[
https://issues.apache.org/jira/browse/CASSANDRA-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stefan Miklosovic updated CASSANDRA-21194:
------------------------------------------
Status: Ready to Commit (was: Review In Progress)
> Sampling data for dictionary training on more than Integer.MAX_VALUE bytes in
> pointless
> ---------------------------------------------------------------------------------------
>
> Key: CASSANDRA-21194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21194
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Feature/Compression
> Reporter: Stefan Miklosovic
> Assignee: Stefan Miklosovic
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> ZstdDictTrainer from zstd-jni library we use uses
> ByteBuffer.allocateDirect(size) for training samples. {{size}} is integer.
> Integer.MAX_VALUE is basically 2.0 GiB. So if a user wants to sample on more,
> like 3GiB, the sampling just stops at 2GiB and in training output it looks
> like it is stuck. We should validate this value before training and reject
> anything bigger than 2GiB.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]