[
https://issues.apache.org/jira/browse/CASSANDRA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885494#comment-15885494
]
Sylvain Lebresne commented on CASSANDRA-10520:
----------------------------------------------
bq. should reopen CASSANDRA-11128
We don't re-open issues that have made it in a release, but it's worth opening
a followup, yes.
bq. this means that for upgrades from 3.0/3.x to 4.0 users must ensure
this/11128 is fixed.
Yes, and I wonder if avoiding this isn't a good enough reason to avoid enabling
this by default, _at least on existing tables_. I mean, in general, I wonder if
we shouldn't default on being more conservative for existing tables on upgrade.
That is, what I'd suggest is that we'd default existing table to this being
disabled (no change from now), but enable it on new table (basically, default
to false if the table doesn't have the option, but force it to true on new
tables if not provided).
> Compressed writer and reader should support non-compressed data.
> ----------------------------------------------------------------
>
> Key: CASSANDRA-10520
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10520
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Reporter: Branimir Lambov
> Assignee: Branimir Lambov
> Labels: messaging-service-bump-required
> Fix For: 4.x
>
> Attachments: ReadWriteTestCompression.java
>
>
> Compressing uncompressible data, as done, for instance, to write SSTables
> during stress-tests, results in chunks larger than 64k which are a problem
> for the buffer pooling mechanisms employed by the
> {{CompressedRandomAccessReader}}. This results in non-negligible performance
> issues due to excessive memory allocation.
> To solve this problem and avoid decompression delays in the cases where it
> does not provide benefits, I think we should allow compressed files to store
> uncompressed chunks as alternative to compressed data. Such a chunk could be
> written after compression returns a buffer larger than, for example, 90% of
> the input, and would not result in additional delays in writing. On reads it
> could be recognized by size (using a single global threshold constant in the
> compression metadata) and data could be directly transferred into the
> decompressed buffer, skipping the decompression step and ensuring a 64k
> buffer for compressed data always suffices.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)