[jira] [Commented] (CASSANDRA-11051) Make LZ4 Compression Level Configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231516#comment-15231516 ] Michael Kjellman commented on CASSANDRA-11051: -- +1 Looks great! Thanks [~krummas] (sorry for the delay again... stupid spam filter keeps triggering on your Jira updates) > Make LZ4 Compression Level Configurable > > > Key: CASSANDRA-11051 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11051 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: lz4_2.2.patch > > > We'd like to make the LZ4 Compressor implementation configurable on a per > column family basis. Testing has shown a ~4% reduction in file size with the > higher compression LZ4 implementation vs the standard compressor we currently > use instantiated by the default constructor. The attached patch adds the > following optional parameters 'lz4_compressor_type' and > 'lz4_high_compressor_level' to the LZ4Compressor. If none of the new optional > parameters are specified, the Compressor will use the same defaults Cassandra > has always had for LZ4. > New LZ4Compressor Optional Parameters: > * lz4_compressor_type can currently be either 'high' (uses LZ4HCCompressor) > or 'fast' (uses LZ4Compressor) > * lz4_high_compressor_level can be set between 1 and 17. Not specifying a > compressor level while specifying lz4_compressor_type as 'high' will use a > default level of 9 (as picked by the LZ4 library as the "default"). > Currently, we use the default LZ4 compressor constructor. This change would > just expose the level (and implementation to use) to the user via the schema. > There are many potential cases where users may find that the tradeoff in > additional CPU and memory usage is worth the on-disk space savings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11051) Make LZ4 Compression Level Configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225887#comment-15225887 ] Marcus Eriksson commented on CASSANDRA-11051: - Wrote a unit test and noticed that we would log the "... parameter is ignored when ..."-message even if the newly created table was OK. Reason is that we call LZ4Compressor.create(...) for every existing table in the keyspace when we create a new table. Moved the logging to the constructor and a few small changes to make this testable. Pushed as a new commit to the repo above Could you have a quick look [~mkjellman]? > Make LZ4 Compression Level Configurable > > > Key: CASSANDRA-11051 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11051 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: lz4_2.2.patch > > > We'd like to make the LZ4 Compressor implementation configurable on a per > column family basis. Testing has shown a ~4% reduction in file size with the > higher compression LZ4 implementation vs the standard compressor we currently > use instantiated by the default constructor. The attached patch adds the > following optional parameters 'lz4_compressor_type' and > 'lz4_high_compressor_level' to the LZ4Compressor. If none of the new optional > parameters are specified, the Compressor will use the same defaults Cassandra > has always had for LZ4. > New LZ4Compressor Optional Parameters: > * lz4_compressor_type can currently be either 'high' (uses LZ4HCCompressor) > or 'fast' (uses LZ4Compressor) > * lz4_high_compressor_level can be set between 1 and 17. Not specifying a > compressor level while specifying lz4_compressor_type as 'high' will use a > default level of 9 (as picked by the LZ4 library as the "default"). > Currently, we use the default LZ4 compressor constructor. This change would > just expose the level (and implementation to use) to the user via the schema. > There are many potential cases where users may find that the tradeoff in > additional CPU and memory usage is worth the on-disk space savings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11051) Make LZ4 Compression Level Configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221910#comment-15221910 ] Michael Kjellman commented on CASSANDRA-11051: -- Sorry [~krummas] I missed your updates! Rebase looks great, thanks for doing that. Ship it! > Make LZ4 Compression Level Configurable > > > Key: CASSANDRA-11051 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11051 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: lz4_2.2.patch > > > We'd like to make the LZ4 Compressor implementation configurable on a per > column family basis. Testing has shown a ~4% reduction in file size with the > higher compression LZ4 implementation vs the standard compressor we currently > use instantiated by the default constructor. The attached patch adds the > following optional parameters 'lz4_compressor_type' and > 'lz4_high_compressor_level' to the LZ4Compressor. If none of the new optional > parameters are specified, the Compressor will use the same defaults Cassandra > has always had for LZ4. > New LZ4Compressor Optional Parameters: > * lz4_compressor_type can currently be either 'high' (uses LZ4HCCompressor) > or 'fast' (uses LZ4Compressor) > * lz4_high_compressor_level can be set between 1 and 17. Not specifying a > compressor level while specifying lz4_compressor_type as 'high' will use a > default level of 9 (as picked by the LZ4 library as the "default"). > Currently, we use the default LZ4 compressor constructor. This change would > just expose the level (and implementation to use) to the user via the schema. > There are many potential cases where users may find that the tradeoff in > additional CPU and memory usage is worth the on-disk space savings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11051) Make LZ4 Compression Level Configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154189#comment-15154189 ] Marcus Eriksson commented on CASSANDRA-11051: - I rebased this on trunk while reviewing: https://github.com/krummas/cassandra/commits/mkjellman/11051-trunk (please have a look that I didn't mess anything up) tests: http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-mkjellman-11051-trunk-dtest/ http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-mkjellman-11051-trunk-testall/ Code LGTM but as Sylvain mentions above, a few unit tests would be good > Make LZ4 Compression Level Configurable > > > Key: CASSANDRA-11051 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11051 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: lz4_2.2.patch > > > We'd like to make the LZ4 Compressor implementation configurable on a per > column family basis. Testing has shown a ~4% reduction in file size with the > higher compression LZ4 implementation vs the standard compressor we currently > use instantiated by the default constructor. The attached patch adds the > following optional parameters 'lz4_compressor_type' and > 'lz4_high_compressor_level' to the LZ4Compressor. If none of the new optional > parameters are specified, the Compressor will use the same defaults Cassandra > has always had for LZ4. > New LZ4Compressor Optional Parameters: > * lz4_compressor_type can currently be either 'high' (uses LZ4HCCompressor) > or 'fast' (uses LZ4Compressor) > * lz4_high_compressor_level can be set between 1 and 17. Not specifying a > compressor level while specifying lz4_compressor_type as 'high' will use a > default level of 9 (as picked by the LZ4 library as the "default"). > Currently, we use the default LZ4 compressor constructor. This change would > just expose the level (and implementation to use) to the user via the schema. > There are many potential cases where users may find that the tradeoff in > additional CPU and memory usage is worth the on-disk space savings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11051) Make LZ4 Compression Level Configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144555#comment-15144555 ] Sylvain Lebresne commented on CASSANDRA-11051: -- I see no reason not to add this, though as that's a (nice but non terribly essential) improvement, we should probably stick to 3.x for that. So [~mkjellman], a 3.x version of the patch with a few unit tests would be really awesome. > Make LZ4 Compression Level Configurable > > > Key: CASSANDRA-11051 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11051 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: lz4_2.2.patch > > > We'd like to make the LZ4 Compressor implementation configurable on a per > column family basis. Testing has shown a ~4% reduction in file size with the > higher compression LZ4 implementation vs the standard compressor we currently > use instantiated by the default constructor. The attached patch adds the > following optional parameters 'lz4_compressor_type' and > 'lz4_high_compressor_level' to the LZ4Compressor. If none of the new optional > parameters are specified, the Compressor will use the same defaults Cassandra > has always had for LZ4. > New LZ4Compressor Optional Parameters: > * lz4_compressor_type can currently be either 'high' (uses LZ4HCCompressor) > or 'fast' (uses LZ4Compressor) > * lz4_high_compressor_level can be set between 1 and 17. Not specifying a > compressor level while specifying lz4_compressor_type as 'high' will use a > default level of 9 (as picked by the LZ4 library as the "default"). > Currently, we use the default LZ4 compressor constructor. This change would > just expose the level (and implementation to use) to the user via the schema. > There are many potential cases where users may find that the tradeoff in > additional CPU and memory usage is worth the on-disk space savings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)