[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773821#comment-16773821
 ] 

Dinesh Joshi edited comment on CASSANDRA-14482 at 2/21/19 8:40 AM:
-------------------------------------------------------------------

[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the 
{{ZStd::compress}} JNI static helper. I confirmed this with the JNI author and 
it goes deeper than just the JNI bindings. However using the compression stream 
causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU and still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming 
flag (no GC overhead)

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.


was (Author: djoshi3):
[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the 
{{ZStd::compress}} JNI static helper. I confirmed this with the JNI author and 
it goes deeper than just the JNI bindings. However using the compression stream 
causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU and still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming flag

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.

> ZSTD Compressor support in Cassandra
> ------------------------------------
>
>                 Key: CASSANDRA-14482
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Dependencies, Feature/Compression
>            Reporter: Sushma A Devendrappa
>            Assignee: Dinesh Joshi
>            Priority: Major
>              Labels: performance, pull-request-available
>             Fix For: 4.x
>
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to