Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Maulin Vasavada
code >> changes based on recommendations from folks here. >> >> Thanks, >> Shylaja >> >> -Original Message- >> From: Joshua McKenzie >> Sent: Friday, November 19, 2021 4:53 AM >> To: dev@cassandra.apache.org >> Subj

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Maulin Vasavada
. > > I would be happy to provide additional data points or make necessary code > changes based on recommendations from folks here. > > Thanks, > Shylaja > > -Original Message- > From: Joshua McKenzie > Sent: Friday, November 19, 2021 4:53 AM > To: dev@cassa

RE: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Kokoori, Shylaja
21 4:53 AM To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption > > setting performance requirements on this regard is a nonsense. As long > as it's reasonably usable in real world, and Cassandra makes the > estimated effects on perfo

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Derek Chen-Becker
Thanks, that's really helpful to have some code to look at! Derek On Fri, Nov 19, 2021 at 9:35 AM Joseph Lynch wrote: > On Fri, Nov 19, 2021 at 9:52 AM Derek Chen-Becker > wrote: > > > > https://bugs.openjdk.java.net/browse/JDK-7184394 added AES intrinsics in > > Java 8, in 2012. While it's

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
On Fri, Nov 19, 2021 at 9:52 AM Derek Chen-Becker wrote: > > https://bugs.openjdk.java.net/browse/JDK-7184394 added AES intrinsics in > Java 8, in 2012. While it's always possible to have a regression, and it's > important to understand the performance impact, stories of 2-10x sound > apocryphal.

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Jeff Jirsa
> On Nov 19, 2021, at 2:53 PM, Joseph Lynch wrote: > >  >> >> For better or worse, different threat models mean that it’s not strictly >> better to do FDE and some use cases definitely want this at the db layer >> instead of file system. > > Do you mind elaborating which threat models?

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> For better or worse, different threat models mean that it’s not strictly > better to do FDE and some use cases definitely want this at the db layer > instead of file system. Do you mind elaborating which threat models? The only one I can think of is users can log onto the database machine and

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Derek Chen-Becker
https://bugs.openjdk.java.net/browse/JDK-7184394 added AES intrinsics in Java 8, in 2012. While it's always possible to have a regression, and it's important to understand the performance impact, stories of 2-10x sound apocryphal. If they're all using the same intrinsics, the performance should be

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> I think Joey's argument, and correct me if I'm wrong, is that implementing > a complex feature in Cassandra that we then have to manage that's > essentially worse in every way compared to a built-in full-disk encryption > option via LUKS+LVM etc is a poor use of our time and energy. > > i.e.

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> > Yes, this needs to be done. The credentials for this stuff should be > just fetched from wherever one wants. 100% agree with that and that > maybe next iteration on top of that, should be rather easy. This was > done in CEP-9 already for SSL context creation so we would just copy > that

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Bowen Song
On the performance note, I copy & pasted a small piece of Java code to do AES256-CBC on the stdin and write the result to stdout. I then ran the following two commands on the same machine (with AES-NI) for comparison: $ dd if=/dev/zero bs=4096 count=$((4*1024*1024)) status=none | time

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> > Are you for real here?Nobody will ever guarantee you these %1 numbers > ... come on. I think we are > super paranoid about performance when we are not paranoid enough about > security. This is a two way street. > People are willing to give up on performance if security is a must. > I am for

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Jeff Jirsa
For better or worse, different threat models mean that it’s not strictly better to do FDE and some use cases definitely want this at the db layer instead of file system. > On Nov 19, 2021, at 12:54 PM, Joshua McKenzie wrote: > >  >> >> >> setting performance requirements on this regard

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joshua McKenzie
> > setting performance requirements on this regard is a > nonsense. As long as it's reasonably usable in real world, and Cassandra > makes the estimated effects on performance available, it will be up to > the operators to decide whether to turn on the feature I think Joey's argument, and

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Bowen Song
Sorry, but IMHO setting performance requirements on this regard is a nonsense. As long as it's reasonably usable in real world, and Cassandra makes the estimated effects on performance available, it will be up to the operators to decide whether to turn on the feature. It's a trade off between

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Stefan Miklosovic
On Fri, 19 Nov 2021 at 03:03, Joseph Lynch wrote: > > > > > I've seen this be a significant obstacle for people who want to adopt > > Apache Cassandra many times and an insurmountable obstacle on multiple > > occasions. From what I've seen, I think this is one of the most watched > > tickets with

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Stefan Miklosovic
On Fri, 19 Nov 2021 at 02:51, Joseph Lynch wrote: > > On Thu, Nov 18, 2021 at 7:23 PM Kokoori, Shylaja > wrote: > > > To address Joey's concern, the OpenJDK JVM and its derivatives optimize > > Java crypto based on the underlying HW capabilities. For example, if the > > underlying HW supports

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-18 Thread Joseph Lynch
> > I've seen this be a significant obstacle for people who want to adopt > Apache Cassandra many times and an insurmountable obstacle on multiple > occasions. From what I've seen, I think this is one of the most watched > tickets with the most "is this coming soon" comments in the project backlog

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-18 Thread Joseph Lynch
On Thu, Nov 18, 2021 at 7:23 PM Kokoori, Shylaja wrote: > To address Joey's concern, the OpenJDK JVM and its derivatives optimize > Java crypto based on the underlying HW capabilities. For example, if the > underlying HW supports AES-NI, JVM intrinsics will use those for crypto > operations.

RE: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-18 Thread Kokoori, Shylaja
1 1:23 AM To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption I agree with Joey that most users may be better served by OS level encryption, but I also think this ticket can likely be delivered fairly easily. If we have a new contributor willing to p

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-18 Thread bened...@apache.org
at 09:07 To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption I wanted to provide a bit of background in the interest we've seen in this ticket/feature (at Instaclustr) - essentially it comes down to in-db encryption at rest being a feature that compliance

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-18 Thread Ben Slater
I wanted to provide a bit of background in the interest we've seen in this ticket/feature (at Instaclustr) - essentially it comes down to in-db encryption at rest being a feature that compliance people are used to seeing in databases and having a very hard time believing that operating system

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Joseph Lynch
For FDE you'd probably have the key file in a tmpfs pulled from a remote secret manager and when the machine boots it mounts the encrypted partition that contains your data files. I'm not aware of anyone doing FDE with a password in production. If you wanted selective encryption it would make

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Bowen Song
I don't like the idea that FDE Full Disk Encryption as an alternative to application managed encryption at rest. Each has their own advantages and disadvantages. For example, if the encryption key is the same across nodes in the same cluster, and Cassandra can share the key securely between

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Stefan Miklosovic
On Tue, 16 Nov 2021 at 16:17, Joseph Lynch wrote: > > > I find it rather strange to offer commit log and hints > encryption at rest but for some reason sstable encryption would be > omitted. > > I also think file/disk encryption may be superior in those cases Just for the record, I do not have

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Joseph Lynch
> I find it rather strange to offer commit log and hints encryption at rest but for some reason sstable encryption would be omitted. I also think file/disk encryption may be superior in those cases, but I imagine they were easier to implement in that you don't have to worry nearly as much about

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Stefan Miklosovic
I don't object to having the discussion about whether we actually need this feature at all :) Let's hear from people in the field what their perception is on this. Btw, if we should rely on file system encryption, for what reason is there encryption of commit logs and hints already? So this

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Joseph Lynch
I think a CEP is wise (or a more thorough design document on the ticket) given how easy it is to do security incorrectly and key management, rotation and key derivation are not particularly straightforward. I am curious what advantage Cassandra implementing encryption has over asking the user to

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Stefan Miklosovic
erhaps > >>> encrypted using a public key provided to you by the receiving node. This > >>> would permit efficient “zero copy” streaming for the data portion, but > >>> not require any knowledge of the recipient node’s master key(s). > >>> >

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Bowen Song
. From: Bowen Song Date: Tuesday, 16 November 2021 at 11:56 To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption I think authenticating a receiving node is important, but it is perhaps not in the scope of this ticket (or CEP if it becomes one

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread bened...@apache.org
, and for an operator to thereby fail to protect their data. From: Bowen Song Date: Tuesday, 16 November 2021 at 12:33 To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption I think a warning message is fine, but Cassandra should not enforce network

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Bowen Song
) than users that do not encrypt their data at rest. From: Bowen Song Date: Tuesday, 16 November 2021 at 11:56 To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption I think authenticating a receiving node is important, but it is perhaps not in the scope

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Stefan Miklosovic
d some authentication of the > > recipient node before streaming the file as it would effectively be > > decrypted to any node that could request this streaming action. > > > > > > From: Stefan Miklosovic > > Date: Tuesday, 16 November 2021 at 10:45 >

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread bened...@apache.org
expectations (and compliance requirements) than users that do not encrypt their data at rest. From: Bowen Song Date: Tuesday, 16 November 2021 at 11:56 To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption I think authenticating a receiving node

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Bowen Song
. From: Stefan Miklosovic Date: Tuesday, 16 November 2021 at 10:45 To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption Ok but this also means that Km would need to be the same for all nodes right? If we are rolling in node by node fashion, Km is changed

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Bowen Song
No, the Km does not need to be the same across nodes. Each node can store their own encryption info file created by their own Km. The streaming process only requires the Kr is shared. A quick description of the streaming process via an insecure connection: 1. the sender unwrap the wrapped key

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread bened...@apache.org
: Re: Resurrection of CASSANDRA-9633 - SSTable encryption Ok but this also means that Km would need to be the same for all nodes right? If we are rolling in node by node fashion, Km is changed at node 1, we change the wrapped key which is stored on disk and we stream this table to the other node

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Stefan Miklosovic
Ok but this also means that Km would need to be the same for all nodes right? If we are rolling in node by node fashion, Km is changed at node 1, we change the wrapped key which is stored on disk and we stream this table to the other node which is still on the old Km. Would this work? I think we

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Bowen Song
Yes, that's correct. The actual key used to encrypt the SSTable will stay the same once the SSTable is created. This is a widely used practice in many encrypt-at-rest applications. One good example is the LUKS full disk encryption, which also supports multiple keys to unlock (decrypt) the same

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Stefan Miklosovic
I really believe we likely need a CEP for this. This gets complicated pretty fast with all the details attached and I do not want to have endless discussions about this in the ticket. I can clearly see this is something a broader audience needs to vote on eventually. On Tue, 16 Nov 2021 at

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-16 Thread Stefan Miklosovic
Hi Bowen, Very interesting idea indeed. So if I got it right, the very key for the actual sstable encryption would be always the same, it is just what is wrapped would differ. So if we rotate, we basically only change Km hence KEK hence the result of wrapping but there would still be the original

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-15 Thread J. D. Jordan
Another comment here. I tried to find the patch to check but couldn’t find it linked to the ticket. If it is not already, given the TDE key class is pluggable in the yaml, when a file is written everything need to instantiate the class to decrypt it should be in the metadata. Just like happens

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-15 Thread Bowen Song
The second question is about key rotation. If an operator needs to roll the key because it was compromised or there is some policy around that, we should be able to provide some way to rotate it. Our idea is to write a tool (either a subcommand of nodetool (rewritesstables)

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-15 Thread bened...@apache.org
November 2021 at 22:09 To: dev@cassandra.apache.org Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption > On Nov 15, 2021, at 2:25 PM, Stefan Miklosovic > wrote: > > On Mon, 15 Nov 2021 at 19:42, Jeremiah D Jordan > mailto:jeremiah.jor...@gmail.com>> wrote: >>

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-15 Thread Jeremiah D Jordan
> On Nov 15, 2021, at 2:25 PM, Stefan Miklosovic > wrote: > > On Mon, 15 Nov 2021 at 19:42, Jeremiah D Jordan > mailto:jeremiah.jor...@gmail.com>> wrote: >> >> >> >>> On Nov 14, 2021, at 3:53 PM, Stefan Miklosovic >>> wrote: >>> >>> Hey, >>> >>> there are two points we are not

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-15 Thread Stefan Miklosovic
On Mon, 15 Nov 2021 at 19:42, Jeremiah D Jordan wrote: > > > > > On Nov 14, 2021, at 3:53 PM, Stefan Miklosovic > > wrote: > > > > Hey, > > > > there are two points we are not completely sure about. > > > > The first one is streaming. If there is a cluster of 5 nodes, each > > node has its own

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-15 Thread Jeremiah D Jordan
> On Nov 14, 2021, at 3:53 PM, Stefan Miklosovic > wrote: > > Hey, > > there are two points we are not completely sure about. > > The first one is streaming. If there is a cluster of 5 nodes, each > node has its own unique encryption key. Hence, if a SSTable is stored > on a disk with the

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-14 Thread Stefan Miklosovic
Hey, there are two points we are not completely sure about. The first one is streaming. If there is a cluster of 5 nodes, each node has its own unique encryption key. Hence, if a SSTable is stored on a disk with the key for node 1 and this is streamed to node 2 - which has a different key - it

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-13 Thread scott
Same reaction here - great to have traction on this ticket. Shylaja, thanks for your work on this and to Stefan as well! It would be wonderful to have the feature complete. One thing I’d mention is that a lot’s changed about the project’s testing strategy since the original patch was written.

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-13 Thread Brandon Williams
We already have a ticket and this predated CEPs, and being an obviously good improvement to have that many have been asking for for some time now, I don't see the need for a CEP here. On Sat, Nov 13, 2021 at 5:01 AM Stefan Miklosovic wrote: > > Hi list, > > an engineer from Intel - Shylaja

Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-13 Thread Stefan Miklosovic
Hi list, an engineer from Intel - Shylaja Kokoori (who is watching this list closely) has retrofitted the original code from CASSANDRA-9633 work in times of 3.4 to the current trunk with my help here and there, mostly cosmetic. I would like to know if there is a general consensus about me going