Re: [DISCUSS] Mentoring newcomers

2021-11-19 Thread Mick Semb Wever
also me On Thu, 18 Nov 2021 at 17:39, Caleb Rackliffe wrote: > You have my bow. > > On Fri, Nov 12, 2021 at 11:05 AM Benjamin Lerer wrote: > > > Hi everybody > > > > As discussed in the *Creating a new slack channel for newcomers* thead, a > > solution to help newcomers engage with the project

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Stefan Miklosovic
On Fri, 19 Nov 2021 at 03:03, Joseph Lynch wrote: > > > > > I've seen this be a significant obstacle for people who want to adopt > > Apache Cassandra many times and an insurmountable obstacle on multiple > > occasions. From what I've seen, I think this is one of the most watched > > tickets with

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Stefan Miklosovic
On Fri, 19 Nov 2021 at 02:51, Joseph Lynch wrote: > > On Thu, Nov 18, 2021 at 7:23 PM Kokoori, Shylaja > wrote: > > > To address Joey's concern, the OpenJDK JVM and its derivatives optimize > > Java crypto based on the underlying HW capabilities. For example, if the > > underlying HW supports

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Bowen Song
Sorry, but IMHO setting performance requirements on this regard is a nonsense. As long as it's reasonably usable in real world, and Cassandra makes the estimated effects on performance available, it will be up to the operators to decide whether to turn on the feature. It's a trade off between

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Jeff Jirsa
For better or worse, different threat models mean that it’s not strictly better to do FDE and some use cases definitely want this at the db layer instead of file system. > On Nov 19, 2021, at 12:54 PM, Joshua McKenzie wrote: > >  >> >> >> setting performance requirements on this regard

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> > Yes, this needs to be done. The credentials for this stuff should be > just fetched from wherever one wants. 100% agree with that and that > maybe next iteration on top of that, should be rather easy. This was > done in CEP-9 already for SSL context creation so we would just copy > that

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Derek Chen-Becker
https://bugs.openjdk.java.net/browse/JDK-7184394 added AES intrinsics in Java 8, in 2012. While it's always possible to have a regression, and it's important to understand the performance impact, stories of 2-10x sound apocryphal. If they're all using the same intrinsics, the performance should be

Re: [DISCUSS] CASSANDRA-17024: Artificial Latency Injection

2021-11-19 Thread bened...@apache.org
To resurrect this discussion briefly, does anyone have a preference for either CQL Grammar or Protocol support? This originally felt to me like something we might want to support at the native protocol level, however that creates a dependency on specific clients and the feature might

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joshua McKenzie
> > setting performance requirements on this regard is a > nonsense. As long as it's reasonably usable in real world, and Cassandra > makes the estimated effects on performance available, it will be up to > the operators to decide whether to turn on the feature I think Joey's argument, and

Re: [VOTE] CEP-14: Paxos Improvements

2021-11-19 Thread bened...@apache.org
For those who are interested, this work has been posted as CASSANDRA-17164. From: bened...@apache.org Date: Wednesday, 1 September 2021 at 05:29 To: dev@cassandra.apache.org Subject: Re: [VOTE] CEP-14: Paxos Improvements With 10 +1 votes and no -1 votes, the vote passes. Thanks everyone! From:

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Bowen Song
On the performance note, I copy & pasted a small piece of Java code to do AES256-CBC on the stdin and write the result to stdout. I then ran the following two commands on the same machine (with AES-NI) for comparison: $ dd if=/dev/zero bs=4096 count=$((4*1024*1024)) status=none | time

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> For better or worse, different threat models mean that it’s not strictly > better to do FDE and some use cases definitely want this at the db layer > instead of file system. Do you mind elaborating which threat models? The only one I can think of is users can log onto the database machine and

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> > Are you for real here?Nobody will ever guarantee you these %1 numbers > ... come on. I think we are > super paranoid about performance when we are not paranoid enough about > security. This is a two way street. > People are willing to give up on performance if security is a must. > I am for

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
> I think Joey's argument, and correct me if I'm wrong, is that implementing > a complex feature in Cassandra that we then have to manage that's > essentially worse in every way compared to a built-in full-disk encryption > option via LUKS+LVM etc is a poor use of our time and energy. > > i.e.

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Derek Chen-Becker
Thanks, that's really helpful to have some code to look at! Derek On Fri, Nov 19, 2021 at 9:35 AM Joseph Lynch wrote: > On Fri, Nov 19, 2021 at 9:52 AM Derek Chen-Becker > wrote: > > > > https://bugs.openjdk.java.net/browse/JDK-7184394 added AES intrinsics in > > Java 8, in 2012. While it's

Re: [DISCUSS] CASSANDRA-17024: Artificial Latency Injection

2021-11-19 Thread Abe Ratnofsky
I like the idea of adding this to the CQL Grammar, but would like to see it follow the ReplicationStrategy style of defining a map with a class and parameters. For example, something like this (names I’m not tied to): SELECT * FROM table WHERE pk = x WITH ARTIFICIAL LATENCY = { 'class':

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Jeff Jirsa
> On Nov 19, 2021, at 2:53 PM, Joseph Lynch wrote: > >  >> >> For better or worse, different threat models mean that it’s not strictly >> better to do FDE and some use cases definitely want this at the db layer >> instead of file system. > > Do you mind elaborating which threat models?

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Joseph Lynch
On Fri, Nov 19, 2021 at 9:52 AM Derek Chen-Becker wrote: > > https://bugs.openjdk.java.net/browse/JDK-7184394 added AES intrinsics in > Java 8, in 2012. While it's always possible to have a regression, and it's > important to understand the performance impact, stories of 2-10x sound > apocryphal.

Re: [DISCUSS] CASSANDRA-17024: Artificial Latency Injection

2021-11-19 Thread Jeremiah D Jordan
If it is per query, then I would think protocol level might be easier to “test” a given application with. Rather than having to append "WITH ADDITIONAL LATENCY” to all your queries, you just set some option in your query based object or such. We already have support at the protocol level for

Re: [DISCUSS] CASSANDRA-17024: Artificial Latency Injection

2021-11-19 Thread Jon Meredith
Wouldn't modifying the CQL grammar would require updating the application under test to perform experimentation? The other thing I was wondering about is extensibility - for example you would like to add a percentage chance for dropping messages for more deterministic overload modeling. I can see

[DISCUSS] Nested YAML configs for new features

2021-11-19 Thread David Capwell
This has been brought up in a few tickets, so pushing to the dev list. CASSANDRA-15234 - Standardise config and JVM parameters CASSANDRA-16896 - hard/soft limits for queries CASSANDRA-17147 - Guardrails prototype In short, do we as a project wish to move "new features" into nested YAML when the

Re: [DISCUSS] Nested YAML configs for new features

2021-11-19 Thread David Capwell
In org.apache.cassandra.config.YamlConfigurationLoader (and anything working on translation of configs to flat structures), we can detect this pattern and recursively get the field (similar to walking directories); main change would be in

Re: [DISCUSS] Nested YAML configs for new features

2021-11-19 Thread Caleb Rackliffe
If it's nested, "track_warnings" would still work if you're grepping around vim or less. I'd have to concede the point about grep output, although there are tools like https://github.com/kislyuk/yq that could probably be bent to do what you want. On Fri, Nov 19, 2021 at 1:08 PM Stefan Miklosovic

RE: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Kokoori, Shylaja
I agree with Joey, kernel also should be able to take advantage of the crypto acceleration. I also want to add, since performance of JDK is a concern here, newer Intel Icelake server platforms supports VAES and SHA-NI which further accelerates AES-GCM perf by 2x and SHA1 perf by ~6x using JDK

Re: [DISCUSS] Nested YAML configs for new features

2021-11-19 Thread Caleb Rackliffe
I'm on record as early as the comments in CASSANDRA-15234 in support of nesting, and I think the biggest reason is that the structure it forces on our config makes it more cohesive and intelligible to those trying to understand how major features and subsystems work together. It's very easy to

Re: [DISCUSS] Nested YAML configs for new features

2021-11-19 Thread David Capwell
> it is really handy to grep > cassandra.yaml on some config key and you know the value instantly. You can still do that $ grep -A2 coordinator_read_size conf/cassandra.yaml # coordinator_read_size: # warn_threshold_kb: 0 # abort_threshold_kb: 0 I was also arguing we should

Re: [DISCUSS] Nested YAML configs for new features

2021-11-19 Thread Bowen Song
I'm with Stefan. I prefer the flat YAML file which I can easily use grep to check and confirm the settings on large number of servers with parallel-ssh. This will be very hard to do on nested config in a YAML file. In addition to that, I also use grep in the Cassandra source code to locate

Re: [DISCUSS] Nested YAML configs for new features

2021-11-19 Thread Jacek Lewandowski
With the flat structure it turns into properties file - would it be possible to support both formats - nested yaml and flat properties? - - -- --- - - Jacek Lewandowski On Fri, Nov 19, 2021 at 10:08 PM Caleb Rackliffe wrote: > If it's nested, "track_warnings" would

Re: [DISCUSS] Nested YAML configs for new features

2021-11-19 Thread David Capwell
> With the flat structure it turns into properties file - would it be > possible to support both formats - nested yaml and flat properties? For majority of our configs yes, but there are a subset where flat properties is annoying hinted_handoff_disabled_datacenters - set type, so you could do

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Maulin Vasavada
Basically, we also have to think about how operable these changes will be for operators in multi-tenant, multi-cluster/dc environments w.r.t. key rotations, security, key deployments etc. On Fri, Nov 19, 2021 at 8:03 PM Maulin Vasavada wrote: > Hi all > > Really interesting discussion. I

Re: Resurrection of CASSANDRA-9633 - SSTable encryption

2021-11-19 Thread Maulin Vasavada
Hi all Really interesting discussion. I started reading this thread and still have to catch-up a lot but based on my experience many big organizations ultimately settle on having over-the-wire encryption combined with OS/disk encryption to comply with the security requirements for various reasons