Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-18 Thread Tolbert, Andy
I think in the context of what I think initially motivated this hot
reloading capability, a big win it provides is avoiding having to
bounce your cluster as your certificates near expiry.  If not watched
closely you can get yourself into a state where every node in the
cluster's cert expired, which is effectively an outage.

I see the appeal of draining connections on a change of trust,
although the necessity of being able to "do it live" (as opposed to
doing a bounce) seems less important then avoiding the outage
condition of your certificates expiring, especially since you can sort
of already do this without bouncing by toggling nodetool
disablebinary/enablebinary.  I agree with Dinesh that most operators
would prefer that it does not do that as interrupting connections can
be disruptive to applications if they don't have retries configured,
but I also agree it'd be a nice improvement to support draining
existing connections in some way.

+1 on the idea of having a "timed connection" capability brought up
here, and implementing it in a way such that connection lifetimes can
be dynamically adjusted.  This way it can be made such that on a trust
store change Cassandra could simply adjust the connection lifetimes
and they will be disconnected immediately or drained over a time
period like Josh proposed.

Thanks,
Andy


Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-18 Thread Josh McKenzie
I think it's all part of the same issue and you're not derailing IMO Abe. For 
the user Pabbireddy here, the unexpected behavior was not closing internode 
connections on that keystore refresh. So ISTM, from a "featureset that would be 
nice to have here" perspective, we could theoretically provide:
 1. An option to disconnect all connections on cert update, disabled by default
 2. An option to drain and recycle connections on a time period, also disabled 
by default
Leave the current behavior in place but allow for these kind of strong 
cert-guarantees if a user needs it in their env.

On Mon, Apr 15, 2024, at 9:51 PM, Abe Ratnofsky wrote:
> Not to derail from the original conversation too far, but wanted to agree 
> that maximum connection establishment time on native transport would be 
> useful. That would provide a maximum duration before an updated client 
> keystore is used for connections, which can be used to safely roll out client 
> keystore updates.
> 
> For example, if the maximum connection establishment time is 12 hours, then 
> you can update the keystore on a canary client, wait 24 hours, confirm that 
> connectivity is maintained, then upgrade keystores across the rest of the 
> fleet.
> 
> With unbounded connection establishment, reconnection isn't tested as often 
> and issues can hide behind long-lived connections.
> 
>> On Apr 15, 2024, at 5:14 PM, Jeff Jirsa  wrote:
>> 
>> It seems like if folks really want the life of a connection to be finite 
>> (either client/server or server/server), adding in an option to quietly 
>> drain and recycle a connection on some period isn’t that difficult.
>> 
>> That type of requirement shows up in a number of environments, usually on 
>> interactive logins (cqlsh, login, walk away, the connection needs to become 
>> invalid in a short and finite period of time), but adding it to internode 
>> could also be done, and may help in some weird situations (if you changed 
>> certs because you believe a key/cert is compromised, having the connection 
>> remain active is decidedly inconvenient, so maybe it does make sense to add 
>> an expiration timer/condition on each connection).
>> 
>> 
>> 
>>> On Apr 15, 2024, at 12:28 PM, Dinesh Joshi  wrote:
>>> 
>>> In addition to what Andy mentioned, I want to point out that for the vast 
>>> majority of use-cases, we would like to _avoid_ interruptions when a 
>>> certificate is updated so it is by design. If you're dealing with a 
>>> situation where you want to ensure that the connections are cycled, you can 
>>> follow Andy's advice. It will require automation outside the database that 
>>> you might already have. If there is demand, we can consider adding a 
>>> feature to slowly cycle the connections so the old SSL context is not used 
>>> anymore.
>>> 
>>> One more thing you should bear in mind is that Cassandra will not load the 
>>> new SSL context if it cannot successfully initialize it. This is again by 
>>> design to prevent an outage when the updated truststore is corrupted or 
>>> could not be read in some way.
>>> 
>>> thanks,
>>> Dinesh
>>> 
>>> On Mon, Apr 15, 2024 at 9:45 AM Tolbert, Andy  
>>> wrote:
 I should mention, when toggling disablebinary/enablebinary between
 instances, you will probably want to give some time between doing this
 so connections can reestablish, and you will want to verify that the
 connections can actually reestablish.  You also need to be mindful of
 this being disruptive to inflight queries (if your client is
 configured for retries it will probably be fine).  Semantically to
 your applications it should look a lot like a rolling cluster bounce.
 
 Thanks,
 Andy
 
 On Mon, Apr 15, 2024 at 11:39 AM pabbireddy avinash
  wrote:
 >
 > Thanks Andy for your reply . We will test the scenario you mentioned.
 >
 > Regards
 > Avinash
 >
 > On Mon, Apr 15, 2024 at 11:28 AM, Tolbert, Andy  
 > wrote:
 >>
 >> Hi Avinash,
 >>
 >> As far as I understand it, if the underlying keystore/trustore(s)
 >> Cassandra is configured for is updated, this *will not* provoke
 >> Cassandra to interrupt existing connections, it's just that the new
 >> stores will be used for future TLS initialization.
 >>
 >> Via: 
 >> https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html#ssl-certificate-hot-reloading
 >>
 >> > When the files are updated, Cassandra will reload them and use them 
 >> > for subsequent connections
 >>
 >> I suppose one could do a rolling disablebinary/enablebinary (if it's
 >> only client connections) after you roll out a keystore/truststore
 >> change as a way of enforcing the existing connections to reestablish.
 >>
 >> Thanks,
 >> Andy
 >>
 >>
 >> On Mon, Apr 15, 2024 at 11:11 AM pabbireddy avinash
 >>  wrote:
 >> >
 >> > Dear Community,
 >> >
 >> > I hope this email 

Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-15 Thread Abe Ratnofsky
Not to derail from the original conversation too far, but wanted to agree that 
maximum connection establishment time on native transport would be useful. That 
would provide a maximum duration before an updated client keystore is used for 
connections, which can be used to safely roll out client keystore updates.

For example, if the maximum connection establishment time is 12 hours, then you 
can update the keystore on a canary client, wait 24 hours, confirm that 
connectivity is maintained, then upgrade keystores across the rest of the fleet.

With unbounded connection establishment, reconnection isn't tested as often and 
issues can hide behind long-lived connections.

> On Apr 15, 2024, at 5:14 PM, Jeff Jirsa  wrote:
> 
> It seems like if folks really want the life of a connection to be finite 
> (either client/server or server/server), adding in an option to quietly drain 
> and recycle a connection on some period isn’t that difficult.
> 
> That type of requirement shows up in a number of environments, usually on 
> interactive logins (cqlsh, login, walk away, the connection needs to become 
> invalid in a short and finite period of time), but adding it to internode 
> could also be done, and may help in some weird situations (if you changed 
> certs because you believe a key/cert is compromised, having the connection 
> remain active is decidedly inconvenient, so maybe it does make sense to add 
> an expiration timer/condition on each connection).
> 
> 
> 
>> On Apr 15, 2024, at 12:28 PM, Dinesh Joshi  wrote:
>> 
>> In addition to what Andy mentioned, I want to point out that for the vast 
>> majority of use-cases, we would like to _avoid_ interruptions when a 
>> certificate is updated so it is by design. If you're dealing with a 
>> situation where you want to ensure that the connections are cycled, you can 
>> follow Andy's advice. It will require automation outside the database that 
>> you might already have. If there is demand, we can consider adding a feature 
>> to slowly cycle the connections so the old SSL context is not used anymore.
>> 
>> One more thing you should bear in mind is that Cassandra will not load the 
>> new SSL context if it cannot successfully initialize it. This is again by 
>> design to prevent an outage when the updated truststore is corrupted or 
>> could not be read in some way.
>> 
>> thanks,
>> Dinesh
>> 
>> On Mon, Apr 15, 2024 at 9:45 AM Tolbert, Andy > > wrote:
>>> I should mention, when toggling disablebinary/enablebinary between
>>> instances, you will probably want to give some time between doing this
>>> so connections can reestablish, and you will want to verify that the
>>> connections can actually reestablish.  You also need to be mindful of
>>> this being disruptive to inflight queries (if your client is
>>> configured for retries it will probably be fine).  Semantically to
>>> your applications it should look a lot like a rolling cluster bounce.
>>> 
>>> Thanks,
>>> Andy
>>> 
>>> On Mon, Apr 15, 2024 at 11:39 AM pabbireddy avinash
>>> mailto:pabbireddyavin...@gmail.com>> wrote:
>>> >
>>> > Thanks Andy for your reply . We will test the scenario you mentioned.
>>> >
>>> > Regards
>>> > Avinash
>>> >
>>> > On Mon, Apr 15, 2024 at 11:28 AM, Tolbert, Andy >> > > wrote:
>>> >>
>>> >> Hi Avinash,
>>> >>
>>> >> As far as I understand it, if the underlying keystore/trustore(s)
>>> >> Cassandra is configured for is updated, this *will not* provoke
>>> >> Cassandra to interrupt existing connections, it's just that the new
>>> >> stores will be used for future TLS initialization.
>>> >>
>>> >> Via: 
>>> >> https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html#ssl-certificate-hot-reloading
>>> >>
>>> >> > When the files are updated, Cassandra will reload them and use them 
>>> >> > for subsequent connections
>>> >>
>>> >> I suppose one could do a rolling disablebinary/enablebinary (if it's
>>> >> only client connections) after you roll out a keystore/truststore
>>> >> change as a way of enforcing the existing connections to reestablish.
>>> >>
>>> >> Thanks,
>>> >> Andy
>>> >>
>>> >>
>>> >> On Mon, Apr 15, 2024 at 11:11 AM pabbireddy avinash
>>> >> mailto:pabbireddyavin...@gmail.com>> wrote:
>>> >> >
>>> >> > Dear Community,
>>> >> >
>>> >> > I hope this email finds you well. I am currently testing SSL 
>>> >> > certificate hot reloading on a Cassandra cluster running version 4.1 
>>> >> > and encountered a situation that requires your expertise.
>>> >> >
>>> >> > Here's a summary of the process and issue:
>>> >> >
>>> >> > Reloading Process: We reloaded certificates signed by our in-house 
>>> >> > certificate authority into the cluster, which was initially running 
>>> >> > with self-signed certificates. The reload was done node by node.
>>> >> >
>>> >> > Truststore and Keystore: The truststore and keystore passwords are the 
>>> >> > same across the cluster.
>>> >> >
>>> >> > 

Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-15 Thread Jeff Jirsa
It seems like if folks really want the life of a connection to be finite 
(either client/server or server/server), adding in an option to quietly drain 
and recycle a connection on some period isn’t that difficult.

That type of requirement shows up in a number of environments, usually on 
interactive logins (cqlsh, login, walk away, the connection needs to become 
invalid in a short and finite period of time), but adding it to internode could 
also be done, and may help in some weird situations (if you changed certs 
because you believe a key/cert is compromised, having the connection remain 
active is decidedly inconvenient, so maybe it does make sense to add an 
expiration timer/condition on each connection).



> On Apr 15, 2024, at 12:28 PM, Dinesh Joshi  wrote:
> 
> In addition to what Andy mentioned, I want to point out that for the vast 
> majority of use-cases, we would like to _avoid_ interruptions when a 
> certificate is updated so it is by design. If you're dealing with a situation 
> where you want to ensure that the connections are cycled, you can follow 
> Andy's advice. It will require automation outside the database that you might 
> already have. If there is demand, we can consider adding a feature to slowly 
> cycle the connections so the old SSL context is not used anymore.
> 
> One more thing you should bear in mind is that Cassandra will not load the 
> new SSL context if it cannot successfully initialize it. This is again by 
> design to prevent an outage when the updated truststore is corrupted or could 
> not be read in some way.
> 
> thanks,
> Dinesh
> 
> On Mon, Apr 15, 2024 at 9:45 AM Tolbert, Andy  > wrote:
>> I should mention, when toggling disablebinary/enablebinary between
>> instances, you will probably want to give some time between doing this
>> so connections can reestablish, and you will want to verify that the
>> connections can actually reestablish.  You also need to be mindful of
>> this being disruptive to inflight queries (if your client is
>> configured for retries it will probably be fine).  Semantically to
>> your applications it should look a lot like a rolling cluster bounce.
>> 
>> Thanks,
>> Andy
>> 
>> On Mon, Apr 15, 2024 at 11:39 AM pabbireddy avinash
>> mailto:pabbireddyavin...@gmail.com>> wrote:
>> >
>> > Thanks Andy for your reply . We will test the scenario you mentioned.
>> >
>> > Regards
>> > Avinash
>> >
>> > On Mon, Apr 15, 2024 at 11:28 AM, Tolbert, Andy > > > wrote:
>> >>
>> >> Hi Avinash,
>> >>
>> >> As far as I understand it, if the underlying keystore/trustore(s)
>> >> Cassandra is configured for is updated, this *will not* provoke
>> >> Cassandra to interrupt existing connections, it's just that the new
>> >> stores will be used for future TLS initialization.
>> >>
>> >> Via: 
>> >> https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html#ssl-certificate-hot-reloading
>> >>
>> >> > When the files are updated, Cassandra will reload them and use them for 
>> >> > subsequent connections
>> >>
>> >> I suppose one could do a rolling disablebinary/enablebinary (if it's
>> >> only client connections) after you roll out a keystore/truststore
>> >> change as a way of enforcing the existing connections to reestablish.
>> >>
>> >> Thanks,
>> >> Andy
>> >>
>> >>
>> >> On Mon, Apr 15, 2024 at 11:11 AM pabbireddy avinash
>> >> mailto:pabbireddyavin...@gmail.com>> wrote:
>> >> >
>> >> > Dear Community,
>> >> >
>> >> > I hope this email finds you well. I am currently testing SSL 
>> >> > certificate hot reloading on a Cassandra cluster running version 4.1 
>> >> > and encountered a situation that requires your expertise.
>> >> >
>> >> > Here's a summary of the process and issue:
>> >> >
>> >> > Reloading Process: We reloaded certificates signed by our in-house 
>> >> > certificate authority into the cluster, which was initially running 
>> >> > with self-signed certificates. The reload was done node by node.
>> >> >
>> >> > Truststore and Keystore: The truststore and keystore passwords are the 
>> >> > same across the cluster.
>> >> >
>> >> > Unexpected Behavior: Despite the different truststore configurations 
>> >> > for the self-signed and new CA certificates, we observed no breakdown 
>> >> > in server-to-server communication during the reload. We did not upload 
>> >> > the new CA cert into the old truststore.We anticipated interruptions 
>> >> > due to the differing truststore configurations but did not encounter 
>> >> > any.
>> >> >
>> >> > Post-Reload Changes: After reloading, we updated the cqlshrc file with 
>> >> > the new CA certificate and key to connect to cqlsh.
>> >> >
>> >> > server_encryption_options:
>> >> >
>> >> > internode_encryption: all
>> >> >
>> >> > keystore: ~/conf/server-keystore.jks
>> >> >
>> >> > keystore_password: 
>> >> >
>> >> > truststore: ~/conf/server-truststore.jks
>> >> >
>> >> > truststore_password: 
>> >> >
>> 

Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-15 Thread Dinesh Joshi
In addition to what Andy mentioned, I want to point out that for the vast
majority of use-cases, we would like to _avoid_ interruptions when a
certificate is updated so it is by design. If you're dealing with a
situation where you want to ensure that the connections are cycled, you can
follow Andy's advice. It will require automation outside the database that
you might already have. If there is demand, we can consider adding a
feature to slowly cycle the connections so the old SSL context is not used
anymore.

One more thing you should bear in mind is that Cassandra will not load the
new SSL context if it cannot successfully initialize it. This is again by
design to prevent an outage when the updated truststore is corrupted or
could not be read in some way.

thanks,
Dinesh

On Mon, Apr 15, 2024 at 9:45 AM Tolbert, Andy  wrote:

> I should mention, when toggling disablebinary/enablebinary between
> instances, you will probably want to give some time between doing this
> so connections can reestablish, and you will want to verify that the
> connections can actually reestablish.  You also need to be mindful of
> this being disruptive to inflight queries (if your client is
> configured for retries it will probably be fine).  Semantically to
> your applications it should look a lot like a rolling cluster bounce.
>
> Thanks,
> Andy
>
> On Mon, Apr 15, 2024 at 11:39 AM pabbireddy avinash
>  wrote:
> >
> > Thanks Andy for your reply . We will test the scenario you mentioned.
> >
> > Regards
> > Avinash
> >
> > On Mon, Apr 15, 2024 at 11:28 AM, Tolbert, Andy 
> wrote:
> >>
> >> Hi Avinash,
> >>
> >> As far as I understand it, if the underlying keystore/trustore(s)
> >> Cassandra is configured for is updated, this *will not* provoke
> >> Cassandra to interrupt existing connections, it's just that the new
> >> stores will be used for future TLS initialization.
> >>
> >> Via:
> https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html#ssl-certificate-hot-reloading
> >>
> >> > When the files are updated, Cassandra will reload them and use them
> for subsequent connections
> >>
> >> I suppose one could do a rolling disablebinary/enablebinary (if it's
> >> only client connections) after you roll out a keystore/truststore
> >> change as a way of enforcing the existing connections to reestablish.
> >>
> >> Thanks,
> >> Andy
> >>
> >>
> >> On Mon, Apr 15, 2024 at 11:11 AM pabbireddy avinash
> >>  wrote:
> >> >
> >> > Dear Community,
> >> >
> >> > I hope this email finds you well. I am currently testing SSL
> certificate hot reloading on a Cassandra cluster running version 4.1 and
> encountered a situation that requires your expertise.
> >> >
> >> > Here's a summary of the process and issue:
> >> >
> >> > Reloading Process: We reloaded certificates signed by our in-house
> certificate authority into the cluster, which was initially running with
> self-signed certificates. The reload was done node by node.
> >> >
> >> > Truststore and Keystore: The truststore and keystore passwords are
> the same across the cluster.
> >> >
> >> > Unexpected Behavior: Despite the different truststore configurations
> for the self-signed and new CA certificates, we observed no breakdown in
> server-to-server communication during the reload. We did not upload the new
> CA cert into the old truststore.We anticipated interruptions due to the
> differing truststore configurations but did not encounter any.
> >> >
> >> > Post-Reload Changes: After reloading, we updated the cqlshrc file
> with the new CA certificate and key to connect to cqlsh.
> >> >
> >> > server_encryption_options:
> >> >
> >> > internode_encryption: all
> >> >
> >> > keystore: ~/conf/server-keystore.jks
> >> >
> >> > keystore_password: 
> >> >
> >> > truststore: ~/conf/server-truststore.jks
> >> >
> >> > truststore_password: 
> >> >
> >> > protocol: TLS
> >> >
> >> > algorithm: SunX509
> >> >
> >> > store_type: JKS
> >> >
> >> > cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
> >> >
> >> > require_client_auth: true
> >> >
> >> > client_encryption_options:
> >> >
> >> > enabled: true
> >> >
> >> > keystore: ~/conf/server-keystore.jks
> >> >
> >> > keystore_password: 
> >> >
> >> > require_client_auth: true
> >> >
> >> > truststore: ~/conf/server-truststore.jks
> >> >
> >> > truststore_password: 
> >> >
> >> > protocol: TLS
> >> >
> >> > algorithm: SunX509
> >> >
> >> > store_type: JKS
> >> >
> >> > cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
> >> >
> >> > Given this situation, I have the following questions:
> >> >
> >> > Could there be a reason for the continuity of server-to-server
> communication despite the different truststores?
> >> > Is there a possibility that the old truststore remains cached even
> after reloading the certificates on a node?
> >> > Have others encountered similar issues, and if so, what were your
> solutions?
> >> >
> >> > Any insights or 

Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-15 Thread Tolbert, Andy
I should mention, when toggling disablebinary/enablebinary between
instances, you will probably want to give some time between doing this
so connections can reestablish, and you will want to verify that the
connections can actually reestablish.  You also need to be mindful of
this being disruptive to inflight queries (if your client is
configured for retries it will probably be fine).  Semantically to
your applications it should look a lot like a rolling cluster bounce.

Thanks,
Andy

On Mon, Apr 15, 2024 at 11:39 AM pabbireddy avinash
 wrote:
>
> Thanks Andy for your reply . We will test the scenario you mentioned.
>
> Regards
> Avinash
>
> On Mon, Apr 15, 2024 at 11:28 AM, Tolbert, Andy  
> wrote:
>>
>> Hi Avinash,
>>
>> As far as I understand it, if the underlying keystore/trustore(s)
>> Cassandra is configured for is updated, this *will not* provoke
>> Cassandra to interrupt existing connections, it's just that the new
>> stores will be used for future TLS initialization.
>>
>> Via: 
>> https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html#ssl-certificate-hot-reloading
>>
>> > When the files are updated, Cassandra will reload them and use them for 
>> > subsequent connections
>>
>> I suppose one could do a rolling disablebinary/enablebinary (if it's
>> only client connections) after you roll out a keystore/truststore
>> change as a way of enforcing the existing connections to reestablish.
>>
>> Thanks,
>> Andy
>>
>>
>> On Mon, Apr 15, 2024 at 11:11 AM pabbireddy avinash
>>  wrote:
>> >
>> > Dear Community,
>> >
>> > I hope this email finds you well. I am currently testing SSL certificate 
>> > hot reloading on a Cassandra cluster running version 4.1 and encountered a 
>> > situation that requires your expertise.
>> >
>> > Here's a summary of the process and issue:
>> >
>> > Reloading Process: We reloaded certificates signed by our in-house 
>> > certificate authority into the cluster, which was initially running with 
>> > self-signed certificates. The reload was done node by node.
>> >
>> > Truststore and Keystore: The truststore and keystore passwords are the 
>> > same across the cluster.
>> >
>> > Unexpected Behavior: Despite the different truststore configurations for 
>> > the self-signed and new CA certificates, we observed no breakdown in 
>> > server-to-server communication during the reload. We did not upload the 
>> > new CA cert into the old truststore.We anticipated interruptions due to 
>> > the differing truststore configurations but did not encounter any.
>> >
>> > Post-Reload Changes: After reloading, we updated the cqlshrc file with the 
>> > new CA certificate and key to connect to cqlsh.
>> >
>> > server_encryption_options:
>> >
>> > internode_encryption: all
>> >
>> > keystore: ~/conf/server-keystore.jks
>> >
>> > keystore_password: 
>> >
>> > truststore: ~/conf/server-truststore.jks
>> >
>> > truststore_password: 
>> >
>> > protocol: TLS
>> >
>> > algorithm: SunX509
>> >
>> > store_type: JKS
>> >
>> > cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
>> >
>> > require_client_auth: true
>> >
>> > client_encryption_options:
>> >
>> > enabled: true
>> >
>> > keystore: ~/conf/server-keystore.jks
>> >
>> > keystore_password: 
>> >
>> > require_client_auth: true
>> >
>> > truststore: ~/conf/server-truststore.jks
>> >
>> > truststore_password: 
>> >
>> > protocol: TLS
>> >
>> > algorithm: SunX509
>> >
>> > store_type: JKS
>> >
>> > cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
>> >
>> > Given this situation, I have the following questions:
>> >
>> > Could there be a reason for the continuity of server-to-server 
>> > communication despite the different truststores?
>> > Is there a possibility that the old truststore remains cached even after 
>> > reloading the certificates on a node?
>> > Have others encountered similar issues, and if so, what were your 
>> > solutions?
>> >
>> > Any insights or suggestions would be greatly appreciated. Please let me 
>> > know if further information is needed.
>> >
>> > Thank you
>> >
>> > Best regards,
>> >
>> > Avinash


Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-15 Thread pabbireddy avinash
Thanks Andy for your reply . We will test the scenario you mentioned.

Regards
Avinash

On Mon, Apr 15, 2024 at 11:28 AM, Tolbert, Andy  wrote:

> Hi Avinash,
>
> As far as I understand it, if the underlying keystore/trustore(s)
> Cassandra is configured for is updated, this *will not* provoke
> Cassandra to interrupt existing connections, it's just that the new
> stores will be used for future TLS initialization.
>
> Via:
> https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html#ssl-certificate-hot-reloading
>
> > When the files are updated, Cassandra will reload them and use them for
> subsequent connections
>
> I suppose one could do a rolling disablebinary/enablebinary (if it's
> only client connections) after you roll out a keystore/truststore
> change as a way of enforcing the existing connections to reestablish.
>
> Thanks,
> Andy
>
>
> On Mon, Apr 15, 2024 at 11:11 AM pabbireddy avinash
>  wrote:
> >
> > Dear Community,
> >
> > I hope this email finds you well. I am currently testing SSL certificate
> hot reloading on a Cassandra cluster running version 4.1 and encountered a
> situation that requires your expertise.
> >
> > Here's a summary of the process and issue:
> >
> > Reloading Process: We reloaded certificates signed by our in-house
> certificate authority into the cluster, which was initially running with
> self-signed certificates. The reload was done node by node.
> >
> > Truststore and Keystore: The truststore and keystore passwords are the
> same across the cluster.
> >
> > Unexpected Behavior: Despite the different truststore configurations for
> the self-signed and new CA certificates, we observed no breakdown in
> server-to-server communication during the reload. We did not upload the new
> CA cert into the old truststore.We anticipated interruptions due to the
> differing truststore configurations but did not encounter any.
> >
> > Post-Reload Changes: After reloading, we updated the cqlshrc file with
> the new CA certificate and key to connect to cqlsh.
> >
> > server_encryption_options:
> >
> > internode_encryption: all
> >
> > keystore: ~/conf/server-keystore.jks
> >
> > keystore_password: 
> >
> > truststore: ~/conf/server-truststore.jks
> >
> > truststore_password: 
> >
> > protocol: TLS
> >
> > algorithm: SunX509
> >
> > store_type: JKS
> >
> > cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
> >
> > require_client_auth: true
> >
> > client_encryption_options:
> >
> > enabled: true
> >
> > keystore: ~/conf/server-keystore.jks
> >
> > keystore_password: 
> >
> > require_client_auth: true
> >
> > truststore: ~/conf/server-truststore.jks
> >
> > truststore_password: 
> >
> > protocol: TLS
> >
> > algorithm: SunX509
> >
> > store_type: JKS
> >
> > cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
> >
> > Given this situation, I have the following questions:
> >
> > Could there be a reason for the continuity of server-to-server
> communication despite the different truststores?
> > Is there a possibility that the old truststore remains cached even after
> reloading the certificates on a node?
> > Have others encountered similar issues, and if so, what were your
> solutions?
> >
> > Any insights or suggestions would be greatly appreciated. Please let me
> know if further information is needed.
> >
> > Thank you
> >
> > Best regards,
> >
> > Avinash
>


Re: ssl certificate hot reloading test - cassandra 4.1

2024-04-15 Thread Tolbert, Andy
Hi Avinash,

As far as I understand it, if the underlying keystore/trustore(s)
Cassandra is configured for is updated, this *will not* provoke
Cassandra to interrupt existing connections, it's just that the new
stores will be used for future TLS initialization.

Via: 
https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html#ssl-certificate-hot-reloading

> When the files are updated, Cassandra will reload them and use them for 
> subsequent connections

I suppose one could do a rolling disablebinary/enablebinary (if it's
only client connections) after you roll out a keystore/truststore
change as a way of enforcing the existing connections to reestablish.

Thanks,
Andy


On Mon, Apr 15, 2024 at 11:11 AM pabbireddy avinash
 wrote:
>
> Dear Community,
>
> I hope this email finds you well. I am currently testing SSL certificate hot 
> reloading on a Cassandra cluster running version 4.1 and encountered a 
> situation that requires your expertise.
>
> Here's a summary of the process and issue:
>
> Reloading Process: We reloaded certificates signed by our in-house 
> certificate authority into the cluster, which was initially running with 
> self-signed certificates. The reload was done node by node.
>
> Truststore and Keystore: The truststore and keystore passwords are the same 
> across the cluster.
>
> Unexpected Behavior: Despite the different truststore configurations for the 
> self-signed and new CA certificates, we observed no breakdown in 
> server-to-server communication during the reload. We did not upload the new 
> CA cert into the old truststore.We anticipated interruptions due to the 
> differing truststore configurations but did not encounter any.
>
> Post-Reload Changes: After reloading, we updated the cqlshrc file with the 
> new CA certificate and key to connect to cqlsh.
>
> server_encryption_options:
>
> internode_encryption: all
>
> keystore: ~/conf/server-keystore.jks
>
> keystore_password: 
>
> truststore: ~/conf/server-truststore.jks
>
> truststore_password: 
>
> protocol: TLS
>
> algorithm: SunX509
>
> store_type: JKS
>
> cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
>
> require_client_auth: true
>
> client_encryption_options:
>
> enabled: true
>
> keystore: ~/conf/server-keystore.jks
>
> keystore_password: 
>
> require_client_auth: true
>
> truststore: ~/conf/server-truststore.jks
>
> truststore_password: 
>
> protocol: TLS
>
> algorithm: SunX509
>
> store_type: JKS
>
> cipher_suites: [TLS_RSA_WITH_AES_256_CBC_SHA]
>
> Given this situation, I have the following questions:
>
> Could there be a reason for the continuity of server-to-server communication 
> despite the different truststores?
> Is there a possibility that the old truststore remains cached even after 
> reloading the certificates on a node?
> Have others encountered similar issues, and if so, what were your solutions?
>
> Any insights or suggestions would be greatly appreciated. Please let me know 
> if further information is needed.
>
> Thank you
>
> Best regards,
>
> Avinash