which exact version of OpenJDK are you using? Is it possible you don't have
JCE on those nodes? (I believe more recent versions of Java 8 has this
baked in so that might not be it)


*Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|*
Twitter <https://twitter.com/MarcSelwan>

*  Quick links | *DataStax <http://www.datastax.com> *| *Training
<http://www.academy.datastax.com> *| *Documentation
<http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html>
 *| *Downloads <http://www.datastax.com/download>



On Mon, Aug 26, 2019 at 1:56 PM Michael Carlise
<mcarl...@salesforce.com.invalid> wrote:

>
> I originally opened this issue on stackoverflow (
> https://stackoverflow.com/questions/57516660/cassandra-node-to-node-encryption-throws-unable-to-gossip-with-peers-exception
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_57516660_cassandra-2Dnode-2Dto-2Dnode-2Dencryption-2Dthrows-2Dunable-2Dto-2Dgossip-2Dwith-2Dpeers-2Dexception&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=E6NVfMr2TIhW42QMfARTvsfCLtdF-oEA3KfAQRfVZdk&m=KdhQDpMbz8v1GYrbdYL_opGq-GBPXftrEYEkgcGeMp0&s=4CR8PRQopb4FyCLj8PDI44mSouBz65Yx8THnH8tOb7o&e=>
> ).
>
> However, I haven't gotten any responses in over a week.  I'm going to post
> it here and maybe someone will have an idea on where I can look.
>
> We currently run a multi region cassandra cluster in AWS. It runs in four
> regions, 12 nodes per region. It runs without node to node encryption (or
> client encryption either). We are trying to enable inter datacenter node to
> node encryption. However, when we flip encryption over we get an exception
> that nodes are unable to gossip with any peers.
>
> It could possibly be that we didn't build our jks keystore/truststores
> correctly (more on how we built these files below). But, we additionally do
> not see intra datacenter communication working (which should be set to
> unencrypted communication). Additionally, cqlsh cannot connect to the node
> either; even though we have (by default) client_auth_required set to false
> .
>
> ERROR [main] 2019-08-15 18:46:32,241 CassandraDaemon.java:749 - Exception 
> encountered during startup
> java.lang.RuntimeException: Unable to gossip with any peers
>         at 
> org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1435) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>         at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:566)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>         at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:823)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>         at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:683)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>         at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:632)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>         at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:388) 
> [apache-cassandra-3.11.4.jar:3.11.4]
>         at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:620)
>  [apache-cassandra-3.11.4.jar:3.11.4]
>         at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:732) 
> [apache-cassandra-3.11.4.jar:3.11.4]
> INFO  [main] 2019-08-15 18:47:07,384 YamlConfigurationLoader.java:89 - 
> Configuration location: file:/etc/cassandra/cassandra.yaml
>
>
> Something to note is that this error message occurs after a few minutes of
> the node being up. (i.e. there is a delay between start up before this
> exception is thrown).
>
> *Information about our cassandra setup*
>
> cassandra version: 3.11.4
> JDK version: openjdk-8.
> Linux: Ubuntu 18.04 (bionic).
>
> *cassandra.yaml*
>
> endpoint_snitch: Ec2MultiRegionSnitch
>
> server_encryption_options:
>   internode_encryption: dc
>   keystore: <omitted>
>   keystore_password: <omitted>
>   truststore: <omitted>
>   truststore_password: <omitted>
>
> client_encryption_options:
>   enabled: false
>
> *cassandra-rackdc.properties*
>
> prefer_local=true
>
> *No obvious errors with SSH output*
>
> When starting cassandra with JVM_OPTS="$JVM_OPTS -Djavax.net.debug=ssl" added
> to cassandra-env.sh we see SSL logs printed to stdout (*Note: Subject and
> Issuer were omitted on purpose)*.
>
> found key for : cassy-us-west-2
> adding as trusted cert:
>   Subject: ...
>   Issuer:  ...
>   Algorithm: RSA; Serial number: 0xdad28d843fc73325d4c1a75207d4e74
>   Valid from Fri May 27 00:00:00 UTC 2016 until Tue May 26 23:59:59 UTC 2026
>
> ...
>
> trigger seeding of SecureRandom
> done seeding SecureRandom
>
> Looking at Java SE SSL/TLS connection debugging
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.oracle.com_javase_7_docs_technotes_guides_security_jsse_ReadDebug.html&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=E6NVfMr2TIhW42QMfARTvsfCLtdF-oEA3KfAQRfVZdk&m=KdhQDpMbz8v1GYrbdYL_opGq-GBPXftrEYEkgcGeMp0&s=SR3ashwvSRxA75nBjGDwjAwq65nDuBZUaDOvHPGDrps&e=>,
> this looks correct. But to note, we see this series of messages (along with
> the RSA key signature output) repeated several times in rapid fire. We
> never observe any messages about the trust store being added; however that
> might be something that occurs only on client initiation (?)
>
> Additionally, we do see cassandra report that the Encrypted Messaging
> service has been started.
>
> INFO  [main] 2019-08-15 18:45:31,022 MessagingService.java:704 - Starting 
> Encrypted Messaging Service on SSL port 7001
>
> *Doesn't appear to be a cassandra.yaml configuration problem*
>
> We can bring the node back online by simply configuring internode_encryption:
> none. This action seems to rule out a broadcast_address or rpc_address
> configuration problem.
>
> *How we built our keystore/truststores*
>
> We followed the basic template datastax docs for preparing SSL
> certificates
> <https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/configuration/secureSSLCertWithCA.html>.
> One minor difference was that our private key and CSRs were generated using
> openssl. One per each region (we plan to share key/signed certs across
> nodes in regions). This was created using a command template as:
>
> openssl req -new -newkey rsa:2048 -out cassy-<region>.csr -keyout 
> cassy-<region>.key -config cassy-<region>.conf -subj "..." -nodes -sha256
>
> The generated CSR was then signed by an internal root CA. Because we
> generated our files using openssl, we had to build our jks files by
> importing our certs into them.
>
> *Commands to generate truststore*
>
> We distribute this one file to all nodes.
>
> keytool -importcert
>     -keystore generic-server-truststore.jks
>     -alias rootCa
>     -file rootCa.crt
>     -noprompt
>     -keypass omitted
>     -storepass omitted
>
> *Commands to generate keystore*
>
> This was done one per region; but essentially we created a keystore with
> keytool, then deleted the key entry and then imported our key entry using
> keytool from a pkcs12 file.
>
> keytool -genkeypair -keyalg RSA -alias cassy-${region} -keystore 
> cassy-${region}.jks -storepass omitted -keypass omitted -validity 365 
> -keysize 2048 -dname "..."
>
> keytool -delete -alias cassy-${region} -keystore cassy-${region}.jks 
> -storepass omitted
>
> openssl pkcs12 -export -in signed_certs/${region}.pem -inkey 
> keys/cassandra.${region}.key -name cassy-${region} -out ${region}.p12
>
> keytool -importkeystore -deststorepass omitted -destkeystore 
> cassy-${region}.jks -srckeystore ${region}.p12 -srcstoretype PKCS12
>
> keytool -importcert -keystore cassy-${region}.jks -alias rootCa -file ca.crt 
> -noprompt -keypass omitted -storepass omitted
>
> Looking back at this, I don't remember why we used keytool to generate a
> keypair/keystore, then deleted and imported. I think it was because the
> keytool importkeystore command refused to run if the keystore didn't
> already exist.
>
> *ca.crt and pem file*
>
> The ca.crt file contains the root certificate and the intermediate
> certificate that was used to sign the CSR. The pem file contains the signed
> CSR returned to us, the intermediate cert, and the root CA (in that order).
>
> *openssl verify ca.crt and pem*
>
> openssl verify -CAfile ca.crt us-west-2.pem
> signed_certs/us-west-2.pem: OK
>
> *Command output after enabling encryption*
>
> *nodetool status (output truncated)*
>
> Datacenter: us-east
> ===================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address         Load       Tokens       Owns (effective)  Host ID         
>                       Rack
> ?N  52.44.11.221    ?          256          25.4%             null            
>                       1c
> ...
> ?N  52.204.232.195  ?          256          23.2%             null            
>                       1d
> Datacenter: us-west-2
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address         Load       Tokens       Owns (effective)  Host ID         
>                       Rack
> ?N  34.209.2.144    ?          256          26.5%             null            
>                       2c
> UN  52.40.32.177    105.99 GiB  256          23.7%             null           
>                        2c
> ?N  34.210.109.203  ?          256          24.7%             null            
>                       2a
> ...
>
> With the online node being the node with encryption set.
>
> *cqlsh to localhost*
>
> cassy-node6:~$ cqlsh
> Connection error: ('Unable to connect to any servers', {'127.0.0.1': 
> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection 
> refused")})
>
> *cqlsh to remote node* Remote node is a node with encryption enabled
>
> cassy-node6:~$ cqlsh 10.0.2.7
> Connection error: ('Unable to connect to any servers', {'10.0.2.7': 
> error(111, "Tried connecting to [('10.0.2.7', 9042)]. Last error: Connection 
> refused")})
>
>

Reply via email to