Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Jai Bheemsen Rao Dhanwada
yes correct, it doesn't work for the servers. trying to see if any had any
workaround for this issue? (may be changing the protocol version during the
upgrade time?)

On Fri, Jul 26, 2019 at 1:11 PM Durity, Sean R 
wrote:

> This would handle client protocol, but not streaming protocol between
> nodes.
>
>
>
>
>
> Sean Durity – Staff Systems Engineer, Cassandra
>
>
>
> *From:* Alok Dwivedi 
> *Sent:* Friday, July 26, 2019 3:21 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: [EXTERNAL] Apache Cassandra upgrade path
>
>
>
> Hi Sean
>
> The recommended practice for upgrade is to explicitly control protocol
> version in your application during upgrade process. Basically the protocol
> version is negotiated on first connection and based on chance it can talk
> to an already upgraded node first which means it will negotiate a higher
> version that will not be compatible with those nodes which are still one
> lower Cassandra version. So initially you set it a lower version that is
> like lower common denominator for mixed mode cluster and then remove the
> call to explicit setting once upgrade has completed.
>
>
>
> Cluster cluster = Cluster.builder()
>
> .addContactPoint("127.0.0.1")
>
> .withProtocolVersion(ProtocolVersion.V2)
>
> .build();
>
>
>
> Refer here for more information if using Java driver
>
>
> https://docs.datastax.com/en/developer/java-driver/3.7/manual/native_protocol/#protocol-version-with-mixed-clusters
> 
>
>
>
> Same thing applies to drivers in other languages.
>
>
>
> Thanks
>
> Alok Dwivedi
>
> Senior Consultant
>
> https://www.instaclustr.com/
> 
>
>
>
>
>
> On Fri, 26 Jul 2019 at 20:03, Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
> Thanks Sean,
>
>
>
> In my use case all my clusters are multi DC, and I am trying my best
> effort to upgrade ASAP, however there is a chance since all machines are
> VMs. Also my key spaces are not uniform across DCs. some are replicated to
> all DCs and some of them are just one DC, so I am worried there.
>
>
>
> Is there a way to override the protocol version until the upgrade is done
> and then change it back once the upgrade is completed?
>
>
>
> On Fri, Jul 26, 2019 at 11:42 AM Durity, Sean R <
> sean_r_dur...@homedepot.com> wrote:
>
> What you have seen is totally expected. You can’t stream between different
> major versions of Cassandra. Get the upgrade done, then worry about any
> down hardware. If you are using DCs, upgrade one DC at a time, so that
> there is an available environment in case of any disasters.
>
>
>
> My advice, though, is to get through the rolling upgrade process as
> quickly as possible. Don’t stay in a mixed state very long. The cluster
> will function fine in a mixed state – except for those streaming
> operations. No repairs, no bootstraps.
>
>
>
>
>
> Sean Durity – Staff Systems Engineer, Cassandra
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada 
> *Sent:* Friday, July 26, 2019 2:24 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Apache Cassandra upgrade path
>
>
>
> Hello,
>
>
>
> I am trying to upgrade Apache Cassandra from 2.1.16 to 3.11.3, the regular
> rolling upgrade process works fine without any issues.
>
>
>
> However, I am running into an issue where if there is a node with older
> version dies (hardware failure) and a new node comes up and tries to
> bootstrap, it's failing.
>
>
>
> I tried two combinations:
>
>
>
> 1. Joining replacement node with 2.1.16 version of cassandra
>
> In this case nodes with 2.1.16 version are able to stream data to the new
> node, but the nodes with 3.11.3 version are failing with the below error.
>
>
>
> ERROR [STREAM-INIT-/10.x.x.x:40296] 2019-07-26 17:45:17,775
> IncomingStreamingConnection.java:80 - Error while reading from socket from
> /10.y.y.y:40296.
> java.io.IOException: Received stream using protocol version 2 (my version
> 4). Terminating connection
>
> 2. Joining replacement node with 3.11.3 version of cassandra
>
> In this case the nodes with 3.11.3 version of cassandra are able to stream
> the data but it's not able to stream data from the 2.1.16 nodes and failing
> with the below error.
>
>
>
> ERROR [STREAM-IN-/10.z.z.z:7000] 2019-07-26 18:08:10,380
> StreamSession.java:593 - [Stream #538c6900-afd0-11e9-a649-ab2e045ee53b]
> Streaming error occurred on session with peer 10.z.z.z
> java.io.IOException: Connection reset by peer
>at 

Cheat Sheet for Unix based OS, Performance troubleshooting

2019-07-26 Thread Krish Donald
Any one has  Cheat Sheet for Unix based OS, Performance troubleshooting ?


RE: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Durity, Sean R
This would handle client protocol, but not streaming protocol between nodes.


Sean Durity – Staff Systems Engineer, Cassandra

From: Alok Dwivedi 
Sent: Friday, July 26, 2019 3:21 PM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Apache Cassandra upgrade path

Hi Sean
The recommended practice for upgrade is to explicitly control protocol version 
in your application during upgrade process. Basically the protocol version is 
negotiated on first connection and based on chance it can talk to an already 
upgraded node first which means it will negotiate a higher version that will 
not be compatible with those nodes which are still one lower Cassandra version. 
So initially you set it a lower version that is like lower common denominator 
for mixed mode cluster and then remove the call to explicit setting once 
upgrade has completed.

Cluster cluster = Cluster.builder()
.addContactPoint("127.0.0.1")
.withProtocolVersion(ProtocolVersion.V2)
.build();

Refer here for more information if using Java driver
https://docs.datastax.com/en/developer/java-driver/3.7/manual/native_protocol/#protocol-version-with-mixed-clusters

Same thing applies to drivers in other languages.

Thanks
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/


On Fri, 26 Jul 2019 at 20:03, Jai Bheemsen Rao Dhanwada 
mailto:jaibheem...@gmail.com>> wrote:
Thanks Sean,

In my use case all my clusters are multi DC, and I am trying my best effort to 
upgrade ASAP, however there is a chance since all machines are VMs. Also my key 
spaces are not uniform across DCs. some are replicated to all DCs and some of 
them are just one DC, so I am worried there.

Is there a way to override the protocol version until the upgrade is done and 
then change it back once the upgrade is completed?

On Fri, Jul 26, 2019 at 11:42 AM Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
What you have seen is totally expected. You can’t stream between different 
major versions of Cassandra. Get the upgrade done, then worry about any down 
hardware. If you are using DCs, upgrade one DC at a time, so that there is an 
available environment in case of any disasters.

My advice, though, is to get through the rolling upgrade process as quickly as 
possible. Don’t stay in a mixed state very long. The cluster will function fine 
in a mixed state – except for those streaming operations. No repairs, no 
bootstraps.


Sean Durity – Staff Systems Engineer, Cassandra

From: Jai Bheemsen Rao Dhanwada 
mailto:jaibheem...@gmail.com>>
Sent: Friday, July 26, 2019 2:24 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Apache Cassandra upgrade path

Hello,

I am trying to upgrade Apache Cassandra from 2.1.16 to 3.11.3, the regular 
rolling upgrade process works fine without any issues.

However, I am running into an issue where if there is a node with older version 
dies (hardware failure) and a new node comes up and tries to bootstrap, it's 
failing.

I tried two combinations:

1. Joining replacement node with 2.1.16 version of cassandra
In this case nodes with 2.1.16 version are able to stream data to the new node, 
but the nodes with 3.11.3 version are failing with the below error.

ERROR [STREAM-INIT-/10.x.x.x:40296] 2019-07-26 17:45:17,775 
IncomingStreamingConnection.java:80 - Error while reading from socket from 
/10.y.y.y:40296.
java.io.IOException: Received stream using protocol version 2 (my version 4). 
Terminating connection
2. Joining replacement node with 3.11.3 version of cassandra
In this case the nodes with 3.11.3 version of cassandra are able to stream the 
data but it's not able to stream data from the 2.1.16 nodes and failing with 
the below error.

ERROR [STREAM-IN-/10.z.z.z:7000] 2019-07-26 18:08:10,380 StreamSession.java:593 
- [Stream #538c6900-afd0-11e9-a649-ab2e045ee53b] Streaming error occurred on 
session with peer 10.z.z.z
java.io.IOException: Connection reset by peer
   at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_151]
   at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) 
~[na:1.8.0_151]
   at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) 
~[na:1.8.0_151]
   at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_151]
   at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
~[na:1.8.0_151]
   at 

Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Alok Dwivedi
Hi Sean
The recommended practice for upgrade is to explicitly control protocol
version in your application during upgrade process. Basically the protocol
version is negotiated on first connection and based on chance it can talk
to an already upgraded node first which means it will negotiate a higher
version that will not be compatible with those nodes which are still one
lower Cassandra version. So initially you set it a lower version that is
like lower common denominator for mixed mode cluster and then remove the
call to explicit setting once upgrade has completed.

Cluster cluster = Cluster.builder() .addContactPoint("127.0.0.1") .
withProtocolVersion(ProtocolVersion.V2) .build();

Refer here for more information if using Java driver
https://docs.datastax.com/en/developer/java-driver/3.7/manual/native_protocol/#protocol-version-with-mixed-clusters

Same thing applies to drivers in other languages.

Thanks
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/


On Fri, 26 Jul 2019 at 20:03, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Thanks Sean,
>
> In my use case all my clusters are multi DC, and I am trying my best
> effort to upgrade ASAP, however there is a chance since all machines are
> VMs. Also my key spaces are not uniform across DCs. some are replicated to
> all DCs and some of them are just one DC, so I am worried there.
>
> Is there a way to override the protocol version until the upgrade is done
> and then change it back once the upgrade is completed?
>
> On Fri, Jul 26, 2019 at 11:42 AM Durity, Sean R <
> sean_r_dur...@homedepot.com> wrote:
>
>> What you have seen is totally expected. You can’t stream between
>> different major versions of Cassandra. Get the upgrade done, then worry
>> about any down hardware. If you are using DCs, upgrade one DC at a time, so
>> that there is an available environment in case of any disasters.
>>
>>
>>
>> My advice, though, is to get through the rolling upgrade process as
>> quickly as possible. Don’t stay in a mixed state very long. The cluster
>> will function fine in a mixed state – except for those streaming
>> operations. No repairs, no bootstraps.
>>
>>
>>
>>
>>
>> Sean Durity – Staff Systems Engineer, Cassandra
>>
>>
>>
>> *From:* Jai Bheemsen Rao Dhanwada 
>> *Sent:* Friday, July 26, 2019 2:24 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Apache Cassandra upgrade path
>>
>>
>>
>> Hello,
>>
>>
>>
>> I am trying to upgrade Apache Cassandra from 2.1.16 to 3.11.3, the
>> regular rolling upgrade process works fine without any issues.
>>
>>
>>
>> However, I am running into an issue where if there is a node with older
>> version dies (hardware failure) and a new node comes up and tries to
>> bootstrap, it's failing.
>>
>>
>>
>> I tried two combinations:
>>
>>
>>
>> 1. Joining replacement node with 2.1.16 version of cassandra
>>
>> In this case nodes with 2.1.16 version are able to stream data to the new
>> node, but the nodes with 3.11.3 version are failing with the below error.
>>
>>
>>
>> ERROR [STREAM-INIT-/10.x.x.x:40296] 2019-07-26 17:45:17,775
>> IncomingStreamingConnection.java:80 - Error while reading from socket from
>> /10.y.y.y:40296.
>> java.io.IOException: Received stream using protocol version 2 (my version
>> 4). Terminating connection
>>
>> 2. Joining replacement node with 3.11.3 version of cassandra
>>
>> In this case the nodes with 3.11.3 version of cassandra are able to
>> stream the data but it's not able to stream data from the 2.1.16 nodes and
>> failing with the below error.
>>
>>
>>
>> ERROR [STREAM-IN-/10.z.z.z:7000] 2019-07-26 18:08:10,380
>> StreamSession.java:593 - [Stream #538c6900-afd0-11e9-a649-ab2e045ee53b]
>> Streaming error occurred on session with peer 10.z.z.z
>> java.io.IOException: Connection reset by peer
>>at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> ~[na:1.8.0_151]
>>at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>> ~[na:1.8.0_151]
>>at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>> ~[na:1.8.0_151]
>>at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_151]
>>at
>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>> ~[na:1.8.0_151]
>>at
>> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:206)
>> ~[na:1.8.0_151]
>>at
>> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>> ~[na:1.8.0_151]
>>at
>> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>> ~[na:1.8.0_151]
>>at
>> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
>> ~[apache-cassandra-3.11.3.jar:3.11.3]
>>at
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:311)
>> ~[apache-cassandra-3.11.3.jar:3.11.3]
>>at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
>>
>>
>>
>> Note: In both cases I 

Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Jai Bheemsen Rao Dhanwada
Thanks Sean,

In my use case all my clusters are multi DC, and I am trying my best effort
to upgrade ASAP, however there is a chance since all machines are VMs. Also
my key spaces are not uniform across DCs. some are replicated to all DCs
and some of them are just one DC, so I am worried there.

Is there a way to override the protocol version until the upgrade is done
and then change it back once the upgrade is completed?

On Fri, Jul 26, 2019 at 11:42 AM Durity, Sean R 
wrote:

> What you have seen is totally expected. You can’t stream between different
> major versions of Cassandra. Get the upgrade done, then worry about any
> down hardware. If you are using DCs, upgrade one DC at a time, so that
> there is an available environment in case of any disasters.
>
>
>
> My advice, though, is to get through the rolling upgrade process as
> quickly as possible. Don’t stay in a mixed state very long. The cluster
> will function fine in a mixed state – except for those streaming
> operations. No repairs, no bootstraps.
>
>
>
>
>
> Sean Durity – Staff Systems Engineer, Cassandra
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada 
> *Sent:* Friday, July 26, 2019 2:24 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Apache Cassandra upgrade path
>
>
>
> Hello,
>
>
>
> I am trying to upgrade Apache Cassandra from 2.1.16 to 3.11.3, the regular
> rolling upgrade process works fine without any issues.
>
>
>
> However, I am running into an issue where if there is a node with older
> version dies (hardware failure) and a new node comes up and tries to
> bootstrap, it's failing.
>
>
>
> I tried two combinations:
>
>
>
> 1. Joining replacement node with 2.1.16 version of cassandra
>
> In this case nodes with 2.1.16 version are able to stream data to the new
> node, but the nodes with 3.11.3 version are failing with the below error.
>
>
>
> ERROR [STREAM-INIT-/10.x.x.x:40296] 2019-07-26 17:45:17,775
> IncomingStreamingConnection.java:80 - Error while reading from socket from
> /10.y.y.y:40296.
> java.io.IOException: Received stream using protocol version 2 (my version
> 4). Terminating connection
>
> 2. Joining replacement node with 3.11.3 version of cassandra
>
> In this case the nodes with 3.11.3 version of cassandra are able to stream
> the data but it's not able to stream data from the 2.1.16 nodes and failing
> with the below error.
>
>
>
> ERROR [STREAM-IN-/10.z.z.z:7000] 2019-07-26 18:08:10,380
> StreamSession.java:593 - [Stream #538c6900-afd0-11e9-a649-ab2e045ee53b]
> Streaming error occurred on session with peer 10.z.z.z
> java.io.IOException: Connection reset by peer
>at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> ~[na:1.8.0_151]
>at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> ~[na:1.8.0_151]
>at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> ~[na:1.8.0_151]
>at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_151]
>at
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> ~[na:1.8.0_151]
>at
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:206)
> ~[na:1.8.0_151]
>at
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> ~[na:1.8.0_151]
>at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> ~[na:1.8.0_151]
>at
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
>at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:311)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
>at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
>
>
>
> Note: In both cases I am using replace_address to replace dead node, as I
> am running into some issues with "nodetool removenode" . I use ephemeral
> disk, so replacement node always comes up with empty data dir and bootstrap.
>
>
>
> Any other work around to mitigate this problem? I am worried about any
> nodes going down while we are in the process of upgrade, as it could take
> several hours to upgrade depending on the cluster size.
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, 

RE: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Durity, Sean R
What you have seen is totally expected. You can’t stream between different 
major versions of Cassandra. Get the upgrade done, then worry about any down 
hardware. If you are using DCs, upgrade one DC at a time, so that there is an 
available environment in case of any disasters.

My advice, though, is to get through the rolling upgrade process as quickly as 
possible. Don’t stay in a mixed state very long. The cluster will function fine 
in a mixed state – except for those streaming operations. No repairs, no 
bootstraps.


Sean Durity – Staff Systems Engineer, Cassandra

From: Jai Bheemsen Rao Dhanwada 
Sent: Friday, July 26, 2019 2:24 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Apache Cassandra upgrade path

Hello,

I am trying to upgrade Apache Cassandra from 2.1.16 to 3.11.3, the regular 
rolling upgrade process works fine without any issues.

However, I am running into an issue where if there is a node with older version 
dies (hardware failure) and a new node comes up and tries to bootstrap, it's 
failing.

I tried two combinations:

1. Joining replacement node with 2.1.16 version of cassandra
In this case nodes with 2.1.16 version are able to stream data to the new node, 
but the nodes with 3.11.3 version are failing with the below error.

ERROR [STREAM-INIT-/10.x.x.x:40296] 2019-07-26 17:45:17,775 
IncomingStreamingConnection.java:80 - Error while reading from socket from 
/10.y.y.y:40296.
java.io.IOException: Received stream using protocol version 2 (my version 4). 
Terminating connection
2. Joining replacement node with 3.11.3 version of cassandra
In this case the nodes with 3.11.3 version of cassandra are able to stream the 
data but it's not able to stream data from the 2.1.16 nodes and failing with 
the below error.

ERROR [STREAM-IN-/10.z.z.z:7000] 2019-07-26 18:08:10,380 StreamSession.java:593 
- [Stream #538c6900-afd0-11e9-a649-ab2e045ee53b] Streaming error occurred on 
session with peer 10.z.z.z
java.io.IOException: Connection reset by peer
   at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_151]
   at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) 
~[na:1.8.0_151]
   at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) 
~[na:1.8.0_151]
   at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_151]
   at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
~[na:1.8.0_151]
   at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:206) 
~[na:1.8.0_151]
   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
~[na:1.8.0_151]
   at 
java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) 
~[na:1.8.0_151]
   at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
 ~[apache-cassandra-3.11.3.jar:3.11.3]
   at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:311)
 ~[apache-cassandra-3.11.3.jar:3.11.3]
   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]

Note: In both cases I am using replace_address to replace dead node, as I am 
running into some issues with "nodetool removenode" . I use ephemeral disk, so 
replacement node always comes up with empty data dir and bootstrap.

Any other work around to mitigate this problem? I am worried about any nodes 
going down while we are in the process of upgrade, as it could take several 
hours to upgrade depending on the cluster size.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Apache Cassandra upgrade path

2019-07-26 Thread Jai Bheemsen Rao Dhanwada
Hello,

I am trying to upgrade Apache Cassandra from 2.1.16 to 3.11.3, the regular
rolling upgrade process works fine without any issues.

However, I am running into an issue where if there is a node with older
version dies (hardware failure) and a new node comes up and tries to
bootstrap, it's failing.

I tried two combinations:

1. Joining replacement node with 2.1.16 version of cassandra
In this case nodes with 2.1.16 version are able to stream data to the new
node, but the nodes with 3.11.3 version are failing with the below error.


> ERROR [STREAM-INIT-/10.x.x.x:40296] 2019-07-26 17:45:17,775
> IncomingStreamingConnection.java:80 - Error while reading from socket from
> /10.y.y.y:40296.
> java.io.IOException: Received stream using protocol version 2 (my version
> 4). Terminating connection

2. Joining replacement node with 3.11.3 version of cassandra
In this case the nodes with 3.11.3 version of cassandra are able to stream
the data but it's not able to stream data from the 2.1.16 nodes and failing
with the below error.


> ERROR [STREAM-IN-/10.z.z.z:7000] 2019-07-26 18:08:10,380
> StreamSession.java:593 - [Stream #538c6900-afd0-11e9-a649-ab2e045ee53b]
> Streaming error occurred on session with peer 10.z.z.z
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_151]
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> ~[na:1.8.0_151]
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.8.0_151]
> at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_151]
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> ~[na:1.8.0_151]
> at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:206)
> ~[na:1.8.0_151]
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> ~[na:1.8.0_151]
> at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> ~[na:1.8.0_151]
> at
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:311)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]


Note: In both cases I am using replace_address to replace dead node, as I
am running into some issues with "nodetool removenode" . I use ephemeral
disk, so replacement node always comes up with empty data dir and bootstrap.

Any other work around to mitigate this problem? I am worried about any
nodes going down while we are in the process of upgrade, as it could take
several hours to upgrade depending on the cluster size.


Re: When Apache Cassandra 4.0 will release?

2019-07-26 Thread Simon Fontana Oscarsson
Hi,
To my knowledge there is no set date for 4.0, the community is
prioritizing QA over fast release which I think is great! Anytime
during Q4 is suggested. See the latest information from the dev mailing
list: https://lists.apache.org/thread.html/1a768d057d1af5a0f373c4c399a2
3e65cb04c61bbfff612634b9437c@%3Cdev.cassandra.apache.org%3E
Maybe we will get some more information on the summit in september.
There is a plugin available for audit logging on Cassandra 3.0 and 3.11
called ecAudit: https://github.com/Ericsson/ecaudit
You can find see what compatibility it has with the audit logging for
4.0 here: https://github.com/Ericsson/ecaudit/blob/release/c3.0.11/doc/
cassandra_compatibility.md#apache-cassandra-40 

-- 
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscars...@ericsson.com
www.ericsson.com
On fre, 2019-07-26 at 16:01 +, bhaskar pandey wrote:
> Hi,
> 
> When Apache Cassandra 4.0 will release? Any specific time line? I am
> eagerly waiting for audit log. 
> 
> Regards,
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

smime.p7s
Description: S/MIME cryptographic signature


When Apache Cassandra 4.0 will release?

2019-07-26 Thread bhaskar pandey
Hi,

When Apache Cassandra 4.0 will release? Any specific time line? I am eagerly 
waiting for audit log. 

Regards,

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Openings in BayArea/Remote for Cassandra Admin

2019-07-26 Thread Krish Donald
Hi,

This community is very helpful.
Looking for any pointers.
Anyone knows any opening in your team for Cassandra Admin skill in BayArea
/ Remote?
Please send me email .

Thanks
Krish


Re: Performance impact with ALLOW FILTERING clause.

2019-07-26 Thread Christian Lorenz
Hi,

did you also consider to “tame” your spark job by reducing it’s executors? 
Probably the Job will have a longer runtime in exchange to reducing the stress 
on the Cassandra cluster.

Regards
Christian

Von: "ZAIDI, ASAD A" 
Antworten an: "user@cassandra.apache.org" 
Datum: Donnerstag, 25. Juli 2019 um 20:05
An: "user@cassandra.apache.org" 
Betreff: RE: Performance impact with ALLOW FILTERING clause.

Thank you all for your insights.

When spark-connector adds allows filtering to a query, it makes the query to 
just ‘run’ no matter if it is expensive for larger table OR  not so expensive 
for table with fewer rows.
In my particular case, nodes are reaching 2TB/per node load in 50 node cluster. 
When bunch of such queries run ,  causes impact on server resources.

Since allow filtering is an expensive operation - I’m trying find knobs which 
if I turn, mitigate the impact.

What I think , correct me if I am wrong , is – it is query design itself which 
is not optimized per table design  - that in turn causing connector to add 
allow filtering implicitly.  I’m not thinking to add secondary indexes on 
tables because they’ve their own overheads.  kindly share if there are  other 
means which we can use to influence connector not to use allow filtering.

Thanks again.
Asad



From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Thursday, July 25, 2019 10:24 AM
To: cassandra 
Subject: Re: Performance impact with ALLOW FILTERING clause.

"unpredictable" is such a loaded word. It's quite predictable, but it's often 
mispredicted by users.

"ALLOW FILTERING" basically tells the database you're going to do a query that 
will require scanning a bunch of data to return some subset of it, and you're 
not able to provide a WHERE clause that's sufficiently fine grained to avoid 
the scan. It's a loose equivalent of doing a full table scan in SQL databases - 
sometimes it's a valid use case, but it's expensive, you're ignoring all of the 
indexes, and you're going to do a lot more work.

It's predictable, though - you're probably going to walk over some range of 
data. Spark is grabbing all of the data to load into RDDs, and it probably does 
it by slicing up the range, doing a bunch of range scans.

It's doing that so it can get ALL of the data and do the filtering / joining / 
searching in-memory in spark, rather than relying on cassandra to do the 
scanning/searching on disk.

On Thu, Jul 25, 2019 at 6:49 AM ZAIDI, ASAD A 
mailto:az1...@att.com>> wrote:
Hello Folks,

I was going thru documentation and saw at many places saying ALLOW FILTERING 
causes performance unpredictability.  Our developers says ALLOW FILTERING 
clause is implicitly added on bunch of queries by spark-Cassandra  connector 
and they cannot control it; however at the same time we see unpredictability in 
application performance – just as documentation says.

I’m trying to understand why would a connector add a clause in query when this 
can cause negative impact on database/application performance. Is that data 
model that is driving connector make its decision and add allow filtering to 
query automatically or if there are other reason this clause is added to the 
code. I’m not a developer though I want to know why developer don’t have any 
control on this to happen.

I’ll appreciate your guidance here.

Thanks
Asad




Re: Materialized View's additional PrimaryKey column

2019-07-26 Thread Christian Lorenz
I fully agree to Jon here. We previously also used MV’s and major problems 
popped up on decommissioning/commissioning nodes.
After replacing them by doing the MV’s job “manually” by code, we did not face 
those issues anymore.

Regards,
Christian

Von: Jon Haddad 
Antworten an: "user@cassandra.apache.org" 
Datum: Donnerstag, 25. Juli 2019 um 23:53
An: "user@cassandra.apache.org" 
Betreff: Re: Materialized View's additional PrimaryKey column

The issues I have with MVs aren't related to how they aren't correctly 
synchronized, although I'm not happy about that either.  My issue with them are 
in every cluster I've seen that uses them, the cluster has been unstable, and 
I've put a lot of time into helping teams undo them.  You will almost certainly 
have several hours or days of downtime as a result of using them.

There's a good reason they're marked as experimental (and disabled by default). 
 You should maintain the other tables yourself.

Jon



On Thu, Jul 25, 2019 at 12:22 AM mehmet bursali  
wrote:
Hi Jon, thanks for your suggestion (or warning :) ).
yes, i've read sth. about your point and i know that just because of using MVs, 
there are really several issues open in JIRA on bootstrapping, compaction and 
incremental repair stuff   but, after reading almost all jira tickets (with 
comments and history) related to using MVs,  AFAU  all that issues come out by 
either loosing syncronization between base table and MV by deleting columns or 
rows values on base table or having a huge system that has large and dynamic 
number of nodes/data/workloads. We use 3.11.3 version and most of the critical 
issues were fixed on 3.10 but  of course I might be miss sth so i 'll be glad 
if you point me some specific jira ticket.
We have a certain use case that require updates on filtering (clustering) 
columns.Our motivation for using MV was avoiding updates (delete + create) on 
primaryKey columns  because we suppose that cassandra developers can manage 
this unpreferred operation better then us. I'm really confused now.



On Wednesday, July 24, 2019, 11:30:15 PM GMT+3, Jon Haddad 
mailto:j...@jonhaddad.com>> wrote:


I really, really advise against using MVs.  I've had to help a number of teams 
move off them.  Not sure what list of bugs you read, but if the list didn't 
include "will destabilize your cluster to the point of constant downtime" then 
the list was incomplete.

Jon

On Wed, Jul 24, 2019 at 6:32 AM mehmet bursali  
wrote:
+ additional info: our production environment is a multiDC cluster that consist 
of 6 nodes in 2 DataCenters




On Wednesday, July 24, 2019, 3:35:11 PM GMT+3, mehmet bursali 
 wrote:


Hi Cassandra folks,
I'm planning to use Materialized View (MV) on production for some specific 
cases.  I've read a lot of blogs, technical documents about the risks of using 
it  and everything seems ok for our use case.
My question is about consistency(also durability) evaluation of MV usage with 
an additional primary key column.  İn one of our case, we select an UDT column 
of base table as addtional primary key column on MV. (UDT possible values are 
non nullable and restricted with domain.) . After inserting a record in base 
table, this additonal column (MVs primary key column)
value also will be updated  for 1 or 2 time. So in our case,  for each update 
operation that will be occured on base table there are going to be delete and 
create operations inside MV.
Does it matter  from consistency(also durability) perspective that using 
additional primary key column whether as partition column or  clustering column?



Re: high write latency on a single table

2019-07-26 Thread CPC
In the mean time we opened traces and found that normal insert operations
does not have high latency. Delete operations have high latency. And all
deletes are range deletes. Is there any performance regression about range
deletes? Mean time of delete operations are 45 times higher than insert
operations.

On Thu, Jul 25, 2019, 11:46 AM mehmet bursali 
wrote:

> awesome! so we can make a further investigation by using cassandra
> exporter on this link.  https://github.com/criteo/cassandra_exporter
> This exporter gives detailed information for read/write operations on each
> column  by using metrics below..
>
>  org:apache:cassandra:metrics:columnfamily:.*  ( reads from table metrics in 
> cassandra 
> https://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics )
>
>
>
>
>
>
>
>
>
>
>
> On Wednesday, July 24, 2019, 11:51:28 PM GMT+3, CPC 
> wrote:
>
>
> Hi Mehmet,
>
> Yes prometheus and opscenter
>
> On Wed, 24 Jul 2019 at 17:09, mehmet bursali 
> wrote:
>
> hi,
> do you use any perfomance monitoring tool like prometheus?
>
>
>
>
> On Monday, July 22, 2019, 1:16:58 PM GMT+3, CPC 
> wrote:
>
>
> Hi everybody,
>
> State column contains "R" or "D" values. Just a single character. As Rajsekhar
> said, only difference is the table can contain high number of cell count.
> In the mean time we made a major compaction and data per node was 5-6 gb.
>
> On Mon, Jul 22, 2019, 10:56 AM Rajsekhar Mallick 
> wrote:
>
> Hello Team,
>
> The difference in write latencies between both the tables though
> significant,but the higher latency being 11.353 ms is still acceptable.
>
> Overall Writes not being an issue, but write latency for this particular
> table on the higher side does point towards data being written to the table.
> Few things which I noticed, is the data in cell count column in nodetool
> tablehistogram o/p for message_history_state table is scattered.
> The partition size histogram for the tables is consistent, but the column
> count histogram for the impacted table isn't uniform.
> May be we can start thinking on these lines.
>
> I would also wait for some expert advice here.
>
> Thanks
>
>
> On Mon,the 22 Jul, 2019, 12:31 PM Ben Slater, 
> wrote:
>
> Is the size of the data in your “state” column variable? The higher write
> latencies at the 95%+ could line up with large volumes of data for
> particular rows in that column (the one column not in both tables)?
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> 
>
>    
>
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Mon, 22 Jul 2019 at 16:46, CPC  wrote:
>
> Hi guys,
>
> Any idea? I thought it might be a bug but could not find anything related
> on jira.
>
> On Fri, Jul 19, 2019, 12:45 PM CPC  wrote:
>
> Hi Rajsekhar,
>
> Here the details:
>
> 1)
>
> [cassadm@bipcas00 ~]$ nodetool tablestats tims.MESSAGE_HISTORY
> Total number of tables: 259
> 
> Keyspace : tims
> Read Count: 208256144
> Read Latency: 7.655146714749506 ms
> Write Count: 2218205275
> Write Latency: 1.7826005103175133 ms
> Pending Flushes: 0
> Table: MESSAGE_HISTORY
> SSTable count: 41
> Space used (live): 976964101899
> Space used (total): 976964101899
> Space used by snapshots (total): 3070598526780
> Off heap memory used (total): 185828820
> SSTable Compression Ratio: 0.8219217809913125
> Number of partitions (estimate): 8175715
> Memtable cell count: 73124
> Memtable data size: 26543733
> Memtable off heap memory used: 27829672
> Memtable switch count: 1607
> Local read count: 7871917
> Local read latency: 1.187 ms
> Local write count: 172220954
> Local write latency: 0.021 ms
> Pending flushes: 0
> Percent repaired: 0.0
> Bloom filter false positives: 130
> Bloom filter false ratio: 0.0
> Bloom filter space used: 10898488
> Bloom filter off heap memory used: 10898160
> Index summary off heap memory used: 2480140
> Compression metadata off heap memory used: 144620848
> Compacted partition minimum 

Re: Materialized View's additional PrimaryKey column

2019-07-26 Thread Jasonstack Zhao Yang
Hi Jon,

Do you have any clue what's the cause of downtime using MV? eg. memory
pressure, or overloaded by view writes?


Thanks.

On Fri, 26 Jul 2019 at 13:59, mehmet bursali 
wrote:

> Thank you again for Clear information Jon! i give up 珞
>
> Android’de Yahoo Postadan gönderildi
> 
>
> 0:53’’26e’ 26 Tem 2019 Cum tarihinde, Jon Haddad
>  şunu yazdı:
> The issues I have with MVs aren't related to how they aren't correctly
> synchronized, although I'm not happy about that either.  My issue with them
> are in every cluster I've seen that uses them, the cluster has been
> unstable, and I've put a lot of time into helping teams undo them.  You
> will almost certainly have several hours or days of downtime as a result of
> using them.
>
> There's a good reason they're marked as experimental (and disabled by
> default).  You should maintain the other tables yourself.
>
> Jon
>
>
>
> On Thu, Jul 25, 2019 at 12:22 AM mehmet bursali
>  wrote:
>
> Hi Jon, thanks for your suggestion (or warning :) ).
> yes, i've read sth. about your point and i know that just because of
> using MVs, there are really several issues open in JIRA on bootstrapping,
> compaction and incremental repair stuff   but, after reading almost all
> jira tickets (with comments and history) related to using MVs,  AFAU  all
> that issues come out by either loosing syncronization between base table
> and MV by deleting columns or rows values on base table or having a huge
> system that has large and dynamic number of nodes/data/workloads. We use
> 3.11.3 version and most of the critical issues were fixed on 3.10 but  of
> course I might be miss sth so i 'll be glad if you point me some specific
> jira ticket.
> We have a certain use case that require updates on filtering (clustering)
> columns.Our motivation for using MV was avoiding updates (delete +
> create) on primaryKey columns  because we suppose that cassandra developers
> can manage this unpreferred operation better then us. I'm really confused
> now.
>
>
>
> On Wednesday, July 24, 2019, 11:30:15 PM GMT+3, Jon Haddad <
> j...@jonhaddad.com> wrote:
>
>
> I really, really advise against using MVs.  I've had to help a number of
> teams move off them.  Not sure what list of bugs you read, but if the list
> didn't include "will destabilize your cluster to the point of constant
> downtime" then the list was incomplete.
>
> Jon
>
> On Wed, Jul 24, 2019 at 6:32 AM mehmet bursali 
> wrote:
>
> + additional info: our production environment is a multiDC cluster that
> consist of 6 nodes in 2 DataCenters
>
>
>
>
> On Wednesday, July 24, 2019, 3:35:11 PM GMT+3, mehmet bursali
>  wrote:
>
>
> Hi Cassandra folks,
> I'm planning to use Materialized View (MV) on production for some specific
> cases.  I've read a lot of blogs, technical documents about the risks of
> using it  and everything seems ok for our use case.
> My question is about consistency(also durability) evaluation of MV usage
> with an additional primary key column.  İn one of our case, we select an
> UDT column of base table as addtional primary key column on MV. (UDT
> possible values are non nullable and restricted with domain.) . After
> inserting a record in base table, this additonal column (MVs primary key
> column)
> value also will be updated  for 1 or 2 time. So in our case,  for each
> update operation that will be occured on base table there are going to be
> delete and create operations inside MV.
> Does it matter  from consistency(also durability) perspective that using
> additional primary key column whether as partition column or  clustering
> column?
>
>