Commit-log structure changes - versions

2019-02-07 Thread Sreenivasulu Nallapati
Hello folks, I am exploring the CDC option to move data from cassandra to Hive on periodic basis. While exploring this option, I overheard saying that the internal commit-log structure will change form version to version. Is this correct? As per this link

Re: Two datacenters with one cassandra node in each datacenter

2019-02-07 Thread Kunal
Hi Dinesh, We have very small setup and size of data is also very small. Max data size is around 2gb. Latency expectations is around 10-15ms. Regards, Kunal On Wed, Feb 6, 2019 at 11:27 PM dinesh.jo...@yahoo.com.INVALID wrote: > You also want to use Cassandra with a minimum of 3 nodes. > >

RE: range repairs multiple dc

2019-02-07 Thread Kenneth Brotman
This webpage has relevant information on procedures you need to use: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] Sent: Thursday, February 07, 2019 1:31 PM To: user@cassandra.apache.org

RE: range repairs multiple dc

2019-02-07 Thread Kenneth Brotman
A nice article on The Last Pickle blog at http://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html should be helpful to you. A line in the comments following the article states: “So restricting a -pr repair on a specific datacenter will be forbidden by Cassandra to

Re: How to read the Index.db file

2019-02-07 Thread Ben Slater
They don’t do exactly what you want but depending on why you are trying to get this info you might find our sstable-tools useful: https://github.com/instaclustr/cassandra-sstable-tools --- *Ben Slater* *Chief Product Officer*

RE: How to read the Index.db file

2019-02-07 Thread Kenneth Brotman
When you say you’re trying to get all the partition of a particular SSTable, I’m not sure what you mean. Do you want to make a copy of it? I don’t understand. Kenneth Brotman From: Pranay akula [mailto:pranay.akula2...@gmail.com] Sent: Wednesday, February 06, 2019 7:51 PM To:

Re: Bootstrap keeps failing

2019-02-07 Thread Kenneth Brotman
Lots of things come to mind. We need more information from you to help us understand: How long have you had your cluster running? Is it generally working ok? Is it just one node that is misbehaving at a time? How many nodes do you need to replace? Are you doing rolling restarts instead of

Re: [EXTERNAL] Re: Bootstrap keeps failing

2019-02-07 Thread Léo FERLIN SUTTON
Thank you for the recommendation. We are already using datastax's recommended settings for tcp_keepalive. Regards, Leo On Thu, Feb 7, 2019 at 5:49 PM Durity, Sean R wrote: > I have seen unreliable streaming (streaming that doesn’t finish) because > of TCP timeouts from firewalls or switches.

RE: [EXTERNAL] Re: Bootstrap keeps failing

2019-02-07 Thread Durity, Sean R
I have seen unreliable streaming (streaming that doesn’t finish) because of TCP timeouts from firewalls or switches. The default tcp_keepalive kernel parameters are usually not tuned for that. See https://docs.datastax.com/en/dse-trblshoot/doc/troubleshooting/idleFirewallLinux.html for more

RE: [EXTERNAL] RE: SASI queries- cqlsh vs java driver

2019-02-07 Thread Durity, Sean R
Kenneth is right. Trying to port/support a relational model to a CQL model the way you are doing it is not going to go well. You won’t be able to scale or get the search flexibility that you want. It will make Cassandra seem like a bad fit. You want to play to Cassandra’s strengths –

RE: SASI queries- cqlsh vs java driver

2019-02-07 Thread Kenneth Brotman
Peter, Sounds like you may need to use a different architecture. Perhaps you need something like Presto or Kafka as a part of the solution. If the data from the legacy system is wrong for Cassandra it’s an ETL problem? You’d have to transform the data you want to use with Cassandra so

range repairs multiple dc

2019-02-07 Thread CPC
Hi All, I searched over documentation but could not find enough reference regarding -pr option. In some documentation it says you have to cover all ring in some places it says you have to run it on every node regardless of you have multiple dc. In our case we have three dc (DC1,DC2,DC3) with