RE: [EXTERNAL] multiple Cassandra instances per server, possible?

2019-04-18 Thread Durity, Sean R
What is the data problem that you are trying to solve with Cassandra? Is it 
high availability? Low latency queries? Large data volumes? High concurrent 
users? I would design the solution to fit the problem(s) you are solving.

For example, if high availability is the goal, I would be very cautious about 2 
nodes/machine. If you need the full amount of the disk – you *can* have larger 
nodes than 1 TB. I agree that administration tasks (like adding/removing nodes, 
etc.) are more painful with large nodes – but not impossible. For large amounts 
of data, I like nodes that have about 2.5 – 3 TB of usable SSD disk.

It is possible that your nodes might be under-utilized, especially at first. 
But if the hardware is already available, you have to use what you have.

We have done multiple nodes on single physical hardware, but they were two 
separate clusters (for the same application). In that case, we had  a different 
install location and different ports for one of the clusters.

Sean Durity

From: William R 
Sent: Thursday, April 18, 2019 9:14 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] multiple Cassandra instances per server, possible?

Hi all,

In our small company we have 10 nodes of (2 x 3 TB HD) 6 TB each, 128 GB ram 
and 64 cores and we are thinking to use them as Cassandra nodes. From what I am 
reading around, the community recommends that every node should not keep more 
than 1 TB data so in this case I am wondering if it is possible to install 2 
instances per node using docker so each docker instance can write to its own 
physical disk and utilise more efficiently the rest hardware (CPU & RAM).

I understand with this setup there is the danger of creating a single point of 
failure for 2 Cassandra nodes but except that do you think that is a possible 
setup to start with the cluster?

Except the docker solution do you recommend any other way to split the physical 
node to 2 instances? (VMWare? or even maybe 2 separate installations of 
Cassandra? )

Eventually we are aiming in a cluster consisted of 2 DCs with 10 nodes each (5 
baremetal nodes with 2 Cassandra instances)

Probably later when we will start introducing more nodes to the cluster we can 
decommissioning the "double-instaned" ones and aim for a more homogeneous 
solution..

Thank you,

Wil



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException:

2019-04-17 Thread Durity, Sean R
If you are just trying to get a sense of the data, you could try adding a limit 
clause to limit the amount of results and hopefully beat the timeout.

However, ALLOW FILTERING really means "ALLOW ME TO DESTROY MY APPLICATION AND 
CLUSTER." It means the data model does not support the query and will not scale 
-- in this case, not even on one node. Design a new table to support the query 
with a proper partition key (and any clustering keys).


Sean Durity


-Original Message-
From: Dinesh Joshi 
Sent: Wednesday, April 17, 2019 2:39 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Caused by: 
com.datastax.driver.core.exceptions.ReadTimeoutException:

More info with detailed explanation: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_apache-2Dcassandra-2Dscalability-2Dallow-2Dfiltering-2Dpartition-2Dkeys_=DwIFAg=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=HXMAEKpR-N5O0-U5rclUrsVk5QPmSUQYels4VTOVZWI=LHl3QGlLsAdszkJ6XK3O2w7_EcSyRyaSFjBgEcK9nfo=

Dinesh

> On Apr 16, 2019, at 11:24 PM, Mahesh Daksha  wrote:
>
> Hi,
>
> how much data you are trying to read in the single query? is it large in size 
> or normal text data.
> Looking at the exception it seems the node is unable to deliver data within 
> stipulated time. I have faced similar issue with the response data in huge in 
> size (some binary data). But it was solved as we spread the data across 
> multiple rows.
>
> Thanks,
> Mahesh Daksha
>
> On Wed, Apr 17, 2019 at 11:42 AM Krishnanand Khambadkone 
>  wrote:
> Hi,  I have a single instance cassandra server.  I am trying to execute a 
> query with ALLOW FILTERING option.  When I run this same query from cqlsh it 
> runs fine but when I try to execute the query through the java driver it 
> throws this exception.  I have increased all the timeouts in cassandra.yaml 
> file and also included the read timeout option in the SimpleStetement query I 
> am running.  Any idea how I can fix this issue.
> Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: 
> Cassandra timeout during read query at consistency LOCAL_ONE (1 responses 
> were required but only 0 replica responded)
>
>


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: [EXTERNAL] Re: Getting Consistency level TWO when it is requested LOCAL_ONE

2019-04-11 Thread Durity, Sean R
https://issues.apache.org/jira/browse/CASSANDRA-9620 has something similar that 
was determined to be a driver error. I would start with looking at the driver 
version and also the RetryPolicy that is in effect for the Cluster. Secondly, I 
would look at whether a batch is really needed for the statements. Cassandra 
batches are for atomicity – not speed.

[cid:image003.png@01D4F04A.61E8CD40]

Sean Durity
Staff Systems Engineer – Cassandra
MTC 2250
#cassandra - for the latest news and updates



From: Mahesh Daksha 
Sent: Thursday, April 11, 2019 5:21 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Getting Consistency level TWO when it is requested 
LOCAL_ONE

Hi Jean,

I want to understand how you are setting the write consistency level as LOCAL 
ONE. That is with every query you mentioning consistency level or you have set 
the spring cassandra config with provided consistency level.
Like this:
cluster.setQueryOptions(new 
QueryOptions().setConsistencyLevel(ConsistencyLevel.valueOf(cassandraConsistencyLevel)));

The only possibility i see of such behavior is its getting overridden from some 
where.

Thanks,
Mahesh Daksha

On Thu, Apr 11, 2019 at 1:43 PM Jean Carlo 
mailto:jean.jeancar...@gmail.com>> wrote:
Hello everyone,

I have a case where the developers are using spring data framework for 
Cassandra. We are writing batches setting consistency level at LOCAL_ONE but we 
got a timeout like this

Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra 
timeout during BATCH_LOG write query at consistency TWO (2 replica were 
required but only 1 acknowledged the write)

Is it Cassandra that somehow writes to the system.batchlog using consistency 
TWO or is it spring data that makes some dirty things behind the scenes ?
(I want to believe it is the second one)

Cheers

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Issue while updating a record in 3 node cassandra cluster deployed using kubernetes

2019-04-09 Thread Durity, Sean R
My first suspicion would be to look at the server times in the cluster. It 
looks like other cases where a write occurs (with no errors) but the data is 
not retrieved as expected. If the write occurs with an earlier timestamp than 
the existing data, this is the behavior you would see. The write would occur, 
but it would not be the latest data to be retrieved. The write looks like it 
fails silently, but it actually does exactly what it is designed to do.

Sean Durity

From: Mahesh Daksha 
Sent: Tuesday, April 09, 2019 9:10 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Issue while updating a record in 3 node cassandra cluster 
deployed using kubernetes


Hello All,

I have a 3 node cassandra cluster with Replication factor as 2 and read-write 
consistency set to QUORUM. We are using Spring data cassandra. All 
infrastructure is deployed using kubernetes.

Now in normal use case many records gets inserted to cassandra table. Then we 
try to modify/update one of the record using save method of repo, like below:

ChunkMeta tmpRec = chunkMetaRepository.save(chunkMeta);

After execution of above statement we never see any exception or error. But 
still this update state goes silent/fail intermittently. That is at times the 
record in the db gets updated successfully where as other time it fails. Also 
in the above query when we print tmpRec it contains the updated and correct 
value every time. Still in the db these updated values doesn't get reflected.

We check the the cassandra transport TRACE logs on all nodes and found the our 
queries are getting logged there and are being executed also with out any error 
or exception.

Now another weird observation is this all thing works erfectly fine if I am 
using single cassandra node (in kubernetes) or if we deploy above infra using 
ansible (even works for 3 nodes for Ansible).

It looks some issue is specifically with the kubernetes 3 node deployment of 
cassandra. Primarily looks like replication among nodes causing this.

Please suggest.



I have a 3 node cassandra cluster with Replication factor as 2 and read-write 
consistency set to QUORUM. We are using Spring data cassandra. All 
infrastructure is deployed using kubernetes.

Now in normal use case many records gets inserted to cassandra table. Then we 
try to modify/update one of the record using save method of repo, like below:

ChunkMeta tmpRec = chunkMetaRepository.save(chunkMeta);

After execution of above statement we never see any exception or error. But 
still this update fail intermittently. That is when we check the record in the 
db sometime it gets updated successfully where as other time it fails. Also in 
the above query when we print tmpRec it contains the updated and correct value. 
Still in the db these updated values doesnt get reflected.

We check the the cassandra transport TRACE logs on all nodes and found the our 
queries are getting logged there and are being executed also.

Now another weird observation is this all thing works if I am using single 
cassandra node (in kubernetes) or if we deploy above infra using ansible (even 
works for 3 nodes for Ansible).

It looks some issue is specifically with the kubernetes 3 node deployment of 
cassandra. Primarily looks like replication among nodes causing this.

Please suggest.

Below are the contents of  my cassandra Docker file:

FROM ubuntu:16.04



RUN apt-get update && apt-get install -y python sudo lsof vim dnsutils 
net-tools && apt-get clean && \

addgroup testuser && useradd -g testuser testuser && usermod --password 
testuser testuser;



RUN mkdir -p /opt/test && \

mkdir -p /opt/test/data;



ADD jre8.tar.gz /opt/test/

ADD apache-cassandra-3.11.0-bin.tar.gz /opt/test/



RUN chmod 755 -R /opt/test/jre && \

ln -s /opt/test/jre/bin/java /usr/bin/java && \

mv /opt/test/apache-cassandra* /opt/test/cassandra;



RUN mkdir -p /opt/test/cassandra/logs;



ENV JAVA_HOME /opt/test/jre

RUN export JAVA_HOME



COPY version.txt /opt/test/cassandra/version.txt



WORKDIR /opt/test/cassandra/bin/



RUN mkdir -p /opt/test/data/saved_caches && \

mkdir -p /opt/test/data/commitlog && \

mkdir -p /opt/test/data/hints && \

chown -R testuser:testuser /opt/test/data && \

chown -R testuser:testuser /opt/test;



USER testuser



CMD cp /etc/cassandra/cassandra.yml ../conf/conf.yml && perl -p -e 
's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg; s/\$\{([^}]+)\}//eg' 
../conf/conf.yml > ../conf/cassandra.yaml && rm ../conf/conf.yml && ./cassandra 
-f

Please note conf.yml is basically cassandra.yml file having properties related 
to cassandra.



Thanks,

Mahesh Daksha



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken 

RE: [EXTERNAL] Re: Garbage Collector

2019-03-19 Thread Durity, Sean R
My default is G1GC using 50% of available RAM (so typically a minimum of 16 GB 
for the JVM). That has worked in just about every case I’m familiar with. In 
the old days we used CMS, but tuning that beast is a black art with few wizards 
available (though several on this mailing list). Today, I just don’t see GC 
issues – unless there is a bad query in play. For me, the data model/query 
construction is the more fruitful path to achieving performance and reliability.



Sean Durity

From: Jon Haddad 
Sent: Tuesday, March 19, 2019 2:16 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Garbage Collector

G1 is optimized for high throughput with higher pause times.  It's great if you 
have mixed / unpredictable workloads, and as Elliott mentioned is mostly set & 
forget.

ZGC requires Java 11, which is only supported on trunk.  I plan on messing with 
it soon, but I haven't had time yet.  We'll share the results on our blog (TLP) 
when we get to it.

Jon

On Tue, Mar 19, 2019 at 10:12 AM Elliott Sims 
mailto:elli...@backblaze.com>> wrote:
I use G1, and I think it's actually the default now for newer Cassandra 
versions.  For G1, I've done very little custom config/tuning.  I increased 
heap to 16GB (out of 64GB physical), but most of the rest is at or near 
default.  For the most part, it's been "feed it more RAM, and it works" 
compared to CMS's "lower overhead, works great until it doesn't" and dozens of 
knobs.
I haven't tried ZGC yet, but anecdotally I've heard that it doesn't really 
match or beat G1 quite yet.

On Tue, Mar 19, 2019 at 9:44 AM Ahmed Eljami 
mailto:ahmed.elj...@gmail.com>> wrote:
Hi Folks,

Does someone use G1 GC or ZGC on production?

Can you share your feedback, the configuration used if it's possible ?

Thanks.




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Default TTL on CF

2019-03-14 Thread Durity, Sean R
I spent a month of my life on similar problem... There wasn't an easy answer, 
but this is what I did

#1 - Stop the problem from growing further. Get new inserts using a TTL (or set 
the default on the table so they get it). App team had to do this one.
#2 - Delete any data  that should already be expired.
- In my case the partition key included a date in the composite string they had 
built. So I could know from the partition key if the data needed to be deleted. 
I used sstablekeys to get the keys into files on each host. Then I parsed the 
files and created deletes for only those expired records. Then I executed the 
deletes. Then I had to do some compaction to actually create disk space. A long 
process with hundreds of billions of records...
#3 - Add TTL to data that should live. I gave this to the app team. Using the 
extracted keys I gave them, they could calculate the proper TTL. They read the 
data with the key, calculated TTL, and rewrote the data with TTL. Long, boring, 
etc. but they did it.



Sean Durity

-Original Message-
From: Jeff Jirsa 
Sent: Thursday, March 14, 2019 9:30 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Default TTL on CF

SSTableReader and CQLSSTableWriter if you’re comfortable with Java


--
Jeff Jirsa


> On Mar 14, 2019, at 1:28 PM, Nick Hatfield  wrote:
>
> Bummer but, reasonable. Any cool tricks I could use to make that process
> easier? I have many TB of data on a live cluster and was hoping to
> starting cleaning out the earlier bad habits of data housekeeping
>
>> On 3/14/19, 9:24 AM, "Jeff Jirsa"  wrote:
>>
>> It does not impact existing data
>>
>> The data gets an expiration time stamp when you write it. Changing the
>> default only impacts newly written data
>>
>> If you need to change the expiration time on existing data, you must
>> update it
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>>> On Mar 14, 2019, at 1:16 PM, Nick Hatfield 
>>> wrote:
>>>
>>> Hello,
>>>
>>> Can anyone tell me if setting a default TTL will affect existing data?
>>> I would like to enable a default TTL and have cassandra add that TTL to
>>> any rows that don¹t currently have a TTL set.
>>>
>>> Thanks,
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


RE: [EXTERNAL] Re: Migrate large volume of data from one table to another table within the same cluster when COPY is not an option.

2019-03-14 Thread Durity, Sean R
The possibility of a highly available way to do this gives more challenges. I 
would be weighing the cost of a complex solution vs the possibility of a 
maintenance window when you stop your app to move the data, then restart.

For the straight copy of the data, I am currently enamored with DataStax’s 
dsbulk utility for unloading and loading larger amounts of data. I don’t have 
extensive experience, yet, but it has been fast enough in my experiments – and 
that is without doing too much tuning for speed. From a host not in the 
cluster, I was able to extract 3.5 million rows in about 11 seconds. I inserted 
them into a differently partitioned table in about 26 seconds. Very small data 
rows, but it was impressive for not doing much to try and speed it up further. 
(In some other tests, it was about ¼ the time of simple copy statement from 
cqlsh)

If I was designing something for a “can’t take an outage” scenario, I would 
start with:

-  Writing the data to the old and new tables on all inserts

-  On reads, read from the new table first. If not there, read from the 
old table <-- could introduce some latency, but would be available; could also 
do asynchronous reads on both tables and choose the latest

-  Do this until the data has been copied from old to new (with dsbulk 
or custom code or Spark)

-  Drop the double writes and conditional reads


Sean

From: Stefan Miklosovic 
Sent: Wednesday, March 13, 2019 6:39 PM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Re: Migrate large volume of data from one table to 
another table within the same cluster when COPY is not an option.

Hi Leena,

as already suggested in my previous email, you could use Apache Spark and 
Cassandra Spark connector (1). I have checked TTLs and I believe you should 
especially read this section (2) about TTLs. Seems like thats what you need to 
do, ttls per row. The workflow would be that you read from your source table, 
making transformations per row (via some mapping) and then you would save it to 
new table.

This would import it "all" but until you switch to the new table and records 
are still being saved into the original one, I am not sure how to cover "the 
gap" in such sense that once you make the switch, you would miss records which 
were created in the first table after you did the loading. You could maybe 
leverage Spark streaming (Cassandra connector knows that too) so you would make 
this transformation on the fly with new ones.

(1) 
https://github.com/datastax/spark-cassandra-connector<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_datastax_spark-2Dcassandra-2Dconnector=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=_DgzHjtyiXt4QUBiWPplE-cs_HMaVflC9fAK6I4TdpQ=mMB-uNoPbBBK9Zfn5WuDoKoF31IgSi1MXgNlYG7jhDE=>
(2) 
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md#using-a-different-value-for-each-row<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_datastax_spark-2Dcassandra-2Dconnector_blob_master_doc_5-5Fsaving.md-23using-2Da-2Ddifferent-2Dvalue-2Dfor-2Deach-2Drow=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=_DgzHjtyiXt4QUBiWPplE-cs_HMaVflC9fAK6I4TdpQ=AwO-LFAxHWvYgzjuWt9ez5FHKDeNdS3C6KYfaoUUgOs=>


On Thu, 14 Mar 2019 at 00:13, Leena Ghatpande 
mailto:lghatpa...@hotmail.com>> wrote:
Understand, 2nd table would be a better approach. So what would be the best way 
to copy 70M rows from current table to the 2nd table with ttl set on each 
record as the first table?

____
From: Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>>
Sent: Wednesday, March 13, 2019 8:17 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: [EXTERNAL] Re: Migrate large volume of data from one table to 
another table within the same cluster when COPY is not an option.


Correct, there is no current flag. I think there SHOULD be one.





From: Dieudonné Madishon NGAYA mailto:dmng...@gmail.com>>
Sent: Tuesday, March 12, 2019 7:17 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Migrate large volume of data from one table to another 
table within the same cluster when COPY is not an option.



Hi Sean, you can’t flag in Cassandra.yaml not allowing allow filtering , the 
only thing you can do will be from your data model .

Don’t ask Cassandra to query all data from table but the ideal query will be 
using single partition.



On Tue, Mar 12, 2019 at 6:46 PM Stefan Miklosovic 
mailto:stefan.mikloso...@instaclustr.com>> 
wrote:

Hi Sean,



for sure, the best approach would be to create another table which would treat 
just that specific query.



How do I set the flag for not allowing allow filtering in cassandra.yaml? I 
read a doco and there seems to be nothing about that.



Regards



On Wed, 13 Mar 2019 at 06:57, Duri

RE: [EXTERNAL] Re: Cluster size "limit"

2019-03-13 Thread Durity, Sean R
Rebuild the DCs with a new number of vnodes… I have done it.

Sean

From: Ahmed Eljami 
Sent: Wednesday, March 13, 2019 2:09 PM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Re: Cluster size "limit"

Is not possible with an existing cluster!
Le mer. 13 mars 2019 à 18:39, Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> a écrit :
If you can change to 8 vnodes, it will be much better for repairs and other 
kinds of streaming operations. The old advice of 256 per node is now not very 
helpful.

Sean

From: Ahmed Eljami mailto:ahmed.elj...@gmail.com>>
Sent: Wednesday, March 13, 2019 1:27 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cluster size "limit"

Yes, 256 vnodes

Le mer. 13 mars 2019 à 17:31, Jeff Jirsa 
mailto:jji...@gmail.com>> a écrit :
Do you use vnodes? How many vnodes per machine?
--
Jeff Jirsa


On Mar 13, 2019, at 3:58 PM, Ahmed Eljami 
mailto:ahmed.elj...@gmail.com>> wrote:
Hi,

We are planning to add a third datacenter to our cluster (already has 2 
datacenter, every datcenter has 50 nodes, so 100 nodes in total).

My fear is that an important number of nodes per cluster (> 100) could cause a 
lot of problems like gossip duration, maintenance (repair...)...

I know that it depends on use cases, volume of data and many other thing, but I 
would like that you share your  experiences with that.

Thx






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Cluster size "limit"

2019-03-13 Thread Durity, Sean R
If you can change to 8 vnodes, it will be much better for repairs and other 
kinds of streaming operations. The old advice of 256 per node is now not very 
helpful.

Sean

From: Ahmed Eljami 
Sent: Wednesday, March 13, 2019 1:27 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cluster size "limit"

Yes, 256 vnodes

Le mer. 13 mars 2019 à 17:31, Jeff Jirsa 
mailto:jji...@gmail.com>> a écrit :
Do you use vnodes? How many vnodes per machine?
--
Jeff Jirsa


On Mar 13, 2019, at 3:58 PM, Ahmed Eljami 
mailto:ahmed.elj...@gmail.com>> wrote:
Hi,

We are planning to add a third datacenter to our cluster (already has 2 
datacenter, every datcenter has 50 nodes, so 100 nodes in total).

My fear is that an important number of nodes per cluster (> 100) could cause a 
lot of problems like gossip duration, maintenance (repair...)...

I know that it depends on use cases, volume of data and many other thing, but I 
would like that you share your  experiences with that.

Thx






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Migrate large volume of data from one table to another table within the same cluster when COPY is not an option.

2019-03-13 Thread Durity, Sean R
Correct, there is no current flag. I think there SHOULD be one.


From: Dieudonné Madishon NGAYA 
Sent: Tuesday, March 12, 2019 7:17 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Migrate large volume of data from one table to another 
table within the same cluster when COPY is not an option.

Hi Sean, you can’t flag in Cassandra.yaml not allowing allow filtering , the 
only thing you can do will be from your data model .
Don’t ask Cassandra to query all data from table but the ideal query will be 
using single partition.

On Tue, Mar 12, 2019 at 6:46 PM Stefan Miklosovic 
mailto:stefan.mikloso...@instaclustr.com>> 
wrote:
Hi Sean,

for sure, the best approach would be to create another table which would treat 
just that specific query.

How do I set the flag for not allowing allow filtering in cassandra.yaml? I 
read a doco and there seems to be nothing about that.

Regards

On Wed, 13 Mar 2019 at 06:57, Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
If there are 2 access patterns, I would consider having 2 tables. The first one 
with the ID, which you say is the majority use case.  Then have a second table 
that uses a time-bucket approach as others have suggested:
(time bucket, id) as primary key
Choose a time bucket (day, week, hour, month, whatever) that would hold less 
than 100 MB of data in the time-bucket partition.

You could include all relevant data in the second table to meet your query. OR, 
if that data seems too large or too volatile to duplicate, just include your 
primary key and look-up the data in the primary table as needed.

If you use allow filtering, you are setting yourself up for failure to scale. I 
tell my developers, “if you use allow filtering, you are doing it wrong.” In 
fact, I think the Cassandra admin should be able to set a flag in 
cassandra.yaml to not allow filtering at all. The cluster should be able to 
protect itself from bad queries.



From: Leena Ghatpande mailto:lghatpa...@hotmail.com>>
Sent: Tuesday, March 12, 2019 9:02 AM
To: Stefan Miklosovic 
mailto:stefan.mikloso...@instaclustr.com>>; 
user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Migrate large volume of data from one table to another 
table within the same cluster when COPY is not an option.

Our data model cannot be like below as you have recommended as majority of the 
reads need to select the data by the partition key (id) only, not by date.
You could remodel your data in such way that you would make primary key like 
this
((date), hour-minute, id)
or
((date, hour-minute), id)


By adding the date as clustering column, yes the idea was to use the Allow 
Filtering on the date and pull the records. Understand that it is not 
recommended to do this, but we have been doing this on another existing large 
table and have not run into any issue so far. But want to understand if there 
is a better approach to this?

Thanks


From: Stefan Miklosovic 
mailto:stefan.mikloso...@instaclustr.com>>
Sent: Monday, March 11, 2019 7:12 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Migrate large volume of data from one table to another table 
within the same cluster when COPY is not an option.

The query which does not work should be like this, I made a mistake there

cqlsh> SELECT * from my_keyspace.my_table where  number > 2;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot 
execute this query as it might involve data filtering and thus may have 
unpredictable performance. If you want to execute this query despite the 
performance unpredictability, use ALLOW FILTERING"


On Tue, 12 Mar 2019 at 10:10, Stefan Miklosovic 
mailto:stefan.mikloso...@instaclustr.com>> 
wrote:
Hi Leena,

"We are thinking of creating a new table with a date field as a clustering 
column to be able to query for date ranges, but partition key to clustering key 
will be 1-1. Is this a good approach?"

If you want to select by some time range here, I am wondering how would making 
datetime a clustering column help you here? You still have to provide primary 
key, right?

E.g. select * from your_keyspace.your_table where id=123 and my_date > 
yesterday and my_date < tomorrow (you got the idea)

If you make my_date clustering column, you cant not do this below, because you 
still have to specify partition key fully and then clustering key (optionally) 
where you can further order and do ranges. But you cant do a query without 
specifying partition key. Well, you can use ALLOW FILTERING but you do not want 
to do this at all in your situation as it would scan everything.

select * from your_keyspace.your_table where my_date > yesterday and my_date < 
tomorrow

cqlsh> create KEYSPACE my_keyspace WITH replication = {'class': 
'NetworkTopologyStrategy', 'dc1': '1'};
cqlsh> CREATE TABLE my_keyspace.my_table (i

RE: Migrate large volume of data from one table to another table within the same cluster when COPY is not an option.

2019-03-12 Thread Durity, Sean R
If there are 2 access patterns, I would consider having 2 tables. The first one 
with the ID, which you say is the majority use case.  Then have a second table 
that uses a time-bucket approach as others have suggested:
(time bucket, id) as primary key
Choose a time bucket (day, week, hour, month, whatever) that would hold less 
than 100 MB of data in the time-bucket partition.

You could include all relevant data in the second table to meet your query. OR, 
if that data seems too large or too volatile to duplicate, just include your 
primary key and look-up the data in the primary table as needed.

If you use allow filtering, you are setting yourself up for failure to scale. I 
tell my developers, "if you use allow filtering, you are doing it wrong." In 
fact, I think the Cassandra admin should be able to set a flag in 
cassandra.yaml to not allow filtering at all. The cluster should be able to 
protect itself from bad queries.



From: Leena Ghatpande 
Sent: Tuesday, March 12, 2019 9:02 AM
To: Stefan Miklosovic ; 
user@cassandra.apache.org
Subject: [EXTERNAL] Re: Migrate large volume of data from one table to another 
table within the same cluster when COPY is not an option.

Our data model cannot be like below as you have recommended as majority of the 
reads need to select the data by the partition key (id) only, not by date.
You could remodel your data in such way that you would make primary key like 
this
((date), hour-minute, id)
or
((date, hour-minute), id)


By adding the date as clustering column, yes the idea was to use the Allow 
Filtering on the date and pull the records. Understand that it is not 
recommended to do this, but we have been doing this on another existing large 
table and have not run into any issue so far. But want to understand if there 
is a better approach to this?

Thanks


From: Stefan Miklosovic 
mailto:stefan.mikloso...@instaclustr.com>>
Sent: Monday, March 11, 2019 7:12 PM
To: user@cassandra.apache.org
Subject: Re: Migrate large volume of data from one table to another table 
within the same cluster when COPY is not an option.

The query which does not work should be like this, I made a mistake there

cqlsh> SELECT * from my_keyspace.my_table where  number > 2;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot 
execute this query as it might involve data filtering and thus may have 
unpredictable performance. If you want to execute this query despite the 
performance unpredictability, use ALLOW FILTERING"


On Tue, 12 Mar 2019 at 10:10, Stefan Miklosovic 
mailto:stefan.mikloso...@instaclustr.com>> 
wrote:
Hi Leena,

"We are thinking of creating a new table with a date field as a clustering 
column to be able to query for date ranges, but partition key to clustering key 
will be 1-1. Is this a good approach?"

If you want to select by some time range here, I am wondering how would making 
datetime a clustering column help you here? You still have to provide primary 
key, right?

E.g. select * from your_keyspace.your_table where id=123 and my_date > 
yesterday and my_date < tomorrow (you got the idea)

If you make my_date clustering column, you cant not do this below, because you 
still have to specify partition key fully and then clustering key (optionally) 
where you can further order and do ranges. But you cant do a query without 
specifying partition key. Well, you can use ALLOW FILTERING but you do not want 
to do this at all in your situation as it would scan everything.

select * from your_keyspace.your_table where my_date > yesterday and my_date < 
tomorrow

cqlsh> create KEYSPACE my_keyspace WITH replication = {'class': 
'NetworkTopologyStrategy', 'dc1': '1'};
cqlsh> CREATE TABLE my_keyspace.my_table (id uuid, number int, PRIMARY KEY 
((id), number));

cqlsh> SELECT * from my_keyspace.my_table ;

 id   | number
--+
 6e23f79a-8b67-47e0-b8e0-50be78bb1c7f |  3
 abdc0184-a695-427d-b63b-57cdf7a45f00 |  1
 90fe112e-0f74-4cbc-8767-67bdc9c8c3b0 |  4
 8cff3eb7-1aff-4dc7-9969-60190c7e4675 |  2

cqlsh> SELECT * from my_keyspace.my_table where id = 
'6e23f79a-8b67-47e0-b8e0-50be78bb1c7f' and  number > 2;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid 
STRING constant (6e23f79a-8b67-47e0-b8e0-50be78bb1c7f) for "id" of type uuid"

cqlsh> SELECT * from my_keyspace.my_table where id = 
6e23f79a-8b67-47e0-b8e0-50be78bb1c7f and  number > 2;

 id   | number
--+
 6e23f79a-8b67-47e0-b8e0-50be78bb1c7f |  3

You could remodel your data in such way that you would make primary key like 
this

((date), hour-minute, id)

or

((date, hour-minute), id)

I would prefer the second one because if you expect a lot of data per day, they 
would all end up on same set of replicas as hash of partition key 

RE: [EXTERNAL] Re: A Question About Hints

2019-03-05 Thread Durity, Sean R
Versions 2.0 and 2.1 were generally very stable, so I can understand a 
reticence to move when there are so many other things competing for time and 
attention.

Sean Durity




From: shalom sagges 
Sent: Monday, March 04, 2019 4:21 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: A Question About Hints

Everyone really should move off of the 2.x versions just like you are doing.
Tell me about it... But since there are a lot of groups involved, these things 
take time unfortunately.

Thanks for your assistance Kenneth

On Mon, Mar 4, 2019 at 11:04 PM Kenneth Brotman 
mailto:kenbrot...@yahoo.com.invalid>> wrote:
Since you are in the process of upgrading, I’d do nothing on the settings right 
now.  But if you wanted to do something on the settings in the meantime, based 
on my read of the information available, I’d maybe double the default settings. 
The upgrade will help a lot of things as you know.

Everyone really should move off of the 2.x versions just like you are doing.

From: shalom sagges 
[mailto:shalomsag...@gmail.com]
Sent: Monday, March 04, 2019 12:34 PM
To: user@cassandra.apache.org
Subject: Re: A Question About Hints

See my comments inline.

Do the 8 nodes clusters have the problem too?
Yes

To the same extent?
It depends on the throughput, but basically the smaller clusters get low 
throughput, so the problem is naturally smaller.

Is it any cluster across multi-DC’s?
Yes

Do all the clusters use nodes with similar specs?
All nodes have similar specs within a cluster but different specs on different 
clusters.

The version of Cassandra you are on can make a difference.  What version are 
you on?
Currently I'm on various versions, 2.0.14, 2.1.15 and 3.0.12. In the process of 
upgrading to 3.11.4

Did you see Edward Capriolo’s presentation at 26:19 into the YouTube video at: 
https://www.youtube.com/watch?v=uN4FtAjYmLU
 where he briefly mentions you can get into trouble if you go to fast or two 
slow?
I guess you can say it about almost any parameter you change :)

BTW, I thought the comments at the end of the article you mentioned were really 
good.
The entire article is very good, but I wonder if it's still valid since it was 
created around 4 years ago.

Thanks!




On Mon, Mar 4, 2019 at 9:37 PM Kenneth Brotman 
mailto:kenbrotman@yahoocom.invalid>> wrote:
Makes sense  If you have time and don’t mind, could you answer the following:
Do the 8 nodes clusters have the problem too?
To the same extent?
Is it just the clusters with the large node count?
Is it any cluster across multi-DC’s?
Do all the clusters use nodes with similar specs?

The version of Cassandra you are on can make a difference.  What version are 
you on?

Did you see Edward Capriolo’s presentation at 26:19 into the YouTube video at: 
https://www.youtube.com/watch?v=uN4FtAjYmLU
 where he briefly mentions you can get into trouble if you go to fast or two 
slow?
BTW, I thought the comments at the end of the article you mentioned were really 
good.



From: shalom sagges 
[mailto:shalomsag...@gmail.com]
Sent: Monday, March 04, 2019 11:04 AM
To: user@cassandra.apache.org
Subject: Re: A Question About Hints

It varies...
Some clusters have 48 nodes, others 24 nodes and some 8 nodes.
Both settings are on default.

I’d try making a single conservative change to one or the other, measure and 
reassess.  Then do same to other setting.
That's the plan, but I thought I might first get some valuable information from 
someone in the community that has already experienced in this type of change.

Thanks!

On Mon, Mar 4, 2019 at 8:27 PM Kenneth Brotman 
mailto:kenbrot...@yahoo.com.invalid>> wrote:
It sounds like your use case might be appropriate for tuning those two settings 
some.

How many nodes are in the cluster?
Are both settings definitely on the default values currently?

I’d try making a single conservative change to one or the other, measure and 
reassess.  Then do same to other setting.

Then of course share your results with us.

From: shalom sagges 
[mailto:shalomsag...@gmail.com]
Sent: Monday, March 04, 2019 9:54 AM
To: user@cassandra.apache.org
Subject: Re: A Question About Hints

Hi Kenneth,

The concern is that in some cases, hints accumulate on nodes, and it takes a 
while until they are delivered (multi DCs).
I see that 

RE: [EXTERNAL] Re: Question on changing node IP address

2019-02-27 Thread Durity, Sean R
I am not making a recommendation for anyone else – just sharing our experience 
and reasoning. It is why I argued for keeping PropertyFileSnitch in some JIRA 
that proposed dropping it completely. GPFS is the typical recommendation for 
production use. Just a hurdle not worth my time at the moment.

From: Alexander Dejanovski 
Sent: Wednesday, February 27, 2019 9:22 AM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Re: Question on changing node IP address

It has to be balanced with the dangers related to the PropertyFileSnitch.
I've seen such incidents happen twice in the last few months in different 
places and both times recovery was difficult and hazardous.

I still strongly recommend against it.

On Wed, Feb 27, 2019 at 3:11 PM Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
We use the PropertyFileSnitch precisely because it is the same on every node. 
If each node has to have a different file (for GPFS) – deployment is more 
complicated. (And for any automated configuration you would have a list of 
hosts and DC/rack information to compile anyway)

I do put UNKNOWN as the default DC so that any missed node easily appears in 
its own unused DC.


Sean Durity

From: Alexander Dejanovski 
mailto:a...@thelastpickle.com>>
Sent: Wednesday, February 27, 2019 4:43 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Question on changing node IP address

This snitch is easy to misconfigure. It allows some nodes to have a different 
view of the cluster if they are configured differently, which can result in 
data loss (or at least data that is very hard to recover).
Also it has a nasty feature that allows to set a default DC/Rack. If one node 
isn't properly declared in all the files throughout the cluster, it will be 
seen as part of that "default" DC and then again, it's hard to recover.
Be aware that while the GossipingPropertyFileSnitch will not allow changing 
rack of DC for a node that already bootstrapped, the PropertyFileSnitch will 
allow to change it without any notice. So a little misconfiguration could merge 
all nodes from DC1 into DC2, abruptly changing token ownership (and it could 
very be the case that DC1 thinks it's part of DC2 but DC2 still thinks DC1 is 
DC1).
So again, I think this snitch is dangerous and shouldn't be used. The 
GossipingPropertyFileSnitch is much more secure and easy to use.

Cheers,


On Wed, Feb 27, 2019 at 10:13 AM shalom sagges 
mailto:shalomsag...@gmail.com>> wrote:
If you're using the PropertyFileSnitch, well... you shouldn't as it's a rather 
dangerous and tedious snitch to use

I inherited Cassandra clusters that use the PropertyFileSnitch. It's been 
working fine, but you've kinda scared me :-)
Why is it dangerous to use?
If I decide to change the snitch, is it seamless or is there a specific 
procedure one must follow?

Thanks!


On Wed, Feb 27, 2019 at 10:08 AM Alexander Dejanovski 
mailto:a...@thelastpickle.com>> wrote:
I confirm what Oleksandr said.
Just stop Cassandra, change the IP, and restart Cassandra.
If you're using the GossipingPropertyFileSnitch, the node will redeclare its 
new IP through Gossip and that's it.
If you're using the PropertyFileSnitch, well... you shouldn't as it's a rather 
dangerous and tedious snitch to use. But if you are, it'll require to change 
the file containing all the IP addresses across the cluster.

I've been changing IPs on a whole cluster back in 2.1 this way and it went 
through seamlessly.

Cheers,

On Wed, Feb 27, 2019 at 8:54 AM Oleksandr Shulgin 
mailto:oleksandr.shul...@zalando.de>> wrote:
On Wed, Feb 27, 2019 at 4:15 AM 
wxn...@zjqunshuo.com<mailto:wxn...@zjqunshuo.com> 
mailto:wxn...@zjqunshuo.com>> wrote:
>After restart with the new address the server will notice it and log a 
>warning, but it will keep token ownership as long as it keeps the old host id 
>(meaning it must use the same data directory as before restart).

Based on my understanding, token range is binded to host id. As long as host id 
doesn't change, everything is ok. Besides data directory, any other thing can 
lead to host id change? And how host id is caculated? For example, if I upgrade 
Cassandra binary to a new version, after restart, will host id change?

I believe host id is calculated once the new node is initialized and never 
changes afterwards, even through major upgrades.  It is stored in system 
keyspace in data directory, and is stable across restarts.

--
Alex

--
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.thelastpickle.com_=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=xojzDh-fJSOl_ZfDMCYIYi4sckWpwqdnKDG5QMx2nUE=HwAxm8xI-Bmc8IFmwEK0we9hlvNwUVuj7DGpXuNM8r4=>
--
-
Alexander Dejanovski
France
@alexanderdeja


RE: [EXTERNAL] Re: Question on changing node IP address

2019-02-27 Thread Durity, Sean R
We use the PropertyFileSnitch precisely because it is the same on every node. 
If each node has to have a different file (for GPFS) – deployment is more 
complicated. (And for any automated configuration you would have a list of 
hosts and DC/rack information to compile anyway)

I do put UNKNOWN as the default DC so that any missed node easily appears in 
its own unused DC.


Sean Durity

From: Alexander Dejanovski 
Sent: Wednesday, February 27, 2019 4:43 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Question on changing node IP address

This snitch is easy to misconfigure. It allows some nodes to have a different 
view of the cluster if they are configured differently, which can result in 
data loss (or at least data that is very hard to recover).
Also it has a nasty feature that allows to set a default DC/Rack. If one node 
isn't properly declared in all the files throughout the cluster, it will be 
seen as part of that "default" DC and then again, it's hard to recover.
Be aware that while the GossipingPropertyFileSnitch will not allow changing 
rack of DC for a node that already bootstrapped, the PropertyFileSnitch will 
allow to change it without any notice. So a little misconfiguration could merge 
all nodes from DC1 into DC2, abruptly changing token ownership (and it could 
very be the case that DC1 thinks it's part of DC2 but DC2 still thinks DC1 is 
DC1).
So again, I think this snitch is dangerous and shouldn't be used. The 
GossipingPropertyFileSnitch is much more secure and easy to use.

Cheers,


On Wed, Feb 27, 2019 at 10:13 AM shalom sagges 
mailto:shalomsag...@gmail.com>> wrote:
If you're using the PropertyFileSnitch, well... you shouldn't as it's a rather 
dangerous and tedious snitch to use

I inherited Cassandra clusters that use the PropertyFileSnitch. It's been 
working fine, but you've kinda scared me :-)
Why is it dangerous to use?
If I decide to change the snitch, is it seamless or is there a specific 
procedure one must follow?

Thanks!


On Wed, Feb 27, 2019 at 10:08 AM Alexander Dejanovski 
mailto:a...@thelastpickle.com>> wrote:
I confirm what Oleksandr said.
Just stop Cassandra, change the IP, and restart Cassandra.
If you're using the GossipingPropertyFileSnitch, the node will redeclare its 
new IP through Gossip and that's it.
If you're using the PropertyFileSnitch, well... you shouldn't as it's a rather 
dangerous and tedious snitch to use. But if you are, it'll require to change 
the file containing all the IP addresses across the cluster.

I've been changing IPs on a whole cluster back in 2.1 this way and it went 
through seamlessly.

Cheers,

On Wed, Feb 27, 2019 at 8:54 AM Oleksandr Shulgin 
mailto:oleksandr.shul...@zalando.de>> wrote:
On Wed, Feb 27, 2019 at 4:15 AM 
wxn...@zjqunshuo.com 
mailto:wxn...@zjqunshuo.com>> wrote:
>After restart with the new address the server will notice it and log a 
>warning, but it will keep token ownership as long as it keeps the old host id 
>(meaning it must use the same data directory as before restart).

Based on my understanding, token range is binded to host id. As long as host id 
doesn't change, everything is ok. Besides data directory, any other thing can 
lead to host id change? And how host id is caculated? For example, if I upgrade 
Cassandra binary to a new version, after restart, will host id change?

I believe host id is calculated once the new node is initialized and never 
changes afterwards, even through major upgrades.  It is stored in system 
keyspace in data directory, and is stable across restarts.

--
Alex

--
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
--
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 

RE: [EXTERNAL] Re: Question on changing node IP address

2019-02-26 Thread Durity, Sean R
This has not been my experience. Changing IP address is one of the worst admin 
tasks for Cassandra. System.peers and other information on each nodes is stored 
by ip address. And gossip is really good at sending around the old information 
mixed with new…



Sean Durity

From: Oleksandr Shulgin 
Sent: Tuesday, February 26, 2019 5:36 AM
To: User 
Subject: [EXTERNAL] Re: Question on changing node IP address

On Tue, Feb 26, 2019 at 9:39 AM 
wxn...@zjqunshuo.com 
mailto:wxn...@zjqunshuo.com>> wrote:

I'm running 2.2.8 with vnodes and I'm planning to change node IP address.
My procedure is:
Turn down one node, setting auto_bootstrap to false in yaml file, then bring it 
up with -Dcassandra.replace_address. Repeat the procedure one by one for the 
other nodes.

I care about streaming because the data is very large and if there is 
streaming, it will take a long time. When the node with new IP be brought up, 
will it take over the token range it has before? I expect no token range 
reassignment and no streaming. Am I right?

Any thing I need care about when making IP address change?

Changing the IP address of a node does not require special considerations.  
After restart with the new address the server will notice it and log a warning, 
but it will keep token ownership as long as it keeps the old host id (meaning 
it must use the same data directory as before restart).

At the same time, *do not* use the replace_address option: it assumes empty 
data directory and will try to stream data from other replicas into the node.

--
Alex




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Make large partitons lighter on select without changing primary partition formation.

2019-02-13 Thread Durity, Sean R
Agreed. It’s pretty close to impossible to administrate your way out of a data 
model that doesn’t play to Cassandra’s strengths. Which is true for other data 
storage technologies – you need to model the data the way that the engine is 
designed to work.


Sean Durity

From: DuyHai Doan 
Sent: Wednesday, February 13, 2019 8:08 AM
To: user 
Subject: [EXTERNAL] Re: Make large partitons lighter on select without changing 
primary partition formation.

Plain answer is NO

There is a slight hope that the JIRA 
https://issues.apache.org/jira/browse/CASSANDRA-9754
 gets into 4.0 release

But right now, there seems to be few interest in this ticket, the last comment 
23/Feb/2017 old ...


On Wed, Feb 13, 2019 at 1:18 PM Vsevolod Filaretov 
mailto:vsfilare...@gmail.com>> wrote:
Hi all,

The question.

We have Cassandra 3.11.1 with really heavy primary partitions:
cfhistograms 95%% is 130+ mb, 95%% cell count is 3.3mln and higher, 98%% 
partition size is 220+ mb sometimes partitions are 1+ gb. We have regular 
problems with node lockdowns leading to read request timeouts under read 
requests load.

Changing primary partition key structure is out of question.

Are there any sharding techniques available to dilute partitions at level lower 
than 'select' requests to make read performance better? Without changing read 
requests syntax?

Thank you all in advance,
Vsevolod Filaretov.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Bootstrap keeps failing

2019-02-07 Thread Durity, Sean R
I have seen unreliable streaming (streaming that doesn’t finish) because of TCP 
timeouts from firewalls or switches. The default tcp_keepalive kernel 
parameters are usually not tuned for that. See 
https://docs.datastax.com/en/dse-trblshoot/doc/troubleshooting/idleFirewallLinux.html
 for more details. These “remote” timeouts are difficult to detect or prove if 
you don’t have access to the intermediate network equipment.

Sean Durity
From: Léo FERLIN SUTTON 
Sent: Thursday, February 07, 2019 10:26 AM
To: user@cassandra.apache.org; dinesh.jo...@yahoo.com
Subject: [EXTERNAL] Re: Bootstrap keeps failing

Hello !

Thank you for your answers.

So I have tried, multiple times, to start bootstrapping from scratch. I often 
have the same problem (on other nodes as well) but sometimes it works and I can 
move on to another node.

I have joined a jstack dump and some logs.

Our node was shut down at around 97% disk space used.
I turned it back on and it starting the bootstrap process again.

The log file is the log from this attempt, same for the thread dump.

Small warning, I have somewhat anonymised the log files so there may be some 
inconsistencies.

Regards,

Leo

On Thu, Feb 7, 2019 at 8:13 AM 
dinesh.jo...@yahoo.com.INVALID 
mailto:dinesh.jo...@yahoo.com.invalid>> wrote:
Would it be possible for you to take a thread dump & logs and share them?

Dinesh


On Wednesday, February 6, 2019, 10:09:11 AM PST, Léo FERLIN SUTTON 
mailto:lfer...@mailjet.com.INVALID>> wrote:


Hello !

I am having a recurrent problem when trying to bootstrap a few new nodes.

Some general info :

  *   I am running cassandra 3.0.17
  *   We have about 30 nodes in our cluster
  *   All healthy nodes have between 60% to 90% used disk space on 
/var/lib/cassandra
So I create a new node and let auto_bootstrap do it's job. After a few days the 
bootstrapping node stops streaming new data but is still not a member of the 
cluster.

`nodetool status` says the node is still joining,

When this happens I run `nodetool bootstrap resume`. This usually ends up in 
two different ways :

  1.  The node fills up to 100% disk space and crashes.
  2.  The bootstrap resume finishes with errors
When I look at `nodetool netstats -H` is  looks like `bootstrap resume` does 
not resume but restarts a full transfer of every data from every node.

This is the output I get from `nodetool resume` :
[2019-02-06 01:39:14,369] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-225-big-Data.db
 (progress: 2113%)
[2019-02-06 01:39:16,821] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-88-big-Data.db
 (progress: 2113%)
[2019-02-06 01:39:17,003] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-89-big-Data.db
 (progress: 2113%)
[2019-02-06 01:39:17,032] session with /10.16.XX.YYY complete (progress: 2113%)
[2019-02-06 01:41:15,160] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-220-big-Data.db
 (progress: 2113%)
[2019-02-06 01:42:02,864] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-226-big-Data.db
 (progress: 2113%)
[2019-02-06 01:42:09,284] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-227-big-Data.db
 (progress: 2113%)
[2019-02-06 01:42:10,522] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-228-big-Data.db
 (progress: 2113%)
[2019-02-06 01:42:10,622] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-229-big-Data.db
 (progress: 2113%)
[2019-02-06 01:42:11,925] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-90-big-Data.db
 (progress: 2114%)
[2019-02-06 01:42:14,887] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-91-big-Data.db
 (progress: 2114%)
[2019-02-06 01:42:14,980] session with /10.16.XX.ZZZ complete (progress: 2114%)
[2019-02-06 01:42:14,980] Stream failed
[2019-02-06 01:42:14,982] Error during bootstrap: Stream failed
[2019-02-06 01:42:14,982] Resume bootstrap complete

The bootstrap `progress` goes way over 100% and eventually fails.


Right now I have a node with this output from `nodetool status` :
`UJ  10.16.XX.YYY  2.93 TB256  ? 
5788f061-a3c0-46af-b712-ebeecd397bf7  c`

It is almost filled with data, yet if I look at `nodetool netstats` :
Receiving 480 files, 325.39 GB total. Already received 5 files, 68.32 
MB total
Receiving 499 files, 328.96 GB total. Already received 1 files, 1.32 GB 
total
Receiving 506 files, 345.33 GB total. Already received 6 files, 24.19 
MB total
Receiving 362 files, 206.73 GB total. Already received 7 files, 34 MB 
total
Receiving 424 files, 281.25 GB total. Already 

RE: [EXTERNAL] RE: SASI queries- cqlsh vs java driver

2019-02-07 Thread Durity, Sean R
Kenneth is right. Trying to port/support a relational model to a CQL model the 
way you are doing it is not going to go well. You won’t be able to scale or get 
the search flexibility that you want. It will make Cassandra seem like a bad 
fit. You want to play to Cassandra’s strengths – availability, low latency, 
scalability, etc. so you need to store the data the way you want to retrieve it 
(query first modeling!). You could look at defining the “right” partition and 
clustering keys, so that the searches are within a single, reasonably sized 
partition. And you could have lookup tables for other common search patterns 
(item_by_model_name, etc.)

If that kind of modeling gets you to a situation where you have too many lookup 
tables to keep consistent, you could consider something like DataStax 
Enterprise Search (embedded SOLR) to create SOLR indexes on searchable fields. 
A SOLR query will typically be an order of magnitude slower than a partition 
key lookup, though.

It really boils down to the purpose of the data store. If you are looking for 
primarily an “anything goes” search engine, Cassandra may not be a good choice. 
If you need Cassandra-level availability, extremely low latency queries (on 
known access patterns), high volume/low latency writes, easy scalability, etc. 
then you are going to have to rethink how you model the data.


Sean Durity

From: Kenneth Brotman 
Sent: Thursday, February 07, 2019 7:01 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: SASI queries- cqlsh vs java driver

Peter,

Sounds like you may need to use a different architecture.  Perhaps you need 
something like Presto or Kafka as a part of the solution.  If the data from the 
legacy system is wrong for Cassandra it’s an ETL problem?  You’d have to 
transform the data you want to use with Cassandra so that a proper data model 
for Cassandra can be used.

From: Peter Heitman [mailto:pe...@heitman.us]
Sent: Wednesday, February 06, 2019 10:05 PM
To: user@cassandra.apache.org
Subject: Re: SASI queries- cqlsh vs java driver

Yes, I have read the material. The problem is that the application has a query 
facility available to the user where they can type in "(A = foo AND B = bar) OR 
C = chex" where A, B, and C are from a defined list of terms, many of which are 
columns in the mytable below while others are from other tables. This query 
facility was implemented and shipped years before we decided to move to 
Cassandra
On Thu, Feb 7, 2019, 8:21 AM Kenneth Brotman 
mailto:kenbrot...@yahoo.com.invalid>> wrote:
The problem is you’re not using a query first design.  I would recommend first 
reading chapter 5 of Cassandra: The Definitive Guide by Jeff Carpenter and Eben 
Hewitt.  It’s available free online at this 
link.

Kenneth Brotman

From: Peter Heitman [mailto:pe...@heitman.us]
Sent: Wednesday, February 06, 2019 6:33 PM

To: user@cassandra.apache.org
Subject: Re: SASI queries- cqlsh vs java driver

Yes, I "know" that allow filtering is a sign of a (possibly fatal) inefficient 
data model. I haven't figured out how to do it correctly yet
On Thu, Feb 7, 2019, 7:59 AM Kenneth Brotman 
mailto:kenbrot...@yahoo.com.invalid>> wrote:
Exactly.  When you design your data model correctly you shouldn’t have to use 
ALLOW FILTERING in the queries.  That is not recommended.

Kenneth Brotman

From: Peter Heitman [mailto:pe...@heitman.us]
Sent: Wednesday, February 06, 2019 6:09 PM
To: user@cassandra.apache.org
Subject: Re: SASI queries- cqlsh vs java driver

You are completely right! My problem is that I am trying to port code for SQL 
to CQL for an application that provides the user with a relatively general 
search facility. The original implementation didn't worry about secondary 
indexes - it just took advantage of the ability to create arbitrarily complex 
queries with inner joins, left joins, etc. I am reimplimenting it to create a 
parse tree of CQL queries and doing the ANDs and ORs in the application. Of 
course once I get enough of this implemented I will have to load up the table 
with a large data set and see if it gives acceptable performance for our use 
case.
On Wed, Feb 6, 2019, 8:52 PM Kenneth Brotman 
mailto:kenbrotman@yahoo.cominvalid>> wrote:
Isn’t that a lot of SASI indexes for one table.  Could you denormalize 

RE: [EXTERNAL] fine tuning for wide rows and mixed worload system

2019-01-11 Thread Durity, Sean R
I will start – knowing that others will have additional help/questions.

What heap size are you using? Sounds like you are using the CMS garbage 
collector. That takes some arcane knowledge and lots of testing to tune. I 
would start with G1 and using ½ the available RAM as the heap size. I would 
want 32 GB RAM as a minimum on the hosts.

Spinning disks are a problem, too. Can you tell if the IO is getting 
overwhelmed? SSDs are much preferred.

Read before write is usually an anti-pattern for Cassandra. From your queries, 
it seems you have a partition key and clustering key. Can you give us the table 
schema? I’m also concerned about the IF EXISTS in your delete. I think that 
invokes a light weight transaction – costly for performance. Is it really 
required for your use case?


Sean Durity

From: Marco Gasparini 
Sent: Friday, January 11, 2019 8:20 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] fine tuning for wide rows and mixed worload system

Hello everyone,

I need some advise in order to solve my use case problem. I have already tried 
some solutions but it didn't work out.
Can you help me with the following configuration please? any help is very 
appreciate

I'm using:
- Cassandra 3.11.3
- java version "1.8.0_191"

My use case is composed by the following constraints:
- about 1M reads per day (it is going to rise up)
- about 2M writes per day (it is going to rise up)
- there is a high peek of requests in less than 2 hours in which the system 
receives half of all day traffic (500K reads, 1M writes)
- each request is composed by 1 read and 2 writes (1 delete + 1 write)

* the read query selects max 3 records based on the primary key 
(select * from my_keyspace.my_table where pkey = ? limit 3)
* then is performed a deletion of one record (delete from 
my_keyspace.my_table where pkey = ? and event_datetime = ? IF EXISTS)
* finally the new data is stored (insert into my_keyspace.my_table 
(event_datetime, pkey, agent, some_id, ft, ftt..) values (?,?,?,?,?,?...))

- each row is pretty wide. I don't really know the exact size because there are 
2 dynamic text columns that stores data between 1MB to 50MB length each.
  So, reads are going to be huge because I read 3 records of that dimension 
every time. Writes are complex as well because each row is that wide.

Currently, I own 3 nodes with the following properties:
- node1:
* Intel Core i7-3770
* 2x HDD SATA 3,0 TB
* 4x RAM 8192 MB DDR3
* nominative bit rate 175MB/s
# blockdev --report /dev/sd[ab]
RORA   SSZ   BSZ   StartSecSize   Device
rw   256   512  4096  0   3000592982016   
/dev/sda
rw   256   512  4096  0   3000592982016   
/dev/sdb

- node2,3:
* Intel Core i7-2600
* 2x HDD SATA 3,0 TB
* 4x RAM 4096 MB DDR3
* nominative bit rate 155MB/s
# blockdev --report /dev/sd[ab]
RORA   SSZ   BSZ   StartSecSize   Device
rw   256   512  4096  0   3000592982016   
/dev/sda
rw   256   512  4096  0   3000592982016   
/dev/sdb

Each node has 2 disks but I have disabled RAID option and I have created a 
virtual single disk in order to get much free space.
Can this configuration create issues?

I have already tried some configurations in order to make it work, like:
1) straigthforward attempt
- default Cassandra configuration (cassandra.yaml)
- RF=1
- SizeTieredCompactionStrategy  (write strategy)
- no row cache (because of wide rows dimension is better to have no 
row cache)
- gc_grace_seconds = 1 day (unfortunately, I did no repair schedule 
at all)
results:
too many timeouts, losing data

2)
- added repair schedules
- RF=3 (in order increase reads speed)
results:
- too many timeouts, losing data
- high I/O consumption on each nodes (iostat shows 100% 
in %util on each nodes, dstat shows hundred of M read for each iteration)
- node2 frozen until I stopped data writes.
- node3 almost frozen
- many panding MutationStage events in TPSTATS in node2
- many full GC
- many HintsDispatchExecutor events in system.log

actual)
- added repair schedules
- RF=3
- set durable_writes = false in order to speed up writes
- increased young heap
- decreased SurviviorRatio in order to get much young size 
available because of wide rows data
- increased from 1 to 3 MaxTenuringThreshold in order to decrease 
reads latency
- increased 

RE: [EXTERNAL] Re: Good way of configuring Apache spark with Apache Cassandra

2019-01-10 Thread Durity, Sean R
RF in the Analytics DC can be 2 (or even 1) if storage cost is more important 
than availability. There is a storage (and CPU and network latency) cost for a 
separate Spark cluster. So, the variables of your specific use case may swing 
the decision in different directions.


Sean Durity
From: Dor Laor 
Sent: Wednesday, January 09, 2019 11:23 PM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Re: Good way of configuring Apache spark with Apache 
Cassandra

On Wed, Jan 9, 2019 at 7:28 AM Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
I think you could consider option C: Create a (new) analytics DC in Cassandra 
and run your spark nodes there. Then you can address the scaling just on that 
DC. You can also use less vnodes, only replicate certain keyspaces, etc. in 
order to perform the analytics more efficiently.

But this way you duplicate the entire dataset RF times over. It's very very 
expensive.
It is a common practice to run Spark on a separate Cassandra (virtual) 
datacenter but it's done
in order to isolate the analytic workload from the realtime workload for 
isolation and low latency guarantees.
We addressed this problem elsewhere, beyond this scope.



Sean Durity

From: Dor Laor mailto:d...@scylladb.com>>
Sent: Friday, January 04, 2019 4:21 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Good way of configuring Apache spark with Apache 
Cassandra

I strongly recommend option B, separate clusters. Reasons:
 - Networking of node-node is negligible compared to networking within the node
 - Different scaling considerations
   Your workload may require 10 Spark nodes and 20 database nodes, so why 
bundle them?
   This ratio may also change over time as your application evolves and amount 
of data changes.
 - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't want it 
to affect Cassandra and the opposite.
   If you isolate it with cgroups, you may have too much idle time when the 
above doesn't happen.


On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy 
mailto:goutham.chiru...@gmail.com>> wrote:
Hi,
We have requirement of heavy data lifting and analytics requirement and decided 
to go with Apache Spark. In the process we have come up with two patterns
a. Apache Spark and Apache Cassandra co-located and shared on same nodes.
b. Apache Spark on one independent cluster and Apache Cassandra as one 
independent cluster.

Need good pattern how to use the analytic engine for Cassandra. Thanks in 
advance.

Regards
Goutham.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Good way of configuring Apache spark with Apache Cassandra

2019-01-10 Thread Durity, Sean R
At this point, I would be talking to DataStax. They already have Spark and 
SOLR/search fully embedded in their product. You can look at their docs for 
some idea of the RAM and CPU required for combined Search/Analytics use cases. 
I would expect this to be a much faster route to production.


Sean Durity
From: Goutham reddy 
Sent: Wednesday, January 09, 2019 11:29 AM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Re: Good way of configuring Apache spark with Apache 
Cassandra

Thanks Sean. But what if I want to have both Spark and elasticsearch with 
Cassandra as separare data center. Does that cause any overhead ?

On Wed, Jan 9, 2019 at 7:28 AM Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
I think you could consider option C: Create a (new) analytics DC in Cassandra 
and run your spark nodes there. Then you can address the scaling just on that 
DC. You can also use less vnodes, only replicate certain keyspaces, etc. in 
order to perform the analytics more efficiently.


Sean Durity

From: Dor Laor mailto:d...@scylladb.com>>
Sent: Friday, January 04, 2019 4:21 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Good way of configuring Apache spark with Apache 
Cassandra

I strongly recommend option B, separate clusters. Reasons:
 - Networking of node-node is negligible compared to networking within the node
 - Different scaling considerations
   Your workload may require 10 Spark nodes and 20 database nodes, so why 
bundle them?
   This ratio may also change over time as your application evolves and amount 
of data changes.
 - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't want it 
to affect Cassandra and the opposite.
   If you isolate it with cgroups, you may have too much idle time when the 
above doesn't happen.


On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy 
mailto:goutham.chiru...@gmail.com>> wrote:
Hi,
We have requirement of heavy data lifting and analytics requirement and decided 
to go with Apache Spark. In the process we have come up with two patterns
a. Apache Spark and Apache Cassandra co-located and shared on same nodes.
b. Apache Spark on one independent cluster and Apache Cassandra as one 
independent cluster.

Need good pattern how to use the analytic engine for Cassandra. Thanks in 
advance.

Regards
Goutham.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.
--
Regards
Goutham Reddy



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Good way of configuring Apache spark with Apache Cassandra

2019-01-09 Thread Durity, Sean R
I think you could consider option C: Create a (new) analytics DC in Cassandra 
and run your spark nodes there. Then you can address the scaling just on that 
DC. You can also use less vnodes, only replicate certain keyspaces, etc. in 
order to perform the analytics more efficiently.


Sean Durity

From: Dor Laor 
Sent: Friday, January 04, 2019 4:21 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Good way of configuring Apache spark with Apache 
Cassandra

I strongly recommend option B, separate clusters. Reasons:
 - Networking of node-node is negligible compared to networking within the node
 - Different scaling considerations
   Your workload may require 10 Spark nodes and 20 database nodes, so why 
bundle them?
   This ratio may also change over time as your application evolves and amount 
of data changes.
 - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't want it 
to affect Cassandra and the opposite.
   If you isolate it with cgroups, you may have too much idle time when the 
above doesn't happen.


On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy 
mailto:goutham.chiru...@gmail.com>> wrote:
Hi,
We have requirement of heavy data lifting and analytics requirement and decided 
to go with Apache Spark. In the process we have come up with two patterns
a. Apache Spark and Apache Cassandra co-located and shared on same nodes.
b. Apache Spark on one independent cluster and Apache Cassandra as one 
independent cluster.

Need good pattern how to use the analytic engine for Cassandra. Thanks in 
advance.

Regards
Goutham.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Howto avoid tombstones when inserting NULL values

2018-12-27 Thread Durity, Sean R
You say the events are incremental updates. I am interpreting this to mean only 
some columns are updated. Others should keep their original values.

You are correct that inserting null creates a tombstone.

Can you only insert the columns that actually have new values? Just skip the 
columns with no information. (Make the insert generator a bit smarter.)

Create table happening (id text primary key, event text, a text, b text, c 
text);
Insert into table happening (id, event, a, b, c) values ("MainEvent","The most 
complete info we have right now","Priceless","10 pm","Grand Ballroom");
-- b changes
Insert into happening (id, b) values ("MainEvent","9:30 pm");


Sean Durity


-Original Message-
From: Tomas Bartalos 
Sent: Thursday, December 27, 2018 9:27 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Howto avoid tombstones when inserting NULL values

Hello,

I’d start with describing my use case and how I’d like to use Cassandra to 
solve my storage needs.
We're processing a stream of events for various happenings. Every event have a 
unique happening_id.
One happening may have many events, usually ~ 20-100 events. I’d like to store 
only the latest event for the same happening (Event is an incremental update 
and it contains all up-to date data about happening).
Technically the events are streamed from Kafka, processed with Spark an saved 
to Cassandra.
In Cassandra we use upserts (insert with same primary key).  So far so good, 
however there comes the tombstone...

When I’m inserting field with NULL value, Cassandra creates tombstone for this 
field. As I understood this is due to space efficiency, Cassandra doesn’t have 
to remember there is a NULL value, she just deletes the respective column and a 
delete creates a ... tombstone.
I was hoping there could be an option to tell Cassandra not to be so space 
effective and store “unset" info without generating tombstones.
Something similar to inserting empty strings instead of null values:

CREATE TABLE happening (id text PRIMARY KEY, event text); insert into happening 
(‘1’, ‘event1’); — tombstone is generated insert into happening (‘1’, null); — 
tombstone is not generated insert into happening (‘1’, '’);

Possible solutions:
1. Disable tombstones with gc_grace_seconds = 0 or set to reasonable low value 
(1 hour ?) . Not good, since phantom data may re-appear 2. ignore NULLs on 
spark side with “spark.cassandra.output.ignoreNulls=true”. Not good since this 
will never overwrite previously inserted event field with “empty” one.
3. On inserts with spark, find all NULL values and replace them with “empty” 
equivalent (empty string for text, 0 for integer). Very inefficient and 
problematic to find “empty” equivalent for some data types.

Until tombstones appeared Cassandra was the right fit for our use case, however 
now I’m not sure if we’re heading the right direction.
Could you please give me some advice how to solve this problem ?

Thank you,
Tomas
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


RE: [EXTERNAL] Writes and Reads with high latency

2018-12-27 Thread Durity, Sean R
Your RF is only 1, so the data only exists on one node. This is not typically 
how Cassandra is used. If you need the high availability and low latency, you 
typically set RF to 3 per DC.

How many event_datetime records can you have per pkey? How many pkeys (roughly) 
do you have? In general, you only want to have at most 100 MB of data per 
partition (pkey). If it is larger than that, I would expect some timeouts. And 
because only one node has the data, a single timeout means you won’t get any 
data. Server timeouts default to just 10 seconds. The secret to Cassandra is to 
always select your data by at least the primary key (which you are doing). So, 
I suspect you either have very wide rows or lots of tombstones.

Since you mention lots of deletes, I am thinking it could be tombstones. Are 
you getting any tombstone warnings or errors in your system.log? When you 
delete, are you deleting a full partition? If you are deleting just part of a 
partition over and over, I think you will be creating too many tombstones. I 
try to design my data partitions so that deletes are for a full partition. Then 
I won’t be reading through 1000s (or more) tombstones trying to find the live 
data.


Sean Durity

From: Marco Gasparini 
Sent: Thursday, December 27, 2018 3:01 AM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Writes and Reads with high latency

Hello Sean,

here my schema and RF:

-
CREATE KEYSPACE my_keyspace WITH replication = {'class': 
'NetworkTopologyStrategy', 'DC1': '1'}  AND durable_writes = true;

CREATE TABLE my_keyspace.my_table (
pkey text,
event_datetime timestamp,
agent text,
ft text,
ftt text,
some_id bigint,
PRIMARY KEY (pkey, event_datetime)
) WITH CLUSTERING ORDER BY (event_datetime DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 9
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

-

Queries I make are very simple:

select pkey, event_datetime, ft, some_id, ftt from my_keyspace.my_table where 
pkey = ? limit ?;
and
insert into my_keyspace.my_table (event_datetime, pkey, agent, some_id, ft, 
ftt) values (?,?,?,?,?,?);

About Retry policy, the answer is yes, actually when a write fails I store it 
somewhere else and, after a period, a try to write it to Cassandra again. This 
way I can store almost all my data, but when the problem is the read I don't 
apply any Retry policy (but this is my problem)


Thanks
Marco


Il giorno ven 21 dic 2018 alle ore 17:18 Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> ha scritto:
Can you provide the schema and the queries? What is the RF of the keyspace for 
the data? Are you using any Retry policy on your Cluster object?


Sean Durity

From: Marco Gasparini 
mailto:marco.gaspar...@competitoor.com>>
Sent: Friday, December 21, 2018 10:45 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Writes and Reads with high latency

hello all,

I have 1 DC of 3 nodes in which is running Cassandra 3.11.3 with consistency 
level ONE and Java 1.8.0_191.

Every day, there are many nodejs programs that send data to the cassandra's 
cluster via NodeJs cassandra-driver.
Every day I got like 600k requests. Each request makes the server to:
1_ READ some data in Cassandra (by an id, usually I get 3 records),
2_ DELETE one of those records
3_ WRITE the data into Cassandra.

So every day I make many deletes.

Every day I find errors like:
"All host(s) tried for query failed. First host tried, 
10.8.0.10:9042<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.8.0.10-3A9042_=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=Y2zNzOyvqOiHqZ5yvB1rO_X6C-HivNjXYN0bLLL-yZQ=2v42cyvuxcXJ0oMfUrRcY-kRno1SkM4CTEMi4n1k0Wo=>:
 Host considered as DOWN. See innerErrors"
"Server timeout during write query at consistency LOCAL_ONE (0 peer(s) 
acknowledged the write over 1 required)"
"Server timeout during write query at consistency SERIAL (0 peer(s) 
acknowledged the write over 1 required)"
"Server timeout during read query at consistency LOCAL_ONE (0 peer(s) 
acknowledged the read over 1 required)"

nodetool tablehistograms tells me this:

Percentile  SSTables Writ

RE: [EXTERNAL] Writes and Reads with high latency

2018-12-21 Thread Durity, Sean R
Can you provide the schema and the queries? What is the RF of the keyspace for 
the data? Are you using any Retry policy on your Cluster object?


Sean Durity

From: Marco Gasparini 
Sent: Friday, December 21, 2018 10:45 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Writes and Reads with high latency

hello all,

I have 1 DC of 3 nodes in which is running Cassandra 3.11.3 with consistency 
level ONE and Java 1.8.0_191.

Every day, there are many nodejs programs that send data to the cassandra's 
cluster via NodeJs cassandra-driver.
Every day I got like 600k requests. Each request makes the server to:
1_ READ some data in Cassandra (by an id, usually I get 3 records),
2_ DELETE one of those records
3_ WRITE the data into Cassandra.

So every day I make many deletes.

Every day I find errors like:
"All host(s) tried for query failed. First host tried, 
10.8.0.10:9042:
 Host considered as DOWN. See innerErrors"
"Server timeout during write query at consistency LOCAL_ONE (0 peer(s) 
acknowledged the write over 1 required)"
"Server timeout during write query at consistency SERIAL (0 peer(s) 
acknowledged the write over 1 required)"
"Server timeout during read query at consistency LOCAL_ONE (0 peer(s) 
acknowledged the read over 1 required)"

nodetool tablehistograms tells me this:

Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)
50% 8.00379.02   1955.67379022  
   8
75%10.00785.94 155469.30654949  
  17
95%12.00  17436.92 268650.95   1629722  
  35
98%12.00  25109.16 322381.14   2346799  
  42
99%12.00  30130.99 386857.37   3379391  
  50
Min 0.00  6.87 88.15   104  
   0
Max12.00  43388.63 386857.37  20924300  
 179

in the 99% I noted that write and read latency is pretty high, but I don't know 
how to improve that.
I can provide more statistics if needed.

Is there any improvement I can make to the Cassandra's configuration in order 
to not to lose any data?

Thanks

Regards
Marco



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: upgrade Apache Cassandra 2.1.9 to 3.0.9

2018-12-05 Thread Durity, Sean R
In my understanding, there is a balance of getting upgradesstables done vs 
normal activity. I think the cluster can function fine with old and new 
sstables, but there can be a performance hit to reading the older version 
(perhaps). Personally, I don’t restart repairs until upgradesstables is 
completed. So, I push to get upgradesstables completed as soon as possible.


Sean Durity

From: Shravan R 
Sent: Tuesday, December 04, 2018 3:39 PM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Re: upgrade Apache Cassandra 2.1.9 to 3.0.9

Thanks Sean. I have automation in place that can put the new binary and restart 
the node to a newer version as quickly as possible. upgradesstables is I/O 
intensive and it takes time and is proportional to the data on the node. Given 
these constraints, is there a risk due to prolonged upgradesstables?

On Tue, Dec 4, 2018 at 12:20 PM Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
We have had great success with Cassandra upgrades with applications staying 
on-line. It is one of the strongest benefits of Cassandra. A couple things I 
incorporate into upgrades:

-  The main task is getting the new binaries loaded, then restarting 
the node – in a rolling fashion. Get this done as quickly as possible

-  Streaming between versions is usually problematic. So, I never do 
any node additions or decommissions during an upgrade

-  With applications running, there is not an acceptable back-out plan 
(either lose data or take a long outage or both), so we are always going 
forward. So, lower life cycle testing is important before hitting production

-  Upgrading is a more frequent activity, so get the process/automation 
in place. The upgrade process should not be a reason to delay, especially for 
minor version upgrades that might be quickly necessary (security issue or bug 
fix).


Sean Durity

From: Shravan R mailto:skr...@gmail.com>>
Sent: Tuesday, December 04, 2018 12:22 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: upgrade Apache Cassandra 2.1.9 to 3.0.9

Thanks Jeff. I tried to bootstrap a 3.x node to a partially upgraded cluster 
(2.1.9 + 3.x) and I was not able to do so. The schema never settled.

How does the below approach sound like?

  1.  Update the software binary on all nodes to use cassandra-3.x upon a 
restart.
  2.  Restart all nodes in a rolling fashion
  3.  Run nodetool upgradesstables in a rolling fashion

Is there a risk on pending nodetool upgradesstables?

On Sun, Dec 2, 2018 at 2:12 AM Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:


On Dec 2, 2018, at 12:40 PM, Shravan R 
mailto:skr...@gmail.com>> wrote:
Marc/Dimitry/Jon - greatly appreciate your feedback. I will look into the 
version part that you suggested. The reason to go direct to 3.x is to take a bi 
leap and reduce overall effort to upgrade a large cluster (development 
included).

I have these questions from my original post. Appreciate if you could shed some 
light and point me in the right direction.

1) How do deal with decommissioning a 2.1.9 node in a partially upgraded 
cluster?

If any of the replicas have already upgraded, which is almost guaranteed if 
you’re using vnodes, It’s hard / you don’t. You’d basically upgrade everything 
else and then deal with it. If a host fails mid upgrade you’ll likely have some 
period of unavailables while you bounce the replicas to finish, then you can 
decom



2) How to bootstrap a 3.x node to a partially upgraded cluster?

This may work fine, but test it because I’m not certain. It should be able to 
read the 2.1 and 3.0 sstables that’ll stream so it’ll just work

3) Is there an alternative approach to the upgrade large clusters. i.e instead 
of going through nodetool upgradesstables on each node in rolling fashion

Bounce them all as quickly as is practical, do the upgradesstables after the 
bounces complete





On Sat, Dec 1, 2018 at 1:03 PM Jonathan Haddad 
mailto:j...@jonhaddad.com>> wrote:
Dmitry is right. Generally speaking always go with the latest bug fix release.

On Sat, Dec 1, 2018 at 10:14 AM Dmitry Saprykin 
mailto:saprykin.dmi...@gmail.com>> wrote:
See more here
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-13004<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_plugins_servlet_mobile-23issue_CASSANDRA-2D13004=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=--WtdKaRCohgTv7Y6px-TdcK2xJFB9oaDOSfdoBQ8D0=8csmPWgUEWao6E4wthrG_-BX5a2OQJKXpkKtFLjSPlI=>

On Sat, Dec 1, 2018 at 1:02 PM Dmitry Saprykin 
mailto:saprykin.dmi...@gmail.com>> wrote:
Even more, 3.0.9 is a terrible target choice by itself. It has a nasty bug 
corrupting sstables on alter.

On Sat, Dec 1, 2018 at 11:55 AM Marc Selwan 
mailto:marc.sel...@datastax.com>> wrote:
Hi Shravan,

Did you upgrade Apache Cassandra 2.1.9 to the latest patch release bef

RE: [EXTERNAL] Cassandra Upgrade Plan 2.2.4 to 3.11.3

2018-12-04 Thread Durity, Sean R
See my recent post for some additional points. But I wanted to encourage you to 
look at the in-place upgrade on your existing hardware. No need to add a DC to 
try and upgrade. The cluster will handle reads and writes with nodes of 
different versions – no problems. I have done this many times on many clusters.

Also, I tell my teams there is no real back-out after we get the first node 
upgraded. This is because any new data is being written in the new sstable 
format (assuming the version has a new sstable format) – whether inserts or 
compaction. Any snapshot of the cluster pre-upgrade is now obsolete. Test 
thoroughly, then go forward as quickly as possible.


Sean Durity

From: Devaki, Srinivas 
Sent: Sunday, December 02, 2018 9:24 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Cassandra Upgrade Plan 2.2.4 to 3.11.3

Hi everyone,

I have planned out our org's cassandra upgrade plan and want to make sure if it 
seems fine.

Details Existing Cluster:
* Cassandra 2.2.4
* 8 nodes with 32G ram and 12G max heap allocated to cassandra
* 4 nodes in each rack

1. Ensured all clients to use LOCAL_* consistency levels and all traffic to 
"old" dc
2. Add new cluster as "new" dc with cassandra 2.2.4
  2.1 update conf on all nodes in "old" dc
  2.2 rolling restart the "old" dc
3. Alter tables with similar replication factor on the "new" dc
4. cassandra repair on all nodes in "new" dc
5. upgrade each node in "new" dc to cassandra 3.11.3 (and upgradesstables)
6. switch all clients to connect to new cluster
7. repair all new nodes once more
8. alter tables to replication only on new dc
9. remove "old" dc

and I have some doubts on the same plan
D1. can i just join 3.11.3 cluster as "new" dc in the 2.2.4 cluster?
D2. how does rolling upgrade work, as in within the same cluster how can 2 
versions coexist?

Will be grateful if you could review this plan.

PS: following this plan to ensure that I can revert back to old behaviour at 
any step

Thanks
Srinivas Devaki
SRE/SDE at Zomato






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: upgrade Apache Cassandra 2.1.9 to 3.0.9

2018-12-04 Thread Durity, Sean R
We have had great success with Cassandra upgrades with applications staying 
on-line. It is one of the strongest benefits of Cassandra. A couple things I 
incorporate into upgrades:

-  The main task is getting the new binaries loaded, then restarting 
the node – in a rolling fashion. Get this done as quickly as possible

-  Streaming between versions is usually problematic. So, I never do 
any node additions or decommissions during an upgrade

-  With applications running, there is not an acceptable back-out plan 
(either lose data or take a long outage or both), so we are always going 
forward. So, lower life cycle testing is important before hitting production

-  Upgrading is a more frequent activity, so get the process/automation 
in place. The upgrade process should not be a reason to delay, especially for 
minor version upgrades that might be quickly necessary (security issue or bug 
fix).


Sean Durity

From: Shravan R 
Sent: Tuesday, December 04, 2018 12:22 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: upgrade Apache Cassandra 2.1.9 to 3.0.9

Thanks Jeff. I tried to bootstrap a 3.x node to a partially upgraded cluster 
(2.1.9 + 3.x) and I was not able to do so. The schema never settled.

How does the below approach sound like?

  1.  Update the software binary on all nodes to use cassandra-3.x upon a 
restart.
  2.  Restart all nodes in a rolling fashion
  3.  Run nodetool upgradesstables in a rolling fashion

Is there a risk on pending nodetool upgradesstables?

On Sun, Dec 2, 2018 at 2:12 AM Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:


On Dec 2, 2018, at 12:40 PM, Shravan R 
mailto:skr...@gmail.com>> wrote:
Marc/Dimitry/Jon - greatly appreciate your feedback. I will look into the 
version part that you suggested. The reason to go direct to 3.x is to take a bi 
leap and reduce overall effort to upgrade a large cluster (development 
included).

I have these questions from my original post. Appreciate if you could shed some 
light and point me in the right direction.

1) How do deal with decommissioning a 2.1.9 node in a partially upgraded 
cluster?

If any of the replicas have already upgraded, which is almost guaranteed if 
you’re using vnodes, It’s hard / you don’t. You’d basically upgrade everything 
else and then deal with it. If a host fails mid upgrade you’ll likely have some 
period of unavailables while you bounce the replicas to finish, then you can 
decom




2) How to bootstrap a 3.x node to a partially upgraded cluster?

This may work fine, but test it because I’m not certain. It should be able to 
read the 2.1 and 3.0 sstables that’ll stream so it’ll just work


3) Is there an alternative approach to the upgrade large clusters. i.e instead 
of going through nodetool upgradesstables on each node in rolling fashion

Bounce them all as quickly as is practical, do the upgradesstables after the 
bounces complete






On Sat, Dec 1, 2018 at 1:03 PM Jonathan Haddad 
mailto:j...@jonhaddad.com>> wrote:
Dmitry is right. Generally speaking always go with the latest bug fix release.

On Sat, Dec 1, 2018 at 10:14 AM Dmitry Saprykin 
mailto:saprykin.dmi...@gmail.com>> wrote:
See more here
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-13004

On Sat, Dec 1, 2018 at 1:02 PM Dmitry Saprykin 
mailto:saprykin.dmi...@gmail.com>> wrote:
Even more, 3.0.9 is a terrible target choice by itself. It has a nasty bug 
corrupting sstables on alter.

On Sat, Dec 1, 2018 at 11:55 AM Marc Selwan 
mailto:marc.sel...@datastax.com>> wrote:
Hi Shravan,

Did you upgrade Apache Cassandra 2.1.9 to the latest patch release before doing 
the major upgrade? It's generally favorable to go to the latest patch release 
as often times they include fixes that smooth over the upgrade process. There 
are hundreds of bug fixes between 2.1.9 and 2.1.20 (current version)

Best,
Marc

On Fri, Nov 30, 2018 at 3:13 PM Shravan R 
mailto:skr...@gmail.com>> wrote:
Hello,

I am planning to upgrade Apache Cassandra 2.1.9 to Apache Cassandra-3.0.9. I 
came up with the version based on [1]. I followed upgrade steps as in [2]. I 
was testing the same in the lab and encountered issues (streaming just fails 
and hangs for ever) with bootstrapping a 3.0.9 node on a partially upgraded 
cluster. [50% of nodes on 2.1.9 and 50% on 3.0.9]. The production cluster that 
I am supporting is pretty large and I am anticipating to end up in a situation 
like this (Hope not) and would like to be prepared.

1) How do deal with decommissioning a 2.1.9 node in a partially upgraded 
cluster?
2) How to bootstrap a 3.x node to a partially upgraded cluster?
3) Is there an alternative approach to the upgrade large 

RE: [EXTERNAL] Is Apache Cassandra supports Data at rest

2018-11-14 Thread Durity, Sean R
I think you are asking about *encryption* at rest. To my knowledge, open source 
Cassandra does not support this natively. There are options, like encrypting 
the data in the application before it gets to Cassandra. Some companies offer 
other solutions. IMO, if you need the increased security, it is worth using 
something like DataStax Enterprise.


Sean Durity
From: Goutham reddy 
Sent: Tuesday, November 13, 2018 1:22 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Is Apache Cassandra supports Data at rest

Hi,
Does Apache Cassandra supports data at rest, because datastax Cassandra 
supports it. Can anybody help me.

Thanks and Regards,
Goutham.
--
Regards
Goutham Reddy



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Multiple cluster for a single application

2018-11-08 Thread Durity, Sean R
We have a cluster over 100 nodes that performs just fine for its use case. In 
our case, we needed the disk space and did not want the admin headache of very 
dense nodes. It does take more automation and process to handle a larger 
cluster, but those are all good things to solve anyway.

But count me in on being interested in what DataStax is calling “Big Node.” 
Would love to be able to use denser nodes, if the headaches are reduced.


Sean Durity

From: Ben Slater 
Sent: Wednesday, November 07, 2018 6:08 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Multiple cluster for a single application

I tend to recommend an approach similar to Eric’s functional sharding although 
I describe it at quality of service sharding - group your small, hot data into 
one cluster and your large, cooler data into another so you can provision 
infrastructure and tune according. I guess it depends on you management 
environment but if you app functionality allows your to split into multiple 
clusters (ie all your data is not all in one giant table) then I would 
generally look to split. Splitting also gives you the advantage of making it 
harder to have an outage that brings everything down.

Cheers
Ben

On Thu, 8 Nov 2018 at 08:44 Jonathan Haddad 
mailto:j...@jonhaddad.com>> wrote:
Interesting approach Eric, thanks for sharing that.

Regarding this:

> I've read documents recommended to use clusters with less than 50 or 100 
> nodes (Netflix got hundreds of clusters with less 100 nodes on each).

Not sure where you read that, but it's nonsense.  We work with quite a few 
clusters that are several hundred nodes each.  Your problems can get a bit 
amplified, for instance dynamic snitch can make a cluster perform significantly 
worse than if you just flat out disable it, which is what I usually recommend.

I'm curious how you arrived at the estimate of needing > 100 nodes.  Is that 
due to space constraints or performance ones?



On Wed, Nov 7, 2018 at 12:52 PM Eric Stevens 
mailto:migh...@gmail.com>> wrote:
We are engaging in both strategies at the same time:

1) We call it functional sharding - we write to clusters targeted according to 
the type of data being written.  Because different data types often have 
different workloads this has the nice side effect of being able to tune each 
cluster according to its workload.  Your ability to grow in this dimension is 
limited by the number of business object types you're recording.

2) We write to clusters sharded by time.  Our objects are network security 
events, so there's always an element of time.  We encode that time into 
deterministic object IDs so that we are able to identify in the read path which 
shard to direct the request to by extracting the time component.  This basic 
idea should be able to work any time you're able to use surrogate keys instead 
of natural keys.  If you are using natural keys, you may be facing an 
unpleasant migration should you need to increase the number of shards in this 
dimension.

Our reason for engaging in the second strategy was not purely Cassandra's 
fault, rather we were using DSE with a search workload, and the cost of 
rebuilding Solr indexes on streaming operations (such as adding nodes to an 
existing cluster) required enough resources that we found it prohibitive.  
That's because the bootstrapping node was also taking a production write 
workload, and we didn't want to run our cluster with enough overhead that a 
node could bootstrap and take production workload at the same time.

For vanilla Cassandra workloads we have run clusters with quite a bit more 
nodes than 100 without any appreciable trouble.  Curious if you can share 
documents about clusters over 100 nodes causing troubles for users.  I'm 
wondering if it's related to node failure rate combined with vnodes meaning 
that several concurrent node failures cause a part of the ring to go offline 
too reliably.

On Mon, Nov 5, 2018 at 7:38 AM onmstester onmstester 
 wrote:
Hi,

One of my applications requires to create a cluster with more than 100 nodes, 
I've read documents recommended to use clusters with less than 50 or 100 nodes 
(Netflix got hundreds of clusters with less 100 nodes on each).
Is it a good idea to use multiple clusters for a single application, just to 
decrease maintenance problems and system complexity/performance?
If So, which one of below policies is more suitable to distribute data among 
clusters and Why?
1. each cluster' would be responsible for a specific partial set of tables only 
(table sizes are almost equal so easy calculations here) for example inserts to 
table X would go to cluster Y
2. shard data at loader level by some business logic grouping of data, for 
example all rows with some column starting with X would go to cluster Y

I would appreciate sharing your experiences working with big clusters, problem 
encountered and solutions.

Thanks in Advance


Sent using Zoho 

RE: Cassandra 2.1 bootstrap - No streaming progress from one node

2018-11-07 Thread Durity, Sean R
I would wipe the new node and bootstrap again. I do not know of any way to 
resume the streaming that was previously in progress.


Sean Durity
From: Steinmaurer, Thomas 
Sent: Wednesday, November 07, 2018 5:13 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Cassandra 2.1 bootstrap - No streaming progress from one 
node

Hello,

while bootstrapping a new node into an existing cluster, a node which is acting 
as source for streaming got restarted unfortunately. Since then, from nodetool 
netstats I don't see any progress for this particular node anymore.

E.g.:

/X.X.X.X
Receiving 94 files, 260.09 GB total. Already received 26 files, 69.33 
GB total

Basically, it is stuck at 69.33GB for hours. Is Cassandra (2.1 in our case) not 
doing any resume here, in case there have been e.g. connectivity troubles or in 
our case, Cassandra on the node acting as stream source got restarted?

Can I force the joining node to recover connection to X.X.X.X or do I need to 
restart the bootstrap via restart on the new node from scratch?

Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: rolling version upgrade, upgradesstables, and vulnerability window

2018-10-30 Thread Durity, Sean R
Just to pile on:

I agree. On our upgrades, I always aim to get the binary part done on all nodes 
before worrying about upgradesstables. Upgrade is one node at a time 
(precautionary). Upgradesstables depends on cluster size, data size, 
compactionthroughput, etc. I usually start with running upgradesstables on 2 
nodes per DC and watch how the application performs. On larger clusters (over 
30 nodes), I usually work up to 4-5 nodes per DC running upgradesstables with 
staggered start times.

NOTE: I am rarely doing streaming operations outside of repairs. But I want to 
be able to handle a down node, etc., so I do not run in mixed version mode very 
long.


Sean Durity

From: Carl Mueller 
Sent: Tuesday, October 30, 2018 11:51 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: rolling version upgrade, upgradesstables, and 
vulnerability window

Thank you very much. I couldn't find any definitive answer on that on the list 
or stackoverflow.

It's clear that the safest for a prod cluster is rolling version upgrade of the 
binary, then the upgradesstables.

I will strongly consider cstar for the upgradesstables


On Tue, Oct 30, 2018 at 10:39 AM Alexander Dejanovski 
mailto:a...@thelastpickle.com>> wrote:
Yes, as the new version can read both the old and the new sstables format.

Restrictions only apply when the cluster is in mixed versions.

On Tue, Oct 30, 2018 at 4:37 PM Carl Mueller 
mailto:carl.muel...@smartthings.com.invalid>>
 wrote:
But the topology change restrictions are only in place while there are 
heterogenous versions in the cluster? All the nodes at the upgraded version 
with "degraded" sstables does NOT preclude topology changes or node 
replacement/addition?


On Tue, Oct 30, 2018 at 10:33 AM Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:
Wait for 3.11.4 to be cut

I also vote for doing all the binary bounces and upgradesstables after the 
fact, largely because normal writes/compactions are going to naturally start 
upgrading sstables anyway, and there are some hard restrictions on mixed mode 
(e.g. schema changes won’t cross version) that can be far more impactful.



--
Jeff Jirsa


> On Oct 30, 2018, at 8:21 AM, Carl Mueller 
> mailto:carl.muel...@smartthings.com>.INVALID> 
> wrote:
>
> We are about to finally embark on some version upgrades for lots of clusters, 
> 2.1.x and 2.2.x targetting eventually 3.11.x
>
> I have seen recipes that do the full binary upgrade + upgrade sstables for 1 
> node before moving forward, while I've seen a 2016 vote by Jon Haddad (a TLP 
> guy) that backs doing the binary version upgrades through the cluster on a 
> rolling basis, then doing the upgradesstables on a rolling basis.
>
> Under what cluster conditions are streaming/node replacement precluded, that 
> is we are vulnerable to a cloud provided dumping one of our nodes under us or 
> hardware failure? We ain't apple, but we do have 30+ node datacenters and 
> 80-100 node clusters.
>
> Is the node replacement and streaming only disabled while there are 
> heterogenous cassandra versions, or until all the sstables have been upgraded 
> in the cluster?
>
> My instincts tell me the best thing to do is to get all the cassandra nodes 
> to the same version without the upgradesstables step through the cluster, and 
> then roll through the upgradesstables as needed, and that upgradesstables is 
> a node-local concern that doesn't impact streaming or node replacement or 
> other situations since cassandra can read old version sstables and new 
> sstables would simply be the new format.

-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org
--
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, 

RE: [EXTERNAL] Re: [E] Re: nodetool status and node maintenance

2018-10-29 Thread Durity, Sean R
I have wrapped nodetool info into my own script that strips out and interprets 
the information I care about. That script also sets a return code based on the 
health of that node (which protocols are up, etc.). Then I can monitor the 
individual health of the node – as that node sees itself. I have found these 
much more actionable than up/down alerts from a single node’s view of the whole 
cluster (like nodetool status)


Sean Durity

From: Saha, Sushanta K 
Sent: Monday, October 29, 2018 7:52 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: [E] Re: nodetool status and node maintenance

Thanks!

On Fri, Oct 26, 2018 at 2:39 PM Alain RODRIGUEZ 
mailto:arodr...@gmail.com>> wrote:
Hello

Any way to temporarily make the node under maintenance invisible  from 
"nodetool status" output?

I don't think so.
I would use a different approach like for example only warn/email when the node 
is down for 30 seconds or a minute depending on how long it takes for your 
nodes to restart. This way the failure is not invisible, but ignored when only 
bouncing the nodes.

As a side note, be aware that the 'nodetool status' only give a view of the 
cluster from a specific node, that can be completely wrong as well :).

C*heers,
---
Alain Rodriguez - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le ven. 26 oct. 2018 à 15:16, Saha, Sushanta K 
mailto:sushanta.s...@verizonwireless.com>> a 
écrit :
I have script that parses "nodetool status" output and emails alerts if any 
node is down. So, when I stop cassandra on a node for maintenance, all nodes 
stats emailing alarms.

Any way to temporarily make the node under maintenance invisible  from 
"nodetool status" output?

Thanks



--

Sushanta Saha|MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ - VerizonWireless
O 770.797.1260  C 770.714.6555 Iaas Support Line 949-286-8810



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Installing a Cassandra cluster with multiple Linux OSs (Ubuntu+CentOS)

2018-10-23 Thread Durity, Sean R
Agreed. I have run clusters with both RHEL5 and RHEL6 nodes.


Sean Durity
From: Jeff Jirsa 
Sent: Sunday, October 14, 2018 12:40 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Installing a Cassandra cluster with multiple Linux OSs 
(Ubuntu+CentOS)

Should be fine, just get the java and kernel versions and kernel tuning params 
as close as possible



--
Jeff Jirsa


On Oct 14, 2018, at 5:09 PM, Eyal Bar 
mailto:eyal@kenshoo.com>> wrote:
Hi all,

Did anyone installed a Cassandra cluster with mixed Linux OSs where some of the 
nodes were ubuntu 12\14\16 and some of the nodes where CentOS7?

Will it work without issues?

Rational: We have a 40 servers cluster which was originally installed only with 
Ubuntu servers. Now we want to move to CentOS 7 but the effort of reinstalling 
the entire cluster + migration to CentOS 7 is not simple. So we thought about 
adding new CentOS 7 nodes to the existing cluster and gradually remove the 
Ubuntu ones.

Would love to read your thoughts.

Best,

--
Eyal Bar
Big Data Ops Team Lead | Data Platform and Monitoring  | Kenshoo
Office +972 (3) 746-6500 *473
Mobile +972 (52) 458-6100
www.Kenshoo.com
[Transform your marketing. Grow your 
business.]

This e-mail, as well as any attached document, may contain material which is 
confidential and privileged and may include trademark, copyright and other 
intellectual property rights that are proprietary to Kenshoo Ltd,  its 
subsidiaries or affiliates ("Kenshoo"). This e-mail and its attachments may be 
read, copied and used only by the addressee for the purpose(s) for which it was 
disclosed herein. If you have received it in error, please destroy the message 
and any attachment, and contact us immediately. If you are not the intended 
recipient, be aware that any review, reliance, disclosure, copying, 
distribution or use of the contents of this message without Kenshoo's express 
permission is strictly prohibited.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Upcoming Cassandra-related Conferences

2018-10-08 Thread Durity, Sean R
Thank you. I do want to hear about future conferences. I would also love to 
hear reports/summaries/highlights from folks who went to Distributed Data 
Summit (or other conferences). I think user conferences are great!


Sean Durity

From: Max C. 
Sent: Friday, October 05, 2018 8:33 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Upcoming Cassandra-related Conferences

Some upcoming Cassandra-related conferences, if anyone is interested:

Scylla Summit
November 5-7, 2018
Pullman San Francisco Bay Hotel, Redwood City CA
https://www.scylladb.com/scylla-summit-2018/

(This one seems to be almost entirely Scylla focussed, maybe not terribly 
useful for non-Scylla users)

DataStax Accelerate
May 21-23, 2019
National Harbor, Maryland
https://www.datastax.com/accelerate

(No talks list or sponsors have been posted yet)

DISCLAIMER:
I’m not in the middle of the politics or nor do I have any affiliation with 
either of these companies.  I just thought lowly users like myself might 
appreciate the mention these on the -users list.

I wish we should have had a post or two about the Distributed Data Summit;  I 
think we probably would have had an even better conference!  :-)

- Max



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Rolling back Cassandra upgrades (tarball)

2018-10-01 Thread Durity, Sean R
Version choices aside, I am an advocate for forward-only (in most cases). Here 
is my reasoning, so that you can evaluate for your situation:
- upgrades are done while the application is up and live and writing data (no 
app downtime)
- the upgrade usually includes a change to the sstable version (which is 
unreadable in the older version)
- any data written to upgraded nodes will be written in the new sstable format
+ this includes any compaction that takes place on upgraded nodes, so even an 
app outage doesn't protect you
- so, there is no going back, unless you are willing to lose new (or compacted) 
data written to any upgraded nodes

As you can tell, if the assumptions don't hold true, a roll back may be 
possible. For example, if the sstable version is the same (e.g., for a minor 
upgrade), then the risk of lost data is gone. Or, if you are able to stop your 
application during the upgrade process and stop compaction. Etc.

You could upgrade a single node to see how it behaves. If there is some 
problem, you could wipe out the data, go back to the old version, and bootstrap 
it again. Once I get to the 2nd node, though, I am only going forward.

Sean Durity


-Original Message-
From: Jeff Jirsa 
Sent: Sunday, September 30, 2018 8:38 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Rolling back Cassandra upgrades (tarball)

Definitely don’t go to 3.10, go to 3.11.3 or newest 3.0 instead


--
Jeff Jirsa


On Sep 30, 2018, at 5:29 PM, Nate McCall  wrote:

>> I have a cluster on v3.0.11 I am planning to upgrade this to 3.10.
>> Is rolling back the binaries a viable solution?
>
> What's the goal with moving form 3.0 to 3.x?
>
> Also, our latest release in 3.x is 3.11.3 and has a couple of
> important bug fixes over 3.10 (which is a bit dated at this point).
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


RE: [EXTERNAL] Re: Adding datacenter and data verification

2018-09-18 Thread Durity, Sean R
You are correct that altering the keyspace replication settings does not 
actually move any data. It only affects new writes or reads. System_auth is one 
that needs to be repaired quickly OR, if your number of users/permissions is 
relatively small, you can just reinsert them after the alter to the table. The 
data will get written to all the proper, new nodes.


Sean Durity

From: Pradeep Chhetri 
Sent: Tuesday, September 18, 2018 1:55 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Adding datacenter and data verification

Hi Eunsu,

By going through the documentation, I think you are right, you shouldn't use 
withUsedHostsPerRemoteDc because it will contact nodes in other datacenters.  
No i don't use withUsedHostsPerRemoteDc, but instead i use withLocalDc option.

On Tue, Sep 18, 2018 at 11:02 AM, Eunsu Kim 
mailto:eunsu.bil...@gmail.com>> wrote:
Yes, I altered the system_auth key space before adding the data center.

However, I suspect that the new data center did not get the system_auth data 
and therefore could not authenticate to the client. Because the new data center 
did not get the replica count by altering keyspace.

Do your clients have the 'withUsedHostsPerRemoteDc' option?



On 18 Sep 2018, at 1:17 PM, Pradeep Chhetri 
mailto:prad...@stashaway.com>> wrote:

Hello Eunsu,

I am also using PasswordAuthenticator in my cassandra cluster. I didn't come 
across this issue while doing the exercise on preprod.

Are you sure that you changed the configuration of system_auth keyspace before 
adding the new datacenter using this:

ALTER KEYSPACE system_auth WITH REPLICATION = {'class': 
'NetworkTopologyStrategy', 'datacenter1': '3'};

Regards,
Pradeep



On Tue, Sep 18, 2018 at 7:23 AM, Eunsu Kim 
mailto:eunsu.bil...@gmail.com>> wrote:

In my case, there were authentication issues when adding data centers.

I was using a PasswordAuthenticator.

As soon as the datacenter was added, the following authentication error log was 
recorded on the client log file.

com.datastax.driver.core.exceptions.AuthenticationException: Authentication 
error on host /xxx.xxx.xxx.xx:9042: Provided username apm and/or password are 
incorrect

I was using DCAwareRoundRobinPolicy, but I guess it's probably because of the 
withUsedHostsPerRemoteDc option.

I took several steps and the error log disappeared. It is probably ’nodetool 
rebuild' after altering the system_auth table.

However, the procedure was not clearly defined.



On 18 Sep 2018, at 2:40 AM, Pradeep Chhetri 
mailto:prad...@stashaway.com>> wrote:

Hello Alain,

Thank you very much for reviewing it. You answer on seed nodes cleared my 
doubts. I will update it as per your suggestion.

I have few followup questions on decommissioning of datacenter:

- Do i need to run nodetool repair -full on each of the nodes (old + new dc 
nodes) before starting the decommissioning process of old dc.
- We have around 15 apps using cassandra cluster. I want to make sure that all 
queries before starting the new datacenter are going with right consistency 
level i.e LOCAL_QUORUM instead of QUORUM. Is there a way i can log the 
consistency level of each query somehow in some log file.

Regards,
Pradeep

On Mon, Sep 17, 2018 at 9:26 PM, Alain RODRIGUEZ 
mailto:arodr...@gmail.com>> wrote:
Hello Pradeep,

It looks good to me and it's a cool runbook for you to follow and for others to 
reuse.

To make sure that cassandra nodes in one datacenter can see the nodes of the 
other datacenter, add the seed node of the new datacenter in any of the old 
datacenter’s nodes and restart that node.

Nodes seeing each other from the distinct rack is not related to seeds. It's 
indeed recommended to use seeds from all the datacenter (a couple or 3). I 
guess it's to increase availability on seeds node and/or maybe to make sure 
local seeds are available.

You can perfectly (and even have to) add your second datacenter nodes using 
seeds from the first data center. A bootstrapping node should never be in the 
list of seeds unless it's the first node of the cluster. Add nodes, then make 
them seeds.


Le lun. 17 sept. 2018 à 11:25, Pradeep Chhetri 
mailto:prad...@stashaway.com>> a écrit :
Hello everyone,

Can someone please help me in validating the steps i am following to migrate 
cassandra snitch.

Regards,
Pradeep

On Wed, Sep 12, 2018 at 1:38 PM, Pradeep Chhetri 
mailto:prad...@stashaway.com>> wrote:
Hello

I am running cassandra 3.11.3 5-node cluster on AWS with SimpleSnitch. I was 
testing the process to migrate to GPFS using AWS region as the datacenter name 
and AWS zone as the rack name in my preprod environment and was able to achieve 
it.

But before decommissioning the older datacenter, I want to verify that the data 
in newer dc is in consistence with the one in older dc. Is there any easy way 
to do that.

Do you suggest running a full repair before decommissioning the nodes of older 
datacenter ?

I am using the steps documented here: 

RE: [EXTERNAL] Re: cold vs hot data

2018-09-18 Thread Durity, Sean R
Wouldn’t you have the same problem with two similar tables with different 
primary keys (eg., UserByID and UserByName)? This is a very common pattern in 
Cassandra – inserting into multiple tables… That’s what batches are for – 
atomicity.
I don’t understand the additional concern here.



Sean Durity

From: DuyHai Doan 
Sent: Monday, September 17, 2018 4:23 PM
To: user 
Subject: Re: [EXTERNAL] Re: cold vs hot data

Sean

Without transactions à la SQL, how can you guarantee atomicity between both 
tables for upserts ? I mean, one write could succeed with hot table and fail 
for cold table

The only solution I see is using logged batch, with a huge overhead and perf 
hit on for the writes

On Mon, Sep 17, 2018 at 8:28 PM, Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
An idea:

On initial insert, insert into 2 tables:
Hot with short TTL
Cold/archive with a longer (or no) TTL
Then your hot data is always in the same table, but being expired. And you can 
access the archive table only for the more rare circumstances. Then you could 
have the HOT table on a different volume of faster storage. If the hot/cold 
tables are in different keyspaces, then you could also have different 
replication (a HOT DC and an archive DC, for example)


Sean Durity


-Original Message-
From: Mateusz 
mailto:mateusz-li...@ant.gliwice.pl>>
Sent: Friday, September 14, 2018 2:40 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: cold vs hot data

On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> The data can grow to +100TB however the hot data will be in most cases
> less than 10TB but we still need to keep the rest of data accessible.
> Anyone has this problem?
> What is the best way to make the cluster more efficient?
> Is there a way to somehow automatically move the old data to different
> storage (rack, dc, etc)?
> Any ideas?

We solved it using lvmcache.

--
Mateusz
(...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
krótko mówiąc - podpora społeczeństwa."
Nikos Kazantzakis - "Grek Zorba"




-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org>
For additional commands, e-mail: 
user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org>



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org>
For additional commands, e-mail: 
user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org>




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: cold vs hot data

2018-09-17 Thread Durity, Sean R
An idea:

On initial insert, insert into 2 tables:
Hot with short TTL
Cold/archive with a longer (or no) TTL
Then your hot data is always in the same table, but being expired. And you can 
access the archive table only for the more rare circumstances. Then you could 
have the HOT table on a different volume of faster storage. If the hot/cold 
tables are in different keyspaces, then you could also have different 
replication (a HOT DC and an archive DC, for example)


Sean Durity


-Original Message-
From: Mateusz 
Sent: Friday, September 14, 2018 2:40 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: cold vs hot data

On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> The data can grow to +100TB however the hot data will be in most cases
> less than 10TB but we still need to keep the rest of data accessible.
> Anyone has this problem?
> What is the best way to make the cluster more efficient?
> Is there a way to somehow automatically move the old data to different
> storage (rack, dc, etc)?
> Any ideas?

We solved it using lvmcache.

--
Mateusz
(...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
krótko mówiąc - podpora społeczeństwa."
Nikos Kazantzakis - "Grek Zorba"




-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


RE: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Durity, Sean R
3 starting points:

-  DO NOT migrate your tables as they are in Oracle to Cassandra. In 
most cases, you need a different model for Cassandra

-  DO take the (free) DataStax Academy courses to learn much more about 
Cassandra as you dive in. It is a systematic and bite-size approach to learning 
all things Cassandra (and eventually, DataStax Enterprise, should you go that 
way). However, open source Cassandra is fine as a data platform. DSE gives you 
more options for data models, better administration and monitoring tools, 
support, etc. It all depends on what you need/want to build/can afford

-  Cluster sizing depends on your goals for the data platform. Do you 
need lots of storage, lots of throughput, high availability, low latency, 
workload separation, etc.? A couple guidelines – use at least 3 nodes per data 
center (DC) and at least 2 DCs for availability. Use SSDs for storage and keep 
node size 3 TB or less for reasonable administration.  If six nodes are too 
many – you probably don’t need Cassandra. If you can define what you need your 
data platform to deliver, then you can start a sizing discussion. The good 
thing is, you can always scale (as long as the data model is good).


Sean Durity

From: sha p 
Sent: Wednesday, September 05, 2018 9:21 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate 
data from Oracle to Cassandra.


Hi all ,
Me new to Cassandra , i was asked to migrate data from Oracle to Cassandra.
Please help me giving your valuable guidance.
1) Can it be done using open source Cassandra.
2) Where should I start data model from?
3) I should use java, what kind of  jar/libs/tools I need use ?
4) How I decide the size of cluster , please provide some sample guidelines.
5) this should be in production , so what kind of things i should take care for 
better support or debugging tomorrow?
6) Please provide some good books /links which can help me in this task.


Thanks in advance.
Highly appreciated your every amal help.

Regards,
Shyam



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: adding multiple node to a cluster, cleanup and num_tokens

2018-09-04 Thread Durity, Sean R
I would only run the clean-up (on all nodes) after all new nodes are added. I 
would also look at increasing RF to 3 (and running repair) once there are 
plenty of nodes. (This is assuming that availability matters and that your 
queries use QUORUM or LOCAL_QUORUM for consistency level.

Longer term, I agree with Oleksandr, the recommendation for number of vnodes is 
now much smaller than 256. I am using 8 or 16.


Sean Durity

From: Oleksandr Shulgin 
Sent: Monday, September 03, 2018 10:02 AM
To: User 
Subject: [EXTERNAL] Re: adding multiple node to a cluster, cleanup and 
num_tokens

On Mon, Sep 3, 2018 at 12:19 PM onmstester onmstester 
mailto:onmstes...@zoho.com>> wrote:
What i have understood from this part of document is that, when i already have 
node A,B and C in cluster  there would be some old data on A,B,C after
new node D joined the cluster completely which is data streamed to D, then if i 
add node E to the cluster immediately, the old data on A,B,C would be also 
moved between nodes everytime?

Potentially, when you add node E it takes ownership of some of the data that D 
has.  So you have to run cleanup on all (except the very last node you add) in 
the end.  It still makes sense to do this once, not after every single node you 
add.

--
Alex




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Durity, Sean R
If you are going to compare vs commercial offerings like Scylla and CosmosDB, 
you should be looking at DataStax Enterprise. They are moving more quickly than 
open source (IMO) on adding features and tools that enterprises really need. I 
think they have some emerging tech for large/dense nodes, in particular. The 
ability to handle different data model types (Graph and Search) and embedded 
analytics sets it apart from plain Cassandra. Plus, they have replaced 
Cassandra’s SEDA architecture to give it a significant boost in performance. As 
a customer, I see the value in what they are doing.


Sean Durity
From: onmstester onmstester 
Sent: Wednesday, August 29, 2018 7:43 AM
To: user 
Subject: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

Could you please explain more about (you mean slower performance in compare to 
Cassandra?)
---Hbase tends to be quite average for transactional data

and about:
ScyllaDB IDK, I'd assume they just sorted out streaming by learning from 
C*'s mistakes.
While ScyllaDB is a much younger project than Cassandra with so much less usage 
and attention, Currently I encounter a dilemma on launching new clusters which 
is: should i wait for Cassandra community to apply all enhancement's and bug 
fixes that applied by their main competitors (Scylla DB or Cosmos DB) or just 
switch to competitors (afraid of the new world!)?
For example right now is there a motivation to handle more dense nodes in near 
future?

Again, Thank you for your time


Sent using Zoho 
Mail


 On Wed, 29 Aug 2018 15:16:40 +0430 kurt greaves 
mailto:k...@instaclustr.com>> wrote 

Most of the issues around big nodes is related to streaming, which is currently 
quite slow (should be a bit better in 4.0). HBase is built on top of hadoop, 
which is much better at large files/very dense nodes, and tends to be quite 
average for transactional data. ScyllaDB IDK, I'd assume they just sorted out 
streaming by learning from C*'s mistakes.

On 29 August 2018 at 19:43, onmstester onmstester 
mailto:onmstes...@zoho.com>> wrote:


Thanks Kurt,
Actually my cluster has > 10 nodes, so there is a tiny chance to stream a 
complete SSTable.
While logically any Columnar noSql db like Cassandra, needs always to re-sort 
grouped data for later-fast-reads and having nodes with big amount of data (> 2 
TB) would be annoying for this background process, How is it possible that some 
of these databases like HBase and Scylla db does not emphasis on small nodes 
(like Cassandra do)?


Sent using Zoho 
Mail


 Forwarded message 
From : kurt greaves mailto:k...@instaclustr.com>>
To : "User"mailto:user@cassandra.apache.org>>
Date : Wed, 29 Aug 2018 12:03:47 +0430
Subject : Re: bigger data density with Cassandra 4.0?
 Forwarded message 

My reasoning was if you have a small cluster with vnodes you're more likely to 
have enough overlap between nodes that whole SSTables will be streamed on major 
ops. As  N gets >RF you'll have less common ranges and thus less likely to be 
streaming complete SSTables. Correct me if I've misunderstood.






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

2018-08-29 Thread Durity, Sean R
Sstableloader, though, could require a lot more disk space – until compaction 
can reduce. For example, if your RF=3, you will essentially be loading 3 copies 
of the data. Then it will get replicated 3 more times as it is being loaded. 
Thus, you could need up to 9x disk space.


Sean Durity
From: kurt greaves 
Sent: Wednesday, August 29, 2018 7:26 AM
To: User 
Subject: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

Removing dev...
Nodetool refresh only picks up new SSTables that have been placed in the tables 
directory. It doesn't account for actual ownership of the data like 
SSTableloader does. Refresh will only work properly if the SSTables you are 
copying in are completely covered by that nodes tokens. It doesn't work if 
there's a change in topology, replication and token ownership will have to be 
more or less the same.

SSTableloader will break up the SSTables and send the relevant bits to 
whichever node needs it, so no need for you to worry about tokens and copying 
data to the right places, it will do that for you.

On 28 August 2018 at 11:27, Rajath Subramanyam 
mailto:rajat...@gmail.com>> wrote:
Hi Cassandra users, Cassandra dev,

When recovering using SSTables from a snapshot, I want to know what are the key 
differences between using:
1. Nodetool refresh and,
2. SSTableloader

Does nodetool refresh have restrictions that need to be met? Does nodetool 
refresh work even if there is a change in the topology between the source 
cluster and the destination cluster? Does it work if the token ranges don't 
match between the source cluster and the destination cluster? Does it work when 
an old SSTable in the snapshot has a dropped column that is not part of the 
current schema?

I appreciate any help in advance.

Thanks,
Rajath

Rajath Subramanyam





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Improve data load performance

2018-08-15 Thread Durity, Sean R
Might also help to know:
Size of cluster
How much data is being loaded (# of inserts/actual data size)
Single table or multiple tables?
Is this a one-time or occasional load or more frequently?
Is the data located in the same physical data center as the cluster? (any 
network latency?)

On the client side, prepared statements and ExecuteAsync can really speed 
things up.


Sean Durity

From: Elliott Sims 
Sent: Wednesday, August 15, 2018 1:13 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Improve data load performance

Step one is always to measure your bottlenecks.  Are you spending a lot of time 
compacting?  Garbage collecting?  Are you saturating CPU?  Or just a few cores? 
 Or I/O?  Are repairs using all your I/O?  Are you just running out of write 
threads?

On Wed, Aug 15, 2018 at 5:48 AM, Abdul Patel 
mailto:abd786...@gmail.com>> wrote:
Application team is trying to load data with leveled compaction and its taking 
1hr to load , what are  best options to load data faster ?


On Tuesday, August 14, 2018, @Nandan@ 
mailto:nandanpriyadarshi...@gmail.com>> wrote:
Bro, Please explain your question as much as possible.
This is not a single line Q session where we will able to understand your 
in-depth queries in a single line.
For better and suitable reply, Please ask a question and elaborate what steps 
you took for your question and what issue are you getting and all..
I hope I am making it clear. Don't take it personally.

Thanks

On Wed, Aug 15, 2018 at 8:25 AM Abdul Patel 
mailto:abd786...@gmail.com>> wrote:
How can we improve data load performance?




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Data Corruption due to multiple Cassandra 2.1 processes?

2018-08-13 Thread Durity, Sean R
I have definitely seen corruption, especially in system tables, when there are 
multiple instances of Cassandra running/trying to start. We had an internal 
tool that was supposed to restart processes (like Cassandra) if they were down, 
but it often re-checked before Cassandra was fully up and started another 
Cassandra process. Unwinding it could be very ugly.


Sean Durity

From: kurt greaves 
Sent: Monday, August 13, 2018 7:24 AM
To: User 
Subject: [EXTERNAL] Re: Data Corruption due to multiple Cassandra 2.1 processes?

Yeah that's not ideal and could lead to problems. I think corruption is only 
likely if compactions occur, but seems like data loss is a potential not to 
mention all sorts of other possible nasties that could occur running two C*'s 
at once. Seems to me that 11540 should have gone to 2.1 in the first place, but 
it just got missed. Very simple patch so I think a backport should be accepted.

On 7 August 2018 at 15:57, Steinmaurer, Thomas 
mailto:thomas.steinmau...@dynatrace.com>> 
wrote:
Hello,

with 2.1, in case a second Cassandra process/instance is started on a host (by 
accident), may this result in some sort of corruption, although Cassandra will 
exit at some point in time due to not being able to bind TCP ports already in 
use?

What we have seen in this scenario is something like that:

ERROR [main] 2018-08-05 21:10:24,046 CassandraDaemon.java:120 - Error starting 
local jmx server:
java.rmi.server.ExportException: Port already in use: 7199; nested exception is:
java.net.BindException: Address already in use (Bind failed)
…

But then continuing with stuff like opening system and even user tables:

INFO  [main] 2018-08-05 21:10:24,060 CacheService.java:110 - Initializing key 
cache with capacity of 100 MBs.
INFO  [main] 2018-08-05 21:10:24,067 CacheService.java:132 - Initializing row 
cache with capacity of 0 MBs
INFO  [main] 2018-08-05 21:10:24,073 CacheService.java:149 - Initializing 
counter cache with capacity of 50 MBs
INFO  [main] 2018-08-05 21:10:24,074 CacheService.java:160 - Scheduling counter 
cache save to every 7200 seconds (going to save all keys).
INFO  [main] 2018-08-05 21:10:24,161 ColumnFamilyStore.java:365 - Initializing 
system.sstable_activity
INFO  [SSTableBatchOpen:2] 2018-08-05 21:10:24,692 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-165
 (2023 bytes)
INFO  [SSTableBatchOpen:3] 2018-08-05 21:10:24,692 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-167
 (2336 bytes)
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,692 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-166
 (2686 bytes)
INFO  [main] 2018-08-05 21:10:24,755 ColumnFamilyStore.java:365 - Initializing 
system.hints
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,758 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/hints-2666e20573ef38b390fefecf96e8f0c7/system-hints-ka-377
 (46210621 bytes)
INFO  [main] 2018-08-05 21:10:24,766 ColumnFamilyStore.java:365 - Initializing 
system.compaction_history
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,768 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-129
 (91269 bytes)
…

Replaying commit logs:

…
INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:267 - Replaying 
/var/opt/dynatrace-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:270 - Replaying 
/var/opt/dynatrace-managed/cassandra/commitlog/CommitLog-4-1533133668366.log 
(CL version 4, messaging version 8)
…

Even writing memtables already (below just pasted system tables, but also user 
tables):

…
INFO  [MemtableFlushWriter:4] 2018-08-05 21:11:52,524 Memtable.java:347 - 
Writing 
Memtable-size_estimates@1941663179(2.655MiB
 serialized bytes, 325710 ops, 2%/0% of on/off-heap limit)
INFO  [MemtableFlushWriter:3] 2018-08-05 21:11:52,552 Memtable.java:347 - 
Writing 
Memtable-peer_events@1474667699(0.199KiB
 serialized bytes, 4 ops, 0%/0% of on/off-heap limit)
…

Until it comes to a point where it can’t bind ports like the storage port 7000:

ERROR [main] 2018-08-05 21:11:54,350 CassandraDaemon.java:395 - Fatal 
configuration error
org.apache.cassandra.exceptions.ConfigurationException: /XXX:7000 is in use by 
another process.  Change listen_address:storage_port in cassandra.yaml to 
values that do not conflict with other services
at 

RE: [EXTERNAL] Re: ETL options from Hive/Presto/s3 to cassandra

2018-08-09 Thread Durity, Sean R
DataStax Enterprise 6.0 has a new bulk loader tool. DSE is a commercial 
product, but maybe your needs are worth the investigation.


Sean Durity

From: Rahul Singh 
Sent: Tuesday, August 07, 2018 9:37 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: ETL options from Hive/Presto/s3 to cassandra

Spark is scalable to as many nodes as you want and could be collocated with the 
data nodes — sstableloader wont be as performant for larger datasets. Although 
it can be run in parallel on different nodes I don’t believe it to be as fault 
tolerant.

If you have to do it continuously I would even think about leveraging Kafka as 
the transport layer and using Kafka Connect. It brings other tooling to get 
data into Cassandra from a variety of sources.

Rahul
On Aug 6, 2018, 3:16 PM -0400, srimugunthan dhandapani 
mailto:srimugunthan.dhandap...@gmail.com>>, 
wrote:

Hi all,
We have data that gets filled into Hive/ presto  every few hours.
We want that data to be transferred to cassandra tables.
What are some of the high performance ETL options for transferring data between 
hive  or presto into cassandra?

Also does anybody have any performance numbers comparing
- loading data from S3 to cassandra using SStableloader
- and loading data from S3 to cassandra using other means (like spark-api)?

Thanks,
mugunthan



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Cassandra rate dropping over long term test

2018-08-03 Thread Durity, Sean R
I wonder if you are building up tombstones with the deletes. Can you share your 
data model? Are the deleted rows using the same partition key as new rows? Any 
warnings in your system.log for reading through too many tombstones?


Sean Durity

From: Mihai Stanescu 
Sent: Friday, August 03, 2018 12:03 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra rate dropping over long term test

I looked at the compaction history on the affected node when it was affected 
and it was not affected.

The number of compactions is fairly similar and the amount of work also.

Not affected time
[root@cassandra7 ~]# nodetool compactionhistory | grep 02T22
fda43ca0-9696-11e8-8efb-25b020ed0402 demodbtopic_message 
2018-08-02T22:59:47.946 433124864  339496194  {1:3200576, 2:2025936, 3:262919}
8a83e2c0-9696-11e8-8efb-25b020ed0402 demodbtopic_message 
2018-08-02T22:56:34.796 133610579  109321990  {1:1574352, 2:434814}
01811e20-9696-11e8-8efb-25b020ed0402 demodbtopic_message 
2018-08-02T22:52:44.930 132847372  108175388  {1:1577164, 2:432508}

Experiencing more ioread
[root@cassandra7 ~]# nodetool compactionhistory | grep 03T12
389aa220-970c-11e8-8efb-25b020ed0402 demodbtopic_message 
2018-08-03T12:58:57.986 470326446  349948622  {1:2590960, 2:2600102, 3:298369}
81fe6f10-970b-11e8-8efb-25b020ed0402 demodbtopic_message 
2018-08-03T12:53:51.617 143850880  11226  {1:1686260, 2:453627}
ce418e30-970a-11e8-8efb-25b020ed0402 demodbtopic_message 
2018-08-03T12:48:50.067 147035600  119201638  {1:1742318, 2:452226}

During a read operation the row can mostly be in one sstable since was only 
inserted and then read so its strange.

We have a partition key and then a clustering key.

Rows that are written should be in kernel buffers and the rows which are lost 
to delete are never read again either so the kernel should have only the most 
recent data.

I remain puzzled



On Fri, Aug 3, 2018 at 3:58 PM, Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:
Probably Compaction

Cassandra data files are immutable

The write path first appends to a commitlog, then puts data into the memtable. 
When the memtable hits a threshold, it’s flushed to data files on disk (let’s 
call the first one “1”, second “2” and so on)

Over time we build up multiple data files on disk - when Cassandra reads, it 
will merge data in those files to give you the result you expect, choosing the 
latest value for each column

But it’s usually wasteful to lots of files around, and that merging is 
expensive, so compaction combines those data files behind the scenes in a 
background thread.

By default they’re combined when 4 or more files are approximately the same 
size, so if your write rate is such that you fill and flush the memtable every 
5 minutes, compaction will likely happen at least every 20 minutes (sometimes 
more). This is called size tiered compaction; there are 4 strategies but size 
tiered is default and easiest to understand.

You’re seeing mostly writes because the reads are likely in page cache (the 
kernel doesn’t need to go to disk to read the files, it’s got them in memory 
for serving normal reads).

--
Jeff Jirsa


> On Aug 3, 2018, at 12:30 AM, Mihai Stanescu 
> mailto:mihai.stane...@gmail.com>> wrote:
>
> Hi all,
>
> I am perftesting cassandra over a longrun in a cluster of 8 nodes and i 
> noticed the rate of service drops.
> Most of the nodes have the CPU between 40-65% however one of the nodes has a 
> higher CPU and also started performing a lot of read IOPS as seen in the 
> image. (green is read IOPS)
>
> My test has a mixed rw scenario.
> 1. insert row
> 2. after 60 seconds read row
> 3. delete row.
>
> The rate of inserts is bigger than the rate of deletes so some delete will 
> not happen.
>
> I have checked the client it it does not accumulate RAM, GC is a straight 
> line so o don't understand whats going on.
>
> Any hints?
>
> Regards,
> MIhai
>
> 
>
>

-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for 

RE: [EXTERNAL] full text search on some text columns

2018-07-31 Thread Durity, Sean R
That sounds like a problem tailor-made for the DataStax Search (embedded SOLR) 
solution. I think that would be the fastest path to success.


Sean Durity

From: onmstester onmstester 
Sent: Tuesday, July 31, 2018 10:46 AM
To: user 
Subject: [EXTERNAL] full text search on some text columns

I need to do a full text search (like) on one of my clustering keys and one of 
partition keys (it use text as data type). The input rate is high so only 
Cassandra could handle it, Is there any open source version project which help 
using cassandra+ solr or cassandra + elastic?
Any Recommendation on doing this with home-made solutions would be appreciated?


Sent using Zoho 
Mail





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Server kernal Parameters for cassandra

2018-07-30 Thread Durity, Sean R
Here are some to review and test for Cassandra 3.x from DataStax:
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configRecommendedSettings.html

Al Tobey has done extensive work in this area, too. This is dated (Cassandra 
2.1), but is worth mining for information:
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html


Sean Durity

-Original Message-
From: rajasekhar kommineni 
Sent: Sunday, July 29, 2018 7:36 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Server kernal Parameters for cassandra

Hello,

Do we have any standard values for server kernel parameters to run Cassandra. 
Please share some insight.

Thanks,


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: [EXTERNAL] optimization to cassandra-env.sh

2018-07-26 Thread Durity, Sean R
This is a very good explanation of CMS tuning for Cassandra:
http://thelastpickle.com/blog/2018/04/11/gc-tuning.html
(author Jon Haddad has extensive Cassandra experience – a super star in our 
field)

Sean Durity
From: Durity, Sean R 
Sent: Thursday, July 26, 2018 2:08 PM
To: user@cassandra.apache.org
Subject: RE: [EXTERNAL] optimization to cassandra-env.sh

Check the archives for CMS or G1 (whichever garbage collector you are using). 
There has been significant and good advice on both. In general, though, G1 has 
one basic number to set and does very well in our use cases. CMS has lots of 
black art/science tuning and configuration, but you can test options on a 
“canary” node and tweak until it runs well.

If you need more help after looking back in the archives, we would need the 
Cassandra version, JVM type and version, jvm.options (or cassandra-env.sh) and 
what kinds of errors/gc you are seeing (GC logs can be helpful).


Sean Durity
From: R1 J1 mailto:rjsoft...@gmail.com>>
Sent: Thursday, July 26, 2018 1:28 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] optimization to cassandra-env.sh

Any one has tried to optimize or change cassandra-env.sh in an server 
installation to make it use more heap size for garbage collection ?
Any ideas ? We are having some oom issues and thinking if we have options other 
than increasing RAM for that node.

Regards




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] optimization to cassandra-env.sh

2018-07-26 Thread Durity, Sean R
Check the archives for CMS or G1 (whichever garbage collector you are using). 
There has been significant and good advice on both. In general, though, G1 has 
one basic number to set and does very well in our use cases. CMS has lots of 
black art/science tuning and configuration, but you can test options on a 
“canary” node and tweak until it runs well.

If you need more help after looking back in the archives, we would need the 
Cassandra version, JVM type and version, jvm.options (or cassandra-env.sh) and 
what kinds of errors/gc you are seeing (GC logs can be helpful).


Sean Durity
From: R1 J1 
Sent: Thursday, July 26, 2018 1:28 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] optimization to cassandra-env.sh

Any one has tried to optimize or change cassandra-env.sh in an server 
installation to make it use more heap size for garbage collection ?
Any ideas ? We are having some oom issues and thinking if we have options other 
than increasing RAM for that node.

Regards




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Cassandra recommended server uptime?

2018-07-17 Thread Durity, Sean R
We do not have any scheduled, periodic node restarts. I have been working on 
Cassandra across many versions, and I have not seen a case where periodic 
restarts would solve any problem that I saw.

There are certainly times when a node needs a restart – but those are because 
of specific reasons.


Sean Durity
From: Vsevolod Filaretov 
Sent: Tuesday, July 17, 2018 8:23 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra recommended server uptime?

@rahul.xavier.si...@gmail.com, 
@simon.fontana.oscars...@ericsson.com,

Thank you for answers.

I've got a really heavy data model, but no GC problems.

Moreover, gclog shows very efficient GC process and hardly any GC pauses.

So, the question remains open: how general is the practice of semi-periodical 
C* node restarts?

Best regards,
Vsevolod.

вт, 17 июл. 2018 г., 14:39 Rahul Singh 
mailto:rahul.xavier.si...@gmail.com>>:
It’s likely that if you have server stability issues its because of data model 
or compaction strategy configurations which lead to out of memory issues or 
massive GC pauses. Rebooting wouldn’t solve those issues.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation
On Jul 17, 2018, 7:28 AM -0400, Simon Fontana Oscarsson 
mailto:simon.fontana.oscars...@ericsson.com>>,
 wrote:

Not anything that I'm aware of. Cassandra can run at months/years without 
rebooting.
It is better to monitor your nodes and if you find anything abnormal a restart 
can help.

--
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscars...@ericsson.com
www.ericsson.com

On tis, 2018-07-17 at 12:09 +0300, Vsevolod Filaretov wrote:

Good time of day everyone;

Does Cassandra have a "recommended uptime"? I.e., does regular Cassandra node 
reboots help anything? Are periodical node reboots recommended for general 
system stability?

Best regards, Vsevolod.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] New cluster vs Increasing nodes to already existed cluster

2018-07-16 Thread Durity, Sean R
In most cases, we separate clusters by application. This does help with 
isolating problems. A bad query in one application won’t affect other 
applications. Also, you can then scale each cluster as required by the data 
demands. You can also upgrade separately, which may be a huge help. You only 
need one team’s testing (and driver change or whatever) before you can upgrade. 
With a multi-tenant ring, you will need much more coordination for any changes.

There is a practical limit of the number of memtables per cluster, too. This is 
somewhere in the low hundreds (200-300), based on the amount of RAM you have 
per node.


Sean Durity

From: onmstester onmstester 
Sent: Monday, July 16, 2018 9:17 AM
To: "user" 
Subject: [EXTERNAL] New cluster vs Increasing nodes to already existed cluster

Currently i have a cluster with 10 nodes dedicated to one keyspace (Hardware 
sizing been done according to input rate and ttl just for current application 
requirements).
I need a launch a new application with new keyspace with another set of servers 
(8 nodes), there is no relation between the current and new application. I have 
two option:
1. add new nodes to already existed cluster (10 nodes + 8 nodes) and share the 
power and storage between the keyspace
2. create a new cluster for the new application (isolate clusters)
Which option do you recommend and why ? (i care about of cost of maintenance, 
performance (write and read), isolation of problems)

Sent using Zoho 
Mail





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: JVM Heap erratic

2018-07-03 Thread Durity, Sean R
THIS! A well-reasoned and clear explanation of a very difficult topic. This is 
the kind of gold that a user mailing list can provide. Thank you, Alain!


Sean Durity

From: Alain RODRIGUEZ 
Sent: Tuesday, July 03, 2018 6:37 AM
To: user cassandra.apache.org 
Subject: [EXTERNAL] Re: JVM Heap erratic

Hello Randy,

It's normal that the memory in the heap is having this pattern. Java uses 
memory available and when needed clean some memory for new needs, that's the 
variation you see. In your case, it's not really regular but this can depend on 
the workload as well.

I'm a C# .NET guy, so I have no idea if this is normal Java behavior.

I feel you. I started operating Cassandra with no clue about the Garbage 
collection and other JVM stuff. When I started tuning it the first time with 
some former colleagues, we ended up removing half of the nodes of the cluster 
and still divided latency per 2. It is an important part of Cassandra to tune 
and often people (including myself) overlook it because it's too complex. I'll 
try to give you a big picture so you can have some analysis of what's going on 
and hopefully do some good to this cluster ("some good" - maybe not remove half 
of the nodes and reduce the latency, this was really a strong improvement on a 
badly tuned GC, but let's see :) ).

The heap is a limited amount of memory used to store Java objects. It's 
composed of 3 sections: The New Generation, The Old Generation, the Permananent 
Generation. New objects go to the New Gen ('HEAP_NEW_SIZE' in CMS, auto in G1GC 
- do not set). From time to time, depending on usage and tuning, surviving 
objects are pushed from the Eden Space where they first land to one of the 2 
survivor space (the other one is empty). Then, depending on the tenuring 
threshold option (in CMS, auto in G1GC too I believe), the data will be passed 
from one survivor to the other one, expiring old data in the process. This 
cleaning process in the New Gen is called the minor garbage collection (Minor 
GC) and is triggered when Eden is full. After the tenuring threshold is reached 
and the object was moved around survivor spaces x times, surviving objects will 
be promoted (or tenured) to the Old Gen. This promotion of living objects is 
referenced as a Major GC.
This is the most expensive GC, and even though it will have to happen from time 
to time in almost all cases, it's interesting to reduce the total duration and 
frequency of Major GC to improve GC statistics overall. We can ignore the 
permanent Gen that is not triggering any important GC activity.

Some more information is available here: 
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html

In Cassandra, especially, in read-heavy workloads, objects can often expire 
before being promoted if given enough space and time to do so. And this is way 
more performant than promoting objects because we hadn't them surviving long 
enough in the New Gen.

Using CMS with 20 GB is not recommended (out of the box, as a starting point at 
least) because CMS performances are known to degrade quickly with bigger heap 
than 8 GB. 20 GB is a lot. It also depends on the total memory available.

tried 8GB = OOM
tried 12GB = OOM
tried 20GB w/ G1 = OOM (and long GC pauses usually over 2 secs)
tried 20GB w/ CMS = running

OOM are not only related to the space available but also to the impossibility 
to clean the heap efficiently enough before we need the space. Thus tuning some 
more option than just the heap size might help.

CMS (over G1CG)
HEAP: 8 to 16 GB.
NEW_HEAP: 25 to 50 % - nothing to do with CPU core contrary to 
documentation/comments in the file imho
MaxTenuringThreshold: 15 - From 1 all the way up to 15, that's what gave me the 
best results in the past, it reduces major GC and makes the most of New 
Gen/minor GC, that are less impacting, but still "stop the world GC". Default 
is 1, which is often way to short to expire objects...
SurvarorRatio: 2 to 8 - controls survivor spaces size. It will be: 'Survivor 
total space = New Gen Size / (SurvarorRatio + 2)'. Dividing by 2 you have the 
size of each survivor. Here it will depend how fast the Eden space is 
allocated. Increasing the survivor space will disminuish the Eden space (where 
new objects are allocated) and there is a tradeoff here as well and a balance 
to find.

I would try with these settings on a canary node:
HEAP - 16 GB (if read heavy, if not probably between 8 and 12 GB is better).
NEW_HEAP - 50% of the heap (4 - 8GB)
MaxTenuringThreshold: 15
SurvarorRatio: 4,

When testing GC, there is not a better way than using a canary node, pick one 
rack and node(s) you want in this rack to test. This should not impact 

RE: [EXTERNAL] Re: consultant recommendations

2018-06-29 Thread Durity, Sean R
I haven’t ever hired a Cassandra consultant, but the company named The Last 
Pickle (yes, an odd name) has some outstanding Cassandra experts. Not sure how 
they work, but worth a mention here.

Nothing against Instacluster. There are great folks there, too.


Sean Durity

From: Evelyn Smith 
Sent: Friday, June 29, 2018 1:54 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: consultant recommendations

Hey Randy

Instaclustr provides consulting services for Cassandra as well as managed 
services if you are looking to offload the admin burden.

https://www.instaclustr.com/services/cassandra-consulting/

Alternatively, send me an email at 
evelyn.ba...@instaclustr.com and I’d be 
happy to chase this up on Monday with the head of consulting (it’s Friday night 
my time).

Cheers,
Evelyn.

On 30 Jun 2018, at 2:26 am, Randy Lynn 
mailto:rl...@getavail.com>> wrote:

Having some OOM issues. Would love to get feedback from the group on what 
companies/consultants you might use?

--
Randy Lynn
rl...@getavail.com

office:

859.963.1616  ext 202


163 East Main Street - Lexington, KY 40507 - USA

[Image removed by 
sender.]

getavail.com




RE: RE: [EXTERNAL] Cluster is unbalanced

2018-06-19 Thread Durity, Sean R
You are correct that the cluster decides where data goes (based on the hash of 
the partition key). However, if you choose a “bad” partition key, you may not 
get good distribution of the data, because the hash is deterministic (it always 
goes to the same nodes/replicas). For example, if you have a partition key of a 
datetime, it is possible that there is more data written for a certain time 
period – thus a larger partition and an imbalance across the cluster. Choosing 
a “good” partition key is one of the most important decisions for a Cassandra 
table.

Also, I have seen the use of racks in the topology cause an imbalance in the 
“first” node of the rack.

To help you more, we would need the create table statement(s) for your keyspace 
and the topology of the cluster (like with nodetool status).


Sean Durity
From: learner dba 
Sent: Tuesday, June 19, 2018 9:50 AM
To: user@cassandra.apache.org
Subject: Re: RE: [EXTERNAL] Cluster is unbalanced

We do not chose the node where partition will go. I thought it is snitch's role 
to chose replica nodes. Even the partition size does not vary on our largest 
column family:

Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count

  (micros)  (micros)   (bytes)

50% 0.00 17.08 61.21  3311  
   1

75% 0.00 20.50 88.15  3973  
   1

95% 0.00 35.43105.78  3973  
   1

98% 0.00 42.51126.93  3973  
   1

99% 0.00 51.01126.93  3973  
   1

Min 0.00  3.97 17.0961

Max 0.00 73.46126.93 11864  
   1

We are kinda stuck here to identify, what could be causing this un-balance.

On Tuesday, June 19, 2018, 7:15:28 AM EDT, Joshua Galbraith 
 wrote:


>If it was partition key issue, we would see similar number of partition keys 
>across nodes. If we look closely number of keys across nodes vary a lot.

I'm not sure about that, is it possible you're writing more new partitions to 
some nodes even though each node owns the same number of tokens?

[Image removed by sender.]

On Mon, Jun 18, 2018 at 6:07 PM, learner dba 
mailto:cassandra...@yahoo.com.invalid>> wrote:
Hi Sean,

Are you using any rack aware topology? --> we are using gossip file
Are you using any rack aware topology? --> we are using gossip file
 What are your partition keys? --> Partition key is uniq
Is it possible that your partition keys do not divide up as cleanly as you 
would like across the cluster because the data is not evenly distributed (by 
partition key)?  --> No, we verified it.

If it was partition key issue, we would see similar number of partition keys 
across nodes. If we look closely number of keys across nodes vary a lot.


Number of partitions (estimate): 3142552
Number of partitions (estimate): 15625442
Number of partitions (estimate): 15244021
Number of partitions (estimate): 9592992
Number of partitions (estimate): 15839280





On Monday, June 18, 2018, 5:39:08 PM EDT, Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:



Are you using any rack aware topology? What are your partition keys? Is it 
possible that your partition keys do not divide up as cleanly as you would like 
across the cluster because the data is not evenly distributed (by partition 
key)?





Sean Durity

lord of the (C*) rings (Staff Systems Engineer – Cassandra)

MTC 2250

#cassandra - for the latest news and updates



From: learner dba mailto:cassandra...@yahoo.com>. 
INVALID>
Sent: Monday, June 18, 2018 2:06 PM
To: User 
cassandra.apache.org<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=8q4p6nWedWQJ9gpXCnoa6KR4HRmSf3B1whdYKNFub6M=TmzIaVextVyZy81p9JuU7R6PFv84RfhgtEezCe063V0=>
 mailto:user@cassandra.apache.org>>
Subject: [EXTERNAL] Cluster is unbalanced



Hi,



Data volume varies a lot in our two DC cluster:

 Load   Tokens   Owns

 20.01 GiB  256  ?

 65.32 GiB  256  ?

 60.09 GiB  256  ?

 46.95 GiB  256  ?

 50.73 GiB  256  ?

kaiprodv2

=

/Leaving/Joining/Moving

 Load   Tokens   Owns

 25.19 GiB  256  ?

 30.26 GiB  256  ?

 9.82 GiB   256  ?

 20.54 GiB  256  ?

 9.7 GiB256  ?



I ran clearsnapshot, garbagecollect and cleanup, but it increased the size on 
heavier nodes instead of decreasing. Based on nodetool cfstats, I can see 
partition keys on each node varies a lot:



Number of partitions (estimate): 3142552

Number of partitions (estimate): 15625442

N

RE: [EXTERNAL] Re: Tombstone

2018-06-19 Thread Durity, Sean R
This sounds like a queue pattern, which is typically an anti-pattern for 
Cassandra. I would say that it is very difficult to get the access patterns, 
tombstones, and everything else lined up properly to solve a queue problem.


Sean Durity

From: Abhishek Singh 
Sent: Tuesday, June 19, 2018 10:41 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Tombstone

   The Partition key is made of datetime(basically date 
truncated to hour) and bucket.I think your RCA may be correct since we are 
deleting the partition rows one by one not in a batch files maybe overlapping 
for the particular partition.A scheduled thread picks the rows for a partition 
based on current datetime and bucket number and checks whether for each row the 
entiry is past due or not, if yes we trigger a event and remove the entry.



On Tue 19 Jun, 2018, 7:58 PM Jeff Jirsa, 
mailto:jji...@gmail.com>> wrote:
The most likely explanation is tombstones in files that won’t be collected as 
they potentially overlap data in other files with a lower timestamp (especially 
true if your partition key doesn’t change and you’re writing and deleting data 
within a partition)

--
Jeff Jirsa


> On Jun 19, 2018, at 3:28 AM, Abhishek Singh 
> mailto:abh23...@gmail.com>> wrote:
>
> Hi all,
>We using Cassandra for storing events which are time series based 
> for batch processing once a particular batch based on hour is processed we 
> delete the entries but we were left with almost 18% deletes marked as 
> Tombstones.
>  I ran compaction on the particular CF tombstone didn't come 
> down.
> Can anyone suggest what is the optimal tunning/recommended 
> practice used for compaction strategy and GC_grace period with 100k entries 
> and deletes every hour.
>
> Warm Regards
> Abhishek Singh

-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org


RE: [EXTERNAL] Cluster is unbalanced

2018-06-18 Thread Durity, Sean R
Are you using any rack aware topology? What are your partition keys? Is it 
possible that your partition keys do not divide up as cleanly as you would like 
across the cluster because the data is not evenly distributed (by partition 
key)?


Sean Durity
lord of the (C*) rings (Staff Systems Engineer – Cassandra)
MTC 2250
#cassandra - for the latest news and updates

From: learner dba 
Sent: Monday, June 18, 2018 2:06 PM
To: User cassandra.apache.org 
Subject: [EXTERNAL] Cluster is unbalanced

Hi,

Data volume varies a lot in our two DC cluster:

 Load   Tokens   Owns

 20.01 GiB  256  ?

 65.32 GiB  256  ?

 60.09 GiB  256  ?

 46.95 GiB  256  ?

 50.73 GiB  256  ?

kaiprodv2

=

/Leaving/Joining/Moving

 Load   Tokens   Owns

 25.19 GiB  256  ?

 30.26 GiB  256  ?

 9.82 GiB   256  ?

 20.54 GiB  256  ?

 9.7 GiB256  ?

I ran clearsnapshot, garbagecollect and cleanup, but it increased the size on 
heavier nodes instead of decreasing. Based on nodetool cfstats, I can see 
partition keys on each node varies a lot:


Number of partitions (estimate): 3142552

Number of partitions (estimate): 15625442

Number of partitions (estimate): 15244021

Number of partitions (estimate): 9592992
Number of partitions (estimate): 15839280

How can I diagnose this imbalance further?



RE: [EXTERNAL] Re: apache-cassandra 2.2.8 rpm

2018-06-11 Thread Durity, Sean R


>Finally can I run mixed Datastax and Apache nodes in the same cluster same 
>version?
>Thank you for all your help.

I have run DSE and Apache Cassandra in the same cluster while migrating to DSE. 
The versions of Cassandra were the same. It was relatively brief -- just during 
the upgrade process -- but they run just fine together. My experience was in 
the Cassandra 1.1 and 2.0 time frame, but I would expect it to be similar now. 
Exceptions, of course, would be the DSE additional tools like SOLR, Spark, 
OpsCenter, Graph, DSE Authentication, etc. 


Sean Durity
lord of the (C*) rings (Staff Systems Engineer – Cassandra)
MTC 2250
#cassandra - for the latest news and updates



RE: [EXTERNAL] IN clause of prepared statement

2018-05-21 Thread Durity, Sean R
One of the columns you are selecting is a list or map or other kind of 
collection. You can’t do that with an IN clause against a clustering column. 
Either don’t select the collection column OR don’t use the IN clause. Cassandra 
is trying to protect itself (and you) from a query that won’t scale well. Honor 
that.

As a good practice, you shouldn’t do select * (as a production query) against 
any database. You want to list the columns you actually want to select. That 
way a later “alter table add column” (or similar) doesn’t cause unpredictable 
results to the application.


Sean Durity
From: onmstester onmstester 
Sent: Sunday, May 20, 2018 10:13 AM
To: user 
Subject: [EXTERNAL] IN clause of prepared statement

The table is  something like

Samples
...
partition key (partition,resource,(timestamp,metric_name)

creating prepared statement :
session.prepare("select * from samples where partition=:partition and 
resource=:resource and timestamp>=:start and timestamp<=:end and metric_name in 
:metric_names")

failed with  exception:

can not restrict clustering columns by IN relations when a collection is 
selected by the query

The query is OK using cqlsh. using column names in select did not help.
Is there anyway to achieve this in Cassandra? I'm aware of performance problems 
of this query but it does not matter in my case!

I'm using datastax driver 3.2 and Apache cassandra 3.11.2

Sent using Zoho 
Mail





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Error after 3.1.0 to 3.11.2 upgrade

2018-05-14 Thread Durity, Sean R
A couple additional things:


-  Make sure that you ran repair on the system_auth keyspace on all 
nodes after changing the RF

-  If you are not often changing roles/permissions, you might look to 
increase permissions_validity_in_ms and roles_validity_in_ms so they are not 
being fetched all the time (especially with the internal Cassandra 
Authorizer/Authenticator).


Sean Durity

From: Jeff Jirsa 
Sent: Saturday, May 12, 2018 9:21 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Error after 3.1.0 to 3.11.2 upgrade

RF of one means all auth requests go to the same node, so they’re more likely 
to time out if that host is overloaded or restarts

Increasing it distributed the queries among more hosts

--
Jeff Jirsa


On May 12, 2018, at 6:11 AM, Abdul Patel 
> wrote:
Yeah found that all had 3 replication factor and system_auth had 1 , chnaged to 
3 now ..so was this issue due to system_auth replication facyor mismatch?

On Saturday, May 12, 2018, Hannu Kröger 
> wrote:
Hi,

Did you check replication strategy and amounts of replicas of system_auth 
keyspace?

Hannu

Abdul Patel > kirjoitti 
12.5.2018 kello 5.21:
No applicatiom isnt impacted ..no complains ..
Also its an 4 node cluster in lower non production and all are on same version.

On Friday, May 11, 2018, Jeff Jirsa > 
wrote:
The read is timing out - is the cluster healthy? Is it fully upgraded or mixed 
versions? Repeated isn’t great, but is the application impacted?
--
Jeff Jirsa


On May 12, 2018, at 6:17 AM, Abdul Patel 
> wrote:
Seems its coming from 3.10, got bunch of them today for 3.11.2, so if this is 
repeatedly coming , whats solution for this?

WARN  [Native-Transport-Requests-24] 2018-05-11 16:46:20,938 
CassandraAuthorizer.java:96 - CassandraAuthorizer failed to authorize # for 
ERROR [Native-Transport-Requests-24] 2018-05-11 16:46:20,940 
ErrorMessage.java:384 - Unexpected exception during request
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.RuntimeException: 
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
received only 0 responses.
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) 
~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.get(LocalCache.java:3937) 
~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) 
~[guava-18.0.jar:na]
at 
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) 
~[guava-18.0.jar:na]
at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) 
~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.auth.PermissionsCache.getPermissions(PermissionsCache.java:45)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.auth.AuthenticatedUser.getPermissions(AuthenticatedUser.java:104)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.service.ClientState.authorize(ClientState.java:439) 
~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.service.ClientState.checkPermissionOnResourceChain(ClientState.java:368)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:345)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:332) 
~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:310)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.checkAccess(SelectStatement.java:260)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:221)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:530)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:507)
 ~[apache-cassandra-3.11.2.jar:3.11.2]

On Fri, May 11, 2018 at 8:30 PM, Jeff Jirsa 
> wrote:
That looks like Cassandra 3.10 not 3.11.2

It’s also just the auth cache failing to refresh - if it’s transient it’s 
probably not a big deal. If it continues then there may be an issue with the 
cache refresher.
--
Jeff Jirsa


On May 12, 2018, at 5:55 AM, Abdul Patel 
> wrote:
HI All,

Seen below stack trace messages , in errorlog  one day after upgrade.
one of the blogs said this might be due to old drivers, but not sure on it.

FYI :

INFO  

RE: [EXTERNAL] Cassandra limitations

2018-05-04 Thread Durity, Sean R
The issue is more with the number of tables, not the number of keyspaces. 
Because each table has a memTable, there is a practical limit to the number of 
memtables that a node can hold in its memory. (And scaling out doesn’t help, 
because every node still has a memTable for every table.) The practical table 
limit I have heard is in the low hundreds – maybe 200 as a rough estimate.

In general, we create a new cluster (instead of a new keyspace) for each 
application.


Sean Durity
From: Abdul Patel 
Sent: Thursday, May 03, 2018 5:56 PM
To: User@cassandra.apache.org
Subject: [EXTERNAL] Cassandra limitations

Hi ,

In my environment, we are coming up with 3 to 4 new projects , hence new 
keyspaces will be coming into picture.
Do we have any limitations or performance issues when we hit to a number of 
keyspaces or number of nodes vs keyspaces?
Also connections limitations if any?

I know as data grows we can add more nodes and memory but nor sure about 
somethinh else which need to take into consideration.





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Cassandra reaper

2018-04-26 Thread Durity, Sean R
Wait, isn’t this the Apache Cassandra mailing list? Shouldn’t this be on the 
pickle users list or something?

(Just kidding, everyone. I think there should be room for reaper and DataStax 
inquiries here.)


Sean Durity

From: Joaquin Casares [mailto:joaq...@thelastpickle.com]
Sent: Tuesday, April 24, 2018 9:01 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra reaper

Sure thing Abdul,

That's great to hear! Unfortunately, the JMX authentication needs to be in the 
config file currently. And even if the JMX authentication was stored within 
Cassandra, we would still need to store connection details within the yaml and 
storing the JMX authentication credentials within Cassandra may not be ideal 
from a security standpoint.

The UI keeps logs of all the previous repairs, to the best of my knowledge. If 
you want to completely uninstall Reaper, you can perform a DROP KEYSPACE 
reaper_db; from within cqlsh, but that would remove all schedules as well.

Cheers,

Joaquin

Joaquin Casares
Consultant
Austin, TX

Apache Cassandra Consulting
http://www.thelastpickle.com

On Tue, Apr 24, 2018 at 7:49 PM, Abdul Patel 
> wrote:
Thanks Joaquin,

Yes i used the same and worked fine ..only thing is i had to add userid 
password ..which is somewhat annoyoing to keep in comfig file ..can i get reed 
of it and still store on reaper_db keyspace?
Also how to clean reaper_db by deleting completed reaper information from gui? 
Or any other cleanup is required?


On Tuesday, April 24, 2018, Joaquin Casares 
> wrote:
Hello Abdul,

Depending on what you want your backend to be stored on, you'll want to use a 
different file.

So if you want your Reaper state to be stored within a Cassandra cluster, which 
I would recommend, use this file as your base file:

https://github.com/thelastpickle/cassandra-reaper/blob/master/src/packaging/resource/cassandra-reaper-cassandra.yaml

Make a copy of the yaml and include your system-specific settings. Then symlink 
it to the following location:

/etc/cassandra-reaper/cassandra-reaper.yaml

For completeness, this file is an example of how to use a Postgres server to 
store the Reaper state:

https://github.com/thelastpickle/cassandra-reaper/blob/master/src/packaging/resource/cassandra-reaper.yaml

Hope that helped!

Joaquin Casares
Consultant
Austin, TX

Apache Cassandra Consulting
http://www.thelastpickle.com

On Tue, Apr 24, 2018 at 7:07 PM, Abdul Patel 
> wrote:
Thanks

But the differnce here is cassandra-reaper-caasandra has more paramters than 
the cassandra-reaper.yaml
Can i just use the 1 file with all details or it looks for one specific file?


On Tuesday, April 24, 2018, Joaquin Casares 
> wrote:
Hello Abdul,

You'll only want one:

The yaml file used by the service is located at 
/etc/cassandra-reaper/cassandra-reaper.yaml and alternate config templates can 
be found under /etc/cassandra-reaper/configs. It is recommended to create a new 
file with your specific configuration and symlink it as 
/etc/cassandra-reaper/cassandra-reaper.yaml to avoid your configuration from 
being overwritten during upgrades.

Adapt the config file to suit your setup and then run `sudo service 
cassandra-reaper start`.

Source: 
http://cassandra-reaper.io/docs/download/install/#service-configuration

Hope that helps!

Joaquin Casares
Consultant
Austin, TX

Apache Cassandra Consulting

RE: [EXTERNAL] Re: How to configure Cassandra to NOT use SSLv2?

2018-04-24 Thread Durity, Sean R
I think I would start with the JVM. Sometimes, for export purposes, the 
cryptography extensions (JCE), are in a separate jar or package from the 
standard JRE or JVM. I haven’t used the IBM JDK, so I don’t know specifically 
about that one.

Also, perhaps the error is correct – SSLv2Hello is not a parameter that can be 
passed to the JVM. Maybe remove that option?


Sean Durity

From: Lou DeGenaro [mailto:lou.degen...@gmail.com]
Sent: Tuesday, April 24, 2018 10:08 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: How to configure Cassandra to NOT use SSLv2?

Thanks for your suggestions.  I tried using the -D shown below:
degenaro@bluej421:/users/degenaro/cassandra/bluej421> ./bin/cassandra
degenaro@bluej421:/users/degenaro/cassandra/bluej421> numactl --interleave=all 
/share/ibm-jdk1.8/bin/java -Dhttps.protocols=TLSv1.2,TLSv1.1,SSLv2Hello 
-Xloggc:./bin/../logs/gc.log -XX:+UseParNewGC -XX:+UseConcMarkSweepGC 
-XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-XX:CMSWaitDuration=1...
...
WARN  14:01:09 Filtering out [TLS_RSA_WITH_AES_128_CBC_SHA, 
TLS_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, 
TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, 
TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] as it isn't supported by the socket
Exception (java.lang.IllegalArgumentException) encountered during startup: 
SSLv2Hello is not a recognized protocol.
java.lang.IllegalArgumentException: SSLv2Hello is not a recognized protocol.
at com.ibm.jsse2.S.a(S.java:112)
at com.ibm.jsse2.S.b(S.java:136)
at com.ibm.jsse2.S.(S.java:177)
at com.ibm.jsse2.as.setEnabledProtocols(as.java:2)
at 
org.apache.cassandra.security.SSLFactory.getServerSocket(SSLFactory.java:67)
at 
org.apache.cassandra.net.MessagingService.getServerSockets(MessagingService.java:514)
at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:498)
at 
org.apache.cassandra.net.MessagingService.listen(MessagingService.java:482)
at 
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:765)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:654)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:534)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:344)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:568)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:696)
ERROR 14:01:09 Exception encountered during startup
java.lang.IllegalArgumentException: SSLv2Hello is not a recognized protocol.

Who is at fault: user, Cassandra, JVM, OS?
Thanks.
Lou.





On Tue, Apr 24, 2018 at 9:43 AM, Marcus Haarmann 
> wrote:
Hi,

I did take a look into the source code of 3.11, but I believe the code is more 
or less the same.
The SSL code makes use of Java SSL Sockets so you can limit the protocols in 
the "Java way".
The java way (at least for a recent Java 8) is to setup the protocols in the 
/lib/security/java.security file.
Or to define a system property on the command line (-Dhttps.protocols = 
TLSv1.2,TLSv1.1,SSLv2Hello).

There are multiple options for SSL configuration in the config
(https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/secureSSLNodeToNode.html)
The most interesting one in your situation would be the cipher_suites option, 
which allows you
to limit the avaliable cipher suites e.g. to 
TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384
(which is a TLS1.2-only cipher suite).

You can check the offered protocols for your server with an open source tool 
like sslyze 
(https://github.com/nabla-c0d3/sslyze)

Marcus Haarmann


Von: "Lou DeGenaro" >
An: "user" >
Gesendet: Dienstag, 24. April 2018 11:21:06
Betreff: Re: How to configure Cassandra to NOT use SSLv2?

Can someone please can tell me how to prevent Cassandra 3.0.9 from using SSLv2? 
 Happy to use a newer version of Cassandra if that's what's required.

On Sat, Apr 21, 2018 at 8:30 AM, Lou DeGenaro 
> wrote:
3.0.9

On Fri, Apr 20, 2018 at 10:26 

RE: [EXTERNAL] Re: Cassandra downgrade version

2018-04-19 Thread Durity, Sean R
This answer surprises me, because I would expect NOT to be able to downgrade if 
there are any changes in the sstable structure. I assume:

-  Upgrade is done while the application is up and writing data (so any 
new data is written in the new format)

-  Any compactions that happen to run post-upgrade are written in the 
new format

-  A restore to the time just before upgrade would lose all new data 
and would take time to move sstables back into place and restart all nodes – 
requiring an outage. The data loss and outage time are usually unacceptable.
Therefore, I generally tell my development teams and change controls that a 
backout is only a desperate, last-ditch effort. The upgrade goes forward only. 
So, Lerh’s comment about testing in lower life cycles is critically correct.

So, the specific question here is whether there is any sstable format change 
between 3.1.0 and 3.11.2. I don’t know if there is.


Sean Durity

From: Lerh Chuan Low [mailto:l...@instaclustr.com]
Sent: Monday, April 16, 2018 6:52 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra downgrade version

You should just be able to install 3.1.0 again if you need to as they are in 
the 3.X line. To be really safe you can also take a snapshot and backup your 
existing SSTables first..and always remember to test before upgrading in 
Production :)

On 17 April 2018 at 07:48, Abdul Patel 
> wrote:
Hi All,

I am.planning to upgrade my cassandra cluster from 3.1.0 to 3.11.2 . Just in 
case if somethings goes back then do we have any rollback or downgrade option 
in cassandra  to older/ previous version?

Thanks





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Cassandra vs MySQL

2018-03-20 Thread Durity, Sean R
I’m not sure there is a fair comparison. MySQL and Cassandra have different 
ways of solving related (but not necessarily the same) problems of storing and 
retrieving data.

The data model between MySQL and Cassandra is likely to be very different. The 
key for Cassandra is that you need to model for the queries that will be 
executed. If you cannot know the queries ahead of time, Cassandra is not the 
best choice. If table scans are typically required, Cassandra is not a good 
choice. If you need more than a few hundred tables in a cluster, Cassandra is 
not a good choice.

If multi-datacenter replication is required, Cassandra is an awesome choice. If 
you are going to always query by a partition key (or primary key), Cassandra is 
a great choice. The nice thing is that the performance scales linearly, so 
additional data is fine (as long as you add nodes) – again, if your data model 
is designed for Cassandra. If you like no-downtime upgrades and extreme 
reliability and availability, Cassandra is a great choice.

Personally, I hope to never have to use/support MySQL again, and I love working 
with Cassandra. But, Cassandra is not the choice for all data problems.


Sean Durity

From: Oliver Ruebenacker [mailto:cur...@gmail.com]
Sent: Monday, March 12, 2018 3:58 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Cassandra vs MySQL


 Hello,
  We have a project currently using MySQL single-node with 5-6TB of data and 
some performance issues, and we plan to add data up to a total size of maybe 
25-30TB.
  We are thinking of migrating to Cassandra. I have been trying to find 
benchmarks or other guidelines to compare MySQL and Cassandra, but most of them 
seem to be five years old or older.
  Is there some good more recent material?
  Thanks!
 Best, Oliver

--
Oliver Ruebenacker
Senior Software Engineer, Diabetes 
Portal,
 Broad 
Institute




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] RE: What versions should the documentation support now?

2018-03-14 Thread Durity, Sean R
The DataStax documentation is far superior to the Apache Cassandra attempts. 
Apache is just poor with holes all over, goofy examples, etc. It would take a 
team of people working full time to try and catch up with DataStax. I have met 
the DataStax team; they are doing good work. I think it would be far more 
effective to support/encourage the DataStax documentation efforts. I think they 
accept corrections/suggestions – perhaps publish that email address…

What is missing most from DataStax (and most software) is the discussions of 
why/when you would change a particular parameter and what should change if the 
parameter changes. If DataStax created a community comments section (somewhat 
similar to what MySQL tried), that would be something worth contributing to. I 
love good docs (like DataStax); Apache Cassandra is hopelessly behind.

And, yes, the good documentation from DataStax was a strong reason why our 
company pursued Cassandra as a data technology. It was better than almost any 
other open source project we knew.

(Please, let’s refrain from the high pri emails to the user group list…)


Sean Durity

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID]
Sent: Wednesday, March 14, 2018 3:02 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: What versions should the documentation support now?
Importance: High

This went nowhere quick.  Come on everyone.  The website has to support users 
who are on “supported” versions of the software.  That’s more than one version. 
 There was a JIRA on this months ago.  You are smart people. I just gave a 
perfect answer and ended up burning a bunch of time for nothing.  Now its back 
on you.  Are you going to properly support the software you create or not!

Kenneth Brotman

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID]
Sent: Tuesday, March 13, 2018 11:03 PM
To: user@cassandra.apache.org
Subject: RE: What versions should the documentation support now?

I made sub directories “2_x” and “3_x” under docs and put a copy of the doc in 
each.  No links were changed yet.  We can work on the files first and discuss 
how we want to change the template and links.  I did the pull request already.

Kenneth Brotman

From: Jonathan Haddad [mailto:j...@jonhaddad.com]
Sent: Tuesday, March 13, 2018 6:19 PM
To: user@cassandra.apache.org
Subject: Re: What versions should the documentation support now?

Yes, I agree, we should host versioned docs.  I don't think anyone is against 
it, it's a matter of someone having the time to do it.

On Tue, Mar 13, 2018 at 6:14 PM kurt greaves 
> wrote:
I’ve never heard of anyone shipping docs for multiple versions, I don’t know 
why we’d do that.  You can get the docs for any version you need by downloading 
C*, the docs are included.  I’m a firm -1 on changing that process.
We should still host versioned docs on the website however. Either that or we 
specify "since version x" for each component in the docs with notes on 
behaviour.
​



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Durity, Sean R
You cannot migrate and upgrade at the same time across major versions. 
Streaming is (usually) not compatible between versions.

As to the migration question, I would expect that you may need to put the 
external-facing ip addresses in several places in the cassandra.yaml file. And, 
yes, it would require a restart. Why is a non-restart more desirable? Most 
Cassandra changes require a restart, but you can do a rolling restart and not 
impact your application. This is fairly normal admin work and can/should be 
automated.

How large is the cluster to migrate (# of nodes and size of data). The 
preferred method might depend on how much data needs to move. Is any 
application outage acceptable?

Sean Durity
lord of the (C*) rings (Staff Systems Engineer – Cassandra)
From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
Sent: Sunday, March 11, 2018 10:20 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: Adding new DC?

Hi Kenneth,

Replies inline below.

On 12-Mar-2018 3:40 AM, "Kenneth Brotman" 
> wrote:
Hi Kunal,

That version of Cassandra is too far before me so I’ll let others answer.  I 
was wonder why you wouldn’t want to end up on 3.0x if you’re going through all 
the trouble of migrating anyway?


Application side constraints - some data types are different between 2.1.x and 
3.x (for example, date vs. timestamp).

Besides, this is production setup - so, cannot take risk.
Are both data centers in the same region on AWS?  Can you provide yaml file for 
us to see?


No, they are in different regions - GCE setup is in us-east while AWS setup is 
in Asia-south (Mumbai)

Thanks,
Kunal
Kenneth Brotman

From: Kunal Gangakhedkar 
[mailto:kgangakhed...@gmail.com]
Sent: Sunday, March 11, 2018 2:32 PM
To: user@cassandra.apache.org
Subject: Adding new DC?

Hi all,

We currently have a cluster in GCE for one of the customers.
They want it to be migrated to AWS.

I have setup one node in AWS to join into the cluster by following:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html

Will add more nodes once the first one joins successfully.

The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 
7199, 9042 in GCE firewall.

The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, 
rack=RAC1 while on AWS, I changed the DC to dc=DC2.

When I start cassandra service on the AWS instance, I see the version handshake 
msgs in the logs trying to connect to the public IPs of the GCE nodes:
OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx
However, nodetool status output on both sides don't show the other side at all. 
That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup 
doesn't show old DC (dc=DC1).

In cassandra.yaml file, I'm only using listen_interface and rpc_interface 
settings - no explicit IP addresses used - so, ends up using the internal 
private IP ranges.

Do I need to explicitly add the broadcast_address? for both side?
Would that require restarting of cassandra service on GCE side? Or is it 
possible to change that setting on-the-fly without a restart?

I would prefer a non-restart option.

PS: The cassandra version running in GCE is 2.1.18 while the new node setup in 
AWS is running 2.1.20 - just in case if that's relevant

Thanks,
Kunal




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Version Rollback

2018-02-28 Thread Durity, Sean R
My short answer is always – there are no rollbacks, we only go forward.  Jeff’s 
answer is much more complete and technically precise. You *could* rollback a 
few nodes (depending on topology) by just replacing them as if they had died.

I always upgrade all nodes (the binaries) as quickly as possible (but, one node 
at a time). The application stays up, stays happy, and my customers love 
“always up” Cassandra. I have clusters where we have done 3 or more major 
upgrades with 0 downtime for the application. One of the best things about 
supporting Cassandra! One node at a time upgrades can also be automated (which 
we have done).

After upgrading binaries on all nodes, I execute upgradesstables on groups of 
nodes (depending on load, hardware, cluster size, etc.). Reasoning: You cannot 
do any streaming operations (bootstrap, repairs) in a mixed-version cluster 
(except for maybe very minor version upgrades).


Sean Durity
From: shalom sagges [mailto:shalomsag...@gmail.com]
Sent: Wednesday, February 28, 2018 3:54 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Version Rollback

These are really good directions. Thanks a lot everyone!
@Kenneth - The cluster is comprised of 44 nodes, version 2.0.14, ~2.5TB of data 
per node. It's gonna be a major version upgrade (or upgrades to be exact... 
version 3.x is the target).

@Jeff, I have a passive DC. What if I upgrade the passive DC and if all goes 
well, move the applications to work with the passive DC and then upgrade the 
active DC. Is this doable?
Also, Would you suggest to upgrade one node (binaries), upgrade the SSTables 
and move to the second node, and then third etc, or first upgrade binaries to 
all nodes, and only then start with the SSTables upgrade?
Thanks!


On Tue, Feb 27, 2018 at 7:47 PM, Jeff Jirsa 
> wrote:
MOST minor versions support rollback - the exceptions are those where internode 
protocol changes (3.0.14 being the only one in recent memory), or where sstable 
format changes (again rare). No major versions support rollback - the only way 
to do it is to upgrade in a way that you can effectively reinstall the old 
version without data loss.

The steps usually look like:

Test in a lab
Test in a lab again
Test in a lab a few more times
Snapshot everything

If you have a passive data center:
- upgrade one instance
- check to see if it’s happy
- upgrade another
- check to see if it’s happy
- continue until the passive dc is done
- if at any point they’re unhappy rebuild (wipe and restream the old version) 
the dc from the active dc

On the active DCs, you’ll want to canary it one replica at a time so you can 
treat a failed upgrade like a bad disk:
- upgrade one instance
- check if it’s happy; if it’s not treat it like a failed disk and replace it 
with the old version
- if you’re using single token, do another instance in a different replica set, 
repeat until you’re out of different replicas.
- if you’re using vnodes but a rack aware snitch and have more racks than your 
RF, do another instance in the same rack as the canary, repeat until you’re out 
of instances in that rack

This is typically your point of no return - as soon as you have two replicas in 
the new version there’s no more rollback practical.


--
Jeff Jirsa


On Feb 27, 2018, at 9:22 AM, Carl Mueller 
> wrote:
My speculation is that IF (bigif) the sstable formats are compatible between 
the versions, which probably isn't the case for major versions, then you could 
drop back.

If the sstables changed format, then you'll probably need to figure out how to 
rewrite the sstables in the older format and then sstableloader them in the 
older-version cluster if need be. Alas, while there is an sstable upgrader, 
there isn't a downgrader AFAIK.

And I don't have an intimate view of version-by-version sstable format changes 
and compatibilities. You'd probably need to check the upgrade instructions 
(which you presumably did if you're upgrading versions) to tell.

Basically, version rollback is pretty unlikely to be done.

The OTHER option:

1) build a new cluster with the new version, no new data.

2) code your driver interfaces to interface with both clusters. Write to both, 
but read preferentially from the new, then fall through to the old. Yes, that 
gets hairy on multiple row queries. Port your data with sstable loading from 
the old to the new gradually.

When you've done a full load of all the data from old to new, and you're 
satisfied with the new cluster stability, retire the old cluster.

For merging two multirow sets you'll probably need your multirow queries to 
return the partition hash value (or extract the code that generates the hash), 
and have two simulaneous java-driver ResultSets going, and merge their results, 
providing the illusion of a single database query. You'll need to pay attention 
to both the row key ordering and column key ordering 

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Durity, Sean R
It is instructive to listen to the concerns of new and existing users in order 
to improve a product like Cassandra, but I think the school yard taunt model 
isn’t the most effective.

In my experience with open and closed source databases, there are always things 
that could be improved. Many have a historical base in how the product evolved 
over time. A newcomer sees those as rough edges right away. In other cases, the 
database creators have often widened their scope to try and solve every data 
problem. This creates the complexity of too many configuration options, etc. 
Even the best RDBMS (Informix!) battled these kinds of issues.

Cassandra, though, introduced another angle of difficulty. In trying to relate 
to RDBMS users (pun intended), it often borrowed terminology to make it seem 
familiar. But they don’t work the same way or even solve the same problems. The 
classic example is secondary indexes. For RDBMS, they are very useful; for 
Cassandra, they are anathema (except for very narrow cases).

However, I think the shots at Cassandra are generally unfair. When I started 
working with it, the DataStax documentation was some of the best documentation 
I had seen on any project, especially an open source one. (If anything the 
cooling off between Apache Cassandra and DataStax may be the most serious 
misstep so far…) The more I learned about how Cassandra worked, the more I 
marveled at the clever combination of intricate solutions (gossip, merkle 
trees, compaction strategies, etc.) to solve specific data problems. This is a 
great product! It has given me lots of sleep-filled nights over the last 4+ 
years. My customers love it, once I explain what it should be used for (and 
what it shouldn’t). I applaud the contributors, whether coders or users. Thank 
you!

Finally, a note on backup. Backing up a distributed system is tough, but 
restores are even more complex (if you want no down-time, no extra disk space, 
point-in-time recovery, etc). If you want to investigate why it is a tough 
problem for Cassandra, go look at RecoverX from Datos IO. They have solved many 
of the problems, but it isn’t an easy task. You could ask people to try and 
recreate all that, or just point them to a working solution. If backup and 
recovery is required (and I would argue it isn’t always required), it is 
probably worth paying for.


Sean Durity
From: Josh McKenzie [mailto:jmcken...@apache.org]
Sent: Wednesday, February 21, 2018 11:28 AM
To: d...@cassandra.apache.org
Cc: User 
Subject: [EXTERNAL] Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad, and here's 
what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a 
patch to move the needle on *any* of these things being complained about in 
this thread.

For the Apache 
Way
 to work, people need to step up and meaningfully contribute to a project to 
scratch their own itch instead of just waiting for a random 
corporation-subsidized engineer to happen to have interests that align with 
them and contribute that to the project.

Beating a dead horse for things everyone on the project knows are serious pain 
points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin 
> wrote:
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff.  I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release.  That makes perfect sense.- for every release, especially the
> major ones.  Continuous improvement is not a phase of development for
> example.  CI should be in everything, in every phase.  Stability and
> testing a part of every release not just one.  A major release should be a
> nice step from the previous major release though.
>

I guess what Jeff refers to is the tick-tock release cycle experiment,
which has proven to be a complete disaster by popular opinion.

There's also the "materialized views" feature which failed to materialize
in the end (pun intended) and had to be declared experimental retroactively.

Another prominent example is incremental repair which was introduced as the
default option in 2.2 and now is not recommended to use because of so many
corner cases where it can fail.  So again experimental as an afterthought.

Not to 

RE: [EXTERNAL] Re: Even after the drop table, the data actually was not erased.

2018-01-17 Thread Durity, Sean R
We have found it very useful to set up an infrastructure where we can execute a 
nodetool command (or any other arbitrary command) from a single (non-Cassandra) 
host that will get executed on each node across the cluster (or a list of 
nodes).


Sean Durity

From: Alain RODRIGUEZ [mailto:arodr...@gmail.com]
Sent: Monday, January 15, 2018 1:19 PM
To: user cassandra.apache.org 
Subject: [EXTERNAL] Re: Even after the drop table, the data actually was not 
erased.

As you said, the auto_bootstrap setting was turned on.

Well I was talking about the 'auto_snapshot' ;-). I understand that's what you 
meant to say.

This command seems to apply only to one node. Can it be applied cluster-wide? 
Or should I run this command on each node?

Indeed, 'nodetool clearsnapshot' is only for the node where you run the 
command, like most of the nodetool commands (repair is a bit specific).

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-01-15 1:56 GMT+00:00 Eunsu Kim 
>:
Thank you for your response.  As you said, the auto_bootstrap setting was 
turned on.
The actual data was deleted with the 'nodetool clearsnapshot' command.
This command seems to apply only to one node. Can it be applied cluster-wide? 
Or should I run this command on each node?




On 12 Jan 2018, at 8:10 PM, Alain RODRIGUEZ 
> wrote:

Hello,

However, the actual size of the data directory did not decrease at all. Disk 
Load monitored by JMX has been decreased.

This sounds like 'auto_snapshot' is enabled. This option will trigger a 
snapshot before any table drop / truncate to prevent user mistakes mostly. Then 
the data is removed but as it is still referenced by the snapshot (hard link), 
space cannot be freed.

Running 'nodetool clearsnapshot' should help reducing the dataset size in this 
situation.


The client fails to establish a connection and I see the following exceptions 
in the Cassandra logs.
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId…

This does not look like a failed connection to me but rather a try to query 
some inexistent data. If that's the data you just deleted (keyspace / table), 
this is expected. If not there is an other issue, I hope not related to the 
delete in this case...

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



2018-01-12 7:14 GMT+00:00 Eunsu Kim 
>:
hi everyone

On the development server, I dropped all the tables and even keyspace dropped 
to change the table schema.
Then I created the keyspace and the table.

However, the actual size of the data directory did not decrease at all. Disk 
Load monitored by JMX has been decreased.




After that, Cassandra does not work normally.

The client fails to establish a connection and I see the following exceptions 
in the Cassandra logs.

org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId…….org.apache.cassandra.io.FSReadError:
 java.io.IOException: Digest mismatch exception……


After the data is forcibly deleted, Cassandra is restarted in a clean state and 
works well.

Can anyone guess why this is happening?

Thank you in advance.






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability 

RE: [EXTERNAL] Cassandra cluster add new node slowly

2018-01-03 Thread Durity, Sean R
You don't mention the version, but here are some general suggestions


-  2 GB heap is very small for a node, especially with 1 TB+ of data. 
What is the physical RAM on the host? In general, you want ½ of physical RAM 
for the JVM. (Look in jvm.options or cassandra-env.sh)

-  You can change the streaming throughput from the existing nodes, if 
it looks like the new node can handle it. Look at nodetool setstreamthroughput. 
Default is 200 (MB/sec).

-  You might want to check for a streaming_socket_timeout_in_ms. This 
has changed over the versions. Some details are at: 
https://issues.apache.org/jira/browse/CASSANDRA-11839. 24 hours is good 
recommendation.

-  If your new node can't compact fast enough to keep disk usage down, 
look at compactionthroughput on that node

-  nodetool netstats | grep -v "100%" is a good way to see what is 
happening/if anything is stuck. Newer versions give a bit more info on progress.

-  Don't forget to run cleanup on existing nodes after the new nodes 
are added.



Sean Durity
From: qf zhou [mailto:zhouqf2...@gmail.com]
Sent: Tuesday, January 02, 2018 10:30 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Cassandra cluster add new node slowly

The cluster has  3 nodes,  and  the data in each node is  about 1.2 T.  I want 
to add two new nodes to expand the cluster.

Following the instructions from the datastax  website, ie,  
(http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/operations/opsAddNodeToCluster.html),

I try to add one  node  to  the cluster.  However,  it  is  too slow  and time 
cost too  much.  After about  24 hours,  it still didn't  success.

I run the command: nodetool netstats  on the new node,  it  shows that:

(tb1fullwithstate2  is a big table and 90% of  the cluster data  is  in it.   
Here I use CompactionStrategy: TimeWindowCompactionStrategy).

/*.*.*.3
Receiving 136 files, 328573794609 bytes total. Already received 8 
files, 5774621188 bytes total
tb1/tb1fullwithneweststatetest 3758271/3758271 bytes(100%) received 
from idx:0/*.*.*.3
system_distributed/repair_history 57534/57534 bytes(100%) received 
from idx:0/*.*.*.3
system_distributed/parent_repair_history 507660/507660 bytes(100%) 
received from idx:0/*.*.*.3
tb1/tb1_device_last_state_eachday 15754096/15754096 bytes(100%) 
received from idx:0/*.*.*.3
mytest1/tb1_test1 8143775/8143775 bytes(100%) received from 
idx:0/*.*.*.3
tb1/tb1fullwithstate 2251191007/2251191007 bytes(100%) received 
from idx:0/*.*.*.3
applocationinfo/weiyirong_app 2760/2760 bytes(100%) received from 
idx:0/*.*.*.3
tb1/tb1fullwithstate2 3490748006/4909554503 bytes(71%) received 
from idx:0/*.*.*.3
tb1/tb1fullwithneweststate 4458079/4458079 bytes(100%) received 
from idx:0/*.*.*.3
/*.*.*.2
Receiving 136 files, 336762487360 bytes total. Already received 3 
files, 5695770181 bytes total
system_distributed/repair_history 31684/31684 bytes(100%) received 
from idx:0/*.*.*.2
tb1/tb1fullwithstate 908260516/908260516 bytes(100%) received from 
idx:0/*.*.*.2
tb1/tb1fullwithstate2 4783622958/4990450588 bytes(95%) received 
from idx:0/*.*.*.2
tb1/tb1fullwithneweststate 3855023/3855023 bytes(100%) received 
from idx:0/*.*.*.2
/*.*.*.4
Receiving 132 files, 236250553620 bytes total. Already received 10 
files, 3117465128 bytes total
mytest1/wordstest2 46/46 bytes(100%) received from idx:0/*.*.*.4
tb1/tb1fullwithneweststatetest 3416891/3416891 bytes(100%) received 
from idx:0/*.*.*.4
system_distributed/repair_history 39720/39720 bytes(100%) received 
from idx:0/*.*.*.4
system_distributed/parent_repair_history 452250/452250 bytes(100%) 
received from idx:0/*.*.*.4
mytest1/weblogs 104/104 bytes(100%) received from idx:0/*.*.*.4
tb1/tb1_device_last_state_eachday 12670998/12670998 bytes(100%) 
received from idx:0/*.*.*.4
mytest1/tb1_test1 3257952/3257952 bytes(100%) received from 
idx:0/*.*.*.4
tb1/tb1fullwithstate 647702056/647702056 bytes(100%) received from 
idx:0/*.*.*.4
applocationinfo/weiyirong_app 3509/3509 bytes(100%) received from 
idx:0/*.*.*.4
tb1/tb1fullwithstate2 2446436305/3566562762 bytes(68%) received 
from idx:0/*.*.*.4
tb1/tb1fullwithneweststate 3485297/3485297 bytes(100%) received 
from idx:0/*.*.*.4



check  the log in the  logs/system.log,  it shows that:


INFO  06:09:33 Updating topology for /*.*.*.2
INFO  06:09:33 Updating topology for 

RE: [EXTERNAL] 3.0.15 or 3.11.1

2018-01-02 Thread Durity, Sean R
It might help if you let us know about which 3.11 features you are interested. 
As I hear it, some of the features may not be PR ready (like materialized 
views). In my opinion, it seems that 3.0.15 is the more stable way to go. 
However, I have not been testing 3.11, so my thoughts are more based on what 
others have experienced, not my own experience.


Sean Durity

From: shalom sagges [mailto:shalomsag...@gmail.com]
Sent: Tuesday, January 02, 2018 3:15 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] 3.0.15 or 3.11.1

Hi All,
I want to upgrade from 2.x to 3.x.
I can definitely use the features in 3.11.1 but it's not a must.
So my question is, is 3.11.1 stable and suitable for Production compared to 
3.0.15?
Thanks!



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-28 Thread Durity, Sean R
DataStax Enterprise (pay to license) has embedded SOLR search with Cassandra if 
you don’t want to move the data to another cluster for indexing/searching. 
Similar to Cassandra modeling, you will need to understand the exact search 
queries in order to build the SOLR schema to support them.

The basic answer, though, is that Cassandra itself is not built for handling 
search queries like the ones you mention.


Sean Durity

From: Bradford Stephens [mailto:bradfordsteph...@gmail.com]
Sent: Friday, December 08, 2017 11:31 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Reg:- Data modelling For E-Commerce Pattern data 
modelling for Search

Hi -- you want to use Elasticsearch with a Cassandra store for the blob data.

On Thu, Dec 7, 2017 at 7:39 PM, @Nandan@ 
> wrote:
Hi Peoples,

As currently around the world 60-70% websites are excelling with E-commerce in 
which we have to store huge amount of data and select pattern based on Partial 
Search, Text match, Full-Text Search and all.

So below questions comes to mind :
1) Is Cassandra a correct choice for data modeling which gives complex Search 
patterned as  Amazon or eBay is using?
2) If we will use denormalized data modeling then is it will be effective?

Please clarify this.

Thanks and Best regards,
Nandan Priyadarshi



--
Bradford Stephens
roboticprofit.com
Data for Driving Revenue



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Add nodes change

2017-12-28 Thread Durity, Sean R
--> See inline

Hello All,

We are going add 2 new nodes to our production server, there are 2 questions 
would like to have some advices?

1. In current production env, the cassandra version is 3.0.4, is it ok if we 
use 3.0.15 for the new node?

--> I would not do this. Streaming between versions usually doesn't work. 
Either upgrade the existing node before adding new ones OR do the upgrade after 
adding 3.0.4 nodes.

2. Current RF is 1, we would like change to 2, how does this change affect our 
production data, do we need to do any data migration?

--> Once you make the RF change, you will need to run repair on all nodes to 
have the existing data moved to the proper replicas. (New writes will be fine.) 
Until that is complete, query results may not be "correct." Depending on how 
critical the data and application are, it might be better to schedule an outage 
until the repairs are complete.

Thanks a lot
Shijie

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: [EXTERNAL] Lots of simultaneous connections?

2017-12-28 Thread Durity, Sean R
Have you determined if a specific query is the one getting timed out? It is 
possible that the query/data model does not scale well, especially if you are 
trying to do something like a full table scan.

It is also possible that your OS settings will limit the number of connections 
to the host. Do you see any timewait connections in netstat? I would agree that 
5,000 connections per host seems on the high side. Each one requires resources, 
like memory, so reducing connections is a good idea.


Sean Durity

-Original Message-
From: Max Campos [mailto:mc_cassan...@core43.com]
Sent: Thursday, December 14, 2017 3:18 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Lots of simultaneous connections?

Hi -

We’re finally putting our new application under load, and we’re starting to get 
this error message from the Python driver when under heavy load:

('Unable to connect to any servers', {‘x.y.z.205': 
OperationTimedOut('errors=None, last_host=None',), ‘x.y.z.204': 
OperationTimedOut('errors=None, last_host=None',), ‘x.y.z.206': 
OperationTimedOut('errors=None, last_host=None',)})' (22.7s)

Our cluster is running 3.0.6, has 3 nodes and we use RF=3, CL=QUORUM 
reads/writes.  We have a few thousand machines which are each making 1-10 
connections to C* at once, but each of these connections only reads/writes a 
few records, waits several minutes, and then writes a few records — so while 
netstat reports ~5K connections per node, they’re generally idle.  Peak 
read/sec today was ~1500 per node, peak writes/sec was ~300 per node.  
Read/write latencies peaked at 2.5ms.

Some questions:
1) Is anyone else out there making this many simultaneous connections?  Any 
idea what a reasonable number of connections is, what is too many, etc?

2) Any thoughts on which JMX metrics I should look at to better understand what 
exactly is exploding?  Is there a “number of active connections” metric?  We 
currently look at:
- client reads/writes per sec
- read/write latency
- compaction tasks
- repair tasks
- disk used by node
- disk used by table
- avg partition size per table

3) Any other advice?

I think I’ll try doing an explicit disconnect during the waiting period of our 
application’s execution; so as to get the C* connection count down.  Hopefully 
that will solve the timeout problem.

Thanks for your help.

- Max
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Bring 2 nodes down

2017-12-28 Thread Durity, Sean R
Decommission the two nodes, one at a time (assumes you have enough disk space 
on the remaining hosts). That will move the data to the remaining nodes and 
keep RF=3. Then fix the host. Then add the hosts back into the cluster, one at 
a time. This is easier with vnodes. Finally, run clean-up on the 6 nodes that 
stayed up to recover disk space.


Sean Durity
lord of the (C*) rings (Staff Systems Engineer – Cassandra)
MTC 2250
#cassandra - for the latest news and updates

From: Alaa Zubaidi (PDF) [mailto:alaa.zuba...@pdf.com]
Sent: Thursday, December 14, 2017 2:00 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Bring 2 nodes down

Hi,
I have a cluster of 8 Nodes, 4 physical machines 2 VMs each physical machine.
RF=3, and we have read/write with QUORUM consistency requirement.

One of the machines needs to be down for an hour or two to fix local disk.
What is the best way to do that with out losing data?

Regards
-- Alaa

This message may contain confidential and privileged information. If it has 
been sent to you in error, please reply to advise the sender of the error and 
then immediately permanently delete it and all attachments to it from your 
systems. If you are not the intended recipient, do not read, copy, disclose or 
otherwise use this message or any attachments to it. The sender disclaims any 
liability for such unauthorized use. PLEASE NOTE that all incoming e-mails sent 
to PDF e-mail accounts will be archived and may be scanned by us and/or by 
external service providers to detect and prevent threats to our systems, 
investigate illegal or inappropriate behavior, and/or eliminate unsolicited 
promotional e-mails (“spam”). If you have any concerns about this process, 
please contact us at legal.departm...@pdf.com.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Any Cassandra Backup and Restore tool like Cassandra Reaper?

2017-12-27 Thread Durity, Sean R
Datos IO solves many of the problems inherent in Cassandra backups (primarily 
issues with acceptable restores). It is worth considering. Other groups in my 
company are happy with it.


Sean Durity

From: Lerh Chuan Low [mailto:l...@instaclustr.com]
Sent: Thursday, December 14, 2017 4:13 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Any Cassandra Backup and Restore tool like Cassandra 
Reaper?

Tablesnap assumes S3, and tableslurp can set up the stage for restoring by 
downloading the relevant SSTables (but then it's up to the operator to complete 
the restore from there). Restoring (especially point-in-time restore) isn't 
easy to handle so there aren't a lot available out there.

There's also Netflix's Priam 
https://github.com/Netflix/Priam
 but I think it's a little bit old and is meant to run alongside C* as an agent 
and be the agent for repairs, monitoring, backups and restores, configuring 
Cassandra YAMLs...

One other one I've heard of is 
https://github.com/tbarbugli/cassandra_snapshotter
 but there's no restore yet.

On 15 December 2017 at 07:52, Rutvij Bhatt 
> wrote:
There is tablesnap/tablechop/tableslurp - 
https://github.com/JeremyGrosser/tablesnap.


On Thu, Dec 14, 2017 at 3:49 PM Roger Brown 
> 
wrote:
I've found nothing affordable that works with vnodes. If you have money, you 
could use DataStax OpsCenter or Datos.io Recoverx.

I ended up creating a cron job to make snapshots along with 
incremental_backups: true in the cassandra.yaml. And I'm thinking of setting up 
a replication strategy so that one rack contains 1  replica of each keyspace 
and then using r1soft to image each of those servers to tape for offsite backup.


On Thu, Dec 14, 2017 at 1:30 PM Harika Vangapelli -T (hvangape - AKRAYA INC at 
Cisco) > wrote:
Any Cassandra Backup and Restore tool like Cassandra Reaper for Repairs?




Harika Vangapelli
Engineer - IT
hvang...@cisco.com
Tel:

Cisco Systems, Inc.



United States
cisco.com


Think before you print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
Please click 
here
 for Company Registration Information.






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Data Node Density

2017-12-27 Thread Durity, Sean R
You asked for experience; here’s mine.

I support one PR cluster where the hardware was built more for HBase than 
Cassandra. So the data capacity is large (4.5 TB/node). Administratively, it is 
the worst cluster to work on because any kind of repairs, streaming, 
replacement take forever. And when some nodes were hitting the disk capacity? 
Yikes!

So, I am hesitant to recommend anything over 3 TB/node for any application in 
our setting. I understand that the cost of disk storage (with a 35-50% 
compaction overhead and replication factor and mode nodes) makes denser nodes 
more appealing, but I resist.


Sean Durity

From: Amit Agrawal [mailto:amit.ku.agra...@gmail.com]
Sent: Friday, December 15, 2017 9:38 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Data Node Density

Thanks Nicholas. Am aware of the official recommendations. However, in the last 
project, we tried with 5 TB and it worked fine.

So asking for expereinces around.

Anybody knows anyone who provides a consultancy on open source cassandra. 
Datastax just does it for the enterprise version!

On Fri, Dec 15, 2017 at 3:08 PM, Nicolas Guyomar 
> wrote:
Hi Amit,

This is way too much data per node, official recommendation are to try to stay 
below 2Tb per node, I have seen nodes up to 4Tb but then maintenance gets 
really complicated (backup, boostrap, streaming for repair etc etc)

Nicolas

On 15 December 2017 at 15:01, Amit Agrawal 
> wrote:
Hi,

We are trying to setup a 3 node cluster with 20 TB HD on each node.
its a bare metal setup with 44 cores on each node.

So in total 60 TB, 66 cores , 3 node cluster.

The data velocity is very less, low access rates.

has anyone tried with this configuration ?

A bit urgent.

Regards,
-A







The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Upgrade using rebuild

2017-12-27 Thread Durity, Sean R
The sstable formats/versions are different. Streaming uses those formats. 
Streaming doesn’t work across major versions (for sure), and I don’t even try 
it across minor versions.

To ensure Cassandra-happiness, follow the rule:
For streaming operations (adding nodes, rebuild, repairs, etc.) have everything 
on the same version.

Version upgrades can be automated relatively easily and complete in a 
reasonable period of time. Plan them accordingly and don’t do streaming until 
they are completed.


Sean Durity

From: Anshu Vajpayee [mailto:anshu.vajpa...@gmail.com]
Sent: Tuesday, December 19, 2017 5:17 AM
To: user@cassandra.apache.org
Cc: Hannu Kröger 
Subject: [EXTERNAL] Re: Upgrade using rebuild

​Any specific reason why It doesn't work across  major version ? a

On Fri, Dec 15, 2017 at 12:05 AM, Jon Haddad 
> wrote:
Heh, hit send accidentally.

You generally can’t run rebuild to upgrade, because it’s a streaming operation. 
 Streaming isn’t supported between versions, although on 3.x it might work.



On Dec 14, 2017, at 11:01 AM, Jon Haddad 
> wrote:

no


On Dec 14, 2017, at 10:59 AM, Anshu Vajpayee 
> wrote:

Thanks! I am aware with these steps.

I m just thinking , is it possible to do the upgrade using nodetool rebuild 
like  we rebuld new dc ?

Has anyone tried -  upgrade with nodetool rebuild ?



On Thu, 14 Dec 2017 at 7:08 PM, Hannu Kröger 
> wrote:
If you want to do a version upgrade, you need to basically do follow node by 
node:

0) stop repairs
1) make sure your sstables are at the latest version (nodetool upgradesstables 
can do it)
2) stop cassandra
3) update cassandra software and update cassandra.yaml and cassandra-env.sh 
files
4) start cassandra

After all nodes are up, run “nodetool upgradesstables” on each node to update 
your sstables to the latest version.

Also please note that when you upgrade, you need to upgrade only between 
compatible versions.

E.g. 2.2.x -> 3.0.x  but not 1.2 to 3.11

Cheers,
Hannu

On 14 December 2017 at 12:33:49, Anshu Vajpayee 
(anshu.vajpa...@gmail.com) wrote:
Hi -

Is it possible to upgrade a  cluster ( DC wise) using nodetool rebuild ?



--
C*heers,
Anshu V


--
C*heers,
Anshu V







--
C*heers,
Anshu V





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Pending-range-calculator during bootstrapping

2017-09-22 Thread Durity, Sean R
I don't know a specific issue with these versions, but in general you do not 
want to do ANY streaming operations (bootstrap or repair) between Cassandra 
versions. I would get all the nodes (in all DCs) to the same version and then 
try the bootstrap.


Sean Durity

From: Peng Xiao [mailto:2535...@qq.com]
Sent: Friday, September 22, 2017 3:56 AM
To: user 
Subject: Pending-range-calculator during bootstrapping

Dear All,

when we are bootstrapping a new node,we are experiencing high cpu load which 
affect the rt ,and we noticed that it's mainly costing on 
Pending-range-calculator ,this did not happen before.
We are using C* 2.1.13 in one DC,2.1.18 in another DC.
Could anyone please advise on this?

Thanks,
Peng Xiao



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Pending-range-calculator during bootstrapping

2017-09-22 Thread Durity, Sean R
I don't know a specific issue with these versions, but in general you do not 
want to do ANY streaming operations (bootstrap or repair) between Cassandra 
versions. I would get all the nodes (in all DCs) to the same version and then 
try the bootstrap.


Sean Durity

From: Peng Xiao [mailto:2535...@qq.com]
Sent: Friday, September 22, 2017 3:56 AM
To: user 
Subject: Pending-range-calculator during bootstrapping

Dear All,

when we are bootstrapping a new node,we are experiencing high cpu load which 
affect the rt ,and we noticed that it's mainly costing on 
Pending-range-calculator ,this did not happen before.
We are using C* 2.1.13 in one DC,2.1.18 in another DC.
Could anyone please advise on this?

Thanks,
Peng Xiao



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Massive deletes -> major compaction?

2017-09-22 Thread Durity, Sean R
Thanks for the pointer. I had never heard of this. While it seems that it could 
help, I think our rules for determining which records to keep are not 
supported. Also, this requires adding a new jar to production. Too risky at 
this point.


Sean Durity

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Thursday, September 21, 2017 2:59 PM
To: user <user@cassandra.apache.org>
Subject: Re: Massive deletes -> major compaction?

Have you considered the fantastic DeletingCompactionStrategy?  
https://github.com/protectwise/cassandra-util/tree/master/deleting-compaction-strategy<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_protectwise_cassandra-2Dutil_tree_master_deleting-2Dcompaction-2Dstrategy=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=XbpSdVHHLZeNv3mp3UkL122S3UryXjaG-ROk8SK6Oro=rP7k5CqnOsEASTayoqmU-BOCfo-R0tqg6VGBc3sSXoE=>


On Sep 21, 2017, at 11:51 AM, Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:

The major compaction is most efficient but can temporarily double (nearly) disk 
usage - if you can afford that, go for it.

Alternatively you can do a user-defined compaction on each sstable in reverse 
generational order (oldest first) and as long as the data is minimally 
overlapping it’ll purge tombstones that way as well - takes longer but much 
less disk involved.


--
Jeff Jirsa


On Sep 21, 2017, at 11:27 AM, Durity, Sean R 
<sean_r_dur...@homedepot.com<mailto:sean_r_dur...@homedepot.com>> wrote:
Cassandra version 2.0.17 (yes, it’s old – waiting for new hardware/new OS to 
upgrade)

In a long-running system with billions of rows, TTL was not set. So a one-time 
purge is being planned to reduce disk usage. Records older than a certain date 
will be deleted. The table uses size-tiered compaction. Deletes are probably 
25-40% of the complete data set. To actually recover the disk space, would you 
recommend a major compaction after the gc_grace_seconds time? I expect 
compaction would then need to be scheduled regularly (ick)…

We also plan to re-insert the remaining data with a calculated TTL, which could 
also benefit from compaction.


Sean Durity



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Massive deletes -> major compaction?

2017-09-21 Thread Durity, Sean R
So, let me make sure my assumptions are correct (and let others learn as well):


-  A major compaction would read all sstables at once (ignoring the 
max_threshold), thus the potential for needing double the disk space (of course 
if it wrote 30% less, it wouldn’t be double…)

-  Major compaction would leave one massive sstable, that wouldn’t get 
automatically compacted for a long time

-  A user-defined compaction on 1 sstable would not evict any 
tombstoned data that is in any other sstable (like a newer one with the 
deletes…). It would only remove data if the tombstone is already in the same 
sstable.


Sean Durity

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Thursday, September 21, 2017 2:51 PM
To: user@cassandra.apache.org
Subject: Re: Massive deletes -> major compaction?

The major compaction is most efficient but can temporarily double (nearly) disk 
usage - if you can afford that, go for it.

Alternatively you can do a user-defined compaction on each sstable in reverse 
generational order (oldest first) and as long as the data is minimally 
overlapping it’ll purge tombstones that way as well - takes longer but much 
less disk involved.


--
Jeff Jirsa


On Sep 21, 2017, at 11:27 AM, Durity, Sean R 
<sean_r_dur...@homedepot.com<mailto:sean_r_dur...@homedepot.com>> wrote:
Cassandra version 2.0.17 (yes, it’s old – waiting for new hardware/new OS to 
upgrade)

In a long-running system with billions of rows, TTL was not set. So a one-time 
purge is being planned to reduce disk usage. Records older than a certain date 
will be deleted. The table uses size-tiered compaction. Deletes are probably 
25-40% of the complete data set. To actually recover the disk space, would you 
recommend a major compaction after the gc_grace_seconds time? I expect 
compaction would then need to be scheduled regularly (ick)…

We also plan to re-insert the remaining data with a calculated TTL, which could 
also benefit from compaction.


Sean Durity



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Massive deletes -> major compaction?

2017-09-21 Thread Durity, Sean R
Cassandra version 2.0.17 (yes, it's old - waiting for new hardware/new OS to 
upgrade)

In a long-running system with billions of rows, TTL was not set. So a one-time 
purge is being planned to reduce disk usage. Records older than a certain date 
will be deleted. The table uses size-tiered compaction. Deletes are probably 
25-40% of the complete data set. To actually recover the disk space, would you 
recommend a major compaction after the gc_grace_seconds time? I expect 
compaction would then need to be scheduled regularly (ick)...

We also plan to re-insert the remaining data with a calculated TTL, which could 
also benefit from compaction.


Sean Durity



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Can I have multiple datacenter with different versions of Cassandra

2017-09-12 Thread Durity, Sean R
No – the general answer is that you cannot stream between major versions of 
Cassandra. I would upgrade the existing ring, then add the new DC.


Sean Durity

From: Chuck Reynolds [mailto:creyno...@ancestry.com]
Sent: Thursday, May 18, 2017 11:20 AM
To: user@cassandra.apache.org
Subject: Can I have multiple datacenter with different versions of Cassandra

I have a need to create another datacenter and upgrade my existing Cassandra 
from 2.1.13 to Cassandra 3.0.9.

Can I do this as one step?  Create a new Cassandra ring that is version 3.0.9 
and replicate the data from an existing ring that is Cassandra 2.1.13?

After replicating to the new ring if possible them I would upgrade the old ring 
to Cassandra 3.0.9



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Reg:- DSE 5.1.0 Issue

2017-09-12 Thread Durity, Sean R
In an attempt to help close the loop for future readers… I don’t think an 
upgrade from DSE 4.8 straight to 5.1 is supported. I think you have to go 
through 5.0.x first.

And, yes, you should contact DataStax support for help, but I’m ok with 
DSE-related questions. They may be more Cassandra-related and helpful to the 
community.


Sean Durity

From: DuyHai Doan [mailto:doanduy...@gmail.com]
Sent: Tuesday, May 16, 2017 8:36 AM
To: Hannu Kröger 
Cc: @Nandan@ ; user@cassandra.apache.org
Subject: Re: Reg:- DSE 5.1.0 Issue

Nandan

Since you have asked many times questions about DSE on this OSS mailing list, I 
suggest you to contact directly Datastax if you're using their enterprise 
edition. Every Datastax customer has access to their support. If you're a 
sub-contractor for a final customer that is using DSE, ask your customer to get 
this support access. On this OSS mailing list we cannot answer questions 
related to a commercial product.



On Tue, May 16, 2017 at 1:07 PM, Hannu Kröger 
> wrote:
Hello,

DataStax is probably more than happy answer your particaly DataStax Enterprise 
related questions here (I don’t know if that is 100% right place but…):
https://support.datastax.com/hc/en-us

This mailing list is for open source Cassandra and DSE issues are mostly out of 
the scope here. HADOOP is one of DSE-only features.

Cheers,
Hannu

On 16 May 2017, at 14:01, @Nandan@ 
> wrote:

Hi ,
Sorry in Advance if I am posting here .

I stuck in some particular steps.

I was using DSE 4.8 on Single DC with 3 nodes. Today I upgraded my all 3 nodes 
to DSE 5.1
Issue is when I am trying to start SERVICE DSE RESTART i am getting error 
message as

Hadoop functionality has been removed from DSE.
Please try again without the HADOOP_ENABLED set in /etc/default/dse.

Even in /etc/default//dse file , HADOOP_ENABLED is set as 0 .

For testing ,Once I changed my HADOOP_ENABLED = 1 ,

I  am getting error as

Found multiple DSE core jar files in /usr/share/dse/lib 
/usr/share/dse/resources/dse/lib /usr/share/dse /usr/share/dse/common . Please 
make sure there is only one.

I searched so many article , but till now not able to find the solution.
Please help me to get out of this mess.

Thanks and Best Regards,
Nandan Priyadarshi.





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: AWS Cassandra backup/Restore tools

2017-09-12 Thread Durity, Sean R
Datos IO has a backup/restore product for Cassandra that another team here has 
used successfully. It solves many of the problems inherent with sstable 
captures. Without something like it, restores are a nightmare with any volume 
of data. The downtime required and the loss of data since the snapshot are 
usually not worth it.


Sean Durity

From: Alexander Dejanovski [mailto:a...@thelastpickle.com]
Sent: Friday, May 12, 2017 12:14 PM
To: Manikandan Srinivasan ; Nitan Kainth 

Cc: Blake Eggleston ; cass savy ; 
user@cassandra.apache.org
Subject: Re: AWS Cassandra backup/Restore tools

Hi,

here are the main techniques that I know of to perform backups for Cassandra :

  *   Tablesnap 
(https://github.com/JeremyGrosser/tablesnap)
 : performs continuous backups on S3. Comes with tableslurp to restore backups 
(one table at a time only) and tablechop to delete outdated sstables from S3.
  *   incremental backup : activate it in the cassandra.yaml file and it will 
create snapshots for all newly flushed SSTables. It's up to you to move the 
snapshots off-node and delete them. I don't really like that technique since it 
creates a lot of small sstables that eventually contain a lot of outdated data. 
Upon restore you'll have to wait until compaction catches up on compacting all 
the history (which could take a while and use a lot of power). Your backups 
could also grow indefinitely with this technique since there's no compaction, 
so no purge. You'll have to build the restore script/procedure.
  *   scheduled snapshots : you perform full snapshots by yourself and move 
them off node. You'll have to build the restore script/procedure.
  *   EBS snapshots : probably the easiest way to perform backups if you are 
using M4/R4 instances on AWS.

Cheers,

On Thu, May 11, 2017 at 11:01 PM Manikandan Srinivasan 
> wrote:
Blake is correct. OpsCenter 6.0 and up doesn't work with OSS C*. @Nitan: We 
have made some substantial changes to the Opscenter 6.1 backup service, 
specifically when it comes to S3 backups. Having said this, I am not going to 
be sale-sy here. If folks need some help or need more clarity to know more 
about these improvements, please send me an email directly: 
msriniva...@datastax.com

Regards
Mani

On Thu, May 11, 2017 at 1:54 PM, Nitan Kainth 
> wrote:
Also , Opscenter backup/restore does not work for large databases

Sent from my iPhone

On May 11, 2017, at 3:41 PM, Blake Eggleston 
> wrote:
OpsCenter 6.0 and up don't work with Cassandra.


On May 11, 2017 at 12:31:08 PM, cass savy 
(casss...@gmail.com) wrote:
AWS Backup/Restore process/tools for C*/DSE C*:

Has anyone used Opscenter 6.1 backup tool to backup/restore data for larger 
datasets online ?

If yes, did you run into issues using that tool to backup/restore data in PROD 
that caused any performance or any other impact to the cluster?

If no, what are other tools that people have used or recommended for backup and 
restore of Cassandra keyspaces?

Please advice.





--
Regards,

Manikandan Srinivasan

Director, Product Management| +1.408.887.3686 | 
manikandan.sriniva...@datastax.com


[Image removed by sender. 
linkedin.png][Image
 removed by sender. 
facebook.png][Image
 removed by sender. 
twitter.png][Image
 removed by sender. 
g+.png][Image
 removed by 

RE: Getting all unique keys

2017-08-23 Thread Durity, Sean R
DataStax Enterprise bundles spark and spark connector on the DSE nodes and 
handles much of the plumbing work (and monitoring, etc.). Worth a look.


Sean Durity

From: Avi Levi [mailto:a...@indeni.com]
Sent: Tuesday, August 22, 2017 2:46 AM
To: user@cassandra.apache.org
Subject: Re: Getting all unique keys

Thanks Christophe, we will definitely consider that in the future.

On Mon, Aug 21, 2017 at 3:01 PM, Christophe Schmitz 
> wrote:
Hi Avi,

The spark-project documentation is quite good, as well as the 
spark-cassandra-connector github project, which contains some basic examples 
you can easily get inspired from. A few random advice you might find usefull:
- You will want one spark worker on each node, and a spark master on either one 
of the node, or on a separate node.
- Pay close attention at your port configuration (firewall) as the spark error 
log does not always give you the right hint.
- Pay close attention at your heap size. Make sure to configure your heap size 
such as Cassandra heap size + spark heap size < your node memory (take into 
account Cassandra off heap usage if enabled, OS etc...)
- If your Cassandra data center is used in production, make sure you throttle 
read / write from Spark, pay attention to your latencies, and consider using a 
separate analytic cassandra data center if you get serious with Spark.
- More or less everyone I know find that writing spark jobs in scala is 
natural, while writing them in java is painful :D

Getting spark running will be a bit of an investment at the beginning, but 
overall you will find out it allows you to run queries you can't naturally run 
in Cassandra, like the one you described.

Cheers,

Christophe

On 21 August 2017 at 16:16, Avi Levi > 
wrote:
Thanks Christophe,
we didn't want to add too many moving parts but is sound like a good solution. 
do you have any reference / link that I can look at ?

Cheers
Avi

On Mon, Aug 21, 2017 at 3:43 AM, Christophe Schmitz 
> wrote:
Hi Avi,

Have you thought of using Spark for that work? If you collocate the spark 
workers on each Cassandra nodes, the spark-cassandra connector will split 
automatically the token range for you in such a way that each spark worker only 
hit the Cassandra local node. This will also be done in parallel. Should be 
much faster that way.

Cheers,
Christophe


On 21 August 2017 at 01:34, Avi Levi > 
wrote:
Thank you very much , one question . you wrote that I do not need distinct here 
since it's a part from the primary key. but only the combination is unique 
(PRIMARY KEY (id, timestamp) ) . also if I take the last token and feed it back 
as you showed wouldn't I get overlapping boundaries ?

On Sun, Aug 20, 2017 at 6:18 PM, Eric Stevens 
> wrote:
You should be able to fairly efficiently iterate all the partition keys like:

select id, token(id) from table where token(id) >= -9204925292781066255 limit 
1000;
 id | system.token(id)
+--
...
 0xb90ea1db5c29f2f6d435426dccf77cca6320fac9 | -7821793584824523686

Take the last token you receive and feed it back in, skipping duplicates from 
the previous page (on the unlikely chance that you have two ID's with a token 
collision on the page boundary):

select id, token(id) from table where token(id) >= -7821793584824523686 limit 
1000;
 id | system.token(id)
+-
...
 0xc6289d729c9087fb5a1fe624b0b883ab82a9bffe | -434806781044590339

Continue until you have no more results.  You don't really need distinct here: 
it's part of your primary key, it must already be distinct.

If you want to parallelize it, split the ring into n ranges and include it as 
an upper bound for each segment.

select id, token(id) from table where token(id) >= -9204925292781066255 AND 
token(id) < $rangeUpperBound limit 1000;


On Sun, Aug 20, 2017 at 12:33 AM Avi Levi 
> wrote:
I need to get all unique keys (not the complete primary key, just the partition 
key) in order to aggregate all the relevant records of that key and apply some 
calculations on it.


CREATE TABLE my_table (



id text,



timestamp bigint,



value double,



PRIMARY KEY (id, timestamp) )

I know that to query like this

SELECT DISTINCT id FROM my_table

is not very efficient but how about the approach presented 

RE: Adding a new node with the double of disk space

2017-08-18 Thread Durity, Sean R
I am doing some on-the-job-learning on this newer feature of the 3.x line, 
where the token generation algorithm will compensate for different size nodes 
in a cluster. In fact, it is one of the main reasons I upgraded to 3.0.13, 
because I have a number of original nodes in a cluster that are about half the 
size of the newer nodes. With the same number of vnodes, they can get 
overwhelmed with too much data and have to be rebuilt, etc.

So, I am cutting vnodes in half on those original nodes and rebuilding them. So 
far, it is working as designed. The data size is about half on the smaller 
nodes.

With the more current advice being to use less vnodes, for the original 
question below, I might consider adding the new node in at 256 vnodes and then 
rebuilding all the other nodes at 128. Of course the cluster size and amount of 
data would be important factors, as well as the future growth of the cluster 
and the expected size of any additional nodes.


Sean Durity

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Thursday, August 17, 2017 4:20 PM
To: cassandra 
Subject: Re: Adding a new node with the double of disk space

If you really double the hardware in every way, it's PROBABLY reasonable to 
double num_tokens. It won't be quite the same as doubling all-the-things, 
because you still have a single JVM, and you'll still have to deal with GC as 
you're now reading twice as much and generating twice as much garbage, but you 
can probably adjust the tuning of the heap to compensate.



On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor 
> wrote:
Are you saying if a node had double the hardware capacity in every way it would 
be a bad idea to up num_tokens? I thought that was the whole idea of that 
setting though?

On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo 
> wrote:
No.

If you would double all the hardware on that node vs the others would still be 
a bad idea.
Keep the cluster uniform vnodes wise.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
linkedin.com/in/carlosjuzarterolo

Mobile: +351 918 918 100
www.pythian.com

On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha 
> wrote:
Hi all,

I need to add a new node to my cluster but this time the new node will
have the double of disk space comparing to the other nodes.

I'm using the default vnodes (num_tokens: 256). To fully use the disk
space in the new node I just have to configure num_tokens: 512?

Thanks in advance.



-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org



--







The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: nodetool removenode causing the schema out of sync

2017-07-13 Thread Durity, Sean R
Late to this party, but Jeff is talking about nodetool setstreamthroughput. The 
default in most versions is 200 Mb/s (set in yaml file as 
stream_throughput_outbound_megabits_per_sec). This is outbound throttle only. 
So, if streams from multiple nodes are going to one, it can get inundated.

The nodetool command lets you change this on the fly (no bounce required), but 
I don’t think it affects any current streaming from that node (only future). 
You can use nodetool getstreamthroughput to see the current value.


Sean Durity

From: Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com]
Sent: Thursday, June 29, 2017 6:39 PM
To: user@cassandra.apache.org
Subject: Re: nodetool removenode causing the schema out of sync

Thanks Jeff,

Can you please suggest what value to tweak from the Cassandra side?

On Thu, Jun 29, 2017 at 2:53 PM, Jeff Jirsa 
> wrote:


On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada 
> wrote:
> Hello Jeff,
>
> Sorry the Version I am using 2.1.16, my first email had typo.
> When I say schema out of sync
>
> 1. nodetool descriebcluster shows Schema versions same for all nodes.

Ok got it, this is what I was most concerned with.

> 2. nodetool removenode, shows the node down messages in the logs
> 3. nodetool describecluster during this 1-2 mins shows several nodes as
> UNREACHABLE and recovers with in a minute or two.

This is likely due to overhead of streaming - you're probably running pretty 
close to your tipping point, and your streaming throughput creates enough GC 
pressure on the destinations to make them flap a bit. If you use the streaming 
throughput throttle, you may be able to help mitigate that somewhat (at the 
cost of speed).



-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Node failure Due To Very high GC pause time

2017-07-13 Thread Durity, Sean R
I like Bryan’s terminology of an “antagonistic use case.” If I am reading this 
correctly, you are putting 5 (or 10) million records in a partition and then 
trying to delete them in the same order they are stored. This is not a good 
data model for Cassandra, in fact a dangerous data model. That partition will 
reside completely on one node (and a number of replicas). Then, you are forcing 
the reads to wade through all the tombstones to get to the undeleted records – 
all on the same nodes. This cannot scale to the scope you want.

For a distributed data store, you want the data distributed across all of your 
cluster. And you want to delete whole partitions, if at all possible. (Or at 
least a reasonable number of deletes within a partition.)


Sean Durity
From: Karthick V [mailto:karthick...@zohocorp.com]
Sent: Monday, July 03, 2017 12:47 PM
To: user 
Subject: Re: Node failure Due To Very high GC pause time

Hi Bryan,

Thanks for your quick response.  We have already tuned our memory 
and GC based on our hardware specification and it was working fine until 
yesterday, i.e before facing the below specified delete request. As you 
specified we will once again look into our GC & memory configuration.

FYKI :  We are using memtable_allocation_typ as offheap_objects.

Consider the following table

CREATE TABLE  EmployeeDetails (
branch_id text,
department_id  text,
emp_id bigint,
emp_details text,
PRIMARY KEY (branch, department, emp_id)
) WITH CLUSTERING ORDER BY (department ASC, emp_id ASC)


In this table I have 10 million records for the a particular branch_id and 
department_id . And following are the list of operation which I perform in C* 
in chronological order

  1.  Deleting 5 million records, from the start, in batches of 500 records per 
request for the particular branch_id (say 'xxx' ) and department_id (say 'yyy')
  2.  Read the next 500 records as soon the above delete operation is being 
completed ( Select * from EmployeeDetails where branch_id='xxx' and 
department_id = 'yyy' and emp_id >5000 limit 500 )

It's only after executing the above read request there was a spike in memory 
and within few minutes the node has been marked down.

So my question here is , will the above read request will load all the deleted 
5 million records in my memory before it starts fetching or will it jump 
directly to the offset of 5001 record (since we have specified the greater 
than condition) ? If its going to the former case then for sure the read 
request will keep the data in main memory and performs merge operation before 
it delivers the data as per this wiki( 
https://wiki.apache.org/cassandra/ReadPathForUsers
 ). If not let me know how the above specified read request will provide me the 
data .


Note : And also while analyzing my heap dump its clear that majority of the 
memory is being held my Tombstone threads.


Thanks in advance
-- karthick



 On Mon, 03 Jul 2017 20:40:10 +0530 Bryan Cheng 
> wrote 

This is a very antagonistic use case for Cassandra :P I assume you're familiar 
with Cassandra and deletes? (eg. 
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html,
 
http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_deletes_c.html)

That being said, are you giving enough time for your tables to flush to disk? 
Deletes generate markers which can and will consume memory until they have a 
chance to be flushed, after which they will impact query time and performance 
(but should relieve memory pressure). If you're saturating the capability of 
your nodes your tables will have difficulty flushing. See 
http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_memtable_thruput_c.html.

This 

RE: READ Queries timing out.

2017-07-07 Thread Durity, Sean R
1 GB heap is very small. Why not try increasing it to 50% of RAM and see if it 
helps you track down the real issue. It is hard to tune around a bad data 
model, if that is indeed the issue. Seeing your tables and queries would help.


Sean Durity

From: Pranay akula [mailto:pranay.akula2...@gmail.com]
Sent: Friday, July 07, 2017 11:47 AM
To: user@cassandra.apache.org
Cc: ZAIDI, ASAD A 
Subject: Re: READ Queries timing out.

Thanks ZAIDI,

Using C++ driver doesn't have tracing with driver so executing those from 
cqlsh. when i am tracing i am getting below error, i increased 
--request-timeout to 3600 in cqlsh.

ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
responses] message="Operation timed out - received only 0 responses." 
info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
Statement trace did not complete within 10 seconds

The below are cfstats and cfhistograms, i can see  read latency, cell count and 
Maximum live cells per slice (last five minutes) are high. is there any way to 
get around this with out changing data model.

Percentile  SSTables Write Latency  Read Latency  Partition 
SizeCell Count
 (micros)  (micros) 
 (bytes)
50% 1.00 20.00   NaN
   133120
75% 2.00 29.00   NaN
  686686
95% 8.00 60.00   NaN
126934  1331
98%10.00103.00   NaN
   315852  3973
99%12.00149.00   NaN
  545791  8239
Min 0.00  0.000.00  
 104 0
Max20.00   12730764.00  9773372036884776000.00  
74975550 83457



Read Count: 44514407
Read Latency: 82.92876612928933 ms.
Write Count: 3007585812
Write Latency: 0.07094456590853208 ms.
Pending Flushes: 0
SSTable count: 9
Space used (live): 66946214374
Space used (total): 66946214374
Space used by snapshots (total): 0
Off heap memory used (total): 33706492
SSTable Compression Ratio: 0.5598380206656697
Number of keys (estimate): 2483819
Memtable cell count: 15008
Memtable data size: 330597
Memtable off heap memory used: 518502
Memtable switch count: 39915
Local read count: 44514407
Local read latency: 82.929 ms
Local write count: 3007585849
Local write latency: 0.071 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 12623632
Bloom filter off heap memory used: 12623560
Index summary off heap memory used: 3285614
Compression metadata off heap memory used: 17278816
Compacted partition minimum bytes: 104
Compacted partition maximum bytes: 74975550
Compacted partition mean bytes: 27111
Average live cells per slice (last five minutes): 
388.7486606077893
Maximum live cells per slice (last five minutes): 28983.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0


Thanks
Pranay.

On Fri, Jul 7, 2017 at 11:16 AM, Thakrar, Jayesh 
> wrote:
Can you provide more details.
E.g. table structure, the app used for the query, the query itself and the 
error message.

Also get the output of the following commands from your cluster nodes (note 
that one command uses "." and the other "space" between keyspace and tablename)

nodetool -h  tablestats .
nodetool -h  tablehistograms  

Timeouts can happen at the client/application level (which can be tuned) and at 
the coordinator node level (which too can be tuned).
But again those timeouts are a symptom of something.
It can happen at the client side because of connection pool queue too full 
(which is likely due to response time from the cluster/coordinate nodes).
And the issues at the cluster side could be due to several reasons.
E.g. your query has to scan through too many tombstones, causing the delay or 
your query (if using filter).

From: "ZAIDI, ASAD A" 

RE: Starting Cassandrs after restore of Data - get error

2017-07-07 Thread Durity, Sean R
I have seen Windows format cause problems. Run dos2unix on the cassandra.yaml 
file (on the linux box) and see if it helps.


Sean Durity
lord of the (C*) rings (Staff Systems Engineer - Cassandra)
MTC 2250
#cassandra - for the latest news and updates

From: Jonathan Baynes [mailto:jonathan.bay...@tradeweb.com]
Sent: Friday, July 07, 2017 12:48 PM
To: user@cassandra.apache.org
Subject: Re: Starting Cassandrs after restore of Data - get error

Yes both clusters match I've checked 3 Times and diff'd it as well. Would file 
format have any affect I'm amending on windows machine and returning the file 
back to Linux

Thanks
J

Sent from my iPhone

On 7 Jul 2017, at 17:43, Nitan Kainth 
> wrote:
Jonathan,

Make sure initial tokens have values from back up cluster i.e. 256 tokens. It 
is possible to have typo.

On Jul 7, 2017, at 9:14 AM, Jonathan Baynes 
> wrote:

Hi again,

Trying to restart my nodes after restoring snapshot data, initial tokens have 
been added in as per the instructions online.

In system.log I get this error (same error is I run nodetool cleanup)

Exception encountered during startup: The number of initial tokens (by 
initial_token) specified is different from num_tokens value


On both Cluster A and Cluster B the Num_tokens = 256

Ive taken the initial tokens from running this script

nodetool ring | grep "$(ifconfig | awk '/inet /{print $2}' | head -1)" | awk 
'{print $NF ","}' | xargs > /tmp/tokens

when pasting in the tokens originally I got the an error, but this was due to 
the spacing between the tokens. That error has been resolved I'm just left with 
this one?

Any ideas

Thanks
J

Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place  *  1 Fore Street Avenue  *  London EC2Y 9DT
P +44 (0)20 77760988  *  F +44 (0)20 7776 3201  *  M +44 (0) xx
jonathan.bay...@tradeweb.com


   follow us:  

   

-
A leading 
marketplace
 for electronic fixed income, derivatives and ETF trading


This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and destroy it. Any unauthorized copying, 
disclosure or distribution of the material in this e-mail is strictly 
forbidden. Tradeweb reserves the right to monitor all e-mail communications 
through its networks. If you do not wish to receive marketing emails about our 
products / services, please let us know by contacting us, either by email at 
contac...@tradeweb.com or by writing to us at 
the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited 
(company number 3912826), 1 Fore Street Avenue London EC2Y 9DT. To see our 
privacy policy, visit our website @ 
www.tradeweb.com.




This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and destroy it. Any unauthorized copying, 
disclosure or distribution of the material in this e-mail is strictly 
forbidden. Tradeweb reserves the right to monitor all e-mail communications 
through its networks. If you do not wish to receive marketing emails about our 
products / services, please let us know by contacting us, either by email at 
contac...@tradeweb.com or by writing to us at 
the 

RE: cassandra OOM

2017-04-25 Thread Durity, Sean R
We have seen much better stability (and MUCH less GC pauses) from G1 with a 
variety of heap sizes. I don’t even consider CMS any more.


Sean Durity

From: Gopal, Dhruva [mailto:dhruva.go...@aspect.com]
Sent: Tuesday, April 04, 2017 5:34 PM
To: user@cassandra.apache.org
Subject: Re: cassandra OOM

Thanks, that’s interesting – so CMS is a better option for 
stability/performance? We’ll try this out in our cluster.

From: Alexander Dejanovski 
>
Reply-To: "user@cassandra.apache.org" 
>
Date: Monday, April 3, 2017 at 10:31 PM
To: "user@cassandra.apache.org" 
>
Subject: Re: cassandra OOM

Hi,

we've seen G1GC going OOM on production clusters (repeatedly) with a 16GB heap 
when the workload is intense, and given you're running on m4.2xl I wouldn't go 
over 16GB for the heap.

I'd suggest to revert back to CMS, using a 16GB heap and up to 6GB of new gen. 
You can use 5 as MaxTenuringThreshold as an initial value and activate GC 
logging to fine tune the settings afterwards.

FYI CMS tends to perform better than G1 even though it's a little bit harder to 
tune.

Cheers,

On Mon, Apr 3, 2017 at 10:54 PM Gopal, Dhruva 
> wrote:
16 Gig heap, with G1. Pertinent info from jvm.options below (we’re using 
m2.2xlarge instances in AWS):


#
# HEAP SETTINGS #
#

# Heap size is automatically calculated by cassandra-env based on this
# formula: max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
# That is:
# - calculate 1/2 ram and cap to 1024MB
# - calculate 1/4 ram and cap to 8192MB
# - pick the max
#
# For production use you may wish to adjust this for your environment.
# If that's the case, uncomment the -Xmx and Xms options below to override the
# automatic calculation of JVM heap memory.
#
# It is recommended to set min (-Xms) and max (-Xmx) heap sizes to
# the same value to avoid stop-the-world GC pauses during resize, and
# so that we can lock the heap in memory on startup to prevent any
# of it from being swapped out.
-Xms16G
-Xmx16G

# Young generation size is automatically calculated by cassandra-env
# based on this formula: min(100 * num_cores, 1/4 * heap size)
#
# The main trade-off for the young generation is that the larger it
# is, the longer GC pause times will be. The shorter it is, the more
# expensive GC will be (usually).
#
# It is not recommended to set the young generation size if using the
# G1 GC, since that will override the target pause-time goal.
# More info: 
http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html
#
# The example below assumes a modern 8-core+ machine for decent
# times. If in doubt, and if you do not particularly want to tweak, go
# 100 MB per physical CPU core.
#-Xmn800M

#
#  GC SETTINGS  #
#

### CMS Settings

#-XX:+UseParNewGC
#-XX:+UseConcMarkSweepGC
#-XX:+CMSParallelRemarkEnabled
#-XX:SurvivorRatio=8
#-XX:MaxTenuringThreshold=1
#-XX:CMSInitiatingOccupancyFraction=75
#-XX:+UseCMSInitiatingOccupancyOnly
#-XX:CMSWaitDuration=1
#-XX:+CMSParallelInitialMarkEnabled
#-XX:+CMSEdenChunksRecordAlways
# some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
#-XX:+CMSClassUnloadingEnabled

### G1 Settings (experimental, comment previous section and uncomment section 
below to enable)

## Use the Hotspot garbage-first collector.
-XX:+UseG1GC
#
## Have the JVM do less remembered set work during STW, instead
## preferring concurrent GC. Reduces p99.9 latency.
-XX:G1RSetUpdatingPauseTimePercent=5
#
## Main G1GC tunable: lowering the pause target will lower throughput and vise 
versa.
## 200ms is the JVM default and lowest viable setting
## 1000ms increases throughput. Keep it smaller than the timeouts in 
cassandra.yaml.
-XX:MaxGCPauseMillis=500

## Optional G1 Settings

# Save CPU time on large (>= 16GB) heaps by delaying region scanning
# until the heap is 70% full. The default in Hotspot 8u40 is 40%.
-XX:InitiatingHeapOccupancyPercent=70

# For systems with > 8 cores, the default ParallelGCThreads is 5/8 the number 
of logical cores.
# Otherwise equal to the number of cores when 8 or less.
# Machines with > 10 cores should try setting these to <= full cores.
#-XX:ParallelGCThreads=16
# By default, ConcGCThreads is 1/4 of ParallelGCThreads.
# Setting both to the same value can reduce STW durations.
#-XX:ConcGCThreads=16

### GC logging options -- uncomment to enable

#-XX:+PrintGCDetails
#-XX:+PrintGCDateStamps
#-XX:+PrintHeapAtGC

RE: Can we get username and timestamp in cqlsh_history?

2017-04-03 Thread Durity, Sean R
Sounds like you want full auditing of CQL in the cluster. I have not seen 
anything built into the open source version for that (but I could be missing 
something). DataStax Enterprise does have an auditing feature.


Sean Durity

From: anuja jain [mailto:anujaja...@gmail.com]
Sent: Wednesday, March 29, 2017 7:37 AM
To: user@cassandra.apache.org
Subject: Can we get username and timestamp in cqlsh_history?

Hi,
I have a cassandra cluster having a lot of keyspaces and users. I want to get 
the history of cql commands along with the username and the time at which the 
command is run.
Also if we are running some commands from GUI tools like Devcenter,dbeaver, can 
we log those commands too? If yes, how?

Thanks,
Anuja



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: Issue with Cassandra consistency in results

2017-03-29 Thread Durity, Sean R
There have been many instances of supposed inconsistency noted on this list if 
nodes do not have the same system time. Make sure you have a matching clock on 
all nodes (ntp or similar).


Sean Durity

From: Shubham Jaju [mailto:shub...@vassarlabs.com]
Sent: Tuesday, March 21, 2017 9:58 PM
To: user@cassandra.apache.org
Subject: Re: Issue with Cassandra consistency in results

Hi

This issue used to appear with me . What I figured in my case was
  1. I had 3 machines
  2. Inserted the data with ONE consistency (i.e there is no guarantee that 
data was propagated to remaining nodes , cassandra is supposed to take care of 
that).
  3. Later I figured also that one of machines has different hard disk space 
compared to other two (less and data size was more ) . i.e it was not able to 
contain whole set of data.

So I think in above cases ( 2,3 )if you will query you will get different 
results as nodes are not in sync.
nodetool repair should solve this problem. But it takes more time if you have 
more data.
Check if this solves you problem.

Regards

Shubham Jaju

On Wed, Mar 22, 2017 at 12:23 AM, srinivasarao daruna 
> wrote:
The same issue is appearing in CQL Shell as well.

1) Entered into cqlsh
2) SET CONSISTENCY QUORUM;
3) Ran a select * with partition key in where cluase.

First result gave 0 records,
and Next records gave results.

Its really freaking out us at the moment. And nothing in debug.log or 
system.log.

Thank You,
Regards,
Srini

On Fri, Mar 17, 2017 at 2:33 AM, daemeon reiydelle 
> wrote:
The prep is needed. If I recall correctly it must remain in cache for the query 
to complete. I don't have the docs to dig out the yaml parm to adjust query 
cache. I had run into the problem stress testing a smallish cluster with many 
queries at once.

Do you have a sense of how many distinct queries are hitting the cluster at 
peak?

If many clients, how do you balance the connection load or do you always hit 
the same node?

sent from my mobile
Daemeon Reiydelle
skype daemeon.c.m.reiydelle
USA 415.501.0198

On Mar 16, 2017 3:25 PM, "srinivasarao daruna" 
> wrote:
Hi reiydelle,

I cannot confirm the range as the volume of data is huge and the query 
frequency is also high.
If the cache is the cause of issue, can we increase cache size or is there 
solution to avoid dropped prep statements.?






Thank You,
Regards,
Srini

On Thu, Mar 16, 2017 at 2:13 PM, daemeon reiydelle 
> wrote:
The discard due to oom is causing the zero returned. I would guess a cache miss 
problem of some sort, but not sure. Are you using row, index, etc. caches? Are 
you seeing the failed prep statement on random nodes (duh, nodes that have the 
relevant data ranges)?


...

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Thu, Mar 16, 2017 at 10:56 AM, Ryan Svihla 
> wrote:
Depends actually, restore just restores what's there, so if only one node had a 
copy of the data then only one node had a copy of the data meaning quorum will 
still be wrong sometimes.

On Thu, Mar 16, 2017 at 1:53 PM, Arvydas Jonusonis 
> wrote:
If the data was written at ONE, consistency is not guaranteed. ..but 
considering you just restored the cluster, there's a good chance something else 
is off.

On Thu, Mar 16, 2017 at 18:19 srinivasarao daruna 
> wrote:
Want to make read and write QUORUM as well.


On Mar 16, 2017 1:09 PM, "Ryan Svihla" 
> wrote:
Replication factor is 3, and write consistency is ONE and read 
consistency is QUORUM.

That combination is not gonna work well:

Write succeeds to NODE A but fails on node B,C

Read goes to NODE B, C

If you can tolerate some temporary inaccuracy you can use QUORUM but may still 
have the situation where

Write succeeds on node A a timestamp 1, B succeeds at timestamp 2
Read succeeds on node B and C at timestamp 1

If you need fully race condition free counts I'm afraid you need to use SERIAL 
or LOCAL_SERIAL (for in DC only accuracy)

On Thu, Mar 16, 2017 at 1:04 PM, srinivasarao daruna 
> wrote:
Replication strategy is SimpleReplicationStrategy.

Smith is : EC2 snitch. As we deployed cluster on EC2 instances.

I was worried that CL=ALL have more read latency and read failures. But won't 
rule out trying it.

Should I switch select count (*) to select partition_key column? Would that be 
of any help.?


Thank you
Regards
Srini

On Mar 16, 2017 12:46 PM, "Arvydas Jonusonis" 
> wrote:
What are your 

RE: results differ on two queries, based on secondary index key and partition key

2017-03-29 Thread Durity, Sean R
This looks more like a problem for a graph-based model. Have you looked at DSE 
Graph as a possibility?


Sean Durity
From: ferit baver elhuseyni [mailto:feritba...@gmail.com]
Sent: Tuesday, March 14, 2017 11:40 AM
To: user@cassandra.apache.org
Subject: results differ on two queries, based on secondary index key and 
partition key

Hi all,


We are using a C* 2.2.8 cluster in our production system, composed of 5 nodes 
in 1 DC with RF=3. Our clients mostly write with CL.ALL and read with CL.ONE 
(both will be switched to quorum soon).

We face several problems while trying to persist classical "follow 
relationship". Did anyone of you have similar problems / or have any idea on 
what could be wrong?

1) First our model. It is based on two tables : follower and following, that 
should be identical. First one is for queries on getting followers of a user, 
latter is for getting who a user is following.

followings (uid bigint, ts timeuuid, fid bigint, PRIMARY KEY (uid, ts)) WITH 
CLUSTERING ORDER BY (ts DESC);

followers (uid bigint, ts timeuuid, fid bigint, PRIMARY KEY (uid, ts)) WITH 
CLUSTERING ORDER BY (ts DESC);


2) Both tables have secondary indexes on fid columns.

3) Definitely, a new follow relationship should insert one row to each table 
and delete should work on both too.



Problems :

1) We have a serious discrepancy problems between tables. With "nodetool 
cfstats" followings is 18mb, follower is 19mb in total. For demonstration 
purposes of this problem, I got followers of the most-followed user from both 
tables.

A) select * from followers where uid=12345678
B) select * from followings where fid=12345678

using a small script on unix, i could find out this info on sets A and B:
count( A < B ) = 1247
count( B < A ) = 185
count( A ∩ B ) = 20894


2) Even more interesting than that is, if I query follower table on secondary 
index, I don't get a row that I normally get with filtering just on partition 
key. Let me try to visualize it :

select uid,ts,fid from followers where fid=X (cannot find uid=12345678)
 A | BBB | X
 C | DDD | X
 E | FFF | X

select uid,ts,fid from followers where uid=12345678 | grep X
 12345678 | GGG | X


My thoughts :

1) Currently, we don't use batches during inserts and deletes to both tables. 
Would this help with our problems?

2) I was first suspicious of a corruption in secondary indexes. But actually, 
through the use of secondary index, I get consistent results.

3) I also thought, there could be the case of zombie rows. However we didn't 
have any long downtimes with our nodes. But, to our shame, we haven't been 
running any scheduled repairs on the cluster.

4) Finally, do you think that there may be problem with our modelling?


Thanks in advance.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


<    1   2