Nodetool repair question

2016-05-10 Thread Anubhav Kale
Hello, Suppose I have 3 nodes, and stop Cassandra on one of them. Then I run a repair. Will repair move the token ranges from down node to other node ? In other words in any situation, does repair operation ever change token ownership ? Thanks !

Re: Nodetool repair question

2016-05-10 Thread Joel Knighton
No - repair does not change token ownership. The up/down state of a node is not related to token ownership. On Tue, May 10, 2016 at 3:26 PM, Anubhav Kale wrote: > Hello, > > > > Suppose I have 3 nodes, and stop Cassandra on one of them. Then I run a > repair. Will

Re: Lot's of hints, but only on a few nodes

2016-05-10 Thread Nate McCall
The most immediate work-around would be to nodetool disablehints around the cluster before you load data. This would stop it snowballing from hints at least. On Tue, May 10, 2016 at 7:49 AM, Erik Forsberg wrote: > I have this situation where a few (like, 3-4 out of 84)

Re: Cassandra 3.0.6 Release?

2016-05-10 Thread Tyler Hobbs
On Mon, May 9, 2016 at 2:48 PM, Drew Kutcharian wrote: > > > What’s the 3.0.6 release date? Seems like the code has been frozen for a > few days now. I ask because I want to install Cassandra on Ubuntu 16.04 and > CASSANDRA-10853 is blocking it. > We've been holding it up to

RE: Accessing Cassandra data from Spark Shell

2016-05-10 Thread Mohammed Guller
Yes, it is very simple to access Cassandra data using Spark shell. Step 1: Launch the spark-shell with the spark-cassandra-connector package $SPARK_HOME/bin/spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.5.0 Step 2: Create a DataFrame pointing to your Cassandra table

Re: COPY TO export fails with

2016-05-10 Thread Stefania Alborghetti
For COPY TO you can try increasing the page timeout or decreasing the page size: PAGETIMEOUT=10 - the page timeout in seconds for fetching results PAGESIZE='1000' - the page size for fetching results You can pass these options to the COPY command by adding "WITH

RE: Cassandra 2.0.x OOM during startsup - schema version inconsistency after reboot

2016-05-10 Thread Michael Fong
Hi, Thanks for your recommendation. I also opened a ticket to keep track @ https://issues.apache.org/jira/browse/CASSANDRA-11748 Hope this could brought someone's attention to take a look. Thanks. Sincerely, Michael Fong -Original Message- From: Michael Kjellman

RE: A question to 'paging' support in DataStax java driver

2016-05-10 Thread Sebastian Estevez
I think this request belongs in the java driver jira not the Cassandra jira. https://datastax-oss.atlassian.net/projects/JAVA/ all the best, Sebastián On May 10, 2016 1:09 AM, "Lu, Boying" wrote: > I filed a JIRA https://issues.apache.org/jira/browse/CASSANDRA-11741 to >

Lot's of hints, but only on a few nodes

2016-05-10 Thread Erik Forsberg
I have this situation where a few (like, 3-4 out of 84) nodes misbehave. Very long GC pauses, dropping out of cluster etc. This happens while loading data (via CQL), and analyzing metrics it looks like on these few nodes, a lot of hints are being generated close to the time when they start to

Re: Data platform support

2016-05-10 Thread Srini Sydney
I have a clarification based on your answer - spark is installed as standalone mode (not hdfs) in SMACK framework. Our data lake is in hdfs . How do we overcome this ? - cheers sreeni > On 10 May 2016, at 08:16, vincent gromakowski > wrote: > > Maybe a

Re: Data platform support

2016-05-10 Thread Srini Sydney
Thanks a lot..denise On 10 May 2016 at 02:42, Denise Rogers wrote: > It really depends how close you want to stay to the most current versions > of open source community products. > > Cloudera has tended to build more products that requires their > distribution to not be as

Re: Data platform support

2016-05-10 Thread Sruti S
Not sure what is meant.. Spark can access HDFS. Why is it in standalone mode? Please clarify. On Tue, May 10, 2016 at 11:08 AM, Srini Sydney wrote: > I have a clarification based on your answer - > > spark is installed as standalone mode (not hdfs) in SMACK framework.

Re: Data platform support

2016-05-10 Thread Srini Sydney
I understand that spark supports hdfs and standalone modes. The recommendation from cassandra is that spark should be installed in standalone mode in SMACK framework. On 10 May 2016 at 16:24, Sruti S wrote: > Not sure what is meant.. Spark can access HDFS. Why is it

COPY TO export fails with

2016-05-10 Thread Matthias Niehoff
Hi, i try to export data of a table (~15GB) using the cqlsh copy to. It fails with „no host available“. If I try it with a smaller table everything works fine. The statistics of the big table: SSTable count: 81 Space used (live): 14102945336 Space

Re: COPY TO export fails with

2016-05-10 Thread Carlos Rolo
Hello, That is a lot of data to do an "COPY TO. If you want a fast way to export, and you're fine with Java, you can use Cassandra SSTableReader classes to read the sstables directly. Spark also works. Regards, Carlos Juzarte Rolo Cassandra Consultant / Datastax Certified Architect / Cassandra

Re: Data platform support

2016-05-10 Thread vincent gromakowski
Maybe a SMACK stack would be a better option for using spark with Cassandra... Le 10 mai 2016 8:45 AM, "Srini Sydney" a écrit : > Thanks a lot..denise > > On 10 May 2016 at 02:42, Denise Rogers wrote: > >> It really depends how close you want to stay to

Re: COPY TO export fails with

2016-05-10 Thread Matthias Niehoff
sry, sent early.. more errors: /export.cql:9:Error for (4549395184516451179, 4560441269902768904): NoHostAvailable - ('Unable to complete the operation against any hosts', {: ConnectionException('Host has been marked down or removed',)}) (will try again later attempt 1 of 5) /export.cql:9:Error

Re: A question to 'paging' support in DataStax java driver

2016-05-10 Thread Sebastian Estevez
I didn't read the whole thread last time around, please disregard my comment about the java driver jira. One other thought (hopefully relevant this time). Once we have https://issues.apache.org/jira/browse/CASSANDRA-10783, you could write a write a (*start*, *rows*) style paging UDF which would

Re: COPY TO export fails with

2016-05-10 Thread Matthias Niehoff
Hi, already that copy to might not be the best way to do this. I’ll write a small spark job. Thanks 2016-05-10 10:36 GMT+02:00 Carlos Rolo : > Hello, > > That is a lot of data to do an "COPY TO. > > If you want a fast way to export, and you're fine with Java, you can use >

Re: [C*3.0.3]lucene indexes not deleted and nodetool repair makes DC unavailable

2016-05-10 Thread Eduardo Alonso
Hi all, Sorry, I tested with an old index jar. The cassandra-3.0.3 and dsc-cassandra-3.0.3 packages are the same. The error happens in both, i think we have fixed it and it will be included in next release (maybe 3.0.5.1). 1.- Full repair is very intensive, thats why your cluster is non

Low cardinality secondary index behaviour

2016-05-10 Thread Atul Saroha
I have concern over using secondary index on field with low cardinality. Lets say I have few billion rows and each row can be classified in 1000 category. Lets say we have 50 node cluster. Now we want to fetch data for a single category using secondary index over a category. And query is