Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-20 Thread Aaron Morton
To: Cassandra User Subject: Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat) The limit is just ignored and the entire column family is scanned. Which limit ? 1. Am I right that there is no way to get some data limited by token range with ColumnFamilyInputFormat? From what

RE: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-20 Thread Anton Brazhnyk
Morton [mailto:aa...@thelastpickle.com] Sent: Monday, May 19, 2014 11:58 PM To: Cassandra User Subject: Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat) between 1.2.6 and 2.0.6 the setInputRange(startToken, endToken) is not working Can you confirm or disprove? My reading

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-19 Thread Aaron Morton
The limit is just ignored and the entire column family is scanned. Which limit ? 1. Am I right that there is no way to get some data limited by token range with ColumnFamilyInputFormat? From what I understand setting the input range is used when calculating the splits. The token ranges in

RE: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-19 Thread Anton Brazhnyk
, endToken) doesn't work. between 1.2.6 and 2.0.6 the setInputRange(startToken, endToken) is not working Can you confirm or disprove? WBR, Anton From: Aaron Morton [mailto:aa...@thelastpickle.com] Sent: Monday, May 19, 2014 1:58 AM To: Cassandra User Subject: Re: Cassandra token range support for Hadoop

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-16 Thread Paulo Ricardo Motta Gomes
Hello Anton, What version of Cassandra are you using? If between 1.2.6 and 2.0.6 the setInputRange(startToken, endToken) is not working. This was fixed in 2.0.7: https://issues.apache.org/jira/browse/CASSANDRA-6436 If you can't upgrade you can copy AbstractCFIF and CFIF to your project and

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-16 Thread Clint Kelly
Hi Anton, One approach you could look at is to write a custom InputFormat that allows you to limit the token range of rows that you fetch (if the AbstractColumnFamilyInputFormat does not do what you want). Doing so is not too much work. If you look at the class RowIterator within

RE: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-16 Thread Anton Brazhnyk
. WBR, Anton From: Paulo Ricardo Motta Gomes [mailto:paulo.mo...@chaordicsystems.com] Sent: Thursday, May 15, 2014 3:21 AM To: user@cassandra.apache.org Subject: Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat) Hello Anton, What version of Cassandra are you using? If between

Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-15 Thread Anton Brazhnyk
Greetings, I'm reading data from C* with Spark (via ColumnFamilyInputFormat) and I'd like to read just part of it - something like Spark's sample() function. Cassandra's API seems allow to do it with its ConfigHelper.setInputRange(jobConfiguration, startToken, endToken) method, but it doesn't

Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-14 Thread Anton Brazhnyk
Greetings, I'm reading data from C* with Spark (via ColumnFamilyInputFormat) and I'd like to read just part of it - something like Spark's sample() function. Cassandra's API seems allow to do it with its ConfigHelper.setInputRange(jobConfiguration, startToken, endToken) method, but it doesn't