KeyRange in the CoumnFamilyInputFormat

2011-09-05 Thread Vitaly Vengrov
Hi guys.

See these rows in the ColumnFamilyInputFormat.getSplits method :

assert jobKeyRange.start_key == null : only start_token
supported;
assert jobKeyRange.end_key == null : only end_token
supported;

So, the question is why start_key and end_key aren't supported ?

What I actually need is the ability to specify exact rowKey (UUID). Not a
key range.
I believe I can do this with same start and end keys but not with tokes.

Please advice.

Thanks
Vitaly


Re: KeyRange in the CoumnFamilyInputFormat

2011-09-05 Thread Mick Semb Wever
On Mon, 2011-09-05 at 18:18 +0300, Vitaly Vengrov wrote:
 See these rows in the ColumnFamilyInputFormat.getSplits method : 
 
 assert jobKeyRange.start_key == null : only start_token supported;  

 assert jobKeyRange.end_key == null : only end_token supported; 
 
 So, the question is why start_key and end_key aren't supported ? 
 
 What I actually need is the ability to specify exact rowKey (UUID).
 Not a key range.  I believe I can do this with same start and end keys
 but not with tokes. 

The background to this is CASSANDRA-1125 and specifically this comment
https://issues.apache.org/jira/browse/CASSANDRA-1125?focusedCommentId=13058858page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13058858

Tokens are used here to be consistent with the thrift API.

What you want is:


ConfigHelper.setInputRange(
jobConf,

partitioner.getTokenFactory().toString(partitioner.getToken(myKey)),

partitioner.getTokenFactory().toString(partitioner.getToken(myKey)));


In fact this would not be possible if you were using range.start_key and
range.end_key since that would exclude the one row you are trying to
include.

Out of curiosity why are you using hadoop to process one row?
Won't this be solely processed by one split and therefore only one task?

~mck

-- 
The only thing I know, is that I know nothing. Socrates 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |


signature.asc
Description: This is a digitally signed message part


Re: KeyRange in the CoumnFamilyInputFormat

2011-09-05 Thread Mick Semb Wever
On Mon, 2011-09-05 at 19:02 +0200, Mick Semb Wever wrote:
 
 ConfigHelper.setInputRange(
 jobConf,
 
 partitioner.getTokenFactory().toString(partitioner.getToken(myKey)),
 
 partitioner.getTokenFactory().toString(partitioner.getToken(myKey)));
 
 
 In fact this would not be possible if you were using range.start_key and
 range.end_key since that would exclude the one row you are trying to
 include. 

Sorry i take that back. It's ofc keys that are start-inclusive.

~mck

-- 
Those people who think they know everything are a great annoyance to
those of us who do. Isaac Asimov 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |



signature.asc
Description: This is a digitally signed message part