Hi,

I am trying to iterate over the entire dataset to calculate some
information. Now the way I am trying to do this is by going directly to the
node that has a data range, so here is the route I am following

   - get TokenRange using - describe_ring
   - then for each tokenRange pick a node and get all data from that node
   (so talk directly to that node for local data) - using get_range_slices ()
   and using KeyRange with start and end token. I want to get about N tokens at
   a time.
   - I want to use paging approach for this, but I cannot seem to find a way
   to get the token for my last keyslice? The only thing I can find is key, now
   is there way to get token given a key? As per some suggestions I can do the
   md5 on the last key and use that as the starting token for the next query,
   would that work?

Also is there a better way of doing this? The data per row is very small.
This looks like a hadoop kind of a job, but am trying to avoid hadoop since
have no other use for it and this operation will be infrequent.

I am using 0.6.6, RandomPartitioner.

Thanks
Anand

Reply via email to