Hi, I am trying to iterate over the entire dataset to calculate some information. Now the way I am trying to do this is by going directly to the node that has a data range, so here is the route I am following
- get TokenRange using - describe_ring - then for each tokenRange pick a node and get all data from that node (so talk directly to that node for local data) - using get_range_slices () and using KeyRange with start and end token. I want to get about N tokens at a time. - I want to use paging approach for this, but I cannot seem to find a way to get the token for my last keyslice? The only thing I can find is key, now is there way to get token given a key? As per some suggestions I can do the md5 on the last key and use that as the starting token for the next query, would that work? Also is there a better way of doing this? The data per row is very small. This looks like a hadoop kind of a job, but am trying to avoid hadoop since have no other use for it and this operation will be infrequent. I am using 0.6.6, RandomPartitioner. Thanks Anand