Let's say I have an external job (MR, pig, etc) sorting a cassandra table
by some complicated mechanism.

We want to store the sorted records BACK into cassandra so that clients can
read the records sorted.

What I was just thinking of doing was storing the records as pages.

So page 0 would have records 0-999….

We would just have the key be the page ID and then the values be the
primary keys for the records so that they can be fetched. I could also
denormalize the data and store them inline as a materialized view but of
course this would require much more disk space.

Thoughts on this strategy?

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+
profile<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.

Reply via email to