So I will be generating lots of rows into the db keyed by userId, in
userId order.
I have already learned through this mailing list that this use-case is
not ideal, since it would mean most row-inserts will be on one region
server. I know that some people suggest to add some randomization to
the keys so that it will be spread around, but I can't do that, since
I'll need to be able to do random access lookup on the rows via userId.
But I'm wondering if I could map/hash the real userId, into another
number that will spread around the inserts. And I can still do random
access lookups given a real userId, by calculating the hash..
1) i think i like this idea, does anyone have any experience with this?
2) assume userId is a 8byte long, what would be some good hashing
functions? I would be lazy and use little-endian, but I bet one of you
could come up with something better. :)