Hi all,

We're storing timestamped data in HBase; from lurking on the mailing list it 
seems like the recommendation is usually to make the timestamp part of the row 
key. I'm curious why this is - is scanning over rows more efficient than 
scanning over timestamps within a cell? 

The book says: "the version timestamp is internally by HBase for things like 
time-to-live calculations. It's usually best to avoid setting this timestamp 
yourself. Prefer using a separate timestamp attribute of the row, or have the 
timestamp a part of the rowkey, or both." I understand that TTL would be ruined 
(or saved, depending on your goal) by custom timestamps, and I also gather that 
the way HBase handles concurrency is through MVCC. But we are using application 
level locks, and HBase's TTL functionality applying is a bonus if anything.

So is there any reason why we shouldn't set the timestamps manually?

Thanks!
-Ben

Reply via email to