I would say it depends on your context. Like you say, your 'primary key' should be distinct for two different records. Even if you are using a hash in addition to the timestamp, you can not garante that a record won't be overwritten. If you have something in your context that could act as an identifier you should use it, else you need to create it. If you know a given timestamp will be loaded from a single source, adding a counter could do the trick. If this is not the case, you could append information at the timestamp so that you have a single source for a given pre-key value and you could then append a counter (which should only provide a distinct value for each instances having the same pre-key value).
I would love to hear about alternatives, though. Regards Bertrand On Sat, Sep 22, 2012 at 4:29 PM, Ramasubramanian Narayanan < [email protected]> wrote: > Hi, > > Can anyone suggest what is the best value that can be used for a rowkey in > a hbase table which will not produce duplicate any point of time. For > example timestamp with nano seconds may get duplicated if we are loading in > a batch file. > > regards, > Rams > -- Bertrand Dechoux
