What do you mean by similar? I'd think the speed would be the same doing inserts. How many rows and regions when you are done? What size cluster?
How do you intend to query HBase? Will you be requesting clumps of 'similars' or just getting an item at a time? St.Ack On Fri, Dec 3, 2010 at 4:28 PM, Peter Haidinyak <[email protected]> wrote: > Hi, > Which would be a better approach. > > 1. Having every entry into HBase use a unique Row Key > > 2. Having similar entries into HBase use the same Row Key and then use > versions to extract the data. > > I have noticed that option 2 is much slower for putting data into HBase by a > factor of 2.5 but would extracting the information be faster? > > Thanks > > -Pete >
