Rams - you might enjoy this blog post from HBase committer Jesse Yates (from last summer):
http://jyates.github.io/2012/07/09/consistent-enough-secondary-indexes.html Secondary Indexing doesn't exist in HBase core today, but there are various proposals and early implementations of it in flight. In the mean time, as Mike and others have said, if you don't need them to be immediately consistent in a real-time write scenario, you can simply write the same data into multiple tables in different sort orders. (This is hard in a real-time write scenario because, without cross-table transactions, you'd have to handle all the cases where the record was written but the index wasn't, or vice versa.) Ian On Jun 4, 2013, at 12:22 PM, Ramasubramanian Narayanan wrote: Hi Michel, If you don't mind can you please help explain in detail ... Also can you pls let me know whether we have secondary index in HBASE? regards, Rams On Tue, Jun 4, 2013 at 1:13 PM, Michel Segel <[email protected]<mailto:[email protected]>>wrote: Quick and dirty... Create an inverted table for each index.... Then you can take the intersection of the result set(s) to get your list of rows for further filtering. There is obviously more to this, but its the core idea... Sent from a remote device. Please excuse any typos... Mike Segel On Jun 4, 2013, at 11:51 AM, Shahab Yunus <[email protected]<mailto:[email protected]>> wrote: Just a quick thought, why don't you create different tables and duplicate data i.e. go for demoralization and data redundancy. Is your all read access patterns that would require 70 columns are incorporated into one application/client? Or it will be bunch of different clients/applications? If that is not the case then I think why not take advantage of more storage. Regards, Shahab On Tue, Jun 4, 2013 at 12:43 PM, Ramasubramanian Narayanan < [email protected]<mailto:[email protected]>> wrote: Hi, In a HBASE table, there are 200 columns and the read pattern for diffferent systems invols 70 columns... In the above case, we cannot have 70 columns in the rowkey which will not be a good design... Can you please suggest how to handle this problem? Also can we do indexing in HBASE apart from rowkey? (something called secondary index) regards, Rams
