Re: Advice wanted on supporting a tag feature for searching an HBase Table via Phoenix

Josh Elser Mon, 22 Feb 2021 12:14:47 -0800

I had a similar sort of issues (granted, less data scale), and I wentwith option 2.

If you put the rowkey of your "data" table plus the tag itself into therowkey for your other table/index, you should be able to grow withoutrunning into HBase scalability (though, pulling 10GB of tags for onelookup would be crazy slow :P). It's a fast rowkey, prefix scan to pullall the tags for the "data record".

Just don't forget that hbase won't split a single row across multipleRegions. That's the important part in designing this table.


On 2/21/21 11:51 PM, Simon Mottram wrote:

The requirement is to be able to search from a list of tags, each recordcan have a possible large number of tags. There would be more than onetag field.
An example might 3 different hashtag fields. They do have to bedifferent; we can't have just one tag cloud.
The data size is large so we need to be able to search the tag cloudsover large numbers. Millions but not billions (for now)
e.g:

I was wondering what the best method would be

1) a column per tag value.
ID, name, some_attributes..., type1_tag_1,  type1_tag_2

While hbase is happy with many columns I can't see how to index this
2) A tag join table. Maybe just a single row key ID + single tag.Then it becomes a straight join of ID + tag. Thus it would be indexed.
3) Is there a crafty way of using column families? Could that beindexed efficiently?
Any tips/tricks gratefully received

Simon

Re: Advice wanted on supporting a tag feature for searching an HBase Table via Phoenix

Reply via email to