Hi all,

I have a requirement to store lists in HBase columns like this:
"table", "rowid1", "f:list", "aa, bb, cc"
"table", "rowid2", "f:list", "aabb, cc"

There is a further requirement to be able to find rows where f:list
contains a particular item, e.g. when I need to find rows having item "aa"
only "rowid1" should match, and for item "cc" both "rowid1" and "rowid2"
should match.

For now I decided to use SingleColumnValueFilter with substring matching.
As using comma-separated list proved difficult to search through, I'm using
pipe symbols to separate items like this: "|aa|bb|cc|", so that I could
pass the search item surrounded by pipes into the filter:
SingleColumnValueFilter ('f', 'list', =, 'substring:|aa|')

This proved to work effectively enough, however I would prefer to use
something more standard for my list storage (e.g. serialised JSON), or
perhaps something even more optimised for a search - performance really
does matter here.

Any opinions on this solution and possible enhancements are much
appreciated.

Many thanks,
Stas

Reply via email to