Take a look at SOLR and Lucene. You should be able to a text search on the 
Hbase data written via Phoenix. It works via the hbase replication mechanism so 
should be near-real time. I think you would have to use the SOLR API to do the 
initial search, which would get you the Hbase rowkey, which you could parse and 
do a follow up Phoenix query for additional data. Note that I haven't done any 
of the above myself, so your mileage may vary.

> On Apr 3, 2017, at 6:27 PM, Randy <ruw...@gmail.com> wrote:
> 
> Wondering if anyone knows whether there is an approach to swap in custom 
> indexing implementation, while leveraging all other functionalities of 
> Phoenix. The initial goal is just in SELECT query, but would be nice to make 
> custom index maintenance integrated in record life cycle as well.
> 
> Phoenix supports secondary index already, but need to be more flexible with 
> real large data set when the format and quality varies. 
> 
> For example, assuming we have a table "PEOPLE" which has a column "NAME" 
> stored person's name. If there is a record with "Joe Smith" as the value of 
> "NAME" column, it would be really powerful if we can find it by variants or 
> partial name as criteria. Ideally all the following query would find the same 
> record if we can plug-in a custom indexing implementation in Phoenix:
> 
> SELECT * FROM PEOPLE WHERE NAME='Joe Smith';
> SELECT * FROM PEOPLE WHERE NAME='Smith,Joe';
> SELECT * FROM PEOPLE WHERE NAME='Joseph Smith';
> 
> Given the secondary index has global vs. local implementation. I would 
> imagine there is some level of abstraction already on consuming the index. 
> Not expecting it would be official supported API, just some guidance on where 
> to start would be greatly appreciated.
> 
> Thanks,
> 
> Randy

Reply via email to