It sounds like you should investigate the Lily Project. They have already done a lot of work to integrate Solr and HBase into a single solution. I did something similar before they released their project -- I like my use of dynamic schema's, but their overall approach is probably more solid. In particular they have given careful consideration as to what to do with large objects, and how to integrate them into the system. And most importantly, their project is open.
There was also some talk earlier of integrating HBase and Solr -- you might want to search the list for some of Jason's posts. I think that is a work in progress still. Otherwise you will have to roll your own solution. It is actually not too difficult to set up a system to publish HBase contents to Solr. The difficulty is in maintaining a consistent view of the data between the two. I believe Lily uses queues to keep updates in sync. If you can tolerate some delay, you could simply update your indexes on a regular basis, or set up your application to populate HBase and Solr simultaneously. The biggest challenge is resharding. HBase will automatically split regions when they become too large. Solr doesn't have that capability yet, so you will have to manage the shards yourself. Another approach is to look at Elastic Search. That is a Lucene based system that does do automatic sharding. Direct search on HBase requires either a clever key encoding (like OpenTSDB), and/or multiple copies of the data to imitate secondary indexes. Dave -----Original Message----- From: Stuti Awasthi [mailto:[email protected]] Sent: Thursday, September 29, 2011 2:52 AM To: [email protected] Subject: Hbase - Solr Integration Hi Friends, I am storing my data in Hbase. I want to do search using Solr. I can't find much documentation about the integration. Is there any documentation to integrate these two. Please Suggest Regards, Stuti Awasthi ::DISCLAIMER:: ----------------------------------------------------------------------------------------------------------------------- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. -----------------------------------------------------------------------------------------------------------------------
