Hi Raghav, You could try Apache Solr along with HBase. Apache Solr is designed for Full Text search and works in various modes in terms of storing indexes. http://lucene.apache.org/solr/ http://github.com/akkumar/hbasene [Provides a distributed system to use HBase as the backing store for the TF-IDF representation, as needed by Lucene] http://www.lilyproject.org/lily/index.html [Cloud-scalable NoSQL-based content store and search repository, built on top of Apache HBase and SOLR]
If your requirement is not real-time in nature you may also try the Scanner API of HBase Client. http://hbase.apache.org/docs/r0.89.20100726/apidocs/index.html Regards, Imran On Mon, Sep 27, 2010 at 10:27 AM, Sharma, Raghvendra <[email protected]> wrote: > I am running a little test/poc here. > > I need to load a few million rows every day into a database. And it's not log > file data, I have comma delimited rows (of columns) which would exactly fit a > relational database. > > After the loading, I need to allow a very fast search mechanism. Looking a > bit at Google's implementation of bigtable and structure around it, I > originally thought of using hive integrated with hbase. Hive because of its > querying capabilities. The loading works out fine, better than RDBMS perf. > However, the querying bottleneck, which was the reason to look for > alternatives to RDBMS in the first place, continues with hive too. > > Testing hive for querying is not really blazing performance. Perhaps I need > to look for alternatives.. > > Is there something else ? any other tool/solution/library that I can put on > top of hbase ? or even without hbase ? (I looked at hbase as an alternative > to the RDBMS, moving towards dist computing) > > Suggestions please... > > --raghav.. > ****************************************************************************************** > This message may contain confidential or proprietary information intended > only for the use of the > addressee(s) named above or may contain information that is legally > privileged. If you are > not the intended addressee, or the person responsible for delivering it to > the intended addressee, > you are hereby notified that reading, disseminating, distributing or copying > this message is strictly > prohibited. If you have received this message by mistake, please immediately > notify us by > replying to the message and delete the original message and any copies > immediately thereafter. > > Thank you. > ****************************************************************************************** > CLLD > -- Imran M Yousuf Entrepreneur & CEO Smart IT Engineering Ltd. Dhaka, Bangladesh Twitter: @imyousuf - http://twitter.com/imyousuf Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557
