Hi Raghav,

You could try Apache Solr along with HBase. Apache Solr is designed
for Full Text search and works in various modes in terms of storing
indexes.
http://lucene.apache.org/solr/
http://github.com/akkumar/hbasene [Provides a distributed system to
use HBase as the backing store for the TF-IDF representation, as
needed by Lucene]
http://www.lilyproject.org/lily/index.html [Cloud-scalable NoSQL-based
content store and search repository, built on top of Apache HBase and
SOLR]

If your requirement is not real-time in nature you may also try the
Scanner API of HBase Client.
http://hbase.apache.org/docs/r0.89.20100726/apidocs/index.html

Regards,

Imran

On Mon, Sep 27, 2010 at 10:27 AM, Sharma, Raghvendra
<[email protected]> wrote:
> I am running a little test/poc here.
>
> I need to load a few million rows every day into a database. And it's not log 
> file data, I have comma delimited rows (of columns) which would exactly fit a 
> relational database.
>
> After the loading, I need to allow a very fast search mechanism. Looking a 
> bit at Google's implementation of bigtable and structure around it, I 
> originally thought of using hive integrated with hbase. Hive because of its 
> querying capabilities.  The loading works out fine, better than RDBMS perf. 
> However, the querying bottleneck, which was the reason to look for 
> alternatives to RDBMS in the first place, continues with hive too.
>
> Testing hive for querying is not really blazing performance. Perhaps I need 
> to look for alternatives..
>
> Is there something else ? any other tool/solution/library that I can put on 
> top of hbase ? or even without hbase ? (I looked at hbase as an alternative 
> to the RDBMS, moving towards dist computing)
>
> Suggestions please...
>
> --raghav..
> ******************************************************************************************
> This message may contain confidential or proprietary information intended 
> only for the use of the
> addressee(s) named above or may contain information that is legally 
> privileged. If you are
> not the intended addressee, or the person responsible for delivering it to 
> the intended addressee,
> you are hereby notified that reading, disseminating, distributing or copying 
> this message is strictly
> prohibited. If you have received this message by mistake, please immediately 
> notify us by
> replying to the message and delete the original message and any copies 
> immediately thereafter.
>
> Thank you.
> ******************************************************************************************
> CLLD
>



-- 
Imran M Yousuf
Entrepreneur & CEO
Smart IT Engineering Ltd.
Dhaka, Bangladesh
Twitter: @imyousuf - http://twitter.com/imyousuf
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

Reply via email to