Re: HBase source

Alexander Alten-Lorenz Wed, 24 Jul 2013 03:07:12 -0700

Flume is a event collection tool, means Flume poll a source or catch events. 
HBase is a database, and usually stores some kind of data in a schema (CF). You 
could write a custom source and do a scan on your tables, but really I see no 
sense in such a task. And a full table scan at HBase is really expensive.
What do you mean with reindexing? HBase has primary and secondary indexes 
(http://hbase.apache.org/book/secondary.indexes.html), which can be processed 
over filters. To integrate HBase into SolR, you can use one of the tools I 
mentioned in my post before or ask the SolR mailing lists.


- Alex

On Jul 24, 2013, at 11:29 AM, Flavio Pompermaier <[email protected]> wrote:

> I was thinking to reindex my data stored in HBase and Flume + SolrSink were 
> perfect to this purpose (although I could obviously write a mapreduce job).
> Don't you think this could be a common scenario in which Flume could be 
> useful?
> 
> On Wed, Jul 24, 2013 at 11:08 AM, Alexander Alten-Lorenz 
> <[email protected]> wrote:
> Hi,
> 
> No. And from my perspective it doesn't make sense. I think you look for tools 
> like https://github.com/Photobucket/Solbase or 
> http://code.google.com/p/hbase-solr-dataimport/.
> 
> - Alex
> 
> On Jul 24, 2013, at 10:51 AM, Flavio Pompermaier <[email protected]> wrote:
> 
> > Hi to all,
> > I'd like to read data from HBase and move it to Solr.
> > Is there an HBase source in Flume or something to read from it?
> >
> > Best,
> > Flavio
> 
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
> 
> 
> 
>

Re: HBase source

Reply via email to