Your task appears to be more of a periodic batch movement.. rather than continuous streaming. Flume is meant for the latter use case. -roshan
On Wed, Jul 24, 2013 at 3:19 AM, Flavio Pompermaier <[email protected]>wrote: > In my use case I have a Solr index that proxy the access to data stored in > HBase (I ask solr for the rowkey of documents matching some query). > What I'd like to do is to be able to rebuild this solr index, reading the > json or xml stored in each record, map fields to my solr document and > commit. > I know that this is not the main goal of Flume but I think it could be > used also for this kind of task. > I looked at the tools you suggested but they seems to be very small > projects and they do not provide very interesting features like those in > morphlines > (correct me if I'm wrong!). > > Best, > Flavio > > > On Wed, Jul 24, 2013 at 12:06 PM, Alexander Alten-Lorenz < > [email protected]> wrote: > >> Flume is a event collection tool, means Flume poll a source or catch >> events. HBase is a database, and usually stores some kind of data in a >> schema (CF). You could write a custom source and do a scan on your tables, >> but really I see no sense in such a task. And a full table scan at HBase is >> really expensive. >> What do you mean with reindexing? HBase has primary and secondary indexes >> (http://hbase.apache.org/book/secondary.indexes.html), which can be >> processed over filters. To integrate HBase into SolR, you can use one of >> the tools I mentioned in my post before or ask the SolR mailing lists. >> >> - Alex >> >> On Jul 24, 2013, at 11:29 AM, Flavio Pompermaier <[email protected]> >> wrote: >> >> I was thinking to reindex my data stored in HBase and Flume + SolrSink >> were perfect to this purpose (although I could obviously write a mapreduce >> job). >> Don't you think this could be a common scenario in which Flume could be >> useful? >> >> On Wed, Jul 24, 2013 at 11:08 AM, Alexander Alten-Lorenz < >> [email protected]> wrote: >> >>> Hi, >>> >>> No. And from my perspective it doesn't make sense. I think you look for >>> tools like https://github.com/Photobucket/Solbase or >>> http://code.google.com/p/hbase-solr-dataimport/. >>> >>> - Alex >>> >>> On Jul 24, 2013, at 10:51 AM, Flavio Pompermaier <[email protected]> >>> wrote: >>> >>> > Hi to all, >>> > I'd like to read data from HBase and move it to Solr. >>> > Is there an HBase source in Flume or something to read from it? >>> > >>> > Best, >>> > Flavio >>> >>> -- >>> Alexander Alten-Lorenz >>> http://mapredit.blogspot.com >>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF >>> >> >> >> >> >> >> >>
