FYI there is an ES module in StormCrawler [ https://github.com/DigitalPebble/storm-crawler/tree/master/external/elasticsearch], its components are quite specific to storm-crawler apart from the MetricsComsumer <https://github.com/DigitalPebble/storm-crawler/tree/master/external/elasticsearch/src/main/java/com/digitalpebble/storm/crawler/elasticsearch/metrics> which can be used with any Storm topology e.g. to build graphs with Kibana. To my knowledge this is not available in ES-Hadoop nor storm-elasticsearch. There is a few dependencies to objects from storm-crawler but these should be easy to circumvent if you wanted to piggyback the code.
On 1 April 2016 at 19:32, Lakshmanan Muthuraman <[email protected]> wrote: > 1."ES-Hadoop also provides a good support of time-based index-rolling > which is great for logging-type use-cases." > 2. If trident is not needed, ES-Hadoop is simple enough to use. > > The above is a big usecase for us that we are able to accomplish with > ElasticSearch-Hadoop > > > On Wed, Mar 30, 2016 at 11:50 AM, Aaron.Dossett <[email protected]> > wrote: > >> Size of cluster is TBD, but we ultimately want to ingest tens of millions >> of events per minute. >> >> On Mar 30, 2016, at 1:31 PM, Tech Id <[email protected]> wrote: >> >> >> Hey Aaron, >> >> Do you have a target throughput that you want to achieve through ES-Bolt ? >> How many storm machines you plan to run your topology on? >> >> Thanks >> >> >> On Wed, Mar 30, 2016 at 10:53 AM, Aaron.Dossett <[email protected] >> > wrote: >> >>> In setting up a storm -> ES topology today I ran across that same fact. >>> The EsIndexBolt is also synchronous mode only, with no option for asynch. >>> This week I’ll evaluate adding batching/async to storm-elasticsearch or >>> switching to ES-hadoop for my use use cases. >>> >>> From: Tech Id <[email protected]> >>> Reply-To: "[email protected]" <[email protected]> >>> Date: Wednesday, March 30, 2016 at 10:24 AM >>> To: "[email protected]" <[email protected]> >>> Subject: Re: external/storm-elasticsearch - upgrade requested >>> >>> One thing I see in favor of the elasticsearch-hadoop is that it provides >>> batching without Trident. >>> >> >> > -- *Open Source Solutions for Text Engineering* http://www.digitalpebble.com http://digitalpebble.blogspot.com/ #digitalpebble <http://twitter.com/digitalpebble>
