Hi, inline.
-- Alexander Lorenz http://mapredit.blogspot.com On Apr 20, 2012, at 6:22 AM, M. Karthikeyan wrote: > Thanks Brock for your thoughts. > A few related questions: > 1) Is there an out-of-the-box flume source that can monitor a RDBMS and pick > new rows from there, similar to a tailf on a file? No, here you have to write your own decorator. Here a list of implementations I found over the time: https://github.com/figarocms/flume-plugins https://github.com/stampy88/flume-amqp-plugin https://github.com/thobbs/flume-cassandra-plugin > 2) For systems that do not want to persist data into secondary storage, does > flume provide an API for direct integration into the app generating the data? > I guess the answer should be yes and in that case, is the app considered a > flume agent or the app generates data in a form that can be consumed by > another flume agent? > Again, here you have to write your own sink, but yes. I was reading all your requirements, I would really push you to flumeNG. A short writeup I was done in my blog: http://mapredit.blogspot.de/2012/03/flumeng-evolution.html - Alex > Thanks & Regards, > MK > > KARTHIKEYAN M > > Ericsson India Global Services Pvt.Ltd., > EGI/R > `Tamarai Tech Park', 4th Floor, South Block, > Inner Ring Road, Guindy, Chennai - 600032, India > Phone +91 44 4501 2055 > Fax +91 44 4501 2066 > Mobile +91 96770 68559 > m.karthike...@ericsson.com > www.ericsson.com > > > > > > > > > This Communication is Confidential. We only send and receive email on the > basis of the term set out at www.ericsson.com/email_disclaimer > > -----Original Message----- > From: Brock Noland [mailto:br...@cloudera.com] > Sent: Thursday, April 19, 2012 8:50 PM > To: flume-user@incubator.apache.org > Subject: Re: Flume scalability & performance > > One mistake below of consequence. > > On Thu, Apr 19, 2012 at 2:44 PM, Brock Noland <br...@cloudera.com> wrote: >> Hi, >> >> On Thu, Apr 19, 2012 at 10:04 AM, M. Karthikeyan >> <m.karthike...@ericsson.com> wrote: >>> Im trying to choose between Flume and JMS for data collection >>> framework in our multi-node network. >>> I have the following questions: >>> 1) From a scalability point of view, how does Flume compare with JMS? >>> Are there any numbers that can be referred to >>> 2) My typical payload for a single message is 2 KB. I expect traffic >>> of approx. 50 million messages/day. The messages are usually one >>> sender one receiver type. I require a reasonable level of reliability >>> (atleast the store-and-forward mode in Flume & durable/persistent >>> messages in JMS). With these considerations, which will give better >>> performance: Flume or JMS? >> >> All of this is extremely dependent on the implementation of JMS you >> use. JMS is a specification, there are many implementations. Looking >> at your numbers and assumption all the messages come in 8 hours >> (representing peak load) that is about 4MB/second. >> >> Both Flume and most JMS implementations should be able to handle this >> throughput. The advantage of Flume is really configuration. Purchasing >> and configuring a JMS server and then writing code to interact with >> the JMS Server is, IMHO, going to be less work than installing and >> configuring Flume. > > I meant to say setting up all that JMS infrastructure is going to be > *more* work than flume. > > Brock