Hello Jeff, Thanks for the reply. My use case is not really special. We have multiple products and each product emits traditional log messages in different servers. I would like to stream those into HDFS. The logs are generally in apache or log4j format. So, I have many sources from where I want to stream the logs into HDFS. I can have a channel/collector machine where I install flume. I guess, my question is, do I need to install flume on the servers where the log messages lie and do I need to install flume in HDFS namenode too?
Thanks, - Seshu On Wed, Feb 6, 2013 at 7:47 PM, Jeff Lord <[email protected]> wrote: > Seshu, > > It really is going to depend on your use case. > Though it sounds that you may need to run an agent on each of the source > machines. > Which source do you plan to use? It may also be the case that you can use > the flume rpc client to write data directly from your application to the > flume collector machine. > > http://flume.apache.org/FlumeDeveloperGuide.html#rpc-client-interface > > -Jeff > > > On Wed, Feb 6, 2013 at 4:49 PM, Seshu V <[email protected]> wrote: > >> Hi All, >> >> I have used Flume 0.9.3 a while back, it worked fine at that time. >> Now, I am looking to use 'Flume NG', started reading documentation today. >> In Flume 0.9.3, I installed flume agents on the servers wherever I had the >> data source. And, I had a collector machine separately. My sink was >> HDFS. I see that Flume NG is using Channel. >> My question is that I have multiple source servers and my sink is >> HDFS. I also have another machine for Channel (collector in old days). >> Do I need to install flume NG in all the source machines and Channel >> machine? Or can I install flume NG only on the Channel server and >> (somehow) specify in the configuration to pull data from source machines >> and specify the sink as HDFS? >> Thanks in advance for your replies.. >> >> Thanks, >> - Seshu >> >> > >
