Hi, you could use Avro or Syslog, if possible, or write a own source who runs as a REST Api.
Yes, flume will create directories per timestamp, take a look into the HDFS section in the userguide: http://archive.cloudera.com/cdh4/cdh/4/flume-ng/FlumeUserGuide.html#h.rxt2g9parmkr You can use the escape sequences to match your needs. Small article about: http://mapredit.blogspot.de/2012/03/flumeng-evolution.html cheers, Alex -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF On Jun 1, 2012, at 7:14 AM, Mohit Anchlia wrote: > > I am looking at integrating flume ng with our rest service API to record > click stream data. Flow would be browser sends data to this REST service, > which then acts as a client and send it to flume async. Flume then stores it > in hdfs. I just want to make sure that this is a right use of flume. > > I do have another question, how does flume organizes hdfs files? Does it > create new directory based on the timestamp? Could someone help me with this > in understanding how to efficiently organize and store files such that data > can be clustered based on timestamp? > >
