Chukwa is a Hadoop subproject aiming to do something similar, though particularly for the case of Hadoop logs. You may find it useful.
Hadoop unfortunately does not support concurrent appends. As a result, the Chukwa project found itself creating a whole new demon, the chukwa collector, precisely to merge the event streams and write it out, just once. We're set to do a release within the next week or two, but in the meantime you can check it out from SVN at https://svn.apache.org/repos/asf/hadoop/chukwa/trunk --Ari On Fri, Apr 10, 2009 at 12:06 AM, Ricky Ho <[email protected]> wrote: > I want to analyze the traffic pattern and statistics of a distributed > application. I am thinking of having the application write the events as log > entries into HDFS and then later I can use a Map/Reduce task to do the > analysis in parallel. Is this a good approach ? > > In this case, does HDFS support concurrent write (append) to a file ? > Another question is whether the write API thread-safe ? > > Rgds, > Ricky > -- Ari Rabkin [email protected] UC Berkeley Computer Science Department
