Can you elaborate on your use case a bit? At what point would your business logic decide that the file is complete (by time or other decision to cut a file as completed)? And then when do you batch process from what the stream has pilled up for you ?
Writing to HDFS http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample is pretty straight forward and doing so in a consumer is not a lot of fuss Whether you need more layers and overhead all gets back to what you are trying to accomplish and such :) You might need to use Zookeeper or something to coordinate what to run the batch process (depending on how you kick this off) so you know what is going on in the Consumers is completed in the other system. On Fri, Aug 30, 2013 at 2:18 PM, Mark <static.void....@gmail.com> wrote: > What is the quickest and easiest way to write message from Kafka into > HDFS? I've come across Camus but before we go the whole route of writing > Avro messages we want to test plain old vanilla messages. > > Thanks