Hey Ted,

It's hard to avoid copying files, unless if you are able to change your application to talk to HDFS directly (and even then, there are a lot of "gotchas" that you wouldn't have to put up with at an application level -- look at the Chukwa paper).

I would advise looking at Chukwa, http://wiki.apache.org/hadoop/ Chukwa, and then rotating logfiles quickly.

Facebook's Scribe is supposed to do this sort of stuff too (and is very impressive), but I'm not familiar with it. On face value, it appears that it might take more effort to get scribe well-integrated, but it would have more functionality.

Brian

On Sep 7, 2009, at 4:18 AM, Ted Yu wrote:

We're using hadoop 0.20.0 to analyze large log files from web servers.
I am looking for better HDFS support so that I don't have to copy log files
from Linux File System over.

Please comment.

Thanks

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to