Re: [jira] Commented: (HADOOP-489) Seperating user logs from system logs in map reduce

Doug Cutting Tue, 29 Aug 2006 14:32:43 -0700

Arkady Borkovsky wrote:

With these point in mind, solution (3) does not look attractive -- atleast as long as the tools for immediate access to such logs are perfect.

Currently we have HTTP access to log files as each line is logged. Isthat not immediate enough?

I like solution (2). Concatenation is separate issue -- the importantthing is immediate availability.

DFS files are not visible until they are closed. So logs in DFS wouldnot generally be viewable until the task is complete.

Maybe solution (2) be modified so that the messages from all tasks go tothe single DFS files -- each line of the logs prefixed with task ID andtime stamp?


That would create an i/o bottleneck.  We don't want to be log-bound.

I really think we want to continue to log to a local file, and thenprovide easy access to this over HTTP. What's needed are tools to: (a)display fatal errors as they happen; (b) use logs as MapReduce input, sothat they can, if desired, be efficiently written to DFS forretrospective analysis; (c) display all messages as they occur (tail-f). I have reservations about (c), since reading logs from 1000 nodesblended together in a single stream is not easy. Rather, I thinkpolling for the last N lines or new lines since last poll (whichever issmaller) would be more useful, since you'd get some context with each entry.


Doug

Re: [jira] Commented: (HADOOP-489) Seperating user logs from system logs in map reduce

Reply via email to