[ https://issues.apache.org/jira/browse/HADOOP-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511193 ]
Michael Bieniosek commented on HADOOP-1524: ------------------------------------------- Ah, I see. In that case, your solution sounds good. > Task Logs userlogs don't show up for a while > --------------------------------------------- > > Key: HADOOP-1524 > URL: https://issues.apache.org/jira/browse/HADOOP-1524 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.13.0 > Reporter: Michael Bieniosek > Attachments: eliminate-split-idx.patch > > > When I start a task and go to the task logs, nothing shows up for a while. > An examination of TaskLog.Writer and TaskLog.Reader reveals: > 1. The TaskLog.Reader relies on the presence of a split.idx to identify the > parts of the logs to display. > 2. The TaskLog.Writer only updates the split.idx file when it moves on to the > next log. > As a result, updates to the log only get pushed when an entire file is done. > Why is there a split.idx file? It seems that since files are called > part-00000, part-00001, etc., the TaskLog.Reader can just look at all files > and arrange them by alphabetical order. The split.idx file also contains > file length, but this data is already stored by the filesystem. > If nobody has objections, I'd like to write a patch to eliminate the > split.idx file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.