[ 
https://issues.apache.org/jira/browse/HADOOP-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

[EMAIL PROTECTED] updated HADOOP-1199:
--------------------------------------

    Attachment: hadoop1199-v2.patch

Version 2.  Keys, rather than LongWritable line numbers, are now a compound of 
host, taskid, and line number: e.g. 
debord.archive.org:task_0023_m_000000_0:11223.



> want InputFormat for task logs
> ------------------------------
>
>                 Key: HADOOP-1199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1199
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Doug Cutting
>         Attachments: hadoop1199-v2.patch, hadoop1199.patch
>
>
> We should provide an InputFormat implementation that includes all the task 
> logs from a job. Folks should be able to do something like:
> job = new JobConf();
> job.setInputFormatClass(TaskLogInputFormat.class);
> TaskLogInputFormat.setJobId(jobId);
> ...
> Tasks should ideally be localized to the node that each log is on.
> Examining logs should be as lightweight as possible, to facilitate debugging. 
> It should not require a copy to HDFS. A faster debug loop is like a faster 
> search engine: it makes people more productive. The sooner one can find that, 
> e.g., most tasks failed with a NullPointerException on line 723, the better. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to