[jira] [Commented] (SPARK-1622) Expose input split(s) accessed by a task in UI or logs

Patrick Wendell (JIRA) Sat, 26 Apr 2014 19:13:25 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982163#comment-13982163
 ]


Patrick Wendell commented on SPARK-1622:
----------------------------------------

I think it would be good to have a general mechanism for RDD implementations to 
add contextual information for each task that ends up `TaskInfo`. This could 
include values pinned to a specific task but also counters which can be 
aggregated across all the tasks in a stage.

> Expose input split(s) accessed by a task in UI or logs
> ------------------------------------------------------
>
>                 Key: SPARK-1622
>                 URL: https://issues.apache.org/jira/browse/SPARK-1622
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Matei Zaharia
>
> Right now it's hard to debug which input files or blocks therein have invalid 
> data. The InputSplit for a HadoopRDD is not even exposed programmatically in 
> Scala/Java (it's private[spark]).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1622) Expose input split(s) accessed by a task in UI or logs

Reply via email to