[
https://issues.apache.org/jira/browse/SPARK-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982163#comment-13982163
]
Patrick Wendell edited comment on SPARK-1622 at 4/27/14 2:12 AM:
-----------------------------------------------------------------
I think it would be good to have a general mechanism for RDD implementations to
add contextual information for each task that ends up `TaskInfo` and ultimately
in the UI. This could include values pinned to a specific task but also
counters which can be aggregated across all the tasks in a stage.
was (Author: pwendell):
I think it would be good to have a general mechanism for RDD implementations to
add contextual information for each task that ends up `TaskInfo`. This could
include values pinned to a specific task but also counters which can be
aggregated across all the tasks in a stage.
> Expose input split(s) accessed by a task in UI or logs
> ------------------------------------------------------
>
> Key: SPARK-1622
> URL: https://issues.apache.org/jira/browse/SPARK-1622
> Project: Spark
> Issue Type: Improvement
> Reporter: Matei Zaharia
>
> Right now it's hard to debug which input files or blocks therein have invalid
> data. The InputSplit for a HadoopRDD is not even exposed programmatically in
> Scala/Java (it's private[spark]).
--
This message was sent by Atlassian JIRA
(v6.2#6252)