[ 
https://issues.apache.org/jira/browse/SPARK-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322594#comment-14322594
 ] 

Jacek Lewandowski commented on SPARK-5745:
------------------------------------------

Thanks [~pwendell] for your reply.

The primary goal is to associate with the task some additional data which can 
be collected by some driver-side listener afterwards. The data which I'd like 
to collect is not accessible to the user directly - say, I want to collect the 
number of rows fetched from the database, or the number of batches written to 
the database. These values are known inside the job code and can be easily 
reported to task metrics (just like the number of read/written bytes are 
reported now). 

If I understand the idea of accumulators correctly, although the accumulator is 
a great feature for application-specific metrics, I don't really know how to 
use them to collect metrics which are more general - like RDD / job execution 
metrics, which are a part of an intermediate framework or a library. 


> Allow to use custom TaskMetrics implementation
> ----------------------------------------------
>
>                 Key: SPARK-5745
>                 URL: https://issues.apache.org/jira/browse/SPARK-5745
>             Project: Spark
>          Issue Type: Wish
>          Components: Spark Core
>            Reporter: Jacek Lewandowski
>
> There can be various RDDs implemented and the {{TaskMetrics}} provides a 
> great API for collecting metrics and aggregating them. However some RDDs may 
> want to register some custom metrics and the current implementation doesn't 
> allow for this (for example the number of read rows or whatever).
> I suppose that this can be changed without modifying the whole interface - 
> there could used some factory to create the initial {{TaskMetrics}} object. 
> The default factory could be overridden by user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to