[ https://issues.apache.org/jira/browse/SPARK-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322594#comment-14322594 ]
Jacek Lewandowski commented on SPARK-5745: ------------------------------------------ Thanks [~pwendell] for your reply. The primary goal is to associate with the task some additional data which can be collected by some driver-side listener afterwards. The data which I'd like to collect is not accessible to the user directly - say, I want to collect the number of rows fetched from the database, or the number of batches written to the database. These values are known inside the job code and can be easily reported to task metrics (just like the number of read/written bytes are reported now). If I understand the idea of accumulators correctly, although the accumulator is a great feature for application-specific metrics, I don't really know how to use them to collect metrics which are more general - like RDD / job execution metrics, which are a part of an intermediate framework or a library. > Allow to use custom TaskMetrics implementation > ---------------------------------------------- > > Key: SPARK-5745 > URL: https://issues.apache.org/jira/browse/SPARK-5745 > Project: Spark > Issue Type: Wish > Components: Spark Core > Reporter: Jacek Lewandowski > > There can be various RDDs implemented and the {{TaskMetrics}} provides a > great API for collecting metrics and aggregating them. However some RDDs may > want to register some custom metrics and the current implementation doesn't > allow for this (for example the number of read rows or whatever). > I suppose that this can be changed without modifying the whole interface - > there could used some factory to create the initial {{TaskMetrics}} object. > The default factory could be overridden by user. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org