Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r22822398
--- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala ---
@@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager:
BlockManager) extends Logging {
blockManager.get(key) match {
case Some(blockResult) =>
// Partition is already materialized, so just return its values
+ val existingMetrics = context.taskMetrics.inputMetrics
+ val prevBytesRead = existingMetrics
+ .filter(_.readMethod == blockResult.inputMetrics.readMethod)
+ .map(_.bytesRead)
+ .getOrElse(0L)
--- End diff --
Is it possible to do @pwendell 's suggestion, where you check the type of
the input metrics and only append if it's the same type? I'd actually be
slightly in favor of just returning a list of input metrics, one for each input
type, because the other solutions seem a little hacky -- but defer to @sryza /
@pwendell here (who I think had argued in the past that this extra complexity
wasn't worth it).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]