Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3120#discussion_r22778368
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala ---
    @@ -153,34 +157,19 @@ class NewHadoopRDD[K, V](
               throw new java.util.NoSuchElementException("End of stream")
             }
             havePair = false
    -
    -        // Update bytes read metric every few records
    -        if (recordsSinceMetricsUpdate == 
HadoopRDD.RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES
    --- End diff --
    
    This was done intentionally to help keep the callback updates out of the 
`InputMetrics` class and isolate it to Hadoop RDD. This notion of callbacks 
makes the InputMetrics class more complicated and mutable. Since it's an 
exposed class we really wanted to keep the interface clean and simple, even if 
it meant some extra engineering in HadoopRDD.  So could this part of the change 
be reverted back to how it was before (and you don't change the 
InputMetrics/TaskMetrics classes?).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to