Hello
I don't know if the question has been answered. I am trying to understand the
overlap between FILE_BYTES_READ and HDFS_BYTES_READ. What are the various
components that provide value to this counter? For example when I see
FILE_BYTES_READ for a specific task ( Map or Reduce ) , is it purely due to the
spill during sort phase? If a HDFS read happens on a non local node, does the
counter increase on the node where the data block resides? What happens when
the data is local? does the counter increase for both HDFS_BYTES_READ and
FILE_BYTES_READ? From the values I am seeing, this looks to be the case but I
am not sure.
I am not very fluent in Java , and hence I don't fully understand the source .
:-(
Raj