danielcweeks commented on issue #3062:
URL: https://github.com/apache/iceberg/issues/3062#issuecomment-983306313


   It's been quite a while since I looked at this (prior to Spark 3), but at 
the time, spark relied entirely on Hadoop FileSystem metrics for tracking 
purposes.  I believe we created a shim that pulls IO metrics from the S3FileIO 
and reports them via the Hadoop FileSystem in order to expose this information.
   
   I think it is possible to create such a shim in the Iceberg Spark project, 
but we need to be careful not to leak the Hadoop packages (this would mean 
creating a metric callback interface in the S3FileIO) so as not to introduce a 
Hadoop dependency.
   
   That may provide a workaround until the upstream spark metrics framework is 
sorted out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to