jackye1995 commented on pull request #4254:
URL: https://github.com/apache/iceberg/pull/4254#issuecomment-1062074207


   > I was thinking is a mix-in interface
   
   Yes that could work!
   
   > wouldn't you integrate with the engine rather than with low-level libraries
   
   Correct, however the current `HadoopMetricsContext` seems to be doing this 
across all the engines, and this offers a unique benefit for `HadoopFileIO` 
users that they are able to see their FileIO metrics across all compute 
platforms aggregate through the Hadoop FileSystem interface. 
   
   There is benefit in at least providing an alternative such as 
`CollectdMetricsContext`, because we have been advocating `S3FileIO` to be not 
Hadoop dependent, and it is awkward that we now tell people we have to bring 
back the dependency and use Hadoop FileSystem to report metrics. This also 
applies to other non-Hadoop FileIOs.
   
   Because of that, I think it still makes sense to have some sort of default 
behavior controlled through catalog level property, maybe in the name of 
`io.metrics.default.type` which is limited to a few specific types including 
`hadoop` and `collectd`, and `io.metrics.default.namespace` provides the 
namespace of the metrics.
   
   (regarding metrics namespace, it was discussed in 
https://github.com/apache/iceberg/pull/4254#discussion_r819178970, it seems to 
be the terminology that most metrics system use, let me know what you think)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to