jackye1995 edited a comment on pull request #4254:
URL: https://github.com/apache/iceberg/pull/4254#issuecomment-1062027218


   > But the metrics reporting that this affects are being passed back to a 
processing engine, which is determined by the environment.
   
   Yes I agree this is more related to the processing engine, rather than the 
catalog. The question here becomes how do we support integration to different 
engines without dynamically loading? We do not really have a way of configuring 
FileIO outside catalog.
   
   The issue with HadoopMetricsContext is that we are going back to be tied to 
Hadoop. We have seen users trying to use Iceberg environments without Hadoop 
dependency (e.g. AWS Kinesis serverless Flink) and people don't want to load 
Hadoop dependencies just for getting metrics. So I think we definitely need 
some other implementations.
   
   Our plan with this PR is to later introduce metrics context such as 
`CollectdMetricsContext`, and that could be then tied to any monitoring 
services that typically has an agent process running to collect metrics at 
background. Our target is AWS CloudWatch integration, but I think most products 
offer similar integration pattern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to