jackye1995 edited a comment on pull request #4254: URL: https://github.com/apache/iceberg/pull/4254#issuecomment-1062027218
> But the metrics reporting that this affects are being passed back to a processing engine, which is determined by the environment. Yes I agree this is more related to the processing engine, rather than the catalog. The question here becomes how do we support integration to different engines without dynamically loading? We do not really have a way of configuring FileIO outside catalog. The issue with HadoopMetricsContext is that we are going back to be tied to Hadoop. We have seen users trying to use Iceberg environments without Hadoop dependency (e.g. AWS Kinesis serverless Flink) and people don't want to load Hadoop dependencies just for getting metrics. So I think we definitely need some other implementations. Our plan with this PR is to later introduce metrics context such as `CollectdMetricsContext`, and that could be then tied to any monitoring services that typically has an agent process running to collect metrics at background. Our target is AWS CloudWatch integration, but I think most products offer similar integration pattern. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
