swuferhong commented on PR #22805: URL: https://github.com/apache/flink/pull/22805#issuecomment-1600493415
> @luoyuxia I find the code is called in multiple places. We make it configurable, we need change more moudles and we get more parameters. if we set parameter in hadoop config,both orc and parquet can use this parameter. Could you give me some idea? Hi, did you encounter the problem of slow reporting ORC statistics during using hive connector? If that, I think you can add this parameter into `HiveOptions` as a Flink conf, and you need to set this flink conf into job conf in method `HiveSourceBuilder.setFlinkConfigurationToJobConf()` (jobConf will be add into hadoopConf in hive source) . By doing this, you can get this parameter from `hadoopConf`, if this parameter not in `hadoopConf,` you can set it as `Runtime.getRuntime().availableProcessors()` as default. WDYT, @luoyuxia . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
