zhangjun0x01 commented on PR #494: URL: https://github.com/apache/flink-table-store/pull/494#issuecomment-1415635635
My initial idea is also add the parameters to `CoreOptions`, then use `options.get(ORC_BLOOM_FILTER_FORMAT)` to get the configuration, finally set it to the write properties of orc (`org.apache.orc.OrcConf#BLOOM_FILTER_COLUMNS`). But I found that the `OrcFileFormatFactory` get all properties of the orc prefix by `DelegatingConfiguration`, [code ](https://github.com/apache/flink-table-store/blob/c8c3cd1d93012d6eba6a7f7639435d489e0cffd8/flink-table-store-format/src/main/java/org/apache/flink/table/store/format/orc/OrcFileFormatFactory.java#L41) ``` private Configuration supplyDefaultOptions(Configuration options) { // it is DelegatingConfiguration if (!options.containsKey("compress")) { Properties properties = new Properties(); options.addAllToProperties(properties); properties.setProperty("compress", "lz4"); Configuration newOptions = new Configuration(); properties.forEach((k, v) -> newOptions.setString(k.toString(), v.toString())); return newOptions; } return options; } ``` and then adds them to the write options of orc. ``` // org.apache.flink.table.store.format.orc.OrcFileFormat#getOrcProperties private static Properties getOrcProperties(ReadableConfig options) { Properties orcProperties = new Properties(); Properties properties = new Properties(); ((org.apache.flink.configuration.Configuration) options).addAllToProperties(properties); properties.forEach((k, v) -> orcProperties.put(IDENTIFIER + "." + k, v)); return orcProperties; } ``` If I get the properties by `options.get (xxxx)` and then set it to orc write options, it will be redundant. So when we create a table, we can achieve the function like this. ``` CREATE TABLE word_count ( word STRING PRIMARY KEY NOT ENFORCED, cnt BIGINT ) with ( 'orc.bloom.filter.columns'='cnt', 'orc.bloom.filter.fpp'='0.05' ) ``` I don't know why this design was adopted at first, [including the compression method of orc](https://github.com/apache/flink-table-store/blob/c8c3cd1d93012d6eba6a7f7639435d489e0cffd8/flink-table-store-format/src/main/java/org/apache/flink/table/store/format/orc/OrcFileFormatFactory.java#L41), which is currently fixed in the code. I think we should add a factory class so that users can configure it, and then we can get the user's configuration by options.get (xxxx) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
