PrabhuJoseph commented on code in PR #8512:
URL: https://github.com/apache/hudi/pull/8512#discussion_r1177726006
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/sort/SortOperator.java:
##########
@@ -89,7 +95,7 @@ public void open() throws Exception {
binarySerializer,
computer,
comparator,
- getContainingTask().getJobConfiguration());
+ conf);
Review Comment:
You are right; the configuration used to instantiate HoodieTableSink had
only the default Flink configuration values with the SQL options. The
DynamicTableFactory#context is the right one to use and is from the Planner,
which has the configs specified from both flink-conf.yaml and SqlClient
Session, with later taking first precedence. I have tested below to verify the
new changes with configuration from the context working as expected.
1. Add table.exec.spill-compression.block-size: 40 m in
/etc/flink/conf/flink-conf.yaml
2. Create a hudi table in flink sql-client session with 'write.operation' =
'bulk_insert'.
3. Insert some rows into the table.
4. taskmanager.log shows below with right compressionBlockSize.
```
taskmanager.log:2023-04-26 09:26:43,130 INFO
org.apache.flink.table.runtime.operators.sort.BinaryExternalSorter [] -
BinaryExternalSorter with initial memory segments 16384,
maxNumFileHandles(128), compressionEnable(true), compressionCodecFactory(class
org.apache.flink.runtime.io.compression.Lz4BlockCompressionFactory),
compressionBlockSize(41943040).
```
5. Set below and run the insert command and taskmanager.log shows the
overridden compressionBlockSize
```
Flink SQL> set 'table.exec.spill-compression.block-size' = '80 m';
taskmanager.log:2023-04-26 09:29:14,906 INFO
org.apache.flink.table.runtime.operators.sort.BinaryExternalSorter [] -
BinaryExternalSorter with initial memory segments 16384,
maxNumFileHandles(128), compressionEnable(true), compressionCodecFactory(class
org.apache.flink.runtime.io.compression.Lz4BlockCompressionFactory),
compressionBlockSize(83886080).
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]