KevinyhZou opened a new pull request, #7189:
URL: https://github.com/apache/hudi/pull/7189
…eaming append write
### Change Logs
Add support for write success file to a finished partition in flink
streaming append write:
1. add a PartitionSuccessFileWriteSink class which aimed at to write succe
file to partition;
2. add flink options
`partition.write-success-file.enable` to switch whether enable this feature
in flink job;
`partition.write-success-file.delay` the delay of write success file;
`partition.time-extractor.timestamp-formatter` timestamp format used to
extract timestamp from partition path;
``
3. change `HoodieTableSink` class, support to extract partition path from
streaming record, and emit to extract.
4.
### Impact
Used in flink sql job consume from mq and write into hudi, if you want to
use this feature, set the options as below in hudi sink table ddl
```
create table test_out(
id bigint,
data string,
`day` string,
`hour` string
) partitioned by (`day`, `hour`)
with (
'connector' = 'hudi',
'partition.time-extractor.timestamp-pattern' = '$day $hour:00:00',
'partition.write-success-file.delay' = '1h', -- if the table partitioned
by hour, set the delay 1h; if the table partitioned by day, set the delay 1day;
'partition.write-success-file.enable' = 'true',
'table.type' = 'COPY_ON_WRITE',
'write.operation' = 'insert',
'write.tasks' = '5',
'write.insert.cluster' = 'false',
'metadata.enabled' = 'true',
'path' = 'hdfs://testcluster/user/test/hudi/test_out14',
);
```
### Risk level (write none, low medium or high below)
LOW
### Documentation Update
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]