liubo1022126 opened a new issue #3916: URL: https://github.com/apache/iceberg/issues/3916
@openinx @rdblue I sort data within partitions by columns to gain performance, like `insert overwrite tableA partition(pt='20220118') select id,name,age from tableA where pt='20220118' order by id;`, and table's write.format.default=orc and 'write.target-file-size-bytes'='134217728'. But the data file within partitions is only one file with a large size. and I find that [ORC file now not support target file size before closed](https://github.com/apache/iceberg/pull/1213#discussion_r459197243). because there is only a large data file in every partition, so I can't filter data files at planning time like https://iceberg.apache.org/#performance/#data-filtering. So if I want to use orc fileformat, how to RollToNewFile? By the way, In Flink steaming job, will roll a new file when checkpoint, what is the different with batch job? why batch job can't roll ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
