QianyongY opened a new pull request, #2601: URL: https://github.com/apache/orc/pull/2601
### What changes were proposed in this pull request? Extends the Java merge tool so that, for inputs sharing the same schema, you can still merge to one ORC file by default, or use -m / --maxSize to write multiple ORC files under an output directory as part-xxxxx.orc, batching by on-disk input file size. ### Why are the changes needed? As the title says: users often need to merge many compatible ORC files without producing a single huge output file. This adds an optional mode that caps output size at whole-file boundaries while keeping the existing single-file behavior when --maxSize is not set. ### How was this patch tested? Add UT -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
