QianyongY opened a new pull request, #2601:
URL: https://github.com/apache/orc/pull/2601

   ### What changes were proposed in this pull request?
   Extends the Java merge tool so that, for inputs sharing the same schema, you 
can still merge to one ORC file by default, or use -m / --maxSize to write 
multiple ORC files under an output directory as part-xxxxx.orc, batching by 
on-disk input file size. 
   
   ### Why are the changes needed?
   As the title says: users often need to merge many compatible ORC files 
without producing a single huge output file. This adds an optional mode that 
caps output size at whole-file boundaries while keeping the existing 
single-file behavior when --maxSize is not set.
   
   ### How was this patch tested?
   Add UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to