TableBatchReader doesn't do any array concatenation -- it only iterates through each chunk of the table that can be represented in a RecordBatch. When the columns of the table have different chunk layouts this can result in small batches.
On Fri, Aug 7, 2020 at 1:10 PM Yifei Yang <yiy...@eng.ucsd.edu> wrote: > > Hello, > > I'm using TableBatchReader to process tables. I wonder is there any way to > set the batch size or the number of tuples in a record batch? Sometimes a > batch only contains tens of tuples, which slows down the processing a lot. I > tried TableBatchReader::set_chunksize(), but got no change. Thanks!