tustvold commented on issue #5450: URL: https://github.com/apache/arrow-rs/issues/5450#issuecomment-1973960133
Currently this is expected behaviour, row groups are only automatically "closed" based on row count. I would suggest the following: * Document that AsyncArrowWriter's buffer size is not authoritative, and is bounded by the row group size * Add the ability to limit the maximum size of a row group before ArrowWriter starts a new row group, as it writes to separate memory regions per column it can actually enforce this, unlike the underlying SerializedFileWriter -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
