westonpace commented on issue #34363:
URL: https://github.com/apache/arrow/issues/34363#issuecomment-1548348207
This seems like a legitimate request and pretty workable. We are pretty
close already. The code in ObjectOutputStream is roughly...
```
if request > part_limit:
submit_request(request)
return
buffer.append(request)
if buffer > part_limit:
submit_request(buffer)
buffer.reset()
```
Given we are already talking about cloud upload and I/O I think we can just
directly implement the equal parts approach (instead of trying to maintain
both) without too much hit to performance (though there will be some hit since
this introduces a mandatory extra copy of the data in some cases). This would
change the above logic to:
```
buffer.append(request)
for chunk in slice_off_whole_chunks(buffer, part_limit):
submit_request(chunk)
```
Does anyone want to create a PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]