tustvold opened a new issue, #2935: URL: https://github.com/apache/arrow-datafusion/issues/2935
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently `CsvOpener` and `JsonOpener` call [GetResult::bytes](https://docs.rs/object_store/latest/object_store/enum.GetResult.html#method.bytes) which downloads the entire file, prior to feeding it to the appropriate arrow reader. This is not ideal: * Adds decode latency as must buffer full payload before reading * May read more data than necessary (#2930) Following on from #2677 we now support streaming responses from object storage **Describe the solution you'd like** The underlying challenge is to take arbitrary `Stream<Bytes>` and convert it into a `Stream<Bytes>` where each stream element contains complete rows, as delimited by a newline character. Once we have this `DelimitedStream`, it is trivial to feed each of these byte chunks individually into the corresponding decoder. **Describe alternatives you've considered** We could not do this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
