marsupialtail commented on code in PR #13931: URL: https://github.com/apache/arrow/pull/13931#discussion_r968604626
########## cpp/src/arrow/csv/reader.cc: ########## @@ -970,6 +972,145 @@ class StreamingReaderImpl : public ReaderMixin, std::shared_ptr<std::atomic<int64_t>> bytes_decoded_; }; +class ParallelStreamingReaderImpl Review Comment: This code duplication is admittedly not ideal. However it's not just about reading faster through a new RandomAccessFile generator, but also parsing in parallel, and (in the future) decoding in parallel, which will introduce more changes to the StreamingReaderImpl. After I finish the decoder in parallel in a new PR perhaps then it makes sense to reduce the deduplication. Or perhaps I can just add it to this PR, though I think this PR has enough value to be merged by itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
