westonpace opened a new pull request #10795:
URL: https://github.com/apache/arrow/pull/10795


   Added a basic mapping generator that does not queue incoming jobs.  This 
allows it to forward async-reentrant pressure to the source.  Fixed some issues 
in the CSV reader that were preventing it from running truly parallel.  
Performance is now significantly better but still not quite the same as the 
threaded reader.  For the NY taxi dataset the streaming read time went from ~7 
seconds to ~1.6 seconds.  However, the file reader is still at ~0.8 seconds.  
I'll do more investigation later.
   
   Leaving in draft as I want to extract a thread spawning generator I created 
into an independently tested thing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to