yordan-pavlov commented on issue #200:
URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-830698410


   UPDATE: I have finally been able to implement enough to replace 
ComplexObjectArrayReader with ArrowArrayReader for reading StringArrays and run 
a test query which I have been using for a lot of my performance testing. 
   Initial results look promising - the overall time of the query has reduced 
from about 125ms to about 100ms.
   I will try to write some proper benchmarks next, in the next couple of days, 
in order to better compare performance against the previous implementation.
   
   In general I have found that avoiding use of intermediate arrays as much as 
possible does help for performance and I believe I have finally been able to 
validate the idea of using iterators. I also think that switching from 
iterators to async streams should bring further performance improvements as an 
async runtime should be able to better schedule a combination of disk and CPU 
intensive tasks.
   
   the last changes can be found here: 
   
https://github.com/yordan-pavlov/arrow/commit/95ed8a020c2f44f5b30cfffd0682b98022cc4aea


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to