emkornfield commented on pull request #7175: URL: https://github.com/apache/arrow/pull/7175#issuecomment-630568952
> BM_ReadColumn<true,Int32Type> reflects a lot the profile I get with real-life dataset (nyc taxi dataset). If this can guide you in further performance validation. I don't think I'm going to be doing much more performance related work past https://github.com/apache/arrow/pull/7143 (which if you don't mind trying out it would be good to see if that improves performance on real world data). The last potential easy performance win is pushing the all null/no nulls remaining checks directly into the loops (for small batch sizes I wouldn't expect a huge difference there). My main goal is to get full nested functionality working, and I got a little distracted Other changes will probably require a bigger refactoring then I want to take on right now. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
