emkornfield commented on pull request #8177:
URL: https://github.com/apache/arrow/pull/8177#issuecomment-696205222


   > Just for the record, apart from FixedSizeList, is there anything remaining 
for full nested Parquet -> Arrow reading?
   
   We need to support LargeList, and Map which should be smaller change (I'm 
working on a PR) at the schema level inference.  There are a few other JIRAs 
still open about benchmarking and randomized testing,   Past that, there are 
some open JIRAs about performance improvements:
   *  Computing all all offsets/bitmaps together (the JIRA is about 
non-vectorized).  I would expect that for deeply nested structures containing 
lists this would start to show performance improvements.
   *  Using bitmap based code that was removed from this.  For non-list types I 
think it can be a big performance win on all platforms and a win at least for 
shallowly nested lists I expect it to be better for native Inte.
   
   There is also an unrelated bug on the write side 
https://github.com/apache/arrow/pull/8219 which I asked for @wesm to review (it 
is based on some changes in this PR).
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to