nevi-me commented on pull request #9240:
URL: https://github.com/apache/arrow/pull/9240#issuecomment-761919066


   Hi everyone interested in the Parquet writer.
   
   This PR effectively gives us the ability to compute how to write arbitrarily 
nested types. It has the side effect that nested lists can also be written.
   There's a few places where I need to tidy up, but they're dependent on the 
Arrow reader (ARROW-10391), which unfortunately might be a lot of work on its 
own. I'm a bit worried that I might have to rework a fair share of the writer 
to handle nesting correctly. I've already seen instances where we don't always 
have enough information to arrive at the correct solution.
   
   I'll open JIRAs as I go along.
   
   For reviewers, please note:
   
   This has taken me a few months on weekends to get right. I've iterated over 
various solutions to arrive here.
   The implementation is not optimal (I haven't benchmarked the latest impl), 
but I'm confident that it's correct.
   the extensive tests on the levels.rs will allow us to refactor with some 
confidence.
   
   I've spent far too long on this, so I practically don't have any fresh eyes 
here. I worked on all the edge-cases that I could think with lists and structs. 
I've documented them, but I'll review the doc comments and add more detail 
where I still feel that it's lacking.
   
   Thank you ❤️


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to