scovich commented on issue #7119:
URL: https://github.com/apache/arrow-rs/issues/7119#issuecomment-2653923605

   > > if anything [storing pre-unioned null masks] encourages incorrect 
assumptions
   > 
   > If the spec (or the implementation) doesn't _require_ valid null masks at 
every level, then I have to agree with you there. No point computing _any_ null 
masks unless they are trustworthy -- it just forces both reader and writer to 
pay the ~same cost because they don't trust each other's work.
   
   ... but the more I look at the code changes required to handle this on the 
read path, the more I'm convinced the writer is in a _far_ better position to 
solve the problem than readers can ever be. It would be really helpful if the 
parquet reader at least offered a way to opt in to storing pre-unioned null 
masks (= library code written once), rather than requiring every reader (= user 
code replicated many times across many projects) to attempt the unioning and 
probably get it wrong at least some of the time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to