scovich commented on issue #7119: URL: https://github.com/apache/arrow-rs/issues/7119#issuecomment-2653923605
> > if anything [storing pre-unioned null masks] encourages incorrect assumptions > > If the spec (or the implementation) doesn't _require_ valid null masks at every level, then I have to agree with you there. No point computing _any_ null masks unless they are trustworthy -- it just forces both reader and writer to pay the ~same cost because they don't trust each other's work. ... but the more I look at the code changes required to handle this on the read path, the more I'm convinced the writer is in a _far_ better position to solve the problem than readers can ever be. It would be really helpful if the parquet reader at least offered a way to opt in to storing pre-unioned null masks (= library code written once), rather than requiring every reader (= user code replicated many times across many projects) to attempt the unioning and probably get it wrong at least some of the time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org