etseidl commented on PR #45285:
URL: https://github.com/apache/arrow/pull/45285#issuecomment-2596259495

   > > but the read side needs to account for the asymmetry in the spec.
   > 
   > I am just following the PR, what do you mean by this? I am not sure I 
understand what case you're describing.
   
   @raulcd The parquet spec says the repetition level histogram can be omitted 
if max_rep_level==0, but for the definition levels it can be omitted for 
max_def_level == 0 or 1. 
https://github.com/apache/parquet-format/blob/a498aa9a377edcdbc5da802cf9f1763a2e409411/src/main/thrift/parquet.thrift#L231-L237
   
   So the case I'm describing is a file with a column with max_def_level==1, 
where the repetition histogram is populated but the definition histogram is not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to