emkornfield commented on PR #197: URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1703090707
> As the implemention detail, can we ignore the rep-def histogram when max-rep <= 1, max-def <= 1? Since we already have page-ordinal in OffsetIndex and null-count in ColumnIndex? This might take less space but make it a bit tricky. @etseidl @emkornfield I don't see any downside for max-def level. For max-rep level this would lose the ability to do filter on queries like `list_length(col) > 1` Regarding were to place size: I agree option 2 is the best, I think having a separate list makes the most sense. I'll update the PR to reflect this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
