GitHub user etseidl added a comment to the discussion: Sorting and encoding of dictionary pages
Dictionary pages must be encoded using `PLAIN` encoding ([ref](https://github.com/apache/parquet-format/blob/master/Encodings.md#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8)). The Parquet community is exploring new encoding options right now (https://lists.apache.org/thread/djnbbcnft0fqm9ldby2q96nbtrwqz477), you might want to ask on the parquet dev list if there's a desire to expand the possibilities for dictionary encoding. Being able to sort the dictionary would be a nice addition, but I'm not sure what the level of effort would be. Please feel free to create an issue to request this feature. GitHub link: https://github.com/apache/arrow-rs/discussions/8778#discussioncomment-14882050 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
