wgtmac commented on code in PR #34177: URL: https://github.com/apache/arrow/pull/34177#discussion_r1105912139
########## cpp/src/parquet/encoding.cc: ########## @@ -480,11 +480,18 @@ class DictEncoderImpl : public EncoderImpl, virtual public DictEncoder<DType> { public: typedef typename DType::c_type T; - explicit DictEncoderImpl(const ColumnDescriptor* desc, MemoryPool* pool) - : EncoderImpl(desc, Encoding::PLAIN_DICTIONARY, pool), + explicit DictEncoderImpl(const ColumnDescriptor* desc, Encoding::type dict_encoding, Review Comment: To provide some detail, the encoding type of dictionary encoded pages is depending on the format version - Format v1: `PLAIN_DICTIONARY` is used for both dict page and data page. - Format v2: dict page uses `PLAIN` and data page uses `RLE_DICTIONARY` So it is kind of weird to manually pass `dict_encoding` because they are the same thing but relevant to different parquet formats. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org