wgtmac commented on code in PR #34177:
URL: https://github.com/apache/arrow/pull/34177#discussion_r1105912139


##########
cpp/src/parquet/encoding.cc:
##########
@@ -480,11 +480,18 @@ class DictEncoderImpl : public EncoderImpl, virtual 
public DictEncoder<DType> {
  public:
   typedef typename DType::c_type T;
 
-  explicit DictEncoderImpl(const ColumnDescriptor* desc, MemoryPool* pool)
-      : EncoderImpl(desc, Encoding::PLAIN_DICTIONARY, pool),
+  explicit DictEncoderImpl(const ColumnDescriptor* desc, Encoding::type 
dict_encoding,

Review Comment:
   
   To provide some detail, the encoding type of dictionary encoded pages is 
depending on the format version
   - Format v1: `PLAIN_DICTIONARY` is used for both dict page and data page. 
   - Format v2: dict page uses `PLAIN` and data page uses `RLE_DICTIONARY`
   
   So it is kind of weird to manually pass `dict_encoding` because they are the 
same thing but relevant to different parquet formats.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to