wgtmac commented on code in PR #35825:
URL: https://github.com/apache/arrow/pull/35825#discussion_r1218807398


##########
cpp/src/parquet/types.h:
##########
@@ -64,8 +64,12 @@ struct Type {
     DOUBLE = 5,
     BYTE_ARRAY = 6,
     FIXED_LEN_BYTE_ARRAY = 7,
+
+    // This parquet type does not actually exist (AFAIK) and is used to
+    // create proper type traits
+    LARGE_BYTE_ARRAY = 8,

Review Comment:
   Now I see the problem comes from here:
   ```cpp
   template <typename DType>
   class TypedDecoder : virtual public Decoder {
    public:
     using T = typename DType::c_type;
   
     virtual int DecodeArrow(int num_values, int null_count, const uint8_t* 
valid_bits,
                             int64_t valid_bits_offset,
                             typename EncodingTraits<DType>::Accumulator* out) 
= 0;
   
     virtual int DecodeArrow(int num_values, int null_count, const uint8_t* 
valid_bits,
                             int64_t valid_bits_offset,
                             typename EncodingTraits<DType>::DictAccumulator* 
builder) = 0;
   };
   ```
   
   So, I still prefer extending `EncodingTraits` with a new parameter like below
   ```cpp
   template <typename T, bool EnableLargeBinary = false>
   struct EncodingTraits;
   ```
   
   To workaround issues with the decoder, we can apply similar extension:
   ```cpp
   template <typename DType, bool EnableLargeBinary = false>
   class TypedDecoder : virtual public Decoder {};
   
   template <typename DType, bool EnableLargeBinary = false>
   class DictDecoder : virtual public TypedDecoder<DType, EnableLargeBinary> {};
   ```
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to