Chao Sun created PARQUET-2052:
---------------------------------

             Summary: Integer overflow when writing huge binary using 
dictionary encoding
                 Key: PARQUET-2052
                 URL: https://issues.apache.org/jira/browse/PARQUET-2052
             Project: Parquet
          Issue Type: Bug
            Reporter: Chao Sun
            Assignee: Chao Sun


To check whether it should fallback to plain encoding, 
{{DictionaryValuesWriter}} currently use two variables: {{dictionaryByteSize}} 
and {{maxDictionaryByteSize}}, both of which are integer. This will cause issue 
when one first writes a relatively small binary within the threshold and then 
write a huge string which cause {{dictionaryByteSize}} overflow and becoming 
negative.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to