Csaba Ringhofer created PARQUET-1250:
----------------------------------------

             Summary: RLE decoding should treat 0 length runs as error 
                 Key: PARQUET-1250
                 URL: https://issues.apache.org/jira/browse/PARQUET-1250
             Project: Parquet
          Issue Type: Improvement
          Components: parquet-mr
            Reporter: Csaba Ringhofer


RunLengthBitPackingHybridDecoder accepts run headers that encode 0 length 
repeated runs, and treats them as if they were 2^32 length run, so effectively 
every value returned for that data page will be the same. (see 
https://github.com/apache/parquet-mr/blob/0a86429939075984edce5e3b8195dfb7f9e3ab6b/parquet-column/src/main/java/org/apache/parquet/column/values/rle/RunLengthBitPackingHybridDecoder.java#L66
 )

Throwing an exception if count is 0 would give a proper error message for some 
corrupt files, and would make it clear that these are not legal values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to