[ 
https://issues.apache.org/jira/browse/ORC-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018539#comment-17018539
 ] 

Quanlong Huang commented on ORC-590:
------------------------------------

Commit 
[bf5b780|https://github.com/apache/orc/commit/bf5b7800930bfa030db83aba925d9d3b75852839]
 of ORC-469 unintentionally removes some safety checks in 
StringDictionaryColumnReader, which causes this issue.

> Crash in orc::RleDecoderV2::readByte
> ------------------------------------
>
>                 Key: ORC-590
>                 URL: https://issues.apache.org/jira/browse/ORC-590
>             Project: ORC
>          Issue Type: Bug
>          Components: C++
>            Reporter: Quanlong Huang
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>         Attachments: RleDecoderV2_next_crash.orc
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hit a crash when reading a corrupt file.
> {code}
> (gdb) bt
> #0  0x00000000006108ad in orc::RleDecoderV2::readByte (this=0xd5a0d0) at 
> /home/quanlong/workspace/orc/c++/src/RLEv2.hh:167
> #1  orc::RleDecoderV2::next (this=0xd5a0d0, data=0xd5a1d8, numValues=30, 
> notNull=0x0) at /home/quanlong/workspace/orc/c++/src/RleDecoderV2.cc:119
> #2  0x00000000005f6b8c in 
> orc::StringDictionaryColumnReader::StringDictionaryColumnReader 
> (this=this@entry=0xb497a0, type=..., stripe=...) at 
> /home/quanlong/workspace/orc/c++/src/ColumnReader.cc:581
> #3  0x00000000005f70bb in orc::buildReader (type=..., stripe=...) at 
> /home/quanlong/workspace/orc/c++/src/ColumnReader.cc:1756
> #4  0x00000000005f722b in orc::StructColumnReader::StructColumnReader 
> (this=this@entry=0xb07e40, type=..., stripe=...) at 
> /home/quanlong/workspace/orc/c++/src/ColumnReader.cc:876
> #5  0x00000000005f701b in orc::buildReader (type=..., stripe=...) at 
> /home/quanlong/workspace/orc/c++/src/ColumnReader.cc:1787
> #6  0x000000000059fd18 in orc::RowReaderImpl::startNextStripe (this=0xae2750) 
> at /home/quanlong/workspace/orc/c++/src/Reader.cc:917
> #7  0x00000000005a016a in orc::RowReaderImpl::next (this=0xae2750, data=...) 
> at /home/quanlong/workspace/orc/c++/src/Reader.cc:932
> #8  0x0000000000597a78 in scanFile (out=..., filename=<optimized out>, 
> batchSize=batchSize@entry=1024) at 
> /home/quanlong/workspace/orc/tools/src/FileScan.cc:39
> #9  0x00000000005972f8 in main (argc=1, argv=<optimized out>) at 
> /home/quanlong/workspace/orc/tools/src/FileScan.cc:84
> (gdb) l
> 162   
> 163     unsigned char readByte() {
> 164     if (bufferStart == bufferEnd) {
> 165       int bufferLength;
> 166       const void* bufferPointer;
> 167       if (!inputStream->Next(&bufferPointer, &bufferLength)) {
> 168         throw ParseError("bad read in RleDecoderV2::readByte");
> 169       }
> 170       bufferStart = static_cast<const char*>(bufferPointer);
> 171       bufferEnd = bufferStart + bufferLength;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to