heguanhui opened a new issue, #63921:
URL: https://github.com/apache/doris/issues/63921

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   ### Version
   
   master (trunk)
   
   ### What's Wrong?
   
   When running `OrcReadLinesTest.test0` BE unit test with ASAN build on x86, 
AddressSanitizer reports a SEGV crash due to null pointer dereference of 
`_row_reader` in `OrcReader::_seek_to_read_one_line()`.
   
   **ASAN crash stack:**
   ```
   ==XX==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000030 (pc 
0x555558b3c2e0 bp 0x7ffff3c8e400 sp 0x7ffff3c8e3a0 T2)
   ==XX==The signal is caused by a READ memory access.
       #0 0x555558b3c2e0 in orc::RowReaderImpl::seekToRow(unsigned long) 
be/src/formats/orc/../../thirdparty/installed/include/orc/Reader.hh
       #1 0x5555590a3b5c in 
doris::vectorized::OrcReader::_seek_to_read_one_line() 
be/src/format/orc/vorc_reader.h:710
       #2 0x5555590a3b5c in 
doris::vectorized::OrcReader::_get_next_block_impl(doris::vectorized::Block*, 
unsigned long*, bool*) be/src/format/orc/vorc_reader.cpp:2350
       #3 0x555558f5c5a0 in 
doris::vectorized::OrcReader::get_next_block(doris::vectorized::Block*, 
unsigned long*, bool*) be/src/format/orc/vorc_reader.cpp:2260
       #4 0x555558f5c5a0 in 
doris::vectorized::GenericReader::read_by_rows(doris::RuntimeState*, 
doris::vectorized::Block*, unsigned long*, bool*) 
be/src/format/generic_reader.h:165
       ...
   ```
   
   **Root cause:**
   
   In `OrcReader::_init_orc_row_reader()`, when `createRowReader` throws an 
exception and `should_stop` is true with error message "stop", the catch block 
swallows the exception and returns `Status::OK()`, but `_row_reader` remains 
nullptr. The caller then proceeds to call `_seek_to_read_one_line()` which 
dereferences the null `_row_reader` via `_row_reader->seekToRow()`, causing 
SEGV.
   
   This is inconsistent with `_create_file_reader()` which returns 
`Status::EndOfFile("stop")` in the same `should_stop` scenario.
   
   ### What You Expected?
   
   No SEGV crash. When `_row_reader` is not initialized, the code should either 
return a proper error status or assert the precondition, not silently continue 
and dereference a null pointer.
   
   ### How to Reproduce?
   
   1. Build BE with ASAN: `BUILD_TYPE=ASAN ./build.sh --be`
   2. Run: `./run-be-ut.sh --run --filter=OrcReadLinesTest.test0`
   3. Observe ASAN SEGV crash
   
   ### Anything Else?
   
   The x86 vs ARM difference is a typical undefined behavior manifestation - 
x86 null pointer dereference hits unmapped memory (SIGSEGV), while ARM may 
happen to access mapped memory and appear to pass.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to