jorisvandenbossche commented on PR #37854:
URL: https://github.com/apache/arrow/pull/37854#issuecomment-1750503852

   So this PR introduced a failure in the "AMD64 Ubuntu 22.04 C++ ASAN UBSAN" 
build 
(https://github.com/apache/arrow/actions/runs/6430392691/job/17462667620?pr=38069#logs),
 related to the LazyCache coalesced reads. See details below. 
   
   I assume this is an existing bug, given this PR only changed a default for 
an option a user could already set before as well. But changing the default of 
course makes it more visible. 
   
   Potentially short term option is to only change `pre_buffer` and keep the 
current non-lazy default `cache_options` (if that fixes it). Or revert the PR 
entirely until this is resolved (I don't have time today to look into more 
detail).
   
   <details>
   
   ```
   2023-10-06T10:40:14.0622194Z Running: 
/arrow/testing/data/parquet/fuzzing/clusterfuzz-testcase-minimized-parquet-arrow-fuzz-5640198106120192
   2023-10-06T10:40:14.0651320Z /arrow/cpp/src/arrow/io/interfaces.cc:457:  
Check failed: (left.offset + left.length) <= (right.offset) Some read ranges 
overlap
   2023-10-06T10:40:14.0661169Z 
/build/cpp/debug/parquet-arrow-fuzz(backtrace+0x5b)[0x55893309d6bb]
   2023-10-06T10:40:14.0678721Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow4util7CerrLog14PrintBackTraceEv+0x1a5)[0x7fd67d9f5405]
   2023-10-06T10:40:14.0694280Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow4util7CerrLogD2Ev+0x1f7)[0x7fd67d9f5177]
   2023-10-06T10:40:14.0708313Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow4util7CerrLogD0Ev+0x61)[0x7fd67d9f5251]
   2023-10-06T10:40:14.0722939Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow4util8ArrowLogD1Ev+0x1d0)[0x7fd67d9f4d80]
   2023-10-06T10:40:14.0733586Z 
/usr/local/lib/libarrow.so.1400(+0xb13f151)[0x7fd67d3cc151]
   2023-10-06T10:40:14.0746700Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow2io8internal18CoalesceReadRangesESt6vectorINS0_9ReadRangeESaIS3_EEll+0x4c1)[0x7fd67d3cac81]
   2023-10-06T10:40:14.0762388Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow2io8internal14ReadRangeCache4Impl5CacheESt6vectorINS0_9ReadRangeESaIS5_EE+0x456)[0x7fd67d2c3be6]
   2023-10-06T10:40:14.0775666Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow2io8internal14ReadRangeCache8LazyImpl5CacheESt6vectorINS0_9ReadRangeESaIS5_EE+0x24a)[0x7fd67d2c1cca]
   2023-10-06T10:40:14.0790164Z 
/usr/local/lib/libarrow.so.1400(_ZN5arrow2io8internal14ReadRangeCache5CacheESt6vectorINS0_9ReadRangeESaIS4_EE+0x2a2)[0x7fd67d2bfec2]
   2023-10-06T10:40:14.0795950Z 
/usr/local/lib/libparquet.so.1400(_ZN7parquet14SerializedFile9PreBufferERKSt6vectorIiSaIiEES5_RKN5arrow2io9IOContextERKNS7_12CacheOptionsE+0x1696)[0x7fd69120ef96]
   2023-10-06T10:40:14.0801581Z 
/usr/local/lib/libparquet.so.1400(_ZN7parquet17ParquetFileReader9PreBufferERKSt6vectorIiSaIiEES5_RKN5arrow2io9IOContextERKNS7_12CacheOptionsE+0x360)[0x7fd69120d7c0]
   2023-10-06T10:40:14.0808329Z 
/usr/local/lib/libparquet.so.1400(+0x15435e5)[0x7fd6904885e5]
   2023-10-06T10:40:14.0808759Z 
/usr/local/lib/libparquet.so.1400(+0x1542728)[0x7fd690487728]
   2023-10-06T10:40:14.0815343Z 
/usr/local/lib/libparquet.so.1400(+0x1542c7c)[0x7fd690487c7c]
   2023-10-06T10:40:14.0816050Z 
/usr/local/lib/libparquet.so.1400(_ZN7parquet5arrow8internal10FuzzReaderESt10unique_ptrINS0_10FileReaderESt14default_deleteIS3_EE+0x3e2)[0x7fd69046cdf2]
   2023-10-06T10:40:14.0822733Z ==14349== ERROR: libFuzzer: deadly signal
   2023-10-06T10:40:14.0823311Z 
/usr/local/lib/libparquet.so.1400(_ZN7parquet5arrow8internal10FuzzReaderEPKhl+0x1130)[0x7fd69046e950]
   2023-10-06T10:40:14.0824114Z 
/build/cpp/debug/parquet-arrow-fuzz(+0x118e98)[0x558933121e98]
   2023-10-06T10:40:14.0825448Z 
/build/cpp/debug/parquet-arrow-fuzz(+0x3f354)[0x558933048354]
   2023-10-06T10:40:14.0826059Z 
/build/cpp/debug/parquet-arrow-fuzz(+0x290d0)[0x5589330320d0]
   2023-10-06T10:40:14.0826543Z 
/build/cpp/debug/parquet-arrow-fuzz(+0x2ee27)[0x558933037e27]
   2023-10-06T10:40:14.0827941Z 
/build/cpp/debug/parquet-arrow-fuzz(+0x58c43)[0x558933061c43]
   2023-10-06T10:40:14.0828405Z 
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fd6713bfd90]
   2023-10-06T10:40:14.0828882Z 
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fd6713bfe40]
   2023-10-06T10:40:14.0829351Z 
/build/cpp/debug/parquet-arrow-fuzz(+0x23995)[0x55893302c995]
   2023-10-06T10:40:15.2094786Z     #0 0x5589330eeab1 in 
__sanitizer_print_stack_trace (/build/cpp/debug/parquet-arrow-fuzz+0xe5ab1) 
(BuildId: 8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2096115Z     #1 0x558933061348 in 
fuzzer::PrintStackTrace() (/build/cpp/debug/parquet-arrow-fuzz+0x58348) 
(BuildId: 8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2097546Z     #2 0x558933046dc3 in 
fuzzer::Fuzzer::CrashCallback() (/build/cpp/debug/parquet-arrow-fuzz+0x3ddc3) 
(BuildId: 8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2098544Z     #3 0x7fd6713d851f  
(/lib/x86_64-linux-gnu/libc.so.6+0x4251f) (BuildId: 
229b7dc509053fe4df5e29e8629911f0c3bc66dd)
   2023-10-06T10:40:15.2099481Z     #4 0x7fd67142ca7b in pthread_kill 
(/lib/x86_64-linux-gnu/libc.so.6+0x96a7b) (BuildId: 
229b7dc509053fe4df5e29e8629911f0c3bc66dd)
   2023-10-06T10:40:15.2101878Z     #5 0x7fd6713d8475 in gsignal 
(/lib/x86_64-linux-gnu/libc.so.6+0x42475) (BuildId: 
229b7dc509053fe4df5e29e8629911f0c3bc66dd)
   2023-10-06T10:40:15.2102783Z     #6 0x7fd6713be7f2 in abort 
(/lib/x86_64-linux-gnu/libc.so.6+0x287f2) (BuildId: 
229b7dc509053fe4df5e29e8629911f0c3bc66dd)
   2023-10-06T10:40:15.2103486Z     #7 0x7fd67d9f5193 in 
arrow::util::CerrLog::~CerrLog() /arrow/cpp/src/arrow/util/logging.cc:72:7
   2023-10-06T10:40:15.2104144Z     #8 0x7fd67d9f5250 in 
arrow::util::CerrLog::~CerrLog() /arrow/cpp/src/arrow/util/logging.cc:66:22
   2023-10-06T10:40:15.2104793Z     #9 0x7fd67d9f4d7f in 
arrow::util::ArrowLog::~ArrowLog() /arrow/cpp/src/arrow/util/logging.cc:250:5
   2023-10-06T10:40:15.2105719Z     #10 0x7fd67d3cc150 in 
arrow::io::internal::(anonymous 
namespace)::ReadRangeCombiner::Coalesce(std::vector<arrow::io::ReadRange, 
std::allocator<arrow::io::ReadRange> >) 
/arrow/cpp/src/arrow/io/interfaces.cc:457:7
   2023-10-06T10:40:15.2106830Z     #11 0x7fd67d3cac80 in 
arrow::io::internal::CoalesceReadRanges(std::vector<arrow::io::ReadRange, 
std::allocator<arrow::io::ReadRange> >, long, long) 
/arrow/cpp/src/arrow/io/interfaces.cc:518:19
   2023-10-06T10:40:15.2107880Z     #12 0x7fd67d2c3be5 in 
arrow::io::internal::ReadRangeCache::Impl::Cache(std::vector<arrow::io::ReadRange,
 std::allocator<arrow::io::ReadRange> >) 
/arrow/cpp/src/arrow/io/caching.cc:177:14
   2023-10-06T10:40:15.2108897Z     #13 0x7fd67d2c1cc9 in 
arrow::io::internal::ReadRangeCache::LazyImpl::Cache(std::vector<arrow::io::ReadRange,
 std::allocator<arrow::io::ReadRange> >) 
/arrow/cpp/src/arrow/io/caching.cc:288:34
   2023-10-06T10:40:15.2109909Z     #14 0x7fd67d2bfec1 in 
arrow::io::internal::ReadRangeCache::Cache(std::vector<arrow::io::ReadRange, 
std::allocator<arrow::io::ReadRange> >) 
/arrow/cpp/src/arrow/io/caching.cc:320:17
   2023-10-06T10:40:15.2111039Z     #15 0x7fd69120ef95 in 
parquet::SerializedFile::PreBuffer(std::vector<int, std::allocator<int> > 
const&, std::vector<int, std::allocator<int> > const&, arrow::io::IOContext 
const&, arrow::io::CacheOptions const&) 
/arrow/cpp/src/parquet/file_reader.cc:368:5
   2023-10-06T10:40:15.2112348Z     #16 0x7fd69120d7bf in 
parquet::ParquetFileReader::PreBuffer(std::vector<int, std::allocator<int> > 
const&, std::vector<int, std::allocator<int> > const&, arrow::io::IOContext 
const&, arrow::io::CacheOptions const&) 
/arrow/cpp/src/parquet/file_reader.cc:862:9
   2023-10-06T10:40:15.2113660Z     #17 0x7fd6904885e4 in 
parquet::arrow::(anonymous 
namespace)::FileReaderImpl::ReadRowGroups(std::vector<int, std::allocator<int> 
> const&, std::vector<int, std::allocator<int> > const&, 
std::shared_ptr<arrow::Table>*) /arrow/cpp/src/parquet/arrow/reader.cc:1224:23
   2023-10-06T10:40:15.2114817Z     #18 0x7fd690487727 in 
parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroup(int, 
std::vector<int, std::allocator<int> > const&, std::shared_ptr<arrow::Table>*) 
/arrow/cpp/src/parquet/arrow/reader.cc:321:12
   2023-10-06T10:40:15.2115872Z     #19 0x7fd690487c7b in 
parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroup(int, 
std::shared_ptr<arrow::Table>*) /arrow/cpp/src/parquet/arrow/reader.cc:325:12
   2023-10-06T10:40:15.2116737Z     #20 0x7fd69046cdf1 in 
parquet::arrow::internal::FuzzReader(std::unique_ptr<parquet::arrow::FileReader,
 std::default_delete<parquet::arrow::FileReader> >) 
/arrow/cpp/src/parquet/arrow/reader.cc:1374:37
   2023-10-06T10:40:15.2117736Z     #21 0x7fd69046e94f in 
parquet::arrow::internal::FuzzReader(unsigned char const*, long) 
/arrow/cpp/src/parquet/arrow/reader.cc:1399:11
   2023-10-06T10:40:15.2118358Z     #22 0x558933121e97 in 
LLVMFuzzerTestOneInput /arrow/cpp/src/parquet/arrow/fuzz.cc:22:17
   2023-10-06T10:40:15.2119357Z     #23 0x558933048353 in 
fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) 
(/build/cpp/debug/parquet-arrow-fuzz+0x3f353) (BuildId: 
8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2120490Z     #24 0x5589330320cf in 
fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) 
(/build/cpp/debug/parquet-arrow-fuzz+0x290cf) (BuildId: 
8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2121762Z     #25 0x558933037e26 in 
fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned 
long)) (/build/cpp/debug/parquet-arrow-fuzz+0x2ee26) (BuildId: 
8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2122720Z     #26 0x558933061c42 in main 
(/build/cpp/debug/parquet-arrow-fuzz+0x58c42) (BuildId: 
8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2123509Z     #27 0x7fd6713bfd8f  
(/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 
229b7dc509053fe4df5e29e8629911f0c3bc66dd)
   2023-10-06T10:40:15.2124294Z     #28 0x7fd6713bfe3f in __libc_start_main 
(/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 
229b7dc509053fe4df5e29e8629911f0c3bc66dd)
   2023-10-06T10:40:15.2125121Z     #29 0x55893302c994 in _start 
(/build/cpp/debug/parquet-arrow-fuzz+0x23994) (BuildId: 
8286aad552d39ef7fd5d08d745adab7f6b613e22)
   2023-10-06T10:40:15.2125489Z 
   2023-10-06T10:40:15.2126159Z NOTE: libFuzzer has rudimentary signal handlers.
   2023-10-06T10:40:15.2127161Z       Combine libFuzzer with AddressSanitizer 
or similar for better crash reports.
   2023-10-06T10:40:15.2127655Z SUMMARY: libFuzzer: deadly signal
   2023-10-06T10:40:16.9350640Z 77
   2023-10-06T10:40:17.0185097Z Error: `docker-compose --file 
/home/runner/work/arrow/arrow/docker-compose.yml run --rm ubuntu-cpp-sanitizer` 
exited with a non-zero exit code 77, see the process log above.
   ```
   
   </details>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to