jorisvandenbossche commented on issue #39562: URL: https://github.com/apache/arrow/issues/39562#issuecomment-1888701463
Ignore the above lldb output, that is useless because of https://github.com/apache/arrow/issues/37589. Thanks to the workaround mentioned in https://stackoverflow.com/questions/74059978/why-is-lldb-generating-exc-bad-instruction-with-user-compiled-library-on-macos/76032052#76032052 (`settings set platform.plugin.darwin.ignored-exceptions EXC_BAD_INSTRUCTION`), I could get an actual backtrace: ``` (lldb) process launch Process 2066 launched: '/opt/homebrew/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/Resources/Python.app/Contents/MacOS/Python' (arm64) libc++abi: terminating due to uncaught exception of type std::length_error: vector Process 2066 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT frame #0: 0x00000001a9b30744 libsystem_kernel.dylib`__pthread_kill + 8 libsystem_kernel.dylib`: -> 0x1a9b30744 <+8>: b.lo 0x1a9b30764 ; <+40> 0x1a9b30748 <+12>: pacibsp 0x1a9b3074c <+16>: stp x29, x30, [sp, #-0x10]! 0x1a9b30750 <+20>: mov x29, sp Target 0: (Python) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT * frame #0: 0x00000001a9b30744 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x00000001a9b67c28 libsystem_pthread.dylib`pthread_kill + 288 frame #2: 0x00000001a9a75ae8 libsystem_c.dylib`abort + 180 frame #3: 0x00000001a9b20b84 libc++abi.dylib`abort_message + 132 frame #4: 0x00000001a9b103b4 libc++abi.dylib`demangling_terminate_handler() + 320 frame #5: 0x00000001a97e6e68 libobjc.A.dylib`_objc_terminate() + 160 frame #6: 0x00000001a9b1ff48 libc++abi.dylib`std::__terminate(void (*)()) + 16 frame #7: 0x00000001a9b22d34 libc++abi.dylib`__cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) + 36 frame #8: 0x00000001a9b22ce0 libc++abi.dylib`__cxa_throw + 140 frame #9: 0x0000000148022f90 libarrow_dataset.1500.dylib`std::__1::__throw_length_error[abi:v160006](char const*) + 60 frame #10: 0x00000001480635f8 libarrow_dataset.1500.dylib`std::__1::vector<bool, std::__1::allocator<bool>>::__throw_length_error[abi:v160006]() const + 20 frame #11: 0x000000014801d4f8 libarrow_dataset.1500.dylib`std::__1::vector<bool, std::__1::allocator<bool>>::resize(unsigned long, bool) + 600 frame #12: 0x000000014801d14c libarrow_dataset.1500.dylib`arrow::dataset::ParquetFileFragment::SetMetadata(std::__1::shared_ptr<parquet::FileMetaData>, std::__1::shared_ptr<parquet::arrow::SchemaManifest>) + 432 frame #13: 0x000000014801d7e4 libarrow_dataset.1500.dylib`arrow::dataset::ParquetFileFragment::SplitByRowGroup(arrow::compute::Expression) + 720 frame #14: 0x000000010763b824 _dataset_parquet.cpython-310-darwin.so`__pyx_pw_7pyarrow_16_dataset_parquet_19ParquetFileFragment_5split_by_row_group(_object*, _object* const*, long, _object*) + 1428 ``` So it is giving a `terminating due to uncaught exception of type std::length_error: vector` error for the vector resize in `ParquetFileFragment::SetMetadata`, presumably the one that I changed in https://github.com/apache/arrow/pull/39065: ```diff - statistics_expressions_complete_.resize(physical_schema_->num_fields(), false); + statistics_expressions_complete_.resize(manifest_->descr->num_columns(), false); ``` I am wondering if sometimes `manifest_->descr->num_columns()` could be undefined? The crash also happens specifically in a test where there dataset is created with `ParquetDatasetFactory` It's still very strange that this only occurs in the MacOS wheels. I found some potentially similar issue (https://github.com/pyg-team/pytorch_geometric/issues/4419), but also without clear solution (guess that it was related with inference of system libraries, was typically solved by using a (virtual) environment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
