marin-ma opened a new pull request, #5657: URL: https://github.com/apache/incubator-gluten/pull/5657
If setting `spark.gluten.sql.columnar.backend.velox.IOThreads` > 0, the async io is enabled. The program may crash as below ``` (gdb) thread apply all bt Thread 2 (Thread 0x7fffebbff640 (LWP 1871752) "IOThreadPool0"): #0 0x00007ffff6eba637 in facebook::velox::process::ThreadLocalRegistry<facebook::velox::process::TraceHistory>::Reference::~Reference() () from /home/sparkuser/gluten/cpp/cmake-build-release/releases/libvelox.so #1 0x00007fffeea45d9f in __GI___call_tls_dtors () at ./stdlib/cxa_thread_atexit_impl.c:159 #2 0x00007fffeea94945 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:450 #3 0x00007fffeeb26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 Thread 1 (Thread 0x7fffee9cf440 (LWP 1871731) "generic_benchma"): #0 __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=1871752, futex_word=0x7fffebbff910) at ./nptl/futex-internal.c:57 #1 __futex_abstimed_wait_common (cancel=true, private=128, abstime=0x0, clockid=0, expected=1871752, futex_word=0x7fffebbff910) at ./nptl/futex-internal.c:87 #2 __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7fffebbff910, expected=1871752, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128) at ./nptl/futex-internal.c:139 #3 0x00007fffeea96624 in __pthread_clockjoin_ex (threadid=140737148614208, thread_return=0x0, clockid=0, abstime=0x0, block=<optimized out>) at ./nptl/pthread_join_common.c:105 #4 0x00007fffeeedc2c7 in std::thread::join() () from /lib/x86_64-linux-gnu/libstdc++.so.6 #5 0x00007ffff6f4b610 in folly::ThreadPoolExecutor::joinStoppedThreads (this=0x555555617640, n=1) at /usr/include/c++/11/bits/shared_ptr_base.h:1295 #6 0x00007ffff6f4b774 in folly::ThreadPoolExecutor::stopAndJoinAllThreads (this=0x555555617640, isJoin=<optimized out>) at /home/sparkuser/gluten-debug/ep/build-velox/build/velox_ep/folly/folly/executors/ThreadPoolExecutor.cpp:290 #7 0x00007ffff6f3f56d in folly::IOThreadPoolExecutor::~IOThreadPoolExecutor (this=this@entry=0x555555617640, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/sparkuser/gluten-debug/ep/build-velox/build/velox_ep/folly/folly/executors/IOThreadPoolExecutor.cpp:130 #8 0x00007ffff6f3f6ad in folly::IOThreadPoolExecutor::~IOThreadPoolExecutor (this=0x555555617640, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/sparkuser/gluten-debug/ep/build-velox/build/velox_ep/folly/folly/executors/IOThreadPoolExecutor.cpp:131 #9 0x00007ffff3393275 in std::unique_ptr<gluten::VeloxBackend, std::default_delete<gluten::VeloxBackend> >::~unique_ptr() () from /home/sparkuser/gluten/cpp/cmake-build-release/releases/libvelox.so #10 0x00007fffeea45a56 in __cxa_finalize (d=0x7ffff7e7f680) at ./stdlib/cxa_finalize.c:83 #11 0x00007ffff33017d7 in __do_global_dtors_aux () from /home/sparkuser/gluten/cpp/cmake-build-release/releases/libvelox.so #12 0x00007fffffffd270 in ?? () #13 0x00007ffff7fc924e in _dl_fini () at ./elf/dl-fini.c:142 Backtrace stopped: frame did not save the PC ``` Currently, the `ioExecutor_` in `VeloxBackend` is destructed in the dtor in main thread. However, destructing IOThreadPoolExecutor will stop and join all threads. On threads exit, thread local variables can be constructed with referencing global variables, but at which time the global variable had already been destructed. So, we need to destruct IOThreadPoolExecutor and stop the threads before global variables get destructed. e.g. In Velox's code https://github.com/facebookincubator/velox/blob/5c4903fe2c1f640094295141206f23fe9a54c218/velox/common/process/TraceContext.cpp#L27-L31 thread local variable `threadLocalTraceData` can be constructed on thread exit in `VeloxBackend` dtor, but at this time `registry` has ready destructed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
