lidavidm commented on PR #35090:
URL: https://github.com/apache/arrow/pull/35090#issuecomment-1508217941
Ok, I checked out the instance that Weston set up. (Thanks a lot Weston!)
The test fails probabilistically if I just try to run
`GrpcDataTest.TestDoGetInts`. The backtrace shows this suspicious behavior: The
main thread is executing a gRPC-internal destructor, `ExecCtx::~ExecCtx`:
```
thread #1, queue = 'com.apple.main-thread'
frame #0: 0x00007ff800bcb889 libsystem_pthread.dylib`pthread_mutex_lock
+ 11
frame #1: 0x00000001002af7b9 libgpr.29.dylib`gpr_mu_lock + 9
frame #2: 0x000000010123390f libgrpc.29.dylib`fd_orphan(grpc_fd*,
grpc_closure*, int*, char const*) + 255
frame #3: 0x0000000101249487
libgrpc.29.dylib`deactivated_all_ports(grpc_tcp_server*) + 135
frame #4: 0x0000000101249195 libgrpc.29.dylib`on_read(void*,
absl::lts_20230125::Status) + 1477
frame #5: 0x000000010123794c
libgrpc.29.dylib`grpc_core::ExecCtx::Flush() + 156
frame #6: 0x0000000101034f11
libgrpc.29.dylib`grpc_core::ExecCtx::~ExecCtx() + 33
frame #7: 0x00000001012e9110
libgrpc.29.dylib`grpc_server_shutdown_and_notify + 288
frame #8: 0x0000000100a441bc
libgrpc++.1.51.dylib`grpc::Server::ShutdownInternal(gpr_timespec) + 236
frame #9: 0x00000001008d249f
libarrow_flight.1200.dylib`arrow::flight::transport::grpc::(anonymous
namespace)::GrpcServerTransport::Shutdown() + 47
```
Meanwhile, one of the gRPC server threads is executing the same method:
```
* thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x9)
* frame #0: 0x0000000101222f64
libgrpc.29.dylib`grpc_core::StatusGetChildren(absl::lts_20230125::Status) + 68
frame #1: 0x00000001012eeb8d
libgrpc.29.dylib`grpc_error_has_clear_grpc_status(absl::lts_20230125::Status) +
125
frame #2: 0x000000010113142c
libgrpc.29.dylib`close_transport_locked(grpc_chttp2_transport*,
absl::lts_20230125::Status) + 92
frame #3: 0x0000000101135173 libgrpc.29.dylib`read_action_locked(void*,
absl::lts_20230125::Status) + 1491
frame #4: 0x0000000101231afe
libgrpc.29.dylib`grpc_combiner_continue_exec_ctx() + 158
frame #5: 0x00000001012378fe
libgrpc.29.dylib`grpc_core::ExecCtx::Flush() + 78
frame #6: 0x0000000101234a06
libgrpc.29.dylib`pollset_work(grpc_pollset*, grpc_pollset_worker**,
grpc_core::Timestamp) + 2662
frame #7: 0x0000000101236f86
libgrpc.29.dylib`pollset_work(grpc_pollset*, grpc_pollset_worker**,
grpc_core::Timestamp) + 22
frame #8: 0x000000010123ad23
libgrpc.29.dylib`grpc_pollset_work(grpc_pollset*, grpc_pollset_worker**,
grpc_core::Timestamp) + 19
frame #9: 0x00000001012dedb0
libgrpc.29.dylib`cq_next(grpc_completion_queue*, gpr_timespec, void*) + 544
frame #10: 0x0000000100a29b00
libgrpc++.1.51.dylib`grpc::CompletionQueue::AsyncNextInternal(void**, bool*,
gpr_timespec) + 80
frame #11: 0x0000000100a45655
libgrpc++.1.51.dylib`grpc::Server::SyncRequestThreadManager::PollForWork(void**,
bool*) + 101
frame #12: 0x0000000100a50700
libgrpc++.1.51.dylib`grpc::ThreadManager::MainWorkLoop() + 64
```
That said, this is supposed to be a thread local, so I don't see how they'd
trample each other here. I also can't get lldb to print out the thread-local
value so I can't check if it's initialized or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]