guowangy opened a new issue, #11895:
URL: https://github.com/apache/gluten/issues/11895
### Backend
VL (Velox)
### Bug description
When running TPC-DS or heavy scan workloads on HDFS with `IOThreads > 0` and
`SplitPreloadPerDriver > 0`, the JVM process has a change to crash with SIGSEGV
inside `jni_NewStringUTF` during `hdfsGetPathInfo()`. The crashing thread is a
`CPUThreadPoolN` thread used for async split preloading.
**Expected behavior**: HDFS file operations should work reliably on
IOThreadPool threads across consecutive preload tasks.
**Actual behavior**: After a certain number of tasks, the IOThreadPool
thread crashes with SIGSEGV when calling `hdfsGetPathInfo()` via `libhdfs.so`.
## Root cause
`libhdfs.so` caches `JNIEnv*` in an ELF thread-local (`__thread`) variable
after the first `AttachCurrentThread` on each thread. The cached env is
returned on all subsequent calls without re-validation (confirmed by
disassembly of `libhdfs.so`'s `getJNIEnv` function).
Gluten's `JniColumnarBatchIterator::~JniColumnarBatchIterator()`
(`JniCommon.cc`) and `JavaInputStreamAdaptor::Close()` (`JniWrapper.cc`) call
`vm_->DetachCurrentThread()` after JNI cleanup. This invalidates the `JNIEnv*`
and frees the backing `JavaThread` object in the JVM. But libhdfs's TLS cache
still holds the old pointer. On the next HDFS call, `libhdfs`'s `getJNIEnv()`
returns the stale pointer, and the JVM crashes when it tries to transition the
freed thread state.
### Detailed mechanism
**libhdfs `getJNIEnv` fast path** (from disassembly):
```
1. __tls_get_addr() → get &(__thread hdfsTls*)
2. if (tls_ptr != NULL) → return tls_ptr->env // NO RE-VALIDATION
3. else → slow path: AttachCurrentThread, cache env
```
**After `DetachCurrentThread`**:
- JVM frees the `JavaThread` object, reclaims the memory at the env address
- libhdfs `__thread` TLS still holds the stale `hdfsTls*` → stale `env`
- Next HDFS call → `getJNIEnv()` fast path returns stale env
- `jni_NewStringUTF(stale_env, ...)` → computes `JavaThread* = env - 0x200`
→ freed memory
- JVM reads `*(JavaThread + 0x290)` — gets garbage (not the magic alive
marker `0xdeab`)
- JVM calls `block_if_vm_exited()`, sets JavaThread\* = NULL
- `transition_from_native(NULL, ...)` → **SIGSEGV** at address 0x278
### Evidence from core dump
Core dump: `core.CPUThreadPool21.1770392` (from TPC-DS benchmark on YARN)
Registers at crash frame (`ThreadStateTransition::transition_from_native`):
```
RDI = 0x0 ← JavaThread* is NULL (set by
block_if_vm_exited)
R12 = 0x7f3003a52200 ← stale JNIEnv* from libhdfs TLS cache
```
Memory at stale env (`0x7f3003a52200`):
```
0x7f3003a52200: 0x0000000000000000 0x0000000000000000 ← JNI function
table is NULL
0x7f3003a52210: 0x0000001200000112 0x0000000000000000 ← JVM method
resolution data (reused memory)
```
Call chain (resolved from `libvelox.so` symbol table via `nm`):
```
CPUThreadPool21 (preload task)
→ SplitReader::createReader() [libvelox.so + 0x6173914]
→ HdfsFileSystem::openFileForRead() [libvelox.so + 0x3787216]
→ HdfsReadFile::HdfsReadFile() [libvelox.so + 0x378AB36,
constructor]
→ driver_->GetPathInfo()
→ hdfsGetPathInfo() [libhdfs.so]
→ getJNIEnv() → returns stale env
→ jni_NewStringUTF(stale_env, path) → SIGSEGV
```
### How DetachCurrentThread gets called on CPUThreadPool threads
The two call sites:
1. `JniColumnarBatchIterator::~JniColumnarBatchIterator()` —
`cpp/core/jni/JniCommon.cc`
2. `JavaInputStreamAdaptor::Close()` — `cpp/core/jni/JniWrapper.cc`
These objects are held via `shared_ptr` chains rooted in the Velox `Task`.
When a task is terminated (e.g., by memory arbitration or
`WholeStageResultIterator::~WholeStageResultIterator()` calling
`task_->requestCancel()`), `Task::terminate()` calls `driver->closeByTask()` →
`closeOperators()` which destroys `DataSource` objects, dropping the last
`shared_ptr` references. If this cleanup runs on a CPUThreadPool thread (e.g.,
triggered by memory pressure callback during a preload task), the destructor
calls `DetachCurrentThread` on that thread.
Sequence:
1. CPUThreadPool21 runs preload task A → libhdfs attaches thread, caches env
in TLS
2. Object cleanup on the same thread → destructor calls
`DetachCurrentThread` → env invalidated, but libhdfs TLS still holds it
3. CPUThreadPool21 runs preload task B → `hdfsGetPathInfo()` → stale env →
**SIGSEGV**
### Gluten version
main branch
### Spark version
Spark-3.5.x
### Spark configurations
_No response_
### System information
_No response_
### Relevant logs
```bash
core dump back trace:
Core: core.CPUThreadPool21.1770392
#10 ThreadStateTransition::transition_from_native(JavaThread*,
JavaThreadState) [libjvm.so]
RDI=0x0 (NULL JavaThread*), R12=0x7f3003a52200 (stale JNIEnv*)
#11 jni_NewStringUTF
[libjvm.so]
#12 newJavaStr (env=0x7f3003a52200, path="/.../catalog_sales/...parquet")
[libhdfs.so]
#13 constructNewObjectOfPath
[libhdfs.so]
#14 hdfsGetPathInfo
[libhdfs.so]
#15 HdfsReadFile::HdfsReadFile()
[libvelox.so + 0x378AB36]
#16 HdfsFileSystem::openFileForRead()
[libvelox.so + 0x3787216]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]