[
https://issues.apache.org/jira/browse/HDFS-16084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoqiao He resolved HDFS-16084.
--------------------------------
Fix Version/s: 3.5.0
Hadoop Flags: Reviewed
Resolution: Fixed
> getJNIEnv() returns invalid pointer when called twice after getGlobalJNIEnv()
> failed
> ------------------------------------------------------------------------------------
>
> Key: HDFS-16084
> URL: https://issues.apache.org/jira/browse/HDFS-16084
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: libhdfs
> Affects Versions: 3.2.1
> Reporter: Antoine Pitrou
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.5.0
>
>
> First reported in ARROW-13011: when a libhdfs API call fails because
> CLASSPATH isn't set, calling the API a second time leads to a crash.
> *Backtrace*
> This was obtained from the ARROW-13011 reproducer:
> {code:java}
> #0 globalClassReference (className=className@entry=0x7f75883c13b0
> "org/apache/hadoop/conf/Configuration", env=env@entry=0x6c2f2f3a73666468,
> out=out@entry=0x7fffd86e3020) at
> /build/source/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jni_helper.c:279
> #1 0x00007f75883b9511 in constructNewObjectOfClass
> (env=env@entry=0x6c2f2f3a73666468, out=out@entry=0x7fffd86e3148,
> className=className@entry=0x7f75883c13b0
> "org/apache/hadoop/conf/Configuration",
> ctorSignature=ctorSignature@entry=0x7f75883c1180 "()V")
> at
> /build/source/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jni_helper.c:212
> #2 0x00007f75883bb6d0 in hdfsBuilderConnect (bld=0x5562e4bbb3e0) at
> /build/source/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c:700
> #3 0x00007f758de31ef3 in arrow::io::internal::LibHdfsShim::BuilderConnect
> (this=0x7f758e768240 <arrow::io::internal::(anonymous
> namespace)::libhdfs_shim>,
> bld=0x5562e4bbb3e0) at /arrow/cpp/src/arrow/io/hdfs_internal.cc:366
> #4 0x00007f758de2d098 in
> arrow::io::HadoopFileSystem::HadoopFileSystemImpl::Connect
> (this=0x5562e4a9f750, config=0x5562e46edc30)
> at /arrow/cpp/src/arrow/io/hdfs.cc:372
> #5 0x00007f758de2e646 in arrow::io::HadoopFileSystem::Connect
> (config=0x5562e46edc30, fs=0x5562e46edd08) at
> /arrow/cpp/src/arrow/io/hdfs.cc:590
> #6 0x00007f758d532d2a in arrow::fs::HadoopFileSystem::Impl::Init
> (this=0x5562e46edc30) at /arrow/cpp/src/arrow/filesystem/hdfs.cc:59
> #7 0x00007f758d536931 in arrow::fs::HadoopFileSystem::Make (options=...,
> io_context=...) at /arrow/cpp/src/arrow/filesystem/hdfs.cc:409
> #8 0x00007f75885d7445 in
> __pyx_pf_7pyarrow_5_hdfs_16HadoopFileSystem___init__
> (__pyx_v_self=0x7f758871a970, __pyx_v_host=0x7f758871cc00, __pyx_v_port=8020,
> __pyx_v_user=0x5562e3af6d30 <_Py_NoneStruct>, __pyx_v_replication=3,
> __pyx_v_buffer_size=0, __pyx_v_default_block_size=0x5562e3af6d30
> <_Py_NoneStruct>,
> __pyx_v_kerb_ticket=0x5562e3af6d30 <_Py_NoneStruct>,
> __pyx_v_extra_conf=0x5562e3af6d30 <_Py_NoneStruct>) at _hdfs.cpp:4759
> #9 0x00007f75885d4c88 in
> __pyx_pw_7pyarrow_5_hdfs_16HadoopFileSystem_1__init__
> (__pyx_v_self=0x7f758871a970, __pyx_args=0x7f75900bb048,
> __pyx_kwds=0x7f7590033a68)
> at _hdfs.cpp:4343
> #10 0x00005562e38ca747 in type_call () at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Objects/typeobject.c:915
> #11 0x00005562e39117a3 in _PyObject_FastCallDict (kwargs=<optimized out>,
> nargs=<optimized out>, args=<optimized out>,
> func=0x7f75885f1420 <__pyx_type_7pyarrow_5_hdfs_HadoopFileSystem>) at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Objects/tupleobject.c:76
> #12 _PyObject_FastCallKeywords () at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Objects/abstract.c:2496
> #13 0x00005562e39121d5 in call_function () at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/ceval.c:4875
> #14 0x00005562e3973d68 in _PyEval_EvalFrameDefault () at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/ceval.c:3351
> #15 0x00005562e38b98f5 in PyEval_EvalFrameEx (throwflag=0, f=0x7f74c0664768)
> at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/ceval.c:4166
> #16 _PyEval_EvalCodeWithName () at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/ceval.c:4166
> #17 0x00005562e38bad79 in PyEval_EvalCodeEx (_co=<optimized out>,
> globals=<optimized out>, locals=<optimized out>, args=<optimized out>,
> argcount=<optimized out>,
> kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0,
> closure=0x0)
> at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/ceval.c:4187
> #18 0x00005562e398b6eb in PyEval_EvalCode (co=<optimized out>,
> globals=<optimized out>, locals=<optimized out>)
> at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/ceval.c:731
> #19 0x00005562e39f30e3 in run_mod () at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/pythonrun.c:1025
> #20 0x00005562e3896dd3 in PyRun_InteractiveOneObjectEx (fp=0x7f758f30aa00
> <_IO_2_1_stdin_>, filename=0x7f75900391b8, flags=0x7fffd86e40bc)
> at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/pythonrun.c:246
> #21 0x00005562e3896f85 in PyRun_InteractiveLoopFlags (fp=0x7f758f30aa00
> <_IO_2_1_stdin_>, filename_str=<optimized out>, flags=0x7fffd86e40bc)
> at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/pythonrun.c:114
> #22 0x00005562e3897024 in PyRun_AnyFileExFlags (fp=0x7f758f30aa00
> <_IO_2_1_stdin_>, filename=0x5562e3a32ee6 "<stdin>", closeit=0,
> flags=0x7fffd86e40bc)
> at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Python/pythonrun.c:75
> #23 0x00005562e39f8cc7 in run_file (p_cf=0x7fffd86e40bc, filename=<optimized
> out>, fp=0x7f758f30aa00 <_IO_2_1_stdin_>)
> at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Modules/main.c:340
> #24 Py_Main () at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Modules/main.c:810
> #25 0x00005562e389bf77 in main (argc=1, argv=0x7fffd86e42c8) at
> /home/conda/feedstock_root/build_artifacts/python_1613711361059/work/Programs/python.c:69
> {code}
> *Analysis*
> The first time {{getJNIEnv()}} is called, no thread-local state is registered
> yet. It therefore starts by doing three initialization steps:
> 1) allocate a new {{ThreadLocalState}} structure on the heap
> 2) associate a POSIX thread-local state to the {{ThreadLocalState}} pointer
> 3) associate a native ({{_thread}}) shortcut to the {{ThreadLocalState}}
> pointer
> Then {{getGlobalJNIEnv()}} is called to actually fetch a valid JNI
> environment pointer. However, this call may fail (e.g. CLASSPATH not set
> properly). Then the following happens:
> 1) the {{ThreadLocalState}} is deallocated from the heap
> 2) and... that's all!
> Neither the POSIX thread-local-state nor the native {{__thread}} shortcut are
> reinitialized. They still hold the {{ThreadLocalState}} pointer, but the
> corresponding memory was freed and returned to the allocator.
> The next time the user tries to call a libhdfs API again, {{getJNIEnv()}}
> returns successfully... with an invalid pointer (or pointing to random data).
> For example:
> {code}
> (gdb) p getJNIEnv()
> $2 = (JNIEnv *) 0x6c2f2f3a73666468
> (gdb) p *getJNIEnv()
> Cannot access memory at address 0x6c2f2f3a73666468
> {code}
> (0x6c2f2f3a73666468 is the little-endian representation of the string
> "hdfs://l")
> *Note*
> This analysis was done with Hadoop 3.2.1. However, examination of the 3.3.2
> or trunk source code seems to show that {{getJNIEnv()}} hasn't changed
> in-between.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]