[
https://issues.apache.org/jira/browse/HADOOP-7156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003271#comment-13003271
]
Todd Lipcon commented on HADOOP-7156:
-------------------------------------
bq. But for now, I'd rather just advertise "RHEL 6.0 is broken; don't use it"
just like we do for JREs.
Unfortunately many of us are not in a position to do this - RHEL 6.0 is a must
for us, regardless of some bugs it might have. Same with support for Vintela
Authentication Services (VAS) which has a similar bug. Asking users to switch
their entire OS or auth system is not an option.
I think there are three workable options here from my perspective:
1) Always lock around getpwuid_r. Devaraj is added a cache for this function as
part of another JIRA, so it shouldn't be a big performance issue.
2) Add a compile-time macro like LOCK_AROUND_PWUID, which, when set, will add
the monitor lock around these calls.
3) Add a runtime Hadoop configuration option like
hadoop.workaround.broken.getpwuid, which when enabled adds the lock.
Which, if any, of these seem acceptable to you?
Since we've found that this isn't RHEL6 specific, but in fact occurs with other
broken pieces of software as well, I'm leaning towards option #1 or #3.
> getpwuid_r is not thread-safe on RHEL6
> --------------------------------------
>
> Key: HADOOP-7156
> URL: https://issues.apache.org/jira/browse/HADOOP-7156
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 0.22.0
> Environment: RHEL 6.0 "Santiago"
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hadoop-7156.txt
>
>
> Due to the following bug in SSSD, functions like getpwuid_r are not
> thread-safe in RHEL 6.0 if sssd is specified in /etc/nsswitch.conf (as it is
> by default):
> https://fedorahosted.org/sssd/ticket/640
> This causes many fetch failures in the case that the native libraries are
> available, since the SecureIO functions call getpwuid_r as part of fstat. By
> enabling -Xcheck:jni I get the following trace on JVM crash:
> *** glibc detected *** /mnt/toolchain/JDK6u20-64bit/bin/java: free(): invalid
> pointer: 0x0000003575741d23 ***
> ======= Backtrace: =========
> /lib64/libc.so.6[0x3575675676]
> /lib64/libnss_sss.so.2(_nss_sss_getpwuid_r+0x11b)[0x7fe716cb42cb]
> /lib64/libc.so.6(getpwuid_r+0xdd)[0x35756a5dfd]
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira