[ 
https://issues.apache.org/jira/browse/HADOOP-7156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003271#comment-13003271
 ] 

Todd Lipcon commented on HADOOP-7156:
-------------------------------------

bq. But for now, I'd rather just advertise "RHEL 6.0 is broken; don't use it" 
just like we do for JREs.

Unfortunately many of us are not in a position to do this - RHEL 6.0 is a must 
for us, regardless of some bugs it might have. Same with support for Vintela 
Authentication Services (VAS) which has a similar bug. Asking users to switch 
their entire OS or auth system is not an option.

I think there are three workable options here from my perspective:

1) Always lock around getpwuid_r. Devaraj is added a cache for this function as 
part of another JIRA, so it shouldn't be a big performance issue.
2) Add a compile-time macro like LOCK_AROUND_PWUID, which, when set, will add 
the monitor lock around these calls.
3) Add a runtime Hadoop configuration option like 
hadoop.workaround.broken.getpwuid, which when enabled adds the lock.

Which, if any, of these seem acceptable to you?

Since we've found that this isn't RHEL6 specific, but in fact occurs with other 
broken pieces of software as well, I'm leaning towards option #1 or #3.

> getpwuid_r is not thread-safe on RHEL6
> --------------------------------------
>
>                 Key: HADOOP-7156
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7156
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>         Environment: RHEL 6.0 "Santiago"
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.22.0
>
>         Attachments: hadoop-7156.txt
>
>
> Due to the following bug in SSSD, functions like getpwuid_r are not 
> thread-safe in RHEL 6.0 if sssd is specified in /etc/nsswitch.conf (as it is 
> by default):
> https://fedorahosted.org/sssd/ticket/640
> This causes many fetch failures in the case that the native libraries are 
> available, since the SecureIO functions call getpwuid_r as part of fstat. By 
> enabling -Xcheck:jni I get the following trace on JVM crash:
> *** glibc detected *** /mnt/toolchain/JDK6u20-64bit/bin/java: free(): invalid 
> pointer: 0x0000003575741d23 ***
> ======= Backtrace: =========
> /lib64/libc.so.6[0x3575675676]
> /lib64/libnss_sss.so.2(_nss_sss_getpwuid_r+0x11b)[0x7fe716cb42cb]
> /lib64/libc.so.6(getpwuid_r+0xdd)[0x35756a5dfd]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to