On Thu, 2009-12-03 at 09:53 +0100, Petter Reinholdtsen wrote: > I backported nss-pam-ldapd version 0.7.1 to Lenny, and installed the > nslcd and libnss-ldapd (pam would not work without newer pam-runtime) > packages to see if I could get rid of a segfault in nscd.
Thanks for reporting this. Just to be clear, you are seeing this with libnss-ldapd 0.6.7.1 and 0.7.1? > To trigger the problem, I try to log in using ssh. I suspect the cause > is the huge amount of groups and the huge amount of group members we > have here at the university. Hmm, I've tried (even installed SSH in a lenny chroot jail in my test environment) but haven't been able to reproduce this (one group with 1000 members). > This is the valgrind output when it crashes: > > ==9529== > ==9529== Process terminating with default action of signal 11 (SIGSEGV) > ==9529== Bad permissions for mapped region at address 0xAEA0000 > ==9529== at 0xB5CB663: read_group (group.c:43) > ==9529== by 0xB5CB9E3: _nss_ldap_getgrent_r (group.c:155) > ==9529== by 0xB5A54D0: getgrent_next_nss (compat-initgroups.c:324) > ==9529== by 0xB5A596E: _nss_compat_initgroups_dyn (compat-initgroups.c:430) > ==9529== by 0x1232C: addinitgroupsX (in /usr/sbin/nscd) > ==9529== by 0x12E0E: readdinitgroups (in /usr/sbin/nscd) > ==9529== by 0xF302: prune_cache (in /usr/sbin/nscd) > ==9529== by 0x756A: nscd_run (in /usr/sbin/nscd) > ==9529== by 0x4836F3A: start_thread (pthread_create.c:297) > ==9529== by 0x492ABED: clone (in /usr/lib/debug/libc-2.7.so) > ==9529== > ==9529== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 25 from 1) > ==9529== malloc/free: in use at exit: 276,148 bytes in 16 blocks. > ==9529== malloc/free: 17,708 allocs, 17,692 frees, 1,159,719,814 bytes > allocated. > ==9529== For counts of detected errors, rerun with: -v > ==9529== searching for pointers to 16 not-freed blocks. > ==9529== checked 8,918,640 bytes. > ==9529== > ==9529== LEAK SUMMARY: > ==9529== definitely lost: 136 bytes in 1 blocks. > ==9529== possibly lost: 952 bytes in 7 blocks. > ==9529== still reachable: 275,060 bytes in 8 blocks. > ==9529== suppressed: 0 bytes in 0 blocks. > ==9529== Rerun with --leak-check=full to see details of leaked memory. > Killed The segfault happens when reading a returned group entry (while listing all groups) and specifically when reading the group members. Can you also reproduce this with just 'getent group' (or id -a user)? Does it make a difference if nscd is running or not? Does cleaning the nscd cache make a difference (nscd -i passwd; nscd -i group)? If this is a problem with the communication between nscd and the NSS module, recompiling the NSS module with -DDEBUG_PROT (and maybe even -DDEBUG_PROT_DUMP) could give a lot more details. Warning: this causes every command that does NSS lookups (through LDAP) to output a lot of debugging information. > Changing the group entry to 'files ldap' avoid the crash, but will not > work for me as we use +...@netgroup entries in /etc/group and > /etc/passwd. Does changing +...@netgroup to just + make a difference (haven't set this up in my test environment)? Anyway, thanks for pointing this out. -- -- arthur - [email protected] - http://people.debian.org/~adejong --
signature.asc
Description: This is a digitally signed message part

