--- Begin Message ---
Package: nis
Version: 3.17-32
I'm not sure what happened recently; I am running sid (unstable) and a
reasonably recent kernel (3.1.0). I just upgraded to perl 5.14.
The machine is an NIS client, and for some reason ypbind is not starting.
This causes all sorts of problems with userid->name mappings not working.
The tail of an strace -f trying to start the ypbind server is:
17044 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=4*1024}) = 0
17044 close(1022) = -1 EBADF (Bad file descriptor)
17044 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=4*1024}) = 0
17044 close(1023) = -1 EBADF (Bad file descriptor)
17044 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=4*1024}) = 0
17044 umask(0) = 022
17044 open("/dev/null", O_RDWR) = 0
17044 dup(0) = 1
17044 dup(0) = 2
17044 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT SEGV PIPE TERM CHLD], NULL, 8) = 0
17044 mmap2(NULL, 8392704, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0xb6aa5000
17044 mprotect(0xb6aa5000, 4096, PROT_NONE) = 0
17044 clone(child_stack=0xb72a5434,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0xb72a5bd8, {entry_number:6, base_addr:0xb72a5b70,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1,
seg_not_present:0, useable:1}, child_tidptr=0xb72a5bd8) = 17045
17045 set_robust_list(0xb72a5be0, 0xc) = 0
17045 open("/var/run/ypbind.pid", O_RDWR|O_CREAT, 0644) = 3
17045 fcntl64(3, F_GETFD) = 0
17045 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
17045 fcntl64(3, F_GETLK, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0,
pid=3077939200}) = 0
17045 fcntl64(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
17045 write(3, "17044\n", 6) = 6
17045 futex(0x8052b24, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x8052afc, 2)
= 0
17045 rt_sigtimedwait([HUP INT QUIT SEGV TERM CHLD], NULL, NULL, 8 <unfinished
...>
17044 futex(0x8052b24, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
17044 futex(0x8052afc, FUTEX_WAKE_PRIVATE, 1) = 0
17044 socket(PF_NETLINK, SOCK_RAW, 0) = 4
17044 bind(4, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
17044 getsockname(4, {sa_family=AF_NETLINK, pid=17044, groups=00000000}, [12])
= 0
17044 time(NULL) = 1321461087
17044 sendto(4, "\24\0\0\0\22\0\1\3_\345\303N\0\0\0\0\0\0\0\0", 20, 0,
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = -1 ECONNREFUSED
(Connection refused)
17044 close(4) = 0
17044 dup(2) = 4
17044 fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR)
17044 fstat64(4, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
17044 ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfa3fb68) = -1 ENOTTY
(Inappropriate ioctl for device)
17044 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb7810000
17044 _llseek(4, 0, [0], SEEK_CUR) = 0
17044 write(4, "get_myaddress: getifaddrs: Connection refused\n", 46) = 46
17044 close(4) = 0
17044 munmap(0xb7810000, 4096) = 0
17044 exit_group(1) = ?
17043 exit_group(0) = ?
17042 exit_group(0) = ?
17041 exit_group(0) = ?
I tried statically configuring the YP server in /etc/yp.conf rather than
using -broadcast; no apparent difference. (The above trace is actually
from this test.)
Trying to decode the netlink message, I see a NETLINK_ROUTE socket created, and
the header is:
\24\0\0\0 = 024 = 20 (nlmsg_len)
\22\0 = 022 = 18 (nlmsg_type) RTM_GETLINK
\1\3 = 0x301 (nlmsg_flags): NLM_F_REQUEST | NLM_F_DUMP
_\345\303N = 0x4ec3e55f (nlmsg_seq)
\0\0\0\0 = 0 (nlmsg_pid)
\0\0\0\0 = 0 (not sure)
Searching the kernel source, this appears to be generated in netlink_dump_start
at net/netlink/af_netlink.c:1749, if netlink_lookup() fails. But I don't know
enough about what the #$#$% is going on inside the netlink code to now if it
should be failing or not.
--- End Message ---