Bug#855662: fakeroot: when msgrcv is interrupted by a signal, faked accidentally reprocesses the previous message
I've no reason to think that the patch I supplied here, some years ago now, was anything but a good idea, but it seems that my coworker Susi Berrington has found the real cause of my pain. According to: https://github.com/systemd/systemd/issues/2039 (Change default value of RemoveIPC in logind.conf) ... systemd is removing fakeroot's IPC objects when the user performing the build, which in this case are happening under cron, programmatically ssh()s into the headless build machine to check on the build status. Argh! It's been more than a month since we stopped that from happening, a blissful month without failures.
Bug#855662: fakeroot: when msgrcv is interrupted by a signal, faked accidentally reprocesses the previous message
> A new bug is better Agreed: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=856439 ( fakeroot doesn't detect and handle message queue and semaphore id collision
Bug#855662: fakeroot: when msgrcv is interrupted by a signal, faked accidentally reprocesses the previous message
> Good luck. Sadly, after several days of parallel package building later, I've had another failure. Happily, now errno's been preserved by my previous patch, I have A New Clue: FAKEROOT: r=-1, received message type=1, message=3 FAKEROOT, get_msg: Identifier removed errno=43, EINTR=4 On this system, that's: martind@ussw-cbld00:~$ grep -w 43 /usr/include/asm-generic/errno.h #define EIDRM 43 /* Identifier removed */ martind@ussw-cbld00:~$ Now through early evening fog I see that the message queues and the semaphore are being generated with a random key. Collisions with existing keys aren't detected. I have a patch that I hope will detect that and retry. Would you like that here or in a new bug? Including the first patch or assuming it's applied first or against the baseline version? Any way is easy for me. I imagine the BTS will just do the right thing with a text attachment but five minutes of googling hasn't confirmed that's how to submit a patch to an existing Debian bug. Waiting a week for the patch to dis/prove itself would be perfectly sensible, but I'm happy to embarrass myself if you're happy to look at it before then.
Bug#855662: fakeroot: when msgrcv is interrupted by a signal, faked accidentally reprocesses the previous message
On Fri, Feb 24, 2017 at 02:58:41AM +, Martin Dorey wrote: > Now through early evening fog I see that the message queues and the semaphore > are being generated with a random key. Collisions with existing keys aren't > detected. I have a patch that I hope will detect that and retry. Would you > like that here or in a new bug? Including the first patch or assuming it's > applied first or against the baseline version? Any way is easy for me. I > imagine the BTS will just do the right thing with a text attachment but five > minutes of googling hasn't confirmed that's how to submit a patch to an > existing Debian bug. Waiting a week for the patch to dis/prove itself would > be perfectly sensible, but I'm happy to embarrass myself if you're happy to > look at it before then. A new bug is better since it's fixing a different thing, preferably based off of https://anonscm.debian.org/cgit/users/clint/fakeroot.git/log/?id=refs/heads/upstream , text attachments work fine, whenever you're ready.
Bug#855662: fakeroot: when msgrcv is interrupted by a signal, faked accidentally reprocesses the previous message
On Mon, Feb 20, 2017 at 05:31:15PM -0800, Martin Dorey wrote: > I enclose a patch to address that. Thanks! > I don't know at this juncture whether it fixes my original problem. > I doubt it but fingers crossed. Good luck.
Bug#855662: fakeroot: when msgrcv is interrupted by a signal, faked accidentally reprocesses the previous message
Package: fakeroot Version: 1.20.2-1 Severity: normal Tags: patch Dear Maintainer, I'm trying to track down an intermittent failure that originally presented like this: dh_md5sums dh_builddeb -- -Znone dpkg-deb: building package `mercury-main-4604p00p1099' in `../mercury-main-4604p00p1099_14.1.4604.00.1099_amd64.deb'. dpkg-deb: building package `mercury-main-4604p00p1099-dbg' in `../mercury-main-4604p00p1099-dbg_14.1.4604.00.1099_amd64.deb'. semop(1): encountered an error: Invalid argument debian/rules:57: recipe for target 'binary-arch' failed make[8]: *** [binary-arch] Error 1 /usr/bin/fakeroot: line 1: kill: (6633) - No such process dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit status 2 I've got fakeroot now lashed up to run with debugging. Now the presentation is like this: FAKEROOT: r=320, received message type=1, message=3 FAKEROOT: process stat oldstate=dev:ino=(801:1661605), mode=0100644, own=(0,0), nlink=1, rdev=0 FAKEROOT:(previously unknown) FAKEROOT: r=320, received message type=1, message=3 FAKEROOT: process stat oldstate=dev:ino=(801:1300078), mode=0100644, own=(0,0), nlink=1, rdev=0 FAKEROOT:(previously unknown) semop(1): encountered an error: Invalid argument debian/rules:57: recipe for target 'binary-arch' failed make[8]: *** [binary-arch] Error 1 make[8]: Leaving directory '/u/u645/usdevadmin/work/dragon/1/packages/debian/mercury/ip6tables/x86_64_linux_libc-2.19_release/mercury-ip6tables-4604p00p1099' FAKEROOT: r=-1, received message type=1, message=3 FAKEROOT: process stat oldstate=dev:ino=(801:1300078), mode=0100644, own=(0,0), nlink=1, rdev=0 FAKEROOT:(previously unknown) libfakeroot, when sending message: Invalid argument FAKEROOT, get_msg: Invalid argument r=22, EINTR=4 fakeroot: clearing up message queues and semaphores, signal=-1 /usr/bin/fakeroot: line 1: kill: (4317) - No such process dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit status 2 The r=22, EINTR=4 line is telling me that whatever last set errno had got EINVAL: /usr/include/asm-generic/errno-base.h:#defineEINVAL22 /* Invalid argument */ I notice above that, even though msgrcv failed, faked called process stat with the same arguments as given in the previous message. The indentation of the source suggests that this was not the author's intent. It seems unlikely to be correct. Seeing this seeming bug still present in: https://anonscm.debian.org/cgit/users/clint/fakeroot.git/tree/faked.c I enclose a patch to address that. I don't know at this juncture whether it fixes my original problem. I doubt it but fingers crossed. -- System Information: Debian Release: 8.6 APT prefers stable-updates APT policy: (990, 'stable-updates'), (990, 'stable'), (500, 'oldstable-updates'), (500, 'oldoldstable'), (500, 'stable'), (500, 'oldstable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.16.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages fakeroot depends on: ii libc62.19-18+deb8u6 ii libfakeroot 1.20.2-1 fakeroot recommends no packages. fakeroot suggests no packages. -- no debconf information --- faked.c.orig 2017-02-20 16:59:47.614272988 -0800 +++ faked.c 2017-02-20 17:01:06.274709158 -0800 @@ -1079,13 +1079,14 @@ r=msgrcv(msg_get,,sizeof(struct fake_msg),0,0); if(debug) fprintf(stderr,"FAKEROOT: r=%i, received message type=%li, message=%i\n",r,buf.mtype,buf.id); -if(r!=-1) +if(r!=-1) { buf.remote = 0; process_msg(); +} }while ((r!=-1)||(errno==EINTR)); if(debug){ perror("FAKEROOT, get_msg"); -fprintf(stderr,"r=%i, EINTR=%i\n",errno,EINTR); +fprintf(stderr,"errno=%i, EINTR=%i\n",errno,EINTR); } }