Bug#781742: upgrade-reports: armel wheezy-jessie on QNAP: flash-kernel error in dist-upgrade and various glitches but mostly successful
(CC initramfs-tools@packages, context is flash-kernel invocation not being deferred via triggers during upgrade and ultimately running several times in a dist-upgrade) On Sat, 2015-04-04 at 10:49 +0100, Ian Campbell wrote: At first glance it seems like invocations via the initramfs-tools hooks are not being deferred. This is because initramfs-tools.postinst contains: # Regenerate initramfs whenever we go to dpkg state `installed' if [ x$1 != xtriggered ]; then # this activates the trigger, if triggers are working update-initramfs -u else # force it to actually happen DPKG_MAINTSCRIPT_PACKAGE='' update-initramfs -u fi and flash-kernel uses [ -n $DPKG_MAINTSCRIPT_PACKAGE ] when deciding to defer to a trigger. So the invocations of flash-kernel via /etc/initramfs/post-update.d/flash-kernel end up never being deferred. I don't think initramfs-tools is wrong to do this per-se, but it does mean that anything hooked off the post-update.d hooks cannot reliably use triggers (dpkg-trigger uses $DPKG_MAINTSCRIPT_PACKAGE itself). flash-kernel itself does something similar, but instead of manipulating DPKG_MAINTSCRIPT_PACKAGE it instead sets FLASH_KERNEL_NOTRIGGER=1 and keys off that. It seems like the best solution would a patch to switch initramfs-tools to a similar scheme, would such a patch be accepted? If not then I will arrange for /etc/initramfs/post-update.d/flash-kernel to signal to f-k somehow that triggers should be used despite the lack of DPKG_MAINTSCRIPT_PACKAGE. Ian. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#781742: upgrade-reports: armel wheezy-jessie on QNAP: flash-kernel error in dist-upgrade and various glitches but mostly successful
On Sat, 2015-04-04 at 14:13 +0200, debbug.jessie.upgradereport.nospam@sub.noloop.net wrote: On Sat, Apr 04, 2015 at 10:52:35 +0100, Ian Campbell wrote: On Thu, 2015-04-02 at 13:04 +0200, reportbug wrote: ** error: apt-get dist-upgrade broke during a flash-kernel This happened on the two QNAP TS-419P+ devices but not on the QNAP TS-219P II Turbo device. The apt-get dist-upgrade stage aborted in a flash-kernel trigger that failed, because it seemed to try to flash the jessie 3.16 kernel before it was properly unpacked. Unfortunately I don't have the error message Did you upgrade with apt-get update ; apt-get upgrade ; apt-get dist-upgrade as recommended by the installation guide or did you follow a different path? I tried an update+upgrade+dist-upgrade from a freshly installed Wheezy system and I didn't see this. Yep, update, then upgrade (no problems there), then dist-upgrade which failed, then apt-get -f install to keep going, then another dist-upgrade to wrap it up. I found I had logged the output of dpkg -l on one of the failing machines just before the update started, attaching that to this mail. I also just now discovered a /var/log/apt/term.log that contains the entire upgrade process recorded! It does seem to contain the entire terminal output, including the interactive diffs and manual root shell sessions (Z) shown during file conflicts (which contain local and perhaps confidential information) so I don't feel like attaching the whole log, Understood. but here is everything from Log started to Log ended for the particular dist-upgrade run that failed, plus everything from Log started to Log ended for apt-get -f install just after that. Not shown are the first upgrade and the final dist-ugprade. Thanks. I had a go at reproducing this back when you first reported it, without much luck. Perhaps the dpkg -l will give some clue as to what the difference is. The last lines of the attached log for dist-upgrade are: Preparing to unpack .../module-init-tools_18-3_all.deb ... Unpacking module-init-tools (18-3) over (9-3) ... Selecting previously unselected package linux-image-3.16.0-4-kirkwood. Preparing to unpack .../linux-image-3.16.0-4-kirkwood_3.16.7-ckt7-1_armel.deb ... Unpacking linux-image-3.16.0-4-kirkwood (3.16.7-ckt7-1) ... Processing triggers for initramfs-tools (0.119) ... update-initramfs: Generating /boot/initrd.img-3.2.0-4-kirkwood Can't find /boot/vmlinuz-3.16.0-4-kirkwood or /boot/initrd.img-3.16.0-4-kirkwood Checking and logging those two separately might make sense so we can see which one failed if this happens again. And perhaps logging /boot/*$kver* would be a good idea too. In the absence of being able to repro that might the best we can manage. run-parts: /etc/initramfs/post-update.d//flash-kernel exited with return code 1 dpkg: error processing package initramfs-tools (--unpack): subprocess installed post-installation script returned error exit status 1 Processing triggers for install-info (5.2.0.dfsg.1-6) ... Errors were encountered while processing: initramfs-tools -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#781742: upgrade-reports: armel wheezy-jessie on QNAP: flash-kernel error in dist-upgrade and various glitches but mostly successful
Hello, With regard to: ** odd entry in dmesg: alg: hash: Test 3 failed for mv-sha1 dmesg reveals some slightly concerning messages: [ 35.120866] alg: hash: Test 3 failed for mv-sha1 [ 35.120895] : 10 bf d7 00 71 0b bb 83 3a 26 d0 97 13 05 99 f5 [ 35.120910] 0010: 3a 92 53 3c [ 35.216233] alg: hash: Test 1 failed for mv-hmac-sha1 [ 35.216262] : 0c aa 9f d5 37 c3 79 3a 91 d9 21 5f 42 2b 2c 24 [ 35.216277] 0010: b7 c3 16 0c This happens on all three machines. Not sure if this is a problem? This is indeed a problem. I see the same thing on TS-212P with MV88F6282 CPU running the 3.16 kernel. What it means is that the driver for the hardware crypto engine in the Feroceon failed a self-test (i.e. the cryptographic output is incorrect) and will be disabled. mv_sha1 remains present in /proc/crypto, but any attempt to use it will immediately fall-back to sha1_arm (asm software mode). For me this is a severe loss of functionality, as the crypto engine is a selling point of those CPUs. This is likely an upstream bug in mv_cesa.c or the surrounding code. I strongly suspect that more devices with Kirkwood/Feroceon/Armada 300 CPUs are affected. Should I file a separate bug to track this? Best regards, Jan
Bug#781742: upgrade-reports: armel wheezy-jessie on QNAP: flash-kernel error in dist-upgrade and various glitches but mostly successful
On Thu, 2015-04-02 at 13:04 +0200, reportbug wrote: ** error: apt-get dist-upgrade broke during a flash-kernel This happened on the two QNAP TS-419P+ devices but not on the QNAP TS-219P II Turbo device. The apt-get dist-upgrade stage aborted in a flash-kernel trigger that failed, because it seemed to try to flash the jessie 3.16 kernel before it was properly unpacked. Unfortunately I don't have the error message Did you upgrade with apt-get update ; apt-get upgrade ; apt-get dist-upgrade as recommended by the installation guide or did you follow a different path? I tried an update+upgrade+dist-upgrade from a freshly installed Wheezy system and I didn't see this. [...] Full disclosure: on the failing machines I had an additional sources.list.d entry for www.deb-multimedia.org jessie, which was not present on the machine that didn't exhibit this error. I don't think dmo ships anything directly related to kernels or flash-kernel but I suppose the extra packages may have made the apt resolver do something arbitrarily different. If you are able to reproduce I would of course be very interested in the logs. Ian. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#781742: upgrade-reports: armel wheezy-jessie on QNAP: flash-kernel error in dist-upgrade and various glitches but mostly successful
Package: upgrade-reports Severity: important Hello, I performed a wheezy-jessie upgrade on 3 different armel devices yesterday. Here are a few notes about some speed bumps that cropped up. The devices in question are: * Two QNAP TS-419P+ Turbo devices * One QNAP TS-219P II Turbo device I performed the upgrade using these steps: * Purge removed packages with: apt-get purge $(dpkg -l | awk '/^rc/ { print $2 }') * Change wheezy to jessie in sources.list * apt-get update * apt-get upgrade * apt-get dist-upgrade * reboot These are the issues I encountered: ** error: apt-get dist-upgrade broke during a flash-kernel This happened on the two QNAP TS-419P+ devices but not on the QNAP TS-219P II Turbo device. The apt-get dist-upgrade stage aborted in a flash-kernel trigger that failed, because it seemed to try to flash the jessie 3.16 kernel before it was properly unpacked. Unfortunately I don't have the error message - I expected to capture it on the third device but of course the error didn't happen there. It was something about how it couldn't find the vmlinuz 3.16 file; it seemed confused about whether it should flash the wheezy 3.2 or the jessie 3.16 kernel. During both the upgrade and the dist-upgrade stage, there were probably between 5 and 10 runs of the flash-kernel trigger, which takes quite a long time. I recovered by issuing an apt-get -f install which proceeded to unpack several more packages and the later flash-kernel triggers succesfully flashed a 3.16 kernel. Finally I re-ran apt-get dist-upgrade to wrap it up. Full disclosure: on the failing machines I had an additional sources.list.d entry for www.deb-multimedia.org jessie, which was not present on the machine that didn't exhibit this error. ** frequent flash-kernel triggers As mentioned about, on all the machines the flash-kernel trigger ran frequently during the upgrade and dist-upgrade operations. Since this takes several minutes, it would be ideal if this only happened once during an upgrade or dist-upgrade run. ** odd entry in dmesg: alg: hash: Test 3 failed for mv-sha1 dmesg reveals some slightly concerning messages: [ 35.120866] alg: hash: Test 3 failed for mv-sha1 [ 35.120895] : 10 bf d7 00 71 0b bb 83 3a 26 d0 97 13 05 99 f5 [ 35.120910] 0010: 3a 92 53 3c [ 35.216233] alg: hash: Test 1 failed for mv-hmac-sha1 [ 35.216262] : 0c aa 9f d5 37 c3 79 3a 91 d9 21 5f 42 2b 2c 24 [ 35.216277] 0010: b7 c3 16 0c This happens on all three machines. Not sure if this is a problem? Never saw this on the wheezy kernel. ** journalctl permission / no journal found It was not immediately obvious how to view systemd journals as a non-root user, even being a member of the root, adm, staff groups. Apparently the correct solution is to add the user to the group systemd-journal. The error message given, no journals found, is also not very helpful in diagnosing the problem. Perhaps something could be written about this in the release notes. ** shutdown -rf now doesn't work anymore Apparently systemd has removed the skip-fsck option to shutdown. The error message given is not so pretty: Code should not be reached 'Unhandled option' at ../src/systemctl/systemctl.c:6316, function shutdown_parse_argv(). Aborting. I guess the workaround is to just use shutdown -r and deal with potential fsck delays. tune2fs is a too permanent solution to removing fscks; I miss a way to prevent fscks in a one-shot fashion. Also, it seems running shutdown -r now longer kicks out active ssh sessions; instead other clients won't see the system going down until they try to type in their session and get a broken pipe error back. ** bind9 ignores /etc/default/bind9 and starts on ipv6 It looks like systemd ignores /etc/default/bind9, which contains OPTIONS=-u bind -4 so bind9 starts up with both IPv4 and IPv6 enabled, which causes a LOT of named: error (network unreachable) resolving log entries and possibly delays. I worked around this by hacking in an -4 option into the ExecStart line of /lib/systemd/system/bind9.service ** bitlbee /version identifies as Linux/armv7l When issuing a /version command in irssi against a local bitlbee installation, the response given back is BitlBee-3.2.2-2. localhost Linux/armv7l, which makes me wonder if the bitlbee binary is built for armv7 (debian armel should be at armv5; uname -a gives armv5tel). It seems to work, so maybe not a problem ** arpwatch does not start on boot The arpwatch daemon no longer starts properly on boot. It logs the following lines: arpwatch[1052]: Running as uid=109 gid=105 arpwatch[1052]: Link layer type 113 not ethernet or fddi and exits. Manually restarting after the system has booted complete seems to work. ** apcupsd's /sbin/apcaccess is broken Running apcaccess just prints a Usage: help text, instead of dumping the UPS statistics. Looks like it was reported in november 2014 with a patch but there is no followup in