Bug#847204: nfs-kernel-server: `systemctl status` incorrectly reports server "active" even if not started
On 12/18/2016 11:41 PM, Michael Biebl wrote: > On Tue, 6 Dec 2016 15:15:12 +0100 Riccardo Murri >wrote: >> >> Package: nfs-kernel-server >> Version: 1:1.2.8-9 >> >> The systemd service unit file for `nfs-kernel-server` incorrectly >> reports the service status as "active" even when the NFS server is *not* >> running (e.g., empty exports file). > > .. > >> debian@debian-nfs-bug:~$ sudo systemctl status nfs-kernel-server.service >> * nfs-kernel-server.service - LSB: Kernel NFS server support >>Loaded: loaded (/etc/init.d/nfs-kernel-server) >>Active: active (exited) since Tue 2016-12-06 12:12:58 UTC; 1min 36s >> ago >> >> Dec 06 12:12:58 debian-nfs-bug nfs-kernel-server[1544]: Not starting NFS >> kernel daemon: no exports. ... (warning). >> Dec 06 12:12:58 debian-nfs-bug systemd[1]: Started LSB: Kernel NFS >> server support. >> Dec 06 12:14:33 debian-nfs-bug systemd[1]: Started LSB: Kernel NFS >> server support. > > The problem is, that nfs-kernel-server in jessie is a sysv init script. > For those, systemd creates a wrapper unit which uses RemainAfterExit=yes > as systemd can not know, if the init script starts a long running > process or not. > > (that's why it shows "active (exited)" instead of "active (running)") > > The obvious solution is to use a native .service file, where you can set > the proper Type=. This is the case for stretch. On Stretch this doesn't help though, because of the way the NFS kernel server works. On Jessie (and before) the nfs-kernel-server init script would start all sorts of daemons (such as rpc.svcgssd) that are auxiliary to the NFS server, but at least for NFSv4 the NFS server itself is a kernel thread. It is started via rpc.nfsd $N, which tells the kernel to start $N kernel threads, and then exits immediately. It stops via rpc.nfsd 0. The problem here is that there's no userland process that is kept running here. So that means that the systemd service also has to have RemainAfterExit=yes to make this work. And indeed, the upstream systemd unit (which will be used by the Debian package in Stretch) indeed has that setting set: https://sources.debian.net/src/nfs-utils/1:1.3.4-2/systemd/nfs-server.service/ I haven't tested the Stretch package yet, so I don't know how that reacts to an empty /etc/exports, so maybe this specific bug doesn't occur there any more. However, if something else manually calls rpc.nfsd 0 during operation, the service will be stopped, but systemd won't recognize it. That all said, for the empty /etc/exports case, for Jessie alone: > For jessie, that change would most likely be too invasive though, A workaround for Jessie could be to create a file /etc/systemd/system/nfs-kernel-server.d/stop-if-empty-exports.conf with the following contents: [Service] ExecStartPost=/usr/local/sbin/stop_nfs_systemd_service_if_no_exports.sh And the contents of that script being something like: #!/bin/sh if ... same check as in the init script that /etc/exports is empty ... ; then # Enqueue a job for systemd to stop the service systemctl stop --no-block nfs-kernel-server.service fi (For the Debian package one would obviously use different directories for these files.) This won't fix the problem entirely, but at least after boot the nfs server will be considered stopped. Regards, Christian
Re: cross building linux-image packages
On 11/01/2016 12:22 AM, Andrew Worsley wrote: > I tried using pbuilder (based off > https://jodal.no/2015/03/08/building-arm-debs-with-pbuilder/) but -> > qemu-arm-static has bugs (segmentation faults - possibly > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811087 You can use (in your pbuilderrc) PBUILDERSATISFYDEPENDSCMD="/usr/lib/pbuilder/pbuilder-satisfydepends-classic" to make pbuilder use a different (much slower, and in edge cases possibly problematic) dependency-resolution algorithm. In that case aptitude will never be invoked and builds in qemu-user chroots will mostly work - with possibly some exceptions. I've never tried a kernel build in such a setup though. Note that qemu-user chroots are _really_, _really_ slow. Builds can take anywhere from 3 to 20 times as long as on native hardware. (For other packages, in my experience building armhf in qemu-user chroots on amd64 is ~12 times slower. YMMV.) Note that if you are using qemu chroots, you're not actually cross-compiling, but you are emulating a hardware architecture and using a native compiler (on an emulated instruction set) to build the software. This has the advantage that you're much closer to what the build is going to be on real hardware, but performance is going to be awful and bugs in qemu-user are going to bite you. Alternatively, you can actually cross-compile stuff by employing a real cross compiler that runs natively on your hardware (hence faster) but generates code for the target hardware you're looking at. The build system of a package must support cross compiling explicitly for this to work and be useful, I've never tried the kernel packages, so I don't know if that will work. You can get a pointer on how to actually cross-compile Debian packages under: https://wiki.debian.org/CrossCompiling > But I see that buildd ( > https://buildd.debian.org/status/logs.php?pkg=linux=armel ) > apparently works some how. Well, the official Debian buildds for armhf/armel actually run on ARM hardware, so that's why the "work". ;-) Regards, Christian
Bug#814648: linux kernel backports broken
Hello, On 05/31/2016 10:15 PM, Antoine Beaupré wrote: > On 2016-05-31 16:01:46, Hector Oron wrote: >> Hello, >> >> El 31 may. 2016 9:56 p. m., "Antoine Beaupré"escribió: >> >>> Hi, >>> >>> I still see this problem in debian jessie right now. I can't install the >>> linux kernel backport. >>> >>> cp: impossible d'évaluer « /boot/initrd.img-4.5.0-0.bpo.2-amd64 »: Aucun >> fichier ou dossier de ce type >>> >>> The initrd is simply not in the .deb: >> >> Initramfs binary is usually generated by a hook that calls an initramfs >> generator, from Debian, there is initramfs-tools or dracut. > > I see. Well, that doesn't seem to be working correctly. > > [1017]anarcat@angela:dist100$ sudo apt-get install > [sudo] password for anarcat: > Reading package lists... Done > Building dependency tree > Reading state information... Done > 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. > 1 not fully installed or removed. > After this operation, 0 B of additional disk space will be used. > Setting up linux-image-4.5.0-0.bpo.2-amd64 (4.5.4-1~bpo8+1) ... > vmlinuz(/boot/vmlinuz-4.5.0-0.bpo.2-amd64 > ) points to /boot/vmlinuz-4.5.0-0.bpo.2-amd64 > (/boot/vmlinuz-4.5.0-0.bpo.2-amd64) -- doing nothing at > /var/lib/dpkg/info/linux-image-4.5.0-0.bpo.2-amd64.postinst line 256. > cp: cannot stat '/boot/initrd.img-4.5.0-0.bpo.2-amd64': No such file or > directory > Failed to copy /boot/initrd.img-4.5.0-0.bpo.2-amd64 to /initrd.img . > dpkg: error processing package linux-image-4.5.0-0.bpo.2-amd64 (--configure): > subprocess installed post-installation script returned error exit status 1 > Errors were encountered while processing: > linux-image-4.5.0-0.bpo.2-amd64 > needrestart is being skipped since dpkg has failed > E: Sub-process /usr/bin/dpkg returned an error code (1) > > iirc, there were problems with incompatible initramfs-tools in > backports, and that was solved (in wheezy) by backporting > initramfs-tools. > > is that what is required here? So I just tried this on my system (I actually just did an apt-get upgrade, because I also run a backports kernel on my desktop, also amd64), and it worked just fine here. Are initramfs-tools installed? (dpkg -l initramfs-tools) If so, could you do the following: update-initramfs -k all -u Does that work or give you an error? How much space do you have left on your /boot partition? As for your other problem: > dpkg-deb: error: parsing file > '/tmp/satisfydepends-aptitude/pbuilder-satisfydepends-dummy/DEBIAN/control' > near line 8 package 'pbuilder-satisfydepends-dummy': > `Depends' field, syntax error after reference to package `cpio' This is not an issue with the package build, but with pbuilder (and by extension) cowbuilder only supprt the build profile syntax with 0.215+nmu4, whereas Jessie only has 0.215+nmu3. So if you either use pbuilder from testing/sid, or manually install the required build dependencies on your host system, you can indeed build the package on a pure jessie + jessie-backports system. (Probably, takes a long time, I haven't actually tried.) Regards, Christian signature.asc Description: OpenPGP digital signature
Bug#815787: May be a kernel problem not a pulseaudio one?
On 02/27/2016 10:52 PM, Cristian Ionescu-Idbohrn wrote: >> * [amd64] mm,nvdimm: Disable ZONE_DMA; enable ZONE_DEVICE, NVDIMM_PFN >> - This disables drivers for some AC'97 sound cards > > Alright, but not obvious to me. > > So, what is to do about that? There are people working upstream on solving this issue: http://thread.gmane.org/gmane.linux.kernel.mm/145039/ So you could stick with the Jessie 3.16 kernel (don't use 4.3, because that won't receive security updates) until this is fixed in a future kernel version. Regards, Christian signature.asc Description: OpenPGP digital signature
Re: NFS / rpcbind and systemd on Jessie
Hi, On 02/16/2016 06:57 PM, Frédéric SOSSON wrote: > I have an issue with NFS / rpcbind and systemd on Jessie (up2date). > > According to systemctl nfs-common.service and nfs-kernel-server.service > both start properly. > > The issue downs to rpcbind. Here the message from journalctl: > rpcbind.service/start deleted to break ordering cycle starting with > basic.target/start > > Is there something broken with those packages? This looks like #775542, see my comment and the following discussion at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775542#109 Regards, Christian signature.asc Description: OpenPGP digital signature
Bug#805252: linux-image-4.2.0-1-amd64: I/O errors when writing to iSCSI volumes
Control: tags -1 + fixed-upstream patch jessie Control: found -1 3.16.7-ckt20-1+deb8u3 Control: fixed -1 4.1.1-1~exp1 Hi, so I've found out that this is actually a bug in LIO (the target) not the initiator. Problem is: LIO in Linux up to 4.0 uses vfs_writev to write out blocks when using the fileio backend, so this has a limit of 1024 iovecs (as per UIO_MAXIOV in include/uapi/linux/uio.h), each at most a page in size, so that gives us a maximum of 4 MiB in data that can be processed per request. (At 4 kiB page size.) Older versions of the Linux software initiator had a hard limit of 512 kiB per request, which means that at most 128 iovec entries were used, which fits perfectly. Newer versions of the Linux iSCSI initiator don't have this hard-coded limit but rather use the value supplied by the target. (This is correct behavior by the initiator, so there's no bug there, against what I initially assumed.) Problem now is that LIO with the fileio backend assumes that 8 MiB may be transfered at the same time, because (according to comments in drivers/target/target_core_file.h) they assume for some reason unbeknownst to me that the maximum number of iovecs is 2048. Note that this also means that any non-Linux initiator that properly follows the target-supplied values for the maximum I/O size will run into the same problem and cause I/O errors, even if it behaves properly. This problem doesn't affect upstream anymore, because they have rewritten LIO to use vfs_iter_write, which doesn't have such limitations, but was only introduced after 3.16. Backporting this would be too much, and probably ABI-incompatible. Fortunately, there's a much easier way to fix this, by just lowering the limit in drivers/target/target_core_file.h to 4 MiB. I've tested that and the limit will be properly set by LIO and newer initiators won't choke on that, so that fixes the bug. See the attached patch. CAVEAT: there is a slight problem with this change, and I don't know what the best solution here is: the optimal_sectors setting for fileio disks on people's existing setups is likely to be 16384, because that corresponds to 8 MiB (the previous max value, which is the default for optimal_sectors if no other value is set) - but that will cause the kernel to refuse that setting, because it's now larger than the maximum allowed value. If you use targetcli 3 (not part of Jessie, but you can install e.g. the version from sid) then that will fail to set up the target properly, because it will abort as soon as it notices that it can't make the setting. (Leftover targetcli 2 from Wheezy upgrades should not be affected as badly as far as I can tell, because the startup scripts seem to ignore errors. But I haven't tested that.) So that leaves the situation that without this fix, 3.16 kernels produce I/O errors when used with initiators that respect the kernel's setting, but with the fix the target configuration needs to be updated. (Of course, one could also patch the kernel to ignore the specific value of 16384 for optimal_sectors if fileio is used as a backend and print a warning.) Don't know what you'd prefer here. Also note that this likely also affects the kernel in Wheezy, although I haven't done any tests in that direction. Regards, Christian PS, for reference, upstream discussion on the initiator mailing list that resulted in me finding out that it's not the initiator but the target that was the problem: https://groups.google.com/forum/#!topic/open-iscsi/UE2JJfDmQ7w From: Christian Seiler <christ...@iwakd.de> Date: Sat, 30 Jan 2016 13:48:54 CET Subject: LIO: assume a maximum of 1024 iovecs Previously the code assumed that vfs_[read|write}v supported 2048 iovecs, which is incorrect, as UIO_MAXIOV is 1024 instead. This caused the target to advertise a maximum I/O size that was too large, which in turn caused conforming initiators (most notably recent Linux kernels, which started to respect the maximum I/O size of the target and not have a hard-coded 512 kiB as previous kernel versions did) to send write requests that were too large, resulting in LIO rejecting them (kernel: fd_do_rw() write returned -22), resulting in data loss. This patch adjusts the limit to 1024 iovecs, and also uses the PAGE_SIZE macro instead of just assuming 4 kiB pages. Signed-off-by: Christian Seiler <christ...@iwakd.de> --- drivers/target/target_core_file.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/drivers/target/target_core_file.h +++ b/drivers/target/target_core_file.h @@ -1,6 +1,8 @@ #ifndef TARGET_CORE_FILE_H #define TARGET_CORE_FILE_H +#include + #define FD_VERSION "4.0" #define FD_MAX_DEV_NAME 256 @@ -9,9 +11,9 @@ #define FD_MAX_DEVICE_QUEUE_DEPTH 128 #define FD_BLOCKSIZE 512 /* - * Limited by the number of iovecs (2048) per vfs_[writev,readv] call + * Limited by the number of iovecs (1024) per vfs_[writev,readv] call */ -#define FD_MAX_BYTES 8388608 +#define FD_MAX_BYTES (1024
Bug#805252: linux-image-4.2.0-1-amd64: I/O errors when writing to iSCSI volumes
Control: tags -1 - fixed-upstream patch Control: notfixed -1 4.4~rc5-1~exp1 Hi, I'm sorry, but the commit in question doesn't help. I just got around to testing this with 4.4-1~exp1, and I verified that the sources do indeed contain the aforementioned patch, and the problem persists. I also tried this with the most recent upstream kernel git tree [1] and the problem also persists there. Using dd if=/dev/zero of=test.dat it will reproducibly cause lots of I/O errors *before* the disk runs full. As already said in the original report, the most recent 3.16 kernel that comes with Jessie does not show this problem, even when used with exactly the same userland (up to date sid), there the dd command will just create a large file until the disk is full (as expected). I'll ask open-iscsi upstream for some help with this, but wanted to make sure this is properly tracked in Debian's bugtracker. Regards, Christian [1] Latest commit at time of testing: 03c21cb775a313f1ff19be59c5d02df3e3526471 Built the custom kernel via make-kpkg, using 4.4-1~exp1's config as a basis and then running make oldconfig. signature.asc Description: OpenPGP digital signature
Bug#805252: linux-image-4.2.0-1-amd64: I/O errors when writing to iSCSI volumes
Package: src:linux Version: 4.2.6-1 Severity: important Tags: upstream Dear Maintainer, with this kernel version writing to iSCSI volumes will consistently produce I/O errors. It appears as though this happens if a lot of writes are performed at once (but I'm not sure of that). I've seen this in the version in unstable (i.e. the version I'm reporting this against), but also with 4.3-1~exp1, and also with vanilla upstream git from just now. Steps to reproduce: - have an iSCSI target ready (hardware or e.g. targetcli) - log in to the iSCSI target in a VM or on a separate computer - dd if=/dev/zero of=/dev/iscsidevice bs=4M -> will consistently produce I/O errors after a short amount of time Reading from the same device is not a problem, I can dd the whole device to /dev/null without any errors. For example, the latest testing netinst installer is not able to install Debian on iSCSI rootfs because of this: when using ext4 as the root filesystem mkfs.ext4 will fail due to I/O errors consistently, when using btrfs the filesystem will be created successfully, but the package installation will fail with I/O errors, presumeably due to a different block access strategy. Logs I gathered with the latest upstream git kernel are (when doing dd if=/dev/zero of=/mnt/foo bs=4M until disk full on a mounted iSCSI drive): --- [ 21.473911] sd 2:0:0:0: [sda] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 21.473914] sd 2:0:0:0: [sda] tag#3 Sense Key : Not Ready [current] [ 21.473916] sd 2:0:0:0: [sda] tag#3 Add. Sense: Logical unit communication failure [ 21.473917] sd 2:0:0:0: [sda] tag#3 CDB: Write(10) 2a 00 00 38 15 d8 00 20 28 00 [ 21.473918] blk_update_request: I/O error, dev sda, sector 3675608 [ 21.473939] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 50331648 size 4214784 starting block 459707) [ 21.473941] Buffer I/O error on device sda1, logical block 457403 [ 21.473954] Buffer I/O error on device sda1, logical block 457404 [ 21.473966] Buffer I/O error on device sda1, logical block 457405 [ 21.473978] Buffer I/O error on device sda1, logical block 457406 [ 21.473990] Buffer I/O error on device sda1, logical block 457407 [ 21.474003] Buffer I/O error on device sda1, logical block 457408 [ 21.474015] Buffer I/O error on device sda1, logical block 457409 [ 21.474027] Buffer I/O error on device sda1, logical block 457410 [ 21.474039] Buffer I/O error on device sda1, logical block 457411 [ 21.474051] Buffer I/O error on device sda1, logical block 457412 [ 21.474096] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 50331648 size 4214784 starting block 459963) [ 21.474128] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 50331648 size 4214784 starting block 460219) [ 21.474162] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 50331648 size 4214784 starting block 460475) [ 21.474191] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 50331648 size 4214784 starting block 460480) [ 21.478983] sd 2:0:0:0: [sda] tag#12 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 21.478986] sd 2:0:0:0: [sda] tag#12 Sense Key : Not Ready [current] [ 21.478987] sd 2:0:0:0: [sda] tag#12 Add. Sense: Logical unit communication failure [ 21.478989] sd 2:0:0:0: [sda] tag#12 CDB: Write(10) 2a 00 00 3c 1f 20 00 20 e0 00 [ 21.478990] blk_update_request: I/O error, dev sda, sector 3940128 [ 21.479010] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 16777216 size 4308992 starting block 492772) [ 21.479049] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 16777216 size 4308992 starting block 493028) [ 21.479079] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 16777216 size 4308992 starting block 493284) [ 21.479108] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 16777216 size 4308992 starting block 493540) [ 21.479136] EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -5 writing to inode 46 (offset 16777216 size 4308992 starting block 493568) [ 21.483612] sd 2:0:0:0: [sda] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 21.483615] sd 2:0:0:0: [sda] tag#15 Sense Key : Not Ready [current] [ 21.483617] sd 2:0:0:0: [sda] tag#15 Add. Sense: Logical unit communication failure [ 21.483618] sd 2:0:0:0: [sda] tag#15 CDB: Write(10) 2a 00 00 33 14 00 00 2c 00 00 [ 21.483620] blk_update_request: I/O error, dev sda, sector 3347456 [ 21.490616] sd 2:0:0:0: [sda] tag#20 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 21.490619] sd 2:0:0:0:
Re: auto-mount NFS shares on boot
Thanks for taking care of this! Am 2015-07-07 13:03, schrieb Jonas Meurer: Am 2015-06-28 12:54, schrieb Michael Biebl: I suggest to add this simple fix to Jessie by uploading it to stable-proposed-updates. What do you think? Also, do you think that /etc/systemd/system/remote-fs-pre.target.d/nfs.conf belongs to systemd package or to nfs-common? I would say it belongs to nfs-common as that one provides the required tools and services to mount NFS shares on a client. For Jessie: - nfs-common is still an init script, so one cannot simply add Before=remote-fs-pre.target to that. But there are two other options: - just for Jessie: update systemd to change the original unit file remote-fs-pre.target to include After=nfs-common.service - or alternatively, package a drop-in in /lib in the nfs-common package, i.e. /lib/systemd/system/nfs-common.service.d/systemd-ordering.conf: [Unit] Before=remote-fs-pre.target (IMHO at least, I'll defer to the maintainers of the respective packages as to what they think is appropriate.) Certainly, the preferred fix is, that packages ship native service files which override/replace the sysv init scripts. In case of nfs-common/rpcbind, Ubuntu has done some extensive work to properly integrate nfs-common/rpcbind with systemd. This hasn't landed in Debian (yet) and is not something which can be backported for jessie. The drop-in snippet for nfs-common to augment the dependency information when being run under systemd is something which seems to be suitable for jessie and could be added to the package in sid to give it some testing first. Ideally, that drop-in is shipped by the package itself. This would mean a stable upload for nfs-common. I prepared a patch for nfs-utils 1.2.8-9 that adds a systemd drop-in for nfs-common at /lib/systemd/system/nfs-common.service.d/remote-fs-pre.conf. It places the nfs-common service before the remote-fs-pre target. This results in the rpc.gssd service beeing started before NFS shares are mounted during the boot process. The patch is tested on my system and works for me. Not having built the package with your diff myself: are you sure that it works as expected and installs the file in the right place? Just from reading the debdiff, the following seems to be wrong: The second argument for lines in .install files should be a directory (see dh_install manpage), dh_install alone doesn't support renaming files. (There is dh_exec for that if you need that functionality, but that requires an additional build-dep.) OTOH, you don't really need to rename the file, the name you have is already fine, so why not just put the following line in nfs-common.install: debian/systemd-remote-fs-pre.conf lib/systemd/system/nfs-common.service.d/ I suggest to upload nfs packages with this patch to Jessie (through stable-proposed-updates) in order to fix auto-mounting Kerberos-secured NFS mounts at boot in Jessie. Note that the release team wants bugs fixed in unstable first, before they accept uploads to s-p-u. Christian -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/f7d7fa566191109464191a7134327...@iwakd.de
Bug#775542: auto-mount NFS shares on boot
(Ccing the bugtracker because it appears you've stumbled upon a bug that also a few other people had, see below. Please don't reply to the bugtracker yourself unless you feel it's relevant for the bug report.) Link to thread on debian-user for people reading the bug report: https://lists.debian.org/debian-user/2015/06/msg01508.html On 06/27/2015 03:39 PM, Jonas Meurer wrote: Sure. My /etc/network/interfaces is pretty simple: # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface #allow-hotplug eth0 auto eth0 iface eth0 inet static address 172.18.10.34 netmask 255.255.255.0 network 172.18.10.0 broadcast 172.18.10.255 gateway 172.18.10.1 # dns-* options are implemented by the resolvconf package, if installed dns-nameservers XXX.XXX.XXX.XXX XXX.XXX.XXX.XXX That looks fine, there shouldn't be any problems with that. Could you run the following? systemd-analyze plot bootup.svg This will produce an SVG file that tells you when and in which order systemd started different units at boot. Could you attach that file? It may be able to tell us what happened. Didn't know about that feature until know. Happy to learn new things about systemd every day :) So ... here we go. The SVG is attached. If I understand the output, then my NFS share etc-exim4-virtual.mount is started just before rpcbind.service and network-online.target, right? Ok, so the following is going on: - local-fs.target is reached, this leads to networking.service being started - networking.service sets up network configuration (takes 172ms) - after networking.service is done, network.target is reached - after network.target is reached, network-online.target is reached (since you don't have any services that wait for the network connection like NetworkManager-wait-online.service, but you also don't need it here, since networking.service with a static configuration and 'auto eth0' will make sure the network is up properly before even network.target is reached, so that's not a problem) - but then immediately after that systemd tries to mount the NFS filesystem - in parallel, first rpcbind and then nfs-common is started This is a bug, that shouldn't happen. Rationale: The problem here is that you are using sec=krb5i type mounts, where rpc.gssd needs to have been started (by nfs-common) BEFORE mounting can take place. Unfortunately, there's no ordering relating nfs-common to remote filesystems, so systemd will start them in parallel and the mount will fail. I myself have never seen this because I've only used sec=sys NFSv4 mounts with Jessie, and those don't require any service to be started when trying to mount them - and while the idmapper may be required to have proper permissions, that can be started later (or not at all if you use the new nfsidmap + request-key logic instead of idmapd). But with sec=krb5i mounts, this is bad, because you need rpc.gssd to mount the filesystems. Mounting it later it will work because rpc.gssd has been started by that point. What's missing here is an ordering dependency between remote-fs-pre.target and nfs-common.service. Searching through the bugtracker, this appears to be the same bug as #775542 [1], that's why I've copied this message to that bug report. Could you try to do the following: 1. create a directory /etc/systemd/system/remote-fs-pre.target.d 2. create a file /etc/systemd/system/remote-fs-pre.target.d/nfs.conf with the following contents: [Unit] After=nfs-common.service And then reboot your system? I would bet it should work then. So maybe you were correct earlier in this thread, that rpcbind is started to late? Is there an easy solution to fix it? Maybe by adding rpcbind to the LSB header 'Required-Start' of nfs-common init script? I just tried that, unfortunately it didn't help at all. You were on the right track here, but it's not rpcbind itself that's the issue. (See above.) For the bugtracker: there has to be a new ordering dependency between remote-fs-pre.target and nfs-common.service. Ideally, since this is NFS-specific, this should probably go in the nfs-common package, but since at least in Jessie nfs-common is an init script and not a systemd service file, it might be better to explicitly add the After= dependency in the systemd package - whereas for Stretch there probably will be a native systemd unit file, so that's where the Before=remote-fs-pre.target dependency could be added. (IMHO.) Christian [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775542 signature.asc Description: OpenPGP digital signature
Bug#775542: auto-mount NFS shares on boot
On 06/27/2015 08:02 PM, Jonas Meurer wrote: Am 27.06.2015 um 16:07 schrieb Christian Seiler: Could you try to do the following: 1. create a directory /etc/systemd/system/remote-fs-pre.target.d 2. create a file /etc/systemd/system/remote-fs-pre.target.d/nfs.conf with the following contents: [Unit] After=nfs-common.service And then reboot your system? I would bet it should work then. Perfect, that solution works like a charm. nfs-common is started before remote-fs, thus rpc.gssd runs already when the NFS share is mounted. Great, glad I could solve your problem. :) I suggest to add this simple fix to Jessie by uploading it to stable-proposed-updates. What do you think? Also, do you think that /etc/systemd/system/remote-fs-pre.target.d/nfs.conf belongs to systemd package or to nfs-common? I would say it belongs to nfs-common as that one provides the required tools and services to mount NFS shares on a client. So the fix I gave you is a fix that shouldn't be copied verbatim into a Debian package. /etc/systemd is administrator territory, /lib/systemd is package territory, so any proper fix of a package should go to /lib. Furthermore, my suggestion for your problem was to add a drop-in that augments the Debian-provided unit files (see man systemd.unit for how drop-ins work) - which is great since you only want to extend the unit, not completely replace it. But in the packages themselves it is perfectly possible to modify the unit files themselves, so a drop-in is not necessary at all for any native unit. So for Stretch: - assuming that nfs-common will have a native unit by then, the proper fix would be to simply add Before=remote-fs-pre.target to that unit, and that would fix that For Jessie: - nfs-common is still an init script, so one cannot simply add Before=remote-fs-pre.target to that. But there are two other options: - just for Jessie: update systemd to change the original unit file remote-fs-pre.target to include After=nfs-common.service - or alternatively, package a drop-in in /lib in the nfs-common package, i.e. /lib/systemd/system/nfs-common.service.d/systemd-ordering.conf: [Unit] Before=remote-fs-pre.target (IMHO at least, I'll defer to the maintainers of the respective packages as to what they think is appropriate.) Christian signature.asc Description: OpenPGP digital signature
Bug#769935: linux: Please backport nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait
Source: linux Version: 3.16.7-2 Severity: important Tags: upstream patch Dear Maintainer, in certain circumstances, the kernel may busy-wait indefinitely after processing a SIGKILL to a process when using NFS. There is a patch for this issue that went into 3.17: http://www.spinics.net/lists/linux-nfs/msg45807.html https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=92a56555bd576c61b27a5cab9f38a33a1e9a1df5 I've attached a backported version of the patch for this (the original patch does not directly apply due to another commit [1] in 3.17). I've tested that the patch applies (against 3.16.7-2), the modified package compiles and that the resulting kernel boots (with no obvious regressions). I haven't seen any busy-waits with that since, but the bug is not trivial to trigger. Thanks! Christian [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c1221321b7c25b53204447cff9949a6d5a7c From 92a56555bd576c61b27a5cab9f38a33a1e9a1df5 Mon Sep 17 00:00:00 2001 From: David Jeffery djeff...@redhat.com Date: Tue, 5 Aug 2014 11:19:42 -0400 Subject: nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait If a SIGKILL is sent to a task waiting in __nfs_iocounter_wait, it will busy-wait or soft lockup in its while loop. nfs_wait_bit_killable won't sleep, and the loop won't exit on the error return. Stop the busy-wait by breaking out of the loop when nfs_wait_bit_killable returns an error. Signed-off-by: David Jeffery djeff...@redhat.com Signed-off-by: Trond Myklebust trond.mykleb...@primarydata.com diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 932c6cc..be7cbce 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -116,7 +116,7 @@ __nfs_iocounter_wait(struct nfs_io_counter *c) if (atomic_read(c-io_count) == 0) break; ret = nfs_wait_bit_killable(c-flags); - } while (atomic_read(c-io_count) != 0); + } while (atomic_read(c-io_count) != 0 !ret); finish_wait(wq, q.wait); return ret; } -- cgit v0.10.1
Bug#742619: linux: Please reenable CONFIG_SCSI_PROC_FS
Control: found -1 3.16.3-2 Hi, Since the freeze for Jessie is around the corner, I wanted to ask if you have considered enabling this again at least for Jessie? Thank you! Kind regards, Christian -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5437fccd.1050...@iwakd.de
Bug#742619: linux: Please reenable CONFIG_SCSI_PROC_FS
Hello, I'm sorry if this seems pushy, but since it is trivial to implement this change, I'd like to ask if you've considered reenabling CONFIG_SCSI_PROC_FS? Thank you! Regards, Christian -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/189678693b172f6b2e7e369f770d5...@iwakd.de
Bug#742619: linux: Please reenable CONFIG_SCSI_PROC_FS
Package: linux Version: 3.2.54-2 Severity: wishlist Some hardware vendor tools (e.g. HP Library and Tape Tools) still rely on the existence of /proc/scsi. In Squeeze /proc/scsi was re-enabled in linux-2.6 (2.6.32-32) see also #618258 and it is still default upstream: path: root/drivers/scsi/Kconfig blob: c8bd092fc945fd5a1a407b170c398e126beaa0e4 (plain) config SCSI_PROC_FS bool legacy /proc/scsi/ support depends on SCSI PROC_FS default y ---help--- This option enables support for the various files in /proc/scsi. In Linux 2.6 this has been superseded by files in sysfs but many legacy applications rely on this. If unsure say Y. Many other distributions, for example Ubuntu and Fedora, activate it by default in all recent versions. Unfortunately, in Wheezy it is disabled again: /boot/config-3.2.0-4-amd64:# CONFIG_SCSI_PROC_FS is not set To re-enable /proc/scsi is an unobtrusive change which would help a lot if one depends on one of these vendor applications. Please consider this small change for both Wheezy and Jessie. Thank you. Regards, Christian -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/bbef507e88b4c14e4c7f91be254d8...@iwakd.de
Bug#696321: Launchpad bug for the same issue
I stumbled upon this bug while using a backported kernel on squeeze, and in the process of trying to find a solution, I noticed that there is also a bug report against Ubuntu kernels in launchpad. The patch attached to a comment there that reverts a certain commit seems to solve the problem for me. But since I don't know anything about the kernel's audio subsystem, I'm not really sure this solves the problem. The Launchpad bug is: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1097396 Christian -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/67a4ff88f266f80887daf42bc058f...@iwakd.de
Bug#699283: linux: Build Sandy Bridge EDAC module in kernels that support it
Package: linux Version: 3.2.35-2 Severity: wishlist In the current Wheezy kernel, all EDAC modules are built save for the one for Intel's Sandy Bridge architecture. It would be nice if one could read out EDAC on Sandy Bridge systems, thanks! The missing kernel configuration is: CONFIG_EDAC_SBRIDGE=m -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/7bd3cfa98d85f70f95584de427a84...@iwakd.de