[Kernel-packages] [Bug 2000257] Re: kdump fails on focal + linux-image-generic-hwe-20.04 kernel

2022-12-21 Thread Ryan Harper
Hi Dann, Thanks for the comments. VM has 4G, bumping the craskkernel size 256 still fails, but moving up to 512M allows this to work, but the equivalent kernel in jammy works with 192M... Thoughts? And you're quite right, once the dump works, I see bug 1970672 [3.684815] kdump-tools[487]:

[Kernel-packages] [Bug 2000257] Re: kdump fails on focal + linux-image-generic-hwe-20.04 kernel

2022-12-21 Thread Ryan Harper
Testing with jammy-server image (which uses 5.15 kernel) it crashes fine. The general steps are: 1) boot VM with server image 2) apt install kdump-tools (enable kexec, enable kdump) 3) reboot, check crashdump param is set 4) kdump-config show (says ok to dump) 5) echo c | sudo tee /proc/sysrq-tr

[Kernel-packages] [Bug 2000257] Re: kdump fails on focal + linux-image-generic-hwe-20.04 kernel

2022-12-21 Thread Ryan Harper
Serial console output of VM kexec panic after triggering crashdump on 5.15 hwe kernel. ** Attachment added: "serial console log of 5.15 kxec panic during crash dump" https://bugs.launchpad.net/ubuntu/+source/linux-meta-hwe-5.15/+bug/2000257/+attachment/5637077/+files/ubuntu-focal-hwe-20.04-cra

[Kernel-packages] [Bug 2000257] [NEW] kdump fails on focal + linux-image-generic-hwe-20.04 kernel

2022-12-21 Thread Ryan Harper
Public bug reported: 1) $ lsb_release -rd Description:Ubuntu 20.04.5 LTS Release:20.04 2) ubuntu@ubuntu:~$ apt-cache policy makedumpfile makedumpfile: Installed: 1:1.6.7-1ubuntu2.4 Candidate: 1:1.6.7-1ubuntu2.4 Version table: *** 1:1.6.7-1ubuntu2.4 500 500 http://archi

Re: [Kernel-packages] [Bug 1918427] Re: curtin: install flash-kernel in arm64 UEFI unexpected

2021-03-19 Thread Ryan Harper
* dann frazier <1918...@bugs.launchpad.net> [2021-03-19 12:16]: > On Fri, Mar 19, 2021 at 10:01 AM Ryan Harper <1918...@bugs.launchpad.net> > wrote: > > > > * dann frazier <1918...@bugs.launchpad.net> [2021-03-18 16:30]: > > > On Th

Re: [Kernel-packages] [Bug 1918427] Re: curtin: install flash-kernel in arm64 UEFI unexpected

2021-03-19 Thread Ryan Harper
* dann frazier <1918...@bugs.launchpad.net> [2021-03-18 16:30]: > On Thu, Mar 18, 2021 at 12:25 PM Ryan Harper <1918...@bugs.launchpad.net> > wrote: > > > > * dann frazier <1918...@bugs.launchpad.net> [2021-03-18 12:11]: > > > On Th

Re: [Kernel-packages] [Bug 1918427] Re: curtin: install flash-kernel in arm64 UEFI unexpected

2021-03-18 Thread Ryan Harper
* dann frazier <1918...@bugs.launchpad.net> [2021-03-18 12:11]: > On Thu, Mar 18, 2021 at 10:25 AM Ryan Harper <1918...@bugs.launchpad.net> > wrote: > > > > * dann frazier <1918...@bugs.launchpad.net> [2021-03-17 20:30]: > > > On Tu

Re: [Kernel-packages] [Bug 1918427] Re: curtin: install flash-kernel in arm64 UEFI unexpected

2021-03-18 Thread Ryan Harper
* dann frazier <1918...@bugs.launchpad.net> [2021-03-17 20:30]: > On Tue, Mar 16, 2021 at 10:05 AM Ryan Harper <1918...@bugs.launchpad.net> > wrote: > > > > Hi Dan, > > > > Could you summarize the problem with flash-kernel and this system? > >

Re: [Kernel-packages] [Bug 1918427] Re: curtin: install flash-kernel in arm64 UEFI unexpected

2021-03-18 Thread Ryan Harper
* dann frazier <1918...@bugs.launchpad.net> [2021-03-17 20:40]: > On Wed, Mar 17, 2021 at 4:56 PM Ryan Harper <1918...@bugs.launchpad.net> > wrote: > > > > I still don't understand: > > > > 1) why does which not find flash-kernel if it's prese

[Kernel-packages] [Bug 1918427] Re: curtin: install flash-kernel in arm64 UEFI unexpected

2021-03-17 Thread Ryan Harper
I still don't understand: 1) why does which not find flash-kernel if it's present in the ephemeral image (meaning it will also be present in the target filesystem 2) What is the problem with flash-kernel such that you need to dpkg-divert it? Generally, we do not want to include paths to binaries

Re: [Kernel-packages] [Bug 1918427] Re: curtin: install flash-kernel in arm64 UEFI unexpected

2021-03-16 Thread Ryan Harper
Hi Dan, Could you summarize the problem with flash-kernel and this system? * dann frazier <1918...@bugs.launchpad.net> [2021-03-15 18:25]: > Attached is a patch for curtin that works for me, though it could use > some cleanup. It installs flash-kernel in the same place GRUB gets > installed for E

[Kernel-packages] [Bug 1894910] Re: fallocate swapfile has holes on 5.8 ext4

2020-09-08 Thread Ryan Harper
Turns out it's unrelated to bcache; it is trivially reproducible: lxc launch ubuntu-daily:groovy g1 --vm lxc exec g1 bash fallocate -l 1024M /swap.img mkswap /swap.img swapon --verbose /swap.img cat /proc/swaps On the 5.4 kernel that groovy had a few weeks back this works, on daily (5.8) this fa

[Kernel-packages] [Bug 1894910] Re: fallocate swapfile has holes on 5.8 ext4

2020-09-08 Thread Ryan Harper
** Summary changed: - fallocate swapfile has holes on 5.8 ext4 over bcache + fallocate swapfile has holes on 5.8 ext4 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1894910 Title: fallo

[Kernel-packages] [Bug 1894910] [NEW] fallocate swapfile has holes on 5.8 ext4 over bcache

2020-09-08 Thread Ryan Harper
Public bug reported: 1) Groovy 2) root@ubuntu:/home/ubuntu# apt-cache policy linux-image-generic linux-image-generic: Installed: 5.8.0.18.22 Candidate: 5.8.0.18.22 Version table: *** 5.8.0.18.22 500 500 http://archive.ubuntu.com/ubuntu groovy/main amd64 Packages 100 /var/lib

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-07-02 Thread Ryan Harper
@ddstreet I'm not sure where upstream is going just yet. For Ubuntu; I think 1) Adjusting the bcache-tools patch to use the full path to bcache- super-show should change; 2) If we fix (1) then I think we can drop the systemd patch from a bug fixing perspective; on the openSUSE image I did testi

Re: [Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-06-30 Thread Ryan Harper
On Tue, Jun 30, 2020 at 6:35 AM Balint Reczey <1861...@bugs.launchpad.net> wrote: > @raharper I've forwarded the systemd fix for you with minimal tidying of > the commit message https://github.com/systemd/systemd/pull/16317 Thanks! > > -- > You received this bug notification because you are su

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
systemd debdiff with a fix to skip creating /dev/disk/by-uuid for bcache backing, caching devices. ** Patch added: "lp1861941-skip-bcache-links.debdiff" https://bugs.launchpad.net/ubuntu/+source/bcache-tools/+bug/1861941/+attachment/5375730/+files/lp1861941-skip-bcache-links.debdiff -- You r

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
debdiff of the changes ** Attachment added: "bcache-tools-debdiff-1.0.8-4_to_1.0.8-4ubuntu1" https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1861941/+attachment/5375722/+files/bcache-tools-debdiff-1.0.8-4_to_1.0.8-4ubuntu1 -- You received this bug notification because you are a m

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
Updated test to be a bit more resilient. ** Attachment added: "test-bcache-byuuid-links-fixed.sh" https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1861941/+attachment/5375723/+files/test-bcache-byuuid-links-fixed.sh -- You received this bug notification because you are a member of

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
Tarball of a source package with a fix for this issue: bcache-tools_1.0.8.orig.tar.gz bcache-tools_1.0.8-4ubuntu1_amd64.build bcache-tools_1.0.8-4ubuntu1_amd64.buildinfo bcache-tools_1.0.8-4ubuntu1_amd64.changes bcache-tools_1.0.8-4ubuntu1_amd64.deb bcache-tools_1.0.8-4ubuntu1.debian.tar.xz bcache

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
OK. I've reviewed the kernel code, and there are no unexpected changes w.r.t the CACHED_UUID change event. So I don't think we will need any kernel changes which is good. With the small change to the 60-persistent-storage.rules to not attempt to create a /dev/disk/by-uuid symlink for the backing

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-21 Thread Ryan Harper
Digging deeper and walking through this in a focal vm, I'm seeing some strange things. Starting with a clean disk, and just creating the backing device like so: make-bcache -B /dev/vdb We see /dev/bcache0 get created with vdb as the backing device. Now, after this, I see: /dev/bcache/by-uuid/

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-21 Thread Ryan Harper
@Balint I do not thing the fix you're released is correct, can you upload a new version without the scripts? Also, we should fix make-bcache -B to ensure that cset.uuid is not initialized; that may be why the kernel thinks it should emit the CACHED_UUID if the suerpblock of the device has a cset.

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-20 Thread Ryan Harper
That doesn't explain why they show up sometimes, but not all of the time. There are 3 devices in play here. * The backing device, let's say /dev/vda; this is where we want to store the data. * The caching device, let's say /dev/vdb; this holds the cache. * The bcache device; this only appears

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-20 Thread Ryan Harper
I guess I don't understand why we see this in focal. The two events in Colin's trace always happen on any Ubuntu kernel. We should see if we can get another udev trace on bionic that captures both CHANGE events, one will be from the bcache driver itself, and one is from the block layer. THe orde

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-21 Thread Ryan Harper
Following up my question; we should see both events in all kernels. The first event is when the /dev/bcache0 is joined with a cache device, and emitts the CACHED_UUID value in the uevent; the UUID is the *backing device bcache metadata UUID* it is not related to the content contained within the b

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-21 Thread Ryan Harper
So, this looks like the bug to me: Apr 21 14:15:43 ubuntu-focal systemd-udevd[1916]: bcache0: /usr/lib/udev/rules.d/60-persistent-storage.rules:112 LINK 'disk/by-uuid/30b28bee-6a1e-423d-9d53-32c78ba5454a' Apr 21 14:15:43 ubuntu-focal systemd-udevd[1916]: bcache0: Updating old name, '/dev/bcache

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-21 Thread Ryan Harper
Do we have any more information on why we now get two events in Focal? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1861941 Title: bcache by-uuid links disappear after mounting bcache0

[Kernel-packages] [Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

2020-04-10 Thread Ryan Harper
> This is in an integration lab so these hosts (including maas) are stopped, > MAAS is reinstalled, and the systems are redeployed without any release > or option to wipe during a MAAS release. > Then MAAS deploys Bionic on these hosts thinking they are completely new > systems but in reality they

[Kernel-packages] [Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

2020-04-10 Thread Ryan Harper
> > Ryan, > We believe this is a bug as we expect curtin to wipe the disks. In this > case it's failing to wipe the disks and occasionally that causes issues > with our automation deploying ceph on those disks. I'm still confused about what the actual error you believe is happening. Note tha

[Kernel-packages] [Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

2020-04-09 Thread Ryan Harper
During a clear-holders operation we do not need to catch any failure; we're attempting to destroy the devices in question. The destruction of a device is explicitly requested in the config via a wipe: value[1] present on one or more devices that are members of the LV. 1. https://curtin.readthedo

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
I'm on Focal desktop, running kvm like so qemu-system-x86_64 -smp 2 -m 1024 --enable-kvm \ -drive id=disk0,if=none,format=qcow2,file=bionic-bcache-links.qcow2 \ -device virtio-blk-pci,drive=disk0,bootindex=0 \ -drive id=disk1,if=none,format=raw,file=bcache1.img \ -device virtio-blk-pci,dri

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
It appears that it's always been a touch racy. Curtin does not create bcaches like the script does (make-bcache --wipe-bcache -C /dev/sdc -B /dev/sdb), rather we make the cache-dev and backing dev separately, and then attach them by echoing the cacheset uuid into the bcache device attach sysfs fil

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
Requested output on bionic release image (4.15-20) ** Attachment added: "bcache-release-4.15-20.txt" https://bugs.launchpad.net/ubuntu/+source/linux-signed-5.4/+bug/1861941/+attachment/5350481/+files/bcache-release-4.15-20.txt -- You received this bug notification because you are a member o

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
Requested data from a daily cloud image with 4.15-76 ** Attachment added: "bcache-daily-4.15-76.txt" https://bugs.launchpad.net/ubuntu/+source/linux-signed-5.4/+bug/1861941/+attachment/5350483/+files/bcache-daily-4.15-76.txt -- You received this bug notification because you are a member of

[Kernel-packages] [Bug 1871611] Re: multipath nvme, failed to install with multipath disabled install failed crashed with CalledProcessError

2020-04-08 Thread Ryan Harper
I'm marking curtin task invalid; this looks like kernel/platform issue at this point. Please reopen curtin task if curtin needs to fix something. ** Changed in: curtin Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subsc

[Kernel-packages] [Bug 1871611] Re: multipath nvme, failed to install with multipath disabled install failed crashed with CalledProcessError

2020-04-08 Thread Ryan Harper
The current error looks like /target got unmounted ... or there was some corruption that forced the mount into read-only mode... Running command ['sh', '-c', 'mkdir -p "$2" && cd "$2" && rsync -aXHAS --one-file-system "$1/" .', '--', '/media/filesystem', '/target'] with allowed return codes [0]

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-07 Thread Ryan Harper
This is still occurring daily. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-5.4 in Ubuntu. https://bugs.launchpad.net/bugs/1861941 Title: bcache by-uuid links disappear after mounting bcache0 Status in linux-signed-5.4

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-03-23 Thread Ryan Harper
Ah, from the journal.log: Command line: BOOT_IMAGE=/boot/vmlinuz-5.3.0-1008-azure root=PARTUUID=1261a2c6-48ca-43ee-9b70-197f5b89b82c ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-03-23 Thread Ryan Harper
None of the VMs will be using spinning disks, it's all SSD; and virtual disks anyhow. I would not expect much timing difference on virtual hardware; there aren't real device or pci timing delays; though the kernel may wait for them; however, it should be consistent. In terms of the things that ca

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-03-02 Thread Ryan Harper
*** This bug is a duplicate of bug 1863261 *** https://bugs.launchpad.net/bugs/1863261 I do not believe this is a duplicate; It is more likely a *packaging* issue. The question remains for this bug, why does it only appear in the focal kernels; but not Eoan or older? And if someone could c

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-02-27 Thread Ryan Harper
Note, by fail, we mean depmod emits the error message mentioned in bug title; there is nothing *functionally* wrong; just scary/noisy output which it did not use to produce. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-02-27 Thread Ryan Harper
An easy recreate: lxc launch ubuntu-daily:focal f1 lxc exec f1 bash apt update && apt install linux-generic This does not fail on eoan or bionic. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.ne

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-02-27 Thread Ryan Harper
I can recreate the issue inside an LXC container on focal only (bionic, disco, eoan) and without any dpkg-divert of update-initramfs; as such I'm marking the curtin task invalid. ** Changed in: curtin Status: Incomplete => Invalid -- You received this bug notification because you are a me

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
here is some debug data I captured: -rw-rw-r-- 1 ubuntu ubuntu 729 Dec 16 12:15 bug-bionic-baseline-after-templating-Standard-DS2-v2.csv drwxrwxr-x 12 ubuntu ubuntu 22 Dec 16 12:15 bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/ -rw-rw-r-- 1 ubuntu ubuntu 721 Dec 16 13:07

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
The primary concern is time before rootfs mounting and executing /sbin/init. For the spots after that, that falls into systemd territory. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-azure in Ubuntu. https://bugs.launchpad

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
> 14:48:15> this slowness is happening with a particular instance type? I've not tested extensively across all types; but it's common for any of the "fast" types which have SSD backing. I've seen this DS1_v2, DS2_v2, DS3_v3, D4-v2, B2s, A2s, L4s > 14:49:38> the slowness is happening across mul

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
Sorry for missing the questions earlier. Azure has two "machine types" gen1 which boots a non-uefi based virtual hardware platform and gen2 which is UEFI with newer virtual hardware, details here: https://azure.microsoft.com/en-us/updates/generation-2-virtual-machines- in-azure-public-preview/?cd

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
On Tue, Feb 25, 2020 at 2:35 PM Scott Moser wrote: > this seemed to "just work" for me. > http://paste.ubuntu.com/p/93dWDPZfZT/ Ah, I didn't check that there was an existing ubuntu/devel branch. Sorry. I've pushed a MR here: https://code.launchpad.net/~raharper/cloud-utils/+git/cloud-utils/

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
** Attachment added: "tarball of source package to upload" https://bugs.launchpad.net/cloud-init/+bug/1834875/+attachment/5330895/+files/cloud-utils_0.31-7-gd99b2d76-source.tar.xz -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-a

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
@Scott, cloud-utils isn't quite new-upstream-snapshot out of the box; the debian dir does not contain the changelog; however, I think I've got this sorted out. I've a MP I can put up; but it only will show the add of the changelog file. I'll attach a debdiff and a source package. -- You recei

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
** Patch added: "debdiff showing the changes to upload to fix the bug." https://bugs.launchpad.net/cloud-init/+bug/1834875/+attachment/5330894/+files/cloud-utils_0.31-6_to_0.31-7.debdiff -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-14 Thread Ryan Harper
Here's the upstream changes to growpart I'm suggesting: https://code.launchpad.net/~raharper/cloud-utils/+git/cloud- utils/+merge/379177 I've also proposed on modifications to cloud-init's cc_growpart as a further method to aid debugging if this hit as well as some mitigation around the race. h

[Kernel-packages] [Bug 1862661] Re: zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
Sorry, I do not expect the zfs tools to function inside the unpriv container. There is some packaging change between previous releases which did not report an error to apt/dpkg when installing. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed

[Kernel-packages] [Bug 1862661] Re: zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
The latter; This may only be a packaging issue in that bionic release of tools don't report an error up through apt, where in focal (and eoan) report an error to apt. ** Changed in: zfs-linux (Ubuntu) Status: Incomplete => New -- You received this bug notification because you are a memb

[Kernel-packages] [Bug 1862661] Re: zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
Note, the fact that these services fail isn't new; they've failed for a long time. However, reporting the service failure to apt is new. For example of bionic, we don't see an apt error: # lsb_release -rd Description:Ubuntu 18.04.4 LTS Release:18.04 # apt-cache policy zfsutils-linu

[Kernel-packages] [Bug 1862661] [NEW] zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
Public bug reported: 1) # lsb_release -rd Description:Ubuntu Focal Fossa (development branch) Release:20.04 2) # apt-cache policy zfsutils-linux zfsutils-linux: Installed: (none) Candidate: 0.8.3-1ubuntu3 Version table: 0.8.3-1ubuntu3 500 500 http://archive.ubuntu.

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-10 Thread Ryan Harper
Yes, this is my read on the issue as well. The trigger is related to the inotify watch that systemd-udevd puts on the disk. Something that might help that we could try per xnox's comment around use of flock. if growpart were to flock /dev/sda (we need to sort out what flags are needed to prevent

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-06 Thread Ryan Harper
Cloud-init service starts and will run growpart, etc Feb 06 00:37:26 ubuntu systemd[1]: Starting Initial cloud-init job (pre-networking)... Feb 06 00:37:37 test-xrdpdnvfctsofyygmzan systemd[1]: Starting Initial cloud-init job (metadata service crawler)... Something has modified sdb1 (growpart/s

[Kernel-packages] [Bug 1858615] Re: Fail to boot when NoCloud datasource is included

2020-01-08 Thread Ryan Harper
Thanks for filing a bug. I've added a dmidecode task to track the issue with the tool. It may also affect the kernel package, and possibly firmware (though that's not something that Ubuntu provides). Cloud-init and any other tool may invoke this package and it should not reboot the system; but t

[Kernel-packages] [Bug 1858495] [NEW] multiple long delays during kernel and userspace boot

2020-01-06 Thread Ryan Harper
Public bug reported: Booting some Bionic instances in Azure (gen1 machines), I see some large delays during kernel/userspace boot that it would be good to understand what's going on. Additionally, there areas during boot that see delays is different for an image that's been created from a templat

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
On Thu, Nov 7, 2019 at 2:41 PM Dan Streetman wrote: > > Issuing a second > > trigger will repeat this. > > IMO, that's a non-zero amount of time that slows the boot down, so I'd > like > > to avoid that. > > systemd-udev-trigger.serivce retriggers *everything* at boot (except in > an unprivileged

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
On Thu, Nov 7, 2019 at 11:30 AM Dimitri John Ledkov wrote: > > So that means we have this sequence of events: > > a.) growpart change partition table > > b.) growpart call partx > > c.) udev created and events being processed > > That is not true. whilst sfdisk is deleting, creating, finishing

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
On Thu, Nov 7, 2019 at 1:30 PM Dan Streetman wrote: > > Yes, settle does not help. > > Well, I didn't suggest just to settle ;-) > Sorry; long bug thread. > > I'm currently suggesting as a heavy-handed workaround. > > I don't really see why you think this is heavy-handed, but I must be > missi

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
@ddstreet Yes, settle does not help. Re-triggering udevadm trigger --action=add /sys/class/block/sda Would re-run all of them after the partition change has occurred, which is what I'm currently suggesting as a heavy-handed workaround. I would like to understand *why* the udevd/kernel pair exhi

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
> it will prevent udevd from running the rules against it. Thus effectively the event will be fired and done, but nothing actually executed for it. Interesting, I suspect this is the race we see. The events emitted but no actions taken (ie we didn't get our by-partuuid symlink created. > I someh

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-06 Thread Ryan Harper
A couple of comments on the suggested path: > Imho the sequency of commands should be: > * take flock on the device, to neutralise udev +1 on this approach. Do you know if the flock will block systemd's inotify write watch on the block device which triggers udevd? This is the typical race we se

[Kernel-packages] [Bug 1830740] Re: [linux-azure] Delay during boot of some instance sizes

2019-10-29 Thread Ryan Harper
This looks hypervisor/kernel related. Some observations: The cloud-init.log in the collect-logs shows cloud-init running twice. The first time, run-time is expected, approx 17s of cloud-init time, the second boot took much longer, but the bulk if the time is in udev 2019-06-20 18:09:18,951 - u

[Kernel-packages] [Bug 1838278] Re: zfs-initramfs wont mount rpool

2019-09-17 Thread Ryan Harper
A quick hack shows that if we export after unmount. I'd like to understand if we we need/should use import -f, however, curtin can now ensure it exports pools it has created at the end of install. ** Also affects: curtin (Ubuntu) Importance: Undecided Status: New ** Changed in: curtin

[Kernel-packages] [Bug 1838278] Re: zfs-initramfs wont mount rpool

2019-09-17 Thread Ryan Harper
Curtin hasn't ever run zfs export on the pools; so either something else did this previously, or it wasn't a requirement. I can see if adding a zfs export on the pool works around the issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-23 Thread Ryan Harper
Overnight testing of the revised deployment configuration has no errors, 200 runs completed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1784665 Title: bcache: bch_allocator_thread():

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Ryan Harper
Finally, I did verify xenial proposed with our original test. I had over 100 installs with no issue. @Jason Have you had any runs on Xenial or Disco? (or do you not test those)? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Ryan Harper
Also, I had some confusion earlier about what kernel I was testing. apt-cache policy shows the package version 4.15.0.59.61, however that's the meta package, the actual kernel is the .66 one. # dpkg --list | grep linux-image ii linux-image-4.15.0-59-generic 4.15.0-59.66

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Ryan Harper
I've adjusted my bionic testing with the simpler configuration. I cannot reproduce the failure so far. I'll leave this running over night. I suspect there's something else going on baremetal that we can't reproduce in a VM. -- You received this bug notification because you are a member of Kern

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-20 Thread Ryan Harper
I've verified that disco-proposed linux-virtual kernel succeeds our test-case (curtin-nvme). I've 50 installs with no issue. root@ubuntu:~# apt-cache policy linux-virtual linux-virtual: Installed: 5.0.0.26.27 Candidate: 5.0.0.26.27 Version table: *** 5.0.0.26.27 500 500 http://arc

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-20 Thread Ryan Harper
I've verified that bionic-proposed linux-virtual kernel succeeds our test-case (curtin-nvme). I've 50 installs with no issue. ubuntu@ubuntu:~$ apt-cache policy linux-virtual linux-virtual: Installed: (none) Candidate: 4.15.0.59.61 Version table: 4.15.0.59.61 500 500 http://arch

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-05 Thread Ryan Harper
On Mon, Aug 5, 2019 at 1:19 PM Ryan Harper wrote: > > > On Mon, Aug 5, 2019 at 8:01 AM Andrea Righi > wrote: > >> Ryan, I've uploaded a new test kernel with the fix mentioned in the >> comment before: >> >> https://kernel.ubuntu.com/~arighi/LP-

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-05 Thread Ryan Harper
On Mon, Aug 5, 2019 at 8:01 AM Andrea Righi wrote: > Ryan, I've uploaded a new test kernel with the fix mentioned in the > comment before: > > https://kernel.ubuntu.com/~arighi/LP-1796292/4.15.0-56.62~lp1796292+4/ > > I've performed over 100 installations using curtin-nvme.sh > (install_count = 1

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-02 Thread Ryan Harper
Trying the first kernel without the change event sauce also fails: [ 532.823594] bcache: run_cache_set() invalidating existing data [ 532.828876] bcache: register_cache() registered cache device nvme0n1p2 [ 532.869716] bcache: register_bdev() registered backing device vda1 [ 532.994355] bcache

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-02 Thread Ryan Harper
I tried the +3 kernel first, and I got 3 installs and then this hang: [ 549.828710] bcache: run_cache_set() invalidating existing data [ 549.836485] bcache: register_cache() registered cache device nvme1n1p2 [ 549.937486] bcache: register_bdev() registered backing device vdg [ 550.018855] bca

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-01 Thread Ryan Harper
Reproducer script ** Attachment added: "curtin-nvme.sh" https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.n

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-01 Thread Ryan Harper
On Thu, Aug 1, 2019 at 10:15 AM Andrea Righi wrote: > Thanks Ryan, this is very interesting: > > [ 259.411486] bcache: register_bcache() error /dev/vdg: device already > registered (emitting change event) > [ 259.537070] bcache: register_bcache() error /dev/vdg: device already > registered (emitt

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-01 Thread Ryan Harper
ubuntu@ubuntu:~$ uname -r 4.15.0-56-generic ubuntu@ubuntu:~$ cat /proc/version Linux version 4.15.0-56-generic (arighi@kathleen) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #62~lp1796292 SMP Thu Aug 1 07:45:21 UTC 2019 This failed on the second install while running bcache-super-show /dev

Re: [Kernel-packages] [Bug 1838276] Re: zfs-module depedency selects random kernel package to install

2019-07-29 Thread Ryan Harper
On Mon, Jul 29, 2019 at 11:35 AM Richard Laager wrote: > What was the expected behavior from your perspective? > > The ZFS utilities are useless without a ZFS kernel module. It seems to > me that this is working fine, and installing the ZFS utilities in this > environment doesn’t make sense. > Y

[Kernel-packages] [Bug 1838276] [NEW] zfs-module depedency selects random kernel package to install

2019-07-29 Thread Ryan Harper
Public bug reported: In MAAS (ephemeral environment) or LXD where no kernel package is currently installed; installing the zfsutils-linux package will pull in a kernel package from the zfs-modules dependency. 1) # lsb_release -rd Description:Ubuntu Eoan Ermine (development branch) Release:

[Kernel-packages] [Bug 1838278] [NEW] zfs-initramfs wont mount rpool

2019-07-29 Thread Ryan Harper
Public bug reported: 1. Eoan 2. http://archive.ubuntu.com/ubuntu eoan/main amd64 zfs-initramfs amd64 0.8.1-1ubuntu7 [23.1 kB] 3. ZFS rootfs rpool is mounted at boot 4. Booting an image with a rootfs rpool: [0.00] Linux version 5.2.0-8-generic (buildd@lgw01-amd64-015) (gcc version 9.1.

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-10 Thread Ryan Harper
The newer kernel went about 16 runs and then popped this: [ 2137.810559] md: md0: resync done. [ 2296.795633] INFO: task python3:11639 blocked for more than 120 seconds. [ 2296.800320] Tainted: P O 4.15.0-55-generic #60+lp1796292+1 [ 2296.805097] "echo 0 > /proc/sys/kernel/hun

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-10 Thread Ryan Harper
Andrea, thanks for the updated kernels. On the first one, I got 23 installs before I ran into an issue; I'll test the newer kernel next. https://paste.ubuntu.com/p/2B4Kk3wbvQ/ [ 5436.870482] BUG: unable to handle kernel NULL pointer dereference at 09b8 [ 5436.873374] IP: cache_set_

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-03 Thread Ryan Harper
Without the patch, I can reproduce the hang fairly frequently, in one or two loops, which fails in this way: [ 1069.711956] bcache: cancel_writeback_rate_update_dwork() give up waiting for dc->writeback_write_update to quit [ 1088.583986] INFO: task kworker/0:2:436 blocked for more than 120 secon

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-03 Thread Ryan Harper
I've setup our integration test that runs the the CDO-QA bcache/ceph setup. On the updated kernel I got through 10 loops on the deployment before it stacktraced: http://paste.ubuntu.com/p/zVrtvKBfCY/ [ 3939.846908] bcache: bch_cached_dev_attach() Caching vdd as bcache5 on set 275985b3-da58-41f8

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-06-03 Thread Ryan Harper
On Mon, Jun 3, 2019 at 2:05 PM Andrey Grebennikov < agrebennikov1...@gmail.com> wrote: > Is there an estimate on getting this package in bionic-updates please? > We are starting an SRU of curtin this week. SRU's take at least 7 days from when they hit -proposed possibly longer depending on test

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-05-09 Thread Ryan Harper
On Wed, May 8, 2019 at 11:55 PM Trent Lloyd wrote: > I have been running into this (curtin 18.1-17-gae48e86f- > 0ubuntu1~16.04.1) > > I think this commit basically agrees with my thoughts but I just wanted > to share them explicitly in case they are interesting > > (1) If you *unregister* the ca

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-05-09 Thread Ryan Harper
Xenial GA kernel bcache unregister oops: http://paste.ubuntu.com/p/BzfHFjzZ8y/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1796292 Title: Tight timeout for bcache removal causes spur

[Kernel-packages] [Bug 1825413] Re: mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-05-06 Thread Ryan Harper
Sorry, I missed responding. This were run in separate VMs, this is under our curtin vmtest integration testing. Yes, let me get the q35 trace; it doesn't happen as often. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. h

[Kernel-packages] [Bug 1825413] Re: mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-04-22 Thread Ryan Harper
Hi Seth, notice only one of the stack tracks have the floppy, the mdadm one does not. I've also recreated this on a qemu q35 machine type which does not include the floppy device. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in

[Kernel-packages] [Bug 1825413] Re: mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-04-18 Thread Ryan Harper
root@ubuntu:~# lspci -v -nn 00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02) Subsystem: Red Hat, Inc. Qemu virtual machine [1af4:1100] Flags: fast devsel 00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton I

[Kernel-packages] [Bug 1825413] [NEW] mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-04-18 Thread Ryan Harper
Public bug reported: 1. disco 2. # apt-cache policy linux-image-virtual linux-image-virtual: Installed: 5.0.0.13.14 Candidate: 5.0.0.13.14 Version table: *** 5.0.0.13.14 500 500 http://archive.ubuntu.com/ubuntu disco/main amd64 Packages 100 /var/lib/dpkg/status 3. installat

[Kernel-packages] [Bug 1820754] [NEW] bcache null pointer exception , recursive fault

2019-03-18 Thread Ryan Harper
Public bug reported: 1) # cat /proc/version_signature Ubuntu 3.13.0-166.216-generic 3.13.11-ckt39 ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: linux-image-generic 3.13.0.167.178 ProcVersionSignature: Ubuntu 3.13.0-166.216-generic 3.13.11-ckt39 Uname: Linux 3.13.0-166-generic x86_64 Alsa

[Kernel-packages] [Bug 1820754] Re: bcache null pointer exception , recursive fault

2019-03-18 Thread Ryan Harper
Kernel oops when attempting to stop an online bcache device. ** Attachment added: "trusty-bcache-null.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1820754/+attachment/5247406/+files/trusty-bcache-null.txt ** Tags added: curtin -- You received this bug notification because you a

[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-02-21 Thread Ryan Harper
https://github.com/lxc/lxd/issues/4656 ** Bug watch added: LXD bug tracker #4656 https://github.com/lxc/lxd/issues/4656 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title:

  1   2   3   >