[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-10 Thread Ryan Harper
Andrea, thanks for the updated kernels. On the first one, I got 23 installs before I ran into an issue; I'll test the newer kernel next. https://paste.ubuntu.com/p/2B4Kk3wbvQ/ [ 5436.870482] BUG: unable to handle kernel NULL pointer dereference at 09b8 [ 5436.873374] IP: cache_set_

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-10 Thread Ryan Harper
The newer kernel went about 16 runs and then popped this: [ 2137.810559] md: md0: resync done. [ 2296.795633] INFO: task python3:11639 blocked for more than 120 seconds. [ 2296.800320] Tainted: P O 4.15.0-55-generic #60+lp1796292+1 [ 2296.805097] "echo 0 > /proc/sys/kernel/hun

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-20 Thread Ryan Harper
I've verified that bionic-proposed linux-virtual kernel succeeds our test-case (curtin-nvme). I've 50 installs with no issue. ubuntu@ubuntu:~$ apt-cache policy linux-virtual linux-virtual: Installed: (none) Candidate: 4.15.0.59.61 Version table: 4.15.0.59.61 500 500 http://arch

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-20 Thread Ryan Harper
I've verified that disco-proposed linux-virtual kernel succeeds our test-case (curtin-nvme). I've 50 installs with no issue. root@ubuntu:~# apt-cache policy linux-virtual linux-virtual: Installed: 5.0.0.26.27 Candidate: 5.0.0.26.27 Version table: *** 5.0.0.26.27 500 500 http://arc

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Ryan Harper
I've adjusted my bionic testing with the simpler configuration. I cannot reproduce the failure so far. I'll leave this running over night. I suspect there's something else going on baremetal that we can't reproduce in a VM. -- You received this bug notification because you are a member of Kern

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Ryan Harper
Also, I had some confusion earlier about what kernel I was testing. apt-cache policy shows the package version 4.15.0.59.61, however that's the meta package, the actual kernel is the .66 one. # dpkg --list | grep linux-image ii linux-image-4.15.0-59-generic 4.15.0-59.66

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Ryan Harper
Finally, I did verify xenial proposed with our original test. I had over 100 installs with no issue. @Jason Have you had any runs on Xenial or Disco? (or do you not test those)? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-23 Thread Ryan Harper
Overnight testing of the revised deployment configuration has no errors, 200 runs completed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1784665 Title: bcache: bch_allocator_thread():

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-05-09 Thread Ryan Harper
Xenial GA kernel bcache unregister oops: http://paste.ubuntu.com/p/BzfHFjzZ8y/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1796292 Title: Tight timeout for bcache removal causes spur

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-05-09 Thread Ryan Harper
On Wed, May 8, 2019 at 11:55 PM Trent Lloyd wrote: > I have been running into this (curtin 18.1-17-gae48e86f- > 0ubuntu1~16.04.1) > > I think this commit basically agrees with my thoughts but I just wanted > to share them explicitly in case they are interesting > > (1) If you *unregister* the ca

[Kernel-packages] [Bug 1820754] Re: bcache null pointer exception , recursive fault

2019-03-18 Thread Ryan Harper
Kernel oops when attempting to stop an online bcache device. ** Attachment added: "trusty-bcache-null.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1820754/+attachment/5247406/+files/trusty-bcache-null.txt ** Tags added: curtin -- You received this bug notification because you a

[Kernel-packages] [Bug 1820754] [NEW] bcache null pointer exception , recursive fault

2019-03-18 Thread Ryan Harper
Public bug reported: 1) # cat /proc/version_signature Ubuntu 3.13.0-166.216-generic 3.13.11-ckt39 ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: linux-image-generic 3.13.0.167.178 ProcVersionSignature: Ubuntu 3.13.0-166.216-generic 3.13.11-ckt39 Uname: Linux 3.13.0-166-generic x86_64 Alsa

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-06-03 Thread Ryan Harper
On Mon, Jun 3, 2019 at 2:05 PM Andrey Grebennikov < agrebennikov1...@gmail.com> wrote: > Is there an estimate on getting this package in bionic-updates please? > We are starting an SRU of curtin this week. SRU's take at least 7 days from when they hit -proposed possibly longer depending on test

[Kernel-packages] [Bug 1825413] [NEW] mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-04-18 Thread Ryan Harper
Public bug reported: 1. disco 2. # apt-cache policy linux-image-virtual linux-image-virtual: Installed: 5.0.0.13.14 Candidate: 5.0.0.13.14 Version table: *** 5.0.0.13.14 500 500 http://archive.ubuntu.com/ubuntu disco/main amd64 Packages 100 /var/lib/dpkg/status 3. installat

[Kernel-packages] [Bug 1825413] Re: mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-04-18 Thread Ryan Harper
root@ubuntu:~# lspci -v -nn 00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02) Subsystem: Red Hat, Inc. Qemu virtual machine [1af4:1100] Flags: fast devsel 00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton I

[Kernel-packages] [Bug 1825413] Re: mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-04-22 Thread Ryan Harper
Hi Seth, notice only one of the stack tracks have the floppy, the mdadm one does not. I've also recreated this on a qemu q35 machine type which does not include the floppy device. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in

[Kernel-packages] [Bug 1825413] Re: mdadm, mkfs, other io commands hang, stuck task, bad rip

2019-05-06 Thread Ryan Harper
Sorry, I missed responding. This were run in separate VMs, this is under our curtin vmtest integration testing. Yes, let me get the q35 trace; it doesn't happen as often. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. h

[Kernel-packages] [Bug 1838278] [NEW] zfs-initramfs wont mount rpool

2019-07-29 Thread Ryan Harper
Public bug reported: 1. Eoan 2. http://archive.ubuntu.com/ubuntu eoan/main amd64 zfs-initramfs amd64 0.8.1-1ubuntu7 [23.1 kB] 3. ZFS rootfs rpool is mounted at boot 4. Booting an image with a rootfs rpool: [0.00] Linux version 5.2.0-8-generic (buildd@lgw01-amd64-015) (gcc version 9.1.

[Kernel-packages] [Bug 1838276] [NEW] zfs-module depedency selects random kernel package to install

2019-07-29 Thread Ryan Harper
Public bug reported: In MAAS (ephemeral environment) or LXD where no kernel package is currently installed; installing the zfsutils-linux package will pull in a kernel package from the zfs-modules dependency. 1) # lsb_release -rd Description:Ubuntu Eoan Ermine (development branch) Release:

Re: [Kernel-packages] [Bug 1838276] Re: zfs-module depedency selects random kernel package to install

2019-07-29 Thread Ryan Harper
On Mon, Jul 29, 2019 at 11:35 AM Richard Laager wrote: > What was the expected behavior from your perspective? > > The ZFS utilities are useless without a ZFS kernel module. It seems to > me that this is working fine, and installing the ZFS utilities in this > environment doesn’t make sense. > Y

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-01 Thread Ryan Harper
ubuntu@ubuntu:~$ uname -r 4.15.0-56-generic ubuntu@ubuntu:~$ cat /proc/version Linux version 4.15.0-56-generic (arighi@kathleen) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #62~lp1796292 SMP Thu Aug 1 07:45:21 UTC 2019 This failed on the second install while running bcache-super-show /dev

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-01 Thread Ryan Harper
On Thu, Aug 1, 2019 at 10:15 AM Andrea Righi wrote: > Thanks Ryan, this is very interesting: > > [ 259.411486] bcache: register_bcache() error /dev/vdg: device already > registered (emitting change event) > [ 259.537070] bcache: register_bcache() error /dev/vdg: device already > registered (emitt

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-01 Thread Ryan Harper
Reproducer script ** Attachment added: "curtin-nvme.sh" https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.n

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-02 Thread Ryan Harper
I tried the +3 kernel first, and I got 3 installs and then this hang: [ 549.828710] bcache: run_cache_set() invalidating existing data [ 549.836485] bcache: register_cache() registered cache device nvme1n1p2 [ 549.937486] bcache: register_bdev() registered backing device vdg [ 550.018855] bca

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-02 Thread Ryan Harper
Trying the first kernel without the change event sauce also fails: [ 532.823594] bcache: run_cache_set() invalidating existing data [ 532.828876] bcache: register_cache() registered cache device nvme0n1p2 [ 532.869716] bcache: register_bdev() registered backing device vda1 [ 532.994355] bcache

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-05 Thread Ryan Harper
On Mon, Aug 5, 2019 at 8:01 AM Andrea Righi wrote: > Ryan, I've uploaded a new test kernel with the fix mentioned in the > comment before: > > https://kernel.ubuntu.com/~arighi/LP-1796292/4.15.0-56.62~lp1796292+4/ > > I've performed over 100 installations using curtin-nvme.sh > (install_count = 1

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-08-05 Thread Ryan Harper
On Mon, Aug 5, 2019 at 1:19 PM Ryan Harper wrote: > > > On Mon, Aug 5, 2019 at 8:01 AM Andrea Righi > wrote: > >> Ryan, I've uploaded a new test kernel with the fix mentioned in the >> comment before: >> >> https://kernel.ubuntu.com/~arighi/LP-

[Kernel-packages] [Bug 1862661] [NEW] zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
Public bug reported: 1) # lsb_release -rd Description:Ubuntu Focal Fossa (development branch) Release:20.04 2) # apt-cache policy zfsutils-linux zfsutils-linux: Installed: (none) Candidate: 0.8.3-1ubuntu3 Version table: 0.8.3-1ubuntu3 500 500 http://archive.ubuntu.

[Kernel-packages] [Bug 1862661] Re: zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
Note, the fact that these services fail isn't new; they've failed for a long time. However, reporting the service failure to apt is new. For example of bionic, we don't see an apt error: # lsb_release -rd Description:Ubuntu 18.04.4 LTS Release:18.04 # apt-cache policy zfsutils-linu

[Kernel-packages] [Bug 1862661] Re: zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
The latter; This may only be a packaging issue in that bionic release of tools don't report an error up through apt, where in focal (and eoan) report an error to apt. ** Changed in: zfs-linux (Ubuntu) Status: Incomplete => New -- You received this bug notification because you are a memb

[Kernel-packages] [Bug 1862661] Re: zfs-mount.service and others fail inside unpriv containers

2020-02-10 Thread Ryan Harper
Sorry, I do not expect the zfs tools to function inside the unpriv container. There is some packaging change between previous releases which did not report an error to apt/dpkg when installing. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-14 Thread Ryan Harper
Here's the upstream changes to growpart I'm suggesting: https://code.launchpad.net/~raharper/cloud-utils/+git/cloud- utils/+merge/379177 I've also proposed on modifications to cloud-init's cc_growpart as a further method to aid debugging if this hit as well as some mitigation around the race. h

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-07 Thread Ryan Harper
This is still occurring daily. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-5.4 in Ubuntu. https://bugs.launchpad.net/bugs/1861941 Title: bcache by-uuid links disappear after mounting bcache0 Status in linux-signed-5.4

[Kernel-packages] [Bug 1871611] Re: multipath nvme, failed to install with multipath disabled install failed crashed with CalledProcessError

2020-04-08 Thread Ryan Harper
The current error looks like /target got unmounted ... or there was some corruption that forced the mount into read-only mode... Running command ['sh', '-c', 'mkdir -p "$2" && cd "$2" && rsync -aXHAS --one-file-system "$1/" .', '--', '/media/filesystem', '/target'] with allowed return codes [0]

[Kernel-packages] [Bug 1871611] Re: multipath nvme, failed to install with multipath disabled install failed crashed with CalledProcessError

2020-04-08 Thread Ryan Harper
I'm marking curtin task invalid; this looks like kernel/platform issue at this point. Please reopen curtin task if curtin needs to fix something. ** Changed in: curtin Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subsc

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
Requested output on bionic release image (4.15-20) ** Attachment added: "bcache-release-4.15-20.txt" https://bugs.launchpad.net/ubuntu/+source/linux-signed-5.4/+bug/1861941/+attachment/5350481/+files/bcache-release-4.15-20.txt -- You received this bug notification because you are a member o

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
Requested data from a daily cloud image with 4.15-76 ** Attachment added: "bcache-daily-4.15-76.txt" https://bugs.launchpad.net/ubuntu/+source/linux-signed-5.4/+bug/1861941/+attachment/5350483/+files/bcache-daily-4.15-76.txt -- You received this bug notification because you are a member of

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
I'm on Focal desktop, running kvm like so qemu-system-x86_64 -smp 2 -m 1024 --enable-kvm \ -drive id=disk0,if=none,format=qcow2,file=bionic-bcache-links.qcow2 \ -device virtio-blk-pci,drive=disk0,bootindex=0 \ -drive id=disk1,if=none,format=raw,file=bcache1.img \ -device virtio-blk-pci,dri

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-09 Thread Ryan Harper
It appears that it's always been a touch racy. Curtin does not create bcaches like the script does (make-bcache --wipe-bcache -C /dev/sdc -B /dev/sdb), rather we make the cache-dev and backing dev separately, and then attach them by echoing the cacheset uuid into the bcache device attach sysfs fil

[Kernel-packages] [Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

2020-04-09 Thread Ryan Harper
During a clear-holders operation we do not need to catch any failure; we're attempting to destroy the devices in question. The destruction of a device is explicitly requested in the config via a wipe: value[1] present on one or more devices that are members of the LV. 1. https://curtin.readthedo

[Kernel-packages] [Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

2020-04-10 Thread Ryan Harper
> > Ryan, > We believe this is a bug as we expect curtin to wipe the disks. In this > case it's failing to wipe the disks and occasionally that causes issues > with our automation deploying ceph on those disks. I'm still confused about what the actual error you believe is happening. Note tha

[Kernel-packages] [Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

2020-04-10 Thread Ryan Harper
> This is in an integration lab so these hosts (including maas) are stopped, > MAAS is reinstalled, and the systems are redeployed without any release > or option to wipe during a MAAS release. > Then MAAS deploys Bionic on these hosts thinking they are completely new > systems but in reality they

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-03-23 Thread Ryan Harper
None of the VMs will be using spinning disks, it's all SSD; and virtual disks anyhow. I would not expect much timing difference on virtual hardware; there aren't real device or pci timing delays; though the kernel may wait for them; however, it should be consistent. In terms of the things that ca

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-03-23 Thread Ryan Harper
Ah, from the journal.log: Command line: BOOT_IMAGE=/boot/vmlinuz-5.3.0-1008-azure root=PARTUUID=1261a2c6-48ca-43ee-9b70-197f5b89b82c ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-21 Thread Ryan Harper
Do we have any more information on why we now get two events in Focal? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1861941 Title: bcache by-uuid links disappear after mounting bcache0

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-21 Thread Ryan Harper
So, this looks like the bug to me: Apr 21 14:15:43 ubuntu-focal systemd-udevd[1916]: bcache0: /usr/lib/udev/rules.d/60-persistent-storage.rules:112 LINK 'disk/by-uuid/30b28bee-6a1e-423d-9d53-32c78ba5454a' Apr 21 14:15:43 ubuntu-focal systemd-udevd[1916]: bcache0: Updating old name, '/dev/bcache

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-04-21 Thread Ryan Harper
Following up my question; we should see both events in all kernels. The first event is when the /dev/bcache0 is joined with a cache device, and emitts the CACHED_UUID value in the uevent; the UUID is the *backing device bcache metadata UUID* it is not related to the content contained within the b

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-20 Thread Ryan Harper
I guess I don't understand why we see this in focal. The two events in Colin's trace always happen on any Ubuntu kernel. We should see if we can get another udev trace on bionic that captures both CHANGE events, one will be from the bcache driver itself, and one is from the block layer. THe orde

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-20 Thread Ryan Harper
That doesn't explain why they show up sometimes, but not all of the time. There are 3 devices in play here. * The backing device, let's say /dev/vda; this is where we want to store the data. * The caching device, let's say /dev/vdb; this holds the cache. * The bcache device; this only appears

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-21 Thread Ryan Harper
@Balint I do not thing the fix you're released is correct, can you upload a new version without the scripts? Also, we should fix make-bcache -B to ensure that cset.uuid is not initialized; that may be why the kernel thinks it should emit the CACHED_UUID if the suerpblock of the device has a cset.

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-21 Thread Ryan Harper
Digging deeper and walking through this in a focal vm, I'm seeing some strange things. Starting with a clean disk, and just creating the backing device like so: make-bcache -B /dev/vdb We see /dev/bcache0 get created with vdb as the backing device. Now, after this, I see: /dev/bcache/by-uuid/

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
OK. I've reviewed the kernel code, and there are no unexpected changes w.r.t the CACHED_UUID change event. So I don't think we will need any kernel changes which is good. With the small change to the 60-persistent-storage.rules to not attempt to create a /dev/disk/by-uuid symlink for the backing

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
Tarball of a source package with a fix for this issue: bcache-tools_1.0.8.orig.tar.gz bcache-tools_1.0.8-4ubuntu1_amd64.build bcache-tools_1.0.8-4ubuntu1_amd64.buildinfo bcache-tools_1.0.8-4ubuntu1_amd64.changes bcache-tools_1.0.8-4ubuntu1_amd64.deb bcache-tools_1.0.8-4ubuntu1.debian.tar.xz bcache

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
Updated test to be a bit more resilient. ** Attachment added: "test-bcache-byuuid-links-fixed.sh" https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1861941/+attachment/5375723/+files/test-bcache-byuuid-links-fixed.sh -- You received this bug notification because you are a member of

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
debdiff of the changes ** Attachment added: "bcache-tools-debdiff-1.0.8-4_to_1.0.8-4ubuntu1" https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1861941/+attachment/5375722/+files/bcache-tools-debdiff-1.0.8-4_to_1.0.8-4ubuntu1 -- You received this bug notification because you are a m

[Kernel-packages] [Bug 1861941] Re: bcache by-uuid links disappear after mounting bcache0

2020-05-22 Thread Ryan Harper
systemd debdiff with a fix to skip creating /dev/disk/by-uuid for bcache backing, caching devices. ** Patch added: "lp1861941-skip-bcache-links.debdiff" https://bugs.launchpad.net/ubuntu/+source/bcache-tools/+bug/1861941/+attachment/5375730/+files/lp1861941-skip-bcache-links.debdiff -- You r

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
** Patch added: "debdiff showing the changes to upload to fix the bug." https://bugs.launchpad.net/cloud-init/+bug/1834875/+attachment/5330894/+files/cloud-utils_0.31-6_to_0.31-7.debdiff -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
@Scott, cloud-utils isn't quite new-upstream-snapshot out of the box; the debian dir does not contain the changelog; however, I think I've got this sorted out. I've a MP I can put up; but it only will show the add of the changelog file. I'll attach a debdiff and a source package. -- You recei

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
** Attachment added: "tarball of source package to upload" https://bugs.launchpad.net/cloud-init/+bug/1834875/+attachment/5330895/+files/cloud-utils_0.31-7-gd99b2d76-source.tar.xz -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-a

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-25 Thread Ryan Harper
On Tue, Feb 25, 2020 at 2:35 PM Scott Moser wrote: > this seemed to "just work" for me. > http://paste.ubuntu.com/p/93dWDPZfZT/ Ah, I didn't check that there was an existing ubuntu/devel branch. Sorry. I've pushed a MR here: https://code.launchpad.net/~raharper/cloud-utils/+git/cloud-utils/

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
Sorry for missing the questions earlier. Azure has two "machine types" gen1 which boots a non-uefi based virtual hardware platform and gen2 which is UEFI with newer virtual hardware, details here: https://azure.microsoft.com/en-us/updates/generation-2-virtual-machines- in-azure-public-preview/?cd

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
> 14:48:15> this slowness is happening with a particular instance type? I've not tested extensively across all types; but it's common for any of the "fast" types which have SSD backing. I've seen this DS1_v2, DS2_v2, DS3_v3, D4-v2, B2s, A2s, L4s > 14:49:38> the slowness is happening across mul

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
The primary concern is time before rootfs mounting and executing /sbin/init. For the spots after that, that falls into systemd territory. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-azure in Ubuntu. https://bugs.launchpad

[Kernel-packages] [Bug 1858495] Re: multiple long delays during kernel and userspace boot

2020-02-26 Thread Ryan Harper
here is some debug data I captured: -rw-rw-r-- 1 ubuntu ubuntu 729 Dec 16 12:15 bug-bionic-baseline-after-templating-Standard-DS2-v2.csv drwxrwxr-x 12 ubuntu ubuntu 22 Dec 16 12:15 bug-bionic-baseline-after-templating-Standard-DS2-v2.csv.debug/ -rw-rw-r-- 1 ubuntu ubuntu 721 Dec 16 13:07

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-02-27 Thread Ryan Harper
I can recreate the issue inside an LXC container on focal only (bionic, disco, eoan) and without any dpkg-divert of update-initramfs; as such I'm marking the curtin task invalid. ** Changed in: curtin Status: Incomplete => Invalid -- You received this bug notification because you are a me

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-02-27 Thread Ryan Harper
An easy recreate: lxc launch ubuntu-daily:focal f1 lxc exec f1 bash apt update && apt install linux-generic This does not fail on eoan or bionic. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.ne

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-02-27 Thread Ryan Harper
Note, by fail, we mean depmod emits the error message mentioned in bug title; there is nothing *functionally* wrong; just scary/noisy output which it did not use to produce. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1864992] Re: depmod: ERROR: ../libkmod/libkmod.c:515 lookup_builtin_file() could not open builtin file '/lib/modules/5.4.0-14-generic/modules.builtin.bin'

2020-03-02 Thread Ryan Harper
*** This bug is a duplicate of bug 1863261 *** https://bugs.launchpad.net/bugs/1863261 I do not believe this is a duplicate; It is more likely a *packaging* issue. The question remains for this bug, why does it only appear in the focal kernels; but not Eoan or older? And if someone could c

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-06 Thread Ryan Harper
A couple of comments on the suggested path: > Imho the sequency of commands should be: > * take flock on the device, to neutralise udev +1 on this approach. Do you know if the flock will block systemd's inotify write watch on the block device which triggers udevd? This is the typical race we se

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
> it will prevent udevd from running the rules against it. Thus effectively the event will be fired and done, but nothing actually executed for it. Interesting, I suspect this is the race we see. The events emitted but no actions taken (ie we didn't get our by-partuuid symlink created. > I someh

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
@ddstreet Yes, settle does not help. Re-triggering udevadm trigger --action=add /sys/class/block/sda Would re-run all of them after the partition change has occurred, which is what I'm currently suggesting as a heavy-handed workaround. I would like to understand *why* the udevd/kernel pair exhi

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
On Thu, Nov 7, 2019 at 1:30 PM Dan Streetman wrote: > > Yes, settle does not help. > > Well, I didn't suggest just to settle ;-) > Sorry; long bug thread. > > I'm currently suggesting as a heavy-handed workaround. > > I don't really see why you think this is heavy-handed, but I must be > missi

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
On Thu, Nov 7, 2019 at 11:30 AM Dimitri John Ledkov wrote: > > So that means we have this sequence of events: > > a.) growpart change partition table > > b.) growpart call partx > > c.) udev created and events being processed > > That is not true. whilst sfdisk is deleting, creating, finishing

Re: [Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2019-11-07 Thread Ryan Harper
On Thu, Nov 7, 2019 at 2:41 PM Dan Streetman wrote: > > Issuing a second > > trigger will repeat this. > > IMO, that's a non-zero amount of time that slows the boot down, so I'd > like > > to avoid that. > > systemd-udev-trigger.serivce retriggers *everything* at boot (except in > an unprivileged

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-06 Thread Ryan Harper
Cloud-init service starts and will run growpart, etc Feb 06 00:37:26 ubuntu systemd[1]: Starting Initial cloud-init job (pre-networking)... Feb 06 00:37:37 test-xrdpdnvfctsofyygmzan systemd[1]: Starting Initial cloud-init job (metadata service crawler)... Something has modified sdb1 (growpart/s

[Kernel-packages] [Bug 1834875] Re: cloud-init growpart race with udev

2020-02-10 Thread Ryan Harper
Yes, this is my read on the issue as well. The trigger is related to the inotify watch that systemd-udevd puts on the disk. Something that might help that we could try per xnox's comment around use of flock. if growpart were to flock /dev/sda (we need to sort out what flags are needed to prevent

[Kernel-packages] [Bug 1830740] Re: [linux-azure] Delay during boot of some instance sizes

2019-10-29 Thread Ryan Harper
This looks hypervisor/kernel related. Some observations: The cloud-init.log in the collect-logs shows cloud-init running twice. The first time, run-time is expected, approx 17s of cloud-init time, the second boot took much longer, but the bulk if the time is in udev 2019-06-20 18:09:18,951 - u

[Kernel-packages] [Bug 1858495] [NEW] multiple long delays during kernel and userspace boot

2020-01-06 Thread Ryan Harper
Public bug reported: Booting some Bionic instances in Azure (gen1 machines), I see some large delays during kernel/userspace boot that it would be good to understand what's going on. Additionally, there areas during boot that see delays is different for an image that's been created from a templat

[Kernel-packages] [Bug 1858615] Re: Fail to boot when NoCloud datasource is included

2020-01-08 Thread Ryan Harper
Thanks for filing a bug. I've added a dmidecode task to track the issue with the tool. It may also affect the kernel package, and possibly firmware (though that's not something that Ubuntu provides). Cloud-init and any other tool may invoke this package and it should not reboot the system; but t

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-03 Thread Ryan Harper
I've setup our integration test that runs the the CDO-QA bcache/ceph setup. On the updated kernel I got through 10 loops on the deployment before it stacktraced: http://paste.ubuntu.com/p/zVrtvKBfCY/ [ 3939.846908] bcache: bch_cached_dev_attach() Caching vdd as bcache5 on set 275985b3-da58-41f8

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-03 Thread Ryan Harper
Without the patch, I can reproduce the hang fairly frequently, in one or two loops, which fails in this way: [ 1069.711956] bcache: cancel_writeback_rate_update_dwork() give up waiting for dc->writeback_write_update to quit [ 1088.583986] INFO: task kworker/0:2:436 blocked for more than 120 secon

[Kernel-packages] [Bug 1838278] Re: zfs-initramfs wont mount rpool

2019-09-17 Thread Ryan Harper
Curtin hasn't ever run zfs export on the pools; so either something else did this previously, or it wasn't a requirement. I can see if adding a zfs export on the pool works around the issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1838278] Re: zfs-initramfs wont mount rpool

2019-09-17 Thread Ryan Harper
A quick hack shows that if we export after unmount. I'd like to understand if we we need/should use import -f, however, curtin can now ensure it exports pools it has created at the end of install. ** Also affects: curtin (Ubuntu) Importance: Undecided Status: New ** Changed in: curtin

[Kernel-packages] [Bug 1760173] [NEW] zfs, zpool commands hangs for 10 seconds without a /dev/zfs

2018-03-30 Thread Ryan Harper
Public bug reported: 1. # lsb_release -rd Description:Ubuntu 16.04.4 LTS Release:16.04 2. # apt-cache policy zfsutils-linux zfsutils-linux: Installed: 0.6.5.6-0ubuntu19 Candidate: 0.6.5.6-0ubuntu19 Version table: *** 0.6.5.6-0ubuntu19 500 500 http://archive.ubuntu.com/u

Re: [Kernel-packages] [Bug 1729145] Re: /dev/bcache/by-uuid links not created after reboot

2018-04-03 Thread Ryan Harper
Is this also fixed in bionic yet? On Tue, Apr 3, 2018 at 9:10 AM, Launchpad Bug Tracker <1729...@bugs.launchpad.net> wrote: > This bug was fixed in the package linux - 4.13.0-38.43 > > --- > linux (4.13.0-38.43) artful; urgency=medium > > * linux: 4.13.0-38.43 -proposed tracker (LP:

Re: [Kernel-packages] [Bug 1760173] Re: zfs, zpool commands hangs for 10 seconds without a /dev/zfs

2018-04-10 Thread Ryan Harper
On Tue, Apr 10, 2018 at 2:44 AM, Colin Ian King <1760...@bugs.launchpad.net> wrote: > Would an immediate return with some error/warning message be more > appropriate that a 10 second delay? Yes. I would think that the amount of time to wait could be an option. I've read that in some scenarios user

[Kernel-packages] [Bug 1757565] Re: btrfs and tar sparse truncate archives

2018-04-10 Thread Ryan Harper
Verified artful-proposed. root@ubuntu:~# cat /etc/cloud/build.info build_name: server serial: 20180404 root@ubuntu:~# uname -a Linux ubuntu 4.13.0-38-generic #43-Ubuntu SMP Wed Mar 14 15:20:44 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux root@ubuntu:~# mount /dev/sda /mnt root@ubuntu:~# grep sda /pr

[Kernel-packages] [Bug 1739672] Re: Regression in getaddrinfo(): calls block for much longer on Bionic (compared to Xenial), please disable LLMNR

2018-04-10 Thread Ryan Harper
I'm seeing this on Artful as well, in Azure cloud. ** Also affects: glibc (Ubuntu Artful) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Artful) Importance: Undecided Status: New ** Also affects: systemd (Ubuntu Artful) Importance: Undecided Status

[Kernel-packages] [Bug 1739672] Re: Regression in getaddrinfo(): calls block for much longer on Bionic (compared to Xenial), please disable LLMNR

2018-04-10 Thread Ryan Harper
ubuntu@foufoune:~$ lsb_release -rd Description:Ubuntu 17.10 Release:17.10 ubuntu@foufoune:~$ apt-cache policy systemd systemd: Installed: 234-2ubuntu12.3 Candidate: 234-2ubuntu12.3 Version table: *** 234-2ubuntu12.3 500 500 http://azure.archive.ubuntu.com/ubuntu artful-up

[Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-15 Thread Ryan Harper
test kernel 4.15-rc1 up to abb62c46d4949d44979fa647740feff3f7538799 FAILED ** Attachment added: "kernel oops for 4.15-rc1 up to abb62c46d4949d44979fa647740feff3f7538799" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+attachment/5175947/+files/rc1-bisect2-oops.txt -- You rece

[Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-21 Thread Ryan Harper
FAILED: Kernel 4.15-rc1 up to f17b9e764dfcd838dab51572d620a371c05a8e60 Attached is the oops of the failure. ** Attachment added: "rc1-bisect3-ops.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+attachment/5178807/+files/rc1-bisect3-ops.txt -- You received this bug notific

[Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-27 Thread Ryan Harper
FAILED: Kernel 4.15-rc1 up to f48f66a962a54c3d26d688c3df5441c9d1ba8730 Attached is the oops of the failure. ** Attachment added: "rc1-bisect4-oops.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+attachment/5181417/+files/rc1-bisect4-oops.txt -- You received this bug notif

[Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-27 Thread Ryan Harper
PASS: Kernel 4.15-rc1 (bisect5) up to 6b457409169b7686d293b408da0b6446ccb57a76 I've 2600 seconds of uptime with over 800 loops in the test-case. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/b

[Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-28 Thread Ryan Harper
PASS: Kernel 4.15-rc1 (bisect6) up to 87eba0716011e528f7841026f2cc65683219d0ad -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1784665 Title: mkfs.ext4 over /dev/bcache0 hangs Status in

[Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-28 Thread Ryan Harper
FAILED: Kernel 4.15-rc1 up to bc631943faba6fc3f755748091ada31798fb7d50 Attached is the oops of the failure. ** Attachment added: "rc1-bisect7-oops.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+attachment/5181823/+files/rc1-bisect7-oops.txt -- You received this bug notif

Re: [Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-28 Thread Ryan Harper
It's always been a locking issue, so either misuse (or missing) locks in the bcache attach/detach path, or generic locking changes to block layer path I'd think. I'll see if I can find any hits on those oops tracebacks too On Tue, Aug 28, 2018 at 2:29 PM Ryan Harper w

Re: [Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-28 Thread Ryan Harper
https://patchwork.kernel.org/patch/10094201/ diff --git a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts index d4f80786e7c2..3890468678ce 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts +++ b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dt

Re: [Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-28 Thread Ryan Harper
https://www.spinics.net/lists/linux-bcache/msg04869.html https://www.spinics.net/lists/linux-bcache/msg05774.html On Tue, Aug 28, 2018 at 2:30 PM Ryan Harper wrote: > > It's always been a locking issue, so either misuse (or missing) locks > in the bcache attach/detach path, or g

Re: [Kernel-packages] [Bug 1784665] Re: mkfs.ext4 over /dev/bcache0 hangs

2018-08-29 Thread Ryan Harper
On Wed, Aug 29, 2018 at 12:01 PM Joseph Salisbury wrote: > > Thanks for digging up this additional information. I'll investigate > further. While I do that, the v4.19-rc1 kernel is now out. It might be > worthwhile to give that one a go, and see if the fix was already > committed to mainline: >

[Kernel-packages] [Bug 1789758] [NEW] bluetooth headphones a2dp profile does not function after suspend

2018-08-29 Thread Ryan Harper
Public bug reported: 1) lsb_release -rd Description:Ubuntu 18.04.1 LTS Release:18.04 2) $ apt-cache policy bluez bluez: Installed: 5.48-0ubuntu3.1 Candidate: 5.48-0ubuntu3.1 Version table: *** 5.48-0ubuntu3.1 500 500 http://us.archive.ubuntu.com/ubuntu bionic-updates/ma

  1   2   3   >