from:"Stéphane Graber"

[Kernel-packages] [Bug 1824719] [NEW] [shiftfs] Allow stacking overlayfs on top

2019-04-14 Thread Stéphane Graber

Public bug reported:

Shiftfs right now prevents stacking overlayfs on top of it which
unfortunately means all users of Docker as well as some nested LXC users
which aren't using btrfs are going to break when they get switched over
to shiftfs.

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824719

Title:
  [shiftfs] Allow stacking overlayfs on top

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Shiftfs right now prevents stacking overlayfs on top of it which
  unfortunately means all users of Docker as well as some nested LXC
  users which aren't using btrfs are going to break when they get
  switched over to shiftfs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824719/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1824719] Re: shiftfs: Allow stacking overlayfs on top

2019-04-16 Thread Stéphane Graber

** Changed in: linux (Ubuntu)
   Status: Incomplete => Triaged

** Tags added: shiftfs

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824719

Title:
  shiftfs: Allow stacking overlayfs on top

Status in linux package in Ubuntu:
  Triaged

Bug description:
  Shiftfs right now prevents stacking overlayfs on top of it which
  unfortunately means all users of Docker as well as some nested LXC
  users which aren't using btrfs are going to break when they get
  switched over to shiftfs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824719/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1824812] Re: apparmor does not start in Disco LXD containers

2019-04-16 Thread Stéphane Graber

** Tags added: shiftfs

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824812

Title:
  apparmor does not start in Disco LXD containers

Status in AppArmor:
  Triaged
Status in apparmor package in Ubuntu:
  In Progress
Status in libvirt package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  In Progress

Bug description:
  In LXD apparmor now skips starting.

  Steps to reproduce:
  1. start LXD container
    $ lxc launch ubuntu-daily:d d-testapparmor
    (disco to trigger the issue, cosmic as reference)
  2. check the default profiles loaded
    $ aa-status

  => This will in cosmic and up to recently disco list plenty of profiles 
active even in the default install.
  Cosmic:
    25 profiles are loaded.
    25 profiles are in enforce mode.
  Disco:
    15 profiles are loaded.
    15 profiles are in enforce mode.

  All those 15 remaining are from snaps.
  The service of apparmor.service actually states that it refuses to start.

  $ systemctl status apparmor
  ...
  Apr 15 13:56:12 testkvm-disco-to apparmor.systemd[101]: Not starting AppArmor 
in container

  I can get those profiles (the default installed ones) loaded, for example:
    $ sudo apparmor_parser -r /etc/apparmor.d/sbin.dhclient
  makes it appear
    22 profiles are in enforce mode.
     /sbin/dhclient

  I was wondering as in my case I found my guest with no (=0) profiles loaded. 
But as shown above after "apparmor_parser -r" and package install profiles 
seemed fine. Then the puzzle was solved, on package install they
  will call apparmor_parser via the dh_apparmor snippet and it is fine.

  To fully disable all of them:
$ lxc stop 
$ lxc start 
$ lxc exec d-testapparmor aa-status
  apparmor module is loaded.
  0 profiles are loaded.
  0 profiles are in enforce mode.
  0 profiles are in complain mode.
  0 processes have profiles defined.
  0 processes are in enforce mode.
  0 processes are in complain mode.
  0 processes are unconfined but have a profile defined.

  That would match the service doing an early exit as shown in systemctl
  status output above. The package install or manual load works, but
  none are loaded by the service automatically e.g. on container
  restart.

  --- --- ---

  This bug started as:
  Migrations to Disco trigger "Unable to find security driver for model 
apparmor"

  This most likely is related to my KVM-in-LXD setup but it worked fine
  for years and I'd like to sort out what broke. I have migrated to
  Disco's qemu 3.1 already which makes me doubts generic issues in qemu
  3.1 in general.

  The virt tests that run cross release work fine starting from X/B/C but all 
those chains fail at mirgating to Disco now with:
    $ lxc exec testkvm-cosmic-from -- virsh migrate --unsafe --live 
kvmguest-bionic-normal
    qemu+ssh://10.21.151.207/system
    error: unsupported configuration: Unable to find security driver for model 
apparmor

  I need to analyze what changed

To manage notifications about this bug go to:
https://bugs.launchpad.net/apparmor/+bug/1824812/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-05-05 Thread Stéphane Graber

@Khaled yes, it is and we have it now. What's still needed is for the
kernel to be signed so it can be used under secureboot.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
   Find image based on I

[Kernel-packages] [Bug 1882955] [NEW] LXD 4.2 broken on linux-kvm due to missing VLAN filtering

2020-06-10 Thread Stéphane Graber

Public bug reported:

This is another case of linux-kvm having unexplained differences
compared to linux-generic in areas that aren't related to hardware
drivers (see other bug we filed for missing nft).

This time, CPC is reporting that LXD no longer works on linux-kvm as we
now set vlan filtering on our bridges to prevent containers from
escaping firewalling through custom vlan tags.

This relies on CONFIG_BRIDGE_VLAN_FILTERING which is a built-in on the
generic kernel but is apparently missing on linux-kvm (I don't have any
system running that kernel to confirm its config, but the behavior
certainly matches that).

We need this fixed in focal and groovy.

** Affects: linux-kvm (Ubuntu)
 Importance: Undecided
 Status: Triaged

** Changed in: linux-kvm (Ubuntu)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1882955

Title:
  LXD 4.2 broken on linux-kvm due to missing VLAN filtering

Status in linux-kvm package in Ubuntu:
  Triaged

Bug description:
  This is another case of linux-kvm having unexplained differences
  compared to linux-generic in areas that aren't related to hardware
  drivers (see other bug we filed for missing nft).

  This time, CPC is reporting that LXD no longer works on linux-kvm as
  we now set vlan filtering on our bridges to prevent containers from
  escaping firewalling through custom vlan tags.

  This relies on CONFIG_BRIDGE_VLAN_FILTERING which is a built-in on the
  generic kernel but is apparently missing on linux-kvm (I don't have
  any system running that kernel to confirm its config, but the behavior
  certainly matches that).

  We need this fixed in focal and groovy.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-kvm/+bug/1882955/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-18 Thread Stéphane Graber

Trying to boot the proposed kernel in LXD:

"""
BdsDxe: loading Boot0007 "ubuntu" from 
HD(1,GPT,25633192-5DBD-412A-8A50-E29B79F72A50,0x800,0x32000)/\EFI\ubuntu\shimx64.efi
BdsDxe: starting Boot0007 "ubuntu" from 
HD(1,GPT,25633192-5DBD-412A-8A50-E29B79F72A50,0x800,0x32000)/\EFI\ubuntu\shimx64.efi
RAMDISK: incomplete write (4194304 != 8388608)
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-1017-kvm #17-Ubuntu
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS 0.0.0 02/06/2015
Call Trace:
 0x9392230d
 0x932a8a21
 0x9412e5c1
 0x9412e80f
 0x9412e976
 0x9412e274
 ? 0x93938a70
 0x93938a79
 0x93a00215
Kernel Offset: 0x1220 from 0x8100 (relocation range: 
0x8000-0xbfff)
---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on 
unknown-block(0,0) ]---
"""


This appears to be lz4 related. Changing the initramfs to gzip makes the VM 
boot just fine.
It's worth noting that when booting the generic kernel, we get the unpack error 
showed in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1835660 but 
things still boot fine.

Marking verification as failed based on this. We need images to work
properly with a standard Ubuntu config so need lz4 fixed.

** Tags removed: verification-needed-focal
** Tags added: verification-failed-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " fro

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-18 Thread Stéphane Graber

"""
Jun 18 13:56:15 f1 kernel: [0.383207] Trying to unpack rootfs image as 
initramfs...
Jun 18 13:56:15 f1 kernel: [0.463102] Initramfs unpacking failed: Decoding 
failed
"""

Is what we're getting on current generic kernel, though boot continues after 
that.
I don't know if when that happens we're actually skipping the initrd entirely 
and just get lucky that the generic kernel has everything we need builtin so it 
boots or if the error in that case is just wrong and the initrd is still 
properly unpacked and run.

Either way, this needs sorting, looking at the other bug report, there's
been something wrong with our kernel and lz4 initrd for a long time and
it's apparently biting us a lot more now.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 -

[Kernel-packages] [Bug 1835660] Re: initramfs unpacking failed

2020-06-18 Thread Stéphane Graber

All LXD virtual machines are hitting this too.

Run:
 - lxc launch images:ubuntu/focal/cloud f1 && lxc console f1

And you'll see it show that message. As mentioned above, boot then still
goes ahead and you get a login prompt, but as that may not always be the
case.

For example in linux-kvm, that fallback mechanism doesn't appear to work and we 
instead get a kernel panic unless we've manually modified the initrd to be gzip:
https://bugs.launchpad.net/cloud-images/+bug/1873809

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1835660

Title:
  initramfs unpacking failed

Status in initramfs-tools package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  "initramfs unpacking failed: Decoding failed",  message appears on
  boot up.

  If I "update-initramfs" using gzip instead of lz, then boot up passes
  without decoding failed message.

  ---

  However, we currently believe that the decoding error reported in
  dmesg is actually harmless and has no impact on usability on the
  system.

  Switching from lz4 to gzip compression, simply papers over the
  warning, without any benefits, and slows down boot.

  Kernel should be fixed to correctly parse lz4 compressed initrds, or
  at least lower the warning, to not be user visible as an error.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1835660/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-18 Thread Stéphane Graber

@Stefan, so actually this is an actual regression.

1015 will boot just fine in LXD with secureboot disabled.
1017 will not boot at all in LXD with or without secureboot disabled.

I don't know if it's switching to a signed kernel which causes the lz4
issue but the result is a clear regression so I would not consider this
kernel suitable for release to anyone.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - 000

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-23 Thread Stéphane Graber

Yeah, I think you're right, I also had the exact same panic happen now
on 1015, so it's likely some grub weirdness rather than kernel
regression.

It just so happened that in my last test I managed to get a working grub
config after moving to 1015 and not with 1017. Looks like we'll need to
poke at grub...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-23 Thread Stéphane Graber

"""
Loading Linux 5.4.0-1015-kvm ...
Loading initial ramdisk ...
Linux version 5.4.0-1015-kvm (buildd@lcy01-amd64-027) (gcc version 9.3.0 
(Ubuntu 9.3.0-10ubuntu2)) #15-Ubuntu SMP Fri Jun 5 00:55:20 UTC 2020 (Ubuntu 
5.4.0-1015.15-kvm 5.4.41)
Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1015-kvm 
root=UUID=03167f19-fb7f-4ba9-b4da-5e4acc0d97e3 ro single nomodeset
x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
x86/fpu: xstate_offset[3]:  832, xstate_sizes[3]:   64
x86/fpu: xstate_offset[4]:  896, xstate_sizes[4]:   64
x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 
'compacted' format.
BIOS-provided physical RAM map:
BIOS-e820: [mem 0x-0x0009] usable
BIOS-e820: [mem 0x0010-0x3ee6afff] usable
BIOS-e820: [mem 0x3ee6b000-0x3ef2bfff] reserved
BIOS-e820: [mem 0x3ef2c000-0x3f8eefff] usable
BIOS-e820: [mem 0x3f8ef000-0x3faeefff] reserved
BIOS-e820: [mem 0x3faef000-0x3fb75fff] usable
BIOS-e820: [mem 0x3fb76000-0x3fb7efff] ACPI data
BIOS-e820: [mem 0x3fb7f000-0x3fbfefff] ACPI NVS
BIOS-e820: [mem 0x3fbff000-0x3ffd] usable
BIOS-e820: [mem 0x3ffe-0x3fff] reserved
BIOS-e820: [mem 0xb000-0xbfff] reserved
BIOS-e820: [mem 0xffe0-0x] reserved
NX (Execute Disable) protection: active
extended physical RAM map:
reserve setup_data: [mem 0x-0x0009] usable
reserve setup_data: [mem 0x0010-0x3df4b017] usable
reserve setup_data: [mem 0x3df4b018-0x3df86457] usable
reserve setup_data: [mem 0x3df86458-0x3df87017] usable
reserve setup_data: [mem 0x3df87018-0x3df90a57] usable
reserve setup_data: [mem 0x3df90a58-0x3ee6afff] usable
reserve setup_data: [mem 0x3ee6b000-0x3ef2bfff] reserved
reserve setup_data: [mem 0x3ef2c000-0x3f8eefff] usable
reserve setup_data: [mem 0x3f8ef000-0x3faeefff] reserved
reserve setup_data: [mem 0x3faef000-0x3fb75fff] usable
reserve setup_data: [mem 0x3fb76000-0x3fb7efff] ACPI data
reserve setup_data: [mem 0x3fb7f000-0x3fbfefff] ACPI NVS
reserve setup_data: [mem 0x3fbff000-0x3ffd] usable
reserve setup_data: [mem 0x3ffe-0x3fff] reserved
reserve setup_data: [mem 0xb000-0xbfff] reserved
reserve setup_data: [mem 0xffe0-0x] reserved
efi: EFI v2.70 by EDK II
efi:  SMBIOS=0x3f915000  ACPI=0x3fb7e000  ACPI 2.0=0x3fb7e014  
MEMATTR=0x3e115118 
secureboot: Secure boot disabled
SMBIOS 2.8 present.
DMI: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS 0.0.0 02/06/2015
Hypervisor detected: KVM
kvm-clock: Using msrs 4b564d01 and 4b564d00
kvm-clock: cpu 0, msr 14630001, primary cpu clock
kvm-clock: using sched offset of 4626558194 cycles
clocksource: kvm-clock: mask: 0x max_cycles: 0x1cd42e4dffb, 
max_idle_ns: 881590591483 ns
tsc: Detected 2712.000 MHz processor
last_pfn = 0x3ffe0 max_arch_pfn = 0x4
x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC  
Using GB pages for direct mapping
secureboot: Secure boot disabled
RAMDISK: [mem 0x2c111000-0x2cbadfff]
ACPI: Early table checksum verification disabled
ACPI: RSDP 0x3FB7E014 24 (v02 BOCHS )
ACPI: XSDT 0x3FB7D0E8 4C (v01 BOCHS  BXPCFACP 0001  
0113)
ACPI: FACP 0x3FB7A000 F4 (v03 BOCHS  BXPCFACP 0001 BXPC 
0001)
ACPI: DSDT 0x3FB7B000 001E86 (v01 BOCHS  BXPCDSDT 0001 BXPC 
0001)
ACPI: FACS 0x3FBDD000 40
ACPI: APIC 0x3FB79000 78 (v01 BOCHS  BXPCAPIC 0001 BXPC 
0001)
ACPI: HPET 0x3FB78000 38 (v01 BOCHS  BXPCHPET 0001 BXPC 
0001)
ACPI: MCFG 0x3FB77000 3C (v01 BOCHS  BXPCMCFG 0001 BXPC 
0001)
ACPI: BGRT 0x3FB76000 38 (v01 INTEL  EDK2 0002  
0113)
No NUMA configuration found
Faking a node at [mem 0x-0x3ffd]
NODE_DATA(0) allocated [mem 0x3ff8-0x3ff82fff]
Zone ranges:
  DMA32[mem 0x1000-0x3ffd]
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x1000-0x0009]
  node   0: [mem 0x0010-0x3ee6afff]
  node   0: [mem 0x3ef2c000-0x3f8eefff]
  node   0: [mem 0x3faef000-0x3fb75fff]
  node   0: [mem 0x3fbff000-0x3ffd]
Zeroed struct page in unavaila

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-23 Thread Stéphane Graber

Hmm, actually no luck at booting either 1015 or 1017 on
security.secureboot=false here, poked at grub and it does load both
kernel and initrd...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR -

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-23 Thread Stéphane Graber

@smb Can you confirm that your system indeed goes through the initrd and
isn't just silently falling back to directly mounting and booting /?

Booting with break=mount would likely be a valid way to test this
(should drop you in a shell).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA9

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-23 Thread Stéphane Graber

https://paste.ubuntu.com/p/7yHDCFt75m/ for additional proof that the
initrd is never executed (break=top would immediately drop to a shell).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 000

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-23 Thread Stéphane Graber

"""
stgraber@castiana:~$ lxc launch images:ubuntu/focal f1 --vm
Creating f1
Starting f1
stgraber@castiana:~$ lxc exec f1 bash
root@f1:~# echo "deb http://archive.ubuntu.com/ubuntu focal-proposed main 
restricted universe multiverse" >> /etc/apt/sources.list
root@f1:~# apt-get update
Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease
Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [107 kB]
Get:3 http://security.ubuntu.com/ubuntu focal-security InRelease [107 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal-proposed InRelease [265 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [201 
kB]
Get:6 http://archive.ubuntu.com/ubuntu focal-updates/main Translation-en [80.2 
kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages 
[11.1 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal-updates/restricted Translation-en 
[3036 B]
Get:9 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages 
[114 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages 
[82.4 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal-proposed/main Translation-en 
[35.0 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-proposed/restricted amd64 
Packages [7132 B]
Get:13 http://archive.ubuntu.com/ubuntu focal-proposed/restricted 
Translation-en [2144 B]
Get:14 http://archive.ubuntu.com/ubuntu focal-proposed/universe amd64 Packages 
[35.8 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal-proposed/universe Translation-en 
[24.5 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal-proposed/multiverse 
Translation-en [3404 B]
Fetched 1079 kB in 1s (794 kB/s)   
Reading package lists... Done
root@f1:~# apt-get install linux-kvm
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following additional packages will be installed:
  linux-headers-5.4.0-1017-kvm linux-headers-kvm linux-image-5.4.0-1017-kvm 
linux-image-kvm linux-kvm-headers-5.4.0-1017 linux-modules-5.4.0-1017-kvm
Suggested packages:
  fdutils linux-kvm-doc-5.4.0 | linux-kvm-source-5.4.0 linux-kvm-tools
The following NEW packages will be installed:
  linux-headers-5.4.0-1017-kvm linux-headers-kvm linux-image-5.4.0-1017-kvm 
linux-image-kvm linux-kvm linux-kvm-headers-5.4.0-1017
  linux-modules-5.4.0-1017-kvm
0 upgraded, 7 newly installed, 0 to remove and 18 not upgraded.
Need to get 28.4 MB of archives.
After this operation, 126 MB of additional disk space will be used.
Do you want to continue? [Y/n] 
Get:1 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
linux-kvm-headers-5.4.0-1017 all 5.4.0-1017.17 [11.3 MB]
Get:2 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
linux-headers-5.4.0-1017-kvm amd64 5.4.0-1017.17 [1254 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
linux-headers-kvm amd64 5.4.0.1017.16 [4376 B]
Get:4 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
linux-modules-5.4.0-1017-kvm amd64 5.4.0-1017.17 [10.6 MB]
Get:5 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
linux-image-5.4.0-1017-kvm amd64 5.4.0-1017.17 [5158 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
linux-image-kvm amd64 5.4.0.1017.16 [ B]
Get:7 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 linux-kvm 
amd64 5.4.0.1017.16 [4416 B]
Fetched 28.4 MB in 2s (14.2 MB/s)
Selecting previously unselected package linux-kvm-headers-5.4.0-1017.
(Reading database ... 46372 files and directories currently installed.)
Preparing to unpack .../0-linux-kvm-headers-5.4.0-1017_5.4.0-1017.17_all.deb ...
Unpacking linux-kvm-headers-5.4.0-1017 (5.4.0-1017.17) ...
Selecting previously unselected package linux-headers-5.4.0-1017-kvm.
Preparing to unpack .../1-linux-headers-5.4.0-1017-kvm_5.4.0-1017.17_amd64.deb 
...
Unpacking linux-headers-5.4.0-1017-kvm (5.4.0-1017.17) ...
Selecting previously unselected package linux-headers-kvm.
Preparing to unpack .../2-linux-headers-kvm_5.4.0.1017.16_amd64.deb ...
Unpacking linux-headers-kvm (5.4.0.1017.16) ...
Selecting previously unselected package linux-modules-5.4.0-1017-kvm.
Preparing to unpack .../3-linux-modules-5.4.0-1017-kvm_5.4.0-1017.17_amd64.deb 
...
Unpacking linux-modules-5.4.0-1017-kvm (5.4.0-1017.17) ...
Selecting previously unselected package linux-image-5.4.0-1017-kvm.
Preparing to unpack .../4-linux-image-5.4.0-1017-kvm_5.4.0-1017.17_amd64.deb ...
Unpacking linux-image-5.4.0-1017-kvm (5.4.0-1017.17) ...
Selecting previously unselected package linux-image-kvm.
Preparing to unpack .../5-linux-image-kvm_5.4.0.1017.16_amd64.deb ...
Unpacking linux-image-kvm (5.4.0.1017.16) ...
Selecting previously unselected package linux-kvm.
Preparing to unpack .../6-linux-kvm_5.4.0.1017.16_amd64.deb ...
Unpacking linux-kvm (5.4.0.1017.16) ...
Setting up linux-kvm-headers-5.4.0-1017 (5.4.0-1017.17) ...
Setting up linux-modules-5.4.0-1017-kvm (5.4.0-1017.17) ...
Setting up linux-headers-5.4.0-1017-kvm (5.4.0-1017

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-23 Thread Stéphane Graber

It's not the log above clearly shows the kernel loading an initrd.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
   Find image based on

[Kernel-packages] [Bug 1874519] Re: ZFS installation on Raspberry Pi is problematic

2020-06-23 Thread Stéphane Graber

Good to hear. I just ran into this today when working on a LXD appliance based 
on Ubuntu Core.
btrfs isn't exactly great as an alternative and the 8GB Pi is definitely ZFS 
capable so would be great to have :)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1874519

Title:
  ZFS installation on Raspberry Pi is problematic

Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  Version: Ubuntu Server 20.04 - preinstalled 64-bit image for Raspberry
  Pi.

  ZFS on the Pi under 20.04 is currently a bit problematic. Upon issuing
  the command 'zpool status', I'm helpfully directed to install
  zfsutils-linux. When I do this, it complains that it cannot find the
  ZFS module, then errors out. Worse than that, the zfsutils-linux
  package does not depend on the zfs-dkms package, so it doesn't attempt
  to build the ZFS kernel modules automatically.

  The workaround is to install zfs-dkms, which builds the required
  kernel modules. (Once this has been done, the usual errors when
  installing the zfsutils-linux package, caused by there being no ZFS
  pools on the system, can be worked around by creating a zpool, then
  rerunning 'sudo apt install zfsutils-linux', as with previous versions
  of Ubuntu and Debian).

  I have not tested on other hardware platforms - this problem may also
  exist on other platforms where the user has not selected to install to
  ZFS.

  I have selected 'zfsutils' as the affected package, which is not the
  name of an actual current package, since launchpad won't let me submit
  the bug without selecting a package, however it's not clear to me that
  the problem is caused by that package.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1874519/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-27 Thread Stéphane Graber

@smb what's the state of groovy, did you push the config update there
too?

For the cloud images, we'll want to switch over to those using linux-kvm
in groovy first, then focal, so just want to make sure we'll get a
working kernel on there too!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-27 Thread Stéphane Graber

Confirmed, 1018 boots fine here under Secure Boot, all good!

** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Committed

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR -

[Kernel-packages] [Bug 1858389] Re: lxd won't restart a container

2020-03-21 Thread Stéphane Graber

Moved the bug over to the kernel.

Those log messages are caused by reference issues in a network namespace
preventing it from being flushed, in turn preventing the LXC monitor
from exiting, holding everything up.

** Package changed: lxd (Ubuntu) => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1858389

Title:
  lxd won't restart a container

Status in linux package in Ubuntu:
  New

Bug description:
  During the upgrade of a container from xenial to bionic, the container
  stopped working after the reboot.

  In an manual attempt to start the container:
  # lxc start juju-3fae13-25-lxd-6 
  error: Monitor is hung

  There are no failures reported in lxd/lxc related log files

  The dmesg does give some possibly related lines:
  [24085329.749762] unregister_netdevice: waiting for eth2 to become free. 
Usage count = 1
  [24085339.874547] unregister_netdevice: waiting for eth2 to become free. 
Usage count = 1
  [24085349.931433] unregister_netdevice: waiting for eth2 to become free. 
Usage count = 1
  [24085360.136221] unregister_netdevice: waiting for eth2 to become free. 
Usage count = 1
  [24085370.329020] unregister_netdevice: waiting for eth2 to become free. 
Usage count = 1

  
  I could find that eth2 got renamed quite a few time:
  dmesg | grep eth2 | grep renamed
  [   10.974225] i40e :02:00.2 eno3: renamed from eth2
  [   38.168206] eth2: renamed from vethX6IXIX
  [   39.528396] eth1: renamed from veth2L8Q43
  [   39.544217] eth2: renamed from vethCL4RR8
  [   42.132600] eth0: renamed from veth27ILVJ
  [   42.184425] eth2: renamed from vethGV30Y9
  [   43.332523] eth2: renamed from veth38IOJO
  [   44.553249] eth2: renamed from vethYWPS85
  [   47.696816] eth2: renamed from vethRS5NIA
  [12244954.658741] eth0: renamed from veth23WYKR
  [12244954.712483] eth2: renamed from vethXKHJAY
  [21391530.547187] eth2: renamed from vethON24JW
  [21392047.344985] eth2: renamed from vethBHEXYH
  [21852338.859877] eth2: renamed from veth44669K

  
  Kernel running:
  uname -a
  Linux openstack-9 4.4.0-143-generic #169-Ubuntu SMP Thu Feb 7 07:56:38 UTC 
2019 x86_64 x86_64 x86_64 GNU/Linux

  lxd running:
  dpkg -l | grep lxd
  ii  lxd   2.0.11-0ubuntu1~16.04.4 
amd64Container hypervisor based on LXC - daemon
  ii  lxd-client2.0.11-0ubuntu1~16.04.4 
amd64Container hypervisor based on LXC - client

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1858389/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1530617] Re: FUSE in wily image with upstart installed causes chaos

2020-03-25 Thread Stéphane Graber

** Changed in: lxc (Ubuntu)
   Status: Confirmed => Invalid

** Changed in: upstart (Ubuntu)
   Status: New => Won't Fix

** Changed in: linux (Ubuntu)
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1530617

Title:
  FUSE in wily image with upstart installed causes chaos

Status in linux package in Ubuntu:
  Invalid
Status in lxc package in Ubuntu:
  Invalid
Status in upstart package in Ubuntu:
  Won't Fix

Bug description:
  Host:
  DISTRIB_ID=Ubuntu
  DISTRIB_RELEASE=15.10
  DISTRIB_CODENAME=wily
  DISTRIB_DESCRIPTION="Ubuntu 15.10"

  lxc version: 1.1.4-0ubuntu1

  In a LXC container running Ubuntu 15.10, install upstart-sysv to
  replace systemd. Using FUSE then causes almost all processes in the
  container to be killed.

  The following steps reproduce the problem using sshfs:

  # create a wily container and attach to it
  sudo lxc-create -t download -n wily -- -d ubuntu -r wily -a amd64
  sudo lxc-start -n wily
  sudo lxc-attach -n wily

  # inside the container, install upstart-sysv and reboot
  apt-get update && apt-get -y install upstart-sysv
  reboot

  # on the host, reattach to the container
  sudo lxc-attach -n wily

  # back in the container, install ssh and sshfs
  apt-get -y install openssh-server sshfs

  # create an ssh key pair in /root/.ssh
  ssh-keygen

  # set up passwordless ssh
  mkdir ~ubuntu/.ssh
  cat /root/.ssh/id_rsa.pub >> ~ubuntu/.ssh/authorized_keys
  eval $(ssh-agent)
  ssh-add /root/.ssh/id_rsa

  # take a note of the running processes and their PIDs
  ps axjf

  # run sshfs
  mkdir /fuse
  sshfs ubuntu@localhost:/ /fuse

  # we are kicked out of the container
  # run ps again in the container
  sudo lxc-attach -n wily -- ps axjf

  # a whole bunch of processes are now gone. the getty processes now
  have new PIDs, indicating they have been restarted.

  
  Other debugging performed:
  - On a 14.10 host with lxc version 1.1.0~alpha2-0ubuntu3.3, the problem does 
not occur. FUSE works fine.
  - On the same 14.10 host with lxc upgraded to 1.1.5-0ubuntu3~ubuntu14.04.1, 
the problem occurs.
  - On a 15.10 host, when running a wily container without upstart, the problem 
does not occur.
  - On a 15.10 host, when running a trusty container, the problem does not 
occur.
  - The problem can't be reproduced outside a container (15.10 host, install 
upstart-sysv, then use FUSE)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1530617/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1527374] Re: CVE-2015-8709

2020-03-25 Thread Stéphane Graber

** No longer affects: lxc (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-goldfish in Ubuntu.
https://bugs.launchpad.net/bugs/1527374

Title:
  CVE-2015-8709

Status in linux package in Ubuntu:
  Fix Released
Status in linux-armadaxp package in Ubuntu:
  Confirmed
Status in linux-flo package in Ubuntu:
  Confirmed
Status in linux-goldfish package in Ubuntu:
  Confirmed
Status in linux-lts-quantal package in Ubuntu:
  Won't Fix
Status in linux-lts-raring package in Ubuntu:
  Won't Fix
Status in linux-lts-saucy package in Ubuntu:
  Won't Fix
Status in linux-lts-utopic package in Ubuntu:
  Fix Released
Status in linux-lts-vivid package in Ubuntu:
  Fix Released
Status in linux-lts-wily package in Ubuntu:
  Fix Released
Status in linux-lts-xenial package in Ubuntu:
  New
Status in linux-mako package in Ubuntu:
  Confirmed
Status in linux-manta package in Ubuntu:
  Confirmed
Status in linux-raspi2 package in Ubuntu:
  Fix Released
Status in linux-snapdragon package in Ubuntu:
  New
Status in linux-ti-omap4 package in Ubuntu:
  Confirmed
Status in linux source package in Precise:
  Invalid
Status in linux-lts-trusty source package in Precise:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Vivid:
  Fix Released
Status in linux source package in Wily:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  ** DISPUTED ** kernel/ptrace.c in the Linux kernel through 4.4.1
  mishandles uid and gid mappings, which allows local users to gain
  privileges by establishing a user namespace, waiting for a root
  process to enter that namespace with an unsafe uid or gid, and then
  using the ptrace system call.  NOTE: the vendor states "there is no
  kernel bug here."

  Break-Fix: - local-2015-8709

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1527374/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1684481] Re: KVM guest execution start apparmor blocks on /dev/ptmx now (regression?)

2020-03-25 Thread Stéphane Graber

** Changed in: lxc (Ubuntu)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1684481

Title:
  KVM guest execution start apparmor blocks on /dev/ptmx now
  (regression?)

Status in apparmor package in Ubuntu:
  Won't Fix
Status in linux package in Ubuntu:
  Invalid
Status in lxc package in Ubuntu:
  Fix Released
Status in lxd package in Ubuntu:
  Invalid

Bug description:
  Setup:
  - Xenial host
  - lxd guests with Trusty, Xenial, ...
  - add a LXD profile to allow kvm [3] (inspired by stgraber)
  - spawn KVM guests in the LXD guests using the different distro release 
versions
  - guests are based on the uvtool default template which has a serial console 
[4]

  Issue:
  - guest starting with serial device gets blocked by apparmor and killed on 
creation
  - This affects at least ppc64el and x86 (s390x has no serial concept that 
would match)
  - This appeared in our usual checks on -proposed releases so maybe we 
can/should stop something?
Last good was "Apr 5, 2017 10:40:50 AM" first bad one "Apr 8, 2017 5:11:22 
AM"

  Background:
  We use this setup for a while and it was working without a change on our end.
  Also the fact that it still works in the Trusty LXD makes it somewhat 
suspicious.
  Therefore I'd assume an SRUed change in LXD/Kernel/Apparmor might be the 
reason and open this bug to get your opinion on it.

  You can look into [1] and search for uvt-kvm create in it.

  Deny in dmesg:
  [652759.606218] audit: type=1400 audit(1492671353.134:4520): 
apparmor="DENIED" operation="open" 
namespace="root//lxd-testkvm-xenial-from_" 
profile="libvirt-668e21f1-fa55-4a30-b325-0ed5cfd55e5b" name="/dev/pts/ptmx" 
pid=27162 comm="qemu-system-ppc" requested_mask="wr" denied_mask="wr" fsuid=0 
ouid=0

  Qemu-log:
  2017-04-20T06:55:53.139450Z qemu-system-ppc64: -chardev pty,id=charserial0: 
Failed to create PTY: No such file or directory

  There was a similar issue on qmeu namespacing (which we don't use on any of 
these releases) [2].
  While we surely don't have the "same" issue the debugging on the namespacing 
might be worth as it could be related.

  Workaround for now:
  - drop serial section from guest xml

  [1]: 
https://jenkins.ubuntu.com/server/view/Virt/job/virt-migration-cross-release-amd64/78/consoleFull
  [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1421036
  [3]: 
https://git.launchpad.net/~ubuntu-server/ubuntu/+source/qemu-migration-test/tree/kvm_profile.yaml
  [4]: https://libvirt.org/formatdomain.html#elementsCharPTY
  --- 
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: ppc64el
  DistroRelease: Ubuntu 16.04
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  Package: lxd
  PackageArchitecture: ppc64el
  ProcKernelCmdline: root=UUID=902eaad1-2164-4f9a-bec4-7ff3abc15804 ro 
console=hvc0
  ProcLoadAvg: 3.15 3.02 3.83 1/3056 79993
  ProcSwaps:
   Filename TypeSizeUsedPriority
   /swap.img   file 8388544 0   -1
  ProcVersion: Linux version 4.4.0-72-generic (buildd@bos01-ppc64el-022) (gcc 
version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #93-Ubuntu SMP Fri 
Mar 31 14:05:15 UTC 2017
  ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49
  Syslog:
   
  Tags:  xenial uec-images
  Uname: Linux 4.4.0-72-generic ppc64le
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: utah
  _MarkForUpload: True
  cpu_cores: Number of cores present = 20
  cpu_coreson: Number of cores online = 20
  cpu_smt: SMT is off
  --- 
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: ppc64el
  DistroRelease: Ubuntu 16.04
  NonfreeKernelModules: cfg80211 ebtable_broute ebtable_nat binfmt_misc veth 
nbd openvswitch vhost_net vhost macvtap macvlan xt_conntrack ipt_REJECT 
nf_reject_ipv4 ebtable_filter ebtables ip6t_MASQUERADE nf_nat_masquerade_ipv6 
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter 
ip6_tables xt_comment xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables 
zfs zunicode zcommon znvpair spl zavl kvm_hv kvm ipmi_powernv ipmi_msghandler 
uio_pdrv_genirq vmx_crypto powernv_rng ibmpowernv leds_powernv uio ib_iser 
rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 
multipath linear ses enclosure mlx4_en vxlan ip6_udp_tunnel udp_tunnel 
mlx4_core ipr
  Package: lxd
  PackageArchitecture: ppc64el
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdline: root=UUID=902eaad1-2164-4f9a-bec4-7ff3abc15804 ro 
console=hvc0
  P

[Kernel-packages] [Bug 1873809] Re: disk-kvm.img aren't UEFI bootable

2020-04-20 Thread Stéphane Graber

Ok, so the fact that we thought this worked is clearly the result from
bad testing on our part, probably because of our simplestreams parsing
code we fixed yesterday...

We obviously still need to move LXD onto this images as booting the non-
kvm images takes twice as long as it should (due to them panic + reboot
every time) AND also breaks cloud-init, at least in the way we'd like to
use it.


Now realistically this can't be fixed in time for 20.04, so what we've done is 
submitted a change to simplestreams to force all LXD users onto the non-kvm 
image:
  
https://code.launchpad.net/~stgraber/simplestreams/+git/simplestreams/+merge/382597

We'll also tell all users of `ubuntu:` and `ubuntu-daily:` that they need to do:
 - lxc config device add NAME config disk source=cloud-init:config

Which passes a stable config drive to cloud-init, avoiding the cloud-
init issue they'd be getting out of the box.


Moving forward, we'd like the -kvm kernel to be both EFI enabled AND signed. 
This will then allow those images to work properly inside LXD, at which point 
we can undo the simplestreams change and have those images used once again by 
our users.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  disk-kvm.img aren't UEFI bootable

Status in cloud-images:
  New
Status in linux-kvm package in Ubuntu:
  New

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, completely fail to boot under UEFI.

  This is a critical issue as those are the images that LXD is now
  pulling by default.

  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  Note that the non optimized images boot just fine (disk1.img).

  
  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  
  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
   Find image based on IP(0x3FF2DA12) 
/build/edk2-dQLD17/edk2-0~20191122.bd85bf54/Build/OvmfX64/RELEASE_GCC5/X64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
 (ImageBase=3FF1E000, EntryPoint=3FF30781) 


  If booting in a SecureBoot enabled environment, you instead get a
  `Access Denied` at kernel loading time, indicating that the kernel
  binary isn't a normal signed kernel. That has the same result (boot
  hangs) but without the crash message.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-images/+bug/1873809/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: disk-kvm.img aren't UEFI bootable

2020-04-20 Thread Stéphane Graber

I've tested a kernel with CONFIG_EFI_STUB added (thanks cking!).

This does boot with secureboot enabled, though the LXD agent fails to
start due to lack of vsock.

So in addition to CONFIG_EFI_STUB, it looks like we also need:
 - CONFIG_VSOCKETS
 - CONFIG_VIRTIO_VSOCKETS
 - CONFIG_VIRTIO_VSOCKETS_COMMON

Which should give us the bits needed for virtio vsock.

The rest all looked good, so we should be fine with those tweaks and the
kernel getting signed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  New

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, completely fail to boot under UEFI.

  This is a critical issue as those are the images that LXD is now
  pulling by default.

  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  Note that the non optimized images boot just fine (disk1.img).

  
  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  
  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
   Find image based on IP(0x3FF2DA12) 
/build/edk2-dQLD17/edk2-0~20191122.bd85bf54/Build/OvmfX64/RELEASE_GCC5/X64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
 (ImageBase=3FF1E000, EntryPoint=3FF30781) 


  If booting in a SecureBoot enabled environment, you instead get a
  `Access Denied` at kernel loading time, indicating that the kernel
  binary isn't a normal signed kernel. That has the same result (boot
  hangs) but without the crash message.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-images/+bug/1873809/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: disk-kvm.img aren't UEFI bootable

2020-04-20 Thread Stéphane Graber

Marking cloud-images side of this as Invalid since the images themselves are 
built correctly.
Re-packing with an updated kernel boots just fine, so we only need to track 
this against linux-kvm.

** Changed in: cloud-images
   Status: New => Invalid

** Summary changed:

- disk-kvm.img aren't UEFI bootable
+ Make linux-kvm bootable in LXD VMs

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  New

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, completely fail to boot under UEFI.

  This is a critical issue as those are the images that LXD is now
  pulling by default.

  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  Note that the non optimized images boot just fine (disk1.img).

  
  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  
  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
   Find image based on IP(0x3FF2DA12) 
/build/edk2-dQLD17/edk2-0~20191122.bd85bf54/Build/OvmfX64/RELEASE_GCC5/X64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
 (ImageBase=3FF1E000, EntryPoint=3FF30781) 


  If booting in a SecureBoot enabled environment, you instead get a
  `Access Denied` at kernel loading time, indicating that the kernel
  binary isn't a normal signed kernel. That has the same result (boot
  hangs) but without the crash message.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-images/+bug/1873809/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-04-20 Thread Stéphane Graber

** Description changed:

  The `disk-kvm.img` images which are to be preferred when run under
- virtualization, completely fail to boot under UEFI.
+ virtualization, currently completely fail to boot under UEFI.
  
- This is a critical issue as those are the images that LXD is now pulling
- by default.
+ A workaround was put in place such that LXD instead will pull generic-
+ based images until this is resolved, this however does come with a much
+ longer boot time (as the kernel panics, reboots and then boots) and also
+ reduced functionality from cloud-init, so we'd still like this fixed in
+ the near future.
  
+ To get things behaving, it looks like we need the following config
+ options to be enable in linux-kvm:
+ 
+  - CONFIG_EFI_STUB
+  - CONFIG_VSOCKETS
+  - CONFIG_VIRTIO_VSOCKETS
+  - CONFIG_VIRTIO_VSOCKETS_COMMON
+ 
+ == Rationale ==
+ We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.
+ 
+ We also need the LXD agent to work, which requires functional virtio
+ vsock.
+ 
+ == Test case ==
+  - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
+  - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
+  - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
+  - lxc launch bug1873809 v1
+  - lxc console v1
+  - 
+  - 
+  - lxc exec v1 bash
+ 
+ To validate a new kernel, you'll need to manually repack the .img file
+ and install the new kernel in there.
+ 
+ == Regression potential ==
+ I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.
+ 
+ Also, this will be introducing virtio vsock support which again, could
+ maybe confused some horribly broken systems?
+ 
+ 
+ In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.
+ 
+ 
+ -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224
  
- Note that the non optimized images boot just fine (disk1.img).
- 
- 
  I've reproduced this issue with:
-  - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
-  - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G
- 
+  - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
+  - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G
  
  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.
  
  Switching to the text console view (serial0), you'll see the same issue
  as that LXD report:
  
  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-04-20 Thread Stéphane Graber

Just tested it now, confirmed that this still boots fine and that this
time the LXD agent successfully starts too.

So this config seems suitable for us. That + enabling kernel signing
will get us working images.

Thanks!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  New

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-04-21 Thread Stéphane Graber

Thanks Louis, so our testing may in fact have been accurate and things
regressed afterwards :)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in Cloud-Bio-Linux Tutorials:
  Confirmed
Status in cloud-images:
  Confirmed
Status in linux-kvm package in Ubuntu:
  Fix Released

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
  !

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-04-21 Thread Stéphane Graber

Hmm, actually, CONFIG_EFI_STUB is the one we were missing and I'm not
seeing that in your VM either, which makes me wonder how it was booted
in the first place :)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in Cloud-Bio-Linux Tutorials:
  Confirmed
Status in cloud-images:
  Confirmed
Status in linux-kvm package in Ubuntu:
  Fix Released

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 000

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-04-21 Thread Stéphane Graber

Ok, fixed the bug tasks and re-opened the bug as we still need this
kernel to get signed.

** Changed in: linux-kvm (Ubuntu)
   Status: Fix Released => Triaged

** Changed in: cloud-images
 Assignee: Roufique Hossain (roufique) => (unassigned)

** Changed in: linux-kvm (Ubuntu)
 Assignee: Roufique Hossain (roufique) => (unassigned)

** Changed in: cloud-bl-tutorials
   Status: Confirmed => Invalid

** Changed in: cloud-bl-tutorials
   Status: Invalid => New

** Changed in: cloud-bl-tutorials
 Remote watch: Email to roufique@rtat # => None

** Project changed: cloud-bl-tutorials => linux (Ubuntu)

** No longer affects: linux (Ubuntu)

** Changed in: cloud-images
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 000

[Kernel-packages] [Bug 1879690] Re: Docker registry doesn't stay up and keeps restarting

2020-05-21 Thread Stéphane Graber

To confirm that this isn't shiftfs related and that we were just causing
the issue to be hidden, I've run the same test on OpenSuse tumbleweed.

I chose that distro because it's apparmor-enabled, has snapd and a 5.4
kernel.

```
localhost:~ # snap install docker
docker 18.09.9 from Canonical* installed
localhost:~ # auth_folder=/var/snap/docker/common/auth
localhost:~ # mkdir -p $auth_folder
localhost:~ # docker run --entrypoint htpasswd registry:2 -Bbn user passwd > 
$auth_folder/htpasswd
Unable to find image 'registry:2' locally
2: Pulling from library/registry
486039affc0a: Pulling fs layer
ba51a3b098e6: Pulling fs layer
8bb4c43d6c8e: Pulling fs layer
6f5f453e5f2d: Pulling fs layer
42bc10b72f42: Pulling fs layer
6f5f453e5f2d: Waiting
42bc10b72f42: Waiting
ba51a3b098e6: Download complete
486039affc0a: Verifying Checksum
486039affc0a: Download complete
8bb4c43d6c8e: Verifying Checksum
8bb4c43d6c8e: Download complete
6f5f453e5f2d: Verifying Checksum
6f5f453e5f2d: Download complete
42bc10b72f42: Verifying Checksum
42bc10b72f42: Download complete
486039affc0a: Pull complete
ba51a3b098e6: Pull complete
8bb4c43d6c8e: Pull complete
6f5f453e5f2d: Pull complete
42bc10b72f42: Pull complete
Digest: sha256:7d081088e4bfd632a88e3f3bcd9e007ef44a796fddfe3261407a3f9f04abe1e7
Status: Downloaded newer image for registry:2
localhost:~ # docker run -d -p 5000:5000 --restart=always --name registry \
>   -v $auth_folder:/auth \
>   -e "REGISTRY_AUTH=htpasswd" \
>   -e "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" \
>   -e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
>registry:2
cba1ec94734a8a198fa0c474d9873233958fad6cdafe93d2ccf4d701ecab55ff
localhost:~ # docker ps
CONTAINER IDIMAGE   COMMAND  CREATED
 STATUS  PORTS   NAMES
cba1ec94734aregistry:2  "/entrypoint.sh /etc…"   5 seconds ago  
 Restarting (2) Less than a second ago   registry
localhost:~ # uname -a
Linux localhost 5.4.10-1-default #1 SMP Thu Jan 9 15:45:45 UTC 2020 (556a6fe) 
x86_64 x86_64 x86_64 GNU/Linux
localhost:~ # 
```

As you can see, the exact same thing happen there. So this is an
apparmor kernel bug or some issue  with the snapd or docker snap, this
isn't a shiftfs bug and reverting the change would just expose a
different bug rather than actually fix things.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1879690

Title:
  Docker registry doesn't stay up and keeps restarting

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Eoan:
  Fix Committed
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]
  The change applied for bug 1857257 and its followup fix bug 1876645, which 
were released on focal and eoan -updates, introduced a regression on overlayfs, 
breaking docker snap.

  [Test case]
  See original bug report.

  [Fix]
  While we don't have a final fix the solution for now is to revert the 
following commits:

  UBUNTU: SAUCE: overlayfs: fix shitfs special-casing
  UBUNTU: SAUCE: overlayfs: use shiftfs hacks only with shiftfs as underlay

  [Regression potential]
  Low. Reverting these two commits will introduce back the issue reported on 
bug 1857257, but will fix the other use cases which was broken by the latest 
release.

  
  Original bug report.
  ---
  Tested kernels:
  Focal 5.4.0-31.35
  Eoan 5.3.0-53.47

  To reproduce:
  1) Spin up a cloud image
  2) snap install docker
  3) auth_folder=/var/snap/docker/common/auth
  4) mkdir -p $auth_folder
  5) docker run --entrypoint htpasswd registry:2 -Bbn user passwd > 
$auth_folder/htpasswd
  6) docker run -d -p 5000:5000 --restart=always --name registry \
    -v $auth_folder:/auth \
    -e "REGISTRY_AUTH=htpasswd" \
    -e "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" \
    -e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
     registry:2

  On a good kernel 'docker ps' shows something like:
  # docker ps
  CONTAINER IDIMAGE   COMMAND  CREATED  
   STATUS  PORTSNAMES
  a346b65b4509registry:2  "/entrypoint.sh /etc…"   14 seconds 
ago  Up 12 seconds   0.0.0.0:5000->5000/tcp   registry

  On a bad kernel:
   docker ps
  CONTAINER IDIMAGE   COMMAND  CREATED  
   STATUSPORTS   NAMES
  0322374f1b1dregistry:2  "/entrypoint.sh /etc…"   5 seconds 
ago   Restarting (2) 1 second ago   registry

  Note status 'Restarting' on the bad kernel.

  This seems to be introduce by any of the following commits:
  b3bdda24f1bc UBUNTU: SAUCE: overlayfs: fix shitfs special-casing
  6f18a8434050 UBUNTU: SAUCE: overlayfs: use shiftfs hacks only with shiftfs as 
underlay

[Kernel-packages] [Bug 1879690] Re: Docker registry doesn't stay up and keeps restarting

2020-05-21 Thread Stéphane Graber

/var/log/audit.log on Suse logs the same:

type=AVC msg=audit(1590086639.489:8595): apparmor="DENIED"
operation="open" profile="snap.docker.dockerd" name="/entrypoint.sh"
pid=5656 comm="entrypoint.sh" requested_mask="r" denied_mask="r" fsuid=0
ouid=0

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1879690

Title:
  Docker registry doesn't stay up and keeps restarting

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Eoan:
  Fix Committed
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]
  The change applied for bug 1857257 and its followup fix bug 1876645, which 
were released on focal and eoan -updates, introduced a regression on overlayfs, 
breaking docker snap.

  [Test case]
  See original bug report.

  [Fix]
  While we don't have a final fix the solution for now is to revert the 
following commits:

  UBUNTU: SAUCE: overlayfs: fix shitfs special-casing
  UBUNTU: SAUCE: overlayfs: use shiftfs hacks only with shiftfs as underlay

  [Regression potential]
  Low. Reverting these two commits will introduce back the issue reported on 
bug 1857257, but will fix the other use cases which was broken by the latest 
release.

  
  Original bug report.
  ---
  Tested kernels:
  Focal 5.4.0-31.35
  Eoan 5.3.0-53.47

  To reproduce:
  1) Spin up a cloud image
  2) snap install docker
  3) auth_folder=/var/snap/docker/common/auth
  4) mkdir -p $auth_folder
  5) docker run --entrypoint htpasswd registry:2 -Bbn user passwd > 
$auth_folder/htpasswd
  6) docker run -d -p 5000:5000 --restart=always --name registry \
    -v $auth_folder:/auth \
    -e "REGISTRY_AUTH=htpasswd" \
    -e "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" \
    -e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
     registry:2

  On a good kernel 'docker ps' shows something like:
  # docker ps
  CONTAINER IDIMAGE   COMMAND  CREATED  
   STATUS  PORTSNAMES
  a346b65b4509registry:2  "/entrypoint.sh /etc…"   14 seconds 
ago  Up 12 seconds   0.0.0.0:5000->5000/tcp   registry

  On a bad kernel:
   docker ps
  CONTAINER IDIMAGE   COMMAND  CREATED  
   STATUSPORTS   NAMES
  0322374f1b1dregistry:2  "/entrypoint.sh /etc…"   5 seconds 
ago   Restarting (2) 1 second ago   registry

  Note status 'Restarting' on the bad kernel.

  This seems to be introduce by any of the following commits:
  b3bdda24f1bc UBUNTU: SAUCE: overlayfs: fix shitfs special-casing
  6f18a8434050 UBUNTU: SAUCE: overlayfs: use shiftfs hacks only with shiftfs as 
underlay
  629edd70891c UBUNTU: SAUCE: shiftfs: record correct creator credentials
  cfaa482afb97 UBUNTU: SAUCE: shiftfs: fix dentry revalidation

  Kernels that don't have these commits seem fine.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1879690/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-05-26 Thread Stéphane Graber

Re-opening as I'm not seeing any mention of this being signed now.

** Changed in: linux-kvm (Ubuntu)
   Status: Fix Released => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
   Find i

[Kernel-packages] [Bug 1881346] Re: linux-kvm should support nftables

2020-05-29 Thread Stéphane Graber

Right, I've sent a tweak to LXD upstream to detect such kernel setup and
fallback to xtables, but that's obviously not a situation we'd like to
rely on.

nftables is the current supported way of doing firewalling and is what
Ubuntu uses by default (through shim packages) as of 20.04, so we need
to ensure that all our kernels support it.

Easy fix would be to align CONFIG_NFT* to what we have in generic. If
that increases size too much, then I guess we can look at trimming
things a bit to only include the usually bits we need (ipv4, ipv6, nat,
mangling, mac filtering, ...).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1881346

Title:
  linux-kvm should support nftables

Status in linux-kvm package in Ubuntu:
  New

Bug description:
  LXD can't use nftables on the latest linux-kvm kernels for eoan,
  focal, and groovy:

  - groovy: 5.4.0.1009.9
  - focal: 5.4.0-1011.11
  - eoan: 5.3.0.1017.19

  LXD detects that nft tools are available, and nft tables can be
  listed; however, trying to create a new table or rule fails.

  Because of this, LXD has to fall back on xtables, which is a legacy
  package.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-kvm/+bug/1881346/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1648143] Re: tor in lxd: apparmor="DENIED" operation="change_onexec" namespace="root//CONTAINERNAME_" profile="unconfined" name="system_tor"

2020-06-01 Thread Stéphane Graber

** Changed in: apparmor (Ubuntu)
   Status: Confirmed => Invalid

** No longer affects: apparmor (Ubuntu Xenial)

** No longer affects: apparmor (Ubuntu Yakkety)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1648143

Title:
  tor in lxd: apparmor="DENIED" operation="change_onexec"
  namespace="root//CONTAINERNAME_" profile="unconfined"
  name="system_tor"

Status in apparmor package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Fix Released
Status in tor package in Ubuntu:
  Invalid
Status in linux source package in Xenial:
  Fix Released
Status in tor source package in Xenial:
  Invalid
Status in linux source package in Yakkety:
  Fix Released
Status in tor source package in Yakkety:
  Invalid

Bug description:
  Environment:
  

  Distribution: ubuntu
  Distribution version: 16.10
  lxc info:
  apiextensions:

  storage_zfs_remove_snapshots
  container_host_shutdown_timeout
  container_syscall_filtering
  auth_pki
  container_last_used_at
  etag
  patch
  usb_devices
  https_allowed_credentials
  image_compression_algorithm
  directory_manipulation
  container_cpu_time
  storage_zfs_use_refquota
  storage_lvm_mount_options
  network
  profile_usedby
  container_push
  apistatus: stable
  apiversion: "1.0"
  auth: trusted
  environment:
  addresses:
  163.172.48.149:8443
  172.20.10.1:8443
  172.20.11.1:8443
  172.20.12.1:8443
  172.20.22.1:8443
  172.20.21.1:8443
  10.8.0.1:8443
  architectures:
  x86_64
  i686
  certificate: |
  -BEGIN CERTIFICATE-
  -END CERTIFICATE-
  certificatefingerprint: 
3048baa9f20d316f60a6c602452b58409a6d9e2c3218897e8de7c7c72af0179b
  driver: lxc
  driverversion: 2.0.5
  kernel: Linux
  kernelarchitecture: x86_64
  kernelversion: 4.8.0-27-generic
  server: lxd
  serverpid: 32694
  serverversion: 2.4.1
  storage: btrfs
  storageversion: 4.7.3
  config:
  core.https_address: '[::]:8443'
  core.trust_password: true

  Container: ubuntu 16.10

  
  Issue description
  --

  
  tor can't start in a non privileged container

  
  Logs from the container:
  -

  Dec 7 15:03:00 anonymous tor[302]: Configuration was valid
  Dec 7 15:03:00 anonymous systemd[303]: tor@default.service: Failed at step 
APPARMOR spawning /usr/bin/tor: No such file or directory
  Dec 7 15:03:00 anonymous systemd[1]: tor@default.service: Main process 
exited, code=exited, status=231/APPARMOR
  Dec 7 15:03:00 anonymous systemd[1]: Failed to start Anonymizing overlay 
network for TCP.
  Dec 7 15:03:00 anonymous systemd[1]: tor@default.service: Unit entered failed 
state.
  Dec 7 15:03:00 anonymous systemd[1]: tor@default.service: Failed with result 
'exit-code'.
  Dec 7 15:03:00 anonymous systemd[1]: tor@default.service: Service hold-off 
time over, scheduling restart.
  Dec 7 15:03:00 anonymous systemd[1]: Stopped Anonymizing overlay network for 
TCP.
  Dec 7 15:03:00 anonymous systemd[1]: tor@default.service: Failed to reset 
devices.list: Operation not permitted
  Dec 7 15:03:00 anonymous systemd[1]: Failed to set devices.allow on 
/system.slice/system-tor.slice/tor@default.service: Operation not permitted
  Dec 7 15:03:00 anonymous systemd[1]: message repeated 6 times: [ Failed to 
set devices.allow on /system.slice/system-tor.slice/tor@default.service: 
Operation not permitted]
  Dec 7 15:03:00 anonymous systemd[1]: Couldn't stat device 
/run/systemd/inaccessible/chr
  Dec 7 15:03:00 anonymous systemd[1]: Couldn't stat device 
/run/systemd/inaccessible/blk
  Dec 7 15:03:00 anonymous systemd[1]: Failed to set devices.allow on 
/system.slice/system-tor.slice/tor@default.service: Operation not permitted


  Logs from the host
  

  audit: type=1400 audit(1481119378.856:6950): apparmor="DENIED" 
operation="change_onexec" info="label not found" error=-2 
namespace="root//lxd-anonymous_" profile="unconfined" name="system_tor" 
  pid=12164 comm="(tor)"

  
  Steps to reproduce
  -

  install ubuntu container 16.10 on a ubuntu 16.10 host
  install tor in the container
  Launch tor

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1648143/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-06-01 Thread Stéphane Graber

Pinged in #ubuntu-kernel today for an update. It'd be good to have
groovy signed soon so we can then roll this out to focal users.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - , DR1 - , DR2 - 
  DR3  - , DR6 - 0FF0, DR7 - 0400
  GDTR - 3FBEEA98 0047, LDTR - 
  IDTR - 3F2D8018 0FFF,   TR - 
  FXSAVE_STATE - 3FF1C290
   Find image based

[Kernel-packages] [Bug 1645037] Re: apparmor_parser hangs indefinitely when called by multiple threads

2020-06-01 Thread Stéphane Graber

** No longer affects: apparmor (Ubuntu)

** No longer affects: linux (Ubuntu Xenial)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1645037

Title:
  apparmor_parser hangs indefinitely when called by multiple threads

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Yakkety:
  Won't Fix
Status in linux source package in Zesty:
  Fix Released

Bug description:
  This bug surfaced when starting ~50 LXC container with LXD in parallel
  multiple times:

  # Create the containers
  for c in c foo{1..50}; do lxc launch images:ubuntu/xenial $c; done

  # Exectute this loop multiple times until you observe errors.
  for c in c foo{1..50}; do lxc restart $c & done

  After this you can

  ps aux | grep apparmor

  and you should see output similar to:

  root 19774  0.0  0.0  12524  1116 pts/1S+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo30
  root 19775  0.0  0.0  12524  1208 pts/1S+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo26
  root 19776  0.0  0.0  13592  3224 pts/1D+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo30
  root 19778  0.0  0.0  13592  3384 pts/1D+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo26
  root 19780  0.0  0.0  12524  1208 pts/1S+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo43
  root 19782  0.0  0.0  12524  1208 pts/1S+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo34
  root 19783  0.0  0.0  13592  3388 pts/1D+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo43
  root 19784  0.0  0.0  13592  3252 pts/1D+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo34
  root 19794  0.0  0.0  12524  1208 pts/1S+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo25
  root 19795  0.0  0.0  13592  3256 pts/1D+   20:14   0:00 
apparmor_parser -RWL /var/lib/lxd/security/apparmor/cache 
/var/lib/lxd/security/apparmor/profiles/lxd-foo25

  apparmor_parser remains stuck even after all LXC/LXD commands have
  exited.

  dmesg output yields lines like:

  [41902.815174] audit: type=1400 audit(1480191089.678:43):
  apparmor="STATUS" operation="profile_load" profile="unconfined" name
  ="lxd-foo30_" pid=12545 comm="apparmor_parser"

  and cat /proc/12545/stack shows:

  [] aa_remove_profiles+0x88/0x270
  21:19   brauner  [] profile_remove+0x144/0x2e0
  21:19   brauner  [] __vfs_write+0x18/0x40
  21:19   brauner  [] vfs_write+0xb8/0x1b0
  21:19   brauner  [] SyS_write+0x55/0xc0
  21:19   brauner  [] entry_SYSCALL_64_fastpath+0x1e/0xa8
  21:19   brauner  [] 0x

  This looks like a potential kernel bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1645037/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1864303] Re: Removing the e1000e module causes a crash

2020-02-22 Thread Stéphane Graber

** Changed in: linux-5.4 (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1864303

Title:
  Removing the e1000e module causes a crash

Status in linux-5.4 package in Ubuntu:
  Confirmed

Bug description:
  I have a Lenovo x1 Carbon Gen5 and when it initially came out if you
  left the onboard NIC (e1000e) module loaded it would suck CPU/battery
  life so I have it removed in rc.local on boot.

  In 5.4 (also happens on 5.4.0.15.18 which I'm running from proposed
  right now), this is what happens when the module is unloaded:

  [  608.979789] e1000e :00:1f.6 enp0s31f6: removed PHC
  [  609.008352] [ cut here ]
  [  609.008353] kernel BUG at drivers/pci/msi.c:375!
  [  609.008358] invalid opcode:  [#1] SMP PTI
  [  609.008359] CPU: 0 PID: 6829 Comm: rmmod Tainted: P   O  
5.4.0-15-generic #18-Ubuntu
  [  609.008360] Hardware name: LENOVO 20HRCTO1WW/20HRCTO1WW, BIOS N1MET59W 
(1.44 ) 11/25/2019
  [  609.008364] RIP: 0010:free_msi_irqs+0x17d/0x1b0
  [  609.008365] Code: 84 df fe ff ff 45 31 f6 eb 11 41 83 c6 01 44 39 73 14 0f 
86 cc fe ff ff 8b 7b 10 44 01 f7 e8 ea c3 b6 ff 48 83 78 70 00 74 e0 <0f> 0b 49 
8d b5 b0 00 00 00 e8 b5 7d b7 ff e9 cd fe ff ff 49 8b 78
  [  609.008366] RSP: 0018:a7d2072f7d40 EFLAGS: 00010286
  [  609.008367] RAX: 8bc9bfb49e00 RBX: 8bc9ad69c720 RCX: 

  [  609.008368] RDX:  RSI: 0084 RDI: 
a9e65980
  [  609.008369] RBP: a7d2072f7d70 R08: 8bc9bb564db0 R09: 
8bc9bb564df8
  [  609.008369] R10:  R11: a9e65988 R12: 
8bc9cb5272c0
  [  609.008370] R13: 8bc9cb527000 R14:  R15: 
dead0100
  [  609.008371] FS:  7f188f1f9500() GS:8bc9ce20() 
knlGS:
  [  609.008372] CS:  0010 DS:  ES:  CR0: 80050033
  [  609.008373] CR2: 7f6d1a6af060 CR3: 00046d9f8006 CR4: 
003606f0
  [  609.008373] Call Trace:
  [  609.008376]  pci_disable_msi+0x100/0x130
  [  609.008385]  e1000e_reset_interrupt_capability+0x52/0x60 [e1000e]
  [  609.008389]  e1000_remove+0xc4/0x180 [e1000e]
  [  609.008391]  pci_device_remove+0x3e/0xb0
  [  609.008394]  device_release_driver_internal+0xf0/0x1d0
  [  609.008396]  driver_detach+0x4c/0x8f
  [  609.008397]  bus_remove_driver+0x5c/0xd0
  [  609.008399]  driver_unregister+0x31/0x50
  [  609.008400]  pci_unregister_driver+0x40/0x90
  [  609.008405]  e1000_exit_module+0x10/0x3c1 [e1000e]
  [  609.008407]  __x64_sys_delete_module+0x147/0x2b0
  [  609.008409]  ? exit_to_usermode_loop+0xea/0x160
  [  609.008411]  do_syscall_64+0x57/0x190
  [  609.008413]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [  609.008414] RIP: 0033:0x7f188f345c9b
  [  609.008416] Code: 73 01 c3 48 8b 0d f5 71 0c 00 f7 d8 64 89 01 48 83 c8 ff 
c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 
f0 ff ff 73 01 c3 48 8b 0d c5 71 0c 00 f7 d8 64 89 01 48
  [  609.008416] RSP: 002b:7fffc8d32e68 EFLAGS: 0206 ORIG_RAX: 
00b0
  [  609.008418] RAX: ffda RBX: 561e1a391790 RCX: 
7f188f345c9b
  [  609.008419] RDX: 000a RSI: 0800 RDI: 
561e1a3917f8
  [  609.008420] RBP: 7fffc8d32ec8 R08:  R09: 

  [  609.008420] R10: 7f188f3c1ac0 R11: 0206 R12: 
7fffc8d33090
  [  609.008422] R13: 7fffc8d3474a R14: 561e1a3912a0 R15: 
561e1a391790
  [  609.008424] Modules linked in: thunderbolt rfcomm xfrm_user xfrm_algo 
l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppox ccm xt_comment 
xt_CHECKSUM xt_MASQUERADE ip6table_mangle ip6table_nat dummy iptable_mangle 
iptable_nat nf_tables nfnetlink bridge stp llc cmac algif_hash algif_skcipher 
af_alg bnep zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) 
spl(O) zlua(PO) joydev intel_rapl_msr mei_hdcp nls_iso8859_1 snd_seq_midi 
snd_seq_midi_event snd_hda_codec_hdmi snd_hda_codec_conexant 
snd_hda_codec_generic intel_rapl_common x86_pkg_temp_thermal intel_powerclamp 
coretemp kvm_intel snd_hda_intel snd_rawmidi kvm snd_intel_nhlt snd_hda_codec 
intel_cstate rmi_smbus intel_rapl_perf snd_hda_core snd_hwdep iwlmvm rmi_core 
mac80211 uvcvideo input_leds videobuf2_vmalloc videobuf2_memops btusb btrtl 
snd_pcm libarc4 serio_raw snd_seq btbcm intel_wmi_thunderbolt videobuf2_v4l2 
btintel videobuf2_common wmi_bmof thinkpad_acpi bluetooth videodev nvram 
snd_seq_device ledtrig_audio
  [  609.008447]  iwlwifi mc snd_timer rtsx_pci_ms ecdh_generic ecc cfg80211 
memstick snd mei_me ucsi_acpi typec_ucsi intel_xhci_usb_role_switch mei roles 
intel_pch_thermal typec soundcore acpi_pad mac_hid nf_log_ipv6 ip6t_REJECT 
nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT 
nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp xt_

[Kernel-packages] [Bug 1834475] Re: lxd 3.0.3-0ubuntu1~18.04.1 ADT test failure with linux 4.15.0-54.58

2019-06-28 Thread Stéphane Graber

We've changed some of those timings in 3.0.4 which will make it in
Ubuntu in the next month or so, but those tests can still be slightly
flaky even in our CI as we're testing cluster recovery during random
node losses, sometimes things take a bit longer than the 30s timeout to
recover, especially on busy systems like those adt test VMs.

It's not an actual problem that users will ever see, so if a retry gets
you through, don't worry about that particular failure.

** Changed in: lxd (Ubuntu)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1834475

Title:
  lxd 3.0.3-0ubuntu1~18.04.1 ADT test failure with linux 4.15.0-54.58

Status in linux package in Ubuntu:
  Incomplete
Status in lxd package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Incomplete
Status in lxd source package in Bionic:
  New

Bug description:
  Testing failed on:
  arm64: 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-bionic/bionic/arm64/l/lxd/20190626_154218_46720@/log.gz

  Some testcase have been flaky on arm64.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1834475/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1788314] [NEW] Conflict between zfs-linux and s390-tools

2018-08-21 Thread Stéphane Graber

Public bug reported:

Not sure which of the two needs fixing, but there's a path conflict
between zfs-linux and s390-tools which effectively prevents installing
ZFS on s390x in cosmic.

(Reading database ... 83042 files and directories currently installed.)
Preparing to unpack .../zfsutils-linux_0.7.9-3ubuntu5_s390x.deb ...
Unpacking zfsutils-linux (0.7.9-3ubuntu5) ...
dpkg: error processing archive 
/var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb (--unpack):
 trying to overwrite '/usr/share/initramfs-tools/hooks/zdev', which is also in 
package s390-tools 2.6.0-0ubuntu2
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Errors were encountered while processing:
 /var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
Exit request sent.

** Affects: s390-tools (Ubuntu)
 Importance: High
 Status: Triaged

** Affects: zfs-linux (Ubuntu)
 Importance: High
 Status: Triaged

** Also affects: s390-tools (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: s390-tools (Ubuntu)
   Status: New => Triaged

** Changed in: zfs-linux (Ubuntu)
   Status: New => Triaged

** Changed in: s390-tools (Ubuntu)
   Importance: Undecided => High

** Changed in: zfs-linux (Ubuntu)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1788314

Title:
  Conflict between zfs-linux and s390-tools

Status in s390-tools package in Ubuntu:
  Triaged
Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  Not sure which of the two needs fixing, but there's a path conflict
  between zfs-linux and s390-tools which effectively prevents installing
  ZFS on s390x in cosmic.

  (Reading database ... 83042 files and directories currently installed.)
  Preparing to unpack .../zfsutils-linux_0.7.9-3ubuntu5_s390x.deb ...
  Unpacking zfsutils-linux (0.7.9-3ubuntu5) ...
  dpkg: error processing archive 
/var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb (--unpack):
   trying to overwrite '/usr/share/initramfs-tools/hooks/zdev', which is also 
in package s390-tools 2.6.0-0ubuntu2
  dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
  Errors were encountered while processing:
   /var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb
  E: Sub-process /usr/bin/dpkg returned an error code (1)
  Exit request sent.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/s390-tools/+bug/1788314/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1788314] Re: Conflict between zfs-linux and s390-tools

2018-08-22 Thread Stéphane Graber

Closing the zfs task as this will be fixed in s390-tools.

** Changed in: zfs-linux (Ubuntu)
   Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1788314

Title:
  Conflict between zfs-linux and s390-tools

Status in s390-tools package in Ubuntu:
  Confirmed
Status in zfs-linux package in Ubuntu:
  Invalid

Bug description:
  Not sure which of the two needs fixing, but there's a path conflict
  between zfs-linux and s390-tools which effectively prevents installing
  ZFS on s390x in cosmic.

  (Reading database ... 83042 files and directories currently installed.)
  Preparing to unpack .../zfsutils-linux_0.7.9-3ubuntu5_s390x.deb ...
  Unpacking zfsutils-linux (0.7.9-3ubuntu5) ...
  dpkg: error processing archive 
/var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb (--unpack):
   trying to overwrite '/usr/share/initramfs-tools/hooks/zdev', which is also 
in package s390-tools 2.6.0-0ubuntu2
  dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
  Errors were encountered while processing:
   /var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb
  E: Sub-process /usr/bin/dpkg returned an error code (1)
  Exit request sent.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/s390-tools/+bug/1788314/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1784501] Re: libvirtd is unable to configure bridge devices inside of LXD containers

2018-08-23 Thread Stéphane Graber

Were you maybe using a privileged container before? Those aren't
affected by the /sys ownership issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1784501

Title:
  libvirtd is unable to configure bridge devices inside of LXD
  containers

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Triaged

Bug description:
  libvirtd cannot properly configure the default bridge device when
  installed inside of unprivileged LXD containers. 'systemctl status
  libvirtd' shows the following error:

error : virNetDevBridgeSet:140 : Unable to set bridge virbr0
  forward_delay: Permission denied

  This is caused due to the files under /sys/class/net/ being owned by
  init namespace root rather than container root even when the bridge
  device is created inside of the container. Here's an example from
  inside of an unprivileged container:

  # brctl addbr testbr0
  # ls -al /sys/class/net/testbr0/bridge/forward_delay 
  -rw-r--r-- 1 nobody nogroup 4096 Jul 30 22:33 
/sys/class/net/testbr0/bridge/forward_delay

  libvirt cannot open this file for writing even though it created the
  device. Where safe, files under /sys/class/net/ should be owned by
  container root.

  The following upstream patches have been merged into linux-next which
  fix this bug:

  
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=c59e18b876da3e466abe5fa066aa69050f5be17c
  
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1753390274f7760e5b593cb657ea34f0617e559

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784501/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1789746] Re: getxattr: always handle namespaced attributes

2018-08-29 Thread Stéphane Graber

** Changed in: linux (Ubuntu)
   Status: Confirmed => Triaged

** Also affects: linux (Ubuntu Cosmic)
   Importance: High
   Status: Triaged

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Bionic)
   Status: New => Triaged

** Changed in: linux (Ubuntu Xenial)
   Status: New => Triaged

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1789746

Title:
  getxattr: always handle namespaced attributes

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  Triaged
Status in linux source package in Bionic:
  Triaged
Status in linux source package in Cosmic:
  Triaged

Bug description:
  Hey everyone,

  When running in a container with a user namespace, if you call getxattr
  with name = "system.posix_acl_access" and size % 8 != 4, then getxattr
  silently skips the user namespace fixup that it normally does resulting in
  un-fixed-up data being returned.
  This is caused by posix_acl_fix_xattr_to_user() being passed the total
  buffer size and not the actual size of the xattr as returned by
  vfs_getxattr().

  I have pushed a commit upstream that fixes this bug:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=82c9a927bc5df6e06b72d206d24a9d10cced4eb5

  
  This commit passes the actual length of the xattr as returned by
  vfs_getxattr() down.

  A reproducer for the issue is:

touch acl_posix

setfacl -m user:0:rwx acl_posix

  and the compile:

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 

/* Run in user namespace with nsuid 0 mapped to uid != 0 on the host. */
int main(int argc, void **argv)
{
ssize_t ret1, ret2;
char buf1[128], buf2[132];
int fret = EXIT_SUCCESS;
char *file;

if (argc < 2) {
fprintf(stderr,
"Please specify a file with "
"\"system.posix_acl_access\" permissions set\n");
_exit(EXIT_FAILURE);
}
file = argv[1];

ret1 = getxattr(file, "system.posix_acl_access",
buf1, sizeof(buf1));
if (ret1 < 0) {
fprintf(stderr, "%s - Failed to retrieve "
"\"system.posix_acl_access\" "
"from \"%s\"\n", strerror(errno), file);
_exit(EXIT_FAILURE);
}

ret2 = getxattr(file, "system.posix_acl_access",
buf2, sizeof(buf2));
if (ret2 < 0) {
fprintf(stderr, "%s - Failed to retrieve "
"\"system.posix_acl_access\" "
"from \"%s\"\n", strerror(errno), file);
_exit(EXIT_FAILURE);
}

if (ret1 != ret2) {
fprintf(stderr, "The value of \"system.posix_acl_"
"access\" for file \"%s\" changed "
"between two successive calls\n", file);
_exit(EXIT_FAILURE);
}

for (ssize_t i = 0; i < ret2; i++) {
if (buf1[i] == buf2[i])
continue;

fprintf(stderr,
"Unexpected different in byte %zd: "
"%02x != %02x\n", i, buf1[i], buf2[i]);
fret = EXIT_FAILURE;
}

if (fret == EXIT_SUCCESS)
fprintf(stderr, "Test passed\n");
else
fprintf(stderr, "Test failed\n");

_exit(fret);
}
  and run:

./tester acl_posix

  On a non-fixed up kernel this should return something like:

root@c1:/# ./t
Unexpected different in byte 16: ffa0 != 00
Unexpected different in byte 17: ff86 != 00
Unexpected different in byte 18: 01 != 00

  and on a fixed kernel:

root@c1:~# ./t
Test passed

  
  Please backport this to the 4.15 (bionic) and 4.4 (xenial) kernels. :)

  Thanks!
  Christian

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1789746/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1760173] Re: zfs, zpool commands hangs for 10 seconds without a /dev/zfs

2018-06-05 Thread Stéphane Graber

Actually, LXC/LXD can't set environment variables in that way as systemd
strips all inherited environment.

Looking at the backlog it sounds like it'd be safe for us to just turn
off that timeout entirely in Ubuntu given that we can assume we'll
always have devtmpfs where it matters and so there's no potential race
between module loading and device node appearing.

** Changed in: lxd (Ubuntu)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1760173

Title:
  zfs, zpool commands hangs for 10 seconds without a /dev/zfs

Status in lxd package in Ubuntu:
  Invalid
Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  1. # lsb_release -rd
  Description:  Ubuntu 16.04.4 LTS
  Release:  16.04

  2. # apt-cache policy zfsutils-linux
  zfsutils-linux:
Installed: 0.6.5.6-0ubuntu19
Candidate: 0.6.5.6-0ubuntu19
Version table:
   *** 0.6.5.6-0ubuntu19 500
  500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 
Packages
  100 /var/lib/dpkg/status

  3. When inside a lxd container with zfs storage, zfs list or zpool
  status either return or report what's going on.

  4. When inside a lxd container with zfs storage, zfs list or zpool
  status appears to hang, no output for 10 seconds.

  strace reveals that without a /dev/zfs the tools wait for it to appear
  for 10 seconds but do not provide a command line switch to disable or
  make it more verbose.

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: zfsutils-linux 0.6.5.6-0ubuntu19
  ProcVersionSignature: Ubuntu 4.13.0-36.40~16.04.1-generic 4.13.13
  Uname: Linux 4.13.0-36-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  Date: Fri Mar 30 18:09:29 2018
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C.UTF-8
  SourcePackage: zfs-linux
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1760173/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1760173] Re: zfs, zpool commands hangs for 10 seconds without a /dev/zfs

2018-06-07 Thread Stéphane Graber

I'm confused, how is this change going to work when the "container"
environment variable is only present in PID1's environment but not in
any of its descendants?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1760173

Title:
  zfs, zpool commands hangs for 10 seconds without a /dev/zfs

Status in lxd package in Ubuntu:
  Invalid
Status in zfs-linux package in Ubuntu:
  Triaged
Status in zfs-linux source package in Xenial:
  Fix Committed
Status in zfs-linux source package in Artful:
  Fix Committed
Status in zfs-linux source package in Bionic:
  Fix Committed
Status in zfs-linux source package in Cosmic:
  Triaged

Bug description:
  == SRU Justification, Xenial, Artful, Bionic ==

  When outside a lxd container with zfs storage, zfs list or zpool
  status either returns or reports what's going on.

  When inside a lxd container with zfs storage, zfs list or zpool status
  appears to hang, no output for 10 seconds.

  == Fix ==

  Inside a container we don't need the 10 seconds timeout, so check for
  this scenario and set the timeout to default to 0 seconds.

  == Regression Potential ==

  Minimal, this caters for a corner case inside a containerized
  environment, the fix will not alter the behaviour for other cases.

  -

  1. # lsb_release -rd
  Description:  Ubuntu 16.04.4 LTS
  Release:  16.04

  2. # apt-cache policy zfsutils-linux
  zfsutils-linux:
    Installed: 0.6.5.6-0ubuntu19
    Candidate: 0.6.5.6-0ubuntu19
    Version table:
   *** 0.6.5.6-0ubuntu19 500
  500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 
Packages
  100 /var/lib/dpkg/status

  3. When inside a lxd container with zfs storage, zfs list or zpool
  status either return or report what's going on.

  4. When inside a lxd container with zfs storage, zfs list or zpool
  status appears to hang, no output for 10 seconds.

  strace reveals that without a /dev/zfs the tools wait for it to appear
  for 10 seconds but do not provide a command line switch to disable or
  make it more verbose.

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: zfsutils-linux 0.6.5.6-0ubuntu19
  ProcVersionSignature: Ubuntu 4.13.0-36.40~16.04.1-generic 4.13.13
  Uname: Linux 4.13.0-36-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  Date: Fri Mar 30 18:09:29 2018
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C.UTF-8
  SourcePackage: zfs-linux
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1760173/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1760173] Re: zfs, zpool commands hangs for 10 seconds without a /dev/zfs

2018-06-07 Thread Stéphane Graber

That's because an attached process ("lxc-attach" or "lxc exec") isn't a
child of init, it's spawned directly by liblxc and so does have our env
variable set.

Any process which is a direct or indirect child of PID1 in the container
will be inheriting its environment through that path and as init systems
strip any inherited environment, the container env variable will not be
set for those.

So for example, sshing into your container will not have the env
variable set, same goes for any systemd unit.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1760173

Title:
  zfs, zpool commands hangs for 10 seconds without a /dev/zfs

Status in lxd package in Ubuntu:
  Invalid
Status in zfs-linux package in Ubuntu:
  Triaged
Status in zfs-linux source package in Xenial:
  Fix Committed
Status in zfs-linux source package in Artful:
  Fix Committed
Status in zfs-linux source package in Bionic:
  Fix Committed
Status in zfs-linux source package in Cosmic:
  Triaged

Bug description:
  == SRU Justification, Xenial, Artful, Bionic ==

  When outside a lxd container with zfs storage, zfs list or zpool
  status either returns or reports what's going on.

  When inside a lxd container with zfs storage, zfs list or zpool status
  appears to hang, no output for 10 seconds.

  == Fix ==

  Inside a container we don't need the 10 seconds timeout, so check for
  this scenario and set the timeout to default to 0 seconds.

  == Regression Potential ==

  Minimal, this caters for a corner case inside a containerized
  environment, the fix will not alter the behaviour for other cases.

  -

  1. # lsb_release -rd
  Description:  Ubuntu 16.04.4 LTS
  Release:  16.04

  2. # apt-cache policy zfsutils-linux
  zfsutils-linux:
    Installed: 0.6.5.6-0ubuntu19
    Candidate: 0.6.5.6-0ubuntu19
    Version table:
   *** 0.6.5.6-0ubuntu19 500
  500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 
Packages
  100 /var/lib/dpkg/status

  3. When inside a lxd container with zfs storage, zfs list or zpool
  status either return or report what's going on.

  4. When inside a lxd container with zfs storage, zfs list or zpool
  status appears to hang, no output for 10 seconds.

  strace reveals that without a /dev/zfs the tools wait for it to appear
  for 10 seconds but do not provide a command line switch to disable or
  make it more verbose.

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: zfsutils-linux 0.6.5.6-0ubuntu19
  ProcVersionSignature: Ubuntu 4.13.0-36.40~16.04.1-generic 4.13.13
  Uname: Linux 4.13.0-36-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  Date: Fri Mar 30 18:09:29 2018
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C.UTF-8
  SourcePackage: zfs-linux
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1760173/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1760173] Re: zfs, zpool commands hangs for 10 seconds without a /dev/zfs

2018-06-07 Thread Stéphane Graber

Not really, no. You can use systemd-detect-virt which is systemd
specific but should work as a regular user, otherwise you can try to add
some specialized checks like looking if /dev in the mount table is
devtmpfs or not.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1760173

Title:
  zfs, zpool commands hangs for 10 seconds without a /dev/zfs

Status in lxd package in Ubuntu:
  Invalid
Status in zfs-linux package in Ubuntu:
  Triaged
Status in zfs-linux source package in Xenial:
  Fix Committed
Status in zfs-linux source package in Artful:
  Fix Committed
Status in zfs-linux source package in Bionic:
  Fix Committed
Status in zfs-linux source package in Cosmic:
  Triaged

Bug description:
  == SRU Justification, Xenial, Artful, Bionic ==

  When outside a lxd container with zfs storage, zfs list or zpool
  status either returns or reports what's going on.

  When inside a lxd container with zfs storage, zfs list or zpool status
  appears to hang, no output for 10 seconds.

  == Fix ==

  Inside a container we don't need the 10 seconds timeout, so check for
  this scenario and set the timeout to default to 0 seconds.

  == Regression Potential ==

  Minimal, this caters for a corner case inside a containerized
  environment, the fix will not alter the behaviour for other cases.

  -

  1. # lsb_release -rd
  Description:  Ubuntu 16.04.4 LTS
  Release:  16.04

  2. # apt-cache policy zfsutils-linux
  zfsutils-linux:
    Installed: 0.6.5.6-0ubuntu19
    Candidate: 0.6.5.6-0ubuntu19
    Version table:
   *** 0.6.5.6-0ubuntu19 500
  500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 
Packages
  100 /var/lib/dpkg/status

  3. When inside a lxd container with zfs storage, zfs list or zpool
  status either return or report what's going on.

  4. When inside a lxd container with zfs storage, zfs list or zpool
  status appears to hang, no output for 10 seconds.

  strace reveals that without a /dev/zfs the tools wait for it to appear
  for 10 seconds but do not provide a command line switch to disable or
  make it more verbose.

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: zfsutils-linux 0.6.5.6-0ubuntu19
  ProcVersionSignature: Ubuntu 4.13.0-36.40~16.04.1-generic 4.13.13
  Uname: Linux 4.13.0-36-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  Date: Fri Mar 30 18:09:29 2018
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C.UTF-8
  SourcePackage: zfs-linux
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1760173/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1790521] Re: lxd 3.0.2-0ubuntu3 ADT test failure with linux 4.18.0-7.8

2018-09-04 Thread Stéphane Graber

The new liblxc has now migrated, so may be worth retrying.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1790521

Title:
  lxd 3.0.2-0ubuntu3 ADT test failure with linux 4.18.0-7.8

Status in linux package in Ubuntu:
  Incomplete
Status in lxd package in Ubuntu:
  Invalid
Status in linux source package in Cosmic:
  Incomplete
Status in lxd source package in Cosmic:
  Invalid

Bug description:
  Testing failed on:
  amd64: 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-cosmic/cosmic/amd64/l/lxd/20180831_072844_6ea94@/log.gz
  arm64: 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-cosmic/cosmic/arm64/l/lxd/20180831_074034_6ea94@/log.gz
  i386: 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-cosmic/cosmic/i386/l/lxd/20180831_073216_6ea94@/log.gz
  ppc64el: 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-cosmic/cosmic/ppc64el/l/lxd/20180831_072127_6ea94@/log.gz
  s390x: 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-cosmic/cosmic/s390x/l/lxd/20180831_072401_6ea94@/log.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1790521/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1780227] Re: locking sockets broken due to missing AppArmor socket mediation patches

2018-07-24 Thread Stéphane Graber

In preparation for an SRU, here is a minimal C testcase provided by
Wolfgang Bumiller:

```
/*
# apparmor_parser -r /etc/apparmor.d/bug-profile
# (tested without the flags here as well btw.)
profile bug-profile flags=(attach_disconnected,mediate_deleted) {
   network,
   file,
   unix,
}

# gcc this.c
# ./a.out
lock = 2 (Success)
# aa-exec -p bug-profile ./a.out
lock = 2 (Permission denied)

kernel: audit: type=1400 audit(1530774919.510:93): apparmor="DENIED" 
operation="file_lock" profile="bug-profile" pid=21788 comm="a.out" 
family="unix" sock_type="dgram" protocol=0 addr=none
*/

#include 
#include 
#include 
#include 
#include 
#include 

int
main(int argc, char **argv)
{
 int sp[2];
 if (socketpair(AF_UNIX, SOCK_DGRAM, 0, sp) != 0) {
  perror("socketpair");
  exit(1);
 }
 int rc = flock(sp[0], LOCK_EX);
 printf("lock = %i (%m)\n");

 close(sp[0]);
 close(sp[1]);
 return 0;
}
```

Another very easy way to reproduce the issue is to run "hostnamectl
status" inside a container which will hang as the systemd unit (socket
activated) will fail to trigger.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1780227

Title:
  locking sockets broken due to missing AppArmor socket mediation
  patches

Status in apparmor package in Ubuntu:
  Triaged
Status in linux package in Ubuntu:
  Invalid
Status in apparmor source package in Xenial:
  Triaged
Status in linux source package in Xenial:
  Invalid
Status in apparmor source package in Bionic:
  Triaged
Status in linux source package in Bionic:
  Invalid

Bug description:
  Hey,

  Newer systemd makes use of locks placed on AF_UNIX sockets created
  with the socketpair() syscall to synchronize various bits and pieces
  when isolating services. On kernels prior to 4.18 that do not have
  backported the AppArmor socket mediation patchset this will cause the
  locks to be denied with EACCESS. This causes systemd to be broken in
  LXC and LXD containers that do not run unconfined which is a pretty
  big deal. We have seen various bug reports related to this. See for
  example [1] and [2].

  If feasible it would be excellent if we could backport the socket
  mediation patchset to all LTS kernels. Afaict, this should be 4.4 and
  4.15. This will unbreak a whole range of use-cases.

  The socket mediation patchset is available here:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80a17a5f501ea048d86f81d629c94062b76610d4

  
  [1]: https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1575779
  [2]: https://github.com/systemd/systemd/issues/9493

  Thanks!
  Christian

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1780227/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1780227] Re: locking sockets broken due to missing AppArmor socket mediation patches

2018-07-24 Thread Stéphane Graber

Per discussion above:
 - Closing the kernel tasks
 - Raising priority on apparmor tasks to Critical (to match what kernel had)
 - Assigning to jjohansen as the AppArmor maintainer

As we care about xenial, bionic and cosmic, we need point releases (or 
cherry-pick) for:
 - AppArmor 2.10 (2.10.95 in xenial)
 - AppArmor 2.12 (2.12 in bionic and cosmic)

John: Any ETA for those two point releases or pointer to a commit which
we could SRU on its own?

For now our focus is obviously on getting this resolved in Ubuntu as
soon as possible, since it's breaking a number of systemd services that
are now (18.04) shipping with more confinement than in the past. The
same issue is also currently preventing us from starting newer Fedora
and Arch containers on Ubuntu.

Our standard response so far has been to tell users to turn off AppArmor
for those containers, but it's obviously not an answer we like to give
(I'm sure you'll agree).

** Changed in: linux (Ubuntu)
   Status: Triaged => Invalid

** Changed in: linux (Ubuntu Xenial)
   Status: Triaged => Invalid

** Changed in: linux (Ubuntu Bionic)
   Status: Triaged => Invalid

** Changed in: apparmor (Ubuntu)
   Status: New => Triaged

** Changed in: apparmor (Ubuntu Xenial)
   Status: New => Triaged

** Changed in: apparmor (Ubuntu Bionic)
   Status: New => Triaged

** Changed in: apparmor (Ubuntu)
   Importance: Undecided => Critical

** Changed in: apparmor (Ubuntu Xenial)
   Importance: Undecided => Critical

** Changed in: apparmor (Ubuntu Bionic)
   Importance: Undecided => Critical

** Changed in: linux (Ubuntu)
   Importance: Critical => Undecided

** Changed in: linux (Ubuntu Xenial)
   Importance: High => Undecided

** Changed in: linux (Ubuntu Bionic)
   Importance: High => Undecided

** Changed in: apparmor (Ubuntu)
 Assignee: (unassigned) => John Johansen (jjohansen)

** Changed in: apparmor (Ubuntu Xenial)
 Assignee: (unassigned) => John Johansen (jjohansen)

** Changed in: apparmor (Ubuntu Bionic)
 Assignee: (unassigned) => John Johansen (jjohansen)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1780227

Title:
  locking sockets broken due to missing AppArmor socket mediation
  patches

Status in apparmor package in Ubuntu:
  Triaged
Status in linux package in Ubuntu:
  Invalid
Status in apparmor source package in Xenial:
  Triaged
Status in linux source package in Xenial:
  Invalid
Status in apparmor source package in Bionic:
  Triaged
Status in linux source package in Bionic:
  Invalid

Bug description:
  Hey,

  Newer systemd makes use of locks placed on AF_UNIX sockets created
  with the socketpair() syscall to synchronize various bits and pieces
  when isolating services. On kernels prior to 4.18 that do not have
  backported the AppArmor socket mediation patchset this will cause the
  locks to be denied with EACCESS. This causes systemd to be broken in
  LXC and LXD containers that do not run unconfined which is a pretty
  big deal. We have seen various bug reports related to this. See for
  example [1] and [2].

  If feasible it would be excellent if we could backport the socket
  mediation patchset to all LTS kernels. Afaict, this should be 4.4 and
  4.15. This will unbreak a whole range of use-cases.

  The socket mediation patchset is available here:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80a17a5f501ea048d86f81d629c94062b76610d4

  
  [1]: https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1575779
  [2]: https://github.com/systemd/systemd/issues/9493

  Thanks!
  Christian

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1780227/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1780227] Re: locking sockets broken due to missing AppArmor socket mediation patches

2018-07-26 Thread Stéphane Graber

@John any update on the point releases?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1780227

Title:
  locking sockets broken due to missing AppArmor socket mediation
  patches

Status in apparmor package in Ubuntu:
  Triaged
Status in linux package in Ubuntu:
  Invalid
Status in apparmor source package in Xenial:
  Triaged
Status in linux source package in Xenial:
  Invalid
Status in apparmor source package in Bionic:
  Triaged
Status in linux source package in Bionic:
  Invalid

Bug description:
  Hey,

  Newer systemd makes use of locks placed on AF_UNIX sockets created
  with the socketpair() syscall to synchronize various bits and pieces
  when isolating services. On kernels prior to 4.18 that do not have
  backported the AppArmor socket mediation patchset this will cause the
  locks to be denied with EACCESS. This causes systemd to be broken in
  LXC and LXD containers that do not run unconfined which is a pretty
  big deal. We have seen various bug reports related to this. See for
  example [1] and [2].

  If feasible it would be excellent if we could backport the socket
  mediation patchset to all LTS kernels. Afaict, this should be 4.4 and
  4.15. This will unbreak a whole range of use-cases.

  The socket mediation patchset is available here:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80a17a5f501ea048d86f81d629c94062b76610d4

  
  [1]: https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1575779
  [2]: https://github.com/systemd/systemd/issues/9493

  Thanks!
  Christian

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1780227/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1780227] Re: locking sockets broken due to missing AppArmor socket mediation patches

2018-07-27 Thread Stéphane Graber

Ok, thanks for the update. I've now updated the bug once again to move
all the tasks over to the kernel. Can you attach the kernel patch here
when you can, I'm sure some of the subscribers may want to test this
ahead of the Ubuntu kernel fixes :)

** Changed in: linux (Ubuntu)
   Importance: Undecided => Critical

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => Critical

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Critical

** Changed in: linux (Ubuntu)
   Status: Invalid => Triaged

** Changed in: linux (Ubuntu Xenial)
   Status: Invalid => Triaged

** Changed in: linux (Ubuntu Bionic)
   Status: Invalid => Triaged

** Changed in: apparmor (Ubuntu)
   Status: Triaged => Invalid

** Changed in: apparmor (Ubuntu Xenial)
   Status: Triaged => Invalid

** Changed in: apparmor (Ubuntu Bionic)
   Status: Triaged => Invalid

** Changed in: apparmor (Ubuntu)
 Assignee: John Johansen (jjohansen) => (unassigned)

** Changed in: apparmor (Ubuntu Xenial)
 Assignee: John Johansen (jjohansen) => (unassigned)

** Changed in: apparmor (Ubuntu Bionic)
 Assignee: John Johansen (jjohansen) => (unassigned)

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => John Johansen (jjohansen)

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => John Johansen (jjohansen)

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => John Johansen (jjohansen)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1780227

Title:
  locking sockets broken due to missing AppArmor socket mediation
  patches

Status in apparmor package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Triaged
Status in apparmor source package in Xenial:
  Invalid
Status in linux source package in Xenial:
  Triaged
Status in apparmor source package in Bionic:
  Invalid
Status in linux source package in Bionic:
  Triaged

Bug description:
  Hey,

  Newer systemd makes use of locks placed on AF_UNIX sockets created
  with the socketpair() syscall to synchronize various bits and pieces
  when isolating services. On kernels prior to 4.18 that do not have
  backported the AppArmor socket mediation patchset this will cause the
  locks to be denied with EACCESS. This causes systemd to be broken in
  LXC and LXD containers that do not run unconfined which is a pretty
  big deal. We have seen various bug reports related to this. See for
  example [1] and [2].

  If feasible it would be excellent if we could backport the socket
  mediation patchset to all LTS kernels. Afaict, this should be 4.4 and
  4.15. This will unbreak a whole range of use-cases.

  The socket mediation patchset is available here:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80a17a5f501ea048d86f81d629c94062b76610d4

  
  [1]: https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1575779
  [2]: https://github.com/systemd/systemd/issues/9493

  Thanks!
  Christian

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1780227/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1780227] Re: locking sockets broken due to missing AppArmor socket mediation patches

2018-07-30 Thread Stéphane Graber

I tested on two systems, one clean xenial and one clean bionic, both
running the current stable LXD snap with latest ArchLinux and Debian
containers. On both of them, upgrading to the kernels provided by John
fixed the file_lock denials and made the containers boot again.

So as far as I'm concerned, we're good to start pushing this to Ubuntu
kernels.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1780227

Title:
  locking sockets broken due to missing AppArmor socket mediation
  patches

Status in apparmor package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Triaged
Status in apparmor source package in Xenial:
  Invalid
Status in linux source package in Xenial:
  Triaged
Status in apparmor source package in Bionic:
  Invalid
Status in linux source package in Bionic:
  Triaged

Bug description:
  Hey,

  Newer systemd makes use of locks placed on AF_UNIX sockets created
  with the socketpair() syscall to synchronize various bits and pieces
  when isolating services. On kernels prior to 4.18 that do not have
  backported the AppArmor socket mediation patchset this will cause the
  locks to be denied with EACCESS. This causes systemd to be broken in
  LXC and LXD containers that do not run unconfined which is a pretty
  big deal. We have seen various bug reports related to this. See for
  example [1] and [2].

  If feasible it would be excellent if we could backport the socket
  mediation patchset to all LTS kernels. Afaict, this should be 4.4 and
  4.15. This will unbreak a whole range of use-cases.

  The socket mediation patchset is available here:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80a17a5f501ea048d86f81d629c94062b76610d4

  
  [1]: https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1575779
  [2]: https://github.com/systemd/systemd/issues/9493

  Thanks!
  Christian

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1780227/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1778286] Re: Backport namespaced fscaps to xenial 4.4

2018-08-05 Thread Stéphane Graber

Installing the LXD snap from edge channel (for fscaps support), on the
current 4.4 kernel:

root@djanet:~# lxc launch ubuntu-daily:cosmic c1
To start your first container, try: lxc launch ubuntu:18.04

Creating c1
Starting c1  
root@djanet:~# lxc exec c1 -- setcap cap_net_raw+ep /usr/bin/mtr-packet
Failed to set capabilities on file `/usr/bin/mtr-packet' (Operation not 
permitted)
The value of the capability argument is not permitted for a file. Or the file 
is not a regular (non-symlink) file

As expected on that kernel, the caps were lost when the container got
uid shifted and manually setting the caps from within the container
fails.


After switching to 4.4.0-132:

root@djanet:~# lxc exec c1 -- setcap cap_net_raw+ep /usr/bin/mtr-packet
root@djanet:~# lxc exec c1 -- getcap /usr/bin/mtr-packet
/usr/bin/mtr-packet = cap_net_raw+ep

** Tags removed: verification-needed-xenial
** Tags added: verification-done

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1778286

Title:
  Backport namespaced fscaps to xenial 4.4

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  SRU Justification

  Impact: Support for using filesystem capabilities in unprivileged user
  namespaces was added upstream in Linux 4.14. This is a useful feature
  that allows unprivileged containers to set fscaps that are valid only
  in user namespaces where a specific kuid is mapped to root. This
  allows for e.g. support for Linux distros within lxd which make use of
  filesystem capabilities.

  Fix: Backport upstream commit 8db6c34f1dbc "Introduce v3 namespaced
  file capabilities" and any subsequent fixes to xenial 4.4.

  Test Case: Test use of fscaps within a lxd container.

  Regression Potential: This has been upstream since 4.14 (and thus is
  present in bionic), and the backport to xenial 4.4 was
  straightforward, so regression potential is low.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1778286/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1778286] Re: Backport namespaced fscaps to xenial 4.4

2018-08-05 Thread Stéphane Graber

** Tags added: verification-done-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1778286

Title:
  Backport namespaced fscaps to xenial 4.4

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  SRU Justification

  Impact: Support for using filesystem capabilities in unprivileged user
  namespaces was added upstream in Linux 4.14. This is a useful feature
  that allows unprivileged containers to set fscaps that are valid only
  in user namespaces where a specific kuid is mapped to root. This
  allows for e.g. support for Linux distros within lxd which make use of
  filesystem capabilities.

  Fix: Backport upstream commit 8db6c34f1dbc "Introduce v3 namespaced
  file capabilities" and any subsequent fixes to xenial 4.4.

  Test Case: Test use of fscaps within a lxd container.

  Regression Potential: This has been upstream since 4.14 (and thus is
  present in bionic), and the backport to xenial 4.4 was
  straightforward, so regression potential is low.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1778286/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1784501] Re: libvirtd is unable to configure bridge devices inside of LXD containers

2018-08-10 Thread Stéphane Graber

Adding a task for bionic as we'll want this fix to be available for our 18.04 
users.
No need to backport it to anything older than that though.

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Bionic)
   Status: New => Triaged

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1784501

Title:
  libvirtd is unable to configure bridge devices inside of LXD
  containers

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Triaged

Bug description:
  libvirtd cannot properly configure the default bridge device when
  installed inside of unprivileged LXD containers. 'systemctl status
  libvirtd' shows the following error:

error : virNetDevBridgeSet:140 : Unable to set bridge virbr0
  forward_delay: Permission denied

  This is caused due to the files under /sys/class/net/ being owned by
  init namespace root rather than container root even when the bridge
  device is created inside of the container. Here's an example from
  inside of an unprivileged container:

  # brctl addbr testbr0
  # ls -al /sys/class/net/testbr0/bridge/forward_delay 
  -rw-r--r-- 1 nobody nogroup 4096 Jul 30 22:33 
/sys/class/net/testbr0/bridge/forward_delay

  libvirt cannot open this file for writing even though it created the
  device. Where safe, files under /sys/class/net/ should be owned by
  container root.

  The following upstream patches have been merged into linux-next which
  fix this bug:

  
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=c59e18b876da3e466abe5fa066aa69050f5be17c
  
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1753390274f7760e5b593cb657ea34f0617e559

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784501/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1799497] [NEW] 4.15 kernel hard lockup about once a week

2018-10-23 Thread Stéphane Graber

Public bug reported:

My main server has been running into hard lockups about once a week ever
since I switched to the 4.15 Ubuntu 18.04 kernel.

When this happens, nothing is printed to the console, it's effectively
stuck showing a login prompt. The system is running with panic=1 on the
cmdline but isn't rebooting so the kernel isn't even processing this as
a kernel panic.


As this felt like a potential hardware issue, I had my hosting provider give me 
a completely different system, different motherboard, different CPU, different 
RAM and different storage, I installed that system on 18.04 and moved my data 
over, a week later, I hit the issue again.

We've since also had a LXD user reporting similar symptoms here also on varying 
hardware:
  https://github.com/lxc/lxd/issues/5197


My system doesn't have a lot of memory pressure with about 50% of free memory:

root@vorash:~# free -m
  totalusedfree  shared  buff/cache   available
Mem:  31819   17574 402 513   13842   13292
Swap: 159092687   13222

I will now try to increase console logging as much as possible on the
system in the hopes that next time it hangs we can get a better idea of
what happened but I'm not too hopeful given the complete silence on the
console when this occurs.

System is currently on:
  Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

But I've seen this since the GA kernel on 4.15 so it's not a recent
regression.

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  New

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent
  regression.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799497/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week

2018-10-23 Thread Stéphane Graber

Oh and whatever kernel I boot needs to have support for ZFS 0.7 or I
won't be able to read my drives.

** Tags added: apport-collected

** Description changed:

  My main server has been running into hard lockups about once a week ever
  since I switched to the 4.15 Ubuntu 18.04 kernel.
  
  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on the
  cmdline but isn't rebooting so the kernel isn't even processing this as
  a kernel panic.
  
  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.
  
  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197
  
  
  My system doesn't have a lot of memory pressure with about 50% of free memory:
  
  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222
  
  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea of
  what happened but I'm not too hopeful given the complete silence on the
  console when this occurs.
  
  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux
  
- But I've seen this since the GA kernel on 4.15 so it's not a recent
- regression.
+ But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
+ --- 
+ ProblemType: Bug
+ AlsaDevices:
+  total 0
+  crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
+  crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
+ AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
+ ApportVersion: 2.20.9-0ubuntu7.4
+ Architecture: amd64
+ ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
+ AudioDevicesInUse:
+  Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
+  Cannot stat file /proc/22831/fd/10: Permission denied
+ DistroRelease: Ubuntu 18.04
+ HibernationDevice:
+  RESUME=none
+  CRYPTSETUP=n
+ IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
+ Lsusb:
+  Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
+  Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
+  Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
+ MachineType: Intel Corporation S1200SP
+ NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
+ Package: linux (not installed)
+ PciMultimedia:
+  
+ ProcEnviron:
+  TERM=xterm
+  PATH=(custom, no user)
+  XDG_RUNTIME_DIR=
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
+ ProcFB: 0 mgadrmfb
+ ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
+ ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
+ RelatedPackageVersions:
+  linux-restricted-modules-4.15.0-38-generic N/A
+  linux-backports-modules-4.15.0-38-generic  N/A
+  linux-firmware 1.173.1
+ RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
+ Tags:  bionic
+ Uname: Linux 4.15.0-38-generic x86_64
+ UnreportableReason: This report is about a package that is not installed.
+ UpgradeStatus: No upgrade log present (probably fresh install)
+ UserGroups:
+  
+ _MarkForUpload: False
+ dmi.bios.date: 01/25/2018
+ dmi.bios.vendor: Intel Corporation
+ dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
+ dmi.board.asset.tag: Base Board Asset Tag
+ dmi.board.name: S1200SP
+ dmi.board.vendor: Intel Corporation
+ dmi.board.version: H57532-271
+ dmi.chassis.asset.tag: 
+ dmi.chassis.type: 23
+ dmi.chassis.vendor: ...
+ dmi.chassis.version: ..
+ dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
+ dmi.product.family: Family
+ dmi.product.name: S1200SP
+ dmi.product.version: 
+ dmi.sys.vendor: Intel Corporation

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in

[Kernel-packages] [Bug 1799497] CRDA.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "CRDA.txt"
   https://bugs.launchpad.net/bugs/1799497/+attachment/5204632/+files/CRDA.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/li

[Kernel-packages] [Bug 1799497] ProcCpuinfoMinimal.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "ProcCpuinfoMinimal.txt"
   
https://bugs.launchpad.net/bugs/1799497/+attachment/5204635/+files/ProcCpuinfoMinimal.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.la

[Kernel-packages] [Bug 1799497] Lspci.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "Lspci.txt"
   https://bugs.launchpad.net/bugs/1799497/+attachment/5204634/+files/Lspci.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/

[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week

2018-10-23 Thread Stéphane Graber

Well, kinda, this is a production server running a lot of publicly
visible services, so I can run test kernels on it so long as they don't
regress system security.

There's also the unfortunate problem that it takes over a week for me to
see the problem in most cases and that my last known good kernel was the
latest 4.4 kernel from xenial...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.f

[Kernel-packages] [Bug 1799497] CurrentDmesg.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "CurrentDmesg.txt"
   
https://bugs.launchpad.net/bugs/1799497/+attachment/5204633/+files/CurrentDmesg.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/

[Kernel-packages] [Bug 1799497] UdevDb.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "UdevDb.txt"
   https://bugs.launchpad.net/bugs/1799497/+attachment/5204638/+files/UdevDb.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+sourc

[Kernel-packages] [Bug 1799497] ProcInterrupts.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "ProcInterrupts.txt"
   
https://bugs.launchpad.net/bugs/1799497/+attachment/5204636/+files/ProcInterrupts.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.

[Kernel-packages] [Bug 1799497] ProcModules.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "ProcModules.txt"
   
https://bugs.launchpad.net/bugs/1799497/+attachment/5204637/+files/ProcModules.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ub

[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week

2018-10-23 Thread Stéphane Graber

Note that I've deleted the wifisyslog and currentdmesg as they're not
relevant (current boot) and included information that I'd rather not
have exposed publicly.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.

[Kernel-packages] [Bug 1799497] WifiSyslog.txt

2018-10-23 Thread Stéphane Graber

apport information

** Attachment added: "WifiSyslog.txt"
   
https://bugs.launchpad.net/bugs/1799497/+attachment/5204639/+files/WifiSyslog.txt

** Attachment removed: "CurrentDmesg.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799497/+attachment/5204633/+files/CurrentDmesg.txt

** Attachment removed: "WifiSyslog.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799497/+attachment/5204639/+files/WifiSyslog.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rn

[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week

2018-10-24 Thread Stéphane Graber

The server doesn't respond to pings when locked up.

I do have IPMI and console redirection going for my server and have
enabled all sysrq now though it's unclear whether I can send those
through the BMC yet (as just typing them would obviously send them to my
laptop...).

I've setup debug console both to screen and to IPMI, raised the kernel
log level to 9, setup NMI watchdog and enabled panic on oops and panic
on hardlock and disabled reboot on panic, so maybe I'll get lucky with
the next hang and get some output on console though that'd be a first...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.8

[Kernel-packages] [Bug 1789746] Re: getxattr: always handle namespaced attributes

2018-10-02 Thread Stéphane Graber

** Changed in: linux (Ubuntu Cosmic)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1789746

Title:
  getxattr: always handle namespaced attributes

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Fix Released

Bug description:
  
  == SRU Justification ==
  When running in a container with a user namespace, if you call getxattr
  with name = "system.posix_acl_access" and size % 8 != 4, then getxattr
  silently skips the user namespace fixup that it normally does resulting in
  un-fixed-up data being returned.
  This is caused by posix_acl_fix_xattr_to_user() being passed the total
  buffer size and not the actual size of the xattr as returned by
  vfs_getxattr().

  I have pushed a commit upstream that fixes this bug:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=82c9a927bc5df6e06b72d206d24a9d10cced4eb5

  This commit passes the actual length of the xattr as returned by
  vfs_getxattr() down.

  A reproducer for the issue is:

    touch acl_posix

    setfacl -m user:0:rwx acl_posix

  and the compile:

    #define _GNU_SOURCE
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 

    /* Run in user namespace with nsuid 0 mapped to uid != 0 on the host. */
    int main(int argc, void **argv)
    {
    ssize_t ret1, ret2;
    char buf1[128], buf2[132];
    int fret = EXIT_SUCCESS;
    char *file;

    if (argc < 2) {
    fprintf(stderr,
    "Please specify a file with "
    "\"system.posix_acl_access\" permissions set\n");
    _exit(EXIT_FAILURE);
    }
    file = argv[1];

    ret1 = getxattr(file, "system.posix_acl_access",
    buf1, sizeof(buf1));
    if (ret1 < 0) {
    fprintf(stderr, "%s - Failed to retrieve "
    "\"system.posix_acl_access\" "
    "from \"%s\"\n", strerror(errno), file);
    _exit(EXIT_FAILURE);
    }

    ret2 = getxattr(file, "system.posix_acl_access",
    buf2, sizeof(buf2));
    if (ret2 < 0) {
    fprintf(stderr, "%s - Failed to retrieve "
    "\"system.posix_acl_access\" "
    "from \"%s\"\n", strerror(errno), file);
    _exit(EXIT_FAILURE);
    }

    if (ret1 != ret2) {
    fprintf(stderr, "The value of \"system.posix_acl_"
    "access\" for file \"%s\" changed "
    "between two successive calls\n", file);
    _exit(EXIT_FAILURE);
    }

    for (ssize_t i = 0; i < ret2; i++) {
    if (buf1[i] == buf2[i])
    continue;

    fprintf(stderr,
    "Unexpected different in byte %zd: "
    "%02x != %02x\n", i, buf1[i], buf2[i]);
    fret = EXIT_FAILURE;
    }

    if (fret == EXIT_SUCCESS)
    fprintf(stderr, "Test passed\n");
    else
    fprintf(stderr, "Test failed\n");

    _exit(fret);
    }
  and run:

    ./tester acl_posix

  On a non-fixed up kernel this should return something like:

    root@c1:/# ./t
    Unexpected different in byte 16: ffa0 != 00
    Unexpected different in byte 17: ff86 != 00
    Unexpected different in byte 18: 01 != 00

  and on a fixed kernel:

    root@c1:~# ./t
    Test passed

  
  == Fix ==
  82c9a927bc5d ("getxattr: use correct xattr length")

  == Regression Potential ==
  Low.  One liner that passes the actual length of the xattr as returned by
  vfs_getxattr() down.

  == Test Case ==
  A test kernel was built with this patch and tested by the original bug 
reporter.
  The bug reporter states the test kernel resolved the bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1789746/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week

2018-10-31 Thread Stéphane Graber

Just happened again, though the machine wouldn't reboot at all
afterwards, leading to the hosting provider going for a motherboard
replacement, so I guess better luck next week with debugging this.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications

[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week

2018-11-01 Thread Stéphane Graber

Oh, I am also using zram-config on the affected machine.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
totalusedfree  shared  buff/cache   
available
  Mem:  31819   17574 402 513   13842   
13292
  Swap: 159092687   13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware 1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799497/+subscriptions

-- 
Mailing list: https://launchpad.net/~

[Kernel-packages] [Bug 1788314] Update Released

2018-11-05 Thread Stéphane Graber

The verification of the Stable Release Update for lxd has completed
successfully and the package has now been released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1788314

Title:
  Conflict between zfs-linux and s390-tools

Status in s390-tools package in Ubuntu:
  Fix Released
Status in zfs-linux package in Ubuntu:
  Invalid

Bug description:
  Not sure which of the two needs fixing, but there's a path conflict
  between zfs-linux and s390-tools which effectively prevents installing
  ZFS on s390x in cosmic.

  (Reading database ... 83042 files and directories currently installed.)
  Preparing to unpack .../zfsutils-linux_0.7.9-3ubuntu5_s390x.deb ...
  Unpacking zfsutils-linux (0.7.9-3ubuntu5) ...
  dpkg: error processing archive 
/var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb (--unpack):
   trying to overwrite '/usr/share/initramfs-tools/hooks/zdev', which is also 
in package s390-tools 2.6.0-0ubuntu2
  dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
  Errors were encountered while processing:
   /var/cache/apt/archives/zfsutils-linux_0.7.9-3ubuntu5_s390x.deb
  E: Sub-process /usr/bin/dpkg returned an error code (1)
  Exit request sent.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/s390-tools/+bug/1788314/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1624540] Re: please have lxd recommend zfs

2018-11-06 Thread Stéphane Graber

Marking the LXD side of this fixed as we're now shipping as a snap by
default and the snap contains zfs.

** Changed in: lxd (Ubuntu)
   Status: Incomplete => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1624540

Title:
  please have lxd recommend zfs

Status in lxd package in Ubuntu:
  Fix Released
Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  Since ZFS is now in Main (Bug #1532198), LXD should recommend the ZFS
  userspace package, such that 'sudo lxd init' just works.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1624540/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1789746] Re: getxattr: always handle namespaced attributes

2018-11-08 Thread Stéphane Graber

** Changed in: linux (Ubuntu Xenial)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1789746

Title:
  getxattr: always handle namespaced attributes

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Cosmic:
  Fix Released

Bug description:
  
  == SRU Justification ==
  When running in a container with a user namespace, if you call getxattr
  with name = "system.posix_acl_access" and size % 8 != 4, then getxattr
  silently skips the user namespace fixup that it normally does resulting in
  un-fixed-up data being returned.
  This is caused by posix_acl_fix_xattr_to_user() being passed the total
  buffer size and not the actual size of the xattr as returned by
  vfs_getxattr().

  I have pushed a commit upstream that fixes this bug:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=82c9a927bc5df6e06b72d206d24a9d10cced4eb5

  This commit passes the actual length of the xattr as returned by
  vfs_getxattr() down.

  A reproducer for the issue is:

    touch acl_posix

    setfacl -m user:0:rwx acl_posix

  and the compile:

    #define _GNU_SOURCE
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 

    /* Run in user namespace with nsuid 0 mapped to uid != 0 on the host. */
    int main(int argc, void **argv)
    {
    ssize_t ret1, ret2;
    char buf1[128], buf2[132];
    int fret = EXIT_SUCCESS;
    char *file;

    if (argc < 2) {
    fprintf(stderr,
    "Please specify a file with "
    "\"system.posix_acl_access\" permissions set\n");
    _exit(EXIT_FAILURE);
    }
    file = argv[1];

    ret1 = getxattr(file, "system.posix_acl_access",
    buf1, sizeof(buf1));
    if (ret1 < 0) {
    fprintf(stderr, "%s - Failed to retrieve "
    "\"system.posix_acl_access\" "
    "from \"%s\"\n", strerror(errno), file);
    _exit(EXIT_FAILURE);
    }

    ret2 = getxattr(file, "system.posix_acl_access",
    buf2, sizeof(buf2));
    if (ret2 < 0) {
    fprintf(stderr, "%s - Failed to retrieve "
    "\"system.posix_acl_access\" "
    "from \"%s\"\n", strerror(errno), file);
    _exit(EXIT_FAILURE);
    }

    if (ret1 != ret2) {
    fprintf(stderr, "The value of \"system.posix_acl_"
    "access\" for file \"%s\" changed "
    "between two successive calls\n", file);
    _exit(EXIT_FAILURE);
    }

    for (ssize_t i = 0; i < ret2; i++) {
    if (buf1[i] == buf2[i])
    continue;

    fprintf(stderr,
    "Unexpected different in byte %zd: "
    "%02x != %02x\n", i, buf1[i], buf2[i]);
    fret = EXIT_FAILURE;
    }

    if (fret == EXIT_SUCCESS)
    fprintf(stderr, "Test passed\n");
    else
    fprintf(stderr, "Test failed\n");

    _exit(fret);
    }
  and run:

    ./tester acl_posix

  On a non-fixed up kernel this should return something like:

    root@c1:/# ./t
    Unexpected different in byte 16: ffa0 != 00
    Unexpected different in byte 17: ff86 != 00
    Unexpected different in byte 18: 01 != 00

  and on a fixed kernel:

    root@c1:~# ./t
    Test passed

  
  == Fix ==
  82c9a927bc5d ("getxattr: use correct xattr length")

  == Regression Potential ==
  Low.  One liner that passes the actual length of the xattr as returned by
  vfs_getxattr() down.

  == Test Case ==
  A test kernel was built with this patch and tested by the original bug 
reporter.
  The bug reporter states the test kernel resolved the bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1789746/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1884767] Re: shiftfs: fix btrfs regression

2020-08-03 Thread Stéphane Graber

** Changed in: linux (Ubuntu)
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1884767

Title:
  shiftfs: fix btrfs regression

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  SRU Justification

  Impact: The patch
  commit cfaa482afb97e3c05d020af80b897b061109d51f
  Author: Christian Brauner 
  Date:   Tue Apr 14 22:26:53 2020 +0200

  UBUNTU: SAUCE: shiftfs: fix dentry revalidation

  BugLink: https://bugs.launchpad.net/bugs/1872757

  to fix https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1872757
  regresses various btrfs + shiftfs users. Creating a btrfs subvolume,
  deleting it, and then trying to recreate it will cause EEXIST to be returned.
  It also leaves some files in a half-visible state because they are not 
revalidated
  correctly.
  Faulty behavior such as this can be reproduced via:

  btrfs subvolume create my-subvol
  btrfs subvolume delete my-subvol

  Fix: We need to revert this patch restoring the old behavior. This will 
briefly
  resurface https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1872757 which 
I will fix in a follow-up patch on top of this revert. We basically split the 
part that fixes https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1872757 
out of the revert.

  Regression Potential: Limited to shiftfs.

  Test Case: Build a kernel with fix applied and run above reproducer.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1884767/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1873809] Re: Make linux-kvm bootable in LXD VMs

2020-08-24 Thread Stéphane Graber

We weren't planning to as the previous releases (xenial and bionic) did
not have "-kvm" image and their default image includes an initrd making
them boot just fine under LXD.

So it's really just groovy+focal that we need before we can start using those 
images.
focal has been taken care of so we're just waiting for linux-kvm to hit the 
release pocket on groovy before we can switch over to those.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1873809

Title:
  Make linux-kvm bootable in LXD VMs

Status in cloud-images:
  Invalid
Status in linux-kvm package in Ubuntu:
  Triaged
Status in linux-kvm source package in Focal:
  Fix Released

Bug description:
  The `disk-kvm.img` images which are to be preferred when run under
  virtualization, currently completely fail to boot under UEFI.

  A workaround was put in place such that LXD instead will pull generic-
  based images until this is resolved, this however does come with a
  much longer boot time (as the kernel panics, reboots and then boots)
  and also reduced functionality from cloud-init, so we'd still like
  this fixed in the near future.

  To get things behaving, it looks like we need the following config
  options to be enable in linux-kvm:

   - CONFIG_EFI_STUB
   - CONFIG_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS
   - CONFIG_VIRTIO_VSOCKETS_COMMON

  == Rationale ==
  We'd like to be able to use the linux-kvm based images for LXD, those will 
directly boot without needing the panic+reboot behavior of generic images and 
will be much lighter in general.

  We also need the LXD agent to work, which requires functional virtio
  vsock.

  == Test case ==
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz 
focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809
   - lxc launch bug1873809 v1
   - lxc console v1
   - 
   - 
   - lxc exec v1 bash

  To validate a new kernel, you'll need to manually repack the .img file
  and install the new kernel in there.

  == Regression potential ==
  I don't know who else is using those kvm images right now, but those changes 
will cause a change to the kernel binary such that it contains the EFI stub 
bits + a signature. This could cause some (horribly broken) systems to no 
longer be able to boot that kernel. Though considering that such a setup is 
common to our other kernels, this seems unlikely.

  Also, this will be introducing virtio vsock support which again, could
  maybe confused some horribly broken systems?

  
  In either case, the kernel conveniently is the only package which ships 
multiple versions concurently, so rebooting on the previous kernel is always an 
option, mitigating some of the risks.

  
  -- Details from original report --
  User report on the LXD side: https://github.com/lxc/lxd/issues/7224

  I've reproduced this issue with:
   - wget 
http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img
   - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda 
focal-server-cloudimg-amd64-disk-kvm.img -m 1G

  On the graphical console, you'll see EDK2 load (TianoCore) followed by basic 
boot messages and then a message from grub (error: can't find command 
`hwmatch`).
  Those also appear on successful boots of other images so I don't think 
there's anything concerning that. However it'll hang indefinitely and eat up 
all your CPU.

  Switching to the text console view (serial0), you'll see the same
  issue as that LXD report:

  BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM3 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found
  BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM1 " from 
PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
  error: can't find command `hwmatch'.
  e X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 
 
  ExceptionData - 
  RIP  - 3FF2DA12, CS  - 0038, RFLAGS - 00200202
  RAX  - AFAFAFAFAFAFAFAF, RCX - 3E80F108, RDX - AFAFAFAFAFAFAFAF
  RBX  - 0398, RSP - 3FF1C638, RBP - 3FF34360
  RSI  - 3FF343B8, RDI - 1000
  R8   - 3E80F108, R9  - 3E815B98, R10 - 0065
  R11  - 2501, R12 - 0004, R13 - 3E80F100
  R14  - , R15 - 
  DS   - 0030, ES  - 0030, FS  - 0030
  GS   - 0030, SS  - 0030
  CR0  - 80010033, CR2 - , CR3 - 3FC01000
  CR4  - 0668, CR8 - 
  DR0  - 00

[Kernel-packages] [Bug 1624540] Re: please have lxd recommend zfs

2017-04-19 Thread Stéphane Graber

Colin: This is not what this issue is about.

This issue is about getting the ZFS tools installed by default in server
images, with the problem that doing so now would result in zfs-zed
running all the time for everyone, regardless of whether they use ZFS or
not.

What we want is:
 - Don't load the module by default (as it taints the kernel)
 - Use of ZFS tools should automatically load the kernel module
 - zfs-zed should only start when one or more zpool has been imported

With those done, we'll be able to ship the zfs tools in all Ubuntu
installs without negatively affecting users.

** Changed in: zfs-linux (Ubuntu)
   Status: Fix Released => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1624540

Title:
  please have lxd recommend zfs

Status in lxd package in Ubuntu:
  Incomplete
Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  Since ZFS is now in Main (Bug #1532198), LXD should recommend the ZFS
  userspace package, such that 'sudo lxd init' just works.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1624540/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1611078] Re: Support snaps inside of lxd containers

2017-04-19 Thread Stéphane Graber

No, the solution is that snapd shouldn't assume that /lib/modules exist
and just not attempt to bind-mount it if it's missing.

Systems that don't have kernels installed (like containers) shouldn't
have /lib/modules at all.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1611078

Title:
  Support snaps inside of lxd containers

Status in Snappy:
  Fix Released
Status in apparmor package in Ubuntu:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in lxd package in Ubuntu:
  Fix Released
Status in apparmor source package in Xenial:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in lxd source package in Xenial:
  Fix Committed
Status in apparmor source package in Yakkety:
  Fix Released
Status in linux source package in Yakkety:
  Fix Released
Status in lxd source package in Yakkety:
  Fix Released

Bug description:
  I tried following the instructions on snapcraft.io and got a failure.
  See the output below.  I've also attached the relevant output from
  running "journalctl -xe".

  uname: Linux 3.19.0-65-generic x86_64
  release: Ubuntu 16.04
  package: snapd 2.11+0.16.04

  Notably, I'm running this in an LXD container (version: 2.0.0.rc9).

  -

  $ sudo snap install hello-world
  64.75 MB / 64.75 MB 
[==]
 100.00 % 2.85 MB/s 

  error: cannot perform the following tasks:
  - Mount snap "ubuntu-core" (122) ([start snap-ubuntu\x2dcore-122.mount] 
failed with exit status 1: Job for snap-ubuntu\x2dcore-122.mount failed. See 
"systemctl status "snap-ubuntu\\x2dcore-122.mount"" and "journalctl -xe" for 
details.
  )
  $ ls -la /snap
  total 4K
  drwxr-xr-x 3 root root 4096 Aug  8 17:49 ubuntu-core
  $ ls -la /snap/ubuntu-core/
  total 4K
  drwxr-xr-x 2 root root 4096 Aug  8 17:49 122
  $ ls -la /snap/ubuntu-core/122/
  total 0K
  $ systemctl status "snap-ubuntu\\x2dcore-122.mount"
  ● snap-ubuntu\x2dcore-122.mount - Mount unit for ubuntu-core
 Loaded: loaded (/etc/systemd/system/snap-ubuntu\x2dcore-122.mount; 
enabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Mon 2016-08-08 17:49:36 UTC; 6min 
ago
  Where: /snap/ubuntu-core/122
   What: /var/lib/snapd/snaps/ubuntu-core_122.snap
Process: 31781 ExecMount=/bin/mount 
/var/lib/snapd/snaps/ubuntu-core_122.snap /snap/ubuntu-core/122 -t squashfs 
(code=exited, status=32)

  Aug 08 17:49:35 my-host systemd[1]: Mounting Mount unit for ubuntu-core...
  Aug 08 17:49:35 my-host mount[31781]: mount: /snap/ubuntu-core/122: mount 
failed: Unknown error -1
  Aug 08 17:49:36 my-host systemd[1]: snap-ubuntu\x2dcore-122.mount: Mount 
process exited, code=exited status=32
  Aug 08 17:49:36 my-host systemd[1]: Failed to mount Mount unit for 
ubuntu-core.
  Aug 08 17:49:36 my-host systemd[1]: snap-ubuntu\x2dcore-122.mount: Unit 
entered failed state.

To manage notifications about this bug go to:
https://bugs.launchpad.net/snappy/+bug/1611078/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1684481] Re: KVM guest execution start apparmor blocks on /dev/ptmx now (regression?)

2017-04-20 Thread Stéphane Graber

Ok, so that's an apparmor or apparmor profile problem.

LXD recently changed to also allow for apparmor profiles to be loaded
inside privileged containers. This seems to align with your timeline
above.

Before that change, your kvm process wasn't itself confined when run
inside a privileged LXD container, instead only being confined by the
container's own profile. With this LXD fix, we now offer the same
behavior for unprivileged and privileged containers, letting the
container load its own profile in both cases.

There are a number of problems with apparmor profiles being loaded as
part of an apparmor stack not behaving the same as when loaded in the
host, but those are either issues that need be addressed in the profiles
or in the apparmor kernel code.

As far as we (LXD) are concerned, we'd very much appreciate it if
apparmor could behave the same in containers as it does on the host, but
we understand that there are design problems with this and so most
apparmor profiles are now showing some problems...

Closing LXD task as invalid, since as far as LXD is concerned, we are
doing the right thing wrt apparmor setup. This is caused by either
apparmor misbehaving or the apparmor profile being invalid.

** Changed in: lxd (Ubuntu)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1684481

Title:
  KVM guest execution start apparmor blocks on /dev/ptmx now
  (regression?)

Status in apparmor package in Ubuntu:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in lxd package in Ubuntu:
  Invalid

Bug description:
  Setup:
  - Xenial host
  - lxd guests with Trusty, Xenial, ...
  - add a LXD profile to allow kvm [3] (inspired by stgraber)
  - spawn KVM guests in the LXD guests using the different distro release 
versions
  - guests are based on the uvtool default template which has a serial console 
[4]

  Issue:
  - guest starting with serial device gets blocked by apparmor and killed on 
creation
  - This affects at least ppc64el and x86 (s390x has no serial concept that 
would match)
  - This appeared in our usual checks on -proposed releases so maybe we 
can/should stop something?
Last good was "Apr 5, 2017 10:40:50 AM" first bad one "Apr 8, 2017 5:11:22 
AM"

  Background:
  We use this setup for a while and it was working without a change on our end.
  Also the fact that it still works in the Trusty LXD makes it somewhat 
suspicious.
  Therefore I'd assume an SRUed change in LXD/Kernel/Apparmor might be the 
reason and open this bug to get your opinion on it.

  You can look into [1] and search for uvt-kvm create in it.

  Deny in dmesg:
  [652759.606218] audit: type=1400 audit(1492671353.134:4520): 
apparmor="DENIED" operation="open" 
namespace="root//lxd-testkvm-xenial-from_" 
profile="libvirt-668e21f1-fa55-4a30-b325-0ed5cfd55e5b" name="/dev/pts/ptmx" 
pid=27162 comm="qemu-system-ppc" requested_mask="wr" denied_mask="wr" fsuid=0 
ouid=0

  Qemu-log:
  2017-04-20T06:55:53.139450Z qemu-system-ppc64: -chardev pty,id=charserial0: 
Failed to create PTY: No such file or directory

  There was a similar issue on qmeu namespacing (which we don't use on any of 
these releases) [2].
  While we surely don't have the "same" issue the debugging on the namespacing 
might be worth as it could be related.

  Workaround for now:
  - drop serial section from guest xml

  [1]: 
https://jenkins.ubuntu.com/server/view/Virt/job/virt-migration-cross-release-amd64/78/consoleFull
  [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1421036
  [3]: 
https://git.launchpad.net/~ubuntu-server/ubuntu/+source/qemu-migration-test/tree/kvm_profile.yaml
  [4]: https://libvirt.org/formatdomain.html#elementsCharPTY
  --- 
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: ppc64el
  DistroRelease: Ubuntu 16.04
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  Package: lxd
  PackageArchitecture: ppc64el
  ProcKernelCmdline: root=UUID=902eaad1-2164-4f9a-bec4-7ff3abc15804 ro 
console=hvc0
  ProcLoadAvg: 3.15 3.02 3.83 1/3056 79993
  ProcSwaps:
   Filename TypeSizeUsedPriority
   /swap.img   file 8388544 0   -1
  ProcVersion: Linux version 4.4.0-72-generic (buildd@bos01-ppc64el-022) (gcc 
version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #93-Ubuntu SMP Fri 
Mar 31 14:05:15 UTC 2017
  ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49
  Syslog:
   
  Tags:  xenial uec-images
  Uname: Linux 4.4.0-72-generic ppc64le
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: utah
  _MarkForUpload: True
  cpu_cores: Number of cores present = 20
  cpu_coreson: Number of cores online = 20
  cpu_smt: SMT is off
  --- 
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: ppc64el
  DistroRelease: Ubuntu 16.04
  NonfreeKernelModules: cfg80211 ebtable_broute ebtable_nat binfmt_mi

[Kernel-packages] [Bug 1684481] Re: KVM guest execution start apparmor blocks on /dev/ptmx now (regression?)

2017-04-21 Thread Stéphane Graber

We're looking at changing lxc to show /dev/ptmx as a real file rather than 
symlink. This is however not particularly easy because:
 - It can't be a bind-mount from the host (or it will interact with the host's 
devpts)
 - It can't be a straight mknod (because that's not allowed in unprivileged 
containers)

So we're looking at re-ordering the liblxc code to setup a bind-mount
from /dev/pts/ptmx to /dev/ptmx INSIDE the container, which should work.

That part of the kernel has changed quite a bit, so making sure we don't
break things for supported kernels (2.6.32 or higher) is going to be a
bit tricky.


Note that there is nothing wrong with /dev/ptmx being a symlink to 
/dev/pts/ptmx and I'd argue it's actually "more right" than having it be a 
device node. But since that's not what udev/devtmpfs do, we probably should 
mimic the host's behavior.

** Changed in: lxd (Ubuntu)
   Status: New => Invalid

** Also affects: lxc (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: lxc (Ubuntu)
   Status: New => Triaged

** Changed in: lxc (Ubuntu)
   Importance: Undecided => Wishlist

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1684481

Title:
  KVM guest execution start apparmor blocks on /dev/ptmx now
  (regression?)

Status in apparmor package in Ubuntu:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in lxc package in Ubuntu:
  Triaged
Status in lxd package in Ubuntu:
  Invalid

Bug description:
  Setup:
  - Xenial host
  - lxd guests with Trusty, Xenial, ...
  - add a LXD profile to allow kvm [3] (inspired by stgraber)
  - spawn KVM guests in the LXD guests using the different distro release 
versions
  - guests are based on the uvtool default template which has a serial console 
[4]

  Issue:
  - guest starting with serial device gets blocked by apparmor and killed on 
creation
  - This affects at least ppc64el and x86 (s390x has no serial concept that 
would match)
  - This appeared in our usual checks on -proposed releases so maybe we 
can/should stop something?
Last good was "Apr 5, 2017 10:40:50 AM" first bad one "Apr 8, 2017 5:11:22 
AM"

  Background:
  We use this setup for a while and it was working without a change on our end.
  Also the fact that it still works in the Trusty LXD makes it somewhat 
suspicious.
  Therefore I'd assume an SRUed change in LXD/Kernel/Apparmor might be the 
reason and open this bug to get your opinion on it.

  You can look into [1] and search for uvt-kvm create in it.

  Deny in dmesg:
  [652759.606218] audit: type=1400 audit(1492671353.134:4520): 
apparmor="DENIED" operation="open" 
namespace="root//lxd-testkvm-xenial-from_" 
profile="libvirt-668e21f1-fa55-4a30-b325-0ed5cfd55e5b" name="/dev/pts/ptmx" 
pid=27162 comm="qemu-system-ppc" requested_mask="wr" denied_mask="wr" fsuid=0 
ouid=0

  Qemu-log:
  2017-04-20T06:55:53.139450Z qemu-system-ppc64: -chardev pty,id=charserial0: 
Failed to create PTY: No such file or directory

  There was a similar issue on qmeu namespacing (which we don't use on any of 
these releases) [2].
  While we surely don't have the "same" issue the debugging on the namespacing 
might be worth as it could be related.

  Workaround for now:
  - drop serial section from guest xml

  [1]: 
https://jenkins.ubuntu.com/server/view/Virt/job/virt-migration-cross-release-amd64/78/consoleFull
  [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1421036
  [3]: 
https://git.launchpad.net/~ubuntu-server/ubuntu/+source/qemu-migration-test/tree/kvm_profile.yaml
  [4]: https://libvirt.org/formatdomain.html#elementsCharPTY
  --- 
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: ppc64el
  DistroRelease: Ubuntu 16.04
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  Package: lxd
  PackageArchitecture: ppc64el
  ProcKernelCmdline: root=UUID=902eaad1-2164-4f9a-bec4-7ff3abc15804 ro 
console=hvc0
  ProcLoadAvg: 3.15 3.02 3.83 1/3056 79993
  ProcSwaps:
   Filename TypeSizeUsedPriority
   /swap.img   file 8388544 0   -1
  ProcVersion: Linux version 4.4.0-72-generic (buildd@bos01-ppc64el-022) (gcc 
version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #93-Ubuntu SMP Fri 
Mar 31 14:05:15 UTC 2017
  ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49
  Syslog:
   
  Tags:  xenial uec-images
  Uname: Linux 4.4.0-72-generic ppc64le
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: utah
  _MarkForUpload: True
  cpu_cores: Number of cores present = 20
  cpu_coreson: Number of cores online = 20
  cpu_smt: SMT is off
  --- 
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: ppc64el
  DistroRelease: Ubuntu 16.04
  NonfreeKernelModules: cfg80211 ebtable_broute ebtable_nat binfmt_misc veth 
nbd openvswitch vhost_net vhost macvtap macvlan xt_conntrack ipt_REJECT 
nf_reject_ipv4 ebtable_filter ebtabl

[Kernel-packages] [Bug 1753288] [NEW] ZFS setgid broken on 0.7

2018-03-04 Thread Stéphane Graber

Public bug reported:

Hey there,

We've had one of our LXD users report that setting the setgid bit inside
a container using ZFS on Ubuntu 18.04 (zfs 0.7) is silently failing.
This is not a LXD bug as the exact same operation works on other
filesystems.

There are more details available here:
https://github.com/lxc/lxd/issues/4294

Reproducer looks something like:

```
root@c1:~# touch a
root@c1:~# chmod g+s a
root@c1:~# touch b
root@c1:~# chown 0:117 b
root@c1:~# chmod g+s b
root@c1:~# stat a
  File: a
  Size: 0   Blocks: 1  IO Block: 131072 regular empty file
Device: 43h/67d Inode: 33890   Links: 1
Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-03-02 03:32:47.019430367 +
Modify: 2018-03-02 03:32:47.019430367 +
Change: 2018-03-02 03:32:49.459445015 +
 Birth: -
root@c1:~# stat b
  File: b
  Size: 0   Blocks: 1  IO Block: 131072 regular empty file
Device: 43h/67d Inode: 34186   Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (  117/postdrop)
Access: 2018-03-02 03:32:50.907453706 +
Modify: 2018-03-02 03:32:50.907453706 +
Change: 2018-03-02 03:33:01.299516054 +
 Birth: -
root@c1:~# 
```

And for confirmation, using a tmpfs in the same container:

```
root@c1:~# mkdir tmpfs
root@c1:~# mount -t tmpfs tmpfs tmpfs
root@c1:~# cd tmpfs/
root@c1:~/tmpfs# touch a
root@c1:~/tmpfs# chmod g+s a
root@c1:~/tmpfs# touch b
root@c1:~/tmpfs# chown 0:117 b
root@c1:~/tmpfs# chmod g+s b
root@c1:~/tmpfs# stat a
  File: a
  Size: 0   Blocks: 0  IO Block: 4096   regular empty file
Device: 65h/101dInode: 3   Links: 1
Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-03-02 03:33:35.783722623 +
Modify: 2018-03-02 03:33:35.783722623 +
Change: 2018-03-02 03:33:40.507750883 +
 Birth: -
root@c1:~/tmpfs# stat b
  File: b
  Size: 0   Blocks: 0  IO Block: 4096   regular empty file
Device: 65h/101dInode: 4   Links: 1
Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (  117/postdrop)
Access: 2018-03-02 03:33:42.131760597 +
Modify: 2018-03-02 03:33:42.131760597 +
Change: 2018-03-02 03:33:46.227785091 +
 Birth: -
root@c1:~/tmpfs# 
```

This is particularly troubling because there are no errors returned to
the user, so we now have containers that will have broken binaries and
permissions applied to them with no visible way to detect the problem
short of scanning the entire filesystem against a list of known
permissions.

** Affects: linux (Ubuntu)
 Importance: Critical
 Status: Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1753288

Title:
  ZFS setgid broken on 0.7

Status in linux package in Ubuntu:
  Triaged

Bug description:
  Hey there,

  We've had one of our LXD users report that setting the setgid bit
  inside a container using ZFS on Ubuntu 18.04 (zfs 0.7) is silently
  failing. This is not a LXD bug as the exact same operation works on
  other filesystems.

  There are more details available here:
  https://github.com/lxc/lxd/issues/4294

  Reproducer looks something like:

  ```
  root@c1:~# touch a
  root@c1:~# chmod g+s a
  root@c1:~# touch b
  root@c1:~# chown 0:117 b
  root@c1:~# chmod g+s b
  root@c1:~# stat a
File: a
Size: 0 Blocks: 1  IO Block: 131072 regular empty file
  Device: 43h/67d   Inode: 33890   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
  Access: 2018-03-02 03:32:47.019430367 +
  Modify: 2018-03-02 03:32:47.019430367 +
  Change: 2018-03-02 03:32:49.459445015 +
   Birth: -
  root@c1:~# stat b
File: b
Size: 0 Blocks: 1  IO Block: 131072 regular empty file
  Device: 43h/67d   Inode: 34186   Links: 1
  Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (  117/postdrop)
  Access: 2018-03-02 03:32:50.907453706 +
  Modify: 2018-03-02 03:32:50.907453706 +
  Change: 2018-03-02 03:33:01.299516054 +
   Birth: -
  root@c1:~# 
  ```

  And for confirmation, using a tmpfs in the same container:

  ```
  root@c1:~# mkdir tmpfs
  root@c1:~# mount -t tmpfs tmpfs tmpfs
  root@c1:~# cd tmpfs/
  root@c1:~/tmpfs# touch a
  root@c1:~/tmpfs# chmod g+s a
  root@c1:~/tmpfs# touch b
  root@c1:~/tmpfs# chown 0:117 b
  root@c1:~/tmpfs# chmod g+s b
  root@c1:~/tmpfs# stat a
File: a
Size: 0 Blocks: 0  IO Block: 4096   regular empty file
  Device: 65h/101d  Inode: 3   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
  Access: 2018-03-02 03:33:35.783722623 +
  Modify: 2018-03-02 03:33:35.783722623 +
  Change: 2018-03-02 03:33:40.507750883 +
   Birth: -
  root@c1:~/tmpfs# stat b
File: b
Size: 0 Blocks: 0  IO Bl

[Kernel-packages] [Bug 1753288] Re: ZFS setgid broken on 0.7

2018-03-04 Thread Stéphane Graber

That looks like it, yes. As far as I know most of us only noticed this
when bionic switched from 0.6.x to 0.7.x so yes, 0.6.x seems fine and
current 0.7.x is affected.

I've commented on the github issue and will reach out to Wolfgang (Blub)
on IRC otherwise (he hangs out in the LXC/LXD dev channel) to see if he
made any progress on this since November.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1753288

Title:
  ZFS setgid broken on 0.7

Status in linux package in Ubuntu:
  Triaged

Bug description:
  Hey there,

  We've had one of our LXD users report that setting the setgid bit
  inside a container using ZFS on Ubuntu 18.04 (zfs 0.7) is silently
  failing. This is not a LXD bug as the exact same operation works on
  other filesystems.

  There are more details available here:
  https://github.com/lxc/lxd/issues/4294

  Reproducer looks something like:

  ```
  root@c1:~# touch a
  root@c1:~# chmod g+s a
  root@c1:~# touch b
  root@c1:~# chown 0:117 b
  root@c1:~# chmod g+s b
  root@c1:~# stat a
File: a
Size: 0 Blocks: 1  IO Block: 131072 regular empty file
  Device: 43h/67d   Inode: 33890   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
  Access: 2018-03-02 03:32:47.019430367 +
  Modify: 2018-03-02 03:32:47.019430367 +
  Change: 2018-03-02 03:32:49.459445015 +
   Birth: -
  root@c1:~# stat b
File: b
Size: 0 Blocks: 1  IO Block: 131072 regular empty file
  Device: 43h/67d   Inode: 34186   Links: 1
  Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (  117/postdrop)
  Access: 2018-03-02 03:32:50.907453706 +
  Modify: 2018-03-02 03:32:50.907453706 +
  Change: 2018-03-02 03:33:01.299516054 +
   Birth: -
  root@c1:~# 
  ```

  And for confirmation, using a tmpfs in the same container:

  ```
  root@c1:~# mkdir tmpfs
  root@c1:~# mount -t tmpfs tmpfs tmpfs
  root@c1:~# cd tmpfs/
  root@c1:~/tmpfs# touch a
  root@c1:~/tmpfs# chmod g+s a
  root@c1:~/tmpfs# touch b
  root@c1:~/tmpfs# chown 0:117 b
  root@c1:~/tmpfs# chmod g+s b
  root@c1:~/tmpfs# stat a
File: a
Size: 0 Blocks: 0  IO Block: 4096   regular empty file
  Device: 65h/101d  Inode: 3   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
  Access: 2018-03-02 03:33:35.783722623 +
  Modify: 2018-03-02 03:33:35.783722623 +
  Change: 2018-03-02 03:33:40.507750883 +
   Birth: -
  root@c1:~/tmpfs# stat b
File: b
Size: 0 Blocks: 0  IO Block: 4096   regular empty file
  Device: 65h/101d  Inode: 4   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (  117/postdrop)
  Access: 2018-03-02 03:33:42.131760597 +
  Modify: 2018-03-02 03:33:42.131760597 +
  Change: 2018-03-02 03:33:46.227785091 +
   Birth: -
  root@c1:~/tmpfs# 
  ```

  This is particularly troubling because there are no errors returned to
  the user, so we now have containers that will have broken binaries and
  permissions applied to them with no visible way to detect the problem
  short of scanning the entire filesystem against a list of known
  permissions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1753288/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1753288] Re: ZFS setgid broken on 0.7

2018-03-08 Thread Stéphane Graber

This has now been fixed upstream:

https://github.com/zfsonlinux/zfs/pull/7270#event-1510096286

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1753288

Title:
  ZFS setgid broken on 0.7

Status in linux package in Ubuntu:
  Triaged

Bug description:
  Hey there,

  We've had one of our LXD users report that setting the setgid bit
  inside a container using ZFS on Ubuntu 18.04 (zfs 0.7) is silently
  failing. This is not a LXD bug as the exact same operation works on
  other filesystems.

  There are more details available here:
  https://github.com/lxc/lxd/issues/4294

  Reproducer looks something like:

  ```
  root@c1:~# touch a
  root@c1:~# chmod g+s a
  root@c1:~# touch b
  root@c1:~# chown 0:117 b
  root@c1:~# chmod g+s b
  root@c1:~# stat a
File: a
Size: 0 Blocks: 1  IO Block: 131072 regular empty file
  Device: 43h/67d   Inode: 33890   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
  Access: 2018-03-02 03:32:47.019430367 +
  Modify: 2018-03-02 03:32:47.019430367 +
  Change: 2018-03-02 03:32:49.459445015 +
   Birth: -
  root@c1:~# stat b
File: b
Size: 0 Blocks: 1  IO Block: 131072 regular empty file
  Device: 43h/67d   Inode: 34186   Links: 1
  Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (  117/postdrop)
  Access: 2018-03-02 03:32:50.907453706 +
  Modify: 2018-03-02 03:32:50.907453706 +
  Change: 2018-03-02 03:33:01.299516054 +
   Birth: -
  root@c1:~# 
  ```

  And for confirmation, using a tmpfs in the same container:

  ```
  root@c1:~# mkdir tmpfs
  root@c1:~# mount -t tmpfs tmpfs tmpfs
  root@c1:~# cd tmpfs/
  root@c1:~/tmpfs# touch a
  root@c1:~/tmpfs# chmod g+s a
  root@c1:~/tmpfs# touch b
  root@c1:~/tmpfs# chown 0:117 b
  root@c1:~/tmpfs# chmod g+s b
  root@c1:~/tmpfs# stat a
File: a
Size: 0 Blocks: 0  IO Block: 4096   regular empty file
  Device: 65h/101d  Inode: 3   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (0/root)
  Access: 2018-03-02 03:33:35.783722623 +
  Modify: 2018-03-02 03:33:35.783722623 +
  Change: 2018-03-02 03:33:40.507750883 +
   Birth: -
  root@c1:~/tmpfs# stat b
File: b
Size: 0 Blocks: 0  IO Block: 4096   regular empty file
  Device: 65h/101d  Inode: 4   Links: 1
  Access: (2644/-rw-r-Sr--)  Uid: (0/root)   Gid: (  117/postdrop)
  Access: 2018-03-02 03:33:42.131760597 +
  Modify: 2018-03-02 03:33:42.131760597 +
  Change: 2018-03-02 03:33:46.227785091 +
   Birth: -
  root@c1:~/tmpfs# 
  ```

  This is particularly troubling because there are no errors returned to
  the user, so we now have containers that will have broken binaries and
  permissions applied to them with no visible way to detect the problem
  short of scanning the entire filesystem against a list of known
  permissions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1753288/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1567597] Re: [FFe] implement 'complain mode' in seccomp for developer mode with snaps

2017-09-21 Thread Stéphane Graber

Looks good to me. Delta on libseccomp is small and self contained and
aligns with what has been included in the upstream kernel.

FFe granted

** Changed in: libseccomp (Ubuntu)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1567597

Title:
  [FFe] implement 'complain mode' in seccomp for developer mode with
  snaps

Status in Snappy:
  Confirmed
Status in libseccomp package in Ubuntu:
  Triaged
Status in linux package in Ubuntu:
  Fix Released

Bug description:
  A requirement for snappy is that a snap may be placed in developer
  mode which will put the security sandbox in complain mode such that
  violations against policy are logged, but permitted. In this manner
  learning tools can be written to parse the logs, etc and make
  developing on snappy easier.

  Unfortunately with seccomp only SCMP_ACT_KILL logs to dmesg and while
  we can set complain mode to permit all calls, they are not logged at
  this time. I've discussed this with upstream and we are working
  together on the approach. This may require a kernel patch and an
  update to libseccomp, to filing this bug for now as a placeholder and
  we'll add other tasks as necessary.

  UPDATE: ubuntu-core-launcher now supports the '@complain' directive
  that is a synonym for '@unrestricted' so people can at least turn on
  developer mode and not be blocked by seccomp. Proper complain mode for
  seccomp needs to still be implemented (this bug).

  [Impact]

  Snapd needs a way to log seccomp actions without blocking any syscalls
  in order to have a more useful complain mode. Such functionality has
  been acked upstream and patches are on their way into the Linux 4.14
  kernel (backported to 4.12.0-13.14 in artful).

  The corresponding libseccomp changes are still undergoing review
  (https://github.com/seccomp/libseccomp/pull/92). The pull request adds
  a number of new symbols and probably isn't appropriate to backport
  until upstream has acked the pull request. However, only a small part
  of that larger pull request is needed by snapd and that change can be
  safely backported since the only added symbol, the SCMP_ACT_LOG macro,
  must match the SECCOMP_RET_LOG macro that has already been approved
  and merged in the upstream Linux kernel.

  [Test Case]

  A large number of tests are ran as part of the libseccomp build.
  However, the "live" tests which test libseccomp with actual kernel
  enforcement are not ran at that time. They can be manually exercised
  to help catch any regressions. Note that on Artful, there's an
  existing test failure (20-live-basic_die%%002-1):

  $ sudo apt build-dep -y libseccomp
  $ sudo apt install -y cython
  $ apt source libseccomp
  $ autoreconf -ivf && ./configure --enable-python && make check-build
  $ (cd tests && ./regression -T live)
  ...
  Test 20-live-basic_die%%002-1 result:   FAILURE 20-live-basic_die TRAP 
rc=159
  ...
  Regression Test Summary
   tests run: 12
   tests skipped: 0
   tests passed: 11
   tests failed: 1
   tests errored: 0
  

  Now we can build and run a small test program to test the SCMP_ACT_LOG
  action in the way that snapd wants to use it for developer mode:

  $ sudo apt install -y libseccomp-dev
  $ gcc -o lp1567597-test lp1567597-test.c -lseccomp
  $ ./lp1567597-test

  The exit code should be 0 and you should have an entry in the system
  log that looks like this:

  audit: type=1326 audit(1505859630.994:69): auid=1000 uid=1000 gid=1000
  ses=2 pid=18451 comm="lp1567597-test"
  exe="/home/tyhicks/lp1567597-test" sig=0 arch=c03e syscall=2
  compat=0 ip=0x7f547352c5c0 code=0x7ffc

  [Regression Potential]

  Relatively small since the core logic is in the kernel and we're only
  exposing the new action through libseccomp. The changes include smarts
  to query the kernel to see if the action is available in the kernel.
  Calling applications will not be able to use the action on older
  kernels that don't support it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/snappy/+bug/1567597/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone"

2017-08-22 Thread Stéphane Graber

** No longer affects: lxd (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1567557

Title:
  Performance degradation of "zfs clone"

Status in Native ZFS for Linux:
  New
Status in zfs-linux package in Ubuntu:
  Fix Released
Status in lxd source package in Xenial:
  New
Status in zfs-linux source package in Xenial:
  Fix Committed
Status in lxd source package in Zesty:
  New
Status in zfs-linux source package in Zesty:
  Fix Committed
Status in lxd source package in Artful:
  Confirmed
Status in zfs-linux source package in Artful:
  Fix Released

Bug description:
  [SRU Justification]

  Creating tens of hundreds of clones can be prohibitively slow. The
  underlying mechanism to gather clone information is using a 16K buffer
  which limits performance.  Also, the initial assumption is to pass in
  zero sized buffer to   the underlying ioctl() to get an idea of the
  size of the buffer required to fetch information back to userspace.
  If we bump the initial buffer to a larger size then we reduce the need
  for two ioctl calls which improves performance.

  [Fix]
  Bump initial buffer size from 16K to 256K

  [Regression Potential]
  This is minimal as this is just a tweak in the initial buffer size and larger 
sizes are handled correctly by ZFS since they are normally used on the second 
ioctl() call once we have established the size of the buffer required from the 
first ioctl() call. Larger initial buffers just remove the need for the initial 
size estimation for most cases where the number of clones is less than ~5000.  
There is a risk that a larger buffer size could lead to a ENOMEM issue when 
allocating the buffer, but the size of buffer used is still trivial for modern 
large 64 bit servers running ZFS.

  [Test case]
  Create 4000 clones. With the fix this takes 35-40% less time than without the 
fix. See the example test.sh script as an example of how to create this many 
clones.


  
  --

  I've been running some scale tests for LXD and what I've noticed is
  that "zfs clone" gets slower and slower as the zfs filesystem is
  getting busier.

  It feels like "zfs clone" requires some kind of pool-wide lock or
  something and so needs for all operations to complete before it can
  clone a new filesystem.

  A basic LXD scale test with btrfs vs zfs shows what I mean, see below
  for the reports.

  The test is run on a completely dedicated physical server with the
  pool on a dedicated SSD, the exact same machine and SSD was used for
  the btrfs test.

  The zfs filesystem is configured with those settings:
   - relatime=on
   - sync=disabled
   - xattr=sa

  So it shouldn't be related to pending sync() calls...

  The workload in this case is ultimately 1024 containers running busybox as 
their init system and udhcpc grabbing an IP.
  The problem gets significantly worse if spawning busier containers, say a 
full Ubuntu system.

  === zfs ===
  root@edfu:~# /home/ubuntu/lxd-benchmark spawn --count=1024 
--image=images:alpine/edge/amd64 --privileged=true
  Test environment:
    Server backend: lxd
    Server version: 2.0.0.rc8
    Kernel: Linux
    Kernel architecture: x86_64
    Kernel version: 4.4.0-16-generic
    Storage backend: zfs
    Storage version: 5
    Container backend: lxc
    Container version: 2.0.0.rc15

  Test variables:
    Container count: 1024
    Container mode: privileged
    Image: images:alpine/edge/amd64
    Batches: 128
    Batch size: 8
    Remainder: 0

  [Apr  3 06:42:51.170] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 06:42:52.657] Starting the test
  [Apr  3 06:42:53.994] Started 8 containers in 1.336s
  [Apr  3 06:42:55.521] Started 16 containers in 2.864s
  [Apr  3 06:42:58.632] Started 32 containers in 5.975s
  [Apr  3 06:43:05.399] Started 64 containers in 12.742s
  [Apr  3 06:43:20.343] Started 128 containers in 27.686s
  [Apr  3 06:43:57.269] Started 256 containers in 64.612s
  [Apr  3 06:46:09.112] Started 512 containers in 196.455s
  [Apr  3 06:58:19.309] Started 1024 containers in 926.652s
  [Apr  3 06:58:19.309] Test completed in 926.652s

  === btrfs ===
  Test environment:
    Server backend: lxd
    Server version: 2.0.0.rc8
    Kernel: Linux
    Kernel architecture: x86_64
    Kernel version: 4.4.0-16-generic
    Storage backend: btrfs
    Storage version: 4.4
    Container backend: lxc
    Container version: 2.0.0.rc15

  Test variables:
    Container count: 1024
    Container mode: privileged
    Image: images:alpine/edge/amd64
    Batches: 128
    Batch size: 8
    Remainder: 0

  [Apr  3 07:42:12.053] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 07:42:13.351] Starting the test
  [Apr  3 07:42:14.793] Started 8 containers in 1.442s
  [Apr  3 07:42:16.495] Started

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone"

2017-08-22 Thread Stéphane Graber

** No longer affects: lxd (Ubuntu Xenial)

** No longer affects: lxd (Ubuntu Zesty)

** No longer affects: lxd (Ubuntu Artful)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1567557

Title:
  Performance degradation of "zfs clone"

Status in Native ZFS for Linux:
  New
Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Xenial:
  Fix Committed
Status in zfs-linux source package in Zesty:
  Fix Committed
Status in zfs-linux source package in Artful:
  Fix Released

Bug description:
  [SRU Justification]

  Creating tens of hundreds of clones can be prohibitively slow. The
  underlying mechanism to gather clone information is using a 16K buffer
  which limits performance.  Also, the initial assumption is to pass in
  zero sized buffer to   the underlying ioctl() to get an idea of the
  size of the buffer required to fetch information back to userspace.
  If we bump the initial buffer to a larger size then we reduce the need
  for two ioctl calls which improves performance.

  [Fix]
  Bump initial buffer size from 16K to 256K

  [Regression Potential]
  This is minimal as this is just a tweak in the initial buffer size and larger 
sizes are handled correctly by ZFS since they are normally used on the second 
ioctl() call once we have established the size of the buffer required from the 
first ioctl() call. Larger initial buffers just remove the need for the initial 
size estimation for most cases where the number of clones is less than ~5000.  
There is a risk that a larger buffer size could lead to a ENOMEM issue when 
allocating the buffer, but the size of buffer used is still trivial for modern 
large 64 bit servers running ZFS.

  [Test case]
  Create 4000 clones. With the fix this takes 35-40% less time than without the 
fix. See the example test.sh script as an example of how to create this many 
clones.


  
  --

  I've been running some scale tests for LXD and what I've noticed is
  that "zfs clone" gets slower and slower as the zfs filesystem is
  getting busier.

  It feels like "zfs clone" requires some kind of pool-wide lock or
  something and so needs for all operations to complete before it can
  clone a new filesystem.

  A basic LXD scale test with btrfs vs zfs shows what I mean, see below
  for the reports.

  The test is run on a completely dedicated physical server with the
  pool on a dedicated SSD, the exact same machine and SSD was used for
  the btrfs test.

  The zfs filesystem is configured with those settings:
   - relatime=on
   - sync=disabled
   - xattr=sa

  So it shouldn't be related to pending sync() calls...

  The workload in this case is ultimately 1024 containers running busybox as 
their init system and udhcpc grabbing an IP.
  The problem gets significantly worse if spawning busier containers, say a 
full Ubuntu system.

  === zfs ===
  root@edfu:~# /home/ubuntu/lxd-benchmark spawn --count=1024 
--image=images:alpine/edge/amd64 --privileged=true
  Test environment:
    Server backend: lxd
    Server version: 2.0.0.rc8
    Kernel: Linux
    Kernel architecture: x86_64
    Kernel version: 4.4.0-16-generic
    Storage backend: zfs
    Storage version: 5
    Container backend: lxc
    Container version: 2.0.0.rc15

  Test variables:
    Container count: 1024
    Container mode: privileged
    Image: images:alpine/edge/amd64
    Batches: 128
    Batch size: 8
    Remainder: 0

  [Apr  3 06:42:51.170] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 06:42:52.657] Starting the test
  [Apr  3 06:42:53.994] Started 8 containers in 1.336s
  [Apr  3 06:42:55.521] Started 16 containers in 2.864s
  [Apr  3 06:42:58.632] Started 32 containers in 5.975s
  [Apr  3 06:43:05.399] Started 64 containers in 12.742s
  [Apr  3 06:43:20.343] Started 128 containers in 27.686s
  [Apr  3 06:43:57.269] Started 256 containers in 64.612s
  [Apr  3 06:46:09.112] Started 512 containers in 196.455s
  [Apr  3 06:58:19.309] Started 1024 containers in 926.652s
  [Apr  3 06:58:19.309] Test completed in 926.652s

  === btrfs ===
  Test environment:
    Server backend: lxd
    Server version: 2.0.0.rc8
    Kernel: Linux
    Kernel architecture: x86_64
    Kernel version: 4.4.0-16-generic
    Storage backend: btrfs
    Storage version: 4.4
    Container backend: lxc
    Container version: 2.0.0.rc15

  Test variables:
    Container count: 1024
    Container mode: privileged
    Image: images:alpine/edge/amd64
    Batches: 128
    Batch size: 8
    Remainder: 0

  [Apr  3 07:42:12.053] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 07:42:13.351] Starting the test
  [Apr  3 07:42:14.793] Started 8 containers in 1.442s
  [Apr  3 07:42:16.495] Started 16 containers in 3.144s
  [Apr  3 07:42:19.881] Sta

[Kernel-packages] [Bug 1611078] Re: Support snaps inside of lxd containers

2017-08-25 Thread Stéphane Graber

** Changed in: lxd (Ubuntu Xenial)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1611078

Title:
  Support snaps inside of lxd containers

Status in Snappy:
  Fix Released
Status in apparmor package in Ubuntu:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in lxd package in Ubuntu:
  Fix Released
Status in apparmor source package in Xenial:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in lxd source package in Xenial:
  Fix Released
Status in apparmor source package in Yakkety:
  Fix Released
Status in linux source package in Yakkety:
  Fix Released
Status in lxd source package in Yakkety:
  Fix Released

Bug description:
  I tried following the instructions on snapcraft.io and got a failure.
  See the output below.  I've also attached the relevant output from
  running "journalctl -xe".

  uname: Linux 3.19.0-65-generic x86_64
  release: Ubuntu 16.04
  package: snapd 2.11+0.16.04

  Notably, I'm running this in an LXD container (version: 2.0.0.rc9).

  -

  $ sudo snap install hello-world
  64.75 MB / 64.75 MB 
[==]
 100.00 % 2.85 MB/s 

  error: cannot perform the following tasks:
  - Mount snap "ubuntu-core" (122) ([start snap-ubuntu\x2dcore-122.mount] 
failed with exit status 1: Job for snap-ubuntu\x2dcore-122.mount failed. See 
"systemctl status "snap-ubuntu\\x2dcore-122.mount"" and "journalctl -xe" for 
details.
  )
  $ ls -la /snap
  total 4K
  drwxr-xr-x 3 root root 4096 Aug  8 17:49 ubuntu-core
  $ ls -la /snap/ubuntu-core/
  total 4K
  drwxr-xr-x 2 root root 4096 Aug  8 17:49 122
  $ ls -la /snap/ubuntu-core/122/
  total 0K
  $ systemctl status "snap-ubuntu\\x2dcore-122.mount"
  ● snap-ubuntu\x2dcore-122.mount - Mount unit for ubuntu-core
 Loaded: loaded (/etc/systemd/system/snap-ubuntu\x2dcore-122.mount; 
enabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Mon 2016-08-08 17:49:36 UTC; 6min 
ago
  Where: /snap/ubuntu-core/122
   What: /var/lib/snapd/snaps/ubuntu-core_122.snap
Process: 31781 ExecMount=/bin/mount 
/var/lib/snapd/snaps/ubuntu-core_122.snap /snap/ubuntu-core/122 -t squashfs 
(code=exited, status=32)

  Aug 08 17:49:35 my-host systemd[1]: Mounting Mount unit for ubuntu-core...
  Aug 08 17:49:35 my-host mount[31781]: mount: /snap/ubuntu-core/122: mount 
failed: Unknown error -1
  Aug 08 17:49:36 my-host systemd[1]: snap-ubuntu\x2dcore-122.mount: Mount 
process exited, code=exited status=32
  Aug 08 17:49:36 my-host systemd[1]: Failed to mount Mount unit for 
ubuntu-core.
  Aug 08 17:49:36 my-host systemd[1]: snap-ubuntu\x2dcore-122.mount: Unit 
entered failed state.

To manage notifications about this bug go to:
https://bugs.launchpad.net/snappy/+bug/1611078/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-05-31 Thread Stéphane Graber

Our test machines aren't particularly impressive, just 12GB of RAM or so.
Note that as can be seen above, we're using Alpine (busybox) images rather than 
Ubuntu to limit the resource usage and get us to a lot more containers per 
system.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1567557

Title:
  Performance degradation of "zfs clone" when under load

Status in zfs-linux package in Ubuntu:
  Incomplete

Bug description:
  I've been running some scale tests for LXD and what I've noticed is
  that "zfs clone" gets slower and slower as the zfs filesystem is
  getting busier.

  It feels like "zfs clone" requires some kind of pool-wide lock or
  something and so needs for all operations to complete before it can
  clone a new filesystem.

  A basic LXD scale test with btrfs vs zfs shows what I mean, see below
  for the reports.

  The test is run on a completely dedicated physical server with the
  pool on a dedicated SSD, the exact same machine and SSD was used for
  the btrfs test.

  The zfs filesystem is configured with those settings:
   - relatime=on
   - sync=disabled
   - xattr=sa

  So it shouldn't be related to pending sync() calls...

  The workload in this case is ultimately 1024 containers running busybox as 
their init system and udhcpc grabbing an IP.
  The problem gets significantly worse if spawning busier containers, say a 
full Ubuntu system.

  === zfs ===
  root@edfu:~# /home/ubuntu/lxd-benchmark spawn --count=1024 
--image=images:alpine/edge/amd64 --privileged=true
  Test environment:
Server backend: lxd
Server version: 2.0.0.rc8
Kernel: Linux
Kernel architecture: x86_64
Kernel version: 4.4.0-16-generic
Storage backend: zfs
Storage version: 5
Container backend: lxc
Container version: 2.0.0.rc15

  Test variables:
Container count: 1024
Container mode: privileged
Image: images:alpine/edge/amd64
Batches: 128
Batch size: 8
Remainder: 0

  [Apr  3 06:42:51.170] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 06:42:52.657] Starting the test
  [Apr  3 06:42:53.994] Started 8 containers in 1.336s
  [Apr  3 06:42:55.521] Started 16 containers in 2.864s
  [Apr  3 06:42:58.632] Started 32 containers in 5.975s
  [Apr  3 06:43:05.399] Started 64 containers in 12.742s
  [Apr  3 06:43:20.343] Started 128 containers in 27.686s
  [Apr  3 06:43:57.269] Started 256 containers in 64.612s
  [Apr  3 06:46:09.112] Started 512 containers in 196.455s
  [Apr  3 06:58:19.309] Started 1024 containers in 926.652s
  [Apr  3 06:58:19.309] Test completed in 926.652s

  === btrfs ===
  Test environment:
Server backend: lxd
Server version: 2.0.0.rc8
Kernel: Linux
Kernel architecture: x86_64
Kernel version: 4.4.0-16-generic
Storage backend: btrfs
Storage version: 4.4
Container backend: lxc
Container version: 2.0.0.rc15

  Test variables:
Container count: 1024
Container mode: privileged
Image: images:alpine/edge/amd64
Batches: 128
Batch size: 8
Remainder: 0

  [Apr  3 07:42:12.053] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 07:42:13.351] Starting the test
  [Apr  3 07:42:14.793] Started 8 containers in 1.442s
  [Apr  3 07:42:16.495] Started 16 containers in 3.144s
  [Apr  3 07:42:19.881] Started 32 containers in 6.530s
  [Apr  3 07:42:26.798] Started 64 containers in 13.447s
  [Apr  3 07:42:42.048] Started 128 containers in 28.697s
  [Apr  3 07:43:13.210] Started 256 containers in 59.859s
  [Apr  3 07:44:26.238] Started 512 containers in 132.887s
  [Apr  3 07:47:30.708] Started 1024 containers in 317.357s
  [Apr  3 07:47:30.708] Test completed in 317.357s

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1567557/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-05-31 Thread Stéphane Graber

I'm trying to remember if we had to bump any of the sysctls to actually
reach 1024 containers, I don't think any of the usual suspects would be
in play until you reach 2000+ Alpine containers though.

If you do run out of some kernel resources, you can try applying the following 
sysctls to get you past that:
net.ipv6.neigh.default.gc_thresh3 = 8192
net.ipv6.neigh.default.gc_thresh2 = 4096
net.ipv6.neigh.default.gc_thresh1 = 1024
net.ipv6.route.gc_thresh = 8192
kernel.pty.max = 65536
kernel.pid_max = 2097152
fs.inotify.max_queued_events = 1048576
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1567557

Title:
  Performance degradation of "zfs clone" when under load

Status in zfs-linux package in Ubuntu:
  Incomplete

Bug description:
  I've been running some scale tests for LXD and what I've noticed is
  that "zfs clone" gets slower and slower as the zfs filesystem is
  getting busier.

  It feels like "zfs clone" requires some kind of pool-wide lock or
  something and so needs for all operations to complete before it can
  clone a new filesystem.

  A basic LXD scale test with btrfs vs zfs shows what I mean, see below
  for the reports.

  The test is run on a completely dedicated physical server with the
  pool on a dedicated SSD, the exact same machine and SSD was used for
  the btrfs test.

  The zfs filesystem is configured with those settings:
   - relatime=on
   - sync=disabled
   - xattr=sa

  So it shouldn't be related to pending sync() calls...

  The workload in this case is ultimately 1024 containers running busybox as 
their init system and udhcpc grabbing an IP.
  The problem gets significantly worse if spawning busier containers, say a 
full Ubuntu system.

  === zfs ===
  root@edfu:~# /home/ubuntu/lxd-benchmark spawn --count=1024 
--image=images:alpine/edge/amd64 --privileged=true
  Test environment:
Server backend: lxd
Server version: 2.0.0.rc8
Kernel: Linux
Kernel architecture: x86_64
Kernel version: 4.4.0-16-generic
Storage backend: zfs
Storage version: 5
Container backend: lxc
Container version: 2.0.0.rc15

  Test variables:
Container count: 1024
Container mode: privileged
Image: images:alpine/edge/amd64
Batches: 128
Batch size: 8
Remainder: 0

  [Apr  3 06:42:51.170] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 06:42:52.657] Starting the test
  [Apr  3 06:42:53.994] Started 8 containers in 1.336s
  [Apr  3 06:42:55.521] Started 16 containers in 2.864s
  [Apr  3 06:42:58.632] Started 32 containers in 5.975s
  [Apr  3 06:43:05.399] Started 64 containers in 12.742s
  [Apr  3 06:43:20.343] Started 128 containers in 27.686s
  [Apr  3 06:43:57.269] Started 256 containers in 64.612s
  [Apr  3 06:46:09.112] Started 512 containers in 196.455s
  [Apr  3 06:58:19.309] Started 1024 containers in 926.652s
  [Apr  3 06:58:19.309] Test completed in 926.652s

  === btrfs ===
  Test environment:
Server backend: lxd
Server version: 2.0.0.rc8
Kernel: Linux
Kernel architecture: x86_64
Kernel version: 4.4.0-16-generic
Storage backend: btrfs
Storage version: 4.4
Container backend: lxc
Container version: 2.0.0.rc15

  Test variables:
Container count: 1024
Container mode: privileged
Image: images:alpine/edge/amd64
Batches: 128
Batch size: 8
Remainder: 0

  [Apr  3 07:42:12.053] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 07:42:13.351] Starting the test
  [Apr  3 07:42:14.793] Started 8 containers in 1.442s
  [Apr  3 07:42:16.495] Started 16 containers in 3.144s
  [Apr  3 07:42:19.881] Started 32 containers in 6.530s
  [Apr  3 07:42:26.798] Started 64 containers in 13.447s
  [Apr  3 07:42:42.048] Started 128 containers in 28.697s
  [Apr  3 07:43:13.210] Started 256 containers in 59.859s
  [Apr  3 07:44:26.238] Started 512 containers in 132.887s
  [Apr  3 07:47:30.708] Started 1024 containers in 317.357s
  [Apr  3 07:47:30.708] Test completed in 317.357s

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1567557/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1669611] [NEW] Regression in 4.4.0-65-generic causes very frequent system crashes

2017-03-02 Thread Stéphane Graber

Public bug reported:

After upgrading to 4.4.0-65-generic all of our Jenkins test runners are
dying every 10 minutes or so. They don't answer on the network, on the
console or through serial console.

The kernel backtraces we got are:
```
buildd04 login: [ 1443.707658] BUG: unable to handle kernel paging request at 
2d5e501d
[ 1443.707969] IP: [] mntget+0xf/0x20
[ 1443.708086] *pdpt = 24056001 *pde = 
[ 1443.708237] Oops: 0002 [#1] SMP
[ 1443.708325] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables zram lz4_compress bridge stp llc kvm_intel 
ppdev kvm irqbypass crc32_pclmul aesni_intel aes_i586 xts lrw gf128mul 
ablk_helper cryptd joydev input_leds serio_raw parport_pc 8250_fintek i2c_piix4 
mac_hid lp parport autofs4 btrfs xor raid6_pq psmouse virtio_scsi pata_acpi 
floppy
[ 1443.710365] CPU: 1 PID: 14167 Comm: apparmor_parser Not tainted 
4.4.0-65-generic #86-Ubuntu
[ 1443.710505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
[ 1443.710651] task: f5920a00 ti: e63f2000 task.ti: e63f2000
[ 1443.710776] EIP: 0060:[] EFLAGS: 00010286 CPU: 1
[ 1443.710875] EIP is at mntget+0xf/0x20
[ 1443.710946] EAX: f57e4d90 EBX:  ECX: c1d333cc EDX: 0002801d
[ 1443.711088] ESI: c1d36404 EDI: c1d36408 EBP: e63f3de8 ESP: e63f3de8
[ 1443.711228]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 1443.711334] CR0: 80050033 CR2: 2d5e501d CR3: 35072440 CR4: 001406f0
[ 1443.711471] Stack:
[ 1443.711593]  e63f3e04 c1203752 c13b7f71 c1d333cc eebb5980 e59d71e0 41ed 
e63f3e30
[ 1443.711822]  c130546b e59d7230 1a628dcf 0003  e63f3e58 6c0a010a 
e53b6800
[ 1443.712044]  00de eebb5980 e63f3e44 c13055be    
e63f3e6c
[ 1443.712264] Call Trace:
[ 1443.712314]  [] simple_pin_fs+0x32/0xa0
[ 1443.712421]  [] ? vsnprintf+0x321/0x420
[ 1443.712516]  [] securityfs_create_dentry+0x5b/0x150
[ 1443.712632]  [] securityfs_create_dir+0x2e/0x30
[ 1443.712729]  [] __aa_fs_profile_mkdir+0x46/0x3c0
[ 1443.712826]  [] aa_replace_profiles+0x4c0/0xbc0
[ 1443.712927]  [] ? ns_capable_common+0x55/0x80
[ 1443.713022]  [] policy_update+0x97/0x230
[ 1443.713122]  [] ? security_file_permission+0x39/0xc0
[ 1443.713247]  [] profile_replace+0x98/0xe0
[ 1443.713346]  [] ? policy_update+0x230/0x230
[ 1443.713445]  [] __vfs_write+0x1f/0x50
[ 1443.713535]  [] vfs_write+0x8c/0x1b0
[ 1443.713633]  [] SyS_write+0x51/0xb0
[ 1443.713738]  [] do_fast_syscall_32+0x8d/0x150
[ 1443.713838]  [] sysenter_past_esp+0x3d/0x61
[ 1443.713938] Code: c0 74 09 83 42 10 01 89 d0 5b 5d c3 3b 5b 10 b8 fe ff ff 
ff 75 e3 eb eb 8d 74 26 00 55 89 e5 3e 8d 74 26 00 85 c0 74 06 8b 50 14 <64> ff 
02 5d c3 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 3e
[ 1443.715713] EIP: [] mntget+0xf/0x20 SS:ESP 0068:e63f3de8
[ 1443.715852] CR2: 2d5e501d
```

```
buildd07 login: [ 1262.522071] BUG: unable to handle kernel NULL pointer 
dereference at 0008
[ 1262.522339] IP: [] mntput_no_expire+0x68/0x180
[ 1262.522464] PGD 439912067 PUD 43997f067 PMD 0
[ 1262.522556] Oops: 0002 [#1] SMP
[ 1262.522760] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables bridge stp llc zram lz4_compress zfs(PO) 
zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) kvm_intel kvm irqbypass 
crct10dif_pclmul crc32_pclmul ppdev ghash_clmulni_intel aesni_intel aes_x86_64 
lrw gf128mul glue_helper ablk_helper cryptd input_leds joydev i2c_piix4 
serio_raw 8250_fintek parport_pc mac_hid lp parport autofs4 btrfs xor raid6_pq 
psmouse virtio_scsi pata_acpi floppy
[ 1262.535658] CPU: 10 PID: 163332 Comm: apparmor_parser Tainted: P   O 
   4.4.0-65-generic #86-Ubuntu
[ 1262.536544] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
[ 1262.53] task: 88043c3fd400 ti: 88044fed task.ti: 
88044fed
[ 1262.536773] RIP: 0010:[]  [] 
mntput_no_expire+0x68/0x180
[ 1262.536949] RSP: 0018:88044fed3d70  EFLAGS: 00010206
[ 1262.537046] RAX:  RBX: 88046a74e480 RCX: 
[ 1262.537205] RDX:  RSI: 020

[Kernel-packages] [Bug 1669611] Re: Regression in 4.4.0-65-generic causes very frequent system crashes

2017-03-02 Thread Stéphane Graber

We can reproduce this very easily by triggering a LXD testsuite run
which causes a lot of apparmor profiles and namespaces
creation/deletion, causing this issue. A busy LXD host would also hit
this eventually (if the similar BUG we had before is any indication).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1669611

Title:
  Regression in 4.4.0-65-generic causes very frequent system crashes

Status in linux package in Ubuntu:
  Triaged

Bug description:
  After upgrading to 4.4.0-65-generic all of our Jenkins test runners
  are dying every 10 minutes or so. They don't answer on the network, on
  the console or through serial console.

  The kernel backtraces we got are:
  ```
  buildd04 login: [ 1443.707658] BUG: unable to handle kernel paging request at 
2d5e501d
  [ 1443.707969] IP: [] mntget+0xf/0x20
  [ 1443.708086] *pdpt = 24056001 *pde = 
  [ 1443.708237] Oops: 0002 [#1] SMP
  [ 1443.708325] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables zram lz4_compress bridge stp llc kvm_intel 
ppdev kvm irqbypass crc32_pclmul aesni_intel aes_i586 xts lrw gf128mul 
ablk_helper cryptd joydev input_leds serio_raw parport_pc 8250_fintek i2c_piix4 
mac_hid lp parport autofs4 btrfs xor raid6_pq psmouse virtio_scsi pata_acpi 
floppy
  [ 1443.710365] CPU: 1 PID: 14167 Comm: apparmor_parser Not tainted 
4.4.0-65-generic #86-Ubuntu
  [ 1443.710505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
  [ 1443.710651] task: f5920a00 ti: e63f2000 task.ti: e63f2000
  [ 1443.710776] EIP: 0060:[] EFLAGS: 00010286 CPU: 1
  [ 1443.710875] EIP is at mntget+0xf/0x20
  [ 1443.710946] EAX: f57e4d90 EBX:  ECX: c1d333cc EDX: 0002801d
  [ 1443.711088] ESI: c1d36404 EDI: c1d36408 EBP: e63f3de8 ESP: e63f3de8
  [ 1443.711228]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  [ 1443.711334] CR0: 80050033 CR2: 2d5e501d CR3: 35072440 CR4: 001406f0
  [ 1443.711471] Stack:
  [ 1443.711593]  e63f3e04 c1203752 c13b7f71 c1d333cc eebb5980 e59d71e0 
41ed e63f3e30
  [ 1443.711822]  c130546b e59d7230 1a628dcf 0003  e63f3e58 
6c0a010a e53b6800
  [ 1443.712044]  00de eebb5980 e63f3e44 c13055be   
 e63f3e6c
  [ 1443.712264] Call Trace:
  [ 1443.712314]  [] simple_pin_fs+0x32/0xa0
  [ 1443.712421]  [] ? vsnprintf+0x321/0x420
  [ 1443.712516]  [] securityfs_create_dentry+0x5b/0x150
  [ 1443.712632]  [] securityfs_create_dir+0x2e/0x30
  [ 1443.712729]  [] __aa_fs_profile_mkdir+0x46/0x3c0
  [ 1443.712826]  [] aa_replace_profiles+0x4c0/0xbc0
  [ 1443.712927]  [] ? ns_capable_common+0x55/0x80
  [ 1443.713022]  [] policy_update+0x97/0x230
  [ 1443.713122]  [] ? security_file_permission+0x39/0xc0
  [ 1443.713247]  [] profile_replace+0x98/0xe0
  [ 1443.713346]  [] ? policy_update+0x230/0x230
  [ 1443.713445]  [] __vfs_write+0x1f/0x50
  [ 1443.713535]  [] vfs_write+0x8c/0x1b0
  [ 1443.713633]  [] SyS_write+0x51/0xb0
  [ 1443.713738]  [] do_fast_syscall_32+0x8d/0x150
  [ 1443.713838]  [] sysenter_past_esp+0x3d/0x61
  [ 1443.713938] Code: c0 74 09 83 42 10 01 89 d0 5b 5d c3 3b 5b 10 b8 fe ff ff 
ff 75 e3 eb eb 8d 74 26 00 55 89 e5 3e 8d 74 26 00 85 c0 74 06 8b 50 14 <64> ff 
02 5d c3 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 3e
  [ 1443.715713] EIP: [] mntget+0xf/0x20 SS:ESP 0068:e63f3de8
  [ 1443.715852] CR2: 2d5e501d
  ```

  ```
  buildd07 login: [ 1262.522071] BUG: unable to handle kernel NULL pointer 
dereference at 0008
  [ 1262.522339] IP: [] mntput_no_expire+0x68/0x180
  [ 1262.522464] PGD 439912067 PUD 43997f067 PMD 0
  [ 1262.522556] Oops: 0002 [#1] SMP
  [ 1262.522760] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables bridge stp llc zram lz4_compress zfs(PO) 
zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) kvm_intel kvm irqbypass 
crct10dif_pclmul crc32_pclmul ppdev ghash_clmulni_intel aesni_intel aes_x86_64 
lrw gf128mul glue_helper ablk_helper cryptd input_leds joydev i2c_piix4 
serio_raw 8250_finte

[Kernel-packages] [Bug 1669611] Re: Regression in 4.4.0-65-generic causes very frequent system crashes

2017-03-02 Thread Stéphane Graber

Running the same thing on zesty to see if the problem is present there too.
We get something a bit different but the result ends up being the same, all the 
test runners crash.

```
buildd07 login: [  976.607283] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 
22s! [lxd:34563]
[  988.645772] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1004.605673] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1009.642113] INFO: rcu_sched self-detected stall on CPU
[ 1009.645498]  3-...: (13599 ticks this GP) idle=769/141/0 
softirq=32564/32569 fqs=6049
[ 1009.649690] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1009.649697]  3-...: (13599 ticks this GP) idle=769/141/0 
softirq=32564/32569 fqs=6049
[ 1009.649699]  11-...: (1 GPs behind) idle=8cd/140/0 
softirq=36685/36686 fqs=6049
[ 1009.649700]  (detected by 9, t=15002 jiffies, g=20785, c=20784, q=16519)
[ 1009.663598]   (t=15005 jiffies g=20785 c=20784 q=16519)
[ 1016.645667] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1036.606795] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1044.645665] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1064.605727] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1072.645669] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1092.605690] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1100.645669] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 23s! 
[lxd:22980]
[ 1120.605669] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1128.645698] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 23s! 
[lxd:22980]
[ 1148.605669] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1156.645721] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 23s! 
[lxd:22980]
[ 1176.605670] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1184.645668] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 23s! 
[lxd:22980]
[ 1189.665664] INFO: rcu_sched self-detected stall on CPU
[ 1189.669683] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1189.669689]  3-...: (56078 ticks this GP) idle=769/141/0 
softirq=32564/32569 fqs=26269
[ 1189.669695]  11-...: (1 GPs behind) idle=8cd/140/0 
softirq=36685/36686 fqs=26269
[ 1189.669696]  (detected by 2, t=60007 jiffies, g=20785, c=20784, q=16775)
[ 1189.691113]  3-...: (56078 ticks this GP) idle=769/141/0 
softirq=32564/32569 fqs=26272
[ 1189.692748]   (t=60012 jiffies g=20785 c=20784 q=16775)
[ 1212.645668] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1216.605666] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [lxd:34563]
[ 1240.645876] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1244.606272] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [lxd:34563]
[ 1268.645669] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1272.608277] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [lxd:34563]
[ 1296.645701] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1300.605699] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [lxd:34563]
[ 1324.645670] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1328.605706] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1352.645674] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1356.605673] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1369.685663] INFO: rcu_sched self-detected stall on CPU
[ 1369.692335] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1369.692342]  3-...: (98608 ticks this GP) idle=769/141/0 
softirq=32564/32569 fqs=45792
[ 1369.692345]  11-...: (1 GPs behind) idle=8cd/140/0 
softirq=36685/36686 fqs=45792
[ 1369.692345]  (detected by 9, t=105012 jiffies, g=20785, c=20784, q=17084)
[ 1369.728009]  3-...: (98608 ticks this GP) idle=769/141/0 
softirq=32564/32569 fqs=45797
[ 1369.728591]   (t=105021 jiffies g=20785 c=20784 q=17084)
[ 1380.645674] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1396.605667] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1408.645668] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1424.605671] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
[ 1436.645773] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 22s! 
[lxd:22980]
[ 1452.605666] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxd:34563]
```

```
buildd06 login: [ 1416.276297] BUG: unable to handle kernel paging request at 
2717207f
[ 1416.279824] IP: mntput_no_expire+0x11/0x170
[ 1416.281522] *pdpt = 2afb3001 *pde = 
[ 1416.281525]
[ 1416.286449] Oops: 0002 [#1] SMP
[ 1416.289702] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_rejec

Re: [Kernel-packages] [Bug 1669611] Re: Regression in 4.4.0-65-generic causes very frequent system crashes

2017-03-09 Thread Stéphane Graber

I'll install -67 on our jenkins runners and see if we can reproduce it.
The changelog is a bit confusing as it shows a whole bunch of apparmor
reverts, including the commits that were meant to fix this issue. So
it's unclear whether a proper implementation of the fix was then applied
on top. If not, this kernel obviously wouldn't fix the issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1669611

Title:
  Regression in 4.4.0-65-generic causes very frequent system crashes

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Zesty:
  Fix Committed

Bug description:
  After upgrading to 4.4.0-65-generic all of our Jenkins test runners
  are dying every 10 minutes or so. They don't answer on the network, on
  the console or through serial console.

  The kernel backtraces we got are:
  ```
  buildd04 login: [ 1443.707658] BUG: unable to handle kernel paging request at 
2d5e501d
  [ 1443.707969] IP: [] mntget+0xf/0x20
  [ 1443.708086] *pdpt = 24056001 *pde = 
  [ 1443.708237] Oops: 0002 [#1] SMP
  [ 1443.708325] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables zram lz4_compress bridge stp llc kvm_intel 
ppdev kvm irqbypass crc32_pclmul aesni_intel aes_i586 xts lrw gf128mul 
ablk_helper cryptd joydev input_leds serio_raw parport_pc 8250_fintek i2c_piix4 
mac_hid lp parport autofs4 btrfs xor raid6_pq psmouse virtio_scsi pata_acpi 
floppy
  [ 1443.710365] CPU: 1 PID: 14167 Comm: apparmor_parser Not tainted 
4.4.0-65-generic #86-Ubuntu
  [ 1443.710505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
  [ 1443.710651] task: f5920a00 ti: e63f2000 task.ti: e63f2000
  [ 1443.710776] EIP: 0060:[] EFLAGS: 00010286 CPU: 1
  [ 1443.710875] EIP is at mntget+0xf/0x20
  [ 1443.710946] EAX: f57e4d90 EBX:  ECX: c1d333cc EDX: 0002801d
  [ 1443.711088] ESI: c1d36404 EDI: c1d36408 EBP: e63f3de8 ESP: e63f3de8
  [ 1443.711228]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  [ 1443.711334] CR0: 80050033 CR2: 2d5e501d CR3: 35072440 CR4: 001406f0
  [ 1443.711471] Stack:
  [ 1443.711593]  e63f3e04 c1203752 c13b7f71 c1d333cc eebb5980 e59d71e0 
41ed e63f3e30
  [ 1443.711822]  c130546b e59d7230 1a628dcf 0003  e63f3e58 
6c0a010a e53b6800
  [ 1443.712044]  00de eebb5980 e63f3e44 c13055be   
 e63f3e6c
  [ 1443.712264] Call Trace:
  [ 1443.712314]  [] simple_pin_fs+0x32/0xa0
  [ 1443.712421]  [] ? vsnprintf+0x321/0x420
  [ 1443.712516]  [] securityfs_create_dentry+0x5b/0x150
  [ 1443.712632]  [] securityfs_create_dir+0x2e/0x30
  [ 1443.712729]  [] __aa_fs_profile_mkdir+0x46/0x3c0
  [ 1443.712826]  [] aa_replace_profiles+0x4c0/0xbc0
  [ 1443.712927]  [] ? ns_capable_common+0x55/0x80
  [ 1443.713022]  [] policy_update+0x97/0x230
  [ 1443.713122]  [] ? security_file_permission+0x39/0xc0
  [ 1443.713247]  [] profile_replace+0x98/0xe0
  [ 1443.713346]  [] ? policy_update+0x230/0x230
  [ 1443.713445]  [] __vfs_write+0x1f/0x50
  [ 1443.713535]  [] vfs_write+0x8c/0x1b0
  [ 1443.713633]  [] SyS_write+0x51/0xb0
  [ 1443.713738]  [] do_fast_syscall_32+0x8d/0x150
  [ 1443.713838]  [] sysenter_past_esp+0x3d/0x61
  [ 1443.713938] Code: c0 74 09 83 42 10 01 89 d0 5b 5d c3 3b 5b 10 b8 fe ff ff 
ff 75 e3 eb eb 8d 74 26 00 55 89 e5 3e 8d 74 26 00 85 c0 74 06 8b 50 14 <64> ff 
02 5d c3 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 3e
  [ 1443.715713] EIP: [] mntget+0xf/0x20 SS:ESP 0068:e63f3de8
  [ 1443.715852] CR2: 2d5e501d
  ```

  ```
  buildd07 login: [ 1262.522071] BUG: unable to handle kernel NULL pointer 
dereference at 0008
  [ 1262.522339] IP: [] mntput_no_expire+0x68/0x180
  [ 1262.522464] PGD 439912067 PUD 43997f067 PMD 0
  [ 1262.522556] Oops: 0002 [#1] SMP
  [ 1262.522760] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables bridge stp llc zram lz4_compress zfs(PO) 
zunicode(PO) zcommon(PO) znvpair(PO) spl(O) za

[Kernel-packages] [Bug 1669611] Re: Regression in 4.4.0-65-generic causes very frequent system crashes

2017-03-09 Thread Stéphane Graber

Oh, I got confused between the two bug reports. So -67 is just the
revert. If so, then it's fine, we've been running with a pre-upload
build of this provided by Jon for a while now and haven't seen any full
hang. We do still run in the original apparmor bug but it's no worse
than before at least.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1669611

Title:
  Regression in 4.4.0-65-generic causes very frequent system crashes

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Zesty:
  Fix Committed

Bug description:
  After upgrading to 4.4.0-65-generic all of our Jenkins test runners
  are dying every 10 minutes or so. They don't answer on the network, on
  the console or through serial console.

  The kernel backtraces we got are:
  ```
  buildd04 login: [ 1443.707658] BUG: unable to handle kernel paging request at 
2d5e501d
  [ 1443.707969] IP: [] mntget+0xf/0x20
  [ 1443.708086] *pdpt = 24056001 *pde = 
  [ 1443.708237] Oops: 0002 [#1] SMP
  [ 1443.708325] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables zram lz4_compress bridge stp llc kvm_intel 
ppdev kvm irqbypass crc32_pclmul aesni_intel aes_i586 xts lrw gf128mul 
ablk_helper cryptd joydev input_leds serio_raw parport_pc 8250_fintek i2c_piix4 
mac_hid lp parport autofs4 btrfs xor raid6_pq psmouse virtio_scsi pata_acpi 
floppy
  [ 1443.710365] CPU: 1 PID: 14167 Comm: apparmor_parser Not tainted 
4.4.0-65-generic #86-Ubuntu
  [ 1443.710505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
  [ 1443.710651] task: f5920a00 ti: e63f2000 task.ti: e63f2000
  [ 1443.710776] EIP: 0060:[] EFLAGS: 00010286 CPU: 1
  [ 1443.710875] EIP is at mntget+0xf/0x20
  [ 1443.710946] EAX: f57e4d90 EBX:  ECX: c1d333cc EDX: 0002801d
  [ 1443.711088] ESI: c1d36404 EDI: c1d36408 EBP: e63f3de8 ESP: e63f3de8
  [ 1443.711228]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  [ 1443.711334] CR0: 80050033 CR2: 2d5e501d CR3: 35072440 CR4: 001406f0
  [ 1443.711471] Stack:
  [ 1443.711593]  e63f3e04 c1203752 c13b7f71 c1d333cc eebb5980 e59d71e0 
41ed e63f3e30
  [ 1443.711822]  c130546b e59d7230 1a628dcf 0003  e63f3e58 
6c0a010a e53b6800
  [ 1443.712044]  00de eebb5980 e63f3e44 c13055be   
 e63f3e6c
  [ 1443.712264] Call Trace:
  [ 1443.712314]  [] simple_pin_fs+0x32/0xa0
  [ 1443.712421]  [] ? vsnprintf+0x321/0x420
  [ 1443.712516]  [] securityfs_create_dentry+0x5b/0x150
  [ 1443.712632]  [] securityfs_create_dir+0x2e/0x30
  [ 1443.712729]  [] __aa_fs_profile_mkdir+0x46/0x3c0
  [ 1443.712826]  [] aa_replace_profiles+0x4c0/0xbc0
  [ 1443.712927]  [] ? ns_capable_common+0x55/0x80
  [ 1443.713022]  [] policy_update+0x97/0x230
  [ 1443.713122]  [] ? security_file_permission+0x39/0xc0
  [ 1443.713247]  [] profile_replace+0x98/0xe0
  [ 1443.713346]  [] ? policy_update+0x230/0x230
  [ 1443.713445]  [] __vfs_write+0x1f/0x50
  [ 1443.713535]  [] vfs_write+0x8c/0x1b0
  [ 1443.713633]  [] SyS_write+0x51/0xb0
  [ 1443.713738]  [] do_fast_syscall_32+0x8d/0x150
  [ 1443.713838]  [] sysenter_past_esp+0x3d/0x61
  [ 1443.713938] Code: c0 74 09 83 42 10 01 89 d0 5b 5d c3 3b 5b 10 b8 fe ff ff 
ff 75 e3 eb eb 8d 74 26 00 55 89 e5 3e 8d 74 26 00 85 c0 74 06 8b 50 14 <64> ff 
02 5d c3 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 3e
  [ 1443.715713] EIP: [] mntget+0xf/0x20 SS:ESP 0068:e63f3de8
  [ 1443.715852] CR2: 2d5e501d
  ```

  ```
  buildd07 login: [ 1262.522071] BUG: unable to handle kernel NULL pointer 
dereference at 0008
  [ 1262.522339] IP: [] mntput_no_expire+0x68/0x180
  [ 1262.522464] PGD 439912067 PUD 43997f067 PMD 0
  [ 1262.522556] Oops: 0002 [#1] SMP
  [ 1262.522760] Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
xt_comment veth ebtable_filter ebtables dm_snapshot dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c binfmt_misc xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp 
iptable_filter ip_tables x_tables bridge stp llc zram lz4_compress zfs(PO) 
zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) kvm_intel kvm irqbypass 
crct10dif_pclmul

[Kernel-packages] [Bug 1672749] Re: Please don't assume zfs module is always loaded

2017-03-14 Thread Stéphane Graber

I'd have preferred that Ubuntu's zfsutils be patched to attempt to load
the kernel module as needed since that change means that now any
documentation telling the user to use "zpool create" or similar zfs
commands will fail unless the user manually plays with modprobe...

That very much feels like a regression to me. I understand the reason
for not always loading the module, but having the tools attempt to load
it would have achieved the same thing without breaking every single tool
which uses the zfs tools in the process.

** Changed in: lxd (Ubuntu)
   Status: New => Triaged

** Also affects: zfsutils (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: zfsutils (Ubuntu)
   Status: New => Triaged

** Changed in: zfsutils (Ubuntu)
   Importance: Undecided => High

** Package changed: zfsutils (Ubuntu) => zfs-linux (Ubuntu)

** Changed in: zfs-linux (Ubuntu)
 Assignee: (unassigned) => Aron Xu (happyaron)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1672749

Title:
  Please don't assume zfs module is always loaded

Status in lxd package in Ubuntu:
  Triaged
Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  Since zfsutils-linux/0.6.5.9-4, zfs module is not automatically loaded
  on systems that no zpool exists, this avoids tainting everyone's
  kernel who has the package installed but is not using zfs.

  ADT test of lxd (at least at 2.11-0ubuntu4[1]) shows the storage tests
  are failing because "Could not determine ZFS module version." which is
  printed by zfsModuleVersionGet() in lxd/storage_zfs.go. This indicates
  there are pieces of code assuming zfs kernel module is always loaded
  which turns out to be not true anymore.

  [1]
  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-zesty/zesty/amd64/l/lxd/20170314_120059_2ce5b@/log.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1672749/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1672749] Re: Please don't assume zfs module is always loaded

2017-03-14 Thread Stéphane Graber

Adding a priority "high" task against zfs-linux since this is a post-FF
regression in expected behavior from a tool in main.

Consider this as coming from me as a release team member and TB member
rather than LXD upstream.

My preference here is that rather than just breaking every single script
and tools that create zfs pools today on clean systems, the "zpool"
utility should be patched to automatically load the "zfs" kernel module
if not already loaded.


With my upstream LXD hat on, we will modify LXD itself to attempt to
load the zfs module on first use, if only to cope with other distros
that may use this non-optimal behavior.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1672749

Title:
  Please don't assume zfs module is always loaded

Status in lxd package in Ubuntu:
  Triaged
Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  Since zfsutils-linux/0.6.5.9-4, zfs module is not automatically loaded
  on systems that no zpool exists, this avoids tainting everyone's
  kernel who has the package installed but is not using zfs.

  ADT test of lxd (at least at 2.11-0ubuntu4[1]) shows the storage tests
  are failing because "Could not determine ZFS module version." which is
  printed by zfsModuleVersionGet() in lxd/storage_zfs.go. This indicates
  there are pieces of code assuming zfs kernel module is always loaded
  which turns out to be not true anymore.

  [1]
  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-zesty/zesty/amd64/l/lxd/20170314_120059_2ce5b@/log.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1672749/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1672749] Re: Please don't assume zfs module is always loaded

2017-03-20 Thread Stéphane Graber

** Changed in: lxd (Ubuntu)
   Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1672749

Title:
  Please don't assume zfs module is always loaded

Status in lxd package in Ubuntu:
  Fix Released
Status in zfs-linux package in Ubuntu:
  Fix Released

Bug description:
  Since zfsutils-linux/0.6.5.9-4, zfs module is not automatically loaded
  on systems that no zpool exists, this avoids tainting everyone's
  kernel who has the package installed but is not using zfs.

  ADT test of lxd (at least at 2.11-0ubuntu4[1]) shows the storage tests
  are failing because "Could not determine ZFS module version." which is
  printed by zfsModuleVersionGet() in lxd/storage_zfs.go. This indicates
  there are pieces of code assuming zfs kernel module is always loaded
  which turns out to be not true anymore.

  [1]
  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-zesty/zesty/amd64/l/lxd/20170314_120059_2ce5b@/log.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1672749/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1567557] Re: Performance degradation of "zfs clone" when under load

2017-07-05 Thread Stéphane Graber

Creating 100 clones
Took: 4 seconds (25/s)
Creating 200 clones
Took: 13 seconds (15/s)
Creating 400 clones
Took: 46 seconds (8/s)
Creating 600 clones
Took: 156 seconds (3/s)


```
#!/bin/sh
zfs destroy -R castiana/testzfs
rm -Rf /tmp/testzfs

zfs create castiana/testzfs -o mountpoint=none
zfs snapshot castiana/testzfs@base
mkdir /tmp/testzfs

clone() {
echo "Creating ${1} clones"
BASE=$(date +%s)
for i in $(seq ${1}); do
UUID=$(uuidgen)
zfs clone castiana/testzfs@base castiana/testzfs-$UUID -o 
mountpoint=/tmp/testzfs/${UUID}
done
END=$(date +%s)
TOTAL=$((END-BASE))
PER=$((${1}/${TOTAL}))
echo "Took: $TOTAL seconds ($PER/s)"
}

clone 100
clone 200
clone 400
clone 600
```

That's on an up to date artful system. I'll do a similar test on an up
to date xenial system.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1567557

Title:
  Performance degradation of "zfs clone" when under load

Status in lxd package in Ubuntu:
  Confirmed
Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  I've been running some scale tests for LXD and what I've noticed is
  that "zfs clone" gets slower and slower as the zfs filesystem is
  getting busier.

  It feels like "zfs clone" requires some kind of pool-wide lock or
  something and so needs for all operations to complete before it can
  clone a new filesystem.

  A basic LXD scale test with btrfs vs zfs shows what I mean, see below
  for the reports.

  The test is run on a completely dedicated physical server with the
  pool on a dedicated SSD, the exact same machine and SSD was used for
  the btrfs test.

  The zfs filesystem is configured with those settings:
   - relatime=on
   - sync=disabled
   - xattr=sa

  So it shouldn't be related to pending sync() calls...

  The workload in this case is ultimately 1024 containers running busybox as 
their init system and udhcpc grabbing an IP.
  The problem gets significantly worse if spawning busier containers, say a 
full Ubuntu system.

  === zfs ===
  root@edfu:~# /home/ubuntu/lxd-benchmark spawn --count=1024 
--image=images:alpine/edge/amd64 --privileged=true
  Test environment:
Server backend: lxd
Server version: 2.0.0.rc8
Kernel: Linux
Kernel architecture: x86_64
Kernel version: 4.4.0-16-generic
Storage backend: zfs
Storage version: 5
Container backend: lxc
Container version: 2.0.0.rc15

  Test variables:
Container count: 1024
Container mode: privileged
Image: images:alpine/edge/amd64
Batches: 128
Batch size: 8
Remainder: 0

  [Apr  3 06:42:51.170] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 06:42:52.657] Starting the test
  [Apr  3 06:42:53.994] Started 8 containers in 1.336s
  [Apr  3 06:42:55.521] Started 16 containers in 2.864s
  [Apr  3 06:42:58.632] Started 32 containers in 5.975s
  [Apr  3 06:43:05.399] Started 64 containers in 12.742s
  [Apr  3 06:43:20.343] Started 128 containers in 27.686s
  [Apr  3 06:43:57.269] Started 256 containers in 64.612s
  [Apr  3 06:46:09.112] Started 512 containers in 196.455s
  [Apr  3 06:58:19.309] Started 1024 containers in 926.652s
  [Apr  3 06:58:19.309] Test completed in 926.652s

  === btrfs ===
  Test environment:
Server backend: lxd
Server version: 2.0.0.rc8
Kernel: Linux
Kernel architecture: x86_64
Kernel version: 4.4.0-16-generic
Storage backend: btrfs
Storage version: 4.4
Container backend: lxc
Container version: 2.0.0.rc15

  Test variables:
Container count: 1024
Container mode: privileged
Image: images:alpine/edge/amd64
Batches: 128
Batch size: 8
Remainder: 0

  [Apr  3 07:42:12.053] Importing image into local store: 
64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
  [Apr  3 07:42:13.351] Starting the test
  [Apr  3 07:42:14.793] Started 8 containers in 1.442s
  [Apr  3 07:42:16.495] Started 16 containers in 3.144s
  [Apr  3 07:42:19.881] Started 32 containers in 6.530s
  [Apr  3 07:42:26.798] Started 64 containers in 13.447s
  [Apr  3 07:42:42.048] Started 128 containers in 28.697s
  [Apr  3 07:43:13.210] Started 256 containers in 59.859s
  [Apr  3 07:44:26.238] Started 512 containers in 132.887s
  [Apr  3 07:47:30.708] Started 1024 containers in 317.357s
  [Apr  3 07:47:30.708] Test completed in 317.357s

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1567557/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

1 2 3 >

1 - 100 of 261 matches

Mail list logo