[Kernel-packages] [Bug 1531768] Re: lxd and other commands get stuck on arm64 kernel and multiple CPUs

2016-02-02 Thread Stéphane Graber
Very much looks like it's related to threading and futexes somehow.

Forcing golang to use a single thread rather than one per container made
things more stable using a very simple test (infinite loop of "lxc
list"), though starting containers then still caused the hang to happen.

I've seen a similar hang on futex when running (lxc-tests package):
lxc-test-concurrent -j 8 -i 50

This creates and spawns 8 containers in parallel using threads and
attempts that 50 times in a row. This is done entirely in C so doesn't
touch golang.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  lxd and other commands get stuck on arm64 kernel and multiple CPUs

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I created an 8 CPU arm64 instance on Canonical's Scalingstack (which I
  want to use for armhf autopkgtesting in LXD). I started with wily as
  that has lxd available (it's not yet available in trusty nor the PPA
  for arm64).

  However, pretty much any LXD task that I do (I haven't tried much
  else) on this machine takes unbearably long. A simple "lxc profile set
  default raw.lxc lxc.seccomp=" or "lxc list" takes several minutes.

  I see tons of

  [ 1020.971955] rcu_sched kthread starved for 6000 jiffies! g1095 c1094 f0x0
  [ 1121.166926] INFO: task fsnotify_mark:69 blocked for more than 120 seconds.

  in dmesg (the attached apport info has the complete dmesg).

  ProblemType: Bug
  DistroRelease: Ubuntu 15.10
  Package: linux-image-4.2.0-22-generic 4.2.0-22.27
  ProcVersionSignature: User Name 4.2.0-22.27-generic 4.2.6
  Uname: Linux 4.2.0-22-generic aarch64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan  7 09:18 seq
   crw-rw 1 root audio 116, 33 Jan  7 09:18 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.19.1-0ubuntu5
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  Date: Thu Jan  7 09:24:01 2016
  IwConfig:
   eth0  no wireless extensions.

   lono wireless extensions.

   lxcbr0no wireless extensions.
  Lspci:
   00:00.0 Host bridge [0600]: Red Hat, Inc. Device [1b36:0008]
    Subsystem: Red Hat, Inc Device [1af4:1100]
    Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
    Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:

  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-22-generic 
root=LABEL=cloudimg-rootfs earlyprintk
  RelatedPackageVersions:
   linux-restricted-modules-4.2.0-22-generic N/A
   linux-backports-modules-4.2.0-22-generic  N/A
   linux-firmware1.149.3
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1531768] Re: lxd and other commands get stuck on arm64 kernel and multiple CPUs

2016-02-02 Thread Martin Pitt
Reducing the number of threads that Go uses seems to help a bit:

$ cat /etc/systemd/system/lxd.service.d/override.conf
[Service]
Environment=GOMAXPROCS=1

(GOMAXPROCS defaults to the number of CPUs). But Stéphane is still able
to lock up LXD pretty fast even with that.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  lxd and other commands get stuck on arm64 kernel and multiple CPUs

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I created an 8 CPU arm64 instance on Canonical's Scalingstack (which I
  want to use for armhf autopkgtesting in LXD). I started with wily as
  that has lxd available (it's not yet available in trusty nor the PPA
  for arm64).

  However, pretty much any LXD task that I do (I haven't tried much
  else) on this machine takes unbearably long. A simple "lxc profile set
  default raw.lxc lxc.seccomp=" or "lxc list" takes several minutes.

  I see tons of

  [ 1020.971955] rcu_sched kthread starved for 6000 jiffies! g1095 c1094 f0x0
  [ 1121.166926] INFO: task fsnotify_mark:69 blocked for more than 120 seconds.

  in dmesg (the attached apport info has the complete dmesg).

  ProblemType: Bug
  DistroRelease: Ubuntu 15.10
  Package: linux-image-4.2.0-22-generic 4.2.0-22.27
  ProcVersionSignature: User Name 4.2.0-22.27-generic 4.2.6
  Uname: Linux 4.2.0-22-generic aarch64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan  7 09:18 seq
   crw-rw 1 root audio 116, 33 Jan  7 09:18 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.19.1-0ubuntu5
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  Date: Thu Jan  7 09:24:01 2016
  IwConfig:
   eth0  no wireless extensions.

   lono wireless extensions.

   lxcbr0no wireless extensions.
  Lspci:
   00:00.0 Host bridge [0600]: Red Hat, Inc. Device [1b36:0008]
    Subsystem: Red Hat, Inc Device [1af4:1100]
    Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
    Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:

  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-22-generic 
root=LABEL=cloudimg-rootfs earlyprintk
  RelatedPackageVersions:
   linux-restricted-modules-4.2.0-22-generic N/A
   linux-backports-modules-4.2.0-22-generic  N/A
   linux-firmware1.149.3
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1531768] Re: lxd and other commands get stuck on arm64 kernel and multiple CPUs

2016-01-28 Thread Martin Pitt
I managed to get the 4x CPU instance into the same locked up state now,
so AFAICS the problem isn't fundamentally different between 4 and 8
cores.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  lxd and other commands get stuck on arm64 kernel and multiple CPUs

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I created an 8 CPU arm64 instance on Canonical's Scalingstack (which I
  want to use for armhf autopkgtesting in LXD). I started with wily as
  that has lxd available (it's not yet available in trusty nor the PPA
  for arm64).

  However, pretty much any LXD task that I do (I haven't tried much
  else) on this machine takes unbearably long. A simple "lxc profile set
  default raw.lxc lxc.seccomp=" or "lxc list" takes several minutes.

  I see tons of

  [ 1020.971955] rcu_sched kthread starved for 6000 jiffies! g1095 c1094 f0x0
  [ 1121.166926] INFO: task fsnotify_mark:69 blocked for more than 120 seconds.

  in dmesg (the attached apport info has the complete dmesg).

  ProblemType: Bug
  DistroRelease: Ubuntu 15.10
  Package: linux-image-4.2.0-22-generic 4.2.0-22.27
  ProcVersionSignature: User Name 4.2.0-22.27-generic 4.2.6
  Uname: Linux 4.2.0-22-generic aarch64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan  7 09:18 seq
   crw-rw 1 root audio 116, 33 Jan  7 09:18 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.19.1-0ubuntu5
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  Date: Thu Jan  7 09:24:01 2016
  IwConfig:
   eth0  no wireless extensions.

   lono wireless extensions.

   lxcbr0no wireless extensions.
  Lspci:
   00:00.0 Host bridge [0600]: Red Hat, Inc. Device [1b36:0008]
    Subsystem: Red Hat, Inc Device [1af4:1100]
    Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
    Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:

  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-22-generic 
root=LABEL=cloudimg-rootfs earlyprintk
  RelatedPackageVersions:
   linux-restricted-modules-4.2.0-22-generic N/A
   linux-backports-modules-4.2.0-22-generic  N/A
   linux-firmware1.149.3
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1531768] Re: lxd and other commands get stuck on arm64 kernel and multiple CPUs

2016-01-26 Thread Martin Pitt
Retitling. The "unusably slow" part was fixed with installing haveged,
so what remains is that the 8x CPU instance gets into this lockup state
after some time.

On the 4x instance I'm now running adt-run in a loop, so far it's
through ~ 10 iterations. I'll let it run over night and see how it is
keeping up.

** Summary changed:

- arm64 kernel and multiple CPUs is unusably slow
+ lxd and other commands get stuck on arm64 kernel and multiple CPUs

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  lxd and other commands get stuck on arm64 kernel and multiple CPUs

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I created an 8 CPU arm64 instance on Canonical's Scalingstack (which I
  want to use for armhf autopkgtesting in LXD). I started with wily as
  that has lxd available (it's not yet available in trusty nor the PPA
  for arm64).

  However, pretty much any LXD task that I do (I haven't tried much
  else) on this machine takes unbearably long. A simple "lxc profile set
  default raw.lxc lxc.seccomp=" or "lxc list" takes several minutes.

  I see tons of

  [ 1020.971955] rcu_sched kthread starved for 6000 jiffies! g1095 c1094 f0x0
  [ 1121.166926] INFO: task fsnotify_mark:69 blocked for more than 120 seconds.

  in dmesg (the attached apport info has the complete dmesg).

  ProblemType: Bug
  DistroRelease: Ubuntu 15.10
  Package: linux-image-4.2.0-22-generic 4.2.0-22.27
  ProcVersionSignature: User Name 4.2.0-22.27-generic 4.2.6
  Uname: Linux 4.2.0-22-generic aarch64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan  7 09:18 seq
   crw-rw 1 root audio 116, 33 Jan  7 09:18 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.19.1-0ubuntu5
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  Date: Thu Jan  7 09:24:01 2016
  IwConfig:
   eth0  no wireless extensions.

   lono wireless extensions.

   lxcbr0no wireless extensions.
  Lspci:
   00:00.0 Host bridge [0600]: Red Hat, Inc. Device [1b36:0008]
    Subsystem: Red Hat, Inc Device [1af4:1100]
    Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
    Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:

  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-22-generic 
root=LABEL=cloudimg-rootfs earlyprintk
  RelatedPackageVersions:
   linux-restricted-modules-4.2.0-22-generic N/A
   linux-backports-modules-4.2.0-22-generic  N/A
   linux-firmware1.149.3
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp