from:"Jason Hobbs"

[Kernel-packages] [Bug 1908108] [NEW] tg3: transmit timed out, resetting

2020-12-14 Thread Jason Hobbs

Public bug reported:

On a deploy of kubernetes, we're seeing a machine have issues with its
tg3 driven nics.

We see:

Dec 14 07:44:08 juju-fcf29c-0-lxd-1 kernel: [ 1496.772960] tg3
:02:00.1 eth1: transmit timed out, resetting

Around that time, we have issues with services losing network
connections.

A juju crashdump with logs is available here:
https://oil-jenkins.canonical.com/artifacts/c15028dc-46fa-4f08-8895-55e9d500c362/generated/generated/kubernetes/juju-crashdump-kubernetes-2020-12-14-07.48.49.tar.gz

syslog is at kubernetes-master_0/var/log/syslog

this is on focal:

[0.00] kernel: Linux version 5.4.0-58-generic
(buildd@lcy01-amd64-004) (gcc version 9.3.0 (Ubuntu
9.3.0-17ubuntu1~20.04)) #64-Ubuntu SMP Wed Dec 9 08:16:25 UTC 2020
(Ubuntu 5.4.0-58.64-generic 5.4.73)

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New

** Description changed:

  On a deploy of kubernetes, we're seeing a machine have issues with its
  tg3 driven nics.
  
  We see:
  
  Dec 14 07:44:08 juju-fcf29c-0-lxd-1 kernel: [ 1496.772960] tg3
  :02:00.1 eth1: transmit timed out, resetting
  
  Around that time, we have issues with services losing network
  connections.
  
+ A juju crashdump with logs is available here:
+ 
https://oil-jenkins.canonical.com/artifacts/c15028dc-46fa-4f08-8895-55e9d500c362/generated/generated/kubernetes/juju-crashdump-kubernetes-2020-12-14-07.48.49.tar.gz
+ 
+ syslog is at kubernetes-master_0/var/log/syslog
  
  this is on focal:
  
  [0.00] kernel: Linux version 5.4.0-58-generic
  (buildd@lcy01-amd64-004) (gcc version 9.3.0 (Ubuntu
  9.3.0-17ubuntu1~20.04)) #64-Ubuntu SMP Wed Dec 9 08:16:25 UTC 2020
  (Ubuntu 5.4.0-58.64-generic 5.4.73)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1908108

Title:
  tg3: transmit timed out, resetting

Status in linux package in Ubuntu:
  New

Bug description:
  On a deploy of kubernetes, we're seeing a machine have issues with its
  tg3 driven nics.

  We see:

  Dec 14 07:44:08 juju-fcf29c-0-lxd-1 kernel: [ 1496.772960] tg3
  :02:00.1 eth1: transmit timed out, resetting

  Around that time, we have issues with services losing network
  connections.

  A juju crashdump with logs is available here:
  
https://oil-jenkins.canonical.com/artifacts/c15028dc-46fa-4f08-8895-55e9d500c362/generated/generated/kubernetes/juju-crashdump-kubernetes-2020-12-14-07.48.49.tar.gz

  syslog is at kubernetes-master_0/var/log/syslog

  this is on focal:

  [0.00] kernel: Linux version 5.4.0-58-generic
  (buildd@lcy01-amd64-004) (gcc version 9.3.0 (Ubuntu
  9.3.0-17ubuntu1~20.04)) #64-Ubuntu SMP Wed Dec 9 08:16:25 UTC 2020
  (Ubuntu 5.4.0-58.64-generic 5.4.73)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1908108/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Re: [Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Jason Hobbs

@ Ryan we do not test Xenial or Disco

On Thu, Aug 22, 2019 at 7:41 PM Ryan Harper <1784...@bugs.launchpad.net>
wrote:

> Finally, I did verify xenial proposed with our original test.  I had
> over 100 installs with no issue.
>
> @Jason
>
> Have you had any runs on Xenial or Disco?  (or do you not test those)?
>
> --
> You received this bug notification because you are a member of Canonical
> Field Critical, which is subscribed to a duplicate bug report (1796292).
> https://bugs.launchpad.net/bugs/1784665
>
> Title:
>   bcache: bch_allocator_thread(): hung task timeout
>
> Status in linux package in Ubuntu:
>   Fix Committed
> Status in linux source package in Xenial:
>   Fix Committed
> Status in linux source package in Bionic:
>   New
> Status in linux source package in Disco:
>   Fix Committed
> Status in linux source package in Eoan:
>   Fix Committed
>
> Bug description:
>   [Impact]
>
>   bcache_allocator() can call the following:
>
>bch_allocator_thread()
> -> bch_prio_write()
>-> bch_bucket_alloc()
>   -> wait on >set->bucket_wait
>
>   But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself causing a deadlock.
>
>   [Test Case]
>
>   This is a simple script that can easily trigger the deadlock condition:
>   https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh
>
>   A better test case has been also provided in bug 1796292 (duplicate of
> this bug):
>
> https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh
>
>   [Fix]
>
>   Fix by making the call to bch_prio_write() non-blocking, so that
>   bch_allocator_thread() never waits on itself. Moreover, make sure to
>   wake up the garbage collector thread when bch_prio_write() is failing
>   to allocate buckets to increase the chance of freeing up more buckets.
>
>   In addition to that it would be safe to also import other upstream
>   bcache fixes (all clean cherry picks):
>
>   7e865eba00a3df2dc8c4746173a8ca1c1c7f042e bcache: fix potential deadlock
> in cached_def_free()
>   80265d8dfd77792e133793cef44a21323aac2908 bcache: acquire
> bch_register_lock later in cached_dev_free()
>   ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace
> bch_read_string_list() by __sysfs_match_string()
>   ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of
> functions to sysfs.c
>   04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string
> arrays to sysfs.c
>   5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning
> in bcache_device_init()
>   20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of
> sparse complaints about lock imbalances
>   42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings
> about set-but-not-used variables
>   f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused
> variable
>   47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
>   9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch
> fall-through
>   4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation
> to __bch_check_keys()
>   fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
>   ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop
> variable in memory shrink
>   f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value
> in memory shrink
>   688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs
> output value of strip size
>   09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform
> duplicated CACHE_SET_IO_DISABLE set
>   c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy
> during journal
>   a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
>   616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target
> calc on large devices
>   1f0ffa67349c56ea54c03ccfd1e073c990e7411e bcache: only set
> BCACHE_DEV_WB_RUNNING when cached device attached
>   eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
>   9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a
> discard operation
>
>   [Regression Potential]
>
>   The upstream fixes are all clean cherry picks from stable (most of
>   them are small cleanups), so regression potential is minimal.
>
>   The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
>   bcache_allocator()" that is addressing the main deadlock bug (that
>   seems to be a mainline bug - not fixed yet). We should spend more time
>   trying to reproduce this deadlock with a mainline kernel and post the
>   patch to the LKML for review / feedback.
>
>   However, considering that this patch seems to fix/prevent the specific
>   deadlock problem reported in this bug (tested on the affected
>   platform) it can be considered safe to apply it.
>
>   [Original Bug Report]
>
>   $ cat /proc/version_signature
>   Ubuntu 4.15.0-29.31-generic 4.15.18
>
>   $ lsb_release

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Jason Hobbs

** Changed in: linux (Ubuntu Bionic)
   Status: Fix Committed => New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1784665

Title:
  bcache: bch_allocator_thread(): hung task timeout

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  New
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

  bcache_allocator() can call the following:
  
   bch_allocator_thread()
-> bch_prio_write()
   -> bch_bucket_alloc()
  -> wait on >set->bucket_wait
  
  But the wake up event on bucket_wait is supposed to come from 
bch_allocator_thread() itself causing a deadlock.

  [Test Case]

  This is a simple script that can easily trigger the deadlock condition:
  https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh

  A better test case has been also provided in bug 1796292 (duplicate of this 
bug):
  
https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh

  [Fix]

  Fix by making the call to bch_prio_write() non-blocking, so that
  bch_allocator_thread() never waits on itself. Moreover, make sure to
  wake up the garbage collector thread when bch_prio_write() is failing
  to allocate buckets to increase the chance of freeing up more buckets.

  In addition to that it would be safe to also import other upstream
  bcache fixes (all clean cherry picks):

  7e865eba00a3df2dc8c4746173a8ca1c1c7f042e bcache: fix potential deadlock in 
cached_def_free()
  80265d8dfd77792e133793cef44a21323aac2908 bcache: acquire bch_register_lock 
later in cached_dev_free()
  ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace 
bch_read_string_list() by __sysfs_match_string()
  ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of functions to 
sysfs.c
  04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string arrays 
to sysfs.c
  5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning in 
bcache_device_init()
  20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of sparse 
complaints about lock imbalances
  42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings about 
set-but-not-used variables
  f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused variable
  47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
  9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch fall-through
  4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation to 
__bch_check_keys()
  fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
  ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop variable 
in memory shrink
  f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value in 
memory shrink
  688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs output 
value of strip size
  09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform 
duplicated CACHE_SET_IO_DISABLE set
  c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy 
during journal
  a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
  616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target calc on 
large devices
  1f0ffa67349c56ea54c03ccfd1e073c990e7411e bcache: only set 
BCACHE_DEV_WB_RUNNING when cached device attached
  eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
  9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard 
operation

  [Regression Potential]

  The upstream fixes are all clean cherry picks from stable (most of
  them are small cleanups), so regression potential is minimal.

  The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
  bcache_allocator()" that is addressing the main deadlock bug (that
  seems to be a mainline bug - not fixed yet). We should spend more time
  trying to reproduce this deadlock with a mainline kernel and post the
  patch to the LKML for review / feedback.

  However, considering that this patch seems to fix/prevent the specific
  deadlock problem reported in this bug (tested on the affected
  platform) it can be considered safe to apply it.

  [Original Bug Report]

  $ cat /proc/version_signature
  Ubuntu 4.15.0-29.31-generic 4.15.18

  $ lsb_release -rd
  Description:  Ubuntu Cosmic Cuttlefish (development branch)
  Release:  18.10

  $ apt-cache policy linux-image-`uname -r`
  linux-image-4.15.0-29-generic:
    Installed: 4.15.0-29.31
    Candidate: 4.15.0-29.31
    Version table:
   *** 4.15.0-29.31 500
  500 http://archive.ubuntu.com/ubuntu cosmic/main amd64 Packages
  100 /var/lib/dpkg/status

  3) mkfs.ext4 /dev/bcache0 returns successful creating an ext4

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Jason Hobbs

** Attachment added: "spinda.maas-curtin_config.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+attachment/5284072/+files/spinda.maas-curtin_config.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1784665

Title:
  bcache: bch_allocator_thread(): hung task timeout

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

  bcache_allocator() can call the following:
  
   bch_allocator_thread()
-> bch_prio_write()
   -> bch_bucket_alloc()
  -> wait on >set->bucket_wait
  
  But the wake up event on bucket_wait is supposed to come from 
bch_allocator_thread() itself causing a deadlock.

  [Test Case]

  This is a simple script that can easily trigger the deadlock condition:
  https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh

  A better test case has been also provided in bug 1796292 (duplicate of this 
bug):
  
https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh

  [Fix]

  Fix by making the call to bch_prio_write() non-blocking, so that
  bch_allocator_thread() never waits on itself. Moreover, make sure to
  wake up the garbage collector thread when bch_prio_write() is failing
  to allocate buckets to increase the chance of freeing up more buckets.

  In addition to that it would be safe to also import other upstream
  bcache fixes (all clean cherry picks):

  7e865eba00a3df2dc8c4746173a8ca1c1c7f042e bcache: fix potential deadlock in 
cached_def_free()
  80265d8dfd77792e133793cef44a21323aac2908 bcache: acquire bch_register_lock 
later in cached_dev_free()
  ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace 
bch_read_string_list() by __sysfs_match_string()
  ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of functions to 
sysfs.c
  04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string arrays 
to sysfs.c
  5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning in 
bcache_device_init()
  20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of sparse 
complaints about lock imbalances
  42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings about 
set-but-not-used variables
  f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused variable
  47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
  9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch fall-through
  4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation to 
__bch_check_keys()
  fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
  ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop variable 
in memory shrink
  f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value in 
memory shrink
  688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs output 
value of strip size
  09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform 
duplicated CACHE_SET_IO_DISABLE set
  c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy 
during journal
  a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
  616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target calc on 
large devices
  1f0ffa67349c56ea54c03ccfd1e073c990e7411e bcache: only set 
BCACHE_DEV_WB_RUNNING when cached device attached
  eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
  9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard 
operation

  [Regression Potential]

  The upstream fixes are all clean cherry picks from stable (most of
  them are small cleanups), so regression potential is minimal.

  The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
  bcache_allocator()" that is addressing the main deadlock bug (that
  seems to be a mainline bug - not fixed yet). We should spend more time
  trying to reproduce this deadlock with a mainline kernel and post the
  patch to the LKML for review / feedback.

  However, considering that this patch seems to fix/prevent the specific
  deadlock problem reported in this bug (tested on the affected
  platform) it can be considered safe to apply it.

  [Original Bug Report]

  $ cat /proc/version_signature
  Ubuntu 4.15.0-29.31-generic 4.15.18

  $ lsb_release -rd
  Description:  Ubuntu Cosmic Cuttlefish (development branch)
  Release:  18.10

  $ apt-cache policy linux-image-`uname -r`
  linux-image-4.15.0-29-generic:
    Installed: 4.15.0-29.31
    Candidate: 4.15.0-29.31
    Version table:
   *** 4.15.0-29.31 500
  500 http://archive.ubuntu.com/ubuntu cosmic/main amd64

[Kernel-packages] [Bug 1784665] Re: bcache: bch_allocator_thread(): hung task timeout

2019-08-22 Thread Jason Hobbs

We're still seeing a bcache timeout failure during curtin install
2019-08-22T10:16:40+00:00 spinda

cloud-init[1604]: finish: 
cmd-install/stage-partitioning/builtin/cmd-block-meta/clear-holders: FAIL: 
removing previous storage devices
2019-08-22T10:16:40+00:00 spinda cloud-init[1604]: TIMED BLOCK_META: 
1203.679


I attached the rsyslog from a unit that failed.

Linux version 4.15.0-59-generic (buildd@lgw01-amd64-035) (gcc version
7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #66-Ubuntu SMP Wed Aug 14
10:56:44 UTC 2019 (Ubuntu 4.15.0-59.66-generic 4.15.18)

** Attachment added: "messages"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784665/+attachment/5284071/+files/messages

** Tags removed: verification-done-bionic
** Tags added: verification-failed-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1784665

Title:
  bcache: bch_allocator_thread(): hung task timeout

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

  bcache_allocator() can call the following:
  
   bch_allocator_thread()
-> bch_prio_write()
   -> bch_bucket_alloc()
  -> wait on >set->bucket_wait
  
  But the wake up event on bucket_wait is supposed to come from 
bch_allocator_thread() itself causing a deadlock.

  [Test Case]

  This is a simple script that can easily trigger the deadlock condition:
  https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh

  A better test case has been also provided in bug 1796292 (duplicate of this 
bug):
  
https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh

  [Fix]

  Fix by making the call to bch_prio_write() non-blocking, so that
  bch_allocator_thread() never waits on itself. Moreover, make sure to
  wake up the garbage collector thread when bch_prio_write() is failing
  to allocate buckets to increase the chance of freeing up more buckets.

  In addition to that it would be safe to also import other upstream
  bcache fixes (all clean cherry picks):

  7e865eba00a3df2dc8c4746173a8ca1c1c7f042e bcache: fix potential deadlock in 
cached_def_free()
  80265d8dfd77792e133793cef44a21323aac2908 bcache: acquire bch_register_lock 
later in cached_dev_free()
  ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace 
bch_read_string_list() by __sysfs_match_string()
  ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of functions to 
sysfs.c
  04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string arrays 
to sysfs.c
  5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning in 
bcache_device_init()
  20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of sparse 
complaints about lock imbalances
  42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings about 
set-but-not-used variables
  f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused variable
  47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
  9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch fall-through
  4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation to 
__bch_check_keys()
  fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
  ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop variable 
in memory shrink
  f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value in 
memory shrink
  688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs output 
value of strip size
  09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform 
duplicated CACHE_SET_IO_DISABLE set
  c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy 
during journal
  a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
  616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target calc on 
large devices
  1f0ffa67349c56ea54c03ccfd1e073c990e7411e bcache: only set 
BCACHE_DEV_WB_RUNNING when cached device attached
  eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
  9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard 
operation

  [Regression Potential]

  The upstream fixes are all clean cherry picks from stable (most of
  them are small cleanups), so regression potential is minimal.

  The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
  bcache_allocator()" that is addressing the main deadlock bug (that
  seems to be a mainline bug - not fixed yet). We should spend more time
  trying to reproduce this deadlock with a mainline kernel and post the
  patch to the LKML for review / feedback.

  However,

Re: [Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-07-03 Thread Jason Hobbs

This is difficult for us to test in our lab because we are using MAAS, and
we hit this during MAAS deployments of nodes, so we would need MAAS images
built with these kernels. Additionally, this doesn't reproduce every time,
it is maybe 1/4 test runs. It may be best to find a way to reproduce this
outside of MAAS.

On Wed, Jul 3, 2019 at 11:16 AM Andrea Righi 
wrote:

> >From a kernel perspective this big slowness on shutting down a bcache
> volume might be caused by a locking / race condition issue. If I read
> correctly this problem has been reproduced in bionic (and in xenial we
> even got a kernel oops - it looks like caused by a NULL pointer
> dereference). I would try to address these issues separately.
>
> About bionic it would be nice to test this commit (also mentioned by
> @elmo in comment #28):
>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60
>
> Moreover, even if we didn't get an explicit NULL pointer dereference
> with bionic, I think it would be interesting to test also the following
> fixes:
>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a4b732a248d12cbdb46999daf0bf288c011335eb
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f0ffa67349c56ea54c03ccfd1e073c990e7411e
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9951379b0ca88c95876ad9778b9099e19a95d566
>
> I've already backported all of them and applied to the latest bionic
> kernel. A test kernel is available here:
>
> https://kernel.ubuntu.com/~arighi/LP-1796292/
>
> If it doesn't cost too much it would be great to do a test with it. In
> the meantime I'll try to reproduce the problem locally. Thanks in
> advance!
>
> --
> You received this bug notification because you are a member of Canonical
> Field High, which is subscribed to the bug report.
> https://bugs.launchpad.net/bugs/1796292
>
> Title:
>   Tight timeout for bcache removal causes spurious failures
>
> Status in curtin:
>   Fix Released
> Status in linux package in Ubuntu:
>   Confirmed
> Status in linux source package in Bionic:
>   New
> Status in linux source package in Cosmic:
>   New
> Status in linux source package in Disco:
>   New
> Status in linux source package in Eoan:
>   Confirmed
>
> Bug description:
>   I've had a number of deployment faults where curtin would report
>   Timeout exceeded for removal of /sys/fs/bcache/xxx when doing a mass-
>   deployment of 30+ nodes. Upon retrying the node would usually deploy
>   fine. Experimentally I've set the timeout ridiculously high, and it
>   seems I'm getting no faults with this. I'm wondering if the timeout
>   for removal is set too tight, or might need to be made configurable.
>
>   --- curtin/util.py~ 2018-05-18 18:40:48.0 +
>   +++ curtin/util.py  2018-10-05 09:40:06.807390367 +
>   @@ -263,7 +263,7 @@
>return _subp(*args, **kwargs)
>
>
>   -def wait_for_removal(path, retries=[1, 3, 5, 7]):
>   +def wait_for_removal(path, retries=[1, 3, 5, 7, 1200, 1200]):
>if not path:
>raise ValueError('wait_for_removal: missing path parameter')
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1796292/+subscriptions
>

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1796292

Title:
  Tight timeout for bcache removal causes spurious failures

Status in curtin:
  Fix Released
Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  New
Status in linux source package in Cosmic:
  New
Status in linux source package in Disco:
  New
Status in linux source package in Eoan:
  Confirmed

Bug description:
  I've had a number of deployment faults where curtin would report
  Timeout exceeded for removal of /sys/fs/bcache/xxx when doing a mass-
  deployment of 30+ nodes. Upon retrying the node would usually deploy
  fine. Experimentally I've set the timeout ridiculously high, and it
  seems I'm getting no faults with this. I'm wondering if the timeout
  for removal is set too tight, or might need to be made configurable.

  --- curtin/util.py~ 2018-05-18 18:40:48.0 +
  +++ curtin/util.py  2018-10-05 09:40:06.807390367 +
  @@ -263,7 +263,7 @@
   return _subp(*args, **kwargs)
   
   
  -def wait_for_removal(path, retries=[1, 3, 5, 7]):
  +def wait_for_removal(path, retries=[1, 3, 5, 7, 1200, 1200]):
   if not path:
   raise ValueError('wait_for_removal: missing path parameter')

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/1796292/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-05-14 Thread Jason Hobbs

** Tags added: cdo-qa foundations-engine

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1796292

Title:
  Tight timeout for bcache removal causes spurious failures

Status in curtin:
  New
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I've had a number of deployment faults where curtin would report
  Timeout exceeded for removal of /sys/fs/bcache/xxx when doing a mass-
  deployment of 30+ nodes. Upon retrying the node would usually deploy
  fine. Experimentally I've set the timeout ridiculously high, and it
  seems I'm getting no faults with this. I'm wondering if the timeout
  for removal is set too tight, or might need to be made configurable.

  --- curtin/util.py~ 2018-05-18 18:40:48.0 +
  +++ curtin/util.py  2018-10-05 09:40:06.807390367 +
  @@ -263,7 +263,7 @@
   return _subp(*args, **kwargs)
   
   
  -def wait_for_removal(path, retries=[1, 3, 5, 7]):
  +def wait_for_removal(path, retries=[1, 3, 5, 7, 1200, 1200]):
   if not path:
   raise ValueError('wait_for_removal: missing path parameter')

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/1796292/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1796292] Re: Tight timeout for bcache removal causes spurious failures

2019-05-06 Thread Jason Hobbs

This occurrs on a target machine during maas install. Apport is not
collected in this case.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1796292

Title:
  Tight timeout for bcache removal causes spurious failures

Status in curtin:
  New
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I've had a number of deployment faults where curtin would report
  Timeout exceeded for removal of /sys/fs/bcache/xxx when doing a mass-
  deployment of 30+ nodes. Upon retrying the node would usually deploy
  fine. Experimentally I've set the timeout ridiculously high, and it
  seems I'm getting no faults with this. I'm wondering if the timeout
  for removal is set too tight, or might need to be made configurable.

  --- curtin/util.py~ 2018-05-18 18:40:48.0 +
  +++ curtin/util.py  2018-10-05 09:40:06.807390367 +
  @@ -263,7 +263,7 @@
   return _subp(*args, **kwargs)
   
   
  -def wait_for_removal(path, retries=[1, 3, 5, 7]):
  +def wait_for_removal(path, retries=[1, 3, 5, 7, 1200, 1200]):
   if not path:
   raise ValueError('wait_for_removal: missing path parameter')

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/1796292/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1797581] Re: Composing a VM in MAAS with exactly 2048 MB RAM causes the VM to kernel panic

2019-03-21 Thread Jason Hobbs

@Christian

- release: bionic
- seabios: 1.10.2-1ubuntu1
- qemu: 1:2.11+dfsg-1ubuntu7.10
- libvirt: 4.0.0-1ubuntu8.8
- ovmf - this is a uefi thing right? we're not using it.

- kernel 2019-03-18T12:17:11+00:00 elastic-2 kernel: [0.00]
Linux version 4.15.0-46-generic (buildd@lgw01-amd64-038) (gcc version
7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC
2019 (Ubuntu 4.15.0-46.49-generic 4.15.18)

I don't have copies of the binaries from this run - it was from daily
maas images:

2019-03-18T12:03:37.296671+00:00 leafeon maas.import-images: [info]
Region downloading image descriptions from
'http://images.maas.io/ephemeral-v3/daily/'.

I don't see anything in the logs to indicate an ID number for the
kernel, initrd, or image coming there.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1797581

Title:
  Composing a VM in MAAS with exactly 2048 MB RAM causes the VM to
  kernel panic

Status in MAAS:
  Incomplete
Status in linux package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Using latest MAAS master, I'm unable to compose a VM over the UI
  successfully when composed with 2048 MB of RAM. By that I mean that
  the VM is created, but it fails with a kernel panic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1797581/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1797581] Re: Composing a VM in MAAS with exactly 2048 MB RAM causes the VM to kernel panic

2019-03-20 Thread Jason Hobbs

Bumped to field-high as we ran into this again in testing.

We have a workaround, but it's to not use 2G VM's, which is really silly
and hard to remember when we go and add new deployments, especially
because the failure mode is not obvious at all.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1797581

Title:
  Composing a VM in MAAS with exactly 2048 MB RAM causes the VM to
  kernel panic

Status in MAAS:
  Incomplete
Status in linux package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Using latest MAAS master, I'm unable to compose a VM over the UI
  successfully when composed with 2048 MB of RAM. By that I mean that
  the VM is created, but it fails with a kernel panic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1797581/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Re: [Kernel-packages] [Bug 1820287] Re: kernel panic during pxe boot on DL360 gen9

2019-03-16 Thread Jason Hobbs

This happens only sporadically. If it happens, is there some keyboard
sequence I can use to dump more information, or is the system totally
frozen at this point?

Jason

On Sat, Mar 16, 2019 at 11:35 AM Kai-Heng Feng 
wrote:

> Would it be possible to get earlier trace?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1820287
>
> Title:
>   kernel panic during pxe boot on DL360 gen9
>
> Status in linux package in Ubuntu:
>   Confirmed
>
> Bug description:
>   A machine in our test lab kernel panic'd during PXE boot from MAAS.
>
>   It was running 4.15.0-46-generic #49-Ubuntu
>
>   I've attached a screenshot of the call trace.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1820287/+subscriptions
>

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1820287

Title:
  kernel panic during pxe boot on DL360 gen9

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  A machine in our test lab kernel panic'd during PXE boot from MAAS.

  It was running 4.15.0-46-generic #49-Ubuntu

  I've attached a screenshot of the call trace.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1820287/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1820287] Re: kernel panic during pxe boot on DL360 gen9

2019-03-15 Thread Jason Hobbs

I can't get logs from the system because it's kernel panic'd.

** Changed in: linux (Ubuntu)
   Status: Incomplete => New

** Changed in: linux (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1820287

Title:
  kernel panic during pxe boot on DL360 gen9

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  A machine in our test lab kernel panic'd during PXE boot from MAAS.

  It was running 4.15.0-46-generic #49-Ubuntu

  I've attached a screenshot of the call trace.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1820287/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1820287] [NEW] kernel panic during pxe boot on DL360 gen9

2019-03-15 Thread Jason Hobbs

Public bug reported:

A machine in our test lab kernel panic'd during PXE boot from MAAS.

It was running 4.15.0-46-generic #49-Ubuntu

I've attached a screenshot of the call trace.

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: cdo-qa foundations-engine

** Attachment added: "kernel panic beartic"
   
https://bugs.launchpad.net/bugs/1820287/+attachment/5246430/+files/kernel%20panic%20beartic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1820287

Title:
  kernel panic during pxe boot on DL360 gen9

Status in linux package in Ubuntu:
  New

Bug description:
  A machine in our test lab kernel panic'd during PXE boot from MAAS.

  It was running 4.15.0-46-generic #49-Ubuntu

  I've attached a screenshot of the call trace.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1820287/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1772490] Re: 'Deploying' timed out after 40 minutes / Failedbcache: register_bcache() error

2018-05-23 Thread Jason Hobbs

*** This bug is a duplicate of bug 1768893 ***
https://bugs.launchpad.net/bugs/1768893

** This bug has been marked a duplicate of bug 1768893
   installation on several nodes failed with errors relating to dmsetup remove 
of ceph devices.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1772490

Title:
  'Deploying' timed out after 40 minutes / Failedbcache:
  register_bcache() error

Status in curtin:
  Invalid
Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  We have a few runs over the weekend failed to deploy with maas 2.3.3.

  May 21 11:33:50 swoobat maas.node: [info] geodude: Status transition from 
DEPLOYING to FAILED_DEPLOYMENT
  May 21 11:33:50 swoobat maas.node: [error] geodude: Marking node failed: Node 
operation 'Deploying' timed out after 40 minutes.

  https://solutions.qa.canonical.com/#/qa/testRun/67dae845-b22e-
  4de1-9b30-0ecb28eb3c35

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/1772490/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1759445] Re: kernel panic when trying to reboot in bionic

2018-04-03 Thread Jason Hobbs

After updating firmware on the servers, we can't reproduce it at all
anymore.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1759445

Title:
  kernel panic when trying to reboot in bionic

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  cpe_foundation test deployment of Bionic failed.
  After some investigation, it looks like the nodes deployed and installed 
bionic, but never came back from a reboot.

  Accessing the ILO console of a node in question (all nodes failed), it
  revealed a kernel panic (attached)

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1759445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1759445] Re: kernel panic when trying to reboot in bionic

2018-04-02 Thread Jason Hobbs

So far we've only been able to produce this by doing bionic deploys.

One thing that stands out in the rsyslog for bionic deploys is this
failure:

http://paste.ubuntu.com/p/y8xXc7PYjp/

Apr  2 17:48:35 leafeon blkdeactivate[1782]: /sbin/blkdeactivate: line
345: /bin/sort: No such file or directory

Could it be related?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1759445

Title:
  kernel panic when trying to reboot in bionic

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  cpe_foundation test deployment of Bionic failed.
  After some investigation, it looks like the nodes deployed and installed 
bionic, but never came back from a reboot.

  Accessing the ILO console of a node in question (all nodes failed), it
  revealed a kernel panic (attached)

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1759445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1759445] Re: kernel panic when trying to reboot in bionic

2018-03-30 Thread Jason Hobbs

We reproduced it again... looking to try the testing now.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1759445

Title:
  kernel panic when trying to reboot in bionic

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  cpe_foundation test deployment of Bionic failed.
  After some investigation, it looks like the nodes deployed and installed 
bionic, but never came back from a reboot.

  Accessing the ILO console of a node in question (all nodes failed), it
  revealed a kernel panic (attached)

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1759445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1759445] Re: kernel panic when trying to reboot in bionic

2018-03-28 Thread Jason Hobbs

We can no longer reproduce this.

** Changed in: linux (Ubuntu Bionic)
   Status: Triaged => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1759445

Title:
  kernel panic when trying to reboot in bionic

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  cpe_foundation test deployment of Bionic failed.
  After some investigation, it looks like the nodes deployed and installed 
bionic, but never came back from a reboot.

  Accessing the ILO console of a node in question (all nodes failed), it
  revealed a kernel panic (attached)

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1759445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1759445] Re: kernel panic when trying to reboot in bionic

2018-03-28 Thread Jason Hobbs

** Tags added: foundations-engine

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1759445

Title:
  kernel panic when trying to reboot in bionic

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Bionic:
  Triaged

Bug description:
  cpe_foundation test deployment of Bionic failed.
  After some investigation, it looks like the nodes deployed and installed 
bionic, but never came back from a reboot.

  Accessing the ILO console of a node in question (all nodes failed), it
  revealed a kernel panic (attached)

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1759445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1759445] Re: Bionic due to kernel panic

2018-03-28 Thread Jason Hobbs

This bug is a kernel panic when rebooting at the end of a MAAS
deployment of bionic; there is no way to run apport-collect.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

** Summary changed:

- Bionic due to kernel panic
+ kernel panic when trying to reboot in bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1759445

Title:
  kernel panic when trying to reboot in bionic

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  cpe_foundation test deployment of Bionic failed.
  After some investigation, it looks like the nodes deployed and installed 
bionic, but never came back from a reboot.

  Accessing the ILO console of a node in question (all nodes failed), it
  revealed a kernel panic (attached)

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1759445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1742505] Re: gre_sys set to default 1472 when using path_mtu > 1500 with ovs 2.8.x

2018-01-25 Thread Jason Hobbs

@james-page When will the 2.8.1 release be?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1742505

Title:
  gre_sys set to default 1472 when using path_mtu > 1500 with ovs 2.8.x

Status in Ubuntu Cloud Archive:
  In Progress
Status in Ubuntu Cloud Archive pike series:
  In Progress
Status in Ubuntu Cloud Archive queens series:
  In Progress
Status in neutron:
  Invalid
Status in linux package in Ubuntu:
  Confirmed
Status in openvswitch package in Ubuntu:
  In Progress
Status in linux source package in Artful:
  Confirmed
Status in openvswitch source package in Artful:
  In Progress
Status in linux source package in Bionic:
  Confirmed
Status in openvswitch source package in Bionic:
  In Progress

Bug description:
  [Impact]
  OpenStack Clouds using GRE overlay tunnels with > 1500 MTU's will observe 
packet fragmentation/networking issues for traffic in overlay networks.

  [Test Case]
  Deploy OpenStack Pike (xenial + pike UCA or artful)
  Create tenant networks using GRE segmentation
  Boot instances
  Instance networking will be broken/slow

  gre_sys devices will be set to mtu=1472 on hypervisor hosts.

  [Regression Potential]
  Minimal; the fix to OVS works around an issue for GRE tunnel port setup via 
rtnetlink by performing a second request once the gre device is setup to set 
the MTU to a high value (65000).

  
  [Original Bug Report]
  Setup:
  Pike neutron 11.0.2-0ubuntu1.1~cloud0
  OVS 2.8.0
  Jumbo frames setttings per: 
https://docs.openstack.org/mitaka/networking-guide/config-mtu.html
  global_physnet_mtu = 9000
  path_mtu = 9000

  Symptoms:
  gre_sys MTU is 1472
  Instances with MTUs > 1500 fail to communicate across GRE

  Temporary Workaround:
  ifconfig gre_sys MTU 9000
  Note: When ovs rebuilds tunnels, such as on a restart, gre_sys MTU is set 
back to default 1472.

  Note: downgrading from OVS 2.8.0 to 2.6.1 resolves the issue.

  Previous behavior:
  With Ocata or Pike and OVS 2.6.x
  gre_sys MTU defaults to 65490
  It remains at 65490 through restarts.

  This may be related to some combination of the following changes in OVS which 
seem to imply MTUs must be set in the ovs database for tunnel interfaces and 
patches:
  
https://github.com/openvswitch/ovs/commit/8c319e8b73032e06c7dd1832b3b31f8a1189dcd1
  
https://github.com/openvswitch/ovs/commit/3a414a0a4f1901ba015ec80b917b9fb206f3c74f
  
https://github.com/openvswitch/ovs/blob/6355db7f447c8e83efbd4971cca9265f5e0c8531/datapath/vport-internal_dev.c#L186

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1742505/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1737640] Re: /usr/sbin/fanctl: arithmetic expression: expecting primary | unconfigured interfaces cause ifup failures

2017-12-13 Thread Jason Hobbs

Testing on arm64, the workaround of adding xenial-proposed via maas
doesn't work - the newer ubuntu-fan package isn't being installed

http://paste.ubuntu.com/26178859/

I don't know how that can be, since the repo is being added (or should
be added) before juju installs ubuntu-fan.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to ubuntu-fan in Ubuntu.
https://bugs.launchpad.net/bugs/1737640

Title:
  /usr/sbin/fanctl: arithmetic expression: expecting primary |
  unconfigured interfaces cause ifup failures

Status in juju:
  Triaged
Status in ubuntu-fan package in Ubuntu:
  Confirmed
Status in ubuntu-fan source package in Xenial:
  Fix Committed

Bug description:
  I'm seeing this error as the status of multiple containers in my
  deploy:

  http://paste.ubuntu.com/26166720/

  I can't connect to the parent machines anymore either - it seems
  networking is totally hosed on the machines.

  This is with juju 2.3.1.

To manage notifications about this bug go to:
https://bugs.launchpad.net/juju/+bug/1737640/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1737640] Re: /usr/sbin/fanctl: arithmetic expression: expecting primary | unconfigured interfaces cause ifup failures

2017-12-13 Thread Jason Hobbs

I just tested this also and can verify it fixed it in the
environment/test where it was originally reported as broken.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to ubuntu-fan in Ubuntu.
https://bugs.launchpad.net/bugs/1737640

Title:
  /usr/sbin/fanctl: arithmetic expression: expecting primary |
  unconfigured interfaces cause ifup failures

Status in juju:
  Triaged
Status in ubuntu-fan package in Ubuntu:
  Confirmed
Status in ubuntu-fan source package in Xenial:
  Fix Committed

Bug description:
  I'm seeing this error as the status of multiple containers in my
  deploy:

  http://paste.ubuntu.com/26166720/

  I can't connect to the parent machines anymore either - it seems
  networking is totally hosed on the machines.

  This is with juju 2.3.1.

To manage notifications about this bug go to:
https://bugs.launchpad.net/juju/+bug/1737640/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Re: [Kernel-packages] [Bug 1641593] Re: unable to enable iommu on HPE Proliant Gen9 server

2017-01-30 Thread Jason Hobbs

Doing some more testing it looks like the systems without the firmware
udpate are not stable.  I can sometimes, but not always, get them to boot
using either 4.4.0-59-generic #80  or  4.4.0-62-generic #83, but once
they're up, they don't last long.  The longest I've seen is about 40
minutes before getting "sd 0:0:2:0: rejecting I/O to offline device" errors
and /dev/sda going offline.  I can get it to go offline quicker - almost
immediately - by doing "cat /dev/urandom > /dev/sdb".

The two systems with the firmware updates both reliably boot up and stay up
using either 4.4.0-59-generic #80 or 4.4.0-62-generic #83, and haven't gone
offline yet from the "cat /dev/urandom > /dev/sdb" test. I will leave them
running over night.

On Mon, Jan 30, 2017 at 6:51 PM, Jason Hobbs <jason.ho...@canonical.com>
wrote:

> So, I appear to have spoken too soon on exactly what fixes this.
>
> We have two systems being tested with 4.4.0-62-generic #83 - one with the
> firmware update and one without.
>
> The one with the firmware updates has been up for over 6 hours now without
> any issues.
>
> The one without firmware updates has been up for 40 minutes and is getting
> I/O errors now.
>
> I'm also seeing a system with 4.4.0-59-generic #80 and no firmware updates
> boot up with iommu enabled, I will see how long it stays up..
>
> I'll also test with 4.4.0-59-generic #80 and the firmware updates.
>
> On Mon, Jan 30, 2017 at 6:13 PM, Jason Hobbs <jason.ho...@canonical.com>
> wrote:
>
>> We found testing with the latest Xenial kernel (4.4.0.62.65) from
>> https://launchpad.net/~canonical-kernel-
>> team/+archive/ubuntu/ppa/+build/11278866 fixes this issue - no firmware
>> updates required.  We did also test with just the latest firmware
>> updates, and that did not fix the issue. Latest firmware + 4.4.0.62.65
>> also works.
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1641593
>>
>> Title:
>>   unable to enable iommu on HPE Proliant Gen9 server
>>
>> Status in linux package in Ubuntu:
>>   Incomplete
>>
>> Bug description:
>>   I'm using MAAS to enable the following kernel flags on install/boot:
>>
>> iommu=pt intel_iommu=on
>>
>>   in order to be able to passthrough SR-IOV VF functions to KVM guess;
>>   however when these options are enabled, the servers fail to install
>>   (see attached screenshot).
>>
>>   The install eventually fails - it looks like the writes back to one of
>>   the disks starts to fail for some reason.
>>
>>   Servers are targeted with Xenial and the release 4.4 kernel (no HWE).
>>
>>   Here's the LSHW output from the system:
>>   http://pastebin.ubuntu.com/23875929/
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1641593
>> /+subscriptions
>>
>
>

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1641593

Title:
  unable to enable iommu on HPE Proliant Gen9 server

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  I'm using MAAS to enable the following kernel flags on install/boot:

    iommu=pt intel_iommu=on

  in order to be able to passthrough SR-IOV VF functions to KVM guess;
  however when these options are enabled, the servers fail to install
  (see attached screenshot).

  The install eventually fails - it looks like the writes back to one of
  the disks starts to fail for some reason.

  Servers are targeted with Xenial and the release 4.4 kernel (no HWE).

  Here's the LSHW output from the system:
  http://pastebin.ubuntu.com/23875929/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1641593/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Re: [Kernel-packages] [Bug 1641593] Re: unable to enable iommu on HPE Proliant Gen9 server

2017-01-30 Thread Jason Hobbs

So, I appear to have spoken too soon on exactly what fixes this.

We have two systems being tested with 4.4.0-62-generic #83 - one with the
firmware update and one without.

The one with the firmware updates has been up for over 6 hours now without
any issues.

The one without firmware updates has been up for 40 minutes and is getting
I/O errors now.

I'm also seeing a system with 4.4.0-59-generic #80 and no firmware updates
boot up with iommu enabled, I will see how long it stays up..

I'll also test with 4.4.0-59-generic #80 and the firmware updates.

On Mon, Jan 30, 2017 at 6:13 PM, Jason Hobbs <jason.ho...@canonical.com>
wrote:

> We found testing with the latest Xenial kernel (4.4.0.62.65) from
> https://launchpad.net/~canonical-kernel-
> team/+archive/ubuntu/ppa/+build/11278866 fixes this issue - no firmware
> updates required.  We did also test with just the latest firmware
> updates, and that did not fix the issue. Latest firmware + 4.4.0.62.65
> also works.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1641593
>
> Title:
>   unable to enable iommu on HPE Proliant Gen9 server
>
> Status in linux package in Ubuntu:
>   Incomplete
>
> Bug description:
>   I'm using MAAS to enable the following kernel flags on install/boot:
>
> iommu=pt intel_iommu=on
>
>   in order to be able to passthrough SR-IOV VF functions to KVM guess;
>   however when these options are enabled, the servers fail to install
>   (see attached screenshot).
>
>   The install eventually fails - it looks like the writes back to one of
>   the disks starts to fail for some reason.
>
>   Servers are targeted with Xenial and the release 4.4 kernel (no HWE).
>
>   Here's the LSHW output from the system:
>   http://pastebin.ubuntu.com/23875929/
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1641593/+subscriptions
>

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1641593

Title:
  unable to enable iommu on HPE Proliant Gen9 server

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  I'm using MAAS to enable the following kernel flags on install/boot:

    iommu=pt intel_iommu=on

  in order to be able to passthrough SR-IOV VF functions to KVM guess;
  however when these options are enabled, the servers fail to install
  (see attached screenshot).

  The install eventually fails - it looks like the writes back to one of
  the disks starts to fail for some reason.

  Servers are targeted with Xenial and the release 4.4 kernel (no HWE).

  Here's the LSHW output from the system:
  http://pastebin.ubuntu.com/23875929/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1641593/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1641593] Re: unable to enable iommu on HPE Proliant Gen9 server

2017-01-30 Thread Jason Hobbs

We found testing with the latest Xenial kernel (4.4.0.62.65) from
https://launchpad.net/~canonical-kernel-
team/+archive/ubuntu/ppa/+build/11278866 fixes this issue - no firmware
updates required.  We did also test with just the latest firmware
updates, and that did not fix the issue. Latest firmware + 4.4.0.62.65
also works.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1641593

Title:
  unable to enable iommu on HPE Proliant Gen9 server

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  I'm using MAAS to enable the following kernel flags on install/boot:

    iommu=pt intel_iommu=on

  in order to be able to passthrough SR-IOV VF functions to KVM guess;
  however when these options are enabled, the servers fail to install
  (see attached screenshot).

  The install eventually fails - it looks like the writes back to one of
  the disks starts to fail for some reason.

  Servers are targeted with Xenial and the release 4.4 kernel (no HWE).

  Here's the LSHW output from the system:
  http://pastebin.ubuntu.com/23875929/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1641593/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1641593] Re: unable to enable iommu on HPE Proliant Gen9 server

2017-01-27 Thread Jason Hobbs

** Description changed:

  I'm using MAAS to enable the following kernel flags on install/boot:
  
    iommu=pt intel_iommu=on
  
  in order to be able to passthrough SR-IOV VF functions to KVM guess;
  however when these options are enabled, the servers fail to install (see
  attached screenshot).
  
  The install eventually fails - it looks like the writes back to one of
  the disks starts to fail for some reason.
  
  Servers are targeted with Xenial and the release 4.4 kernel (no HWE).
+ 
+ Here's the LSHW output from the system:
+ http://pastebin.ubuntu.com/23875929/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1641593

Title:
  unable to enable iommu on HPE Proliant Gen9 server

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  I'm using MAAS to enable the following kernel flags on install/boot:

    iommu=pt intel_iommu=on

  in order to be able to passthrough SR-IOV VF functions to KVM guess;
  however when these options are enabled, the servers fail to install
  (see attached screenshot).

  The install eventually fails - it looks like the writes back to one of
  the disks starts to fail for some reason.

  Servers are targeted with Xenial and the release 4.4 kernel (no HWE).

  Here's the LSHW output from the system:
  http://pastebin.ubuntu.com/23875929/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1641593/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1618572] Re: apt-key add fails in overlayfs

2016-09-08 Thread Jason Hobbs

** Tags added: oil

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1618572

Title:
  apt-key add fails in overlayfs

Status in cloud-init:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  Sending a custom APT config to cloud-init fails to:
  1. add keys
  2. configure sources
  3. configura additional repository.

  The same config is being sent to curtin, and curtin doesn't seem to
  fail (curtin install log http://paste.ubuntu.com/23112826/ just in
  case).

  config sent by maas = http://pastebin.ubuntu.com/23112834/
  cloud-init.log = http://paste.ubuntu.com/23112820/
  cloud-init-output.log = http://paste.ubuntu.com/23112822/
  sources.list = http://paste.ubuntu.com/23112824/
  ubuntu@node03:/var/log$ ls -l /etc/apt/sources.list.d/
  total 0

  
  ubuntu@node03:/var/log$ sudo apt-get update
  Hit:2 http://us.archive.ubuntu.com/ubuntu yakkety-updates InRelease
  Get:3 http://us.archive.ubuntu.com/ubuntu yakkety-backports InRelease [92.2 
kB]
  Err:2 http://us.archive.ubuntu.com/ubuntu yakkety-updates InRelease
The following signatures couldn't be verified because the public key is not 
available: NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  Ign:3 http://us.archive.ubuntu.com/ubuntu yakkety-backports InRelease
  Hit:4 http://us.archive.ubuntu.com/ubuntu yakkety-security InRelease
  Get:1 http://us.archive.ubuntu.com/ubuntu yakkety InRelease [247 kB]
  Err:4 http://us.archive.ubuntu.com/ubuntu yakkety-security InRelease
The following signatures couldn't be verified because the public key is not 
available: NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  Err:1 http://us.archive.ubuntu.com/ubuntu yakkety InRelease
The following signatures couldn't be verified because the public key is not 
available: NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  Fetched 339 kB in 0s (388 kB/s)
  Reading package lists... Error!
  W: An error occurred during the signature verification. The repository is not 
updated and the previous index files will be used. GPG error: 
http://us.archive.ubuntu.com/ubuntu yakkety-updates InRelease: The following 
signatures couldn't be verified because the public key is not available: 
NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  W: GPG error: http://us.archive.ubuntu.com/ubuntu yakkety-backports 
InRelease: The following signatures couldn't be verified because the public key 
is not available: NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  W: The repository 'http://us.archive.ubuntu.com/ubuntu yakkety-backports 
InRelease' is not signed.
  N: Data from such a repository can't be authenticated and is therefore 
potentially dangerous to use.
  N: See apt-secure(8) manpage for repository creation and user configuration 
details.
  W: An error occurred during the signature verification. The repository is not 
updated and the previous index files will be used. GPG error: 
http://us.archive.ubuntu.com/ubuntu yakkety-security InRelease: The following 
signatures couldn't be verified because the public key is not available: 
NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  W: An error occurred during the signature verification. The repository is not 
updated and the previous index files will be used. GPG error: 
http://us.archive.ubuntu.com/ubuntu yakkety InRelease: The following signatures 
couldn't be verified because the public key is not available: NO_PUBKEY 
40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  W: Failed to fetch 
http://us.archive.ubuntu.com/ubuntu/dists/yakkety/InRelease  The following 
signatures couldn't be verified because the public key is not available: 
NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  W: Failed to fetch 
http://us.archive.ubuntu.com/ubuntu/dists/yakkety-updates/InRelease  The 
following signatures couldn't be verified because the public key is not 
available: NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  W: Failed to fetch 
http://us.archive.ubuntu.com/ubuntu/dists/yakkety-security/InRelease  The 
following signatures couldn't be verified because the public key is not 
available: NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32
  W: Some index files failed to download. They have been ignored, or old ones 
used instead.
  E: Problem renaming the file /var/cache/apt/srcpkgcache.bin.3HKvbX to 
/var/cache/apt/srcpkgcache.bin - rename (116: Stale file handle)
  E: Problem renaming the file /var/cache/apt/pkgcache.bin.d0JUHJ to 
/var/cache/apt/pkgcache.bin - rename (116: Stale file handle)
  W: You may want to run apt-get update to correct these problems
  E: The package cache file is corrupted

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1618572/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to :

[Kernel-packages] [Bug 1464442] Re: installing or upgrading libc6 in Trusty removes all content from /tmp directory

2015-06-26 Thread Jason Hobbs

Steve's suggested work around:

 # dpkg-divert --rename --add /sbin/telinit 
 # cat  /sbin/telinit
 #!/bin/sh
 exit 0
 ^D
 # apt-get install [...]
 # dpkg-divert --rename --remove /sbin/telinit

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1464442

Title:
  installing or upgrading libc6 in Trusty removes all content from /tmp
  directory

Status in linux package in Ubuntu:
  Invalid
Status in upstart package in Ubuntu:
  Triaged

Bug description:
  We are seeing an issue with installation of dkms package during a
  curtin installation which ends up with /tmp directory being wiped
  clean. This is very bad for curtin as it saves critical installation
  files in /tmp.

  It turns out that it's the of upgrading libc6, which is triggered as a
  result of installing dependencies, that removes content of /tmp. For
  example, installation of gcc results in the same result since it ends
  up with libc6 being upgraded. The only way that this won't be
  recreated is if the latest libc6 is already installed.

  This problem does not exist in precise. It can also be recreated by
  installing the .deb file for any version in trusty including 2.17.

  
  ubuntu@host:~$ ls /tmp
  tmpHHbRkP
  ubuntu@sirrush:~$ sudo apt-get install libc6
  sudo: unable to resolve host sirrush
  Reading package lists... Done
  Building dependency tree
  Reading state information... Done
  The following extra packages will be installed:
    libc-dev-bin libc6-dev
  Suggested packages:
    glibc-doc
  Recommended packages:
    manpages-dev
  The following packages will be upgraded:
    libc-dev-bin libc6 libc6-dev
  3 upgraded, 0 newly installed, 0 to remove and 148 not upgraded.
  Need to get 6,714 kB of archives.
  After this operation, 6,144 B disk space will be freed.
  Do you want to continue? [Y/n] y
  Get:1 http://archive.ubuntu.com/ubuntu/ trusty-updates/main libc6-dev amd64 
2.19-0ubuntu6.6 [1,910 kB]
  Get:2 http://archive.ubuntu.com/ubuntu/ trusty-updates/main libc-dev-bin 
amd64 2.19-0ubuntu6.6 [68.9 kB]
  Get:3 http://archive.ubuntu.com/ubuntu/ trusty-updates/main libc6 amd64 
2.19-0ubuntu6.6 [4,735 kB]
  Fetched 6,714 kB in 0s (18.5 MB/s)
  Preconfiguring packages ...
  (Reading database ... 57798 files and directories currently installed.)
  Preparing to unpack .../libc6-dev_2.19-0ubuntu6.6_amd64.deb ...
  Unpacking libc6-dev:amd64 (2.19-0ubuntu6.6) over (2.19-0ubuntu6.3) ...
  Preparing to unpack .../libc-dev-bin_2.19-0ubuntu6.6_amd64.deb ...
  Unpacking libc-dev-bin (2.19-0ubuntu6.6) over (2.19-0ubuntu6.3) ...
  Preparing to unpack .../libc6_2.19-0ubuntu6.6_amd64.deb ...
  Unpacking libc6:amd64 (2.19-0ubuntu6.6) over (2.19-0ubuntu6.3) ...
  Processing triggers for man-db (2.6.7.1-1) ...
  Setting up libc6:amd64 (2.19-0ubuntu6.6) ...
  Setting up libc-dev-bin (2.19-0ubuntu6.6) ...
  Setting up libc6-dev:amd64 (2.19-0ubuntu6.6) ...
  Processing triggers for libc-bin (2.19-0ubuntu6.3) ...
  ubuntu@host:~$ ls /tmp
  ubuntu@host:~$
  

  This is very recreatable.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1464442/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1444003] Re: BUG: soft lockup - CPU#6 stuck for 22s! [systemd-udevd:166]

2015-05-06 Thread Jason Hobbs

** Changed in: linux (Ubuntu)
   Status: Incomplete = New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1444003

Title:
  BUG: soft lockup - CPU#6 stuck for 22s! [systemd-udevd:166]

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a number of failed deployments on the SM15K:

  This is from console log for a failed deployment this morning:
  [ 3310.700695] Call Trace:^M
  [ 3310.700698] [81057180] ? try_to_free_pmd_page+0x50/0x50^M
  [ 3310.700700] [8105709fff^M
  [ 3310.700707] [810590ed] change_page_attr_set_clr+0x38d/0x4a0^M
  [ 3310.700709] [a020a000] ? 0xa0209fff^M
  [ 3310.700711] [810596cf] set_memory_ro+0x2f/0x40^M
  [ 3310.700714] [8171bad4] set_section_ro_nx+0x3a/0x71^M
  [ 3310.700716] [810e25a8] load_module+0x12c8/0x1b40^M
  [ 3310.700719] [810de040] ? store_uevent+0x40/0x40^M
  [ 3310.700722] [810e2f96] SyS_finit_module+0x86/0xb0^M
  [ 3310.700725] [8173263d] system_call_fastpath+0x1a/0x1f^M
  [ 3310.700745] Code: 1d a9 29 00 3b 05 8f 90 c3 00 89 c2 0f 8d 25 fe ff ff 48 
98 49 8b 4d 00 4p - CPU#2 stuck for 23s! [systemd-udevd:159]^M
  [ 3338.633651] Modules linked in: cryptd(+) e1000(+) ahci libahci^M
  [ 3338.633653] CPU: 2 PID: 159 Comm: systemd-udevd Not tainted 
3.13.0-49-generic #81-Ubuntu^M
  [ 3338.633654] Hardware name: SeaMicro SM15000-64-CC-AA-1Ox1/AMD Server CRB, 
BIOS Estoc.3.72.19.0015 10/29/2012^M
  [ 3338.633655] task: 88022f361800 ti: 8800be95c000 task.ti: 
8800be95c000^M
  [ 3338.633657] RIP: 0010:[810dc4ba] [810dc4ba] 
smp_call_function_many+0x26a/0x2d0^M
  [ 3338.633658] RSP: 0018:8800be95db60 EFLAGS: 0202^M
  [ 3338.633659] RAX:  RBX: 88023fc93fc8 R09: 
0004^M
  [ 3338.633661] R10: 88023fc93fc8 R11: 880234259c28 R12: 
061c^M
  [ 3338.633662] R13: 8802341caf80 R14: 8802342eb180 R15: 
^M
  [ 3338.633663] FS: 7f569e8ce880() GS:88023fc8() 
knlGS:^M
  [ 3338.633664] CS: 0010 DS:  ES:  CR0: 80050033^M
  [ 3338.633665] CR2: 02736138 CR3: be95b000 CR4: 
000407e0^M
  [ 3338.633666] Stack:^M
  [ 3338.633669] 88023fc93fe8 00013f80 8800be95dbe8 
8105c6e0^M
  [ 3338.633671] 0101 0012 8105c6e0 
fff8105c6e0] ? rbt_memtype_copy_nth_element+0xa0/0xa0^M
  [ 3338.633680] [8105c6e0] ? rbt_memtype_copy_nth_element+0xa0/0xa0^M
  [ 3338.633682] [810dc67d] on_each_cpu+0x2d/0x60^M
  [ 3338.633685] [8105ccdd] flush_tlb_kernel_range+0x6d/0x70^M
  [ 3338.633687] [81187555] __purge_vmap_area_lazy+0x335/0x430^M
  [ 3338.633690] [811877b2] vm_unmap_aliases+0x162/0x180^M
  [ 3338.633693] [81058e2e] change_page_attr_set_clr+0xce/0x4a0^M
  [ 3338.633696] [81725ad1] ? __schedule+0x381/0x7d0^M
  [ 3338.633699] [81059243] set_memory_x+0x43/0x50^M
  [ 3338.633702] [ff ? store_uevent+0x40/0x40^M
  [ 3338.633711] [810e2f96] SyS_finit_module+0x86/0xb0^M
  [ 3338.633714] [8173263d] system_call_fastpath+0x1a/0x1f^M
  [ 3338.633734] Code: 1d a9 29 00 3b 05 8f 90 c3 00 89 c2 0f 8d 25 fe ff ff 48 
98 49 8b 4d 00 48 03 0c c5 20 37 d1 81 f6 41 20 01 74 cb 0f 1f 00 f3 90 f6 41 
20 01 75 f8 eb be 0f b6 4d d0 48 8b 55 c0 44 89 ef 48 8b ^M
  [ 3338.701651] BUG: soft lockup - CPU#6 stuck for 22s! [systemd-udevd:166]^M
  [ 3338.701654] Modules linked in: cryptd(+) e1000(+) ahci libahci^M
  [ 3338.701655] CPU: 6 P^M

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1444003/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-06-03 Thread Jason Hobbs

I don't believe there are plans to fix this against Saucy.

** Changed in: linux (Ubuntu Saucy)
   Status: Confirmed = Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Fix Released
Status in Open Compute Project:
  Fix Released
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Saucy:
  Incomplete

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-04-21 Thread Jason Hobbs

** Changed in: linux (Ubuntu)
   Status: Confirmed = Fix Released

** Changed in: opencompute
   Status: Confirmed = Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Fix Released
Status in Open Compute Project:
  Fix Released
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1292927] [NEW] temp bug - ignore for now!

2014-03-15 Thread Jason Hobbs

Private bug reported:

please ignore for now - filling out details in a minute

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New

** Information type changed from Public to Private

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1292927

Title:
  temp bug - ignore for now!

Status in “linux” package in Ubuntu:
  New

Bug description:
  please ignore for now - filling out details in a minute

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1292927/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1282329] Re: juju requires cpu-checker which is unavailable on arm64/ppc64el

2014-02-27 Thread Jason Hobbs

** Tags added: server-hwe

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to cpu-checker in Ubuntu.
https://bugs.launchpad.net/bugs/1282329

Title:
  juju requires cpu-checker which is unavailable on arm64/ppc64el

Status in juju-core:
  Invalid
Status in “cpu-checker” package in Ubuntu:
  Fix Released

Bug description:
  I'm testing out deploying charms to an arm64 target using the manual
  provider. After hacking juju to recognize the arch, and hacking in the
  necessary tools, I reach the following failure:

  dannf@laptop:~$ juju add-machine ssh:arm64.dannf -v
  verbose is deprecated with the current meaning, use show-log
  2014-02-19 23:55:44 INFO juju api.go:231 connecting to API addresses: 
[bootstrap.dannf:17070]
  2014-02-19 23:55:44 INFO juju apiclient.go:118 state/api: dialing 
wss://bootstrap.dannf:17070/
  2014-02-19 23:55:44 INFO juju apiclient.go:128 state/api: connection 
established
  2014-02-19 23:55:44 INFO juju.environs.manual init.go:156 initialising 
arm64.dannf, user 
  2014-02-19 23:55:44 INFO juju.environs.manual init.go:167 ubuntu user is 
already initialised
  2014-02-19 23:55:44 INFO juju.environs.manual provisioner.go:260 addresses 
for arm64.dannf: [192.168.1.117 public:arm64.dannf]
  2014-02-19 23:55:44 INFO juju.environs.manual init.go:29 Checking if 
arm64.dannf is already provisioned
  2014-02-19 23:55:44 INFO juju.environs.manual init.go:46 arm64.dannf is not 
provisioned
  2014-02-19 23:55:44 INFO juju.environs.manual init.go:55 Detecting series and 
characteristics on arm64.dannf
  2014-02-19 23:55:45 INFO juju.environs.manual init.go:118 series: trusty, 
characteristics: arch=arm64 cpu-cores=1 mem=16062M
  Logging to /var/log/cloud-init-output.log on remote host
  Running apt-get update
  Installing package: git
  Installing package: cpu-checker
  2014-02-19 23:56:23 ERROR juju.environs.manual provisioner.go:78 provisioning 
failed, removing machine 2: exit status 1
  2014-02-19 23:56:23 ERROR juju.cmd supercommand.go:294 exit status 1

  The issue here is that cpu-checker is not available for arm64 (or
  ppc64el) in the archive.

To manage notifications about this bug go to:
https://bugs.launchpad.net/juju-core/+bug/1282329/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-02-27 Thread Jason Hobbs

** Tags added: server-hwe

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Fix Committed
Status in Open Compute Project:
  Confirmed
Status in “linux” package in Ubuntu:
  Confirmed
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-02-20 Thread Jason Hobbs

Hey Dustin - I reassigned to David since I'm not sure who will be
testing it.  David/Samantha/Rod - please reassign to whoever is doing
the test!

** Changed in: maas
   Status: Triaged = In Progress

** Changed in: maas
 Assignee: Jason Hobbs (jason-hobbs) = David Duffey (david-duffey)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  In Progress
Status in Open Compute Project:
  Confirmed
Status in “linux” package in Ubuntu:
  Confirmed
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-02-19 Thread Jason Hobbs

Cool David - let me know how it works out. The branch is otherwise
complete/reviewed and ready to land.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Triaged
Status in Open Compute Project:
  Confirmed
Status in “linux” package in Ubuntu:
  Confirmed
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-02-18 Thread Jason Hobbs

** Changed in: maas
Milestone: None = 14.04

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Triaged
Status in Open Compute Project:
  Confirmed
Status in “linux” package in Ubuntu:
  Confirmed
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-02-18 Thread Jason Hobbs

** Branch linked: lp:~jason-hobbs/maas/lp-1210393

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Triaged
Status in Open Compute Project:
  Confirmed
Status in “linux” package in Ubuntu:
  Confirmed
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-02-18 Thread Jason Hobbs

I've posted a branch with a fix to lp:~jason-hobbs/maas/lp-1210393

I've manually tested this, but for lack of access, not on OCPv3
Roadrunner.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Triaged
Status in Open Compute Project:
  Confirmed
Status in “linux” package in Ubuntu:
  Confirmed
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1210393] Re: MAAS ipmi fails on OCPv3 Roadrunner

2014-02-14 Thread Jason Hobbs

I've started work on a patch to fix this. It will find either an
existing maas user, or will find the first disabled user with an empty
username. If it can't find either it will bail and give up on automatic
IPMI config.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1210393

Title:
  MAAS ipmi fails on OCPv3 Roadrunner

Status in MAAS:
  Triaged
Status in Open Compute Project:
  Confirmed
Status in “linux” package in Ubuntu:
  Confirmed
Status in “linux” source package in Saucy:
  Confirmed

Bug description:
  The OCPv3 Roadrunner machine has been fully enabled and passes
  certification testing.  When testing ipmitool locally I'm able to
  setup the BMC and users, etc.

  When using MAAS, MAAS is able to setup the BMC network information (I
  see that it changes that), but it appears to fail to set a username
  and password.  If I try to use the username and password as defined in
  the MAAS GUI, it fails.  Therefore commissioning and juju
  bootstrapping the node has to be done manually (by physically pushing
  the power button).

  If I use the username/password I've set on the BMC I can see that MAAS
  fails to set the username 'maas' and the password as defined in the
  MAAS gui.

  Since the commissioning/enlisting process is temporary and I'm not
  sure how to login to this phase to gather data, troubleshooting tips
  are welcome.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1210393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

41 matches

Mail list logo