[Group.of.nepali.translators] [Bug 1651602] Re: [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

2017-01-05 Thread Mike Pontillo
After further troubleshooting with cgregan, we've further narrowed this
down.

We ran the following script on the node that was having trouble:

https://gist.github.com/pontillo/0b92a7da2fba43fb5dce705be2dcf38b

Unlike all the other devices MAAS works with, the Intel NVMe device
reports a serial number that cannot be found anywhere in /dev/disk/by-
id/*. When curtin is supplied a serial number, it uses a heuristic to
find the device as follows:

http://bazaar.launchpad.net/~curtin-
dev/curtin/trunk/view/435/curtin/commands/block_meta.py#L270

http://bazaar.launchpad.net/~curtin-
dev/curtin/trunk/view/435/curtin/block/__init__.py#L601

So arguably, this is a bug in the Intel NVMe serial number; the way it
populates /dev/disk/* leaves much to be desired.

This is *arguably* a bug in curtin (and maybe MAAS, since we knowingly
use the serial number even though `udevadm` can tell us that the serial
cannot be found anywhere in /dev/disk/by-id/*), in that we could do a
better job dealing with devices backed by not-so-robust kernel drivers.
But I think we shouldn't encourage bad behavior on the part of driver
writers, so I'm on the fence about whether or not we should fix it.

But mostly, I would argue that this is a bug in the Intel NVMe driver.
The way they expose the device to userland is non-standard and arguably
broken. When we ran `udevadm info -q all -n nvme0n1` on the device, we
got the following pseudo-output:

nvme0n1:
P: /devices/pci:00/:00:xx.0/:xx:00.0/nvme/nvme0/nvme0n1
N: nvme0n1
S: SSDxx_CVMDxx
S: disk/by-id/nvme-INTEL
E: DEVLINKS=/dev/disk/by-id/nvme-INTEL /dev/SSDxx_CVMDxx
E: DEVNAME=/dev/nvme0n1
E: DEVPATH=/devices/pci:00/:00:xx.0/:xx:00.0/nvme/nvme0/nvme0n1
E: DEVTYPE=disk
E: ID_SERIAL=INTEL SSDxx_CVMDxx
E: ID_SERIAL_SHORT=CVMDxx
E: MAJOR=259
E: MINOR=0
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=xxx

You can see by the lines that start with "S:" and the "DEVLINKS=" line
that the way this device is exposed is very non-standard. One would
expect /dev/disk/by-id/* to contain a DEVLINK containing the serial
number. Instead they expose a 'nvme-INTEL' link, which is (IMHO) a
critical bug, because anyone expecting the things in /dev/disk/by-id/*
to be unique will be in for a big surprise when they add a second NVMe
device to a machine.

** Also affects: curtin
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu)
   Status: Invalid => New

** Changed in: linux (Ubuntu Xenial)
   Status: Fix Committed => New

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1651602

Title:
  Intel NVMe driver does not expose consistent links in /dev/disk/by-id

Status in curtin:
  New
Status in MAAS:
  Won't Fix
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Incomplete

Bug description:
  MAAS Version 2.1.1+bzr5544-0ubuntu1 (16.10.1)
  Deploying Xenial Nodes

  1) Deploy MAAS 2.1.1 on Yakkety
  2) Associate Juju 2.1 beta3
  3) Juju deploy Kubernetes Core

  Nodes begin to deploy but fail

  Installation failed with exception: Unexpected error while running command.
  Command: ['curtin', 'block-meta', 'custom']
  Exit code: 3
  Reason: -
  Stdout: b"no disk with serial 'CVMD434500BN400AGN' found\n"

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/1651602/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1651602] Re: [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

2016-12-23 Thread Dan Streetman
Chris, as your specific problem seems different than the 1-cpu NVMe bug
that the rest of this bug describes, and my patch fixes, can you open a
new bug please.

** Changed in: maas
   Status: New => Invalid

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1651602

Title:
  [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Xenial:
  Confirmed

Bug description:
  MAAS Version 2.1.1+bzr5544-0ubuntu1 (16.10.1)
  Deploying Xenial Nodes

  1) Deploy MAAS 2.1.1 on Yakkety
  2) Associate Juju 2.1 beta3
  3) Juju deploy Kubernetes Core

  Nodes begin to deploy but fail

  Installation failed with exception: Unexpected error while running command.
  Command: ['curtin', 'block-meta', 'custom']
  Exit code: 3
  Reason: -
  Stdout: b"no disk with serial 'CVMD434500BN400AGN' found\n"

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1651602/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1651602] Re: [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

2016-12-22 Thread Scott Moser
I've just now tested yakkety, and it seems like both 4.8.0-30-generic
and 4.8.0-32-generic are working fine, so I do not believe this to
affect yakkety.


** No longer affects: linux (Ubuntu Yakkety)

** Changed in: linux (Ubuntu)
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1651602

Title:
  [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

Status in MAAS:
  New
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Xenial:
  Confirmed

Bug description:
  MAAS Version 2.1.1+bzr5544-0ubuntu1 (16.10.1)
  Deploying Xenial Nodes

  1) Deploy MAAS 2.1.1 on Yakkety
  2) Associate Juju 2.1 beta3
  3) Juju deploy Kubernetes Core

  Nodes begin to deploy but fail

  Installation failed with exception: Unexpected error while running command.
  Command: ['curtin', 'block-meta', 'custom']
  Exit code: 3
  Reason: -
  Stdout: b"no disk with serial 'CVMD434500BN400AGN' found\n"

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1651602/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp