[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-21 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.13.0-36.40

---
linux (4.13.0-36.40) artful; urgency=medium

  * linux: 4.13.0-36.40 -proposed tracker (LP: #1750010)

  * Rebuild without "CVE-2017-5754 ARM64 KPTI fixes" patch set

linux (4.13.0-35.39) artful; urgency=medium

  * linux: 4.13.0-35.39 -proposed tracker (LP: #1748743)

  * CVE-2017-5715 (Spectre v2 Intel)
- Revert "UBUNTU: SAUCE: turn off IBPB when full retpoline is present"
- SAUCE: turn off IBRS when full retpoline is present
- [Packaging] retpoline files must be sorted
- [Packaging] pull in retpoline files

linux (4.13.0-34.37) artful; urgency=medium

  * linux: 4.13.0-34.37 -proposed tracker (LP: #1748475)

  * libata: apply MAX_SEC_1024 to all LITEON EP1 series devices (LP: #1743053)
- libata: apply MAX_SEC_1024 to all LITEON EP1 series devices

  * KVM patches for s390x to provide facility bits 81 (ppa15) and 82 (bpb)
(LP: #1747090)
- KVM: s390: wire up bpb feature

  * artful 4.13 i386 kernels crash after memory hotplug remove (LP: #1747069)
- Revert "mm, memory_hotplug: do not associate hotadded memory to zones 
until
  online"

  * CVE-2017-5715 (Spectre v2 Intel)
- x86/feature: Enable the x86 feature to control Speculation
- x86/feature: Report presence of IBPB and IBRS control
- x86/enter: MACROS to set/clear IBRS and set IBPB
- x86/enter: Use IBRS on syscall and interrupts
- x86/idle: Disable IBRS entering idle and enable it on wakeup
- x86/idle: Disable IBRS when offlining cpu and re-enable on wakeup
- x86/mm: Set IBPB upon context switch
- x86/mm: Only set IBPB when the new thread cannot ptrace current thread
- x86/entry: Stuff RSB for entry to kernel for non-SMEP platform
- x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD to kvm
- x86/kvm: Set IBPB when switching VM
- x86/kvm: Toggle IBRS on VM entry and exit
- x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature
- x86/spec_ctrl: Add lock to serialize changes to ibrs and ibpb control
- x86/cpu/AMD: Add speculative control support for AMD
- x86/microcode: Extend post microcode reload to support IBPB feature
- KVM: SVM: Do not intercept new speculative control MSRs
- x86/svm: Set IBRS value on VM entry and exit
- x86/svm: Set IBPB when running a different VCPU
- KVM: x86: Add speculative control CPUID support for guests
- SAUCE: turn off IBPB when full retpoline is present

  * Artful 4.13 fixes for tun (LP: #1748846)
- tun: call dev_get_valid_name() before register_netdevice()
- tun: allow positive return values on dev_get_valid_name() call
- tun/tap: sanitize TUNSETSNDBUF input

  * boot failure on AMD Raven + WestonXT (LP: #1742759)
- SAUCE: drm/amdgpu: add atpx quirk handling (v2)

linux (4.13.0-33.36) artful; urgency=low

  * linux: 4.13.0-33.36 -proposed tracker (LP: #1746903)

  [ Stefan Bader ]
  * starting VMs causing retpoline4 to reboot (LP: #1747507) // CVE-2017-5715
(Spectre v2 retpoline)
- x86/retpoline: Fill RSB on context switch for affected CPUs
- x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros
- x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
- x86/retpoline: Remove the esp/rsp thunk
- x86/retpoline: Simplify vmexit_fill_RSB()

  * Missing install-time driver for QLogic QED 25/40/100Gb Ethernet NIC
(LP: #1743638)
- [d-i] Add qede to nic-modules udeb

  * hisi_sas: driver robustness fixes (LP: #1739807)
- scsi: hisi_sas: fix reset and port ID refresh issues
- scsi: hisi_sas: avoid potential v2 hw interrupt issue
- scsi: hisi_sas: fix v2 hw underflow residual value
- scsi: hisi_sas: add v2 hw DFX feature
- scsi: hisi_sas: add irq and tasklet cleanup in v2 hw
- scsi: hisi_sas: service interrupt ITCT_CLR interrupt in v2 hw
- scsi: hisi_sas: fix internal abort slot timeout bug
- scsi: hisi_sas: us start_phy in PHY_FUNC_LINK_RESET
- scsi: hisi_sas: fix NULL check in SMP abort task path
- scsi: hisi_sas: fix the risk of freeing slot twice
- scsi: hisi_sas: kill tasklet when destroying irq in v3 hw
- scsi: hisi_sas: complete all tasklets prior to host reset

  * [Artful/Zesty] ACPI APEI error handling bug fixes (LP: #1732990)
- ACPI: APEI: fix the wrong iteration of generic error status block
- ACPI / APEI: clear error status before acknowledging the error

  * [Zesty/Artful] On ARM64 PCIE physical function passthrough guest fails to
boot (LP: #1732804)
- vfio/pci: Virtualize Maximum Payload Size
- vfio/pci: Virtualize Maximum Read Request Size

  * hisi_sas: Add ATA command support for SMR disks (LP: #1739891)
- scsi: hisi_sas: support zone management commands

  * thunderx2: i2c driver PEC and ACPI clock fixes (LP: #1738073)
- ACPI / APD: Add clock frequency for ThunderX2 I2C controller
- i2c: xlp9xx: Get clock frequency with clk API
- i2c: xlp9xx: Handle 

[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-14 Thread Colin Ian King
Ran the tests against the i386 -proposed kernel, cannot reproduce the
issue with the fixed kernel.  Also ran the ADT tests and could not
reproduce the issue (and these run the memory hotplug tests too).

Tested, and verified.

** Tags removed: verification-needed-artful
** Tags added: verification-done-artful

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747069

Title:
  artful 4.13 i386 kernels crash after memory hotplug remove

Status in linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Request, Artful ==

  Hotplug removal causes i386 crashes when exercised with the kernel
  selftest mem-on-off-test script.  

  == Fix ==

  Revert commit f1dd2cd13c4b (""mm, memory_hotplug: do not associate
  hotadded memory to zones until online")

  Note: A fix occurs in 4.15 however this requires a large set of
  changes that are way too large to be SRU'able and the least risky way
  forward is to revert the offending commit.

  == Testcase ==

  Running the kernel selftest script mem-on-off-test.sh, followed by a
  sync, followed by re-installing kernel packages will always trigger
  this issue. Simply running the mem-on-off-test.sh script sometimes
  won't trigger the problem.  I believe this is why we've not seen this
  happen too frequently with our ADT tests.  I can reproduce this in a
  VM with 4 CPUs and 2GB of memory.

  == Regression Potential ==

  Reverting this commit does remove some functionality, however this
  does not regress the kernel compared to previous releases and having a
  working reliable memory hotplug is the preferred option.  This fix
  does touch some memory hotplug, so there is a risk that this may break
  this functionality that is not covered by the kernel regression
  testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747069/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-14 Thread Kleber Sacilotto de Souza
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
artful' to 'verification-done-artful'. If the problem still exists,
change the tag 'verification-needed-artful' to 'verification-failed-
artful'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-artful

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747069

Title:
  artful 4.13 i386 kernels crash after memory hotplug remove

Status in linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Request, Artful ==

  Hotplug removal causes i386 crashes when exercised with the kernel
  selftest mem-on-off-test script.  

  == Fix ==

  Revert commit f1dd2cd13c4b (""mm, memory_hotplug: do not associate
  hotadded memory to zones until online")

  Note: A fix occurs in 4.15 however this requires a large set of
  changes that are way too large to be SRU'able and the least risky way
  forward is to revert the offending commit.

  == Testcase ==

  Running the kernel selftest script mem-on-off-test.sh, followed by a
  sync, followed by re-installing kernel packages will always trigger
  this issue. Simply running the mem-on-off-test.sh script sometimes
  won't trigger the problem.  I believe this is why we've not seen this
  happen too frequently with our ADT tests.  I can reproduce this in a
  VM with 4 CPUs and 2GB of memory.

  == Regression Potential ==

  Reverting this commit does remove some functionality, however this
  does not regress the kernel compared to previous releases and having a
  working reliable memory hotplug is the preferred option.  This fix
  does touch some memory hotplug, so there is a risk that this may break
  this functionality that is not covered by the kernel regression
  testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747069/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-09 Thread Colin Ian King
** Description changed:

- It seems all of the artful 4.13 are crashing on i386 install when memory
- hotplug removal is attempted.  This crash occurs a few seconds after the
- removal.  I have done a gross bisect back to the first 4.13.0-11 which
- is also affected.
+ == SRU Request, Artful ==
+ 
+ Hotplug removal causes i386 crashes when exercised with the kernel
+ selftest mem-on-off-test script.  
+ 
+ == Fix ==
+ 
+ Revert commit f1dd2cd13c4b (""mm, memory_hotplug: do not associate
+ hotadded memory to zones until online")
+ 
+ Note: A fix occurs in 4.15 however this requires a large set of changes
+ that are way too large to be SRU'able and the least risky way forward is
+ to revert the offending commit.
+ 
+ == Testcase ==
+ 
+ Running the kernel selftest script mem-on-off-test.sh, followed by a
+ sync, followed by re-installing kernel packages will always trigger this
+ issue. Simply running the mem-on-off-test.sh script sometimes won't
+ trigger the problem.  I believe this is why we've not seen this happen
+ too frequently with our ADT tests.  I can reproduce this in a VM with 4
+ CPUs and 2GB of memory.
+ 
+ == Regression Potential ==
+ 
+ Reverting this commit does remove some functionality, however this does
+ not regress the kernel compared to previous releases and having a
+ working reliable memory hotplug is the preferred option.  This fix does
+ touch some memory hotplug, so there is a risk that this may break this
+ functionality that is not covered by the kernel regression testing.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747069

Title:
  artful 4.13 i386 kernels crash after memory hotplug remove

Status in linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Request, Artful ==

  Hotplug removal causes i386 crashes when exercised with the kernel
  selftest mem-on-off-test script.  

  == Fix ==

  Revert commit f1dd2cd13c4b (""mm, memory_hotplug: do not associate
  hotadded memory to zones until online")

  Note: A fix occurs in 4.15 however this requires a large set of
  changes that are way too large to be SRU'able and the least risky way
  forward is to revert the offending commit.

  == Testcase ==

  Running the kernel selftest script mem-on-off-test.sh, followed by a
  sync, followed by re-installing kernel packages will always trigger
  this issue. Simply running the mem-on-off-test.sh script sometimes
  won't trigger the problem.  I believe this is why we've not seen this
  happen too frequently with our ADT tests.  I can reproduce this in a
  VM with 4 CPUs and 2GB of memory.

  == Regression Potential ==

  Reverting this commit does remove some functionality, however this
  does not regress the kernel compared to previous releases and having a
  working reliable memory hotplug is the preferred option.  This fix
  does touch some memory hotplug, so there is a risk that this may break
  this functionality that is not covered by the kernel regression
  testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747069/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-08 Thread Colin Ian King
Ignore my above comments. I bisected this again using a more reliable 
reproducer and found the first bad commit to be:
f1dd2cd13c4bbbc9a7c4617b3b034fa643de98fe is the first bad commit
commit f1dd2cd13c4bbbc9a7c4617b3b034fa643de98fe
Author: Michal Hocko 
Date:   Thu Jul 6 15:38:11 2017 -0700

mm, memory_hotplug: do not associate hotadded memory to zones until online

The current memory hotplug implementation relies on having all the
struct pages associate with a zone/node during the physical hotplug
phase (arch_add_memory->__add_pages->__add_section->__add_zone).  In the
vast majority of cases this means that they are added to ZONE_NORMAL.
This has been so since 9d99aaa31f59 ("[PATCH] x86_64: Support memory
hotadd without sparsemem") and it wasn't a big deal back then because
movable onlining didn't exist yet.

Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable
onlining 511c2aba8f07 ("mm, memory-hotplug: dynamic configure movable
memory and portion memory") and then things got more complicated.
Rather than reconsidering the zone association which was no longer
needed (because the memory hotplug already depended on SPARSEMEM) a
convoluted semantic of zone shifting has been developed.  Only the
currently last memblock or the one adjacent to the zone_movable can be
onlined movable.  This essentially means that the online type changes as
the new memblocks are added.

Let's simulate memory hot online manually
  $ echo 0x1 > /sys/devices/system/memory/probe
  $ grep . /sys/devices/system/memory/memory32/valid_zones
  Normal Movable

  $ echo $((0x1+(128<<20))) > /sys/devices/system/memory/probe
  $ grep . /sys/devices/system/memory/memory3?/valid_zones
  /sys/devices/system/memory/memory32/valid_zones:Normal
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable

  $ echo $((0x1+2*(128<<20))) > /sys/devices/system/memory/probe
  $ grep . /sys/devices/system/memory/memory3?/valid_zones
  /sys/devices/system/memory/memory32/valid_zones:Normal
  /sys/devices/system/memory/memory33/valid_zones:Normal
  /sys/devices/system/memory/memory34/valid_zones:Normal Movable

  $ echo online_movable > /sys/devices/system/memory/memory34/state
  $ grep . /sys/devices/system/memory/memory3?/valid_zones
  /sys/devices/system/memory/memory32/valid_zones:Normal
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable
  /sys/devices/system/memory/memory34/valid_zones:Movable Normal

This is an awkward semantic because an udev event is sent as soon as the
block is onlined and an udev handler might want to online it based on
some policy (e.g.  association with a node) but it will inherently race
with new blocks showing up.

This patch changes the physical online phase to not associate pages with
any zone at all.  All the pages are just marked reserved and wait for
the onlining phase to be associated with the zone as per the online
request.  There are only two requirements

- existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap

- ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses

the latter one is not an inherent requirement and can be changed in the
future.  It preserves the current behavior and made the code slightly
simpler.  This is subject to change in future.

This means that the same physical online steps as above will lead to the
following state: Normal Movable

  /sys/devices/system/memory/memory32/valid_zones:Normal Movable
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable

  /sys/devices/system/memory/memory32/valid_zones:Normal Movable
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable
  /sys/devices/system/memory/memory34/valid_zones:Normal Movable

  /sys/devices/system/memory/memory32/valid_zones:Normal Movable
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable
  /sys/devices/system/memory/memory34/valid_zones:Movable

Implementation:
The current move_pfn_range is reimplemented to check the above
requirements (allow_online_pfn_range) and then updates the respective
zone (move_pfn_range_to_zone), the pgdat and links all the pages in the
pfn range with the zone/node.  __add_pages is updated to not require the
zone and only initializes sections in the range.  This allowed to
simplify the arch_add_memory code (s390 could get rid of quite some of
code).

devm_memremap_pages is the only user of arch_add_memory which relies on
the zone association because it only hooks into the memory hotplug only
half way.  It uses it to associate the new memory with ZONE_DEVICE but
doesn't allow it to be {on,off}lined via sysfs.  This means that this

[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-07 Thread Colin Ian King
Requires backports of commits:

a86d69d58aad561b6bbb44e60f74c41cd4b5f3ab
ed067d4a859ff696373324c5061392e013a7561a
f7f99100d8d95dbcf09e0216a143211e79418b9f
a4a3ede2132ae0863e2d43e06f9b5697c51a7a3b
ea1f5f3712afe895dfa4176ec87376b4a9ac23be

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747069

Title:
  artful 4.13 i386 kernels crash after memory hotplug remove

Status in linux package in Ubuntu:
  In Progress

Bug description:
  It seems all of the artful 4.13 are crashing on i386 install when
  memory hotplug removal is attempted.  This crash occurs a few seconds
  after the removal.  I have done a gross bisect back to the first
  4.13.0-11 which is also affected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747069/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-07 Thread Colin Ian King
Fixed in 4.14 with:

f7f99100d8d95dbcf09e0216a143211e79418b9f is the first bad commit
commit f7f99100d8d95dbcf09e0216a143211e79418b9f
Author: Pavel Tatashin 
Date:   Wed Nov 15 17:36:44 2017 -0800

mm: stop zeroing memory during allocation in vmemmap

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747069

Title:
  artful 4.13 i386 kernels crash after memory hotplug remove

Status in linux package in Ubuntu:
  In Progress

Bug description:
  It seems all of the artful 4.13 are crashing on i386 install when
  memory hotplug removal is attempted.  This crash occurs a few seconds
  after the removal.  I have done a gross bisect back to the first
  4.13.0-11 which is also affected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747069/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-05 Thread Colin Ian King
Bisected, bad commit: b3c6858fb172512f63838523ae7817ae8adec564  - this
is a merge and contains a lot of misc changes across the tree that may
have broken this.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747069

Title:
  artful 4.13 i386 kernels crash after memory hotplug remove

Status in linux package in Ubuntu:
  In Progress

Bug description:
  It seems all of the artful 4.13 are crashing on i386 install when
  memory hotplug removal is attempted.  This crash occurs a few seconds
  after the removal.  I have done a gross bisect back to the first
  4.13.0-11 which is also affected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747069/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1747069] Re: artful 4.13 i386 kernels crash after memory hotplug remove

2018-02-03 Thread Colin Ian King
** Changed in: linux (Ubuntu)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747069

Title:
  artful 4.13 i386 kernels crash after memory hotplug remove

Status in linux package in Ubuntu:
  In Progress

Bug description:
  It seems all of the artful 4.13 are crashing on i386 install when
  memory hotplug removal is attempted.  This crash occurs a few seconds
  after the removal.  I have done a gross bisect back to the first
  4.13.0-11 which is also affected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747069/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp