4.15 is now available in Bionic.  Marking this Fix Released.

** Tags added: kernel

** Information type changed from Proprietary to Public

** Changed in: intel
       Status: New => Fix Released

** Changed in: linux (Ubuntu)
       Status: New => Fix Released

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
       Status: Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1744637

Title:
  [Bug]Crystal Ridge - non-temporal stores receive double fault in KVM
  guest

Status in intel:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  Description:
  I'm seeing a regression in my QEMU based NVDIMM testing system, and I
  bisected it to this commit:

  664f8e26b00c7673a8303b0d40853a0c24ca93e1 is the first bad commit
  commit 664f8e26b00c7673a8303b0d40853a0c24ca93e1
  Author: Wanpeng Li <wanpeng...@hotmail.com>
  Date: Thu Aug 24 03:35:09 2017 -0700

  KVM: X86: Fix loss of exception which has not yet been injected

  The behavior I'm seeing is that heavy I/O to simulated NVDIMMs in
  multiple virtual machines causes the QEMU guests to receive double
  faults, crashing them. Here's an example backtrace:

  [ 1042.653816] PANIC: double fault, error_code: 0x0
  [ 1042.654398] CPU: 2 PID: 30257 Comm: fsstress Not tainted 4.15.0-rc5 #1
  [ 1042.655169] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
  BIOS 1.10.2-2.fc27 04/01/2014
  [ 1042.656121] RIP: 0010:memcpy_flushcache+0x4d/0x180
  [ 1042.656631] RSP: 0018:ffffac098c7d3808 EFLAGS: 00010286
  [ 1042.657245] RAX: ffffac0d18ca8000 RBX: 0000000000000fe0 RCX: 
ffffac0d18ca8000
  [ 1042.658085] RDX: ffff921aaa5df000 RSI: ffff921aaa5e0000 RDI: 
000019f26e6c9000
  [ 1042.658802] RBP: 0000000000001000 R08: 0000000000000000 R09: 
0000000000000000
  [ 1042.659503] R10: 0000000000000000 R11: 0000000000000000 R12: 
ffff921aaa5df020
  [ 1042.660306] R13: ffffac0d18ca8000 R14: fffff4c102a977c0 R15: 
0000000000001000
  [ 1042.661132] FS: 00007f71530b90c0(0000) GS:ffff921b3b280000(0000)
  knlGS:0000000000000000
  [ 1042.662051] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1042.662528] CR2: 0000000001156002 CR3: 000000012a936000 CR4: 
00000000000006e0
  [ 1042.663093] Call Trace:
  [ 1042.663329] write_pmem+0x6c/0xa0 [nd_pmem]
  [ 1042.663668] pmem_do_bvec+0x15f/0x330 [nd_pmem]
  [ 1042.664056] ? kmem_alloc+0x61/0xe0 [xfs]
  [ 1042.664393] pmem_make_request+0xdd/0x220 [nd_pmem]
  [ 1042.664781] generic_make_request+0x11f/0x300
  [ 1042.665135] ? submit_bio+0x6c/0x140
  [ 1042.665436] submit_bio+0x6c/0x140
  [ 1042.665754] ? next_bio+0x18/0x40
  [ 1042.666025] ? _cond_resched+0x15/0x40
  [ 1042.666341] submit_bio_wait+0x53/0x80
  [ 1042.666804] blkdev_issue_zeroout+0xdc/0x210
  [ 1042.667336] ? __dax_zero_page_range+0xb5/0x140
  [ 1042.667810] __dax_zero_page_range+0xb5/0x140
  [ 1042.668197] ? xfs_file_iomap_begin+0x2bd/0x8e0 [xfs]
  [ 1042.668611] iomap_zero_range_actor+0x7c/0x1b0
  [ 1042.668974] ? iomap_write_actor+0x170/0x170
  [ 1042.669318] iomap_apply+0xa4/0x110
  [ 1042.669616] ? iomap_write_actor+0x170/0x170
  [ 1042.669958] iomap_zero_range+0x52/0x80
  [ 1042.670255] ? iomap_write_actor+0x170/0x170
  [ 1042.670616] xfs_setattr_size+0xd4/0x330 [xfs]
  [ 1042.670995] xfs_ioc_space+0x27e/0x2f0 [xfs]
  [ 1042.671332] ? terminate_walk+0x87/0xf0
  [ 1042.671662] xfs_file_ioctl+0x862/0xa40 [xfs]
  [ 1042.672035] ? _copy_to_user+0x22/0x30
  [ 1042.672346] ? cp_new_stat+0x150/0x180
  [ 1042.672663] do_vfs_ioctl+0xa1/0x610
  [ 1042.672960] ? SYSC_newfstat+0x3c/0x60
  [ 1042.673264] SyS_ioctl+0x74/0x80
  [ 1042.673661] entry_SYSCALL_64_fastpath+0x1a/0x7d
  [ 1042.674239] RIP: 0033:0x7f71525a2dc7
  [ 1042.674681] RSP: 002b:00007ffef97aa778 EFLAGS: 00000246 ORIG_RAX:
  0000000000000010
  [ 1042.675664] RAX: ffffffffffffffda RBX: 00000000000112bc RCX: 
00007f71525a2dc7
  [ 1042.676592] RDX: 00007ffef97aa7a0 RSI: 0000000040305825 RDI: 
0000000000000003
  [ 1042.677520] RBP: 0000000000000009 R08: 0000000000000045 R09: 
00007ffef97aa78c
  [ 1042.678442] R10: 0000000000000000 R11: 0000000000000246 R12: 
0000000000000003
  [ 1042.679330] R13: 0000000000019e38 R14: 00000000000fcca7 R15: 
0000000000000016
  [ 1042.680216] Code: 48 8d 5d e0 4c 8d 62 20 48 89 cf 48 29 d7 48 89
  de 48 83 e6 e0 4c 01 e6 48 8d 04 17 4c 8b 02 4c 8b 4a 08 4c 8b 52 10
  4c 8b 5a 18 <4c> 0f c3 00 4c 0f c3 48 08 4c 0f c3 50 10 4c 0f c3 58 18
  48 83

  This appears to be independent of both the guest kernel version (this
  backtrace has v4.15.0-rc5, but I've seen it with other kernels) as
  well as independent of the host QMEU version (mine happens to be
  qemu-2.10.1-2.fc27 in Fedora 27).

  The new behavior is due to this commit being present in the host OS
  kernel. Prior to this commit I could fire up 4 VMs and run xfstests
  on my simulated NVDIMMs, but after this commit such testing results in
  multiple of my VMs crashing almost immediately.

  Reproduction is very simple, at least on my development box. All you
  need are a pair of VMs (I just did it with clean installs of Fedora
  27) with NVDIMMs. Here's a sample QEMU command to get one of these:

  qemu-system-x86_64 /home/rzwisler/vms/Fedora27.qcow2 -m
  4G,slots=3,maxmem=512G -smp 12 -machine pc,accel=kvm,nvdimm
  -enable-kvm -object
  
memory-backend-file,id=mem1,share,mem-path=/home/rzwisler/nvdimms/nvdimm-1,size=17G
  -device nvdimm,memdev=mem1,id=nv1
  In my setup my NVDIMMs backing files (/home/rzwisler/nvdimms/nvdimm-1)
  are being created on a filesystem on an SSD.

  After these two qemu guests are up, run write I/Os to the resulting
  /dev/pmem0 devices. I've done this with xfstests and fio to get the
  error, but the simplest way is just:

  dd if=/dev/zero of=/dev/pmem0
  The double fault should happen in under a minute, definitely before
  the DDs run out of space on their /dev/pmem0 devices.

  I've reproduced this on multiple development boxes, so I'm pretty sure
  it's not related to a flakey hardware setup.

  Commit ids:2a266f23550be997d783f27e704b9b40c4010292

  Target Kernel: 4.15
  Target Release: 18.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/intel/+bug/1744637/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to