Re: [PATCH] Revert "xen: add helpers to allocate unpopulated memory"

2020-12-06 Thread Jürgen Groß

On 06.12.20 18:22, Marek Marczykowski-Górecki wrote:

This reverts commit 9e2369c06c8a181478039258a4598c1ddd2cadfa.

On a Xen PV dom0, with NVME disk, this makes the dom0 crash when starting
a domain. This looks like some bad interaction between xen-blkback and


xen-scsiback has the same use pattern.


NVME driver, both using ZONE_DEVICE. Since the author is on leave now,
revert the change until proper solution is developed.

The specific crash message is:

 general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
 CPU: 1 PID: 134 Comm: kworker/u12:2 Not tainted 5.9.9-1.qubes.x86_64 #1
 Hardware name: LENOVO 20M9CTO1WW/20M9CTO1WW, BIOS N2CET50W (1.33 ) 
01/15/2020
 Workqueue: dm-thin do_worker [dm_thin_pool]
 RIP: e030:nvme_map_data+0x300/0x3a0 [nvme]
 Code: b8 fe ff ff e9 a8 fe ff ff 4c 8b 56 68 8b 5e 70 8b 76 74 49 8b 02 48 c1 e8 
33 83 e0 07 83 f8 04 0f 85 f2 fe ff ff 49 8b 42 08 <83> b8 d0 00 00 00 04 0f 85 
e1 fe ff ff e9 38 fd ff ff 8b 55 70 be
 RSP: e02b:c900010e7ad8 EFLAGS: 00010246
 RAX: dead0100 RBX: 1000 RCX: 8881a58f5000
 RDX: 1000 RSI:  RDI: 8881a679e000
 RBP: 8881a5ef4c80 R08: 8881a5ef4c80 R09: 0002
 R10: ea0003dfff40 R11: 0008 R12: 8881a679e000
 R13: c900010e7b20 R14: 8881a70b5980 R15: 8881a679e000
 FS:  () GS:8881b544() 
knlGS:
 CS:  e030 DS:  ES:  CR0: 80050033
 CR2: 01d64408 CR3: 0001aa2c CR4: 00050660
 Call Trace:
  nvme_queue_rq+0xa7/0x1a0 [nvme]
  __blk_mq_try_issue_directly+0x11d/0x1e0
  ? add_wait_queue_exclusive+0x70/0x70
  blk_mq_try_issue_directly+0x35/0xc0l[
  blk_mq_submit_bio+0x58f/0x660
  __submit_bio_noacct+0x300/0x330
  process_shared_bio+0x126/0x1b0 [dm_thin_pool]
  process_cell+0x226/0x280 [dm_thin_pool]
  process_thin_deferred_cells+0x185/0x320 [dm_thin_pool]
  process_deferred_bios+0xa4/0x2a0 [dm_thin_pool]UX
  do_worker+0xcc/0x130 [dm_thin_pool]
  process_one_work+0x1b4/0x370
  worker_thread+0x4c/0x310
  ? process_one_work+0x370/0x370
  kthread+0x11b/0x140
  ? __kthread_bind_mask+0x60/0x60<
  ret_from_fork+0x22/0x30
 Modules linked in: loop snd_seq_dummy snd_hrtimer nf_tables nfnetlink vfat 
fat snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common 
snd_soc_hdac_hda snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof snd_soc_skl 
snd_soc_sst_
 ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi 
snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine elan_i2c 
snd_hda_codec_hdmi mei_hdcp iTCO_wdt intel_powerclamp intel_pmc_bxt ee1004 
intel_rapl_msr iTCO_vendor
 _support joydev pcspkr intel_wmi_thunderbolt wmi_bmof thunderbolt 
ucsi_acpi idma64 typec_ucsi snd_hda_codec_realtek typec snd_hda_codec_generic 
snd_hda_intel snd_intel_dspcfg snd_hda_codec thinkpad_acpi snd_hda_core 
ledtrig_audio int3403_
 thermal snd_hwdep snd_seq snd_seq_device snd_pcm iwlwifi snd_timer 
processor_thermal_device mei_me cfg80211 intel_rapl_common snd e1000e mei 
int3400_thermal int340x_thermal_zone i2c_i801 acpi_thermal_rel soundcore 
intel_soc_dts_iosf i2c_s
 mbus rfkill intel_pch_thermal xenfs
  ip_tables dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt nouveau 
rtsx_pci_sdmmc mmc_core mxm_wmi crct10dif_pclmul ttm crc32_pclmul crc32c_intel 
i915 ghash_clmulni_intel i2c_algo_bit serio_raw nvme drm_kms_helper cec 
xhci_pci nvme
 _core rtsx_pci xhci_pci_renesas drm xhci_hcd wmi video pinctrl_cannonlake 
pinctrl_intel xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev 
xen_evtchn uinput
 ---[ end trace f8d47e4aa6724df4 ]---
 RIP: e030:nvme_map_data+0x300/0x3a0 [nvme]
 Code: b8 fe ff ff e9 a8 fe ff ff 4c 8b 56 68 8b 5e 70 8b 76 74 49 8b 02 48 c1 e8 
33 83 e0 07 83 f8 04 0f 85 f2 fe ff ff 49 8b 42 08 <83> b8 d0 00 00 00 04 0f 85 
e1 fe ff ff e9 38 fd ff ff 8b 55 70 be
 RSP: e02b:c900010e7ad8 EFLAGS: 00010246
 RAX: dead0100 RBX: 1000 RCX: 8881a58f5000
 RDX: 1000 RSI:  RDI: 8881a679e000
 RBP: 8881a5ef4c80 R08: 8881a5ef4c80 R09: 0002
 R10: ea0003dfff40 R11: 0008 R12: 8881a679e000
 R13: c900010e7b20 R14: 8881a70b5980 R15: 8881a679e000
 FS:  () GS:8881b544() 
knlGS:
 CS:  e030 DS:  ES:  CR0: 80050033
 CR2: 01d64408 CR3: 0001aa2c CR4: 00050660
 Kernel panic - not syncing: Fatal exception
 Kernel Offset: disabled

Discussion at 
https://lore.kernel.org/xen-devel/20201205082839.ts3ju6yta46cgwjn@Air-de-Roger/T

Cc: sta...@vger.kernel.org #v5.9+
(for 5.9 it's easier to revert the original commit directly)

[PATCH] Revert "xen: add helpers to allocate unpopulated memory"

2020-12-06 Thread Marek Marczykowski-Górecki
This reverts commit 9e2369c06c8a181478039258a4598c1ddd2cadfa.

On a Xen PV dom0, with NVME disk, this makes the dom0 crash when starting
a domain. This looks like some bad interaction between xen-blkback and
NVME driver, both using ZONE_DEVICE. Since the author is on leave now,
revert the change until proper solution is developed.

The specific crash message is:

general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
CPU: 1 PID: 134 Comm: kworker/u12:2 Not tainted 5.9.9-1.qubes.x86_64 #1
Hardware name: LENOVO 20M9CTO1WW/20M9CTO1WW, BIOS N2CET50W (1.33 ) 
01/15/2020
Workqueue: dm-thin do_worker [dm_thin_pool]
RIP: e030:nvme_map_data+0x300/0x3a0 [nvme]
Code: b8 fe ff ff e9 a8 fe ff ff 4c 8b 56 68 8b 5e 70 8b 76 74 49 8b 02 48 
c1 e8 33 83 e0 07 83 f8 04 0f 85 f2 fe ff ff 49 8b 42 08 <83> b8 d0 00 00 00 04 
0f 85 e1 fe ff ff e9 38 fd ff ff 8b 55 70 be
RSP: e02b:c900010e7ad8 EFLAGS: 00010246
RAX: dead0100 RBX: 1000 RCX: 8881a58f5000
RDX: 1000 RSI:  RDI: 8881a679e000
RBP: 8881a5ef4c80 R08: 8881a5ef4c80 R09: 0002
R10: ea0003dfff40 R11: 0008 R12: 8881a679e000
R13: c900010e7b20 R14: 8881a70b5980 R15: 8881a679e000
FS:  () GS:8881b544() knlGS:
CS:  e030 DS:  ES:  CR0: 80050033
CR2: 01d64408 CR3: 0001aa2c CR4: 00050660
Call Trace:
 nvme_queue_rq+0xa7/0x1a0 [nvme]
 __blk_mq_try_issue_directly+0x11d/0x1e0
 ? add_wait_queue_exclusive+0x70/0x70
 blk_mq_try_issue_directly+0x35/0xc0l[
 blk_mq_submit_bio+0x58f/0x660
 __submit_bio_noacct+0x300/0x330
 process_shared_bio+0x126/0x1b0 [dm_thin_pool]
 process_cell+0x226/0x280 [dm_thin_pool]
 process_thin_deferred_cells+0x185/0x320 [dm_thin_pool]
 process_deferred_bios+0xa4/0x2a0 [dm_thin_pool]UX
 do_worker+0xcc/0x130 [dm_thin_pool]
 process_one_work+0x1b4/0x370
 worker_thread+0x4c/0x310
 ? process_one_work+0x370/0x370
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60<
 ret_from_fork+0x22/0x30
Modules linked in: loop snd_seq_dummy snd_hrtimer nf_tables nfnetlink vfat 
fat snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common 
snd_soc_hdac_hda snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof snd_soc_skl 
snd_soc_sst_
ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi 
snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine elan_i2c 
snd_hda_codec_hdmi mei_hdcp iTCO_wdt intel_powerclamp intel_pmc_bxt ee1004 
intel_rapl_msr iTCO_vendor
_support joydev pcspkr intel_wmi_thunderbolt wmi_bmof thunderbolt ucsi_acpi 
idma64 typec_ucsi snd_hda_codec_realtek typec snd_hda_codec_generic 
snd_hda_intel snd_intel_dspcfg snd_hda_codec thinkpad_acpi snd_hda_core 
ledtrig_audio int3403_
thermal snd_hwdep snd_seq snd_seq_device snd_pcm iwlwifi snd_timer 
processor_thermal_device mei_me cfg80211 intel_rapl_common snd e1000e mei 
int3400_thermal int340x_thermal_zone i2c_i801 acpi_thermal_rel soundcore 
intel_soc_dts_iosf i2c_s
mbus rfkill intel_pch_thermal xenfs
 ip_tables dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt nouveau 
rtsx_pci_sdmmc mmc_core mxm_wmi crct10dif_pclmul ttm crc32_pclmul crc32c_intel 
i915 ghash_clmulni_intel i2c_algo_bit serio_raw nvme drm_kms_helper cec 
xhci_pci nvme
_core rtsx_pci xhci_pci_renesas drm xhci_hcd wmi video pinctrl_cannonlake 
pinctrl_intel xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev 
xen_evtchn uinput
---[ end trace f8d47e4aa6724df4 ]---
RIP: e030:nvme_map_data+0x300/0x3a0 [nvme]
Code: b8 fe ff ff e9 a8 fe ff ff 4c 8b 56 68 8b 5e 70 8b 76 74 49 8b 02 48 
c1 e8 33 83 e0 07 83 f8 04 0f 85 f2 fe ff ff 49 8b 42 08 <83> b8 d0 00 00 00 04 
0f 85 e1 fe ff ff e9 38 fd ff ff 8b 55 70 be
RSP: e02b:c900010e7ad8 EFLAGS: 00010246
RAX: dead0100 RBX: 1000 RCX: 8881a58f5000
RDX: 1000 RSI:  RDI: 8881a679e000
RBP: 8881a5ef4c80 R08: 8881a5ef4c80 R09: 0002
R10: ea0003dfff40 R11: 0008 R12: 8881a679e000
R13: c900010e7b20 R14: 8881a70b5980 R15: 8881a679e000
FS:  () GS:8881b544() knlGS:
CS:  e030 DS:  ES:  CR0: 80050033
CR2: 01d64408 CR3: 0001aa2c CR4: 00050660
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled

Discussion at 
https://lore.kernel.org/xen-devel/20201205082839.ts3ju6yta46cgwjn@Air-de-Roger/T

Cc: sta...@vger.kernel.org #v5.9+
(for 5.9 it's easier to revert the original commit directly)
Signed-off-by: Marek Marczykowski-Górecki 
---
 drivers/gpu/drm/xen/xen_drm_front_gem.c |   9 +-
 drivers/xen/Kconfig |  10 --