Private bug reported:

I've been trying to run the server certification tools on a server with
DPCMM devices that are configured as:  15% MemoryMode, 85% AppDirect
mode.

I've configured a fsdax, devdax, sector and raw devices on the DCPMMs.
fsdax, devdax and sector are all mountable, formatted etc.

The test runs CKing's stress-ng disk tests against the DCPMM storage
devices, and after so long, the entire server abends and resets.

This is the exact same test we run on all servers and it never causes
this sort of behaviour. So the likely issues:

1: Some odditiy in testing the AppDirect storage devices on DCPMMs.
2: Some Kernel thing that's not able to deal with high I/O loads on DCPMMs.


I'm currently in the middle of recreating this abit easier, so I'll provide 
directions soon.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-54-generic 4.15.0-54.58
ProcVersionSignature: User Name 4.15.0-54.58-generic 4.15.18
Uname: Linux 4.15.0-54-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116,  1 Jul  2 04:41 seq
 crw-rw---- 1 root audio 116, 33 Jul  2 04:41 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
Date: Tue Jul  2 05:37:17 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp. 
 Bus 001 Device 002: ID 0000:0001  
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Intel Corporation S2600WFD
PciMultimedia:
 
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic 
root=UUID=d8f7444e-3965-49ba-bc42-628bc368893a ro
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-54-generic N/A
 linux-backports-modules-4.15.0-54-generic  N/A
 linux-firmware                             1.173.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 02/27/2019
dmi.bios.vendor: Intel Corporation
dmi.bios.version: SE5C620.86B.0D.01.0395.022720191340
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: S2600WFD
dmi.board.vendor: Intel Corporation
dmi.board.version: J46732-610
dmi.chassis.asset.tag: ....................
dmi.chassis.type: 23
dmi.chassis.vendor: ...............................
dmi.chassis.version: ..................
dmi.modalias: 
dmi:bvnIntelCorporation:bvrSE5C620.86B.0D.01.0395.022720191340:bd02/27/2019:svnIntelCorporation:pnS2600WFD:pvr....................:rvnIntelCorporation:rnS2600WFD:rvrJ46732-610:cvn...............................:ct23:cvr..................:
dmi.product.family: Family
dmi.product.name: S2600WFD
dmi.product.version: ....................
dmi.sys.vendor: Intel Corporation

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug bionic uec-images

** Information type changed from Public to Public Security

** Information type changed from Public Security to Private

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1834990

Title:
  Cascade Lake system with DCPMM devices abends under stress

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1834990/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to