[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2019-01-10 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

Warner Losh  changed:

   What|Removed |Added

   Priority|--- |Normal
   Assignee|b...@freebsd.org|i...@freebsd.org
 Status|New |Open

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2019-01-10 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #13 from Warner Losh  ---
I understand what I must do next, but haven't had the time to do the changes
for the next round of testing because of time off around the holidays, travel,
etc.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-15 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #11 from Alexandr Krivulya  ---
With kernel patched from nvme-suspend branch I have a new messages in log:

Dec 15 14:50:41 thinkpad kernel: pm_runtime_get_if_in_use not implemented --
see your local kernel hacker
Dec 15 14:50:41 thinkpad kernel: sched_setscheduler_nocheck not implemented --
see your local kernel hacker
Dec 15 14:50:41 thinkpad kernel: register_oom_notifier not implemented -- see
your local kernel hacker
Dec 15 14:50:41 thinkpad kernel: [drm] Initialized i915 1.6.0 20171222 for
drmn0 on minor 0
Dec 15 14:50:41 thinkpad kernel: register_acpi_notifier not implemented -- see
your local kernel hacker
Dec 15 14:50:41 thinkpad kernel: async_schedule is dodgy -- see your local
kernel hacker
Dec 15 14:50:41 thinkpad kernel: pm_runtime_set_autosuspend_delay not
implemented -- see your local kernel hacker
Dec 15 14:50:41 thinkpad kernel: __pm_runtime_use_autosuspend not implemented
-- see your local kernel hacker
Dec 15 14:50:41 thinkpad kernel: async_synchronize_cookie not implemented --
see your local kernel hacker

Also kernel panicked while suspending:

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address   = 0x4d8
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x807ca21d
stack pointer   = 0x28:0xfe4857a0
frame pointer   = 0x28:0xfe4857c0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 13 (g_down)
trap number = 12
panic: page fault
cpuid = 3
time = 1544877759
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe485450
vpanic() at vpanic+0x1b4/frame 0xfe4854b0
panic() at panic+0x43/frame 0xfe485510
trap_fatal() at trap_fatal+0x369/frame 0xfe485560
trap_pfault() at trap_pfault+0x49/frame 0xfe4855c0
trap() at trap+0x29f/frame 0xfe4856d0
calltrap() at calltrap+0x8/frame 0xfe4856d0
--- trap 0xc, rip = 0x807ca21d, rsp = 0xfe4857a0, rbp =
0xfe4857c0 ---
nvme_qpair_submit_request() at nvme_qpair_submit_request+0x2d/frame
0xfe4857c0
nvme_ctrlr_submit_io_request() at nvme_ctrlr_submit_io_request+0x2f/frame
0xfe4857e0
nvme_ns_cmd_deallocate() at nvme_ns_cmd_deallocate+0x84/frame
0xfe485820
nvme_ns_bio_process() at nvme_ns_bio_process+0x257/frame 0xfe4858a0
nvd_strategy() at nvd_strategy+0x4d/frame 0xfe4858d0
g_disk_start() at g_disk_start+0x360/frame 0xfe485920
g_io_schedule_down() at g_io_schedule_down+0x1a1/frame 0xfe485960
g_down_procbody() at g_down_procbody+0x6d/frame 0xfe485970
fork_exit() at fork_exit+0x83/frame 0xfe4859b0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe4859b0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-14 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #10 from Warner Losh  ---
(In reply to Alexandr Krivulya from comment #6)
Just rebuild the kernel and/or module.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-14 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #9 from mmata...@gmail.com ---
I uploaded photos of what the kernel panic I get on resume.  Hopefully that is
useful.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-14 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #8 from mmata...@gmail.com ---
Created attachment 200117
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=200117=edit
Kernel Panic while resuming 2

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-14 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #7 from mmata...@gmail.com ---
Created attachment 200116
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=200116=edit
Kernel Panic while resuming

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-12 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #6 from Alexandr Krivulya  ---
Do I need to build a whole tree or just patch these files?

sys/dev/nvme/nvme.c
sys/dev/nvme/nvme_ctrlr.c
sys/dev/nvme/nvme_private.h

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-12 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

--- Comment #5 from Warner Losh  ---
I have some code in my https://github.com/bsdimp/freebsd in the nvme-suspend
branch you might want to test. I've not had time to test it out, so I'm looking
for help testing it out.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-12-12 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

Alexandr Krivulya  changed:

   What|Removed |Added

 CC||shur...@shurik.kiev.ua

--- Comment #4 from Alexandr Krivulya  ---
I have not any data corruptions, but I prefer to shutdown system instead of
suspend until we do not have any progress on it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-11-29 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

ultrageran...@bleu255.com changed:

   What|Removed |Added

 CC||ultrageran...@bleu255.com

--- Comment #3 from ultrageran...@bleu255.com ---
I'm seeing the same on a Thinkpad X1 (2018) with 12.0-PRERELEASE (r341261) and
following hardware:

nvme0@pci0:4:0:0:   class=0x010802 card=0xa801144d chip=0xa808144d rev=0x00
hdr=0x00
  vendor = 'Samsung Electronics Co Ltd'
  device = 'NVMe SSD Controller SM981/PM981'
  class  = mass storage
  subclass   = NVM

Not sure if related but I suffered an unpleasant data loss during one of these
hanging resuming state.

Is it reliable to use suspend while this issue is not solved?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-09-26 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

mmata...@gmail.com changed:

   What|Removed |Added

 CC||mmata...@gmail.com

--- Comment #2 from mmata...@gmail.com ---
I am experiencing the same problem however I will get, around 50% of the time,
a kernel panic or the machine simply becomes unresponsive.  I have seen errors
saying something along the lines of the nvme device disappeared.  Unfortunately
because logs don't actually persist to disk since the disk is gone.

I am running a new Dell XPS 13 and commit 338924 with drm-kmod-next

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-08-24 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

Eric van Gyzen  changed:

   What|Removed |Added

 CC||vangy...@freebsd.org
   Severity|Affects Only Me |Affects Some People

--- Comment #1 from Eric van Gyzen  ---
I get these on my Dell XPS 13 running head, too.

Aug 12 20:35:51 hammy kernel: nvme0: Resetting controller due to a timeout.
Aug 12 20:35:51 hammy kernel: nvme0: resetting controller
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: READ sqid:1 cid:103 nsid:1 lba:296718939
len:24
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:1
cid:103 cdw0:0
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: READ sqid:1 cid:106 nsid:1 lba:147433251
len:186
Aug 12 20:35:51 hammy kernel: nvme0: nvme0: ABORTED - BY REQUEST (00/07) sqid:1
cid:106 cdw0:0
Aug 12 20:35:51 hammy kernel: async event occurred (type 0x0, info 0x00, page
0x01)
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: WRITE sqid:2 cid:125 nsid:1 lba:290334796
len:7
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:2
cid:125 cdw0:0
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: DATASET MANAGEMENT sqid:3 cid:84 nsid:1
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:3 cid:84
cdw0:0
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: DATASET MANAGEMENT sqid:3 cid:94 nsid:1
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:3 cid:94
cdw0:0
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: READ sqid:4 cid:126 nsid:1 lba:261212309
len:7
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:4
cid:126 cdw0:0
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: WRITE sqid:5 cid:87 nsid:1 lba:290332732
len:16
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:5 cid:87
cdw0:0
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: READ sqid:5 cid:86 nsid:1 lba:107551651
len:4
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:5 cid:86
cdw0:0
Aug 12 20:35:51 hammy kernel: nvme0: aborting outstanding i/o
Aug 12 20:35:51 hammy kernel: nvme0: READ sqid:6 cid:115 nsid:1 lba:312559547
len:5
Aug 12 20:35:51 hammy kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:6
cid:115 cdw0:0
Aug 12 20:35:51 hammy acpi[1696]: resumed at 20180812 20:35:51

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 230722] [nvme][resume] Delay and possible deadlock after resume

2018-08-17 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230722

Bug ID: 230722
   Summary: [nvme][resume] Delay and possible deadlock after
resume
   Product: Base System
   Version: CURRENT
  Hardware: amd64
OS: Any
Status: New
  Severity: Affects Only Me
  Priority: ---
 Component: kern
  Assignee: b...@freebsd.org
  Reporter: go...@freebsd.org
CC: i...@freebsd.org, jimhar...@freebsd.org

When resuming from S3 sleep any disk activity stops for around 30 seconds and
kernel prints a bunch of error messages from NVME. Eventually, the system
resumes normal operations. I also saw this scenario resulting in deadlock
although can't reproduce it reliably anymore.

Error messages printed by the kernel:
nvme0: Resetting controller due to a timeout.
nvme0: resetting controller
nvme0: aborting outstanding i/o
nvme0: READ sqid:1 cid:111 nsid:1 lba:98381272 len:8
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:111 cdw0:0
nvme0: aborting outstanding i/o
nvme0: READ sqid:1 cid:124 nsid:1 lba:626180072 len:8
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:124 cdw0:0
nvme0: aborting outstanding i/o
nvme0: READ sqid:2 cid:121 nsid:1 lba:617174512 len:40
nvme0: ABORTED - BY REQUEST (00/07) sqid:2 cid:121 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:94 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:94 cdw0:0
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:4 cid:93 nsid:1 lba:46557808 len:16
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:93 cdw0:0
nvme0: Missing interrupt

Controller information:

Controller Capabilities/Features

Vendor ID:   17aa
Subsystem Vendor ID: 17aa
Serial Number:   1150592304774
Model Number:LENSE20512GMSP34MEAT2TA
Firmware Version:1.9.8341
Recommended Arb Burst:   2
IEEE OUI Identifier: 99 32 a0
Multi-Path I/O Capabilities: Not Supported
Max Data Transfer Size:  131072
Controller ID:   0x01
Version: 0.0.0

Admin Command Set Attributes

Security Send/Receive:   Supported
Format NVM:  Supported
Firmware Activate/Download:  Supported
Namespace Managment: Not Supported
Device Self-test:Not Supported
Directives:  Not Supported
NVMe-MI Send/Receive:Not Supported
Virtualization Management:   Not Supported
Doorbell Buffer Config   Not Supported
Abort Command Limit: 4
Async Event Request Limit:   4
Number of Firmware Slots:1
Firmware Slot 1 Read-Only:   No
Per-Namespace SMART Log: No
Error Log Page Entries:  4
Number of Power States:  5

NVM Command Set Attributes
==
Submission Queue Entry Size
  Max:   64
  Min:   64
Completion Queue Entry Size
  Max:   16
  Min:   16
Number of Namespaces:1
Compare Command: Not Supported
Write Uncorrectable Command: Supported
Dataset Management Command:  Supported
Write Zeroes Command:Not Supported
Save Features:   Supported
Reservations:Not Supported
Timestamp feature:   Not Supported
Fused Operation Support: Not Supported
Format NVM Attributes:   Crypto Erase, Per-NS Erase, Per-NS Format
Volatile Write Cache:Present

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"