Hi, Any update to this issue?
I have the same system (SYS-2028U-TN24R4T+) with 11 x intel P3600 2.0TB, NVMe PCIe 3.0, HET MLC 20nm 3DWPD (SSDPE2ME020T4). I install it successfully with 20170116T154141Z platform and it works for a few days then the headnode lost connection with it. I can not login to it from anywhere even from the console. It get the password then the screen is freeze. No error showing up. I also reboot it several times, boot it with many images including the latest (20170201T235645Z) with no luck. Thank you Viraphan > On Aug 11, 2559 BE, at 8:38 PM, Youzhong Yang <[email protected]> wrote: > > Hi Robert, > > Thanks for looking into this issue. > > I tried MSI interrupt type on my own but it didn't work, but I will try your > patch again and then report back. > > I've studied the nvme driver in Solaris 11.3, it seems they do the same thing > as Linux - MSI-X first, then MSI, finally FIXED, see attached file for the > assembly code and my comments. By the way, if I set nvmex_enable_msi and > nvmex_enable_msix to 0 (false) in /etc/system, Solaris crashed immediately > upon reboot. > > MSI-X works well on our host with a minor issue: > > In nvme_var.h, NVME_ADMIN_CMD_TIMEOUT is defined as 100000, i.e. 100ms I > think. It's too small. One of the INTEL NVMe SSDs took 366ms to execute GET > LOG PAGE command. Once I bumped up the value to 1000,000, nvme driver happily > attached all the 24 SSDs in many reboots. > > Here comes new issues/lack of functionality: > > - The INTEL drives are formatted to use 512 bytes block size, but they > advertise 4096 bytes block size as the best performing one (see the data in > issue report https://www.illumos.org/issues/7279 > <https://www.illumos.org/issues/7279>). So far we don't have the ability to > perform such low level FORMAT and I had to install other OS such as Linux and > use their tool (nvme-cli). > > - With NVMe SSD formatted to use 4096 block size, many things don't work. It > seems our blkdev driver never intended to support device with block size > larger than 512 bytes. I tried hacking bd_strategy() function to modify > bp->b_lblkno (in 512 bytes size, passed all the way down from zfs layer) to > be in 4096 block size, it appeared to be working for most I/O ops, but I know > it's just a hack and I will open a new thread discussing this particular > issue. > > http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/io/blkdev/blkdev.c#1142 > > <http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/io/blkdev/blkdev.c#1142> > http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/vdev_disk.c#783 > > <http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/vdev_disk.c#783> > > - Using 512-bytes block size, I was able to create zpool, do filesystem ops, > everything appears to be great. But, once I do zpool scrub, it reports > checksum errors. It can be reproduced easily with one drive. I tried the same > think using the same drive in Solaris, it worked so I don't think it's a > hardware issue. > > The checksum errors issue just bothered me, I don't know where to start with. > I've done some dtracing in nvme driver and found no error returned for the > read/write cmds. > > Thanks, > > -Youzhong > > > > > On Wed, Aug 10, 2016 at 10:41 PM, Robert Mustacchi <[email protected] > <mailto:[email protected]>> wrote: > On 8/4/16 11:25 , Youzhong Yang wrote: > > Thanks for the input Robert. > > > > I believe the issue is now resolved by using MSI-X (instead of FIXED) > > interrupt type inside nvme_init() for the admin queue. > > > > Here is the issue report I just filed: > > > > https://www.illumos.org/issues/7273 <https://www.illumos.org/issues/7273> > > > > I don't know why FIXED interrupt would cause issue, probably because we > > have too many NVMe SSDs? > > Following up on this aspect, I've put together the following: > > https://us-east.manta.joyent.com/rmustacc/public/webrevs/7273/index.html > <https://us-east.manta.joyent.com/rmustacc/public/webrevs/7273/index.html> > > Youzhong, would it be possible for you to use this? We've had success > with this with someone who was seeing issues. I'll hopefully get some > time to look at the offlining issue, but it's still a bit out, sorry. > > Hans, can you review this and take a look? > > Thanks, > Robert > > smartos-discuss | Archives > <https://www.listbox.com/member/archive/184463/=now> > <https://www.listbox.com/member/archive/rss/184463/28073523-38ca017d> | > Modify <https://www.listbox.com/member/?&> Your Subscription > <http://www.listbox.com/><nvme_register_intrs.txt> ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
