This does not make sense to me, not sure if it's relevant to the issue I am
seeing:

http://www.nvmexpress.org/wp-content/uploads/NVM-Express-1_1b.pdf
page 42 defines 'CSTS -Controller Status',

but our code defines it as

typedef union {
struct {
uint32_t csts_rdy:1; /* Ready */
uint32_t csts_cfs:1; /* Controller Fatal Status */
uint32_t csts_shst:2; /* Shutdown Status */
uint32_t csts_nssro:1; /* NVM Subsystem Reset Occured */
uint32_t csts_rsvd:28;
} b;
uint32_t r;
} nvme_reg_csts_t;
(
http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/io/nvme/nvme_reg.h#108
)

Shouldn't "uint32_t csts_rsvd:28;" be "uint32_t csts_rsvd:27;"?

Thanks!

On Mon, Aug 1, 2016 at 2:59 PM, Youzhong Yang <youzh...@gmail.com> wrote:

> I'm not sure what sector size it uses. Randomly failing some devices
> suggests that the NVMe driver is not doing the right thing. I am not going
> to blame the hardware, because everything looks good under Solaris 11.3 and
> Centos.
>
> Thanks!
>
>
> On Mon, Aug 1, 2016 at 2:39 PM, Michael Loftis <mlof...@wgops.com> wrote:
>
>> One random question, are the affected SSDs LBA#3/4K sector formatted?
>>
>> On Mon, Aug 1, 2016 at 10:41 AM, Youzhong Yang <youzh...@gmail.com>
>> wrote:
>>
>>> Hello again,
>>>
>>> Thanks Robert for the advises. I've spent some time struggling with why
>>> NVMe SSDs were retired but there's no error reported by NVMe driver, it
>>> turns out to be a victim of fmd_asru_hash_replay_asru(), i.e. if we don't
>>> tell fmd a fault is repaired, next time when the host is rebooted, it tries
>>> to replay the event.
>>>
>>> I plugged in all the 24 NVMe SSDs, the driver reported errors like these
>>> (see attached txt file for additional info):
>>>
>>> 2016-07-30T23:11:53.468013-04:00 batfs9995 nvme: [ID 265585
>>> kern.warning] WARNING: nvme3: command timeout, OPC = 6, CFS = 0
>>> 2016-07-30T23:11:53.468018-04:00 batfs9995 nvme: [ID 265585
>>> kern.warning] WARNING: nvme3: command timeout, OPC = 8, CFS = 0
>>> 2016-07-30T23:11:53.468024-04:00 batfs9995 nvme: [ID 176450
>>> kern.warning] WARNING: nvme3: nvme_admin_cmd failed for ABORT
>>> 2016-07-30T23:11:53.468032-04:00 batfs9995 nvme: [ID 366983
>>> kern.warning] WARNING: nvme3: nvme_admin_cmd failed for IDENTIFY
>>> 2016-07-30T23:11:53.468038-04:00 batfs9995 nvme: [ID 318795
>>> kern.warning] WARNING: nvme3: failed to identify controller
>>> 2016-07-30T23:11:53.468045-04:00 batfs9995 genunix: [ID 408114 kern.info]
>>> /pci@6d,0/pci8086,6f04@2/pci10b5,9765@0/pci10b5,9765@7/pci8086,370a@0
>>> (nvme3) down
>>>
>>> Here is my understanding of what happened after NVMe driver reported the
>>> above errors:
>>>
>>> - NVMe driver called ddi_fm_service_impact(nvme->n_dip,
>>> DDI_SERVICE_LOST) to report the error for device /pci@6d
>>> ,0/pci8086,6f04@2/pci10b5,9765@0/pci10b5,9765@7/pci8086,370a@0
>>>
>>> - fmd received ereport.io.service.lost event with device-path = /pci@6d
>>> ,0/pci8086,6f04@2/pci10b5,9765@0/pci10b5,9765@7/pci8086,370a@0
>>>
>>> - fmd decided the event affects the following devs:
>>>        dev:////pci@6d,0/pci8086,6f04@2/pci10b5,9765@0/pci10b5,9765@7
>>> /pci8086,370a@0
>>>        dev:////pci@6d,0/pci8086,6f04@2/pci10b5,9765@0
>>>        dev:////pci@6d,0/pci8086,6f04@2
>>>
>>> - fmd sent requests to retire the above devs, which caused all the SSDs
>>> under /pci@6d,0/pci8086,6f04@2 to be retired.
>>>
>>> Why fmd decides to retire the ancestors of the problematic device is a
>>> different issue, the issue here is why NVMe driver failed to execute some
>>> of its commands during nvme_attach(). Every time I rebooted the host, it
>>> just randomly failed some of the 24 devices, and rarely sometimes there's
>>> no error at all.
>>>
>>> Just an update about what I am up to, hopefully you guys can shed some
>>> light on what can be done next.
>>>
>>> Thanks,
>>>
>>> -Youzhong
>>>
>>>
>>> On Fri, Jun 24, 2016 at 8:13 PM, Robert Mustacchi <r...@joyent.com> wrote:
>>>
>>>> On 6/24/16 11:05 , Youzhong Yang wrote:
>>>>
>>>> > I panicked the host when e_ddi_retire_device() is called, here is
>>>> what I
>>>> > found:
>>>> >
>>>> > it is /usr/lib/fm/fmd/fmd who calls modctl -> modctl_retire
>>>> > -> e_ddi_retire_device to retire /pci@0,0/pci8086,6f08@3.
>>>>
>>>> Okay, this makes some amount of sense, we're seeing various FM ereports
>>>> being generated at a rate which causes us to eventually offline the
>>>> device.
>>>>
>>>> > Attached is a file with some entries produced by fmdump. It's weird
>>>> that
>>>> > sometimes I got those fm entries but sometimes the system generated
>>>> nothing
>>>> > but still retired the drives.
>>>> >
>>>> > I don't know how to interpret those entries, maybe someone on the
>>>> list can
>>>> > shed some light?
>>>>
>>>> So, these are errors that are based on the PCI express specification and
>>>> the various entries usually refer to parts of the advanced error
>>>> reporting capabilities. So, what I do here is I go through and look at
>>>> the correctable and uncorrectable error status members which correspond
>>>> to the registers.
>>>>
>>>> So the first one starting at line 11 indicates that a receive error was
>>>> encountered. Note that the entry that generated it is not the device,
>>>> but what seems like the non-transparent bridge.
>>>>
>>>> It's also worth calling out what the general ereports are talking about.
>>>> You'll note there are basically three different classes there:
>>>>
>>>> - ereport.io.pci.fabric
>>>> - ereport.io.pciex.rc.ce-msg
>>>> - ereport.io.pciex.pl.re
>>>>
>>>> So, the pl.re are issues that indicate receiver errors. Which if I'm
>>>> reading this correctly indicates issues in some of the decoding of data?
>>>>
>>>> The rc.ce-msg means that the root complex has been informed of
>>>> correctable errors.
>>>>
>>>> That said, some of the messages that have arrive at the root port seem a
>>>> bit odd.
>>>>
>>>> > Device 8086:6f08 is "Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3
>>>> > v4/Xeon D PCI Express Root Port 3" and seems to use "PCIe
>>>> bridge/switch
>>>> > driver" (pcieb). Is it possible the pcieb driver in illumos does not
>>>> work
>>>> > properly with this device?
>>>> 
>>>> It looks like the actual NVMe devices may be connected to a
>>>> non-transparent bridge. So it's highly likely that that device is
>>>> failing which is also what's directly connected to that port. I have
>>>> seen something similar, but not on a system we have at Joyent.
>>>> 
>>>> I'm going to have to spend a bit more time understanding the exact set
>>>> of FM actions that have caused us to end up deciding to offline that,
>>>> but in the interim, I'd suggest that we go through and see if this is
>>>> correlated at all with activity to the NVMe devices. While I'm not sure
>>>> that I have any reason to believe that the NVMe driver is at issue, it
>>>> might be a useful data point.
>>>> 
>>>> First, what I'd suggest is that you use dtrace -Z here. -Z basically
>>>> tells DTrace to ignore probes that don't exist. That way when you run
>>>> add_drv on nvme, if it sees that the functions are in the nvme driver,
>>>> it'll end up enabling them. Then, make sure you kill DTrace before you
>>>> want to rem_drv, otherwise it'll block it.
>>>> 
>>>> Perhaps let's try something like:
>>>> 
>>>> dtrace -Zn 'fbt::pf_send_ereport:entry,fbt::nvme_submit_cmd:entry{
>>>> trace(timestamp); }' -n 'fbt::nvme_wait_cmd:return{ trace(timestamp);
>>>> trace(arg1); }'
>>>> 
>>>> Robert
>>>> 
>>>
>>>
>>
>>
>> --
>>
>> "Genius might be described as a supreme capacity for getting its
>> possessors
>> into trouble of all kinds."
>> -- Samuel Butler
>> *smartos-discuss* | Archives
>> <https://www.listbox.com/member/archive/184463/=now>
>> <https://www.listbox.com/member/archive/rss/184463/25077300-734ee1ca> |
>> Modify
>> <https://www.listbox.com/member/?&;>
>> Your Subscription <http://www.listbox.com>
>>
>
>



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to