> On May 15, 2017, at 5:31 AM, Ewan D. Milne <emi...@redhat.com> wrote:
> 
> On Mon, 2017-05-15 at 08:04 +0200, Hannes Reinecke wrote:
>> In general I like the idea; however, the 'filter' thingie is somewhat
>> odd. I could somewhat buy into the idea of filtering for sense keys, but
>> then I would have expected to use the 'sense_event_filter' as a bitmap
>> of allowed sense keys.
>> With your current design you can only filter for one specific sense key,
>> _and_ you leave the remaining bits of the sense_event_filter variable
>> unused.
>> So either turn it into a bitmap and allow for several sense keys to be
>> filtered, or treat it as mask for the _entire_ sense, but then you'd
>> need for ASC and ASCQ, too.
>> 
>> Cheers,
>> 
>> Hannes
> 
> I think what would help here is understanding what you are trying to
> accomplish with this change.  Are you trying to allow generation of
> a uevent on a particular sense KCQ combination you know in advance,
> or a set of them, or do you want to be able to normally generate
> uevents for every sense keys/codes (i.e. all the bitmap bits set)
> and then perhaps narrow down to a specific one when you are looking
> for a problem?

Thanks for these feedbacks!

My goal here is to generate uevent for a set of sense keys. I would 
like to store these information for a fleet of many HDDs over a 
couple years. We are hoping to use these information to get insights
in the quality of the HDDs. 

> When I added the sense code uevent generation for Unit Attentions,
> one big concern I had was how many events per second would be generated.
> The userspace event handling is very slow, and can only handle a few
> hundred events per second before it starts backing up, and then you
> can lose events, which you don't want.

Thanks for this information. I am trying to limit the rate with the 
filter. But I am not sure whether this is sufficient. 

> Generating a uevent on MEDIUM ERROR is somewhat worrying, when a drive
> goes bad you could easily get these on every I/O.

MEDIUM ERROR is an important event that we would like to keep. Maybe 
we need better mechanism for rate limit, for example, no more than 10 
MEDIUM ERROR per second per drive. 

> 
> Your actual uevent emit:
> 
> +#ifdef CONFIG_SCSI_SENSE_UEVENT
> +       case SDEV_EVT_SCSI_SENSE:
> +               envp[idx++] = "SDEV_UA=SCSI_SENSE";
> +               for (i = idx; i < idx + 3; ++i) {
> +                       envp[i] = kzalloc(32, GFP_ATOMIC);
> +                       if (!envp[i])
> +                               break;
> +                       free_envp = i;
> +               }
> +               snprintf(envp[idx++], 32, "LBA=%lu", evt->sense_evt_data.lba);
> +               snprintf(envp[idx++], 32, "SIZE=%d", 
> evt->sense_evt_data.size);
> +               snprintf(envp[idx++], 32, "SENSE_CODE=%1x/%02x/%02x",
> +                        evt->sense_evt_data.sshdr.sense_key,
> +                        evt->sense_evt_data.sshdr.asc,
> +                        evt->sense_evt_data.sshdr.ascq);
> +               break;
> +#endif
> 
> Is a little strange.  The "SDEV_UA" property implies that it was a
> UNIT ATTENTION, but you are filtering on other sense types, so you
> would probably want a different property.  Many SCSI commands do not
> have a LOGICAL BLOCK ADDRESS field, or a TRANSFER LENGTH field, so
> you might be reporting garbage if they don't.  And, since this is a
> SCSI command, you probably want to report the length from the command,
> not the count in bytes from the request structure.

I didn't realize SDEV_UA means UNIT ATTENTION. Let me think about a 
different name. 

For the data being logged, I was thinking to log the whole SCSI command
and full sense buffer. Let me try a few different things. 

Thanks again,
Song


Reply via email to