On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley
<james.bottom...@hansenpartnership.com> wrote:
> On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote:
>> On Wed, Jul 22, 2015 at 11:28 AM, James Bottomley
>> <james.bottom...@hansenpartnership.com> wrote:
>> >
>> >>  }
>> >>
>> >>  void sas_device_set_phy(struct domain_device *dev, struct sas_port *port)
>> >> diff --git a/drivers/scsi/libsas/sas_port.c 
>> >> b/drivers/scsi/libsas/sas_port.c
>> >> index d3c5297c6c89..9a25ae3a52a4 100644
>> >> --- a/drivers/scsi/libsas/sas_port.c
>> >> +++ b/drivers/scsi/libsas/sas_port.c
>> >> @@ -219,7 +219,6 @@ void sas_deform_port(struct asd_sas_phy *phy, int 
>> >> gone)
>> >>
>> >>       if (port->num_phys == 1) {
>> >>               sas_unregister_domain_devices(port, gone);
>> >> -             sas_port_delete(port->port);
>> >>               port->port = NULL;
>> >>       } else {
>> >>               sas_port_delete_phy(port->port, phy->phy);
>> >>
>> >
>> > This should become
>> >
>> > if (port->num_phys == 1)
>> >         sas_unregister_domain_device(port, gone);
>> >
>> > sas_port_delete_phy(port->port, phy->phy);
>> >
>> > So we end up with a port scheduled for destruction with no phys rather
>> > than making the last phy association hang around until the DISCE
>> > workqueue runs.
>>
>> Sounds ok in theory.
>
> It's not really a choice.  The specific problem you've introduced with
> this patch is failure to cope with link flutter: a deform and form event
> queued sequentially.  In the new scheme you're trying to introduce, the
> destruct event gets queued from the deform but behind the form and the
> link flutter results in a dead link.  I thought just forcing a zero phy
> port would fix this, but it won't, either the destruct has to run in the
> context of the deform event or the form has to be queued later than the
> destruct.  I think coupled with the changes above, there needs to be
>
> if (port->port) {
>         /* dying port, requeue form event */
>         resend the PORTE_BYTES_DMAED event
>         return
> }
>
> inside the unmatched port loop in sas_port_form() if nothing is found as
> well to close this.

I think it's too late.  Once the lldd has triggered libsas to start
tear down I seem to recall the lldd has the expectation that a new
PORTE_BYTES_DMAED triggers the creation of a new port instance for
that phy.  Once the flutter reaches libsas the race is already lost
and the port needs to be torn down, but I would need to take a closer
look.

>>   I don't have a libsas environment handy, I worked with Praveen to
>> validate the version as submitted if you want to re-work it.
>
> A couple of days ago, this was so urgent as to have to go outside the
> usual patch process ... now it's not important enough for you to bother
> working on it; which is it?

Neither, it was a reviewed patch that was idling in the process.  I'm
still of the opinion that pinging Andrew in a case like this *is* the
expected process, unless there's a place I can check that a patch is
still in the application queue?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to