On Tue, 2017-12-12 at 08:01 +0100, Hannes Reinecke wrote:
> On 12/12/2017 01:00 AM, Bart Van Assche wrote:
> > On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote:
> > > @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev,
> > > struct alua_port_group *pg)
> > > retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags);
> > >
> > > if (retval) {
> > > + /*
> > > + * If the target only supports active/optimized there's
> > > + * not much we can do; it's not that we can switch paths
> > > + * or somesuch.
> > > + * So ignore any errors to avoid spurious failures during
> > > + * path failover.
> > > + */
> > > + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {
> > > + sdev_printk(KERN_INFO, sdev,
> > > + "%s: ignoring rtpg result %d\n",
> > > + ALUA_DH_NAME, retval);
> > > + kfree(buff);
> > > + return SCSI_DH_OK;
> > > + }
> >
> > Hello Hannes,
> >
> > Sorry but this change looks weird to me. If an RTPG fails, shouldn't
> > alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target
> > supports? Are you perhaps trying to implement a work-around for an array
> > that does not respond to RTPG during path transitions?
> >
>
> Yes, precisely.
>
> Thing is: if an array is only supporting active/optimized the entire
> device-handler stuff is basically a no-op as we can't switch paths anyway.
> So it doesn't really matter if the RTPG fails; in fact, we could just
> short-circuit the entire logic. I did not do that to allow for a state
> modification (ie arrays _might_ decide to announce additional states
> eventually, and only starting off with active/optimized as an initial
> state set).
>
> But if we return SCSI_DH_IO here the multipath logic will not attempt to
> switch paths, and failover will not work.
Hello Hannes,
I would appreciate it if it would be mentioned more clearly in the comment
above pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED that the newly added code
is a workaround for non-standard array behavior. I'm afraid that otherwise
people who will read that code will be puzzled about why that code has been
added.
Thanks,
Bart.