On Thu, 2017-03-16 at 15:53 -0700, James Bottomley wrote:
> On Thu, 2017-03-16 at 13:56 -0700, Bart Van Assche wrote:
> > scsi_target_unblock() must unblock SCSI devices even if this function
> > is called after unloading of the LLD that created these devices has
> > started. This is necessary to prevent that __scsi_remove_device() 
> > hangs on the SYNCHRONIZE CACHE command issued by the sd driver during
> > shutdown.
> 
> Your special get function misses the try_module_get().  But this is all
> really a bit ugly. Since the only problem is the SYNC CACHE triggered
> by device_del, isn't a better solution a new state: SDEV_CANCEL_BLOCK. 
>  This will make the device visible to scsi_get_device() and we can take
> it back from CANCEL_BLOCKED->CANCEL when the queue is unblocked.  I
> suspect we could also simply throw away the sync cache command when the
> device is blocked (the cache should destage naturally in the time it
> takes for the device to be unblocked).

Hello James,

The purpose of this patch series is to make sure that unblock also occurs
after module unloading has started. My understanding of try_module_get() is
that it fails once module unloading has started. In other words, it is on
purpose that try_module_get() is not called. From the kernel module code:

bool try_module_get(struct module *module)
{
        bool ret = true;

        if (module) {
                preempt_disable();
                /* Note: here, we can fail to get a reference */
                if (likely(module_is_live(module) &&
                           atomic_inc_not_zero(&module->refcnt) != 0))
                        trace_module_get(module, _RET_IP_);
                else
                        ret = false;
                preempt_enable();
        }
        return ret;
}

static inline int module_is_live(struct module *mod)
{
        return mod->state != MODULE_STATE_GOING;
}

Regarding introducing a new device state: this is something I would like to
avoid. Any code that manipulates the SCSI device state is unnecessarily hard
to modify because multiple types of state information have been mixed up in
a single state variable (blocked / not blocked; created / running / cancel /
offline). Additionally, the SCSI device state is visible to user space.
Adding a new SCSI device state could break existing user space applications.

There is another problem with the introduction of the SDEV_CANCEL_BLOCKED
state: we do not want open() calls to succeed for devices that are in the
SDEV_DEL, SDEV_CANCEL nor for devices that are in the SDEV_CANCEL_BLOCKED
state. scsi_disk_get() in the sd driver relies on scsi_device_get() to check
the SCSI device state. If scsi_device_get() would succeed for devices in the
SDEV_CANCEL_BLOCKED state then an explicit check for that state would have
to be added in several users of scsi_device_get(). In other words, I think
adding the SDEV_CANCEL_BLOCKED state would result in a much more complex and
also harder to test patch.

Thanks,

Bart.

Reply via email to