Hello list,
So over in OSX camp, I am trying to fix the situation caused by users
pulling out media
without first exporting. (apparently they seem determined to do this).
The net effect is that you have to reboot to get any functionality back
from ZFS.
Summarising 4 hours of debugging, it seems that.
* we detect error in vdev_disk_io_intr() :
printf("ZFS: Device %p removal(?) detected\n",
dvd->vd_devvp);
zfs_post_remove(zio->io_spa, vd);
vd->vdev_remove_wanted = B_TRUE;
printf("vdev_disk issuing request\n");
spa_async_request(zio->io_spa, SPA_ASYNC_REMOVE);
and even though I re-insert the device, and issue clear, the pool does
not come back. It calls vdev_reopen(), vdev_probe(), vdev_validate() etc
as expected.
In the end, it tries to do a simple zio_read_physio() in vdev_probe()
which fails due to vdev_accessible().
... which fails because vdev_remove_wanted is still true.
So even though we issue spa_async_request(zio->io_spa, SPA_ASYNC_REMOVE)
it never gets set. The spa async thread is triggered at the end of
spa_sync().
Indeed, issuing a spindump to look at the stacks at this point shows:
*1000 spa_sync + 1104 (spa.c:6414 in zfs + 531184)
[0xffffff7f829aaaf0]
*1000 dsl_pool_sync + 211 (dsl_pool.c:489 in zfs + 339667)
[0xffffff7f8297bed3]
*1000 zio_wait + 117 (zio.c:1544 in zfs + 1107125)
[0xffffff7f82a374b5]
*1000 spl_cv_wait + 132 (spl-condvar.c:68 in spl + 7828)
[0xffffff7f828e7e94]
The 1000 shows that out of 1000 probes, the stack was like this 1000
times. So stalled.
But it turns out that, once the device has been unplugged, spa_sync()
pretty much hangs forever.
I am unsure what mechanism should prevent this from happening. I have
had buf_*() IO hang forever on OSX. There is no timeout to set with buf
IO, and I don't think other Unix platforms have that any way. Do the txg
work resurface and cancel buf IO requests? Can you even?
So, as a test, I add this:
spa_async_request(spa_t *spa, int task)
{
zfs_dbgmsg("spa=%s async request task=%u", spa->spa_name, task);
mutex_enter(&spa->spa_async_lock);
spa->spa_async_tasks |= task;
mutex_exit(&spa->spa_async_lock);
+ spa_async_dispatch(spa);
}
Lo and behold, device gets remove_wanted=FALSE, the vdev_probe() can
complete its IO test and pool gets restored after I issue another pool
clear.
Interestingly, the spa_sync() comes back to life.
What is the correct behavior? I assume I can't just keep the dispatch
call in async_request? Might end up with more than one thread?
Lund
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer