----- Original Message ----- From: "George Wilson" <[email protected]>
To: "Steven Hartland" <[email protected]>; <[email protected]>
Sent: Wednesday, May 07, 2014 3:07 PM
Subject: Re: [OpenZFS Developer] zfs zio reordering in zio_vdev_io_start 
causing panics


On 5/7/14, 4:44 AM, Steven Hartland wrote:
----- Original Message ----- From: "George Wilson"
I think there is probably another way we can solve this problem but I first want to get a better understanding of the corruption. We have not integrated the TRIM support upstream and I suspect that's the source of most of the problems. Can you confirm that with TRIM disabled that most of corruption you've seen does not occur? I'm trying to get context here since we've not seen this type of failure elsewhere.

In the case of TRIM if BIO delete's are disabled the free IO's don't get
sent to physical media and instead instantly return ZIO_PIPELINE_CONTINUE.
static int
vdev_geom_io_start(zio_t *zio)
{
   ...
   switch (zio->io_type) {
   case ZIO_TYPE_FREE:
       if (vdev_geom_bio_delete_disable)
           return (ZIO_PIPELINE_CONTINUE);
   }
}

For your simple corruption case could you change the code to do the following:

if (vdev_geom_bio_delete_disable) {
    zio_interrupt(zio);
    return (ZIO_PIPELINE_STOP);
}

I believe this is the way it should behave. Let me know how this work for you. I'm going to change the way the lower consumer work and do some testing also.

So your saying no path from vdev_*io_start(..) that queues an IO return
ZIO_PIPELINE_CONTINUE?

If so it looks like there's a number of locations where that needs fixing
at least in head of FreeBSD e.g.
static int
vdev_file_io_start(zio_t *zio)
{
....
   if (!vdev_readable(vd)) {
       zio->io_error = SET_ERROR(ENXIO);
       return (ZIO_PIPELINE_CONTINUE);
   }
-----
static int
vdev_mirror_io_start(zio_t *zio)
{
....

   if (zio->io_type == ZIO_TYPE_READ) {
       if ((zio->io_flags & ZIO_FLAG_SCRUB) && !mm->mm_replacing) {
           ...
           return (ZIO_PIPELINE_CONTINUE);
       }
....

   return (ZIO_PIPELINE_CONTINUE);
}
----

Also should that be asserted to make it clear and testable e.g.
static int
zio_vdev_io_start(zio_t *zio)
{
   zio_t *sio = zio;
....
   ret = vd->vdev_ops->vdev_op_io_start(zio);
   ASSERT(ret != ZIO_PIPELINE_CONTINUE || sio == zio);

   return (ret);
}

This would also ensure the stack overflow issue shouldn't occur.

   Regards
   Steve
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to