On 5/7/14, 4:44 AM, Steven Hartland wrote:
----- Original Message ----- From: "George Wilson"
I think there is probably another way we can solve this problem but I first want to get a better understanding of the corruption. We have not integrated the TRIM support upstream and I suspect that's the source of most of the problems. Can you confirm that with TRIM disabled that most of corruption you've seen does not occur? I'm trying to get context here since we've not seen this type of failure elsewhere.

In the case of TRIM if BIO delete's are disabled the free IO's don't get
sent to physical media and instead instantly return ZIO_PIPELINE_CONTINUE.
static int
vdev_geom_io_start(zio_t *zio)
{
   ...
   switch (zio->io_type) {
   case ZIO_TYPE_FREE:
       if (vdev_geom_bio_delete_disable)
           return (ZIO_PIPELINE_CONTINUE);
   }
}

For your simple corruption case could you change the code to do the following:

if (vdev_geom_bio_delete_disable) {
    zio_interrupt(zio);
    return (ZIO_PIPELINE_STOP);
}

I believe this is the way it should behave. Let me know how this work for you. I'm going to change the way the lower consumer work and do some testing also.

Thanks,
George


Since I'm not familiar with the TRIM implementation in FreeBSD I was wondering if you could explain the scenario that leads to the corruption. The fact that the pipeline doesn't allow the zio to change mid-pipeline is actually intentional so I don't think we want to make a change to allow this to occur. From code inspection it does look like the vdev_*_io_start() routines should never return ZIO_PIPELINE_CONTINUE. I will look at this closer but it looks like there is a bug there.

When I looked there few cases where this can happen for example scrubbing
reads in mirror, unreadable devices for file IO's.

Anyway, if you could give me more details to the corruption I would be happy to help you design a way that this can be implemented while still ensuring that a zio cannot change while the pipeline is still active. Thanks for diving into this and I will post more about the bugs that look to exist in the vdev_*_io_start() routines.

Given how queuing can execute a different IO than was passed in by
zio_execute to the pipeline I don't think its a valid to assume that
IO's can't return continue without asserting otherwise.

   Regards
   Steve





_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to