On 5/7/14, 4:44 AM, Steven Hartland wrote:
----- Original Message ----- From: "George Wilson"
I think there is probably another way we can solve this problem but I
first want to get a better understanding of the corruption. We have
not integrated the TRIM support upstream and I suspect that's the
source of most of the problems. Can you confirm that with TRIM
disabled that most of corruption you've seen does not occur? I'm
trying to get context here since we've not seen this type of failure
elsewhere.
In the case of TRIM if BIO delete's are disabled the free IO's don't get
sent to physical media and instead instantly return
ZIO_PIPELINE_CONTINUE.
static int
vdev_geom_io_start(zio_t *zio)
{
...
switch (zio->io_type) {
case ZIO_TYPE_FREE:
if (vdev_geom_bio_delete_disable)
return (ZIO_PIPELINE_CONTINUE);
}
}
For your simple corruption case could you change the code to do the
following:
if (vdev_geom_bio_delete_disable) {
zio_interrupt(zio);
return (ZIO_PIPELINE_STOP);
}
I believe this is the way it should behave. Let me know how this work
for you. I'm going to change the way the lower consumer work and do some
testing also.
Thanks,
George
Since I'm not familiar with the TRIM implementation in FreeBSD I was
wondering if you could explain the scenario that leads to the
corruption. The fact that the pipeline doesn't allow the zio to
change mid-pipeline is actually intentional so I don't think we want
to make a change to allow this to occur. From code inspection it does
look like the vdev_*_io_start() routines should never return
ZIO_PIPELINE_CONTINUE. I will look at this closer but it looks like
there is a bug there.
When I looked there few cases where this can happen for example scrubbing
reads in mirror, unreadable devices for file IO's.
Anyway, if you could give me more details to the corruption I would
be happy to help you design a way that this can be implemented while
still ensuring that a zio cannot change while the pipeline is still
active. Thanks for diving into this and I will post more about the
bugs that look to exist in the vdev_*_io_start() routines.
Given how queuing can execute a different IO than was passed in by
zio_execute to the pipeline I don't think its a valid to assume that
IO's can't return continue without asserting otherwise.
Regards
Steve
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer