I am looking at a rare but nasty case of corruption in which a block
pointer has one of ditto dva's out of range.
This causes vdev_lookup_top() to return NULL for a vdev pointer.
Going through the callers of vdev_lookup_top(), I noticed that in some
cases NULL vdev are handled properly, while in others it is not.
In particular, in case of an import, ditto blocks are handled by a mirror
vdev code path that does not have proper handling for NULL and would panic
the system.

My question, though, is not about the mirror (for which I have a fix that I
will upstream eventually),but about metaslab allocator.
In particular, I am looking at metaslab_allocate() and
metaslab_allocate_dva().
In the latter function NULL vdev is properly handled in the case of a gang
block (since a hint may be stale and the device may be gone).
However, that very same function does not handle NULL vdev in case of a
regular block pointer with ditto blocks.
The dilemma I am facing here is that I can either just use rotor (i.e.
mimic gang block behavior), or return an error immediately.
In the latter case, the caller, metaslab_allocate() would handle it
properly.
I am inclined to go with the second option, but I would greatly appreciate
an insight on this from someone who is familiar with the internal logic and
the theory behind metaslab allocator.

Thank you very much,
Ilya.
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to