It might be a good idea to discuss whether labelclear in OpenZFS should or
should not do this: https://github.com/zfsonlinux/zfs/issues/3156
(I'm using illumos device names because this is the OpenZFS repo, but all the
same can be said in terms of ZoL (sda/sda1) and OS X (disk3/disk3s1) device
names. I don't know enough about FreeBSD device names to comment but I would
assume it's all the same issues.)
In my opinion, labelclear should clear the exact device it's given, as it does
now. In other words, it should take the same block device that zdb -l would.
However, it is worth noting that this can be quite confusing to users.
Suppose we create a pool on a physical disk with this command:
zpool create notrpool c2t1d0
zpool status will report notrpool is on c2t1d0.
If we destroy the pool, and want to clear the label, the command is zpool
labelclear c2t1d0, right? Wrong. Or is it zpool labelclear /dev/rdsk/c2t1d0?
Also wrong.
Currently, the correct answer is zpool labelclear /dev/rdsk/c2t1d0s0, which is
indeed the same device that we would need to supply to zdb -l in order to read
the label. zdb -l c2t1d0 will not work. zdb -l /dev/rdsk/c2t1d0 will not work.
zdb -l /dev/rdsk/c2t1d0s0 is what is required. So it is logical that
/dev/rdsk/c2t1d0s0 is the device zpool labelclear expects.
However, we're using the zpool command not the zdb command, and zpool status
tells us the device is c2t1d0. So many (most?) users will naturally conclude
that they should zpool labelclear c2t1d0 or at worst zpool labelclear
/dev/rdsk/c2t1d0. zpool labelclear /dev/rdsk/c2t1d0s0 violates the principle of
least surprise from that perspective.
The original sin here is the lying that zpool status does in the case where
whole_disk=1. It strips off the s0 if whole_disk=1 in the name of readability
and to communicate that ZFS was given the whole device in the original
create/add/attach command.
The confusion is compounded by the fact that sometimes zpool status DOES print
the same device that labelclear expects. This is true whenever whole_disk=0
(modulo the fact that labelclear wants the full path and ZFS only prints the
basename of whole_disk=0 block devices, though it does print the full path in
the case of file pools).
So sometimes labelclear expects the exact device status reports (filepools).
Sometimes labelclear expects the exact device status reports but with the
dirname prepended (virtual disks and physical disk partitions). And sometimes
(most common case) labelclear expects the device status reports but with both
the dirname prepended and the partition name appeneded.
OK, so users can get confused. What else is new? RTFM, etc. All that I've said
so far could be treated as a documentation issue.
But the situation gets worse.
Suppose the user, seeing c2t1d0 in the zpool status output, DOES do "zpool
labelclear /dev/rdsk/c2t1d0." The user is likely to conclude that the command
was successful because supplying the full device mistakenly because the command
will have the side effect of destroying the GPT. The GPT was there before the
labelclear command. Partitions were visible, etc. After the command, the gpt is
gone and the partitions are no longer listed. So the command worked, right?
After all, it did do something.
So thinking the labelclear command was successful, the user proceeds to try to
reuse the device for a new pool. Unfortunately, this can result in a
"mysterious" EBUSY being returned by the kernel. Of course, now there's lots of
fuss trying to determine what could be busying the device, and troubleshooting
ensues.
Why EBUSY? Userland looks at the disk that had its GPT (accidentally) wiped,
does its checks to make sure the device isn't being used by another file system
or another pool, etc., and all looks good to go. So it proceeds to apply the
standard auto-partitioning when whole_disk=1 at exactly the same offsets that
they were previously used by the dead GPT. With the autopartitioning completed,
userland then hands its freshly minted partition off to the kernel. The kernel,
being wiser than userland and the final fail safe against dangerous behavior,
seeing the old pool on the partition it was just handed and balks at creating a
new one because the device is already in use by another pool and we wouldn't
want to blow that away.
So has this happened in real life or am I telling a plausible tale?
2011: https://github.com/zfsonlinux/zfs/issues/440#issuecomment-3144878
2015: https://openzfsonosx.org/forum/viewtopic.php?f=26&t=2323&start=10#p6118
(Two separate users if you read from the beginning of the topic. The link is to
the final post confirming the solution.)
(If nothing else userland should probably do its busy checks a second time
AFTER the partitoning to prevent the kernel from having to step in and issue
mysterious EBUSY errors.)
Should zpool labelclear be made smarter? Should it accept the device names
zpool status reports? It is probably safest and most flexible to have it only
operate on the exact device it is given, full path required (same exact path as
zdb -l wants, after all), just as it does now.
But I can certainly see why the temptation exists to propogate zpool status's
lying to other commands. Some might suggest automatically labelclearing the
partition not the full device if whole_disk=1 and the user supplies the full
device, but that may not always be readable, and it strikes me as dangerous to
"guess" at what the user really meant when we're zeroing things out. Also, if
we're going to automatically use the partition instead, how would you actually
get labelclear to operate on the full device if that is indeed what you
actually intend? There's also the snag that the partition table may already be
gone in which case double guessing would be required. Others might suggest
clearing both the whole device and the partition. And I'm sure there are other
possible approaches. It's also worth keeping in mind that whole_disk=1 may in
the future be possible with arbitrary partition numbers and is certainly
theoretically separable and logically distinct from the autopartiti
oning, t
hough they do usually go together nicely.
---
Reply to this email directly or view it on GitHub:
https://github.com/openzfs/openzfs/pull/32#issuecomment-152717072
_______________________________________________
developer mailing list
developer@open-zfs.org
http://lists.open-zfs.org/mailman/listinfo/developer