So, I think I've narrowed it down to two things: * ZFS tries to destroy the dataset every time it's called because the last time it didn't finish destroying * In this process, ZFS makes the kernel run out of memory and die
So I thought of two options, but I'm not sure if I'm right: Option 1: "Destroy" is an atomic operation If destroy is atomic, then I guess what it's trying to do is look up all the blocks that need to be deleted/unlinked/released/freed (not sure which is the word). After it has that list, it will write it to the ZIL (remember this is just what I suppose, correct me if I'm wrong!) and start to physically delete the blocks, until the operation is done and it's finally committed. If this is the case, then the process will be restarted from scratch every time the system is rebooted. But I read that apparently in previous versions, rebooting while destroying a clone that it's taking too long makes the clone reappear intact next time. This, and the fact that zpool iostat show only reads and no or very few writes is what lead me to think this is how it works. So if this is the case, I'd like to abort this destroy. After importing the pool, I will have everything as it was and maybe I can delete snapshots before the clone's parent snapshot and maybe this will speed up the destroy process, or just leave the clone. Option 2: Destroy is "not" atomic By this I don't mean that it's not "atomic", as in "if the operation is canceled, it will finish in an incomplete state", but as in "if the system is rebooted, the operation will RESUME at the point it was where it died. If this is the case, maybe I can write a script to reboot the computer in a fixed amount of time, and run it on boot: zpool import xx & sleep 20 seconds rm /etc/zfs/zpool.cache sleep 1800 seconds reboot This will work under the assumption that the list of blocks to be deleted is flushed to the ZIL or something before boot, to allow the operation to restart at the same point. This is a very nasty hack but it may do the trick only in a very slow fashion: zpool iostat shows 1MB/s read when it's doing the destroy. The dataset in question has 450GB which means that the operation will take 5 days to finish if it needs to read the whole dataset to destroy it, or 7 days if it also needs to go through the other snapshots (600GB total). So, my only viable option seems to be to "abort" this. How can I do this? disable the ZIL, maybe? Delete the ZIL? scrub after this? Thanks, HernĂ¡n This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss