I've been running into an issue with zfs file systems hanging/freezing
during our backup processes.

We currently perform backups to two servers, an onsite server and an
offsite server. What we found is that when a single process is
creating the snap, sending it and deleting an older snap it works
fine. However, if we add the second process to do the same thing to
the offsite server, we get a stuck process on the snapshot delete that
cannot be killed. From what I can tell, the problem occurs when we
attempt to delete any snapshot while a "zfs send" is in progress.

Some things to note:

* Only the volume with the stuck process freezes. Accessing files is
fine, only ZFS commands are stuck.
* We used different snapshot names, so the delete and send in both
processes were not using the same snaps.
* We also tried tiering where we would send snaps from the onsite
backup server, but processes still froze.
* We are using OmniOS.

After some advising from Omni I used mdb to walk the threads. Here is
the output:

  http://pastie.org/private/xism3bd6qixbfaugprqyqw

They also mentioned it might be related to this bug:

  https://www.illumos.org/issues/2807

I attempted to do a crashdump, but after 30 minutes it was only 8%
complete and we had to force the reboot. The box has 198GB of RAM, so
I assume it was taking a while due to that.

Does anyone know more about this issue? I am trying to figure out a
way that I can run tiered backups to multiple locations without
locking zfs.

Thanks,
Chris


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to