Re: [zfs-discuss] cannot destroy snapshot

2011-09-16 Thread Matt Banks

On Apr 11, 2011, at 3:22 PM, Paul Kraus wrote:

> On Wed, Apr 6, 2011 at 1:58 PM, Rich Morris  wrote:
>> On 04/06/11 12:43, Paul Kraus wrote:
>>> 
>>> xxx> zfs holds zpool-01/dataset-01@1299636001
>>> NAME   TAGTIMESTAMP
>>> zpool-01/dataset-01@1299636001  .send-18440-0  Tue Mar 15 20:00:39 2011
>>> xxx> zfs holds zpool-01/dataset-01@1300233615
>>> NAME   TAGTIMESTAMP
>>> zpool-01/dataset-01@1300233615  .send-18440-0  Tue Mar 15 20:00:47 2011
>>> xxx>
>>> 
>>>That is what I was looking for. Looks like when a zfs send got
>>> killed it left a hanging lock (hold) around. I assume the next
>>> export/import (not likely as this is a production zpool) or a reboot
>>> (will happen eventually, and I can wait) these will clear. Unless
>>> there is a way to force clear the hold.
>> 
>> The user holds won't be released by an export/import or a reboot.
>> 
>> "zfs get defer_destroy snapname" will show whether this snapshot is marked
>> for
>> deferred destroy and "zfs release .send-18440-0 snapname" will clear that
>> hold.
>> If the snapshot is marked for deferred destroy then the release of the last
>> tag
>> will also destroy it.
> 
>Sorry I did not get back on this last week, it got busy late in the week.
> 
>I tried the `zfs release` and it appeared to hang, so I just let
> it be. A few hours later the server experienced a resource crunch of
> some type (fork errors about unable to allocate resources). The load
> also varied between about 16 and 50 (it is a 16 CPU M4000).
> 
>Users who had an open SAMBA connection seemed OK, but eventually
> we needed to reboot the box (I did let it sit in that state as long as
> I could). Since I could not even get on the XSCF console, I had to
> `break` it to the OK prompt and sync it. The first boot hung. I then
> did a boot -rv and that also hung (I was hoping to see a device probe
> that caused the hang, but it looked like it was getting past all the
> device discovery). That also hung. Finally a boot -srv got me to a
> login prompt. I logged in as root, then logged out and it came up to
> mulltiuser-server without a hitch.
> 
>I do not know what the root cause of the initial resource problem
> was, as I did not get a good core dump. I *hope* it was not the `zfs
> release`, but it may have been.
> 
>After the boot cycle(s) the zfs snapshots are no longer held and I
> could destroy them.
> 
>Thanks to all those who helped. This discussion is one of the best
> sources, if not THE best source, of zfs support and knowledge.


I hate to drudge up this "old" email thread, but I just wanted to:

a) say thanks ("thanks!") as I had exactly this same issue just crop up on 
Sol10u9 (zpool rev22) and sure enough, it had a hold from a previous send.

b) mention (for those that may find this thread in the future) that once I 
found the hold, the "zfs release [hold] [snapname]" method mentioned above 
worked swimmingly for me. I was nervous doing this during production hours, but 
the release command returned in about 5-7 seconds with no apparent adverse 
effects. I was then able to destroy the snap.

I was initially afraid that it was somehow the "memory bug" mentioned in the 
current thread (when things are fresh in your mind, they seem more likely), so 
I'm glad this thread was out there.

matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-11 Thread Paul Kraus
On Wed, Apr 6, 2011 at 1:58 PM, Rich Morris  wrote:
> On 04/06/11 12:43, Paul Kraus wrote:
>>
>> xxx> zfs holds zpool-01/dataset-01@1299636001
>> NAME                               TAG            TIMESTAMP
>> zpool-01/dataset-01@1299636001  .send-18440-0  Tue Mar 15 20:00:39 2011
>> xxx> zfs holds zpool-01/dataset-01@1300233615
>> NAME                               TAG            TIMESTAMP
>> zpool-01/dataset-01@1300233615  .send-18440-0  Tue Mar 15 20:00:47 2011
>> xxx>
>>
>>    That is what I was looking for. Looks like when a zfs send got
>> killed it left a hanging lock (hold) around. I assume the next
>> export/import (not likely as this is a production zpool) or a reboot
>> (will happen eventually, and I can wait) these will clear. Unless
>> there is a way to force clear the hold.
>
> The user holds won't be released by an export/import or a reboot.
>
> "zfs get defer_destroy snapname" will show whether this snapshot is marked
> for
> deferred destroy and "zfs release .send-18440-0 snapname" will clear that
> hold.
> If the snapshot is marked for deferred destroy then the release of the last
> tag
> will also destroy it.

Sorry I did not get back on this last week, it got busy late in the week.

I tried the `zfs release` and it appeared to hang, so I just let
it be. A few hours later the server experienced a resource crunch of
some type (fork errors about unable to allocate resources). The load
also varied between about 16 and 50 (it is a 16 CPU M4000).

Users who had an open SAMBA connection seemed OK, but eventually
we needed to reboot the box (I did let it sit in that state as long as
I could). Since I could not even get on the XSCF console, I had to
`break` it to the OK prompt and sync it. The first boot hung. I then
did a boot -rv and that also hung (I was hoping to see a device probe
that caused the hang, but it looked like it was getting past all the
device discovery). That also hung. Finally a boot -srv got me to a
login prompt. I logged in as root, then logged out and it came up to
mulltiuser-server without a hitch.

I do not know what the root cause of the initial resource problem
was, as I did not get a good core dump. I *hope* it was not the `zfs
release`, but it may have been.

After the boot cycle(s) the zfs snapshots are no longer held and I
could destroy them.

Thanks to all those who helped. This discussion is one of the best
sources, if not THE best source, of zfs support and knowledge.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-06 Thread Rich Morris

On 04/06/11 12:43, Paul Kraus wrote:

xxx> zfs holds zpool-01/dataset-01@1299636001
NAME   TAGTIMESTAMP
zpool-01/dataset-01@1299636001  .send-18440-0  Tue Mar 15 20:00:39 2011
xxx> zfs holds zpool-01/dataset-01@1300233615
NAME   TAGTIMESTAMP
zpool-01/dataset-01@1300233615  .send-18440-0  Tue Mar 15 20:00:47 2011
xxx>

That is what I was looking for. Looks like when a zfs send got
killed it left a hanging lock (hold) around. I assume the next
export/import (not likely as this is a production zpool) or a reboot
(will happen eventually, and I can wait) these will clear. Unless
there is a way to force clear the hold.


The user holds won't be released by an export/import or a reboot.

"zfs get defer_destroy snapname" will show whether this snapshot is 
marked for
deferred destroy and "zfs release .send-18440-0 snapname" will clear 
that hold.
If the snapshot is marked for deferred destroy then the release of the 
last tag

will also destroy it.

-- Rich

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-06 Thread Paul Kraus
On Tue, Apr 5, 2011 at 9:26 PM, Edward Ned Harvey
 wrote:

> This may not apply to you, but in some other unrelated situation it was
> useful...
>
> Try zdb -d poolname
> In an older version of zpool, under certain conditions, there would
> sometimes be "hidden" clones listed with a % in the name.  Maybe the % won't
> be there in your case, but maybe you have some other manifestation of the
> hidden clone problem?

I have seen the dataset with a '%' in the name, but that was
during a zfs recv (and if the zfs recv dies, then it sometimes hangs
around and has to be destroyed, and the zfs destroy claims to fail
even though it succeeds ;-), but not in this case. The snapshots are
all valid (I just can't destroy two of them), we are snapshotting on a
fairly frequent basis as we are loading data.

Thanks for the suggestion.

xxx> zdb -d zpool-01
Dataset mos [META], ID 0, cr_txg 4, 18.7G, 745 objects
Dataset zpool-01/dataset-01@1302019202 [ZPL], ID 140, cr_txg 654658,
38.9G, 990842 objects
Dataset zpool-01/dataset-01@1302051600 [ZPL], ID 158, cr_txg 655776,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302062401 [ZPL], ID 189, cr_txg 656162,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1301951162 [ZPL], ID 108, cr_txg 652292,
1.02M, 478 objects
Dataset zpool-01/dataset-01@1302087601 [ZPL], ID 254, cr_txg 657065,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302105601 [ZPL], ID 291, cr_txg 657710,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302058800 [ZPL], ID 164, cr_txg 656033,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1299636001 [ZPL], ID 48, cr_txg 560375,
1.12T, 28468324 objects
Dataset zpool-01/dataset-01@1302007173 [ZPL], ID 125, cr_txg 654202,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302055201 [ZPL], ID 161, cr_txg 655905,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302080401 [ZPL], ID 248, cr_txg 656807,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302044400 [ZPL], ID 152, cr_txg 655518,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1301950939 [ZPL], ID 106, cr_txg 652280,
1.02M, 478 objects
Dataset zpool-01/dataset-01@1302015602 [ZPL], ID 137, cr_txg 654530,
10.3G, 175879 objects
Dataset zpool-01/dataset-01@1302030001 [ZPL], ID 143, cr_txg 655029,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1300233615 [ZPL], ID 79, cr_txg 594951,
4.48T, 99259515 objects
Dataset zpool-01/dataset-01@1302094801 [ZPL], ID 282, cr_txg 657323,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302066001 [ZPL], ID 214, cr_txg 656291,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302006933 [ZPL], ID 120, cr_txg 654181,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302098401 [ZPL], ID 285, cr_txg 657452,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302007755 [ZPL], ID 131, cr_txg 654240,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302048001 [ZPL], ID 155, cr_txg 655647,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302005414 [ZPL], ID 116, cr_txg 654119,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302007469 [ZPL], ID 128, cr_txg 654221,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302084001 [ZPL], ID 251, cr_txg 656936,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302076801 [ZPL], ID 245, cr_txg 656678,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302069601 [ZPL], ID 217, cr_txg 656420,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302073201 [ZPL], ID 242, cr_txg 656549,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302102001 [ZPL], ID 288, cr_txg 657581,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01@1302005162 [ZPL], ID 112, cr_txg 654101,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302012001 [ZPL], ID 134, cr_txg 654391,
1.18G, 63312 objects
Dataset zpool-01/dataset-01@1302004805 [ZPL], ID 110, cr_txg 654085,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302006769 [ZPL], ID 118, cr_txg 654171,
1.09M, 506 objects
Dataset zpool-01/dataset-01@1302091201 [ZPL], ID 257, cr_txg 657194,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 [ZPL], ID 84, cr_txg 439406, 71.1G, 1845553 objects
Dataset zpool-01 [ZPL], ID 16, cr_txg 1, 39.3K, 5 objects
xxx>

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-06 Thread Paul Kraus
On Tue, Apr 5, 2011 at 6:56 PM, Rich Morris  wrote:
> On 04/05/11 17:29, Ian Collins wrote:

> If there are clones then zfs destroy should report that.  The error being
> reported is "dataset is busy" which would be reported if there are user
> holds on the snapshots that can't be deleted.
>
> Try running "zfs holds zpool-01/dataset-01@1299636001"

xxx> zfs holds zpool-01/dataset-01@1299636001
NAME   TAGTIMESTAMP
zpool-01/dataset-01@1299636001  .send-18440-0  Tue Mar 15 20:00:39 2011
xxx> zfs holds zpool-01/dataset-01@1300233615
NAME   TAGTIMESTAMP
zpool-01/dataset-01@1300233615  .send-18440-0  Tue Mar 15 20:00:47 2011
xxx>

That is what I was looking for. Looks like when a zfs send got
killed it left a hanging lock (hold) around. I assume the next
export/import (not likely as this is a production zpool) or a reboot
(will happen eventually, and I can wait) these will clear. Unless
there is a way to force clear the hold.

Thanks Rich.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-05 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Paul Kraus
> 
> I have a zpool with one dataset and a handful of snapshots. I
> cannot delete two of the snapshots. The message I get is "dataset is
> busy". Neither fuser or lsof show anything holding open the
> .zfs/snapshot/ directory. What can cause this ?

This may not apply to you, but in some other unrelated situation it was
useful...

Try zdb -d poolname
In an older version of zpool, under certain conditions, there would
sometimes be "hidden" clones listed with a % in the name.  Maybe the % won't
be there in your case, but maybe you have some other manifestation of the
hidden clone problem?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-05 Thread Rich Morris

On 04/05/11 17:29, Ian Collins wrote:

 On 04/ 6/11 12:28 AM, Paul Kraus wrote:

 I have a zpool with one dataset and a handful of snapshots. I
cannot delete two of the snapshots. The message I get is "dataset is
busy". Neither fuser or lsof show anything holding open the
.zfs/snapshot/  directory. What can cause this ?


Do you have any clones?


If there are clones then zfs destroy should report that.  The error being
reported is "dataset is busy" which would be reported if there are user
holds on the snapshots that can't be deleted.

Try running "zfs holds zpool-01/dataset-01@1299636001"

-- Rich

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-05 Thread Paul Kraus
On Tue, Apr 5, 2011 at 5:29 PM, Ian Collins  wrote:
>  On 04/ 6/11 12:28 AM, Paul Kraus wrote:
>>
>>     I have a zpool with one dataset and a handful of snapshots. I
>> cannot delete two of the snapshots. The message I get is "dataset is
>> busy". Neither fuser or lsof show anything holding open the
>> .zfs/snapshot/  directory. What can cause this ?
>>
> Do you have any clones?

Nope. Just a basic snapshot.

I did a `zfs destroy -d` and that did not complain, so I'll if they
magically disappear at some point in the future.

I just can't figure out what can be holding those snapshots open to
prevent destruction. It reminds me of the first time I could not
umount a UFS and fuser/lsof showed nothing... it was NFS shared and
the kernel does not show up in fuser/lsof.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot destroy snapshot

2011-04-05 Thread Ian Collins

 On 04/ 6/11 12:28 AM, Paul Kraus wrote:

 I have a zpool with one dataset and a handful of snapshots. I
cannot delete two of the snapshots. The message I get is "dataset is
busy". Neither fuser or lsof show anything holding open the
.zfs/snapshot/  directory. What can cause this ?


Do you have any clones?

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] cannot destroy snapshot

2011-04-05 Thread Paul Kraus
I have a zpool with one dataset and a handful of snapshots. I
cannot delete two of the snapshots. The message I get is "dataset is
busy". Neither fuser or lsof show anything holding open the
.zfs/snapshot/ directory. What can cause this ?

xxx> uname -a
SunOS nyc-sed3 5.10 Generic_142909-17 sun4u sparc SUNW,SPARC-Enterprise
xxx> zpool upgrade
This system is currently running ZFS pool version 22.

All pools are formatted using this version.
xxx> zpool get all zpool-01
NAME   PROPERTY   VALUE   SOURCE
zpool-01  size   74.9T   -
zpool-01  capacity   10% -
zpool-01  altroot-   default
zpool-01  health ONLINE  -
zpool-01  guid   6976165213827467407  default
zpool-01  version22  default
zpool-01  bootfs -   default
zpool-01  delegation on  default
zpool-01  autoreplaceoff default
zpool-01  cachefile  -   default
zpool-01  failmode   waitdefault
zpool-01  listsnapshots  on  default
zpool-01  autoexpand off default
zpool-01  free   67.2T   -
zpool-01  allocated  7.75T   -
xxx> zfs upgrade
This system is currently running ZFS filesystem version 4.

All filesystems are formatted with the current version.
xxx> zfs get all zpool-01/dataset-01
NAMEPROPERTY  VALUESOURCE
zpool-01/dataset-01  type  filesystem   -
zpool-01/dataset-01  creation  Tue Jan 25 10:02 2011-
zpool-01/dataset-01  used  4.60T-
zpool-01/dataset-01  available 39.3T-
zpool-01/dataset-01  referenced1.09M-
zpool-01/dataset-01  compressratio 1.54x-
zpool-01/dataset-01  mounted   yes  -
zpool-01/dataset-01  quota none default
zpool-01/dataset-01  reservation   none default
zpool-01/dataset-01  recordsize32K
inherited from zpool-01
zpool-01/dataset-01  mountpoint/zpool-01/dataset-01  default
zpool-01/dataset-01  sharenfs  off  default
zpool-01/dataset-01  checksum  on   default
zpool-01/dataset-01  compression   on
inherited from zpool-01
zpool-01/dataset-01  atime on   default
zpool-01/dataset-01  devices   on   default
zpool-01/dataset-01  exec  on   default
zpool-01/dataset-01  setuidon   default
zpool-01/dataset-01  readonly  off  default
zpool-01/dataset-01  zoned off  default
zpool-01/dataset-01  snapdir   hidden   default
zpool-01/dataset-01  aclmode   passthrough
inherited from zpool-01
zpool-01/dataset-01  aclinheritpassthrough
inherited from zpool-01
zpool-01/dataset-01  canmount  on   default
zpool-01/dataset-01  shareiscsioff  default
zpool-01/dataset-01  xattr on   default
zpool-01/dataset-01  copies1default
zpool-01/dataset-01  version   4-
zpool-01/dataset-01  utf8only  off  -
zpool-01/dataset-01  normalization none -
zpool-01/dataset-01  casesensitivity   sensitive-
zpool-01/dataset-01  vscan off  default
zpool-01/dataset-01  nbmandoff  default
zpool-01/dataset-01  sharesmb  off  default
zpool-01/dataset-01  refquota  none default
zpool-01/dataset-01  refreservationnone default
zpool-01/dataset-01  primarycache  all  default
zpool-01/dataset-01  secondarycacheall  default
zpool-01/dataset-01  usedbysnapshots   4.60T-
zpool-01/dataset-01  usedbydataset 1.09M-
zpool-01/dataset-01  usedbychildren0-
zpool-01/dataset-01  usedbyrefreservation  0-
zpool-01/dataset-01  logbias   latency  default
xxx> zfs list | grep zpool-01/dataset-01
zpool-01/dataset-01   4.60T
39.3T  1.09M  /zpool-01/dataset-01
zpool-01/dataset-01@1299636001 117G
  -  1.12T  -
zpool-01/dataset-01@13002336153.48T
  -  4.48T  -
zpool-01/dataset-01@13019509390
  -  1.02M  -
zpool-01/dataset