Re: ZFS command can block the whole ZFS subsystem!
On Sun, Jan 5, 2014 at 9:41 AM, O. Hartmann wrote: > > > > > As already described by Dan and perhaps not followed up on: dedup > > requires at very large amount of memory. Assuming 32GB is sufficient > > is most likely wrong. > > > > What does zdb -S BACKUP00 say? > > That command is stuck for 2 hours by now ... That is expected. It is not stuck, it is running. It's output will indicate what is the minimum required for your dataset. The command will be slow if you have large dataset/insufficient ram. Really slow if both. -- Adam ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ZFS command can block the whole ZFS subsystem!
On Sun, 5 Jan 2014 06:43:18 -0600 Adam Vande More wrote: > On Sun, Jan 5, 2014 at 2:11 AM, O. Hartmann > wrote: > > > On Sun, 5 Jan 2014 10:14:26 +1100 > > Peter Jeremy wrote: > > > > > On 2014-Jan-04 23:26:42 +0100, "O. Hartmann" > > > wrote: > > > >zfs list -r BACKUP00 > > > >NAME USED AVAIL REFER MOUNTPOINT > > > >BACKUP00 1.48T 1.19T 144K /BACKUP00 > > > >BACKUP00/backup 1.47T 1.19T 1.47T /backup > > > > > > Well, that at least shows it's making progress - it's gone from > > > 2.5T to 1.47T used (though I gather that has taken several > > > days). Can you pleas post the result of > > > zfs get all BACKUP00/backup > > > > > > > Here we go: > > > > > > NAME PROPERTY VALUE SOURCE > > BACKUP00/backup type filesystem- > > BACKUP00/backup creation Fr Dez 20 23:17 2013 - > > BACKUP00/backup used 1.47T - > > BACKUP00/backup available 1.19T - > > BACKUP00/backup referenced1.47T - > > BACKUP00/backup compressratio 1.00x - > > BACKUP00/backup mounted no- > > BACKUP00/backup quota none default > > BACKUP00/backup reservation none default > > BACKUP00/backup recordsize128K default > > BACKUP00/backup mountpoint/backup local > > BACKUP00/backup sharenfs off default > > BACKUP00/backup checksum sha256local > > BACKUP00/backup compression lz4 local > > BACKUP00/backup atime ondefault > > BACKUP00/backup devices ondefault > > BACKUP00/backup exec ondefault > > BACKUP00/backup setuidondefault > > BACKUP00/backup readonly off default > > BACKUP00/backup jailedoff default > > BACKUP00/backup snapdir hiddendefault > > BACKUP00/backup aclmode discard default > > BACKUP00/backup aclinheritrestricteddefault > > BACKUP00/backup canmount ondefault > > BACKUP00/backup xattr ondefault > > BACKUP00/backup copies1 default > > BACKUP00/backup version 5 - > > BACKUP00/backup utf8only off - > > BACKUP00/backup normalization none - > > BACKUP00/backup casesensitivity sensitive - > > BACKUP00/backup vscan off default > > BACKUP00/backup nbmandoff default > > BACKUP00/backup sharesmb onlocal > > BACKUP00/backup refquota none default > > BACKUP00/backup refreservationnone default > > BACKUP00/backup primarycache all default > > BACKUP00/backup secondarycacheall default > > BACKUP00/backup usedbysnapshots 0 - > > BACKUP00/backup usedbydataset 1.47T - > > BACKUP00/backup usedbychildren0 - > > BACKUP00/backup usedbyrefreservation 0 - > > BACKUP00/backup logbias latency default > > BACKUP00/backup dedup onlocal > > > > As already described by Dan and perhaps not followed up on: dedup > requires at very large amount of memory. Assuming 32GB is sufficient > is most likely wrong. > > What does zdb -S BACKUP00 say? That command is stuck for 2 hours by now ... > > Also I will note you were asked if the ZFS FS in question had dedup > enabled. You replied with a response from an incorrect FS. > > > signature.asc Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On Sun, Jan 5, 2014 at 2:11 AM, O. Hartmann wrote: > On Sun, 5 Jan 2014 10:14:26 +1100 > Peter Jeremy wrote: > > > On 2014-Jan-04 23:26:42 +0100, "O. Hartmann" > > wrote: > > >zfs list -r BACKUP00 > > >NAME USED AVAIL REFER MOUNTPOINT > > >BACKUP00 1.48T 1.19T 144K /BACKUP00 > > >BACKUP00/backup 1.47T 1.19T 1.47T /backup > > > > Well, that at least shows it's making progress - it's gone from 2.5T > > to 1.47T used (though I gather that has taken several days). Can you > > pleas post the result of > > zfs get all BACKUP00/backup > > > > Here we go: > > > NAME PROPERTY VALUE SOURCE > BACKUP00/backup type filesystem- > BACKUP00/backup creation Fr Dez 20 23:17 2013 - > BACKUP00/backup used 1.47T - > BACKUP00/backup available 1.19T - > BACKUP00/backup referenced1.47T - > BACKUP00/backup compressratio 1.00x - > BACKUP00/backup mounted no- > BACKUP00/backup quota none default > BACKUP00/backup reservation none default > BACKUP00/backup recordsize128K default > BACKUP00/backup mountpoint/backup local > BACKUP00/backup sharenfs off default > BACKUP00/backup checksum sha256local > BACKUP00/backup compression lz4 local > BACKUP00/backup atime ondefault > BACKUP00/backup devices ondefault > BACKUP00/backup exec ondefault > BACKUP00/backup setuidondefault > BACKUP00/backup readonly off default > BACKUP00/backup jailedoff default > BACKUP00/backup snapdir hiddendefault > BACKUP00/backup aclmode discard default > BACKUP00/backup aclinheritrestricteddefault > BACKUP00/backup canmount ondefault > BACKUP00/backup xattr ondefault > BACKUP00/backup copies1 default > BACKUP00/backup version 5 - > BACKUP00/backup utf8only off - > BACKUP00/backup normalization none - > BACKUP00/backup casesensitivity sensitive - > BACKUP00/backup vscan off default > BACKUP00/backup nbmandoff default > BACKUP00/backup sharesmb onlocal > BACKUP00/backup refquota none default > BACKUP00/backup refreservationnone default > BACKUP00/backup primarycache all default > BACKUP00/backup secondarycacheall default > BACKUP00/backup usedbysnapshots 0 - > BACKUP00/backup usedbydataset 1.47T - > BACKUP00/backup usedbychildren0 - > BACKUP00/backup usedbyrefreservation 0 - > BACKUP00/backup logbias latency default > BACKUP00/backup dedup onlocal > As already described by Dan and perhaps not followed up on: dedup requires at very large amount of memory. Assuming 32GB is sufficient is most likely wrong. What does zdb -S BACKUP00 say? Also I will note you were asked if the ZFS FS in question had dedup enabled. You replied with a response from an incorrect FS. -- Adam ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ZFS command can block the whole ZFS subsystem!
On Sun, 5 Jan 2014 19:30:39 +1100 Peter Jeremy wrote: > On 2014-Jan-05 09:11:38 +0100, "O. Hartmann" > wrote: > >On Sun, 5 Jan 2014 10:14:26 +1100 > >Peter Jeremy wrote: > > > >> On 2014-Jan-04 23:26:42 +0100, "O. Hartmann" > >> wrote: > >> >zfs list -r BACKUP00 > >> >NAME USED AVAIL REFER MOUNTPOINT > >> >BACKUP00 1.48T 1.19T 144K /BACKUP00 > >> >BACKUP00/backup 1.47T 1.19T 1.47T /backup > >> > >> Well, that at least shows it's making progress - it's gone from > >> 2.5T to 1.47T used (though I gather that has taken several days). > >> Can you pleas post the result of > >> zfs get all BACKUP00/backup > > >BACKUP00/backup deduponlocal > > This is your problem. Before it can free any block, it has to check > for other references to the block via the DDT and I suspect you don't > have enough RAM to cache the DDT. > > Your options are: > 1) Wait until the delete finishes. > 2) Destroy the pool with extreme prejudice: Forcably export the pool >(probably by booting to single user and not starting ZFS) and write >zeroes to the first and last MB of ada3p1. > > BTW, this problem will occur on any filesystem where you've ever > enabled dedup - once there are any dedup'd blocks in a filesystem, > all deletes need to go via the DDT. > As I stated earlier in the this thread, the box in question has 32 GB RAM and this should be sufficient. signature.asc Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On 2014-Jan-05 09:11:38 +0100, "O. Hartmann" wrote: >On Sun, 5 Jan 2014 10:14:26 +1100 >Peter Jeremy wrote: > >> On 2014-Jan-04 23:26:42 +0100, "O. Hartmann" >> wrote: >> >zfs list -r BACKUP00 >> >NAME USED AVAIL REFER MOUNTPOINT >> >BACKUP00 1.48T 1.19T 144K /BACKUP00 >> >BACKUP00/backup 1.47T 1.19T 1.47T /backup >> >> Well, that at least shows it's making progress - it's gone from 2.5T >> to 1.47T used (though I gather that has taken several days). Can you >> pleas post the result of >> zfs get all BACKUP00/backup >BACKUP00/backup deduponlocal This is your problem. Before it can free any block, it has to check for other references to the block via the DDT and I suspect you don't have enough RAM to cache the DDT. Your options are: 1) Wait until the delete finishes. 2) Destroy the pool with extreme prejudice: Forcably export the pool (probably by booting to single user and not starting ZFS) and write zeroes to the first and last MB of ada3p1. BTW, this problem will occur on any filesystem where you've ever enabled dedup - once there are any dedup'd blocks in a filesystem, all deletes need to go via the DDT. -- Peter Jeremy pgp3MDihoDvIU.pgp Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On Sun, 5 Jan 2014 10:14:26 +1100 Peter Jeremy wrote: > On 2014-Jan-04 23:26:42 +0100, "O. Hartmann" > wrote: > >zfs list -r BACKUP00 > >NAME USED AVAIL REFER MOUNTPOINT > >BACKUP00 1.48T 1.19T 144K /BACKUP00 > >BACKUP00/backup 1.47T 1.19T 1.47T /backup > > Well, that at least shows it's making progress - it's gone from 2.5T > to 1.47T used (though I gather that has taken several days). Can you > pleas post the result of > zfs get all BACKUP00/backup > Here we go: NAME PROPERTY VALUE SOURCE BACKUP00/backup type filesystem- BACKUP00/backup creation Fr Dez 20 23:17 2013 - BACKUP00/backup used 1.47T - BACKUP00/backup available 1.19T - BACKUP00/backup referenced1.47T - BACKUP00/backup compressratio 1.00x - BACKUP00/backup mounted no- BACKUP00/backup quota none default BACKUP00/backup reservation none default BACKUP00/backup recordsize128K default BACKUP00/backup mountpoint/backup local BACKUP00/backup sharenfs off default BACKUP00/backup checksum sha256local BACKUP00/backup compression lz4 local BACKUP00/backup atime ondefault BACKUP00/backup devices ondefault BACKUP00/backup exec ondefault BACKUP00/backup setuidondefault BACKUP00/backup readonly off default BACKUP00/backup jailedoff default BACKUP00/backup snapdir hiddendefault BACKUP00/backup aclmode discard default BACKUP00/backup aclinheritrestricteddefault BACKUP00/backup canmount ondefault BACKUP00/backup xattr ondefault BACKUP00/backup copies1 default BACKUP00/backup version 5 - BACKUP00/backup utf8only off - BACKUP00/backup normalization none - BACKUP00/backup casesensitivity sensitive - BACKUP00/backup vscan off default BACKUP00/backup nbmandoff default BACKUP00/backup sharesmb onlocal BACKUP00/backup refquota none default BACKUP00/backup refreservationnone default BACKUP00/backup primarycache all default BACKUP00/backup secondarycacheall default BACKUP00/backup usedbysnapshots 0 - BACKUP00/backup usedbydataset 1.47T - BACKUP00/backup usedbychildren0 - BACKUP00/backup usedbyrefreservation 0 - BACKUP00/backup logbias latency default BACKUP00/backup dedup onlocal BACKUP00/backup mlslabel- BACKUP00/backup sync standard default BACKUP00/backup refcompressratio 1.00x - BACKUP00/backup written 1.47T - BACKUP00/backup logicalused 1.47T - BACKUP00/backup logicalreferenced 1.47T - signature.asc Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On 2014-Jan-04 23:26:42 +0100, "O. Hartmann" wrote: >zfs list -r BACKUP00 >NAME USED AVAIL REFER MOUNTPOINT >BACKUP00 1.48T 1.19T 144K /BACKUP00 >BACKUP00/backup 1.47T 1.19T 1.47T /backup Well, that at least shows it's making progress - it's gone from 2.5T to 1.47T used (though I gather that has taken several days). Can you pleas post the result of zfs get all BACKUP00/backup -- Peter Jeremy pgpmSrBIo4DlN.pgp Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On Sun, 5 Jan 2014 09:10:04 +1100 Peter Jeremy wrote: > On 2014-Jan-03 20:25:35 +0100, "O. Hartmann" > wrote: > >[~] zfs get all BACKUP00 > >NAME PROPERTY VALUE SOURCE > ... > >BACKUP00 usedbysnapshots 0 - > >BACKUP00 usedbydataset 144K - > >BACKUP00 usedbychildren2.53T - > >BACKUP00 usedbyrefreservation 0 - > > >Funny, the disk is supposed to be "empty" ... but is marked as used > >by 2.5 TB ... > > That says there's another filesystem inside BACKUP00 which has 2.5TB > used. > > What are the results of: > zpool status -v BACKUP00 > zfs list -r BACKUP00 > No, not stuck, came back after a while: zpool status -v BACKUP00 pool: BACKUP00 state: ONLINE status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: none requested config: NAMESTATE READ WRITE CKSUM BACKUP00ONLINE 0 0 0 ada3p1ONLINE 0 0 0 errors: No known data errors [...] zfs list -r BACKUP00 NAME USED AVAIL REFER MOUNTPOINT BACKUP00 1.48T 1.19T 144K /BACKUP00 BACKUP00/backup 1.47T 1.19T 1.47T /backup signature.asc Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On Sun, 5 Jan 2014 09:10:04 +1100 Peter Jeremy wrote: > On 2014-Jan-03 20:25:35 +0100, "O. Hartmann" > wrote: > >[~] zfs get all BACKUP00 > >NAME PROPERTY VALUE SOURCE > ... > >BACKUP00 usedbysnapshots 0 - > >BACKUP00 usedbydataset 144K - > >BACKUP00 usedbychildren2.53T - > >BACKUP00 usedbyrefreservation 0 - > > >Funny, the disk is supposed to be "empty" ... but is marked as used > >by 2.5 TB ... > > That says there's another filesystem inside BACKUP00 which has 2.5TB > used. > > What are the results of: > zpool status -v BACKUP00 > zfs list -r BACKUP00 > Nothing - drive is still operating on something (as reported), every zfs related commands make the terminal stuck ... signature.asc Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On 2014-Jan-03 20:25:35 +0100, "O. Hartmann" wrote: >[~] zfs get all BACKUP00 >NAME PROPERTY VALUE SOURCE ... >BACKUP00 usedbysnapshots 0 - >BACKUP00 usedbydataset 144K - >BACKUP00 usedbychildren2.53T - >BACKUP00 usedbyrefreservation 0 - >Funny, the disk is supposed to be "empty" ... but is marked as used by >2.5 TB ... That says there's another filesystem inside BACKUP00 which has 2.5TB used. What are the results of: zpool status -v BACKUP00 zfs list -r BACKUP00 -- Peter Jeremy pgpJndNkyBTKH.pgp Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
On Fri, 3 Jan 2014 17:04:00 - "Steven Hartland" wrote: .. > Sorry Im confused then as you said "locks up the entire command and > even worse - it seems to wind up the pool in question for being > exported!" > > Which to me read like you where saying the pool ended up being > exported. I'm not a native English speaker. My intention was, to make it short: renove the dummy file. While having issued the command in the foreground of the terminal, I decided a second later after hitting return, to send it in the background via suspending the rm-command and issuing "bg" then. Ahh thanks for explaining :) > > > > I expect to get the command into the background as every other > > > > UNIX command does when sending Ctrl-Z in the console. > > > > Obviously, ZFS related stuff in FreeBSD doesn't comply. > > > > > > > > The file has been removed from the pool but the console is still > > > > stuck with "^Z fg" (as I typed this in). Process list tells me: > > > > > > > > top > > > > 17790 root 1 200 8228K 1788K STOP 10 0:05 > > > > 0.00% rm > > > > > > > > for the particular "rm" command issued. > > > > > > Thats not backgrounded yet otherwise it wouldnt be in the state > > > STOP. > > > > As I said - the job never backgrounded, locked up the terminal and > > makes the whole pool inresponsive. > > Have you tried sending a continue signal to the process? No, not by intention. Since the operation started to slow down the whole box and seemed to influence nearly every operation with ZFS pools I intended (zpool status, zpool import the faulty pool, zpool export) I rebootet the machine. After the reboot, when ZFS came up, the drive started working like crazy again and the system stopped while in recognizing the ZFS pools. I did then a hard reset and restarted in single user mode, exported the pool successfully, and rebooted. But the moment I did an zpool import POOL, the heavy working continued. > > > > Now, having the file deleted, I'd like to export the pool for > > > > further maintainance > > > > > > Are you sure the delete is complete? Also don't forget ZFS has > > > TRIM by default, so depending on support of the underlying > > > devices you could be seeing deletes occuring. > > > > Quite sure it didn't! It takes hours (~ 8 now) and the drive is > > still working, although I tried to stop. > > A delete of a file shouldn't take 8 hours, but you dont say how large > the file actually is? The drive has a capacity of ~ 2,7 TiB (Western Digital 3TB drive). The file I created was, do not laugh, please, 2,7 TB :-( I guess depending on COW technique and what I read about ZFS accordingly to this thread and others, this seems to be the culprit. There is no space left to delete the file savely. By the way - the box is still working on 100% on that drive :-( That's now > 12 hours. > > > > You can check that gstat -d > > > > command report 100% acticity on the drive. I exported the pool in > > question in single user mode and now try to import it back while in > > miltiuser mode. > > Sorry you seem to be stating conflicting things: > 1. The delete hasnt finished > 2. The pool export hung > 3. You have exported the pool > Not conflicting, but in my non-expert terminology not quite accurate and precise as you may expect. ad item 1) I terminated (by the brute force of the mighty RESET button) the copy command. It hasn't finished the operation on the pool as I can see, but it might be a kind of recovery mechanism in progress now, not the rm-command anymore. ad 2) Yes, first it hung, then I reset the box, then in single user mode the export to avoid further interaction, then I tried to import the pool again ... ad 3) yes, successfully after the reset, now I imported the pool and the terminal, in which I issued the command is still stuck again while the pool is under heavy load. > What exactly is gstat -d reporting, can you paste the output please. I think this is boring looking at 100% activity, but here it is ;-) dT: 1.047s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/d %busy Name 0 0 0 00.0 0 00.0 0 00.0 0.0| ada0 0 0 0 00.0 0 00.0 0 00.0 0.0| ada1 0 0 0 00.0 0 00.0 0 00.0 0.0| ada2 10114114455 85.3 0 00.0 0 00.0 100.0| ada3 0 0 0 00.0 0 00.0 0 00.0 0.0| ada4 ... 10114114455 85.3 0 00.0 0 00.0 100.0| ada3p1 0 0 0 00.0 0 00.0 0 00.0 0.0| ada4p1 > > > Shortly after issuing the command > > > > zpool import POOL00 > > > > the terminal is stuck again, the drive is working at 100% for two > > hours now and it seems the great ZFS is deleting every block per > > pedes. Is this supposed
Re: ZFS command can block the whole ZFS subsystem!
On Fri, 3 Jan 2014 17:04:00 - "Steven Hartland" wrote: > - Original Message - > From: "O. Hartmann" > > On Fri, 3 Jan 2014 14:38:03 - > > "Steven Hartland" wrote: > > > > > > > > - Original Message - > > > From: "O. Hartmann" > > > > > > > > For some security reasons, I dumped via "dd" a large file onto > > > > a 3TB disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 > > > > 22:43:56 CET 2013 amd64. Filesystem in question is a single ZFS > > > > pool. > > > > > > > > Issuing the command > > > > > > > > "rm dumpfile.txt" > > > > > > > > and then hitting Ctrl-Z to bring the rm command into background > > > > via fg" (I use FreeBSD's csh in that console) locks up the > > > > entire command and even worse - it seems to wind up the pool in > > > > question for being exported! > > > > > > I cant think of any reason why backgrounding a shell would export > > > a pool. > > > > I sent the job "rm" into background and I didn't say that implies an > > export of the pool! > > > > I said that the pool can not be exported once the bg-command has > > been issued. > > Sorry Im confused then as you said "locks up the entire command and > even worse - it seems to wind up the pool in question for being > exported!" > > Which to me read like you where saying the pool ended up being > exported. I'm not a native English speaker. My intention was, to make it short: renove the dummy file. While having issued the command in the foreground of the terminal, I decided a second later after hitting return, to send it in the background via suspending the rm-command and issuing "bg" then. > > > > > I expect to get the command into the background as every other > > > > UNIX command does when sending Ctrl-Z in the console. > > > > Obviously, ZFS related stuff in FreeBSD doesn't comply. > > > > > > > > The file has been removed from the pool but the console is still > > > > stuck with "^Z fg" (as I typed this in). Process list tells me: > > > > > > > > top > > > > 17790 root 1 200 8228K 1788K STOP 10 0:05 > > > > 0.00% rm > > > > > > > > for the particular "rm" command issued. > > > > > > Thats not backgrounded yet otherwise it wouldnt be in the state > > > STOP. > > > > As I said - the job never backgrounded, locked up the terminal and > > makes the whole pool inresponsive. > > Have you tried sending a continue signal to the process? No, not by intention. Since the operation started to slow down the whole box and seemed to influence nearly every operation with ZFS pools I intended (zpool status, zpool import the faulty pool, zpool export) I rebootet the machine. After the reboot, when ZFS came up, the drive started working like crazy again and the system stopped while in recognizing the ZFS pools. I did then a hard reset and restarted in single user mode, exported the pool successfully, and rebooted. But the moment I did an zpool import POOL, the heavy working continued. > > > > > Now, having the file deleted, I'd like to export the pool for > > > > further maintainance > > > > > > Are you sure the delete is complete? Also don't forget ZFS has > > > TRIM by default, so depending on support of the underlying > > > devices you could be seeing deletes occuring. > > > > Quite sure it didn't! It takes hours (~ 8 now) and the drive is > > still working, although I tried to stop. > > A delete of a file shouldn't take 8 hours, but you dont say how large > the file actually is? The drive has a capacity of ~ 2,7 TiB (Western Digital 3TB drive). The file I created was, do not laugh, please, 2,7 TB :-( I guess depending on COW technique and what I read about ZFS accordingly to this thread and others, this seems to be the culprit. There is no space left to delete the file savely. By the way - the box is still working on 100% on that drive :-( That's now > 12 hours. > > > > You can check that gstat -d > > > > command report 100% acticity on the drive. I exported the pool in > > question in single user mode and now try to import it back while in > > miltiuser mode. > > Sorry you seem to be stating conflicting things: > 1. The delete hasnt finished > 2. The pool export hung > 3. You have exported the pool > Not conflicting, but in my non-expert terminology not quite accurate and precise as you may expect. ad item 1) I terminated (by the brute force of the mighty RESET button) the copy command. It hasn't finished the operation on the pool as I can see, but it might be a kind of recovery mechanism in progress now, not the rm-command anymore. ad 2) Yes, first it hung, then I reset the box, then in single user mode the export to avoid further interaction, then I tried to import the pool again ... ad 3) yes, successfully after the reset, now I imported the pool and the terminal, in which I issued the command is still stuck again while the pool is under heavy load. > What exactly is gstat -d reporting, can you paste the output please. I think th
Re: ZFS command can block the whole ZFS subsystem!
On Fri, 3 Jan 2014 12:16:22 -0600 Dan Nelson wrote: > In the last episode (Jan 03), O. Hartmann said: > > On Fri, 3 Jan 2014 14:38:03 - "Steven Hartland" > > wrote: > > > From: "O. Hartmann" > > > > For some security reasons, I dumped via "dd" a large file onto > > > > a 3TB disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 > > > > 22:43:56 CET 2013 amd64. Filesystem in question is a single > > > > ZFS pool. > > > > > > > > Issuing the command > > > > > > > > "rm dumpfile.txt" > > > > > > > > and then hitting Ctrl-Z to bring the rm command into background > > > > via fg" (I use FreeBSD's csh in that console) locks up the > > > > entire command and even worse - it seems to wind up the pool in > > > > question for being exported! > > > > > > You can check that gstat -d > > > > command report 100% acticity on the drive. I exported the pool in > > question in single user mode and now try to import it back while in > > miltiuser mode. > > Did you happen to have enabled deduplication on the filesystem in > question? That's the only thing I can think of that would make file > deletions run slow. I have deleted files up to 10GB on regular > filesystems with no noticable delay at the commandline. If you have > deduplication enabled, however, each block's hash has to be looked up > in the dedupe table, and if you don't have enough RAM for it to be > loaded completely into memory, that will be very very slow :) > > There are varying recommendations on how much RAM you need for a > given pool size, since the DDT has to hold an entry for each block > written, and blocksize depends on whether you wrote your files > sequentially (128K blocks) or randomly (8k or smaller). Each DDT > entry takes 320 bytes of RAM, so a full 3TB ZFS pool would need at > minimum 320*(3TB/128K) ~= 7GB of RAM to hold the DDT, and much more > than that if your averge blocksize is less than 128K. > > So, if your system has less than 8GB of RAM in it, there's no way the > DDT will be able to stay in memory, so you're probably going to have > to do at least one disk seek (probably more, since you're writing to > the DDT as well) per block in the file you're deleting. You should > probably have 16GB or more RAM, and use an SSD as a L2ARC device as > well. > Thanks for the explanation. The box in question has 32GB RAM. I wrote a single file, 2,72 GB in size, to the pool, which I tried to "remove via rm" then. DEDUp seems to be off according to this information: [~] zfs get all BACKUP00 NAME PROPERTY VALUE SOURCE BACKUP00 type filesystem- BACKUP00 creation Fr Dez 20 23:14 2013 - BACKUP00 used 2.53T - BACKUP00 available 147G - BACKUP00 referenced144K - BACKUP00 compressratio 1.00x - BACKUP00 mounted yes - BACKUP00 quota none default BACKUP00 reservation none default BACKUP00 recordsize128K default BACKUP00 mountpoint/BACKUP00 default BACKUP00 sharenfs off default BACKUP00 checksum ondefault BACKUP00 compression off default BACKUP00 atime ondefault BACKUP00 devices ondefault BACKUP00 exec ondefault BACKUP00 setuidondefault BACKUP00 readonly off default BACKUP00 jailedoff default BACKUP00 snapdir hiddendefault BACKUP00 aclmode discard default BACKUP00 aclinheritrestricteddefault BACKUP00 canmount ondefault BACKUP00 xattr off temporary BACKUP00 copies1 default BACKUP00 version 5 - BACKUP00 utf8only off - BACKUP00 normalization none - BACKUP00 casesensitivity sensitive - BACKUP00 vscan off default BACKUP00 nbmandoff default BACKUP00 sharesmb off default BACKUP00 refquota none default BACKUP00 refreservationnone default BACKUP00 primarycache all default BACKUP00 secondarycacheall default BACKUP00 usedbysnapshots 0 - BACKUP00 usedbydataset 144K - BACKUP00 usedbychildren2.53T - BACKUP00 usedbyrefr
Re: ZFS command can block the whole ZFS subsystem!
In the last episode (Jan 03), O. Hartmann said: > On Fri, 3 Jan 2014 14:38:03 - "Steven Hartland" > wrote: > > From: "O. Hartmann" > > > For some security reasons, I dumped via "dd" a large file onto a 3TB > > > disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 22:43:56 CET > > > 2013 amd64. Filesystem in question is a single ZFS pool. > > > > > > Issuing the command > > > > > > "rm dumpfile.txt" > > > > > > and then hitting Ctrl-Z to bring the rm command into background via > > > fg" (I use FreeBSD's csh in that console) locks up the entire command > > > and even worse - it seems to wind up the pool in question for being > > > exported! > > > > You can check that gstat -d > > command report 100% acticity on the drive. I exported the pool in > question in single user mode and now try to import it back while in > miltiuser mode. Did you happen to have enabled deduplication on the filesystem in question? That's the only thing I can think of that would make file deletions run slow. I have deleted files up to 10GB on regular filesystems with no noticable delay at the commandline. If you have deduplication enabled, however, each block's hash has to be looked up in the dedupe table, and if you don't have enough RAM for it to be loaded completely into memory, that will be very very slow :) There are varying recommendations on how much RAM you need for a given pool size, since the DDT has to hold an entry for each block written, and blocksize depends on whether you wrote your files sequentially (128K blocks) or randomly (8k or smaller). Each DDT entry takes 320 bytes of RAM, so a full 3TB ZFS pool would need at minimum 320*(3TB/128K) ~= 7GB of RAM to hold the DDT, and much more than that if your averge blocksize is less than 128K. So, if your system has less than 8GB of RAM in it, there's no way the DDT will be able to stay in memory, so you're probably going to have to do at least one disk seek (probably more, since you're writing to the DDT as well) per block in the file you're deleting. You should probably have 16GB or more RAM, and use an SSD as a L2ARC device as well. -- Dan Nelson dnel...@allantgroup.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ZFS command can block the whole ZFS subsystem!
On Fri, 2014-01-03 at 17:14 +0100, O. Hartmann wrote: > > > Issuing the command > > > > > > "rm dumpfile.txt" > > > > > > and then hitting Ctrl-Z to bring the rm command into background > via > > > fg" (I use FreeBSD's csh in that console) locks up the entire > > > command and even worse - it seems to wind up the pool in question > > > for being exported! It's probably just a typo in your email, but "^Z fg" suspends the process then resumes it in foreground; I suspect you meant "^Z bg". Also, at the point you would hit ^Z, it might be handy to hit ^T and see what the process is actually doing. -- Ian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ZFS command can block the whole ZFS subsystem!
- Original Message - From: "O. Hartmann" On Fri, 3 Jan 2014 14:38:03 - "Steven Hartland" wrote: > > - Original Message - > From: "O. Hartmann" > > > > For some security reasons, I dumped via "dd" a large file onto a 3TB > > disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 22:43:56 > > CET 2013 amd64. Filesystem in question is a single ZFS pool. > > > > Issuing the command > > > > "rm dumpfile.txt" > > > > and then hitting Ctrl-Z to bring the rm command into background via > > fg" (I use FreeBSD's csh in that console) locks up the entire > > command and even worse - it seems to wind up the pool in question > > for being exported! > > I cant think of any reason why backgrounding a shell would export a > pool. I sent the job "rm" into background and I didn't say that implies an export of the pool! I said that the pool can not be exported once the bg-command has been issued. Sorry Im confused then as you said "locks up the entire command and even worse - it seems to wind up the pool in question for being exported!" Which to me read like you where saying the pool ended up being exported. > > I expect to get the command into the background as every other UNIX > > command does when sending Ctrl-Z in the console. Obviously, ZFS > > related stuff in FreeBSD doesn't comply. > > > > The file has been removed from the pool but the console is still > > stuck with "^Z fg" (as I typed this in). Process list tells me: > > > > top > > 17790 root 1 200 8228K 1788K STOP 10 0:05 > > 0.00% rm > > > > for the particular "rm" command issued. > > Thats not backgrounded yet otherwise it wouldnt be in the state STOP. As I said - the job never backgrounded, locked up the terminal and makes the whole pool inresponsive. Have you tried sending a continue signal to the process? > > Now, having the file deleted, I'd like to export the pool for > > further maintainance > > Are you sure the delete is complete? Also don't forget ZFS has TRIM by > default, so depending on support of the underlying devices you could > be seeing deletes occuring. Quite sure it didn't! It takes hours (~ 8 now) and the drive is still working, although I tried to stop. A delete of a file shouldn't take 8 hours, but you dont say how large the file actually is? > You can check that gstat -d command report 100% acticity on the drive. I exported the pool in question in single user mode and now try to import it back while in miltiuser mode. Sorry you seem to be stating conflicting things: 1. The delete hasnt finished 2. The pool export hung 3. You have exported the pool What exactly is gstat -d reporting, can you paste the output please. Shortly after issuing the command zpool import POOL00 the terminal is stuck again, the drive is working at 100% for two hours now and it seems the great ZFS is deleting every block per pedes. Is this supposed to last days or a week? What controller and what drive? What does the following report: sysctl kstat.zfs.misc.zio_trim > > but that doesn't work with > > > > zpool export -f poolname > > > > This command is now also stuck blocking the terminal and the pool > > from further actions. > > If the delete hasnt completed and is stuck in the kernel this is > to be expected. At this moment I will not imagine myself what will happen if I have to delete several deka terabytes. If the weird behaviour of the current system can be extrapolated, then this is a no-go. As I'm sure you'll appreciate that depends if the file is simply being unlinked or if each sector is being erased, the answers to the above questions should help determine that :) Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ZFS command can block the whole ZFS subsystem!
On 2014-01-03 11:14, O. Hartmann wrote: > On Fri, 3 Jan 2014 14:38:03 - > "Steven Hartland" wrote: > >> - Original Message - >> From: "O. Hartmann" >>> For some security reasons, I dumped via "dd" a large file onto a 3TB >>> disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 22:43:56 >>> CET 2013 amd64. Filesystem in question is a single ZFS pool. >>> >>> Issuing the command >>> >>> "rm dumpfile.txt" >>> >>> and then hitting Ctrl-Z to bring the rm command into background via >>> fg" (I use FreeBSD's csh in that console) locks up the entire >>> command and even worse - it seems to wind up the pool in question >>> for being exported! >> I cant think of any reason why backgrounding a shell would export a >> pool. > I sent the job "rm" into background and I didn't say that implies an > export of the pool! > > I said that the pool can not be exported once the bg-command has been > issued. > >>> I expect to get the command into the background as every other UNIX >>> command does when sending Ctrl-Z in the console. Obviously, ZFS >>> related stuff in FreeBSD doesn't comply. >>> >>> The file has been removed from the pool but the console is still >>> stuck with "^Z fg" (as I typed this in). Process list tells me: >>> >>> top >>> 17790 root 1 200 8228K 1788K STOP 10 0:05 >>> 0.00% rm >>> >>> for the particular "rm" command issued. >> Thats not backgrounded yet otherwise it wouldnt be in the state STOP. > As I said - the job never backgrounded, locked up the terminal and > makes the whole pool inresponsive. > > >>> Now, having the file deleted, I'd like to export the pool for >>> further maintainance >> Are you sure the delete is complete? Also don't forget ZFS has TRIM by >> default, so depending on support of the underlying devices you could >> be seeing deletes occuring. > Quite sure it didn't! It takes hours (~ 8 now) and the drive is still > working, although I tried to stop. >> You can check that gstat -d > command report 100% acticity on the drive. I exported the pool in > question in single user mode and now try to import it back while in > miltiuser mode. > > Shortly after issuing the command > > zpool import POOL00 > > the terminal is stuck again, the drive is working at 100% for two > hours now and it seems the great ZFS is deleting every block per pedes. > Is this supposed to last days or a week? > > >>> but that doesn't work with >>> >>> zpool export -f poolname >>> >>> This command is now also stuck blocking the terminal and the pool >>> from further actions. >> If the delete hasnt completed and is stuck in the kernel this is >> to be expected. > At this moment I will not imagine myself what will happen if I have to > delete several deka terabytes. If the weird behaviour of the current > system can be extrapolated, then this is a no-go. > >>> This is painful. Last time I faced the problem, I had to reboot >>> prior to take any action regarding any pool in the system, since >>> one single ZFS command could obviously block the whole subsystem (I >>> tried to export and import). >>> >>> What is up here? >> Regards >> Steve >> >> >> This e.mail is private and confidential between Multiplay (UK) Ltd. >> and the person or entity to whom it is addressed. In the event of >> misdirection, the recipient is prohibited from using, copying, >> printing or otherwise disseminating it or any information contained >> in it. >> >> In the event of misdirection, illegible or incomplete transmission >> please telephone +44 845 868 1337 or return the E.mail to >> postmas...@multiplay.co.uk. > Regards, > Oliver Deleting large amounts of data with 'rm' is slow. When destroying a dataset, ZFS grew a feature flag, async_destroy that lets this happen in the background, and avoids a lot of these issues. An async_delete might be something to consider some day. -- Allan Jude signature.asc Description: OpenPGP digital signature
Re: ZFS command can block the whole ZFS subsystem!
On Fri, 3 Jan 2014 14:38:03 - "Steven Hartland" wrote: > > - Original Message - > From: "O. Hartmann" > > > > For some security reasons, I dumped via "dd" a large file onto a 3TB > > disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 22:43:56 > > CET 2013 amd64. Filesystem in question is a single ZFS pool. > > > > Issuing the command > > > > "rm dumpfile.txt" > > > > and then hitting Ctrl-Z to bring the rm command into background via > > fg" (I use FreeBSD's csh in that console) locks up the entire > > command and even worse - it seems to wind up the pool in question > > for being exported! > > I cant think of any reason why backgrounding a shell would export a > pool. I sent the job "rm" into background and I didn't say that implies an export of the pool! I said that the pool can not be exported once the bg-command has been issued. > > > I expect to get the command into the background as every other UNIX > > command does when sending Ctrl-Z in the console. Obviously, ZFS > > related stuff in FreeBSD doesn't comply. > > > > The file has been removed from the pool but the console is still > > stuck with "^Z fg" (as I typed this in). Process list tells me: > > > > top > > 17790 root 1 200 8228K 1788K STOP 10 0:05 > > 0.00% rm > > > > for the particular "rm" command issued. > > Thats not backgrounded yet otherwise it wouldnt be in the state STOP. As I said - the job never backgrounded, locked up the terminal and makes the whole pool inresponsive. > > > Now, having the file deleted, I'd like to export the pool for > > further maintainance > > Are you sure the delete is complete? Also don't forget ZFS has TRIM by > default, so depending on support of the underlying devices you could > be seeing deletes occuring. Quite sure it didn't! It takes hours (~ 8 now) and the drive is still working, although I tried to stop. > > You can check that gstat -d command report 100% acticity on the drive. I exported the pool in question in single user mode and now try to import it back while in miltiuser mode. Shortly after issuing the command zpool import POOL00 the terminal is stuck again, the drive is working at 100% for two hours now and it seems the great ZFS is deleting every block per pedes. Is this supposed to last days or a week? > > > but that doesn't work with > > > > zpool export -f poolname > > > > This command is now also stuck blocking the terminal and the pool > > from further actions. > > If the delete hasnt completed and is stuck in the kernel this is > to be expected. At this moment I will not imagine myself what will happen if I have to delete several deka terabytes. If the weird behaviour of the current system can be extrapolated, then this is a no-go. > > > > This is painful. Last time I faced the problem, I had to reboot > > prior to take any action regarding any pool in the system, since > > one single ZFS command could obviously block the whole subsystem (I > > tried to export and import). > > > > What is up here? > > Regards > Steve > > > This e.mail is private and confidential between Multiplay (UK) Ltd. > and the person or entity to whom it is addressed. In the event of > misdirection, the recipient is prohibited from using, copying, > printing or otherwise disseminating it or any information contained > in it. > > In the event of misdirection, illegible or incomplete transmission > please telephone +44 845 868 1337 or return the E.mail to > postmas...@multiplay.co.uk. Regards, Oliver signature.asc Description: PGP signature
Re: ZFS command can block the whole ZFS subsystem!
- Original Message - From: "O. Hartmann" For some security reasons, I dumped via "dd" a large file onto a 3TB disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 22:43:56 CET 2013 amd64. Filesystem in question is a single ZFS pool. Issuing the command "rm dumpfile.txt" and then hitting Ctrl-Z to bring the rm command into background via fg" (I use FreeBSD's csh in that console) locks up the entire command and even worse - it seems to wind up the pool in question for being exported! I cant think of any reason why backgrounding a shell would export a pool. I expect to get the command into the background as every other UNIX command does when sending Ctrl-Z in the console. Obviously, ZFS related stuff in FreeBSD doesn't comply. The file has been removed from the pool but the console is still stuck with "^Z fg" (as I typed this in). Process list tells me: top 17790 root 1 200 8228K 1788K STOP 10 0:05 0.00% rm for the particular "rm" command issued. Thats not backgrounded yet otherwise it wouldnt be in the state STOP. Now, having the file deleted, I'd like to export the pool for further maintainance Are you sure the delete is complete? Also don't forget ZFS has TRIM by default, so depending on support of the underlying devices you could be seeing deletes occuring. You can check that gstat -d but that doesn't work with zpool export -f poolname This command is now also stuck blocking the terminal and the pool from further actions. If the delete hasnt completed and is stuck in the kernel this is to be expected. This is painful. Last time I faced the problem, I had to reboot prior to take any action regarding any pool in the system, since one single ZFS command could obviously block the whole subsystem (I tried to export and import). What is up here? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
ZFS: ZFS command can block the whole ZFS subsystem!
For some security reasons, I dumped via "dd" a large file onto a 3TB disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20 22:43:56 CET 2013 amd64. Filesystem in question is a single ZFS pool. Issuing the command "rm dumpfile.txt" and then hitting Ctrl-Z to bring the rm command into background via fg" (I use FreeBSD's csh in that console) locks up the entire command and even worse - it seems to wind up the pool in question for being exported! I expect to get the command into the background as every other UNIX command does when sending Ctrl-Z in the console. Obviously, ZFS related stuff in FreeBSD doesn't comply. The file has been removed from the pool but the console is still stuck with "^Z fg" (as I typed this in). Process list tells me: top 17790 root 1 200 8228K 1788K STOP 10 0:05 0.00% rm for the particular "rm" command issued. Now, having the file deleted, I'd like to export the pool for further maintainance, but that doesn't work with zpool export -f poolname This command is now also stuck blocking the terminal and the pool from further actions. This is painful. Last time I faced the problem, I had to reboot prior to take any action regarding any pool in the system, since one single ZFS command could obviously block the whole subsystem (I tried to export and import). What is up here? signature.asc Description: PGP signature