Re: [zfs-discuss] ZFS overhead killed my ZVOL

2007-04-03 Thread Brian H. Nelson

Can anyone comment?

-Brian


Brian H. Nelson wrote:

Adam Leventhal wrote:

On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote:
  
Why does this happen? Is it a bug? I know there is a recommendation of 
20% free space for good performance, but that thought never occurred to 
me when this machine was set up (zvols only, no zfs proper).



It sounds like this bug:

  6430003 record size needs to affect zvol reservation size on RAID-Z

Adam


Could be, but 6429996 sounds like a more likely candidate: zvols don't 
reserve enough space for requisite meta data.



I can create some large files (2GB) and the 'available' space only 
decreases by .01-.04GB for each file. The raidz pool is 7x36GB disks, 
with the default 8k volblocksize. Would/should 6430003 affect me? I 
don't understand what determines minimum allocatable size and the 
number of 'skipped' sectors for a given situation.


Either way, my main concern is that I can address the problem so that 
the same situation does not reoccur. Are there workarounds for these 
bugs? How can I determine how much space needs to be reserved? How 
much (if any) of the remaining free space could be used for an 
additional zvol (with its own allocation of reserved space)?


Thanks,
Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---
  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS overhead killed my ZVOL

2007-03-20 Thread Brian H. Nelson

Dear list,

Solaris 10 U3 on SPARC.

I had a 197GB raidz storage pool. Within that pool, I had allocated a 
191GB zvol (filesystem A), and a 6.75GB zvol (filesystem B). These used 
all but a couple hundred K of the zpool. Both zvols contained UFS 
filesystems with logging enabled. The (A) filesystem was about 79% full. 
(B) was also nearly full, but unmounted and not being used.


This configuration worked happily for a bit over two months. Then the 
other day, a user decided to copy (cp) about 11GB worth of video files 
within (A). This caused UFS to choke as such:


Mar  9 17:34:43 maxwell ufs: [ID 702911 kern.warning] WARNING: Error 
writing master during ufs log roll
Mar  9 17:34:43 maxwell ufs: [ID 127457 kern.warning] WARNING: ufs log 
for /export/home/engr changed state to Error
Mar  9 17:34:43 maxwell ufs: [ID 616219 kern.warning] WARNING: Please 
umount(1M) /export/home/engr and run fsck(1M)


I do as the message says: unmount and attempt to fsck. I am then 
bombarded with thousands of errors, BUT fsck can not fix them due to 'no 
space left on device'. That's right, the filesystem with about 30GB free 
didn't have enough free space to fsck. Strange.


After messing with the machine all weekend, rebooting, calling coworkers 
(other sys admins), calling sun, scratching my head, etc.. The solution 
ended up being to _delete the (B) zvol_ (which contained only junk 
data). Once that was done, fsck ran all the way through without problems 
(besides wiping all my ACLs) and things were happy again.


So I surmised that ZFS ran out of space to do it's thing, and for 
whatever reason, that 'out of space' got pushed down into the zvol as 
well, causing fsck to choke. I _have_ been able to reproduce the 
situation on a test machine, but not reliably. It basically comprises of 
setting up two zvols that take up almost all of the pool space, newfsing 
them, filling one up to about 90% full, then looping though copys of 1/2 
of the remaining space until it dies.


(So for a 36GB pool, create a 34GB zvol and a 2.xxGB zvol. newfs them. 
Mount the larger one. Create a 30GB junk file. Create a directory of say 
5 files worth about 2GB total. Then do  'while true; do copy -r dira 
dirb;done' until it fails. Sometimes it does, sometimes not.)


Why does this happen? Is it a bug? I know there is a recommendation of 
20% free space for good performance, but that thought never occurred to 
me when this machine was set up (zvols only, no zfs proper).


I think it is a bug simply because it _allowed_ me to create a 
configuration that didn't leave enough room for overhead. There isn't a 
whole lot of info surrounding zvol. Does the 80% free rule still apply 
to the underlining zfs if only zvols are used? That would be really 
unfortunate. I think most people wanting to use a zvol would want to use 
100% of a pool toward the zvol.


-Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS overhead killed my ZVOL

2007-03-20 Thread Adam Leventhal
On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote:
 Why does this happen? Is it a bug? I know there is a recommendation of 
 20% free space for good performance, but that thought never occurred to 
 me when this machine was set up (zvols only, no zfs proper).

It sounds like this bug:

  6430003 record size needs to affect zvol reservation size on RAID-Z

Adam

-- 
Adam Leventhal, Solaris Kernel Development   http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] ZFS overhead killed my ZVOL

2007-03-20 Thread Robert Milkowski
Hello Adam,

Wednesday, March 21, 2007, 12:42:49 AM, you wrote:

AL On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote:
 Why does this happen? Is it a bug? I know there is a recommendation of 
 20% free space for good performance, but that thought never occurred to 
 me when this machine was set up (zvols only, no zfs proper).

AL It sounds like this bug:

AL   6430003 record size needs to affect zvol reservation size on RAID-Z

AL Adam


Adam, while you are here, what about gzip compression in ZFS?
I mean are you going to integrate changes soon?



-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] ZFS overhead killed my ZVOL

2007-03-20 Thread Robert Milkowski
Hello Adam,

Wednesday, March 21, 2007, 1:24:35 AM, you wrote:

AL On Wed, Mar 21, 2007 at 01:23:06AM +0100, Robert Milkowski wrote:
 Adam, while you are here, what about gzip compression in ZFS?
 I mean are you going to integrate changes soon?

AL I submitted the RTI today.

Great!

btw: I assume that compression level will be hard coded after all,
right?

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS overhead killed my ZVOL

2007-03-20 Thread Adam Leventhal
On Wed, Mar 21, 2007 at 01:36:10AM +0100, Robert Milkowski wrote:
 btw: I assume that compression level will be hard coded after all,
 right?

Nope. You'll be able to choose from gzip-N with N ranging from 1 to 9 just
like gzip(1).

Adam

-- 
Adam Leventhal, Solaris Kernel Development   http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss