Re: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks

2010-03-18 Thread Brian H. Nelson
I have only heard of alignment being discussed in reference to 
block-based storage (like DASD/iSCSI/FC). I'm not really sure how it 
would work out over NFS. I do see why you are asking though.


My understanding is that VMDK files are basically 'aligned' but the 
partitions inside of them may not be. You don't state what OS you are 
using in your guests. Windows XP/2003 and older create mis-alligned 
partitions by default (within a VMDK). You would need to manually 
create/adjust NTFS partitions in those cases in order for them to 
properly fall on a 4k boundary. This could be a cause of the problem you 
are describing.


This doc from VMware is aimed at block-based storage but it has some 
concepts that might be helpful as well as info on aligning guest OS 
partitions:

http://www.vmware.com/pdf/esx3_partition_align.pdf

-Brian


Chris Murray wrote:

Good evening,
I understand that NTFS  VMDK do not relate to Solaris or ZFS, but I was 
wondering if anyone has any experience of checking the alignment of data blocks 
through that stack?

I have a VMware ESX 4.0 host using storage presented over NFS from ZFS filesystems (recordsize 4KB). Within virtual machine VMDK files, I have formatted NTFS filesystems, block size 4KB. Dedup is turned on. When I run ZDB -DD, i see a figure of unique blocks which is higher than I expect, which makes me wonder whether any given 4KB in the NTFS filesystem is perfectly aligned with a 4KB block in ZFS? 


e.g. consider two virtual machines sharing lots of the same blocks. Assuming there /is/ 
a misalignment between NTFS  VMDK/VMDK  ZFS, if they're not in the same order 
within NTFS, they don't align, and will actually produce different blocks in ZFS:

VM1
NTFS1---2---3---


  
ZFS 1---2---3---4---

ZFS blocks are   AA, AABB and so on ...
Then in another virtual machine, the blocks are in a different order:

VM2
NTFS1---2---3---


  
ZFS 1---2---3---4---
ZFS blocks for this VM would be   CC, CCAA, AABB etc. So, no overlap 
between virtual machines, and no benefit from dedup.

I may have it wrong, and there are indeed 30,785,627 unique blocks in my setup, 
but if there's a mechanism for checking alignment, I'd find that very helpful.

Thanks,
Chris
  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS upgrade.

2010-01-07 Thread Brian H. Nelson

James Lever wrote:

Is there a way to upgrade my current ZFS version.  I show the version could
be as high as 22.



The version of Solaris you are running only suport ZFS versions up to version 
15 as demonstrated by your zfs upgrade -v output. You probably need a newer 
version of Solaris, but I cannot tell you if any newer versions support later 
zfs versions.
  


John,

You are already running the Update 8 kernel (141444-09). That is the 
latest version of ZFS that is available for Solaris 10.


-Brian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs/io performance on Netra X1

2009-11-13 Thread Brian H. Nelson

Bob Friesenhahn wrote:

On Fri, 13 Nov 2009, Tim Cook wrote:


If it is using parallel SCSI, perhaps there is a problem with the 
SCSI bus termination or a bad cable?


SCSI?  Try PATA ;)


Is that good?  I don't recall ever selecting that option when 
purchasing a computer.  It seemed safer to stick with SCSI than to try 
exotic technologies.




I hope you're being facetious. :-)   
http://en.wikipedia.org/wiki/Parallel_ATA



The Netra X1 has two IDE channels, so it should be able to handle 2 
disks without contention so long as only one disk is on each channel. 
OTOH, that machine is basically a desktop machine in a rack mount case 
(similar to a Blade 100) and is also vintage 2001. I wouldn't expect 
much performance out of it regardless.


-Brian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sun Flash Accelerator F20

2009-09-24 Thread Brian H. Nelson

Roland Rambau wrote:

Richard, Tim,

yes, one might envision the X4275 as OpenStorage appliances, but
they are not. Exadata 2 is
 - *all* Sun hardware
 - *all* Oracle software (*)
and that combination is now an Oracle product: a database appliance.


Is there any reason the X4275 couldn't be an OpenStorage appliance? It 
seems like it would be a good fit. It doesn't seem specific to Exadata2.


The F20 accelerator card isn't something specific to Exadata2 either is 
it? It looks like something that would benefit any kind of storage 
server. When I saw the F20 on the Sun site the other day, my first 
thought was Oh cool, they reinvented Prestoserve!


-Brian
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Why did my zvol shrink ?

2009-03-11 Thread Brian H. Nelson

I'm doing a little testing and I hit a strange point. Here is a zvol (clone)

pool1/volclone  type volume -
pool1/volclone  origin   pool1/v...@diff1   -
pool1/volclone  reservation  none   default
pool1/volclone  volsize  191G   -
pool1/volclone  volblocksize 8K -

The zvol has UFS on it. It has always been 191G and we've never 
attempted to resize it. However, if I just try to grow it, it gives me 
an error:


-bash-3.00# growfs 
/dev/zvol/rdsk/pool1/volclone
400555998 sectors  current size of 400556032 sectors


Is the zvol is somehow smaller than it was originally? How/why?

It fsck ok, so UFS doesn't seem to notice.

This is solaris 10 u6 currently, the machine (and zpool) have gone 
through a few update releases since creation.


Thanks for any input,
-Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Performance issue with zfs send of a zvol (Again)

2009-01-20 Thread Brian H. Nelson
Nobody can comment on this?

-Brian


Brian H. Nelson wrote:
 I noticed this issue yesterday when I first started playing around with 
 zfs send/recv. This is on Solaris 10U6.

 It seems that a zfs send of a zvol issues 'volblocksize' reads to the 
 physical devices. This doesn't make any sense to me, as zfs generally 
 consolidates read/write requests to improve performance. Even the dd 
 case with the same snapshot does not exhibit this behavior. It seems to 
 be specific to zfs send.

 I checked with 8k, 64k, and 128k volblocksize, and the reads generated 
 by zfs send always seem to follow that size, while the reads with dd do not.

 The small reads seems to hurt performance of zfs send. I tested with a 
 mirror, but on another machine with a 7 disk raidz, the performance is 
 MUCH worse because the 8k reads get broken up into even smaller reads 
 and spread across the raidz.

 Is this a bug, or can someone explain why this is happening?

 Thanks
 -Brian

 Using 8k volblocksize:

 -bash-3.00# zfs send pool1/vo...@now  /dev/null

 capacity operationsbandwidth
 pool  used  avail   read  write   read  write
 ---  -  -  -  -  -  -
 pool14.01G   274G  1.88K  0  15.0M  0
   mirror 4.01G   274G  1.88K  0  15.0M  0
 c0t9d0   -  -961  0  7.46M  0
 c0t11d0  -  -968  0  7.53M  0
 ---  -  -  -  -  -  -
 == ~8k reads to pool and drives

 -bash-3.00# dd if=/dev/zvol/dsk/pool1/vo...@now of=/dev/null bs=8k

 capacity operationsbandwidth
 pool  used  avail   read  write   read  write
 ---  -  -  -  -  -  -
 pool14.01G   274G  2.25K  0  17.9M  0
   mirror 4.01G   274G  2.25K  0  17.9M  0
 c0t9d0   -  -108  0  9.00M  0
 c0t11d0  -  -109  0  8.92M  0
 ---  -  -  -  -  -  -
 == ~8k reads to pool, ~85k reads to drives


 Using volblocksize of 64k:

 -bash-3.00# zfs send pool1/vol...@now  /dev/null

 capacity operationsbandwidth
 pool  used  avail   read  write   read  write
 ---  -  -  -  -  -  -
 pool16.01G   272G378  0  23.5M  0
   mirror 6.01G   272G378  0  23.5M  0
 c0t9d0   -  -189  0  11.8M  0
 c0t11d0  -  -189  0  11.7M  0
 ---  -  -  -  -  -  -
 == ~64k reads to pool and drives

 -bash-3.00# dd if=/dev/zvol/dsk/pool1/vol...@now of=/dev/null bs=64k

 capacity operationsbandwidth
 pool  used  avail   read  write   read  write
 ---  -  -  -  -  -  -
 pool16.01G   272G414  0  25.7M  0
   mirror 6.01G   272G414  0  25.7M  0
 c0t9d0   -  -107  0  12.9M  0
 c0t11d0  -  -106  0  12.8M  0
 ---  -  -  -  -  -  -
 == ~64k reads to pool, ~124k reads to drives


 Using volblocksize of 128k:

 -bash-3.00# zfs send pool1/vol1...@now  /dev/null

 capacity operationsbandwidth
 pool  used  avail   read  write   read  write
 ---  -  -  -  -  -  -
 pool14.01G   274G188  0  23.3M  0
   mirror 4.01G   274G188  0  23.3M  0
 c0t9d0   -  - 94  0  11.7M  0
 c0t11d0  -  - 93  0  11.7M  0
 ---  -  -  -  -  -  -
 == ~128k reads to pool and drives

 -bash-3.00# dd if=/dev/zvol/dsk/pool1/vol1...@now of=/dev/null bs=128k

 capacity operationsbandwidth
 pool  used  avail   read  write   read  write
 ---  -  -  -  -  -  -
 pool14.01G   274G247  0  30.8M  0
   mirror 4.01G   274G247  0  30.8M  0
 c0t9d0   -  -122  0  15.3M  0
 c0t11d0  -  -123  0  15.5M  0
 ---  -  -  -  -  -  -
 == ~128k reads to pool and drives

   

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs destroy is taking a long time...

2009-01-08 Thread Brian H. Nelson
David Smith wrote:
 I was wondering if anyone has any experience with how long a zfs destroy of 
 about 40 TB should take?  So far, it has been about an hour...  Is there any 
 good way to tell if it is working or if it is hung?

 Doing a zfs list just hangs.  If you do a more specific zfs list, then it 
 is okay... zfs list pool/another-fs

 Thanks,

 David
   

I can't voice to something like 40 TB, but I can share a related story 
(on Solaris 10u5).

A couple days ago, I tried to zfs destroy a clone of a snapshot of a 191 
GB zvol. It didn't complete right away, but the machine appeared to 
continue working on it, so I decided to let it go overnight (it was near 
the end of the day). Well, by about 4:00 am the next day, the machine 
had completely ran out of memory and hung. When I came in, I forced a 
sync from prom to get it back up. While it was booting, it stopped 
during (I think) the zfs initialization part, where it ran the disks for 
about 10 minutes before continuing. When the machine was back up, 
everything appeared to be ok. The clone was still there, although usage 
had changed to zero.

I ended up patching the machine up to the latest u6 kernel + zfs patch 
(13-01 + 139579-01). After that, the zfs destroy went off without a 
hitch.

I turned up bug 6606810 'zfs destroy volume is taking hours to 
complete' which is supposed to be fixed by 139579-01. I don't know if 
that was the cause of my issue or not. I've got a 2GB kernel dump if 
anyone is interested in looking.

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Performance issue with zfs send of a zvol

2009-01-06 Thread Brian H. Nelson
I noticed this issue yesterday when I first started playing around with 
zfs send/recv. This is on Solaris 10U6.

It seems that a zfs send of a zvol issues 'volblocksize' reads to the 
physical devices. This doesn't make any sense to me, as zfs generally 
consolidates read/write requests to improve performance. Even the dd 
case with the same snapshot does not exhibit this behavior. It seems to 
be specific to zfs send.

I checked with 8k, 64k, and 128k volblocksize, and the reads generated 
by zfs send always seem to follow that size, while the reads with dd do not.

The small reads seems to hurt performance of zfs send. I tested with a 
mirror, but on another machine with a 7 disk raidz, the performance is 
MUCH worse because the 8k reads get broken up into even smaller reads 
and spread across the raidz.

Is this a bug, or can someone explain why this is happening?

Thanks
-Brian

Using 8k volblocksize:

-bash-3.00# zfs send pool1/vo...@now  /dev/null

capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
pool14.01G   274G  1.88K  0  15.0M  0
  mirror 4.01G   274G  1.88K  0  15.0M  0
c0t9d0   -  -961  0  7.46M  0
c0t11d0  -  -968  0  7.53M  0
---  -  -  -  -  -  -
== ~8k reads to pool and drives

-bash-3.00# dd if=/dev/zvol/dsk/pool1/vo...@now of=/dev/null bs=8k

capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
pool14.01G   274G  2.25K  0  17.9M  0
  mirror 4.01G   274G  2.25K  0  17.9M  0
c0t9d0   -  -108  0  9.00M  0
c0t11d0  -  -109  0  8.92M  0
---  -  -  -  -  -  -
== ~8k reads to pool, ~85k reads to drives


Using volblocksize of 64k:

-bash-3.00# zfs send pool1/vol...@now  /dev/null

capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
pool16.01G   272G378  0  23.5M  0
  mirror 6.01G   272G378  0  23.5M  0
c0t9d0   -  -189  0  11.8M  0
c0t11d0  -  -189  0  11.7M  0
---  -  -  -  -  -  -
== ~64k reads to pool and drives

-bash-3.00# dd if=/dev/zvol/dsk/pool1/vol...@now of=/dev/null bs=64k

capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
pool16.01G   272G414  0  25.7M  0
  mirror 6.01G   272G414  0  25.7M  0
c0t9d0   -  -107  0  12.9M  0
c0t11d0  -  -106  0  12.8M  0
---  -  -  -  -  -  -
== ~64k reads to pool, ~124k reads to drives


Using volblocksize of 128k:

-bash-3.00# zfs send pool1/vol1...@now  /dev/null

capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
pool14.01G   274G188  0  23.3M  0
  mirror 4.01G   274G188  0  23.3M  0
c0t9d0   -  - 94  0  11.7M  0
c0t11d0  -  - 93  0  11.7M  0
---  -  -  -  -  -  -
== ~128k reads to pool and drives

-bash-3.00# dd if=/dev/zvol/dsk/pool1/vol1...@now of=/dev/null bs=128k

capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
pool14.01G   274G247  0  30.8M  0
  mirror 4.01G   274G247  0  30.8M  0
c0t9d0   -  -122  0  15.3M  0
c0t11d0  -  -123  0  15.5M  0
---  -  -  -  -  -  -
== ~128k reads to pool and drives

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Inexpensive ZFS home server

2008-11-12 Thread Brian H. Nelson

Jonathan Loran wrote:

David Evans wrote:
  

For anyone looking for a cheap home ZFS server...

Dell is having a sale on their PowerEdge SC440 for $199 (regular $598) through 
11/12/2008.

http://www.dell.com/content/products/productdetails.aspx/pedge_sc440?c=uscs=04l=ens=bsd

Its got Dual Core IntelĀ® PentiumĀ®E2180, 2.0GHz, 1MB Cache, 800MHz FSB
and you can upgrade the memory (ECC too) to 2gb for 19$ bucks.

@$199, I just ordered 2.

dce
  



I don't think the Pentium E2180 has the lanes to use ECC RAM.  I'm also 
not confident the system board for this machine would make use of ECC 
memory either, which is not good from a ZFS perspective.  How many SATA 
plugs are there on the MB in this guy?


Jon

  


ECC support is a function of the chipset AFAIK. That system has an Intel 
3000 chipset which is stated to have ECC support. The Dell literature 
also states ECC support. I don't see any reason it wouldn't work as such.


From the manual, it appears to have 4 SATA ports. For anyone 
contemplating buying one for home use, note that it has only PCIe x8, 
not x16 (for graphics cards).


The SC440 is basically just a re-badged workstation. Nothing too 
exciting, but $199 is not a bad deal.


-Brian

--

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs patches in latest sol10 u2 patch bundle

2008-07-16 Thread Brian H. Nelson
Manyam wrote:
 Hi ZFS gurus  --  I have a v240 with solaris10 u2 release  and ZFs - could 
 you please tell me if by applying the latest patch bundle of update 2 -- I 
 will get the all the ZFS patches installed as well ?

   

It is possible to patch your way up to the U5 kernel and related 
patches, which should give you all the latest ZFS bits (available in 
Solaris anyways). I have done this from U3, but I believe coming from U2 
wouldn't be much different. I assume that the required patches are in 
the latest bundle, but I believe 'smpatch update' is the prescribed 
method these days. Be aware that there is at least one obsolete patch 
that must be installed by hand in order to satisfy a dependency. I don't 
recall the patch number, but I know the dependant patch will print out a 
notice as such if the required patch is not installed. You will have to 
go through several patch-reboot iterations (one for each kernel patch, 
U2-U5) in order to get all the way there. Once your done patching, you 
should be able to do a 'zpool upgrade' to the current version (4).

Depending on your situation though, it may just be easier to do an 
upgrade :)

-Brian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to identify zpool version

2008-06-16 Thread Brian H. Nelson
Peter Hawkins wrote:
 Can zpool on U3 be patched to V4? I've applied the latest cluster and it 
 still seems to be V3.

   
Yes, you can patch your way up to the Sol 10 U4 kernel (or even U5 
kernel) which will give you zpool v4 support. The particular patch you 
need is 120011-14 or 120012-14 (sparc or x86). There is at least one 
dependency patch that is obsolete (122660-10/122661-10) but must still 
be installed before the kernel patch will go in, so you may need to 
install one or two patches manually to get it working.

http://mail.opensolaris.org/pipermail/zfs-discuss/2007-October/043331.html

-Brian
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [SOLVED] USB hard to ZFS

2008-06-16 Thread Brian H. Nelson


Andrius wrote:

 That is true, but
 # kill -HUP `pgrep vold`
 usage: kill [ [ -sig ] id ... | -l ]



I think you already did this as per a previous message:

# svcadm disable volfs

As such, vold isn't running. Re-enable the service and you should be fine.


-Brian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to identify zpool version

2008-06-13 Thread Brian H. Nelson
S10 U4 and U5 both use ZFS v4 (you specified your U4 machine as using v3).

If you have access to both machines, you can do 'zpool upgrade -v' to 
confirm which versions are being used.

-Brian


Peter Hawkins wrote:
 By the way I'm sure the pool was created using S10 Update 5
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The ZFS inventor and Linus sitting in a tree?

2008-05-20 Thread Brian H. Nelson

Keith Bierman wrote:


Not being a lawyer, and this not being a Legal forum ... can we leave  
license analysis alone?


  


The GNU _itself_ states that it is not allowable in plain English. Why 
people continue to argue about it is beyond me :-)


Common Development and Distribution License (CDDL) 
http://www.opensolaris.org/os/licensing/cddllicense.txt


   This is a free software license. It has a copyleft with a scope
   that's similar to the one in the Mozilla Public License, which makes
   it incompatible with the GNU GPL
   http://www.gnu.org/licenses/gpl.html. This means a module covered
   by the GPL and a module covered by the CDDL cannot legally be linked
   together. We urge you not to use the CDDL for this reason.

   Also unfortunate in the CDDL is its use of the term intellectual
   property http://www.gnu.org/philosophy/not-ipr.html.


(from http://www.gnu.org/licenses/license-list.html#SoftwareLicenses)

-Brian


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Need help with a dead disk

2008-02-12 Thread Brian H. Nelson
Here's a bit more info. The drive appears to have failed at 22:19 EST 
but it wasn't until 1:30 EST the next day that the system finally 
decided that it was bad. (Why?) Here's some relevant log stuff (with 
lots of repeated 'device not responding' errors removed) I don't know if 
it will be useful:


Feb 11 22:19:09 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):
Feb 11 22:19:09 maxwell SCSI transport failed: reason 
'incomplete': retrying command
Feb 11 22:19:10 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 11 22:19:10 maxwell disk not responding to selection
...
Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED] (isp0):

Feb 11 22:21:08 maxwell SCSI Cable/Connection problem.
Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.notice]   
Hardware/Firmware error.
Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED] (isp0):

Feb 11 22:21:08 maxwell Fatal error, resetting interface, flg 16

... (Why did this take so long?)

Feb 12 01:30:05 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 12 01:30:05 maxwell offline
...
Feb 12 01:30:22 maxwell fmd: [ID 441519 daemon.error] SUNW-MSG-ID: 
ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major

Feb 12 01:30:22 maxwell EVENT-TIME: Tue Feb 12 01:30:22 EST 2008
Feb 12 01:30:22 maxwell PLATFORM: SUNW,Ultra-250, CSN: -, HOSTNAME: maxwell
Feb 12 01:30:22 maxwell SOURCE: zfs-diagnosis, REV: 1.0
Feb 12 01:30:22 maxwell EVENT-ID: 7f48f376-2eb1-ccaf-afc5-e56f5bf4576f
Feb 12 01:30:22 maxwell DESC: A ZFS device failed.  Refer to 
http://sun.com/msg/ZFS-8000-D3 for more information.

Feb 12 01:30:22 maxwell AUTO-RESPONSE: No automated response will occur.
Feb 12 01:30:22 maxwell IMPACT: Fault tolerance of the pool may be 
compromised.
Feb 12 01:30:22 maxwell REC-ACTION: Run 'zpool status -x' and replace 
the bad device.



One thought I had was to unconfigure the bad disk with cfgadm. Would 
that force the system back into the 'offline' response?


Thanks,
-Brian



Brian H. Nelson wrote:
Ok. I think I answered my own question. ZFS _didn't_ realize that the 
disk was bad/stale. I power-cycled the failed drive (external) to see if 
it would come back up and/or run diagnostics on it. As soon as I did 
that, ZFS put the disk ONLINE and started using it again! Observe:


bash-3.00# zpool status
  pool: pool1
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
pool1ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
c0t9d0   ONLINE   0 0 0
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c2t0d0   ONLINE   0 0 0
c2t1d0   ONLINE   0 0 0
c2t2d0   ONLINE   2.11K 20.09 0

errors: No known data errors


Now I _really_ have a problem. I can't offline the disk myself:

bash-3.00# zpool offline pool1 c2t2d0
cannot offline c2t2d0: no valid replicas


I don't understand why, as 'zpool status' says all the other drives are OK.

What's worse, if I just power off the drive in question (trying to get 
back to where I started) the zpool hangs completely! I let it go for 
about 7 minutes thinking maybe there was some timeout, but still 
nothing. Any command that would access the zpool (including 'zpool  
status') hangs. The only way to fix is to power the external disk back 
on upon which everything starts working like nothing has happened. 
Nothing gets logged other than lots of these only while the drive is 
powered off:


Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 12 11:49:32 maxwell disk not responding to selection
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32):

Feb 12 11:49:32 maxwell offline or reservation conflict
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0

[zfs-discuss] Need help with a dead disk (was: ZFS keeps trying to open a dead disk: lots of logging)

2008-02-12 Thread Brian H. Nelson
Ok. I think I answered my own question. ZFS _didn't_ realize that the 
disk was bad/stale. I power-cycled the failed drive (external) to see if 
it would come back up and/or run diagnostics on it. As soon as I did 
that, ZFS put the disk ONLINE and started using it again! Observe:

bash-3.00# zpool status
  pool: pool1
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
pool1ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
c0t9d0   ONLINE   0 0 0
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c2t0d0   ONLINE   0 0 0
c2t1d0   ONLINE   0 0 0
c2t2d0   ONLINE   2.11K 20.09 0

errors: No known data errors


Now I _really_ have a problem. I can't offline the disk myself:

bash-3.00# zpool offline pool1 c2t2d0
cannot offline c2t2d0: no valid replicas

I don't understand why, as 'zpool status' says all the other drives are OK.

What's worse, if I just power off the drive in question (trying to get 
back to where I started) the zpool hangs completely! I let it go for 
about 7 minutes thinking maybe there was some timeout, but still 
nothing. Any command that would access the zpool (including 'zpool  
status') hangs. The only way to fix is to power the external disk back 
on upon which everything starts working like nothing has happened. 
Nothing gets logged other than lots of these only while the drive is 
powered off:

Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 11:49:32 maxwell disk not responding to selection
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 11:49:32 maxwell offline or reservation conflict
Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 11:49:32 maxwell i/o to invalid geometry


What's going on here? What can I do to make ZFS let go of the bad drive? 
This is a production machine and I'm getting concerned. I _really_ don't 
like the fact that ZFS is using a suspect drive, but I can't seem to 
make it stop!

Thanks,
-Brian


Brian H. Nelson wrote:
 This is Solaris 10U3 w/127111-05.

 It appears that one of the disks in my zpool died yesterday. I got 
 several SCSI errors finally ending with 'device not responding to 
 selection'. That seems to be all well and good. ZFS figured it out and 
 the pool is degraded:

 maxwell /var/adm zpool status
   pool: pool1
  state: DEGRADED
 status: One or more devices could not be opened.  Sufficient replicas 
 exist for
 the pool to continue functioning in a degraded state.
 action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-D3
  scrub: none requested
 config:

 NAME STATE READ WRITE CKSUM
 pool1DEGRADED 0 0 0
   raidz1 DEGRADED 0 0 0
 c0t9d0   ONLINE   0 0 0
 c0t10d0  ONLINE   0 0 0
 c0t11d0  ONLINE   0 0 0
 c0t12d0  ONLINE   0 0 0
 c2t0d0   ONLINE   0 0 0
 c2t1d0   ONLINE   0 0 0
 c2t2d0   UNAVAIL  1.88K 17.98 0  cannot open

 errors: No known data errors


 My question is why does ZFS keep attempting to open the dead device? At 
 least that's what I assume is happening. About every minute, I get eight 
 of these entries in the messages log:

 Feb 12 10:15:54 maxwell scsi: [ID 107833 kern.warning] WARNING: 
 /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
 PROTECTED],0 (sd32):
 Feb 12 10:15:54 maxwell disk not responding to selection

 I also got a number of these thrown in for good measure:

 Feb 11 22:21:58 maxwell scsi: [ID 107833 kern.warning] WARNING: 
 /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
 PROTECTED],0 (sd32):
 Feb 11 22:21:58 maxwell SYNCHRONIZE CACHE command failed (5)


 Since the disk died last night (at about 11:20pm EST) I now have over 
 15K of similar entries in my log. What gives? Is this expected behavior? 
 If ZFS knows the device is having problems, why does it not just leave

[zfs-discuss] ZFS keeps trying to open a dead disk: lots of logging

2008-02-12 Thread Brian H. Nelson
This is Solaris 10U3 w/127111-05.

It appears that one of the disks in my zpool died yesterday. I got 
several SCSI errors finally ending with 'device not responding to 
selection'. That seems to be all well and good. ZFS figured it out and 
the pool is degraded:

maxwell /var/adm zpool status
  pool: pool1
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas 
exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
pool1DEGRADED 0 0 0
  raidz1 DEGRADED 0 0 0
c0t9d0   ONLINE   0 0 0
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c2t0d0   ONLINE   0 0 0
c2t1d0   ONLINE   0 0 0
c2t2d0   UNAVAIL  1.88K 17.98 0  cannot open

errors: No known data errors


My question is why does ZFS keep attempting to open the dead device? At 
least that's what I assume is happening. About every minute, I get eight 
of these entries in the messages log:

Feb 12 10:15:54 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 12 10:15:54 maxwell disk not responding to selection

I also got a number of these thrown in for good measure:

Feb 11 22:21:58 maxwell scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd32):
Feb 11 22:21:58 maxwell SYNCHRONIZE CACHE command failed (5)


Since the disk died last night (at about 11:20pm EST) I now have over 
15K of similar entries in my log. What gives? Is this expected behavior? 
If ZFS knows the device is having problems, why does it not just leave 
it alone and wait for user intervention?

Also, I noticed that the 'action' says to attach the device and 'zpool 
online' it. Am I correct in assuming that a 'zpool replace' is what 
would really be needed, as the data on the disk will be outdated?

Thanks,
-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Do we have a successful installation method for patch 120011-14?

2007-10-04 Thread Brian H. Nelson
Manually installing the obsolete patch 122660-10 has worked fine for me. 
Until sun fixes the patch dependencies, I think that is the easiest way.

-Brian

Bruce Shaw wrote:
 It fails on my machine because it requires a patch that's deprecated.

 This email and any files transmitted with it are confidential and intended 
 solely for the use of the individual or entity to whom they are addressed. If 
 you have received this email in error please notify the system manager. This 
 message contains confidential information and is intended only for the 
 individual named. If you are not the named addressee you should not 
 disseminate, distribute or copy this e-mail.

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Do we have a successful installation method for patch 120011-14?

2007-10-04 Thread Brian H. Nelson
It was 120272-12 that caused ths snmp.conf problem and was withdrawn. 
120272-13 has replaced it and has that bug fixed.

122660-10 does not have any issues that I am aware of. It is only 
obsolete, not withdrawn. Additionally, it appears that the circular 
patch dependency is by design if you read this BugID:

6574472 U4 feature Ku's need to hard require a patch that enforces 
zoneadmd patch is installed

So hacking the prepatch script for 125547-02/125548-02 to bypass the 
dependency check (as others have recommended) is a BAD THING and you may 
wind up with a broken system.

-Brian


Rob Windsor wrote:
 Yeah, the only thing wrong with that patch is that it eats 
 /etc/sma/snmp/snmpd.conf

 All is not lost, your original is copied to 
 /etc/sma/snmp/snmpd.conf.save in the process.

 Rob++

 Brian H. Nelson wrote:
   
 Manually installing the obsolete patch 122660-10 has worked fine for me. 
 Until sun fixes the patch dependencies, I think that is the easiest way.

 -Brian


 

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS array NVRAM cache?

2007-09-26 Thread Brian H. Nelson
Vincent Fox wrote:
 It seems like ZIL is a separate issue.

 I have read that putting ZIL on a separate device helps, but what about the 
 cache?

 OpenSolaris has some flag to disable it.  Solaris 10u3/4 do not.  I have 
 dual-controllers with NVRAM and battery backup, why can't I make use of it?   
 Would I be wasting my time to mess with this on 3310 and 3510 class 
 equipment?  I would think it would help but perhaps not.
  
  

   

I'm probably being really daft in thinking that everyone is overlooking 
the obvious, but...

Is this what you're referring to?
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Brian H. Nelson
Stephen Usher wrote:
 Brian H. Nelson:

 I'm sure it would be interesting for those on the list if you could 
 outline the gotchas so that the rest of us don't have to re-invent the 
 wheel... or at least not fall down the pitfalls.
   

I believe I ran into one or both of these bugs:

6429996 zvols don't reserve enough space for requisite meta data
6430003 record size needs to affect zvol reservation size on RAID-Z

Basically what happened was that the zpool filled to 100% and broke UFS 
with 'no space left on device' errors. This was quite strange to sort 
out since the UFS zvol had 30GB of free space.

I never got any replies to my request for more info and/or workarounds 
for the above bugs. My workaround and recommendation is to leave a 
'healthy' amount of un-allocated space in the zpool. I don't know what a 
good level for 'healthy' is. Currently I've left about 1% (2GB) on a 
200GB raid-z pool.

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Brian H. Nelson
Mike Gerdts wrote:
 The UFS on zvols option sounds intriguing to me, but I would guess
 that the following could be problems:

 1) Double buffering:  Will ZFS store data in the ARC while UFS uses
 traditional file system buffers?
   
This is probably an issue. You also have the journal+COW combination 
issue. I'm guessing that both would be performance concerns. My 
application is relatively low bandwidth, so I haven't dug deep into this 
area.
 2) Boot order dependencies.  How does the startup of zfs compare to
 processing of /etc/vfstab?  I would guess that this is OK due to
 legacy mount type supported by zfs.  If this is OK, then dfstab
 processing is probably OK.
Zvols by nature are not available under ZFS automatic mounting. You 
would need to add the /dev/zvol/dsk/... lines to /etc/vfstab just as you 
would for any other /dev/dsk... or /dev/md/dsk/... devices.

If you are not using the z_pool_ for anything else, I would remove the 
automatic mount point for it.

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Brian H. Nelson
Stephen Usher wrote:

 Brian H. Nelson:

 I'm sure it would be interesting for those on the list if you could 
 outline the gotchas so that the rest of us don't have to re-invent the 
 wheel... or at least not fall down the pitfalls.
   
Also, here's a link to the ufs on zvol blog where I originally found the 
idea:

http://blogs.sun.com/scottdickson/entry/fun_with_zvols_-_ufs

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Brian H. Nelson
Mike Gerdts wrote:
 Having worked in academia and multiple Fortune 100's, the problem
 seems to be most prevalent in academia, although possibly a minor
 inconvenience in some engineering departments in industry.  In the
 .edu where I used to manage the UNIX environment, I would have a tough
 time weighing the complexities of quotas he mentions vs. the other
 niceties.  My guess is that unless I had something that was really
 broken, I would stay with UFS or VxFS waiting for a fix.
   

UFS on a zvol is a pretty good compromise. You get lots of the nice ZFS 
stuff (checksums, raidz/z2, snapshots, growable pool, etc) with no 
changes in userland.

There are a couple gotcha's but as long as you're aware of them, it 
works pretty good. We've been using it since January.

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS overhead killed my ZVOL

2007-04-03 Thread Brian H. Nelson

Can anyone comment?

-Brian


Brian H. Nelson wrote:

Adam Leventhal wrote:

On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote:
  
Why does this happen? Is it a bug? I know there is a recommendation of 
20% free space for good performance, but that thought never occurred to 
me when this machine was set up (zvols only, no zfs proper).



It sounds like this bug:

  6430003 record size needs to affect zvol reservation size on RAID-Z

Adam


Could be, but 6429996 sounds like a more likely candidate: zvols don't 
reserve enough space for requisite meta data.



I can create some large files (2GB) and the 'available' space only 
decreases by .01-.04GB for each file. The raidz pool is 7x36GB disks, 
with the default 8k volblocksize. Would/should 6430003 affect me? I 
don't understand what determines minimum allocatable size and the 
number of 'skipped' sectors for a given situation.


Either way, my main concern is that I can address the problem so that 
the same situation does not reoccur. Are there workarounds for these 
bugs? How can I determine how much space needs to be reserved? How 
much (if any) of the remaining free space could be used for an 
additional zvol (with its own allocation of reserved space)?


Thanks,
Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---
  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS overhead killed my ZVOL

2007-03-20 Thread Brian H. Nelson

Dear list,

Solaris 10 U3 on SPARC.

I had a 197GB raidz storage pool. Within that pool, I had allocated a 
191GB zvol (filesystem A), and a 6.75GB zvol (filesystem B). These used 
all but a couple hundred K of the zpool. Both zvols contained UFS 
filesystems with logging enabled. The (A) filesystem was about 79% full. 
(B) was also nearly full, but unmounted and not being used.


This configuration worked happily for a bit over two months. Then the 
other day, a user decided to copy (cp) about 11GB worth of video files 
within (A). This caused UFS to choke as such:


Mar  9 17:34:43 maxwell ufs: [ID 702911 kern.warning] WARNING: Error 
writing master during ufs log roll
Mar  9 17:34:43 maxwell ufs: [ID 127457 kern.warning] WARNING: ufs log 
for /export/home/engr changed state to Error
Mar  9 17:34:43 maxwell ufs: [ID 616219 kern.warning] WARNING: Please 
umount(1M) /export/home/engr and run fsck(1M)


I do as the message says: unmount and attempt to fsck. I am then 
bombarded with thousands of errors, BUT fsck can not fix them due to 'no 
space left on device'. That's right, the filesystem with about 30GB free 
didn't have enough free space to fsck. Strange.


After messing with the machine all weekend, rebooting, calling coworkers 
(other sys admins), calling sun, scratching my head, etc.. The solution 
ended up being to _delete the (B) zvol_ (which contained only junk 
data). Once that was done, fsck ran all the way through without problems 
(besides wiping all my ACLs) and things were happy again.


So I surmised that ZFS ran out of space to do it's thing, and for 
whatever reason, that 'out of space' got pushed down into the zvol as 
well, causing fsck to choke. I _have_ been able to reproduce the 
situation on a test machine, but not reliably. It basically comprises of 
setting up two zvols that take up almost all of the pool space, newfsing 
them, filling one up to about 90% full, then looping though copys of 1/2 
of the remaining space until it dies.


(So for a 36GB pool, create a 34GB zvol and a 2.xxGB zvol. newfs them. 
Mount the larger one. Create a 30GB junk file. Create a directory of say 
5 files worth about 2GB total. Then do  'while true; do copy -r dira 
dirb;done' until it fails. Sometimes it does, sometimes not.)


Why does this happen? Is it a bug? I know there is a recommendation of 
20% free space for good performance, but that thought never occurred to 
me when this machine was set up (zvols only, no zfs proper).


I think it is a bug simply because it _allowed_ me to create a 
configuration that didn't leave enough room for overhead. There isn't a 
whole lot of info surrounding zvol. Does the 80% free rule still apply 
to the underlining zfs if only zvols are used? That would be really 
unfortunate. I think most people wanting to use a zvol would want to use 
100% of a pool toward the zvol.


-Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] UFS on zvol: volblocksize and maxcontig

2007-01-26 Thread Brian H. Nelson

Hi all!

First off, if this has been discussed, please point me in that 
direction. I have searched high and low and really can't find much info 
on the subject.


We have a large-ish (200gb) UFS file system on a Sun Enterprise 250 that 
is being shared with samba (lots of files, mostly random IO). OS is 
Solaris 10u3. Disk set  is 7x36gb 10k scsi, 4 internal 3 external.


For several reasons we currently need to stay on UFS and can't switch to 
ZFS proper. So instead we have opted to do UFS on a zvol using raid-z, 
in lieu of UFS on SVM using raid5 (we want/need raid protection). This 
decision was made because of the ease of disk set portability of zpools, 
and also the [assumed] performance benefit vs SVM.


Anyways, I've been pondering the volblocksize parameter, and trying to 
figure out how it interacts with UFS. When the zvol was setup, I took 
the default 8k size. Since UFS uses an 8k blocksize, this seemed to be a 
reasonable choice. I've been thinking more about it lately, and have 
also read that UFS will do R/W in bigger than 8k blocks when it can, up 
to maxcontig (default of 16, ie 128k).


This presented me with several questions: Would a volblocksize of 128k 
and maxcontig 16 provide better UFS performance? Overall, or only in 
certain situations (ie only for sequential IO)? Would increasing the 
maxcontig beyond 16 make any difference (good, bad or indifferent) if 
the underlying device is limited to 128k blocks?


What exactly does volblocksize control? My observations thus far 
indicate that it simply sets a max block size for the [virtual] zvol 
device. Changing volblocksize does NOT seem to have an impact on IOs to 
the underlying physical disks, which always seem to float in the 50-110k 
range). How does volblocksize affect IO that is not of a set block size?


Finally, why does volblocksize only affects raidz and mirror devices? It 
seems to have no effect on 'simple' devices, even though I presume 
striping is still used there. That is also assuming that volblocksize 
interacts with striping.


Any answers or input is greatly appreciated.

Thanks much!
-Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] UFS on zvol: volblocksize and maxcontig

2007-01-26 Thread Brian H. Nelson

Darren J Moffat wrote:

Brian H. Nelson wrote:
For several reasons we currently need to stay on UFS and can't switch 
to ZFS proper. So instead we have opted to do UFS on a zvol using 
raid-z, 


Can you state what those reasons are please ?

I know that isn't answering the question you are asking but it is 
worth making sure you have the correct info.


I'd also like to understand why UFS works for you but ZFS as a 
filesystem does not.




I knew someone would ask that :)

The primary reason is that our backup software (EMC/Legato Networker 
7.2) does not appear to support zfs. We don't have the funds currently 
to upgrade to the new version that does.


The other reason is that the machine has been around for years, already 
using UFS and quotas extensively. Over winter break we had time to 
upgrade to Solaris 10 and migrate the volume from svm to zvol, but not 
much more.There are a few thousand users on the machine. The thought of 
transitioning to that many zfs 'partitions' in order to have per-user 
quotas seemed daunting, not to mention the administrative re-training 
needed (edquota doesn't work. du is reporting 3000 filesystems?! etc).


IMO, the quota-per-file-system approach seems inconvenient when you get 
past a handful of file systems. Unless I'm really missing something, it 
just seems like a nightmare to have to deal with such a ridiculous 
number of file systems.


-Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] UFS on zvol: volblocksize and maxcontig

2007-01-26 Thread Brian H. Nelson

[EMAIL PROTECTED] wrote:

*snip*
IMO, the quota-per-file-system approach seems inconvenient when you get 
past a handful of file systems. Unless I'm really missing something, it 
just seems like a nightmare to have to deal with such a ridiculous 
number of file systems.



Why?  What additional per-filesystem overhead from a maintenance perspective
are you seeing?

Casper
  
The obvious example would be /var/mail . UFS quotas are easy. Doing the 
same thing with ZFS would be (I think) impossible. You would have to 
completely convert and existing system to a maildir or home directory 
mail storage setup.


Other file-system-specific software could also have issues. Networker 
for instance does backups per filesystem. In that situation I could then 
possibly have ~3000 backup sets DAILY for a single machine (worst case, 
that each file system has changes). Granted, that may not be better or 
worse, just 'different' and not what I'm used to. On the other hand, I 
could certainly see where that could add a ton of overhead to backup 
processing.


Don't get me wrong, zfs quotas are a good thing, and could certainly be 
useful in many situations. I just don't think I agree that they are a 
one to one replacement for ufs quotas in terms of usability in all 
situations.


-Brian

--
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
 bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss