Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Daniel Carosone
On Tue, Mar 23, 2010 at 07:22:59PM -0400, Frank Middleton wrote:
 On 03/22/10 11:50 PM, Richard Elling wrote:
  
 Look again, the checksums are different.

 Whoops, you are correct, as usual. Just 6 bits out of 256 different...

 Look which bits are different -  digits 24, 53-56 in both cases.

This is very likely an error introduced during the calculation of
the hash, rather than an error in the input data.  I don't know how
that helps narrow down the source of the problem, though..

It suggests an experiment: try switching to another hash algorithm.
It may move the problem around, or even make it worse, of course.

I'm also reminded of a thread about the implementation of fletcher2
being flawed, perhaps you're better switching regardless.

 o Why is the file flagged by ZFS as fatally corrupted still accessible?

 This is the part I was hoping to get answers for since AFAIK this
 should be impossible. Since none of this is having any operational
 impact, all of these issues are of interest only, but this is a bit scary!

It's only the blocks with bad checksums that should return errors.
Maybe you're not reading those, or the transient error doesn't happen
next time when you actually try to read it / from the other side of
the mirror.

Repeated errors in the same file could also be a symptom of an error
calculating the hash when the file was written.  If there's a
bit-flipping issue at the root of it, with some given probability,
that would invert the probabilities of correct and error results.

--
Dan.


pgpGRgBlRkr4l.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Damon Atkins
You could try copying the file to /tmp (ie swap/ram) and do a continues loop of 
checksums  e.g.

while [ ! -f  ibdlpi.so.1.x ] ; do sleep 1; cp libdlpi.so.1 libdlpi.so.1.x ; 
A=`sha512sum -b libdlpi.so.1.x` ; [ $A == what it should be 
libdlpi.so.1.x ]  rm libdlpi.so.1.x ; done ; date

Assume the file never goes to swap, it would tell you if something on the 
motherboard is playing up.

I have seen CPU randomly set a byte to 0 which should not be 0, think it was an 
L1 or L2 cache problem.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Saso Kiselkov
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

How about running memtest86+ (http://www.memtest.org/) on the machine
for a while? It doesn't test the arithmetics on the CPU very much, but
it stresses data paths quite a lot. Just a quick suggestion...

- --
Saso

Damon Atkins wrote:
 You could try copying the file to /tmp (ie swap/ram) and do a continues loop 
 of checksums  e.g.
 
 while [ ! -f  ibdlpi.so.1.x ] ; do sleep 1; cp libdlpi.so.1 libdlpi.so.1.x ; 
 A=`sha512sum -b libdlpi.so.1.x` ; [ $A == what it should be 
 libdlpi.so.1.x ]  rm libdlpi.so.1.x ; done ; date
 
 Assume the file never goes to swap, it would tell you if something on the 
 motherboard is playing up.
 
 I have seen CPU randomly set a byte to 0 which should not be 0, think it was 
 an L1 or L2 cache problem.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuqHm8ACgkQRO8UcfzpOHD9PQCgyehtxeAt8tieOlIKfHICQQI9
bFoAnRGzfWayNDsjHj5NdF+5n++Pdqaq
=cru5
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Damon Atkins
you could also use psradm to take a CPU off-line.

At boot I would ??assume?? the system boots the same way every time unless 
something changes, so you could be hiting the came CPU core every time or the 
same bit of RAM until booted fully.

Or even run SunVTS Validation Test Suite which I belive has a simlar test to 
the cp in /tmp and all the other tests it has.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-24 Thread Joerg Schilling
Darren J Moffat darr...@opensolaris.org wrote:

  You cannot get a single file out of the zfs send datastream.

 I don't see that as part of the definition of a backup - you obviously 
 do - so we will just have to disagree on that.

If you need to set up a file server of the same size as the original one 
in order to be able to access a specific file from backup data, this could be
sees as major handicap.

  getattrat(3C) / setattrat(3C)
 
  Even has example code in it.
 
  This is what ls(1) uses.
 
  It could be easily possible to add portable support integrated into the
  framework that already supports FreeBSD and Linux attributes.

 Great, do you have a time frame for when you will have this added to 
 star then ?

I need to write some missing 50 lines of code (formatting virgin BD-RE and 
BD-RE/DL media) in cdrecordl and publish cdrtools-3.0-final before I start
working on other projects, but this will hopefully be soon.


  -   A public interface to get the property state

 That would come from libzfs.  There are private interfaces just now that 
 are very likely what you need zfs_prop_get()/zfs_prop_set(). They aren't 
 documented or public though and are subject to change at any time.

mmm, as the state of the compression flag may seriously affect media 
consumption, this seems to be an important part of the meta data in case of a 
backup. Is there no way to define an interface that will just evolve without
becoming ncompatible?

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-24 Thread David Magda
On Wed, March 24, 2010 10:36, Joerg Schilling wrote:

  -  A public interface to get the property state

 That would come from libzfs.  There are private interfaces just now that
 are very likely what you need zfs_prop_get()/zfs_prop_set(). They aren't
 documented or public though and are subject to change at any time.

 mmm, as the state of the compression flag may seriously affect media
 consumption, this seems to be an important part of the meta data in case
 of a
 backup. Is there no way to define an interface that will just evolve
 without
 becoming ncompatible?

I think the larger question is: when will ZFS be stable enough that Oracle
will say that libzfs is an officially supported interface? Once that
happens it will probably be possible for third parties to start accessing
ZFS in ways other than the POSIX interface.

I'm guessing that support for crypto, device removal, and parity changing
(RAID-Z1 - Z2 - Z3) need to be put in first (the latter two
necessitating bp rewrite). I would hazard to guess it will be at least a
year before it's even considered and longer before it happens (Solaris 12?
or maybe a latter update of Solaris 11?).

Until that happens we'll be stuck with working at the ZPL for most things.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Dusan Radovanovic
Hello all,

I am a complete newbie to OpenSolaris, and must to setup a ZFS NAS. I do have 
linux experience, but have never used ZFS. I have tried to install OpenSolaris 
Developer 134 on a 11TB HW RAID-5 virtual disk, but after the installation I 
can only use one 2TB disk, and I cannot partition the rest. I realize that 
maximum partition size is 2TB, but I guess the rest must be usable. For 
hardware I am using HP ProLiant DL180G6, 12 1TB disks connected to P212 
controller in RAID-5. Could someone direct me or suggest what I am doing wrong. 
Any help is greatly appreciated.

Cheers,
Dusan
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Tim Cook
On Wed, Mar 24, 2010 at 11:01 AM, Dusan Radovanovic dusa...@gmail.comwrote:

 Hello all,

 I am a complete newbie to OpenSolaris, and must to setup a ZFS NAS. I do
 have linux experience, but have never used ZFS. I have tried to install
 OpenSolaris Developer 134 on a 11TB HW RAID-5 virtual disk, but after the
 installation I can only use one 2TB disk, and I cannot partition the rest. I
 realize that maximum partition size is 2TB, but I guess the rest must be
 usable. For hardware I am using HP ProLiant DL180G6, 12 1TB disks connected
 to P212 controller in RAID-5. Could someone direct me or suggest what I am
 doing wrong. Any help is greatly appreciated.

 Cheers,
 Dusan



You would be much better off installing to a small internal disk, and then
creating a separate pool for the 11TB of storage.  The 2TB limit is because
it's a boot drive.  That limit should go away if you're using it as a
separate storage pool.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Carsten Aulbert
Hi

On Wednesday 24 March 2010 17:01:31 Dusan Radovanovic wrote:

  connected to P212 controller in RAID-5. Could someone direct me or suggest
  what I am doing wrong. Any help is greatly appreciated.
 

I don't know, but I would get around this like this:

My suggestion would be to configure the HW RAID controller to act as a dumb 
JBOD controller and thus make the 12 disks visible to the OS.

Then you can start playing around with ZFS on these disks, e.g. creating 
different pools:

zpool create testpool raidz c0t0d0 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c0t5d0 \
  raidz c0t6d0 c0t7d0 c0t8d0 c0t9d0 c0t10d0 c0t11d0

(Caveat: this is from the top of my head and might be - very -wrong). This 
would create something like RAID50.

Then I would start reading, reading and testing and testing :)

HTH

Carsten
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Fwd: Re: ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Svein Skogen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Forgot to cc the list, well here goes...

-  Original Message 
Subject: Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller
Date: Wed, 24 Mar 2010 17:10:58 +0100
From: Svein Skogen sv...@stillbilde.net
To: Dusan Radovanovic dusa...@gmail.com

On 24.03.2010 17:01, Dusan Radovanovic wrote:
 Hello all,

 I am a complete newbie to OpenSolaris, and must to setup a ZFS NAS. I do have 
 linux experience, but have never used ZFS. I have tried to install 
 OpenSolaris Developer 134 on a 11TB HW RAID-5 virtual disk, but after the 
 installation I can only use one 2TB disk, and I cannot partition the rest. I 
 realize that maximum partition size is 2TB, but I guess the rest must be 
 usable. For hardware I am using HP ProLiant DL180G6, 12 1TB disks connected 
 to P212 controller in RAID-5. Could someone direct me or suggest what I am 
 doing wrong. Any help is greatly appreciated.

 Cheers,
 Dusan

If you have a recent enough raid controller to reliably handle more than
2TB per Logical Disk, it has support for more than one logical disk per
drivegroup/span of drivegroups. Do yourself the favour of setting up a
100GB logical disk 0 for the system, and the rest of the drivegroup/span
for the storage pool. Remember to disable write cache unless you have
battery backup, unless you really want to try out ZFS's famous
corruption-recovery algorithms by personal experience.

//Svein

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuqOdEACgkQSBMQn1jNM7bbEgCcDN3sEs1wDI86l04ch0eUZ3yw
BL8AmgIJ6uaiuqPX2nelqR645rn4IuyW
=trHb
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Karl Rossing
I believe that write caching is turned off on the boot drives or is it 
the controller or both?


Which could be a big problem.

On 03/24/10 11:07, Tim Cook wrote:



On Wed, Mar 24, 2010 at 11:01 AM, Dusan Radovanovic dusa...@gmail.com 
mailto:dusa...@gmail.com wrote:


Hello all,

I am a complete newbie to OpenSolaris, and must to setup a ZFS
NAS. I do have linux experience, but have never used ZFS. I have
tried to install OpenSolaris Developer 134 on a 11TB HW RAID-5
virtual disk, but after the installation I can only use one 2TB
disk, and I cannot partition the rest. I realize that maximum
partition size is 2TB, but I guess the rest must be usable. For
hardware I am using HP ProLiant DL180G6, 12 1TB disks connected to
P212 controller in RAID-5. Could someone direct me or suggest what
I am doing wrong. Any help is greatly appreciated.

Cheers,
Dusan



You would be much better off installing to a small internal disk, and 
then creating a separate pool for the 11TB of storage.  The 2TB limit 
is because it's a boot drive.  That limit should go away if you're 
using it as a separate storage pool.


--Tim


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   





CONFIDENTIALITY NOTICE:  This communication (including all attachments) is
confidential and is intended for the use of the named addressee(s) only and
may contain information that is private, confidential, privileged, and
exempt from disclosure under law.  All rights to privilege are expressly
claimed and reserved and are not waived.  Any use, dissemination,
distribution, copying or disclosure of this message and any attachments, in
whole or in part, by anyone other than the intended recipient(s) is strictly
prohibited.  If you have received this communication in error, please notify
the sender immediately, delete this communication from all data storage
devices and destroy all hard copies.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-24 Thread Richard Elling
On Mar 23, 2010, at 11:21 PM, Daniel Carosone wrote:

 On Tue, Mar 23, 2010 at 07:22:59PM -0400, Frank Middleton wrote:
 On 03/22/10 11:50 PM, Richard Elling wrote:
 
 Look again, the checksums are different.
 
 Whoops, you are correct, as usual. Just 6 bits out of 256 different...
 
 Look which bits are different -  digits 24, 53-56 in both cases.
 
 This is very likely an error introduced during the calculation of
 the hash, rather than an error in the input data.  I don't know how
 that helps narrow down the source of the problem, though..

The exact same code is used to calculate the checksum when writing
or reading. However, we assume the processor works and Frank's tests
do not indicate otherwise.

 
 It suggests an experiment: try switching to another hash algorithm.
 It may move the problem around, or even make it worse, of course.
 
 I'm also reminded of a thread about the implementation of fletcher2
 being flawed, perhaps you're better switching regardless.

Clearly, fletcher2 identified the problem.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Dusan Radovanovic
Thank you all for your valuable experience and fast replies. I see your point 
and will create one virtual disk for the system and one for the storage pool. 
My RAID controller is battery backed up, so I'll leave write caching on.

Thanks again,
Dusan
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: Re: ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Dusan Radovanovic
Thank you for your advice. I see your point and will create one virtual disk 
for the system and one for the storage pool. My RAID controller is battery 
backed up, so I'll leave write caching on.

Thanks again,
Dusan
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Richard Elling
On Mar 24, 2010, at 9:14 AM, Karl Rossing wrote:
 I believe that write caching is turned off on the boot drives or is it the 
 controller or both?

By default, ZFS will not enable volatile write caches on disks for SMI labeled
disk drives (eg boot).

 Which could be a big problem.

Actually, it is very rare that the synchronous write performance of a boot
drive is a performance problem.

Nonvolatile write caches are not a problem.

 On 03/24/10 11:07, Tim Cook wrote:
 
 
 On Wed, Mar 24, 2010 at 11:01 AM, Dusan Radovanovic dusa...@gmail.com 
 wrote:
 Hello all,
 
 I am a complete newbie to OpenSolaris, and must to setup a ZFS NAS. I do 
 have linux experience, but have never used ZFS. I have tried to install 
 OpenSolaris Developer 134 on a 11TB HW RAID-5 virtual disk, but after the 
 installation I can only use one 2TB disk, and I cannot partition the rest. I 
 realize that maximum partition size is 2TB, but I guess the rest must be 
 usable. For hardware I am using HP ProLiant DL180G6, 12 1TB disks connected 
 to P212 controller in RAID-5. Could someone direct me or suggest what I am 
 doing wrong. Any help is greatly appreciated.

Simple. Make a small LUN, say 20GB or so, and install the OS there.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Svein Skogen

On 24.03.2010 17:42, Richard Elling wrote:

Nonvolatile write caches are not a problem.


Which is why ZFS isn't a replacement for proper array controllers 
(defining proper as those with sufficient battery to leave you with a 
seemingly intact filesystem), but a very nice augmentation for them. ;)


As someone pointed out in another thread: Proper storage still takes 
proper planning. ;)


//Svein

--

Sending mail from a temporary set up workstation, as my primary W500 is 
off for service. PGP not installed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LSISAS2004 support

2010-03-24 Thread Bart Nabbe
Loaded mpt_sas, world of difference, thanks.

Then I yanked a drive out of the hot plug backplane to see what would happened.
My ZPOOL detects an IO failure and runs in degraded mode. All good, pop the 
drive 
back in, but a zpool replace appears not sufficient. (This works with the 
1068E/mpt driver)
combo. I then ran cfgadm -c configure c4, completes, no change in the 
configuration status 
of the device.  cfgadm -c configure c4::dsk/c4t3d0 fails. Is there a equivalent 
to
-xsata_port_activate for scsi-sas that I should use? 

Thanks, 

Bart

 
On Mar 22, 2010, at 23:40, James C. McPherson wrote:

 On 23/03/10 01:23 PM, Bart Nabbe wrote:
 All,
 
 I did some digging and I was under the impression that the
  mr_sas driver was to support the LSISAS2004 HBA controller
  from LSI.
 I did add the pci id to the driver alias for mr_sas, but
  then the driver still showed up as unattached (see below).
 Did I miss something, or was my assumption that this controller
  was supported in the dev branch flawed.
 I'm running:  SunOS 5.11 snv_134 i86pc i386 i86pc Solaris.
 
 Thanks in advance for any pointers.
 
 
 node name:  pci1000,3010
 Vendor: LSI Logic / Symbios Logic
 Device: SAS2004 PCI-Express Fusion-MPT SAS-2 
 [Spitfire]
 Sub-Vendor: LSI Logic / Symbios Logic
 binding name:   pciex1000,70
 devfs path: /p...@0,0/pci8086,3...@3/pci1000,3010
 pci path:   3,0,0
 compatible name:
 (pciex1000,70.1000.3010.2)(pciex1000,70.1000.3010)(pciex1000,70.2)(pciex1000,70)(pciexclass,010700)(pciexclass,0107)(pci1000,70.1000.3010.2)(pci1000,70.1000.3010)(pci1000,3010)(pci1000,70.2)(pci1000,70)(pciclass,010700)(pciclass,0107)
 driver name:mr_sas
 
 
 This should be using the mpt_sas driver, not the mr_sas driver.
 
 
 James C. McPherson
 --
 Senior Software Engineer, Solaris
 Sun Microsystems
 http://www.jmcp.homeunix.com/blog
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Moving drives around...

2010-03-24 Thread Russ Price
 On Tue, March 23, 2010 12:00, Ray Van Dolson wrote:
 ZFS recognizes disks based on various ZFS special
 blocks written to them. 
 It also keeps a cache file on where things have been
 lately.  If you
 export a ZFS pool, swap the physical drives around,
 and import it,
 everything should be fine.  If you don't export
 first, you may have to
 give it a bit of help.  And there are pathological
 cases where for example
 you don't have a link in the /dev/dsk directory which
 can cause a default
 import to not find all the pieces of a pool.

Indeed. Before I wised up and bought an HBA for my RAIDZ2 array instead of 
using randomly-assorted SATA controllers, I tried rearranging some disks 
without exporting the pool first. I almost had a heart attack when the system 
came up reporting corrupted data on the drives that had been switched. As it 
turned out, I just needed to export and re-import the pool, and it was fine 
after that. Needless to say, when the HBA went in, I made sure to export the 
pool FIRST.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] To reserve space

2010-03-24 Thread Edward Ned Harvey
Is there a way to reserve space for a particular user or group?  Or perhaps
to set a quota for a group which includes everyone else?

 

I have one big pool, which holds users' home directories, and also the
backend files for the svn repositories etc.  I would like to ensure the svn
server process will always have some empty space to work with, even if some
users go hog wild and consume everything they can.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Richard Elling
On Mar 24, 2010, at 10:05 AM, Svein Skogen wrote:

 On 24.03.2010 17:42, Richard Elling wrote:
 Nonvolatile write caches are not a problem.
 
 Which is why ZFS isn't a replacement for proper array controllers (defining 
 proper as those with sufficient battery to leave you with a seemingly intact 
 filesystem), but a very nice augmentation for them. ;)

Nothing prevents a clever chap from building a ZFS-based array controller
which includes nonvolatile write cache. However, the economics suggest
that the hybrid storage pool model can provide a highly dependable service
at a lower price-point than the traditional array designs.

 As someone pointed out in another thread: Proper storage still takes proper 
 planning. ;)

Good advice :-)
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To reserve space

2010-03-24 Thread Freddie Cash
On Wed, Mar 24, 2010 at 10:52 AM, Edward Ned Harvey
solar...@nedharvey.comwrote:

  Is there a way to reserve space for a particular user or group?  Or
 perhaps to set a quota for a group which includes everyone else?

 I have one big pool, which holds users’ home directories, and also the
 backend files for the svn repositories etc.  I would like to ensure the svn
 server process will always have some empty space to work with, even if some
 users go hog wild and consume everything they can.

zfs set reservation=100GB dataset/name

That will reserve 100 GB of space for the dataset, and will make that space
unavailable to the rest of the pool.

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Edward Ned Harvey
 Thank you all for your valuable experience and fast replies. I see your
 point and will create one virtual disk for the system and one for the
 storage pool. My RAID controller is battery backed up, so I'll leave
 write caching on.

I think the point is to say:  ZFS software raid is both faster and more
reliable than your hardware raid.  Surprising though it may be for a
newcomer, I have statistics to back that up, and explanation of how it's
possible.  If you want to know.  

You will do best if you configure the raid controller to JBOD.  Yes it's ok
to enable WriteBack on all those disks, but just use the raid card for write
buffering, not raid.

The above suggestion might be great ideally.  But how do you boot from some
disk which isn't attached to the raid controller?  Most servers don't have
any other option ...  So you might just make a 2-disk mirror, use that as a
boot volume, and then JBOD all the other disks.  That's somewhat a waste of
disk space, but it might be your best solution.  This is in fact, what I do.
I have 2x 1TB disks dedicated to nothing but the OS.  That's tremendous
overkill.  And all the other disks are a data pool.  All of the disks are
1TB, because it greatly simplifies the usage of a hotspare...  And I'm
wasting nearly 1TB on the OS disks.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To reserve space

2010-03-24 Thread Edward Ned Harvey
 zfs set reservation=100GB dataset/name
 
 That will reserve 100 GB of space for the dataset, and will make that
 space unavailable to the rest of the pool.

That doesn't make any sense to me ... 

How does that allow subversionuser to use the space, and block joeuser from 
using it?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To reserve space

2010-03-24 Thread Edward Ned Harvey
  zfs set reservation=100GB dataset/name
 
  That will reserve 100 GB of space for the dataset, and will make that
  space unavailable to the rest of the pool.
 
 That doesn't make any sense to me ...
 
 How does that allow subversionuser to use the space, and block
 joeuser from using it?

Oh - I get it - In the case of subversion server, it's pretty safe to assume
all the svnuser files are under a specific subdirectory (or a manageably
finite number of directories) and therefore could use a separate zfs
filesystem within the same pool, and therefore that directory or directories
could have a space reservation.

I think that will be sufficient for our immediate needs.  Thanks for the
suggestion.

Out of curiosity, the more general solution would be the ability to create a
reservation on a per-user or per-group basis (just like you create quotas on
a per-user or per-group basis).  Is this possible?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Moving drives around...

2010-03-24 Thread Ray Van Dolson
On Wed, Mar 24, 2010 at 10:47:27AM -0700, Russ Price wrote:
  On Tue, March 23, 2010 12:00, Ray Van Dolson wrote:
  ZFS recognizes disks based on various ZFS special
  blocks written to them. 
  It also keeps a cache file on where things have been
  lately.  If you
  export a ZFS pool, swap the physical drives around,
  and import it,
  everything should be fine.  If you don't export
  first, you may have to
  give it a bit of help.  And there are pathological
  cases where for example
  you don't have a link in the /dev/dsk directory which
  can cause a default
  import to not find all the pieces of a pool.
 
 Indeed. Before I wised up and bought an HBA for my RAIDZ2 array
 instead of using randomly-assorted SATA controllers, I tried
 rearranging some disks without exporting the pool first. I almost had
 a heart attack when the system came up reporting corrupted data on
 the drives that had been switched. As it turned out, I just needed to
 export and re-import the pool, and it was fine after that. Needless
 to say, when the HBA went in, I made sure to export the pool FIRST.

In my limited testing (with an HBA based system), I've been able to
move drives around without exporting first... but sounds like good
practice just to export anyways to be on the safe side. :)

Thanks,
Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To reserve space

2010-03-24 Thread Brandon High
On Wed, Mar 24, 2010 at 11:18 AM, Edward Ned Harvey
solar...@nedharvey.comwrote:

 Out of curiosity, the more general solution would be the ability to create
 a
 reservation on a per-user or per-group basis (just like you create quotas
 on
 a per-user or per-group basis).  Is this possible?


OpenSolaris's zfs has supported quotas for a little while, so make sure
you're using a recent build. I'm not sure if it's in Solaris 10, but I
believe it is.

Before quotas were supported, the answer was to create a new dataset per
user, eg: tank/home/user1, tank/home/user2, etc. It's easy to do in zfs, but
it doesn't always work for storage that is shared between users.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Edward Ned Harvey
  Which is why ZFS isn't a replacement for proper array controllers
 (defining proper as those with sufficient battery to leave you with a
 seemingly intact filesystem), but a very nice augmentation for them. ;)
 
 Nothing prevents a clever chap from building a ZFS-based array
 controller
 which includes nonvolatile write cache. However, the economics suggest
 that the hybrid storage pool model can provide a highly dependable
 service
 at a lower price-point than the traditional array designs.

I don't have finished results that are suitable for sharing yet, but I'm
doing a bunch of benchmarks right now that suggest:

-1-  WriteBack enabled is much faster for writing than WriteThrough.  (duh.)
-2-  Ditching the WriteBack, and using a ZIL instead, is even faster than
that.

Oddly, the best performance seems to be using ZIL, with all the disks
WriteThrough.  You actually get slightly lower performance if you enable the
ZIL together with WriteBack.  My theory to explain the results I'm seeing
is:  Since the ZIL performs best for zillions of tiny write operations and
the spindle disks perform best for large sequential writes, I suspect the
ZIL accumulates tiny writes until they add up to a large sequential write,
and then they're flushed to spindle disks.  In this configuration, the HBA
writeback cannot add any benefit, because the datastreams are already
optimized for the device they're writing to.  Yet, by enabling the
WriteBack, you introduce a small delay before writes begin to hit the
spindle.  By switching to WriteThrough, you actually get better performance.
As counter-intuitive as that may seem.  :-)

So, if you've got access to a pair of decent ZIL devices, you're actually
faster and more reliable to run all your raid and caching and buffering via
ZFS instead of using a fancy HBA.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Dedup Performance

2010-03-24 Thread Miles Nordin
 srbi == Steve Radich, BitShop, Inc ste...@bitshop.com writes:

  srbi 
http://www.bitshop.com/Blogs/tabid/95/EntryId/78/Bug-in-OpenSolaris-SMB-Server-causes-slow-disk-i-o-always.aspx

I'm having trouble understanding many things in here like ``our file
move'' (moving what from where to where with what protocol?) and
``with SMB running'' (with the server enabled on Solaris, with
filesystems mounted, with activity on the mountpoints?  what does
running mean?) and ``RAID-0/stripe reads is the slow point'' (what
does this mean?  How did you determine which part of the stack is
limiting the observed speed?  This is normally quite difficult and
requires comparing several experiments, not doing just one experiment
like ``a file move between zfs pools''.).  What is ``bytes the
negotiated protocol allows''?  mtu, mss, window size?  Can you show us
in what tool you see one number and where you see the other number
that's too big?


pgpAMuI2YHJGk.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To reserve space

2010-03-24 Thread Edward Ned Harvey
The question is not how to create quotas for users.

The question is how to create reservations for users.

 

One way to create a reservation for a user is to create a quota for everyone
else, but that's a little less manageable, so a reservation per-user would
be cleaner and more desirable.

 

 

 

 

From: Brandon High [mailto:bh...@freaks.com] 
Sent: Wednesday, March 24, 2010 2:33 PM
To: Edward Ned Harvey
Cc: Freddie Cash; zfs-discuss
Subject: Re: [zfs-discuss] To reserve space

 

On Wed, Mar 24, 2010 at 11:18 AM, Edward Ned Harvey solar...@nedharvey.com
wrote:

Out of curiosity, the more general solution would be the ability to create a
reservation on a per-user or per-group basis (just like you create quotas on
a per-user or per-group basis).  Is this possible?


OpenSolaris's zfs has supported quotas for a little while, so make sure
you're using a recent build. I'm not sure if it's in Solaris 10, but I
believe it is.

Before quotas were supported, the answer was to create a new dataset per
user, eg: tank/home/user1, tank/home/user2, etc. It's easy to do in zfs, but
it doesn't always work for storage that is shared between users.

-B


-- 
Brandon High : bh...@freaks.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Karl Rossing

On 03/24/10 12:54, Richard Elling wrote:


Nothing prevents a clever chap from building a ZFS-based array controller
which includes nonvolatile write cache.


+1 to that.  Something that is inexpensive and small (4GB?) and works in 
a PCI express slot.




CONFIDENTIALITY NOTICE:  This communication (including all attachments) is
confidential and is intended for the use of the named addressee(s) only and
may contain information that is private, confidential, privileged, and
exempt from disclosure under law.  All rights to privilege are expressly
claimed and reserved and are not waived.  Any use, dissemination,
distribution, copying or disclosure of this message and any attachments, in
whole or in part, by anyone other than the intended recipient(s) is strictly
prohibited.  If you have received this communication in error, please notify
the sender immediately, delete this communication from all data storage
devices and destroy all hard copies.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To reserve space

2010-03-24 Thread David Magda
On Wed, March 24, 2010 14:36, Edward Ned Harvey wrote:
 The question is not how to create quotas for users.

 The question is how to create reservations for users.

There is currently no way to do per-user reservations. That ZFS property
is only available per-file system.


Even per-user and per-group quotas are a recent addition (requested a lot
from academic environments). For most of the existence of ZFS, only
per-file system (i.e., data set) quotas were available.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Svein Skogen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 24.03.2010 19:53, Karl Rossing wrote:
 On 03/24/10 12:54, Richard Elling wrote:

 Nothing prevents a clever chap from building a ZFS-based array controller
 which includes nonvolatile write cache.
 
 +1 to that.  Something that is inexpensive and small (4GB?) and works in
 a PCI express slot.

Maybe someone should look at implementing the zfs code for the XScale
range of io-processors (such as the IOP333)?

//Svein

- -- 
- +---+---
  /\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
- +---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.
- 
 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/
- 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuqYa4ACgkQSBMQn1jNM7Z32QCbBfyhDz34vTkSNIT0JO9gbgZ2
TkUAoPlRbirW5VQ0bYS3k/kmbOWaUUc0
=SDFD
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS backup configuration

2010-03-24 Thread Wolfraider
Sorry if this has been dicussed before. I tried searching but I couldn't find 
any info about it. We would like to export our ZFS configurations in case we 
need to import the pool onto another box. We do not want to backup the actual 
data in the zfs pool, that is already handled through another program.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS backup configuration

2010-03-24 Thread Eric D. Mudama

On Wed, Mar 24 at 12:20, Wolfraider wrote:

Sorry if this has been dicussed before. I tried searching but I
couldn't find any info about it. We would like to export our ZFS
configurations in case we need to import the pool onto another
box. We do not want to backup the actual data in the zfs pool, that
is already handled through another program.


I'm pretty sure the configuration is embedded in the pool itself.
Just import on the new machine.  You may need --force/-f the pool
wasn't exported on the old system properly.

--eric

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool vdev imbalance - getting worse?

2010-03-24 Thread Ian Collins

On 02/28/10 08:09 PM, Ian Collins wrote:
I was running zpool iostat on a pool comprising a stripe of raidz2 
vdevs that appears to be writing slowly and I notice a considerable 
imbalance of both free space and write operations.  The pool is 
currently feeding a tape backup while receiving a large filesystem.


Is this imbalance normal?  I would expect a more even distribution as 
the poll configuration hasn't been changed since creation.


The system is running Solaris 10 update 7.

capacity operationsbandwidth
pool   used  avail   read  write   read  write
  -  -  -  -  -  -
tank  15.9T  2.19T 87119  2.34M  1.88M
 raidz2  2.90T   740G 24 27   762K  95.5K
 raidz2  3.59T  37.8G 20  0   546K  0
 raidz2  3.58T  44.1G 27  0  1.01M  0
 raidz2  3.05T   587G  7 47  24.9K  1.07M
 raidz2  2.81T   835G  8 45  30.9K   733K
  -  -  -  -  -  -


This system has since been upgraded, but the imbalance in getting worse:

zpool iostat -v tank | grep raid
  raidz2  3.60T  28.5G166 41  6.97M   764K
  raidz2  3.59T  33.3G170 35  7.35M   709K
  raidz2  3.60T  26.1G173 35  7.36M   658K
  raidz2  1.69T  1.93T129 46  6.70M   610K
  raidz2  2.25T  1.38T124 54  5.77M   967K

Is there any way to determine how this is happening?

I may have to resort to destroying and recreating some large 
filesystems, but there's no way to determine which ones to target...


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS backup configuration

2010-03-24 Thread Khyron
Yes, I think Eric is correct.

Funny, this is an adjunct to the thread I started entitled Thoughts on ZFS
Pool
Backup Strategies.  I was going to include this point in that thread but
thought
better of it.

It would be nice if there were an easy way to extract a pool configuration,
with
all of the dataset properties, ACLs, etc. so that you could easily reload it
into a
new pool.  I could see this being useful in a disaster recovery sense, and
I'm
sure people smarter than I can think of other uses.

From my reading of the documentation and man pages, I don't see that any
such command currently exists.  Something that would allow you dump the
config into a file and read it back from a file using typical Unix semantics
like
STDIN/STDOUT.  I was thinking something like:

zpool dump pool [-o filename]

zpool load pool [-f filename]

Without -o or -f, the output would go to STDOUT or the input would come
from STDIN, so you could use this in pipelines.  If you have a particularly
long
lived and stable pool, or one that has been through many upgrades, this
might
be a nice way to save a configuration that you could restore later (if
necessary)
with a single command.

Thoughts?

On Wed, Mar 24, 2010 at 15:31, Eric D. Mudama edmud...@bounceswoosh.orgwrote:

 On Wed, Mar 24 at 12:20, Wolfraider wrote:

 Sorry if this has been dicussed before. I tried searching but I
 couldn't find any info about it. We would like to export our ZFS
 configurations in case we need to import the pool onto another
 box. We do not want to backup the actual data in the zfs pool, that
 is already handled through another program.


 I'm pretty sure the configuration is embedded in the pool itself.
 Just import on the new machine.  You may need --force/-f the pool
 wasn't exported on the old system properly.

 --eric

 --
 Eric D. Mudama
 edmud...@mail.bounceswoosh.org


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
You can choose your friends, you can choose the deals. - Equity Private

If Linux is faster, it's a Solaris bug. - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-24 Thread Harry Putnam
Brandon High bh...@freaks.com writes:

 Someone pointed out that you can use bart, but that also scans the
 directories. It might do what you want, but it doesn't work at the zpool /
 zfs level, just at the file level layer.

Apparently I missed any suggestion about bart, but looking it up just
now, I guess maybe in what they call `safe mode' where changed files aren't
deleted, it might be useful as a versioning tool.  Sounds like it
could be targeted with a little finer granularity than snapshots
generally can be.

At just a quick read, it really just sounds like rsync, after its been in a
severe wreck and was badly crippled.

Maybe `bart' handles windows files better than rsync?

I'm just curious why `bart' would be recommended over rsync?  Are there
abilities that make it more attractive?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-24 Thread Harry Putnam
Harry Putnam rea...@newsguy.com writes:

 At just a quick read, it really just sounds like rsync, after its been in a
 severe wreck and was badly crippled.

OOps, I may have looked at the wrong bart.  One of the first hits
google turned up was:

   http://www.zhornsoftware.co.uk/bart/index.html

But I think maybe this `bart' is what was suggested:

   http://www.unisol.com/papers/bart_paper.html

This looks to be a comprehensive network backup system.  But would be
way overdone for what I talked about.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Daniel Carosone
On Wed, Mar 24, 2010 at 08:02:06PM +0100, Svein Skogen wrote:
 Maybe someone should look at implementing the zfs code for the XScale
 range of io-processors (such as the IOP333)?

NetBSD runs on (many of) those.
NetBSD has an (in-progress, still-some-issues) ZFS port.

Hopefully they will converge in due course to provide exactly this.

The particularly nice thing would be that, using ZFS in the RAID
controller firmware like this would result in contents that are
interchangable with standard zfs, needing just an import/export.  This
is a big improvement over many other dedicated raid solutions, and
provides good comfort when thinking about recovery scenarios for a
controller failure. 

Unfortunately, it would mostly be only useful with zvols for
presentation to the host - there's not a good interface, and usually
not much RAM, for the controller to run all the ZPL layer.   That
would still be useful for controllers running in non-ZFS servers, as
an alternative to external boxes with comstar and various transports.
If you could find a way to get a zfs send/recv stream through from the
controller, though, some interesting deployment possibilities open up.

--
Dan.



pgphc7wlO3yRB.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-24 Thread Christine Tran
 OOps, I may have looked at the wrong bart.

I think he meant this BART:
http://blogs.sun.com/gbrunett/entry/automating_solaris_10_file_integrity

I'm going to make one quick comment about this, despite better
judgment to probably keep quiet.  I don't think anyone should use ZFS
as a VCS like Subversion ... that's nuts!  How many developers on your
project?  How many sub projects, how many commits a day?  I just
started a new repo and I'm up in the hundreds in a few weeks.  Do you
want to keep that many snapshots around?  Someone is going to get the
idea to use ZFS like this, and 8 months from now, get bitter and
heart-broken and dump on ZFS for not behaving like a VCS, which it is
not.

VCS has logs, easy diffs, easy rollback, merge, branch, feeds into
build automation software, allows IDEs to be fed into it, integrates
with tracking tools ... none of which ZFS does (and I'm not saying
this like it's a bad thing.)

If you want to do a code release, say, 1.0, put that on a ZFS
filesystem, snapshot it, keep developing until you get to somewhere
you want to call a 1.1, snapshot that ... that is a wonderful thing to
do.  You can clone and make active an entire bundle of stuff.

But please, use versioning software (good ones are free, even) for
versioning and don't shoehorn ZFS.

CT
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS file system confusion

2010-03-24 Thread Chris Dunbar
Hello,

I have boxed myself into a mental corner and need some help getting out. I am 
confused about working with ZFS file systems. Here is a simple example of what 
has me confused: Let's say I create the ZFS file system tank/nfs and share that 
over NFS. Then I create the ZFS file systems tank/nfs/foo1 and tank/nfs/foo2. I 
want to manage snapshots independently for foo1 and foo2, but I would like to 
be able to access both from the single NFS share for tank/nfs. Here are my 
questions:

1. Can I in fact access foo1 and foo2 through the NFS share of tank/nfs or do I 
need to create separate NFS shares for each of them?

2. Is there any difference in interacting with foo1 and foo2 through the 
tank/nfs share versus interacting with them directly? I don't even know if that 
question makes sense, but it's at the heart of my confusion - nesting file 
systems.

3. If I make a snapshot of tank/nfs, does it include the data in foo1 and foo2 
or are they excluded since they are separate ZFS file systems?

Thanks for your help.

Regards,
Chris Dunbar

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file system confusion

2010-03-24 Thread Brandon High
2010/3/24 Chris Dunbar cdun...@earthside.net

 I have boxed myself into a mental corner and need some help getting out. I
 am confused about working with ZFS file systems. Here is a simple example of
 what has me confused: Let's say I create the ZFS file system tank/nfs and
 share that over NFS. Then I create the ZFS file systems tank/nfs/foo1 and
 tank/nfs/foo2. I want to manage snapshots independently for foo1 and foo2,
 but I would like to be able to access both from the single NFS share for
 tank/nfs. Here are my questions:

 1. Can I in fact access foo1 and foo2 through the NFS share of tank/nfs or
 do I need to create separate NFS shares for each of them?


No, but sort of yes.

If you mount server:/nfs on another host, it will not include
server:/nfs/foo1 or server:/nfs/foo2. Some nfs clients (notably Solaris's)
will attempt to mount the foo1  foo2 datasets automatically, so it looks
like you've exported everything under server:/nfs. Linux clients don't
behave in the same fashion, you'll have to separately mount all the exports.

The sharenfs property will be inherited by the descendant datasets, so if
you set it on tank/nfs, tank/nfs/foo1 will have the same settings.

2. Is there any difference in interacting with foo1 and foo2 through the
 tank/nfs share versus interacting with them directly? I don't even know if
 that question makes sense, but it's at the heart of my confusion - nesting
 file systems.


There are some functions that are unavailable, such as retrieving the zfs
settings, etc. I'm not really sure about specifics.

Depending on the client nfs version, you may not be able to manipulate acls
from clients.


 3. If I make a snapshot of tank/nfs, does it include the data in foo1 and
 foo2 or are they excluded since they are separate ZFS file systems?


No, foo1 and foo2 are separate datasets and have completely independent
snapshots.

You can do 'zfs snapshot -r tank/nfs' which will make a recursive snapshot.
All the datasets under tank/nfs will have a snapshot taken at the exact same
transaction. I'm guessing that's what you'd want?

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file system confusion

2010-03-24 Thread Chris Dunbar
Brandon,

Thank you for the explanation. It looks like I will have to share out each file 
system. I was trying to keep the number of shares manageable, but it sounds 
like that won't work.

Regards,
Chris

On Mar 24, 2010, at 9:36 PM, Brandon High wrote:

 2010/3/24 Chris Dunbar cdun...@earthside.net
 I have boxed myself into a mental corner and need some help getting out. I am 
 confused about working with ZFS file systems. Here is a simple example of 
 what has me confused: Let's say I create the ZFS file system tank/nfs and 
 share that over NFS. Then I create the ZFS file systems tank/nfs/foo1 and 
 tank/nfs/foo2. I want to manage snapshots independently for foo1 and foo2, 
 but I would like to be able to access both from the single NFS share for 
 tank/nfs. Here are my questions:
 
 1. Can I in fact access foo1 and foo2 through the NFS share of tank/nfs or do 
 I need to create separate NFS shares for each of them?
 
 No, but sort of yes.
 
 If you mount server:/nfs on another host, it will not include 
 server:/nfs/foo1 or server:/nfs/foo2. Some nfs clients (notably Solaris's) 
 will attempt to mount the foo1  foo2 datasets automatically, so it looks 
 like you've exported everything under server:/nfs. Linux clients don't behave 
 in the same fashion, you'll have to separately mount all the exports.
 
 The sharenfs property will be inherited by the descendant datasets, so if you 
 set it on tank/nfs, tank/nfs/foo1 will have the same settings. 
 
 2. Is there any difference in interacting with foo1 and foo2 through the 
 tank/nfs share versus interacting with them directly? I don't even know if 
 that question makes sense, but it's at the heart of my confusion - nesting 
 file systems.
 
 There are some functions that are unavailable, such as retrieving the zfs 
 settings, etc. I'm not really sure about specifics.
 
 Depending on the client nfs version, you may not be able to manipulate acls 
 from clients.
  
 3. If I make a snapshot of tank/nfs, does it include the data in foo1 and 
 foo2 or are they excluded since they are separate ZFS file systems?
 
 No, foo1 and foo2 are separate datasets and have completely independent 
 snapshots.
 
 You can do 'zfs snapshot -r tank/nfs' which will make a recursive snapshot. 
 All the datasets under tank/nfs will have a snapshot taken at the exact same 
 transaction. I'm guessing that's what you'd want?
 
 -B 
 
 -- 
 Brandon High : bh...@freaks.com
 eSoft SpamFilter Training Tool
 Train as Spam
 Blacklist for All Users
 Whitelist for All Users

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file system confusion

2010-03-24 Thread Brandon High
On Wed, Mar 24, 2010 at 6:39 PM, Chris Dunbar cdun...@earthside.net wrote:

 Thank you for the explanation. It looks like I will have to share out each
 file system. I was trying to keep the number of shares manageable, but it
 sounds like that won't work.


Thanks to inheritance, it's easier than you think when you've laid out your
datasets properly. If all the datasets you want to export are descendant
from the same starting point, you'll only need to set sharenfs once.
Management on the opensolaris box is easy, but you may have to do some
clever automounter configs on other hosts.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Fajar A. Nugraha
On Thu, Mar 25, 2010 at 1:02 AM, Edward Ned Harvey
solar...@nedharvey.com wrote:
 I think the point is to say:  ZFS software raid is both faster and more
 reliable than your hardware raid.  Surprising though it may be for a
 newcomer, I have statistics to back that up,

Can you share it?

 You will do best if you configure the raid controller to JBOD.

Problem: HP's storage controller doesn't support that mode.

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Carson Gaspar

Fajar A. Nugraha wrote:

On Thu, Mar 25, 2010 at 1:02 AM, Edward Ned Harvey
solar...@nedharvey.com wrote:

I think the point is to say:  ZFS software raid is both faster and more
reliable than your hardware raid.  Surprising though it may be for a
newcomer, I have statistics to back that up,


Can you share it?


You will do best if you configure the raid controller to JBOD.


Problem: HP's storage controller doesn't support that mode.


It does, ish. It forces you to create a bunch of single disk raid 0 
logical drives. It's what we do at work on our HP servers running ZFS.


The bigger problem is that you have to script around a disk failure, as 
the array won't bring a non-redundant logicaldrive back online after a 
disk failure without being kicked (which is a good thing in general, but 
annoying for ZFS).


--
Carson


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Carson Gaspar

Carson Gaspar wrote:

Fajar A. Nugraha wrote:

On Thu, Mar 25, 2010 at 1:02 AM, Edward Ned Harvey
solar...@nedharvey.com wrote:

I think the point is to say:  ZFS software raid is both faster and more
reliable than your hardware raid.  Surprising though it may be for a
newcomer, I have statistics to back that up,


Can you share it?


You will do best if you configure the raid controller to JBOD.


Problem: HP's storage controller doesn't support that mode.


It does, ish. It forces you to create a bunch of single disk raid 0 
logical drives. It's what we do at work on our HP servers running ZFS.


The bigger problem is that you have to script around a disk failure, as 
the array won't bring a non-redundant logicaldrive back online after a 
disk failure without being kicked (which is a good thing in general, but 
annoying for ZFS).


*sigh* too tired - I meant after you replace a failed disk. Obviously it 
won't come back online while the disk is failed...


--
Carson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Carson Gaspar

Fajar A. Nugraha wrote:

On Thu, Mar 25, 2010 at 10:31 AM, Carson Gaspar car...@taltos.org wrote:

Fajar A. Nugraha wrote:

You will do best if you configure the raid controller to JBOD.

Problem: HP's storage controller doesn't support that mode.

It does, ish. It forces you to create a bunch of single disk raid 0 logical
drives. It's what we do at work on our HP servers running ZFS.


that's different. Among other things, it won't allow tools like
smartctl to work.


The bigger problem is that you have to script around a disk failure, as the
array won't bring a non-redundant logicaldrive back online after a disk
failure without being kicked (which is a good thing in general, but annoying
for ZFS).


How do you replace a bad disk then? Is there some userland tool for
opensolaris which can tell the HP array to bring that disk back up? Or
do you have to restart the server, go to BIOS, and enable it there?


hpacucli will do it (usually /opt/HPQacucli/sbin/hpacucli). You need to:

# Wipe the new disk. Not strictly necessary, but I'm paranoid
hpacucli crtl slot=$n physicaldrive $fixeddrive modify erase
# And online the LD...
hpacucli ctrl slot=$n logicaldrive $ld modify reenable forced

--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS backup configuration

2010-03-24 Thread Freddie Cash
On Wed, Mar 24, 2010 at 4:00 PM, Khyron khyron4...@gmail.com wrote:

 Yes, I think Eric is correct.

 Funny, this is an adjunct to the thread I started entitled Thoughts on ZFS
 Pool
 Backup Strategies.  I was going to include this point in that thread but
 thought
 better of it.

 It would be nice if there were an easy way to extract a pool configuration,
 with
 all of the dataset properties, ACLs, etc. so that you could easily reload
 it into a
 new pool.  I could see this being useful in a disaster recovery sense, and
 I'm
 sure people smarter than I can think of other uses.

 From my reading of the documentation and man pages, I don't see that any
 such command currently exists.  Something that would allow you dump the
 config into a file and read it back from a file using typical Unix
 semantics like
 STDIN/STDOUT.  I was thinking something like:

 zpool dump pool [-o filename]

 zpool load pool [-f filename]

 Without -o or -f, the output would go to STDOUT or the input would come

 from STDIN, so you could use this in pipelines.  If you have a particularly
 long
 lived and stable pool, or one that has been through many upgrades, this
 might
 be a nice way to save a configuration that you could restore later (if
 necessary)
 with a single command.


I don't use ACLs, but you can get the pool configuration and dataset
properties via zfs get poolname.  With some fancy scripting, you should be
able to come up with something that would take that output and recreate the
pool with the same settings.

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss