Re: [zfs-discuss] zfs and small files

2007-09-21 Thread Roch - PAE

Claus Guttesen writes:
I have many small - mostly jpg - files where the original file is
approx. 1 MB and the thumbnail generated is approx. 4 KB. The files
are currently on vxfs. I have copied all files from one partition onto
a zfs-ditto. The vxfs-partition occupies 401 GB and zfs 449 GB. Most
files uploaded are in jpg and all thumbnails are always jpg.
  
   Is there a problem?
  
  Not by the diskusage itself. But if zfs takes up more space than vxfs
  (12 %) 80 TB will become 89 TB instead (our current storage) and add
  cost.
  
   Also, how are you measuring this (what commands)?
  
  I did a 'df -h'.
  
Will a different volblocksize (during creation of the partition) make
better use of the available diskspace? Will (meta)data require less
space if compression is enabled?
  
   volblocksize won't have any affect on file systems, it is for zvols.
   Perhaps you mean recordsize?  But recall that recordsize is the maximum 
   limit,
   not the actual limit, which is decided dynamically.
  
I read 
http://www.opensolaris.org/jive/thread.jspa?threadID=37673tstart=105
which is very similar to my case except for the file type. But no
clear pointers otherwise.
  
   A good start would be to find the distribution of file sizes.
  
  The files are approx. 1 MB with an thumbnail of approx. 4 KB.
  

So the 1 MB files are stored as ~8 x 128K recordsize.

Because of 
5003563 use smaller tail block for last block of object

The last block of you file is partially used. It will depend 
on your filesize distribution by without that info we can
only guess that we're wasting an avg of 64K per file. Or 6%.

If your distribution is such that most files are slightly
more than 1M, then we'd have 12% overhead from this effect.

So using 16K/32K recordsize would quite possibly help as
files would be stored using ~ 64 x 16K blocks with an
overhead of  1-2% (0.5 blocks wasted  every 64).


-r


  -- 
  regards
  Claus
  
  When lenity and cruelty play for a kingdom,
  the gentlest gamester is the soonest winner.
  
  Shakespeare
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs and small files

2007-09-21 Thread Claus Guttesen
 So the 1 MB files are stored as ~8 x 128K recordsize.

 Because of
 5003563 use smaller tail block for last block of object

 The last block of you file is partially used. It will depend
 on your filesize distribution by without that info we can
 only guess that we're wasting an avg of 64K per file. Or 6%.

 If your distribution is such that most files are slightly
 more than 1M, then we'd have 12% overhead from this effect.

 So using 16K/32K recordsize would quite possibly help as
 files would be stored using ~ 64 x 16K blocks with an
 overhead of  1-2% (0.5 blocks wasted  every 64).

I will (re)create the partition and modify the recordsize. I was
unwilling to do so when I read the man page which discourages
modifying this setting unless a database was used.

Does zfs use suballocation if a file does not use an entire
recordsize? If not the thumbnails probably wastes most space. They are
approx. 4 KB.

I'll be testing recordsizes from 1K and upwards. Actually 1K made zfs
very slow but 2K seems fine. I'll report back when the entire
partition has been copied. When I find the sweet spot I'll try to
enable (default) compression.

Thank you.

-- 
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs and small files

2007-09-21 Thread Roch - PAE

Claus Guttesen writes:
   So the 1 MB files are stored as ~8 x 128K recordsize.
  
   Because of
   5003563 use smaller tail block for last block of object
  
   The last block of you file is partially used. It will depend
   on your filesize distribution by without that info we can
   only guess that we're wasting an avg of 64K per file. Or 6%.
  
   If your distribution is such that most files are slightly
   more than 1M, then we'd have 12% overhead from this effect.
  
   So using 16K/32K recordsize would quite possibly help as
   files would be stored using ~ 64 x 16K blocks with an
   overhead of  1-2% (0.5 blocks wasted  every 64).
  
  I will (re)create the partition and modify the recordsize. I was
  unwilling to do so when I read the man page which discourages
  modifying this setting unless a database was used.
  
  Does zfs use suballocation if a file does not use an entire
  recordsize? If not the thumbnails probably wastes most space. They are
  approx. 4 KB.
  

Files smaller than 'recordsize' are stored using a multiple
of the sector size. So small files should not factor in this 
equation.


  I'll be testing recordsizes from 1K and upwards. Actually 1K made zfs
  very slow but 2K seems fine. I'll report back when the entire
  partition has been copied. When I find the sweet spot I'll try to
  enable (default) compression.

Beware because  at 2K you  might be generating more indirect
blocks.  For 1MB files   the  gains from using  a recordsize
smaller than 16K start to be quite small.

-r

  
  Thank you.
  
  -- 
  regards
  Claus
  
  When lenity and cruelty play for a kingdom,
  the gentlest gamester is the soonest winner.
  
  Shakespeare

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS (and quota)

2007-09-21 Thread Pawel Jakub Dawidek
I'm CCing zfs-discuss@opensolaris.org, as this doesn't look like
FreeBSD-specific problem.

It looks there is a problem with block allocation(?) when we are near
quota limit. tank/foo dataset has quota set to 10m:

Without quota:

FreeBSD:
# dd if=/dev/zero of=/tank/test bs=512 count=20480
time: 0.7s

Solaris:
# dd if=/dev/zero of=/tank/test bs=512 count=20480
time: 4.5s

With quota:

FreeBSD:
# dd if=/dev/zero of=/tank/foo/test bs=512 count=20480
dd: /tank/foo/test: Disc quota exceeded
time: 306.5s

Solaris:
# dd if=/dev/zero of=/tank/foo/test bs=512 count=20480
write: Disc quota exceeded
time: 602.7s

CPU is almost entirely idle, but disk activity seems to be high.

Any ideas?

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgp0eROCivYe1.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS panic when trying to import pool

2007-09-21 Thread Geoffroy Doucet
Ok I found the problem with 0x06, one disk was missing. But now I got all my 
disk and I get 0x05.:
Sep 21 10:25:53 unknown ^Mpanic[cpu0]/thread=ff0001e12c80:
Sep 21 10:25:53 unknown genunix: [ID 603766 kern.notice] assertion failed: 
dmu_read(os, smo-smo_object, offset, size, entry_map) == 0 (0x5 == 0x0), file: 
..
/../common/fs/zfs/space_map.c, line: 339
Sep 21 10:25:53 unknown unix: [ID 10 kern.notice]
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e124f0 
genunix:assfail3+b9 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12590 
zfs:space_map_load+2ef ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e125d0 
zfs:metaslab_activate+66 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12690 
zfs:metaslab_group_alloc+24e ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12760 
zfs:metaslab_alloc_dva+192 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12800 
zfs:metaslab_alloc+82 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12850 
zfs:zio_dva_allocate+68 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12870 
zfs:zio_next_stage+b3 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e128a0 
zfs:zio_checksum_generate+6e ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e128c0 
zfs:zio_next_stage+b3 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12930 
zfs:zio_write_compress+239 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12950 
zfs:zio_next_stage+b3 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e129a0 
zfs:zio_wait_for_children+5d ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e129c0 
zfs:zio_wait_children_ready+20 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e129e0 
zfs:zio_next_stage_async+bb ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12a00 
zfs:zio_nowait+11 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12a80 
zfs:dmu_objset_sync+196 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12ad0 
zfs:dsl_dataset_sync+5d ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12b40 
zfs:dsl_pool_sync+b5 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12bd0 
zfs:spa_sync+1c5 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12c60 
zfs:txg_sync_thread+19a ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12c70 
unix:thread_start+8 ()

There is no scsi the disks, because those are virtual disk. Also for anyone who 
are interest, I wrote a little program to show the properties on the vdev.:

http://www.projectvolcano.org/zfs/list_vdev.c.

Here is a sample output:
bash-3.00# ./list_vdev -d /dev/dsk/c1t12d0s0 
Vdev properties for /dev/dsk/c1t12d0s0:
version: 0x0003
name: share02
state: 0x0001
txg: 0x003fd0e4
pool_guid: 0x88f93fc54c215cfa
top_guid: 0x65400f2e7db0c2a5
guid: 0xfc3b9af2d3b6fd46
vdev_tree: type: raidz
id: 0x
guid: 0x65400f2e7db0c2a5
nparity: 0x0001
metaslab_array: 0x000d
metaslab_shift: 0x001e
ashift: 0x0009
asize: 0x00196e0c
children: [
[0]
type: disk
id: 0x
guid: 0xfc3b9af2d3b6fd46
path: /dev/dsk/c1t12d0s0
devid: id1,[EMAIL PROTECTED]/a
whole_disk: 0x0001
DTL: 0x004e
[1]
type: disk
id: 0x0001
guid: 0x377cc1a2beb3c985
path: /dev/dsk/c1t13d0s0
devid: id1,[EMAIL PROTECTED]/a
whole_disk: 0x0001
DTL: 0x004d
[2]
type: disk
id: 0x0002
guid: 0xe97db62ad7fe325d
path: /dev/dsk/c1t14d0s0
devid: id1,[EMAIL PROTECTED]/a
whole_disk: 0x0001
DTL: 0x0091
]


So my question, is there a way to really know why I got IOE (0x05)? Is there a 
way to know in the debugger? How can I access it?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] The ZFS-Man.

2007-09-21 Thread Pawel Jakub Dawidek
Hi.

I gave a talk about ZFS during EuroBSDCon 2007, and because it won the
the best talk award and some find it funny, here it is:

http://youtube.com/watch?v=o3TGM0T1CvE

a bit better version is here:

http://people.freebsd.org/~pjd/misc/zfs/zfs-man.swf

BTW. Inspired by ZFS demos from OpenSolaris page I created few demos of
ZFS on FreeBSD:

http://youtube.com/results?search_query=freebsd+zfssearch=Search

And better versions:

http://people.freebsd.org/~pjd/misc/zfs/

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpe0ibMatzuw.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Andy Lubel



On 9/20/07 7:31 PM, Paul B. Henson [EMAIL PROTECTED] wrote:

 On Thu, 20 Sep 2007, Tim Spriggs wrote:
 
 It's an IBM re-branded NetApp which can which we are using for NFS and
 iSCSI.

Yeah its fun to see IBM compete with its OEM provider Netapp.
 
 Ah, I see.
 
 Is it comparable storage though? Does it use SATA drives similar to the
 x4500, or more expensive/higher performance FC drives? Is it one of the
 models that allows connecting dual clustered heads and failing over the
 storage between them?
 
 I agree the x4500 is a sweet looking box, but when making price comparisons
 sometimes it's more than just the raw storage... I wish I could just drop
 in a couple of x4500's and not have to worry about the complexity of
 clustering sigh...
 
 
zfs send/receive.


Netapp is great, we have about 6 varieties in production here. But what I
pay in maintenance and up front cost on just 2 filers,  I can buy a x4500 a
year, and have a 3 year warranty each time I buy.  It just depends on the
company you work for.

I haven't played too much with anything but netapp and storagetek.. But once
I got started on zfs I just knew it was the future; and I think netapp
realizes that too.  And if apple does what I think it will, it will only get
better :)

Fast, Cheap, Easy - you only get 2.  Zfs may change that.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server

2007-09-21 Thread Tim Spriggs
Gino wrote:
 The x4500 is very sweet and the only thing stopping
 us from buying two 
 instead of another shelf is the fact that we have
 lost pools on Sol10u3 
 servers and there is no easy way of making two pools
 redundant (ie the 
 complexity of clustering.) Simply sending incremental
 snapshots is not a 
 viable option.

 The pools we lost were pools on iSCSI (in a mirrored
 config) and they 
 were mostly lost on zpool import/export. The lack of
 a recovery 
 mechanism really limits how much faith we can put
 into our data on ZFS. 
 It's safe as long as the pool is safe... but we've
 lost multiple pools.
 

 Hello Tim,
 did you try SNV60+ or S10U4 ?

 Gino
   
Hi Gino,

We need Solaris proper for these systems and we will have to 
schedule a significant downtime to patch update to U4.

-Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Mike Gerdts
On 9/20/07, Paul B. Henson [EMAIL PROTECTED] wrote:
 Again though, that would imply two different storage locations visible to
 the clients? I'd really rather avoid that. For example, with our current
 Samba implementation, a user can just connect to
 '\\files.csupomona.edu\username' to access their home directory or
 '\\files.csupomona.edu\groupname' to access a shared group directory.
 They don't need to worry on which physical server it resides or determine
 what server name to connect to.

MS-DFS could be helpful here.  You could have a virtual samba instance
that generates MS-DFS redirects to the appropriate spot.  At one point
in the past I wrote a script (long since lost - at a different job)
that would automatically convert automounter maps into the
appropriately formatted symbolic links used by the Samba MS-DFS
implementation.  It worked quite well for giving one place to
administer the location mapping while providing transparency to the
end-users.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zoneadm clone doesn't support ZFS snapshots in s10u4?

2007-09-21 Thread Cindy . Swearingen
Mike, Grant,

I reported the zoneadm.1m man page problem to the man page group.

I also added some stronger wording to the ZFS Admin Guide and the
ZFS FAQ about not using ZFS for zone root paths for the Solaris 10
release and that upgrading or patching is not supported for either
Solaris 10 or Solaris Express release.

It is true that none of the current install/patch tools recognize
ZFS yet and this has been discussed many times before. You can follow
the ZFS boot/root project here:

http://opensolaris.org/os/community/zfs/boot/

I've tried to add all the things you *can't* do with ZFS in the ZFS
FAQ and the next topic to add is around the inability to split
mirrors.

If you are considering doing something with ZFS where recovering
would be difficut, then please ask here first. :-)

Cindy

Mike Gerdts wrote:
 On 9/19/07, grant beattie [EMAIL PROTECTED] wrote:
 
according to the zoneadm(1m) man page on s10u4:

 clone [-m copy] [-s zfs_snapshot] source_zone

 Install a zone by copying an  existing  installed  zone.
 This  subcommand  is  an  alternative way to install the
 zone.
 
 
 That's interesting... I reported this as a bug during the S10U2
 (11/06) beta and it got fixed for 11/06.  My bug report was closed as
 6480274 as duplicate of 6383119.  This was wrong - I was reporting a
 man page bug and not the feature request (CR 6383119).  The feature
 request from me was the previous bug that I filed in that beta
 program.  :)
 
 Someone really wants this man page to be out of sync with the command.
 
 The rather consistent answer is that zoneadm clone will not do zfs
 until live upgrade does zfs.  Since there is a new project in the
 works (Snap Upgrade) that is very much targeted at environments that
 use zfs, I would be surprised to see zfs support come into live
 upgrade.
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The ZFS-Man.

2007-09-21 Thread eric kustarz

On Sep 21, 2007, at 11:47 AM, Pawel Jakub Dawidek wrote:

 Hi.

 I gave a talk about ZFS during EuroBSDCon 2007, and because it won the
 the best talk award and some find it funny, here it is:

   http://youtube.com/watch?v=o3TGM0T1CvE

 a bit better version is here:

   http://people.freebsd.org/~pjd/misc/zfs/zfs-man.swf

Looks like Jeff has been working out :)


 BTW. Inspired by ZFS demos from OpenSolaris page I created few  
 demos of
 ZFS on FreeBSD:

   http://youtube.com/results?search_query=freebsd+zfssearch=Search

 And better versions:

   http://people.freebsd.org/~pjd/misc/zfs/

Nice.

eric

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Thu, 20 Sep 2007, Tim Spriggs wrote:

 The x4500 is very sweet and the only thing stopping us from buying two
 instead of another shelf is the fact that we have lost pools on Sol10u3
 servers and there is no easy way of making two pools redundant (ie the
 complexity of clustering.) Simply sending incremental snapshots is not a
 viable option.

 The pools we lost were pools on iSCSI (in a mirrored config) and they
 were mostly lost on zpool import/export. The lack of a recovery
 mechanism really limits how much faith we can put into our data on ZFS.
 It's safe as long as the pool is safe... but we've lost multiple pools.

Lost data doesn't give me a warm fuzzy 8-/. Were you running an officially
supported version of Solaris at the time? If so, what did Sun support have
to say about this issue?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The ZFS-Man.

2007-09-21 Thread Jonathan Edwards

On Sep 21, 2007, at 14:57, eric kustarz wrote:

 Hi.

 I gave a talk about ZFS during EuroBSDCon 2007, and because it won  
 the
 the best talk award and some find it funny, here it is:

  http://youtube.com/watch?v=o3TGM0T1CvE

 a bit better version is here:

  http://people.freebsd.org/~pjd/misc/zfs/zfs-man.swf

 Looks like Jeff has been working out :)

my first thought too:
http://blogs.sun.com/bonwick/resource/images/bonwick.portrait.jpg

funny - i always pictured this as UFS-man though:
http://www.benbakerphoto.com/business/47573_8C-after.jpg

but what's going on with the sheep there?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The ZFS-Man.

2007-09-21 Thread Torrey McMahon
Jonathan Edwards wrote:
 On Sep 21, 2007, at 14:57, eric kustarz wrote:

   
 Hi.

 I gave a talk about ZFS during EuroBSDCon 2007, and because it won  
 the
 the best talk award and some find it funny, here it is:

 http://youtube.com/watch?v=o3TGM0T1CvE

 a bit better version is here:

 http://people.freebsd.org/~pjd/misc/zfs/zfs-man.swf
   
 Looks like Jeff has been working out :)
 

 my first thought too:
 http://blogs.sun.com/bonwick/resource/images/bonwick.portrait.jpg

 funny - i always pictured this as UFS-man though:
 http://www.benbakerphoto.com/business/47573_8C-after.jpg

 but what's going on with the sheep there?

Got me but they do look kind of nervous. (Happy friday folks...)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Tim Spriggs
Paul B. Henson wrote:
 On Thu, 20 Sep 2007, Tim Spriggs wrote:

   
 The x4500 is very sweet and the only thing stopping us from buying two
 instead of another shelf is the fact that we have lost pools on Sol10u3
 servers and there is no easy way of making two pools redundant (ie the
 complexity of clustering.) Simply sending incremental snapshots is not a
 viable option.

 The pools we lost were pools on iSCSI (in a mirrored config) and they
 were mostly lost on zpool import/export. The lack of a recovery
 mechanism really limits how much faith we can put into our data on ZFS.
 It's safe as long as the pool is safe... but we've lost multiple pools.
 

 Lost data doesn't give me a warm fuzzy 8-/. Were you running an officially
 supported version of Solaris at the time? If so, what did Sun support have
 to say about this issue?
   

Sol 10 with just about all patches up to date.

I joined this list in hope of a good answer. After answering a few 
questions over two days I had no hope of recovering the data. Don't 
import/export (especially between systems) without serious cause, at 
least not with U3. I haven't tried updating our servers yet and I don't 
intend to for a while now. The filesystems contained databases that were 
luckily redundant and could be rebuilt, but our DBA was not too happy to 
have to do that at 3:00am.

I still have a pool that can not be mounted or exported. It shows up 
with zpool list but nothing under zfs list. zpool export gives me an IO 
error and does nothing. On the next downtime I am going to attempt to 
yank the lun out from under its feet (as gently as I can) after I have 
stopped all other services.

Still, we are using ZFS but we are re-thinking on how to deploy/manage 
it. Our original model had us exporting/importing pools in order to move 
zone data between machines. We had done the same with UFS on iSCSI 
without a hitch. ZFS worked for about 8 zone moves and then killed 2 
zones. The major operational difference between the moves involved a 
reboot of the global zones. The initial import worked but after the 
reboot the pools were in a bad state reporting errors on both drives in 
the mirror. I exported one (bad choice) and attempted to gain access to 
the other. Now attempting to import the first pool will panic a 
solaris/opensolaris box very reliably. The second is in the state I 
described above. Also, the drive labels are intact according to zdb.

When we don't move pools around, zfs seems to be stable on both Solaris 
and OpenSolaris. I've done snapshots/rollbacks/sends/receives/clones/... 
without problems. We even have zvols exported as mirrored luns from an 
OpenSolaris box. It mirrors the luns that the IBM/NetApp box exports and 
seems to be doing fine with that. There are a lot of other people that 
seem to have the same opinion and use zfs with direct attached storage.

-Tim

PS: when I have a lot of time I might try to reproduce this by:

m2# zpool create test mirror iscsi_lun1 iscsi_lun2
m2# zpool export test
m1# zpool import -f test
m1# reboot
m2# reboot
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread eric kustarz

On Sep 21, 2007, at 3:50 PM, Tim Spriggs wrote:

 Paul B. Henson wrote:
 On Thu, 20 Sep 2007, Tim Spriggs wrote:


 The x4500 is very sweet and the only thing stopping us from  
 buying two
 instead of another shelf is the fact that we have lost pools on  
 Sol10u3
 servers and there is no easy way of making two pools redundant  
 (ie the
 complexity of clustering.) Simply sending incremental snapshots  
 is not a
 viable option.

 The pools we lost were pools on iSCSI (in a mirrored config) and  
 they
 were mostly lost on zpool import/export. The lack of a recovery
 mechanism really limits how much faith we can put into our data  
 on ZFS.
 It's safe as long as the pool is safe... but we've lost multiple  
 pools.


 Lost data doesn't give me a warm fuzzy 8-/. Were you running an  
 officially
 supported version of Solaris at the time? If so, what did Sun  
 support have
 to say about this issue?


 Sol 10 with just about all patches up to date.

 I joined this list in hope of a good answer. After answering a few
 questions over two days I had no hope of recovering the data. Don't
 import/export (especially between systems) without serious cause, at
 least not with U3. I haven't tried updating our servers yet and I  
 don't
 intend to for a while now. The filesystems contained databases that  
 were
 luckily redundant and could be rebuilt, but our DBA was not too  
 happy to
 have to do that at 3:00am.

 I still have a pool that can not be mounted or exported. It shows up
 with zpool list but nothing under zfs list. zpool export gives me  
 an IO
 error and does nothing. On the next downtime I am going to attempt to
 yank the lun out from under its feet (as gently as I can) after I have
 stopped all other services.

 Still, we are using ZFS but we are re-thinking on how to deploy/manage
 it. Our original model had us exporting/importing pools in order to  
 move
 zone data between machines. We had done the same with UFS on iSCSI
 without a hitch. ZFS worked for about 8 zone moves and then killed 2
 zones. The major operational difference between the moves involved a
 reboot of the global zones. The initial import worked but after the
 reboot the pools were in a bad state reporting errors on both  
 drives in
 the mirror. I exported one (bad choice) and attempted to gain  
 access to
 the other. Now attempting to import the first pool will panic a
 solaris/opensolaris box very reliably. The second is in the state I
 described above. Also, the drive labels are intact according to zdb.

 When we don't move pools around, zfs seems to be stable on both  
 Solaris
 and OpenSolaris. I've done snapshots/rollbacks/sends/receives/ 
 clones/...
 without problems. We even have zvols exported as mirrored luns from an
 OpenSolaris box. It mirrors the luns that the IBM/NetApp box  
 exports and
 seems to be doing fine with that. There are a lot of other people that
 seem to have the same opinion and use zfs with direct attached  
 storage.

 -Tim

 PS: when I have a lot of time I might try to reproduce this by:

 m2# zpool create test mirror iscsi_lun1 iscsi_lun2
 m2# zpool export test
 m1# zpool import -f test
 m1# reboot
 m2# reboot


Since I haven't actually looked into what problem caused your pools  
to become damaged/lost, i can only guess that its possibly due to the  
pool being actively imported on multiple machines (perhaps even  
accidentally).

If it is that, you'll be happy to note that we specifically no longer  
that to happen (unless you use the -f flag):
http://blogs.sun.com/erickustarz/entry/poor_man_s_cluster_end
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6282725

Looks like it just missed the s10u4 cut off, but should be in s10_u5.

In your above example, there should be no reason why you have to use  
the '-f' flag on import (the pool was cleanly exported) - when you're  
moving the pool from system to system, this can get you into trouble  
if things don't go exactly how you planned.

eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Tim Spriggs
eric kustarz wrote:

 On Sep 21, 2007, at 3:50 PM, Tim Spriggs wrote:

 m2# zpool create test mirror iscsi_lun1 iscsi_lun2
 m2# zpool export test
 m1# zpool import -f test
 m1# reboot
 m2# reboot

 Since I haven't actually looked into what problem caused your pools to 
 become damaged/lost, i can only guess that its possibly due to the 
 pool being actively imported on multiple machines (perhaps even 
 accidentally).

 If it is that, you'll be happy to note that we specifically no longer 
 that to happen (unless you use the -f flag):
 http://blogs.sun.com/erickustarz/entry/poor_man_s_cluster_end
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6282725

 Looks like it just missed the s10u4 cut off, but should be in s10_u5.

 In your above example, there should be no reason why you have to use 
 the '-f' flag on import (the pool was cleanly exported) - when you're 
 moving the pool from system to system, this can get you into trouble 
 if things don't go exactly how you planned.

 eric

That's a very possible prognosis. Even when the pools are exported from 
one system, they are still marked as attached (thus the -f was 
necessary). Since I rebooted both systems at the same time I guess it's 
possible that they both made claim to the pool and corrupted it.

I'm glad this will be fixed in the future.

-Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zoneadm clone doesn't support ZFS snapshots in

2007-09-21 Thread Christine Tran
grant beattie wrote:

 I don't have any advice, unfortunately, but I do know that in my case 
 putting zones on UFS is simply not an option. there must be a way 
 considering there is nothing in the documentation to suggest that zones 
 on ZFS are not supported.
 

There's a very explicit Do not place the zonepath on ZFS for this 
release in this doc: 
http://docs.sun.com/app/docs/doc/817-1592/z.conf.start-5?a=view


 one question though, why does patchadd care about filesystems in the 
 first place? what if I put my zones on VxFS, or QFS? I don't see why it 
 should make any difference to patchadd. live upgrade is obviously 
 another kettle of fish entirely, though.

patch and install tools can't figure out pools yet.  If you have a 1GB 
pool and 10 filesystems on it, du reports each having 1GB, do you have 
10GB capacity?  The tools can't tell.  Please check the archives, this 
subject has been extensively discussed.

CT
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zoneadm clone doesn't support ZFS snapshots in

2007-09-21 Thread Mike Gerdts
On 9/20/07, Matthew Flanagan [EMAIL PROTECTED] wrote:
 Mike,

 I followed your procedure for cloning zones and it worked
 well up until yesterday when I tried applying the S10U4
 kernel patch 12001-14 and it wouldn't apply because I had
 my zones on zfs :(

Thanks for sharing.  That sucks.

 I'm still figuring out how to fix this other than moving all of my zones onto 
 UFS.

How about a dtrace script that changes the fstyp in statvfs() returns
to say that it is ufs?  :)

I bet someone comes along and says that isn't supported either...

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zoneadm clone doesn't support ZFS snapshots in

2007-09-21 Thread Mike Gerdts
On 9/21/07, Christine Tran [EMAIL PROTECTED] wrote:
 patch and install tools can't figure out pools yet.  If you have a 1GB
 pool and 10 filesystems on it, du reports each having 1GB, do you have
 10GB capacity?  The tools can't tell.  Please check the archives, this
 subject has been extensively discussed.

Two responses come immediately to mind...

1) Thanks for protecting stupid/careless people from doing bad things.
2) UNIX has a longstanding tradition of adding a -f flag for cases
when the sysadmin realizes there is additional risk but feels that
appropriate precautions have been taken.

I would really like to ask Sun for a roadmap as to when this is going
to be supported.  Since this is the zfs list (not zones or install
list) and it is OpenSolaris (not Solaris) I guess I should probably
find a more appropriate forum.

So, for now I will use OpenSolaris where I can and wait patiently for
the new installer + snap upgrade basket and wait for it to find its
way into Solaris in about a year or two.  In the meantime, I'll
probably end up putting most zones on a particular competitor's NAS
devices and looking into how well their file system cloning
capabilities play in coordination with iSCSI.

irony
Oh, wait!  What if the NAS device runs out of space while I'm
patching?  Better rule out the thin provisioning capabilities of the
HDS storage that Sun sells as well.
/irony

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Thu, 20 Sep 2007, eric kustarz wrote:

  As far as quotas, I was less than impressed with their implementation.

 Would you mind going into more details here?

The feature set was fairly extensive, they supported volume quotas for
users or groups, or qtree quotas, which similar to the ZFS quota would
limit space for a particular directory and all of its contents regardless
of user/group ownership.

But all quotas were set in a single flat text file. Anytime you added a new
quota, you needed to turn off quotas, then turn them back on, and quota
enforcement was disabled while it recalculated space utilization.

Like a lot of aspects of the filer, it seemed possibly functional but
rather kludgy. I hate kludgy :(. I'd have to go review the documentation to
recall the other issues I had with it, quotas were one of the last things
we reviewed and I'd about given up taking notes at that point.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, Andy Lubel wrote:

 Yeah its fun to see IBM compete with its OEM provider Netapp.

Yes, we had both IBM and Netapp out as well. I'm not sure what the point
was... We do have some IBM SAN equipment on site, I suppose if we had gone
with the IBM variant we could have consolidated support.

  sometimes it's more than just the raw storage... I wish I could just drop
  in a couple of x4500's and not have to worry about the complexity of
  clustering sigh...
 
 zfs send/receive.

If I understand correctly, that would be sort of a poor man's replication?
So you would result with a physical copy on server2 of all of the data on
server1? What would you do when server1 crashed and died? One of the
benefits of a real cluster would be the automatic failover, and fail back
when the server recovered.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, James F. Hranicky wrote:

  It just seems rather involved, and relatively inefficient to continuously
  be mounting/unmounting stuff all the time. One of the applications to be
  deployed against the filesystem will be web service, I can't really
  envision a web server with tens of thousands of NFS mounts coming and
  going, seems like a lot of overhead.

 Well, that's why ZFS wouldn't work for us :-( .

Although, I'm just saying that from my gut -- does anyone have any actual
experience with automounting thousands of file systems? Does it work? Is it
horribly inefficient? Poor performance? Resource intensive?


 Makes sense -- in that case you would be looking at multiple SMB servers,
 though.

Yes, with again the resultant problem of worrying about where a user's
files are when they want to access them :(.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, Mike Gerdts wrote:

 MS-DFS could be helpful here.  You could have a virtual samba instance
 that generates MS-DFS redirects to the appropriate spot.  At one point in

That's true, although I rather detest Microsoft DFS (they stole the acronym
from DCE/DFS, even though particularly the initial versions sucked
feature-wise in comparison). Also, the current release version of MacOS X
does not support CIFS DFS referrals. I'm not sure if the upcoming version
is going to rectify that or not. Windows clients not belonging to the
domain also occasionally have problems accessing shares across different
servers.

Although it is definitely something to consider if I'm going to be unable
to achieve my single namespace by having one large server...

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs mount points (all-or-nothing)

2007-09-21 Thread Matthew Ahrens
msl wrote:
 Hello all,
  There is a way to configure the zpool to legacy_mount, and have all 
 filesystems in that pool mounted automatically?
  I will try explain better:
  - Imagine that i have a zfs pool with 1000 filesystems. 
  - I want to control the mount/unmount of that pool, so, i did configure 
 the zpool to legacy_mount. 
  - But i don't want to have to mount the other 1000 filessytems...so, when 
 i issue a mount -F zfs mypool, all the filesystems would be mounted too (i 
 think the mount property is per-filesystem).

I don't quite follow what behavior you are looking for.  When you say you 
want to control the mount/unmount of the pool, do you mean just the 
poolname filesystem, or all filesystems in the pool?

You may be looking for zfs set canmount=off poolname.  This will cause the 
poolname (top-most) filesystem to not be mounted, but all filesystems below 
it will be mounted as usual.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, Tim Spriggs wrote:

 Still, we are using ZFS but we are re-thinking on how to deploy/manage
 it. Our original model had us exporting/importing pools in order to move
 zone data between machines. We had done the same with UFS on iSCSI
[...]
 When we don't move pools around, zfs seems to be stable on both Solaris
 and OpenSolaris. I've done snapshots/rollbacks/sends/receives/clones/...

Sounds like your problems are in an area we probably wouldn't be delving
into... Thanks for the detail.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Ed Plese
On Thu, Sep 20, 2007 at 12:49:29PM -0700, Paul B. Henson wrote:
   I was planning to provide CIFS services via Samba. I noticed a posting a
   while back from a Sun engineer working on integrating NFSv4/ZFS ACL 
   support
   into Samba, but I'm not sure if that was ever completed and shipped either
   in the Sun version or pending inclusion in the official version, does
   anyone happen to have an update on that? Also, I saw a patch proposing a
   different implementation of shadow copies that better supported ZFS
   snapshots, any thoughts on that would also be appreciated.
 
  This work is done and, AFAIK, has been integrated into S10 8/07.
 
 Excellent. I did a little further research myself on the Samba mailing
 lists, and it looks like ZFS ACL support was merged into the official
 3.0.26 release. Unfortunately, the patch to improve shadow copy performance
 on top of ZFS still appears to be floating around the technical mailing
 list under discussion.

ZFS ACL support was going to be merged into 3.0.26 but 3.0.26 ended up
being a security fix release and the merge got pushed back.  The next
release will be 3.2.0 and ACL support will be in there.

As others have pointed out though, Samba is included in Solaris 10
Update 4 along with support for ZFS ACLs, Active Directory, and SMF.

The patches for the shadow copy module can be found here:

http://www.edplese.com/samba-with-zfs.html

There are hopefully only a few minor changes that I need to make to them
before submitting them again to the Samba team.

I recently compiled the module for someone to use with Samba as shipped
with U4 and he reported that it worked well.  I've made the compiled
module available on this page as well if anyone is interested in testing
it.

The patch doesn't improve performance anymore in order to preserve
backwards compatibility with the existing module but adds usability
enhancements for both admins and end-users.  It allows shadow copy
functionality to just work with ZFS snapshots without having to create
symlinks to each snapshot in the root of each share.  For end-users it
allows the Previous Versions list to be sorted chronologically to make
it easier to use.  If performance is an issue the patch can be
modified to improve performance like the original patch did but this
only affects directory listings and is likely negligible in most cases.

   Is there any facility for managing ZFS remotely? We have a central 
   identity
   management system that automatically provisions resources as necessary for
 [...]
  This is a loaded question.  There is a webconsole interface to ZFS which can
  be run from most browsers.  But I think you'll find that the CLI is easier
  for remote management.
 
 Perhaps I should have been more clear -- a remote facility available via
 programmatic access, not manual user direct access. If I wanted to do
 something myself, I would absolutely login to the system and use the CLI.
 However, the question was regarding an automated process. For example, our
 Perl-based identity management system might create a user in the middle of
 the night based on the appearance in our authoritative database of that
 user's identity, and need to create a ZFS filesystem and quota for that
 user. So, I need to be able to manipulate ZFS remotely via a programmatic
 API.

While it won't help you in your case since your users access the files
using protocols other than CIFS, if you use only CIFS it's possible to
configure Samba to automatically create a user's home directory the
first time the user connects to the server.  This is done using the
root preexec share option in smb.conf and an example is provided at
the above URL.


Ed Plese


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss