[zfs-discuss] There is no NFS over ZFS issue

2007-06-26 Thread Roch - PAE

Regarding the bold statement 


There is no NFS over ZFS issue


What   I  mean here  is that,if  you  _do_  encounter  a
performance pathology not  linked to the NVRAM Storage/cache
flush issue then you _should_ complain or better get someone
to do an analysis of the situation.

One  should   not  assume that someobserved pathological
performance of  NFS/ZFS is widespread and due  to some known
ZFS issue about to be fixed.

To be sure, there are lots of performance opportunities that
will provide incremental  improvements the most  significant
of which ZFSSeparate  Intent Log  just  integrated  in
Nevada. This opens  up the   field   for further NFS/ZFS
performance investigations.

But the data that got this thread  started seem to highlight
an NFS   vs Samba opportinity,   something  we need  to look
into. Otherwise I don't think that the  data produced so far
has hightlighted   any specific  NFS/ZFS issue.There are
certainly   opportinitiesfor   incremental   performance
improvements but, to the best of my knowledge, outside the
NVRAM/Flush issue on certain storage :


There are no known prevalent NFS over ZFS performance
pathologies on record.


-r


Ref: 
http://mail.opensolaris.org/pipermail/zfs-discuss/2007-June/thread.html#29026


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS - DB2 Performance

2007-06-26 Thread Roshan Perera
Hi all,

I am after some help/feedback to the subject issue explained below.

We are in the process of migrating a big DB2 database from a 

6900 24 x 200MHz CPU's with Veritas FS 8TB of storage Solaris 8 to 
25K   12 CPU dual core x 1800Mhz with ZFS 8TB storage SAN storage (compressed  
RaidZ) Solaris 10.

Unfortunately, we are having massive perfomance problems with the new solution. 
It all points towards IO and ZFS. 

Couple of questions relating to ZFS.
1. What is the impace on using ZFS compression ? Percentage of system resources 
required, how much of a overhead is this as suppose to non-compression. In our 
case DB2 do similar amount of read's and writes.
2. Unfortunately we are using twice RAID (San level Raid and RaidZ) to overcome 
the panic problem my previous blog (for which I had good response).
3. Any way of monitoring ZFS performance other than iostat ?
4. Any help on ZFS tuning in this kind of environment like caching etc ?

Would appreciate for any feedback/help wher to go next. 
If this cannot be resolved we may have to go back to VXFS which would be a 
shame.


Thanks in advance.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS test suite released on OpenSolaris.org

2007-06-26 Thread Jim Walker
The ZFS test suite is being released today on OpenSolaris.org along with
the Solaris Test Framework (STF), Checkenv and Runwattr test tools.

The source tarball, binary package and baseline can be downloaded from the test
consolidation download center at http://dlc.sun.com/osol/test/downloads/current.
And, the source code can be viewed in the Solaris Test Collection (STC) 2.0
source tree at: 
http://cvs.opensolaris.org/source/xref/test/ontest-stc2/src/suites/zfs.

The STF, Checkenv and Runwattr packages must be installed prior to executing
a ZFS test run. More information is available in the ZFS README file and on the
ZFS test suite webpage at: http://opensolaris.org/os/community/zfs/zfstestsuite.

Any questions about the ZFS test suite can be sent to zfs discuss at:
http://www.opensolaris.org/os/community/zfs/discussions.
Any questions about STF, and the test tools can be sent to testing discuss at: 
http://www.opensolaris.org/os/community/testing/discussions.

Happy Hunting,
Jim
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS - DB2 Performance

2007-06-26 Thread Will Murnane

On 6/26/07, Roshan Perera [EMAIL PROTECTED] wrote:

25K   12 CPU dual core x 1800Mhz with ZFS 8TB storage SAN storage (compressed  
RaidZ) Solaris 10.

RaidZ is a poor choice for database apps in my opinion; due to the way
it handles checksums on raidz stripes, it must read every disk in
order to satisfy small reads that traditional raid-5 would only have
to read a single disk for.  Raid-Z doesn't have the terrible write
performance of raid 5, because you can stick small writes together and
then do full-stripe writes, but by the same token you must do
full-stripe reads, all the time.  That's how I understand it, anyways.
Thus, raidz is a poor choice for a database application which tends
to do a lot of small reads.

Using mirrors (at the zfs level, not the SAN level) would probably
help with this.  Mirrors each get their own copy of the data, each
with its own checksum, so you can read a small block by touching only
one disk.

What is your vdev setup like right now?  'zpool list', in other words.
How wide are your stripes?  Is the SAN doing raid-1ish things with
the disks, or something else?


2. Unfortunately we are using twice RAID (San level Raid and RaidZ) to overcome 
the panic problem my previous blog (for which I had good response).

Can you convince the customer to give ZFS a chance to do things its
way?  Let the SAN export raw disks, and make two- or three-way
mirrored vdevs out of them.


3. Any way of monitoring ZFS performance other than iostat ?

In a word, yes.  What are you interested in?  DTrace or 'zpool iostat'
(which reports activity of individual disks within the pool) may prove
interesting.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS usb keys

2007-06-26 Thread Dick Davies

I used a zpool on a usb key today to get some core files off a non-networked
Thumper running S10U4 beta.

Plugging the stick into my SXCE b61 x86 machine worked fine; I just had to
'zpool import sticky' and it worked ok.

But when we attach the drive to a blade 100 (running s10u3), it sees the
pool as corrupt. I thought I'd been too hasty pulling out the stick,
but it works
ok back in the b61 desktop and Thumper.

I'm trying to figure out if this is an endian thing (which I thought
ZFS was immune
from) - or has the b61 machine upgraded the zpool format?



--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS - DB2 Performance

2007-06-26 Thread Roshan Perera

Hi Will,
Thanks for your reply. 
Customer has EMC San solution and will not change their current layout. 
Therefore, asking the customer to give RAW disks to ZFS is no no. Hence, the 
RaidZ configuration as suppose to Raid - 5. 
I have given some stats below. I know its a bit difficult to troubleshoot with 
the type of data you have. But whatever input would be muchly appreciated.


zpool list

NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
datapool1  2.12T707G   1.43T32%  ONLINE -
datapool2  2.12T706G   1.44T32%  ONLINE -
datapool3  2.12T702G   1.44T32%  ONLINE -
datapool4  2.12T701G   1.44T32%  ONLINE -
dumppool272G171G101G62%  ONLINE -
localpool68G   12.5G   55.5G18%  ONLINE -
logpool 272G157G115G57%  ONLINE -



zfs get all datapool1

NAME PROPERTY   VALUE  SOURCE
datapool1type   filesystem -
datapool1creation   Fri Jun  8 18:46 2007  -
datapool1used   615G   -
datapool1available  1.22T  -
datapool1referenced 42.6K  -
datapool1compressratio  2.08x  -
datapool1mountedno -
datapool1quota  none   default  
datapool1reservationnone   default  
datapool1recordsize 128K   default  
datapool1mountpoint none   local
datapool1sharenfs   offdefault  
datapool1checksum   on default  
datapool1compressionon local
datapool1atime  on default  
datapool1deviceson default  
datapool1exec   on default  
datapool1setuid on default  
datapool1readonly   offdefault  
datapool1zoned  offdefault  
datapool1snapdirhidden default  
datapool1aclmodegroupmask  default  
datapool1aclinherit secure default   


[su621dwdb/root] zpool status -v
  pool: datapool1
 state: ONLINE
 scrub: none requested
config:
 
NAME STATE READ WRITE CKSUM
datapool1ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
emcpower8h   ONLINE   0 0 0
emcpower9h   ONLINE   0 0 0
emcpower10h  ONLINE   0 0 0
emcpower11h  ONLINE   0 0 0
emcpower12h  ONLINE   0 0 0
emcpower13h  ONLINE   0 0 0
emcpower14h  ONLINE   0 0 0
emcpower15h  ONLINE   0 0 0
 
errors: No known data errors
 
  pool: datapool2
 state: ONLINE
 scrub: none requested
config:
 
NAME STATE READ WRITE CKSUM
datapool2ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
emcpower16h  ONLINE   0 0 0
emcpower17h  ONLINE   0 0 0
emcpower18h  ONLINE   0 0 0
emcpower19h  ONLINE   0 0 0
emcpower20h  ONLINE   0 0 0
emcpower21h  ONLINE   0 0 0
emcpower22h  ONLINE   0 0 0
emcpower23h  ONLINE   0 0 0
 
errors: No known data errors
 
  pool: datapool3
 state: ONLINE
 scrub: none requested
config:
 
NAME STATE READ WRITE CKSUM
datapool3ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
emcpower24h  ONLINE   0 0 0
emcpower25h  ONLINE   0 0 0
emcpower26h  ONLINE   0 0 0
emcpower27h  ONLINE   0 0 0
emcpower28h  ONLINE   0 0 0
emcpower29h  ONLINE   0 0 0
emcpower30h  ONLINE   0 0 0
emcpower31h  ONLINE   0 0 0
 
errors: No known data errors
 
  pool: datapool4
 state: ONLINE
 scrub: none requested
config:
 
NAME STATE READ WRITE CKSUM
datapool4ONLINE   0 

Re: [zfs-discuss] Suggestions on 30 drive configuration?

2007-06-26 Thread Rob Logan

 an array of 30 drives in a RaidZ2 configuration with two hot spares
 I don't want to mirror 15 drives to 15 drives

ok, so space over speed... and are willing to toss somewhere between 4
and 15 drives for protection.

raidz splits the (up to 128k) write/read recordsize into each element of
the raidz set.. (ie: all drives must be touched and all must finish
before the block request is complete)  so with a 9 disk raid1z set that's
(8 data + 1 parity (8+1)) or 16k per disk for a full 128k write. or for
a smaller 4k block, that a single 512b sector per disk. on a 26+2 raid2z
set that 4k block would still use 8 disks, with the other 18 disks
unneeded but allocated.

so perhaps three sets of 8+2 would let three blocks be read/written to
at once with a total of 6 disks for protection.

but for twice the speed, six sets of 4+1 would be the same size, (same
number of disks for protection) but isn't quite as safe for its 2x speed.

Rob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS usb keys

2007-06-26 Thread Jürgen Keil
 I used a zpool on a usb key today to get some core files off a non-networked
 Thumper running S10U4 beta.
 
 Plugging the stick into my SXCE b61 x86 machine worked fine; I just had to
 'zpool import sticky' and it worked ok.
 
 But when we attach the drive to a blade 100 (running s10u3), it sees the
 pool as corrupt. I thought I'd been too hasty pulling out the stick,
 but it works ok back in the b61 desktop and Thumper.
 
 I'm trying to figure out if this is an endian thing (which I thought
 ZFS was immune from) - or has the b61 machine upgraded the zpool
 format?

Most likely the zpool on the usb stick was formatted using a zpool version
that s10u3 does not yet support.

Check with zpool version on the b61 machine which zpool version is
supported by b61, any which zpool version is on the usb stick.
Repeat on the s10u3 machine.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] ZFS - DB2 Performance

2007-06-26 Thread Ellis, Mike
At what Solaris10 level (patch/update) was the single-threaded
compression situation resolved? 
Could you be hitting that one?

 -- MikeE 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roch - PAE
Sent: Tuesday, June 26, 2007 12:26 PM
To: Roshan Perera
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS - DB2 Performance


Possibly the  storage is flushing  the write caches  when it
should not.  Until we  get  a fix,  cache flushing  could be
disabled  in  the storage  (ask   the vendor for   the magic
incantation). If that's not forthcoming and if all pools are 
attached to NVRAM protected devices; then these /etc/system
evil tunable might help :

In older solaris releases we have

set zfs:zil_noflush = 1

On newer releases

set zfs:zfs_nocacheflush = 1


If  you implement this,  Do place a   comment that this is a
temporary workaround waiting for bug 6462690 to be fixed.

About Compression, I don't have the numbers but a reasonable
guess would be that it can consumes  roughly 1-Ghz of CPU to
compress 100MB/sec. This will of course depend on the type
of data being compressed.

-r

Roshan Perera writes:
  Hi all,
  
  I am after some help/feedback to the subject issue explained below.
  
  We are in the process of migrating a big DB2 database from a 
  
  6900 24 x 200MHz CPU's with Veritas FS 8TB of storage Solaris 8 to 
  25K   12 CPU dual core x 1800Mhz with ZFS 8TB storage SAN storage
(compressed  RaidZ) Solaris 10.
  
  Unfortunately, we are having massive perfomance problems with the new
solution. It all points towards IO and ZFS. 
  
  Couple of questions relating to ZFS.
  1. What is the impace on using ZFS compression ? Percentage of system
  resources required, how much of a overhead is this as suppose to
  non-compression. In our case DB2 do similar amount of read's and
  writes. 
  2. Unfortunately we are using twice RAID (San level Raid and RaidZ)
to
  overcome the panic problem my previous blog (for which I had good
  response). 
  3. Any way of monitoring ZFS performance other than iostat ?
  4. Any help on ZFS tuning in this kind of environment like caching
etc ?
  
  Would appreciate for any feedback/help wher to go next. 
  If this cannot be resolved we may have to go back to VXFS which would
be a shame.
  
  
  Thanks in advance.
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Drive Failure w/o Redundancy

2007-06-26 Thread Jef Pearlman
Hi. I'm looking for the best solution to create an expandable heterogeneous 
pool of drives. I think in an ideal world, there'd be a raid version which 
could cleverly handle both multiple drive sizes and the addition of new drives 
into a group (so one could drop in a new drive of arbitrary size, maintain some 
redundancy, and gain most of that drive's capacity), but my impression is that 
we're far from there.

Absent that, I was considering using zfs and just having a single pool. My main 
question is this: what is the failure mode of zfs if one of those drives either 
fails completely or has errors? Do I permanently lose access to the entire 
pool? Can I attempt to read other data? Can I zfs replace the bad drive and 
get some level of data recovery? Otherwise, by pooling drives am I simply 
increasing the probability of a catastrophic data loss? I apologize if this is 
addressed elsewhere -- I've read a bunch about zfs, but not come across this 
particular answer.

As a side-question, does anyone have a suggestion for an intelligent way to 
approach this goal? This is not mission-critical data, but I'd prefer not to 
make data loss _more_ probable. Perhaps some volume manager (like LVM on linux) 
has appropriate features?

Thanks for any help.

-puk
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS - DB2 Performance

2007-06-26 Thread Louwtjie Burger


Roshan Perera writes:
  Hi all,
 
  I am after some help/feedback to the subject issue explained below.
 
  We are in the process of migrating a big DB2 database from a
 
  6900 24 x 200MHz CPU's with Veritas FS 8TB of storage Solaris 8 to
  25K   12 CPU dual core x 1800Mhz with ZFS 8TB storage SAN storage
(compressed  RaidZ) Solaris 10.
 


200Mhz !? You mean 1200Mhz ;) The slowest CPU's in a 6900 was 900Mhz III Cu.

You mention Veritas FS ... as in Veritas filesystem, vxfs ? I suppose
you also include vmsa or the whole Storage Foundation? (could still be
vxva on Solaris 8 ! Oh, those were the days...)

First impressions on the system is ... well, it's fair to say that you
have some extra CPU power (and then some). The old III 1.2Ghz was nice
but by no means screamers. ( years ago)


  Unfortunately, we are having massive perfomance problems with the new
solution. It all points towards IO and ZFS.
 


Yep... CPU it isn't. Keep in mind that you have now completely moved
the goal posts when it comes to performance or comparing performance
with the previous installation. Not only do you have a large increase
in CPU performance, Solaris 10 will blitz 8 on a bad day by miles.
With all of the CPU/OS bottlenecks removed I sure hope you have decent
I/O at the back...


  Couple of questions relating to ZFS.
  1. What is the impace on using ZFS compression ? Percentage of system
  resources required, how much of a overhead is this as suppose to
  non-compression. In our case DB2 do similar amount of read's and
  writes.


I'm unsure as to why a person that buys a 24 core 25K would activate
compression on a OLTP database? Surely when you fork out that kind of
cash you want to get every bang for your buck (and then some!). I
don't think compression was created with the view on high performance
OLTP db's.

I would hope that the 25K (which in this case is light years faster
than the 6900) wasn't spec'ed with the idea of running compression
with the extra CPU cycles... oooh... *crash* *burn*.


  2. Unfortunately we are using twice RAID (San level Raid and RaidZ)
to
  overcome the panic problem my previous blog (for which I had good
  response).


I've yet to deploy a DB on ZFS in production, so I cannot comment on
the real world performance.. what I can comment on is some basic
things.

RAID on top of RAID seems silly. Especially RAID-Z. It's just not as
fast as a mirror or stripe when it comes to a decent db workout.

Are you sure that you want to go with ZFS ... any real reason to go
that way now? I would wait for U4 ... and give the machine/storage a
good workout with SVM and UFS/DirectIO.

Yep... it's a bastard to manage but very little can touch it when it
comes to pure performance. With so many $$$ standing on the datacentre
floor, I'd forget about technology for now and let common sense and
good business practice prevail.



  3. Any way of monitoring ZFS performance other than iostat ?


Dtrace guru's can comment... however iostat should suffice.


  4. Any help on ZFS tuning in this kind of environment like caching
etc ?
 


As was posted, read the blog on ZFS and db's.


  Would appreciate for any feedback/help wher to go next.
  If this cannot be resolved we may have to go back to VXFS which would
be a shame.


By the way ... if the client has already purchased vmsa/vxfs (oh my
word, how much was that!) then I'm unsure as to what ZFS will bring to
the party... apart from saving the yearly $$$ for updates and
patches/support. Is that the idea? It's not like SF is bad...

Nope, 8TB on a decent configured storage unit is not that big _not_ to
give it a go with SVM, especially if you want to save money on Storage
Foundation.

I'm sure I'm preaching to the converted here but DB performance and
problems will usually reside inside the storage architecture... I've
seldom found a system wanting in the CPU department if the architect
wasn't a moron. With the upgrade that I see here... all the pressure
will move to the back (bar a bad configuration)

If you want to speed up a regular OLTP DB... fiddle with the I/O :)

2c
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive Failure w/o Redundancy

2007-06-26 Thread Richard Elling

Jef Pearlman wrote:

Hi. I'm looking for the best solution to create an expandable heterogeneous 
pool of drives. I think in an ideal world, there'd be a raid version which 
could cleverly handle both multiple drive sizes and the addition of new drives 
into a group (so one could drop in a new drive of arbitrary size, maintain some 
redundancy, and gain most of that drive's capacity), but my impression is that 
we're far from there.


Mirroring (aka RAID-1, though technically more like RAID-1+0) in ZFS will do 
this.


Absent that, I was considering using zfs and just having a single pool. My main question 
is this: what is the failure mode of zfs if one of those drives either fails completely 
or has errors? Do I permanently lose access to the entire pool? Can I attempt to read 
other data? Can I zfs replace the bad drive and get some level of data 
recovery? Otherwise, by pooling drives am I simply increasing the probability of a 
catastrophic data loss? I apologize if this is addressed elsewhere -- I've read a bunch 
about zfs, but not come across this particular answer.


We generally recommend a single pool, as long as the use case permits.
But I think you are confused about what a zpool is.  I suggest you look
at the examples or docs.  A good overview is the slide show
http://www.opensolaris.org/os/community/zfs/docs/zfs_last.pdf


As a side-question, does anyone have a suggestion for an intelligent way to 
approach this goal? This is not mission-critical data, but I'd prefer not to 
make data loss _more_ probable. Perhaps some volume manager (like LVM on linux) 
has appropriate features?


ZFS, mirrored pool will be the most performant and easiest to manage
with better RAS than a raidz pool.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: zfs and 2530 jbod

2007-06-26 Thread Joel Miller
Hi folks,

So the expansion unit for the 2500 series is the 2501.
The back-end drive channels are SAS.

Currently it is not supported to connect a 2501 directly to a SAS HBA.

SATA drives are in the pipe, but will not be released until the RAID firmware 
for the 2500 series officially supports the SATA drives. The current firmware 
does not lock out those drives and prematurely releasing the drives would 
result in lots of service calls for unsupported configurations.

The 750GB and 1TB drives are on the map behind the initial release of SATA 
support.

The 2500 series engineering team is talking with the ZFS folks to understand 
the various aspects of delivering a complete solution. (There is a lot more to 
it than it seems to work...).

-Joel
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] NFS, nested ZFS filesystems and ownership

2007-06-26 Thread Marko Milisavljevic

Hello,
I'm sure there is a simple solution, but I am unable to figure this one out.

Assuming I have tank/fs, tank/fs/fs1, tank/fs/fs2, and I set sharenfs=on for
tank/fs (child filesystems are inheriting it as well), and I chown
user:group /tank/fs, /tank/fs/fs1 and /tank/fs/fs2, I see:

ls -la /tank/fs
user:group .
user:group fs1
user:group fs2
user:group some_other_file

If I mount server:/tank/fs /tmp/mount from another machine, I see:


ls -la /tmp/mount
user:group .
root:wheel fs1
root:wheel fs2
user:group some_other_file

How can I get user:group to propagate down the nested ZFS filesystem over
NFS?

Thanks,
Marko
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

2007-06-26 Thread Joel Miller
I am pretty sure the T3/6120/6320 firmware does not support the 
SYNCHRONIZE_CACHE commands..

Off the top of my head, I do not know if that triggers any change in behavior 
on the Solaris side...

The firmware does support the use of the FUA bit...which would potentially lead 
to similar flushing behavior...

I will try to check in my infinite spare time...

-Joel
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS delegation script

2007-06-26 Thread Mark Shellenbaum

Nicolas Williams wrote:

Couldn't wait for ZFS delegation, so I cobbled something together; see
attachment.

Nico



The *real* ZFS delegation code was integrated into Nevada this morning. 
 I've placed a little overview in my blog.


http://blogs.sun.com/marks

  -Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS delegation script

2007-06-26 Thread Nicolas Williams
On Tue, Jun 26, 2007 at 04:19:03PM -0600, Mark Shellenbaum wrote:
 Nicolas Williams wrote:
 Couldn't wait for ZFS delegation, so I cobbled something together; see
 attachment.
 
 The *real* ZFS delegation code was integrated into Nevada this morning. 
  I've placed a little overview in my blog.
 
 http://blogs.sun.com/marks

Yup.  I'd written my script a while back but had left it unfinished.
Fortunately I only spent a couple of hours on Friday finishing it up,
but I really should have checked when ZFS delegation was scheduled to
integrate (actually, I did ask on #onnv, but the only answer I got was
not soon enough! :)

Perhaps folks may find this script useful for pre-updated systems.
(Speaking of which, what S10 update will ZFS delegation be rolled into?)

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS delegation script

2007-06-26 Thread Nicolas Williams
Oh, and thanks!  ZFS delegations rocks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: NFS, nested ZFS filesystems and ownership

2007-06-26 Thread Marko Milisavljevic

I figured out how to get it to work, but I still don't quite understand it.
The way i got it to work is to zfs unmount tank/fs/fs1 and tank/fs/fs2, and
then it looked like this:

ls -la /tank/fs
user:group .
root:root fs1
root:root fs2

That is, those mountpoints changed to root:root from user:group that was in
effect while it was mounted. This I don't understand - what is determining
this? How did zfs know to change this to user:group after zfs mount -a on
local filesystem? Does ZFS inherit parent directory ownership at time of
mounting, regardless of ownership of mountpoint? Does NFS respect ownership
of underlying mountpoint, regardless of how ZFS is mounting it? I would
appreciate an explanation or pointing to appropriate documentation. In any
case, I would expect that reasonable behavour would be for both local ZFS
filesystem hierarchy and the view of the same over NFS to display same
ownership (user:group in question exists on both machines, and client is Mac
OSX 10.4.9)


Marko

On 6/26/07, Marko Milisavljevic [EMAIL PROTECTED] wrote:


Hello,

I'm sure there is a simple solution, but I am unable to figure this one
out.


Assuming I have tank/fs, tank/fs/fs1, tank/fs/fs2, and I set sharenfs=on
for tank/fs (child filesystems are inheriting it as well), and I chown
user:group /tank/fs, /tank/fs/fs1 and /tank/fs/fs2, I see:


ls -la /tank/fs
user:group .
user:group fs1
user:group fs2
user:group some_other_file

If I mount server:/tank/fs /tmp/mount from another machine, I see:


ls -la /tmp/mount
user:group .
root:wheel fs1
root:wheel fs2
user:group some_other_file


How can I get user:group to propagate down the nested ZFS filesystem over
NFS?


Thanks,
Marko

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS delegation script

2007-06-26 Thread Roland Mainz
Nicolas Williams wrote:
 On Sat, Jun 23, 2007 at 12:31:28PM -0500, Nicolas Williams wrote:
  On Sat, Jun 23, 2007 at 12:18:05PM -0500, Nicolas Williams wrote:
   Couldn't wait for ZFS delegation, so I cobbled something together; see
   attachment.
 
  I forgot to slap on the CDDL header...
 
 And I forgot to add a -p option here:
 
  #!/bin/ksh
 
 That should be:
 
  #!/bin/ksh -p

Uhm... that's no longer needed for /usr/bin/ksh in Solaris 10 and ksh93
never needed it.

 Note that this script is not intended to be secure, just to keep honest
 people honest and from making certain mistakes.  Setuid-scripts (which
 this isn't quite) are difficult to make secure.

Uhm... why ? You only have to make sure the users can't inject
data/code. David Korn provided some guidelines for such cases, see
http://mail.opensolaris.org/pipermail/shell-discuss/2007-June/000493.html
(mainly avoid eval, put all variable expensions in quotes, set IFS= at
the beginning of the script and harden your script against unexpected
input (classical example is $ myscript $(cat /usr/bin/cat) # (e.g. the
attempt to pass a giant binary string as argument))) ... and I am
currently working on a new shell code style guideline at
http://www.opensolaris.org/os/project/shell/shellstyle/ with more stuff.



Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) [EMAIL PROTECTED]
  \__\/\/__/  MPEG specialist, CJAVASunUnix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS usb keys

2007-06-26 Thread andrewk9
Shouldn't S10u3 just see the newer on-disk format and report that fact, rather 
than complain it is corrupt?

Andrew.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS delegation script

2007-06-26 Thread Nicolas Williams
On Wed, Jun 27, 2007 at 12:55:15AM +0200, Roland Mainz wrote:
 Nicolas Williams wrote:
  On Sat, Jun 23, 2007 at 12:31:28PM -0500, Nicolas Williams wrote:
   On Sat, Jun 23, 2007 at 12:18:05PM -0500, Nicolas Williams wrote:
Couldn't wait for ZFS delegation, so I cobbled something together; see
attachment.
  
   I forgot to slap on the CDDL header...
  
  And I forgot to add a -p option here:
  
   #!/bin/ksh
  
  That should be:
  
   #!/bin/ksh -p
 
 Uhm... that's no longer needed for /usr/bin/ksh in Solaris 10 and ksh93
 never needed it.

But will ksh or ksh93 know that this script must not source $ENV?

Apparently ksh won't source it anyways; this was not clear from the man
page.

Note that in the RBAC profile for this script the script gets run with
privs=all, not euid=0, so checking that euid == uid is not sufficient.

  Note that this script is not intended to be secure, just to keep honest
  people honest and from making certain mistakes.  Setuid-scripts (which
  this isn't quite) are difficult to make secure.
 
 Uhm... why ? You only have to make sure the users can't inject
 data/code. David Korn provided some guidelines for such cases, see
 http://mail.opensolaris.org/pipermail/shell-discuss/2007-June/000493.html
 (mainly avoid eval, put all variable expensions in quotes, set IFS= at
 the beginning of the script and harden your script against unexpected
 input (classical example is $ myscript $(cat /usr/bin/cat) # (e.g. the
 attempt to pass a giant binary string as argument))) ... and I am
 currently working on a new shell code style guideline at
 http://www.opensolaris.org/os/project/shell/shellstyle/ with more stuff.

As you can see the script quotes user arguments throughout.  It's
probably secure -- what I meant is that I make no guarantees about this
script :)

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: NFS, nested ZFS filesystems and ownership

2007-06-26 Thread Marko Milisavljevic

Well, I didn't realize this at first because I was testing with newly empty
directories and sorry about wasting the bandwidth here, but it apprears NFS
is not showing nested ZFS filesystems *at all*, all I was seing is
mountpoints of the parent filesystem, and their changing ownership as server
was mounting and unmounting child filesystems locally.
I see nothing in zfs or mount commands that would allow me to recursively
propagate my NFS mount to display hierarchy created by ZFS on server side.
Is this at all possible or NFS and nested ZFS filesystems don't mix?

And it seems to be a bug, but unless zfs set sharenfs=on is explicitly set
on a filesystem (even though it is already inherited as on, and shows as
such in zfs list -o name,sharenfs), it does not appear in share command
output nor can it be mounted by the NFS client - gives permission error.

Marko

On 6/26/07, Marko Milisavljevic [EMAIL PROTECTED] wrote:


I figured out how to get it to work, but I still don't quite understand
it.

The way i got it to work is to zfs unmount tank/fs/fs1 and tank/fs/fs2,
and then it looked like this:


ls -la /tank/fs
user:group .
root:root fs1
root:root fs2


That is, those mountpoints changed to root:root from user:group that was
in effect while it was mounted. This I don't understand - what is
determining this? How did zfs know to change this to user:group after zfs
mount -a on local filesystem? Does ZFS inherit parent directory ownership at
time of mounting, regardless of ownership of mountpoint? Does NFS respect
ownership of underlying mountpoint, regardless of how ZFS is mounting it? I
would appreciate an explanation or pointing to appropriate documentation. In
any case, I would expect that reasonable behavour would be for both local
ZFS filesystem hierarchy and the view of the same over NFS to display same
ownership (user:group in question exists on both machines, and client is Mac
OSX 10.4.9)


Marko

On 6/26/07, Marko Milisavljevic [EMAIL PROTECTED] wrote:

 Hello,

 I'm sure there is a simple solution, but I am unable to figure this one
 out.


 Assuming I have tank/fs, tank/fs/fs1, tank/fs/fs2, and I set sharenfs=on
 for tank/fs (child filesystems are inheriting it as well), and I chown
 user:group /tank/fs, /tank/fs/fs1 and /tank/fs/fs2, I see:


 ls -la /tank/fs
 user:group .
 user:group fs1
 user:group fs2
 user:group some_other_file

 If I mount server:/tank/fs /tmp/mount from another machine, I see:


 ls -la /tmp/mount
 user:group .
 root:wheel fs1
 root:wheel fs2
 user:group some_other_file


 How can I get user:group to propagate down the nested ZFS filesystem
 over NFS?


 Thanks,
 Marko




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS delegation script

2007-06-26 Thread Roland Mainz
Nicolas Williams wrote:
 On Wed, Jun 27, 2007 at 12:55:15AM +0200, Roland Mainz wrote:
  Nicolas Williams wrote:
   On Sat, Jun 23, 2007 at 12:31:28PM -0500, Nicolas Williams wrote:
On Sat, Jun 23, 2007 at 12:18:05PM -0500, Nicolas Williams wrote:
 Couldn't wait for ZFS delegation, so I cobbled something together; see
 attachment.
   
I forgot to slap on the CDDL header...
  
   And I forgot to add a -p option here:
  
#!/bin/ksh
  
   That should be:
  
#!/bin/ksh -p
 
  Uhm... that's no longer needed for /usr/bin/ksh in Solaris 10 and ksh93
  never needed it.
 
 But will ksh or ksh93 know that this script must not source $ENV?

Erm, I don't know what's the correct behaviour for Solaris ksh88... but
for ksh93 it's clearly defined that ${ENV} and /etc/ksh.kshrc are only
sourced for _interactive_ shell sessions by default - and that excludes
non-interactive scripts.

 Apparently ksh won't source it anyways; this was not clear from the man
 page.
 
 Note that in the RBAC profile for this script the script gets run with
 privs=all, not euid=0, so checking that euid == uid is not sufficient.

What do you mean with that ?

   Note that this script is not intended to be secure, just to keep honest
   people honest and from making certain mistakes.  Setuid-scripts (which
   this isn't quite) are difficult to make secure.
 
  Uhm... why ? You only have to make sure the users can't inject
  data/code. David Korn provided some guidelines for such cases, see
  http://mail.opensolaris.org/pipermail/shell-discuss/2007-June/000493.html
  (mainly avoid eval, put all variable expensions in quotes, set IFS= at
  the beginning of the script and harden your script against unexpected
  input (classical example is $ myscript $(cat /usr/bin/cat) # (e.g. the
  attempt to pass a giant binary string as argument))) ... and I am
  currently working on a new shell code style guideline at
  http://www.opensolaris.org/os/project/shell/shellstyle/ with more stuff.
 
 As you can see the script quotes user arguments throughout.  It's
 probably secure -- what I meant is that I make no guarantees about this
 script :)

Yes... I saw that... and I realised that the new ksh93 getopts, pattern
matching (e.g. [[ ${pat} == ~(Ei).*myregex.* ]] to replace something
like [ $(echo ${pat} | egrep -i .*myregex.*) !=  ] ) and
associative arrays (e.g. use string as index instead of numbers) would
be usefull for this script.

Anyway... the script looks good... I wish the script code in OS/Net
Makefiles would have that quality... ;-/



Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) [EMAIL PROTECTED]
  \__\/\/__/  MPEG specialist, CJAVASunUnix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS delegation script

2007-06-26 Thread Nicolas Williams
On Wed, Jun 27, 2007 at 01:45:07AM +0200, Roland Mainz wrote:
 Nicolas Williams wrote:
  But will ksh or ksh93 know that this script must not source $ENV?
 
 Erm, I don't know what's the correct behaviour for Solaris ksh88... but
 for ksh93 it's clearly defined that ${ENV} and /etc/ksh.kshrc are only
 sourced for _interactive_ shell sessions by default - and that excludes
 non-interactive scripts.

Right, and I'd forgotten that, and when I glanced at the manpage,
nervous that I'd might have missed a ksh option that's important for
setuid scripts, it was not obvious that this was indeed the case.

  Apparently ksh won't source it anyways; this was not clear from the man
  page.
  
  Note that in the RBAC profile for this script the script gets run with
  privs=all, not euid=0, so checking that euid == uid is not sufficient.
 
 What do you mean with that ?

Read the part of the script that deals with the 'setup' sub-command.

  As you can see the script quotes user arguments throughout.  It's
  probably secure -- what I meant is that I make no guarantees about this
  script :)
 
 Yes... I saw that... and I realised that the new ksh93 getopts, pattern
 matching (e.g. [[ ${pat} == ~(Ei).*myregex.* ]] to replace something
 like [ $(echo ${pat} | egrep -i .*myregex.*) !=  ] ) and
 associative arrays (e.g. use string as index instead of numbers) would
 be usefull for this script.

Indeed.  I can't tell you how many times I've wished that Solaris had
had ksh93 back in, well, 1993 :)  Although, I must say that I *like* KSH
globs quite a bit, enough so that I'd not resort to regexps in a ksh93
script unless I had to match patterns that were not easily expressible
as KSH globs.  And I like KSH variable substitution transformations like
${var%pattern} and so on (though, again, I wish ksh88 had a few more
extensions of that sort).

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss