Re: [zfs-discuss] Deleting large amounts of files

2010-07-29 Thread Constantin Gonzalez

Hi,


Is there a way to see which files have been deduped, so I can copy them again 
an un-dedupe them?


unfortunately, that's not easy (I've tried it :) ).

The issue is that the dedup table (which knows which blocks have been deduped)
doesn't know about files.

And if you pull block pointers for deduped blocks from the dedup table,
you'll need to backtrack from there through the filesystem structure
to figure out what files are associated with those blocks.

(remember: Deduplication happens at the block level, not the file level.)

So, in order to compile a list of deduped _files_, one would need to extract
the list of dedupes _blocks_ from the dedup table, then chase the pointers
from the root of the zpool to the blocks in order to figure out what files
they're associated with.

Unless there's a different way that I'm not aware of (and I hope someone can
correct me here), the only way to do that is run a scrub-like process and
build up a table of files and their blocks.

Cheers,
  Constantin

--

Constantin Gonzalez Schmitz | Principal Field Technologist
Phone: +49 89 460 08 25 91 || Mobile: +49 172 834 90 30
Oracle Hardware Presales Germany

ORACLE Deutschland B.V.  Co. KG | Sonnenallee 1 | 85551 Kirchheim-Heimstetten

ORACLE Deutschland B.V.  Co. KG
Hauptverwaltung: Riesstraße 25, D-80992 München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V.
Rijnzathe 6, 3454PV De Meern, Niederlande
Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
Geschäftsführer: Jürgen Kunz, Marcel van de Molen, Alexander van der Ven

Oracle is committed to developing practices and products that help protect the
environment
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Tips for ZFS tuning for NFS store of VM images

2010-07-29 Thread erik.ableson
Hmmm, that's odd. I have a number of VMs running on NFS (hosted on ESX, rather 
than Xen) with no problems at all. I did add a SLOG device to get performance 
up to a reasonable level, but it's been running flawlessly for a few months 
now. Previously I was using iSCSI for most of the connections, but with the 
addition of the SLOG device NFS has become feasible.

All I'm using on the OSOL side is sharenfs=anon=0 and adding the server's 
addresses to /etc/hosts to permit access. Running osol2009.06.

Cheers,

Erik

On 28 juil. 2010, at 21:11, sol wrote:

 Richard Elling wrote:
 Gregory Gee wrote:
 I am using OpenSolaris to host VM images over NFS for XenServer.  I'm 
 looking 
 for tips on what parameters can be set to help optimize my ZFS pool that 
 holds 
 my VM images.
 There is nothing special about tuning for VMs, the normal NFS tuning applies.
 
 
 That's not been my experience. Out of the box VMware server would not work 
 with 
 the VMs stored on a zfs pool via NFS. I've not yet found out why but the 
 analytics showed millions of getattr/access/lookup compared to read/write.
 
 A partial workaround was to turn off access time on the share and to mount 
 with 
 noatime,actimeo=60
 
 But that's not perfect because when left along the VM got into a stuck 
 state. 
 I've never seen that state before when the VM was hosted on a local disk. 
 Hosting VMs on NFS is not working well so far...
 
 
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Tips for ZFS tuning for NFS store of VM images

2010-07-29 Thread rwalists

On Jul 28, 2010, at 3:11 PM, sol wrote:

 A partial workaround was to turn off access time on the share and to mount 
 with 
 noatime,actimeo=60
 
 But that's not perfect because when left along the VM got into a stuck 
 state. 
 I've never seen that state before when the VM was hosted on a local disk. 
 Hosting VMs on NFS is not working well so far...

We host a lot of VMs on NFS shares (from a 7000 series) on ESXi with no issues 
other than an occasional Ubuntu machine that would do something similar to what 
you describe.  For us it was this:

http://communities.vmware.com/thread/237699?tstart=30

when the timeout is set to 180 the issue has been completely eliminated.  The 
current ESXi (and I think all versions of 4) VMWare tools does this properly.

Also, EMC and NetApp have this description of using NFS shares for VMWare:

http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-to-help-our-mutual-nfs-customers-using-vmware.html

which by and large applies to any NFS server rather than just their equipment.  
We found it helpful, but nothing is that surprising in it.

Good luck,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [osol-discuss] ZFS read performance terrible

2010-07-29 Thread Karol
Hi Eric - thanks for your reply.
Yes, zpool iostat -v

I've re-configured the setup into two pools for a test:
1st pool: 8 disk stripe vdev
2nd pool: 8 disk stripe vdev

The SSDs are currently not in the pool since I am not even reaching what the 
spinning rust is capable of - I believe I have a deeper issue and they would 
only complicate things for me at this point.
I can reconfigure the pool however needed, since this server is not yet in 
production.

My test is through 8gb FC target through comstar from a Windows Workstation.
The pool is currently configured with a default 128k recordsize.

Then I:
touch /pool/file
stmfadm create-lu -p wcd=false -s 10T /pool/file
stmfadm add-view lu
(The lu defaults to reporting a 512 blk size)

I formatted the volume NTFS cluster size default 4k
I do that twice (two seperate pools, two seperate LUNs, etc)

Then I copy a large file (700MB or so) to one of the LUNs from the local 
workstation.
The read performance of my workstation harddrive is about 100+ MBps, and as 
such the file copies at about that speed.
Then I make a few copies of the file on that LUN so that I have about 20+ GB of 
that same file on one of the LUNs.
Then I reboot the opensolaris server (since the cache is nicely populated at 
this point and everything is running fast)

Then I try copying the lot of those files from one lun to the other.
The read performance appears to be limiting my write performance.

I have tried matching recordsize to NTFS cluster size at 4k, 16k, 32 and 64k.
I have tried making NTFS clustersize a multiple of recordsize.
I have seen performance improvements as a result (I dont' have numbers) 
however, none of the cluster/block combinations brought me to where I should be 
on reads.

I've tried many configurations - and I've seen my performance fluctuate up and 
down here and there.  However, it's never on-par with what it should be and the 
reads seem to be a limiting factor.

For clarity - here's some 'zpool iostat -v 1' output from my current 
configuration directly following a reboot of the server 
while copying 13GB of those files from LUN - LUN:



capacity operationsbandwidth
pool alloc   free   read  write   read  write
---  -  -  -  -  -  -

~snip~

edit113.8G  16.3T773  0  96.5M  0
  c0t5000C50020C7A44Bd0  1.54G  1.81T 75  0  9.38M  0
  c0t5000C50020C7C9DFd0  1.54G  1.81T 89  0  11.2M  0
  c0t5000C50020C7CE1Fd0  1.53G  1.81T 82  0  10.3M  0
  c0t5000C50020C7D86Bd0  1.53G  1.81T 85  0  10.6M  0
  c0t5000C50020C61ACBd0  1.55G  1.81T 83  0  10.4M  0
  c0t5000C50020C79DEFd0  1.54G  1.81T 92  0  11.5M  0
  c0t5000C50020CD3473d0  1.53G  1.81T 84  0  10.6M  0
  c0t5000C50020CD5873d0  1.53G  1.81T 87  0  11.0M  0
  c0t5000C500103F36BFd0  1.54G  1.81T 92  0  11.5M  0
---  -  -  -  -  -  -
syspool  35.1G  1.78T  0  0  0  0
  mirror 35.1G  1.78T  0  0  0  0
c0t5000C5001043D3BFd0s0  -  -  0  0  0  0
c0t5000C500104473EFd0s0  -  -  0  0  0  0
---  -  -  -  -  -  -
test111.0G  16.3T850  0   106M  0
  c0t5000C500103F48FFd0  1.23G  1.81T 95  0  12.0M  0
  c0t5000C500103F49ABd0  1.23G  1.81T 92  0  11.6M  0
  c0t5000C500104A3CD7d0  1.22G  1.81T 92  0  11.6M  0
  c0t5000C500104A5867d0  1.24G  1.81T 97  0  12.0M  0
  c0t5000C500104A7723d0  1.22G  1.81T 95  0  11.9M  0
  c0t5000C5001043A86Bd0  1.23G  1.81T 96  0  12.1M  0
  c0t5000C5001043C1BFd0  1.22G  1.81T 91  0  11.3M  0
  c0t5000C5001043D1A3d0  1.23G  1.81T 91  0  11.4M  0
  c0t5000C5001046534Fd0  1.23G  1.81T 97  0  12.2M  0
---  -  -  -  -  -  -

~snip~

Here's some zpool iostat (no -v) output over the same time:


   capacity operationsbandwidth
poolalloc   free   read  write   read  write
--  -  -  -  -  -  -

~snip~

edit1   13.8G  16.3T  0  0  0  0
syspool 35.1G  1.78T  0  0  0  0
test1   11.9G  16.3T  0956  0   120M
--  -  -  -  -  -  -
edit1   13.8G  16.3T  0  0  0  0
syspool 35.1G  1.78T  0  0  0  0
test1   11.9G  16.3T142564  17.9M  52.8M
--  -  -  -  -  -  -
edit1   13.8G  16.3T  0  0  0  0
syspool 35.1G  1.78T  0  0  0  0
test1   11.9G  16.3T723  0  90.3M  0
--  -  -  -  -  -  -
edit1  

Re: [zfs-discuss] [osol-discuss] ZFS read performance terrible

2010-07-29 Thread Karol
Sorry - I said the 2 iostats were run at the same time - the second was run 
after the first during the same file copy operation.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS read performance terrible

2010-07-29 Thread Karol
 Update to my own post.  Further tests more
 consistently resulted in closer to 150MB/s.
 
 When I took one disk offline, it was just shy of
 100MB/s on the single disk.  There is both an obvious
 improvement with the mirror, and a trade-off (perhaps
 the latter is controller related?).
 
 I did the same tests on my work computer, which has
 the same 7200.12 disks (except larger), an i7-920,
 ICH10, and 12GB memory.  The mirrored pool
 performance was identical, but the individual disks
 performed at near 120MB/s when isolated.  Seems like
 the 150MB/s may be a wall, and all disks and
 controllers are definitely in SATA2 mode.  But I
 digress

You could be running into a hardware bandwidth bottleneck somewhere 
(controller, bus, memory, cpu, etc.) - however my experience isn't exactly 
similar to yours since I am not even getting 150MBps from 8 disks - so I am 
probably running into a 1) hardware issue 2) driver issue 3) zfs issue 4) 
configuration issue

I have tried with Osol 09.06 but the driver doesn't recognize my SAS controller.
I then went with Osol b134 to get my controller recognized and have the 
performance issues I am discussing now, and now I'm using the RC2 of Nexenta 
(osol b134 with backported fixes) with the same performance issues.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS read performance terrible

2010-07-29 Thread Richard Jahnel
Hi r2ch

The operations column shows about 370 operations for read - per spindle
(Between 400-900 for writes)
How should I be measuring iops? 

It seems to me then that your spindles are going about as fast as they can and 
your just moving small block sizes.

There are lots of ways to test for iops, but for this purpose imo the 
operations column is fine. 

I think the next step would be to attatch a couple of inexpensive SSDs as cache 
and zil to see what that did for me. Understanding that it wil only make a 
difference on data that is warm for reads and commit required for writes.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] TRIM support added

2010-07-29 Thread David Magda
Hello,

TRIM support has just been committed into OpenSolaris:

http://mail.opensolaris.org/pipermail/onnv-notify/2010-July/012674.html

Via:

http://www.c0t0d0s0.org/archives/6792-SATA-TRIM-support-in-Opensolaris.html


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs upgrade unmounts filesystems

2010-07-29 Thread Gary Mills
Zpool upgrade on this system went fine, but zfs upgrade failed:

# zfs upgrade -a
cannot unmount '/space/direct': Device busy
cannot unmount '/space/dcc': Device busy
cannot unmount '/space/direct': Device busy
cannot unmount '/space/imap': Device busy
cannot unmount '/space/log': Device busy
cannot unmount '/space/mysql': Device busy
2 filesystems upgraded

Do I have to shut down all the applications before upgrading the
filesystems?  This is on a Solaris 10 5/09 system.

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] snapshot question

2010-07-29 Thread Mark
I'm trying to understand how snapshots work in terms of how I can use them for 
recovering and/or duplicating virtual machines, and how I should set up my file 
system.

I want to use OpenSolaris as a storage platform with NFS/ZFS for some 
development VMs; that is, the VMs use the OpenSolaris box as their NAS for 
shared access.

Should I set up a separate ZFS file system for each VM so I can individually 
snapshot each one on a regular basis, or does it matter? The goal would be to 
be able to take an individual VM back to a previous point in time without 
changing the others.

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zpool import despite missing log [PSARC/2010/292 Self Review]

2010-07-29 Thread Richard Elling
On Jul 28, 2010, at 4:11 PM, Robert Milkowski wrote:
 
 fyi

This covers the case where an exported pool has lost its log.
zpool export
[log disk or all disks in a mirrored log disappear]
zpool import -- currently fails, missing top-level vdev

The following cases are already recoverable:

If the pool is not exported and the log disappears, then the pool
can import ok if the zpool.cache file is current.
*crash*
[log disk or all disks in a mirrored log disappear]
zpool import -- succeeds, pool state is updated
keep on truckin'

If the log device fails while the pool is imported, then the pool
marks the device as failed.
[log disk or all disks in a mirrored log disappear]
report error, change pool state to show failed log device
keep on truckin'

 -- richard

 
 -- 
 Robert Milkowski
 http://milek.blogspot.com
 
 
  Original Message 
 Subject:  zpool import despite missing log [PSARC/2010/292 Self Review]
 Date: Mon, 26 Jul 2010 08:38:22 -0600
 From: Tim Haley tim.ha...@oracle.com
 To:   psarc-...@sun.com
 CC:   zfs-t...@sun.com
 
 I am sponsoring the following case for George Wilson.  Requested binding 
 is micro/patch.  Since this is a straight-forward addition of a command 
 line option, I think itqualifies for self review.  If an ARC member 
 disagrees, let me know and I'll convert to a fast-track.
 
 Template Version: @(#)sac_nextcase 1.70 03/30/10 SMI
 This information is Copyright (c) 2010, Oracle and/or its affiliates. 
 All rights reserved.
 1. Introduction
 1.1. Project/Component Working Name:
  zpool import despite missing log
 1.2. Name of Document Author/Supplier:
  Author:  George Wilson
 1.3  Date of This Document:
 26 July, 2010
 
 4. Technical Description
 
 OVERVIEW:
 
  ZFS maintains a GUID (global unique identifier) on each device and
  the sum of all GUIDs of a pool are stored into the ZFS uberblock.
  This sum is used to determine the availability of all vdevs
  within a pool when a pool is imported or opened.  Pools which
  contain a separate intent log device (e.g. a slog) will fail to
  import when that device is removed or is otherwise unavailable.
  This proposal aims to address this particular issue.
 
 PROPOSED SOLUTION:
 
  This fast-track introduce a new command line flag to the
  'zpool import' sub-command.  This new option, '-m', allows
  pools to import even when a log device is missing.  The contents
  of that log device are obviously discarded and the pool will
  operate as if the log device were offlined.
 
 MANPAGE DIFFS:
 
zpool import [-o mntopts] [-p property=value] ... [-d dir | -c
 cachefile]
 -  [-D] [-f] [-R root] [-n] [-F] -a
 +  [-D] [-f] [-m] [-R root] [-n] [-F] -a
 
 
zpool import [-o mntopts] [-o property=value] ... [-d dir | -c
 cachefile]
 -  [-D] [-f] [-R root] [-n] [-F] pool |id [newpool]
 +  [-D] [-f] [-m] [-R root] [-n] [-F] pool |id [newpool]
 
zpool import [-o mntopts] [ -o property=value] ... [-d dir |
 - -c cachefile] [-D] [-f] [-n] [-F] [-R root] -a
 + -c cachefile] [-D] [-f] [-m] [-n] [-F] [-R root] -a
 
Imports all  pools  found  in  the  search  directories.
Identical to the previous command, except that all pools
 
 + -m
 +
 +Allows a pool to import when there is a missing log device
 
 EXAMPLES:
 
 1). Configuration with a single intent log device:
 
 # zpool status tank
pool: tank
 state: ONLINE
  scan: none requested
  config:
 
  NAMESTATE READ WRITE CKSUM
  tankONLINE   0 0 0
c7t0d0ONLINE   0 0 0
  logs
c5t0d0ONLINE   0 0 0
 
 errors: No known data errors
 
 # zpool import tank
 The devices below are missing, use '-m' to import the pool anyway:
  c5t0d0 [log]
 
 cannot import 'tank': one or more devices is currently unavailable
 
 # zpool import -m tank
 # zpool status tank
pool: tank
   state: DEGRADED
 status: One or more devices could not be opened.  Sufficient replicas
 exist for
  the pool to continue functioning in a degraded state.
 action: Attach the missing device and online it using 'zpool online'.
 see: 
 http://www.sun.com/msg/ZFS-8000-2Q
 
scan: none requested
 config:
 
  NAME   STATE READ WRITE CKSUM
  tank   DEGRADED 0 0 0
c7t0d0   ONLINE   0 0 0
  logs
1693927398582730352  UNAVAIL  0 0 0  was
 /dev/dsk/c5t0d0
 
 errors: No known data errors
 
 2). Configuration with mirrored intent log device:
 
 # zpool add tank log mirror c5t0d0 c5t1d0
 zr...@diskmonster:/dev/dsk# zpool status 

Re: [zfs-discuss] ZFS acl and chmod

2010-07-29 Thread Cindy Swearingen
Which Solaris release is this and are you using /usr/bin/ls and 
/usr/bin/chmod?


Thanks,

Cindy
On 07/29/10 02:44, . . wrote:

Hi ,
while playing with ZFS acls I have noticed chmod strange behavior, it 
duplicates some acls , is it a bug or a feature :) ?

For example scenario:
#ls -dv ./2

drwxr-xr-x   2 root root   2 Jul 29 11:22 2
 0:owner@::deny
 1:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
 2:group@:add_file/write_data/add_subdirectory/append_data:deny
 3:group@:list_directory/read_data/execute:allow
 
4:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr

 /write_attributes/write_acl/write_owner:deny
 5:everyone@:list_directory/read_data/read_xattr/execute/read_attributes
 /read_acl/synchronize:allow


chmod  A3=group@:list_directory/read_data/write_data/execute:allow 2

bash-3.00# ls -dv 2
drwxr-xr-x   2 root root   2 Jul 29 11:22 2
 0:owner@::deny
 1:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
 2:group@:add_file/write_data/add_subdirectory/append_data:deny
 3:group@:list_directory/read_data/add_file/write_data/execute:allow
 
4:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr

 /write_attributes/write_acl/write_owner:deny
 5:everyone@:list_directory/read_data/read_xattr/execute/read_attributes
 /read_acl/synchronize:allow

bash-3.00#chmod 755 2
bash-3.00#ls -dv
drwxr-xr-x+  2 root root   2 Jul 29 11:22 2
 0:owner@::deny
 1:owner@:write_xattr/write_attributes/write_acl/write_owner:allow
 2:group@::deny
 3:group@::allow
 4:group@::allow
 5:everyone@:write_xattr/write_attributes/write_acl/write_owner:deny
 6:everyone@:read_xattr/read_attributes/read_acl/synchronize:allow
 7:owner@::deny
 8:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
 9:group@:add_file/write_data/add_subdirectory/append_data:deny
 10:group@:list_directory/read_data/execute:allow
 
11:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr

 /write_attributes/write_acl/write_owner:deny
 
12:everyone@:list_directory/read_data/read_xattr/execute/read_attributes

 /read_acl/synchronize:allow





--
-
http://unixinmind.blogspot.com




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs upgrade unmounts filesystems

2010-07-29 Thread Cindy Swearingen

Hi Gary,

I found a similar zfs upgrade failure with the device busy error, which
I believe was caused by a file system mounted under another file system.

If this is the cause, I will file a bug or find an existing one.

The workaround is to unmount the nested file systems and upgrade them
individually, like this:

# zfs upgrade space/direct
# zfs upgrade space/dcc

Thanks,

Cindy

On 07/29/10 09:48, Cindy Swearingen wrote:

Hi Gary,

This should just work without having to do anything.

Looks like a bug but I haven't seen this problem before.

Anything unusual about the mount points for the file systems
identified below?

Thanks,

Cindy

On 07/29/10 07:07, Gary Mills wrote:

Zpool upgrade on this system went fine, but zfs upgrade failed:

# zfs upgrade -a
cannot unmount '/space/direct': Device busy
cannot unmount '/space/dcc': Device busy
cannot unmount '/space/direct': Device busy
cannot unmount '/space/imap': Device busy
cannot unmount '/space/log': Device busy
cannot unmount '/space/mysql': Device busy
2 filesystems upgraded

Do I have to shut down all the applications before upgrading the
filesystems?  This is on a Solaris 10 5/09 system.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS read performance terrible

2010-07-29 Thread Richard Elling
On Jul 29, 2010, at 9:57 AM, Carol wrote:

 Yes I noticed that thread a while back and have been doing a great deal of 
 testing with various scsi_vhci options.  
 I am disappointed that the thread hasn't moved further since I also suspect 
 that it is related to mpt-sas or multipath or expander related.

The thread is in the ZFS forum, but the problem is not a ZFS problem.

 I was able to get aggregate writes up to 500MB out to the disks but reads 
 have not improved beyond an aggregate average of about 50-70MBps for the pool.

I find zpool iostat to be only marginally useful.  You need to look at the
output of iostat -zxCn which will show the latency of the I/Os.  Check to
see if the latency (asvc_t) is similar to the previous thread.

 I did not look much at read speeds during alot of my previous testing because 
 I thought write speeds were my issue... And I've since realized that my 
 userland write speed problem from zpool - zpool was actually read limited.

Writes are cached in RAM, so looking at iostat or zpool iostat doesn't offer
the observation point you'd expect.

 Since then I've tried mirrors, stripes, raidz, checked my drive caches, 
 tested recordsizes, volblocksizes, clustersizes, combinations therein, tried 
 vol-backed luns, file-backed luns, wcd=false - etc.
 
 Reads from disk are slow no matter what.  Of course - once the arc cache is 
 populated, the userland experience is blazing - because the disks are not 
 being read.

Yep, classic case of slow disk I/O.

 Seeing write speeds so much faster that read strikes me as quite strange from 
 a hardware perspective, though, since writes also invoke a read operation - 
 do they not?

In many cases, writes do not invoke a read.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS read performance terrible

2010-07-29 Thread StorageConcepts
Actually writes faster then reads are typical fora Copy on Write FC (or Write 
Anywhere). I usually describe it like this. 

CoW in ZFS works like when you come home after a long day and you ust want to 
go to bed. You take of one pice of clothing after another and drop it on the 
floor just where you are - this is very fast (and it actually is copy on write 
with block allocation policy of closest). 

Then the next day when you have to get to work (in this example assuming that 
you wear the same underwear again - remember not supported ! :) - you have to 
pick up all the cloths one after another and you have to move across all the 
room to get dressed. This takes time and it is the same for reads. 

So in CoW it is usual that writes are fast then reads (especially for 
RaidZ/RaidZ2, where each vdev can be viewes as one disk). For 100% synchronous 
writes (wcd=true), you should see the same write and read performance. 

So for your setup I assume: 

4 x 2 disk mirror with Nearline SATA:

Write (sync, wcd=true) = 4 x 80 IOPS = 320 IOPS x 8 KB Recordsize = 2,6 MB/Sec 
if you see more thats ZFS optimizations already. If you see less - make sure 
you have proper partition alignment (otherwise 1 write can become 2).

Read = 8 x 100 IOPS (some more IOPS because of head optimization and elevator) 
= 800 IOPS x 8k = 6,4 MB /sec from disk. Same problem with partiton alignment.

For 128k block size ? 

Write: 320 x 128k = 42 MB/sec
Read: 102 MB/sec 

ZFS needs caching (L2ARC,ZIL etc.), otherwise it is slow  - just as any other 
disk system for random I/O. For sequencial I/O ZFS is not optimimal because of 
CoW. Also with iSCSI you have more fragmentation becase of the small block 
updates. 

So how to tune ? 

1) Use ZIL (this will make your writes more sequencial, so also optimize the 
reads)
2) Use L2ARC
3) Make sure partition aligment is ok
4) try to disable read-ahead on the client (otherwise you case eben more random 
I/O)
5) use larger block size (128k) to ave some kind of implicit read-ahead  
(except for DB workloads)

Regards, 
Robert Heinzmann
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS acl and chmod

2010-07-29 Thread Cindy Swearingen

Hey Nix,

I think I see the problem now.

If you want to review the interaction of setting an explicit ACL and
using the chmod 755 command on 2, you need this command:

# ls -dv 2

What you have is this command:

# ls -dv

(I have no idea what's going on with the parent dir ACL.)

I tested your syntax, which says replace ACL #3 and then reset the
permissions by using the chmod command. Its working as expected.
See below.

Thanks

Cindy


# zpool create tank c0t1d0
# zfs create tank/test
# cd /tank/test
# mkdir 2
# ls -dv 2
drwxr-xr-x   2 root root   2 Jul 29 12:45 2
0:owner@::deny
1:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
2:group@:add_file/write_data/add_subdirectory/append_data:deny
3:group@:list_directory/read_data/execute:allow
4:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr
 /write_attributes/write_acl/write_owner:deny
5:everyone@:list_directory/read_data/read_xattr/execute/read_attributes
 /read_acl/synchronize:allow

# chmod  A3=group@:list_directory/read_data/write_data/execute:allow 2
# ls -dv 2
drwxr-xr-x   2 root root   2 Jul 29 12:45 2
0:owner@::deny
1:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
2:group@:add_file/write_data/add_subdirectory/append_data:deny
3:group@:list_directory/read_data/add_file/write_data/execute:allow
4:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr
 /write_attributes/write_acl/write_owner:deny
5:everyone@:list_directory/read_data/read_xattr/execute/read_attributes
 /read_acl/synchronize:allow
# chmod 755 2
# ls -dv 2
drwxr-xr-x   2 root root   2 Jul 29 12:45 2
0:owner@::deny
1:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
2:group@:add_file/write_data/add_subdirectory/append_data:deny
3:group@:list_directory/read_data/execute:allow
4:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr
 /write_attributes/write_acl/write_owner:deny
5:everyone@:list_directory/read_data/read_xattr/execute/read_attributes
 /read_acl/synchronize:allow

On 07/29/10 11:56, Cindy Swearingen wrote:
Which Solaris release is this and are you using /usr/bin/ls and 
/usr/bin/chmod?


Thanks,

Cindy
On 07/29/10 02:44, . . wrote:

Hi ,
while playing with ZFS acls I have noticed chmod strange behavior, it 
duplicates some acls , is it a bug or a feature :) ?

For example scenario:
#ls -dv ./2

drwxr-xr-x   2 root root   2 Jul 29 11:22 2
 0:owner@::deny
 
1:owner@:list_directory/read_data/add_file/write_data/add_subdirectory

 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
 2:group@:add_file/write_data/add_subdirectory/append_data:deny
 3:group@:list_directory/read_data/execute:allow
 
4:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr

 /write_attributes/write_acl/write_owner:deny
 
5:everyone@:list_directory/read_data/read_xattr/execute/read_attributes

 /read_acl/synchronize:allow


chmod  A3=group@:list_directory/read_data/write_data/execute:allow 2

bash-3.00# ls -dv 2
drwxr-xr-x   2 root root   2 Jul 29 11:22 2
 0:owner@::deny
 
1:owner@:list_directory/read_data/add_file/write_data/add_subdirectory

 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
 2:group@:add_file/write_data/add_subdirectory/append_data:deny
 3:group@:list_directory/read_data/add_file/write_data/execute:allow
 
4:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr

 /write_attributes/write_acl/write_owner:deny
 
5:everyone@:list_directory/read_data/read_xattr/execute/read_attributes

 /read_acl/synchronize:allow

bash-3.00#chmod 755 2
bash-3.00#ls -dv
drwxr-xr-x+  2 root root   2 Jul 29 11:22 2
 0:owner@::deny
 1:owner@:write_xattr/write_attributes/write_acl/write_owner:allow
 2:group@::deny
 3:group@::allow
 4:group@::allow
 5:everyone@:write_xattr/write_attributes/write_acl/write_owner:deny
 6:everyone@:read_xattr/read_attributes/read_acl/synchronize:allow
 7:owner@::deny
 
8:owner@:list_directory/read_data/add_file/write_data/add_subdirectory

 /append_data/write_xattr/execute/write_attributes/write_acl
 /write_owner:allow
 9:group@:add_file/write_data/add_subdirectory/append_data:deny
 10:group@:list_directory/read_data/execute:allow
 
11:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr

 /write_attributes/write_acl/write_owner:deny
 

Re: [zfs-discuss] zfs upgrade unmounts filesystems

2010-07-29 Thread Pawel Jakub Dawidek
On Thu, Jul 29, 2010 at 12:00:08PM -0600, Cindy Swearingen wrote:
 Hi Gary,
 
 I found a similar zfs upgrade failure with the device busy error, which
 I believe was caused by a file system mounted under another file system.
 
 If this is the cause, I will file a bug or find an existing one.
 
 The workaround is to unmount the nested file systems and upgrade them
 individually, like this:
 
 # zfs upgrade space/direct
 # zfs upgrade space/dcc

'zfs upgrade' unmounts file system first, which makes it hard to upgrade
for example root file system. The only work-around I found is to clone
root file system (clone is created with most recent version), change
root file system to newly created clone, reboot, upgrade original root
file system, change root file system back, reboot, destroy clone.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpDwyEEJ9AAb.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-29 Thread Brandon High
On Tue, Jul 20, 2010 at 9:48 AM, Hernan Freschi drge...@gmail.com wrote:
 Is there a way to see which files are using dedup? Or should I just
 copy everything  to a new ZFS?

Using 'zfs send' to copy the datasets will work and preserve other
metadata that copying will lose.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Moved to new controller now degraded

2010-07-29 Thread -
I moved one hard drive from a pool to a different controller, and now it isn't 
recognized as part of the pool.

This is the pool:

NAME STATE READ WRITE CKSUM
videoDEGRADED 0 0 0
  raidz1-0   DEGRADED 0 0 0
c13t0d0  UNAVAIL  0 0 0  cannot open
c12d1ONLINE   0 0 0
c11d0ONLINE   0 0 0
c12d0ONLINE   0 0 0
c11d1ONLINE   0 0 0

c13t0d0 is now c15d0, so I tried zpool replace -f c3t0d0 c15d0 but that just 
tells me that it is already part of a zpool.

How do I tell zfs that c15d0 is the new name for c13t0d0?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs upgrade unmounts filesystems

2010-07-29 Thread Gary Mills
On Thu, Jul 29, 2010 at 10:26:14PM +0200, Pawel Jakub Dawidek wrote:
 On Thu, Jul 29, 2010 at 12:00:08PM -0600, Cindy Swearingen wrote:
  
  I found a similar zfs upgrade failure with the device busy error, which
  I believe was caused by a file system mounted under another file system.
  
  If this is the cause, I will file a bug or find an existing one.

No, it was caused by processes active on those filesystems.

  The workaround is to unmount the nested file systems and upgrade them
  individually, like this:
  
  # zfs upgrade space/direct
  # zfs upgrade space/dcc

Except that I couldn't unmount them because the filesystems were busy.

 'zfs upgrade' unmounts file system first, which makes it hard to upgrade
 for example root file system. The only work-around I found is to clone
 root file system (clone is created with most recent version), change
 root file system to newly created clone, reboot, upgrade original root
 file system, change root file system back, reboot, destroy clone.

In this case it wasn't the root filesystem, but I still had to disable
twelve services before doing the upgrade and enable them afterwards.
`fuser -c' is useful to identify the processes.  Mapping them to
services can be difficult.  The server is essentially down during the
upgrade.

For a root filesystem, you might have to boot off the failsafe archive
or a DVD and import the filesystem in order to upgrade it.

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss