Re: [zfs-discuss] ZFS is very slow in our test, when the capacity is high

2007-10-12 Thread Thomas Liesner
Hi,

did you read the following? 
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

 Currently, pool performance can degrade when a pool is very full and
 filesystems are updated frequently, such as on a busy mail server.
 Under these circumstances, keep pool space under 80% utilization
 to maintain pool performance.

I wonder if defining a zfs quota of roughly 80% of the whole pool capacity 
would help to keep performance up. Users always use all the space available.

Regards,
Tom
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs: allocating allocated segment(offset=77984887808

2007-10-12 Thread Jürgen Keil
size=66560)
In-Reply-To: [EMAIL PROTECTED]
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Approved: 3sm4u3
X-OpenSolaris-URL: 
http://www.opensolaris.org/jive/message.jspa?messageID=163221tstart=0#163221

 how does one free segment(offset=77984887808 size=66560)
 on a pool that won't import?
 
 looks like I found
 http://bugs.opensolaris.org/view_bug.do?bug_id=6580715
 http://mail.opensolaris.org/pipermail/zfs-discuss/2007-September/042541.html

Btw. my machine from that mail.opensolaris.org zfs-discuss thread,
which paniced with freeing free segment, did have a defective ram
module.

I don't know for sure, but I suspect that the bad ram module might
have been the root cause for that freeing free segment zfs panic, 
too ...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs: allocating allocated segment (offset=

2007-10-12 Thread Rob Logan

  I suspect that the bad ram module might have been the root
  cause for that freeing free segment zfs panic,

perhaps I removed two 2G simms but left the two 512M
simms, also removed kernelbase but the zpool import
still crashed the machine.

its also registered ECC ram, memtest86 v1.7 didn't
find anything yet, but I'll let it go overnight.

Rob
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zone root on a ZFS filesystem and Cloning zones

2007-10-12 Thread Dick Davies
On 11/10/2007, Dick Davies [EMAIL PROTECTED] wrote:
 No, they aren't (i.e. zoneadm clone on S10u4 doesn't use zfs snapshots).

 I have a workaround I'm about to blog

Here it is - hopefully be of some use:

  
http://number9.hellooperator.net/articles/2007/10/11/fast-zone-cloning-on-solaris-10
-- 
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs/zpools iscsi

2007-10-12 Thread Krzys
Hello all, sorry if somebody already asked this or not. I was playing today 
with 
iSCSI and I was able to create zpool and then via iSCSI I can see it on two 
other hosts. I was courious if I could use zfs to have it shared on those two 
hosts but aparently I was unable to do it for obvious reasons. On my linuc 
oracle rac I was using ocfs which works just as I need it, does anyone know if 
such could be acheived with zfs maybe? maybe if not now but in the future? is 
there anything that I could do at this moment to be able to have my two other 
solaris clients see my zpool that I am presenting via iscsi to them both? Is 
there any solutions out there of this kind?

Thanks so much for your help.

Regards,

Chris

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Inherited quota question

2007-10-12 Thread Rahul Mehta
Has there been any solution to the problem discussed above in ZFS version 8??
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] XFS_IOC_FSGETXATTR XFS_IOC_RESVSP64 like options in ZFS ?

2007-10-12 Thread Manoj Nayak
Hi,

I am using XFS_IOC_FSGETXATTR in ioctl() call on Linux running XFS file 
system.I want to use similar thing on Solaris running ZFS file system.

struct fsxattr fsx;
ioctl(fd, XFS_IOC_FSGETXATTR, fsx);

The above call get additional attributes associated with files in XFS 
file systems. The final argument points to a variable of type struct 
fsxattr, whose fields include: fsx_xflags (extended flag bits), 
fsx_extsize (nominal extent size in file system blocks), fsx_nextents 
(number of data extents in the file). A fsx_extsize value returned 
indicates that a preferred extent size was previously set on the file, a 
fsx_extsize of zero indicates that the defaults for that filesystem will 
be used.

Structure for XFS_IOC_FSGETXATTR and XFS_IOC_FSSETXATTR.

struct fsxattr {
  __u32   fsx_xflags; /* xflags field value (get/set) */
  __u32   fsx_extsize;/* extsize field value (get/set)*/
 __u32   fsx_nextents;   /* nextents field value (get)   */
  __u32   fsx_projid; /* project identifier (get/set) */
 unsigned char   fsx_pad[12];
};

Is it possible to use ioctl to allocate disk space on ZFS ?  I  am using 
XFS_IOC_RESVSP64 in ioctl() call on Linux running XFS file system.

flock.l_whence = SEEK_SET;
flock.l_start = file_size;
flock.l_len = n_bytes_grow;
ioctl_ret = ioctl(fd, XFS_IOC_RESVSP64, flock);

The above call is used to allocate space to a file. A range of bytes is 
specified using a pointer to a variable of type xfs_flock64_t in the 
final argument. The blocks are allocated, but not zeroed, and the file 
size does not change. If the XFS filesystem is configured to flag 
unwritten file extents, performance will be negatively affected when 
writing to preallocated space, since extra filesystem transactions are 
required to convert extent flags on the range of the file written.

Thanks
Manoj Nayak


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs/zpools iscsi

2007-10-12 Thread Mattias Pantzare
2007/10/12, Krzys [EMAIL PROTECTED]:
 Hello all, sorry if somebody already asked this or not. I was playing today 
 with
 iSCSI and I was able to create zpool and then via iSCSI I can see it on two
 other hosts. I was courious if I could use zfs to have it shared on those two
 hosts but aparently I was unable to do it for obvious reasons. On my linuc
 oracle rac I was using ocfs which works just as I need it, does anyone know if
 such could be acheived with zfs maybe? maybe if not now but in the future? is
 there anything that I could do at this moment to be able to have my two other
 solaris clients see my zpool that I am presenting via iscsi to them both? Is
 there any solutions out there of this kind?

Why not use NFS?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] XFS_IOC_FSGETXATTR XFS_IOC_RESVSP64 like options in ZFS ?

2007-10-12 Thread Darren J Moffat
Manoj Nayak wrote:
 Hi,
 
 I am using XFS_IOC_FSGETXATTR in ioctl() call on Linux running XFS file 
 system.I want to use similar thing on Solaris running ZFS file system.

See openat(2).



-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS is very slow in our test, when the capacity is high

2007-10-12 Thread LI Xin
eSX wrote:
 We are tesing ZFS in OpenSolairs, write TBs data to ZFS, But when the
 capacity is close to 90%, ZFS went into slowly. We do ls, rm, and write
 something, those operation is so terrible. for example, we do ls in a
 Directory which have about 4000 Directories, the time is about 5-10s!
 we've checked the CPU, memory(and swap), IO, all those are normal and
 idle. So is there any specialty in ZFS when capacity is high, like UFS?
 thanks.

It's insane to exhaust every bits in almost *any* file system IMHO,
because when this happens you end up with a lot of fragments (this will
affect ZFS more than UFS, as the difference in disk layout), which will
just hurt performance because the increased disk seek requests.

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] practicality of zfs send/receive for failover

2007-10-12 Thread Paul B. Henson

We've been evaluating ZFS as a possible enterprise file system for our
campus. Initially, we were considering one large cluster, but it doesn't
look like that will scale to meet our needs. So, now we are thinking about
breaking our storage across multiple servers, probably three.

However, I don't necessarily want to incur the expense and hassle of
maintaining three clusters, but think I might have three standalone servers
instead. If one of them happens to break, we're only down 1/3 Of our files,
not all of them. Given our budget, that's probably an acceptable
compromise.

On the other hand, it would be nice to have some level of redundancy, so
I'm toying with the idea of having each server be primary for some amount
of storage, and secondary for a different set of storage. Each server would
use zfs send to replicate snapshots to its backup server.

I've read a number of threads and blog posts discussing zfs send/receive
and its applicability is such an implementation, but I'm curious if anyone
has actually done something like that in practice, and if so how well it
worked.

What authentication/authorization was used to transfer the zfs snapshots
between servers? I'm thinking about using ssh with public-key
authentication over an internal private network the servers are connected
to with different ethernet interfaces than the ones facing the world and
actually serving files. Does zfs send/receive have to be done with root
privileges, or can RBAC or some other mechanism be used so a lower
privileged account could be used?

In the various threads I read about this type of failover, there was some
issue about marking the filesystems readonly on the slave, or else changes
would cause snapshots to fail? Supposedly there was some feature added to
zfs receive to rectify this problem, did that make it into S10U4, or is
that still only in the development version?

Did you have automatic or manual failover? I'm thinking about having a
manual failover process, if the process were automatic given the
replication is only one way if a failover happened, and the secondary
server started providing service, updates would happen there that would not
be on the primary server if it suddenly came back to life and took over
again.

How did you implement the failover at the network level? DNS change?
Virtual IP address switched from one server to the other?

Thanks much for any feedback...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs/zpools iscsi

2007-10-12 Thread roland
I was courious if I could use zfs to have it shared on those two hosts 
no, that`s not possible for now.

but aparently I was unable to do it for obvious reasons. 
you will corrupt your data!

On my linuc oracle rac I was using ocfs which works just as I need it

yes, because ocfs is build for that.
it`s a cluster filesystem - that`s what you need for this.
another name is shared disk filesystem
see wikipedia - http://en.wikipedia.org/wiki/List_of_file_systems

maybe if not now but in the future?
it has been discussed, iirc.

is there anything that I could do at this moment to be able to have my two 
other
solaris clients see my zpool that I am presenting via iscsi to them both? 
zpool? i assume you mean zvol, correct ?

Is there any solutions out there of this kind?
i`m not that deep into solaris, but iirc there isn`t one for free.
veritas is quite popular, but you need spend lots of bucks for this.
maybe SAM-QFS ?

regards
roland
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs/zpools iscsi

2007-10-12 Thread Richard Elling
roland wrote:
 Is there any solutions out there of this kind?
 i`m not that deep into solaris, but iirc there isn`t one for free.
 veritas is quite popular, but you need spend lots of bucks for this.
 maybe SAM-QFS ?

We have lots of customers using shared QFS with RAC.
QFS is on the road to open source, too bad RAC isn't :-P
http://www.opensolaris.org/os/community/storage/

  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS array NVRAM cache

2007-10-12 Thread Vincent Fox
So what are the failure modes to worry about?

I'm not exactly sure what the implications of this nocache option for my 
configuration.

Say from a recent example I have an overtemp and first one array shuts down, 
then the other one.

I come in after A/C is returned, shutdown and repower everything.  Bring up the 
zpool and scrub it I would think I should be good.

Any other scenarios I should play out?

I really like mirrored dual-array setups with clustered frontends in failover 
mode.  I want performance but don't want to risk my data, so if there are 
reasons to remove this option from /etc/system I will do that.

I still see little or no usage of the cache according to the status-page on the 
3310.  I really would expect more activity so I'm wondering if it's still not 
being used.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] io:::start and zfs filenames?

2007-10-12 Thread Matthew Ahrens
Jim Mauro wrote:
 Hi Neel - Thanks for pushing this out. I've been tripping over this for 
 a while.
 
 You can instrument zfs_read() and zfs_write() to reliably track filenames:
 
 #!/usr/sbin/dtrace -s
 
 #pragma D option quiet
 
 zfs_read:entry,
 zfs_write:entry
 {
 printf(%s of %s\n,probefunc, stringof(args[0]-v_path));
 }

FYI, this tracks the system calls made (which may hit in the cache), whereas 
io:::start tracks i/o sent down to the disk driver.

 What sayeth the ZFS team regarding the use of a stable DTrace provider 
 with their file system?

Sounds like a good idea.  However, as others discussed, the data needed to do 
that is not immediately available.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Space Map optimalization

2007-10-12 Thread Matthew Ahrens
Łukasz K wrote:
 Now space maps, intent log, spa history are compressed.
 All normal metadata (including space maps and spa history) is always
 compressed.  The intent log is never compressed.
 
 Can you tell me where space map is compressed ?

we specify that it should be compressed in dbuf_sync_leaf:

 if (dmu_ot[dn-dn_type].ot_metadata) {
 checksum = os-os_md_checksum;
 compress = zio_compress_select(dn-dn_compress,
 os-os_md_compress);

os_md_compress is set to ZIO_COMPRESS_LZJB in dmu_objset_open_impl(), so the 
compression will happen in lzjb_compress().

 I want to propose few optimalization here:
  - space map block size schould be dynamin ( 4KB buffer is a bug )
My space map on thumper takes over 3,5 GB / 4kB = 855k blocks

A small block size is used because we typically have to keep the last block 
of every space map in memory (as we are constantly appending to it).  This is 
a trade-off between memory usage and time taken to load the space map.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] practicality of zfs send/receive for failover

2007-10-12 Thread Vincent Fox
So the problem in the zfs send/receive thing, is what if your network glitches 
out during the transfers?

We have these once a day due to some as-yet-undiagnosed switch problem, a 
chop-out of 50 seconds or so which is enough to trip all our IPMP setups and 
enough to abort SSH transfers in progress.

Just saying you need error-checking to account for these.  The transfers in my 
testing seemed fairly slow I was doing a full send and receive not incremental, 
for some 400 gigs and it took over 24 hours at which time I lost connection and 
gave up on the idea.  Once you were just down to incrementals it probably 
wouldn't be so bad.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some test results: ZFS + SAMBA + Sun Fire X4500 (Thumper)

2007-10-12 Thread Matthew Ahrens
Tim Thomas wrote:
 Hi
 
 this may be of interest:
 
 http://blogs.sun.com/timthomas/entry/samba_performance_on_sun_fire
 
 I appreciate that this is not a  frightfully clever set of tests but I 
 needed some throughout numbersand the easiest way to share the 
 results is to blog.

It seems that we can conclude that for this workload (streaming write over 
SAMBA), you saturated 2 x 1Gb/sec ethernet links, and the rest of the system 
(CPU, disk bandwidth) was under-utilized.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS 60 second pause times to read 1K

2007-10-12 Thread Matthew Ahrens
Michael Kucharski wrote:
 We have a x4500 setup as a single 4*( raid2z 9 + 2)+2 spare pool and have
 the files  system mounted over v5 krb5 NFS and accessed directly. The pool
 is a 20TB pool and is using . There are three filesystems, backup, test
 and home. Test has about 20 million files and uses 4TB. These files range
 from 100B to 200MB. Test has a cron job to take snapshots every 15 minutes
 from 1m on the hour. Every 15min at 2min on the hour a cron batch job runs
 to zfs send/recv to the backup filesystem. Home has only 100GB.

Are you doing this send|recv within the same pool?  What's the motivation for 
that?  Can't you just use zfs clone, which would be much faster and use 
less disk space?  Or if you want another copy (which seems unlikely since you 
can already tolerate any 2 disks failing in your pool), then use zfs set 
copies=2 fs.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] enlarge a mirrored pool

2007-10-12 Thread Ivan Wang
Hi all,

Forgive me if this is a dumb question. Is it possible for a two-disk mirrored 
zpool to be seamlessly enlarged by gradually replacing previous disk with 
larger one?

Say, in a constrained desktop, only space for two internal disks is available, 
could I just begin with two 160G disks, then at some time, replace one of the 
160G with 250G, resilvering, then replace another 160G, and finally get a 
two-disk 250G mirrored pool?

Cheers,
Ivan.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enlarge a mirrored pool

2007-10-12 Thread Erik Trimble
Ivan Wang wrote:
 Hi all,
 
 Forgive me if this is a dumb question. Is it possible for a two-disk mirrored 
 zpool to be seamlessly enlarged by gradually replacing previous disk with 
 larger one?
 
 Say, in a constrained desktop, only space for two internal disks is 
 available, could I just begin with two 160G disks, then at some time, replace 
 one of the 160G with 250G, resilvering, then replace another 160G, and 
 finally get a two-disk 250G mirrored pool?
 
 Cheers,
 Ivan.
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Yes.

After both drives are replaced, you will automatically see the 
additional space.

-- 
Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enlarge a mirrored pool

2007-10-12 Thread Neil Perrin


Erik Trimble wrote:
 Ivan Wang wrote:
 Hi all,

 Forgive me if this is a dumb question. Is it possible for a two-disk 
 mirrored zpool to be seamlessly enlarged by gradually replacing previous 
 disk with larger one?

 Say, in a constrained desktop, only space for two internal disks is 
 available, could I just begin with two 160G disks, then at some time, 
 replace one of the 160G with 250G, resilvering, then replace another 160G, 
 and finally get a two-disk 250G mirrored pool?

 Cheers,
 Ivan.
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 Yes.
 
 After both drives are replaced, you will automatically see the 
 additional space.

I believe currently after the last replace an import/export sequence
is needed to force zfs to see the increased size.

Neil.
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss