Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-25 Thread Miles Nordin
 ca == Carsten Aulbert carsten.aulb...@aei.mpg.de writes:
 ls == Lutz Schumann presa...@storageconcepts.de writes:

ca X25-E drives and a converter from 3.5 to 2.5 inches. So far
ca two systems have shown pretty bad instabilities with that.

instability after crashing or instability while running?  Lutz
Schumann 2010-01-10 seemed to find the x25m g2 was ignoring sync cache
commands when its write cache was set to ``on'', but it did indeed do
uncached writing if you turn the write cache off for the whole drive
albeit at half the performance advertised in the spec sheet:

ls Intel X25-M G2: - If I pull the power cable much data is lost,
ls altought commited to the app (some hundred) - If I pull the
ls sata cable no data is lost
  
ls ST3500418AS: - If I pull the power cable almost no data is
ls lost, but still the last write is lost (strange!)  - If I pull
ls the sata cable no data is lost

the test for it was to write a program that did 'write, sync, write,
sync' and notice when yaning x25m power connector with cache on n
transactions were lost, while yanking SATA connector 0 or 1
transactions were lost.  Therefore I suspect the x25e which also lacks
a supercap might also be another deliberately broken to inflate specs
drive, and if it's instability after crashing you might try to disable
the x25e write cache (1/2 performance) and try again?


pgpM3Wk0Y64mC.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Edward Ned Harvey
 zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0
  mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0
 mirror c0t2d0 c1t2d0
  mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0
 mirror c4t3d0 c5t3d0
  mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0
 mirror c0t5d0 c1t5d0
  mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0
 mirror c4t6d0 c5t6d0
  mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0
 mirror c6t7d0 c7t7d0
  mirror c7t0d0 c7t4d0

This looks good.  But you probably want to stick a spare in there, and add
a SSD disk specified by log

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Edward Ned Harvey
 Zfs does not strictly support RAID 1+0.  However, your sample command
 will create a pool based on mirror vdevs which is written to in a
 load-shared fashion (not striped).  This type of pool is ideal for

Although it's not technically striped according to the RAID definition of
striping, it does achieve the same performance result (actually better) so
people will generally refer to this as striping anyway.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Carsten Aulbert
On Thursday 21 January 2010 10:29:16 Edward Ned Harvey wrote:
  zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0
   mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0
  mirror c0t2d0 c1t2d0
   mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0
  mirror c4t3d0 c5t3d0
   mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0
  mirror c0t5d0 c1t5d0
   mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0
  mirror c4t6d0 c5t6d0
   mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0
  mirror c6t7d0 c7t7d0
   mirror c7t0d0 c7t4d0
 
 This looks good.  But you probably want to stick a spare in there, and
  add a SSD disk specified by log

May I jump in here an ask how people are using SSDs relibly in a x4500? So far 
we had very little success with X25-E drives and a converter from 3.5 to 2.5 
inches. So far two systems have shown pretty bad instabilities with that.

Anyone with a success here?

Cheers

Carste
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Edward Ned Harvey
 zpool create testpool disk1 disk2 disk3

In the traditional sense of RAID, this would create a concatenated data set.
The size of the data set is the size of disk1 + disk2 + disk3.  However,
since this is ZFS, it's not constrained to linearly assigning virtual disk
blocks to physical disk blocks ...  ZFS will happily write a single large
file to all 3 disks simultaneously and just keep track of where all the
blocks landed.

As a result, you get performance which is 3x a single disk for large files
(like striping) but the performance for small files has not been harmed (as
it is in striping)...  As an added bonus, unlike striping, you can still
just add more disks to your zpool, and expand your volume on the fly.  The
filesystem will dynamically adjust to accommodate more space and more
devices, and will intelligently optimize for performance.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Phil Harman
Can ASM match ZFS for checksum and self healing? The reason I ask is  
that the x45x0 uses inexpensive (less reluable) SATA drives. Even the  
J4xxx paper you cite uses SAS for production data (only using SATA for  
Oracle Flash, although I gave my concerns about that too).


The thing is, ZFS and the x45x0 seem made for eachother. The latter  
only makes sense to me with all the goodness and assurance added by  
the former.


Phil


On 21 Jan 2010, at 02:58, John hort...@gmail.com wrote:

Have you looked at using Oracle ASM instead of or with ZFS? Recent  
Sun docs concerning the F5100 seem to recommend a hybrid of both.


If you don't go that route, generally you should separate redo logs  
from actual data so they don't compete for I/O, since a redo switch  
lagging hangs the database. If you use archive logs, separate that  
on to yet another pool.


Realistically, it takes lots of analysis with different  
configurations. Every workload and database is different.


A decent overview of configuring JBOD-type storage for databases is  
here, though it doesn't use ASM...

https://www.sun.com/offers/docs/j4000_oracle_db.pdf
It's a couple years old and that might contribute to the lack of an  
ASM mention.

--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread John
No. But, that's where the hybrid solution comes in. ASM would be used for the 
database files and ZFS for the redo/archive logs and undo. Corrupt blocks in 
the datafiles would be repaired with data from redo during a recovery, and ZFS 
should give you assurance that the redo didn't get corrupted. Sun's docs on the 
F5100 point to this as the best solution for performance and 
recoverability/reliability.

Message was edited by: hortnon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Bob Friesenhahn

On Thu, 21 Jan 2010, Edward Ned Harvey wrote:


Although it's not technically striped according to the RAID definition of
striping, it does achieve the same performance result (actually better) so
people will generally refer to this as striping anyway.


People will say a lot of things, but that does not make them right. 
At some point, using the wrong terminology becomes foolish and 
counterproductive.


Striping and load-share seem quite different to me.  The difference is 
immediately apparent when watching the drive activity LEDs.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Brad
Did you buy the SSDs directly from Sun?  I've heard there could possibly be 
firmware that's vendor specific for the X25-E.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Carsten Aulbert
Hi

On Friday 22 January 2010 07:04:06 Brad wrote:
 Did you buy the SSDs directly from Sun?  I've heard there could possibly be
  firmware that's vendor specific for the X25-E.

No.

So far I've heard that they are not readily available as certification 
procedures are still underway (apart from this the 8850 firmware should be ok, 
but that's just what I've heard).

C
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-20 Thread Brad
Can anyone recommend a optimum and redundant striped configuration for a X4500? 
 We'll be using it for a OLTP (Oracle) database and will need best performance. 
 Is it also true that the reads will be load-balanced across the mirrors?

Is this considered a raid 1+0 configuration?  
zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0 
 mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0 mirror 
c0t2d0 c1t2d0 
 mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0 mirror 
c4t3d0 c5t3d0 
 mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0 mirror 
c0t5d0 c1t5d0 
 mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0 mirror 
c4t6d0 c5t6d0 
 mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0 mirror 
c6t7d0 c7t7d0 
 mirror c7t0d0 c7t4d0

Is it even possible to do a raid 0+1?
zpool create -f testpool c0t0d0 c4t0d0 c0t1d0 c4t1d0 c6t1d0 c0t2d0 c4t2d0 
c6t2d0 c0t3d0 c4t3d0 c6t3d0 c0t4d0 c4t4d0 c0t5d0 c4t5d0 c6t5d0 c0t6d0 c4t6d0  
c6t6d0 c0t7d0 c4t7d0 c6t7d0 c7t0d0 mirror c1t0d0 c6t0d0 c1t1d0 c5t1d0 c7t1d0 
c1t2d0 c5t2d0 c7t2d0 c1t3d0 c5t3d0 c7t3d0 c1t4d0 c6t4d0 c1t5d0 c5t5d0 c7t5d0 
c1t6d0 c5t6d0 c7t6d0 c1t7d0 c5t7d0 c7t7d0 c7t4d0
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-20 Thread Bob Friesenhahn

On Wed, 20 Jan 2010, Brad wrote:

Can anyone recommend a optimum and redundant striped configuration 
for a X4500?  We'll be using it for a OLTP (Oracle) database and 
will need best performance.  Is it also true that the reads will be 
load-balanced across the mirrors?


Is this considered a raid 1+0 configuration?


Zfs does not strictly support RAID 1+0.  However, your sample command 
will create a pool based on mirror vdevs which is written to in a 
load-shared fashion (not striped).  This type of pool is ideal for 
databases since it consumes the least of those precious IOPS.  With 
SATA drives, you need to preserve those precious IOPS as much as 
possible.


Zfs does not do striping across vdevs, but its load share approach 
will write based on (roughly) a round-robin basis, but will also 
prefer a less loaded vdev when under a heavy write load, or will 
prefer to write to an empty vdev rather than write to an almost full 
one.  Due to zfs behavior, it is best to provision the full number of 
disks to start with so that the disks are evenly filled and the data 
is well distributed.


Reads from mirror pairs use a simple load share algorithm to select 
the mirror side which does not attempt to strictly balance the reads. 
This does provide more performance than one disk, but not twice the 
performance.



Is it even possible to do a raid 0+1?


No.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-20 Thread John
Have you looked at using Oracle ASM instead of or with ZFS? Recent Sun docs 
concerning the F5100 seem to recommend a hybrid of both.

If you don't go that route, generally you should separate redo logs from actual 
data so they don't compete for I/O, since a redo switch lagging hangs the 
database. If you use archive logs, separate that on to yet another pool.

Realistically, it takes lots of analysis with different configurations. Every 
workload and database is different.

A decent overview of configuring JBOD-type storage for databases is here, 
though it doesn't use ASM...
https://www.sun.com/offers/docs/j4000_oracle_db.pdf
It's a couple years old and that might contribute to the lack of an ASM mention.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-20 Thread Brad
@hortnon - ASM is not within the scope of this project.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-20 Thread Brad
Zfs does not do striping across vdevs, but its load share approach
will write based on (roughly) a round-robin basis, but will also
prefer a less loaded vdev when under a heavy write load, or will
prefer to write to an empty vdev rather than write to an almost full
one.

I'm trying to visualize this...can you elaborate or give a ascii example?

So with the syntax below, load sharing is implemented?

zpool create testpool disk1 disk2 disk3
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-20 Thread Brad
I was reading your old posts about load-shares 
http://opensolaris.org/jive/thread.jspa?messageID=294580#294580 .

So between raidz and load-share striping, raidz stripes a file system block 
evenly across each vdev but with load sharing the file system block is written 
on a vdev that's not filled up (slab??) then for the next file system block it 
continues filling up the 1MB slab until its full being moving on to the next 
one?

Richard can you comment? :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-20 Thread Richard Elling
On Jan 20, 2010, at 8:14 PM, Brad wrote:

 I was reading your old posts about load-shares 
 http://opensolaris.org/jive/thread.jspa?messageID=294580#294580 .
 
 So between raidz and load-share striping, raidz stripes a file system block 
 evenly across each vdev but with load sharing the file system block is 
 written on a vdev that's not filled up (slab??) then for the next file system 
 block it continues filling up the 1MB slab until its full being moving on to 
 the next one?
 
 Richard can you comment? :)

That seems to be a reasonable interpretation.  The nit is that the 1MB 
changeover
is not the slab size.  Slab sizes are usually much larger.

In my list of things to remember for Oracle and ZFS:
1. recordsize is the biggest tuning knob
2. put redo log on a low latency device, SSD if possible
3. avoid raidz, when possible
4. prefer to give memory to the SGA rather than the ARC

Roch provides some good guidelines when you have an SSD and a
ZFS release which offers the logbias property here:
http://blogs.sun.com/roch/entry/synchronous_write_bias_property

 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss