Re: [zfs-discuss] How to avoid striping ?

2010-11-01 Thread Erik Ableson
Le 18 oct. 2010 à 08:44, Habony, Zsolt zsolt.hab...@hp.com a écrit :

 Hi,
 
I have seen a similar question on this list in the archive but 
 haven’t seen the answer.
 
 Can I avoid striping across top level vdevs ?
 
  
 
If I use a zpool which is one LUN from the SAN, and when it 
 becomes full I add a new LUN to it.
 
 But I cannot guarantee that the LUN will not come from the same spindles on 
 the SAN.
 
  
 
Can I force zpool to not to stripe the data ?
 
No. The basic principle of the zpool is dynamic striping across vdevs in order 
to ensure that all available spindles are contributing to the workload. If you 
want/need more granular control over what data goes to which disk, then you'll 
need to create multiple pools.

Just create a new pool from the new SAN volume and you will segregate the IO. 
But then you risk having hot and cold spots in your storage as the IO won't be 
striped. If the approach is to fill a vdev completely before adding a new one 
this possibility exists anyway until the block rewrite arrives to redistribute 
existing data across available vdevs.

Cheers,

Erik

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-11-01 Thread Ketola Sami

On 18 Oct 2010, at 12:40, Habony, Zsolt wrote:

 Is there a way to avoid it, or can we be sure that the problem does not 
 exist at all ?
 Grow the existing LUN rather than adding another one.
 
 The only way to have ZFS not stripe is to not give it devices to stripe 
 over.  So stick with simple mirrors ...
 
 (I do not mirror, as the storage gives redundancy behind LUNs.)

Then you lose ZFS self healing ability.

Sami
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-19 Thread Sami Ketola

On 18 Oct 2010, at 17:44, Habony, Zsolt wrote:

 Thank You all for the comments.
 
 You should imagine a datacenter with 
 - standards not completely depending on me.
 - SAN for many OSs, one of them is Solaris, (and not the major amount)

So you get luns from the storage team and there is nothing you can do about it. 
Just use the luns you get as well as you can then. Which is host based mirrored 
zpool.

 - usually level 2 engineers doing filesystem increases.
 - hundreds of physical boxes, dozens of virtuals on one physical
 - ability to move VMs (zones) across physical boxes. (by assigning LUNs to 
 other boxes)

You can do that even if the raid management is done host based with zfs. 

 
 That probably explains, that I cannot use host based raid management,  it is 
 done by storage as standard.

No it does not. I would still let zfs do the raid management on host side even 
if you can't stop the storage team from raiding it again on the storage box.


 I cannot assign whole disks to boxes, as I get LUNs standardized for all 
 other OSs, and in a size optimized for
 virtual small virtual machines.

You still should mirror across two storage boxes.

Sami

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Darren J Moffat

On 18/10/2010 07:44, Habony, Zsolt wrote:

I have seen a similar question on this list in the archive but haven’t
seen the answer.

Can I avoid striping across top level vdevs ?

If I use a zpool which is one LUN from the SAN, and when it becomes full
I add a new LUN to it.

But I cannot guarantee that the LUN will not come from the same spindles
on the SAN.


That sounds like a problem with your SAN config if that matters to you.


Can I force zpool to not to stripe the data ?


You can't, but why do you care ?

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Habony, Zsolt
In many large datacenters, a different storage team handles LUN requests and 
assignment.
We ask a LUN in a specific size, and we get one.

It might result that the first vdev (LUN) is on a beginning of a RAID set on 
the storage,
and the second vdev is on the end of the same RAID set on the same physical 
disks. (If not in the creation time, then
later, during the increase of a filled zpool, by adding a LUN)

I worry about head thrashing.  Though memory cache of large storage should make 
the problem 
easier, I would be more happy if I can be sure that zpool will not be handled 
as a stripe.

Is there a way to avoid it, or can we be sure that the problem does not exist 
at all ?

-Original Message-
From: Darren J Moffat [mailto:darr...@opensolaris.org] 
Sent: 2010. október 18. 10:19
To: Habony, Zsolt
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] How to avoid striping ?

On 18/10/2010 07:44, Habony, Zsolt wrote:
 I have seen a similar question on this list in the archive but haven't
 seen the answer.

 Can I avoid striping across top level vdevs ?

 If I use a zpool which is one LUN from the SAN, and when it becomes full
 I add a new LUN to it.

 But I cannot guarantee that the LUN will not come from the same spindles
 on the SAN.

That sounds like a problem with your SAN config if that matters to you.

 Can I force zpool to not to stripe the data ?

You can't, but why do you care ?

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Habony, Zsolt
No. The basic principle of the zpool is dynamic striping across vdevs in order 
to ensure that all available spindles are contributing to the workload. If 
you want/need more granular control over what data goes to which disk, then 
you'll need to create multiple pools.

Just create a new pool from the new SAN volume and you will segregate the IO.

That's my understanding and that's my problem.
You have an application filesystem from one LUN. (vxfs is expensive, ufs/svm is 
not really able to handle online filesystem increase. Thus we plan to use zfs 
for application filesystems.)
When it fills up you increase it by adding a new LUN.

You have to make sure that the added LUN is from different physical disks. Is 
might be not obvious with todays large storages with thousands of LUNs.

If I can force concatenation, then I do not have to investigate, where are the 
existing parts of the filesystems.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Darren J Moffat

On 18/10/2010 09:28, Habony, Zsolt wrote:

I worry about head thrashing.  Though memory cache of large storage should make 
the problem


Is that really something you should be worried about with all the other 
software and hardware between ZFS and the actual drives ?


If that is a problem then it isn't ZFS causing it, it will just be using 
the LUNs that was given to it by the SAN.  An access pattern of an 
application on a completely different filesystem could still mean that 
you are using both LUNs in that way.



Is there a way to avoid it, or can we be sure that the problem does not exist 
at all ?


Grow the existing LUN rather than adding another one.

The only way to have ZFS not stripe is to not give it devices to stripe 
over.  So stick with simple mirrors eg this style of configuration:


  pool: builds
 state: ONLINE
 scan: none requested
config:

NAMESTATE READ WRITE CKSUM
builds  ONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c8t4d0  ONLINE   0 0 0

Where in your configuration c7t3d0/c8t4d0 are your LUNs from the SAN.

Rather than this style:

  pool: builds
 state: ONLINE
 scan: none requested
config:

NAMESTATE READ WRITE CKSUM
builds  ONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c8t4d0  ONLINE   0 0 0
  mirror-1  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c8t4d0  ONLINE   0 0 0

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Darren J Moffat

On 18/10/2010 10:01, Habony, Zsolt wrote:

If I can force concatenation, then I do not have to investigate, where are the 
existing parts of the filesystems.


You can't, the code for concatenation rather than stripping does not 
exist and there are no plans to add it.


Instead of assuming you have a problem I'd highly recommend you go with 
the recommendation in my other email or don't worry about it.  Don't 
assume that you will have a problem with ZFS because of your experience 
with other systems.  Striping isn't bad it is usually good.


Or fix the root cause of the problem - which in this example case isn't 
ZFS - on the SAN where the LUNs are getting allocated.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Brandon High
On Mon, Oct 18, 2010 at 1:28 AM, Habony, Zsolt zsolt.hab...@hp.com wrote:
 Is there a way to avoid it, or can we be sure that the problem does not exist 
 at all ?

ZFS will coalesce asynchronous writes, which should help for most of
the head trash on write. Using a log device will convert sync writes
to async.

For reads, make sure you have enough memory and a cache device.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Rainer J.H. Brandt
Hi,

Habony, Zsolt writes:
 You have an application filesystem from one LUN. (vxfs is expensive, ufs/svm 
 is not really able to handle online filesystem increase. Thus we plan to use 
 zfs for application filesystems.)

What do you mean by not really?
Use metattach to grow a metadevice or soft partition.
Use growfs to grow UFS on the grown device.

Rainer
-- 

Rainer J. H. Brandt
Brandt  Brandt Computer GmbH
Am Wiesenpfad 6, 53340 Meckenheim
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
Handelsregister: Amtsgericht Bonn, HRB 10513

RFC 5322: Each line [...] SHOULD be no more than 78 characters
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Carson Gaspar

On 10/18/10 2:13 AM, Rainer J.H. Brandt wrote:


Habony, Zsolt writes:

You have an application filesystem from one LUN. (vxfs is
expensive, ufs/svm is not really able to handle online filesystem
increase. Thus we plan to use zfs for application filesystems.)


What do you mean by not really? Use metattach to grow a metadevice
or soft partition. Use growfs to grow UFS on the grown device.


He is probably referring to the fact that growfs locks the filesystem.

--
Carson Gaspar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Habony, Zsolt
 You have an application filesystem from one LUN. (vxfs is expensive, ufs/svm 
 is not really able to handle online filesystem increase. Thus we plan to use 
 zfs for application filesystems.)

What do you mean by not really?
...
Use growfs to grow UFS on the grown device.

I know its off-toopic but the statement:  growfs will ``write-lock'' (see 
lockfs(1M)) a  mounted  file
system when expanding.  made me always uncomfortable with this online 
expansion. I cannot guarantee how a specific
application will behave during the expansion.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Habony, Zsolt
 Is there a way to avoid it, or can we be sure that the problem does not 
 exist at all ?
Grow the existing LUN rather than adding another one.

The only way to have ZFS not stripe is to not give it devices to stripe 
over.  So stick with simple mirrors ...

(I do not mirror, as the storage gives redundancy behind LUNs.)

Online LUN expansion seems promising, and answering my question.
Thank You for that.

Zsolt


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Casper . Dik

 You have an application filesystem from one LUN. (vxfs is expensive, 
 ufs/svm is not really able
 to handle online filesystem increase. Thus we plan to use zfs for application 
filesystems.)

What do you mean by not really?
...
Use growfs to grow UFS on the grown device.

I know its off-toopic but the statement:  growfs will ``write-lock''
 (see lockfs(1M)) a  mounted  filesystem when expanding.  made me 
always uncomfortable with this online expansion. I cannot guarantee how a
specific application will behave during the expansion.


-w

 Write-lock  (wlock)  the  specified  file-system.  wlock
 suspends  writes  that  would  modify  the  file system.
 Access times are not kept while a file system is  write-
 locked.


All the applications trying to write will suspend.  What would be the
risk of that?

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Kyle McDonald

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 


On 10/18/2010 4:28 AM, Habony, Zsolt wrote:

 I worry about head thrashing.
Why?

If your SAN group gives you a LUN that is at the opposite end of the
array, I would think that was because they had already assigned the
space in the middle to other customers (other groups like yours, or
other hosts of yours.)

If so, don't you think that all those other hosts and customers will
be reading and writing from that array all the time anyway? I mean if
the heads are going to 'thrash', then they'll be doing so even before
you request your second LUN right?

Adding your second LUN to the mix isn't going to seriously change the
workload on the disks in the array.

 Though memory cache of large storage should make the problem
 easier, I would be more happy if I can be sure that zpool will not
 be handled as a stripe.

 Is there a way to avoid it, or can we be sure that the problem does
 not exist at all ?

As I think the logic above suggests, If the problem exists, it exists
even when you only have 1 LUN.

  -Kyle
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (MingW32)
 
iQEcBAEBAgAGBQJMvFeKAAoJEEADRM+bKN5wuc4IALPTIrGcAq6TWa95yrA/DCWp
vu2K7+pwSvz/IRIP+C6Y+qvWm/Km+UdtRu6PKb8G/DF8xp5vEnkqXdRSNDC6FlpR
EwSNavS7ij87bN6fuBiw6E02GZtADi2RptPKgyGz1FT3wPDHS8SQKtA59DwrWJNS
ckHUi+9BwngL4p7E0C+8pcahyF7QmtTm3DpL3y4AZ+7O+c/wPcIwLZ3dI6yQU8vd
KuRe6h/xCHffKH9gHoXJf0pG4e5iA8XP+lt7DlJGPxRYzZil0Rr5JA67uGqEf/VY
FbhAtXqWrHkNSd2sk1bIJVj7OFCS6j/NXMkV/Dt6OUH2Gkucl1nBs4yIAQ9Hu3s=
=I+w1
-END PGP SIGNATURE-

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Kyle McDonald

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 


On 10/18/2010 5:40 AM, Habony, Zsolt wrote:
 (I do not mirror, as the storage gives redundancy behind LUNs.)

By not enabling redundancy (Mirror or RAIDZ[123]) at the ZFS level,
you are opening yourself to corruption problems that the underlying
SAN storage can't protect you from. the SAN array won't even notice
the problem.

ZFS will notice the problem, and (if you don't give it redundancy to
work with) it won't be able to repair it for you.

You'd be better off getting unprotected LUNS from the Array, and
letting ZFS handle the redundancy.

  -Kyle
 Online LUN expansion seems promising, and answering my question.
 Thank You for that.

 Zsolt


 ___ zfs-discuss mailing
 list zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (MingW32)
 
iQEcBAEBAgAGBQJMvFhtAAoJEEADRM+bKN5wmgwIAK2HCAtaHkAp2RxqfkcFGD3A
0YyzP148fzTcEpFwhpNm59nht9fsfAibjCZZ/HmApe2jYWJ2K9l4W0MBXedXnz3e
gEaIxqymSHLjkF2SF0OD2XfnNiDMor5CrzPirZMcAL7TeyIqyACeuQTVVqZPw2rZ
TF1fGG2M9Y0l1Gq5+PfNcGESiz4tb7Er6UtDnLFe7rx4DObNJnO07jr1BMBxHsp8
tL1+YxhAUpWvaKOqHJvruZRtxagdE1KUQAtipPQjZvFudqIVAT8PRL0Acwz0D6aq
Lv1nmYzGg3M1usjrbfSEDV2eM3WR3gc7px93xyxZ1kMQPOgRO7X0YRxwfUMEsUc=
=+YXG
-END PGP SIGNATURE-

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Tim Cook
On Mon, Oct 18, 2010 at 3:28 AM, Habony, Zsolt zsolt.hab...@hp.com wrote:

 In many large datacenters, a different storage team handles LUN requests
 and assignment.
 We ask a LUN in a specific size, and we get one.

 It might result that the first vdev (LUN) is on a beginning of a RAID set
 on the storage,
 and the second vdev is on the end of the same RAID set on the same physical
 disks. (If not in the creation time, then
 later, during the increase of a filled zpool, by adding a LUN)

 I worry about head thrashing.  Though memory cache of large storage should
 make the problem
 easier, I would be more happy if I can be sure that zpool will not be
 handled as a stripe.

 Is there a way to avoid it, or can we be sure that the problem does not
 exist at all ?

 -Original Message-
 From: Darren J Moffat [mailto:darr...@opensolaris.org]
 Sent: 2010. október 18. 10:19
 To: Habony, Zsolt
 Cc: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] How to avoid striping ?

 On 18/10/2010 07:44, Habony, Zsolt wrote:
  I have seen a similar question on this list in the archive but haven't
  seen the answer.
 
  Can I avoid striping across top level vdevs ?
 
  If I use a zpool which is one LUN from the SAN, and when it becomes full
  I add a new LUN to it.
 
  But I cannot guarantee that the LUN will not come from the same spindles
  on the SAN.

 That sounds like a problem with your SAN config if that matters to you.

  Can I force zpool to not to stripe the data ?

 You can't, but why do you care ?

 --
 Darren J Moffat



It shouldn't matter if LUN's are on the same backend disk.  Unless the
manufacturer of the array is brain dead, their wide striping algorithm
should handle it without breaking a sweat.  If the pool of disk can't
service the number of IOPS, the storage team should be moving LUN's
around, that's what they get paid to do.

Your *issue* shouldn't be an issue at all unless the backend disk is junk.
 I've never seen an issue with Hitachi's HDP or NetApp's aggregates.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-10-18 Thread Peter Jeremy
On 2010-Oct-18 17:45:34 +0800, casper@sun.com casper@sun.com wrote:
 Write-lock  (wlock)  the  specified  file-system.  wlock
 suspends  writes  that  would  modify  the  file system.
 Access times are not kept while a file system is  write-
 locked.


All the applications trying to write will suspend.  What would be the
risk of that?

At least some versions of Oracle rdbms have timeouts around I/O and
will abort if I/O operations don't complete within a short period.

-- 
Peter Jeremy


pgp1r1gM7cLEs.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss