Re: [zfs-discuss] zfs send/receive - actual performance

2010-03-26 Thread Erik Ableson


On 25 mars 2010, at 22:00, Bruno Sousa bso...@epinfante.com wrote:


Hi,

Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the  
system i have now.
Regarding the performance...let's assume that a bonnie++ benchmark  
could go to 200 mg/s in. The possibility of getting the same values  
(or near) in a zfs send / zfs receive is just a matter of putting ,  
let's say a 10gbE card between both systems?
I have the impression that benchmarks are always synthetic,  
therefore live/production environments behave quite differently.
Again, it might be just me, but with 1gb link being able to  
replicate 2 servers with a average speed above 60 mb/s does seems  
quite good. However, like i said i would like to know other results  
from other guys...


Don't forget to factor in your transport mechanism. If you're using  
ssh to pipe the send/recv data your overall speed may end up being CPU  
bound since I think that ssh will be single threaded so even on a  
multicore system, you'll only be able to consume one core and here raw  
clock speed will make difference.


Cheers,

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive - actual performance

2010-03-26 Thread Bruno Sousa
Hi,

I think that in this case the cpu is not the bottleneck, since i'm not
using ssh.
However my 1gb network link probably is the bottleneck.

Bruno

On 26-3-2010 9:25, Erik Ableson wrote:

 On 25 mars 2010, at 22:00, Bruno Sousa bso...@epinfante.com wrote:

 Hi,

 Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the
 system i have now.
 Regarding the performance...let's assume that a bonnie++ benchmark
 could go to 200 mg/s in. The possibility of getting the same values
 (or near) in a zfs send / zfs receive is just a matter of putting ,
 let's say a 10gbE card between both systems?
 I have the impression that benchmarks are always synthetic, therefore
 live/production environments behave quite differently.
 Again, it might be just me, but with 1gb link being able to replicate
 2 servers with a average speed above 60 mb/s does seems quite good.
 However, like i said i would like to know other results from other
 guys...

 Don't forget to factor in your transport mechanism. If you're using
 ssh to pipe the send/recv data your overall speed may end up being CPU
 bound since I think that ssh will be single threaded so even on a
 multicore system, you'll only be able to consume one core and here raw
 clock speed will make difference.

 Cheers,

 Erik





smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive - actual performance

2010-03-26 Thread Bruno Sousa
Hi,

The jumbo-frames in my case give me a boost of around 2 mb/s, so it's
not that much.
Now i will play with link aggregation and see how it goes, and of course
i'm counting that incremental replication will be slower...but since the
amount of data would be much less probably it will still deliver a good
performance.

And what a relief to know that i'm not alone when i say that storage
management is part science, part arts and part voodoo magic ;)

Cheers,
Bruno

On 25-3-2010 23:22, Ian Collins wrote:
 On 03/26/10 10:00 AM, Bruno Sousa wrote:

 [Boy top-posting sure mucks up threads!]

 Hi,

 Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the
 system i have now.
 Regarding the performance...let's assume that a bonnie++ benchmark
 could go to 200 mg/s in. The possibility of getting the same values
 (or near) in a zfs send / zfs receive is just a matter of putting ,
 let's say a 10gbE card between both systems?

 Maybe, or a 2x1G LAG would me more cost effective (and easier to
 check!).  The only way to know for sure is to measure.  I managed to
 get slightly better transfers by enabling jumbo frames.

 I have the impression that benchmarks are always synthetic, therefore
 live/production environments behave quite differently.

 Very true, especially in the black arts of storage management!

 Again, it might be just me, but with 1gb link being able to replicate
 2 servers with a average speed above 60 mb/s does seems quite good.
 However, like i said i would like to know other results from other
 guys...

 As I said, the results are typical for a 1G link.  Don't forget you
 are measuring full copies, incremental replications may well be
 significantly slower.

 -- 
 Ian.
   

 -- 
 This message has been scanned for viruses and
 dangerous content by *MailScanner* http://www.mailscanner.info/, and is
 believed to be clean. 



smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ2 configuration

2010-03-26 Thread Edward Ned Harvey
 Using fewer than 4 disks in a raidz2 defeats the purpose of raidz2, as

 you will always be in a degraded mode.

 

Freddie, are you nuts?  This is false.

 

Sure you can use raidz2 with 3 disks in it.  But it does seem pointless to do 
that instead of a 3-way mirror.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ2 configuration

2010-03-26 Thread Edward Ned Harvey
 Coolio.  Learn something new everyday.  One more way that raidz is
 different from RAID5/6/etc.

Freddie, again, you're wrong.  Yes, it's perfectly acceptable to create either 
raid-5 or raidz using 2 disks.  It's not degraded, but it does seem pointless 
to do this instead of a mirror.

Likewise, it's perfectly acceptable to create a raid-6 or raid-dp or raidz2 
using 3 disks.  It's not degraded, but seems pointless to do this instead of a 
3-way mirror.

Since it's pointless, some hardware vendors may not implement it in their raid 
controllers.  They might only give you the option of creating a mirror instead. 
 But that doesn't mean it's invalid raid configuration.


 So, is it just a standard that hardware/software RAID setups require
 3 drives for a RAID5 array?  And 4 drives for RAID6?

It is just standard not to create a silly 2-disk raid5 or raidz.  But don't 
use the word require.

It is common practice to create raidz2 only with 4 disks or more, but again, 
don't use the word require.

Some people do in fact create these silly configurations just because they're 
unfamiliar with what it all means.  Take Bruno's original post as example, and 
that article he referenced on sun.com.  How these things get started, I'll 
never know.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ2 configuration

2010-03-26 Thread Edward Ned Harvey
Just because most people are probably too lazy to click the link, I’ll paste a 
phrase from that sun.com webpage below:

“Creating a single-parity RAID-Z pool is identical to creating a mirrored pool, 
except that the ‘raidz’ or ‘raidz1’ keyword is used instead of ‘mirror’.”

And

“zpool create tank raidz2 c1t0d0 c2t0d0 c3t0d0”

 

So … Shame on you, Sun, for doing this to your poor unfortunate readers.  It 
would be nice if the page were a wiki, or somehow able to have feedback 
submitted…

 

 

 

From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Bruno Sousa
Sent: Thursday, March 25, 2010 3:28 PM
To: Freddie Cash
Cc: ZFS filesystem discussion list
Subject: Re: [zfs-discuss] RAIDZ2 configuration

 

Hmm...it might be completely wrong , but the idea of raidz2 vdev with 3 disks 
came from the reading of http://docs.sun.com/app/docs/doc/819-5461/gcvjg?a=view 
.

This particular page has the following example :

zpool create tank raidz2 c1t0d0 c2t0d0 c3t0d0
# zpool status -v tank
  pool: tank
 state: ONLINE
 scrub: none requested
config:
 
NAME  STATE READ WRITE CKSUM
tank  ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
c1t0d0ONLINE   0 0 0
c2t0d0ONLINE   0 0 0
c3t0d0ONLINE   0 0 0
 

So...what am i missing here? Just a bad example in the sun documentation 
regarding zfs?

Bruno

On 25-3-2010 20:10, Freddie Cash wrote: 

On Thu, Mar 25, 2010 at 11:47 AM, Bruno Sousa bso...@epinfante.com wrote:

What do you mean by Using fewer than 4 disks in a raidz2 defeats the purpose 
of raidz2, as you will always be in a degraded mode ? Does it means that 
having 2 vdevs with 3 disks it won't be redundant in the advent of a drive 
failure?

 

raidz1 is similar to raid5 in that it is single-parity, and requires a minimum 
of 3 drives (2 data + 1 parity)

raidz2 is similar to raid6 in that it is double-parity, and requires a minimum 
of 4 drives (2 data + 2 parity)

 

IOW, a raidz2 vdev made up of 3 drives will always be running in degraded mode 
(it's missing a drive).

 

-- 

Freddie Cash
fjwc...@gmail.com

-- 
This message has been scanned for viruses and 
dangerous content by  http://www.mailscanner.info/ MailScanner, and is 
believed to be clean. 

 
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  

 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Edward Ned Harvey
 OK, I have 3Ware looking into a driver for my cards (3ware 9500S-8) as
 I dont see an OpenSolaris driver for them.
 
 But this leads me that they do have a FreeBSD Driver, so I could still
 use ZFS.
 
 What does everyone thing about that? I bet it is not as mature as on
 OpenSolaris.

mature is not the right term in this case.  FreeBSD has been around much
longer than opensolaris, and it's equally if not more mature.  FreeBSD is
probably somewhat less featureful.  Because their focus is heavily on the
reliability and stability side, rather than early adoption.  Also it's less
popular so there are ... less package availability.

And FreeBSD in general will be built using older versions of packages than
what's in OpenSolaris.

Both are good OSes.  If you can use FreeBSD but OpenSolaris doesn't have the
driver for your hardware, go for it.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Svein Skogen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26.03.2010 12:46, Edward Ned Harvey wrote:
 OK, I have 3Ware looking into a driver for my cards (3ware 9500S-8) as
 I dont see an OpenSolaris driver for them.

 But this leads me that they do have a FreeBSD Driver, so I could still
 use ZFS.

 What does everyone thing about that? I bet it is not as mature as on
 OpenSolaris.
 
 mature is not the right term in this case.  FreeBSD has been around much
 longer than opensolaris, and it's equally if not more mature.  FreeBSD is
 probably somewhat less featureful.  Because their focus is heavily on the
 reliability and stability side, rather than early adoption.  Also it's less
 popular so there are ... less package availability.

Have you had a look at /usr/ports? ;)
As of a few days ago (when I last updated ports): 21430

I know, strictly speaking ports isn't packages since things are
compiled locally (but you can output the result into packages if you
need to install on several systems).

 And FreeBSD in general will be built using older versions of packages than
 what's in OpenSolaris.

Where did you get that info? Of course, ZFS is a little older:
NAMEPROPERTY  VALUESOURCE
pollux  version   14   default

But for other packages FreeBSD is atleast as cutting edge as (Open)Solaris.

 Both are good OSes.  If you can use FreeBSD but OpenSolaris doesn't have the
 driver for your hardware, go for it.

Finally something we agree on. ;)

FreeBSD also has a less restrictive license.

//Svein

- -- 
- +---+---
  /\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
- +---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.
- 
 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/
- 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkusoH4ACgkQSBMQn1jNM7Y53QCgvx+rSQRk9AmkmvZpWILVV9SE
wSwAoN/YELyPIQWbxcUSIh1Ut60pbxak
=XlC/
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS backup configuration

2010-03-26 Thread Edward Ned Harvey
 It seems like the zpool export will ques the drives and mark the pool
 as exported. This would be good if we wanted to move the pool at that
 time but we are thinking of a disaster recovery scenario. It would be
 nice to export just the config to where if our controller dies, we can
 use the zpool import on another box to get back up and running.

Correct, zpool export will offline your disks so you can remove them and
bring them somewhere else.

I don't think you need to do anything in preparation for possible server
failure.  Am I wrong about this?  I believe once your first server is down,
you just move your disks to another system, and then zpool import.  I
don't believe the export is necessary in order to do an import.  You would
only export if you wanted to disconnect while the system is still powered
on.

You just export to tell the running OS I'm about to remove those disks,
so don't freak out.  But if there is no running OS, you don't worry about
it.

Again, I'm only 98% sure of the above.  So it might be wise to test on a
sandbox system.

One thing that is worth mention:  If you have an HBA such as 3ware, or Perc,
or whatever ... it might be impossible to move the disks to a different HBA,
such as Perc or 3ware (swapped one for the other).  If your original system
is using Perc 6/i, only move them to another system with Perc 6/i (and if
possible, ensure the controller is using the same rev of firmware.)

If you're using a simple unintelligent non-raid sas or sata controller, you
should be good.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and ARC

2010-03-26 Thread Edward Ned Harvey
 In the Thoughts on ZFS Pool Backup Strategies thread it was stated
 that zfs send, sends uncompress data and uses the ARC.
 
 If zfs send sends uncompress data which has already been compress
 this is not very efficient, and it would be *nice* to see it send the
 original compress data. (or an option to do it)

You've got 2 questions in your post.  The one above first ...

It's true that zfs send sends uncompressed data.  So I've heard.  I haven't 
tested it personally.

I seem to remember there's some work to improve this, but not available yet.  
Because it was easier to implement the uncompressed send, and that already is 
super-fast compared to all the alternatives.


 I thought I would ask a true or false type questions mainly for
 curiosity sake.
 
 If zfs send uses standard ARC cache (when something is not already in
 the ARC) I would expect this to hurt (to some degree??) the performance
 of the system. (ie I assume it has the effect of replacing
 current/useful data in the cache with not very useful/old data

And this is a separate question.

I can't say first-hand what ZFS does, but I have an educated guess.  I would 
say, for every block the zfs send needs to read ... if the block is in ARC or 
L2ARC, then it won't fetch again from disk.  But it is not obliterating the ARC 
or L2ARC with old data.  Because it's smart enough to work at a lower level 
than a user-space process, and tell the kernel (or whatever) something like 
I'm only reading this block once; don't bother caching it for my sake.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Eugen Leitl
On Fri, Mar 26, 2010 at 07:46:01AM -0400, Edward Ned Harvey wrote:

 And FreeBSD in general will be built using older versions of packages than
 what's in OpenSolaris.
 
 Both are good OSes.  If you can use FreeBSD but OpenSolaris doesn't have the
 driver for your hardware, go for it.

While I use zfs with FreeBSD (FreeNAS appliance with 4x SATA 1 TByte drives) 
it is trailing OpenSolaris by at least a year if not longer and hence lacks
many key features people pick zfs over other file systems. The performance,
especially CIFS is quite lacking. Purportedly (I have never seen the source
nor am I a developer), such crucial features are nontrivial to backport because 
FreeBSD doesn't practice layer separation. Inasmuch this is still true
for the future we'll see once the Oracle/Sun dust settles.

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Edward Ned Harvey
 While I use zfs with FreeBSD (FreeNAS appliance with 4x SATA 1 TByte
 drives)
 it is trailing OpenSolaris by at least a year if not longer and hence
 lacks
 many key features people pick zfs over other file systems. The
 performance,
 especially CIFS is quite lacking. Purportedly (I have never seen the
 source
 nor am I a developer), such crucial features are nontrivial to backport
 because
 FreeBSD doesn't practice layer separation. Inasmuch this is still true
 for the future we'll see once the Oracle/Sun dust settles.

I'm not sure if it's a version thing, or something else ... I am running
solaris 10u6 (at least a year or two old) and the performance of that is not
just fine ... it's super awesome.

An important note, though, is that I'm using samba and not the zfs built-in
cifs.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and ARC

2010-03-26 Thread David Dyer-Bennet

On Fri, March 26, 2010 07:06, Edward Ned Harvey wrote:
 In the Thoughts on ZFS Pool Backup Strategies thread it was stated
 that zfs send, sends uncompress data and uses the ARC.

 If zfs send sends uncompress data which has already been compress
 this is not very efficient, and it would be *nice* to see it send the
 original compress data. (or an option to do it)

 You've got 2 questions in your post.  The one above first ...

 It's true that zfs send sends uncompressed data.  So I've heard.  I
 haven't tested it personally.

 I seem to remember there's some work to improve this, but not available
 yet.  Because it was easier to implement the uncompressed send, and that
 already is super-fast compared to all the alternatives.

I don't know that it makes sense to.  There are lots of existing filter
packages that do compression; so if you want compression, just put them in
your pipeline.  That way you're not limited by what zfs send has
implemented, either.  When they implement bzip98 with a new compression
technology breakthrough, you can just use it :-) .

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ2 configuration

2010-03-26 Thread Eric D. Mudama

On Fri, Mar 26 at  7:29, Edward Ned Harvey wrote:

   Using fewer than 4 disks in a raidz2 defeats the purpose of raidz2, as
   you will always be in a degraded mode.



  Freddie, are you nuts?  This is false.

  Sure you can use raidz2 with 3 disks in it.  But it does seem pointless to
  do that instead of a 3-way mirror.


One thing about mirrors is you can put each side of your mirror on a
different controller, so that any single controller failure doesn't
cause your pool to go down.

While controller failure rates are very low, using 16/24 or 14/21
drives for parity on a dataset seems crazy to me.  I know disks can be
unreliable, but they shouldn't be THAT unreliable.  I'd think that
spending fewer drives for hot redundancy and then spending some of
the balance on an isolated warm/cold backup solution would be more
cost effective.

http://blog.richardelling.com/2010/02/zfs-data-protection-comparison.html

Quoting from the summary, at some point, the system design will be
dominated by common failures and not the failure of independent
disks.

Another thought is that if heavy seeking is more likely to lead to
high temperature and/or drive failure, then reserving one or two slots
for an SSD L2ARC might be a good idea.  It'll take a lot of load off
of your spindles if your data set fits or mostly fits within the
L2ARC.  You'd need a lot of RAM to make use of a large L2ARC though,
just something to keep in mind.

We have a 32GB X25-E as L2ARC and though it's never more than ~5GB
full with our workloads, most every file access saturates the wire
(1.0 Gb/s ethernet) once the cache has warmed up, resulting in very
little IO to our spindles.

--eric

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS hex dump diagrams?

2010-03-26 Thread Eric D. Mudama

On Fri, Mar 26 at 11:10, Sanjeev wrote:

On Thu, Mar 25, 2010 at 02:45:12PM -0700, John Bonomi wrote:

I'm sorry if this is not the appropriate place to ask, but I'm a
student and for an assignment I need to be able to show at the hex
level how files and their attributes are stored and referenced in
ZFS. Are there any resources available that will show me how this
is done?


You could try zdb.


Or just look at the source code.

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS hex dump diagrams?

2010-03-26 Thread m...@bruningsystems.com

Hi,

You might take a look at
http://www.osdevcon.org/2008/files/osdevcon2008-max.pdf
and
http://www.osdevcon.org/2008/files/osdevcon2008-proceedings.pdf, starting
at page 36.

Or you might just use od -x file for the file part of your assignment.

Have fun.
max


Eric D. Mudama wrote:

On Fri, Mar 26 at 11:10, Sanjeev wrote:

On Thu, Mar 25, 2010 at 02:45:12PM -0700, John Bonomi wrote:

I'm sorry if this is not the appropriate place to ask, but I'm a
student and for an assignment I need to be able to show at the hex
level how files and their attributes are stored and referenced in
ZFS. Are there any resources available that will show me how this
is done?


You could try zdb.


Or just look at the source code.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ2 configuration

2010-03-26 Thread David Magda
On Fri, March 26, 2010 07:38, Edward Ned Harvey wrote:
 Coolio.  Learn something new everyday.  One more way that raidz is
 different from RAID5/6/etc.

 Freddie, again, you're wrong.  Yes, it's perfectly acceptable to create
 either raid-5 or raidz using 2 disks.  It's not degraded, but it does seem
 pointless to do this instead of a mirror.

I think the word you're looking for is possible, not acceptable.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and ARC

2010-03-26 Thread David Magda
On Fri, March 26, 2010 09:46, David Dyer-Bennet wrote:

 I don't know that it makes sense to.  There are lots of existing filter
 packages that do compression; so if you want compression, just put them in
 your pipeline.  That way you're not limited by what zfs send has
 implemented, either.  When they implement bzip98 with a new compression
 technology breakthrough, you can just use it :-) .

Actually a better example may be using parallel implementations of popular
algorithms:

http://www.zlib.net/pigz/
http://www.google.com/search?q=parallel+bzip

Given the amount of cores we have nowadays (especially the Niagara-based
CPUs), might as well use them. There are also better algorithms out there
(some of which assume parallelism):

http://en.wikipedia.org/wiki/Xz
http://en.wikipedia.org/wiki/7z

If you're using OpenSSH, there are also some third-party patches that may
help in performance:

http://www.psc.edu/networking/projects/hpn-ssh/

However, if the data is already compressed (and/or deduped), there's no
sense in doing it again. If ZFS does have to go to disk, might as well
send the data as-is.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Bob Friesenhahn

On Fri, 26 Mar 2010, Edward Ned Harvey wrote:

mature is not the right term in this case.  FreeBSD has been around much
longer than opensolaris, and it's equally if not more mature.  FreeBSD is
probably somewhat less featureful.  Because their focus is heavily on the
reliability and stability side, rather than early adoption.  Also it's less
popular so there are ... less package availability.

And FreeBSD in general will be built using older versions of packages than
what's in OpenSolaris.


I am confused.  What is the meaning of package and why would 
OpenSolaris be ahead of FreeBSD when it comes to packages?  I am not 
sure what the meaning of package is but the claim seems quite 
dubious to me.


To be sure, FreeBSD 8.0 is behind with zfs versions:

% zpool upgrade
This system is currently running ZFS pool version 13.

but of course this is continually being worked on, and the latest 
stuff (with dedup) is in the process of being ported for delivery in 
FreeBSD 9.0 (and possibly FreeBSD 8.X).


I think that the main advantage that Solaris ultimately has over 
FreeBSD when it comes to zfs is that Solaris provides an advanced 
fault management system and FreeBSD does not.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)

2010-03-26 Thread Bottone, Frank
Does zfs handle 4kb sectors properly or does it always assume 512b sectors?

If it does, we could manually create a slice properly aligned and set zfs to 
use it...






--
 The sender of this email subscribes to Perimeter E-Security's email
 anti-virus service. This email has been scanned for malicious code and is
 believed to be virus free. For more information on email security please
 visit: http://www.perimeterusa.com/services/messaging
 This communication is confidential, intended only for the named recipient(s)
 above and may contain trade secrets or other information that is exempt from
 disclosure under applicable law. Any use, dissemination, distribution or
 copying of this communication by anyone other than the named recipient(s) is
 strictly prohibited. If you have received this communication in error, please
 delete the email and immediately notify our Command Center at 203-541-3444.

 Thanks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)

2010-03-26 Thread Larry Liu

Yes, it does.

Bottone, Frank 写道:


Does zfs handle 4kb sectors properly or does it always assume 512b 
sectors?


If it does, we could manually create a slice properly aligned and set 
zfs to use it…



--
 The sender of this email subscribes to Perimeter E-Security's email
 anti-virus service. This email has been scanned for malicious code and is
 believed to be virus free. For more information on email security please
 visit: http://www.perimeterusa.com/services/messaging 
 This communication is confidential, intended only for the named recipient(s)

 above and may contain trade secrets or other information that is exempt from
 disclosure under applicable law. Any use, dissemination, distribution or
 copying of this communication by anyone other than the named recipient(s) is
 strictly prohibited. If you have received this communication in error, please
 delete the email and immediately notify our Command Center at 203-541-3444.

 Thanks



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)

2010-03-26 Thread Svein Skogen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26.03.2010 16:55, Bottone, Frank wrote:
 Does zfs handle 4kb sectors properly or does it always assume 512b sectors?
 
  
 
 If it does, we could manually create a slice properly aligned and set
 zfs to use it?

A real simple patch would be to attempt alignment with 4096 every time
(since 4096 is a multiple of 512 there really wouldn't be a performance
penalty here). This would mean that things are optimal on _ALL_ disks.
(and allow those of us using more advanced diskcontrollers to set the
strip-size (strip size, not stripe size) to 4K as well)

//Svein

- -- 
- +---+---
  /\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
- +---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.
- 
 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/
- 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkus2cMACgkQSBMQn1jNM7YHeACgoxq5z6Eylrfn9IeIer+epJPs
ylwAoKgxYWRaT7IQ+JTjeTQk8goCBcFT
=TB1W
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)

2010-03-26 Thread Bottone, Frank
Awesome!

Just when I thought zfs couldn’t get any better...



-Original Message-
From: larry@sun.com [mailto:larry@sun.com] 
Sent: Friday, March 26, 2010 11:58 AM
To: Bottone, Frank
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital 
GREEN Drives?)

Yes, it does.

Bottone, Frank 写道:

 Does zfs handle 4kb sectors properly or does it always assume 512b 
 sectors?

 If it does, we could manually create a slice properly aligned and set 
 zfs to use it…


 --
  The sender of this email subscribes to Perimeter E-Security's email
  anti-virus service. This email has been scanned for malicious code and is
  believed to be virus free. For more information on email security please
  visit: http://www.perimeterusa.com/services/messaging 
  This communication is confidential, intended only for the named recipient(s)
  above and may contain trade secrets or other information that is exempt from
  disclosure under applicable law. Any use, dissemination, distribution or
  copying of this communication by anyone other than the named recipient(s) is
  strictly prohibited. If you have received this communication in error, please
  delete the email and immediately notify our Command Center at 203-541-3444.

  Thanks

 

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   







--
 The sender of this email subscribes to Perimeter E-Security's email
 anti-virus service. This email has been scanned for malicious code and is
 believed to be virus free. For more information on email security please
 visit: http://www.perimeterusa.com/services/messaging
 This communication is confidential, intended only for the named recipient(s)
 above and may contain trade secrets or other information that is exempt from
 disclosure under applicable law. Any use, dissemination, distribution or
 copying of this communication by anyone other than the named recipient(s) is
 strictly prohibited. If you have received this communication in error, please
 delete the email and immediately notify our Command Center at 203-541-3444.

 Thank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] RAID10

2010-03-26 Thread Slack-Moehrle
Hi All,

I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 
5, but what about RAID 10? Isn't a RAID 10 setup better for data protection?

So if I have 8 x 1.5tb drives, wouldn't I:

- mirror drive 1 and 5
- mirror drive 2 and 6
- mirror drive 3 and 7
- mirror drive 4 and 8

Then stripe 1,2,3,4

Then stripe 5,6,7,8

How does one do this with ZFS?

-Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Slack-Moehrle

And I should mention that I have a boot drive (500gb SATA) so I dont have to 
consider booting from the RAID, I just want to use it for storage.

- Original Message -
From: Slack-Moehrle mailingli...@mailnewsrss.com
To: zfs-discuss zfs-discuss@opensolaris.org
Sent: Friday, March 26, 2010 11:39:35 AM
Subject: [zfs-discuss] RAID10

Hi All,

I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 
5, but what about RAID 10? Isn't a RAID 10 setup better for data protection?

So if I have 8 x 1.5tb drives, wouldn't I:

- mirror drive 1 and 5
- mirror drive 2 and 6
- mirror drive 3 and 7
- mirror drive 4 and 8

Then stripe 1,2,3,4

Then stripe 5,6,7,8

How does one do this with ZFS?

-Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Carson Gaspar

Slack-Moehrle wrote:

And I should mention that I have a boot drive (500gb SATA) so I dont have to 
consider booting from the RAID, I just want to use it for storage.

- Original Message -
From: Slack-Moehrle mailingli...@mailnewsrss.com
To: zfs-discuss zfs-discuss@opensolaris.org
Sent: Friday, March 26, 2010 11:39:35 AM
Subject: [zfs-discuss] RAID10

Hi All,

I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 
5, but what about RAID 10? Isn't a RAID 10 setup better for data protection?

So if I have 8 x 1.5tb drives, wouldn't I:

- mirror drive 1 and 5
- mirror drive 2 and 6
- mirror drive 3 and 7
- mirror drive 4 and 8

Then stripe 1,2,3,4

Then stripe 5,6,7,8

How does one do this with ZFS?


You don't, because your description is insane. You mirror each pair, 
then stripe each mirror, not the drives in the mirror (not really a 
stripe in ZFS, but...)


zpool create mypool mirror 1 5 mirror 2 6 mirror 3 7 mirror 4 8

Relpacing the numbers with the actual device names
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Tim Cook
On Fri, Mar 26, 2010 at 1:39 PM, Slack-Moehrle mailingli...@mailnewsrss.com
 wrote:

 Hi All,

 I am looking at ZFS and I get that they call it RAIDZ which is similar to
 RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data
 protection?

 So if I have 8 x 1.5tb drives, wouldn't I:

 - mirror drive 1 and 5
 - mirror drive 2 and 6
 - mirror drive 3 and 7
 - mirror drive 4 and 8

 Then stripe 1,2,3,4

 Then stripe 5,6,7,8

 How does one do this with ZFS?

 -Jason


Just keep adding mirrored vdev's to the pool.  It isn't exactly like a
raid-10, as zfs doesn't to a typical raid-0 stripe, per se.  It is the same
basic concept as raid-10 though.  You would be striping across all of the
mirrored sets, not just a subset.

So you would do:
zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5
drive6 mirror drive7 drive8

See here:
http://www.stringliterals.com/?p=132
http://www.stringliterals.com/?p=132
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Rich Teer
On Fri, 26 Mar 2010, Slack-Moehrle wrote:

 Hi All,
 
 I am looking at ZFS and I get that they call it RAIDZ which is
 similar to RAID 5, but what about RAID 10? Isn't a RAID 10 setup better
 for data protection?

I think so--at the expense of extra disks for a given amount of available
storage.

 So if I have 8 x 1.5tb drives, wouldn't I:
 
 - mirror drive 1 and 5
 - mirror drive 2 and 6
 - mirror drive 3 and 7
 - mirror drive 4 and 8
 
 How does one do this with ZFS?

Try this:

zpool create dpool mirror drive1 drive5 mirror drive2 drive6 \
mirror drive3 drive7 mirror drive4 drive8

Isn't ZFS great?!

-- 
Rich Teer, Publisher
Vinylphile Magazine

www.vinylphilemag.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Freddie Cash
On Fri, Mar 26, 2010 at 11:39 AM, Slack-Moehrle 
mailingli...@mailnewsrss.com wrote:

 I am looking at ZFS and I get that they call it RAIDZ which is similar to
 RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data
 protection?

 So if I have 8 x 1.5tb drives, wouldn't I:

 - mirror drive 1 and 5
 - mirror drive 2 and 6
 - mirror drive 3 and 7
 - mirror drive 4 and 8

 Then stripe 1,2,3,4

 Then stripe 5,6,7,8

 How does one do this with ZFS?

 Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member
vdevs, which can be either mirrors (essentially RAID10), or raidz1
(essentially RAID50), or raidz2 (essentially RAID60), or raidz3 (essentially
RAID70???).

A pool with a single mirror vdev is just a RAID1.  A pool with a single
raidz1 vdev is just a RAID5.  And so on.

But, as you add vdevs to a pool, it becomes a stripeset across all the
vdevs.

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Slack-Moehrle


So if I have 8 x 1.5tb drives, wouldn't I: 

- mirror drive 1 and 5 
- mirror drive 2 and 6 
- mirror drive 3 and 7 
- mirror drive 4 and 8 

Then stripe 1,2,3,4 

Then stripe 5,6,7,8 

How does one do this with ZFS? 

So you would do: 
zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5 
drive6 mirror drive7 drive8 

See here: 
http://www.stringliterals.com/?p=132 

So, effectively mirroring the drives, but the pool that is created is one giant 
pool of all of the mirrors?

I looked at: http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID-Z and 
they had a brief description of RAIDZ2.

Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 
x 1.5tb?

I apologize for seeming dense, I just am confused about non-stardard raid 
setups, they seem tricky.

-Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Svein Skogen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26.03.2010 20:04, Slack-Moehrle wrote:
 
 
 So if I have 8 x 1.5tb drives, wouldn't I: 
 
 - mirror drive 1 and 5 
 - mirror drive 2 and 6 
 - mirror drive 3 and 7 
 - mirror drive 4 and 8 
 
 Then stripe 1,2,3,4 
 
 Then stripe 5,6,7,8 
 
 How does one do this with ZFS? 
 
 So you would do: 
 zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5 
 drive6 mirror drive7 drive8 
 
 See here: 
 http://www.stringliterals.com/?p=132 
 
 So, effectively mirroring the drives, but the pool that is created is one 
 giant pool of all of the mirrors?
 
 I looked at: http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID-Z and 
 they had a brief description of RAIDZ2.
 
 Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 
 8 x 1.5tb?
 
 I apologize for seeming dense, I just am confused about non-stardard raid 
 setups, they seem tricky.

raidz eats one disk. Like RAID5
raidz2 digests another one. Like RAID6
raidz3 yet another one. Like ... h...

//Svein

- -- 
- +---+---
  /\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
- +---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.
- 
 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/
- 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkutBbgACgkQSBMQn1jNM7aXPQCfSd92B8GilEiRa6LR/ltAF00X
ENQAoIqlAdtCBHKiiiVbl1C9o0AZNRER
=8ueU
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Matt Cowger
RAIDZ = RAID5, so lose 1 drive (1.5TB)
RAIDZ2 = RAID6, so lose 2 drives (3TB)
RAIDZ3 = RAID7(?), so lose 3 drives (4.5TB).

What you lose in useable space, you gain in redundancy.

-m

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Slack-Moehrle
Sent: Friday, March 26, 2010 12:04 PM
To: Tim Cook
Cc: zfs-discuss
Subject: Re: [zfs-discuss] RAID10



So if I have 8 x 1.5tb drives, wouldn't I: 

- mirror drive 1 and 5 
- mirror drive 2 and 6 
- mirror drive 3 and 7 
- mirror drive 4 and 8 

Then stripe 1,2,3,4 

Then stripe 5,6,7,8 

How does one do this with ZFS? 

So you would do: 
zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5 
drive6 mirror drive7 drive8 

See here: 
http://www.stringliterals.com/?p=132 

So, effectively mirroring the drives, but the pool that is created is one giant 
pool of all of the mirrors?

I looked at: http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID-Z and 
they had a brief description of RAIDZ2.

Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 
x 1.5tb?

I apologize for seeming dense, I just am confused about non-stardard raid 
setups, they seem tricky.

-Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Slack-Moehrle


 Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 
 8 x 1.5tb?
 
 I apologize for seeming dense, I just am confused about non-stardard raid 
 setups, they seem tricky.

 raidz eats one disk. Like RAID5
 raidz2 digests another one. Like RAID6
 raidz3 yet another one. Like ... h...

So: 

RAIDZ would be 8 x 1.5tb = 12tb - 1.5tb = 10.5tb

RAIDZ2 would be 8 x 1.5tb = 12tb - 3.0tb = 9.0tb

RAIDZ3 would be 8 x 1.5tb = 12tb - 4.5tb = 7.5tb

But not really that usable space for each since the mirroring?

So do you not mirror drives with RAIDZ2 or RAIDZ3 because you would have 
nothing for space left

-Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Bob Friesenhahn

On Fri, 26 Mar 2010, Freddie Cash wrote:


Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member vdevs, 
which can be


Except that ZFS does not support RAID0.  I don't know why you guys 
persist with these absurd claims and continue to use wrong and 
misleading terminology.


What you guys are effectively doing is calling a mule a horse 
because it has four legs, two ears, and a tail, like a donkey.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Malte Schirmacher
Bob Friesenhahn wrote:

 Except that ZFS does not support RAID0.  I don't know why you guys
 persist with these absurd claims and continue to use wrong and
 misleading terminology.

What is the main difference between RAID0 and striping (what zfs really
does, i guess?)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Ray Van Dolson
On Fri, Mar 26, 2010 at 12:25:54PM -0700, Malte Schirmacher wrote:
 Bob Friesenhahn wrote:
 
  Except that ZFS does not support RAID0.  I don't know why you guys
  persist with these absurd claims and continue to use wrong and
  misleading terminology.
 
 What is the main difference between RAID0 and striping (what zfs really
 does, i guess?)

There's a difference in implementation, but, for your purposes of
describing how the vdevs stripe, I'd say it's fair enough. :)

Some folks are just a little sensitive about ZFS being compared to
standard RAID is all, so what's your P's and Q's around here! ;)

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Freddie Cash
On Fri, Mar 26, 2010 at 12:21 PM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Fri, 26 Mar 2010, Freddie Cash wrote:


 Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member
 vdevs, which can be


 Except that ZFS does not support RAID0.


Wow, what part of overly simplified did you not read, see, understand, or
parse?  You even quoted it.

 I don't know why you guys persist with these absurd claims and continue to
 use wrong and misleading terminology.


So, mister I'm so much better than everyone because I know that ZFS doesn't
use RAID0 but don't provide any actual useful info:

How would you describe how a ZFS pool works for striping data across
multiple vdevs, in such a way that someone coming from a RAID background can
understand, without using fancy-shmancy terms that no one else has ever
heard?  (Especially considering how confused the OP was as to how even a
RAID10 array works.)

Where I come from, you start with what the person knows (RAID terminology),
find ways to relate that to the new knowledge domain (basically a RAID0
stripeset), and then later build on that to explain all the fancy-shmancy
terminology and nitty-gritty of how it works.

We didn't all pop into the work full of all the knowledge of everything.

What you guys are effectively doing is calling a mule a horse because it
 has four legs, two ears, and a tail, like a donkey.


For someone who's only ever seen, dealt with, and used horses, then (overly
simplified), a mule is like a horse.  Just as it is like a donkey.  From
there, you can go on to explain how a mule actually came to be, and what
makes it different from a horse and a donkey.  And what makes it better than
either.

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)

2010-03-26 Thread Richard Elling
On Mar 26, 2010, at 8:58 AM, Svein Skogen wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 26.03.2010 16:55, Bottone, Frank wrote:
 Does zfs handle 4kb sectors properly or does it always assume 512b sectors?
 
 
 
 If it does, we could manually create a slice properly aligned and set
 zfs to use it?
 
 A real simple patch would be to attempt alignment with 4096 every time
 (since 4096 is a multiple of 512 there really wouldn't be a performance
 penalty here). This would mean that things are optimal on _ALL_ disks.
 (and allow those of us using more advanced diskcontrollers to set the
 strip-size (strip size, not stripe size) to 4K as well)

Two thoughts:

1. the performance impact may not be very great, but I'm sure there are
exceptions in the consumer-grade market

2. people will be disappointed with the reduced compressibility of their data

 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS size calculation. Again!

2010-03-26 Thread Richard Elling
On Mar 25, 2010, at 7:25 PM, antst wrote:

 I have two storages, both on snv133. Both filled with 1TB drives.
 1) stripe over two raidz vdevs, 7 disks in each. In total avalable size is 
 (7-1)*2=12TB
 2) zfs pool over HW raid, also 12TB.
 
 Both storages keeps the same data with minor differences. First pool keeps 24 
 hourly snapshots + 7 daily snapshots. Second one (backup) keeps only daily 
 snapshots, but for longer period (2 weeks for now).

Good idea :-)

 But they reports strangely different sizes which can't be explained by 
 differences in snapshots I believe.
 
 1) 
 # zpool list export
 NAME SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
 export  12.6T  3.80T  8.82T30%  1.00x  ONLINE  -
 
 # zfs list export
 NAME USED  AVAIL  REFER  MOUNTPOINT
 export  3.24T  7.35T  40.9K  /export
 
 2) 
 # zpool list export
 NAME SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
 export  12.6T  3.19T  9.44T25%  1.00x  ONLINE  -
 
 # zfs list export
 NAME USED  AVAIL  REFER  MOUNTPOINT
 export  3.19T  9.24T25K  /export
 
 As we see, both pools have the same size according to zpool.

Correct.

 As we see, for second storage size reported by zpool list and sum of used 
 and avail in zfs list are in agreement.

Correct.

 But for first one, 2TB is missing somehow, sum of USED and avail is 10.6 TB.

Correct.  To understand this, please see the ZFS FAQ:
http://hub.opensolaris.org/bin/view/Community+Group+zfs/faq#HWhydoesntthespacethatisreportedbythezpoollistcommandandthezfslistcommandmatch

[richard pauses to look in awe at the aforementioned URL...]
 -- richard

 Also what makes me a bit wonder, is that I would expect more space to be used 
 on backup pool (more daily snapshots), but if zfs list can be explained 
 that amount taken by hourly snapshots is bigger than amount taken by extra 7 
 daily snapshots on backup storage (difference is 50GB which is still pretty 
 big, taking into account that on backup storage we have also extra 10 gig of 
 backup of rpool from primary storage), there is no way for this explanation 
 to be valid for difference in USED reported by zpool list. 600GB is much 
 more than any possible difference coming from storing different snapshots, 
 because our guys just don't produce so much of data daily. Also I tried to 
 look how much of space is refereed by hourly snapshots - no way to be even 
 close to 600GB.
 
 What's wrong there? My main concern, though, is difference between zpool size 
 and sum of used+avail for zfs on primary storage. 2TB is 2TB!
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread David Dyer-Bennet

On Fri, March 26, 2010 14:21, Bob Friesenhahn wrote:
 On Fri, 26 Mar 2010, Freddie Cash wrote:

 Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member
 vdevs, which can be

 Except that ZFS does not support RAID0.  I don't know why you guys
 persist with these absurd claims and continue to use wrong and
 misleading terminology.

They're attempting to communicate with the OP, who made it pretty clear
that he was comfortable with traditional RAID terms, and trying to
understand ZFS.

 What you guys are effectively doing is calling a mule a horse
 because it has four legs, two ears, and a tail, like a donkey.

They're short-circuiting that discussion, and we can have it later if
necessary.  The differences  you're emphasizing are important for
implementation, and performance analysis, and even for designing the
system at some levels, but they're not important to the initial
understanding of the system.

The question was essentially Wait, I don't see RAID 10 here, and that's
what I like.  How do I do that?  I think the answer was responsive and
not misleading enough to be dangerous; the differences can be explicated
later.

YMMV :-)

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread David Dyer-Bennet

On Fri, March 26, 2010 14:25, Malte Schirmacher wrote:
 Bob Friesenhahn wrote:

 Except that ZFS does not support RAID0.  I don't know why you guys
 persist with these absurd claims and continue to use wrong and
 misleading terminology.

 What is the main difference between RAID0 and striping (what zfs really
 does, i guess?)

RAID creates fixed, absolute, patterns of spreading blocks, bytes, and
bits around the various disks; ZFS does not, it makes on-the-fly decisions
about where things should go at some levels.  In RAID1, a block will go
the same physical place on each drive; in a ZFS mirror it won't, it'll
just go *somewhere* on each drive.

In the end, RAID produces a block device that you then run a filesystem
on, whereas ZFS includes the filesystem (and other things; including block
devices you can run other filesystems on).
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive - actual performance

2010-03-26 Thread Richard Elling
On Mar 26, 2010, at 2:34 AM, Bruno Sousa wrote:
 Hi,
 
 The jumbo-frames in my case give me a boost of around 2 mb/s, so it's not 
 that much. 

That is about right.  IIRC, the theoretical max is about 4% improvement, for 
MTU of 8KB.

 Now i will play with link aggregation and see how it goes, and of course i'm 
 counting that incremental replication will be slower...but since the amount 
 of data would be much less probably it will still deliver a good performance.

Probably won't help at all because of the brain dead way link aggregation has to
work.  See Ordering of frames at
http://en.wikipedia.org/wiki/Link_Aggregation_Control_Protocol#Link_Aggregation_Control_Protocol

If you see the workload on the wire go through regular patterns of fast/slow 
response
then there are some additional tricks that can be applied to increase the 
overall
throughput and smooth the jaggies. But that is fodder for another post...
You can measure this with iostat using samples  15 seconds or with tcpstat.
tcpstat is a handy DTrace script often located as /opt/DTT/Bin/tcpstat.d
 -- richard

 And what a relief to know that i'm not alone when i say that storage 
 management is part science, part arts and part voodoo magic ;)
 
 Cheers,
 Bruno
 
 On 25-3-2010 23:22, Ian Collins wrote:
 On 03/26/10 10:00 AM, Bruno Sousa wrote:
 
 [Boy top-posting sure mucks up threads!]
 
 Hi,
 
 Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the system 
 i have now.
 Regarding the performance...let's assume that a bonnie++ benchmark could go 
 to 200 mg/s in. The possibility of getting the same values (or near) in a 
 zfs send / zfs receive is just a matter of putting , let's say a 10gbE card 
 between both systems?
 
 Maybe, or a 2x1G LAG would me more cost effective (and easier to check!).  
 The only way to know for sure is to measure.  I managed to get slightly 
 better transfers by enabling jumbo frames.
 
 I have the impression that benchmarks are always synthetic, therefore 
 live/production environments behave quite differently.
 
 Very true, especially in the black arts of storage management!
 
 Again, it might be just me, but with 1gb link being able to replicate 2 
 servers with a average speed above 60 mb/s does seems quite good. However, 
 like i said i would like to know other results from other guys...
 
 As I said, the results are typical for a 1G link.  Don't forget you are 
 measuring full copies, incremental replications may well be significantly 
 slower.
 
 -- 
 Ian.
   
 
 
 -- 
 This message has been scanned for viruses and 
 dangerous content by MailScanner, and is 
 believed to be clean.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS hex dump diagrams?

2010-03-26 Thread Richard Elling
On Mar 25, 2010, at 2:45 PM, John Bonomi wrote:

 I'm sorry if this is not the appropriate place to ask, but I'm a student and 
 for an assignment I need to be able to show at the hex level how files and 
 their attributes are stored and referenced in ZFS. Are there any resources 
 available that will show me how this is done?

IMHO the best place to start with this level of analysis is the ZFS on-disk
specification doc:
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf

It is getting long in the tooth and doesn't document recent features, but
it is fundamentally correct.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Eric Andersen
It depends a bit on how you set up the drives really.  You could make one raidz 
vdev of 8 drives, losing one of them for parity, or you could make two raidz 
vdevs of 4 drives each and lose two drives for parity (one for each vdev).  You 
could also do one raidz2 vdev of 8 drives and lose two drives for parity, or 
two raidz2 vdevs of 4 drives each and lose four drives for parity (2 for each 
raidz2 vdev).  That would give you a bit better redundancy than using 4 mirrors 
while giving you the same available storage space.  The list goes on and on.  
There are a lot of different configurations you could use with 8 drives, but 
keep in mind once you add a vdev to your pool, you can't remove it.

Personally, I would not choose to create one vdev of 8 disks, but that's just 
me.  It is important to be aware that when and if you want to replace the 1.5TB 
disks with something bigger, you need to replace ALL the disks in the vdev to 
gain the extra space.  So, if you wanted to go from 1.5TB to 2TB disks down the 
road, and you set up one raidz of 8 drives, you need to replace all 8 drives 
before you gain the additional space.  If you do two raidz vdevs of 4 drives 
each, you need to replace 4 drives to gain additional space.  If you use 
mirrors, you need to replace 2 drives.  Or, you can add a new vdev of 2, 4, 8, 
or however many disks you want if you have the physical space to do so.

I believe you can mix and match mirror vdevs and raidz vdevs within a zpool, 
but I don't think it's recommended to do so.  The ZFS best practices guide has 
a lot of good information in it if you have not read it yet (google).

You might have less usable drive space using mirrors, but you will gain a bit 
of performance, and it's a bit easier to expand your zpool when the time comes. 
 A raidz (1,2,3) can give you more usable space, and can give you better or 
worse redundancy depending on how you set it up.  There is a lot to consider.  
I hope I didn't cloud things up for you any further or misinform you on 
something (I'm a newb too, so don't take my word alone on anything).  

Hell, if you wanted to, you could also do one 8-way mirror that would give you 
an ignorant amount of redundancy at the cost of 7 drives worth of usable space.

It all boils down to personal choice.  You have to determine how much usable 
space, redundancy, performance, and ease of replacing drives mean to you and go 
from there.  ZFS will do pretty much any configuration to suit your needs. 

eric
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Richard Elling
On Mar 26, 2010, at 4:46 AM, Edward Ned Harvey wrote:
 What does everyone thing about that? I bet it is not as mature as on
 OpenSolaris.
 
 mature is not the right term in this case.  FreeBSD has been around much
 longer than opensolaris, and it's equally if not more mature.

Bill Joy might take offense to this statement.  Both FreeBSD and Solaris trace
their roots to the work done at Berkeley 30 years ago. Both have evolved in
different ways at different rates. Since Solaris targets the enterprise market,
I will claim that Solaris is proven in that space. OpenSolaris is just one of 
the
next steps forward for Solaris.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS hex dump diagrams?

2010-03-26 Thread m...@bruningsystems.com

Hi Richard,

Richard Elling wrote:

On Mar 25, 2010, at 2:45 PM, John Bonomi wrote:

  

I'm sorry if this is not the appropriate place to ask, but I'm a student and 
for an assignment I need to be able to show at the hex level how files and 
their attributes are stored and referenced in ZFS. Are there any resources 
available that will show me how this is done?



IMHO the best place to start with this level of analysis is the ZFS on-disk
specification doc:
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf

It is getting long in the tooth and doesn't document recent features, but
it is fundamentally correct.
  

I completely agree with this, but good luck getting a hex dump from that
information.

max


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Bob Friesenhahn

On Fri, 26 Mar 2010, Malte Schirmacher wrote:


Bob Friesenhahn wrote:


Except that ZFS does not support RAID0.  I don't know why you guys
persist with these absurd claims and continue to use wrong and
misleading terminology.


What is the main difference between RAID0 and striping (what zfs really
does, i guess?)


Zfs only stripes within raidzN vdevs, and even then at the zfs record 
level and not using a RAID0 (fixed mapping on the LUN) approach.


RAID0 and striping are similar concepts.  When one stripes across 
an array of disks, one breaks up the written block (record), and 
writes parts of it across all of the disks in the stripe.  This is 
usually done to increase sequential read/write performance but may 
also be used to assist with error recovery (which zfs does take 
advantage of).


Zfs only writes whole records (e.g. 128K) to a vdev so that it does 
not stripe across vdevs.  Within a vdev, it may stripe.


The difference is pretty huge when one considers that zfs is able to 
support vdevs of different sizes and topologies, as well as ones added 
much more recently than when the pool was created.  RAID0 and striping 
can't do that.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Bob Friesenhahn

On Fri, 26 Mar 2010, David Dyer-Bennet wrote:


The question was essentially Wait, I don't see RAID 10 here, and that's
what I like.  How do I do that?  I think the answer was responsive and
not misleading enough to be dangerous; the differences can be explicated
later.


Most of us choose a pool design and then copy all of our data to it. 
If one does not understand how the pool works, then a poor design may 
be selected, which can be difficult to extricate from later.  That is 
why it is important to know that zfs writes full records to each vdev 
and does not stripe the blocks across vdevs as was suggested.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Bob Friesenhahn

On Fri, 26 Mar 2010, Freddie Cash wrote:


On Fri, Mar 26, 2010 at 12:21 PM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:
  On Fri, 26 Mar 2010, Freddie Cash wrote:

Overly-simplified, a ZFS pool is a RAID0 stripeset across all the 
member
vdevs, which can be

  Except that ZFS does not support RAID0.

Wow, what part of overly simplified did you not read, see, understand, or 
parse?  You even quoted
it.


Sorry to pick on your email in particular.  Everyone here should 
consider it to be their personal duty to correct such statements.  The 
distinctions may not seem important but they are important to 
understand since they can be quite important to pool performance.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS RaidZ to RaidZ2

2010-03-26 Thread Muhammed Syyid
Hi
I have a couple of questions
I currently have a 4disk RaidZ1 setup and want to move to a RaidZ2
4x2TB = RaidZ1 (tank)
My current plan is to setup
8x1.5TB  in a RAIDZ2 and migrate the data from the tank vdev over. 
What's the best way to accomplish this with minimal disruption?
I have seen the zfs send / receive commands which seem to be what I should be 
using? 
The reason I'm not doing a simple copy is I have Xen Volumes as well which I'm 
not exactly sure how to copy over.
ZFS list (snipping the rpool) yields
tank
tank/vm
tank/vm/centos48
tank/vm/centos48/disk0
tank/vm/centos54
tank/vm/centos54/disk0

Basically I'm looking to replace a 4disk Raidz1 with an 8disk raidz2 (from what 
I understand I can't simply add 4 more disks to the existing raidz1 and update 
it to raidz2)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Marc Nicholas
Richard,

My challenge to you is that at least three vedors that I know of built
their storage platforms on FreeBSD. One of them sells $4bn/year of
product - petty sure that eclipses all (Open)Solaris-based storage ;)

-marc

On 3/26/10, Richard Elling richard.ell...@gmail.com wrote:
 On Mar 26, 2010, at 4:46 AM, Edward Ned Harvey wrote:
 What does everyone thing about that? I bet it is not as mature as on
 OpenSolaris.

 mature is not the right term in this case.  FreeBSD has been around much
 longer than opensolaris, and it's equally if not more mature.

 Bill Joy might take offense to this statement.  Both FreeBSD and Solaris
 trace
 their roots to the work done at Berkeley 30 years ago. Both have evolved in
 different ways at different rates. Since Solaris targets the enterprise
 market,
 I will claim that Solaris is proven in that space. OpenSolaris is just one
 of the
 next steps forward for Solaris.
  -- richard

 ZFS storage and performance consulting at http://www.RichardElling.com
 ZFS training on deduplication, NexentaStor, and NAS performance
 Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com





 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SSD As ARC

2010-03-26 Thread Muhammed Syyid
Hi
I'm planning on setting up two RaidZ2 volumes in different pools for added 
flexibility in removing / resizing (from what I understand if they were in the 
same pool I can't remove them at all). I also have got an SSD drive that I was 
going to use as Cache (L2ARC). How do I set this up to have two L2ARCs off one 
SSD (to service each pool). Do I need to create two slices (50% of the SSD disk 
space) and assign one to each pool?
Also I'm not expecting a lot of writes (primarily a file server) so I didn't 
think a ZIL would be a worthwhile investment. Any advice appreciated
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Svein Skogen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26.03.2010 23:25, Marc Nicholas wrote:
 Richard,
 
 My challenge to you is that at least three vedors that I know of built
 their storage platforms on FreeBSD. One of them sells $4bn/year of
 product - petty sure that eclipses all (Open)Solaris-based storage ;)

sarcasm alert
Butbutbutbut! Solaris is more enterprise focused!
/sarcasm alert

Seriously. FreeBSD has a _VERY_ good track record (in all levels of
busness). This is not an attempt at belittling Solaris, nor the effort
of Sun, but trying to claim FreeBSD not being enterprise-ready seems silly.

//Svein
- -- 
- +---+---
  /\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
- +---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.
- 
 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/
- 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkutNgUACgkQSBMQn1jNM7aBswCg6zqxqCmq9bz6OepVPWifMuRo
NqoAoIIdmL2IKMVqYrlBvVHPM0BB8P1a
=k/MQ
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS RaidZ to RaidZ2

2010-03-26 Thread Ian Collins

On 03/27/10 11:22 AM, Muhammed Syyid wrote:

Hi
I have a couple of questions
I currently have a 4disk RaidZ1 setup and want to move to a RaidZ2
4x2TB = RaidZ1 (tank)
My current plan is to setup
8x1.5TB  in a RAIDZ2 and migrate the data from the tank vdev over.
What's the best way to accomplish this with minimal disruption?
I have seen the zfs send / receive commands which seem to be what I should be 
using?
   


Yes, they are the only option if you wish to preserve your filesystem 
properties.  You will end up with a clone of your original pool's 
filesystems on the new pool.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS RaidZ to RaidZ2

2010-03-26 Thread Richard Jahnel
zfs send s...@oldpool | zfs receive newpool
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Ian Collins

On 03/27/10 11:32 AM, Svein Skogen wrote:

On 26.03.2010 23:25, Marc Nicholas wrote:
   

Richard,

My challenge to you is that at least three vedors that I know of built
their storage platforms on FreeBSD. One of them sells $4bn/year of
product - petty sure that eclipses all (Open)Solaris-based storage ;)
 

sarcasm alert
Butbutbutbut! Solaris is more enterprise focused!
/sarcasm alert

Seriously. FreeBSD has a _VERY_ good track record (in all levels of
busness). This is not an attempt at belittling Solaris, nor the effort
of Sun, but trying to claim FreeBSD not being enterprise-ready seems silly.

   

Which is why no one on this thread has.


//Svein
- --


Please use a standard signature delimiter --  if you are going to tag 
on so much ASCII art and unnecessary PGP baggage!


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD As ARC

2010-03-26 Thread David Dyer-Bennet

On Fri, March 26, 2010 17:26, Muhammed Syyid wrote:
 Hi
 I'm planning on setting up two RaidZ2 volumes in different pools for added
 flexibility in removing / resizing (from what I understand if they were in
 the same pool I can't remove them at all).

What do you mean remove?

You cannot remove a vdev from a pool.  You can however destroy the entire
pool, thus essentially removing the vdev.

You CAN replace the drives in a vdev, one at a time, with larger drives,
and when you are done the extra space will be available to the pool, so
for resizing purposes you can essentially replace a vdev, though not
remove it or alter the number of drives or the type.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS RaidZ to RaidZ2

2010-03-26 Thread Ian Collins

On 03/27/10 11:33 AM, Richard Jahnel wrote:

zfs send s...@oldpool | zfs receive newpool
   

In the OP's case, a recursive send is in order.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Richard Elling
On Mar 26, 2010, at 3:25 PM, Marc Nicholas wrote:

 Richard,
 
 My challenge to you is that at least three vedors that I know of built
 their storage platforms on FreeBSD. One of them sells $4bn/year of
 product - petty sure that eclipses all (Open)Solaris-based storage ;)

FreeBSD 8 or  FreeBSD 7.3?  If neither, then the point is moot.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] *SPAM* Re: zfs send/receive - actual performance

2010-03-26 Thread Ian Collins

On 03/27/10 09:39 AM, Richard Elling wrote:

On Mar 26, 2010, at 2:34 AM, Bruno Sousa wrote:
   

Hi,

The jumbo-frames in my case give me a boost of around 2 mb/s, so it's not that 
much.
 

That is about right.  IIRC, the theoretical max is about 4% improvement, for 
MTU of 8KB.

   

Now i will play with link aggregation and see how it goes, and of course i'm 
counting that incremental replication will be slower...but since the amount of 
data would be much less probably it will still deliver a good performance.
 

Probably won't help at all because of the brain dead way link aggregation has to
work.  See Ordering of frames at
http://en.wikipedia.org/wiki/Link_Aggregation_Control_Protocol#Link_Aggregation_Control_Protocol

   
Arse, thanks for reminding me Richard! A single stream will only use one 
path in a LAG.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Slack-Moehrle

OK, so I made progress today. FreeBSD see's all of my drives, ZFS is acting 
correct.

Now for me confusion.

RAIDz3

# zpool create datastore raidz3 da0 da1 da2 da3 da4 da5 da6 da7
Gives: 'raidz3' no such GEOM providor

# I am looking at the best practices guide and I am confused about adding a hot 
spare. Wont that happen with the above command or do I really just zpool create 
datastore raidz3 da0 da1 da2 da3 da4 da5 and then issue the hotspare command 
twice for da6 and da7?

-Jason

- Original Message -
From: Slack-Moehrle mailingli...@mailnewsrss.com
To: zfs-discuss@opensolaris.org
Sent: Friday, March 26, 2010 12:13:58 PM
Subject: Re: [zfs-discuss] RAID10



 Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 
 8 x 1.5tb?
 
 I apologize for seeming dense, I just am confused about non-stardard raid 
 setups, they seem tricky.

 raidz eats one disk. Like RAID5
 raidz2 digests another one. Like RAID6
 raidz3 yet another one. Like ... h...

So: 

RAIDZ would be 8 x 1.5tb = 12tb - 1.5tb = 10.5tb

RAIDZ2 would be 8 x 1.5tb = 12tb - 3.0tb = 9.0tb

RAIDZ3 would be 8 x 1.5tb = 12tb - 4.5tb = 7.5tb

But not really that usable space for each since the mirroring?

So do you not mirror drives with RAIDZ2 or RAIDZ3 because you would have 
nothing for space left

-Jason
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Tim Cook
On Fri, Mar 26, 2010 at 6:29 PM, Slack-Moehrle mailingli...@mailnewsrss.com
 wrote:


 OK, so I made progress today. FreeBSD see's all of my drives, ZFS is acting
 correct.

 Now for me confusion.

 RAIDz3

 # zpool create datastore raidz3 da0 da1 da2 da3 da4 da5 da6 da7
 Gives: 'raidz3' no such GEOM providor

 # I am looking at the best practices guide and I am confused about adding a
 hot spare. Wont that happen with the above command or do I really just zpool
 create datastore raidz3 da0 da1 da2 da3 da4 da5 and then issue the hotspare
 command twice for da6 and da7?

 -Jason

 - Original Message -
 From: Slack-Moehrle mailingli...@mailnewsrss.com
 To: zfs-discuss@opensolaris.org
 Sent: Friday, March 26, 2010 12:13:58 PM
 Subject: Re: [zfs-discuss] RAID10



  Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3?
 With 8 x 1.5tb?

  I apologize for seeming dense, I just am confused about non-stardard
 raid setups, they seem tricky.

  raidz eats one disk. Like RAID5
  raidz2 digests another one. Like RAID6
  raidz3 yet another one. Like ... h...

 So:

 RAIDZ would be 8 x 1.5tb = 12tb - 1.5tb = 10.5tb

 RAIDZ2 would be 8 x 1.5tb = 12tb - 3.0tb = 9.0tb

 RAIDZ3 would be 8 x 1.5tb = 12tb - 4.5tb = 7.5tb

 But not really that usable space for each since the mirroring?

 So do you not mirror drives with RAIDZ2 or RAIDZ3 because you would have
 nothing for space left

 -Jason



Triple parity did not get added until version 17.  FreeBSD cannot do
raidz3.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Tim Cook
On Fri, Mar 26, 2010 at 5:42 PM, Richard Elling richard.ell...@gmail.comwrote:

 On Mar 26, 2010, at 3:25 PM, Marc Nicholas wrote:

  Richard,
 
  My challenge to you is that at least three vedors that I know of built
  their storage platforms on FreeBSD. One of them sells $4bn/year of
  product - petty sure that eclipses all (Open)Solaris-based storage ;)

 FreeBSD 8 or  FreeBSD 7.3?  If neither, then the point is moot.
  -- richard

 ZFS storage and performance consulting at http://www.RichardElling.com
 ZFS training on deduplication, NexentaStor, and NAS performance
 Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com


Well that depends on exactly what you mean.  There's several that are
actively contributing and using code from both.  Built on is all relative.
 Given the SMP improvement recently from all of the major players using BSD,
if you're talking kernel code, I would say every single one of them has
pulled code from the 7-branch, and likely the 8-branch as well.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID10

2010-03-26 Thread Victor Latushkin

On Mar 26, 2010, at 23:37, David Dyer-Bennet d...@dd-b.net wrote:



On Fri, March 26, 2010 14:25, Malte Schirmacher wrote:

Bob Friesenhahn wrote:


Except that ZFS does not support RAID0.  I don't know why you guys
persist with these absurd claims and continue to use wrong and
misleading terminology.


What is the main difference between RAID0 and striping (what zfs  
really

does, i guess?)


RAID creates fixed, absolute, patterns of spreading blocks, bytes, and
bits around the various disks; ZFS does not, it makes on-the-fly  
decisions
about where things should go at some levels.  In RAID1, a block will  
go

the same physical place on each drive; in a ZFS mirror it won't, it'll
just go *somewhere* on each drive.


This is not correct. In ZFS mirror a block will go to the same offset  
within data area on both submirrors.


But if you set up your mirrored slices starting at different offsets  
you can arrange for blocks on submirrors to have different physical  
offsets ;-)




In the end, RAID produces a block device that you then run a  
filesystem
on, whereas ZFS includes the filesystem (and other things; including  
block

devices you can run other filesystems on).
--
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)

2010-03-26 Thread Darren Mackay
For the time being, the EARS series of drives actually present 512 byte sectors 
to the o/s through emulation in firmware.

The drive I tested was WD20EARS (2TB WD Caviar Green Advanced Format drives):

MDL: WD20EARS-00S81
DATE: 29 DEC 2009
DCM: HBRNHT2BB
DCX: 6019S1W87
LBA: 3907029168

The LBA above is key -. this s the unmber of sectors presented by the drive's 
firmware to the host o/s. Combinations of jumpers, and runnign the WD alignment 
utility only appears to reorganise how the ECC is stored in 4k blocks 
physically on disk, but the drive still presents each 4K physicaly dick block 
as 8 x 512byte logical blocks to the host.

I have logged a support request with WD to see if they may be releasing 
firmware that will present the 4k blocks natively. As an individual user, i 
actually doubt that WD will ever respond. I can only hope that quite a few 
others (00's / 000's) of other people also log similar requests and that WD may 
release appropriate firmware.

Would be grateful to hear of any others and their testing experiences with 
other series of advanced format drives from WD.

The drives works perfectly on 64bit kernel, but not on 32bit osol kernels. I 
purchased the drive just to test on 32bit kernel - mainly as there are quite a 
lot of soho NAS devices that may be able to use our Velitium Embedded Kit for 
OpenSolaris with drives larger than 1TB. 

It would be nice if the 32bit osol kernel support 48bit LBA (similar to linux, 
not sure if 32bit BSD supports 48bit LBA ), then the drive would probably work 
- perhaps later in the year we will have time to work on a patch to support 
48bit lba on the 32bit osol kernels...

Darren Mackay
http://www.sikkra.com
http://sourceforge.net/projects/velitium/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Mixed ZFS vdev in same pool.

2010-03-26 Thread Justin
I have a question about using mixed vdev in the same zpool and what the 
community opinion is on the matter.  Here is my setup:

I have four 1TB drives and two 500GB drives.  When I first setup ZFS I was 
under the assumption that it does not really care much on how you add devices 
to the pool and it assumes you are thinking things through.  But when I tried 
to create a pool (called group) with four 1TB disk in raidz and two 500GB disk 
in mirror configuration to the same pool ZFS complained and said if I wanted to 
do it I had to add a -f (which I assume stands for force).  So was ZFS 
attempting to stop me from doing something generally considered bad?

Some other questions I have, lets assume that this setup isn't that bad (or it 
is that bad and these questions will be why):

If one 500GB disk dies (c10dX) in the mirror and I choose not to replace it, 
would I be able to migrate the files that are on the other mirror that still 
works over to the drives in the raidz configuration assuming there is space?  
Would ZFS inform me which files are affected, like it does in other situations? 

In this configuration how does Solaris/ZFS determine which vdev to place the 
current write operations worth of data  into?

Is there any situations where data would, for some reason, not be protected 
against single disk failures? 

Would this configuration survive a two disk failure if the disk are in a 
separate vdev? 


jsm...@corax:~# zpool status group
  pool: group
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
group   ONLINE   0 0 0
..raidz1ONLINE   0 0 0
c7t0d0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
c8t0d0  ONLINE   0 0 0
c8t1d0  ONLINE   0 0 0
  ..mirrorONLINE   0 0 0
  c10d0   ONLINE   0 0 0
  c10d1   ONLINE   0 0 0

errors: No known data errors
jsm...@corax:~# zfs list group
NAMEUSED  AVAIL  REFER  MOUNTPOINT
group  94.4K  3.12T  23.7K  /group


This isn't for a production environment in some datacenter but nevertheless I 
would like to make the data as reasonably secure as possible while maximizing 
total storage space.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss