Re: [zfs-discuss] Best practice for moving FS between pool on same machine?

2007-06-20 Thread Constantin Gonzalez
Hi Chris,

 What is the best (meaning fastest) way to move a large file system 
 from one pool to another pool on the same machine.  I have a machine
 with two pools.  One pool currently has all my data (4 filesystems), but it's
 misconfigured. Another pool is configured correctly, and I want to move the 
 file systems to the new pool.  Should I use 'rsync' or 'zfs send'?

zfs send/receive is the fastest and most efficient way.

I've used it multiple times on my home server until I had my configuration
right :).

 What happens is I forgot I couldn't incrementally add raid devices.  I want
 to end up with two raidz(x4) vdevs in the same pool.  Here's what I have now:

For this reason, I decided to go with mirrors. Yes, they use more raw storage
space, but they are also much more flexible to expand. Just add two disks when
the pool is full and you're done.

If you have a lot of disks or can afford to add disks 4-5 disks at a time, then
RAID-Z may be as easy to do, but remember that two disk failures in RAID-5
variants can be quite common - You may want RAID-Z2 instead.

 1. move data to dbxpool2
 2. remount using dbxpool2
 3. destroy dbxpool1
 4. create new proper raidz vdev inside dbxpool2 using devices from dbxpool1

Add:

0. Snapshot data in dbxpool1 so you can use zfs send/receive

Then the above should work fine.

 I'm constrained by trying to minimize the downtime for the group
 of people using this as their file server.  So I ended up with
 an ad-hoc assignment of devices.  I'm not worried about
 optimizing my controller traffic at the moment.

Ok. If you want to really be thorough, I'd recommend:

0. Run a backup, just in case. It never hurts.
1. Do a snapshot of dbxpool1
2. zfs send/receive dbxpool1 - dbxpool2
   (This happens while users are still using dbxpool1, so no downtime).
3. Unmount dbxpool1
4. Do a second snapshot of dbxpool1
5. Do an incremental zfs send/receive of dbxpool1 - dbxpool2.
   (This should take only a small amount of time)
6. Mount dbxpool2 where dbxpool1 used to be.
7. Check everything is fine with the new mounted pool.
8. Destroy dbxpool1
9. Use disks from dbxpool1 to expand dbxpool2 (be careful :) ).

You might want to exercise the above steps on an extra spare disk with
two pools just to gain some confidence before doing it in production.

I have a script that automatically does 1-6 that is looking for beta
testers. If you're interested, let me know.

Hope this helps,
   Constantin

-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Constantin Gonzalez
Hi,

 I'm quite interested in ZFS, like everybody else I suppose, and am about
 to install FBSD with ZFS.

welcome to ZFS!

 Anyway, back to business :)
 I have a whole bunch of different sized disks/speeds. E.g. 3 300GB disks
 @ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s and so on.
 
 Raid-Z and ZFS claims to be uber scalable and all that, but would it
 'just work' with a setup like that too?

Yes. If you dump a set of variable-size disks into a mirror or RAID-Z
configuration, you'll get the same result as if you had the smallest of
their sizes. Then, the pool will grow when exchanging smaller disks with
larger.

I used to run a ZFS pool on 1x250GB, 1x200GB, 1x85 GB and 1x80 GB the following
way:

- Set up an 80 GB slice on all 4 disks and make a 4 disk RAID-Z vdev
- Set up a 5 GB slice on the 250, 200 and 85 GB disks and make a 3 disk RAID-Z
- Set up a 115GB slice on the 200 and the 250 GB disk and make a 2 disk mirror.
- Concatenate all 3 vdevs into one pool. (You need zpool add -f for that).

Not something to be done on a professional production system, but it worked
for my home setup just fine. The remaining 50GB from the 250GB drive then
went into a scratch pool.

Kinda like playing Tetris with RAID-Z...

Later, I decided using just paired disks as mirrors are really more
flexible and easier to expand, since disk space is cheap.

Hope this helps,
   Constantin

-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is this storage model correct?

2007-06-20 Thread Mario Goebbels
 I had the same question last week decided to take a similar approach.
 Instead of a giant raidz of 6 disks, i created 2 raidz's of 3 disks
 each. So when I want to add more storage, I just add 3 more disks.

Even if you've created a giant 6 disk RAID-Z, apart from a formal
warning requiring the -f parameter on addition, nothing would have
prevented you adding a second toplevel vdev representing a 3 disk RAID-Z
to the existing 6 disk one.

-mg


signature.asc
Description: This is a digitally signed message part
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread Roshan Perera


 But Roshan, if your pool is not replicated from ZFS'
 point of view, then all the multipathing and raid
 controller backup in the world will not make a difference.

James, I Agree from ZFS point of view. However, from the EMC or the customer 
point of view they want to do the replication at the EMC level and not from 
ZFS. By replicating at the ZFS level they will loose some storage and its 
doubling the replication. Its just customer use to working with Veritas and UFS 
and they don't want to change their habbits. I just have to convince the 
customer to use ZFS replication.

Thanks again


 
 
 
 James C. McPherson
 --
 Solaris kernel software engineer
 Sun Microsystems
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Slow write speed to ZFS pool (via NFS)

2007-06-20 Thread Mario Goebbels
 Correction: 
 
 SATA Controller is a Sillcon Image 3114, not a 3112.

Do these slow speeds only appear when writing via NFS or generally in
all scenarios? Just asking, because Solaris' ata driver doesn't
initialize settings like block mode, prefetch and such on IDE/SATA
drives (that is if ata applies here with that chipset).

-mg


signature.asc
Description: This is a digitally signed message part
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Z-Raid performance with Random reads/writes

2007-06-20 Thread Mario Goebbels
 A 6 disk raidz set is not optimal for random reads, since each disk in 
 the raidz set needs to be accessed to retrieve each item.

I don't understand, if the file is contained within a single stripe, why
would it need to access the other disks, if the checksum of the stripe
is OK? Also, why wouldn't it be able to concurrently access different
disks for multiple reads?

-mg


signature.asc
Description: This is a digitally signed message part
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Z-Raid performance with Random reads/writes

2007-06-20 Thread Ian Collins
Mario Goebbels wrote:
 A 6 disk raidz set is not optimal for random reads, since each disk in 
 the raidz set needs to be accessed to retrieve each item.
 

 I don't understand, if the file is contained within a single stripe, why
 would it need to access the other disks, if the checksum of the stripe
 is OK? Also, why wouldn't it be able to concurrently access different
 disks for multiple reads?

   
The item is striped across all the drives, so you have to wait for the
slowest drive.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Oliver Schinagl


Pawel Jakub Dawidek wrote:
 On Tue, Jun 19, 2007 at 07:52:28PM -0700, Richard Elling wrote:
   
 On that note, i have a different first question to start with. I
 personally am a Linux fanboy, and would love to see/use ZFS on linux. I
 assume that I can use those ZFS disks later with any os that can
 work/recognizes ZFS correct? e.g.  I can install/setup ZFS in FBSD, and
 later use it in OpenSolaris/Linux Fuse(native) later?
   
 The on-disk format is an available specification and is designed to be
 platform neutral.  We certainly hope you will be able to access the
 zpools from different OSes (one at a time).
 

 Will be nice to not EFI label disks, though:) Currently there is a
 problem with this - zpool created on Solaris is not recognized by
 FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
 hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.

   

I read this earlier, that it's recommended to use a whole disk instead
of a partition with zfs, the thing that's holding me back however is the
mixture of different sized disks I have. I suppose if I had a 300gb per
disk raid-z going on 3 300 disk and one 320gb disk, but only have a
partition of 300gb on it (still with me), i could later expand that
partition with fdisk and the entire raid-z would then expand to 320gb
per disk (assuming the other disks magically gain 20gb, so this is a bad
example in that sense :) )

Also what about full disk vs full partition, e.g. make 1 partition to
span the entire disk vs using the entire disk.
Is there any significant performance penalty? (So not having a disk
split into 2 partitions, but 1 disk, 1 partition) I read that with a
full raw disk zfs will be beter to utilize the disks write cache, but I
don't see how.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread Roshan Perera
Hi all,

Is there a place where I can find ZFS best practices guide to use against DMX 
and a roadmap of ZFS ?

Also, the customer now looking at big ZFS installations in production. Would 
you guys happen to know or where I can find details of the numbers of current 
installations ? We are looking at akmost 10Terrabytes of data to be stored on 
DMX using ZFS (customer is not comfortable with the RaidZ solution in addition 
to their best practice of raiding at DMX levell) Any feedback, experiences and 
more importantly gotchas will be muchly appreciated. 

Thanks in advance.

Roshan



- Original Message -
From: Roshan Perera [EMAIL PROTECTED]
Date: Wednesday, June 20, 2007 10:49 am
Subject: Re: [zfs-discuss] Re: ZFS - SAN and Raid
To: [EMAIL PROTECTED]
Cc: Bruce McAlister [EMAIL PROTECTED], zfs-discuss@opensolaris.org, Richard 
Elling [EMAIL PROTECTED]

 
 
  But Roshan, if your pool is not replicated from ZFS'
  point of view, then all the multipathing and raid
  controller backup in the world will not make a difference.
 
 James, I Agree from ZFS point of view. However, from the EMC or 
 the customer point of view they want to do the replication at the 
 EMC level and not from ZFS. By replicating at the ZFS level they 
 will loose some storage and its doubling the replication. Its just 
 customer use to working with Veritas and UFS and they don't want 
 to change their habbits. I just have to convince the customer to 
 use ZFS replication.
 
 Thanks again
 
 
  
  
  
  James C. McPherson
  --
  Solaris kernel software engineer
  Sun Microsystems
  
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread James C. McPherson

Roshan Perera wrote:

Hi all,

Is there a place where I can find ZFS best practices guide to use against
DMX and a roadmap of ZFS ?
Also, the customer now looking at big ZFS installations in production.
Would you guys happen to know or where I can find details of the numbers
of current installations ? We are looking at akmost 10Terrabytes of data
to be stored on DMX using ZFS (customer is not comfortable with the RaidZ
solution in addition to their best practice of raiding at DMX levell) Any
feedback, experiences and more importantly gotchas will be muchly
appreciated.


http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
and

I know Ben Rockwood (now of Joyent) has blogged about how much
storage they're using, all managed with ZFS... I just can't
find the blog entry.

Hope this helps,
James C. McPherson
--
Solaris kernel software engineer
Sun Microsystems
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Constantin Gonzalez
Hi,

 How are paired mirrors more flexiable?

well, I'm talking of a small home system. If the pool gets full, the
way to expand with RAID-Z would be to add 3+ disks (typically 4-5).

With mirror only, you just add two. So in my case it's just about
the granularity of expansion.

The reasoning is that of the three factors reliability, performance and
space, I value them in this order. Space comes last since disk space
is cheap.

If I had a bigger number of disks (12+), I'd be using them in RAID-Z2
sets (4+2 plus 4+2 etc.). Here, the speed is ok and the reliability is
ok and so I can use RAID-Z2 instead of mirroring to get some extra
space as well.

 Right now, i have a 3 disk raid 5 running with the linux DM driver. One
 of the most resent additions was raid5 expansion, so i could pop in a
 matching disk, and expand my raid5 to 4 disks instead of 3 (which is
 always interesting as your cutting on your parity loss). I think though
 in raid5 you shouldn't put more then 6 - 8 disks afaik, so I wouldn't be
 expanding this enlessly.
 
 So how would this translate to ZFS? I have learned so far that, ZFS

ZFS does not yet support rearranging the disk cofiguration. Right now,
you can expand a single disk to a mirror or an n-way mirror to an n+1 way
mirror.

RAID-Z vdevs can't be changed right now. But you can add more disks
to a pool by adding more vdevs (You have a 1+1 mirror, add another 1+1
pair and get more space, have a 3+2 RAID-Z2 and add another 5+2 RAID etc.)

 basically is raid + LVM. e.g. the mirrored raid-z pairs go into the
 pool, just like one would use LVM to bind all the raid pairs. The
 difference being I suppose, that you can't use a zfs mirror/raid-z
 without having a pool to use it from?

Here's the basic idea:

- You first construct vdevs from disks:

  One disk can be one vdev.
  A 1+1 mirror can be a vdev, too.
  A n+1 or n+2 RAID-Z (RAID-Z2) set can be a vdev too.

- Then you concatenate vdevs to create a pool. Pools can be extended by
  adding more vdevs.

- Then you create ZFS file systems that draw their block usage from the
  resources supplied by the pool. Very flexible.

 Wondering now is if I can simply add a new disk to my raid-z and have it
 'just work', e.g. the raid-z would be expanded to use the new
 disk(partition of matching size)

If you have a RAID-Z based pool in ZFS, you can add another group of disks
that are organized in a RAID-Z manner (a vdev) to expand the storage capacity
of the pool.

Hope this clarifies things a bit. And yes, please check out the admin guide and
the other collateral available on ZFS. It's full of new concepts and one needs
some getting used to to explore all possibilities.

Cheers,
   Constantin

-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread James C. McPherson

Roshan Perera wrote:



But Roshan, if your pool is not replicated from ZFS' point of view,
then all the multipathing and raid controller backup in the world will
not make a difference.


James, I Agree from ZFS point of view. However, from the EMC or the
customer point of view they want to do the replication at the EMC level
and not from ZFS. By replicating at the ZFS level they will loose some
storage and its doubling the replication. Its just customer use to
working with Veritas and UFS and they don't want to change their habbits.
I just have to convince the customer to use ZFS replication.


Hi Roshan,
that's a great shame because if they actually want
to make use of the features of ZFS such as replication,
then they need to be serious about configuring their
storage to play in the ZFS world and that means
replication that ZFS knows about.



James C. McPherson
--
Solaris kernel software engineer
Sun Microsystems
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread mike

On 6/20/07, Constantin Gonzalez [EMAIL PROTECTED] wrote:


 One disk can be one vdev.
 A 1+1 mirror can be a vdev, too.
 A n+1 or n+2 RAID-Z (RAID-Z2) set can be a vdev too.

- Then you concatenate vdevs to create a pool. Pools can be extended by
 adding more vdevs.

- Then you create ZFS file systems that draw their block usage from the
 resources supplied by the pool. Very flexible.


This actually brings up something I was wondering about last night:

If I was to plan for a 16 disk ZFS-based system, you would probably
suggest me to configure it as something like 5+1, 4+1, 4+1 all raid-z
(I don't need the double parity concept)

I would prefer something like 15+1 :) I want ZFS to be able to detect
and correct errors, but I do not need to squeeze all the performance
out of it (I'll be using it as a home storage server for my DVDs and
other audio/video stuff. So only a few clients at the most streaming
off of it)

I would be interested in hearing if there are any other configuration
options to squeeze the most space out of the drives. I have no issue
with powering down to replace a bad drive, and I expect that I'll only
have one at the most fail at a time. If I really do need room for two
to fail then I suppose I can look for a 14 drive space usable setup
and use raidz-2.

Thanks,
mike
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Constantin Gonzalez
Hi Mike,

 If I was to plan for a 16 disk ZFS-based system, you would probably
 suggest me to configure it as something like 5+1, 4+1, 4+1 all raid-z
 (I don't need the double parity concept)
 
 I would prefer something like 15+1 :) I want ZFS to be able to detect
 and correct errors, but I do not need to squeeze all the performance
 out of it (I'll be using it as a home storage server for my DVDs and
 other audio/video stuff. So only a few clients at the most streaming
 off of it)

this is possibe. ZFS in theory does not significantly limit the n and 15+1
is indeed possible.

But for a number of reasons (among them performance) people generally
advise to use no more than 10+1.

A lot of ZFS configuration wisdom can be found on the Solaris internals
ZFS Best Practices Guide Wiki at:

  http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

Richard Elling has done a great job of thoroughly analyzing different
reliability concepts for ZFS in his blog. One good introduction is the
following entry:

  http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance

That may help you find the right tradeoff between space and reliability.

Hope this helps,
   Constantin


-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Paul Fisher
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of mike
 Sent: Wednesday, June 20, 2007 9:30 AM
 
 I would prefer something like 15+1 :) I want ZFS to be able to detect
 and correct errors, but I do not need to squeeze all the performance
 out of it (I'll be using it as a home storage server for my DVDs and
 other audio/video stuff. So only a few clients at the most streaming
 off of it)

I would not risk raidz on that many disks.  A nice compromise may be 14+2 
raidz2, which should perform nicely for your workload and be pretty reliable 
when the disks start to fail.


--

paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread mike

On 6/20/07, Paul Fisher [EMAIL PROTECTED] wrote:

I would not risk raidz on that many disks.  A nice compromise may be 14+2 
raidz2, which should perform nicely for your workload and be pretty reliable 
when the disks start to fail.


Would anyone on the list not recommend this setup? I could live with 2
drives being used for parity (or the parity concept)

I would be able to reap the benefits of ZFS - self-healing, corrupted
file reconstruction (since it has some parity to read from) and should
have decent performance (obviously not smokin' since I am not
configuring this to try for the fastest possible)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Tomas Ögren
On 20 June, 2007 - Oliver Schinagl sent me these 1,9K bytes:

 Also what about full disk vs full partition, e.g. make 1 partition to
 span the entire disk vs using the entire disk.
 Is there any significant performance penalty? (So not having a disk
 split into 2 partitions, but 1 disk, 1 partition) I read that with a
 full raw disk zfs will be beter to utilize the disks write cache, but I
 don't see how.

Because when given a whole disk, ZFS can safely play with the write
cache in disks without jeopardizing any UFS or so that might be on some
other slice.. Helps when ZFS is batch-writing in a transaction group.

/Tomas
-- 
Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Porting ZFS file system to FreeBSD (BSDCan 2007)

2007-06-20 Thread Rayson Ho

http://www.bsdcan.org/2007/schedule/events/43.en.html

Direct link to the presentation:
http://www.bsdcan.org/2007/schedule/attachments/27-Porting_ZFS_file_system_to_FreeBSD_Pawel_Jakub_Dawidek.pdf

And presentation for Asia BSDCon 2007:
http://asiabsdcon.org/papers/P16-slides.pdf
http://asiabsdcon.org/papers/P16-paper.pdf

Rayson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Will Murnane

On 6/20/07, mike [EMAIL PROTECTED] wrote:

On 6/20/07, Paul Fisher [EMAIL PROTECTED] wrote:
 I would not risk raidz on that many disks.  A nice compromise may be 14+2
 raidz2, which should perform nicely for your workload and be pretty reliable
 when the disks start to fail.
Would anyone on the list not recommend this setup? I could live with 2
drives being used for parity (or the parity concept)

Yes.  2 disks means when one fails, you've still got an extra.  In
raid 5 boxes, it's not uncommon with large arrays for one disk to die,
and when it's replaced, the stress on the other disks causes another
failure.  Then the array is toast.  I don't know if this is a problem
on ZFS... but they took the time to implement raidz2, so I'd suggest
it.


I would be able to reap the benefits of ZFS - self-healing, corrupted
file reconstruction (since it has some parity to read from) and should
have decent performance (obviously not smokin' since I am not
configuring this to try for the fastest possible)

And since you'll generally be doing full-stripe reads and writes, you
get good bandwidth anyways.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Notes for Cindys and Goo

2007-06-20 Thread Will Murnane

On 6/20/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

Huitzi,

Awesome graphics! Do we have your permission to use them? :-)
I might need to recreate them in another format.

The numbers don't look quite right.  Shouldn't the first image have a
600GB zpool as a result, not 400GB?  Similarly, the second image
should be 200GB and then 600GB.

I'd suggest the yEd graph editor, here, for a format:
http://www.yworks.com/en/products_yed_about.htm  It's not open
sourced, but it is free.  Or there's always xfig.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Pawel Jakub Dawidek
On Wed, Jun 20, 2007 at 01:45:29PM +0200, Oliver Schinagl wrote:
 
 
 Pawel Jakub Dawidek wrote:
  On Tue, Jun 19, 2007 at 07:52:28PM -0700, Richard Elling wrote:

  On that note, i have a different first question to start with. I
  personally am a Linux fanboy, and would love to see/use ZFS on linux. I
  assume that I can use those ZFS disks later with any os that can
  work/recognizes ZFS correct? e.g.  I can install/setup ZFS in FBSD, and
  later use it in OpenSolaris/Linux Fuse(native) later?

  The on-disk format is an available specification and is designed to be
  platform neutral.  We certainly hope you will be able to access the
  zpools from different OSes (one at a time).
  
 
  Will be nice to not EFI label disks, though:) Currently there is a
  problem with this - zpool created on Solaris is not recognized by
  FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
  hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.
 

 
 I read this earlier, that it's recommended to use a whole disk instead
 of a partition with zfs, the thing that's holding me back however is the
 mixture of different sized disks I have. I suppose if I had a 300gb per
 disk raid-z going on 3 300 disk and one 320gb disk, but only have a
 partition of 300gb on it (still with me), i could later expand that
 partition with fdisk and the entire raid-z would then expand to 320gb
 per disk (assuming the other disks magically gain 20gb, so this is a bad
 example in that sense :) )
 
 Also what about full disk vs full partition, e.g. make 1 partition to
 span the entire disk vs using the entire disk.
 Is there any significant performance penalty? (So not having a disk
 split into 2 partitions, but 1 disk, 1 partition) I read that with a
 full raw disk zfs will be beter to utilize the disks write cache, but I
 don't see how.

On FreeBSD (thanks to GEOM) there is no difference what do you have
under ZFS. On Solaris, ZFS turns on write cache on disk when whole disk
is used. On FreeBSD write cache is enabled by default and GEOM consumers
can send write-cache-flush (BIO_FLUSH) request to any GEOM providers.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpZkCuJUZmIl.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread Torrey McMahon

James C. McPherson wrote:

Roshan Perera wrote:



But Roshan, if your pool is not replicated from ZFS' point of view,
then all the multipathing and raid controller backup in the world will
not make a difference.


James, I Agree from ZFS point of view. However, from the EMC or the
customer point of view they want to do the replication at the EMC level
and not from ZFS. By replicating at the ZFS level they will loose some
storage and its doubling the replication. Its just customer use to
working with Veritas and UFS and they don't want to change their 
habbits.

I just have to convince the customer to use ZFS replication.


Hi Roshan,
that's a great shame because if they actually want
to make use of the features of ZFS such as replication,
then they need to be serious about configuring their
storage to play in the ZFS world and that means
replication that ZFS knows about.



Also, how does replication at the ZFS level use more storage - I'm 
assuming raw block - then at the array level?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Oliver Schinagl


Pawel Jakub Dawidek wrote:
 On Wed, Jun 20, 2007 at 01:45:29PM +0200, Oliver Schinagl wrote:
   
 Pawel Jakub Dawidek wrote:
 
 On Tue, Jun 19, 2007 at 07:52:28PM -0700, Richard Elling wrote:
   
   
 On that note, i have a different first question to start with. I
 personally am a Linux fanboy, and would love to see/use ZFS on linux. I
 assume that I can use those ZFS disks later with any os that can
 work/recognizes ZFS correct? e.g.  I can install/setup ZFS in FBSD, and
 later use it in OpenSolaris/Linux Fuse(native) later?
   
   
 The on-disk format is an available specification and is designed to be
 platform neutral.  We certainly hope you will be able to access the
 zpools from different OSes (one at a time).
 
 
 Will be nice to not EFI label disks, though:) Currently there is a
 problem with this - zpool created on Solaris is not recognized by
 FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
 hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.

   
   
 I read this earlier, that it's recommended to use a whole disk instead
 of a partition with zfs, the thing that's holding me back however is the
 mixture of different sized disks I have. I suppose if I had a 300gb per
 disk raid-z going on 3 300 disk and one 320gb disk, but only have a
 partition of 300gb on it (still with me), i could later expand that
 partition with fdisk and the entire raid-z would then expand to 320gb
 per disk (assuming the other disks magically gain 20gb, so this is a bad
 example in that sense :) )

 Also what about full disk vs full partition, e.g. make 1 partition to
 span the entire disk vs using the entire disk.
 Is there any significant performance penalty? (So not having a disk
 split into 2 partitions, but 1 disk, 1 partition) I read that with a
 full raw disk zfs will be beter to utilize the disks write cache, but I
 don't see how.
 

 On FreeBSD (thanks to GEOM) there is no difference what do you have
 under ZFS. On Solaris, ZFS turns on write cache on disk when whole disk
 is used. On FreeBSD write cache is enabled by default and GEOM consumers
 can send write-cache-flush (BIO_FLUSH) request to any GEOM providers.

   
zo basically, what you are saying is that on FBSD there's no performane
issue, whereas on solaris there (can be if write caches aren't enabled)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Eric Schrock
On Wed, Jun 20, 2007 at 12:45:52PM +0200, Pawel Jakub Dawidek wrote:
 
 Will be nice to not EFI label disks, though:) Currently there is a
 problem with this - zpool created on Solaris is not recognized by
 FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
 hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.
 

FYI, the primary reason for using EFI labels is that they are
endian-neutral, unlike Solaris VTOC.  The secondary reason is that they
are simpler and easier to use (at least on Solaris).

I'm curious why FreeBSD claims the GPT label is corrupted.  Is this
because FreeBSD doesn't understand EFI labels, our EFI label is bad, or
is there a bug in the FreeBSD EFI implementation?

Thanks,

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread Victor Engle

On 6/20/07, Torrey McMahon [EMAIL PROTECTED] wrote:
Also, how does replication at the ZFS level use more storage - I'm
assuming raw block - then at the array level?
___



Just to add to the previous comments. In the case where you have a SAN
array providing storage to a host for use with ZFS the SAN storage
really needs to be redundant in the array AND the zpools need to be
redundant pools.

The reason the SAN storage should be redundant is that SAN arrays are
designed to serve logical units. The logical units are usually
allocated from a raid set, storage pool or aggregate of some kind. The
array side pool/aggregate may include 10 300GB disks and may have 100+
luns allocated from it for example. If redundancy is not used in the
array side pool/aggregate and then 1 disk failure will kill 100+ luns
at once.

On 6/20/07, Torrey McMahon [EMAIL PROTECTED] wrote:

James C. McPherson wrote:
 Roshan Perera wrote:

 But Roshan, if your pool is not replicated from ZFS' point of view,
 then all the multipathing and raid controller backup in the world will
 not make a difference.

 James, I Agree from ZFS point of view. However, from the EMC or the
 customer point of view they want to do the replication at the EMC level
 and not from ZFS. By replicating at the ZFS level they will loose some
 storage and its doubling the replication. Its just customer use to
 working with Veritas and UFS and they don't want to change their
 habbits.
 I just have to convince the customer to use ZFS replication.

 Hi Roshan,
 that's a great shame because if they actually want
 to make use of the features of ZFS such as replication,
 then they need to be serious about configuring their
 storage to play in the ZFS world and that means
 replication that ZFS knows about.


Also, how does replication at the ZFS level use more storage - I'm
assuming raw block - then at the array level?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Slow write speed to ZFS pool (via NFS)

2007-06-20 Thread Joe S

After researching this further, I found that there are some known
performance issues with NFS + ZFS. I tried transferring files via SMB, and
got write speeds on average of 25MB/s.

So I will have my UNIX systems use SMB to write files to my Solaris server.
This seems weird, but its fast. I'm sure Sun is working on fixing this. I
can't imagine running a Sun box with out NFS.



On 6/20/07, Mario Goebbels [EMAIL PROTECTED] wrote:


 Correction:

 SATA Controller is a Sillcon Image 3114, not a 3112.

Do these slow speeds only appear when writing via NFS or generally in
all scenarios? Just asking, because Solaris' ata driver doesn't
initialize settings like block mode, prefetch and such on IDE/SATA
drives (that is if ata applies here with that chipset).

-mg


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread Gary Mills
On Wed, Jun 20, 2007 at 12:23:18PM -0400, Torrey McMahon wrote:
 James C. McPherson wrote:
 Roshan Perera wrote:
 
 But Roshan, if your pool is not replicated from ZFS' point of view,
 then all the multipathing and raid controller backup in the world will
 not make a difference.
 
 James, I Agree from ZFS point of view. However, from the EMC or the
 customer point of view they want to do the replication at the EMC level
 and not from ZFS. By replicating at the ZFS level they will loose some
 storage and its doubling the replication. Its just customer use to
 working with Veritas and UFS and they don't want to change their 
 habbits.
 I just have to convince the customer to use ZFS replication.
 
 that's a great shame because if they actually want
 to make use of the features of ZFS such as replication,
 then they need to be serious about configuring their
 storage to play in the ZFS world and that means
 replication that ZFS knows about.
 
 Also, how does replication at the ZFS level use more storage - I'm 
 assuming raw block - then at the array level?

SAN storage generally doesn't work that way.  They use some magical
redundancy scheme, which may be RAID-5 or WAFL, from which the Storage
Administrator carves out virtual disks.  These are best viewed as an
array of blocks.  All disk administration, such as replacing failed
disks, takes place on the storage device without affecting the virtual
disks.  There's no need for disk administration or additional
redundancy on the client side.  If more space is needed on the client,
the Storage Administrator simply expands the virtual disk by extending
its blocks.  ZFS needs to play nicely in this environment because
that's what's available in large organizations that have centralized
their storage.  Asking for raw disks doesn't work.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs and EMC

2007-06-20 Thread Carisdad

Dominik Saar wrote:

Hi there,

have a strange behavior if i´ll create a zfs pool at an EMC PowerPath
pseudo device.

I can create a pool on emcpower0a
but not on emcpower2a

zpool core dumps with invalid argument  

Thats my second maschine with powerpath and zfs
the first one works fine, even zfs/powerpath and failover ...

Is there anybody who has the same failure and a solution ? :)
  
I've had the same failure, but no solution, yet.  These were drives in 
an EMC CX3-80.  I was able to create zpools out of drives which were 
hosted by SP-A, but not those which defaulted to SP-B.


-Andy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Oliver Schinagl


mike wrote:
 On 6/20/07, Constantin Gonzalez [EMAIL PROTECTED] wrote:

  One disk can be one vdev.
  A 1+1 mirror can be a vdev, too.
  A n+1 or n+2 RAID-Z (RAID-Z2) set can be a vdev too.

 - Then you concatenate vdevs to create a pool. Pools can be extended by
  adding more vdevs.

 - Then you create ZFS file systems that draw their block usage from the
  resources supplied by the pool. Very flexible.

 This actually brings up something I was wondering about last night:

 If I was to plan for a 16 disk ZFS-based system, you would probably
 suggest me to configure it as something like 5+1, 4+1, 4+1 all raid-z
 (I don't need the double parity concept)

 I would prefer something like 15+1 :) I want ZFS to be able to detect
 and correct errors, but I do not need to squeeze all the performance
 out of it (I'll be using it as a home storage server for my DVDs and
 other audio/video stuff. So only a few clients at the most streaming
 off of it)

 I would be interested in hearing if there are any other configuration
 options to squeeze the most space out of the drives. I have no issue
 with powering down to replace a bad drive, and I expect that I'll only
Just know that, if your server/disks are up all the time, shutting down
your server whilst you wait for replacement drives actually might kill
your array. Especially with consumer IDE/SATA drives.

Those pesky consumer drivers aren't made for 24/7 usage, i think they
spec em at 8hrs a day? Eitherway, that's me being sidetracked, the
problem is, you'll have a disk up spinning normally, some access, same
temperature! all the time. All of a sudden you change the envirment, you
let it cool down and what not. Harddisks don't like that at all! I've
even heard of harddisk (cases) cracking because of the temperature
differences and such.

My requirements are the same, and i want space, but the thought of
having more disks die on me while i replace the broken one doesn't
really make me happy either. (I personally use only the WD Raid editions
of HDD's; wether it's worth it or not, i dunno, but they have better
warranty and supposedly should be able to do 24/7 a day)

 have one at the most fail at a time. If I really do need room for two
 to fail then I suppose I can look for a 14 drive space usable setup
 and use raidz-2.

 Thanks,
 mike
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Shrinking of Pools.

2007-06-20 Thread Casper . Dik

One of the reasons i switched back from X/JFS to ReiserFS on my linux
box was that I couldn't shrink the FS ontop of my LVM, which was highly
annoying. Also sometimes you might wanna just remove a disk from your
array: Say you setup up a mirrored ZFS with 2 120gb disks. 4 years
later, you get some of those fancy 1tb disks, say 3 or 4 of em and
raid-z them. not only would those 120gb insignificant, but maybe they've
becom a liability, they are old, replacing them ins't that easy anymore,
who still sells disks that size, and why bother, if you have plenty of
space.

You scenario is adequately covered by zpool replace; you can
replace disks with bigger disks or disks of the same size.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Bill Sommerfeld
On Wed, 2007-06-20 at 12:45 +0200, Pawel Jakub Dawidek wrote:
 Will be nice to not EFI label disks, though:) Currently there is a
 problem with this - zpool created on Solaris is not recognized by
 FreeBSD, because FreeBSD claims GPT label is corrupted.

Hmm.  I'd think the right answer here is to understand why FreeBSD and
solaris disagree about EFI/GPT labels.  Could be a solaris bug, could be
a freebsd bug, but the intent of the label format is to permit
interchange between different platforms..

- Bill


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Shrinking of Pools.

2007-06-20 Thread Oliver Schinagl



[EMAIL PROTECTED] wrote:

One of the reasons i switched back from X/JFS to ReiserFS on my linux
box was that I couldn't shrink the FS ontop of my LVM, which was highly
annoying. Also sometimes you might wanna just remove a disk from your
array: Say you setup up a mirrored ZFS with 2 120gb disks. 4 years
later, you get some of those fancy 1tb disks, say 3 or 4 of em and
raid-z them. not only would those 120gb insignificant, but maybe they've
becom a liability, they are old, replacing them ins't that easy anymore,
who still sells disks that size, and why bother, if you have plenty of
space.



You scenario is adequately covered by zpool replace; you can
replace disks with bigger disks or disks of the same size.

Casper
  

yes, but can I replace a mirror with a raid-z?

what i understood was that i can have a 5-way mirror, remove/replace 4 
disks fine, but i can't remove the 5th disk. I imagine I can add 1 (or 
4) bigger disks, let it resync etc and then pull the last disk, but if i 
would want to go from mirror to raid-z? would that still work with zpool 
replace? (Or am I simply not far enough in the document)


oliver
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Pawel Jakub Dawidek
On Wed, Jun 20, 2007 at 09:48:08AM -0700, Eric Schrock wrote:
 On Wed, Jun 20, 2007 at 12:45:52PM +0200, Pawel Jakub Dawidek wrote:
  
  Will be nice to not EFI label disks, though:) Currently there is a
  problem with this - zpool created on Solaris is not recognized by
  FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
  hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.
  
 
 FYI, the primary reason for using EFI labels is that they are
 endian-neutral, unlike Solaris VTOC.  The secondary reason is that they
 are simpler and easier to use (at least on Solaris).
 
 I'm curious why FreeBSD claims the GPT label is corrupted.  Is this
 because FreeBSD doesn't understand EFI labels, our EFI label is bad, or
 is there a bug in the FreeBSD EFI implementation?

I haven't investigated this yet. FreeBSD should understand EFI, so
either the last two or a bug in Solaris EFI implementation:) I seem to
recall similar problems on Linux with ZFS/FUSE...

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpd81Zg8xdCo.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: New german white paper on ZFS

2007-06-20 Thread roland
nice one !

i think this is one of the best and most comprehensive papers about zfs i have 
seen

regards
roland
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Notes for Cindys and Goo

2007-06-20 Thread Richard Elling

Will Murnane wrote:

On 6/20/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

Huitzi,

Awesome graphics! Do we have your permission to use them? :-)
I might need to recreate them in another format.

The numbers don't look quite right.  Shouldn't the first image have a
600GB zpool as a result, not 400GB?  Similarly, the second image
should be 200GB and then 600GB.

I'd suggest the yEd graph editor, here, for a format:
http://www.yworks.com/en/products_yed_about.htm  It's not open
sourced, but it is free.  Or there's always xfig.


StarOffice does objects and connectors.  It is included in Solaris.
I didn't add folders because I'm not sure if the folder was intended
to represent a file system or directory... either way the folders didn't
seem to clarify the intent to show that the zpool can be dynamically
expanded.
 -- richard
inline: zpool-desc.png

zpool-desc.odg
Description: application/vnd.oasis.opendocument.graphics
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS-fuse on linux

2007-06-20 Thread mario heimel
Linux is the first operating system that can boot from RAID-1+0, RAID-Z or 
RAID-Z2 ZFS, really cool trick to put zfs-fuse in the initramfs.
( Solaris can only boot from single-disk or RAID-1 pools ) 


http://www.linuxworld.com/news/2007/061807-zfs-on-linux.html
http://groups.google.com/group/zfs-fuse/browse_thread/thread/3e781ace9de600bc/230ca0608235e216?lnk=gstq=bootrnum=1#230ca0608235e216
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS-fuse on linux

2007-06-20 Thread Eric Schrock
On Wed, Jun 20, 2007 at 01:25:35PM -0700, mario heimel wrote:
 Linux is the first operating system that can boot from RAID-1+0,
 RAID-Z or RAID-Z2 ZFS, really cool trick to put zfs-fuse in the
 initramfs.  ( Solaris can only boot from single-disk or RAID-1 pools ) 

Note that this method is much like the old 'UFS mountroot' support which
was replaced in favor of the current native GRUB boot. This method could
support arbitrary pools at the cost of maintaining an extra UFS slice.
You basically boot from something that GRUB can read (in this case
initramfs), and then switch to the real boot environment.

This is very different from booting natively from GRUB, as you have to
maintain two separate boot environments (one on your initramfs and one
in your ZFS root).

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS-fuse on linux

2007-06-20 Thread eric kustarz


On Jun 20, 2007, at 1:25 PM, mario heimel wrote:

Linux is the first operating system that can boot from RAID-1+0,  
RAID-Z or RAID-Z2 ZFS, really cool trick to put zfs-fuse in the  
initramfs.

( Solaris can only boot from single-disk or RAID-1 pools )


http://www.linuxworld.com/news/2007/061807-zfs-on-linux.html
http://groups.google.com/group/zfs-fuse/browse_thread/thread/ 
3e781ace9de600bc/230ca0608235e216? 
lnk=gstq=bootrnum=1#230ca0608235e216


cool stuff!

Looks like the FUSE port to linux is getting an entirely different  
audience excited about ZFS... nice.


eric

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Richard Elling

Oliver Schinagl wrote:

zo basically, what you are saying is that on FBSD there's no performane
issue, whereas on solaris there (can be if write caches aren't enabled)


Solaris plays it safe by default.  You can, of course, override that safety.
Whether it is a performance win seems to be the subject of some debate,
but intuitively it seems like it should help for most cases.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS Scalability/performance

2007-06-20 Thread Toby Thain


On 20-Jun-07, at 12:23 PM, Richard L. Hamilton wrote:


Hello,

I'm quite interested in ZFS, like everybody else I
suppose, and am about
to install FBSD with ZFS.

On that note, i have a different first question to
start with. I
personally am a Linux fanboy, and would love to
see/use ZFS on linux. I
assume that I can use those ZFS disks later with any
os that can
work/recognizes ZFS correct? e.g.  I can
install/setup ZFS in FBSD, and
later use it in OpenSolaris/Linux Fuse(native) later?


I've seen some discussions that implied adding attributes
to support non-Solaris (*BSD) uses of zfs, so that the format would
remain interoperable (i.e. free of incompatible extensions),
although not all OSs might fully support those.  But I don't know
if there's some firm direction to keeping the on-disk format
compatible across platforms that zfs is ported to.  Indeed, if the
code is open-source, I'm not sure that's possible to _enforce_.  But
I suspect (and certainly hope) it's being encouraged.  If someone who
works on zfs could comment on that, it might help.


Mat Ahrens recently did, on this list:

... as a leader of Sun's ZFS team, and the OpenSolaris ZFS  
community, I would do everything in my power to prevent the ZFS on- 
disk format from diverging in different implementations.  Let's  
discuss the issues on this mailing list as they come up, and try to  
arrive at a conclusion which offers the best ZFS for *all* ZFS  
users, OpenSolaris or otherwise.

...
FYI, we're already working with engineers on some other ports to  
ensure on-disk compatability.  Those changes are going smoothly.   
So please, contact us if you want to make (or want us to make) on- 
disk changes to ZFS for your port or distro.  We aren't that  
difficult to work with :-)


--mat


--Toby




Anyway, back to business :)
I have a whole bunch of different sized disks/speeds.
E.g. 3 300GB disks
@ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s
and so on.

Raid-Z and ZFS claims to be uber scalable and all
that, but would it
'just work' with a setup like that too?

I used to match up partition sizes in linux, so make
the 320gb disk into
2 partitions of 300 and 20gb, then use the 4 300gb
partitions as a
raid5, same with the 120 gigs and use the scrap on
those aswell, finally
stiching everything together with LVM2. I can't easly
find how this
would work with raid-Z/ZFS, e.g. can I really just
put all these disks
in 1 big pool and remove/add to it at will? And I
really don't need to
use softwareraid yet still have the same reliablity
with raid-z as I had
with raid-5? What about hardware raid controllers,
just use it as a JBOD
device, or would I use it to match up disk sizes in
raid0 stripes (e.g.
the 3x 120gb to make a 360 raid0).

Or you'd recommend to just stick with
raid/lvm/reiserfs and use that.


One of the advantages of zfs is said to be that if it's used
end-to-end, it can catch more potential data integrity issues
(including controller, disk, cabling glitches, misdirected writes,  
etc).


As far as I understand, raid-z is like raid-5 except that the stripes
are varying size, so all writes are full-stripe, closing the write  
hole,
so no NVRAM is needed to ensure that recovery would always be  
possible.


Components of raid-z or raid-z2 or mirrors can AFAIK only be used  
up to the

size of the smallest component.  However, a zpool can consist of
the aggregation (dynamic striping, I think) of various mirrors or  
raid-z[2]
virtual devices.  So you could group similar sized chunks (be it  
partitions or
whole disks) into redundant virtual devices, and aggregate them all  
into a
zpool (and add more later to grow it, too).  Ideally, all such  
virtual devices
would have the same level of redundancy; I don't think that's  
_required_, but
there isn't much good excuse for doing otherwise, since the  
performance of

raid-z[2] is different from that of a mirror.

There may be some advantages to giving zfs entire disks where  
possible;
it will handle labelling (using EFI labels) and IIRC, may be able  
to better

manage the disk's write cache.

For the most part, I can't see many cases where using zfs together  
with

something else (like vxvm or lvm) would make much sense.  One possible
exception might be AVS (http://opensolaris.org/os/project/avs/) for
geographic redundancy; see http://blogs.sun.com/AVS/entry/ 
avs_and_zfs_seamless

for more details.

It can be quite easy to use, with only two commands (zpool and zfs);
however, you still want to know what you're doing, and there are  
plenty of

issues and tradeoffs to consider to get the best out of it.

Look around a little for more info; for example,
http://www.opensolaris.org/os/community/zfs/faq/
http://en.wikipedia.org/wiki/ZFS
http://docs.sun.com/app/docs/doc/817-2271   (ZFS Administration Guide)
http://www.google.com/search?hl=enq=zpool+OR+zfs+site% 
3Ablogs.sun.combtnG=Search



This message posted from opensolaris.org
___

[zfs-discuss] Re: Best practice for moving FS between pool on same machine?

2007-06-20 Thread Chris Quenelle
Thanks, Constantin!  That sounds like the right answer for me.
Can I use send and/or snapshot at the pool level?  Or do I have
to use it on one filesystem at a time?  I couldn't quite figure this
out from the man pages.

--chris
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] legacy shared ZFS vs. ZFS NFS shares

2007-06-20 Thread Ed Ravin
Looking over the info at 

   
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_NFS_Server_Performance

I see this:

   Do not mix NFS legacy shared ZFS file systems and ZFS NFS shared file
   systems. Go with ZFS NFS shared file systems.

Other than which command turns on the NFS sharing, (shareall vs zfs share),
what is the difference between the two forms of exporting a ZFS filesystem
via NFS?  In particular, what differences might there be in an environment
composed of all NFSv3 clients?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] legacy shared ZFS vs. ZFS NFS shares

2007-06-20 Thread Cindy . Swearingen

Hi Ed,

This BP was added as a lesson learned for not mixing these
models because its too confusing to administer and no other reason.
I'll update the BP to be clear about this.

I'm sure someone else will answer your NFSv3 question. (I'd like
to know too).

Cindy



Ed Ravin wrote:
Looking over the info at 


   
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_NFS_Server_Performance

I see this:

   Do not mix NFS legacy shared ZFS file systems and ZFS NFS shared file
   systems. Go with ZFS NFS shared file systems.

Other than which command turns on the NFS sharing, (shareall vs zfs share),
what is the difference between the two forms of exporting a ZFS filesystem
via NFS?  In particular, what differences might there be in an environment
composed of all NFSv3 clients?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Migrating ZFS pool with zones from one host to another

2007-06-20 Thread Hector De Jesus
I have created a zfs pool and I have installed a zone in the pool. For example 
my pool name is hec pool /hecpool and I have installed my zone to the following 
location /hecpool/zones/heczone.  Is there away to migrate all of my pool data 
and zones to another SUN host if my pools are created on provisioned storage 
from a SAN. I have found an interesting blog article that discusses this but it 
does not seem to work. I am able to import the pool on another machine, but the 
zone info is not there. zoneadm -z heczone attach or zoneadm -z heczone boot do 
not work.  Can anyone help with a set of steps to do a migration thanks in 
advance.

PS. please see this link,  these are the steps that i followed to try the 
migration
https://www.sdn.sap.com/irj/sdn/weblogs?blog=/pub/wlg/6162
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Install new Solaris - how to see old ZFS disk

2007-06-20 Thread Joubert Nel
Hi,

Stupid question I'm sure - I've just upgraded to Solaris Express Dev Edition 
(05/07) by installing over my previous Solaris 10 installation (intentionally, 
so as to get a clean setup).
The install is on Disk #1.

I also have a Disk #2, which was the sole disk in a ZFS pool under Solaris 10.
How can I now mount/incorporate/import this Disk #2 into a ZFS pool on my new 
Solaris so that I can see the data stored on that disk?

Joubert
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Install new Solaris - how to see old ZFS disk

2007-06-20 Thread Eric Schrock
On Wed, Jun 20, 2007 at 05:54:49PM -0700, Joubert Nel wrote:
 Hi,
 
 Stupid question I'm sure - I've just upgraded to Solaris Express Dev
 Edition (05/07) by installing over my previous Solaris 10 installation
 (intentionally, so as to get a clean setup).  The install is on Disk
 #1.
 
 I also have a Disk #2, which was the sole disk in a ZFS pool under
 Solaris 10.  How can I now mount/incorporate/import this Disk #2 into
 a ZFS pool on my new Solaris so that I can see the data stored on that
 disk?
 
 Joubert

Try 'zpool import'.

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Install new Solaris - how to see old ZFS disk

2007-06-20 Thread Michael Schuster

Joubert Nel wrote:

Hi,

Stupid question I'm sure - I've just upgraded to Solaris Express Dev Edition 
(05/07) by installing over my previous Solaris 10 installation (intentionally, 
so as to get a clean setup).
The install is on Disk #1.

I also have a Disk #2, which was the sole disk in a ZFS pool under Solaris 10.
How can I now mount/incorporate/import this Disk #2 into a ZFS pool on my new 
Solaris so that I can see the data stored on that disk?


zpool import

HTH
--
Michael Schuster
Recursion, n.: see 'Recursion'
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss