Re: [zfs-discuss] fchmod(2) returns ENOSPC on ZFS

2007-06-15 Thread Manoj Joseph

Matthew Ahrens wrote:
In a COW filesystem such as ZFS, it will sometimes be necessary to 
return ENOSPC in cases such as chmod(2) which previously did not.  This 
is because there could be a snapshot, so overwriting some information 
actually requires a net increase in space used.


That said, we may be generating this ENOSPC in cases where it is not 
strictly necessary (eg, when there are no snapshots).  We're working on 
some of these cases.  Can you show us the output of 'zfs list' when the 
ENOSPC occurs?


Is there a bug id for this?

Regards,
Manoj

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fchmod(2) returns ENOSPC on ZFS

2007-06-15 Thread Matthew Ahrens

Manoj Joseph wrote:

Matthew Ahrens wrote:
In a COW filesystem such as ZFS, it will sometimes be necessary to 
return ENOSPC in cases such as chmod(2) which previously did not.  
This is because there could be a snapshot, so overwriting some 
information actually requires a net increase in space used.


That said, we may be generating this ENOSPC in cases where it is not 
strictly necessary (eg, when there are no snapshots).  We're working 
on some of these cases.  Can you show us the output of 'zfs list' when 
the ENOSPC occurs?


Is there a bug id for this?


Can you search for ENOSPC in solaris/kernel/zfs?  (That's 
product/category/subcat.  I don't know how the external bug interface works.) 
 Or check out 6362156 and 6453407.


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Karma Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Alec Muffett
As I understand matters, from my notes to design the perfect home NAS 
server :-)


1) you want to give ZFS entire spindles if at all possible; that will 
mean it can enable and utilise the drive's hardware write cache 
properly, leading to a performance boost. You want to do this if you 
can.  Alas it knocks out the split all disks into 7  493Gb 
partitions design concept.


2) I've considered pivot-root solutions based around a USB stick or 
drive; cute, but I want a single tower box and no dongles


3) This leads me to the following design points:

- enormous tower case with 10+ bays
- HE/high-efficency mobo with 8+ SATA capability
- crank down the CPU, big fans, etc... quiet
	- 1x [small/cheap]Gb Drive @ 1+rpm for root / swap / alternate 
boot environments

- 4x 750Gb SATA @ 7200rpm for full-spindle RAID-Z
	- populate the spare SATA ports when 1Tb disks hit the price point; 
make a separate RAIDZ and drop *that* into the existing pool.


This - curiously - echoes the Unixes of my youth (and earlier!) where 
root was a small fast disk for swapping and access to key utilities 
which were used frequently (hence /bin and /lib) - whereas usr 
was a bigger, slower, cheaper disk, where the less frequently-used 
stuff was stored (/usr/bin, home directories, etc)...


Funny how the karmic wheel turns; I was suffering from the above 
architecture until the early 1990s - arguably we still suffer from it 
today, watch Perl building some time - and now I am redesigning the 
same thing but at least now the whole OS squeezes into the small disk 
pretty easily. :-)


As an aside there is nothing wrong with using ZFS - eg: a zvol - as a 
swap device; but just as you say, if we use real disks for root then 
they will be so big that there's probably no point in pushing swap off 
to ZFS.


-a
--
Alec Muffett
http://www.google.com/search?q=alec-muffett

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS wastesd diskspace?

2007-06-15 Thread Samuel Borgman
Tsk, turns out Mysql was holding on to some old files..

Thanks Daniel!
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: zfs reports small st_size for directories?

2007-06-15 Thread Joerg Schilling
Ed Ravin [EMAIL PROTECTED] wrote:

  15 years ago, Novell Netware started to return a fixed size of 512 for all
  directories via NFS. 
  
  If there is still unfixed code, there is no help.

 The Novell behavior, commendable as it is, did not break the BSD scandir()
 code, because BSD scandir() fails in the other direction, when st_size is
 a low number, like less than 24.

This is wrong:

If you use such a Novell server, you only see the first 21 entries of a 
directory.

Jörg

-- 
 EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin
   [EMAIL PROTECTED](uni)  
   [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Mario Goebbels
 I definitely [i]don't[/i] want to use flash for swap...

You could use a ZVOL on the RAID-Z. Ok, not the most efficient thing,
but there's no sort of flag to disable parity on a specific object. I
wish there was, exactly for this reason.

-mg


signature.asc
Description: This is a digitally signed message part
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: zfs reports small st_size for directories?

2007-06-15 Thread Tomas Ögren
On 14 June, 2007 - Bill Sommerfeld sent me these 0,6K bytes:

 On Thu, 2007-06-14 at 09:09 +0200, [EMAIL PROTECTED] wrote:
  The implication of which, of course, is that any app build for Solaris 9
  or before which uses scandir may have picked up a broken one.
 
 or any app which includes its own copy of the BSD scandir code, possibly
 under a different name, because not all systems support scandir..
 
 it can be impossible to fix all copies of a bug which has been cut 
 pasted too many times... 

Such stuff does exist out in the world..

http://www.google.com/codesearch?hl=enlr=q=scandir.c+st_size+24btnG=Search

/Tomas
-- 
Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS Boot manual setup in b65

2007-06-15 Thread Douglas Atique
 No.  There is nothing else the OS can do when it
 cannot mount the root
 filesystem. 
I have the impression (didn't check though) that the pool is made available by 
just setting some information in its main superblock or something alike (sorry 
for the imprecisions in ZFS jargon). I understand the OS knows which pool/fs it 
wants to mount onto /. It also knows that the root filesystem is ZFS, so it 
could in theory be able to import the pool at boot, I suppose. So I wonder if 
the OS could prompt the user on the console to import the pool or even use some 
(additional) boot options to instruct it to either import the pool without 
asking or reboot without panicking (e.g. -B 
auto-import-exported-root-pool=true/false). I guess this would be an RFE rather 
than a bug. Any thoughts on it?

 That being said, it should have a nicer
 message (using
 FMA-style knowledge articles) that tell you what's
 actually going
 wrong.  There is already a bug filed against this
 failure mode.
Off-topic question, but I cannot resist. What is FMA-style knowledge articles?

-- Douglas
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Karma Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Ian Collins
Alec Muffett wrote:
 As I understand matters, from my notes to design the perfect home
 NAS server :-)

 1) you want to give ZFS entire spindles if at all possible; that will
 mean it can enable and utilise the drive's hardware write cache
 properly, leading to a performance boost. You want to do this if you
 can.  Alas it knocks out the split all disks into 7  493Gb
 partitions design concept.

 2) I've considered pivot-root solutions based around a USB stick or
 drive; cute, but I want a single tower box and no dongles

 3) This leads me to the following design points:

 - enormous tower case with 10+ bays
A good alternative is a smaller case with 6 bays and two 5 way
SuperMicro cages.  Better for space and drive cooling.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs and EMC

2007-06-15 Thread Dominik Saar
Hi there,

have a strange behavior if i´ll create a zfs pool at an EMC PowerPath
pseudo device.

I can create a pool on emcpower0a
but not on emcpower2a

zpool core dumps with invalid argument  

Thats my second maschine with powerpath and zfs
the first one works fine, even zfs/powerpath and failover ...

Is there anybody who has the same failure and a solution ? :)

Greets

Dominik



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Virtual IP Integration

2007-06-15 Thread Vic Engle
Has there been any discussion here about the idea integrating a virtual IP into 
ZFS. It makes sense to me because of the integration of NFS and iSCSI with the 
sharenfs and shareiscsi properties. Since these are both dependent on an IP it 
would be pretty cool if there was also a virtual IP that would automatically 
move with the pool. 

Maybe something like zfs set ip.nge0=x.x.x.x mypool

Or since we may have different interfaces on the nodes where we want to move 
the zpool...

zfs set ip.server1.nge0=x.x.x.x mypool
zfs set ip.server2.bge0=x.x.x.x mypool

I know this could be handled with Sun Cluster but if I am only building a 
simple storage appliance to serve NFS and iSCSI along with CIFS via SAMBA then 
I don't want or need the overhead and complexity of Sun Cluster.

Anyone have comments about whether this is needed and worthwhile? Any good 
simple alternative ways to move a virtual IP with a zpool from one node to 
another?

Regards,
Vic
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS zpool created with MPxIO devices question

2007-06-15 Thread James Lefebvre



Customer asks:

Will SunCluster 3.2 support ZFS zpool created with MPxIO devices instead 
of the corresponding DID devices?

Will it cause any support issues?

Thank you,

James Lefebvre

--
James Lefebvre - OS Technical Support[EMAIL PROTECTED]
(800)USA-4SUN (Reference your Case Id #) Hours 8:00 - 5:00 EST
Sun Support Services
4 Network Drive,  UBUR04-105
Burlington MA 01803-0902 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs and EMC

2007-06-15 Thread Torrey McMahon
This sounds familiarlike something about the powerpath device not 
responding to the SCSI inquiry strings. Are you using the same version 
of powerpath on both systems? Same type of array on both?


Dominik Saar wrote:

Hi there,

have a strange behavior if i´ll create a zfs pool at an EMC PowerPath
pseudo device.

I can create a pool on emcpower0a
but not on emcpower2a

zpool core dumps with invalid argument  

Thats my second maschine with powerpath and zfs
the first one works fine, even zfs/powerpath and failover ...

Is there anybody who has the same failure and a solution ? :)

Greets

Dominik



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs and EMC

2007-06-15 Thread Dominik Saar
Same version on both systems

On Monday i´ll concat the facts, what stuck out me ...

there are some points there are very strange ..





Am Freitag, den 15.06.2007, 10:52 -0400 schrieb Torrey McMahon:
 This sounds familiarlike something about the powerpath device not 
 responding to the SCSI inquiry strings. Are you using the same version 
 of powerpath on both systems? Same type of array on both?
 
 Dominik Saar wrote:
  Hi there,
 
  have a strange behavior if i´ll create a zfs pool at an EMC PowerPath
  pseudo device.
 
  I can create a pool on emcpower0a
  but not on emcpower2a
 
  zpool core dumps with invalid argument  
 
  Thats my second maschine with powerpath and zfs
  the first one works fine, even zfs/powerpath and failover ...
 
  Is there anybody who has the same failure and a solution ? :)
 
  Greets
 
  Dominik
 
 
 
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 

 
 
-- 
Dominik Saar
IT Corporate Server

11 Internet AG
Elgendorfer Str. 57
56410 Montabaur

Telefon: 02602/96-1635
Telefax: 02602/96--1635

Amtsgericht Montabaur HRB 6484

Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich,
Andreas Gauger, Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss 
Aufsichtsratsvorsitzender: Michael Scheeren

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Ruben Wisniewski
Hi Rick,

 Hmm. Not sure I can do RAID5 (and boot from it). Presumably, though,
 this would continue to function if a drive went bad.
 
 It also prevents ZFS from managing the devices itself, which I think
 is undesirable (according to the ZFS Admin Guide).
 
 I'm also not sure if I have RAID5 support in the BIOS. I think it's
 just RAID0/1.

Just a mainboard BIOS RAID is only like a software raid - accept that
you may can't get your data back if the RAID-controller die ... and
when, you'll need the same hardware again.

So you may want to use a software-RAID instead of a plugin-card which
supports buffering the read/write operations on power lost - it is
nearly the same as when you use your BIOS (not a RAID-Controller) for
that.

And I RAID5 you'll able to boot if you have a /boot
partition outside the RAID5.


Greetings Cyron


signature.asc
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Karma Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Rob Windsor

Ian Collins wrote:

Alec Muffett wrote:

As I understand matters, from my notes to design the perfect home
NAS server :-)

1) you want to give ZFS entire spindles if at all possible; that will
mean it can enable and utilise the drive's hardware write cache
properly, leading to a performance boost. You want to do this if you
can.  Alas it knocks out the split all disks into 7  493Gb
partitions design concept.

2) I've considered pivot-root solutions based around a USB stick or
drive; cute, but I want a single tower box and no dongles

3) This leads me to the following design points:

- enormous tower case with 10+ bays



A good alternative is a smaller case with 6 bays and two 5 way
SuperMicro cages.  Better for space and drive cooling.



- HE/high-efficency mobo with 8+ SATA capability


What 8-port-SATA motherboard models are Solaris-friendly?  I've hunted 
and hunted and have finally resigned myself to getting a generic 
motherboard with PCIe-x16 and dropping in an Areca PCIe-x8 RAID card (in 
JBOD config, of course).


As for drive arrangement, I went with the Addonics 5-drives-in-3-bays 
cages, rather similar to the SuperMicro ones mentioned above.


--
Internet: [EMAIL PROTECTED] __o
Life: [EMAIL PROTECTED]_`\,_
   (_)/ (_)
They couldn't hit an elephant at this distance.
  -- Major General John Sedgwick
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Karma Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Richard Elling

comments from the peanut gallery...

Rob Windsor wrote:

Ian Collins wrote:

Alec Muffett wrote:

As I understand matters, from my notes to design the perfect home
NAS server :-)

1) you want to give ZFS entire spindles if at all possible; that will
mean it can enable and utilise the drive's hardware write cache
properly, leading to a performance boost. You want to do this if you
can.  Alas it knocks out the split all disks into 7  493Gb
partitions design concept.


Most mobos still have IDE ports where you can hang a 40GByte disk or two
for installing bootable ZFS (until it gets fully integrated into install)
and as a dump device.


2) I've considered pivot-root solutions based around a USB stick or
drive; cute, but I want a single tower box and no dongles

3) This leads me to the following design points:

- enormous tower case with 10+ bays



A good alternative is a smaller case with 6 bays and two 5 way
SuperMicro cages.  Better for space and drive cooling.



- HE/high-efficency mobo with 8+ SATA capability


What 8-port-SATA motherboard models are Solaris-friendly?  I've hunted 
and hunted and have finally resigned myself to getting a generic 
motherboard with PCIe-x16 and dropping in an Areca PCIe-x8 RAID card (in 
JBOD config, of course).


In the short term, look for AHCI (eg. Intel ICH6 and Via vt8251) for onboard
SATA.  NVidia SATA (nv_sata) is still not integrated :-(.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS Boot manual setup in b65

2007-06-15 Thread Eric Schrock
On Fri, Jun 15, 2007 at 04:37:06AM -0700, Douglas Atique wrote:

 I have the impression (didn't check though) that the pool is made
 available by just setting some information in its main superblock or
 something alike (sorry for the imprecisions in ZFS jargon). I
 understand the OS knows which pool/fs it wants to mount onto /. It
 also knows that the root filesystem is ZFS, so it could in theory be
 able to import the pool at boot, I suppose. So I wonder if the OS
 could prompt the user on the console to import the pool or even use
 some (additional) boot options to instruct it to either import the
 pool without asking or reboot without panicking (e.g. -B
 auto-import-exported-root-pool=true/false). I guess this would be an
 RFE rather than a bug. Any thoughts on it?


Sure, that would seem possible.  Keep in mind that the boot environment
is extremely limited when dealing with devices.  For example, I don't
know if it's possible for a grub plugin to search all attached devices,
which would be necessary for pool import.

 
 Off-topic question, but I cannot resist. What is FMA-style knowledge
 articles?


See the Fault Management community:

http://www.opensolaris.org/os/community/fm/

As well as the event registry:

http://www.opensolaris.org/os/project/events-registry/

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual IP Integration

2007-06-15 Thread Richard Elling

Vic Engle wrote:
Has there been any discussion here about the idea integrating a virtual IP into ZFS. It makes sense to me because of the integration of NFS and iSCSI with the sharenfs and shareiscsi properties. Since these are both dependent on an IP it would be pretty cool if there was also a virtual IP that would automatically move with the pool. 


Maybe something like zfs set ip.nge0=x.x.x.x mypool

Or since we may have different interfaces on the nodes where we want to move 
the zpool...

zfs set ip.server1.nge0=x.x.x.x mypool
zfs set ip.server2.bge0=x.x.x.x mypool

I know this could be handled with Sun Cluster but if I am only building a 
simple storage appliance to serve NFS and iSCSI along with CIFS via SAMBA then 
I don't want or need the overhead and complexity of Sun Cluster.


Overhead?

The complexity of a simple HA storage service is quite small.
The complexity arises when you have multiple dependencies where
various applications depend on local storage and other applications.
(think SMF, but spread across multiple OSes).  For a simple
relationship such as storage--ZFS--share, there isn't much complexity.

Reinventing the infrastructure needed to manage access in the
face of failures is a distinctly non-trivial task.  You can
even begin with a single node cluster, though a virtual IP on a
single node cluster isn't very interesting.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Karma Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Will Murnane

On 6/15/07, Ian Collins [EMAIL PROTECTED] wrote:

Alec Muffett wrote:
 2) I've considered pivot-root solutions based around a USB stick or
 drive; cute, but I want a single tower box and no dongles

You could buy a laptop disk, or mount one of these on the motherboard:
http://www.newegg.com/Product/Product.aspx?Item=N82E16822998003 with a
card like http://www.newegg.com/Product/Product.aspx?Item=N82E16820214113
for about $20.

A good alternative is a smaller case with 6 bays and two 5 way
SuperMicro cages.  Better for space and drive cooling.

Interesting in this regard is the yy-0221:
http://www.directron.com/yy0221bk.html with 10 3.5 bays (well-cooled,
too - fans mounted in front of 8 of them) and 6 5.25 bays.  This
doesn't leave much room for a power supply, unfortunately - the
Supermicro bays are almost 10 deep, and the case is only 18 deep.
I'll measure when I get home, but suffice it to say that the Enermax
EG651P-VE I have in mine (at 140mm deep, if the manufacturer specs are
correct) is a little on the tight side.  But powerful PSUs aren't
necessarily any deeper - the Silverstone OP750, for example, is only
150mm, which I think would fit fine.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual IP Integration

2007-06-15 Thread Victor Engle

Well I suppose complexity is relative. Still, to use Sun Cluster at
all I have to install the cluster framework on each node, correct? And
even before that I have to install an interconnect with 2 switches
unless I direct connect a simple 2 node cluster.

My thinking was that ZFS seems to try and bundle all storage related
tasks into 1 simple interface including making vfstab and dfstab
entries unnecessary and considered legacy wrt ZFS. If I am using ZFS
only to serve storage via IP then the only component I'm forced to
manage outside of ZFS is the IP and if that's really all I want then
it does seem like overkill to install, configure and administer sun
cluster framework on even 2 nodes.

I'm not really thinking about an application where I really need sun
cluster like availability. Just the convenience factor of being able
to export a pool to another system if I need to do maintenance or
patching or whatever without having to go configure the other system.
As it is now, the only thing I might need to do is go bring the
virtual IP on the system I import the pool to.

A good example would be maybe a system where I keep jumpstart images.
I really don't need HA for it but simple administration is always a
plus.

It's an easy enough task to script I suppose but it occurred to me
that it would be very convenient to have this task builtin to ZFS.

Regards,
Vic


On 6/15/07, Richard Elling [EMAIL PROTECTED] wrote:

Vic Engle wrote:
 Has there been any discussion here about the idea integrating a virtual IP 
into ZFS. It makes sense to me because of the integration of NFS and iSCSI with 
the sharenfs and shareiscsi properties. Since these are both dependent on an IP it 
would be pretty cool if there was also a virtual IP that would automatically move 
with the pool.

 Maybe something like zfs set ip.nge0=x.x.x.x mypool

 Or since we may have different interfaces on the nodes where we want to move 
the zpool...

 zfs set ip.server1.nge0=x.x.x.x mypool
 zfs set ip.server2.bge0=x.x.x.x mypool

 I know this could be handled with Sun Cluster but if I am only building a 
simple storage appliance to serve NFS and iSCSI along with CIFS via SAMBA then I 
don't want or need the overhead and complexity of Sun Cluster.

Overhead?

The complexity of a simple HA storage service is quite small.
The complexity arises when you have multiple dependencies where
various applications depend on local storage and other applications.
(think SMF, but spread across multiple OSes).  For a simple
relationship such as storage--ZFS--share, there isn't much complexity.

Reinventing the infrastructure needed to manage access in the
face of failures is a distinctly non-trivial task.  You can
even begin with a single node cluster, though a virtual IP on a
single node cluster isn't very interesting.
  -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual IP Integration

2007-06-15 Thread Richard Elling

Victor Engle wrote:

Well I suppose complexity is relative. Still, to use Sun Cluster at
all I have to install the cluster framework on each node, correct? And
even before that I have to install an interconnect with 2 switches
unless I direct connect a simple 2 node cluster.


Yes, rolling your own cluster software will not release you from these
requirements.  The only way to release these requirements is to increase
the risk of data corruption.


My thinking was that ZFS seems to try and bundle all storage related
tasks into 1 simple interface including making vfstab and dfstab
entries unnecessary and considered legacy wrt ZFS. If I am using ZFS
only to serve storage via IP then the only component I'm forced to
manage outside of ZFS is the IP and if that's really all I want then
it does seem like overkill to install, configure and administer sun
cluster framework on even 2 nodes.


If you are considering manual failover, then this isn't very difficult.
For automated failover, you will need automated management of the
services, which is what Solaris Cluster provides.


I'm not really thinking about an application where I really need sun
cluster like availability. Just the convenience factor of being able
to export a pool to another system if I need to do maintenance or
patching or whatever without having to go configure the other system.
As it is now, the only thing I might need to do is go bring the
virtual IP on the system I import the pool to.

A good example would be maybe a system where I keep jumpstart images.
I really don't need HA for it but simple administration is always a
plus.

It's an easy enough task to script I suppose but it occurred to me
that it would be very convenient to have this task builtin to ZFS.


I don't see where this is a problem in a manual case.  You can have
more than one IP address per NIC.  So you could use a virtual IP address
that moves to the active server without affecing the fixed IP
addresses.  But you can't have both servers attempting to use the same
IP address at the same time -- Solaris Cluster manages this task
automatically.

There are other nuances as well.  Clients tend to react poorly if an
IP address is active, but the service is not.  For NFS, the behaviour
is widely understood and Solaris Cluster will ensure that the server
side tasks occur in proper order (eg. import storage before it is shared,
resolve locks, etc.)  For iSCSI, I'm not sure what the client behaviours
are.  In any case, you do not want your clients to restart/reboot/hang
when you migrate the service between nodes.

Bottom line: manual tasks can work around the nuances, for those who
are interested in manual tasks.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Karma Re: Re: Best use of 4 drives?

2007-06-15 Thread Tom Kimes
Here's a start for a suggested equipment list:

Lian Li case with 17 drive bays (12 3.5 , 5 5.25)   
http://www.newegg.com/Product/Product.aspx?Item=N82E1682064

Asus M2N32-WS motherboard has PCI-X and PCI-E slots. I'm using Nevada b64 for 
iSCSI targets: 
http://www.newegg.com/Product/Product.aspx?Item=N82E16813131026

Your choice of CPU and memory.

I'm using an Opteron 1212 
http://www.newegg.com/Product/Product.aspx?Item=N82E16819105016

and DDR2-800 memory 
http://www.newegg.com/Product/Product.aspx?Item=N82E16820145034.

BTW, I'm not spamming for Newegg, it's just who I used and had the links handy 
;^] 

TK
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Karma Re: Re: Best use of 4 drives?

2007-06-15 Thread Neal Pollack

Tom Kimes wrote:

Here's a start for a suggested equipment list:

Lian Li case with 17 drive bays (12 3.5 , 5 5.25)   
http://www.newegg.com/Product/Product.aspx?Item=N82E1682064
  


So it only has room for one power supply.  How many disk drives will you 
be installing?
It's not the steady state current that matters, as much as it is the 
ability to handle the surge current
of starting to spin 17 disks from zero rpm.   That initial surge can 
stall a lot of lesser power supplies.

Will be interesting to see what happens here.

Asus M2N32-WS motherboard has PCI-X and PCI-E slots. I'm using Nevada b64 for iSCSI targets: 
http://www.newegg.com/Product/Product.aspx?Item=N82E16813131026


Your choice of CPU and memory.

I'm using an Opteron 1212 
http://www.newegg.com/Product/Product.aspx?Item=N82E16819105016


and DDR2-800 memory 
http://www.newegg.com/Product/Product.aspx?Item=N82E16820145034.


BTW, I'm not spamming for Newegg, it's just who I used and had the links handy ;^] 


TK
 
 
This message posted from opensolaris.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Karma Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Ian Collins
Rob Windsor wrote:

 What 8-port-SATA motherboard models are Solaris-friendly?  I've hunted
 and hunted and have finally resigned myself to getting a generic
 motherboard with PCIe-x16 and dropping in an Areca PCIe-x8 RAID card
 (in JBOD config, of course).

I don't know about 8 port SATA, but I used an Asus L1N64-SLI which has
12 and is very Solaris friendly.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Karma Re: Re: Best use of 4 drives?

2007-06-15 Thread mike

On 6/15/07, Brian Hechinger [EMAIL PROTECTED] wrote:


Hmmm, that's an interesting point.  I remember the old days of having to
stagger startup for large drives (physically large, not capacity large).

Can that be done with SATA?


I had to link 2 600w power supplies together to be able to power on 12 drives...

I believe it is up to the controller (and possibly the drives) to
support staggering. But it is allowed in SATA if the controller/drives
support it.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Rick Mann
I'm having a heckuva time posting to individual replies (keep getting 
exceptions).

I have a 1U rackmount server with 4 bays. I don't think there's any way to 
squeeze in a small IDE drive, and I don't want to reduce the swap transfer rate 
if I can avoid it.

The machine has 4 500 GB SATA drives, 2 GB RAM, and an AMD Opteron 175 Denmark 
2.2GHz CPU

The machine itself is a TYAN B2865G20S4H Industry 19 rack-mountable 1U 
chassis Barebone Server NVIDIA nForce4 Ultra Socket 939 AMD Opteron Up to 1 GHz 
Hyper-Transport link support FSB:

http://www.newegg.com/product/product.asp?item=N82E16856152019

I'm afraid I don't know much about the different peripheral controllers 
available in the PC world (I'm a Mac guy), so I don't know if I've shot myself 
in the foot with what I've bought.

Sadly, I think I'll just waste 500 GB of space for now; don't really see a 
better solution. I may have to bail on the whole effort if I can't get all my 
other apps running on b65 (Java, Resin, MySQL, etc).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Richard Elling

Rick Mann wrote:

I'm having a heckuva time posting to individual replies (keep getting 
exceptions).

I have a 1U rackmount server with 4 bays. I don't think there's any way to 
squeeze in a small IDE drive, and I don't want to reduce the swap transfer rate 
if I can avoid it.

The machine has 4 500 GB SATA drives, 2 GB RAM, and an AMD Opteron 175 Denmark 
2.2GHz CPU

The machine itself is a TYAN B2865G20S4H Industry 19 rack-mountable 1U chassis 
Barebone Server NVIDIA nForce4 Ultra Socket 939 AMD Opteron Up to 1 GHz Hyper-Transport link 
support FSB:

http://www.newegg.com/product/product.asp?item=N82E16856152019


For the time being, these SATA disks will operate in IDE compatibility mode, so
don't worry about the write cache.  There is some debate about whether the write
cache is a win at all, but that is another rat hole.  Go ahead and split off 
some
space for boot and swap.  Put the rest in ZFS.  Mirror for best all-around 
performance.


I'm afraid I don't know much about the different peripheral controllers 
available in the PC world (I'm a Mac guy), so I don't know if I've shot myself 
in the foot with what I've bought.

Sadly, I think I'll just waste 500 GB of space for now; don't really see a 
better solution. I may have to bail on the whole effort if I can't get all my 
other apps running on b65 (Java, Resin, MySQL, etc).


Java and mysql are already integrated into Solaris.  The only resin I use comes
from trees, not software developers :-).
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Ian Collins
Rick Mann wrote:
 Richard Elling wrote:

   
 For the time being, these SATA disks will operate in IDE compatibility mode, 
 so
 don't worry about the write cache.  There is some debate about whether the 
 write
 cache is a win at all, but that is another rat hole.  Go ahead and split off 
 some
 space for boot and swap.  Put the rest in ZFS.  Mirror for best all-around 
 performance.
 

 I assume you mean to dedicate one drive to boot/swap/upgrade, and the other 
 three drives to ZFS. But I can't mirror with only 3 drives, so I think RAIDZ 
 is best, wouldn't you agree?
  
   
Try some tests on you box:

slice one drive for root, swap and the rest ZFS.

Install on that drive and create a raidz pool with the other drives,
benchmark.

Slice the other drives in the same way as the first and build either a
raidz pool or a stripe of two of mirrors from the four ZFS slices,
benchmark.

The time will be well spent.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Richard Elling

Rick Mann wrote:

Richard Elling wrote:


For the time being, these SATA disks will operate in IDE compatibility mode, so
don't worry about the write cache.  There is some debate about whether the write
cache is a win at all, but that is another rat hole.  Go ahead and split off 
some
space for boot and swap.  Put the rest in ZFS.  Mirror for best all-around 
performance.


I assume you mean to dedicate one drive to boot/swap/upgrade, and the other 
three drives to ZFS. But I can't mirror with only 3 drives, so I think RAIDZ is 
best, wouldn't you agree?


What I would do:
2 disks: slice 0  3 root (BE and ABE), slice 1 swap/dump, slice 6 ZFS 
mirror
2 disks: whole disk mirrors

The ZFS config would be a dynamic stripe of mirrors.  Later, when you spring for
1TByte disks, you can replace them one at a time, and grow with minimal effort.
KISS

Your challenge will be how to back this beast up.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread David Dyer-Bennet

Richard Elling wrote:

What I would do:
2 disks: slice 0  3 root (BE and ABE), slice 1 swap/dump, slice 6 
ZFS mirror

2 disks: whole disk mirrors

I don't understand slice 6 zfs mirror.  A mirror takes *two* things of 
the same size.


--
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/dd-b
Pics: http://dd-b.net/dd-b/SnapshotAlbum, http://dd-b.net/photography/gallery
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Marion Hakanson
[EMAIL PROTECTED] said:
 Richard Elling wrote:
 For the time being, these SATA disks will operate in IDE compatibility mode,
 so don't worry about the write cache.  There is some debate about whether
 the write cache is a win at all, but that is another rat hole.  Go ahead
 and split off some space for boot and swap.  Put the rest in ZFS.  Mirror
 for best all-around performance.
 
 I assume you mean to dedicate one drive to boot/swap/upgrade, and the other
 three drives to ZFS. But I can't mirror with only 3 drives, so I think RAIDZ
 is best, wouldn't you agree? 

I'll chime in here with a me too to Richard's suggestion, though
slightly different layout.  We have a Sun T2000 here with 4x 73GB drives,
and it works just fine to mix UFS and ZFS on old-fashioned slices
(partitions) across all four of them.

On your first disk, use s0 for a large-enough root, maybe 10GB;  Then
s1 is swap;  The rest of the disk can be s6, which you'll use for ZFS.

Now, slice up all four disks exactly the same way.  Create an SVM mirror
across the first two s0's, that's your root.  Create a 2nd SVM mirror
across the first two s1's, that's your swap.  The 3rd and 4th s0's
and s1's can be anything you like, maybe mirrored alternate boot env.
for liveupgrade, etc.  I made mine into a ZFS-mirrored /export.

Lastly, use the four s6's to create a big RAID-Z pool.  With your
four 500GB drives, you've given up only 12-14GB each for your system
usage, so the remaining 486-488GB should give you nearly 1.4TB of
useable RAID-Z protected space.

Sure, it's not optimal, but it's really quite good.

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Ian Collins
David Dyer-Bennet wrote:
 Richard Elling wrote:
 What I would do:
 2 disks: slice 0  3 root (BE and ABE), slice 1 swap/dump, slice
 6 ZFS mirror
 2 disks: whole disk mirrors

 I don't understand slice 6 zfs mirror.  A mirror takes *two* things
 of the same size.

Note the 2 disks:.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Karma Re: Re: Best use of 4 drives?

2007-06-15 Thread Will Murnane

On 6/15/07, Brian Hechinger [EMAIL PROTECTED] wrote:

On Fri, Jun 15, 2007 at 02:27:18PM -0700, Neal Pollack wrote:

 So it only has room for one power supply.  How many disk drives will you
 be installing?
 It's not the steady state current that matters, as much as it is the
 ability to handle the surge current
 of starting to spin 17 disks from zero rpm.   That initial surge can
 stall a lot of lesser power supplies.
 Will be interesting to see what happens here.

Drives only really take 1.5A or so from the 12V rail when spinning up,
but it's a good rule of thumb to pretend they take 3A each on top of
the other junk in your system.  A low-usage system like a Core 2 Duo
with a non-nV chipset and onboard or low-end video can run in 100
watts with no problems, so add 51A of 12V rail capacity to 100 watts
worth and you can still find PSUs that supply that.  If you do
staggered spinup, you can allocate something more like 10A plus one
amp per drive.  In either case, an OP1000
(http://www.newegg.com/Product/Product.aspx?Item=N82E16817256010) or a
PCPC TC1KW-SR (http://www.newegg.com/Product/Product.aspx?Item=N82E16817703007)
would do you even if you wanted to do SLI or something equally
ridiculous.


Hmmm, that's an interesting point.  I remember the old days of having to
stagger startup for large drives (physically large, not capacity large).

Can that be done with SATA?

Can and is.  On my Marvell 88sx6081 controller, it happened without my
having to configure anything magical.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Best use of 4 drives?

2007-06-15 Thread Richard Elling

Ian Collins wrote:

David Dyer-Bennet wrote:

Richard Elling wrote:

What I would do:
2 disks: slice 0  3 root (BE and ABE), slice 1 swap/dump, slice
6 ZFS mirror
2 disks: whole disk mirrors


I don't understand slice 6 zfs mirror.  A mirror takes *two* things
of the same size.


Note the 2 disks:.


Yeah, should probably draw a picture :-)
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Mac OS X Leopard to use ZFS

2007-06-15 Thread George

I'm curious about something.  Wouldn't ZFS `send` and `recv` be a
perfect fit for Apple Time Machine in Leopard if glued together by
some scripts?  In this scenario you could have an external volume and
simply send snapshots to it and reciprocate as needed with recv.

Also, it would seem that Apple really can't push ZFS into Mac OS X
until evacuation of data and removal of vdevs is supported for pools.
Once this is in place it would seem reasonable that Apple would more
than want to push ZFS rw support into Leopard.  This would then allow
for very flexible and robust usage for desktop users as well in that
folks will very often wish to manipulate their storage arrangement in
terms of saying woops, I didn't mean to put that volume here
permanently, I need to remove it!  This would especially be true of
firewire/usb drives for backups and all sorts of questions that
would arise.  A simple, `remove` would be perfect to cure these blues.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Mac OS X Leopard to use ZFS

2007-06-15 Thread Richard Elling

George wrote:

I'm curious about something.  Wouldn't ZFS `send` and `recv` be a
perfect fit for Apple Time Machine in Leopard if glued together by
some scripts?  In this scenario you could have an external volume and
simply send snapshots to it and reciprocate as needed with recv.

Also, it would seem that Apple really can't push ZFS into Mac OS X
until evacuation of data and removal of vdevs is supported for pools.


Does hfs+ support this?  I see no evidence that it does.


Once this is in place it would seem reasonable that Apple would more
than want to push ZFS rw support into Leopard.  This would then allow
for very flexible and robust usage for desktop users as well in that
folks will very often wish to manipulate their storage arrangement in
terms of saying woops, I didn't mean to put that volume here
permanently, I need to remove it!  This would especially be true of
firewire/usb drives for backups and all sorts of questions that
would arise.  A simple, `remove` would be perfect to cure these blues.


More likely, they are trying to make sure it fits their integration
time schedules.  There is still a lot of development being done on ZFS.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss