from:"Tim Cook"

Re: [zfs-discuss] Hardware for high-end ZFS NAS file server - 2010 March edition

2010-03-05 Thread Tim Cook

On Fri, Mar 5, 2010 at 8:41 PM, Dan Dascalescu <
bigbang7+opensola...@gmail.com > wrote:

> Thanks for your suggestions.
>
> In the meantime I had found this case and PSU - what do folks think?
>
> Antec Twelve Hundred Gaming Case -
> http://wiki.dandascalescu.com/reviews/gadgets/computers/cases#Antec_Twelve_Hundred_Gaming_Case_.E2.98.85
>
> + 12 5.25" externally-accessible bays, in which you can elastic-mount the
> hard drives (http://www.silentpcreview.com/forums/viewtopic.php?t=8240)
> + under 28dBA SPL @ 1meter at 100% CPU fan
>
>
> PSU: Nexus RX-8500, http://www.silentpcreview.com/article970-page7.html
> It has a relatively constant noise profile 28 dBA at 300W, 33 dBA @ 700W
>
> Not sure what wattage the system will use most of the time though?
>
> If it's under 400W, then the Antec CP-850 doesn't exceed 14 dBA - almost
> silent.
>
>
I think you'd be better off with a NORCO for a cheap storage server case.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-03-07 Thread Tim Cook

On Sun, Mar 7, 2010 at 3:12 AM, James C. McPherson wrote:

> On  7/03/10 12:28 PM, norm.tallant wrote:
>
>> I'm about to try it!  My LSI SAS 9211-8i should arrive Monday or
>> Tuesday.  I bought the cable-less version, opting instead to save a few
>> $ and buy Adaptec 2247000-R SAS to SATA cables.
>>
>> My rig will be based off of fairly new kit, so it should be interesting
>> to see how 2009.06 deals with it all :)
>>
>
> As far as I recall, the chip on that card is
> supported with the mpt_sas(7d) driver, not the
> mp(7d) driver.
>
> It won't, however, be detected by a 2009.06
> installation, since that was based on build snv_111b,
> and the mpt_sas driver wasn't introduced until build
> snv_118.
>
> So you could either wait until 2010.$spring comes out,
> or start using the /dev repo instead.
>
>
> hth,
> James C. McPherson
> --
> Senior Software Engineer, Solaris
> Sun Microsystems
> http://www.jmcp.homeunix.com/blog
>
>

Or manually load the driver onto the older version of OSOL :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] getting drive serial number

2010-03-07 Thread Tim Cook

On Sun, Mar 7, 2010 at 12:30 PM, Ethan  wrote:

> I have a failing drive, and no way to correlate the device with errors in
> the zpool status with an actual physical drive.
> If I could get the device's serial number, I could use that as it's printed
> on the drive.
> I come from linux, so I tried dmesg, as that's what's familiar (I see that
> the man page for dmesg on opensolaris says that I should be using syslogd
> but I haven't been able to figure out how to get the same output from
> syslogd). But, while I see at the top the serial numbers for some other
> drives, I don't see the one I want because it seems to be scrolled off the
> top.
> Can anyone tell me how to get the serial number of my failing drive? Or
> some other way to correlate the device with the physical drive?
>
> -Ethan
>
>
>
smartctl will do what you're looking for.  I'm not sure if it's included by
default or not with the latest builds.  Here's the package if you need to
build from source:
http://smartmontools.sourceforge.net/

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] getting drive serial number

2010-03-07 Thread Tim Cook

On Sun, Mar 7, 2010 at 1:05 PM, Dennis Clarke  wrote:

>
> > On Sun, Mar 7, 2010 at 12:30 PM, Ethan  wrote:
> >
> >> I have a failing drive, and no way to correlate the device with errors
> >> in
> >> the zpool status with an actual physical drive.
> >> If I could get the device's serial number, I could use that as it's
> >> printed
> >> on the drive.
> >> I come from linux, so I tried dmesg, as that's what's familiar (I see
> >> that
> >> the man page for dmesg on opensolaris says that I should be using
> >> syslogd
> >> but I haven't been able to figure out how to get the same output from
> >> syslogd). But, while I see at the top the serial numbers for some other
> >> drives, I don't see the one I want because it seems to be scrolled off
> >> the
> >> top.
> >> Can anyone tell me how to get the serial number of my failing drive? Or
> >> some other way to correlate the device with the physical drive?
> >>
> >> -Ethan
> >>
> >>
> >>
> > smartctl will do what you're looking for.  I'm not sure if it's included
> > by
> > default or not with the latest builds.  Here's the package if you need to
> > build from source:
> > http://smartmontools.sourceforge.net/
> >
>
> You can find it at http://blastwave.network.com/csw/unstable/
>
> Just install it with pkgadd or use pkgtrans to extract it and then run the
> binary.
>

Speaking of which, what happened to the IPS mirror?  Using a separate
utility just for that repository is a bit ridiculous.
--Tim

>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] getting drive serial number

2010-03-07 Thread Tim Cook

On Sun, Mar 7, 2010 at 3:12 PM, Ethan  wrote:

> On Sun, Mar 7, 2010 at 15:30, Tim Cook  wrote:
>
>>
>>
>> On Sun, Mar 7, 2010 at 2:10 PM, Ethan  wrote:
>>
>>> On Sun, Mar 7, 2010 at 14:55, Tim Cook  wrote:
>>>
>>>>
>>>>
>>>> On Sun, Mar 7, 2010 at 1:05 PM, Dennis Clarke wrote:
>>>>
>>>>>
>>>>> > On Sun, Mar 7, 2010 at 12:30 PM, Ethan  wrote:
>>>>> >
>>>>> >> I have a failing drive, and no way to correlate the device with
>>>>> errors
>>>>> >> in
>>>>> >> the zpool status with an actual physical drive.
>>>>> >> If I could get the device's serial number, I could use that as it's
>>>>> >> printed
>>>>> >> on the drive.
>>>>> >> I come from linux, so I tried dmesg, as that's what's familiar (I
>>>>> see
>>>>> >> that
>>>>> >> the man page for dmesg on opensolaris says that I should be using
>>>>> >> syslogd
>>>>> >> but I haven't been able to figure out how to get the same output
>>>>> from
>>>>> >> syslogd). But, while I see at the top the serial numbers for some
>>>>> other
>>>>> >> drives, I don't see the one I want because it seems to be scrolled
>>>>> off
>>>>> >> the
>>>>> >> top.
>>>>> >> Can anyone tell me how to get the serial number of my failing drive?
>>>>> Or
>>>>> >> some other way to correlate the device with the physical drive?
>>>>> >>
>>>>> >> -Ethan
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> > smartctl will do what you're looking for.  I'm not sure if it's
>>>>> included
>>>>> > by
>>>>> > default or not with the latest builds.  Here's the package if you
>>>>> need to
>>>>> > build from source:
>>>>> > http://smartmontools.sourceforge.net/
>>>>> >
>>>>>
>>>>> You can find it at http://blastwave.network.com/csw/unstable/
>>>>>
>>>>> Just install it with pkgadd or use pkgtrans to extract it and then run
>>>>> the
>>>>> binary.
>>>>>
>>>>
>>>> Speaking of which, what happened to the IPS mirror?  Using a separate
>>>> utility just for that repository is a bit ridiculous.
>>>> --Tim
>>>>
>>>>>
>>>>
>>>>
>>> Thanks. Actually I already had smartmontools built from source
>>> previously, but I was never able to get it to do much of anything. It
>>> outputs
>>>
>>> ###
>>> ATA command routine ata_command_interface() NOT IMPLEMENTED under
>>> Solaris.
>>> Please contact smartmontools-supp...@lists.sourceforge.net if
>>> you want to help in porting smartmontools to Solaris.
>>> ###
>>>
>>> Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
>>>
>>> I'm not sure if that last line means I'm giving it the wrong thing - in
>>> fact I'm not really sure what to give it. I tried
>>> # smartctl -d ata /dev/dsk/c9t1d0p0
>>> and
>>> # smartctl -d ata /devices/p...@0,0/pci1043,8...@1f,2/d...@1,0
>>> but I am not sure if either of those correctly specifies the disk as
>>> smartctl wants it. or if the first message is the import one, and I just
>>> can't use smartctl as it hasn't implemented what I need.
>>>
>>> -Ethan
>>>
>>>
>>
>> What kind of drive is it?  Is it ATA or SATA?
>>
>> http://opensolaris.org/jive/thread.jspa?threadID=120402
>>
>> --Tim
>>
>>
> (whoops, meant to reply to the list before.)
>
> It is sata, but ata seems to be the closest option available.
>
> # ./smartctl /devices/p...@0,0/pci1043,8...@1f,2/d...@1,0
> smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build)
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
>
> /devices/p...@0,0/pci1043,8...@1f,2/d...@1,0: Unable to detect device type
> Smartctl: please specify device type with the -d option.
>
> ===> VALID ARGUMENTS ARE: ata, scsi, sat[,N][+TYPE], usbcypress[,X],
> usbjmicron[,x][,N], usbsunplus, marvell, areca,N, 3ware,N, hpt,L/M/N,
> cciss,N, test <===
>
>
> Is that the right device to be giving it? It seems to behave the same when
> I try /dev/dsk/c9t1d0s0 or /dev/dsk/c9t1d0s2 or /dev/dsk/c9t1d0p0.
> The controller is: Marvell Technology Group Ltd. 88SE6121 SATA II
> Controller
> and I see marvell a a type in that list, a connection I hadn't made before.
> but when I do -d marvell, it says:
> # ./smartctl -d marvell /devices/p...@0,0/pci1043,8...@1f,2/d...@1,0
> Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
>
> (again, same when I try various /dev/dsk/c9t1d0* devices)
>
> Reading http://opensolaris.org/jive/thread.jspa?messageID=384927 one
> person says "SATA drives are ATA and unsupported in smartmontools
> for Solaris."
> Any ideas?
>
> -Ethan
>
>
what do you get from:
smartctl -a /dev/dsk/c9t1d0s0

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Can you manually trigger spares?

2010-03-08 Thread Tim Cook

Is there a way to manually trigger a hot spare to kick in?  Mine doesn't
appear to be doing so.  What happened is I exported a pool to reinstall
solaris on this system.  When I went to re-import it, one of the drives
refused to come back online.  So, the pool imported degraded, but it doesn't
seem to want to use the hot spare... I've tried triggering a scrub to see if
that would give it a kick, but no-go.

r...@fserv:~$ zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist
for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: scrub completed after 3h19m with 0 errors on Mon Mar  8 02:28:08
2010
config:

NAME  STATE READ WRITE CKSUM
fserv DEGRADED 0 0 0
  raidz2-0DEGRADED 0 0 0
c2t0d0ONLINE   0 0 0
c2t1d0ONLINE   0 0 0
c2t2d0ONLINE   0 0 0
c2t3d0ONLINE   0 0 0
c2t4d0ONLINE   0 0 0
c2t5d0ONLINE   0 0 0
c3t0d0ONLINE   0 0 0
c3t1d0ONLINE   0 0 0
c3t2d0ONLINE   0 0 0
c3t3d0ONLINE   0 0 0
c3t4d0ONLINE   0 0 0
12589257915302950264  UNAVAIL  0 0 0  was
/dev/dsk/c7t5d0s0
spares
  c3t6d0  AVAIL

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Fishworks 2010Q1 and dedup bug?

2010-03-08 Thread Tim Cook

On Mon, Mar 8, 2010 at 2:10 PM, Miles Nordin  wrote:

> > "al" == Adam Leventhal  writes:
>
>al> As always, we welcome feedback (although zfs-discuss is not
>al> the appropriate forum),
>
> ``Please, you criticize our work in private while we compliment it in
> public.''
>

I'm betting its more the fact that zfs-discuss is not fishworks-support.
 Nobody is stopping you from making a blog talking about how badly you think
fishworks sucks, or how awesome you think it is.  I don't see Adam and co.
posting to this list announcing new features or code releases for the
fishworks project.  If they have on a regular basis and I've just been
missing it, feel free to link to the threads.  I'm fairly certain his
response is that if you want to discuss fishworks, you should go about the
proper channels, not that he's somehow trying to cover up a glaring issue
with the product.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Fishworks 2010Q1 and dedup bug?

2010-03-08 Thread Tim Cook

On Mon, Mar 8, 2010 at 5:47 PM, Miles Nordin  wrote:

> >>>>> "tc" == Tim Cook  writes:
>
>tc> I'm betting its more the fact that zfs-discuss is not
>
> Firstly, there's no need for you to respond on anyone's behalf,
> especially not by ``betting.''
>
>
I'm not betting, I know.  It's called being polite and leaving the door open
for him to speak his own mind if he so chooses.

> Secondly, fishworks does run ZFS, and I for one am interested in what
> works and what doesn't.
>

So does nexenta.  So does green-bytes.  So does milax.  So does belenix.  So
does freebsd.  Fortunately this isn't nexenta-enterprise-support, or
green-bytes-enterprise-support, this is zfs-discuss.  We're here to talk
about zfs as it's implemented in opensolaris.  NOT fishworks or any other
one-off.

>
>tc> I don't see Adam and co.  posting to this list announcing new
>tc> features or code releases
>
> I don't recall whether he does or not, but I do recall reading about
> fishworks here and not regarding it OT.
>
>
Mentioning fishworks in passing is a far cry from turning this into a forum
to discuss the implementations of solaris components in a closed appliance.

>tc> Nobody is stopping you from making a blog talking about
>
> Yup, and if this forum's not a neutral one, I'll not be the only one
> who stops wasting his time on it and goes looking for another.  But,
> so far, notwithstanding your efforts, it is neutral, and there's no
> need for me to do that.
>
>
This forum being neutral has absolutely nothing to do with the specifics of
Oracle's closed appliances.  People keeping the discussion on-topic by
telling you/whoever this isn't the proper place to discuss those closed
appliances also has nothing to do with this forum being neutral.

If you want to debate the pro's and con's of fishworks, you're free to do
so, but this isn't the proper place to do it.  Start your own forum.  Start
a blog.  Call up your local sales rep and ask to speak to an engineer.
 You've got all sorts of avenues to have the discussion, but this isn't one
of them.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Intel SASUC8I - worth every penny

2010-03-14 Thread Tim Cook

On Sun, Mar 14, 2010 at 4:26 AM, Svein Skogen  wrote:

> How does it fare, with regards to BUG ID 689477?
>
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6894775
>
> //Svein
>
>

It fairs identically, it's literally the exact same card OEM'd by Intel and
sold for less money.  Same drivers, same firmware, IIRC, it's even the same
PCI device ID.  When I ordered the card, I thought there was a mistake, as
the previous poster already mentioned, it comes in a box with an LSI
sticker, and the card says LSI all over it.  The only place I saw intel was
the receipt.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Tim Cook

On Mon, Mar 15, 2010 at 9:55 AM, Gabriele Bulfon wrote:

> Hello,
> I'd like to check for any guidance about using zfs on iscsi storage
> appliances.
> Recently I had an unlucky situation with an unlucky storage machine
> freezing.
> Once the storage was up again (rebooted) all other iscsi clients were
> happy, while one of the iscsi clients (a sun solaris sparc, running Oracle)
> did not mount the volume marking it as corrupted.
> I had no way to get back my zfs data: had to destroy and recreate from
> backups.
> So I have some questions regarding this nice story:
> - I remember sysadmins being able to almost always recover data on
> corrupted ufs filesystems by magic of superblocks. Is there something
> similar on zfs? Is there really no way to access data of a corrupted zfs
> filesystem?
> - In this case, the storage appliance is a legacy system based on linux, so
> raids/mirrors are managed at the storage side its own way. Being an iscsi
> target, this volume was mounted as a single iscsi disk from the solaris
> host, and prepared as a zfs pool consisting of this single iscsi target. ZFS
> best practices, tell me that to be safe in case of corruption, pools should
> always be mirrors or raidz on 2 or more disks. In this case, I considered
> all safe, because the mirror and raid was managed by the storage machine.
> But from the solaris host point of view, the pool was just one! And maybe
> this has been the point of failure. What is the correct way to go in this
> case?
> - Finally, looking forward to run new storage appliances using OpenSolaris
> and its ZFS+iscsitadm and/or comstar, I feel a bit confused by the
> possibility of having a double zfs situation: in this case, I would have the
> storage zfs filesystem divided into zfs volumes, accessed via iscsi by a
> possible solaris host that creates his own zfs pool on it (...is it too
> redundant??) and again I would fall in the same previous case (host zfs pool
> connected to one only iscsi resource).
>
> Any guidance would be really appreciated :)
> Thanks a lot
> Gabriele.
>
>
To answer the other portion of your question, yes, you can roll back zfs if
you're at the proper version.  The procedure is listed below, essentially it
will try to find the last known good transaction.  If that doesn't work,
your only remaining option is to restore from backup:
http://docs.sun.com/app/docs/doc/817-2271/gbctt?l=ja&a=view

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Tim Cook

On Mon, Mar 15, 2010 at 9:10 PM, Ross Walker  wrote:

> On Mar 15, 2010, at 7:11 PM, Tonmaus  wrote:
>
>  Being an iscsi
>>> target, this volume was mounted as a single iscsi
>>> disk from the solaris host, and prepared as a zfs
>>> pool consisting of this single iscsi target. ZFS best
>>> practices, tell me that to be safe in case of
>>> corruption, pools should always be mirrors or raidz
>>> on 2 or more disks. In this case, I considered all
>>> safe, because the mirror and raid was managed by the
>>> storage machine.
>>>
>>
>> As far as I understand Best Practises, redundancy needs to be within zfs
>> in order to provide full protection. So, actually Best Practises says that
>> your scenario is rather one to be avoided.
>>
>
> There is nothing saying redundancy can't be provided below ZFS just if you
> want auto recovery you need redundancy within ZFS itself as well.
>
> You can have 2 separate raid arrays served up via iSCSI to ZFS which then
> makes a mirror out of the storage.
>
> -Ross
>
>
Perhaps I'm remembering incorrectly, but I didn't think mirroring would
auto-heal/recover, I thought that was limited to the raidz* implementations.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-20 Thread Tim Cook

On Sat, Mar 20, 2010 at 4:00 PM, Richard Elling wrote:

> On Mar 20, 2010, at 12:07 PM, Svein Skogen wrote:
> > We all know that data corruption may happen, even on the most reliable of
> hardware. That's why zfs har pool scrubbing.
> >
> > Could we introduce a zpool option (as in zpool set  )
> for "scrub period", in "number of hours" (with 0 being no automatic
> scrubbing).
>
> Currently you can do this with cron, of course (or at).  The ZFS-based
> appliances
> in the market offer simple ways to manage such jobs -- NexentaStor,
> Oracle's Sun
> OpenStorage, etc.
>


Right, but I rather agree with Svein.  It would be nice to have it
integrated.  I would argue at the very least, it should become an integrated
service much like auto-snapshot (which could/was also done from cron).
 Doing a basic cron means if you have lots of pools, you might start
triggering several scrubs at the same time, which may or may not crush the
system with I/O load.  So the answer is "well then query to see if the last
scrub is done", and suddenly we've gone from a simple cron job to custom
scripting based on what could be a myriad of variables.



>
> > I see several modern raidcontrollers (such as the LSI Megaraid MFI line)
> has such features (called "patrol reads") already built into them. Why
> should zfs have the same? Having the zpool automagically handling this
> (probably a good thing to default it on 168 hours or one week) would also
> mean that the scrubbing feature is independent from cron, and since scrub
> already has lower priority than ... actual work, it really shouldn't annoy
> anybody (except those having their server under their bed).
> >
> > Of course I'm more than willing to stand corrected if someone can tell me
> where this is already implemented, or why it's not needed. Proper flames
> over this should start with a "warning, flame" header, so I can don my
> asbestos longjohns. ;)
>
> Prepare your longjohns!  Ha!
> Just kidding... the solution exists, just turn it on.  And remember the
> UNIX philosophy.
> http://en.wikipedia.org/wiki/Unix_philosophy
>  -- richard
>
>
Funny (ironic?) you'd quote the UNIX philosophy when the Linux folks have
been running around since day one claiming the basic concept of ZFS fly's in
the face of that very concept.  Rather than do one thing well, it's unifying
two things (file system and raid/disk management) into one.  :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-20 Thread Tim Cook

On Sat, Mar 20, 2010 at 5:00 PM, Gary Gendel  wrote:

> I'm not sure I like this at all.  Some of my pools take hours to scrub.  I
> have a cron job run scrubs in sequence...  Start one pool's scrub and then
> poll until it's finished, start the next and wait, and so on so I don't
> create too much load and bring all I/O to a crawl.
>
> The job is launched once a week, so the scrubs have plenty of time to
> finish. :)
>
> Scrubs every hour?  Some of my pools would be in continuous scrub.
>
>
Who said anything about scrubs every hour?  I see he mentioned hour being
the granularity of the frequency, but that hardly means you'd HAVE to run
scrubs every hour.  Nobody is stopping you from setting it to 3600 hours if
you so choose.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-20 Thread Tim Cook

On Sat, Mar 20, 2010 at 5:36 PM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Sat, 20 Mar 2010, Tim Cook wrote:
>
>>
>> Funny (ironic?) you'd quote the UNIX philosophy when the Linux folks have
>> been running around since day
>> one claiming the basic concept of ZFS fly's in the face of that very
>> concept.  Rather than do one thing
>> well, it's unifying two things (file system and raid/disk management) into
>> one.  :)
>>
>
> Most software introduced in Linux clearly violates the "UNIX philosophy".
>  Instead of small and simple parts we have huge and complex parts, with many
> programs requiring 70 or 80 libraries in order to run.  Zfs's intermingling
> of layers is benign in comparison.
>
>
> Bob
>
>
You can take that up with them :)  I'm just pointing out the obvious irony
of  claiming separation as an excuse for not adding features when the
product is based on the very idea of unification of
layers/features/functionality.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [indiana-discuss] future of OpenSolaris

2010-03-23 Thread Tim Cook

On Tue, Mar 23, 2010 at 7:11 AM, Jacob Ritorto wrote:

> Sorry to beat the dead horse, but I've just found perhaps the only
> written proof that OpenSolaris is supportable.  For those of you who
> deny that this is an issue, its existence as a supported OS has been
> recently erased from every other place I've seen on the Oracle sites.
> Everyone please grab a copy of this before they silently delete it and
> claim that it never existed.  I'm buying a contract right now.  I may
> just take back every mean thing I ever said about Oracle.
>
> http://www.sun.com/servicelist/ss/lgscaledcsupprt-us-eng-20091001.pdf
>
>

Erased from every site?   Assuming when I pointed out several links the
first go round wasn't enough, how bout directly on the opensolaris page
itself?

http://www.opensolaris.com/learn/features/availability/

• Highly available open source based solutions ready to deploy on
OpenSolaris with *full production support from Sun. *
OpenSolaris enables developers to develop, debug, and globally deploy
applications faster, with built-in innovative features and with *full
production support from Sun.*
*
*
*Full production level support

Both Standard and Premium support offerings are available for deployment of
Open HA Cluster 2009.06 with OpenSolaris 2009.06 with following
configurations:
*

etc. etc. etc.

 So do you get paid directly by IBM then, or is it more of a "consultant"
type role?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on a 11TB HW RAID-5 controller

2010-03-24 Thread Tim Cook

On Wed, Mar 24, 2010 at 11:01 AM, Dusan Radovanovic wrote:

> Hello all,
>
> I am a complete newbie to OpenSolaris, and must to setup a ZFS NAS. I do
> have linux experience, but have never used ZFS. I have tried to install
> OpenSolaris Developer 134 on a 11TB HW RAID-5 virtual disk, but after the
> installation I can only use one 2TB disk, and I cannot partition the rest. I
> realize that maximum partition size is 2TB, but I guess the rest must be
> usable. For hardware I am using HP ProLiant DL180G6, 12 1TB disks connected
> to P212 controller in RAID-5. Could someone direct me or suggest what I am
> doing wrong. Any help is greatly appreciated.
>
> Cheers,
> Dusan
>


You would be much better off installing to a small internal disk, and then
creating a separate pool for the 11TB of storage.  The 2TB limit is because
it's a boot drive.  That limit should go away if you're using it as a
separate storage pool.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID10

2010-03-26 Thread Tim Cook

On Fri, Mar 26, 2010 at 1:39 PM, Slack-Moehrle  wrote:

> Hi All,
>
> I am looking at ZFS and I get that they call it RAIDZ which is similar to
> RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data
> protection?
>
> So if I have 8 x 1.5tb drives, wouldn't I:
>
> - mirror drive 1 and 5
> - mirror drive 2 and 6
> - mirror drive 3 and 7
> - mirror drive 4 and 8
>
> Then stripe 1,2,3,4
>
> Then stripe 5,6,7,8
>
> How does one do this with ZFS?
>
> -Jason
>

Just keep adding mirrored vdev's to the pool.  It isn't exactly like a
raid-10, as zfs doesn't to a typical raid-0 stripe, per se.  It is the same
basic concept as raid-10 though.  You would be striping across all of the
mirrored sets, not just a subset.

So you would do:
zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5
drive6 mirror drive7 drive8

See here:
http://www.stringliterals.com/?p=132

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID10

2010-03-26 Thread Tim Cook

On Fri, Mar 26, 2010 at 6:29 PM, Slack-Moehrle  wrote:

>
> OK, so I made progress today. FreeBSD see's all of my drives, ZFS is acting
> correct.
>
> Now for me confusion.
>
> RAIDz3
>
> # zpool create datastore raidz3 da0 da1 da2 da3 da4 da5 da6 da7
> Gives: 'raidz3' no such GEOM providor
>
> # I am looking at the best practices guide and I am confused about adding a
> hot spare. Wont that happen with the above command or do I really just zpool
> create datastore raidz3 da0 da1 da2 da3 da4 da5 and then issue the hotspare
> command twice for da6 and da7?
>
> -Jason
>
> - Original Message -
> From: "Slack-Moehrle" 
> To: zfs-discuss@opensolaris.org
> Sent: Friday, March 26, 2010 12:13:58 PM
> Subject: Re: [zfs-discuss] RAID10
>
>
>
> >> Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3?
> With 8 x 1.5tb?
>
> >> I apologize for seeming dense, I just am confused about non-stardard
> raid setups, they seem tricky.
>
> > raidz "eats" one disk. Like RAID5
> > raidz2 digests another one. Like RAID6
> > raidz3 yet another one. Like ... h...
>
> So:
>
> RAIDZ would be 8 x 1.5tb = 12tb - 1.5tb = 10.5tb
>
> RAIDZ2 would be 8 x 1.5tb = 12tb - 3.0tb = 9.0tb
>
> RAIDZ3 would be 8 x 1.5tb = 12tb - 4.5tb = 7.5tb
>
> But not really that usable space for each since the mirroring?
>
> So do you not mirror drives with RAIDZ2 or RAIDZ3 because you would have
> nothing for space left
>
> -Jason
>
>

Triple parity did not get added until version 17.  FreeBSD cannot do
raidz3.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Tim Cook

On Fri, Mar 26, 2010 at 5:42 PM, Richard Elling wrote:

> On Mar 26, 2010, at 3:25 PM, Marc Nicholas wrote:
>
> > Richard,
> >
> > My challenge to you is that at least three vedors that I know of built
> > their storage platforms on FreeBSD. One of them sells $4bn/year of
> > product - petty sure that eclipses all (Open)Solaris-based storage ;)
>
> FreeBSD 8 or  FreeBSD 7.3?  If neither, then the point is moot.
>  -- richard
>
> ZFS storage and performance consulting at http://www.RichardElling.com
> ZFS training on deduplication, NexentaStor, and NAS performance
> Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
>
>
Well that depends on exactly what you mean.  There's several that are
actively contributing and using code from both.  "Built on" is all relative.
 Given the SMP improvement recently from all of the major players using BSD,
if you're talking kernel code, I would say every single one of them has
pulled code from the 7-branch, and likely the 8-branch as well.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 - Mirrored rpool won't boot unless both mirrors are present

2010-03-27 Thread Tim Cook

On Sat, Mar 27, 2010 at 2:26 PM, Russ Price  wrote:

> I have two 500 GB drives on my system that are attached to built-in SATA
> ports on my Asus M4A785-M motherboard, running in AHCI mode. If I shut down
> the system, remove either drive, and then try to boot the system, it will
> fail to boot. If I disable the splash screen, I find that it will display
> the SunOS banner and the hostname, but it never gets as far as the "Reading
> ZFS config:" stage. GRUB is installed on both drives, and if both drives are
> present, I can flip the boot order in the BIOS and still have it boot
> successfully. I can even move one of the mirrors to a different SATA port
> and still have it boot. But if a mirror is missing, forget it. I can't find
> any log entries in /var/adm/messages about why it fails to boot, and the
> console is equally uninformative. If I check fmdump, it reports an empty
> fault log.
>
> If I throw in a blank drive in place of one of the mirrors, the boot still
> fails. Needless to say, this pretty much makes the whole idea of mirroring
> rather useless.
>
> Any idea what's really going wrong here?
>
>
What build?  How long have you waited for the boot?  It almost sounds to me
like it's waiting for the drive and hasn't timed out before you give up and
power it off.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 - Mirrored rpool won't boot unless both mirrors are present

2010-03-27 Thread Tim Cook

On Sat, Mar 27, 2010 at 2:45 PM, Russ Price  wrote:

> > What build?  How long have you waited for the boot?  It
> > almost sounds to me like it's waiting for the
> > drive and hasn't timed out before you give up and
> > power it off.
>
> I waited about three minutes. This is a b134 installation.
>
> One one of my tests, I tried shoving the removed mirror into the hotswap
> bay, and got a console message indicating that the device was detected, but
> that didn't make the boot complete. I restarted the system with the drive
> present, and everything's fine.
>
> How long should I expect to wait if a drive is missing? It shouldn't take
> more than 30 seconds, IMHO.
>
>
Depends on a lot of things.  I'd let it sit for at least half an hour to see
if you get any messages.  30 seconds, if it's waiting for the driver stack
timeouts, is way too short.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 - Mirrored rpool won't boot unless both mirrors are present

2010-03-27 Thread Tim Cook

On Sat, Mar 27, 2010 at 7:57 PM, William Bauer  wrote:

> Posted this reply in the help forum, copying it here:
>
> I frequently use mirrors to replace disks, or even as a backup with an
> esata dock. So I set up v134 with a mirror in VB, ran installgrub, then
> detached each drive in turn. I completely duplicated and can confirm your
> problem, and since I'm quite comfortable with this process I suggest you
> have found a serious bug and should report it immediately. This is a
> horrible problem if a mirror member fails and renders a system unbootable!!
>
>

Have you tried booting from a livecd and importing the pool from there?  It
might help narrow down exactly where the problem lies.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 - Mirrored rpool won't boot unless both mirrors are present

2010-03-27 Thread Tim Cook

On Sat, Mar 27, 2010 at 10:03 PM, William Bauer  wrote:

> Depends on a lot of things.  I'd let it sit for at least half an hour to
> see if you get any messages.  30 seconds, if it's waiting for the driver
> stack timeouts, is way too short.
> -
>
> I'm not the OP, but I let my VB guest sit for an hour now, and nothing new
> has happened.  The last thing it displayed was the "Hostname:" line, as the
> original post stated.  Personally, I've never seen an OpenSolaris system,
> virtual or physical, take more than a few seconds to pause for a missing
> mirror member.
>
> I did notice something a little odd--usually an OpenSolaris VB guest
> doesn't use all of its allocated memory immediately, even after a user logs
> into gnome.  However, this seemingly idle system quickly ate up all of the
> 2GB I allocated to it.  The host is 2009.06 with 8GB memory and an Intel
> quad Q6600, so I have adequate resources for this guest.
>
>
>
Sounds exactly like the behavior people have had previously while a system
is trying to recover a pool with a faulted drive.  I'll have to check and
see if I can dig up one of those old threads.  I vaguely recall someone here
had a single drive fail on an import and it took forever to import the pool,
running out of memory every time.  I think he eventually added significantly
more memory and was able to import the pool (of course my memory sucks, so
I'm sure that's not quite accurate).

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-31 Thread Tim Cook

On Wed, Mar 31, 2010 at 6:31 AM, Edward Ned Harvey
wrote:

> > > Nobody knows any way for me to remove my unmirrored
> > > log device.  Nobody knows any way for me to add a mirror to it (until
> >
> > Since snv_125 you can remove log devices. See
> > http://bugs.opensolaris.org/view_bug.do?bug_id=6574286
> >
> > I've used this all the time during my testing and was able to remove
> > both
> > mirrored and unmirrored log devices without any problems (and without
> > reboot). I'm using snv_134.
>
> Aware.  Opensolaris can remove log devices.  Solaris cannot.  Yet.  But if
> you want your server in production, you can get a support contract for
> solaris.  Opensolaris cannot.
>


According to who?

http://www.opensolaris.com/learn/features/availability/

Full production level support

Both Standard and Premium support offerings are available for deployment of
Open HA Cluster 2009.06 with OpenSolaris 2009.06 with following
configurations:


--Tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-31 Thread Tim Cook

On Wed, Mar 31, 2010 at 9:47 AM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Wed, 31 Mar 2010, Tim Cook wrote:
>
>>
>> http://www.opensolaris.com/learn/features/availability/
>>
>>  Full production level support
>>
>> Both Standard and Premium support offerings are available for deployment
>> of Open HA Cluster 2009.06 with OpenSolaris 2009.06 with following
>> configurations:
>>
>
> This formal OpenSolaris release is too anchient to do him any good. In
> fact, zfs-wise, it lags the Solaris 10 releases.
>
> If there is ever another OpenSolaris formal release, then the situation
> will be different.
>
> Bob
>

Cmon now, have a little faith.  It hasn't even slipped past March yet :)  Of
course it'd be way more fun if someone from Sun threw caution to the wind
and told us what the hold-up is *cough*oracle*cough*.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-31 Thread Tim Cook

On Wed, Mar 31, 2010 at 11:23 AM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Wed, 31 Mar 2010, Tim Cook wrote:
>
>>
>> If there is ever another OpenSolaris formal release, then the situation
>> will be different.
>>
>> Cmon now, have a little faith.  It hasn't even slipped past March yet :)
>>  Of course it'd be way more fun if someone from Sun threw caution to the
>> wind and told us what the hold-up is *cough*oracle*cough*.
>>
>
> Oracle is a total "cold boot" for me.  Everything they have put on their
> web site seems carefully designed to cast fear and panic into the former Sun
> customer base and cause substantial doubt, dismay, and even terror.  I don't
> know what I can and can't trust.  Every bit of trust that Sun earned with me
> over the past 19 years is clean-slated.
>
> Regardless, it seems likely that Oracle is taking time to change all of the
> copyrights, documentation, and logos to reflect the new othership.  They are
> probably re-evaluating which parts should be included for free in
> OpenSolaris.  The name "Sun" is deeply embedded in Solaris.  All of the
> Solaris 10 packages include "SUN" in their name.
>
> Yesterday I noticed that the Sun Studio 12 compiler (used to build
> OpenSolaris) now costs a minimum of $1,015/year.  The "Premium" service plan
> costs $200 more.
>
> Bob
> --
> Bob Friesenhahn
> bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/



Where did you see that?  It looks to be free to me:
Sun Studio 12 Update 1 - FREE for SDN members.

SDN members can download a free, full-license copy of Sun Studio 12 Update
1.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-31 Thread Tim Cook

On Wed, Mar 31, 2010 at 11:39 AM, Chris Ridd  wrote:

> On 31 Mar 2010, at 17:23, Bob Friesenhahn wrote:
>
> > Yesterday I noticed that the Sun Studio 12 compiler (used to build
> OpenSolaris) now costs a minimum of $1,015/year.  The "Premium" service plan
> costs $200 more.
>
> The download still seems to be a "free, full-license copy" for SDN members;
> the $1015 you quote is for the standard Sun Software service plan. Is a
> service plan now *required*, a la Solaris 10?
>
> Cheers,
>
> Chris
>

It's still available in the opensolaris repo, and I see no license reference
stating you have to have a support contract, so I'm guessing no...

*Several releases of Sun Studio Software are available in the OpenSolaris
repositories. The following list shows you how to download and install each
release, and where you can find the documentation for the release:*

   - *Sun Studio 12 Update 1:** The Sun Studio 12 Update 1 release is the
   latest full production release of Sun Studio software. It has recently been
   added to the OpenSolaris IPS repository.

   To install this release in your OpenSolaris 2009.06 environment using the
   Package Manager:*

*
*
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Tim Cook

On Fri, Apr 2, 2010 at 10:08 AM, Kyle McDonald wrote:

> On 4/2/2010 8:08 AM, Edward Ned Harvey wrote:
> >> I know it is way after the fact, but I find it best to coerce each
> >> drive down to the whole GB boundary using format (create Solaris
> >> partition just up to the boundary). Then if you ever get a drive a
> >> little smaller it still should fit.
> >>
> > It seems like it should be unnecessary.  It seems like extra work.  But
> > based on my present experience, I reached the same conclusion.
> >
> > If my new replacement SSD with identical part number and firmware is
> 0.001
> > Gb smaller than the original and hence unable to mirror, what's to
> prevent
> > the same thing from happening to one of my 1TB spindle disk mirrors?
> > Nothing.  That's what.
> >
> >
> Actually, It's my experience that Sun (and other vendors) do exactly
> that for you when you buy their parts - at least for rotating drives, I
> have no experience with SSD's.
>
> The Sun disk label shipped on all the drives is setup to make the drive
> the standard size for that sun part number. They have to do this since
> they (for many reasons) have many sources (diff. vendors, even diff.
> parts from the same vendor) for the actual disks they use for a
> particular Sun part number.
>
> This isn't new, I beleive IBM, EMC, HP, etc all do it also for the same
> reasons.
> I'm a little surprised that the engineers would suddenly stop doing it
> only on SSD's. But who knows.
>
>  -Kyle
>
>

If I were forced to ignorantly cast a stone, it would be into Intel's lap
(if the SSD's indeed came directly from Sun).  Sun's "normal" drive vendors
have been in this game for decades, and know the expectations.  Intel on the
other hand, may not have quite the same QC in place yet.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook

On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey
wrote:

>  Momentarily, I will begin scouring the omniscient interweb for
> information, but I’d like to know a little bit of what people would say
> here.  The question is to slice, or not to slice, disks before using them in
> a zpool.
>
>
>
> One reason to slice comes from recent personal experience.  One disk of a
> mirror dies.  Replaced under contract with an identical disk.  Same model
> number, same firmware.  Yet when it’s plugged into the system, for an
> unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
> unable to attach and un-degrade the mirror.  It seems logical this problem
> could have been avoided if the device added to the pool originally had been
> a slice somewhat smaller than the whole physical device.  Say, a slice of
> 28G out of the 29G physical disk.  Because later when I get the
> infinitesimally smaller disk, I can always slice 28G out of it to use as the
> mirror device.
>
>
>
> There is some question about performance.  Is there any additional overhead
> caused by using a slice instead of the whole physical device?
>
>
>
> There is another question about performance.  One of my colleagues said he
> saw some literature on the internet somewhere, saying ZFS behaves
> differently for slices than it does on physical devices, because it doesn’t
> assume it has exclusive access to that physical device, and therefore caches
> or buffers differently … or something like that.
>
>
>
> Any other pros/cons people can think of?
>
>
>
> And finally, if anyone has experience doing this, and process
> recommendations?  That is … My next task is to go read documentation again,
> to refresh my memory from years ago, about the difference between “format,”
> “partition,” “label,” “fdisk,” because those terms don’t have the same
> meaning that they do in other OSes…  And I don’t know clearly right now,
> which one(s) I want to do, in order to create the large slice of my disks.
>
>
Your experience is exactly why I suggested ZFS start doing some "right
sizing" if you will.  Chop off a bit from the end of any disk so that we're
guaranteed to be able to replace drives from different manufacturers.  The
excuse being "no reason to, Sun drives are always of identical size".  If
your drives did indeed come from Sun, their response is clearly not true.
 Regardless, I guess I still think it should be done.  Figure out what the
greatest variation we've seen from drives that are supposedly of the exact
same size, and chop it off the end of every disk.  I'm betting it's no more
than 1GB, and probably less than that.  When we're talking about a 2TB
drive, I'm willing to give up a gig to be guaranteed I won't have any issues
when it comes time to swap it out.

--Tim

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook

On Sat, Apr 3, 2010 at 6:53 PM, Robert Milkowski  wrote:

>  On 03/04/2010 19:24, Tim Cook wrote:
>
>
>
> On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey  > wrote:
>
>>   Momentarily, I will begin scouring the omniscient interweb for
>> information, but I’d like to know a little bit of what people would say
>> here.  The question is to slice, or not to slice, disks before using them in
>> a zpool.
>>
>>
>>
>> One reason to slice comes from recent personal experience.  One disk of a
>> mirror dies.  Replaced under contract with an identical disk.  Same model
>> number, same firmware.  Yet when it’s plugged into the system, for an
>> unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
>> unable to attach and un-degrade the mirror.  It seems logical this problem
>> could have been avoided if the device added to the pool originally had been
>> a slice somewhat smaller than the whole physical device.  Say, a slice of
>> 28G out of the 29G physical disk.  Because later when I get the
>> infinitesimally smaller disk, I can always slice 28G out of it to use as the
>> mirror device.
>>
>>
>>
>> There is some question about performance.  Is there any additional
>> overhead caused by using a slice instead of the whole physical device?
>>
>>
>>
>> There is another question about performance.  One of my colleagues said he
>> saw some literature on the internet somewhere, saying ZFS behaves
>> differently for slices than it does on physical devices, because it doesn’t
>> assume it has exclusive access to that physical device, and therefore caches
>> or buffers differently … or something like that.
>>
>>
>>
>> Any other pros/cons people can think of?
>>
>>
>>
>> And finally, if anyone has experience doing this, and process
>> recommendations?  That is … My next task is to go read documentation again,
>> to refresh my memory from years ago, about the difference between “format,”
>> “partition,” “label,” “fdisk,” because those terms don’t have the same
>> meaning that they do in other OSes…  And I don’t know clearly right now,
>> which one(s) I want to do, in order to create the large slice of my disks.
>>
>
>  Your experience is exactly why I suggested ZFS start doing some "right
> sizing" if you will.  Chop off a bit from the end of any disk so that we're
> guaranteed to be able to replace drives from different manufacturers.  The
> excuse being "no reason to, Sun drives are always of identical size".  If
> your drives did indeed come from Sun, their response is clearly not true.
>  Regardless, I guess I still think it should be done.  Figure out what the
> greatest variation we've seen from drives that are supposedly of the exact
> same size, and chop it off the end of every disk.  I'm betting it's no more
> than 1GB, and probably less than that.  When we're talking about a 2TB
> drive, I'm willing to give up a gig to be guaranteed I won't have any issues
> when it comes time to swap it out.
>
>
>  that's what open solaris is doing more or less for some time now.
>
> look in the archives of this mailing list for more information.
> --
> Robert Milkowski
> http://milek.blogspot.com
>
>

Since when?  It isn't doing it on any of my drives, build 134, and judging
by the OP's issues, it isn't doing it for him either... I try to follow this
list fairly closely and I've never seen anyone at Sun/Oracle say they were
going to start doing it after I was shot down the first time.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook

On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook  wrote:

>
>
> On Sat, Apr 3, 2010 at 6:53 PM, Robert Milkowski wrote:
>
>>  On 03/04/2010 19:24, Tim Cook wrote:
>>
>>
>>
>> On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey <
>> guacam...@nedharvey.com> wrote:
>>
>>>   Momentarily, I will begin scouring the omniscient interweb for
>>> information, but I’d like to know a little bit of what people would say
>>> here.  The question is to slice, or not to slice, disks before using them in
>>> a zpool.
>>>
>>>
>>>
>>> One reason to slice comes from recent personal experience.  One disk of a
>>> mirror dies.  Replaced under contract with an identical disk.  Same model
>>> number, same firmware.  Yet when it’s plugged into the system, for an
>>> unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
>>> unable to attach and un-degrade the mirror.  It seems logical this problem
>>> could have been avoided if the device added to the pool originally had been
>>> a slice somewhat smaller than the whole physical device.  Say, a slice of
>>> 28G out of the 29G physical disk.  Because later when I get the
>>> infinitesimally smaller disk, I can always slice 28G out of it to use as the
>>> mirror device.
>>>
>>>
>>>
>>> There is some question about performance.  Is there any additional
>>> overhead caused by using a slice instead of the whole physical device?
>>>
>>>
>>>
>>> There is another question about performance.  One of my colleagues said
>>> he saw some literature on the internet somewhere, saying ZFS behaves
>>> differently for slices than it does on physical devices, because it doesn’t
>>> assume it has exclusive access to that physical device, and therefore caches
>>> or buffers differently … or something like that.
>>>
>>>
>>>
>>> Any other pros/cons people can think of?
>>>
>>>
>>>
>>> And finally, if anyone has experience doing this, and process
>>> recommendations?  That is … My next task is to go read documentation again,
>>> to refresh my memory from years ago, about the difference between “format,”
>>> “partition,” “label,” “fdisk,” because those terms don’t have the same
>>> meaning that they do in other OSes…  And I don’t know clearly right now,
>>> which one(s) I want to do, in order to create the large slice of my disks.
>>>
>>
>>  Your experience is exactly why I suggested ZFS start doing some "right
>> sizing" if you will.  Chop off a bit from the end of any disk so that we're
>> guaranteed to be able to replace drives from different manufacturers.  The
>> excuse being "no reason to, Sun drives are always of identical size".  If
>> your drives did indeed come from Sun, their response is clearly not true.
>>  Regardless, I guess I still think it should be done.  Figure out what the
>> greatest variation we've seen from drives that are supposedly of the exact
>> same size, and chop it off the end of every disk.  I'm betting it's no more
>> than 1GB, and probably less than that.  When we're talking about a 2TB
>> drive, I'm willing to give up a gig to be guaranteed I won't have any issues
>> when it comes time to swap it out.
>>
>>
>>  that's what open solaris is doing more or less for some time now.
>>
>> look in the archives of this mailing list for more information.
>> --
>> Robert Milkowski
>> http://milek.blogspot.com
>>
>>
>
> Since when?  It isn't doing it on any of my drives, build 134, and judging
> by the OP's issues, it isn't doing it for him either... I try to follow this
> list fairly closely and I've never seen anyone at Sun/Oracle say they were
> going to start doing it after I was shot down the first time.
>
> --Tim
>


Oh... and after 15 minutes of searching for everything from 'right-sizing'
to 'block reservation' to 'replacement disk smaller size fewer blocks' etc.
etc. I don't see a single thread on it.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook

On Sat, Apr 3, 2010 at 9:52 PM, Richard Elling wrote:

> On Apr 3, 2010, at 5:56 PM, Tim Cook wrote:
> >
> > On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook  wrote:
> >> Your experience is exactly why I suggested ZFS start doing some "right
> sizing" if you will.  Chop off a bit from the end of any disk so that we're
> guaranteed to be able to replace drives from different manufacturers.  The
> excuse being "no reason to, Sun drives are always of identical size".  If
> your drives did indeed come from Sun, their response is clearly not true.
>  Regardless, I guess I still think it should be done.  Figure out what the
> greatest variation we've seen from drives that are supposedly of the exact
> same size, and chop it off the end of every disk.  I'm betting it's no more
> than 1GB, and probably less than that.  When we're talking about a 2TB
> drive, I'm willing to give up a gig to be guaranteed I won't have any issues
> when it comes time to swap it out.
> >>
> >>
> > that's what open solaris is doing more or less for some time now.
> >
> > look in the archives of this mailing list for more information.
> > --
> > Robert Milkowski
> > http://milek.blogspot.com
> >
> >
> >
> > Since when?  It isn't doing it on any of my drives, build 134, and
> judging by the OP's issues, it isn't doing it for him either... I try to
> follow this list fairly closely and I've never seen anyone at Sun/Oracle say
> they were going to start doing it after I was shot down the first time.
> >
> > --Tim
> >
> >
> > Oh... and after 15 minutes of searching for everything from
> 'right-sizing' to 'block reservation' to 'replacement disk smaller size
> fewer blocks' etc. etc. I don't see a single thread on it.
>
> CR 6844090, zfs should be able to mirror to a smaller disk
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
> b117<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090%0Ab117>,
> June 2009
>  -- richard
>
>

Unless the bug description is incomplete, that's talking about adding a
mirror to an existing drive.  Not about replacing a failed drive in an
existing vdev that could be raid-z#.  I'm almost positive I had an issue
post b117 with replacing a failed drive in a raid-z2 vdev.

I'll have to see if I can dig up a system to test the theory on.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpxio load-balancing...it doesn't work??

2010-04-04 Thread Tim Cook

On Sun, Apr 4, 2010 at 8:55 PM, Brad  wrote:

> I had always thought that with mpxio, it load-balances IO request across
> your storage ports but this article
> http://christianbilien.wordpress.com/2007/03/23/storage-array-bottlenecks/has 
> got me thinking its not true.
>
> "The available bandwidth is 2 or 4Gb/s (200 or 400MB/s – FC frames are 10
> bytes long -) per port. As load balancing software (Powerpath, MPXIO, DMP,
> etc.) are most of the times used both for redundancy and load balancing,
> I/Os coming from a host can take advantage of an aggregated bandwidth of two
> ports. However, reads can use only one path, but writes are duplicated, i.e.
> a host write ends up as one write on each host port. "
>
> Is this true?
> --
>

I have no idea what MPIO stack he's talking about, but I've never heard
anything operating like he's talking about.  Writes aren't "duplicated on
each port".  The path a read OR write goes down depends on the host-side
mpio stack, and how you have it configured to load-balance.  It could be
simple round-robin, it could be based on queue depth, it could be most
recently used, etc. etc.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Tim Cook

On Sun, Apr 4, 2010 at 9:46 PM, Edward Ned Harvey wrote:

> > CR 6844090, zfs should be able to mirror to a smaller disk
> > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
> > b117, June 2009
>
> Awesome.  Now if someone would only port that to solaris, I'd be a happy
> man.   ;-)
>
>

Have you tried pointing that bug out to the support engineers who have your
case at Oracle?  If the fixed code is already out there, it's just a matter
of porting the code, right?  :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpxio load-balancing...it doesn't work??

2010-04-05 Thread Tim Cook

On Mon, Apr 5, 2010 at 8:16 PM, Brad  wrote:

> I'm wondering if the author is talking about "cache mirroring" where the
> cache is mirrored between both controllers.  If that is the case, is he
> saying that for every write to the active controlle,r a second write issued
> on the passive controller to keep the cache mirrored?
>
>
He's talking about multipathing, he just has no clue what
he's talking about.  He specifically calls out applications that are
specifically used for multipathing.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Diagnosing Permanent Errors

2010-04-05 Thread Tim Cook

On Mon, Apr 5, 2010 at 9:39 PM, Willard Korfhage wrote:

> It certainly has symptoms that match a marginal power supply, but I
> measured the power consumption some time ago and found it comfortably within
> the power supply's capacity. I've also wondered if the RAM is fine, but
> there is just some kind of flaky interaction of the ram configuration I had
> with the motherboard.
> --
> This message posted from opensolaris.org
>
>
I think the confusion is that you said you ran memtest86+ and the memory
tested just fine.  Did you remove some memory before running memtest86+ and
narrow it down to a certain stick being bad or something?  Your post makes
it sound as though you found that all of the ram is working perfectly fine.
 IE: It's not the problem.

Also, a low power draw doesn't mean much of anything.  The power supply
could just be dying.  Load wouldn't really matter in that scenario (although
a high load will generally help it out the door a bit quicker due to higher
heat/etc.).

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Diagnosing Permanent Errors

2010-04-05 Thread Tim Cook

On Tue, Apr 6, 2010 at 12:24 AM, Daniel Carosone  wrote:

> On Mon, Apr 05, 2010 at 09:35:21PM -0700, Willard Korfhage wrote:
> > By the way, I see that now one of the disks is listed as degraded - too
> many errors. Is there a good way to identify exactly which of the disks it
> is?
>
> It's hidden in iostat -E, of all places.
>
> --
> Dan.
>
>
I think he wants to know how to identify which physical drive maps to the
dev ID in solaris.  The only way I can think of is to run something like DD
against the drive to light up the activity LED.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Diagnosing Permanent Errors

2010-04-06 Thread Tim Cook

On Tue, Apr 6, 2010 at 12:47 AM, Daniel Carosone  wrote:

> On Tue, Apr 06, 2010 at 12:29:35AM -0500, Tim Cook wrote:
> > On Tue, Apr 6, 2010 at 12:24 AM, Daniel Carosone 
> wrote:
> >
> > > On Mon, Apr 05, 2010 at 09:35:21PM -0700, Willard Korfhage wrote:
> > > > By the way, I see that now one of the disks is listed as degraded -
> too
> > > many errors. Is there a good way to identify exactly which of the disks
> it
> > > is?
> > >
> > > It's hidden in iostat -E, of all places.
> > >
> > > --
> > > Dan.
> > >
> > >
> > I think he wants to know how to identify which physical drive maps to the
> > dev ID in solaris.  The only way I can think of is to run something like
> DD
> > against the drive to light up the activity LED.
>
> or look at the serial numbers printed in iostat -E
>
> --
> Dan.
>


And then what?  Cross your fingers and hope you pull the right drive on the
first go?  I don't know of any drives that come from the factory in a
hot-swap bay with the serial number printed on the front of the caddy.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Are there (non-Sun/Oracle) vendors selling OpenSolaris/ZFS based NAS Hardware?

2010-04-07 Thread Tim Cook

On Wed, Apr 7, 2010 at 2:20 PM, Jeremy Archer  wrote:

> GreenBytes (USA) sells OpenSolaris based storage appliances
> Web site: www.getgreenbytes.com
>  
>

Unless something has changed recently, they were using their own modified,
and non-open-source version of ZFS.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS RaidZ recommendation

2010-04-07 Thread Tim Cook

On Wednesday, April 7, 2010, Jason S  wrote:
> Since i already have Open Solaris installed on the box, i probably wont jump 
> over to FreeBSD. However someone has suggested to me to look into 
> www.nexenta.org and i must say it is quite interesting. Someone correct me if 
> i am wrong but it looks like it is Open Solaris based and has basically 
> everything i am looking for (NFS and CIFS sharing). I am downloading it right 
> now and am going to install it on another machine to see if this GUI is easy 
> enough to use.
>
> Does anyone have any experience or pointers with this NAS software?
> --
> This message posted from opensolaris.org
> _


I wouldn't waste your time. My last go round lacp was completely
broken for no apparent reason. The community is basically
non-existent.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS RaidZ recommendation

2010-04-07 Thread Tim Cook

On Wed, Apr 7, 2010 at 5:59 PM, Richard Elling wrote:

> On Apr 7, 2010, at 3:24 PM, Tim Cook wrote:
> > On Wednesday, April 7, 2010, Jason S  wrote:
> >> Since i already have Open Solaris installed on the box, i probably wont
> jump over to FreeBSD. However someone has suggested to me to look into
> www.nexenta.org and i must say it is quite interesting. Someone correct me
> if i am wrong but it looks like it is Open Solaris based and has basically
> everything i am looking for (NFS and CIFS sharing). I am downloading it
> right now and am going to install it on another machine to see if this GUI
> is easy enough to use.
> >>
> >> Does anyone have any experience or pointers with this NAS software?
> >> --
> >> This message posted from opensolaris.org
> >> _
> >
> >
> > I wouldn't waste your time. My last go round lacp was completely
> > broken for no apparent reason. The community is basically
> > non-existent.
>
> [richard pinches himself... yep, still there :-)]
>
> NexentaStor version 3.0 is based on b134 so it has the same basic
> foundation
> as the yet-unreleased OpenSolaris 2010.next.  For an easy-to-use NAS box
> for the masses, it is much more friendly and usable than a basic
>  OpenSolaris
> or Solaris 10 release.
>  -- richard
>
> ZFS storage and performance consulting at http://www.RichardElling.com
> ZFS training on deduplication, NexentaStor, and NAS performance
> Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
>
>

**Unless of course you were looking for any community support or basic LACP
functionality.

;)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-04-10 Thread Tim Cook

On Sat, Apr 10, 2010 at 10:08 AM, Edward Ned Harvey
wrote:

>  Due to recent experiences, and discussion on this list, my colleague and
> I performed some tests:
>
>
>
> Using solaris 10, fully upgraded.  (zpool 15 is latest, which does not have
> log device removal that was introduced in zpool 19)  In any way possible,
> you lose an unmirrored log device, and the OS will crash, and the whole
> zpool is permanently gone, even after reboots.
>
>
>
> Using opensolaris, upgraded to latest, which includes zpool version 22.
> (Or was it 23?  I forget now.)  Anyway, it’s >=19 so it has log device
> removal.
>
> 1.   Created a pool, with unmirrored log device.
>
> 2.   Started benchmark of sync writes, verified the log device getting
> heavily used.
>
> 3.   Yank out the log device.
>
> Behavior was good.  The pool became “degraded” which is to say, it started
> using the primary storage for the ZIL, performance presumably degraded, but
> the system remained operational and error free.
>
> I was able to restore perfect health by “zpool remove” the failed log
> device, and “zpool add” a new log device.
>
>
>
> Next:
>
> 1.   Created a pool, with unmirrored log device.
>
> 2.   Started benchmark of sync writes, verified the log device getting
> heavily used.
>
> 3.   Yank out both power cords.
>
> 4.   While the system is down, also remove the log device.
>
> (OOoohhh, that’s harsh.)  I created a situation where an unmirrored log
> device is known to have unplayed records, there is an ungraceful shutdown, *
> *and** the device disappears.  That’s the absolute worst case scenario
> possible, other than the whole building burning down.  Anyway, the system
> behaved as well as it possibly could.  During boot, the faulted pool did not
> come up, but the OS came up fine.  My “zpool status” showed this:
>
>
>
> # zpool status
>
>
>
>   pool: junkpool
>
>  state: FAULTED
>
> status: An intent log record could not be read.
>
> Waiting for adminstrator intervention to fix the faulted pool.
>
> action: Either restore the affected device(s) and run 'zpool online',
>
> or ignore the intent log records by running 'zpool clear'.
>
>see: http://www.sun.com/msg/ZFS-8000-K4
>
>  scrub: none requested
>
> config:
>
>
>
> NAMESTATE READ WRITE CKSUM
>
> junkpoolFAULTED  0 0 0  bad intent log
>
>   c8t4d0ONLINE   0 0 0
>
>   c8t5d0ONLINE   0 0 0
>
> logs
>
>   c8t3d0UNAVAIL  0 0 0  cannot open
>
>
>
> (---)
>
> I know the unplayed log device data is lost forever.  So I clear the error,
> remove the faulted log device, and acknowledge that I have lost the last few
> seconds of written data, up to the system crash:
>
>
>
> # zpool clear junkpool
>
> # zpool status
>
>
>
>   pool: junkpool
>
>  state: DEGRADED
>
> status: One or more devices could not be opened.  Sufficient replicas exist
> for
>
> the pool to continue functioning in a degraded state.
>
> action: Attach the missing device and online it using 'zpool online'.
>
>see: http://www.sun.com/msg/ZFS-8000-2Q
>
>  scrub: none requested
>
> config:
>
>
>
> NAMESTATE READ WRITE CKSUM
>
> junkpoolDEGRADED 0 0 0
>
>   c8t4d0ONLINE   0 0 0
>
>   c8t5d0ONLINE   0 0 0
>
> logs
>
>   c8t3d0UNAVAIL  0 0 0  cannot open
>
>
>
> # zpool remove junkpool c8t3d0
>
> # zpool status junkpool
>
>
>
>   pool: junkpool
>
>  state: ONLINE
>
>  scrub: none requested
>
> config:
>
>
>
> NAMESTATE READ WRITE CKSUM
>
> junkpoolONLINE   0 0 0
>
>   c8t4d0ONLINE   0 0 0
>
>   c8t5d0ONLINE   0 0 0
>
>
>


Awesome!  Thanks for letting us know the results of your tests Ed, that's
extremely helpful.  I was actually interested in grabbing some of the
cheaper intel SSD's for home use, but didn't want to waste my money if it
wasn't going to handle the various failure modes gracefully.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS RaidZ recommendation

2010-04-10 Thread Tim Cook

On Fri, Apr 9, 2010 at 9:31 PM, Eric D. Mudama wrote:

> On Sat, Apr 10 at  7:22, Daniel Carosone wrote:
>
>> On Fri, Apr 09, 2010 at 10:21:08AM -0700, Eric Andersen wrote:
>>
>>>  If I could find a reasonable backup method that avoided external
>>>  enclosures altogether, I would take that route.
>>>
>>
>> I'm tending to like bare drives.
>>
>> If you have the chassis space, there are 5-in-3 bays that don't need
>> extra drive carriers, they just slot a bare 3.5" drive.  For e.g.
>>
>> http://www.newegg.com/Product/Product.aspx?Item=N82E16817994077
>>
>
> I have a few of the 3-in-2 versions of that same enclosure from the
> same manufacturer, and they installed in about 2 minutes in my tower
> case.
>
> The 5-in-3 doesn't have grooves in the sides like their 3-in-2 does,
> so some cases may not accept the 5-in-3 if your case has tabs to
> support devices like DVD drives in the 5.25" slots.
>
> The grooves are clearly visible in this picture:
>
> http://www.newegg.com/Product/Product.aspx?Item=N82E16817994075
>
> The doors are a bit "light" perhaps, but it works just fine for my
> needs and holds drives securely.  The small fans are a bit noisy, but
> since the box lives in the basement I don't really care.
>
> --eric
>
>
> --
> Eric D. Mudama
> edmud...@mail.bounceswoosh.org
>


At that price, for the 5-in-3 at least, I'd go with supermicro.  For $20
more, you get what appears to be a far more solid enclosure.

--Tim

>
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd

2010-04-15 Thread Tim Cook

On Fri, Apr 16, 2010 at 12:21 AM, george  wrote:

> hi all
>
> im brand new to opensolaris ... feel free to call me noob :)
>
> i need to build a home server for media and general storage
>
> zfs sound like the perfect solution
>
> but i need to buy a 8 (or more) SATA controller
>
> any suggestions for compatible 2 opensolaris products will be really
> appreciated
>
> for the moment made a silly purchase spending 250 euros on lsi megaraid
> 8208elp which
> is SR and not compatible with OpenSolaris...
>
> thanx in advance
> G
>
>
Depends on what sort of interface you're looking for.  The supermicro
AOC-SAT2-MV8's work great.  They're pci-x based.  8-ports, come with SATA
cables, and are relatively cheap (<$150 most places).

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd

2010-04-16 Thread Tim Cook

On Fri, Apr 16, 2010 at 1:57 AM, Günther  wrote:

> hello
>
> if you are looking for pci-e (8x), i would recommend sas/sata  controller
> with lsi 1068E sas chip. they are nearly perfect with opensolaris.
>
> you must look for controller with it firmware (jbod mode) not
> those with raid enabled (ir mode). normally the cheaper
> variants are the right ones.
>
> one of the cheapest at all any my favourite is the supermicro usas-l8i
> http://www.supermicro.com/products/accessories/addon/AOC-USAS-L8i.cfm
> (about 100 euro in germany)
>
> although it is uio (wrong side mounted for special supermicro cases),
> it's normally not a problem because it's internal only.
> (you will loose one slot normally)
>
> see my hardware
> http://www.napp-it.org/hardware/
>
> gea
> --
>
>
The firmware can be flashed regardless of what the card came with.  Why
would you buy the UIO card when you can get the intel SASUC8i for the same
price/cheaper and it comes in a standard form factor?

The (potential) problem with the 1068 cards is that they don't support AHCI
with SATA.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd

2010-04-16 Thread Tim Cook

On Fri, Apr 16, 2010 at 7:35 PM, Harry Putnam  wrote:

> "Eric D. Mudama"  writes:
>
> > On Thu, Apr 15 at 23:57, Günther wrote:
> >>hello
> >>
> >>if you are looking for pci-e (8x), i would recommend sas/sata  controller
> >>with lsi 1068E sas chip. they are nearly perfect with opensolaris.
> >
> > For just a bit more, you can get the LSI SAS 9211-9i card which is
> > 6Gbit/s.  It works fine for us, and does JBOD no problem.
>
> I can't resist getting in a similar questions here.  Its not so easy
> to really get good info about this subject... there is a lot of info
> on the subject but when you remove all pci-e info .. maybe not so
> much.
>
> I will be needing a 4 or more port PCI sata controller soon and would
> like to get one that can make use of the newest sata (alleged) 3GB
> transfer rates.
>
> It's older base hardware... athlon64 3400+ 2.2 ghz  3GB Ram
> With A-open AK86-L Motherboard.
>
> So what do any of you know about a PCI card that fills the bill?
>
>
>
If you're talking about standard PCI, and not PCI-e or PCI-X, there's no
reason to try to get a faster controller.  A standard PCI slot can't even
max out the first revision of SATA.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd

2010-04-17 Thread Tim Cook

On Sat, Apr 17, 2010 at 2:12 PM, Harry Putnam  wrote:

> Tim Cook  writes:
>
> > On Fri, Apr 16, 2010 at 7:35 PM, Harry Putnam 
> wrote:
> >
> >> "Eric D. Mudama"  writes:
> >>
> >> > On Thu, Apr 15 at 23:57, Günther wrote:
> >> >>hello
> >> >>
> >> >>if you are looking for pci-e (8x), i would recommend sas/sata
>  controller
> >> >>with lsi 1068E sas chip. they are nearly perfect with opensolaris.
> >> >
> >> > For just a bit more, you can get the LSI SAS 9211-9i card which is
> >> > 6Gbit/s.  It works fine for us, and does JBOD no problem.
> >>
> >> I can't resist getting in a similar questions here.  Its not so easy
> >> to really get good info about this subject... there is a lot of info
> >> on the subject but when you remove all pci-e info .. maybe not so
> >> much.
> >>
> >> I will be needing a 4 or more port PCI sata controller soon and would
> >> like to get one that can make use of the newest sata (alleged) 3GB
> >> transfer rates.
> >>
> >> It's older base hardware... athlon64 3400+ 2.2 ghz  3GB Ram
> >> With A-open AK86-L Motherboard.
> >>
> >> So what do any of you know about a PCI card that fills the bill?
> >>
> >>
> >>
> > If you're talking about standard PCI, and not PCI-e or PCI-X, there's no
> > reason to try to get a faster controller.  A standard PCI slot can't even
> > max out the first revision of SATA.
>
> Ahh good to know.  So will sata2 drives have any trouble with a plain
> pci sata controller.
>
> I have no option for pci-e or whatever.  just PCI.  And I need at
> least a 4 port, whether its faster or not.
>
>
They'll work just fine, they'll just be very, very slow.  IIRC, standard PCI
is limited to 33MB/sec.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd

2010-04-19 Thread Tim Cook

On Monday, April 19, 2010, Roy Sigurd Karlsbakk  wrote:
> - "Harry Putnam"  skrev:
>
>> Erik Trimble  writes:
>>
>> >> Do you think it would be a problem having a second sata card in a
>> PCI
>> >> slot?  That would be 8 sata ports in all, since the A-open AK86
>> >> motherboard has 2 built in.  Or should I swap out the 2prt for the
>> 4
>> >> prt.  I really only need 2 more prts currently, but would be nice
>> to
>> >> have a couple still open for the future.
>> >>
>> > Your PCI bus bandwidth is shared, so it doesn't matter if you use 3
>> x
>> > 2port cards, or 2 x 4port cards (or, in your case, 1x2port +
>> > 1x4port). Performance is going to be virtually identical.
>>
>> Thanks.
>> So performance with drop as number of drives increases?
>
> Performance is likely to drop with the number of PCI cards on the same 
> bridge. Some motherboards have multiple PCI bridges, but mostly on more 
> expensive server boards (those with PCI-X etc). Performance will probably be 
> limited to the (theoretical) 133/266MB/s plus overhead with more PCI cards. 
> I'd say get an 8-port card to get the best out of it. I would guess your 
> motherboard supports 66MHz, since that came in PCI 2.1 (from wikipedia PCI 
> 2.1, released on June 1, 1995, allows for 66 MHz signaling at 3.3 volt signal 
> voltage (peak transfer rate of 533 MB/s), but at 33 MHz both 5 volt and 3.3 
> volt signal voltages are still allowed. It also added transaction latency 
> limits to the specification.[7]).
>
> roy
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

I'd be shocked to see 66mhz on a consumer board. In fact, i'd be
shocked to see a 66mhz 32bit bus period.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opteron 6100? Does it work with opensolaris?

2010-05-12 Thread Tim Cook

The problem is the Solaris team and lsi have put a lot of work into the new
2008 cards. Claiming there are issues without listing specific bugs they can
address is, I'm sure, frustrating to say the least.

On May 12, 2010 8:22 AM, "Thomas Burgess"  wrote:

>>
>
> Now wait just a minute. You're casting aspersions on
> stuff here without saying what you're ...
I think he was just trying to tell me that my cpu should be fine, that the
only thing which i might have to worry about is network and disk drivers.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] iSCSI confusion

2010-05-23 Thread Tim Cook

Yes, it requires a clustered filesystem to share out a single LUN to
multiple hosts. Vmfs3, however bad of an implementation, is in fact a
clustered filesystem.   I highly doubt nfs is your problem though. I'd take
nfs over iscsi and vmfs any day.

On May 23, 2010 8:06 PM, "Chris Dunbar - Earthside, LLC" <
cdun...@earthside.net> wrote:

Hello,

I think I know the answer to this, but not being an iSCSI expert I am hoping
to be pleasantly surprised by your answers. I currently use ZFS plus NFS to
host a shared VMFS store for my VMware ESX cluster. It's easy to set up and
high availability works great since all the ESX hosts see the same storage
pool. However, NFS performance has been pretty poor and I am looking for
other options. I do not currently use any SSD drives in my pool and I
understand adding a couple as ZIL devices might improve performance. I am
also thinking about switching to iSCSI. Here is my confusion/question. Is it
possible to share the same ZFS file system with multiple ESX hosts via
iSCSI? My belief is that an iSCSI connection is sort of like having a
dedicated physical drive and therefore does not lend itself to sharing
between multiple systems. Please set me straight.

Thank you,
Chris Dunbar

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ssd pool + ssd cache ?

2010-06-07 Thread Tim Cook

On Mon, Jun 7, 2010 at 9:45 AM, David Magda  wrote:

> On Mon, June 7, 2010 09:21, Richard Jahnel wrote:
> > I'll have to take your word on the Zeus drives. I don't see any thing in
> > thier literature that explicitly states that cache flushes are obeyed or
> > other wise protected against power loss.
>
> The STEC units is what Oracle/Sun use in their 7000 series appliances, and
> I believe EMC and many others use them as well.
>
>
When did that start?  Every 7000 I've seen uses Intel drives.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] NexentaStor 3.0.3 vs OpenSolaris - Patches more up to date?

2010-07-02 Thread Tim Cook

On Fri, Jul 2, 2010 at 8:06 PM, Richard Elling  wrote:

> On Jul 2, 2010, at 12:53 PM, Steve Radich, BitShop, Inc. wrote:
>
> > I see in NexentaStor's announcement of Community Edition 3.0.3 they
> mention some backported patches in this release.
>
> Yes.  These patches are in the code tree, currently at b143 (~18 weeks
> newer than b134)
>
> > Aside from their management features / UI what is the core OS difference
> if we move to Nexenta from OpenSolaris b134?
>
> You're not stuck at b134 for ZFS anymore ;-)
>
> > These DeDup bugs are my main frustration - if a staff member does a rm *
> in a directory with dedup you can take down the whole storage server - all
> with 1% cpu load and relatively little disk i/o due to DeDup DDT not fitting
> in the SSD + RAM (l2arc+arc). This is rediculous, something must be single
> threaded and it can't be that difficult to at least allow reads from other
> files.. Writes perhaps are more complex - But in our case the "other files"
> don't even have DeDup enabled on them and they can't be read.
>
> Some are fixed, more are in the upstream development queue.
>
> > It seems like some of these bugs have been fixed but Oracle hasn't
> published a new build - Perhaps we should be updating to newer builds, I
> haven't invested much time in seeking these out but b134 is the latest
> "obvious" build I see. Am I just not RTFM enough on finding new builds?
>
> No, what you see is what you get.  After the CIC there hasn't been a
> binary release from Oracle, just source releases.  I read this as saying
> the community should build their own distros. In a quick look at
> http://www.genunix.org it appears that Nexenta and EON are the only
> distro releases since early March.  Rich Lowe has released a b142
> tarball, too, but does that qualify as a distro?
>
> > I hate to move to Nexenta, I would think in the future Oracle will
> maintain this better than a third party and don't want to switch back and
> forth.
>
>
> I understand, but if actions speak louder than words, then consider joining
> the Nexenta core platform community at http://www.nexenta.org
> But don't forget to stay up to date with ZFS on zfs-discuss :-)
>  -- richard
>
> --
> Richard Elling
> rich...@nexenta.com   +1-760-896-4422
> ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
> http://nexenta-rotterdam.eventbrite.com/



Given that the most basic of functionality was broken in Nexenta, and not
Opensolaris, and I couldn't get a single response, I have a hard time
recommending ANYONE go to Nexenta.  It's great they're employing you now,
but the community edition has an extremely long way to go before it comes
close to touching the community that still hangs around here, despite
Oracle's lack of care and feeding.

http://www.nexenta.org/boards/1/topics/211


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] NexentaStor 3.0.3 vs OpenSolaris - Patches more up to date?

2010-07-02 Thread Tim Cook

On Fri, Jul 2, 2010 at 9:25 PM, Richard Elling  wrote:

> On Jul 2, 2010, at 6:48 PM, Tim Cook wrote:
> > Given that the most basic of functionality was broken in Nexenta, and not
> Opensolaris, and I couldn't get a single response, I have a hard time
> recommending ANYONE go to Nexenta.  It's great they're employing you now,
> but the community edition has an extremely long way to go before it comes
> close to touching the community that still hangs around here, despite
> Oracle's lack of care and feeding.
> >
> > http://www.nexenta.org/boards/1/topics/211
>
> I can't test that, due to lack of equivalent hardware, but did you file a
> bug?
> The dladm code and nge drivers come from upstream, so look for an
> equivalent
> opensolaris bug,  perhaps something like
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913874
>  -- richard
>
>

No, I didn't file a bug.  I couldn't get a response to the issue to even
begin troubleshooting, so I had no desire to file a bug or continue using a
product that was broken out of the box.  Opensolaris worked, so I went back
to it.  I can say with a fair amount of confidence, the same wouldn't happen
with Opensolaris proper.  Even if I chose not to continue running down a
problem, I've never run into a situation where I didn't at least get a
suggestion for troubleshooting tips.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] NexentaStor 3.0.3 vs OpenSolaris - Patches more up to date?

2010-07-02 Thread Tim Cook

On Fri, Jul 2, 2010 at 9:55 PM, James C. McPherson wrote:

> On  3/07/10 12:25 PM, Richard Elling wrote:
>
>> On Jul 2, 2010, at 6:48 PM, Tim Cook wrote:
>>
>>> Given that the most basic of functionality was broken in Nexenta, and not
>>> Opensolaris, and I couldn't get a single response, I have a hard time
>>> recommending ANYONE go to Nexenta.  It's great they're employing you now,
>>> but the community edition has an extremely long way to go before it comes
>>> close to touching the community that still hangs around here, despite
>>> Oracle's lack of care and feeding.
>>>
>>> http://www.nexenta.org/boards/1/topics/211
>>>
>>
>> I can't test that, due to lack of equivalent hardware, but did you file a
>> bug?
>> The dladm code and nge drivers come from upstream, so look for an
>> equivalent
>> opensolaris bug,  perhaps something like
>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913874
>>  -- richard
>>
>>
>
> Hi Tim,
> does this CR match what you were experiencing?
>
> http://bugs.opensolaris.org/view_bug.do?bug_id=6901419
> 6901419 dladm create-aggr -u incorrectly rejects some valid ethernet
> addresses
>
> If so - fixed in snv_136.
>
> The only other dladm CR I can see in the push logs for builds
> post 134 is
>
> 6932656 "dladm set-linkprop -p cpus" can't take more than 32 CPUs
> fixed in 138.
>
>
> hth,
> James
> --
> Senior Software Engineer, Solaris
> Oracle
> http://www.jmcp.homeunix.com/blog


Hi james,

Nope.  I'm not sure what exactly I was hitting.  I've never run into a
problem on any release of Opensolaris.  I believe I've tested on 126, 132,
133, and 134 (as well as many iterations of older versions).  The dladm
issue was exclusively on nexenta.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-11 Thread Tim Cook

On Sat, Jul 10, 2010 at 1:20 PM, Edward Ned Harvey
wrote:

> > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> > boun...@opensolaris.org] On Behalf Of Peter Taps
> >
> > A few companies have already backed out of zfs
> > as they cannot afford to go through a lawsuit.
>
> Or, in the case of Apple, who could definitely afford a lawsuit, but choose
> to avoid it anyway.
>
>
> > I am in a stealth
> > startup company and we rely on zfs for our application. The future of
> > our company, and many other businesses, depends on what happens to zfs.
>
> For a lot of purposes, ZFS is the clear best solution.  But maybe you're
> not
> necessarily in one of those situations?  Perhaps you could use Microsoft
> VSS, or Linux BTRFS?
>
> 'Course, by all rights, those are copy-on-write too.  So why doesn't netapp
> have a lawsuit against kernel.org, or microsoft?  Maybe cuz they just know
> they'll damage their own business too much by suing Linus, and they can't
> afford to go up against MS.  I guess.
>
>
Because VSS isn't doing anything remotely close to what WAFL is doing when
it takes snapshots.

I haven't spent much time looking at the exact BTRFS implementation, but I'd
imagine the fact its on-disk format isn't "finalized" (last I heard) would
make it a bit pre-mature to file a lawsuit.  I'm sure they're actively
watching it as well.

Furthermore, I'm sure the fact one of the core zfs developers, Matt Ahrens,
previously interned for the filesystem group at NetApp had just a *BIT* to
do with the lawsuit.  From their perspective, it's just a bit too convenient
someone gets access to the crown jewels, then runs off to a new company and
creates a filesystem that looks and feels so similar.

Of course, taking stabs in the dark on this mailing list without having
access to all of the court documents isn't really constructive in the first
place.  Then again, neither are people trying to claim they have a solid
understanding of the validity of the lawsuit(s), on this mailing list, who
aren't IP lawyers.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-12 Thread Tim Cook

On Mon, Jul 12, 2010 at 8:32 AM, Edward Ned Harvey
wrote:

> > From: Tim Cook [mailto:t...@cook.ms]
> >
> > Because VSS isn't doing anything remotely close to what WAFL is doing
> > when it takes snapshots.
>
> It may not do what you want it to do, but it's still copy on write, as
> evidenced by the fact that it takes instantaneous snapshots, and snapshots
> don't get overwritten when new data is written.
>
> I wouldn't call that "not even remotely close."  It's different, but
> definitely the same ballpark.
>
>

Everyone's SNAPSHOTS are copy on write BESIDES ZFS and WAFL's.   The
filesystem itself is copy-on-write for NetApp/Oracle, which is why there is
no performance degradation when you take them.

Per Microsoft:
When a change to the original volume occurs, but before it is written to
disk, the block about to be modified is read and then written to a
“differences area”, which preserves a copy of the data block before it is
overwritten with the change.

That is exactly how pretty much everyone else takes snapshots in the
industry, and exactly why nobody can keep more than a handful on disk at any
one time, and sometimes not even that for data that has heavy change rates.

It's not in the same ballpark, it's a completely different implementation.
 It's about as similar as a gas and diesel engine.  They might both go in
cars, they might both move the car.  They aren't remotely close to each
other from a design perspective.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Solaris Filesystem

2010-07-14 Thread Tim Cook

On Wed, Jul 14, 2010 at 4:07 PM, Beau J. Bechdol  wrote:

> So not sue if this is the correct list to email to or not. I am curious to
> know on my machine I have two hard drive (c8t0d0 and c8t1d0). Can some one
> explain to me what this exactly means? What does "c8" "t0" and "d0" actually
> mean. I might have to go back to solaris 101 to understand what this all
> means.
>
> Thanks,
>
> -Beau
>
>
>
Controller 8 (this could be a sata/fc/pata card)
Target 0 (pretty self explanatory... it's the first target on that
controller)
Disk 0 (first disk at the end of that target)

http://www.idevelopment.info/data/Unix/Solaris/SOLARIS_UnderstandingDiskDeviceFiles.shtml


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-14 Thread Tim Cook

On Wed, Jul 14, 2010 at 9:27 PM, BM  wrote:

> On Thu, Jul 15, 2010 at 12:49 AM, Edward Ned Harvey
>  wrote:
> > I'll second that.  And I think this is how you can tell the difference:
> > With supermicro, do you have a single support number to call and a 4hour
> > onsite service response time?
>
> Yes.
>
> BTW, just for the record, people potentially have a bunch of other
> supermicros in a stock, that they've bought for the rest of the money
> that left from a budget that was initially estimated to get shiny
> Sun/Oracle hardware. :) So normally you put them online in a cluster
> and don't really worry that one of them gone — just power that thing
> down and disconnect from the whole grid.
>
> > When you pay for the higher prices for OEM hardware, you're paying for
> the
> > knowledge of parts availability and compatibility. And a single point
> > vendor who supports the system as a whole, not just one component.
>
> What exactly kind of compatibility you're talking about? For example,
> if I remove my broken mylar air shroud for X8 DP with a
> MCP-310-18008-0N number because I step on it accidentally :-D, pretty
> much I think I am gonna ask them to replace exactly THAT thing back.
> Or you want to let me tell you real stories how OEM hardware is
> supported and how many emails/phonecalls it involves? One of the very
> latest (just a week ago): Apple Support reported me that their
> engineers in US has no green idea why Darwin kernel panics on their
> XServe, so they suggested me replace mother board TWICE and keep OLDER
> firmware and never upgrade, since it will cause crash again (although
> identical server works just fine with newest firmware)! I told them
> NNN times that traceback of Darwin kernel was yelling about ACPI
> problem and gave them logs/tracebacks/transcripts etc, but they still
> have no idea where is the problem. Do I need such "support"? No. Not
> at all.
>
> --
> Kind regards, BM
>
> Things, that are stupid at the beginning, rarely ends up wisely.
> ___
>
>

You're clearly talking about something completely different than everyone
else.  Whitebox works GREAT if you've got 20 servers.  Try scaling it to
10,000.  "A couple extras" ends up being an entire climate controlled
warehouse full of parts that may or may not be in the right city.  Not to
mention you've then got full-time staff on-hand to constantly be replacing
parts.  Your model doesn't scale for 99% of businesses out there.  Unless
they're google, and they can leave a dead server in a rack for years, it's
an unsustainable plan.  Out of the fortune 500, I'd be willing to bet
there's exactly zero companies that use whitebox systems, and for a reason.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-15 Thread Tim Cook

On Thu, Jul 15, 2010 at 1:50 AM, BM  wrote:

> On Thu, Jul 15, 2010 at 1:51 PM, Tim Cook  wrote:
> > Not to mention you've then got full-time staff on-hand to constantly be
> replacing
> > parts.
>
> Maybe I don't understand something, but we also had on-hand full-time
> staff to constantly replacing Dell's parts..., so what's the problem?
> Dell or HP or Sun are crashing exactly as same as SuperMicro machines
> (well, not really: Dell is more horrible, if you ask). Vendor, that
> sells us SuperMicro boxes offers as same support as we could get from
> HP or Dell. So all we do is simply pull out off the rack the thing and
> let vendor takes care of it. Machines are built automatically from the
> kickstart.
>
> What exactly I am missing then?
>

I'm not sure why you would intentionally hire someone to be on staff to
watch a tech from Dell come out and swap a part...  I'm starting to think
you HAVEN'T actually had any enterprise class boxes because your description
of service and what you get is not at all what reality is.

> > Your model doesn't scale for 99% of businesses out there. Unless
> > they're google, and they can leave a dead server in a rack for years,
> it's
> > an unsustainable plan.
>
> Not sure what you're talking about here, but if I run a cluster, then
> I am probably OK if some node[s] gone. :)
>
> Now, how it does not scales, if the vendor that works with IBM
> directly (in my case there is no real IBM in the über-country I am
> living but a third-party company that only merchandizing the name)
> came and took my hardware for repair. Vendor that works with the Dell
> (same situation) directly came and took my hardware for repair. Vendor
> that works with HP directly came and took my hardware for repair.
>

What are you talking about?  They don't "come and take your hardware".  If
you're paying for a proper service contract a tech brings the hardware to
your site and swaps out the defective part on the whole chassis right in
your datacenter.  Again, you're talking like you've never owned a piece of
enterprise hardware with a proper support contract.

> Apple officially NOT repairing their XServe, but give parts to a
> third-party company that does the same to HP or IBM (!) or Dell or
> Supermicro — that happens in the country I am living, yes. And now the
> vendor that works directly with Supermicro took my hardware for repair
> on the same conditions as others. In any case, no matter what box
> (white, black, beige, silver, green, red, purple) I still
> experiencing:
>

Apple isn't an enterprise class server provider, I'm not even sure why you'd
bring them into the conversation, other than once again, I think you have no
idea what we're talking about.

> 1. A downtime of the box (obviously).
> 2. A chain of phonecalls to support, language of which could be more
> censored.
> 3. A vendor coming and taking a brick with himself.
> 4. A some time for repair taking a while.
> 5. A smile from the vendor, when they returning the box back to the DC.
>

Not how it works, not even close.  If you've got a contract, and you've got
a bad piece of hardware, it's generally one call, witha  tech onsite in four
hours to fix the problem.

>
> This sequence yields to all the vendors I've mentioned.
>
>
No, it doesn't.

> Now, what exactly is the problem other than just scary grandma's
> stories that my model does not scales and big snow bear will eat me
> alive? I have to admit that I have no experience running 10K servers
> in one block like you do, so my respect is to you and I'd like to know
> the exact problems I might step into and the solution to avoid. Since
> you running this amount of machines, so you know it and you can share
> the experience. But from what I do have experience, I can not foresee
> some additional problems that we have with HP or Dell or Sun or IBM
> boxes.
>
>
Again, where are you planning on keeping all the spare parts required to
service boxes on your own?  Who is going to manage your inventory?  Who is
going to be on staff to replace parts?

> So could you please elaborate your statements? I would appreciate that
> (and some other folks here as well would be interested to listen to
> your lesson).
>
> Thank you.
>
> --
> Kind regards, BM
>
> Things, that are stupid at the beginning, rarely ends up wisely.
>

Gladly, it's clear you haven't actually ever had a service call with a
proper 4-hour support contract from any major vendor.  The "steps" you
describe above aren't even close to how it actually works.  Once again, 0
companies in the fortune 500.  You can continue to rant about how great
whiteboxes are, but reality is they don't scale.  You can break it down any
way you'd like and that isn't changing. If I didn't know any better, I'd
think you're just another internet troll.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-15 Thread Tim Cook

On Thu, Jul 15, 2010 at 9:09 AM, David Dyer-Bennet  wrote:

>
> On Wed, July 14, 2010 23:51, Tim Cook wrote:
> > On Wed, Jul 14, 2010 at 9:27 PM, BM  wrote:
> >
> >> On Thu, Jul 15, 2010 at 12:49 AM, Edward Ned Harvey
> >>  wrote:
> >> > I'll second that.  And I think this is how you can tell the
> >> difference:
> >> > With supermicro, do you have a single support number to call and a
> >> 4hour
> >> > onsite service response time?
> >>
> >> Yes.
> >>
> >> BTW, just for the record, people potentially have a bunch of other
> >> supermicros in a stock, that they've bought for the rest of the money
> >> that left from a budget that was initially estimated to get shiny
> >> Sun/Oracle hardware. :) So normally you put them online in a cluster
> >> and don't really worry that one of them gone — just power that thing
> >> down and disconnect from the whole grid.
> >>
> >> > When you pay for the higher prices for OEM hardware, you're paying for
> >> the
> >> > knowledge of parts availability and compatibility. And a single point
> >> > vendor who supports the system as a whole, not just one component.
> >>
> >> What exactly kind of compatibility you're talking about? For example,
> >> if I remove my broken mylar air shroud for X8 DP with a
> >> MCP-310-18008-0N number because I step on it accidentally :-D, pretty
> >> much I think I am gonna ask them to replace exactly THAT thing back.
> >> Or you want to let me tell you real stories how OEM hardware is
> >> supported and how many emails/phonecalls it involves? One of the very
> >> latest (just a week ago): Apple Support reported me that their
> >> engineers in US has no green idea why Darwin kernel panics on their
> >> XServe, so they suggested me replace mother board TWICE and keep OLDER
> >> firmware and never upgrade, since it will cause crash again (although
> >> identical server works just fine with newest firmware)! I told them
> >> NNN times that traceback of Darwin kernel was yelling about ACPI
> >> problem and gave them logs/tracebacks/transcripts etc, but they still
> >> have no idea where is the problem. Do I need such "support"? No. Not
> >> at all.
> >>
> >> --
> >> Kind regards, BM
> >>
> >> Things, that are stupid at the beginning, rarely ends up wisely.
> >> ___
> >>
> >>
> >
> > You're clearly talking about something completely different than everyone
> > else.  Whitebox works GREAT if you've got 20 servers.  Try scaling it to
> > 10,000.  "A couple extras" ends up being an entire climate controlled
> > warehouse full of parts that may or may not be in the right city.  Not to
> > mention you've then got full-time staff on-hand to constantly be
> replacing
> > parts.  Your model doesn't scale for 99% of businesses out there.  Unless
> > they're google, and they can leave a dead server in a rack for years,
> it's
> > an unsustainable plan.  Out of the fortune 500, I'd be willing to bet
> > there's exactly zero companies that use whitebox systems, and for a
> > reason.
>
> You might want to talk to Google about that; as I understand it they
> decided that buying expensive servers was a waste of money precisely
> because of the high numbers they needed.  Even with the good ones, some
> will fail, so they had to plan to work very well through server failures,
> so they can save huge amounts of money on hardware by buying cheap servers
> rather than expensive ones.
>
>
Obviously someone was going to bring up google, whose business model is
unique, and doesn't really apply to anyone else.  Google makes it work
because they order so many thousands of servers at a time, they can demand
custom made parts for the servers, that are built to their specifications.
 Furthermore, the clustering and filesystem they use wouldn't function at
all for 99% of the workloads out there.  Their core application: search, is
what makes the hardware they use possible.  If they were serving up a highly
transactional database that required millisecond latency it would be a
different story.



> And your juxtaposition of "fortune 500" and "99% of businesses" is
> significant; possibly the Fortune 500, other than Google, use expensive
> proprietary hardware; but 99% of businesses out there are NOT in the
> Fortune 500, and mostly use whitebox systems (and not rackmount at all;
&g

Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Tim Cook

On Wed, Aug 11, 2010 at 7:27 PM, Edward Ned Harvey wrote:

> > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> > boun...@opensolaris.org] On Behalf Of Paul Kraus
> >
> >I am looking for references of folks using ZFS with either NFS
> > or iSCSI as the backing store for VMware (4.x) backing store for
>
> I'll try to clearly separate what I know, from what I speculate:
>
> I know you can do either one, NFS or iscsi served by ZFS for the backend
> datastore used by ESX.  I know (99.9%) that vmware will issue sync-mode
> operations in both cases.  Which means you are strongly encouraged to use a
> mirrored dedicated log device, presumably SSD or some sort of high IOPS low
> latency devices.
>
> I speculate that iscsi will perform better.  If you serve it up via NFS,
> then vmware is going to create a file in your NFS filesystem, and inside
> that file it will create a new filesystem.  So you get twice the filesytem
> overhead.  Whereas in iscsi, ZFS presents a raw device to VMware, and then
> vmware maintains its filesystem in that.
>
>
>
That's not true at all.  Whether you use iSCSI or NFS, VMware is laying down
a file which it presents as a disk to the guest VM which then formats it
with its own filesystem.  That's the advantage of virtualization.  You've
got a big file you can pick up and move anywhere that is hardware agnostic.
 With iSCSI, you're forced to use VMFS, which is an adaptation of the legato
clustered filesystem from the early 90's.  It is nowhere near as robust as
NFS, and I can't think of a reason you would use it if given the choice;
short of a massive pre-existing investment in Fibre Channel.  With NFS,
you're simply using ZFS, there is no VMFS to worry about.  You don't have to
have another ESX box if something goes wrong, any client with an nfs client
can mount the share and diagnose the VMDK.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Tim Cook

>
>
>
> This is not entirely correct either. You're not forced to use VMFS.
>

It is entirely true.  You absolutely cannot use ESX with a guest on a block
device without formatting the LUN with VMFS.  You are *FORCED* to use VMFS.


You can format the LUN with VMFS, then put VM files inside the VMFS; in this
> case you get the Guest OS filesystem inside a VMDK file on the VMFS
> filesystem inside a LUN/ZVOL on your ZFS filesystem. You can also set up Raw
> Device Mapping (RDM) directly to a LUN, in which case you get the Guest OS
> filesystem inside the LUN/ZVOL on your ZFS filesystem. There has to be VMFS
> available somewhere to store metadata, though.
>
>
You cannot boot a VM off an RDM.  You *HAVE* to use VMFS with block devices
for your guest operating systems.  Regardless, we aren't talking about
RDM's, we're talking about storing virtual machines.


It was and may still be common to use RDM for VMs that need very high IO
> performance. It also used to be the only supported way to get thin
> provisioning for an individual VM disk. However, VMware regularly makes a
> lot of noise about how VMFS does not hurt performance enough to outweigh its
> benefits anymore, and thin provisioning has been a native/supported feature
> on VMFS datastores since 4.0.
>
> I still think there are reasons why iSCSI would be better than NFS and vice
> versa.
>
>
I'd love for you to name one.  Short of a piss-poor NFS server
implementation, I've never once seen iSCSI beat out NFS in a VMware
environment.  I have however seen countless examples of their "clustered
filesystem" causing permanent SCSI locks on a LUN that result in an entire
datastore going offline.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Tim Cook

>
>
>
> My understanding is that if you wanted to use MS Cluster Server, you'd need
> to use a LUN as an RDM for the quorum drive. VMDK files are locked when
> open, so they can't typically be shared. VMware's Fault Tolerance gets
> around this somehow, and I have a suspicion that their Lab Manager product
> does as well.
>
>
Right, but again, we're talking about storing virtual machines, not RDM's.
 Using MSCS on top of VMware rarely makes any sense, and MS is doing
their damnedest to make it as painful as possible for those that try
anyways.  There's nothing stopping you from putting your virtual machine on
an NFS datastore, and mounting a LUN directly to the guest OS with a
software iSCSI client and cutting out the middleman and bypassing the RDM
entirely... which just adds yet another headache when it comes to things
like SRM and vmotion.




> I don't think you can use VMware's built-in multipathing with NFS. Maybe
> it's possible, it doesn't look that way but I'm not going to verify it one
> way or the other. There are probably better/alternative ways to achieve the
> same thing with NFS.
>

You can achieve the same thing with a little bit of forethought on your
network design.  No, ALUA is not compatible with NFS, it is a block protocol
feature.  Then again, ALUA is also not compatible with the MSCS example you
listed above.




> The new VAAI stuff that VMware announced with vSphere 4.1 does not support
> NFS (yet), it only works with storage servers that implement the requires
> commands.
>
>
VAAI is an attempt to give block more NFS-like features (for instance,
finer-grained locking which already exists in NFS by default).  The
"features" are basically useless in an NFS environment on intelligent
storage.



The locked LUN thing has happened to me once. I've had more trouble with
> thin provisioning and negligence leading to a totally-full VMFS, which is
> irritating to recover from, and moved/restored luns needing VMFS
> resignaturing, which is also irritating.
>
> I don't want to argue with you about the other stuff.
>
>
Which is why block with vmware blows :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Opensolaris is apparently dead

2010-08-13 Thread Tim Cook

http://www.theregister.co.uk/2010/08/13/opensolaris_is_dead/

I'm a bit surprised at this development... Oracle really just doesn't get
it.  The part that's most disturbing to me is the fact they won't be
releasing nightly snapshots.  It appears they've stopped Illumos in its
tracks before it really even got started (perhaps that explains the timing
of this press release) as well as killed the Opensolaris community.

Quite frankly, I think there will be an even faster decline of Solaris
installed base after this move.  I know I have no interest in pushing it
anywhere after this mess.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-13 Thread Tim Cook

On Fri, Aug 13, 2010 at 3:54 PM, Erast  wrote:

>
>
> On 08/13/2010 01:39 PM, Tim Cook wrote:
>
>> http://www.theregister.co.uk/2010/08/13/opensolaris_is_dead/
>>
>> I'm a bit surprised at this development... Oracle really just doesn't
>> get it.  The part that's most disturbing to me is the fact they won't be
>> releasing nightly snapshots.  It appears they've stopped Illumos in its
>> tracks before it really even got started (perhaps that explains the
>> timing of this press release)
>>
>
> Wrong. Be patient, with the pace of current Illumos development it soon
> will have all the closed binaries liberated and ready to sync up with
> promised ON code drops as dictated by GPL and CDDL licenses.
>

Given the path they are heading down, there's absolutely 0 guarantee that
new features added to Solaris will be opened with CDDL.  Furthermore,
there's nothing guaranteeing the community is able to reproduce those
features on their own if things do shutdown more.  That's clearly by design.

Obviously Illumos can fork, but that's still 'stopped dead in its tracks' as
far as I am concerned.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-15 Thread Tim Cook

On Sun, Aug 15, 2010 at 9:48 AM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Sun, 15 Aug 2010, David Magda wrote:
>
>>
>> But that US$ 400 was only if you wanted support. For the last little while
>> you could run Solaris 10 legally without a support contract without issues.
>>
>
> The $400 number is bogus since the amount that Oracle quotes now depends on
> the value of the hardware that the OS will run on.  For my old SPARC Blade
> 2500 (which will probably not go beyond Solaris 10), the OS support cost was
> only in the $60-70 range.  On a brand-new high-end system, the cost is
> higher.  The OS support cost on a million dollar system would surely be
> quite high but owners of such systems will surely pay for system support
> rather than just OS support and care very much that their system continues
> running.
>
> The previous Sun software support pricing model was completely bogus. The
> Oracle model is also bogus, but at least it provides a means for an
> entry-level user to be able to afford support.
>
> Bob
>
>

The cost discussion is ridiculous, period.  $400 is a steal for support.
 You'll pay 3x or more for the same thing from Redhat or Novell.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Help! Dedup delete FS advice needed!!

2010-08-15 Thread Tim Cook

On Sun, Aug 15, 2010 at 2:30 PM, Marc Emmerson wrote:

> Hi all,
> I have a 10TB array (zpool = 2x 5 disk raidz1), I had dedup enabled on a
> couple of filesystems which I decided to delete last week, the first
> contained about 6GB of data and was deleted in about 30 minutes, the second
> (about 100GB of VMs) is still being deleted (I think) 4.5 days later!
>
> Now, I've seen delete "dedup enabled fs" operations take a while before (2
> days) but 4.5 days is a surprise.
>
> I am wondering what (if anything) I can do to speed this up, my server only
> has 4GB RAM, would it be beneficial/safe for me to switch off, upgrade to
> 8GB?  I am assuming this may help the delete operation as more memory should
> mean that more of the dedup table is stored in RAM?
>
> Or is there anything else I can do to speed things up or indeed determine
> how much longer left?
>
> I'd appreciate any advice, cheers
>
>
It would be extremely beneficial for you to switch off and upgrade to 8GB.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-16 Thread Tim Cook

On Mon, Aug 16, 2010 at 10:21 AM, David Dyer-Bennet  wrote:

>
> On Sun, August 15, 2010 20:44, Peter Jeremy wrote:
>
> > Irrespective of the above, there is nothing requiring Oracle to release
> > any future btrfs or ZFS improvements (or even bugfixes).  They can't
> > retrospectively change the license on already released code but they
> > can put a different (non-OSS) license on any new code.
>
> That's true.
>
> However, if Oracle makes a binary release of BTRFS-derived code, they must
> release the source as well; BTRFS is under the GPL.
>

BTRFS can be under any license they want, they own the code.  There's
absolutely nothing preventing them from dual-licensing it.


>
> So, if they're going to use it in any way as a product, they have to
> release the source.  If they want to use it just internally they can do
> anything they want, of course.
>
>
No, no they don't.  You're under the misconception that they no longer own
the code just because they released a copy as GPL.  That is not true.
 Anyone ELSE who uses the GPL code must release modifications if they wish
to distribute it due to the GPL.  The original author is free to license the
code as many times under as many conditions as they like, and release or not
release subsequent changes they make to their own code.

I absolutely guarantee Oracle can and likely already has dual-licensed
BTRFS.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-16 Thread Tim Cook

On Mon, Aug 16, 2010 at 10:40 AM, Ray Van Dolson wrote:

> On Mon, Aug 16, 2010 at 08:35:05AM -0700, Tim Cook wrote:
> > No, no they don't.  You're under the misconception that they no
> > longer own the code just because they released a copy as GPL.  That
> > is not true.  Anyone ELSE who uses the GPL code must release
> > modifications if they wish to distribute it due to the GPL.  The
> > original author is free to license the code as many times under as
> > many conditions as they like, and release or not release subsequent
> > changes they make to their own code.
> >
> > I absolutely guarantee Oracle can and likely already has
> > dual-licensed BTRFS.
>
> Well, Oracle obviously would want btrfs to stay as part of the Linux
> kernel rather than die a death of anonymity outside of it...
>
> As such, they'll need to continue to comply with GPLv2 requirements.
>
>

Why would they obviously want that?  When the project started, they were
competing with Sun.  They now own Solaris; they no longer have a need to
produce a competing product.  I would be EXTREMELY surprised to see Oracle
continue to push Linux as hard as they have in the past, over the next 5
years.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-16 Thread Tim Cook

2010/8/16 "C. Bergström" 

> Joerg Schilling wrote:
>
>> "C. Bergström"  wrote:
>>
>>
>>
>>> I absolutely guarantee Oracle can and likely already has dual-licensed
 BTRFS.


>>> No.. talk to Chris Mason.. it depends on the linux kernel too much
>>> already to be available under anything, but GPLv2
>>>
>>>
>>
>> If he really believes this, then he seems to be missinformed about legal
>> background.
>> The question is: who wrote the btrfs code and who owns it.
>>
>> If Oracle pays him for writing the code, then Oracle owns the code and can
>> relicense it under any license they like.
>>
>>
> Why don't all you license trolls go crawl under a rock.. Are you so dense
> to believe
>
> 1) Only Oracle devs have by now contributed to btrfs?
> 2) That it's so tightly intermingled with the linux kernel code you can't
> separate the two of them.
>
> Just STFU already and go check commit logs and source if you don't
> believe..
>
> ZFS-discuss != BTRFS+Oracle-license troll-ml
>

Before making yourself look like a fool, I suggest you look at the BTRFS
commits.  Can you find a commit submitted by anyone BUT Oracle employees?
 I've yet to see any significant contribution from anyone outside the walls
of Oracle to the project.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-16 Thread Tim Cook

On Mon, Aug 16, 2010 at 11:08 AM, Ray Van Dolson wrote:

> On Mon, Aug 16, 2010 at 08:57:19AM -0700, Joerg Schilling wrote:
> > "C. Bergström"  wrote:
> >
> > > > I absolutely guarantee Oracle can and likely already has
> dual-licensed
> > > > BTRFS.
> > > No.. talk to Chris Mason.. it depends on the linux kernel too much
> > > already to be available under anything, but GPLv2
> >
> > If he really believes this, then he seems to be missinformed about legal
> > background.
> >
> > The question is: who wrote the btrfs code and who owns it.
> >
> > If Oracle pays him for writing the code, then Oracle owns the code and
> can
> > relicense it under any license they like.
> >
> > Jörg
>
> I don't think anyone is arguing that Oracle can relicense their own
> copyrighted code as they see fit.
>
> The real question is, WHY would they do it?  What would be the business
> motivation here?  Chris Mason would most likely leave Oracle, Red Hat
> would hire him and fork the last GPL'd version of btrfs and Oracle
> would have relegated itself to a non-player in the Linux filesystem
> space...
>
> So, yes, they can do it if they want, I just think they're not THAT
> stupid. :)
>
>
>
Or, for all you know, Chris Mason's contract has a non-compete that states
if he leaves Oracle he's not allowed to work on any project he was a part of
for five years.

The "business motivation" would be to set the competition back a decade.


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-16 Thread Tim Cook

2010/8/16 "C. Bergström" 

> Tim Cook wrote:
>
>>
>>
>> 2010/8/16 "C. Bergström" > codest...@osunix.org>>
>>
>>
>>Joerg Schilling wrote:
>>
>>"C. Bergström" ><mailto:codest...@osunix.org>> wrote:
>>
>>
>>I absolutely guarantee Oracle can and likely already
>>has dual-licensed BTRFS.
>>
>>No.. talk to Chris Mason.. it depends on the linux kernel
>>too much already to be available under anything, but GPLv2
>>
>>
>>If he really believes this, then he seems to be missinformed
>>about legal background.
>>The question is: who wrote the btrfs code and who owns it.
>>
>>If Oracle pays him for writing the code, then Oracle owns the
>>code and can relicense it under any license they like.
>>
>>Why don't all you license trolls go crawl under a rock.. Are you
>>so dense to believe
>>
>>1) Only Oracle devs have by now contributed to btrfs?
>>2) That it's so tightly intermingled with the linux kernel code
>>you can't separate the two of them.
>>
>>Just STFU already and go check commit logs and source if you don't
>>believe..
>>
>>ZFS-discuss != BTRFS+Oracle-license troll-ml
>>
>>
>> Before making yourself look like a fool, I suggest you look at the BTRFS
>> commits.  Can you find a commit submitted by anyone BUT Oracle employees?
>>  I've yet to see any significant contribution from anyone outside the walls
>> of Oracle to the project.
>>
> I think I've probably dug into the issue a bit deeper than you..
>
> http://www.codestrom.com/wandering/2009/03/zfs-vs-btrfs-comparison.html
>
> Oh. .and if you don't believe me ask Josef Bacik from RH..
>
> I'm not directing this at anyone specifically..  Pretty please..  STFU and
> go back to trolling somewhere else...
>
>
Nobody here appears to be trolling beyond you.  The rest of us were having a
civilized conversation prior to you feeling the need to start throwing out
insults.  Oracle can pull the plug at any time they choose.  *ONE* developer
from Redhat does not change the fact that Oracle owns the rights to the
majority of the code, and can relicense it, or discontinue code updates, as
they see fit.

Grow up.


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Quickest way to find files with cksum errors without doing scrub

2009-09-28 Thread Tim Cook

On Mon, Sep 28, 2009 at 12:16 PM, Richard Elling
wrote:

> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
>
>  On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>>
>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>

 Scrub could be faster, but you can try
tar cf - . > /dev/null

 If you think about it, validating checksums requires reading the data.
 So you simply need to read the data.

>>>
>>> This should work but it does not verify the redundant metadata.  For
>>> example, the duplicate metadata copy might be corrupt but the problem
>>> is not detected since it did not happen to be used.
>>>
>>
>> Too bad we cannot scrub a dataset/object.
>>
>
> Can you provide a use case? I don't see why scrub couldn't start and
> stop at specific txgs for instance. That won't necessarily get you to a
> specific file, though.
>  -- richard
>
>
> On Mon, Sep 28, 2009 at 12:16 PM, Richard Elling  wrote:

> On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:
>
>  On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
>>
>>> On Mon, 28 Sep 2009, Richard Elling wrote:
>>>

 Scrub could be faster, but you can try
tar cf - . > /dev/null

 If you think about it, validating checksums requires reading the data.
 So you simply need to read the data.

>>>
>>> This should work but it does not verify the redundant metadata.  For
>>> example, the duplicate metadata copy might be corrupt but the problem
>>> is not detected since it did not happen to be used.
>>>
>>
>> Too bad we cannot scrub a dataset/object.
>>
>
> Can you provide a use case? I don't see why scrub couldn't start and
> stop at specific txgs for instance. That won't necessarily get you to a
> specific file, though.
>  -- richard
>

I get the impression he just wants to check a single file in a pool without
waiting for it to check the entire pool.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] "Hot Space" vs. hot spares

2009-09-30 Thread Tim Cook

On Wed, Sep 30, 2009 at 7:06 PM, Brandon High  wrote:

> I might have this mentioned already on the list and can't find it now,
> or I might have misread something and come up with this ...
>
> Right now, using hot spares is a typical method to increase storage
> pool resiliency, since it minimizes the time that an array is
> degraded. The downside is that drives assigned as hot spares are
> essentially wasted. They take up space & power but don't provide
> usable storage.
>
> Depending on the number of spares you've assigned, you could have 7%
> of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
> This is on top of the RAID6 / raidz[1-3] overhead.
>
> What about using the free space in the pool to cover for the failed drive?
>
> With bp rewrite, would it be possible to rebuild the vdev from parity
> and simultaneously rewrite those blocks to a healthy device? In other
> words, when there is free space, remove the failed device from the
> zpool, resizing (shrinking) it on the fly and restoring full parity
> protection for your data. If online shrinking doesn't work, create a
> phantom file that accounts for all the space lost by the removal of
> the device until an export / import.
>
> It's not something I'd want to do with less than raidz2 protection,
> and I imagine that replacing the failed device and expanding the
> stripe width back to the original would have some negative performance
> implications that would not occur otherwise. I also imagine it would
> take a lot longer to rebuild / resilver at both device failure and
> device replacement. You wouldn't be able to share a spare among many
> vdevs either, but you wouldn't always need to if you leave some space
> free on the zpool.
>
> Provided that bp rewrite is committed, and vdev & zpool shrinks are
> functional, could this work? It seems like a feature most applicable
> to SOHO users, but I'm sure some enterprise users could find an
> application for nearline storage where available space trumps
> performance.
>
> -B
>
> --
> Brandon High : bh...@freaks.com
> Always try to do things in chronological order; it's less confusing that
> way.
>


What are you hoping to accomplish?  You're still going to need a drives
worth of free space, and if you're so performance strapped that one drive
makes the difference, you've got some bigger problems on your hands.

To me it sounds like complexity for complexity's sake, and leaving yourself
with a far less flexible option in the face of a drive failure.

BTW, you shouldn't need one disk per tray of 14 disks.  Unless you've got
some known bad disks/environmental issues, every 2-3 should be fine.  Quite
frankly, if you're doing raid-z3, I'd feel comfortable with one per thumper.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] FW: Supermicro AOC-SAT2-MV8 hang when drive removed

2009-10-13 Thread Tim Cook

On Tue, Oct 13, 2009 at 8:54 AM, Aaron Brady  wrote:

> All's gone quiet on this issue, and the bug is closed, but I'm having
> exactly the same problem; pulling a disk on this card, under OpenSolaris
> 111, is pausing all IO (including, weirdly, network IO), and using the ZFS
> utilities (zfs list, zpool list, zpool status) causes a hang until I replace
> the disk.
> --
>


Did you set your failmode to continue?


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE

2009-10-13 Thread Tim Cook

On Tue, Oct 13, 2009 at 8:24 AM, Derek Anderson wrote:

> Before you all start taking bets, I am having a difficult time
> understanding why you would.   If you think I am nuts because SSD's have a
> limited lifespan, I would agree with you, however we all know that SSD's are
> going to get cheaper and cheaper as the days go by.  The Intels I bought in
> April are half the price now they were then.  So are the Samsungs.   I
> suspect that by next spring, I will replace them all with new ones and they
> will be half the cost they are now.   Why would anyone spend 3K on disks and
> just toss it in the river?
>
> Simple answer:  Man hour math.  I have 150 virtual machines on these disks
> for shared storage.  They hold no actual data so who really cares if they
> get lost.  However 150 users of these virtual machines will save 5 minutes
> or so every day of work, which translates to $250.   So $3,000 in SSD's
> which are easily replaced one by one with zfs saves the company $250,000 in
> labor.  So when I replace these drives in 6 months, for somewhere around
> $1500 its a fantastic deal.
>
> The only bad part is I cannot estimate how much of the old disks have life
> is left because in a few months, I am going to have a handful of the fastest
> SSD's around and not sure if I would trust them for much of anything.
>
> Am I really that wrong?
>
> Derek
>

I'll take them when you're done :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] FW: Supermicro AOC-SAT2-MV8 hang when drive removed

2009-10-13 Thread Tim Cook

On Tue, Oct 13, 2009 at 9:42 AM, Aaron Brady  wrote:

> I did, but as tcook suggests running a later build, I'll try an
> image-update (though, 111 > 2008.11, right?)
>

It should be, yes.  b111 was released in April of 2009.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] fishworks on x4275?

2009-10-16 Thread Tim Cook

On Fri, Oct 16, 2009 at 1:05 PM, Frank Cusack  wrote:

> Apologies if this has been covered before, I couldn't find anything
> in my searching.
>
> Can the software which runs on the 7000 series servers be installed
> on an x4275?
>
> -frank
>

Fishworks can only be run on systems purchased as a 7000 series, Sun will
not support it on anything else.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] fishworks on x4275?

2009-10-16 Thread Tim Cook

On Fri, Oct 16, 2009 at 1:14 PM, Frank Cusack  wrote:

> On October 16, 2009 1:08:17 PM -0500 Tim Cook  wrote:
>
>> On Fri, Oct 16, 2009 at 1:05 PM, Frank Cusack 
>> wrote:
>>
>>> Can the software which runs on the 7000 series servers be installed
>>> on an x4275?
>>>
>>
>> Fishworks can only be run on systems purchased as a 7000 series, Sun will
>> not support it on anything else.
>>
>
> I don't care about "support", I only care if it can be convinced to run
> on another hardware.  I guess the answer is no.  Kind of ashame, really.
>
> thanks
> -frank
>


I'm sure you could convince it to work if you could get a copy of it.  I
just don't know why you'd bother since there's no guarantee it won't munch
data.  You might as well just use the simulator.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20

2009-10-20 Thread Tim Cook

On Tue, Oct 20, 2009 at 10:23 AM, Robert Dupuy wrote:

> A word of caution, be sure not to read a lot into the fact that the F20 is
> included in the Exadata Machine.
>
> >From what I've heard the flash_cache feature of 11.2.0 Oracle that was
> enabled in beta, is not working in the production release, for anyone except
> the Exadata 2.
>
> The question is, why did they need to give this machine an unfair software
> advantage?  Is it because of the poor performance they found with the F20?
>
> Oracle bought Sun, they have reason to make such moves.
>
> I have been talking to a Sun rep for weeks now, trying to get the latency
> specs on this F20 card, with no luck in getting that revealed so far.
>
> However, you can look at Sun's other products like the F5100, which are
> very unimpressive and high latency.
>
> I would not assume this Sun tech is in the same league as a Fusion-io
> ioDrive, or a Ramsan-10.  They would not confirm whether its a native PCIe
> solution, or if the reason it comes on a SAS card, is because it requires
> SAS.
>
> So, test, test, test, and don't assume this card is competitive because it
> came out this year, I am not sure its even competitive with last years
> ioDrive.
>
> I told my sun reseller that I merely needed it to be faster than the Intel
> X25-E in terms of latency, and they weren't able to demonstrate that, at
> least so far...lots of feet dragging, and I can only assume they want to
> sell as much as they can, before the cards metrics become widely known.
> --
>


That's an awful lot of assumptions with no factual basis for any of your
claims.

As for your bagging on the F5100... what exactly is your problem with its
latency?  Assuming you aren't using absurdly large block sizes, it would
appear to fly.  0.15ms is bad?
http://blogs.sun.com/BestPerf/entry/1_6_million_4k_iops

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20

2009-10-20 Thread Tim Cook

On Tue, Oct 20, 2009 at 3:58 PM, Robert Dupuy wrote:

> "there is no consistent latency measurement in the industry"
>
> You bring up an important point, as did another poster earlier in the
> thread, and certainly its an issue that needs to be addressed.
>
> "I'd be surprised if anyone could answer such a question while
> simultaneously being credible."
>
>
> http://download.intel.com/design/flash/nand/extreme/extreme-sata-ssd-product-brief.pdf
>
> Intel:  X-25E read latency 75 microseconds
>
> http://www.sun.com/storage/disk_systems/sss/f5100/specs.xml
>
> Sun:  F5100 read latency 410 microseconds
>
> http://www.fusionio.com/PDFs/Data_Sheet_ioDrive_2.pdf
>
> Fusion-IO:  read latency less than 50 microseconds
>
> Fusion-IO lists theirs as .05ms
>
>
> I find the latency measures to be useful.
>
> I know it isn't perfect, and I agree benchmarks can be deceiving, heck I
> criticized one vendors benchmarks in this thread already :)
>
> But, I did find, that for me, I just take a very simple, single thread,
> read as fast you can approach, and get the # of random access per second, as
> one type of measurement, that gives you some data, on the raw access ability
> of the drive.
>
> No doubt in some cases, you want to test multithreaded IO too, but my
> application is very latency sensitive, so this initial test was telling.
>
> As I got into the actual performance of my app, the lower latency drives,
> performed better than the higher latency drives...all of this was on SSD.
>
> (I did not test the F5100 personally, I'm talking about the SSD drives that
> I did test).
>
> So, yes, SSD and HDD are different, but latency is still important.
>


Timeout, rewind, etc.  What workload do you have that 410microsecond latency
is detrimental?  More to the point, what workload do you have that you'd
rather have 5microsecond latency with 1/10th the IOPS?  Whatever it is,
I've never run across such a workload in the real world.  It sounds like
you're comparing paper numbers for the sake of comparison, rather than to
solve a real-world problem...

BTW, latency does not give you "# of random access per second".
5microsecond latency for one access != # of random access per second, sorry.
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20

2009-10-21 Thread Tim Cook

On Wed, Oct 21, 2009 at 9:15 PM, Jake Caferilla  wrote:

> Clearly a lot of people don't understand latency, so I'll talk about
> latency, breaking it down in simpler components.
>
> Sometimes it helps to use made up numbers, to simplify a point.
>
> Imagine a non-real system that had these 'ridiculous' performance
> characteristics:
>
> The system has a 60 second (1 minute) read latency.
> The system can scale dramatically, it can do 60 billion IO's per minute.
>
> Now some here are arguing about the term latency, but its rather a simple
> term.
> It simply means the amount of time it takes, for data to move from one
> point to another.
>
> And some here have argued there is no good measurement of latency, but also
> it very simple.
> It is measured in time units.
>
> OK, so we have a latency of 1 minute, in this 'explanatory' system.
>
> That means, I issued a read request, the Flash takes 1 minute to return the
> data requested to the program.
>
> But remember, this example system, has massive parallel scalability.
>
> I issue 2 read requests, both read requests return after 1 minute.
> I issue 3 read requests, all 3 return after 1 minute.
>
> I defined this made up system, as one, such that if you issue 60 billion
> read requests, they all return, simultaneously, after 1 minute.
>
> Let's do some math.
>
> 60,000,000,000 divided by 60 seconds, well this system does 1 billion IOPS!
>
> Wow, what wouldn't run fast with 1 billion IOPS?
>
> The answer, is, most programs would not, not with such a high latency as
> waiting 1 minute for data to return.  Most apps wouldn't run acceptably, no
> not at all.
>
> Imagine you are in Windows, or Solaris, or Linux, and every time you needed
> to go to disk, a 1 minute wait.  Wow, it would be totally unacceptable,
> despite the IOPS, latency matters.
>
> Certain types of apps wouldn't be latency sensitive, some people would love
> to have this 1 billion IOPs system :)
>
> The good news is, the F20 latency, even if we don

flash has lower latency than traditional disks, that's part of what makes it
> competitive...and by the same token, flash with lower latency than other
> flash, has a competitive advantage.
>
> Some here say latency (that wait times) doesn't matter with flash.  That
> latency (waiting) only matters with traditional hard drives.
>
> Uhm, who told you that?  I've never heard someone make that case before,
> anywhere, ever.
>
> And lets give you credit and say you had some minor point to make about hdd
> and flash differences...still you are using it in such a way, that someone
> could draw the wrong conclusion, so. clarify this point, you are
> certainly not suggesting that higher wait times speeds up an application,
> correct?
>
> Or that the F20's latency cannot impact performace, right?  C'mon, some
> common sense? anyone?
>
>
Yet again, you're making up situations on paper.  We're dealing with the
real world, not theory.  So please, describe the electronics that have been
invented that can somehow take in 1billion IO requests, process them, have a
memory back end that can return them, but does absolutely nothing with them
for a full minute.  Even if you scale those numbers down, your theory is
absolutely ridiculous.

Of course, you also failed to address the other issue.  How exactly does a
drive have .05ms response time, yet only provide 500 IOPS.  It's IMPOSSIBLE
for those numbers to work out.

But hey, lets ignore reality and just go with vendor numbers.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] new google group for ZFS on OSX

2009-10-23 Thread Tim Cook

On Fri, Oct 23, 2009 at 2:38 PM, Richard Elling wrote:

> FYI,
> The ZFS project on MacOS forge (zfs.macosforge.org) has provided the
> following announcement:
>
>ZFS Project Shutdown2009-10-23
>The ZFS project has been discontinued. The mailing list and
> repository will
>also be removed shortly.
>
> The community is migrating to a new google group:
>http://groups.google.com/group/zfs-macos
>
>  -- richard
>


Any official word from Apple on the abandonment?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool with very different sized vdevs?

2009-10-23 Thread Tim Cook

On Fri, Oct 23, 2009 at 3:05 PM, Travis Tabbal  wrote:

> Hmm.. I expected people to jump on me yelling that it's a bad idea. :)
>
> How about this, can I remove a vdev from a pool if the pool still has
> enough space to hold the data? So could I add it in and mess with it for a
> while without losing anything? I would expect the system to resliver the
> data onto the remaining vdevs, or tell me to go jump off a pier. :)
> --
>


Jump off a pier.  Removing devices is not currently supported but it is in
the works.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook

On Fri, Oct 23, 2009 at 3:48 PM, Bruno Sousa  wrote:

> Could Sun'x x4540 Thumper reason to have 6 LSI's some sort of "hidden"
> problems found by Sun where the HBA resets, and due to market time pressure
> the "quick and dirty" solution was to spread the load over multiple HBA's
> instead of software fix?
>
> Just my 2 cents..
>
>
> Bruno
>
>
What else were you expecting them to do?  According to LSI's website, the
1068e in an x8 configuration is an 8-port card.
http://www.lsi.com/DistributionSystem/AssetDocument/files/docs/marketing_docs/storage_stand_prod/SCG_LSISAS1068E_PB_040407.pdf

While they could've used expanders, that just creates one more component
that can fail/have issues.  Looking at the diagram, they've taken the
absolute shortest I/O path possible, which is what I would hope to
see/expect.
http://www.sun.com/servers/x64/x4540/server_architecture.pdf

One drive per channel, 6 channels total.

I also wouldn't be surprised to find out that they found this the optimal
configuration from a performance/throughput/IOPS perspective as well.  Can't
seem to find those numbers published by LSI.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook

On Fri, Oct 23, 2009 at 6:32 PM, Adam Cheal  wrote:

> I don't think there was any intention on Sun's part to ignore the
> problem...obviously their target market wants a performance-oriented box and
> the x4540 delivers that. Each 1068E controller chip supports 8 SAS PHY
> channels = 1 channel per drive = no contention for channels. The x4540 is a
> monster and performs like a dream with snv_118 (we have a few ourselves).
>
> My issue is that implementing an archival-type solution demands a dense,
> simple storage platform that performs at a reasonable level, nothing more.
> Our design has the same controller chip (8 SAS PHY channels) driving 46
> disks, so there is bound to be contention there especially in high-load
> situations. I just need it to work and handle load gracefully, not timeout
> and cause disk "failures"; at this point I can't even scrub the zpools to
> verify the data we have on there is valid. From a hardware perspective, the
> 3801E card is spec'ed to handle our architecture; the OS just seems to fall
> over somewhere though and not be able to throttle itself in certain
> intensive IO situations.
>
> That said, I don't know whether to point the finger at LSI's firmware or
> mpt-driver/ZFS. Sun obviously has a good relationship with LSI as their
> 1068E is the recommended SAS controller chip and is used in their own
> products. At least we've got a bug filed now, and we can hopefully follow
> this through to find out where the system breaks down.
>
>
Have you checked in with LSI to verify the IOPS ability of the chip?  Just
because it supports having 46 drives attached to one ASIC doesn't mean it
can actually service all 46 at once.  You're talking (VERY conservatively)
2800 IOPS.

Even ignoring that, I know for a fact that the chip can't handle raw
throughput numbers on 46 disks unless you've got some very severe raid
overhead.  That chip is good for roughly 2GB/sec each direction.  46 7200RPM
drives can fairly easily push 4x that amount in streaming IO loads.

Long story short, it appears you've got a 5lbs bag a 50lbs load...

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Checksums

2009-10-23 Thread Tim Cook

So, from what I gather, even though the documentation appears to state
otherwise, default checksums have been changed to SHA256.  Making that
assumption, I have two questions.

First, is the default updated from fletcher2 to SHA256 automatically for a
pool that was created with an older version of zfs and then upgraded to the
latest?  Second, would all of the blocks be re-checksummed with a zfs
send/receive on the receiving side?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook

On Fri, Oct 23, 2009 at 7:17 PM, Adam Cheal  wrote:

> LSI's sales literature on that card specs "128 devices" which I take with a
> few hearty grains of salt. I agree that with all 46 drives pumping out
> streamed data, the controller would be overworked BUT the drives will only
> deliver data as fast as the OS tells them to. Just because the speedometer
> says 200 mph max doesn't mean we should (or even can!) go that fast.
>
> The IO intensive operations that trigger our timeout issues are a small
> percentage of the actual normal IO we do to the box. Most of the time the
> solution happily serves up archived data, but when it comes time to scrub or
> do mass operations on the entire dataset bad things happen. It seems a waste
> to architect a more expensive performance-oriented solution when you aren't
> going to use that performance the majority of the time. There is a balance
> between performance and functionality, but I still feel that we should be
> able to make this situation work.
>
> Ideally, the OS could dynamically adapt to slower storage and throttle its
> IO requests accordingly. At the least, it could allow the user to specify
> some IO thresholds so we can "cage the beast" if need be. We've tried some
> manual tuning via kernel parameters to restrict max queued operations per
> vdev and also a "scrub" related one (specifics escape me), but it still
> manages to overload itself.
> --
>

Where are you planning on queueing up those requests?  The scrub, I can
understand wanting throttling, but what about your user workload?  Unless
you're talking about EXTREMELY  short bursts of I/O, what do you suggest the
OS do?  If you're sending 3000 IOPS at the box from a workstation, where is
that workload going to sit if you're only dumping 500 IOPS to disk?  The
only thing that will change is that your client will timeout instead of your
disks.

I don't recall seeing what generates the I/O, but I do recall that it's
backup.  My assumption would be it's something coming in over the network,
in which case I'd say you're far, far better off throttling at the network
stack.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Tim Cook

On Fri, Oct 23, 2009 at 7:17 PM, Richard Elling wrote:

>
> Tim has a valid point. By default, ZFS will queue 35 commands per disk.
> For 46 disks that is 1,610 concurrent I/Os.  Historically, it has proven to
> be
> relatively easy to crater performance or cause problems with very, very,
> very expensive arrays that are easily overrun by Solaris. As a result, it
> is
> not uncommon to see references to setting throttles, especially in older
> docs.
>
> Fortunately, this is  simple to test by reducing the number of I/Os ZFS
> will queue.  See the Evil Tuning Guide
>
> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29
>
> The mpt source is not open, so the mpt driver's reaction to 1,610
> concurrent
> I/Os can only be guessed from afar -- public LSI docs mention a number of
> 511
> concurrent I/Os for SAS1068, but it is not clear to me that is an explicit
> limit.  If
> you have success with zfs_vdev_max_pending set to 10, then the mystery
> might be solved. Use iostat to observe the wait and actv columns, which
> show the number of transactions in the queues.  JCMP?
>
> NB sometimes a driver will have the limit be configurable. For example, to
> get
> high performance out of a high-end array attached to a qlc card, I've set
> the execution-throttle in /kernel/drv/qlc.conf to be more than two orders
> of
> magnitude greater than its default of 32. /kernel/drv/mpt*.conf does not
> seem
> to have a similar throttle.
>  -- richard
>
>

I believe there's a caveat here though.  That really only helps if the total
I/O load is actually enough for the controller to handle.  If the sustained
I/O workload is still 1600 concurrent I/O's, lowering the batch won't
actually cause any difference in the timeouts, will it?  It would obviously
eliminate burstiness (yes, I made that word up), but if the total sustained
I/O load is greater than the ASIC can handle, it's still going to fall over
and die with a queue of 10, correct?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Checksums

2009-10-23 Thread Tim Cook

On Fri, Oct 23, 2009 at 7:19 PM, Adam Leventhal  wrote:

> On Fri, Oct 23, 2009 at 06:55:41PM -0500, Tim Cook wrote:
> > So, from what I gather, even though the documentation appears to state
> > otherwise, default checksums have been changed to SHA256.  Making that
> > assumption, I have two questions.
>
> That's false. The default checksum has changed from fletcher2 to fletcher4
> that is to say, the definition of the value of 'on' has changed.
>
> > First, is the default updated from fletcher2 to SHA256 automatically for
> a
> > pool that was created with an older version of zfs and then upgraded to
> the
> > latest?  Second, would all of the blocks be re-checksummed with a zfs
> > send/receive on the receiving side?
>
> As with all property changes, new writes get the new properties. Old data
> is not rewritten.
>
> Adam
>
>

Adam,

Thank you for the correction.  My next question is, do you happen to know
what the overhead difference between fletcher4 and SHA256 is?  Is the
checksumming multi-threaded in nature?  I know my fileserver has a lot of
spare cpu cycles, but it would be good to know if I'm going to take a
substantial hit in throughput moving from one to the other.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Tim Cook

On Sat, Oct 24, 2009 at 4:49 AM, Adam Cheal  wrote:

> The iostat I posted previously was from a system we had already tuned the
> zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10
> in actv per disk).
>
> I reset this value in /etc/system to 7, rebooted, and started a scrub.
> iostat output showed busier disks (%b is higher, which seemed odd) but a cap
> of about 7 queue items per disk, proving the tuning was effective. iostat at
> a high-water mark during the test looked like this:
>


> ...and sure enough about 20 minutes into it I get this (bus reset?):
>
> scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4
> /pci1000,3...@0/s...@34,0 (sd49):
>   incomplete read- retrying
> scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4
> /pci1000,3...@0/s...@21,0 (sd30):
>   incomplete read- retrying
> scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4
> /pci1000,3...@0/s...@1e,0 (sd27):
>   incomplete read- retrying
> scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0):
>   Rev. 8 LSI, Inc. 1068E found.
> scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0):
>   mpt0 supports power management.
> scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0):
>   mpt0: IOC Operational.
>
> During the "bus reset", iostat output looked like this:
>
>
> During our previous testing, we had tried even setting this max_pending
> value down to 1, but we still hit the problem (albeit it took a little
> longer to hit it) and I couldn't find anything else I could set to throttle
> IO to the disk, hence the frustration.
>
> If you hadn't seen this output, would you say that 7 was a "reasonable"
> value for that max_pending queue for our architecture and should give the
> LSI controller in this situation enough breathing room to operate? If so, I
> *should* be able to scrub the disks successfully (ZFS isn't to blame) and
> therefore have to point the finger at the
> mpt-driver/LSI-firmware/disk-firmware instead.
> --
>
>
A little bit of searching google says:
http://downloadmirror.intel.com/17968/eng/ESRT2_IR_readme.txt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Tim Cook

On Sat, Oct 24, 2009 at 11:20 AM, Tim Cook  wrote:

>
>
> On Sat, Oct 24, 2009 at 4:49 AM, Adam Cheal  wrote:
>
>> The iostat I posted previously was from a system we had already tuned the
>> zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10
>> in actv per disk).
>>
>> I reset this value in /etc/system to 7, rebooted, and started a scrub.
>> iostat output showed busier disks (%b is higher, which seemed odd) but a cap
>> of about 7 queue items per disk, proving the tuning was effective. iostat at
>> a high-water mark during the test looked like this:
>>
>
>
>> ...and sure enough about 20 minutes into it I get this (bus reset?):
>>
>>
>> scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4
>> /pci1000,3...@0/s...@34,0 (sd49):
>>   incomplete read- retrying
>> scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4
>> /pci1000,3...@0/s...@21,0 (sd30):
>>   incomplete read- retrying
>> scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4
>> /pci1000,3...@0/s...@1e,0 (sd27):
>>   incomplete read- retrying
>> scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0(mpt0):
>>   Rev. 8 LSI, Inc. 1068E found.
>> scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0(mpt0):
>>   mpt0 supports power management.
>> scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0(mpt0):
>>   mpt0: IOC Operational.
>>
>> During the "bus reset", iostat output looked like this:
>>
>>
>> During our previous testing, we had tried even setting this max_pending
>> value down to 1, but we still hit the problem (albeit it took a little
>> longer to hit it) and I couldn't find anything else I could set to throttle
>> IO to the disk, hence the frustration.
>>
>> If you hadn't seen this output, would you say that 7 was a "reasonable"
>> value for that max_pending queue for our architecture and should give the
>> LSI controller in this situation enough breathing room to operate? If so, I
>> *should* be able to scrub the disks successfully (ZFS isn't to blame) and
>> therefore have to point the finger at the
>> mpt-driver/LSI-firmware/disk-firmware instead.
>> --
>>
>>
> A little bit of searching google says:
> http://downloadmirror.intel.com/17968/eng/ESRT2_IR_readme.txt
>
>
Huh, good old keyboard shortcuts firing off emails before I'm done with
them.  Anyways, in that link, I found he following:
 3. Updated - to provide NCQ queue depth of 32 (was 8) on 1064e and 1068e
and 1078 internal-only controllers in IR and ESRT2 modes.

Then there's also this link from someone using a similar controller under
freebsd:
http://www.nabble.com/mpt-errors-QUEUE-FULL-EVENT,-freebsd-7.0-on-dell-1950-td20019090.html

It would make total sense if you're having issues and the default queue
depth for that controller is 8 per port.  Even setting it to 1 isn't going
to fix your issue if you've got 46 drives on one channel/port.

Honestly I'm just taking shots in the dark though.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Tim Cook

On Sat, Oct 24, 2009 at 12:30 PM, Carson Gaspar  wrote:

>
> I saw this with my WD 500GB SATA disks (HDS725050KLA360) and LSI firmware
> 1.28.02.00 in IT mode, but I (almost?) always had exactly 1 "stuck" I/O.
> Note that my disks were one per channel, no expanders. I have _not_ seen it
> since replacing those disks. So my money is on a bug in the LSI firmware,
> the drive firmware, the drive controller hardware, or some combination
> thereof.
>
> Note that LSI has released firmware 1.29.00.00. Sadly I cannot find any
> documentation on what has changed. Downloadable from LSI at
> http://lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/internal/sas3081e-r/index.html?remote=1&locale=EN
>
> --
> Carson

Here's the closest I could find from some Intel release notes.  It came
from: ESRT2_IR_readme.txt and does mention the 1068e chipset, as well as
that firmware rev.

Package Information

FW and OpROM Package for Native SAS mode, IT/IR mode and Intel(R) Embedded
Server RAID Technology II

Package version: 2009.10.06
FW Version = 01.29.00 (includes fixed firmware settings)
BIOS (non-RAID) Version = 06.28.00
BIOS (SW RAID) Version = 08.09041155

Supported RAID modes: 0, 1, 1E, 10, 10E and 5 (activation key AXXRAKSW5
required for RAID 5 support)

Supported Intel(R) Server Boards and Systems:
 - S5000PSLSASR, S5000XVNSASR, S5000VSASASR, S5000VCLSASR, S5000VSFSASR
 - SR1500ALSASR, SR1550ALSASR, SR2500ALLXR, S5000PALR (with SAS I/O Module)
 - S5000PSLROMBR (SROMBSAS18E) without HW RAID activation key AXXRAK18E
installed (native SAS or SW RAID modes only) - for HW RAID mode separate
package is available
 - NSC2U, TIGW1U

Supported Intel(R) RAID controller (adapters):
- SASMF8I, SASWT4I, SASUC8I

Intel(R) SAS Entry RAID Module AXX4SASMOD, when inserted in below Intel(R)
Server Boards and Systems:
 - S5520HC / S5520HCV, S5520SC,S5520UR,S5500WB

Known Restrictions

1. The sasflash versions within this package don't support ESRTII
controllers.
2. The sasflash utility for Windows and Linux version within this package
only support Intel(R) IT/IR RAID controllers.  The sasflash utility for
Windows and Linux version within this package don't support sasflash -o -e 6
command.
3. The sasflash utility for DOS version doesn't support the Intel(R) Server
Boards and Systems due to BIOS limitation.  The DOS version sasflash might
still be supported on 3rd party server boards which don't have the BIOS
limitation.
4. No PCI 3.0 support
5. No Foreign Configuration Resolution Support
6. No RAID migration Support
7. No mixed RAID mode support ever
8. No Stop On Error support

Known Bugs

(1)
For Intel(R) chipset S5000P/S5000V/S5000X based server systems, please use
the 32 bit, non-EBC version of sasflash which is
SASFLASH_Ph17-1.22.00.00\sasflash_efi_bios32_rel\sasflash.efi, instead of
the ebc version of sasflash which is in the top package directory and also
in
SASFLASH_Ph17-1.22.00.00\sasflash_efi_ebc_rel\sasflash.efi.  The latter one
may return a wrong sas address with a sasflash -list command in the listed
systems.

(2)
LED behavior does not match between SES and SGPIO for some conditions
(documentation in process).

(3)
When in EFI Optimized Boot mode, the task bar is not displayed in EFI_BSD
after two volumes are created.

(4)
If a system is rebooted while a volume rebuild is in progress, the rebuild
will start over from the beginning.

Fixes/Updates

Version 2009.10.06
 1. Fixed - MP2 HDD fault LED stays on after rebuild completes
 2. Fixed - System hangs if drive hot-unplugged during stress

Version 2009.07.30
 1. Fixed - SES over i2c for 106x products
 2. Fixed - FW settings updated to support SES over i2c drive lights on
FALSASMP2.

Version 2009.06.15
 1. Fixed - SES over I2C issue for 1078IR.
 2. Updated - 1068e fw to fix SES over I2C on MP2 bug.
 3. Updated - to provide NCQ queue depth of 32 (was 8) on 1064e and 1068e
and 1078 internal-only controllers in IR and ESRT2 modes.
 4. Updated - Firmware to enable SES over I2C on AXX4SASMOD.
 5. Updated - Settings to provide better LED indicators for SGPIO.

Version 2008.12.11
 1. Fixed - Media can't boot from SATA DVD in some systems in Software RAID
(ESRT2) mode.
 2. Fixed - Incorrect RAID 5 ECC error handling in Ctrl+M

Version 2008.11.07
 1. Added support for - Enable ICH10 support
 2. Added support for - Software RAID5 to support ICH10R
 3. Added support for - Single Drive RAID 0 (IS) Volume
 4. Fixed - Resolved issue where user could not create a second volume
immediately following the deletion of a second volume.
 5. Fixed - Second hot spare status not shown when first hot spare is
inactive/missing

Version 2008.09.22
 1. Fixed - SWR:During hot PD removal and then quick reboot, not updating
the DDF correctly.

Version 2008.06.16
 1. Fixed - the issue withThe LED functions are not working inside the
OSes for SWR5
 2. Fixed - the

Re: [zfs-discuss] zfs code and fishworks "fork"

2009-10-27 Thread Tim Cook

On Tue, Oct 27, 2009 at 2:35 AM, Bruno Sousa  wrote:

>  Hi all,
>
> I fully understand that within a cost effective point of view, developing
> the fishworks for a reduced set of hardware makes , alot, of sense.
> However, i think that Sun/Oracle would increase their user base if they
> make availabe a Fishwork framework certified only for a reduced set of
> hardware, ie :
>
>- it needs Western Digital HDD firmware version x.y.z
>- it needs a SAS/SATA controller from a specific brand, model and
>firmare ( LSI SAS1068E )
>- if SSD's are used they need to be from vendor X with firmware Y
>- the system motherboard chipset needs to be from vendor X or Y and not
>from Z
>
> Within this possible landscape i'm pretty sure that alot more customers
> would pay for the Fishworks stack and support, given the fact that not all
> customers "need" aKa can afford, the Unified Storage platform from Sun.
>
> Anyway..Fishworks it's an awesome product! Congratulations for the extreme
> good job.
>
> Regards,
> Bruno
>
>

You're making a very, very bad assumption that the price of Fishworks would
be "cheap" for just the software.  Sun hardware does not cost that much more
than their competitors when it comes down to it.  You should expect the
software to make up the difference in price if they were to unbundle it.
Heck, I would expect it to be MORE if they're forced into having to deal
with third party vendors that are pointing fingers at software problems vs.
hardware problems and wasting Sun support engineers valuable time.  I think
you'd find yourself unpleasantly surprised at the end price tag.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs code and fishworks "fork"

2009-10-27 Thread Tim Cook

On Tue, Oct 27, 2009 at 2:13 PM, Dale Ghent  wrote:

>
> On Oct 27, 2009, at 2:58 PM, Bryan Cantrill wrote:
>
>
>>  I can agree that the software is the one that really has the added
> value, but to my opinion allowing a stack like Fishworks to run
> outside
> the Sun Unified Storage would lead to lower price per unit(Fishwork
> license) but maybe increase revenue.
>

 I'm afraid I don't see that argument at all; I think that the
 economics
 that you're advocating would be more than undermined by the
 necessarily
 higher costs of validating and supporting a broader range of
 hardware and
 firmware...

>>>
>>> (Just playing Devil's Advocate here)
>>>
>>> There could be no economics at all. A basic warranty would be provided
>>> but running a standalone product is a wholly on your own proposition
>>> once one ventures outside a very small hardware support matrix.
>>>
>>> Perhaps Fishworks/AK would have a OpenSolaris edition - leave the bulk
>>> of the actual hardware support up to a support infrastructure that's
>>> already geared towards making wide ranges of hardware supportable -
>>> OpenSolaris/Solaris, after all, does allow that.
>>>
>>> Perhaps this could be a version of Fishworks that's not as integrated
>>> with what you get on a SUS platform; if some of the Fishworks
>>> functionality that depends on a precise hardware combo could be
>>> reduced or generalized, perhaps it's worth consideration. Knowing the
>>> little I do about what's going on under the hood of a SUS system, I
>>> wouldn't expect the version of Fishworks uses on the SUS systems to
>>> have 100% parity with a unbundled Fishworks edition - but the core
>>> features, by and large, would convey.
>>>
>>
>> Why would we do this?  I'm all for zero-cost endeavors, but this isn't
>> zero-cost -- and I'm having a hard time seeing the business case here,
>> especially when we have so many paying customers for whom the business
>> case for our time and energy is crystal clear...
>>
>
> Hey, I was just offering food for thought from the technical end :)
>
> Of course the cost in man hours to attain a reasonable, unbundled version
> would have to be justifiable. If that aspect isn't currently justifiable,
> then that's as far as the conversation needs to go. However, times change
> and one day demand could very well justify the business costs.
>
>
> /dale
>

The problem is, most of the things that make fishworks desirable are the
things that wouldn't work.  Want to light up a failed drive with an LED?
 Clustering?  Timeouts for failed hardware?

The fact of the matter is, people asking for this are people that aren't
willing to spend the money that Sun would be asking for anyways.  I mean,
seriously, a 7110 is $10,000 LIST!  Assuming you absolutely despise
bartering on price, you can get the thing for 20% off just by using try and
buy.  If you're balking at that price, you wouldn't like the price of the
software.  No amount of "but you don't have to support it" is going to
change that.  I think you're failing to take into consideration the PR
suicide it would be for Sun to offer fishworks on any platform people want,
offer support contracts (that's the ONLY way this will make them money), and
then turn around and tell people the reasony feature XYZ isnt' working is
because their hardware isn't supported... oh, and they have no plans to ever
add support either.

I honestly can't believe this is even a discussion.  What next, are you
going to ask NetApp to support ONTAP on Dell systems, and EMC to support
Enginuity on HP blades?

Just because the underpinnings are based on an open source OS that supports
many platforms doesn't mean this customized build can or ever should.

And one last example... QLogic and Brocade FC switches run Linux... I
wouldn't expect or ask them to make a version that I could run on a desktop
full of HBA's to act as my very own FC switch even though it is entirely
possible for them to do so.

And just as a reminder... if you look back through the archives, I am FAR
from a Sun fanboy... I just feel you guys aren't even grounded in reality
when making these requests.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import single user mode incompatible version

2009-10-27 Thread Tim Cook

On Tue, Oct 27, 2009 at 4:25 PM, Paul Lyons  wrote:

> I know this is opensolaris and Solaris, but I'm stuck...
>
> I want to demonstrate to my client how to recover an unbootable system from
> a zfs snapshot. (Say some dope rm -rf /kernel/drv...) Running Solaris 10 U8
> sparc.
>
> Normal procedures are boot cdrom -s (or boot net -s)
> zpool import rpool
> zfs rollback 
> reboot and all is well
>
> I've done this before with earlier rev's of Sol 10.
>
> When I boot off Solaris 10 U8 I get the error that pool is formatted using
> an incompatible version. Status show the pool as "Newer Version"
>
> I know Update 8 has version 15, but it looks like the "miniroot" from the
> install media is only version 10.
>
> This is not good.
>
> Any advice? I am already thinking about installing U7 on my test box to
> demonstrate. Glad I haven't rolled out u8 into production.
>
> Thanks,
>
> Paul
>



Not to be a jerk, but is there a question in there?  The system told you
exactly what is wrong, and you seem to already know.  You're booting from an
old cd that has an old version of zfs.  Grab a new iso.

How would you expect a system that shipped with verison 10 of zfs to know
what to do with version 15?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import single user mode incompatible version

2009-10-27 Thread Tim Cook

On Tue, Oct 27, 2009 at 4:59 PM, dick hoogendijk  wrote:

> Tim Cook wrote:
>
>>
>>
>> On Tue, Oct 27, 2009 at 4:25 PM, Paul Lyons > paulrly...@gmail.com>> wrote:
>>
>>When I boot off Solaris 10 U8 I get the error that pool is
>>formatted using an incompatible version.
>>
>>
>> You're booting from an old cd that has an old version of zfs.  Grab a new
>> iso.
>>
> It might be that I can't read but does OP not state he is booting off
> Solaris 10 update 8 DVD?
> What can be newer than that one? If the miniroot really only supports ZFS
> v10 then this is indeed not good (unworkable/unusable/..)
>

Assuming he didn't accidentally burn the wrong media... and the 10u8 does in
fact have 15 as the default version, but 10 as the mini-root (that sounds
more than a bit odd to me), it's a matter of simply grabbing an opensolaris
ISO instead to do the exact same thing.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool failmode

2009-10-27 Thread Tim Cook

On Tue, Oct 27, 2009 at 5:13 PM, deniz rende  wrote:

> Hi,
>
> I am trying to understand the behavior of zpool failmode=continue rpool.
>
> I've read the man page regarding to this and I understand that the default
> mode is set to wait. So If I set up my zfs pool to continue, in the case of
> loss of connectivity, what is this setting supposed to do?
>
> Does setting failmode=continue prevent the system not to panic when an
> event like loss of connectivity or pool failure occur? What does the "EIO"
> term refer to in the man page for this setting?
>
> Could somebody explain what really this setting does?
>
> Thanks
>
> Deniz
>

"wait" causes all I/O to hang while the system attempts to retry the
device.  "continue" will cause the system to continue on as if nothing has
changed.  "panic" will cause the system to panic and core dump.  The only
real advantage I see in wait is that it will alert the admin to a failure
rather quickly if you aren't checking the health of the system on a regular
basis.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs code and fishworks "fork"

2009-10-27 Thread Tim Cook

On Wed, Oct 28, 2009 at 12:15 AM, Eric D. Mudama
wrote:

> On Tue, Oct 27 at 18:58, Bryan Cantrill wrote:
>
>> Why would we do this?  I'm all for zero-cost endeavors, but this isn't
>> zero-cost -- and I'm having a hard time seeing the business case here,
>> especially when we have so many paying customers for whom the business
>> case for our time and energy is crystal clear...
>>
>>- Bryan
>>
>
> I don't have a need for a large 7110 box, my group's file serving
> needs are quite small.  I decided on a Dell T610 running OpenSolaris,
> with half the drives populated now and half to be populated as we get
> close to filling them up.  Pair of mirrored vdevs for performance,
> with an SSD cache.
>
> I'd have loved to have, instead, the nice fishworks gui interface to
> the whole thing, and if that existed on something like an X2270,
> that's what we would have bought instead of the Dell box.
>
> Ultimately, I wanted the simplicity of a Drobo, capable of saturating
> a Gig-E port or two, in an easy to maintain and administer system.
> One and a half out of three ain't bad, but Fishworks GUI on a 4-disk
> X2270 would have been a 3 for 3 solution I believe.  We just can't
> afford to spend $8-10k to "try" a 7110 which is likely complete
> overkill for our needs, and we have no expectation of our business
> growing into it within the next two years.
>
> $2k was our absolute ceiling for a trial purchase, and I knew that if
> my OpenSolaris experiment didn't work out, I could just repurpose the
> Dell box with Debian, EXT3, software RAID and Samba and get a 75-80%
> solution.
>
> Yes, this may not make business sense for Sun-as-structured, but
> someone will figure out how to scratch that itch because it's real for
> a LOT of small businesses.  They want that low cost entry into a
> business-grade NAS without having to build it themselves, something
> that's a step up from a whitebox 2-disk mirror from some no-name
> vendor who won't exist in 6 months.
>
> --eric
>
> PS: Not having enough engineers to support a growing and paying
> customer base is a *good* problem to have.  The opposite is much, much
> worse.
>


So use Nexenta?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs code and fishworks "fork"

2009-10-28 Thread Tim Cook

2009/10/28 Eric D. Mudama 

> On Wed, Oct 28 at 13:40, "C. Bergström" wrote:
>
>> Tim Cook wrote:
>>
>>>
>>>
>>>   PS: Not having enough engineers to support a growing and paying
>>>   customer base is a *good* problem to have.  The opposite is much, much
>>>   worse.
>>>
>>>
>>>
>>> So use Nexenta?
>>>
>> Got data you care about?
>>
>> Verify extensively before you jump to that ship.. :)
>>
>
> I am not aware of any data issues, but simply when I investigated
> nexenta they lagged far enough behind OpenSolaris that I was concerned
> they didn't have enough critical mass to keep up.  High quality
> distros are a ton of work.
>
> That, and the supported NexentaStor pricing exceeded our $2k ceiling.
>
> --eric
>

If Nexenta was too expensive, there's nothing Sun will ever offer that will
fit your price profile.  "Home electronics" is not their business model and
never will be.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

1 2 3 4 5 >

1 - 100 of 424 matches

Mail list logo