Re: [zfs-discuss] Resizing ZFS partition, shrinking NTFS?

2011-06-17 Thread Michael Sullivan
On 17 Jun 11, at 21:14 , Bob Friesenhahn wrote:

> On Fri, 17 Jun 2011, Jim Klimov wrote:
>> I gather that he is trying to expand his root pool, and you can
>> not add a vdev to one. Though, true, it might be possible to
>> create a second, data pool, in the partition. I am not sure if
>> zfs can make two pools in different partitions of the same
>> device though - underneath it still uses Solaris slices, and
>> I think those can be used on one partition. That was my
>> assumption for a long time, though never really tested.
> 
> This would be a bad assumption.  Zfs should not care and you are able to do 
> apparently silly things with it.  Sometimes allowing potentially silly things 
> is quite useful.
> 

This is true.  If one has mirrored disks, you could do something like I explain 
here WRT partitioning and resizing pools.

http://www.kamiogi.net/Kamiogi/Frame_Dragging/Entries/2009/5/19_Everything_in_Its_Place_-_Moving_and_Reorganizing_ZFS_Storage.html

I did some shuffling using Solaris partitions here on a home server, but it was 
using mirrors of the same geometry disks.

You might be able to o a similar shuffle using an external USB drive which was 
appropriately sized and turn on autoexpand.

Mike

---
Michael Sullivan   
m...@axsh.us
http://www.axsh.us/
Phone: +1-662-259-
Mobile: +1-662-202-7716

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] question about COW and snapshots

2011-06-17 Thread Michael Sullivan
On 17 Jun 11, at 21:02 , Ross Walker wrote:

> On Jun 16, 2011, at 7:23 PM, Erik Trimble  wrote:
> 
>> On 6/16/2011 1:32 PM, Paul Kraus wrote:
>>> On Thu, Jun 16, 2011 at 4:20 PM, Richard Elling
>>>   wrote:
>>> 
 You can run OpenVMS :-)
>>> Since *you* brought it up (I was not going to :-), how does VMS'
>>> versioning FS handle those issues ?
>>> 
>> It doesn't, per se.  VMS's filesystem has a "versioning" concept (i.e. every 
>> time you do a close() on a file, it creates a new file with the version 
>> number appended, e.g.  foo;1  and foo;2  are the same file, different 
>> versions).  However, it is completely missing the rest of the features we're 
>> talking about, like data *consistency* in that file. It's still up to the 
>> app using the file to figure out what data consistency means, and such.  
>> Really, all VMS adds is versioning, nothing else (no API, no additional 
>> features, etc.).
> 
> I believe NTFS was built on the same concept of file streams the VMS FS used 
> for versioning.
> 
> It's a very simple versioning system.
> 
> Personnally I use Sharepoint, but there are other content management systems 
> out there that provide what your looking for, so no need to bring out the 
> crypt keeper.
> 

I think from following this whole discussion people are wanting "Versions" 
which will be offered by OS X Lion soon. However, it is dependent upon 
applications playing nice,behaving and using the "standard" API's.

It would likely take a major overhaul in the way ZFS handles snapshots to 
create them at the object level rather than the filesystems level.  Might be a 
nice exploratory exercise for those in the know with the ZFS roadmap, but then 
there are two "roadmaps" right?

Also consistency and integrity cannot be guaranteed on the object level since 
an application may have more than a single filesystem object in use at a time 
and operations would need to be transaction based with commits and rollbacks.

Way off-topic, but Smalltalk and its variants do this by maintaining the state 
of everything in an operating environment image.

But then again, I could be wrong.

Mike

---
Michael Sullivan   
m...@axsh.us
http://www.axsh.us/
Phone: +1-662-259-
Mobile: +1-662-202-7716

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resizing ZFS partition, shrinking NTFS?

2011-06-17 Thread Bob Friesenhahn

On Fri, 17 Jun 2011, Jim Klimov wrote:

I gather that he is trying to expand his root pool, and you can
not add a vdev to one. Though, true, it might be possible to
create a second, data pool, in the partition. I am not sure if
zfs can make two pools in different partitions of the same
device though - underneath it still uses Solaris slices, and
I think those can be used on one partition. That was my
assumption for a long time, though never really tested.


This would be a bad assumption.  Zfs should not care and you are able 
to do apparently silly things with it.  Sometimes allowing potentially 
silly things is quite useful.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] question about COW and snapshots

2011-06-17 Thread Ross Walker
On Jun 16, 2011, at 7:23 PM, Erik Trimble  wrote:

> On 6/16/2011 1:32 PM, Paul Kraus wrote:
>> On Thu, Jun 16, 2011 at 4:20 PM, Richard Elling
>>   wrote:
>> 
>>> You can run OpenVMS :-)
>> Since *you* brought it up (I was not going to :-), how does VMS'
>> versioning FS handle those issues ?
>> 
> It doesn't, per se.  VMS's filesystem has a "versioning" concept (i.e. every 
> time you do a close() on a file, it creates a new file with the version 
> number appended, e.g.  foo;1  and foo;2  are the same file, different 
> versions).  However, it is completely missing the rest of the features we're 
> talking about, like data *consistency* in that file. It's still up to the app 
> using the file to figure out what data consistency means, and such.  Really, 
> all VMS adds is versioning, nothing else (no API, no additional features, 
> etc.).

I believe NTFS was built on the same concept of file streams the VMS FS used 
for versioning.

It's a very simple versioning system.

Personnally I use Sharepoint, but there are other content management systems 
out there that provide what your looking for, so no need to bring out the crypt 
keeper.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] write cache partial-disk pools (was Server with 4 drives, how to configure ZFS?)

2011-06-17 Thread Ross Walker
On Jun 17, 2011, at 7:06 AM, Edward Ned Harvey 
 wrote:

> I will only say, that regardless of whether or not that is or ever was true,
> I believe it's entirely irrelevant.  Because your system performs read and
> write caching and buffering in ram, the tiny little ram on the disk can't
> possibly contribute anything.

You would be surprised.

The on-disk buffer is there so data is ready when the hard drive head lands, 
without it the drive's average rotational latency will trend higher due to 
missed landings because the data wasn't in buffer at the right time.

The read buffer is to allow the disk to continuously read sectors whether the 
system bus is ready to transfer or not. Without it, sequential reads wouldn't 
last long enough to reach max throughput before they would have to pause 
because of bus contention and then suffer a rotation of latency hit which would 
kill read performance.

Try disabling the on-board write or read cache and see how your sequential IO 
performs and you'll see just how valuable those puny caches are.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2011-06-17 Thread MasterCATZ

>
ok what is the Point of the RESERVE 

When we can not even delete a file when their is no space left !!!

if they are going to have a RESERVE they should make it a little smarter and
maybe have the FS use some of that free space so when we do hit 0 bytes 
data can still be deleted because their is over 50 gig free in the reserve .. 


# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank  2.68T  0  2.68T  /tank
# zpool list
NAME   SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
tank  3.64T  3.58T  58.2G98%  1.00x  ONLINE  -

rm -f -r downloads
rm: downloads: No space left on device





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Jim Klimov

2011-06-18 0:24, marty scholes wrote:
>> It makes me wonder how large shops with thousands of spindles 
handle this.


> We pay for the brand-name disk enclosures or servers where the 
fault-management stuff is supported by Solaris.

> Including the blinky lights.
>  


Funny you say that.

My Sun v40z connected a pair of Sun A5200 arrays running OSol 128a 
can't see the enclosures. The luxadm command comes up blank.


Except for that annoyance (and similar other issues) the Sun gear 
works well with a Sun operating system.




For the sake of weekend sarcasm:

Why would you wonder? That's a wrong brand name, it is too old.
Does it say "Oracle" anywhere on the label? Really, "v40z", pff!
When was it made? Like, in two-thousand-zeros, back when
dinosaurs roamed the earth and Sun was high above horizon?
Is it still supported at all, let alone Solaris (not OSol, may I add)?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread marty scholes
Funny you say that. 

My Sun v40z connected a pair of Sun A5200 arrays running OSol 128a can't see 
the enclosures. The luxadm command comes up blank. 

Except for that annoyance (and similar other issues) the Sun gear works well 
with a Sun operating system. 

Sent from Yahoo! Mail on Android

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Erik Trimble

On 6/17/2011 6:52 AM, Marty Scholes wrote:

Lights. Good.

Agreed. In a fit of desperation and stupidity I once enumerated disks by 
pulling them one by one from the array to see which zfs device faulted.

On a busy array it is hard even to use the leds as indicators.

It makes me wonder how large shops with thousands of spindles handle this.


We pay for the brand-name disk enclosures or servers where the 
fault-management stuff is supported by Solaris.


Including the blinky lights.



--
Erik Trimble
Java Platform Group Infrastructure
Mailstop:  usca22-317
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (UTC-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Marty Scholes
> Lights. Good.

Agreed. In a fit of desperation and stupidity I once enumerated disks by 
pulling them one by one from the array to see which zfs device faulted.

On a busy array it is hard even to use the leds as indicators.

It makes me wonder how large shops with thousands of spindles handle this.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Marty Scholes
> Lights.  Good.

Agreed.  In a fit of desperation and stupidity I once enumerated disks by 
pulling them one by one from the array to see which zfs device faulted.

On a busy array it is hard even to use the leds as indicators.

It makes me wonder how large shops with thousands of spindles handle this.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] write cache partial-disk pools (was Server with 4 drives, how to configure ZFS?)

2011-06-17 Thread Jim Klimov

2011-06-17 15:06, Edward Ned Harvey пишет:


When it comes to reads:  The OS does readahead more intelligently than the
disk could ever hope.  Hardware readahead is useless.


Here's another (lame?) question to the experts, partly as a
followup to my last post about large arrays and essentially
a shared bus to be freed ASAP: can the OS request a disk
readahead (send a small command and release the bus)
and then later poll the disk('s cache) for the readahead
results? That is, it would not "hold the line" between
sending a request and receiving the result.

Alternatively, does it work in a packeted protocol (and in
effect requests and responses do not "hold the line",
but the controller must keep states - are these command
queues?), and so the ability to transfer packets faster
and free the shared ether between disks, backplanes
and controllers, is critical per se?

Thanks,
//Jim

The more I know, the more I know how little I know ;)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] write cache partial-disk pools (was Server with 4 drives, how to configure ZFS?)

2011-06-17 Thread Jim Klimov

2011-06-17 15:41, Edward Ned Harvey пишет:

From: Daniel Carosone [mailto:d...@geek.com.au]
Sent: Thursday, June 16, 2011 11:05 PM

the [sata] channel is idle, blocked on command completion, while
the heads seek.

I'm interested in proving this point.  Because I believe it's false.

Just hand waving for the moment ... Presenting the alternative viewpoint
that I think is correct...

I'm also interested to hear the in-the-trenches specialists
and architechts on this point, however, the way it was
explained to me a while ago, disk caches and higher
interface speeds really matter in large arrays, where
you have one (okay, 8) links from your controller to a
backplane with dozens of disks, and the faster any of
these disks completes its bursty operation, the less
latency is induced on the array in whole.

So even if the spinning drive can not sustain 6Gbps,
its 64Mb of cache quite can spit out (or read in) its
bit of data, free the bus, and let the other many drives
spit theirs.

I am not sure if this is relevant to say a motherboard
controller where one chip processes 6-8 disks, but
maybe there's something to it too...

//Jim


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resizing ZFS partition, shrinking NTFS?

2011-06-17 Thread Jim Klimov

2011-06-17 9:37, Michael Schuster пишет:


I'd suggest a somewhat different approach:
1) boot a live cd and use something like parted to shrink the NTFS 
partition

2) create a new partition without FS in the space now freed from NTFS
3) boot OpenSolaris, add the partition from 2) as vdev to your zpool.

HTH
Michael

I gather that he is trying to expand his root pool, and you can
not add a vdev to one. Though, true, it might be possible to
create a second, data pool, in the partition. I am not sure if
zfs can make two pools in different partitions of the same
device though - underneath it still uses Solaris slices, and
I think those can be used on one partition. That was my
assumption for a long time, though never really tested.

I only did similar expansions within one slice table which
covered the whole disk, and that was a mirror set so I
could juggle data around (migrating to ZFS Root from
UFS+LUroot+swap layouts, where sometimes original
slices did not all go after each other). That went well,
and autoexpansion worked as soon as I increased
the slice sizes via format's partition editor. And maybe
rebooted too...

I think the method of zfs send/recv via a secondary device
(including network storage) and another boot device
(including a LiveCD|LiveUSB), and remaking the partition,
is the least error-prone for this poster's situation.

And resizing the system NTFS partition is really unlikely
to be allowed from within running Windows. It should be
done from some external booting software, and probably
after such defragmentation which would stuff all data into
start or end of that NTFS partition as appropriate (see for
example the free JkDefrag/MyDefrag project).


--


++
||
| Климов Евгений, Jim Klimov |
| технический директор   CTO |
| ЗАО "ЦОС и ВТ"  JSC COS&HT |
||
| +7-903-7705859 (cellular)  mailto:jimkli...@cos.ru |
|  CC:ad...@cos.ru,jimkli...@mail.ru |
++
| ()  ascii ribbon campaign - against html mail  |
| /\- against microsoft attachments  |
++



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] write cache partial-disk pools (was Server with 4 drives, how to configure ZFS?)

2011-06-17 Thread Edward Ned Harvey
> From: Daniel Carosone [mailto:d...@geek.com.au]
> Sent: Thursday, June 16, 2011 11:05 PM
> 
> the [sata] channel is idle, blocked on command completion, while
> the heads seek.

I'm interested in proving this point.  Because I believe it's false.

Just hand waving for the moment ... Presenting the alternative viewpoint
that I think is correct...

All drives, regardless of whether or not their disk cache or buffer is
enabled, support PIO and DMA.  This means no matter the state of the cache
or buffer, the bus will deliver information to/from the memory of the disk
as fast as possible, and the disk will optimize the visible workload to the
best of its ability, and the disk will report back an interrupt when each
operation is completed out-of-order.

The difference between enabling or disabling the disk write buffer is:  If
the write buffer is disabled...  It still gets used temporarily ... but the
disk doesn't interrupt "completed" until the buffer is flushed to platter.
If the disk write buffer is enabled, the disk will immediately report
"completed" as soon as it receives the data, before flushing to platter...
And if your application happens to have issued the write in "sync" mode (or
the fsync() command), your OS will additionally issue the hardware sync
command, and your application will block until the hardware sync has
completed.

It would be stupid for a disk to hog the bus in an idle state.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Lanky Doodle
>  
> or is it completely random leaving me with some trial and error to work
out
> what disk is on what port?

It's highly desirable to have drives with lights on them.  So you can
manually make the light blink (or stay on) just by reading the drive with
dd.

Even if you dig down and quantify precisely how the drives are numbered in
which order ... You would have to find labels printed on the system board or
other sata controllers, and trace the spaghetti of the sata cables, and if
you make any mistake along the way, you destroy your pool.  (Being dramatic,
but not necessarily unrealistic.)

Lights.  Good.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] write cache partial-disk pools (was Server with 4 drives, how to configure ZFS?)

2011-06-17 Thread Edward Ned Harvey
> From: Daniel Carosone [mailto:d...@geek.com.au]
> Sent: Thursday, June 16, 2011 10:27 PM
> 
> Is it still the case, as it once was, that allocating anything other
> than whole disks as vdevs forces NCQ / write cache off on the drive
> (either or both, forget which, guess write cache)?

I will only say, that regardless of whether or not that is or ever was true,
I believe it's entirely irrelevant.  Because your system performs read and
write caching and buffering in ram, the tiny little ram on the disk can't
possibly contribute anything.

When it comes to reads:  The OS does readahead more intelligently than the
disk could ever hope.  Hardware readahead is useless.

When it comes to writes:  Categorize as either async or sync.

When it comes to async writes:  The OS will buffer and optimize, and the
applications have long since marched onward before the disk even sees the
data.  It's irrelevant how much time has elapsed before the disk finally
commits to platter.

When it comes to sync writes:  The write will not be completed, and the
application will block, until all the buffers have been flushed.  Both ram
and disk buffer.  So neither the ram nor disk buffer is able to help you.

It's like selling usb fobs labeled USB2 or USB3.  If you look up or measure
the actual performance of any one of these devices, they can't come anywhere
near the bus speed...  In fact, I recently paid $45 for a USB3 16G fob,
which is finally able to achieve 380 Mbit.  Oh, thank goodness I'm no longer
constrained by that slow 480 Mbit bus...   ;-)   Even so, my new fob is
painfully slow compared to a normal cheap-o usb2 hard disk.  They just put
these labels on there because it's a marketing requirement.  Something that
formerly mattered one day, but people still use as a purchasing decider.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Erik Trimble

On 6/17/2011 12:55 AM, Lanky Doodle wrote:

Thanks Richard.

How does ZFS enumerate the disks? In terms of listing them does it do them 
logically, i.e;

controller #1 (motherboard)
 |
 |--- disk1
 |--- disk2
controller #3
 |--- disk3
 |--- disk4
 |--- disk5
 |--- disk6
 |--- disk7
 |--- disk8
 |--- disk9
 |--- disk10
controller #4
 |--- disk11
 |--- disk12
 |--- disk13
 |--- disk14
 |--- disk15
 |--- disk16
 |--- disk17
 |--- disk18

or is it completely random leaving me with some trial and error to work out 
what disk is on what port?


This is not a ZFS issue, this is the Solaris device driver issue.

Solaris uses a location-based disk naming scheme, NOT the 
BSD/Linux-style of simply incrementing the disk numbers. I.e. drives are 
usually named something like ctd


In most cases, the on-board controllers receive a lower controller 
number than any add-in adapters, and add-in adapters are enumerated in 
PCI ID order. However, there is no good explanation of exactly *what* 
number a given controller may be assigned.


After receiving a controller number, disks are enumerated in ascending 
order by ATA ID, SCSI ID, SAS WWN, or FC WWN.


The naming rules can get a bit complex.

--
Erik Trimble
Java Platform Group Infrastructure
Mailstop:  usca22-317
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (UTC-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs global hot spares?

2011-06-17 Thread Fred Liu


> -Original Message-
> From: Fred Liu
> Sent: 星期四, 六月 16, 2011 17:28
> To: Fred Liu; 'Richard Elling'
> Cc: 'Jim Klimov'; 'zfs-discuss@opensolaris.org'
> Subject: RE: [zfs-discuss] zfs global hot spares?
> 
> Fixing a typo in my last thread...
> 
> > -Original Message-
> > From: Fred Liu
> > Sent: 星期四, 六月 16, 2011 17:22
> > To: 'Richard Elling'
> > Cc: Jim Klimov; zfs-discuss@opensolaris.org
> > Subject: RE: [zfs-discuss] zfs global hot spares?
> >
> > > This message is from the disk saying that it aborted a command.
> These
> > > are
> > > usually preceded by a reset, as shown here. What caused the reset
> > > condition?
> > > Was it actually target 11 or did target 11 get caught up in the
> reset
> > > storm?
> > >
> >
>  It happed in the mid-night and nobody touched the file box.
>  I assume it is the transition status before the disk is *thoroughly*
>  damaged:
> 
>  Jun 10 09:34:11 cn03 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-
>  8000-FD, TYPE: Fault, VER: 1, SEVERITY:
> 
>  Major
>  Jun 10 09:34:11 cn03 EVENT-TIME: Fri Jun 10 09:34:11 CST 2011
>  Jun 10 09:34:11 cn03 PLATFORM: X8DTH-i-6-iF-6F, CSN: 1234567890,
>  HOSTNAME: cn03
>  Jun 10 09:34:11 cn03 SOURCE: zfs-diagnosis, REV: 1.0
>  Jun 10 09:34:11 cn03 EVENT-ID: 4f4bfc2c-f653-ed20-ab13-eef72224af5e
>  Jun 10 09:34:11 cn03 DESC: The number of I/O errors associated with a
>  ZFS device exceeded
>  Jun 10 09:34:11 cn03 acceptable levels.  Refer to
>  http://sun.com/msg/ZFS-8000-FD for more information.
>  Jun 10 09:34:11 cn03 AUTO-RESPONSE: The device has been offlined and
>  marked as faulted.  An attempt
>  Jun 10 09:34:11 cn03 will be made to activate a hot spare if
>  available.
>  Jun 10 09:34:11 cn03 IMPACT: Fault tolerance of the pool may be
>  compromised.
>  Jun 10 09:34:11 cn03 REC-ACTION: Run 'zpool status -x' and replace the
>  bad device.
> 
>  After I rebooted it, I got:
>  Jun 10 11:38:49 cn03 genunix: [ID 540533 kern.notice] ^MSunOS Release
>  5.11 Version snv_134 64-bit
>  Jun 10 11:38:49 cn03 genunix: [ID 683174 kern.notice] Copyright 1983-
>  2010 Sun Microsystems, Inc.  All rights
> 
>  reserved.
>  Jun 10 11:38:49 cn03 Use is subject to license terms.
>  Jun 10 11:38:49 cn03 unix: [ID 126719 kern.info] features:
> 
> 
> 7f7f  t,sse2,sse,sep,pat,cx8,pae,mca,mmx,cmov,d
> 
>  e,pge,mtrr,msr,tsc,lgpg>
> 
>  Jun 10 11:39:06 cn03 scsi: [ID 365881 kern.info]
>  /pci@0,0/pci8086,3410@9/pci1000,72@0 (mpt_sas0):
>  Jun 10 11:39:06 cn03mptsas0 unrecognized capability 0x3
> 
>  Jun 10 11:39:42 cn03 scsi: [ID 107833 kern.warning] WARNING:
>  /scsi_vhci/disk@g5000c50009723937 (sd3):
>  Jun 10 11:39:42 cn03drive offline
>  Jun 10 11:39:47 cn03 scsi: [ID 107833 kern.warning] WARNING:
>  /scsi_vhci/disk@g5000c50009723937 (sd3):
>  Jun 10 11:39:47 cn03drive offline
>  Jun 10 11:39:52 cn03 scsi: [ID 107833 kern.warning] WARNING:
>  /scsi_vhci/disk@g5000c50009723937 (sd3):
>  Jun 10 11:39:52 cn03drive offline
>  Jun 10 11:39:57 cn03 scsi: [ID 107833 kern.warning] WARNING:
>  /scsi_vhci/disk@g5000c50009723937 (sd3):
>  Jun 10 11:39:57 cn03drive offline
> >
> >
> > >
> > > Hot spare will not help you here. The problem is not constrained to
> > one
> > > disk.
> > > In fact, a hot spare may be the worst thing here because it can
> kick
> > in
> > > for the disk
> > > complaining about a clogged expander or spurious resets.  This
> causes
> > a
> > > resilver
> > > that reads from the actual broken disk, that causes more resets,
> that
> > > kicks out another
> > > disk that causes a resilver, and so on.
> > >  -- richard
> > >
> >
>  So the warm spares could be "better" choice under this situation?
>  BTW, in what condition, the scsi reset storm will happen?
>  How can we be immune to this so as NOT to interrupt the file
>  service?
> >
> >
> > Thanks.
> > Fred
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Lanky Doodle
> 4 - the 16th port
> 
> Can you find somewhere inside the case for an SSD as
> L2ARC on your
> last port?

Although saying that, if we are saying hot spares may be bad in my scenario, I 
could ditch it and use an 3.5" SSD in the 15th drive's place?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Lanky Doodle
>I was planning on using one of
> these
> http://www.scan.co.uk/products/icy-dock-mb994sp-4s-4in
> 1-sas-sata-hot-swap-backplane-525-raid-cage

Imagine if 2.5" 2TB disks were price neutral compared to 3.5" equivalents.

I could have 40 of the buggers in my system giving 80TB raw storage! I'd 
happily use mirrors all the way in that scenario
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Lanky Doodle
Thanks Richard.

How does ZFS enumerate the disks? In terms of listing them does it do them 
logically, i.e;

controller #1 (motherboard)
|
|--- disk1
|--- disk2
controller #3
|--- disk3
|--- disk4
|--- disk5
|--- disk6
|--- disk7
|--- disk8
|--- disk9
|--- disk10
controller #4
|--- disk11
|--- disk12
|--- disk13
|--- disk14
|--- disk15
|--- disk16
|--- disk17
|--- disk18

or is it completely random leaving me with some trial and error to work out 
what disk is on what port?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-17 Thread Lanky Doodle
> 1 - are the 2 vdevs in the same pool, or two separate
> pools?
> 
I was planning on having the 2 z2 vdevs in one pool. Although having 2 pools 
and having them sync'd sounds really good, I fear it may be overkill for the 
intended purpose.

> 
> 
> 3 - spare temperature
> 
> for levels raidz2 and better, you might be happier
> with a warm spare
> and manual replacement, compared to overly-aggressive
> automated
> replacement if there is a cascade of errors.  See
> recent threads.
> 
> You may also consider a cold spare, leaving a drive
> bay free for
> disks-as-backup-tapes swapping.  If you replace the
> 1Tb's now,
> repurpose them for this rather than reselling.  
> 
I have considered this. The fact I am using cheap disks inevitably means they 
will fail sooner and more often than enterprise equivalents so the hot spare 
may be need to be over-used.

Could I have different sized vdevs and still have them both in one pool - i.e. 
an 8 disk z2 vdev and a 7 disk z2 vdev.

> 
> 4 - the 16th port
> 
> Can you find somewhere inside the case for an SSD as
> L2ARC on your
> last port?  Could be very worthwhile for some of your
> other data and
> metadata (less so the movies).

Yes! I have 10 5.1/4" drive bays in my case. 9 of them are occupied by the 
5-in-3 hot swop caddies leaving 1 bay left. I was planning on using one of 
these 
http://www.scan.co.uk/products/icy-dock-mb994sp-4s-4in1-sas-sata-hot-swap-backplane-525-raid-cage
 in the drive bay and having 2x 2.5" SATA drives mirrored for the root pool, 
leaving 2 drive bays spare.

For the mirrored root pool I was going to use 2 of the 6 motherboard SATA II 
ports so they are entirely seperate to the 'data' controllers. So I could 
either use the 16th port on the Supermicro controllers for an SSD or one of the 
remaining motherboard ports.

What size would you recommend for the L2ARC disk. I ask as I have a 72GB SAS 
10k disk spare so could use this for now (being faster than SATA), but it would 
have to be on the Supermicro card as this also supports SAS drives. SSD's are a 
bit out of range price wise at the moment so i'd wait to use one. Also ZFS 
doesn't support TRIM yet does it?

Thank you for you excellent post! :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss