Heya Kent,

Kent Watsen wrote:
>> It sounds good, that way, but (in theory), you'll see random I/O 
>> suffer a bit when using RAID-Z2: the extra parity will drag 
>> performance down a bit.
> I know what you are saying, but I , wonder if it would be noticeable?  I 

Well, "noticeable" again comes back to your workflow. As you point out 
to Richard, it's (theoretically) 2x IOPS difference, which can be very 
significant for some people.

> think my worst case scenario would be 3 myth frontends watching 1080p 
> content while 4 tuners are recording 1080p content - with each 1080p 
> stream being 27Mb/s, that would be 108Mb/s writes and 81Mb/s reads (all 
> sequential I/O) - does that sound like it would even come close to 
> pushing a 4(4+2) array?

I would say no, not even close to pushing it. Remember, we're measuring 
performance in MBytes/s, and video throughput is measured in Mbit/s (and 
even then, I imagine that a 27 Mbit/s stream over the air is going to be 
pretty rare). So I'm figuring you're just scratching the surface of even 
a minimal array.

Put it this way: can a single, modern hard drive keep up with an ADSL2+ 
(24 Mbit/s) connection?
Throw 24 spindles at the problem, and I'd say you have headroom for a 
*lot* of streams.

>> The RAS guys will flinch at this, but have you considered 8*(2+1) 
>> RAID-Z1?
> That configuration showed up in the output of the program I posted back 
> in July 
> (http://mail.opensolaris.org/pipermail/zfs-discuss/2007-July/041778.html):
> 
>    24 bays w/ 500 GB drives having MTBF=5 years
>      - can have 8 (2+1) w/ 0 spares providing 8000 GB with MTTDL of
>    95.05 years
>      - can have 4 (4+2) w/ 0 spares providing 8000 GB with MTTDL of
>    8673.50 years
> 
> But it is 91 times more likely to fail and this system will contain data 
> that  I don't want to risk losing

I wasn't sure, with your workload. I know with mine, I'm seeing the data 
store as being mostly temporary. With that much data streaming in and 
out, are you planning on archiving *everything*? Cos that's "only" one 
month's worth of HD video.

I'd consider tuning a portion of the array for high throughput, and 
another for high redundancy as an archive for whatever you don't want to 
lose. Whether that's by setting copies=2, or by having a mirrored zpool 
(smart for an archive, because you'll be less sensitive to the write 
performance that suffers there), it's up to you...
ZFS gives us a *lot* of choices. (But then you knew that, and it's what 
brought you to the list :)

>> I don't want to over-pimp my links, but I do think my blogged 
>> experiences with my server (also linked in another thread) might give 
>> you something to think about:
>>  http://lindsay.at/blog/archive/tag/zfs-performance/
> I see that you also set up a video server (myth?), 

For the uncompressed HD test case, no. It'd be for storage/playout of 
Ultra-Grid-like streams, and really, that's there so our network guys 
can give their 10Gb links a little bit of a workout.

> from you blog, I 
> think you are doing 5(2+1) (plus a hot-spare?)  - this is what my 
> program says about a 16-bay system:
> 
>    16 bays w/ 500 GB drives having MTBF=5 years
>      - can have 5 (2+1) w/ 1 spares providing 5000 GB with MTTDL of
>    1825.00 years
 > [snipped some interesting numbers]
> Note that are MTTDL isn't quite as bad as 8(2+1) since you have three 
> less strips.  

I also committed to having at least one hot spare, which, after staring 
at relling's graphs for days on end, seems to be the cheapest, easiest 
way of upping the MTTDL for any array. I'd recommend it.

>Also, its interesting for me to note that have have 5 
> strips and my 4(4+2) setup would have just one less - so the question to 
> answer if your extra strip is better than my 2 extra disks in each 
> raid-set?

As I understand it, 5(2+1) would scale to better IOPS performance than 
4(4+2), and IOPS represents the performance baseline; as you ask the 
array to do more and more at once, it'll look more like random seeks.

What you get from those bigger zvol groups of 4+2 is higher performance 
per zvol. That said, with my few datapoints on 4+1 RAID-Z groups 
(running on 2 controllers) suggest that that configuration runs into a 
bottleneck somewhere, and underperforms from what's expected.

>> Testing 16 disks locally, however, I do run into noticeable I/O 
>> bottlenecks, and I believe it's down to the top limits of the PCI-X bus.
> Yes, too bad Supermicro doesn't make a PCIe-based version...   But 
> still, the limit of a 64-bit, 133.3MHz PCI-X bus is 1067 MB/s whereas a 
> 64-bit, 100MHz, PCI-X bus is 800MB/s - either way, its much faster than 
> my worst case scenario from above where 7 1080p streams would be 189Mb/s...

Oh, the bus will far exceed your needs, I think.
The exercise is to specify something that handles what you need without 
breaking the bank, no?

BTW, where are these HDTV streams coming from/going to? Ethernet? A 
capture card? (and which ones will work with Solaris?)

>> Still, though, take a look at the block diagram on the datasheet: that 
>> actually looks like 1x PCI-X 133MHz slot and a bridge sharing 2x 
>> 100MHz slots. My benchmarks so far show that putting a controller on a 
>> 100MHz slot is measurably slower than 133MHz, but contention over a 
>> single bridge can be even worse.
> Hmmm, I hadn't thought about that...  Here is another new mobo from Tyan 
> (http://tyan.com/product_board_detail.aspx?pid=517) - its datasheet 
> shows the PCI-X buses configured the same way as your S3892:

Yeah, perhaps I've been a bit too circumspect about it, but I haven't 
been all that impressed with my PCI-X bus configuration. Knowing what I 
know now, I might've spec'd something different. Of all the suggestions 
that've gone out on the list, I was most impressed with Tim Cook's:

> Won't come cheap, but this mobo comes with 6x pci-x slots... should get the 
> job done :)
> 
> http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBE-X.cfm

That has 3x 133MHz PCI-X slots each connected to the Southbridge via a 
different PCIe bus, which sounds worthy of being the core of the 
demi-Thumper you propose.

...But.... It all depends what you intend to spend. (This is what I was 
going to say in my next blog entry on the system:) We're talking about 
benchmarks that are really far past what you say is your most taxing 
work load. I say I'm "disappointed" with the contention on my bus 
putting limits on maximum throughputs, but really, what I have far 
outstrips my ability to get data into or out of the system.

So all of my "disappointment" is in theory.

adam

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to