Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-17 Thread Kent Watsen






  Probably not, my box has 10 drives and two very thirsty FX74 processors
and it draws 450W max.

At 1500W, I'd be more concerned about power bills and cooling than the UPS!
  


Yeah - good point, but I need my TV! - or so I tell my wife so I can
play with all this gear  :-X 

Cheers,
Kent



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-16 Thread Adam Lindsay
Heya Kent,

Kent Watsen wrote:
 It sounds good, that way, but (in theory), you'll see random I/O 
 suffer a bit when using RAID-Z2: the extra parity will drag 
 performance down a bit.
 I know what you are saying, but I , wonder if it would be noticeable?  I 

Well, noticeable again comes back to your workflow. As you point out 
to Richard, it's (theoretically) 2x IOPS difference, which can be very 
significant for some people.

 think my worst case scenario would be 3 myth frontends watching 1080p 
 content while 4 tuners are recording 1080p content - with each 1080p 
 stream being 27Mb/s, that would be 108Mb/s writes and 81Mb/s reads (all 
 sequential I/O) - does that sound like it would even come close to 
 pushing a 4(4+2) array?

I would say no, not even close to pushing it. Remember, we're measuring 
performance in MBytes/s, and video throughput is measured in Mbit/s (and 
even then, I imagine that a 27 Mbit/s stream over the air is going to be 
pretty rare). So I'm figuring you're just scratching the surface of even 
a minimal array.

Put it this way: can a single, modern hard drive keep up with an ADSL2+ 
(24 Mbit/s) connection?
Throw 24 spindles at the problem, and I'd say you have headroom for a 
*lot* of streams.

 The RAS guys will flinch at this, but have you considered 8*(2+1) 
 RAID-Z1?
 That configuration showed up in the output of the program I posted back 
 in July 
 (http://mail.opensolaris.org/pipermail/zfs-discuss/2007-July/041778.html):
 
24 bays w/ 500 GB drives having MTBF=5 years
  - can have 8 (2+1) w/ 0 spares providing 8000 GB with MTTDL of
95.05 years
  - can have 4 (4+2) w/ 0 spares providing 8000 GB with MTTDL of
8673.50 years
 
 But it is 91 times more likely to fail and this system will contain data 
 that  I don't want to risk losing

I wasn't sure, with your workload. I know with mine, I'm seeing the data 
store as being mostly temporary. With that much data streaming in and 
out, are you planning on archiving *everything*? Cos that's only one 
month's worth of HD video.

I'd consider tuning a portion of the array for high throughput, and 
another for high redundancy as an archive for whatever you don't want to 
lose. Whether that's by setting copies=2, or by having a mirrored zpool 
(smart for an archive, because you'll be less sensitive to the write 
performance that suffers there), it's up to you...
ZFS gives us a *lot* of choices. (But then you knew that, and it's what 
brought you to the list :)

 I don't want to over-pimp my links, but I do think my blogged 
 experiences with my server (also linked in another thread) might give 
 you something to think about:
  http://lindsay.at/blog/archive/tag/zfs-performance/
 I see that you also set up a video server (myth?), 

For the uncompressed HD test case, no. It'd be for storage/playout of 
Ultra-Grid-like streams, and really, that's there so our network guys 
can give their 10Gb links a little bit of a workout.

 from you blog, I 
 think you are doing 5(2+1) (plus a hot-spare?)  - this is what my 
 program says about a 16-bay system:
 
16 bays w/ 500 GB drives having MTBF=5 years
  - can have 5 (2+1) w/ 1 spares providing 5000 GB with MTTDL of
1825.00 years
  [snipped some interesting numbers]
 Note that are MTTDL isn't quite as bad as 8(2+1) since you have three 
 less strips.  

I also committed to having at least one hot spare, which, after staring 
at relling's graphs for days on end, seems to be the cheapest, easiest 
way of upping the MTTDL for any array. I'd recommend it.

Also, its interesting for me to note that have have 5 
 strips and my 4(4+2) setup would have just one less - so the question to 
 answer if your extra strip is better than my 2 extra disks in each 
 raid-set?

As I understand it, 5(2+1) would scale to better IOPS performance than 
4(4+2), and IOPS represents the performance baseline; as you ask the 
array to do more and more at once, it'll look more like random seeks.

What you get from those bigger zvol groups of 4+2 is higher performance 
per zvol. That said, with my few datapoints on 4+1 RAID-Z groups 
(running on 2 controllers) suggest that that configuration runs into a 
bottleneck somewhere, and underperforms from what's expected.

 Testing 16 disks locally, however, I do run into noticeable I/O 
 bottlenecks, and I believe it's down to the top limits of the PCI-X bus.
 Yes, too bad Supermicro doesn't make a PCIe-based version...   But 
 still, the limit of a 64-bit, 133.3MHz PCI-X bus is 1067 MB/s whereas a 
 64-bit, 100MHz, PCI-X bus is 800MB/s - either way, its much faster than 
 my worst case scenario from above where 7 1080p streams would be 189Mb/s...

Oh, the bus will far exceed your needs, I think.
The exercise is to specify something that handles what you need without 
breaking the bank, no?

BTW, where are these HDTV streams coming from/going to? Ethernet? A 
capture card? (and which ones will work with Solaris?)

 Still, though, take a look at 

Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-16 Thread Anton B. Rang
 - can have 6 (2+2) w/ 0 spares providing 6000 GB with MTTDL of
 28911.68 years

This should, of course, set off one's common-sense alert.

 it is 91 times more likely to fail and this system will contain data 
 that I don't want to risk losing

If you don't want to risk losing data, you need multiple -- off-site -- copies.

(Incidentally, I rarely see these discussions touch upon what sort of UPS is 
being used. Power fluctuations are a great source of correlated disk failures.)

Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-16 Thread Kent Watsen

 I know what you are saying, but I , wonder if it would be noticeable?  I 

 Well, noticeable again comes back to your workflow. As you point out 
 to Richard, it's (theoretically) 2x IOPS difference, which can be very 
 significant for some people.
Yeah, but my point is if it would be noticeable to *me* (yes, I am a bit 
self-centered)

 I would say no, not even close to pushing it. Remember, we're 
 measuring performance in MBytes/s, and video throughput is measured in 
 Mbit/s (and even then, I imagine that a 27 Mbit/s stream over the air 
 is going to be pretty rare). So I'm figuring you're just scratching 
 the surface of even a minimal array.

 Put it this way: can a single, modern hard drive keep up with an 
 ADSL2+ (24 Mbit/s) connection?
 Throw 24 spindles at the problem, and I'd say you have headroom for a 
 *lot* of streams.
Sweet!  I should probably hang-up this thread now, but there are too 
many other juicy bits to respond too...

 I wasn't sure, with your workload. I know with mine, I'm seeing the 
 data store as being mostly temporary. With that much data streaming in 
 and out, are you planning on archiving *everything*? Cos that's only 
 one month's worth of HD video.
Well, not to down-play the importance of my TV recordings, which is 
really a laugh because I'm not really a big TV watcher, I simply don't 
want to ever have to think about this again after getting it setup

 I'd consider tuning a portion of the array for high throughput, and 
 another for high redundancy as an archive for whatever you don't want 
 to lose. Whether that's by setting copies=2, or by having a mirrored 
 zpool (smart for an archive, because you'll be less sensitive to the 
 write performance that suffers there), it's up to you...
 ZFS gives us a *lot* of choices. (But then you knew that, and it's 
 what brought you to the list :)
All true, but if 4(4+2) serves all my needs, I think that its simpler to 
administrate as I can arbitrarily allocate space as needed without 
needing to worry about what kind of space it is - all the space is good 
and fast space...

 I also committed to having at least one hot spare, which, after 
 staring at relling's graphs for days on end, seems to be the cheapest, 
 easiest way of upping the MTTDL for any array. I'd recommend it.
No doubt that a hot-spare gives you a bump in MTTDL, but double-parity 
trumps it big time - check out Richard's blog...

 As I understand it, 5(2+1) would scale to better IOPS performance than 
 4(4+2), and IOPS represents the performance baseline; as you ask the 
 array to do more and more at once, it'll look more like random seeks.

 What you get from those bigger zvol groups of 4+2 is higher 
 performance per zvol. That said, with my few datapoints on 4+1 RAID-Z 
 groups (running on 2 controllers) suggest that that configuration runs 
 into a bottleneck somewhere, and underperforms from what's expected.
Er?  Can anyone fill in the missing blank here?


 Oh, the bus will far exceed your needs, I think.
 The exercise is to specify something that handles what you need 
 without breaking the bank, no?
Bank, smank - I build a system every 5+ years and I want it to kick ass 
all the way until I build the next one - cheers!


 BTW, where are these HDTV streams coming from/going to? Ethernet? A 
 capture card? (and which ones will work with Solaris?)
Glad you asked, for the lists sake, I'm using two HDHomeRun tuners 
(http://www.silicondust.com/wiki/products/hdhomerun) - actually, I 
bought 3 of them because I felt like I needed a spare :-D


 Yeah, perhaps I've been a bit too circumspect about it, but I haven't 
 been all that impressed with my PCI-X bus configuration. Knowing what 
 I know now, I might've spec'd something different. Of all the 
 suggestions that've gone out on the list, I was most impressed with 
 Tim Cook's:

 Won't come cheap, but this mobo comes with 6x pci-x slots... should 
 get the job done :)

 http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBE-X.cfm 


 That has 3x 133MHz PCI-X slots each connected to the Southbridge via a 
 different PCIe bus, which sounds worthy of being the core of the 
 demi-Thumper you propose.
Yeah, but getting back to PCIe I see these tasty SAS/SATA HBAs from LSI: 
http://www.lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/lsisas3081er/index.html
 
(note, LSI also sells matching PCI-X HBA controllers, in case you need 
to balance your mobo's architecture]

 ...But It all depends what you intend to spend. (This is what I 
 was going to say in my next blog entry on the system:) We're talking 
 about benchmarks that are really far past what you say is your most 
 taxing work load. I say I'm disappointed with the contention on my 
 bus putting limits on maximum throughputs, but really, what I have far 
 outstrips my ability to get data into or out of the system.
So moving to the PCIe-based cards should fix that - no?

 So all of my disappointment is in theory.
Seems like this 

Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-16 Thread Ian Collins
Kent Watsen wrote:

 Glad you brought that up - I currently have an APC 2200XL 
 (http://www.apcc.com/resource/include/techspec_index.cfm?base_sku=SU2200XLNET)
  
 - its rated for 1600 watts, but my current case selections are saying 
 they have a 1500W 3+1, should I be worried?

   
Probably not, my box has 10 drives and two very thirsty FX74 processors
and it draws 450W max.

At 1500W, I'd be more concerned about power bills and cooling than the UPS!

Ian
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-15 Thread Kent Watsen

Hey Adam,

 My first posting contained my use-cases, but I'd say that video 
 recording/serving will dominate the disk utilization - thats why I'm 
 pushing for 4 striped sets of RAIDZ2 - I think that it would be all 
 around goodness

 It sounds good, that way, but (in theory), you'll see random I/O 
 suffer a bit when using RAID-Z2: the extra parity will drag 
 performance down a bit.
I know what you are saying, but I , wonder if it would be noticeable?  I 
think my worst case scenario would be 3 myth frontends watching 1080p 
content while 4 tuners are recording 1080p content - with each 1080p 
stream being 27Mb/s, that would be 108Mb/s writes and 81Mb/s reads (all 
sequential I/O) - does that sound like it would even come close to 
pushing a 4(4+2) array?



 The RAS guys will flinch at this, but have you considered 8*(2+1) 
 RAID-Z1?
That configuration showed up in the output of the program I posted back 
in July 
(http://mail.opensolaris.org/pipermail/zfs-discuss/2007-July/041778.html):

24 bays w/ 500 GB drives having MTBF=5 years
  - can have 8 (2+1) w/ 0 spares providing 8000 GB with MTTDL of
95.05 years
  - can have 6 (2+2) w/ 0 spares providing 6000 GB with MTTDL of
28911.68 years
  - can have 4 (4+1) w/ 4 spares providing 8000 GB with MTTDL of
684.38 years
  - can have 4 (4+2) w/ 0 spares providing 8000 GB with MTTDL of
8673.50 years
  - can have 2 (8+1) w/ 6 spares providing 8000 GB with MTTDL of
380.21 years
  - can have 2 (8+2) w/ 4 spares providing 8000 GB with MTTDL of
416328.12 years

But it is 91 times more likely to fail and this system will contain data 
that  I don't want to risk losing



 I don't want to over-pimp my links, but I do think my blogged 
 experiences with my server (also linked in another thread) might give 
 you something to think about:
  http://lindsay.at/blog/archive/tag/zfs-performance/
I see that you also set up a video server (myth?),  from you blog, I 
think you are doing 5(2+1) (plus a hot-spare?)  - this is what my 
program says about a 16-bay system:

16 bays w/ 500 GB drives having MTBF=5 years
  - can have 5 (2+1) w/ 1 spares providing 5000 GB with MTTDL of
1825.00 years
  - can have 4 (2+2) w/ 0 spares providing 4000 GB with MTTDL of
43367.51 years
  - can have 3 (4+1) w/ 1 spares providing 6000 GB with MTTDL of
912.50 years
  - can have 2 (4+2) w/ 4 spares providing 4000 GB with MTTDL of
2497968.75 years
  - can have 1 (8+1) w/ 7 spares providing 4000 GB with MTTDL of
760.42 years
  - can have 1 (8+2) w/ 6 spares providing 4000 GB with MTTDL of
832656.25 years

Note that are MTTDL isn't quite as bad as 8(2+1) since you have three 
less strips.  Also, its interesting for me to note that have have 5 
strips and my 4(4+2) setup would have just one less - so the question to 
answer if your extra strip is better than my 2 extra disks in each raid-set?



 Testing 16 disks locally, however, I do run into noticeable I/O 
 bottlenecks, and I believe it's down to the top limits of the PCI-X bus.
Yes, too bad Supermicro doesn't make a PCIe-based version...   But 
still, the limit of a 64-bit, 133.3MHz PCI-X bus is 1067 MB/s whereas a 
64-bit, 100MHz, PCI-X bus is 800MB/s - either way, its much faster than 
my worst case scenario from above where 7 1080p streams would be 189Mb/s...



  As far as a mobo with good PCI-X architecture - check out
 the latest from Tyan 
 (http://tyan.com/product_board_detail.aspx?pid=523) - it has three 
 133/100MHz PCI-X slots

 I use a Tyan in my server, and have looked at a lot of variations, but 
 I hadn't noticed that one. It has some potential.

 Still, though, take a look at the block diagram on the datasheet: that 
 actually looks like 1x PCI-X 133MHz slot and a bridge sharing 2x 
 100MHz slots. My benchmarks so far show that putting a controller on a 
 100MHz slot is measurably slower than 133MHz, but contention over a 
 single bridge can be even worse.
Hmmm, I hadn't thought about that...  Here is another new mobo from Tyan 
(http://tyan.com/product_board_detail.aspx?pid=517) - its datasheet 
shows the PCI-X buses configured the same way as your S3892:



Thanks!
Kent

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-15 Thread Kent Watsen

 Nit: small, random read I/O may suffer.  Large random read or any random
 write workloads should be ok.
Given that video-serving is all sequential-read, is it correct that  
that raidz2, specifically 4(4+2), would be just fine?

 For 24 data disks there are enough combinations that it is not easy to
 pick from.  The attached RAIDoptimizer output may help you decide on
 the trade-offs.  
Wow! - thanks for running it with 24 disks!

 For description of the theory behind it, see my blog
 http://blogs.sun.com/relling
I used your theory to write my own program (posted in July), but your's 
is way more complete

 I recommend loading it into StarOffice 
Nice little plug  ;-)

 and using graphs or sorts to
 reorder the data, based on your priorities.  
Interesting, my 4(4+2) has 282 iops, where as 8(2+1) has 565 iops - 
exactly double, which is kind of expected given that it has twice as 
many stripes)...  Also, it helps to see that the iops extremes are 
12(raid1) with 1694 iops and 2(10+2) with 141 iops - so 4(4+2) is not a 
great 24-disk performer but isn't 282 iops is probably overkill for my 
home network?

Yes, I (obviously :-) recommend

 http://www.sun.com/storagetek/storage_networking/hba/sas/specs.xml
Very nice - think I'll be getting 3 of these!


Thanks,
Kent




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-15 Thread Kent Watsen


 Sorry, but looking again at the RMP page, I see that the chassis I 
 recommended is actually different than the one we have.  I can't find 
 this chassis only online, but here's what we bought:

 http://www.siliconmechanics.com/i10561/intel-storage-server.php?cat=625
That is such a cool looking case!

 From their picture gallery, you can't see the back, but it has space 
 for 3.5 drives in the back.  You can put hot swap trays back there 
 for your OS drives.  The guys at Silicon Mechanics are great, so you 
 could probably call them to ask who makes this chassis.  They may also 
 be able to build you a partial system, if you like.
An excellent suggestion, but after configuring the nServ K501 (because I 
want quad-core AMD) the way I want it, there price is almost exactly the 
same a my thrifty-shopper price, unlike RackMountPro which seems to add 
about 20% overhead - so I'll probably order the whole system from them, 
sans the Host Bus Adapter, as I'll use the SUN card Richard suggested



Thanks!
Kent
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-15 Thread Kent Watsen

[CC-ing xen-discuss regarding question below]


 Probably a 64 bit dual core with 4GB of (ECC) RAM would be a good
 starting point.

 Agreed.
So I was completely out of a the ball-park - I hope the ZFS Wiki can be 
updated to contain some sensible hardware-sizing information...

One option I'm still holding on to is to also use the ZFS system as a 
Xen-server - that is OpenSolaris would be running in Dom0...  Given that 
the Xen hypervisor has a pretty small cpu/memory footprint, do you think 
it could share 2-cores + 4Gb with ZFS or should I allocate 3 cores of 
Dom0 and bump the memory up 512MB?


Thanks,
Kent

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Kent Watsen


 I will only comment on the chassis, as this is made by AIC (short for 
 American Industrial Computer), and I have three of these in service at 
 my work.  These chassis are quite well made, but I have experienced 
 the following two problems:

 snip
Oh my, thanks for the heads-up!  Charlie at RMP said that they were most 
popular - so I assumed that they were solid...


 For all new systems, I've gone with this chassis instead (I just 
 noticed Rackmount Pro sells 'em also):

 http://rackmountpro.com/productpage.php?prodid=2043
But I was hoping for resiliency and easy replacement for the OS drive - 
hot-swap RAID1 seemed like a no-brainer...  This case has 1 internal and 
one external 3.5 drive bays.  I could use a CF reader for resiliency 
and reduce need for replacement - assuming I spool logs to internal 
drive so as to not burn out the CF.  Alternatively, I could put a couple 
2.5 drives into a single 3.5 bay for RAID1 resiliency, but I'd have to 
shutdown to replace a drive...  What do you recommend?


 One other thing, that you may know already.  Rackmount Pro will try to 
 sell you 3ware cards, which work great in the Linux/Windows 
 environment, but aren't supported in Open Solaris, even in JBOD mode.  
 You will need alternate SATA host adapters for this application.
Indeed, but why pay for a RAID controller when you only need SATA 
ports?  - thats why I was thinking of picking up three of these bad boys 
(http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm) 
for about $100 each

 Good luck,
Getting there - can anybody clue me into how much CPU/Mem ZFS needs?
I have an old 1.2Ghz with 1Gb of mem laying around - would it be sufficient?


Thanks!
Kent






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Adam Lindsay
Kent Watsen wrote:

 I'm putting together a OpenSolaris ZFS-based system and need help 
 picking hardware.

Fun exercise! :)

 I'm thinking about using this 26-disk case:  [FYI: 2-disk RAID1 for the 
 OS  4*(4+2) RAIDZ2 for SAN]

What are you *most* interested in for this server? Reliability? 
Capacity? High Performance? Reading or writing? Large contiguous reads 
or small seeks?

One thing that I did that got a good feedback from this list was picking 
apart the requirements of the most demanding workflow I imagined for the 
machine I was speccing out.

 Regarding the mobo, cpus, and memory - I searched goggle and the ZFS 
 site and all I came up with so far is that, for a dedicated iSCSI-based 
 SAN, I'll need about 1 Gb of memory and a low-end processor - can anyone 
 clarify exactly how much memory/cpu I'd need to be in the safe-zone?  
 Also, are there any mobo/chipsets that are particularly well suited for 
 a dedicated iSCSI-based SAN?

I'm learning more and more about this subject as I test the server (not 
all that dissimilar to what you've described, except with only 18 disks) 
I now have. I'm frustrated at the relative unavailability of PCIe SATA 
controller cards that are ZFS-friendly (i.e., JBOD), and the relative 
unavailability of motherboards that support both the latest CPUs as well 
as have a good PCI-X architecture.

If you come across some potential solutions, I think a lot of people 
here will thank you for sharing...


adam
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Kent Watsen

 Fun exercise! :)

Indeed! - though my wife and kids don't seem to appreciate it so much  ;)


 I'm thinking about using this 26-disk case:  [FYI: 2-disk RAID1 for 
 the OS  4*(4+2) RAIDZ2 for SAN]

 What are you *most* interested in for this server? Reliability? 
 Capacity? High Performance? Reading or writing? Large contiguous reads 
 or small seeks?

 One thing that I did that got a good feedback from this list was 
 picking apart the requirements of the most demanding workflow I 
 imagined for the machine I was speccing out.
My first posting contained my use-cases, but I'd say that video 
recording/serving will dominate the disk utilization - thats why I'm 
pushing for 4 striped sets of RAIDZ2 - I think that it would be all 
around goodness


 I'm learning more and more about this subject as I test the server 
 (not all that dissimilar to what you've described, except with only 18 
 disks) I now have. I'm frustrated at the relative unavailability of 
 PCIe SATA controller cards that are ZFS-friendly (i.e., JBOD), and the 
 relative unavailability of motherboards that support both the latest 
 CPUs as well as have a good PCI-X architecture.
Good point - another reply I just sent noted a PCI-X sata controller 
card, but I'd prefer a PCIe card - do you have a recommendation on a 
PCIe card?  As far as a mobo with good PCI-X architecture - check out 
the latest from Tyan (http://tyan.com/product_board_detail.aspx?pid=523) 
- it has three 133/100MHz PCI-X slots

 If you come across some potential solutions, I think a lot of people 
 here will thank you for sharing...
Will keep the list posted!


Thanks,
Kent





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Adam Lindsay
Kent Watsen wrote:

 What are you *most* interested in for this server? Reliability? 
 Capacity? High Performance? Reading or writing? Large contiguous reads 
 or small seeks?

 One thing that I did that got a good feedback from this list was 
 picking apart the requirements of the most demanding workflow I 
 imagined for the machine I was speccing out.
 My first posting contained my use-cases, but I'd say that video 
 recording/serving will dominate the disk utilization - thats why I'm 
 pushing for 4 striped sets of RAIDZ2 - I think that it would be all 
 around goodness

It sounds good, that way, but (in theory), you'll see random I/O suffer 
a bit when using RAID-Z2: the extra parity will drag performance down a 
bit. The RAS guys will flinch at this, but have you considered 8*(2+1) 
RAID-Z1?

I don't want to over-pimp my links, but I do think my blogged 
experiences with my server (also linked in another thread) might give 
you something to think about:
  http://lindsay.at/blog/archive/tag/zfs-performance/

 
 I'm learning more and more about this subject as I test the server 
 (not all that dissimilar to what you've described, except with only 18 
 disks) I now have. I'm frustrated at the relative unavailability of 
 PCIe SATA controller cards that are ZFS-friendly (i.e., JBOD), and the 
 relative unavailability of motherboards that support both the latest 
 CPUs as well as have a good PCI-X architecture.
 Good point - another reply I just sent noted a PCI-X sata controller 
 card, but I'd prefer a PCIe card - do you have a recommendation on a 
 PCIe card? 

Nope, but I can endorse the Supermicro card you mentioned. That's one 
component in my server I have few doubts about.

When I was kicking around possibilities on the list, I started out 
thinking about Areca's PCIe RAID drivers, used in JBOD mode. The on-list 
consensus was that they would be overkill. (Plus, there's the reliance 
on Solaris drivers from Areca.) It's true, for my configuration: disk 
I/O far exceeds the network I/O I'll be dealing with.

Testing 16 disks locally, however, I do run into noticeable I/O 
bottlenecks, and I believe it's down to the top limits of the PCI-X bus.

  As far as a mobo with good PCI-X architecture - check out
 the latest from Tyan (http://tyan.com/product_board_detail.aspx?pid=523) 
 - it has three 133/100MHz PCI-X slots

I use a Tyan in my server, and have looked at a lot of variations, but I 
hadn't noticed that one. It has some potential.

Still, though, take a look at the block diagram on the datasheet: that 
actually looks like 1x PCI-X 133MHz slot and a bridge sharing 2x 100MHz 
slots. My benchmarks so far show that putting a controller on a 100MHz 
slot is measurably slower than 133MHz, but contention over a single 
bridge can be even worse.

hth,
adam
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Richard Elling

comments from a RAS guy below...

Adam Lindsay wrote:

Kent Watsen wrote:

What are you *most* interested in for this server? Reliability? 
Capacity? High Performance? Reading or writing? Large contiguous reads 
or small seeks?


One thing that I did that got a good feedback from this list was 
picking apart the requirements of the most demanding workflow I 
imagined for the machine I was speccing out.
My first posting contained my use-cases, but I'd say that video 
recording/serving will dominate the disk utilization - thats why I'm 
pushing for 4 striped sets of RAIDZ2 - I think that it would be all 
around goodness


It sounds good, that way, but (in theory), you'll see random I/O suffer 
a bit when using RAID-Z2: the extra parity will drag performance down a 
bit. The RAS guys will flinch at this, but have you considered 8*(2+1) 
RAID-Z1?


Nit: small, random read I/O may suffer.  Large random read or any random
write workloads should be ok.

For 24 data disks there are enough combinations that it is not easy to
pick from.  The attached RAIDoptimizer output may help you decide on
the trade-offs.  For description of the theory behind it, see my blog
http://blogs.sun.com/relling

I recommend loading it into StarOffice and using graphs or sorts to
reorder the data, based on your priorities.  Also, this uses a generic
model, knowing the drive model will allow bandwidth analysis (with the
caveats shown in Adam's blog below).

I don't want to over-pimp my links, but I do think my blogged 
experiences with my server (also linked in another thread) might give 
you something to think about:

  http://lindsay.at/blog/archive/tag/zfs-performance/

I'm learning more and more about this subject as I test the server 
(not all that dissimilar to what you've described, except with only 18 
disks) I now have. I'm frustrated at the relative unavailability of 
PCIe SATA controller cards that are ZFS-friendly (i.e., JBOD), and the 
relative unavailability of motherboards that support both the latest 
CPUs as well as have a good PCI-X architecture.
Good point - another reply I just sent noted a PCI-X sata controller 
card, but I'd prefer a PCIe card - do you have a recommendation on a 
PCIe card? 


Yes, I (obviously :-) recommend
http://www.sun.com/storagetek/storage_networking/hba/sas/specs.xml

Note: marketing still seems to have SATA-phobia, so if you search for SATA
you'll be less successful than searching for SAS.  But many SAS HBAs support
SATA devices, too.
 -- richard
RAIDOptimizer Report
Date: Sep 14, 2007 at 9:50:42 AM
RAIDOptimizer version: 1.0

RAID Type   Disks/Set   SetsSpares  Copies  Space (GBytes)  
MTTDL[1] (yrs)  MTTDL[2] (yrs)  Performance (iops)  Min BW (MByte/s)
Max BW (MBytes/s)   MTBS[1] (yrs)   MTBS[2] (yrs)   Note
RAID-Z  12  2   0   1   11,000  8,559   17  141 0   
0   14  14  1
RAID-Z  8   3   0   1   10,500  13,450  27  212 0   
0   14  14  1
RAID-Z  6   4   0   1   10,000  18,830  38  282 0   
0   14  14  1
RAID-Z2 12  2   0   1   10,000  27,905,578  9,825   141 
0   0   37  37  1
RAID-Z  4   6   0   1   9,000   31,383  63  424 0   
0   14  14  1
RAID-Z2 8   3   0   1   9,000   73,086,037  15,439  212 
0   0   37  37  1
RAID-Z  3   8   0   1   8,000   47,074  95  565 0   
0   14  14  1
RAID-Z2 6   4   0   1   8,000   153,480,678 33,774  282 
0   0   37  37  1
Mirror  2   12  0   1   6,000   108,076 190 1,694   0   
0   14  14  1
RAID-Z2 4   6   0   1   6,000   511,602,260 150,106 424 
0   0   37  37  1
RAID-Z2 3   8   0   1   4,000   1,534,806,780   675,475 565 
0   0   37  37  1
RAID-Z  11  2   2   1   10,000  39,851  21  141 0   
0   14  137 1
RAID-Z2 11  2   2   1   9,000   1,700,290,577   79,701  141 
0   0   37  642 1
Mirror  2   11  2   1   5,500   797,011 208 1,553   0   
0   14  137 1
RAID-Z  7   3   3   1   9,000   69,580  36  212 0   
0   14  642 1
RAID-Z2 7   3   3   1   7,500   5,343,770,385   185,548 212 
0   0   37  3,739   1
RAID-Z  3   7   3   1   7,000   208,741 109 494 0   
0   14  642 1
RAID-Z2 3   7   3   1   3,500   80,156,555,773  5,964,029   
494 0   0   37  3,739   1
RAID-Z  10  2   4   1   9,000   48,706  25  141 0   
0   14  3,739   1
RAID-Z  5   4   4   1   8,000   109,589 57  282 0  

Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Tim Cook
Won't come cheap, but this mobo comes with 6x pci-x slots... should get the job 
done :)

http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBE-X.cfm
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Will Snow
Go look at intel - they have a pretty decent mb with 6 sata ports


Tim Cook wrote:
 Won't come cheap, but this mobo comes with 6x pci-x slots... should get the 
 job done :)

 http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBE-X.cfm
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

-- 
--will snow
[EMAIL PROTECTED]
Director, Web Engineering
Sun Microsystems, Inc.
http://www.sun.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Ian Collins
Kent Watsen wrote:
 Getting there - can anybody clue me into how much CPU/Mem ZFS needs?
 I have an old 1.2Ghz with 1Gb of mem laying around - would it be sufficient?


   
It'll use as much memory as you can spare and it has a strong preference
for 64 bit systems.  Considering how much you are spending on the case
and drives, it would be foolish to skimp on the motherboard CPU combination.

Probably a 64 bit dual core with 4GB of (ECC) RAM would be a good
starting point.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-14 Thread Rob Windsor
Tim Cook wrote:
 Won't come cheap, but this mobo comes with 6x pci-x slots... should get the 
 job done :)
 
 http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBE-X.cfm

Yes, but where do you buy SuperMicro toys?

SuperMicro doesn't sell online, anything neat that I've found is not 
in stock at CDW, other preferred vendors on their list all have nasty 
online catalogs (if any), yadda yadda.

Rob++
-- 
Internet: [EMAIL PROTECTED] __o
Life: [EMAIL PROTECTED]_`\,_
(_)/ (_)
They couldn't hit an elephant at this distance.
   -- Major General John Sedgwick
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] hardware sizing for a zfs-based system?

2007-09-13 Thread Kent Watsen

Hi all,

I'm putting together a OpenSolaris ZFS-based system and need help 
picking hardware.

I'm thinking about using this 26-disk case:  [FYI: 2-disk RAID1 for the 
OS  4*(4+2) RAIDZ2 for SAN]

http://rackmountpro.com/productpage.php?prodid=2418

Regarding the mobo, cpus, and memory - I searched goggle and the ZFS 
site and all I came up with so far is that, for a dedicated iSCSI-based 
SAN, I'll need about 1 Gb of memory and a low-end processor - can anyone 
clarify exactly how much memory/cpu I'd need to be in the safe-zone?  
Also, are there any mobo/chipsets that are particularly well suited for 
a dedicated iSCSI-based SAN?

This is for my home network, which includes internet/intranet services 
(mail, web, ldap, samba, netatalk, code-repository), build/test 
environments (for my cross-platform projects), and a video server 
(mythtv-backend). 

Right now, the aforementioned run on two separate machines, but I'm 
planning to consolidate them into a single Xen-based server.  One idea I 
have is to host a Xen-server on this same machine - that is, an 
OpenSolaris-based Dom0 serving ZFS-based volumes to the DomU guest 
machines.  But if I go this way, then I'd be looking at 4-socket Opteron 
mobo to use with AMD's just released quad-core CPUs and tons of memory.  
My biggest concern with this approach is getting PSUs large enough to 
power it all - if anyone has experience on this front, I'd love to hear 
about it too

Thanks!
Kent





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware sizing for a zfs-based system?

2007-09-13 Thread Jonathan Loran

I will only comment on the chassis, as this is made by AIC (short for 
American Industrial Computer), and I have three of these in service at 
my work.  These chassis are quite well made, but I have experienced the 
following two problems:

1) The rails really are not up to the task of supporting such a heavy 
box when fully extended.  If you rack this guy, you are at serious risk 
of having a rail failure, and dropping the whole party on the floor.  
Ouch.  If you do use this chassis in a rack, I highly recommend you 
either install a very strong rail mounted shelf below it, or you support 
it with a lift when the rails are fully extended.

3) The power distribution board in these are a little flaky.  I haven't 
ever had one outright fail on me, but, I have had some interesting power 
on scenarios.  For example, after a planned power outage, the chassis 
would power on, but then turn it's self off again after about 4-5 
seconds.  I couldn't get it powered on to stay.  What was happening was 
the power distribution card was confused, and thought it didn't have the 
necessary 3 (of 4) power supplies on line, and safed its self off.  To 
fix this, I had to pull the power supplied all out, and wait a few 
minutes to fully discharge the power distribution card, then plug the 
supplies back in.  Then it was able to power on again to stay.  A real 
odd pain in the posterior. 

For all new systems, I've gone with this chassis instead (I just noticed 
Rackmount Pro sells 'em also):

http://rackmountpro.com/productpage.php?prodid=2043

Functional rails, and better power system.

One other thing, that you may know already.  Rackmount Pro will try to 
sell you 3ware cards, which work great in the Linux/Windows environment, 
but aren't supported in Open Solaris, even in JBOD mode.  You will need 
alternate SATA host adapters for this application.

Good luck,

Jon

Kent Watsen wrote:
 Hi all,

 I'm putting together a OpenSolaris ZFS-based system and need help 
 picking hardware.

 I'm thinking about using this 26-disk case:  [FYI: 2-disk RAID1 for the 
 OS  4*(4+2) RAIDZ2 for SAN]

 http://rackmountpro.com/productpage.php?prodid=2418

 Regarding the mobo, cpus, and memory - I searched goggle and the ZFS 
 site and all I came up with so far is that, for a dedicated iSCSI-based 
 SAN, I'll need about 1 Gb of memory and a low-end processor - can anyone 
 clarify exactly how much memory/cpu I'd need to be in the safe-zone?  
 Also, are there any mobo/chipsets that are particularly well suited for 
 a dedicated iSCSI-based SAN?

 This is for my home network, which includes internet/intranet services 
 (mail, web, ldap, samba, netatalk, code-repository), build/test 
 environments (for my cross-platform projects), and a video server 
 (mythtv-backend). 

 Right now, the aforementioned run on two separate machines, but I'm 
 planning to consolidate them into a single Xen-based server.  One idea I 
 have is to host a Xen-server on this same machine - that is, an 
 OpenSolaris-based Dom0 serving ZFS-based volumes to the DomU guest 
 machines.  But if I go this way, then I'd be looking at 4-socket Opteron 
 mobo to use with AMD's just released quad-core CPUs and tons of memory.  
 My biggest concern with this approach is getting PSUs large enough to 
 power it all - if anyone has experience on this front, I'd love to hear 
 about it too

 Thanks!
 Kent





 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

-- 


- _/ _/  /   - Jonathan Loran -   -
-/  /   /IT Manager   -
-  _  /   _  / / Space Sciences Laboratory, UC Berkeley
-/  / /  (510) 643-5146 [EMAIL PROTECTED]
- __/__/__/   AST:7731^29u18e3
 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss