Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
could you explain how raid 5 relates to sata vs sas? i can't see now it's anything but a non-sequitor. Here is the motivating real-world business case: You are in the movie post-production business and need 50 TB of online storage at as low a price as possible with good performance and reliability. 7200 rpm SATA (currently ~15¢/GB on Newegg) plus RAID narrows the performance and reliability of benefits of 15k rpm SAS (currently ~$1/GB) at a much lower cost.
Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
On Sun, Apr 19, 2009 at 12:58 AM, John Barham jbar...@gmail.com wrote: I certainly can't think ahead 20 years but I think it's safe to say that the next 5 (at least doing HPC and large-scale web type stuff) will increasingly look like this: http://www.technologyreview.com/computing/22504/?a=f, which talks about building a cluster from AMD Geode (!) nodes w/ compact flash storage. Sure it's not super-fast, but it's very efficient per watt. If you had more cash you might substitute HE Opterons and SSD's but the principle is the same. It's nice. We did that one a few years ago. Here is the 7 year old version: http://eri.ca.sandia.gov/eri/howto.html We've been doing these with the Geode stuff since about 2006. We are certainly not the first. The RLX was doing what FAWN did about 8 years ago; orion, about 3-4 years ago (both transmeta). RLX and Orion multisystems showed there is not much of a market for lots of wimpy nodes -- yet or never, is the real question. Either way, they did not have enough buyers to stay in business. And RLX had to drop its wimpy transmetas for P4s, and they could not keep up with the cheap mainboards. It's a tough business. ron
Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
On Mon Apr 20 11:13:01 EDT 2009, jbar...@gmail.com wrote: could you explain how raid 5 relates to sata vs sas? i can't see now it's anything but a non-sequitor. Here is the motivating real-world business case: You are in the movie post-production business and need 50 TB of online storage at as low a price as possible with good performance and reliability. 7200 rpm SATA (currently ~15¢/GB on Newegg) this example has nothing to do with raid. if the object is to find the lowest cost per gigabyte, enterprise sata drives are the cheeper option. (it would make more sense to compare 7.2k sas and sata drives. there is also a premium on spindle speed.) the original argument was that scsi is better than ata or sas is better than sata (i'm not sure which); in my opinion, there are no facts to justify either assertion. plus RAID narrows the performance and reliability of benefits of 15k rpm SAS (currently ~$1/GB) at a much lower cost. without raid, such a configuration might be impossible to deal with. most 15k drives are 73gb. this means you would need 685 for 50tb. the afr is probablly something like 0.15% - 0.25%. this would mean you will loose 1-2 drives/year. (if you believe those rosy afr numbers.) - erik
Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
ron minnich wrote: RLX and Orion multisystems showed there is not much of a market for lots of wimpy nodes -- yet or never, is the real question. Either way, they did not have enough buyers to stay in business. And RLX had to drop its wimpy transmetas for P4s, and they could not keep up with the cheap mainboards. It's a tough business. All RLX showed was that they didn't know how to market benefits rather than nifty technology. Once again, the company with beautifully engineered products failed to understand that the decision makers who would buy the products were not engineers who loved them but people who needed education and handholding in order to understand them and overcome FUD from RLX competitors. A well-worn path. Wes
[9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
I certainly can't think ahead 20 years but I think it's safe to say that the next 5 (at least doing HPC and large-scale web type stuff) will increasingly look like this: http://www.technologyreview.com/computing/22504/?a=f, which talks about building a cluster from AMD Geode (!) nodes w/ compact flash storage. Sure it's not super-fast, but it's very efficient per watt. If you had more cash you might substitute HE Opterons and SSD's but the principle is the same. The general trend is that capital expenditures for computing are going down but operating expenditures are going up. Indeed if you sign up for something like Amazon's EC2 service, your initial capital outlay is exactly $0. (I vividly recall paying over $3000 for a low-end server and $300/month in colo fees back in early 2003 when I had a hosting business.) Apparently they use the above cluster to implement some type of distributed memcached style cache. Here is the page listing the many clients for memcached: http://code.google.com/p/memcached/wiki/Clients. However, if w/ Plan 9 you implement the interface to the cache as a 9p service, it is automatically available to any language that can do file I/O (heck, even Haskell, if you can slog through the advanced type theory). So your software development costs go down. Another change that levels the playing field in Plan 9's favor is the clock-speed wall and the move to multi-core chips. Soon everyone is going to have to re-write their software to make it concurrent if they want to make it run faster. And concurrency is hard, especially when the predominant model is preemptive threads. Here again Plan 9's technical advantages of its lightweight kernel and CSP threading model confers an economic advantage. I think the key to successfully being able to use Plan 9 commercially is to use its unique technical advantages to exploit disruptive economic changes. Economics beats technology every time (e.g., x86/amd64 vs. MIPS/Itanium, Ethernet vs. Infiniband, SATA vs. SCSI) so don't try to fight it. John
Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
On Sun, Apr 19, 2009 at 2:58 AM, John Barham jbar...@gmail.com wrote: I certainly can't think ahead 20 years but I think it's safe to say that the next 5 (at least doing HPC and large-scale web type stuff) will increasingly look like this: http://www.technologyreview.com/computing/22504/?a=f, which talks about building a cluster from AMD Geode (!) nodes w/ compact flash storage. Sure it's not super-fast, but it's very efficient per watt. If you had more cash you might substitute HE Opterons and SSD's but the principle is the same. We thought this was the future several years ago (http://bit.ly/16ZWjc), but couldn't convince the company that such an approach would win out over big iron. Of course, if you look at Blue Gene, it's really just a massive realization of this model with several really tightly coupled interconnects. Apparently they use the above cluster to implement some type of distributed memcached style cache. I'm not convinced that such ad-hoc DSM models are the way to go as a general principal. Full blown DSM didn't fair very well in the past. Plan 9 distributed applications take a different approach and instead of sharing memory they share services in much more of a message passing model. This isn't to say that all caches are bad -- I just don't believe in making them the foundation of your programing model as it will surely lead to trouble. -eric
Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
Economics beats technology every time (e.g., x86/amd64 vs. MIPS/Itanium, Ethernet vs. Infiniband, SATA vs. SCSI) so don't try to fight it. if those examples prove your point, i'm not sure i agree. having just completed a combined-mode sata/sas driver, scsi vs ata is is fresh on my mind. i'll use it as an example. To clarify, I meant that given X vs. Y, the cost benefits of X eventually overwhelm the initial technical benefits of Y. With SATA vs. SCSI in particular, I wasn't so much thinking of command sets or physical connections but of providing cluster scale storage (i.e., 10's or 100's of TB) where it's fast enough and reliable enough but much cheaper to use commodity 7200 rpm SATA drives in RAID 5 than server grade 10k or 15k rpm SCSI or SAS drives. John
Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
To clarify, I meant that given X vs. Y, the cost benefits of X eventually overwhelm the initial technical benefits of Y. With SATA vs. SCSI in particular, I wasn't so much thinking of command sets or physical connections but of providing cluster scale storage (i.e., 10's or 100's of TB) where it's fast enough and reliable enough but much cheaper to use commodity 7200 rpm SATA drives in RAID 5 than server grade 10k or 15k rpm SCSI or SAS drives. this dated prejudice that scsi is for servers and ata is for your dad's computer has just got to die. could you explain how raid 5 relates to sata vs sas? i can't see now it's anything but a non-sequitor. you do realize that enterprise sata drives are available? you do realize that many of said drives are built with the same drive mechanism as sata hard drives? as an example, the seagate es.2 drives are available with a sas interface or a sata interface. (by the way, enterprise drives are well worth it, as i discovered on monday. :-(.) while it's true there aren't any 15k sata drives currently on the market, on the other hand if you want real performance, you can beat sas by getting an intel ssd drive. these are not currently available in a sas. - erik
Re: [9fans] FAWN: Fast array of wimpy nodes (was: Plan 9 - the next 20 years)
On Sun, Apr 19, 2009 at 09:27:43AM -0500, Eric Van Hensbergen wrote: I'm not convinced that such ad-hoc DSM models are the way to go as a general principal. Full blown DSM didn't fair very well in the past. Plan 9 distributed applications take a different approach and instead of sharing memory they share services in much more of a message passing model. This isn't to say that all caches are bad -- I just don't believe in making them the foundation of your programing model as it will surely lead to trouble. FWIW, the more satisfying definition for me of a computing unit (an atom OS based) is memory based: all the processing unit having direct hardware access to a memory space/sharing the same directly hardware accessible memory space. There seems to be 2 kinds of NUMA around there : 1) Cathedral model NUMA: a hierarchical association of memories, tightly coupled but with different speeds (a lot of uniprocessor are NUMA these days with cache1, cache2 and main memory). All directly known by the cores. 2) Bazaar model NUMA, or software NUMA, or GPLNUMA: treating an inorganized collection of storage as addressable memories since one can always give a way to locate the ressource, including by URL, associating high speed tightly connected memories with remote storage accessible via IP packets sent by surface mail if the hardware drived whistle is heard by the human writing the letters. Curiously enough, I believe in 1. I don't believe in 2. -- Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C