Quoting "Jeffrey B. Layton" <[EMAIL PROTECTED]>, on Sat 10 Nov 2007 08:49:01 AM PST:

andrew holway wrote:
Sod all this tin pot stuff.

Buying all this crap, sticking it in a rack and stringing it together
with wire aint difficult. Making the damn software work is the tricky
bit.

Get loads of ram, vmware-server and BINGO! you have a cluster!


But this isn't a cluster - it's enterprise masturbation. We're talking about
HPC, not running payrolls on a server. It's all about performance.
Running a bunch of VM instances on a server is not really HPC to me.
Of course, it's a GREAT way to learn and I know a bunch of people
who use for testing and development.

And, in fact, I contend that it's the grubby aspects of stringing wires, making netboot or sneakernet distribution work and so forth that is what future cluster builders desperately need practice with.

If your interest is parallel algorithm design, then multiple VMs is a great way.

If your interest is understanding the practicalities of cluster engineering, then a stack of 50 very cheap boxes might be a suitable playground for learning by ordeal.

Say you have a class in cluster engineering with, say, 20-30 students. You make up groups of 2-3 bodies (so they can learn social skills, if nothing else), and give each group a crate with 8 boxes with freshly wiped disks plus one head node and a box full of power cords, network cables, VGA cables, keyboards, all thrown in there by last semester's groups. There will, of course, be 9 power cords in some crates and 7 in others.

Have them build up a cluster and run some trivial demo. There will be much learning, just getting a bootable image on all 8 machines (some might go the PXEboot route, some might sneakernet).

Then, tell them they have to gang all 80 machines into two clusters, each one with 40 machines and install a new OS. Hand configuration management and sneakernet will be painful. Then, have them swap 20 of the machines between clusters, do the same. CM by hand and sneakernet is even MORE painful. Heck, they can start to understand the differences between parallelism on machines and parallelism in bodies (ok, Bob, you put the boot CD in machines 1-5, Fred, you do 6-10, Ann, 11-15, etc.). If they're all running from one networked file server, they'll also learn empirically why you don't want them all to boot from the network simultaneously.

If you've given them 5 port cheap switches, they'll also get to learn about multi tier networking toplogies.


That'll larn 'em....
(If anyone decides to do this, let me know... I'd love to watch.. I'll even bring suitable frosty beverages for the spectators)


Jim...

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to