Re: [Beowulf] MS HPC... Oh dear...

2006-06-12 Thread Chris Samuel
On Monday 12 June 2006 15:39, Greg Lindahl wrote: FWIW there is a de-facto ABI with MPICH when using shared libraries, we have used this trick to run all sorts of ISV codes with our interconnect. My guess is that it is the ISV's that MS are going to target: Why bother building and supporting

Re: [Beowulf] MS HPC... Oh dear...

2006-06-12 Thread Chris Samuel
On Monday 12 June 2006 14:59, Gerry Creager N5JXS wrote:  If I recall correctly, the Data General ad in response included the  comment, The bastards say, 'Welcome'. Apparently DG produced but never actually ran that advert.

Re: [Beowulf] MS HPC... Oh dear...

2006-06-12 Thread Chris Samuel
On Monday 12 June 2006 14:02, Joe Landman wrote: If HPC has been both too expensive AND too difficult to use, then why is it as a market growing at 20+% per year? Ahh, don't let facts confuse you! My guess is that the targets of their comment are MS's customers who have never touched a

Re: [Beowulf] NCSU and FORTRAN

2006-09-10 Thread Chris Samuel
On Sunday 10 September 2006 11:48 am, Jim Lux wrote: To me, the license reads pretty clear... you can fool with it at home to learn about the product, and tinker to your heart's content, but don't do it as a job or for product development. /lurk I suspect they also have a strong interest in

Re: NFS Performance (was Re: [Beowulf] GPFS on Linux (x86))

2006-09-16 Thread Chris Samuel
On Saturday 16 September 2006 12:32 am, Brent Franks wrote: Nice, any sort of comparison data in terms of differences in throughput achieved? We weren't as concerned about throughput as the fact that when the NFS server was under mild load (which the previous RH7.3 box could cope with) it

Re: NFS Performance (was Re: [Beowulf] GPFS on Linux (x86))

2006-09-16 Thread Chris Samuel
On Saturday 16 September 2006 12:49 am, Mark Hahn wrote: upgrading the rhel server to a kernel.org kernel would be minimal work, probably... That certainly wasn't the case with RHEL3 where the 2.4 kernel had NPTL backported from 2.6 and their userspace was built around that.. :-( --

Re: NFS Performance (was Re: [Beowulf] GPFS on Linux (x86))

2006-09-16 Thread Chris Samuel
On Sunday 17 September 2006 2:55 am, Mark Hahn wrote: so you couldn't have slapped a modern 2.6 kernel on it?  I've certainly put much newer kernels on old distros (though not any of the commercial RH variants so far.) This was back in mid-late 2004 and (from memory) the versions of kernel

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Chris Samuel
On Wednesday 27 September 2006 5:54 am, Mike Davis wrote: We've had good luck with Apple's arrays. /lurk So have we. On the other hand we've had an IBM FAStT EXP enclosure (or whatever they're called today) which lost 2 SCSI drives (one in the main unit and one in the EXP) within a

Re: [Beowulf] Looking for external RAID vendors

2006-09-28 Thread Chris Samuel
On Wednesday 27 September 2006 6:10 am, Erik Paulson wrote: We don't do any SAN, each one is attached to a 1U server and we have our own filesystem to track where stuff is. Works for us. Snap - IBM e325's running FC4. -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Chris Samuel
On Thursday 28 September 2006 2:08 pm, Robert G. Brown wrote: I don't think mkdir(2) does the equivalent of mkdir -p and create parent directories as required. That's quite correct, you'll always have to do those yourself it they don't already exist (otherwise you'll get EACCESS). --

Re: [Beowulf] commercial clusters

2006-09-28 Thread Chris Samuel
On Wednesday 27 September 2006 5:10 am, Angel Dimitrov wrote:  Is there many clients for processor time? As I saw the biggest supercomputers in the World are very busy! I'm wondering if it's worthwhile to setup a commercial cluster. Intel are planning for new processors - two CPUs each with

Re: [Beowulf] Stupid MPI programming question

2006-09-28 Thread Chris Samuel
On Thursday 28 September 2006 2:41 pm, Mark Hahn wrote: also, man 3 strerror ;) and to make life even easier - man 3 perror :-) -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria

Re: [Beowulf] commercial clusters

2006-09-30 Thread Chris Samuel
On Saturday 30 September 2006 3:37 am, Vincent Diepeveen wrote: No average joe knows the word 'linux'. For them an astronaut course is easier than learning linux. That is demonstrably false: http://vic.computerbank.org.au/ Computerbank Victoria founded the Australian Computerbank

Re: [Beowulf] Question about root login(s)

2006-10-24 Thread Chris Samuel
On Thursday 19 October 2006 05:10, Tim Moore wrote: but do not quite understand how to login as root (without a password) throughout the compute nodes via ssh because /root is present on each system. Others have covered all the keying, but one thing that often catches a lot of people out

Re: [Beowulf] IBM p5 cluster discussion list (off-topic)

2006-10-30 Thread Chris Samuel
On Tuesday 31 October 2006 10:54, Craig Tierney wrote: Sorry to ask this question here, but is anyone familiar with discussion lists around IBM power systems(p5 575), and in particular their cluster solution?  I have been trying for several days to find the right mailing-list or newsgroup

Re: [Beowulf] IBM p5 cluster discussion list (off-topic)

2006-10-31 Thread Chris Samuel
On Wednesday 01 November 2006 02:01, Craig Tierney wrote: Thanks for the pointers.  My customer recently purchased a pSeries p5 575 cluster (8 node, 8 dual-core sockets) with GPFS, HPS, and AIX. Ahh, AIX. 'nuff said.. Our system is OpenPOWER 720 so AIX is nobbled not to run on it (which

[Beowulf] SC'06 Beowulf Bash ?

2006-10-31 Thread Chris Samuel
Hi folks, I'm off to my first SC'06 this year to help man the VPAC booth with a couple of friends and was wondering if there were any details about the Beowulf bash yet ? cheers, Chris -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced

Re: [Beowulf] SC'06 Beowulf Bash ?

2006-11-01 Thread Chris Samuel
On Thursday 02 November 2006 10:19, Donald Becker wrote: I'm now trying to convince people that it's better for the meeting to be the bash itself, and the topic to be Beer: is more always better? I'm pretty certain that we can find speakers for that. It depends on the beer. 'nuff said.. :-)

Re: [Beowulf] SC'06 Beowulf Bash ?

2006-11-02 Thread Chris Samuel
On Thursday 02 November 2006 7:03 pm, Leif Nixon wrote: Chris Samuel [EMAIL PROTECTED] writes: It depends on the beer.  'nuff said.. :-) Brains S.A., er, Chris.. I had a real problem with that at CCGrid in Cardiff. How are you supposed to order a pint of Brains with a straight face? I

Re: [Beowulf] Thought that this might be of interest

2006-11-06 Thread Chris Samuel
On Tuesday 07 November 2006 02:18, Vincent Diepeveen wrote: So until end 2007 the core2 annihilates any opteron system. In my experience with both architectures it depends on what you want out of a box, and what you're running on it, as well as your power constraints.. There is a good reason

Re: [Beowulf] Thought that this might be of interest

2006-11-07 Thread Chris Samuel
On Wednesday 08 November 2006 14:43, Mark Hahn wrote: I agree with Pathscale on this.  we evaluated Pathscale about 2 years ago, and were rather apalled at the license-hold time (which was substantially longer at the time.) Aha, that was around the time that we looked at it last. --

Re: [Beowulf] Apologies for the spam/virus yesterday

2006-11-09 Thread Chris Samuel
On Friday 10 November 2006 03:37, Michael Will wrote: Completely usel.ess unless you plan to litigate and have to prove authorship Certainly for me being able to demonstrate authorship and/or modification of a message or file is useful in its own right, I don't understand why utility should

Re: [Beowulf] Apologies for the spam/virus yesterday

2006-11-09 Thread Chris Samuel
On Friday 10 November 2006 05:12, Robert G. Brown wrote: besides, signatures don't have to be attachments That's the old style ASCII armour technique, effectively supersceded in 1996 by RFC 2015 (based on the previous PGP/MIME work). -- Christopher Samuel - (03)9925 4751 - VPAC Deputy

Re: [Beowulf] More cores/More processors/More nodes?

2006-11-10 Thread Chris Samuel
On Wednesday 04 October 2006 00:37, Douglas Eadline wrote: This is a non-obvious result many find hard to believe. That is, MPI on the same node maybe faster than some shared/threaded mode. (of course it all depends on the application etc.) We believe we have seen that with LS-Dyna comparing

Re: [Beowulf] /etc/security/limits.conf and Torque jobs

2006-11-26 Thread Chris Samuel
On Friday 24 November 2006 18:16, 陈齐旺 wrote: I just install Rocks v4.2.1 on our IA64 Cluster, I encounterred this problem, when I used torque 2.1.2, It still get 128k for ulimit -l, and I can't modify the limit. So I can't use OpenMPI and Intel MPI to run MPI program. Probably worth asking

Re: [Beowulf] More technical information and spec of beowulf

2006-12-03 Thread Chris Samuel
On Thursday 30 November 2006 18:19, reza bakhshi wrote: How can i find some more detailed technical information on Beowulf software infrastructure? Hopefully these will help.. http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_book/ http://en.wikipedia.org/wiki/Beowulf_(computing)

Re: [Beowulf] distributed file storage solution?

2006-12-14 Thread Chris Samuel
On Wednesday 13 December 2006 15:15, Anand Vaidya wrote: Have you considered G-FARM? http://datafarm.apgrid.org Any news on when they're going to do a release as a POSIX filesystem yet ? Last time I'd heard that was planned for 2.0, but I've not heard anything about that release.. --

Re: [Beowulf] LAM -beowulf problems

2006-12-26 Thread Chris Samuel
On Thursday 21 December 2006 03:45, Mr. Sumit Saxena wrote: I have provided the link of the libraries of LAM in my ld.so.conf as well as .bash_profile Try putting the PATH configuration for LAM into your .bashrc instead.. good luck! Chris -- Christopher Samuel - (03)9925 4751 - VPAC Deputy

Re: [Beowulf] Selling computation time

2006-12-27 Thread Chris Samuel
On Wednesday 27 December 2006 05:29, Chetoo Valux wrote: I wonder then if there would be potential buyers for cluster time. I've been browsing,  not too deep, the net, and I've not found (yet) any information of someone selling cluster time. We occasionally get approached by commercial

Re: [Beowulf] OT: some quick Areca rpm packages for interested users

2006-12-27 Thread Chris Samuel
On Thursday 28 December 2006 08:15, Yaroslav Halchenko wrote: That is pity that Areca's drivers got kicked even from -mm devel branch of linux mainstream kernel (unfortunately I don't remember in which exact version it has happened). The ARECA drivers have just moved from the mm tree into the

Re: [Beowulf] OT: some quick Areca rpm packages for interested users

2006-12-27 Thread Chris Samuel
On Thursday 28 December 2006 14:10, Yaroslav Halchenko wrote: on the server I am still running 2.6.18.2 and also I am not using ext3 (just reiser and xfs), and I am not sure when I will have a chance to reboot (that beast is main file server for our cluster at the moment). Understood, we tend

Re: [Beowulf] OT: some quick Areca rpm packages for interested users

2006-12-27 Thread Chris Samuel
On Thursday 28 December 2006 15:24, Joe Landman wrote: If you have the financial wherewithal to support them, I urge you to do so. They do have a starving hacker rate that's about US$40 a year (it's an honour system that you choose the appropriate amount, there's a project leader option for

Re: [Beowulf] Re: Selling computation time

2006-12-28 Thread Chris Samuel
On Thursday 28 December 2006 07:36, Jeff Johnson wrote: It is a reasonable assumption that Sun did their homework. I wonder who they are targeting. http://www.channelregister.co.uk/2005/10/25/sun_grid_slip/ Sun's grid: lights on, no customers (October 2005) 14 months of utility computing

Re: [Beowulf] SW Giaga, what kind?

2006-12-28 Thread Chris Samuel
On Thursday 28 December 2006 18:22, Ruhollah Moussavi Baygi wrote: Please let me know your idea about SW level1 (Giga). Is it a proper choice for a  small Beowulf cluster? Never heard of it. Care to enlighten us ? -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager

Re: [Beowulf] Which distro for the cluster?

2006-12-28 Thread Chris Samuel
On Friday 29 December 2006 04:24, Robert G. Brown wrote: I'd be interested in comments to the contrary, but I suspect that Gentoo is pretty close to the worst possible choice for a cluster base. Maybe slackware is worse, I don't know. But think of the speed you could emerge applications with

Re: [Beowulf] Which distro for the cluster?

2006-12-28 Thread Chris Samuel
person hog 64 CPUs for one job ? On Fri, Dec 29, 2006 at 09:39:59AM +1100, Chris Samuel wrote: On Friday 29 December 2006 04:24, Robert G. Brown wrote: I personally would suggest that you go with one of the mainstream, reasonably well supported, package based distributions.  Centos, FC

Re: [Beowulf] Which distro for the cluster?

2007-01-01 Thread Chris Samuel
On Friday 29 December 2006 21:05, Geoff Jacobs wrote: Here's a bare bones kickstart method (not Kickstart[tm] per se): http://linuxmafia.com/faq/Debian/kickstart.html Good old Rick, he crops up everywhere is a mine of information. ;-) Regarding kickstart, among choices for pre-scripted

Re: [Beowulf] picking out a job scheduler

2007-01-02 Thread Chris Samuel
On Wednesday 03 January 2007 08:06, Chris Dagdigian wrote: Both should be fine although if you are considering *PBS you should   look at both Torque (a fork of OpenPBS I think) That's correct, it (and ANU-PBS, another fork) seem to be the defacto queuing systems in the state and national HPC

Re: [Beowulf] picking out a job scheduler

2007-01-03 Thread Chris Samuel
On Wednesday 03 January 2007 15:55, Nathan Moore wrote: WARNING:  server not specified (set $pbsserver) This has already been answered on the Torque list, but for the folks on the Beowulf list this was the issue. cheers! Chris -- Christopher Samuel - (03)9925 4751 - VPAC Deputy

Re: [Beowulf] RE: OT: Announcing MPI-HMMER

2007-01-03 Thread Chris Samuel
On Thursday 04 January 2007 04:27, David Mathog wrote: Joe and I aparently exist in parallel software universes ;-). Being MPI means it can take advantage of high speed interconnects (e.g. building it with MPICH-GM to use native Myrinet). Of course whether that would benefit HMMER is

Re: [Beowulf] picking out a job scheduler

2007-01-04 Thread Chris Samuel
On Thursday 04 January 2007 22:16, Reuti wrote: Linda and PVM* need some kind of rsh/ssh between the nodes, and I   didn't get a clue up to now to convince Linda to use the PBS TM of   Torque. Torque provides a pbsdsh command that uses the TM interface and acts like the various DSH variants.

[Beowulf] OT: Software RAID Multipath

2007-01-08 Thread Chris Samuel
This is about a storage node for a cluster, so it's partly on topic.. :-) Through happy coincidence we now have a box with two FC cards going to a SAN switch and thence into each side of an IBM FAStT 600 (doing H/W RAID5). The FAStT is partitioned into two 1.6TB lumps and each FC card can see

Re: [Beowulf] OT: Software RAID Multipath

2007-01-09 Thread Chris Samuel
On Tuesday 09 January 2007 23:35, Leif Nixon wrote: I suspect this is deprecated these days, but I have handled situations like this by using the *MD* multipath support instead. Then you can explicitly define your multipath devices and stripe them together, all in /etc/mdadm.conf. I don't

Re: no 'commodity' OS is 'secure' Re: [Beowulf] Which distro for the cluster?

2007-01-10 Thread Chris Samuel
On Thursday 11 January 2007 01:21, Andrew Piskorski wrote: It sounds suspiciously like decision making driven by what the rules and paperwork says you're supposed to do I knew an organisation (not this one) that had the rule that every system had to run a full virus scan once a day. The

[Beowulf] OneSis experiences ?

2007-01-13 Thread Chris Samuel
Hi folks, Anyone here played with OneSis ? It looks like a variation on the WareWulf theme.. http://www.onesis.org/ A thin, role-based, single image system for scalable cluster management. oneSIS is a simple, flexible method for managing one or more clusters from a single filesystem image.

Re: [Beowulf] Teraflop chip hints at the future

2007-02-12 Thread Chris Samuel
On Tue, 13 Feb 2007, Mitchell Wisidagamage wrote: I think this news is of some relevance to the group... http://news.bbc.co.uk/1/hi/technology/6354225.stm Single or double precision ? -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-18 Thread Chris Samuel
On Sat, 17 Feb 2007, Jim Lux wrote: I think it's pretty obvious that Google has figured out how to partition their workload in a can use any number of processors sort of way, in which case, they probably should be buying the cheap drives and just letting them fail (and stay failed.. it's

[Beowulf] DMA Memory Mapping Question

2007-02-21 Thread Chris Samuel
Hi folks, We've got an IBM Power5 cluster running SLES9 and using the GM drivers. We occasionally get users who manage to use up all the DMA memory that is addressable by the Myrinet card through the Power5 hypervisor. Through various firmware and driver tweaks (thanks to both IBM and Myrinet)

Re: [Beowulf] DMA Memory Mapping Question

2007-02-21 Thread Chris Samuel
On Thu, 22 Feb 2007, Chris Samuel wrote: Through various firmware and driver tweaks (thanks to both IBM and Myrinet) we've gotten that limit up to almost 1GB and then we use an undocumented environment variable (GMPI_MAX_LOCKED_MBYTE) to say only use 248MB of that per process (as we've got 4

Re: [Beowulf] DMA Memory Mapping Question

2007-02-21 Thread Chris Samuel
On Thu, 22 Feb 2007, Patrick Geoffray wrote: Hi Chris, G'day Patrick! Chris Samuel wrote: We occasionally get users who manage to use up all the DMA memory that is addressable by the Myrinet card through the Power5 hypervisor. The IOMMU limit set by the hypervisor varies depending

Re: [Beowulf] anyone using 10gbaseT?

2007-02-21 Thread Chris Samuel
On Thu, 22 Feb 2007, Craig Tierney wrote: I didn't think it was that cheap.  I would prefer Layer 3 if this was going into a rack of a multi-rack system, but the price is right. Thanks Craig! -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-22 Thread Chris Samuel
On Thu, 22 Feb 2007, Robin Harker wrote: So if we now know, (and we have seen similarly spirious behaviour with SATA Raid arrays), isn't the real solution to lose the node discs? Depends on the code you're running, if it hammers local scratch then either you have to have them or you have to

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-28 Thread Chris Samuel
On Thu, 1 Mar 2007, Douglas Gilbert wrote: That FAQ entry is about 2 years out of date. smartmontools support for SATA disks behind a SCSI to ATA Translation (SAT) layer is now much better. Please try the recently released version 5.37 of smartmontools. For instance, you should be able to

Re: [Beowulf] extreme dynamic underclocking and undervolting

2007-03-05 Thread Chris Samuel
On Sat, 3 Mar 2007, David Mathog wrote: So it would be nice if the range of underclocking / undervolting adjustments  provided on compute nodes extended quite a bit further towards the lower end than it currently does. FWIW 2.6.21 looks like it will include i386 support for the clockevents

Re: [Beowulf] network filesystem

2007-03-05 Thread Chris Samuel
On Fri, 2 Mar 2007, [EMAIL PROTECTED] wrote: I have a small (16 dual xeon machines) cluster. [...] Does anybody knows what is better for a cluster of this size, exporting the filesystem via NFS FWIW we run two NFS servers (dual 2.0GHz Opteron 240's) with users split across the two and they

Re: [Beowulf] network filesystem

2007-03-05 Thread Chris Samuel
On Mon, 5 Mar 2007, John Hearns wrote: Purely as a point of interest, since high energy physics labs use AFS (and hence kerberos) they have already faced this one. Interesting, though it's not clear from that whether it can cope with, say, automatically renewing expiring tickets for running

Re: [Beowulf] Benchmark between Dell Poweredge 1950 And 1435

2007-03-08 Thread Chris Samuel
On Wed, 7 Mar 2007, Juan Camilo Hernandez wrote: I would like to know what server has the best performance for HPC systems between The Dell Poweredge 1950 (Xeon) And 1435SC (Opteron) If you are fortunate enough to have only a couple of applications you care about, then get one of each on loan

Re: [Beowulf] number of NFS daemons

2007-03-08 Thread Chris Samuel
On Fri, 9 Mar 2007, Kozin, I (Igor) wrote: How many NFS daemons people are using on a dedicated NFS server? We use 128 and I've since found out that, coincidentally, that is the same number which SGI use on their NAS head units (of which we have none). YMMV. :-) cheers! Chris --

[Beowulf] DrQueue feeding jobs into PBS ?

2007-03-18 Thread Chris Samuel
Hi folks, We have a potential user here who is interested in using DrQueue to manage rendering jobs. Problem is that all our clusters are controlled exclusively by PBS (Torque) and need to keep that because of the demand we have. So I'm left wondering if anyone here has any knowledge about

Re: [Beowulf] A start in Parallel Programming?

2007-03-19 Thread Chris Samuel
On Wed, 14 Mar 2007, Peter St. John wrote: and Ken wrote B in the late 60's (to extend your gleam in the eye metaphor beyond bearability, I'd say that Dennis carred the project to term). Nice thing about B is that the formal definition fits in 2 pages. B, I remember it well and with fondness.

Re: [Beowulf] A start in Parallel Programming?

2007-03-19 Thread Chris Samuel
On Fri, 16 Mar 2007, Robert G. Brown wrote: Or be human-readable.  f2c code was just about as evil as any zomby woof or eskimo boy could be. The TenDRA compilers built at RSRE/DRA/DERA in the early/mid 90's that implemented ANDF as a distribution format had large auto-generated chunks and

Re: [Beowulf] A start in Parallel Programming?

2007-03-19 Thread Chris Samuel
On Thu, 15 Mar 2007, Joe Landman wrote: I seem to remember after my joyous year with Pascal in the early 80s that they quickly caught the Modula fad (Niklaus Wirth could do no wrong), dabbled a bit in other things, and came out strong in C++ around late 80s early 90s. Certainly at

Re: [Beowulf] A start in Parallel Programming?

2007-03-19 Thread Chris Samuel
On Thu, 15 Mar 2007, Greg Lindahl wrote: Robert is in the South, all youse guys is a Northeast super-plural. It also gets used around Melbourne (Australia).. -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced Computing http://www.vpac.org/

Re: [Beowulf] Cell programming

2007-03-25 Thread Chris Samuel
On Wed, 21 Mar 2007, Tim Wilcox wrote: The API, as far as I have read it, does not have nice routines for message passing between the SPUs, you have to set up your own memory transfers or address remote memory directly using the MFC. FWIW the Charm++ folks are working on supporting Cell.

Re: [Beowulf] scheduler policy design

2007-04-25 Thread Chris Samuel
On Wed, 25 Apr 2007, Toon Knapen wrote: Does anyone know of any projects underway that are trying to accomplish exactly this ? I believe the Moab scheduler supports provisioning of nodes on demand via various means (xCat, System Imager) and this includes Xen when used with Torque 2.1:

Re: [Beowulf] scheduler policy design

2007-04-25 Thread Chris Samuel
On Wed, 25 Apr 2007, Mark Hahn wrote: hell no.  only a few users even have a guess about how long their job will run, let alone what it will actually do (in mem or IO usage). Our users are roughly split into 3 groups, those who have a reasonable idea of how long their job will run, those who

Re: [Beowulf] scientific computing with the PS3

2007-04-29 Thread Chris Samuel
On Sat, 28 Apr 2007, Eugen Leitl wrote: http://www.netlib.org/utk/people/JackDongarra/PAPERS/scop3.pdf has some details on the GBit interface (yes, it's virtualized). Thanks Eugen. I've just managed to find the presentation about the work being done with Charm++ (and NAMD) on Cell.

Re: [Beowulf] SSH without login in nodes

2007-05-04 Thread Chris Samuel
On Sat, 5 May 2007, Peter St. John wrote: I am configuring a cluster with ssh (but without passwords) and currently the users can log in to compute nodes. I wish the clients to use the queue system (Torque, it works fine) without being able to access the compute nodes. In the past, we used

Re: [Beowulf] SSH without login in nodes

2007-05-06 Thread Chris Samuel
On Sun, 6 May 2007, Kilian CAVALOTTI wrote: Not that ugly, actually. But what if users do a ssh node -t bash --noprofile? ;) Then if any of the 500 odd tried we would spot them with some other scripts and chase them about it. We've not had to do that yet, though, fortunately! To handle

Re: [Beowulf] a question about running HPL

2007-05-12 Thread Chris Samuel
On Sat, 12 May 2007, M C wrote: After I install HPL on the machine, I try to run it in the bin dir of HPL by mpirun -np 1 xhpl. But it reports cannot find mpirun command. Looks like it can't find your MPI versions mpirun command. Which MPICH did you use ? -- Christopher Samuel - (03)9925

[Beowulf] Gaussian g03 on CentOS5/RHEL5 ?

2007-07-12 Thread Chris Samuel
Hi folks, Don't suppose anyone out there has any war stories about trying to get Gaussian 03 going with CentOS5/RHEL5 ? We're looking at running G03 here at VPAC and the new cluster will be running CentOS5 and I'm trying to find out as much as possible before committing! All the best, Chris

Re: [Beowulf] power usage, Intel 5160 vs. AMD 2216

2007-07-13 Thread Chris Samuel
On Sat, 14 Jul 2007, Lombard, David N wrote: Amen, brother.  Don's comments on PS efficiency are highly relevant, and the kill-a-watt takes that all into account.  There's also a spiffier (and more expensive) model that logs the reading for later analysis. I wish I could find an alternative

Re: [Beowulf] power usage, Intel 5160 vs. AMD 2216

2007-07-14 Thread Chris Samuel
On Sat, 14 Jul 2007, Jim Lux wrote: Hi Jim, The Kill-A-Watt is available in a 220V 50Hz version. I didn't realise that, thanks! Might have to cobble a plug/receptacle that works depending on your local style.. the 220V ones I've seen have the round plugs and I think you have the slanted

Re: [Beowulf] openMosix ending

2007-07-16 Thread Chris Samuel
On Tue, 17 Jul 2007, Robert G. Brown wrote: The real drivers will install into the BIOS and should stop being OS specific at all Given the general quality of BIOS and ACPI implementations this somehow does not fill me with a warm glow... Our BIOS supports both types of Linux, RHEL and

Re: [Beowulf] Sidebar: Vista Rant

2007-07-17 Thread Chris Samuel
On Wed, 18 Jul 2007, Jaime Perea wrote: Something related, it can't be serious: Any comment from the experts? :-) I love the irony of: In 2006, Microsoft announced the release of Windows Compute Cluster Server (CCS) 2003 My first thought was oh, only 3 years late then.. (and yes, I do know

Re: A Modest Proposal (was [Beowulf] openMosix ending)

2007-07-17 Thread Chris Samuel
On Wed, 18 Jul 2007, Robert G. Brown wrote: No, you misunderstand. No, I just have a different point of view. :-) At this point in time, one major job of an operating system is to hide the details of the hardware from the programmer. Correct - you should not need to know whether the path to

Re: [Beowulf] Sidebar: Vista Rant

2007-07-17 Thread Chris Samuel
On Wed, 18 Jul 2007, Robert G. Brown wrote: I don't believe that Vista's slowness has anything to do with hardware memory footprint. Probably this, from page 14 of Output Content Protection and Windows Vista at http://www.microsoft.com/whdc/device/stream/output_protect.mspx : In addition to

Re: [Beowulf] Sidebar: Vista Rant

2007-07-19 Thread Chris Samuel
On Thu, 19 Jul 2007, Tim Cutts wrote: And this is different from Linux how? Because you are comparing two different system calls. fsync(2) under Linux says: fsync() transfers (flushes) all modified in-core data of (i.e., modi‐ fied buffer cache pages for) the file referred to

Re: [Beowulf] Virtualization

2007-08-05 Thread Chris Samuel
On Thu, 26 Jul 2007, Julien Leduc wrote: This last technique ensure reproductible experiments, more performances, drawbacks are: more work on the middleware that make all that magic come true. http://workspace.globus.org/vm/index.html The general idea being that you can request the config

Re: [Beowulf] openmosix-kernel-2.6

2007-08-07 Thread Chris Samuel
On Wed, 8 Aug 2007, A Lenzo wrote: I am installing OpenMosix right now, and found that the kernel for 2.6 is in beta.  But this beta was released in late 2006 and I don't see anything newer.  http://sourceforge.net/forum/forum.php?forum_id=715406 # Moshe Bar, openMosix founder and project

[Beowulf] CFP: VTDC 2007 - Second International Workshop on Virtualization Technology in Distributed Computing

2007-08-09 Thread Chris Samuel
=== CALL FOR PAPERS (VTDC 2007) Workshop on Virtualization Technologies in Distributed Computing held in conjunction with SC 07, the International Conference for High Performance Computing, Networking and Storage.

Re: [Beowulf] Network Filesystems performance

2007-08-21 Thread Chris Samuel
On Tue, 21 Aug 2007, Glen Dosey wrote: I've left out a lot of detail to keep this succinct. I can provide it when necessary. I think the details you've omitted mean there's not enough information there to actually come to any conclusions! :-) What distro on the clients and servers ? What

Re: [Beowulf] [tt] World's most powerful supercomputer goes online

2007-09-01 Thread Chris Samuel
On Sat, 1 Sep 2007, Jim Lux wrote: The wikipedia didn't say if the P45 is pink, though. Nope, they seem to be a blueish colour. I'm not up to digging through my records for the one I got when I left the UK civil service before migrating to Australia, so here's one someone else prepared

Re: [Beowulf] Node not answering...

2007-09-01 Thread Chris Samuel
On Fri, 31 Aug 2007, Nestor Waldyd Alvarez Villa wrote: mpirun: cannot start a.out on n1: No such file or directory NFS (or other network/distributed fs) mountpoint not mounted ? -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O.

Re: [Beowulf] [tt] World's most powerful supercomputer goes online

2007-09-02 Thread Chris Samuel
On Mon, 3 Sep 2007, Gerry Creager wrote: Both should be required reading (and it's time to re-read The Puzzle Palace) before asking open-ended questions about the organization whose very name was once classified, and whose acronym was expanded to, in most instances, No Such Agency. Yesterday

Re: [Beowulf] Sun buys Lustre

2007-09-14 Thread Chris Samuel
On Thursday 13 September 2007 04:30:29 Mark Hahn wrote: wow, that's a bit of a shock.  I had heard mutterings about CFS using ZFS, but would not have guessed a buyout. This blog from Ricardo , the nice chap who has been working on the ZFS/FUSE port to Linux (originally sponsored by the Google

Re: [Beowulf] Big storage

2007-09-14 Thread Chris Samuel
On Friday 14 September 2007 06:19:34 Leif Nixon wrote: I still think it would be interesting to see how often one gets data corruption from other sources than disk errors (presuming ZFS is perfect). A good friend of mine has just had what appears to be a rather nasty (but recoverable)

Re: [Beowulf] Big storage

2007-09-16 Thread Chris Samuel
On Sunday 16 September 2007 04:48:56 Greg Lindahl wrote: Several people have commented that fsprobe doesn't check existing files. For your system binaries, you can test them using rpm -V. One interesting problem I've experienced on a non-HPC server is a latent memory error corrupting files

Re: [Beowulf] Advantage to compiling RHEL 5 with opteron option?

2007-09-26 Thread Chris Samuel
On Thu, 27 Sep 2007, Jeremy Fleming wrote: Anyone know if there is any advantage to recompiling the default RHEL 5 kernel to include the Opteron config option on a quad processor opteron machine? It may well be worth your while moving to the current kernel from the RHEL one (2.6.22.9), we've

[Beowulf] Odd Infiniband scaling behaviour

2007-10-07 Thread Chris Samuel
Hi fellow Beowulfers.. We're currently building an Opteron based IB cluster, and are seeing some rather peculiar behaviour that has had us puzzled for a while. If I take a CPU bound application, like NAMD, I can run an 8 CPU job on a single node and it pegs the CPUs at 100% (this is built

Re: [Beowulf] Odd Infiniband scaling behaviour - *SOLVED* - MVAPICH2 problem

2007-10-08 Thread Chris Samuel
On Mon, 8 Oct 2007, Chris Samuel wrote: If I then run 2 x 4 CPU jobs of the *same* problem, they all run at 50% CPU. With big thanks to Mark Hahn, this problem is solved. Infiniband is exonerated, it was the MPI stack that was the problem! Mark suggested that this sounded like a CPU

Re: [Beowulf] Virtualisation and high performance interconnects.

2007-11-01 Thread Chris Samuel
On Thu, 1 Nov 2007, andrew holway wrote: I'm trying to find out about the effects of virtualisation on high performance interconnects. Effects on latency and bandwidth. Google is your friend.. :-) There is an IBM presentation from the 2006 Xen conference on virtualising InfiniBand networks,

Re: [Beowulf] How to Monitor Cluster

2007-11-09 Thread Chris Samuel
On Thu, 23 Aug 2007, Lombard, David N wrote: - You can directly manipulate the XML data from the multicast channel. We run a script on each node that uses the ipmitool command to pull out CPU and system temperatures and then use gmetric to inject them into Ganglia. Very handy.. cheers,

Re: [Beowulf] impressions of Super Micro IPMI management cards?

2007-11-09 Thread Chris Samuel
On Mon, 22 Oct 2007, Chris Dagdigian wrote: Does anyone have any experience/impressions of the Supermicro   Intelligent Management stuff? We're using them on our new Opteron cluster that will start to blossom into a Barcelona cluster soon. The biggest problem we've found is what appears to

Re: [Beowulf] Network Filesystems performance

2007-11-10 Thread Chris Samuel
On Fri, 24 Aug 2007, Michael Will wrote: I tested several NFS server configurations with a 19 node cluster. Same, but with 150+ nodes and about 500+ CPUs. The first advise is to stay away from redhat for file servers since they have some bursty I/O bugs and don't support XFS. Amen. We've

Re: [Beowulf] Network Filesystems performance

2007-11-10 Thread Chris Samuel
On Sun, 11 Nov 2007, Buccaneer for Hire. wrote: A stock kernel will from Redhat will not give you the performance you need. Indeed, and the people I know at Red Hat in Australia are well aware of my thoughts on their restrictive choice of filesystems.. :-) cheers! Chris -- Christopher

Re: [Beowulf] Quad-Core Parallelism

2007-11-17 Thread Chris Samuel
On Sat, 17 Nov 2007, Joe Landman wrote: Hey Joe, ESSL is/was generally only available to IBM customers on AIX machines (Power class). You can get ESSL and PESSL (no MORTAR? :-)) from IBM for Linux on Power too. We have it on our SLES9 OpenPOWER 720 cluster. cheers, Chris -- Christopher

Re: [Beowulf] Opteron

2007-11-17 Thread Chris Samuel
On Fri, 16 Nov 2007, andrew holway wrote: Come on then SC07'ers. Whats the buzz with barcelona? We've had a 1.9GHz Barcelona node on our new cluster for a few months now (under NDA that expired on the launch) and have been happy enough with it to go for the full upgrade to the 2.3GHz

Re: How bleeding edge are people with kernels (Was Re: [Beowulf] impressions of Super Micro IPMI management cards?)

2007-11-17 Thread Chris Samuel
On Tue, 13 Nov 2007, stephen mulcahy wrote: This prompted me to wonder how close to the bleeding edge people in clusterland are living with regard to Linux kernel versions? I should just point out that on our 4 year old Intel P4 cluster we run CentOS5 with the kernels out of the CentOS Plus

Re: [Beowulf] Opteron

2007-11-17 Thread Chris Samuel
On Fri, 16 Nov 2007, andrew holway wrote: Come on then SC07'ers. Whats the buzz with barcelona? I should also point out that we're using the PGI 7.x series of compilers and tell our users to build with: -tp k8-64,barcelona-64 so that they get the optimisations for both dual and quad core

  1   2   3   4   5   >