Re: [gridengine users] Parallel GE jobs on 48-way nodes

2011-10-11 Thread Gerald Ragghianti
Like the OP mentioned, one could use a consumable complex for 6.1. If you add "complex_values network=16" to the queue, and "load_thresholds network=15" it will be pushed to alarm state automatically and you can avoid the load sensor. When you add a default consumption of 1, it works out-of-t

Re: [gridengine users] Re??? `cloud' nodes

2011-10-11 Thread Rayson Ho
On Tue, Oct 11, 2011 at 3:58 PM, Jesse Becker wrote: > We're getting a bit off topic here, but CFEngine fits the one of your > requirements (but probably not the other).  It is written in C, is quite > fast, and has a much lower resource footprint than anything based on > Ruby. Like many other th

Re: [gridengine users] Parallel GE jobs on 48-way nodes

2011-10-11 Thread Reuti
Am 11.10.2011 um 14:37 schrieb William Hay: > On 11 October 2011 12:55, Reuti wrote: >> Am 10.10.2011 um 20:46 schrieb Gerald Ragghianti: >> >>> We have a cluster consisting of 48-core compute nodes where we need to run >>> parallel (MPI) jobs across nodes. There is a hardware limitation on th

Re: [gridengine users] Re??? `cloud' nodes

2011-10-11 Thread Jesse Becker
On Tue, Oct 11, 2011 at 03:51:57PM -0400, Chi Chan wrote: I am not against Chef or Puppet, but if there is a simple package that is written in C/C++, and is much simpler to learn, then I am more willing to use auto configuration management. In the end, the common tasks are not that complicated

[gridengine users] Re﹕ `cloud' nodes

2011-10-11 Thread Chi Chan
The biggest problem with running Chef or Puppet is that you have an extra package to install. When you mention Ruby to people, they immediately think of web applications. I think a lot of people are more comfortable with C/C++ or even Java applications, with Ruby you need the Ruby interpreter &

[gridengine users] 回覆﹕ OT: IBM to acquire Platform Computing!

2011-10-11 Thread Chi Chan
I also read the news today, yet other lists are quiet about this announcement, so I will also share my thoughts on the SGE list. I think it is good for Platform to find a new home, as the original organic growth period was behind when SGE was opensourced in 2001 (thanks Sun - we will remember

Re: [gridengine users] OT: IBM to acquire Platform Computing!

2011-10-11 Thread Rayson Ho
On Tue, Oct 11, 2011 at 2:08 PM, Chris Dagdigian wrote: > On a related note I was talking to a former Platform person who I'm sure > many of us know on this list and he mentioned that the stripped down older > variant of Platform LSF that platform produced back in the day ("lava") has > a new open

Re: [gridengine users] OT: IBM to acquire Platform Computing!

2011-10-11 Thread Chris Dagdigian
On a related note I was talking to a former Platform person who I'm sure many of us know on this list and he mentioned that the stripped down older variant of Platform LSF that platform produced back in the day ("lava") has a new open source home and developer group: http://openlava.net/ -

[gridengine users] OT: IBM to acquire Platform Computing!

2011-10-11 Thread Rayson Ho
http://www.platform.com/press-releases/2011/IBMtoAcquireSystemSoftwareCompanyPlatformComputingtoExtendReachofTechnicalComputing Not sure what's going to happen to Loadleveler... Rayson ___ users mailing list users@gridengine.org https://gridengine.org/m

Re: [gridengine users] `cloud' nodes

2011-10-11 Thread Rayson Ho
Hi, some update... 1) StarCluster stable release 0.92rc2 has new features that make running larger clusters (100+ instances) real easy. The new additions like dynamic cluster size grow & shrink would reduce the cost of operating semi-permanent clusters (where jobs arrive unpredictably during the l

Re: [gridengine users] Grid Engine Slotwise Preemption

2011-10-11 Thread Reuti
Am 11.10.2011 um 14:48 schrieb Fabio Martinelli: > Dear Colleagues > > kindly somebody can address me vs a good doc about Grid Engine Preemption and > its internals ? especially how the RAM memory is managed for a suspended job. It's not managed at all. It's still allocated to the suspended jo

[gridengine users] Grid Engine Slotwise Preemption

2011-10-11 Thread Fabio Martinelli
Dear Colleagues kindly somebody can address me vs a good doc about Grid Engine Preemption and its internals ? especially how the RAM memory is managed for a suspended job. by looking on Internet I've got that 6.2u5 like released from Sun has bugs about that and I see: * Fix slotwise pree

Re: [gridengine users] Parallel GE jobs on 48-way nodes

2011-10-11 Thread William Hay
On 11 October 2011 12:55, Reuti wrote: > Am 10.10.2011 um 20:46 schrieb Gerald Ragghianti: > >> We have a cluster consisting of 48-core compute nodes where we need to run >> parallel (MPI) jobs across nodes.  There is a hardware limitation on the QDR >> Infiniband cards that limits the available

Re: [gridengine users] Parallel GE jobs on 48-way nodes

2011-10-11 Thread Reuti
Am 10.10.2011 um 20:46 schrieb Gerald Ragghianti: > We have a cluster consisting of 48-core compute nodes where we need to run > parallel (MPI) jobs across nodes. There is a hardware limitation on the QDR > Infiniband cards that limits the available hardware contexts to 16 per card. > We have