Re: [gridengine users] Resource quotas and parallel jobs across multiple queues

2012-01-12 Thread Reuti
Hi, Am 12.01.2012 um 08:00 schrieb Brendan Moloney: I seem to have found a combination of resource quotas that is preventing the scheduler from scheduling parallel jobs across multiple queues. I have multiple queues for jobs with different run times: veryshort.q, short.q , long.q, and

[gridengine users] How setup queue priority?

2012-01-12 Thread Semi
I need to setup high and low priority queues for the same nodes. I preferred to make it without subordinate lists. I know, that the following parameters are dealing with this: seq_no10 priority 20 If I'm right please explain me the meaning of numbers, if no correct

Re: [gridengine users] How setup queue priority?

2012-01-12 Thread William Hay
On 12 January 2012 11:41, Semi s...@bgu.ac.il wrote: I need to setup high and low priority queues for the same nodes. I preferred to make it without subordinate lists. I know, that the following parameters are dealing with this: seq_no                10 The seq_no is used to determine which

Re: [gridengine users] How setup queue priority?

2012-01-12 Thread Semi
I have 3 queues. I want: all.q lowest priority mid.q middle hig.q highest I can solve this problem only with subordinate list? qconf -sq hig.q subordinate_list all.q=1, mid.q=1 qconf -sq mid.q subordinate_list all.q=1 On 1/12/2012 2:04 PM, William Hay wrote: On 12 January 2012 11:41,

Re: [gridengine users] Automatic CPU core binding - JSV script

2012-01-12 Thread Daniel Gruber
While core binding itself should work with such an topology (I never tried it) in 6.2u5, the reporting of the topology string will be wrong. As you might noticed, string based load values are just reported up to a length of 1024 bytes, that means that with 1000 nodes not the full topology

Re: [gridengine users] How setup queue priority?

2012-01-12 Thread Reuti
Hi, Am 12.01.2012 um 12:41 schrieb Semi: I need to setup high and low priority queues for the same nodes. I preferred to make it without subordinate lists. I know, that the following parameters are dealing with this: seq_no10 priority 20 In addition to

[gridengine users] documentation for SGE

2012-01-12 Thread Peskin, Eric
All, What is the best source of documentation for SGE? I had been using http://wikis.sun.com/display/GridEngine But that seems to have disappeared. Thanks, Eric This email message, including any attachments, is for the

Re: [gridengine users] documentation for SGE

2012-01-12 Thread Gerard Henry
hello, i've noted http://gridscheduler.sourceforge.net/documentation.html and http://arc.liv.ac.uk/SGE/ On 01/12/12 04:41 PM, Peskin, Eric wrote: All, What is the best source of documentation for SGE? I had been using http://wikis.sun.com/display/GridEngine But that seems to have

[gridengine users] deciding spool directory location

2012-01-12 Thread Wolf, Dale
We are in the planning phase for the initial installation of grid engine. The initial configuration initially is a single cluster with 30 SLES 11 machines. This number may grow to as many as 100 SLES 11 servers. The Oracle N1 Grid Engine 6 Installation Guide, under sge-root Installation

Re: [gridengine users] deciding spool directory location

2012-01-12 Thread Rayson Ho
You can reference this HOWTO: http://gridscheduler.sourceforge.net/howto/nfsreduce.html You can put everything on NFS, and if the NFS server can't handle the load, then change the configuration to local spooling instead later on. Rayson On Thu, Jan 12, 2012 at 12:17 PM, Wolf, Dale

Re: [gridengine users] More Univa FUD???

2012-01-12 Thread William Deegan
Chi, On Jan 11, 2012, at 6:44 PM, Chi Chan wrote: So what's your point, William? Like others have already said, did you read what Ron said, or are you just not happy with many forks and each with features that are different, and like you said before that you needed to choose one to use?

Re: [gridengine users] More Univa FUD???

2012-01-12 Thread Joe Landman
On 01/11/2012 01:46 AM, Ron Chen wrote: And I just found this one today: http://www.univagridengine.com/ Again, as a contributor who has stayed with Oracle and Sun Grid Engine and Open Grid Scheduler for over 10 years, I think it is unacceptable to register a domain using other company's

Re: [gridengine users] deciding spool directory location

2012-01-12 Thread Chris Dagdigian
Hi Dale, We are trying to determine where the spool directory should reside based on performance Versus ease of administration. Can somebody explain how ease of administration would be made easier? Here is a short answer: When the spool directory is shared it is far easier for an

Re: [gridengine users] More Univa FUD???

2012-01-12 Thread Rayson Ho
On Thu, Jan 12, 2012 at 1:46 PM, Joe Landman land...@scalableinformatics.com wrote: More than merely wrong, it opens up the people/company who registered it to legal action in the US if univa and/or gridengine are trademarks, or copyrighted of a particular entity. Joe, you haven't showed up on

Re: [gridengine users] More Univa FUD???

2012-01-12 Thread Joe Landman
On 01/12/2012 02:14 PM, Rayson Ho wrote: On Thu, Jan 12, 2012 at 1:46 PM, Joe Landman land...@scalableinformatics.com wrote: More than merely wrong, it opens up the people/company who registered it to legal action in the US if univa and/or gridengine are trademarks, or copyrighted of a

Re: [gridengine users] Resource quotas and parallel jobs across multiple queues

2012-01-12 Thread Brendan Moloney
Hello, { name shortlimit description NONE enabled TRUE limitqueues short.q hosts * to slots=32 I think you can leave the hosts * out here and the other RQS below. It means used slots across all machines limited to 32 in this queue. The same can be achieved

[gridengine users] My notes on building Open GridScheduler 2011.11 on RedHat/CentOS 6.x based systems

2012-01-12 Thread Chris Dagdigian
Tried to reverse engineer my crusty old build environment into something that I (or even others) can actually replicate or follow. Going to try similar for 32bit binaries as well as document the process for RHEL/CentOS 5.x based systems in the near future... Short link: http://biote.am/6y

Re: [gridengine users] Resource quotas and parallel jobs across multiple queues

2012-01-12 Thread Brendan Moloney
All the queues are on the same machines. I am not sure which algorithm you refer to. I refer to the internal algorithm of SGE how to collect slots from various queues. As mentioned, the scheduler sorts by sequence number so the queues are checked in shortest to longest order. Not for

Re: [gridengine users] My notes on building Open GridScheduler 2011.11 on RedHat/CentOS 6.x based systems

2012-01-12 Thread Rayson Ho
Thanks Chris for posting this - I've never tried to build OGS outside of our machines or EC2 images. And we needed to use BerkeleyDB version 4.4.20 because the on-disk data structure is not compatible across different releases of Berkeley DB - it's not Oracle's fault, but it's just that it is not

[gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Simon Matthews
I have an installation of SGE 6.2U4 that I downloaded some years ago that I have installed on a couple of qmaster hosts. I hope that I do not offend the users of this list by asking for help using a binary installation, using binaries built by Sun. I hope that someone can shed some light on the

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Rayson Ho
On Fri, Jan 13, 2012 at 12:02 AM, Simon Matthews simon.d.matth...@gmail.com wrote: I have an installation of SGE 6.2U4 that I downloaded some years ago that I have installed on a couple of qmaster hosts. Are you using the same version of SGE (SGE 6.2u4) on both the qmaster the node? You can

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Simon Matthews
I am running the same version. I have one installation tree that is NFS mounted. All clients use the same binaries. I had wanted to move to 6.2U5, but I can't find a source to download it. Simon On Thu, Jan 12, 2012 at 9:50 PM, Rayson Ho ray...@scalablelogic.com wrote: On Fri, Jan 13, 2012 at

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Simon Matthews
On Thu, Jan 12, 2012 at 10:00 PM, Simon Matthews simon.d.matth...@gmail.com wrote: I am running the same version. I have one installation tree that is NFS mounted. All clients use the same binaries. I had wanted to move to 6.2U5, but I can't find a source to download it. Arrgh ---

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Simon Matthews
On Thu, Jan 12, 2012 at 9:50 PM, Rayson Ho ray...@scalablelogic.com wrote: On Fri, Jan 13, 2012 at 12:02 AM, Simon Matthews simon.d.matth...@gmail.com wrote: I have an installation of SGE 6.2U4 that I downloaded some years ago that I have installed on a couple of qmaster hosts. Are you

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Rayson Ho
Does it hang when you issue the qconf command on that node, or does it return the error message immediately?? Rayson On Fri, Jan 13, 2012 at 1:00 AM, Simon Matthews simon.d.matth...@gmail.com wrote: I am running the same version. I have one installation tree that is NFS mounted. All clients

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Simon Matthews
On Thu, Jan 12, 2012 at 10:15 PM, Rayson Ho ray...@scalablelogic.comwrote: Does it hang when you issue the qconf command on that node, or does it return the error message immediately?? It hangs. I see the message either after it times out or if I kill it. Simon Rayson On Fri, Jan 13,

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Rayson Ho
Good! That means qconf is waiting for the master's response but not getting it. If an IP filter or firewall is configured on that node, then it is very likely to be the cause. Make sure that firewalls are turned off or configured properly... I used to use sniffers like TCP dump to debug issues

Re: [gridengine users] qconf -sh fails on Centos 4 guest.

2012-01-12 Thread Rayson Ho
Simon, I'm logging off now, please let the list know whether it's still causing problems and/or your findings. (I'm in North America - EST timezone, and I normally don't stay up this late - but it usually takes me some time to get back to the normal daily schedule after the holidays :-D )