Re: [gridengine users] [SGE-discuss] Advance Reservation, project relationship

2012-12-11 Thread baf035
I'm testing it as root, which is temporary a member of a user group (acl) configured for the project prj_cfd_of_ext : qconf -sprj prj_cfd_of_ext | grep acl acl cfd_ext xacl NONE qconf -su cfd_ext | grep root ,czcmmma,ezcsmuy,fzcmh3q,d486sa0,hjkljqf,xzcjk9r,root The hostgroup is

Re: [gridengine users] [SGE-discuss] Advance Reservation, project relationship

2012-12-11 Thread Reuti
Am 04.12.2012 um 16:49 schrieb baf035: Projects should be probably ignored for AR , I do not see any relation or dependency here. Or I'm wrong? In my opinion: no, they should be honored but in a proper way. Otherwise an user could submit an AR to certain queues/machines where he has no

[gridengine users] Advance Reservation, project relationship

2012-12-11 Thread baf035
Hello, I cannot find in any documentation, please could somebody explain me how is the relationship between AR definition and project configuration in a queue? I'm able to configure AR: ~# qrsub -pe Pe_C02_Ba 8 -q all.q@nodec02n122 -N reservation -d 1:0:0 Your advance reservation

Re: [gridengine users] Defining and launching jobs programmatically

2012-12-11 Thread Dave Love
Lane Schwartz dowob...@gmail.com writes: I'm trying to figure out a way to define and launch jobs programmatically. The idea is that I am writing a computationally intensive program in some programming language (most typically Java or Scala, but sometimes C, C++, Ruby). I would like to be

Re: [gridengine users] $'\r': command not found

2012-12-11 Thread jan roels
Nevermind, problem with the license. 2012/12/10 jan roels janro...@gmail.com Hi, I added some new nodes and when i run my script i get the following: /var/spool/gridengine/execd/node1/job_scripts/160: line 2: $'\r': command not found /var/spool/gridengine/execd/node1/job_scripts/160:

Re: [gridengine users] [SGE-discuss] Advance Reservation, project relationship

2012-12-11 Thread baf035
Projects should be probably ignored for AR , I do not see any relation or dependency here. Or I'm wrong? 2012/12/4 Reuti re...@staff.uni-marburg.de Am 04.12.2012 um 08:50 schrieb baf035: I'm testing it as root, which is temporary a member of a user group (acl) configured for the project

[gridengine users] Memory allocation woes. Any thoughts?

2012-12-11 Thread Jake Carroll
Hi all. We've got some memory allocation/memory contention issues our users are complaining about. Many are saying they can't get their jobs to run because of memory resource issues. An example: scheduling info: (-l h_vmem=24G,virtual_free=24G) cannot run at host

[gridengine users] memory utilization of job using qstat

2012-12-11 Thread Vamsi Krishna
hi is there any grid engine command to find out the actual memory utilization of job submitted? Regards PVK ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] [SGE-discuss] Advance Reservation, project relationship

2012-12-11 Thread Dave Love
baf035 baf...@gmail.com writes: I'm testing it as root, which is temporary a member of a user group (acl) configured for the project prj_cfd_of_ext : qconf -sprj prj_cfd_of_ext | grep acl acl cfd_ext xacl NONE qconf -su cfd_ext | grep root

Re: [gridengine users] [SGE-discuss] Advance Reservation, project relationship

2012-12-11 Thread Dave Love
baf035 baf...@gmail.com writes: In case of setting of project in the queue for a hostgroup is impossible use Advance Reservation: ~# qconf -sq all.q | grep projects projects NONE,[@c02_Ba=prj_cfd_of_ext] xprojects NONE ~# qrsub -pe Pe_C02_Ba 8 -q

Re: [gridengine users] Intermittent commlib errors with MPI jobs

2012-12-11 Thread Brendan Moloney
Hello again, I got a chance to run some more tests. I can recreate the problem with different ports on the switch, and I can recreate it between different pairs of nodes. I also used tcpdump to look for bad checksums (while recreating the commlib error) and got nothing. Is it still possible

Re: [gridengine users] [SGE-discuss] Advance Reservation, project relationship

2012-12-11 Thread Reuti
Am 04.12.2012 um 08:50 schrieb baf035: I'm testing it as root, which is temporary a member of a user group (acl) configured for the project prj_cfd_of_ext : qconf -sprj prj_cfd_of_ext | grep acl acl cfd_ext xacl NONE qconf -su cfd_ext | grep root

Re: [gridengine users] [SGE-discuss] Advance Reservation, project relationship

2012-12-11 Thread Reuti
Am 04.12.2012 um 13:59 schrieb Reuti: Am 04.12.2012 um 08:50 schrieb baf035: I'm testing it as root, which is temporary a member of a user group (acl) configured for the project prj_cfd_of_ext : qconf -sprj prj_cfd_of_ext | grep acl acl cfd_ext xacl NONE qconf -su cfd_ext | grep

Re: [gridengine users] Functional share policy question

2012-12-11 Thread Dave Love
Ben De Luca bdel...@gmail.com writes: Careful not to set the number of tickets too high, there is a bug I believe that can cause the code to wrap the number of tickets to be negative. Do you know/remember how that shows up? If it's reproducible, we could make an issue of it. The underling

[gridengine users] qstat reports a job's priority as '-nan'

2012-12-11 Thread Gowtham
While performing routine weekly checks on our test cluster (it runs Rocks 6.0 with CentOS 6.2), I noticed that all currently every job's priority as '-nan' (running and waiting). Uncertain of what caused this issue, I submitted a test job 'hello_world_serial.sh'. It seems to start out by

[gridengine users] $'\r': command not found

2012-12-11 Thread jan roels
Hi, I added some new nodes and when i run my script i get the following: /var/spool/gridengine/execd/node1/job_scripts/160: line 2: $'\r': command not found /var/spool/gridengine/execd/node1/job_scripts/160: line 9: $'\r': command not found I'm sure the script is correct, it always worked on my

Re: [gridengine users] memory utilization of job using qstat

2012-12-11 Thread Jesse Becker
On Fri, Dec 07, 2012 at 09:38:26AM +0530, Vamsi Krishna wrote: hi is there any grid engine command to find out the actual memory utilization of job submitted? qacct -j jobid Note that in some versions of SGE there was a bug that would cause the value to wrap at 4GB (and thus be wrong for

Re: [gridengine users] [Rocks-Discuss] qstat reports a job's priority as '-nan'

2012-12-11 Thread Luca Clementi
On Mon, Dec 10, 2012 at 7:07 AM, Gowtham sgowt...@mtu.edu wrote: While performing routine weekly checks on our test cluster (it runs Rocks 6.0 with CentOS 6.2), I noticed that all currently every job's priority as '-nan' (running and waiting). Uncertain of what caused this issue, I submitted

Re: [gridengine users] [Rocks-Discuss] Problem with mpi program and sge

2012-12-11 Thread Luca Clementi
On Mon, Dec 10, 2012 at 9:27 AM, Forster, Robert robert.fors...@agr.gc.ca wrote: Hello all: I'm running a small Rocks cluster (Rocks 5.4, 7 nodes, 56 cores). I need to run many iterations of a program that takes 13 hrs to finish on 53 cores. I can successfully run the program via the command

Re: [gridengine users] $'\r': command not found

2012-12-11 Thread Reuti
Am 10.12.2012 um 11:35 schrieb jan roels: Nevermind, problem with the license. Hehe - it should be handled in a better way by the software instead of resulting in some kind of syntax error. - Reuti 2012/12/10 jan roels janro...@gmail.com Hi, I added some new nodes and when i run my

Re: [gridengine users] $'\r': command not found

2012-12-11 Thread Ian Kaufman
Hi Jan, Have you made any modifications to the script recently? It looks like something is inserting Windows carriage returns which show up as \n\r in UNIX/Linux, which subsequently newlines the \n, leaving a \r on the following empty line. Ian On Mon, Dec 10, 2012 at 4:01 AM, jan roels

Re: [gridengine users] Memory allocation woes. Any thoughts?

2012-12-11 Thread Alex Chekholko
Hi Jake, You can do 'qhost -F h_vmem,mem_free,virtual_free', that might be a useful view for you. In general, I've only ever used one of the three complexes above. Which one(s) do you have defined for the execution hosts? e.g. qconf -se compute-1-7 h_vmem will map to 'ulimit -v' mem_free

Re: [gridengine users] host group @allhosts is required attribute of SGE?

2012-12-11 Thread Reuti
Am 10.12.2012 um 15:13 schrieb Semi: host group @allhosts is required attribute of SGE? No. It can be deleted? Yes. -- Reuti ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Memory allocation woes. Any thoughts?

2012-12-11 Thread Gowtham
I second Alex's thoughts. In all our clusters, we only use h_vmem (to indicate the hard cap per job) and mem_free (a suggestion to the scheduler as to which node the job should be started on). Best regards, g -- Gowtham Information Technology Services Michigan Technological University (906)

Re: [gridengine users] Memory allocation woes. Any thoughts?

2012-12-11 Thread Reuti
Am 11.12.2012 um 21:32 schrieb Gowtham: I second Alex's thoughts. In all our clusters, we only use h_vmem The difference is that virtual_free is only a guidance for SGE, but h_vmem will also be enforced. It depends on the working style of the users/groups which you want to prefer to use. Is

Re: [gridengine users] Memory allocation woes. Any thoughts?

2012-12-11 Thread Reuti
Am 07.12.2012 um 11:31 schrieb Jake Carroll: Hi all. We've got some memory allocation/memory contention issues our users are complaining about. Many are saying they can't get their jobs to run because of memory resource issues. An example: scheduling info:

Re: [gridengine users] qstat reports a job's priority as '-nan'

2012-12-11 Thread Reuti
Am 10.12.2012 um 16:07 schrieb Gowtham: While performing routine weekly checks on our test cluster (it runs Rocks 6.0 with CentOS 6.2), I noticed that all currently every job's priority as '-nan' (running and waiting). Uncertain of what caused this issue, I submitted a test job

Re: [gridengine users] [Rocks-Discuss] Problem with mpi program and sge

2012-12-11 Thread Reuti
Am 10.12.2012 um 19:37 schrieb Luca Clementi: On Mon, Dec 10, 2012 at 9:27 AM, Forster, Robert robert.fors...@agr.gc.ca wrote: Hello all: I'm running a small Rocks cluster (Rocks 5.4, 7 nodes, 56 cores). I need to run many iterations of a program that takes 13 hrs to finish on 53 cores.

[gridengine users] Requesting CPU Type with qsub / qrsh ?

2012-12-11 Thread Joseph Farran
Greetings. How do I request the CPU type in qrsh / qsub with SGE 8.1.2? Googling this question shows some answers of the type qrsh -l arch=xxx. However, all my nodes in my qhost shows the same type of arch: # qhost -F | grep arch hl:arch=lx-amd64 hl:arch=lx-amd64 hl:arch=lx-amd64

Re: [gridengine users] Some generic questions: binding, parallel, over-subscription

2012-12-11 Thread Reuti
Hi, Am 07.12.2012 um 11:16 schrieb Arnau Bria: I've configured our cluster in the way that slots slots are consumable by default in `qconf -sc`. /memory are consumable resources. Our nodes have their limits and there are some default resources requirements at job submission. All this conf

Re: [gridengine users] Requesting CPU Type with qsub / qrsh ?

2012-12-11 Thread Reuti
Am 11.12.2012 um 22:14 schrieb Joseph Farran: Greetings. How do I request the CPU type in qrsh / qsub with SGE 8.1.2? Googling this question shows some answers of the type qrsh -l arch=xxx. However, all my nodes in my qhost shows the same type of arch: # qhost -F | grep arch

Re: [gridengine users] Some generic questions: binding, parallel, over-subscription

2012-12-11 Thread Reuti
Am 11.12.2012 um 22:19 schrieb Reuti: Hi, Am 07.12.2012 um 11:16 schrieb Arnau Bria: I've configured our cluster in the way that slots slots are consumable by default in `qconf -sc`. /memory are consumable resources. Our nodes have their limits and there are some default

Re: [gridengine users] Memory allocation woes. Any thoughts?

2012-12-11 Thread Jake Carroll
Cool. Thanks for the response guys. See in line: On 12/12/12 6:45 AM, Reuti re...@staff.uni-marburg.de wrote: Am 11.12.2012 um 21:32 schrieb Gowtham: I second Alex's thoughts. In all our clusters, we only use h_vmem The difference is that virtual_free is only a guidance for SGE, but h_vmem

Re: [gridengine users] memory utilization of job using qstat

2012-12-11 Thread Vamsi Krishna
Thanks Jesse, yes it always wrap around 4G, seems bug. On Wed, Dec 12, 2012 at 1:08 AM, Jesse Becker becker...@mail.nih.govwrote: On Fri, Dec 07, 2012 at 09:38:26AM +0530, Vamsi Krishna wrote: hi is there any grid engine command to find out the actual memory utilization of job submitted?

Re: [gridengine users] Memory allocation woes. Any thoughts?

2012-12-11 Thread Schmidt U.
On 12/12/2012 02:17 AM, Jake Carroll wrote: Cool. Thanks for the response guys. See in line: On 12/12/12 6:45 AM, Reuti re...@staff.uni-marburg.de wrote: Am 11.12.2012 um 21:32 schrieb Gowtham: I second Alex's thoughts. In all our clusters, we only use h_vmem The difference is that