[gridengine users] Use cgroup functionality using prolog
Hello everybody, I'm using OGS/GE 2011.11p1 on Linux and I want to play with cgroups. My idea was simply to dynamically create a cgroup on an execution node per JOBID using a prolog (and cgcreate or simple shell command), but I also need to run the provided user command with cgexec. Is there a way to alter the command within the prolog ? I'm not an everyday grid engine administrator, so I'm certainly not thinking the GE way. Any comments or suggestions are welcomed. PS : By the way, i've seen this blog post http://blogs.scalablelogic.com/2012/05/grid-engine-cgroups-integration.html but this is not clear for me how it can be used. I've just checkouted the trunk source, grepping around, but I'm not able to find anything related to cgroup besides some lines in hwloc. Maybe I'm grepping wrong or looking at the wrong place ? =) Regards, Jean-Baptiste ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Use cgroup functionality using prolog
Hi, Am 22.08.2013 um 10:12 schrieb Jean-Baptiste Denis: Hello everybody, I'm using OGS/GE 2011.11p1 on Linux and I want to play with cgroups. My idea was simply to dynamically create a cgroup on an execution node per JOBID using a prolog (and cgcreate or simple shell command), but I also need to run the provided user command with cgexec. Is there a way to alter the command within the prolog ? What do you mean by alter the command? I would suggest to look into a redefinition of the starter_method in the queue definition. -- Reuti I'm not an everyday grid engine administrator, so I'm certainly not thinking the GE way. Any comments or suggestions are welcomed. PS : By the way, i've seen this blog post http://blogs.scalablelogic.com/2012/05/grid-engine-cgroups-integration.html but this is not clear for me how it can be used. I've just checkouted the trunk source, grepping around, but I'm not able to find anything related to cgroup besides some lines in hwloc. Maybe I'm grepping wrong or looking at the wrong place ? =) Regards, Jean-Baptiste ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Use cgroup functionality using prolog
Hi, What do you mean by alter the command? Prepend something, like ionice ... command or cgexec command I would suggest to look into a redefinition of the starter_method in the queue definition. I'll look into that. Thank you very much for the suggestion. Jean-Baptiste ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Use cgroup functionality using prolog
On 08/22/2013 11:50 AM, Jean-Baptiste Denis wrote: I would suggest to look into a redefinition of the starter_method in the queue definition. I'll look into that. Thank you very much for the suggestion. From the manpage, I does not look like I can access the $job_id or $task_id variable, which I could need to create a uniq cgroup per task/job. I could also generate a random cgroup name, but it will complicates the cgroup cleanup. I'll try that to begin. If anybody has another suggestion, I'll take it ! Jean-Baptiste ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] [SGE-discuss] variable getting truncated in soge8.1.3 and OGS 2011.11p1
Ed Lauzier elauzi...@perlstar.com writes: Hi Dave, I found the section where the static buffer is defined. I'm thinking on the best way to handle this for the patch. Having a static buffer is fine especially for the size it is. Properly handling variables that exceed this limit is where the issue is. I'll work on it and get back to thru the bug tracking system. Thanks. I suspect there are fixed buffers in multiple places which limit the values, probably defined as MAX_STRING_SIZE. You probably want to handle arbitrary-length strings with the dstring library http://arc.liv.ac.uk/SGE/adoc/libuti.html#uti-dstring if that doesn't require wholesale changes. I don't remember quite how the environment passing is all done, but I can check and advise if it's helpful (which might solve my puzzlement at it changing since 6.2u5). I'm still intrigued about what generates the huge values. -- Community Grid Engine: http://arc.liv.ac.uk/SGE/ ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Use cgroup functionality using prolog
Jean-Baptiste Denis jbde...@pasteur.fr writes: Hello everybody, I'm using OGS/GE 2011.11p1 on Linux and I want to play with cgroups. What do you want to do with them? (They're not without problems.) My idea was simply to dynamically create a cgroup on an execution node per JOBID using a prolog The prolog is no use, as such, because it only runs on the master node, and isn't connected to the shepherd. (and cgcreate or simple shell command), but I also need to run the provided user command with cgexec. Is there a way to alter the command within the prolog ? I'm not an everyday grid engine administrator, so I'm certainly not thinking the GE way. Any comments or suggestions are welcomed. There is preliminary cpuset support in the current SGE (though there's a bug report of problems I haven't got to the bottom of) and Mark Dixon has posted patches whose functionality will get merged soon. PS : By the way, i've seen this blog post http://blogs.scalablelogic.com/2012/05/grid-engine-cgroups-integration.html but this is not clear for me how it can be used. I've just checkouted the trunk source, grepping around, but I'm not able to find anything related to cgroup besides some lines in hwloc. Maybe I'm grepping wrong or looking at the wrong place ? =) Look in the current SGE source (which has been partly restructured, but that's not finished and published), and also in the source of SLURM, and possibly Condor. If you want to work on it, I'm happy to make suggestions for contributions to SGE (and Mark probably will), bearing in mind much of the work is already done. The main issue, apart from cgroups being a moving target with their own problems, is actually providing a good user/admin interface, at least in the current SGE style. -- Community Grid Engine: http://arc.liv.ac.uk/SGE/ ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Use cgroup functionality using prolog
Reuti re...@staff.uni-marburg.de writes: Hi, Am 22.08.2013 um 10:12 schrieb Jean-Baptiste Denis: Hello everybody, I'm using OGS/GE 2011.11p1 on Linux and I want to play with cgroups. My idea was simply to dynamically create a cgroup on an execution node per JOBID using a prolog (and cgcreate or simple shell command), but I also need to run the provided user command with cgexec. Is there a way to alter the command within the prolog ? What do you mean by alter the command? I would suggest to look into a redefinition of the starter_method in the queue definition. That runs as the user, so it can only manipulate cgroups with an suid utility or suitable capability (caveat emptor). The job needs doing in execd and shepherd. The only hook currently available (pending SLURM-like loadable modules) that runs privileged at least once per-node is shepherd_cmd. It's non-trivial to have that do cleanup (as opposed to exec'ing sge_shepherd) without major breakage. -- Community Grid Engine: http://arc.liv.ac.uk/SGE/ ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Use cgroup functionality using prolog
What do you want to do with them? (They're not without problems.) I'm already playing with them on my desktop machine since a few months, and I just want to experiment the use of cgroup within SGE. I'd like to prevent job eating more than X% of ram+swap using : memory.limit_in_bytes# set/show limit of memory usage memory.memsw.limit_in_bytes # set/show limit of memory+Swap usage The prolog is no use, as such, because it only runs on the master node, and isn't connected to the shepherd. Ok, I should have understand that. There is preliminary cpuset support in the current SGE (though there's a bug report of problems I haven't got to the bottom of) and Mark Dixon has posted patches whose functionality will get merged soon. Ok. Look in the current SGE source (which has been partly restructured, but that's not finished and published), and also in the source of SLURM, and possibly Condor. I've cloned the current trunk of the subversion repository, but I didn't see anything. Maybe in a specific branch, I'll look around. From the blog post I read, I thought it was already in 2011p1, hence the confusion. If you want to work on it, I'm happy to make suggestions for contributions to SGE (and Mark probably will), bearing in mind much of the work is already done. The main issue, apart from cgroups being a moving target with their own problems, is actually providing a good user/admin interface, at least in the current SGE style. I'm too lazy (and not really the right guy) =) Thank you for your answer. ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Random queue errors, and suspect pe_hostfiles
In the message dated: Thu, 01 Aug 2013 03:09:56 -, The pithy ruminations from Jewell, Chris on [gridengine users] Random queue errors, and suspect pe_hostfiles were: = Hello all, = = A while since I posted here, so good to be back! = = My installation of GE 8.1.3 from the Scientific Linux 6.3 RPM repo = has started misbehaving of late, since I introduced a share tree policy = the other day. I'm using SGE 6.2u5 under RHEL 6.3. = = My setup is contained entirely on my 32 cpu, 2 NVIDIA Tesla card = machine (both qmaster and execd), and the spool directory is mounted in = /opt which is on the root partition. Having had a very stable vanilla I'm trying to do performance testing on 2 machines: 32 CPU Intel vs 64 CPU AMD each machine is isolated as a single 'cluster' with both qmaster execd on each server the spool directory for each 'cluster' is mounted in /opt which is on the root partition The SGE configuration is very closely based on the stable config we've been using on our production cluster for several years. [SNIP!] = = This seems associated with another new (though less frequent) error message: = = 07/29/2013 09:55:54| main | it060123 | E | can't start job 821: can't open file /opt/sge/default/spool/it060123/active_jobs/821.36000/pe_hostfile: Permission denied = I'm seeing the same error on both clusters. The error shows up reliably but randomly. For example, I've got a sample set of jobs, intended to load each server in order to test throughput. The test submits 2048 jobs. The error occurs when about 75~95 jobs have run on each server, at which point the queue is in an error state with the remaining jobs waiting. The error doesn't occur after a fixed number of jobs, and seems to be independent of the job content--I've tried 4 different types of workloads during the process of evaluating the servers, and the error is the same. = which puts the queue into an error state. This appears to happen to a = minority of jobs at random, but of course stalls the queue. I'm fairly = sure the filesystem is okay (at least, an fsck tells me it is), so I'm = assuming it's something related to GE. Same here... = = Any ideas on where to start? = I started with a search of the SGE mailing list archive, and found your post. :) Have you found a solution? Thanks, Mark = Cheers, = = Chris = = = = -- = Dr Chris Jewell = Lecturer in Biostatistics = Institute of Fundamental Sciences = Massey University = Private Bag 11222 = Palmerston North 4442 = New Zealand = Tel: +64 (0) 6 350 5701 Extn: 3586 = ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Random queue errors, and suspect pe_hostfiles
I started with a search of the SGE mailing list archive, and found your post. :) Have you found a solution? Hello all, Sorry for the long leave of absence. I've been thoroughly testing my system for this issue. I checked my RAID1 for consistency, and performed an xfs_repair to make doubly sure my filesystem was okay. It was. I also disabled SELinux in case that was the problem. In reply to Reuti: The directories (/opt/sge/default/spool/it060123/active_jobs/...) are normally created by the admin user - is this root or any other one with normal rights (which would be fine)? Nevertheless also other users must be allowed to read this directory and the files inside. Is there any special `umask` in place and/or does it only happen to parallel jobs and/or only certain users? No special umask or parallel jobs being used. The problem seems more apparent when lots of very short jobs are sent to the system. The one thing that Mark and I have in common is high-CPU count machines. My box is currently configured to provide 28 slots out of 32 logical cores. I wonder if this might be causing a race-condition to become apparent in the creation of the pe_hostfile? Cheers, Chris -- Dr Chris Jewell Lecturer in Biostatistics Institute of Fundamental Sciences Massey University Private Bag 11222 Palmerston North 4442 New Zealand Tel: +64 (0) 6 350 5701 Extn: 3586 ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users