Stuart On 7.05.08, Stuart Martin wrote: > On May 7, 2008, at May 7, 7:19 AM, Steve White wrote: > > These are things we'd like to do, but have not been able to get to them. > > > > > 1) the lack of a "prologue/epilogue script" in the job submission > > We had a prototype of a GRAM service java RM API that included a > prologue and epilogue callouts, but we have not been able to give it > attention to get it in release form. > I'm not sure I understand "RM API" and "callouts".
I have expanded on my idea of a good user interface (for JDD anyway) in Jan's bug report: http://bugzilla.mcs.anl.gov/globus/show_bug.cgi?id=5698 > > 2) generic control of RAM-per-process > > We would like to move to JSDL where I would think this would be > covered, but after scanning, it looks like it isn't. jsdl posix has > MemoryLimit, but that is for the job and not for each process in the > job. So I don't think even JSDL provides this. > This would suffice if implemented properly. The memory per process would be mem_per_process = MemoryLimit / count The number of cores to assign per node on cluster with multi-core nodes could be calculated as available_RAM_per_node / mem_per_process Cheers! > From the JSDL 1.0 doc > >>>> > jsdl-posix:MemoryLimit > > 8.1.14 MemoryLimit Element > > 8.1.14.1 Definition > This element is a positive integer that describes the maximum amount > of physical memory that > the job should use when executing. The amount is given in bytes. If > this is not present then the > consuming system MAY choose its default value10. > <<<< > > > > >I regard these omissions as bugs. > > > > > >Generally, it is bad policy to add another layer of software to > >compensate > >for bugs in a lower layer. It is to put bandages on bandages. > > > >On the other hand, if we can fix these middle-layer problems, much > >better > >higher-layer software can be made, much more easily. > > > >Cheers! > > > > > >On 6.05.08, Jan Ploski wrote: > >>Steve White wrote: > >>>Jan, > >>> > >>>I agree with your assessment that the need to adjust the memory > >>>use per > >>>process is a general one in cluster job submission, and that it is > >>>in > >>>some way implemented by any underlying job management system, and > >>>that > >>>these extensions ought not to be PBS-specific. > >>> > >>>I also looked at your "messy solution". (The code looks very > >>>professional, > >>>really.) It won't do for my purposes, because I need to present a > >>>minimal, > >>>easily understood solution. > >>> > >>>Let me explain my situation: > >>> > >>>None of the compute resources is under my control. I can point out > >>>problems to admins, that is all. > >>> > >>>I have been assigned two jobs. > >>> > >>>I and our users are familiar with doing conventional cluster job > >>>submission. One job was to bring them into the grid fold, showing > >>>them the > >>>advantages > >>>of globusrun-ws. If it can be shown to be really a cross-platform > >>>solution, giving them the ability to (almost) effortlessly switch > >>>between grid clusters, the effort will be a success. > >>> > >>>My other job is to write a report on practical MPI job submission > >>>over > >>>the grid. > >>> > >>>We have come a long way, but still have to deal with a couple of > >>>practical > >>>details. At this point, it looks like both of them will end up as > >>>work-arounds to incomplete implementation of a job submission > >>>interface > >>>in Globus. > >>> > >>>If with a future release of Globus, these issues can be dealt > >>>with, grid > >>>job submission will look very attractive to real researchers. > >> > >>Hi, > >> > >>Based on my experience with Globus, you might be following a wrong > >>route > >>(the route to disenchantment). I view Globus more as a middleware > >>that > >>has to be adapted (as in: "wrapped around" or "slightly modified") > >>according to your users' needs and which plays an important role > >>behind > >>the scenes, but it probably should not be exposed directly to users > >>as a > >>drop-in replacement for their familiar job submission tools. > >> > >>There is a reason for that more important than the limitations you > >>have > >>discovered so far: Globus doesn't ship with command-line job > >>management > >>commands on par with those of TORQUE/Maui, Condor or SGE. If you let > >>users submit jobs with globus-job-submit, the next thing they are > >>going > >>to ask you is "how can I see what jobs I have submitted", "how can I > >>cancel the job or resubmit it elsewhere", "is my job running or not", > >>"why is my job not running", "when is my job going to start", etc. > >> > >>You need something in front of Globus to make your users' life > >>bearable. > >>Some projects lean toward application-specific web portals (I think > >>that's AstroGrid's approach). In our project, we have deployed a > >>largely > >>application-agnostic frontend based on Condor-G, but even so there > >>was > >>some customization and some user training required. The Condor-G > >>approach might be relevant for you because it covers the scenario of > >>making a transparent transition from a local batch system to a Grid - > >>the Condor tools for submitting jobs and status querying are pretty > >>much > >>the same regardless of whether your job goes to a machine from a > >>local > >>pool (equivalent to an SGE or PBS-managed cluster) or to a pool of > >>Globus hosts. (In fact, Condor can submit to GT2 [gLite], GT4, > >>Unicore, > >>and some more Grid middlewares.) > >> > >>The disadvantage of Condor is that it is a rather huge software > >>product > >>and trying to understand all of it can be daunting. Still, I > >>suppose you > >>could get the Grid submission piece of it running in a couple of > >>hours > >>if you wish to give it a try (by following our tutorials and asking > >>questions where necessary). > >> > >>Regards, > >>Jan Ploski > >> > > > >-- > >- - - - - - - - - - - - - - - - - - - - - - - > >- - > >Steve White > >+49(331)7499-202 > >e-Science / AstroGrid-D Zi. 35 > >Bg. 20 > >- - - - - - - - - - - - - - - - - - - - - - - > >- - > >Astrophysikalisches Institut Potsdam (AIP) > >An der Sternwarte 16, D-14482 Potsdam > > > >Vorstand: Prof. Dr. Matthias Steinmetz, Peter A. Stolz > > > >Stiftung privaten Rechts, Stiftungsverzeichnis Brandenburg: III/ > >7-71-026 > >- - - - - - - - - - - - - - - - - - - - - - - > >- - > > > -- - - - - - - - - - - - - - - - - - - - - - - - - - Steve White +49(331)7499-202 e-Science / AstroGrid-D Zi. 35 Bg. 20 - - - - - - - - - - - - - - - - - - - - - - - - - Astrophysikalisches Institut Potsdam (AIP) An der Sternwarte 16, D-14482 Potsdam Vorstand: Prof. Dr. Matthias Steinmetz, Peter A. Stolz Stiftung privaten Rechts, Stiftungsverzeichnis Brandenburg: III/7-71-026 - - - - - - - - - - - - - - - - - - - - - - - - -
