Am 29.01.2013 um 19:40 schrieb Reuti: > Am 29.01.2013 um 12:50 schrieb Stefano Bridi: > >> On Tue, Jan 29, 2013 at 9:26 AM, William Hay <[email protected]> wrote: >>> >>> On 25 January 2013 17:21, Stefano Bridi <[email protected]> wrote: >>>> >>>> Hi all, is there a way to use the scratch area (local disk) on the >>>> compute node in a transparent way from the submitted script point of >>>> view? >>>> What I want to do is to copy to and from the compute node scratch area >>>> the job data using the prolog/epilog but I need also to start the >>>> submitted script in the scratch area instead of the cwd. >>>> Is there a way? >>> >>> Since you are mucking around with prolog and epilog I assume you have >>> administrative control of the cluster. >>> One solution would be to use a starter method to cd to $TMPDIR before >>> execing the real job. starter_method >>> is a bit of a swiss army chainsaw though (a flexible but dangeous tool). >>> >>> William >> >> Yes, I'm the admin: the problem I want to solve in this way is to >> lower the load of the central file server by using local scratch area >> on the "master" node as a scratch area. >> What I mean is that if the job is running in SMP or serial, it is the >> local disk (/scratch) and, if the job is using multiple nodes (mpi), >> it will be the local disk "/scratch" of the first node exported via >> NFS and mounted on the fly via autofs "/net/n0000/scratch/" on the >> other nodes. >> By doing this, the traffic on the central file server ("/home") is >> done only at the start and at the end of the job with the possibility >> to apply some filter to throw away useless redundant huge files >> generated by the software. >> Please don't laugh...Now I'm doing this by copying the files to the >> scratch area of the first node in the "start pe phase" and copying it >> back in the "stop pe phase": It was my first try and I discovered too >> late the existence of the prolog/epilog way which now I think is >> should be "the way" of doing this. > > Whether you do it in the PE script or prolog/epilog is personal taste IMO. If > it's only necessary in case of a parallel run, the PE scripts might even be > the more appropriate place. > > >> Anyway, actually the users need to do a >> >> cd /net/`hostname -s`/scratch/${USER}.${JOB_ID} > > Do you create these directories on your own instead of using the build in > $TMPDIR? > > So this is done also on the machine where the jobscript runs, even so it > would be accessible in /scratch? > > I'm sill not sure about the workflow in detail, but 2 ideas I got and maybe > you can make any use of them: > > a) Submit the job with a hold to modify the -wd: > > reuti@pc15370:~> qsub -h -l h=pc15370 test.sh > Your job 5532 ("test.sh") has been submitted > reuti@pc15370:~> qalter -wd /tmp/5532.1.all.q 5532 > modified working directory of job 5532 > reuti@pc15370:~> qrls 5532 > modified hold of job 5532 > > You need to submit with a hold, as you don't know the jobnumber beforehand. > So, no `cd` by hand necessary but a wrapper around `qsub` to do these steps > for you. > > b) Use path aliasing in SGE in the file: > > /usr/sge/default/common/sge_aliases > > you can put a line for each exechost: > > /dummy/ * pc15370 /tmp/ > > reuti@pc15370:~> qsub -h -l h=pc15370 -wd /foobar test.sh > Your job 5533 ("test.sh") has been submitted > reuti@pc15370:~> qalter -wd /dummy/5533.1.all.q 5533 > modified working directory of job 5533 > reuti@pc15370:~> qrls 5533 > modified hold of job 5533 > > You can submit with a plain /scratch/ there, and it will be replaced before > execution to /tmp/ (man sge_aliases). Maybe it can be used to map /scratch/ to
Correction: You can submit with a plain /dummy/ there, and it will be replaced before execution to /tmp/ (man sge_aliases). Maybe it can be used to map /scratch/ to > /net/n0000/scratch/ or alike for each exechost. > > NB: It looks like a bug that the flag to enable path aliasing isn't set by > `qalter`, hence already at submission time it's necessary to use -cwd or -wd > /foobar to set it with an arbitrary path. > > -- Reuti > > >> in the job script they submit in order to keep the mechanism working. >> Now I have a new "user" which in fact is an automated system which I >> prefer not to tweak and so I think to adapt GE to that automated >> system. >> What I'm trying to achieve is to have a system configured in this way >> but "hardcoded" and transparent to the end user. >> I suppose that the prolog/epilog is the right place to do the >> first/last step (copying around data) and the starter_method is the >> right way for doing the other step, now I need to figure out how to >> do it and what side effects could emerge: any idea on the second >> question? >> >> Thanks >> Stefano >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
