Hello, On Friday, March 21, 2014 4:31:59 AM UTC+1, Jiahao Chen wrote: > > I wrote a similar package long ago for Python and remember SGE array jobs > well.
> If ClusterManager's addprocs_sge function doesn't respect the current > working directory in the worker processes, it would be nice to file an > issue about it. It would be really nice to have your code integrated > OK, I might do that. I feel a bit hesitant, though, to call this an issue. Maybe the behaviour is by design. > more tightly with ClusterManager rather than exist as a separate > package. > > It is just a few lines of code---if anybody form ClusterManagers is willing to integrate it---by all means. ---david > Thanks, > > Jiahao Chen > Staff Research Scientist > MIT Computer Science and Artificial Intelligence Laboratory > > > On Thu, Mar 13, 2014 at 4:07 AM, David van Leeuwen > <[email protected] <javascript:>> wrote: > > Hello, > > > > I've got a tiny package that makes certain Sun Grid Engine array > processing > > jobs easier with Julia. I've put it up at > > > > Pkg.clone("https://github.com/davidavdav/SGEArrays.jl.git") > > > > The premise is that your main Julia script needs to process a large > number > > of files, which are given as a list. > > > > Rather than splitting the files in separate lists outside Julia, and > > spawning an array of jobs calling the Julia script with a different list > > files for every job, this splitting is done in an iterator. > > > > Your main julia script `bin/julia-script` could look like > > > > using SGEArrays > > > > listfile = ARGS[1] > > files = readdlm(listfile) > > > > for file in SGEArray(files) > > ## process file $file > > end > > > > i,e., the `SGEArray(files)` replaces the bit where you would normally > have > > `files`. Calling the script as an SGE task array of size 80 would go > like: > > > > find data/input/ -type f > filelist > > qsub -t1:80 -b y -cwd bin/julia-script filelist > > > > but the code would also work outside SGE > > > > bin/julia-script filelist > > > > For certain computing tasks I find this somewhat easier than using > > ClusterManagers.addprocs_sge, which also doesn't seem to respect the > current > > working directory in the worker processes. > > > > Cheers, > > > > ---david >
