i've used qrsh instead of qsub to avoid the aggressive disk buffering.
it's a much better solution all around. check out this PR:
https://github.com/JuliaParallel/ClusterManagers.jl/pull/11
Yes this seems to be ten same issue. When my cluster is heavily loaded it
sometimes is so slow that I hit the 60sec limit anyway.
Why don't we just grab the list of nodes from the sge env variable
PE_HOSTFILE as soon as this becomes available and put that list into a
simple addprocs( machines )
i'm having a problem on a cluster where setting up the connections via
ClusterManagers.addprocs_sge() takes longer than the 60 second limit. how
can I extend that limit? thanks!