On 28 February 2012 11:02, Stefano Bridi <[email protected]> wrote: > Hi list, I have a problem on a SGE setup where the home directory are > shared trough glusterfs and some job failed to start because of a > latency on the filesystem propagation between the login node and the > compute node. > What happen is that a script create a workdir with some support files, > "cd" inside and then qsub a script, sometime the script start to run > on the compute node too quickly and the "workdir" is not yet visible > on that node. I know it is a glusterfs problem that must be resolved > elsewhere but in the meantime, where can I put a "sleep"? > Does exist a prerun hook that I can use for that? For other use > (copying files around and cleanup) does exist a similar postrun hook? > Had another thought. Set up a load sensor for a >= complex that reports the current time (seconds since 1970). Add a request to the qsub (via jsv if you don't want to make the submission process more complex) for that complex with a value greater than now+fudge factor.
William _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
