sounds like a bad case of a network file system. you prob need to 
harass your sysadmin and try a few of these too
    http://fixunix.com/nfs/61890-forcing-nfs-sync.html

On 02/09/2010 04:09, Suzy Howlett wrote:
> Hi everyone,
>
> I'm running Moses through its experiment management system across a
> cluster and I'm finding that sometimes jobs will finish successfully but
> the .STDERR and .STDOUT files will be slow in appearing relative to the
> .DONE file, meaning that the EMS concludes that the step crashed. I can
> run the system again and it successfully reuses the results of the step
> (it doesn't have to rerun the step) but this is becoming frustrating as
> I have to restart the system
> frequently. I tried adding a call to sleep() in the check_if_crashed()
> method in experiment.perl but this is not helping in general - I think
> sometimes the delay is as much as a couple of minutes.
>
> Has anyone else faced this problem, or have a better idea for how to get
> around it?
>
> Cheers,
> Suzy
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to