The bail is that "make" will eventually succeed or fail with something other than "interrupted system call". Do we need another condition?
I do not know which system call is getting interrupted, but here's an interesting article on how different Unixes deal with connect() interruptions: http://www.madore.org/~david/computers/connect-intr.html -Ethan On Tue, Oct/16/2007 04:59:29PM, Jeff Squyres wrote: > Ick! > > This is a long-known problem [apparently] with Sun's NFS, > unfortunately. :-( > > I'd be ok with this if there is an eventual bail out of the loop -- > the prospect of an infinite loop is a bit scary for me. > > > On Oct 16, 2007, at 11:23 AM, Ethan Mallove wrote: > > > On certain NFS servers, I run into the error message > > "Interrupted system call" when executing long running > > commands such as "make all". One solution I've been able to > > use is to setup an NFS mount point solely for the cluster > > I'm using, but this is not always an option. The below link > > advises to restart the build on "Interrupted system call": > > > > http://developers.sun.com/solaris/articles/parallel_make.html > > > > I wrapped the GNU_Install.pm make commands in a do-while to > > effect the build restarts. E.g., > > > > do { > > $x = MTT::DoCommand::Cmd("make install") > > } while (!MTT::DoCommand::wsuccess($x->{exit_status}) and ($x-> > > {result_stderr} =~ /interrupted system call/i)); > > > > As long as make emits "interrupted system call" and fails, > > MTT will keep restarting make. > > > > I realize this is ugly, but is it acceptable? > > > > -Ethan > > _______________________________________________ > > mtt-devel mailing list > > mtt-de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel