On Fri, Feb 20, 2009 at 9:14 PM, mabshoff <[email protected]> wrote: > > > > On Feb 20, 8:59 pm, mabshoff <[email protected]> wrote: >> On Feb 20, 8:42 pm, William Stein <[email protected]> wrote: > > <SNIP> > >> > In the above example the restart happened exactly once, then the build >> > failed later and I got an error about the machine being too loaded. >> > The script thus didn't successfully do the job it is supposed to do. >> >> Examine the log. This is something else in the ATLAS build. > > Ok, I figured it out I think: on debian32 this happens: > > STAGE 2-1-5: GEMV TUNE > make -f Makefile INSTALL_LOG/dMVRES pre=d 2>&1 | ./xatlas_tee > INSTALL_LOG/dMVTUNE.LOG > make[3]: *** [build] Error 255 > make[3]: Leaving directory `/space/wstein/farm/sage-3.3.rc3/spkg/build/ > atlas-3.8.3/ATLAS-build' > make[2]: *** [build] Error 2 > make[2]: Leaving directory `/space/wstein/farm/sage-3.3.rc3/spkg/build/ > atlas-3.8.3/ATLAS-build' > ATLAS failed - round 1 - sleeping 5 minutes > > Then the restart kicks in and finishes the build. > > Restartig build for the first time > make[2]: Entering directory `/space/wstein/farm/sage-3.3.rc3/spkg/ > build/atlas-3.8.3/ATLAS-build' > make -f Make.top build > make[3]: Entering directory `/space/wstein/farm/sage-3.3.rc3/spkg/ > build/atlas-3.8.3/ATLAS-build' > cd bin/ ; make xatlas_install > <SNIP> > > Because at some point there was a failure the makefile errors out at > the very end even though it all worked. So the script you wrote is > likely to hit the same bug unless you completely clean out the ATLAS > build directory. > > The fix here is to figure out which file causes the tuning failure > message in the end and to get rid of it before restart. > > Thoughts?
My script deletes the entire ATLAS-build directory, so it works. I just tested doing builds of the ATLAS spkg here all at once on boxen and everything worked perfectly. There were a total of five separate complete restarts, but after doing them everything finished building perfectly. Anyway, I build Sage a lot on loaded systems, and I just want something that actually works, since this "restart since system is too loaded" ATLAS build issue has literally been causing me pain and suffering for over a year now. I know from tons of experience that simply restarting from scratch the build of that package nearly always works, since I have to do it a lot manually when doing build testing and building Sage binaries. I don't care what you do to get this to work. Just fix it or use my spkg: http://sage.math.washington.edu/home/wstein/patches/atlas-3.8.3.p1.spkg I can easily test any proposed spkg. -- William --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---
