On Jun 19, 2013, at 12:43 , Berend Hasselman wrote: > > On 19-06-2013, at 10:24, peter dalgaard <pda...@gmail.com> wrote: > >> >> On Jun 18, 2013, at 21:49 , peter dalgaard wrote: >> >>> >>> On Jun 18, 2013, at 21:23 , Berend Hasselman wrote: >>> >>>> >>>> So it seems that the blocked algorithm is the cause of the error and that >>>> using the (possibly slow) unblocked algorithm gives the correct result. >>> >>> Thanks Berend, >>> >>> The finger seems to be pointing to the internal libRlapack/libRblas. The >>> apparent pattern is that things work when R is linked against external >>> libraries and not when the internal ones are used. So it could be time to >>> start looking for differences between R's lapack module and the original >>> LAPACK code. >> >> Now done and no (relevant) differences found. Hmmm.... >> >> There could be compiler issues. It could also be that the internal LAPACK >> uses a newer version than the external libs and a bug was introduced in >> between them. >> >> One option could be to bypass R by coding Robin's example in pure Fortran >> and see whether issues persist when linked against libRlapack, vecLib, and >> the relevant subset of current LAPACK sources. (The bogus 1.0000 eigenvalue >> in the 33x33 variant of Robin's example should make it rather easy to spot >> whether things work or not.) > > > I have made that pure Fortran program. > I have run it on OS X 10.8.4 with > > - libRblas libRlapack > - my private refblas and Lapack > - framework Accelerate > - downloaded zheev.f + dependencies + libRblas > - downloaded zheev.f + dependencies + my refblas > > The Fortran program and the shell script to do the compiling and running and > the output file follow at the end of this mail. > The shell scripts runs otool -L on each executable to make sure it is linked > to the intended libraries. > The fortran program runs zheev with the minimal lwork and the optimal lwork. > > Summary of the output: > > - libRblas libRlapack ==> Bad > - my private refblas and Lapack ==> OK > - framework Accelerate ==> OK > - downloaded zheev.f + dependencies + libRblas ==> Bad > - downloaded zheev.f + dependencies + my refblas ==> OK > > It looks like libRblas is the cause of the problem. > I haven't done any further investigation of the matter. > > Berend > > Output is: > > <output> > Running hb with Rlapack and Rblas > ./hbRlapack: > > /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRblas.dylib > (compatibility version 0.0.0, current version 0.0.0) > > /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRlapack.dylib > (compatibility version 3.0.0, current version 3.0.1) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current > version 169.3.0) > lwork= 199 > info= 0 > min eig value= -4.433595556845234E-008 > lwork= 3300 > info= 0 > min eig value= -0.359347003948635 > > Running hb with Haslapack and Hasblas (Lapack 3.4.2) > ./hbHaslapack: > /Users/berendhasselman/Library/lib/x86_64/librefblas.dylib > (compatibility version 0.0.0, current version 0.0.0) > /Users/berendhasselman/Library/lib/x86_64/liblapack.dylib > (compatibility version 0.0.0, current version 0.0.0) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current > version 169.3.0) > lwork= 199 > info= 0 > min eig value= -4.433595556845234E-008 > lwork= 3300 > info= 0 > min eig value= -4.433595559424842E-008 > > Running hb with -framework Accelerate > ./hbveclib: > /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate > (compatibility version 1.0.0, current version 4.0.0) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current > version 169.3.0) > lwork= 199 > info= 0 > min eig value= -4.433595568797253E-008 > lwork= 3300 > info= 0 > min eig value= -4.433595550683581E-008 > > Compile downloaded zheev + dependencies > /Users/berendhasselman/Documents/Programming/R/problems/bug-error/eigbug > Running hb with downloaded zheev and Rblas > ./hbzheev: > > /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRblas.dylib > (compatibility version 0.0.0, current version 0.0.0) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current > version 169.3.0) > lwork= 199 > info= 0 > min eig value= -4.433595556845234E-008 > lwork= 3300 > info= 0 > min eig value= -0.359347003948635 > > Running hb with downloaded zheev and Hasrefblas > ./hbzheev: > /Users/berendhasselman/Library/lib/x86_64/librefblas.dylib > (compatibility version 0.0.0, current version 0.0.0) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current > version 169.3.0) > lwork= 199 > info= 0 > min eig value= -4.433595556845234E-008 > lwork= 3300 > info= 0 > min eig value= -4.433595559424842E-008 > </output> > > Fortran program: > <fortran> > program test > > integer n > parameter(n=100) > complex*16 A(n,n) > double precision w(n), rwork(3*n-2) > > integer lwork, info > > complex*16 work, wtmp(1) > allocatable work(:) > > call genmat(A,n) > lwork = 2*n-1 > allocate(work(lwork)) > print *, "lwork=",lwork > call zheev("N","L",n,A,n,w,work,lwork,rwork,info) > print *, "info=",info > print *, "min eig value=",minval(w) > deallocate(work) > > call genmat(A,n) > lwork = -1 > call zheev("N","L",n,A,n,w,wtmp,lwork,rwork,info) > lwork = real(wtmp(1)) > allocate(work(lwork)) > print *, "lwork=",lwork > call zheev("N","L",n,A,n,w,work,lwork,rwork,info) > print *, "info=",info > print *, "min eig value=",minval(w) > > stop > end > > subroutine genmat(A,n) > integer n > complex*16 A(n,*) > integer i, j > double precision tmp > > do i=1,n > do j=1,n > tmp = exp(-0.1D0 * (i-j)**2) > A(i,j) = cmplx(tmp,0.0D0) > enddo > enddo > > return > end > </fortran> > > > Shell script to do the compiling and running: > <script> > gfortran -c hankinbug.f -o hankinbug.o > gfortran hankinbug.o -L/Library/Frameworks/R.framework/Libraries -lRblas > -lRlapack -o hbRlapack > echo "Running hb with Rlapack and Rblas" > otool -L ./hbRlapack > ./hbRlapack > echo "" > > gfortran hankinbug.o -L${HOME}/Library/lib/x86_64 -lrefblas -llapack -o > hbHaslapack > echo "Running hb with Haslapack and Hasblas (Lapack 3.4.2)" > otool -L ./hbHaslapack > ./hbHaslapack > echo "" > > gfortran hankinbug.o -framework Accelerate -o hbveclib > echo "Running hb with -framework Accelerate" > otool -L ./hbveclib > ./hbveclib > echo "" > > echo "Compile downloaded zheev + dependencies" > cd zheev > gfortran -c *.f > cd - > gfortran hankinbug.o zheev/*.o -L/Library/Frameworks/R.framework/Libraries > -lRblas -o hbzheev > echo "Running hb with downloaded zheev and Rblas" > otool -L ./hbzheev > ./hbzheev > echo "" > > gfortran hankinbug.o zheev/*.o -L${HOME}/Library/lib/x86_64 -lrefblas -o > hbzheev > echo "Running hb with downloaded zheev and Hasrefblas" > otool -L ./hbzheev > ./hbzheev > echo "" > </script> >
Thanks. I think I have it nailed down now. The culprit was indeed in our reference BLAS (I had only checked the LAPACK code), cmplxblas.f to be specific. Revision 53001 had a number of IF statements being commented out, but two of the changes looked like this: @@ -1561,7 +1561,7 @@ C( J, J ) = DBLE( C( J, J ) ) END IF DO 170 L = 1, K - IF( ( A( J, L ).NE.ZERO ) .OR. ( B( J, L ).NE.ZERO ) ) +c IF( ( A( J, L ).NE.ZERO ) .OR. ( B( J, L ).NE.ZERO ) ) $ THEN TEMP1 = ALPHA*DCONJG( B( J, L ) ) TEMP2 = DCONJG( ALPHA*A( J, L ) ) Notice that the continuation line was NOT commented out. So FORTRAN, being what it is, continues the line before the comment and parses it as DO 170 L = 1, KTHEN with KTHEN uninitialized! and things go downhill from there. (The uninitialized variable was actually hinted at in PR14964 and the fact that I could get one of my builds to segfault also helped.) -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel