On Jun 19, 2013, at 12:43 , Berend Hasselman wrote:

> 
> On 19-06-2013, at 10:24, peter dalgaard <pda...@gmail.com> wrote:
> 
>> 
>> On Jun 18, 2013, at 21:49 , peter dalgaard wrote:
>> 
>>> 
>>> On Jun 18, 2013, at 21:23 , Berend Hasselman wrote:
>>> 
>>>> 
>>>> So it seems that the blocked algorithm is the cause of the error and that 
>>>> using the (possibly slow) unblocked algorithm gives the correct result.
>>> 
>>> Thanks Berend,
>>> 
>>> The finger seems to be pointing to the internal libRlapack/libRblas. The 
>>> apparent pattern is that things work when R is linked against external 
>>> libraries and not when the internal ones are used. So it could be time to 
>>> start looking for differences between R's lapack module and the original 
>>> LAPACK code.
>> 
>> Now done and no (relevant) differences found. Hmmm....
>> 
>> There could be compiler issues. It could also be that the internal LAPACK 
>> uses a newer version than the external libs and a bug was introduced in 
>> between them.
>> 
>> One option could be to bypass R by coding Robin's example in pure Fortran 
>> and see whether issues persist when linked against libRlapack, vecLib, and 
>> the relevant subset of current LAPACK sources. (The bogus 1.0000 eigenvalue 
>> in the 33x33 variant of Robin's example should make it rather easy to spot 
>> whether things work or not.)
> 
> 
> I have made that pure Fortran program.
> I have run it on OS X 10.8.4 with
> 
> - libRblas libRlapack
> - my private refblas and Lapack
> - framework Accelerate
> - downloaded zheev.f + dependencies + libRblas
> - downloaded zheev.f + dependencies + my refblas
> 
> The Fortran program and the shell script to do the compiling and running and 
> the output file follow at the end of this mail.
> The shell scripts runs otool -L on each executable to make sure it is linked 
> to the intended libraries.
> The fortran program runs zheev with the minimal lwork and the optimal lwork.
> 
> Summary of the output:
> 
> - libRblas libRlapack  ==> Bad
> - my private refblas and Lapack  ==> OK
> - framework Accelerate ==> OK
> - downloaded zheev.f + dependencies + libRblas ==> Bad
> - downloaded zheev.f + dependencies + my refblas ==> OK
> 
> It looks like libRblas is the cause of the problem.
> I haven't done any further investigation of the matter.
> 
> Berend
> 
> Output is:
> 
> <output>
> Running hb with Rlapack and Rblas
> ./hbRlapack:
>       
> /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRblas.dylib 
> (compatibility version 0.0.0, current version 0.0.0)
>       
> /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRlapack.dylib 
> (compatibility version 3.0.0, current version 3.0.1)
>       /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current 
> version 169.3.0)
> lwork=         199
> info=           0
> min eig value= -4.433595556845234E-008
> lwork=        3300
> info=           0
> min eig value= -0.359347003948635     
> 
> Running hb with Haslapack and Hasblas (Lapack 3.4.2)
> ./hbHaslapack:
>       /Users/berendhasselman/Library/lib/x86_64/librefblas.dylib 
> (compatibility version 0.0.0, current version 0.0.0)
>       /Users/berendhasselman/Library/lib/x86_64/liblapack.dylib 
> (compatibility version 0.0.0, current version 0.0.0)
>       /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current 
> version 169.3.0)
> lwork=         199
> info=           0
> min eig value= -4.433595556845234E-008
> lwork=        3300
> info=           0
> min eig value= -4.433595559424842E-008
> 
> Running hb with -framework Accelerate
> ./hbveclib:
>       /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate 
> (compatibility version 1.0.0, current version 4.0.0)
>       /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current 
> version 169.3.0)
> lwork=         199
> info=           0
> min eig value= -4.433595568797253E-008
> lwork=        3300
> info=           0
> min eig value= -4.433595550683581E-008
> 
> Compile downloaded zheev + dependencies
> /Users/berendhasselman/Documents/Programming/R/problems/bug-error/eigbug
> Running hb with downloaded zheev and Rblas
> ./hbzheev:
>       
> /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRblas.dylib 
> (compatibility version 0.0.0, current version 0.0.0)
>       /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current 
> version 169.3.0)
> lwork=         199
> info=           0
> min eig value= -4.433595556845234E-008
> lwork=        3300
> info=           0
> min eig value= -0.359347003948635     
> 
> Running hb with downloaded zheev and Hasrefblas
> ./hbzheev:
>       /Users/berendhasselman/Library/lib/x86_64/librefblas.dylib 
> (compatibility version 0.0.0, current version 0.0.0)
>       /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current 
> version 169.3.0)
> lwork=         199
> info=           0
> min eig value= -4.433595556845234E-008
> lwork=        3300
> info=           0
> min eig value= -4.433595559424842E-008
> </output>
> 
> Fortran program:
> <fortran>
>      program test
> 
>      integer n
>      parameter(n=100)
>      complex*16 A(n,n)
>      double precision w(n), rwork(3*n-2)
> 
>      integer lwork, info
> 
>      complex*16 work, wtmp(1)
>      allocatable work(:)
> 
>      call genmat(A,n)
>      lwork = 2*n-1
>      allocate(work(lwork))
>      print *, "lwork=",lwork
>      call zheev("N","L",n,A,n,w,work,lwork,rwork,info)
>      print *, "info=",info
>      print *, "min eig value=",minval(w)
>      deallocate(work)
> 
>      call genmat(A,n)
>      lwork = -1
>      call zheev("N","L",n,A,n,w,wtmp,lwork,rwork,info)
>      lwork = real(wtmp(1))
>      allocate(work(lwork))
>      print *, "lwork=",lwork
>      call zheev("N","L",n,A,n,w,work,lwork,rwork,info)
>      print *, "info=",info
>      print *, "min eig value=",minval(w)
> 
>      stop
>      end
> 
>      subroutine genmat(A,n)
>      integer n
>      complex*16 A(n,*)
>      integer i, j
>      double precision tmp
> 
>      do i=1,n
>          do j=1,n
>              tmp = exp(-0.1D0 * (i-j)**2)
>              A(i,j) = cmplx(tmp,0.0D0)
>          enddo
>      enddo
> 
>      return
>      end
> </fortran>
> 
> 
> Shell script to do the compiling and running:
> <script>
> gfortran -c hankinbug.f -o hankinbug.o
> gfortran hankinbug.o -L/Library/Frameworks/R.framework/Libraries -lRblas 
> -lRlapack -o hbRlapack
> echo "Running hb with Rlapack and Rblas"
> otool -L ./hbRlapack
> ./hbRlapack
> echo ""
> 
> gfortran hankinbug.o -L${HOME}/Library/lib/x86_64 -lrefblas -llapack -o 
> hbHaslapack
> echo "Running hb with Haslapack and Hasblas (Lapack 3.4.2)"
> otool -L ./hbHaslapack
> ./hbHaslapack
> echo ""
> 
> gfortran hankinbug.o -framework Accelerate -o hbveclib
> echo "Running hb with -framework Accelerate"
> otool -L ./hbveclib
> ./hbveclib
> echo ""
> 
> echo "Compile downloaded zheev + dependencies"
> cd zheev
> gfortran -c *.f
> cd -
> gfortran hankinbug.o zheev/*.o -L/Library/Frameworks/R.framework/Libraries 
> -lRblas -o hbzheev
> echo "Running hb with downloaded zheev and Rblas"
> otool -L ./hbzheev
> ./hbzheev
> echo ""
> 
> gfortran hankinbug.o zheev/*.o -L${HOME}/Library/lib/x86_64 -lrefblas -o 
> hbzheev
> echo "Running hb with downloaded zheev and Hasrefblas"
> otool -L ./hbzheev
> ./hbzheev
> echo ""
> </script>
> 


Thanks. I think I have it nailed down now. The culprit was indeed in our 
reference BLAS (I had only checked the LAPACK code), cmplxblas.f to be 
specific. Revision 53001 had a number of IF statements being commented out, but 
two of the changes looked like this:

@@ -1561,7 +1561,7 @@
                   C( J, J ) = DBLE( C( J, J ) )
                END IF
                DO 170 L = 1, K
-                  IF( ( A( J, L ).NE.ZERO ) .OR. ( B( J, L ).NE.ZERO ) )
+c                  IF( ( A( J, L ).NE.ZERO ) .OR. ( B( J, L ).NE.ZERO ) )
      $                 THEN
                      TEMP1 = ALPHA*DCONJG( B( J, L ) )
                      TEMP2 = DCONJG( ALPHA*A( J, L ) )

Notice that the continuation line was NOT commented out. So FORTRAN, being what 
it is, continues the line before the comment and parses it as 
      DO 170 L = 1, KTHEN
with KTHEN uninitialized! and things go downhill from there. 

(The uninitialized variable was actually hinted at in PR14964 and the fact that 
I could get one of my builds to segfault also helped.)


-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to