On Wed, Jun 2, 2010 at 2:00 PM, Dirk Eddelbuettel <e...@debian.org> wrote: > > On 2 June 2010 at 13:28, Paul Johnson wrote: > | On Tue, Jun 1, 2010 at 10:56 PM, Dirk Eddelbuettel <e...@debian.org> wrote: > | > Who says you need libRblas.so? We no longer do. > | > > | > | different BLAS. > | > | > | > | > | The R install & admin manual says so, actually. I think that you know > | what's going on much more accurately than it does, and perhaps you > | don't see that doc the way we do. > > I actually don't read it much as I have no problems building R (on Linux), so > maybe I shouldnt have sent you that way as it may have clouded and muddled > your understanding instead of helping.
> Initially, in your previous email, you said > > This question reminded me I never understood BLAS linkage with R when > I asked about it 2 months ago and I forgot to follow up. > > and there is really nothing magic here. There is an interface (called BLAS) > and a number of interchangeable libraries that can all provide libblas.so. > They are "simply" arranged in such a manner (by the atlas + lapack packages) > that the best one is preferred. That's all. The devil is the detail _and I > am merely using these facilities from R and other packages_ (which included > Octave when I still maintained Octave). > > All I do it make sure libblas.so is there when R runs configure, so that R > finds it and builds again it. Presto -- now you get your "pluggability". > > Depending on which package (libatlas*, refblas, ...) you have installed, > running > > ldd ldd /usr/lib/R/bin/exec/R > > will point to different libraries standing in libblas.so. > > Lastly, I would recommend that you stop worrying about libRblas.so. Note the > R in that name. It is a fallback provided by R when the system has nothing > better. In a darker age we (as in Debian) had to use it too (as gfortran and > lapack had issues) but that has long passed. We now have something better, > and it works. Enjoy it. > Thanks. I've got to make this work for RedHat/Centos, Fedora, and some Solaris systems, so it helps to get to the core of it. I'm "pretty sure" this is approximately right. If you think I'm right, thanks for your tips along the way. If I'm wrong, well, its somebody else's fault :) Question: How does the R executable know which BLAS shared library to use. Answer: It uses whatever so name it was told to use at build time, and the dynamic linking mechanism of the OS helps it find the file. If you build R with a configure option that tells it to use an external BLAS like atlas and you do not have --enable-BLAS-shlib specified, then no libRblas.so is created, the R executable will look for the specific location of the BLAS library that it found at compile time. Here's what I see with the deb packages from CRAN $ ldd /usr/lib/R/bin/exec/R linux-vdso.so.1 => (0x00007fff9c9ff000) libR.so => /usr/lib/libR.so (0x00007fcfe9550000) libc.so.6 => /lib/libc.so.6 (0x00007fcfe91ce000) libblas.so.3gf => /usr/lib/libblas.so.3gf (0x00007fcfe8f32000) In Ubuntu 10.04 (lucid), the installation of R from CRAN (r-base, r-base-core, etc) causes the installation of the addon packages "libblas-dev" and "libblas3gf" and you see above that R is linked against it. In theory, according to README.Atlas.gz, it should be possible to have several BLAS library collections installed at once. From Atlas, we could have "base","sse", and "sse2". The docs say "sse2" is best. In my Ubuntu system, I have the universe repository enabled, but I don't see the Atlas sse or sse2 versions (even though the launchpad listing says those packages exist.) For Ubuntu 10.04, for blas we have the one r-base-core pulls in, libblas-dev version 1.2-2build1 libblas3gf version 1.2-2build1 and uninstalled: libatlas3gf-base version 3.6.0-24 libatlas-base-dev libatlas-headers The libblas-dev installs /usr/lib64/libblas.so.3gf. And libatlas3gf-base installs in a subdir: /usr/lib64/atlas/libblas.so.3gf. R goes MUCH FASTER if libatlas3gf-base is installed. Before installing libatlas3gf, I get this: > mm <- matrix(rnorm(10^7), ncol = 10^3) > system.time(crossprod(mm)) user system elapsed 9.390 0.010 9.424 After installing the libatlas-3gf and restarting the computer, it became MUCH faster: > mm <- matrix(rnorm(10^7), ncol = 10^3) > > system.time(crossprod(mm)) user system elapsed 2.250 0.000 2.254 I've installed and uninstalled that package several times and it gets slower and faster. The minimal conclusion I draw from this is that Ubuntu users should install the libatlas3gf-base. Clearly, there is some dynamic linking "magic" going on so that the system knows which libatlas3gf.so.to use when R asks for it. I have not seen this before, were 2 identically named so files exist. But check the output of $ /sbin/ldconfig -p libblas.so.3gf (libc6,x86-64) => /usr/lib/atlas/libblas.so.3gf libblas.so.3gf (libc6,x86-64) => /usr/lib/libblas.so.3gf Hm. 2 libraries with the exact same name, the one in the atlas directory is found first, so R uses it. If I remove libatlas3gf-base, then, of course, the only one that is found is from libblas3gf. Question: How can one experiment with other versions of BLAS? Answer: Either replace the file /usr/lib/libblas.so.3gf with some other shared object file, or rebuild R using --enable-BLAS-shlib and replace that. Explanation: The README.Atlas is pretty outdated. It outlines a testing procedure, a script that uses the package manager to remove all blas packages, run R, then install an atlas package, run R, then install a different atlas, and so forth. That does not work on current Ubuntu. The package system will not allow you to remove libblas3gf to test this out. In the README.Atlas file, it shows a speedup from ordinary R blas to atlas3gf-base that is substantial, and then about the same percentage improvement after upgrading to atlas-sse2. So if I can figure how to set the Ubuntu repositories to get their version of sse2, I'll test. In the meanwhile, I downloaded the Gotoblas2 code, version 1.13 from the U Texas site (http://www.tacc.utexas.edu/tacc-projects/) . I can't give you that file because you are not supposed to redistribute it, but you can sign up and get for free. Its easy to build. I just ran their script "quickbuild.64bit" and 5 minutes later out popped a shared library. After removing libatlas3gf-base package (just to be sure), I did this: $ sudo cp libgoto2_penrynp-r1.13.so /usr/lib64 $ cd /usr/lib64 $ sudo mv libblas.so.3gf.0 libblas.so.3gf.0-orig $ sudo ln -sf libgoto2_penrynp-r1.13.so libblas.so.3gf.0 Now look at my time: mm <- matrix(rnorm(10^7), ncol = 10^3) > > system.time(crossprod(mm)) user system elapsed 1.140 0.040 0.592 WOW! Almost 2x as fast as Atlas3gf-base, many orders of magnitude faster than Ubuntu's default libblas3gf. The only downside here is that I've blocked all the other users in the system from using libblas3gf that they usually expect. I should probably just bother the R users. That's where libRblas.so comes into the picture. In order to leave libblas3gf.so.0 unchanged, I found it is valuable to install R with the --enable-BLAS-shlib option. That creates libRblas.so, which R finds like so: $ ldd /usr/lib/R/bin/exec/R linux-vdso.so.1 => (0x00007fffe49c0000) libR.so => /usr/lib/libR.so (0x00007f4137331000) libRblas.so => /usr/lib/libRblas.so (0x00007f413712d000) Replace libRblas.so with a sym link to Atlas or Gotoblas2 shared object files and all is done. If I see any significant differences on our Fedora or RedHat/Centos systems, I'll let you know. pj -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas _______________________________________________ R-SIG-Debian mailing list R-SIG-Debian@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-debian