On Oct 22, 2010, at 9:58 AM, Steve Lianoglou wrote:

> Hi,
> 
> Although I should, I don't follow too closely in the places I should
> (I don't know where those places are) to know if Apple is aware of
> this vecLib breakdown issue ... are they? Where can we go to add our
> voices/votes for them to fix it?
> 

The way to go is to file a concise bug report (simply using let's say DGEMM 
timings) - but I did not have time to do so. If someone else wants to go ahead, 
let me know and I can pass on the bug # to our contacts at Apple.


> Apple still has their apple.com/science section, so I guess this
> should be (somehow) important to them ... maybe we can make an
> R-SIG-Mac 20-person-strong tidal wave to help force their hand? ;-)
> 

Good question - and it's not only R people that are appalled.

Cheers,
Simon


> 
> 
> On Thu, Oct 21, 2010 at 12:36 PM, Simon Urbanek
> <simon.urba...@r-project.org> wrote:
>> On Oct 21, 2010, at 7:47 AM, Stefan Evert wrote:
>> 
>>> 
>>> On 21 Oct 2010, at 03:28, Simon Urbanek wrote:
>>> 
>>>> It's not vague at all, it's MacPro4,1 and MacPro5,1 models (you can use 
>>>> use "sysctl hw.model" to find out what you have). If in doubt, check on 
>>>> Wikipedia ;)
>>>> 
>>>> The latter uses the Nehalem architecture but I don't have a specimen of 
>>>> those so I can't confirm that the bug still holds true for those.
>>> 
>>> Not just those ... I'm plagued by the same problem on my Penryn-based 
>>> MacBookPro4,1.  In 64-bit mode, BLAS performance breaks down to single core 
>>> levels, whereas in 32-bit mode (i.e. R --arch=i386) it uses both cores.  I 
>>> posted some benchmark results to this list a few weeks ago.
>>> 
>> 
>> Well, given that it is only a two-thread CPU there is not much you can gain 
>> so I wouldn't lose my sleep over it. If you have 16-theads CPU it's a while 
>> different story ;). For illustration, those are the timings from your 
>> benchmarks (only those that use BLAS) for 64-bit R 2.1...@10.6.4 on a 
>> 2.66GHz MacPro4,1:
>> 
>> test                    R BLAS  vecLib  ATLAS   MKL
>> inner M %*% t(M) D      19.961  3.470   0.519   0.662
>> inner tcrossprod D      0.658   1.867   0.243   0.235
>> inner crossprod t(M) D  9.574   1.849   0.242   0.256
>> cosine normalised D     0.798   2.009   0.385   0.411
>> cosine general D        0.770   1.993   0.380   0.352
>> euclid() D              2.072   3.271   1.637   1.635
>> euclid() small D        0.515   0.821   0.421   0.395
>> 
>> As you can see both MKL and ATLAS outperform vecLib and R BLAS by an order 
>> of magnitude. It's sad, because vecLib used to be fairly well optimized ... 
>> (in fact it is actually some version of ATLAS which is even more strange 
>> ...).
>> 
>> 
>>> My solution has also been to switch to the reference BLAS, which 
>>> outperforms vecLib on most of the operations I benchmarked, except for 
>>> crossprod(), which is terribly slow (more than 10x slower than 
>>> tcrossprod()).  I've just tested again with R 2.12.0, and the situation has 
>>> become even worse: now an explicit matrix multiplication M %*% t(M) -- 
>>> which used to be fast -- performs as poorly as crossprod().
>>> 
>>> Any ideas about this?  The crossprod() slowdown isn't a Mac problem: I got 
>>> similar results on a Pentium Dual Core laptop running Ubuntu.  If this is a 
>>> known problem of the reference BLAS, is there any way to work around it?
>>> 
>>> Apart from the speed hiccups, in my benchmarks vecLib BLAS performed 
>>> consistently slower than the reference BLAS.  Is there evidence from other 
>>> benchmarks / hardware architectures that vecLib can be faster?  If not, 
>>> perhaps the default should be _not_ to use vecLib on Mac?  Or perhaps it 
>>> would be possible to autodetect hardware in the R startup wrapper and 
>>> select the BLAS that's known to run faster on this setup?
>>> 
>> 
>> I don't think we would want to do that since that would prevent the user 
>> from choosing the BLAS they want to use. We will probably abandon vecLib as 
>> the default for the next release (more due to its numerical instability 
>> issues) and maybe provide all three options (vecLib, R BLAS, ATLAS) for the 
>> user to choose from in case they have a machine that can take advantage of 
>> it.
>> 
>> Cheers,
>> Simon
>> 
>> _______________________________________________
>> R-SIG-Mac mailing list
>> R-SIG-Mac@stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>> 
> 
> 
> 
> -- 
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
> 
> 

_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to