I wrote once the benchmark mentioned in Stefan's post (based on initial work by Stephan Steinhaus), and it is still available for those who would like to update it. Note that it is lacking some checking of the results to make sure that calculation is not only faster, but correct!

Now, I'll tell why I haven't update it, and you'll see it is connected with the current topic.

First, lack of time, for sure.

Second, this benchmark has always been very criticized by several people including from the R Core Team. Basically, this is just toy examples, disconnected from the reality. Even with better cases, benchmarks do not take into account the time needed to write your code for your particular application (from the question to the results).

I wrote this benchmark at a time when I overemphasized on the pure performances of the software, at a time I was looking for the best software I would choose as a tool for my future career.

Now, what's my choice, ten years later? Not two, not three software... but just ONE: R. I tend to do 95% of my calculations with R (the rest is ImageJ/Java). Indeed, this benchmark results (and the toy example of Ajay Shah, a <- a + 1) should be only considered very marginally, because what is important is how your software tool is performing in real application, not in simplistic toy examples.

R lays behind Matlab for pure arithmetic calculation... right! But R has a better object oriented approach, features more variable types (factor, for instance), and has a richer mechanism for metadata handling (col/row names, various other attributes, ...) that makes it richer to instanciate complex datasets or analyzes than Matlab. Of course, this has a small cost in performance.

As soon as you think your problem in a vectorized way, R is one of the best tool, I think, to go "from the question to the answer" in real situations. How could we quantify this? I would only see big contests where experts of each language would be presented real problems and one would measure the time needed to solve the problem,... Also, one should measure: the robustness, reusability, flexibility, "elegance" of the code produced (how to quantify these?). Such kind of contest between R, Matlab, Octave, Scilab, etc. is very unlikely to happen.

At the end, it is really a matter of personal feeling: you can make your own little contest by yourself: trying to solve a given problem in several software... and then decide which one you prefer. I think many people do/did this, and the still exponential growth of R use (at least, as it can be observed by the increasing number of CRAN R packages) is probably a good sign that R is probably one of the top performers when it comes to efficiency "from the question to the answer" in real problems, not just on toy little examples!

(sorry for been so long, I think I miss some interaction with the R community this time ;-)
Best,

Philippe

..............................................<°}))><........
 ) ) ) ) )
( ( ( ( (    Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (    Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..............................................................

Stefan Grosse wrote:
I don't have octave (on the same machine) to compare these with.
And I don't have MatLab at all. So I can't provide a comparison
on that front, I'm afraid.
Ted.

Just to add some timings, I was running 1000 repetitions (adding up to
a=1001) on a notebook with core 2 duo T7200

R 2.8.1 on Fedora 10: mean 0.10967, st.dev 0.005238
R 2.8.1 on Windows Vista: mean 0.13245, st.dev 0.00943

Octave 3.0.3 on Fedora 10: mean 0.097276, st.dev 0.0041296

Matlab 2008b on Windows Vista: 0.0626 st.dev 0.005

But I am not sure how representative this is with that very simple
example. To compare Matlab speed with R a kind of benchmark suite is
necessary. Like: http://www.sciviews.org/benchmark/index.html but that
one is very old. I would guess that there did not change much: sometimes
R is faster, sometimes not.

This difference between the Windows and Linux timing is probably not
really relevant: when I was comparing the timings of my usual analysis
there was no difference between the two operating systems. (count data
and time series stuff)

Cheers
Stefan

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to