I'll quote one of my comments on this StackOverflow question <http://stackoverflow.com/questions/9968578/speeding-up-julias-poorly-written-r-examples> :
That all depends on what you are trying to measure. Personally, I'm not at > all interested in how fast one can compute Fibonacci numbers. Yet that is > one of our benchmarks. Why? Because I am very interested in how well > languages support recursion – and the doubly recursive algorithm happens to > be a great test of recursion, precisely because it is such a terrible way > to compute Fibonacci numbers. So what would be learned by comparing an > intentionally slow, excessively recursive algorithm in C and Julia against > a tricky, clever, vectorized algorithm in R? Nothing at all. On Fri, May 1, 2015 at 12:58 PM, Steven Sagaert <[email protected]> wrote: > Of course I'm not saying loops should not be benchmarked and I do use > loops in julia also. I'm just saying that when doing performance comparison > one should try to write the programs in each language in their most optimal > style rather than similar style which is optimal for one language but very > suboptimal in another language. > Ah I didn't know the article was rebutted by Stefan. I read that article > before that happened and just looked it up again now as an example. > > I guess the conclusion is that cross-language performance benchmarks are > very tricky which was kinda my point :) > > > On Friday, May 1, 2015 at 3:13:24 PM UTC+2, Tim Holy wrote: >> >> Hi Steven, >> >> I understand your point---you're saying you'd be unlikely to write those >> algorithms in that manner, if your goal were to do those particular >> computations. But the important point to keep in mind is that those >> benchmarks >> are simply "toys" for the purpose of testing performance of various >> language >> constructs. If you think it's irrelevant to benchmark loops for >> scientific >> code, then you do very, very different stuff than me. Not all algorithms >> reduce >> to BLAS calls. I use julia to write all kinds of algorithms that I used >> to >> write MEX functions for, back in my Matlab days. If all you need is A*b, >> then >> of course basically any scientific language will be just fine, with >> minimal >> differences in performance. >> >> Moreover, that R benchmark on cumsum is simply not credible. I'm not sure >> what >> was happening (and that article doesn't post its code or procedures used >> to >> test), but julia's cumsum reduces to efficient machine code (basically, a >> bunch >> of addition operations). If they were computing cumsum across a specific >> dimension, then this PR: >> https://github.com/JuliaLang/julia/pull/7359 >> changed things. But more likely, someone forgot to run the code twice (so >> it >> got JIT-compiled), had a type-instability in the code they were testing, >> or >> some other mistake. It's too bad one can make mistakes, of course, but >> then it >> becomes a comparison of different programmers rather than different >> programming >> languages. >> >> Indeed, if you read the comments in that post, Stefan already rebutted >> that >> benchmark, with a 4x advantage for Julia: >> >> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89 >> >> --Tim >> >> >> >> On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote: >> > I think the performance comparisons between Julia & Python are flawed. >> They >> > seem to be between standard Python & Julia but since Julia is all about >> > scientific programming it really should be between SciPi & Julia. Since >> > SciPi uses much of the same underlying libs in Fortran/C the >> performance >> > gap will be much smaller and to be really fair it should be between >> numba >> > compiled SciPi code & julia. I suspect the performance will be very >> close >> > then (and close to C performance). >> > >> > Similarly the standard benchmark (on the opening page of julia website) >> > between R & julia is also flawed because it takes the best case >> scenario >> > for julia (loops & mutable datastructures) & the worst case scenario >> for R. >> > When the same R program is rewritten in vectorised style it beat julia >> > see >> > >> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon >> > e-else-wanna-challenge-r/. >> > >> > So my interest in julia isn't because it is the fastest scientific high >> > level language (because clearly at this stage you can't really claim >> that) >> > but because it's a clean interesting language (still needs work for >> some >> > rough edges of course) with clean(er) & clear(er) libraries and that >> gives >> > reasonable performance out of the box without much tweaking. >> > >> > On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote: >> > > Yes... Python will win on string processing... esp. with Python 3... >> I >> > > quickly ran into things that were > 800x faster in Python... >> > > (I hope to help change that though!) >> > > >> > > Scott >> > > >> > > On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson >> wrote: >> > >> I wouldn't expect a difference in Julia for code like that (didn't >> > >> check). But I guess what we are often seeing is someone comparing a >> tuned >> > >> Python code to newbie Julia code. I still want it faster than that >> code.. >> > >> (assuming same algorithm, note row vs. column major caveat). >> > >> >> > >> The main point of mine, *should* Python at any time win? >> > >> >> > >> 2015-04-30 21:36 GMT+00:00 Sisyphuss <[email protected]>: >> > >>> This post interests me. I'll write something here to follow this >> post. >> > >>> >> > >>> The performance gap between normal code in Python and badly-written >> code >> > >>> in Julia is something I'd like to know too. >> > >>> As far as I know, Python interpret does some mysterious >> optimizations. >> > >>> For example `(x**2)**2` is 100x faster than `x**4`. >> > >>> >> > >>> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson >> wrote: >> > >>>> Hi, >> > >>>> >> > >>>> [As a best language is subjective, I'll put that aside for a >> moment.] >> > >>>> >> > >>>> Part I. >> > >>>> >> > >>>> The goal, as I understand, for Julia is at least within a factor >> of two >> > >>>> of C and already matching it mostly and long term beating that >> (and >> > >>>> C++). >> > >>>> [What other goals are there? How about 0.4 now or even 1.0..?] >> > >>>> >> > >>>> While that is the goal as a language, you can write slow code in >> any >> > >>>> language and Julia makes that easier. :) [If I recall, Bezanson >> > >>>> mentioned >> > >>>> it (the global "problem") as a feature, any change there?] >> > >>>> >> > >>>> >> > >>>> I've been following this forum for months and newbies hit the same >> > >>>> issues. But almost always without fail, Julia can be speed up >> (easily >> > >>>> as >> > >>>> Tim Holy says). I'm thinking about the exceptions to that - are >> there >> > >>>> any >> > >>>> left? And about the "first code slowness" (see Part II). >> > >>>> >> > >>>> Just recently the last two flaws of Julia that I could see where >> fixed: >> > >>>> Decimal floating point is in (I'll look into the 100x slowness, >> that is >> > >>>> probably to be expected of any language, still I think may be a >> > >>>> misunderstanding and/or I can do much better). And I understand >> the >> > >>>> tuple >> > >>>> slowness has been fixed (that was really the only "core language" >> > >>>> defect). >> > >>>> The former wasn't a performance problem (mostly a non existence >> problem >> > >>>> and >> > >>>> correctness one (where needed)..). >> > >>>> >> > >>>> >> > >>>> Still we see threads like this one recent one: >> > >>>> >> > >>>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw >> > >>>> "It seems changing the order of nested loops also helps" >> > >>>> >> > >>>> Obviously Julia can't beat assembly but really C/Fortran is >> already >> > >>>> close enough (within a small factor). The above row vs. column >> major >> > >>>> (caching effects in general) can kill performance in all >> languages. >> > >>>> Putting >> > >>>> that newbie mistake aside, is there any reason Julia can be within >> a >> > >>>> small >> > >>>> factor of assembly (or C) in all cases already? >> > >>>> >> > >>>> >> > >>>> Part II. >> > >>>> >> > >>>> Except for caching issues, I still want the most newbie code or >> > >>>> intentionally brain-damaged code to run faster than at least >> > >>>> Python/scripting/interpreted languages. >> > >>>> >> > >>>> Potential problems (that I think are solved or at least not >> problems in >> > >>>> theory): >> > >>>> >> > >>>> 1. I know Any kills performance. Still, isn't that the default in >> > >>>> Python (and Ruby, Perl?)? Is there a good reason Julia can't be >> faster >> > >>>> than >> > >>>> at least all the so-called scripting languages in all cases >> (excluding >> > >>>> small startup overhead, see below)? >> > >>>> >> > >>>> 2. The global issue, not sure if that slows other languages down, >> say >> > >>>> Python. Even if it doesn't, should Julia be slower than Python >> because >> > >>>> of >> > >>>> global? >> > >>>> >> > >>>> 3. Garbage collection. I do not see that as a problem, incorrect? >> > >>>> Mostly performance variability ("[3D] games" - subject for another >> > >>>> post, as >> > >>>> I'm not sure that is even a problem in theory..). Should reference >> > >>>> counting >> > >>>> (Python) be faster? On the contrary, I think RC and even manual >> memory >> > >>>> management could be slower. >> > >>>> >> > >>>> 4. Concurrency, see nr. 3. GC may or may not have an issue with >> it. It >> > >>>> can be a problem, what about in Julia? There are concurrent GC >> > >>>> algorithms >> > >>>> and/or real-time (just not in Julia). Other than GC is there any >> big >> > >>>> (potential) problem for concurrent/parallel? I know about the >> threads >> > >>>> work >> > >>>> and new GC in 0.4. >> > >>>> >> > >>>> 5. Subarrays ("array slicing"?). Not really what I consider a >> problem, >> > >>>> compared to say C (and Python?). I know 0.4 did optimize it, but >> what >> > >>>> languages do similar stuff? Functional ones? >> > >>>> >> > >>>> 6. In theory, pure functional languages "should" be faster. Are >> they in >> > >>>> practice in many or any case? Julia has non-mutable state if >> needed but >> > >>>> maybe not as powerful? This seems a double-edged sword. I think >> Julia >> > >>>> designers intentionally chose mutable state to conserve memory. >> Pros >> > >>>> and >> > >>>> cons? Mostly Pros for Julia? >> > >>>> >> > >>>> 7. Startup time. Python is faster and for say web use, or compared >> to >> > >>>> PHP could be an issue, but would be solved by not doing CGI-style >> web. >> > >>>> How >> > >>>> good/fast is Julia/the libraries right now for say web use? At >> least >> > >>>> for >> > >>>> long running programs (intended target of Julia) startup time is >> not an >> > >>>> issue. >> > >>>> >> > >>>> 8. MPI, do not know enough about it and parallel in general, seems >> you >> > >>>> are doing a good job. I at least think there is no inherent >> limitation. >> > >>>> At >> > >>>> least Python is not in any way better for parallel/concurrent? >> > >>>> >> > >>>> 9. Autoparallel. Julia doesn't try to be, but could (be an >> addon?). Is >> > >>>> anyone doing really good and could outperform manual Julia? >> > >>>> >> > >>>> 10. Any other I'm missing? >> > >>>> >> > >>>> >> > >>>> Wouldn't any of the above or any you can think of be considered >> > >>>> performance bugs? I know for libraries you are very aggressive. >> I'm >> > >>>> thinking about Julia as a core language mostly, but maybe you are >> > >>>> already >> > >>>> fastest already for most math stuff (if implemented at all)? >> > >>>> >> > >>>> >> > >>>> I know to get the best speed, 0.4 is needed. Still, (for the >> above) >> > >>>> what are the problems for 0.3? Have most of the fixed speed issues >> been >> > >>>> backported? Is Compat.jl needed (or have anything to do with >> speed?) I >> > >>>> think slicing and threads stuff (and global?) may be the only >> > >>>> exceptions. >> > >>>> >> > >>>> Rust and some other languages also claim "no abstraction penalty" >> and >> > >>>> maybe also other desirable things (not for speed) that Julia >> doesn't >> > >>>> have. >> > >>>> Good reason it/they might be faster or a good reason to prefer for >> > >>>> non-safety related? Still any good reason to choose Haskell or >> Erlang? >> > >>>> I do >> > >>>> not know to much about Nim language that seems interesting but not >> > >>>> clearly >> > >>>> better/faster. Possibly Rust (or Nim?) would be better if you >> really >> > >>>> need >> > >>>> to avoid GC or for safety-critical. Would there be a best >> complementary >> > >>>> language to Julia? >> > >>>> >> > >>>> >> > >>>> Part III. >> > >>>> >> > >>>> Faster for developer time not CPU time. Seems to be.. (after a >> short >> > >>>> learning curve). This one is subjective, but any languages clearly >> > >>>> better? >> > >>>> Right metric shouldn't really be to first code that seems right >> but >> > >>>> bug-free or proven code. I'll leave that aside and safe-critical >> > >>>> issues. >> >>
