Yes, 6581 sounds like it. Thanks for the clarification. Jan
Dňa piatok, 12. septembra 2014 14:12:46 UTC+2 Andreas Noack napísal(-a): > > I think the reason for the slow down in rand since 2.1 is this > > https://github.com/JuliaLang/julia/pull/6581 > > Right now we are filling the array one by one which is not efficient, but > unfortunately it is our best option right now. In applications where you > draw one random variate at a time there shouldn't be difference. > > Med venlig hilsen > > Andreas Noack > > 2014-09-12 4:46 GMT-04:00 Ján Dolinský: > >> Finally, I found that Octave has an equivalent to sumabs2() called >> sumsq(). Just for sake of completeness here are the timings: >> >> Octave >> X = rand(7000); >> tic; sumsq(X); toc; >> Elapsed time is 0.0616651 seconds. >> >> Julia v0.3 >> @time X = rand(7000,7000); >> elapsed time: 0.285218597 seconds (392000160 bytes allocated) >> @time sumabs2(X, 1); >> elapsed time: 0.05705666 seconds (56496 bytes allocated) >> >> >> Essentially speed is about the same with Julia being a little faster. >> >> It was however interesting to observe that @time X = rand(7000,7000); >> is about 2.5 times slower in Julia 0.3 than it was in Julia 0.2 ... >> >> in Julia (v0.2.1): >> @time X = rand(7000,7000); >> elapsed time: 0.114418731 seconds (392000128 bytes allocated) >> >> >> Jan >> >> Dňa utorok, 9. septembra 2014 17:06:59 UTC+2 Ján Dolinský napísal(-a): >>> >>> Hello Andreas, >>> >>> Thanks for the tip. I'll check it out. Thumbs up for the 0.4! >>> >>> Jan >>> >>> On 09.09.2014 17:04, Andreas Noack wrote: >>> >>> If you need the speed now you can try one of the package ArrayViews or >>> ArrayViewsAPL. It is something similar to the functionality in these >>> packages that we are trying to include in base. >>> >>> Med venlig hilsen >>> >>> Andreas Noack >>> >>> 2014-09-09 9:38 GMT-04:00 Ján Dolinský: >>> >>>> OK, so basically there is nothing wrong with the syntax X[:,1001:end] >>>> ? >>>> d = sumabs2(X[:,1001:end], 1); >>>> and I should just wait until v0.4 is available (perhaps available >>>> soon in Julia Nightlies PPA). >>>> >>>> I did the benchmark with the floating point power function based on >>>> Simon's comment. Here are my results (after couple of repetitive >>>> iterations): >>>> @time X.^2; >>>> elapsed time: 0.511988142 seconds (392000256 bytes allocated, 2.52% gc >>>> time) >>>> @time X.^2.0; >>>> elapsed time: 0.411791612 seconds (392000256 bytes allocated, 3.12% gc >>>> time) >>>> >>>> Thanks, >>>> Jan Dolinsky >>>> >>>> On 09.09.2014 14:06, Andreas Noack wrote: >>>> >>>> The problem is that right now X[:,1001,end] makes a copy of the array. >>>> However, in 0.4 this will instead be a view of the original matrix and >>>> therefore the computing time should be almost the same. >>>> >>>> It might also be worth repeating Simon's comment that the floating >>>> point power function has special handling of 2. The result is that >>>> >>>> julia> @time A.^2; >>>> elapsed time: 1.402791357 seconds (200000256 bytes allocated, 5.90% gc >>>> time) >>>> >>>> julia> @time A.^2.0; >>>> elapsed time: 0.554241105 seconds (200000256 bytes allocated, 15.04% gc >>>> time) >>>> >>>> I tend to agree with Simon that special casing of integer 2 would be >>>> reasonable. >>>> >>>> Med venlig hilsen >>>> >>>> Andreas Noack >>>> >>>> 2014-09-09 4:24 GMT-04:00 Ján Dolinský: >>>> >>>> Hello guys, >>>>> >>>>> Thanks a lot for the lengthy discussions. It helped me a lot to get a >>>>> feeling on what is Julia like. I did some more performance comparisons as >>>>> suggested by first two posts (thanks a lot for the tips). In the mean >>>>> time >>>>> I upgraded to v0.3. >>>>> X = rand(7000,7000); >>>>> @time d = sum(X.^2, 1); >>>>> elapsed time: 0.573125833 seconds (392056672 bytes allocated, 2.25% >>>>> gc time) >>>>> @time d = sum(X.*X, 1); >>>>> elapsed time: 0.178715901 seconds (392057080 bytes allocated, 14.06% >>>>> gc time) >>>>> @time d = sumabs2(X, 1); >>>>> elapsed time: 0.067431808 seconds (56496 bytes allocated) >>>>> >>>>> In Octave then >>>>> X = rand(7000); >>>>> tic; d = sum(X.^2); toc; >>>>> Elapsed time is 0.167578 seconds. >>>>> >>>>> So the ultimate solution is the sumabs2 function which is a blast. I >>>>> am comming from Matlab/Octave and I would expect X.^2 to be fast "out of >>>>> the box" but nevertheless if I can get an excellent performance by >>>>> learning >>>>> some new paradigms I will go for it. >>>>> >>>>> The above tests lead me to another question. I often need to calculate >>>>> the "self" dot product over a portion of a matrix, e.g. >>>>> @time d = sumabs2(X[:,1001:end], 1); >>>>> elapsed time: 0.175333366 seconds (336048688 bytes allocated, 7.01% >>>>> gc time) >>>>> >>>>> Apparently this is not a way to do it in Julia because working on a >>>>> smaller matrix of 7000x6000 gives more than double computing time and >>>>> furthermore it seems to allocate unnecessary memory. >>>>> >>>>> Best Regards, >>>>> Jan >>>>> >>>>> >>>>> >>>>> Dňa pondelok, 8. septembra 2014 10:36:02 UTC+2 Ján Dolinský >>>>> napísal(-a): >>>>> >>>>>> Hello, >>>>>> >>>>>> I am a new Julia user. I am trying to write a function for computing >>>>>> "self" dot product of all columns in a matrix, i.e. calculating a square >>>>>> of >>>>>> each element of a matrix and computing a column-wise sum. I am >>>>>> interested >>>>>> in a proper way of doing it because I often need to process large >>>>>> matrices. >>>>>> >>>>>> I first put a focus on calculating the squares. For testing purposes >>>>>> I use a matrix of random floats of size 7000x7000. All timings here are >>>>>> deducted after several repetitive runs. >>>>>> >>>>>> I used to do it in Octave (v3.8.1) a follows: >>>>>> tic; X = rand(7000); toc; >>>>>> Elapsed time is 0.579093 seconds. >>>>>> tic; XX = X.^2; toc; >>>>>> Elapsed time is 0.114737 seconds. >>>>>> >>>>>> >>>>>> I tried to to the same in Julia (v0.2.1): >>>>>> @time X = rand(7000,7000); >>>>>> elapsed time: 0.114418731 seconds (392000128 bytes allocated) >>>>>> @time XX = X.^2; >>>>>> elapsed time: 0.369641268 seconds (392000224 bytes allocated) >>>>>> >>>>>> I was surprised to see that Julia is about 3 times slower when >>>>>> calculating a square than my original routine in Octave. I then read >>>>>> "Performance tips" and found out that one should use * instead of of >>>>>> raising to small integer powers, for example x*x*x instead of x^3. I >>>>>> therefore tested the following. >>>>>> @time XX = X.*X; >>>>>> elapsed time: 0.146059577 seconds (392000968 bytes allocated) >>>>>> >>>>>> This approach indeed resulted in a lot shorter computing time. It is >>>>>> still however a little slower than my code in Octave. Can someone advise >>>>>> on >>>>>> any performance tips ? >>>>>> >>>>>> I then finally do a sum over all columns of XX to get the "self" dot >>>>>> product but first I'd like to fix the squaring part. >>>>>> >>>>>> Thanks a lot. >>>>>> Best Regards, >>>>>> Jan >>>>>> >>>>>> p.s. In Julia manual I found a while ago an example of using >>>>>> @vectorize macro with a squaring function but can not find it any more. >>>>>> Perhaps the name of macro was different ... >>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> >
