Yes, 6581 sounds like it. Thanks for the clarification.

Jan 

Dňa piatok, 12. septembra 2014 14:12:46 UTC+2 Andreas Noack napísal(-a):
>
> I think the reason for the slow down in rand since 2.1 is this
>
> https://github.com/JuliaLang/julia/pull/6581
>
> Right now we are filling the array one by one which is not efficient, but 
> unfortunately it is our best option right now. In applications where you 
> draw one random variate at a time there shouldn't be difference.
>
> Med venlig hilsen
>
> Andreas Noack
>
> 2014-09-12 4:46 GMT-04:00 Ján Dolinský:
>
>> Finally, I found that Octave has an equivalent to sumabs2() called 
>> sumsq(). Just for sake of completeness here are the timings:
>>
>> Octave
>> X = rand(7000);
>> tic; sumsq(X); toc;
>> Elapsed time is 0.0616651 seconds.
>>
>> Julia v0.3
>> @time X = rand(7000,7000);
>> elapsed time: 0.285218597 seconds (392000160 bytes allocated)
>> @time sumabs2(X, 1);
>> elapsed time: 0.05705666 seconds (56496 bytes allocated)
>>
>>
>> Essentially speed is about the  same with Julia being a little faster.
>>
>> It was however interesting to observe that @time X = rand(7000,7000);
>> is about 2.5 times slower in Julia 0.3 than it was in Julia 0.2 ...
>>
>> in Julia (v0.2.1):
>>  @time X = rand(7000,7000);
>> elapsed time: 0.114418731 seconds (392000128 bytes allocated)
>>  
>>
>> Jan
>>
>> Dňa utorok, 9. septembra 2014 17:06:59 UTC+2 Ján Dolinský napísal(-a):
>>>
>>>  Hello Andreas,
>>>
>>> Thanks for the tip. I'll check it out. Thumbs up for the 0.4!
>>>
>>> Jan
>>>
>>>  On 09.09.2014 17:04, Andreas Noack wrote:
>>>  
>>> If you need the speed now you can try one of the package ArrayViews or 
>>> ArrayViewsAPL. It is something similar to the functionality in these 
>>> packages that we are trying to include in base.
>>>
>>>  Med venlig hilsen
>>>
>>> Andreas Noack
>>>  
>>> 2014-09-09 9:38 GMT-04:00 Ján Dolinský:
>>>
>>>>  OK, so basically there is nothing wrong with the syntax X[:,1001:end] 
>>>> ?   
>>>>  d = sumabs2(X[:,1001:end], 1);
>>>>  and I should just wait until v0.4 is available (perhaps available 
>>>> soon in Julia Nightlies PPA).
>>>>
>>>> I did the benchmark with the floating point power function based on 
>>>> Simon's comment. Here are my results (after couple of repetitive 
>>>> iterations):
>>>>  @time X.^2;
>>>> elapsed time: 0.511988142 seconds (392000256 bytes allocated, 2.52% gc 
>>>> time)
>>>> @time X.^2.0;
>>>> elapsed time: 0.411791612 seconds (392000256 bytes allocated, 3.12% gc 
>>>> time)
>>>>  
>>>> Thanks, 
>>>> Jan Dolinsky
>>>>
>>>>   On 09.09.2014 14:06, Andreas Noack wrote:
>>>>  
>>>> The problem is that right now X[:,1001,end] makes a copy of the array. 
>>>> However,  in 0.4 this will instead be a view of the original matrix and 
>>>> therefore the computing time should be almost the same. 
>>>>
>>>>  It might also be worth repeating Simon's comment that the floating 
>>>> point power function has special handling of 2. The result is that
>>>>
>>>> julia> @time A.^2;
>>>> elapsed time: 1.402791357 seconds (200000256 bytes allocated, 5.90% gc 
>>>> time)
>>>>
>>>> julia> @time A.^2.0;
>>>> elapsed time: 0.554241105 seconds (200000256 bytes allocated, 15.04% gc 
>>>> time) 
>>>>
>>>>  I tend to agree with Simon that special casing of integer 2 would be 
>>>> reasonable.
>>>>  
>>>>  Med venlig hilsen
>>>>
>>>> Andreas Noack
>>>>  
>>>> 2014-09-09 4:24 GMT-04:00 Ján Dolinský:
>>>>
>>>>  Hello guys,
>>>>>
>>>>> Thanks a lot for the lengthy discussions. It helped me a lot to get a 
>>>>> feeling on what is Julia like. I did some more performance comparisons as 
>>>>> suggested by first two posts (thanks a lot for the tips). In the mean 
>>>>> time 
>>>>> I upgraded to v0.3.
>>>>>  X = rand(7000,7000);
>>>>> @time d = sum(X.^2, 1);
>>>>> elapsed time: 0.573125833 seconds (392056672 bytes allocated, 2.25% 
>>>>> gc time)
>>>>> @time d = sum(X.*X, 1);
>>>>> elapsed time: 0.178715901 seconds (392057080 bytes allocated, 14.06% 
>>>>> gc time)
>>>>> @time d = sumabs2(X, 1);
>>>>> elapsed time: 0.067431808 seconds (56496 bytes allocated)
>>>>>  
>>>>> In Octave then
>>>>>  X = rand(7000);
>>>>> tic; d = sum(X.^2); toc;
>>>>> Elapsed time is 0.167578 seconds.
>>>>>  
>>>>> So the ultimate solution is the sumabs2 function which is a blast. I 
>>>>> am comming from Matlab/Octave and I would expect X.^2 to be fast "out of 
>>>>> the box" but nevertheless if I can get an excellent performance by 
>>>>> learning 
>>>>> some new paradigms I will go for it.
>>>>>
>>>>> The above tests lead me to another question. I often need to calculate 
>>>>> the "self" dot product over a portion of a matrix, e.g.
>>>>>  @time d = sumabs2(X[:,1001:end], 1);
>>>>> elapsed time: 0.175333366 seconds (336048688 bytes allocated, 7.01% 
>>>>> gc time)
>>>>>  
>>>>> Apparently this is not a way to do it in Julia because working on a 
>>>>> smaller matrix of 7000x6000 gives more than double computing time and 
>>>>> furthermore it seems to allocate unnecessary memory.
>>>>>
>>>>> Best Regards,
>>>>> Jan
>>>>>
>>>>>
>>>>>
>>>>> Dňa pondelok, 8. septembra 2014 10:36:02 UTC+2 Ján Dolinský 
>>>>> napísal(-a): 
>>>>>  
>>>>>> Hello,
>>>>>>
>>>>>> I am a new Julia user. I am trying to write a function for computing 
>>>>>> "self" dot product of all columns in a matrix, i.e. calculating a square 
>>>>>> of 
>>>>>> each element of a matrix and computing a column-wise sum. I am 
>>>>>> interested 
>>>>>> in a proper way of doing it because I often need to process large 
>>>>>> matrices.
>>>>>>
>>>>>> I first put a focus on calculating the squares. For testing purposes 
>>>>>> I use a matrix of random floats of size 7000x7000. All timings here are 
>>>>>> deducted after several repetitive runs.
>>>>>>
>>>>>> I used to do it in Octave (v3.8.1) a follows:
>>>>>>  tic; X = rand(7000); toc;
>>>>>> Elapsed time is 0.579093 seconds.
>>>>>> tic; XX = X.^2; toc;
>>>>>> Elapsed time is 0.114737 seconds.
>>>>>>  
>>>>>>
>>>>>> I tried to to the same in Julia (v0.2.1):
>>>>>>  @time X = rand(7000,7000);
>>>>>> elapsed time: 0.114418731 seconds (392000128 bytes allocated)
>>>>>> @time XX = X.^2;
>>>>>> elapsed time: 0.369641268 seconds (392000224 bytes allocated)
>>>>>>  
>>>>>> I was surprised to see that Julia is about 3 times slower when 
>>>>>> calculating a square than my original routine in Octave. I then read 
>>>>>> "Performance tips" and found out that one should use * instead of of 
>>>>>> raising to small integer powers, for example x*x*x instead of x^3. I 
>>>>>> therefore tested the following.
>>>>>>  @time XX = X.*X;
>>>>>> elapsed time: 0.146059577 seconds (392000968 bytes allocated)
>>>>>>  
>>>>>> This approach indeed resulted in a lot shorter computing time. It is 
>>>>>> still however a little slower than my code in Octave. Can someone advise 
>>>>>> on 
>>>>>> any performance tips ?
>>>>>>
>>>>>> I then finally do a sum over all columns of XX to get the "self" dot 
>>>>>> product but first I'd like to fix the squaring part.
>>>>>>
>>>>>> Thanks a lot. 
>>>>>> Best Regards,
>>>>>> Jan 
>>>>>>
>>>>>> p.s. In Julia manual I found a while ago an example of using 
>>>>>> @vectorize macro with a squaring function but can not find it any more. 
>>>>>> Perhaps the name of macro was different ... 
>>>>>>   
>>>>>>  
>>>>>    
>>>>  
>>>>    
>>>  
>>>  
>

Reply via email to