The reason for the improvements when you fixed the truncations is that 
indexing with a float is deprecated and calling deprecated methods is very 
slow. For good performance it is therefore important not to repeatedly call 
any deprecated method.

@inbounds is good if you have a very tight loop with many vector loads 
(like in your example). Remember that @inbounds turns off bounds checking 
so if you mess up you can segfault or get garbage data.

For parallellization I would look at shared 
arrays: 
http://julia.readthedocs.org/en/latest/manual/parallel-computing/#shared-arrays

On Sunday, October 11, 2015 at 11:28:08 PM UTC+2, Alain Clo wrote:
>
> Pretty good improvements Thanks to you Kristoffer.
> I am puzzled why the changes on the truncations for sxloc and szloc brings 
> a factor of 3 to the whole program,
> to the loop underneath. Maybe the truncation changed the indexing type to 
> the whole array, right ?
> How did you figure out this affect ?
>
> Regarding @inbounds and @fastmath. I have the same question, but I see 
> them now in the performance tips.
> Maybe as a generic rule, those options have to be checked. 
>
> Would you have any suggestions for parallelisation ?
>
> Thanks for the feedback
> Alain
>
>
>
>
> Le dimanche 11 octobre 2015 23:48:55 UTC+3, Kristoffer Carlsson a écrit :
>>
>> Some unsolicited comments on the code.
>>
>> You probably want to change line 118 and 119 to
>>
>> sxloc = trunc(Int, mxnx / 2)   
>> szloc = trunc(Int, mxnz / 2)   
>>
>> so that they really are ints, Without the "Int" it will still be a float.
>>
>> As a base line, your code for me ran in 1.239 seconds.
>>
>> Adding "@inbounds" on line 127 takes this down to 0.75 seconds.
>>
>> Adding "@fastmath" on the same line reduces it further down to 0.52 
>> seconds. Normal caveats for fastmath of course applies.
>>
>> // Kristoffer
>>
>>
>>
>> On Sunday, October 11, 2015 at 9:32:04 PM UTC+2, Alain Clo wrote:
>>>
>>> See attached the updated code ( slight modifications, code embedded into 
>>> a function acoustic(), and some loops interchanged).
>>> To see some images, uncomment at the end of the code.
>>> Enjoy !
>>> Alain
>>> Le dimanche 11 octobre 2015 21:30:43 UTC+3, Kristoffer Carlsson a écrit :
>>>>
>>>> Maybe you can give a link to the updated code?
>>>
>>>

Reply via email to