Maybe we need to undeprecate the order keyword until we can make function
arguments faster. When I deprecated that, I figured we'd have that soon,
but it hasn't happened yet. In this case, I wonder if the comparison
function is getting inlined in C++ but not in Julia. The function call
overhead could make a 2x difference like you're seeing.


On Fri, Jul 4, 2014 at 3:29 PM, gentlebeldin <[email protected]>
wrote:

> You're right, that's very concise, and Julia inlined it, graciously. So it
> works, sort of:
>
> julia> @time e461(10000)
> WARNING: the `order` keyword is deprecated, use `lt`, `by` and `rev`
> instead.
> elapsed time: 66.65448558 seconds (1736533296 bytes allocated)
>
>
> But then, the warning is not pretty, and the C++11 equivalent runs in 26
> seconds on the same computer.
>
> Am Freitag, 4. Juli 2014 16:20:46 UTC+2 schrieb Stefan Karpinski:
>>
>> If you're indexing into an external array in Julia without any additional
>> annotation, there's quite a bit more going on than in C++ – C++ does array
>> accesses without any regards for safety. First, make sure that the array
>> that you're indexing into is a constant – if it isn't Julia has no chance
>> of generating decent code since it has no idea what the type of the array
>> or its elements are, or where it is in memory. Next, you'll want to turn of
>> bounds checking of the array with @inbounds. Of course, you'll want to be
>> *very* sure that you're not going out of bounds when you do this or you'll
>> get segfaults just like you would in C/C++. For example, if I do this:
>>
>> const f = [1,2,3]
>> sf(i) = @inbounds return f[i>>16+1] + f[i&65535+1]
>> lt(i,j) = sf(i) < sf(j)
>>
>> Then the native code for lt(::Int,::Int) is
>>
>> julia> @code_native lt(1,2)
>> .section __TEXT,__text,regular,pure_instructions
>> Filename: none
>> Source line: 1
>> push RBP
>> mov RBP, RSP
>> Source line: 1
>> movzx ECX, SI
>> movabs RAX, 140305558344800
>>  mov RAX, QWORD PTR [RAX]
>> mov RAX, QWORD PTR [RAX + 8]
>>  mov RCX, QWORD PTR [RAX + 8*RCX]
>> sar RSI, 16
>>  add RCX, QWORD PTR [RAX + 8*RSI]
>> movzx EDX, DI
>>  mov RDX, QWORD PTR [RAX + 8*RDX]
>> sar RDI, 16
>>  add RDX, QWORD PTR [RAX + 8*RDI]
>> cmp RDX, RCX
>>  setl AL
>> pop RBP
>>  ret
>>
>>
>> That's quite respectable. If I leave out the const or the @inbounds, then
>> it's not nearly so neat and tidy.
>>
>>
>> On Fri, Jul 4, 2014 at 9:51 AM, gentlebeldin <[email protected]>
>> wrote:
>>
>>> Unfortunately, my happiness was a bit premature. With a simple function
>>> like f(x)=x or f(x)=sin(x), it works, that's inlined, as it seems. If the
>>> function is more complicated, it isn't, and we're back at many seconds, and
>>> memory allocation of gigabytes for sorting an array of 1000000 elements. :-(
>>> In C++11, that works:
>>>     vector<int> arr(S);
>>>     for(int i = 0, k = 0; k <= M; k++)
>>>         for(int l = k; l <= M; l++, i++)
>>>             arr[i] = (k << 16) + l;
>>>
>>>     sort (arr.begin(), arr.end(), [](int i, int j)
>>>     {
>>>         return (sf(i) < sf(j));
>>>     });
>>>
>>> Here, sf is a more complex function, sf(i)=f[i>>16+1]+f[i&65535+1],
>>> where the array f is precomputed
>>> The syntax of a lambda expression (since C++11) is rather strange, but
>>> the implementation seems to be efficient, it's actually much faster than
>>> defining the lt function outside.
>>> As it seems, I won't be able port that to Julia.
>>>
>>> Am Freitag, 4. Juli 2014 15:22:08 UTC+2 schrieb Stefan Karpinski:
>>>>
>>>> Yes, exactly. Sorry I was not especially helpful yesterday – was on my
>>>> phone. But you seem to have figured it out. Sorry also that this is not the
>>>> most elegant interface. Obviously the longer term is to make passing
>>>> functions as arguments more efficient.
>>>>
>>>>
>>>> On Fri, Jul 4, 2014 at 5:36 AM, gentlebeldin <[email protected]>
>>>> wrote:
>>>>
>>>>> I guess you mean something like this:
>>>>>
>>>>> julia> using Base.Order
>>>>>
>>>>> julia> import Base.lt
>>>>>
>>>>> julia> type MyNewOrdering<:Ordering
>>>>>        end
>>>>>
>>>>> julia> f(x::Float64)=x
>>>>> f (generic function with 1 method)
>>>>>
>>>>>
>>>>> julia> lt(o::MyNewOrdering,x::Float64,y::Float64)=f(x)<f(y)
>>>>> lt (generic function with 9 methods)
>>>>>
>>>>>
>>>>> julia> a=rand(1000000)
>>>>> 1000000-element Array{Float64,1}:
>>>>> ...
>>>>>
>>>>> julia> @time sort(a,order=MyNewOrdering())
>>>>> elapsed time: 0.119845282 seconds (8000496 bytes allocated)
>>>>> 1000000-element Array{Float64,1}:
>>>>> ...
>>>>>
>>>>>
>>>>> Yes, that's better, both time and memory allocation. Thank you very
>>>>> much! :-)
>>>>>
>>>>> Am Donnerstag, 3. Juli 2014 21:14:05 UTC+2 schrieb Stefan Karpinski:
>>>>>
>>>>>> You can also define your own subtype of Order. See base/order.jl.
>>>>>>
>>>>>>
>>>>
>>

Reply via email to