Maybe we need to undeprecate the order keyword until we can make function arguments faster. When I deprecated that, I figured we'd have that soon, but it hasn't happened yet. In this case, I wonder if the comparison function is getting inlined in C++ but not in Julia. The function call overhead could make a 2x difference like you're seeing.
On Fri, Jul 4, 2014 at 3:29 PM, gentlebeldin <[email protected]> wrote: > You're right, that's very concise, and Julia inlined it, graciously. So it > works, sort of: > > julia> @time e461(10000) > WARNING: the `order` keyword is deprecated, use `lt`, `by` and `rev` > instead. > elapsed time: 66.65448558 seconds (1736533296 bytes allocated) > > > But then, the warning is not pretty, and the C++11 equivalent runs in 26 > seconds on the same computer. > > Am Freitag, 4. Juli 2014 16:20:46 UTC+2 schrieb Stefan Karpinski: >> >> If you're indexing into an external array in Julia without any additional >> annotation, there's quite a bit more going on than in C++ – C++ does array >> accesses without any regards for safety. First, make sure that the array >> that you're indexing into is a constant – if it isn't Julia has no chance >> of generating decent code since it has no idea what the type of the array >> or its elements are, or where it is in memory. Next, you'll want to turn of >> bounds checking of the array with @inbounds. Of course, you'll want to be >> *very* sure that you're not going out of bounds when you do this or you'll >> get segfaults just like you would in C/C++. For example, if I do this: >> >> const f = [1,2,3] >> sf(i) = @inbounds return f[i>>16+1] + f[i&65535+1] >> lt(i,j) = sf(i) < sf(j) >> >> Then the native code for lt(::Int,::Int) is >> >> julia> @code_native lt(1,2) >> .section __TEXT,__text,regular,pure_instructions >> Filename: none >> Source line: 1 >> push RBP >> mov RBP, RSP >> Source line: 1 >> movzx ECX, SI >> movabs RAX, 140305558344800 >> mov RAX, QWORD PTR [RAX] >> mov RAX, QWORD PTR [RAX + 8] >> mov RCX, QWORD PTR [RAX + 8*RCX] >> sar RSI, 16 >> add RCX, QWORD PTR [RAX + 8*RSI] >> movzx EDX, DI >> mov RDX, QWORD PTR [RAX + 8*RDX] >> sar RDI, 16 >> add RDX, QWORD PTR [RAX + 8*RDI] >> cmp RDX, RCX >> setl AL >> pop RBP >> ret >> >> >> That's quite respectable. If I leave out the const or the @inbounds, then >> it's not nearly so neat and tidy. >> >> >> On Fri, Jul 4, 2014 at 9:51 AM, gentlebeldin <[email protected]> >> wrote: >> >>> Unfortunately, my happiness was a bit premature. With a simple function >>> like f(x)=x or f(x)=sin(x), it works, that's inlined, as it seems. If the >>> function is more complicated, it isn't, and we're back at many seconds, and >>> memory allocation of gigabytes for sorting an array of 1000000 elements. :-( >>> In C++11, that works: >>> vector<int> arr(S); >>> for(int i = 0, k = 0; k <= M; k++) >>> for(int l = k; l <= M; l++, i++) >>> arr[i] = (k << 16) + l; >>> >>> sort (arr.begin(), arr.end(), [](int i, int j) >>> { >>> return (sf(i) < sf(j)); >>> }); >>> >>> Here, sf is a more complex function, sf(i)=f[i>>16+1]+f[i&65535+1], >>> where the array f is precomputed >>> The syntax of a lambda expression (since C++11) is rather strange, but >>> the implementation seems to be efficient, it's actually much faster than >>> defining the lt function outside. >>> As it seems, I won't be able port that to Julia. >>> >>> Am Freitag, 4. Juli 2014 15:22:08 UTC+2 schrieb Stefan Karpinski: >>>> >>>> Yes, exactly. Sorry I was not especially helpful yesterday – was on my >>>> phone. But you seem to have figured it out. Sorry also that this is not the >>>> most elegant interface. Obviously the longer term is to make passing >>>> functions as arguments more efficient. >>>> >>>> >>>> On Fri, Jul 4, 2014 at 5:36 AM, gentlebeldin <[email protected]> >>>> wrote: >>>> >>>>> I guess you mean something like this: >>>>> >>>>> julia> using Base.Order >>>>> >>>>> julia> import Base.lt >>>>> >>>>> julia> type MyNewOrdering<:Ordering >>>>> end >>>>> >>>>> julia> f(x::Float64)=x >>>>> f (generic function with 1 method) >>>>> >>>>> >>>>> julia> lt(o::MyNewOrdering,x::Float64,y::Float64)=f(x)<f(y) >>>>> lt (generic function with 9 methods) >>>>> >>>>> >>>>> julia> a=rand(1000000) >>>>> 1000000-element Array{Float64,1}: >>>>> ... >>>>> >>>>> julia> @time sort(a,order=MyNewOrdering()) >>>>> elapsed time: 0.119845282 seconds (8000496 bytes allocated) >>>>> 1000000-element Array{Float64,1}: >>>>> ... >>>>> >>>>> >>>>> Yes, that's better, both time and memory allocation. Thank you very >>>>> much! :-) >>>>> >>>>> Am Donnerstag, 3. Juli 2014 21:14:05 UTC+2 schrieb Stefan Karpinski: >>>>> >>>>>> You can also define your own subtype of Order. See base/order.jl. >>>>>> >>>>>> >>>> >>
