[julia-users] Type stability of function in composite types
I am failing to understand why the following code produce type instability (caveat: the code is a reduction o large and more complex code, but the features are the same). ``` type Foo f::Function y::Array{Float64, 1} x::Array{Float64, 2} end type Bar{T} b::T end type A{T} a::T end x = randn(100,4) y = randn(100) f(theta, y, x) = x'*(y-x*theta) (g::Foo)(x) = g.f(x, g.y, g.x) a = A(Bar(Foo(f, y, x))) m(a::A) = a.a.b([.1,.1,.1,.1]) ``` The output of @code_warntype is below: ``` @code_warntype m(a) Variables: #self#::#m a::A{Bar{Foo}} Body: begin SSAValue(1) = (Core.getfield)((Core.getfield)(a::A{Bar{Foo}},:a)::Bar{Foo},:b)::Foo SSAValue(0) = $(Expr(:invoke, LambdaInfo for vect(::Float64, ::Vararg{Float64,N}), :(Base.vect), 0.1, 0.1, 0.1, 0.1)) return ((Core.getfield)(SSAValue(1),:f)::F)(SSAValue(0),(Core.getfield)(SSAValue(1),:y)::Array{Float64,1},(Core.getfield)(SSAValue(1),:x)::Array{Float64,2})::Any end::Any ``` I thought the v0.5 will solve the problem with functions not being correctly inferred. Maybe, I am simply doing something patently stupid.
[julia-users] Trick to use plotlyjs in Atom
i think in the last versions of Atom the plot will show in the plot pane without the need of tricks.
[julia-users] Re: Julia and the Tower of Babel
it seems a good idea JuliaPraxis. I have been struggling with trying to get consistent naming and having a guide to follow may at least cut short the struggling time.
Re: [julia-users] Re: chol() more strict in v0.5?
Me too! I have been trying to update a package to `v0.5` and I do not really see a clean way to support both 0.4 and 0.5 without an entry like this in Compat.jl. On Sunday, August 7, 2016 at 10:02:44 PM UTC+2, Andreas Noack wrote: > > It would be great with an entry for this in Compat.jl, e.g. something like > > cholfact(A::HermOrSym, args...) = cholfact(A.data, A.uplo, args...) > > On Sun, Aug 7, 2016 at 2:44 PM, Chris <7hunde...@gmail.com > > wrote: > >> mmh, could you explain your comment a little more? >> >> David, thanks for the tip. > > >
[julia-users] Re: Tips for optimizing this short code snippet
As Eric pointed out, with Julia 0.4.x functions passed as arguments are not optimized as their type is difficult to infer. That's why the profiler shows jl_generic_function being the bottleneck. Try it with 0.5 and things could get dramatically faster.
[julia-users] Re: getting a warning when overloading +(a,b)
``` import Base.+ type numerr num err end +(a::numerr, b::numerr) = numerr(a.num + b.num, sqrt(a.err^2 + b.err^2)); +(a::Any, b::numerr) = numerr(a + b.num, b.err); +(a::numerr, b::Any) = numerr(a.num + b, a.err); x = numerr(10, 1); y = numerr(20, 2); println(x+y) println(2+x) println(y+2) ``` On Friday, April 1, 2016 at 4:51:53 PM UTC+2, Gerson J. Ferreira wrote: > > I'm trying to overload simple math operators to try a code for error > propagation, but I'm getting a warning. Here's a short code that already > shows the warning message: > > type numerr > num > err > end > > +(a::numerr, b::numerr) = numerr(a.num + b.num, sqrt(a.err^2 + b.err^2)); > +(a::Any, b::numerr) = numerr(a + b.num, b.err); > +(a::numerr, b::Any) = numerr(a.num + b, a.err); > > x = numerr(10, 1); > y = numerr(20, 2); > > println(x+y) > println(2+x) > println(y+2) > > I didn't see much about operator overloading in Julia's manual. I would > really appreciate if someone could point me in the right direction. > > The code above returns this warning in Julia 0.4.2: > >_ _ _(_)_ | A fresh approach to technical computing > (_) | (_) (_)| Documentation: http://docs.julialang.org >_ _ _| |_ __ _ | Type "?help" for help. > | | | | | | |/ _` | | > | | |_| | | | (_| | | Version 0.4.2 (2015-12-06 21:47 UTC) > _/ |\__'_|_|_|\__'_| | Official http://julialang.org release > |__/ | x86_64-linux-gnu > > julia> include("overload.jl") > WARNING: module Main should explicitly import + from Base > numerr(30,2.23606797749979) > numerr(12,1) > numerr(22,2) > > >
Re: [julia-users] Do I have simd?
I am pretty sure must something specific to your installation. On my machine ``` Darwin Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; RELEASE_X86_64 x86_64 ``` running the code, I get the following timings: ``` julia> timeit(1000,1000) GFlop= 2.4503017546610866 GFlop (SIMD) = 11.622906423980382 ``` On Friday, November 6, 2015 at 12:15:47 AM UTC+1, DNF wrote: > > I install using homebrew from here: > https://github.com/staticfloat/homebrew-julia > > I have limited understanding of the process, but believe there is some > compilation involved. >
Re: [julia-users] Number of loops not known at compile time
@erik thank you. I know of Base.Cartesian, but I was not able to make it work for my use case. @steven yes, that is what i was afraid of (I was trying to shy away from recursion). On Monday, September 28, 2015 at 8:23:48 PM UTC+2, Erik Schnetter wrote: > > Guiseppe > > In addition to using recursion, you can also use a macro to generate the > code. > > However, this functionality is available in the "Cartesian" module which > is part of Base Julia. You are probably looking for "@nloops". > > -erik > > > On Mon, Sep 28, 2015 at 11:24 AM, Giuseppe Ragusa <giusepp...@gmail.com > > wrote: > >> I am having problems (serious problems!) to deal with algorithms that >> boil down do nested for loops but whose number of loops is not known at >> compile time. As an example consider this: >> >> ``` >> function tmp() >> r = 3 >> n = 100 >> λ = zeros(r) >> u = rand(n, r) >> counter = 0 >> Y = Array(Float64, n) >> >> @inbounds for j_1 = 0:k >> >> for i = 1:n >> Y[i] = pow(u[i, 1], j_1) >> end >> >> λ[1] = j_1 >> for j_2 = 0:(k-j_1) >> λ[2] = j_2 >> for i = 1:n >> Y[i] *= pow(u[i, 2], j_2) >> end >> >> for j_3 = 0:(k-j_1-j_2) >> λ[3] = j_3 >> for i = 1:n >> Y[i] *= pow(u[i, 3], j_3) >> end >> counter += 1 >> println(λ, " ", " => ", j_1, j_2, j_3, " counter =>", >> counter) >> end >> end >> end >> end >> ``` >> >> This is what I want when `r = 3`. For larger `r = 4` there would be an >> additional loop. etc. Now, everything is complicated by the fact that `r` >> is not know at compile time. >> >> I have tried to generate the loops using the macros in Base.Cartesian >> and then using generated functions to accomplish what I want, but I cannot >> get what I want. The main difficulty is that Base.Cartesian I can't get the >> ranges I want). >> >> Does anybody have suggestions on how to deal with this? >> >> >> > > > -- > Erik Schnetter <schn...@gmail.com > > http://www.perimeterinstitute.ca/personal/eschnetter/ >
[julia-users] Number of loops not known at compile time
I am having problems (serious problems!) to deal with algorithms that boil down do nested for loops but whose number of loops is not known at compile time. As an example consider this: ``` function tmp() r = 3 n = 100 λ = zeros(r) u = rand(n, r) counter = 0 Y = Array(Float64, n) @inbounds for j_1 = 0:k for i = 1:n Y[i] = pow(u[i, 1], j_1) end λ[1] = j_1 for j_2 = 0:(k-j_1) λ[2] = j_2 for i = 1:n Y[i] *= pow(u[i, 2], j_2) end for j_3 = 0:(k-j_1-j_2) λ[3] = j_3 for i = 1:n Y[i] *= pow(u[i, 3], j_3) end counter += 1 println(λ, " ", " => ", j_1, j_2, j_3, " counter =>", counter) end end end end ``` This is what I want when `r = 3`. For larger `r = 4` there would be an additional loop. etc. Now, everything is complicated by the fact that `r` is not know at compile time. I have tried to generate the loops using the macros in Base.Cartesian and then using generated functions to accomplish what I want, but I cannot get what I want. The main difficulty is that Base.Cartesian I can't get the ranges I want). Does anybody have suggestions on how to deal with this?
Re: [julia-users] Writing Type-Stable Code in Julia - type stable version not that fast in 0.4
The issue is also present for 0.3.3 --- oldest version I have installed. That's the LLVM IR code generated: ``` julia code_llvm(sumofsins2, (Int, )) define double @julia_sumofsins2;19930(i64) { top: %1 = icmp sgt i64 %0, 0, !dbg !848 br i1 %1, label %L, label %L3, !dbg !848 L:; preds = %top, %pass %r.0 = phi double [ %6, %pass ], [ 0.00e+00, %top ] %#s119.0 = phi i64 [ %5, %pass ], [ 1, %top ] %2 = call double inttoptr (i64 4707812848 to double (double)*)(double 3.40e+00), !dbg !849 %3 = fcmp ord double %2, 0.00e+00, !dbg !849 br i1 %3, label %pass, label %fail, !dbg !849 fail: ; preds = %L %4 = load %jl_value_t** @jl_domain_exception, align 8, !dbg !849, !tbaa %jtbaa_const call void @jl_throw_with_superfluous_argument(%jl_value_t* %4, i32 4), !dbg !849 unreachable, !dbg !849 pass: ; preds = %L %5 = add i64 %#s119.0, 1, !dbg !848 %6 = fadd double %r.0, %2, !dbg !849 %7 = icmp eq i64 %#s119.0, %0, !dbg !849 br i1 %7, label %L3, label %L, !dbg !849 L3: ; preds = %pass, %top %r.1 = phi double [ 0.00e+00, %top ], [ %6, %pass ] ret double %r.1, !dbg !852 } ``` On Tuesday, January 27, 2015 at 8:46:59 AM UTC+1, Jason Merrill wrote: Looking at code_llvm(sumofsins2, (Int,)), it looks like sin(3.4) used to be constant collapsed [1], but is now recomputed each time through the loop. [1] That's the 0xBFD05AC910FF4C6C in the LLVM code in John's post reinterpret(Float64, 0xBFD05AC910FF4C6C) == -0.2555411020268312 == sin(3.4) On Monday, January 26, 2015 at 11:23:36 PM UTC-8, Mauro wrote: This is a performance regression, also for 0.3.5. My timings for 0.3.5: julia @time [sumofsins1(100_000) for i in 1:100]; elapsed time: 0.446675737 seconds (320109932 bytes allocated, 21.32% gc time) julia @time [sumofsins2(100_000) for i in 1:100]; elapsed time: 0.115537618 seconds (896 bytes allocated) but for 0.2.1: julia @time [sumofsins1(100_000) for i in 1:100]; elapsed time: 0.347052858 seconds (320072020 bytes allocated) julia @time [sumofsins2(100_000) for i in 1:100]; elapsed time: 0.008610216 seconds (896 bytes allocated) Can you check whether an issue for this has been filed and if you can't find one file one? On Tue, 2015-01-27 at 07:36, Kuba Roth kuba...@gmail.com wrote: Hi there, Recently I run into this very interesting post: http://www.johnmyleswhite.com/notebook/2013/12/06/writing-type-stable-code-in-julia/ Surprisingly, when tested both examples against the latest 0.4 build - the speed difference of the type-stable version is only 2-3 times faster then unstable one. I wonder what is the source of such a huge disparity and what version of Julia was used? My timings: unstable: 0.425013212 seconds (305 MB allocated, 7.56% gc time in 14 pauses with 0 full sweep) stable: 0.14287404 seconds (896 bytes allocated) John's: unstable: 0.412261722 seconds (320002496 bytes allocated) stable: 0.008509995 seconds (896 bytes allocated) Thanks, kuba
[julia-users] Re: Matrix crossproduct by groups of rows
Hi Gray thank you very much. Yes, memory allocation is astonishing. This works great. On Friday, September 5, 2014 6:59:53 AM UTC+2, Gray Calhoun wrote: On Wednesday, September 3, 2014 6:46:07 PM UTC-5, Giuseppe Ragusa wrote: I have been struggling to find a fast and elegant way to implement the following algorithm. [...] function block_crossprod!(out, X, cl) for j in distinct(cl) out += X[find(cl.==j),:]'*X[find(cl.==j),:] end end Hi Giuseppe, hope you're doing well! I think that += creates a temporary array, even though the notation suggests that it's mutating. The BLAS function syrk! does essentially what you want but without creating the temporary variables[1]. i.e.: ``` function block_crossprod2!(out, X, cl) for j in unique(cl) Base.BLAS.syrk!('U', 'T', 1., X[cl .== j,:], 1., out) end end ``` out will only return the upper triangular part, but Symmetric(out) will make it symmetric. If you know more about the indexing scheme, you may be able to improve things more by replacing X[cl .== j,:] with a subarray: ``` function block_crossprod3!(out, X, bstarts, blengths) for j in 1:length(bstarts) Base.BLAS.syrk!('U', 'T', 1., sub(X, range(bstarts[j], blengths[j]),:), 1., out) end end ``` Other people may have more improvements, or know how to use subarrays in the general case. Hope this helps! A quick profile: ``` X = randn(10_000,200) cl=rep(1:500, 20) out1 = zeros(200,200) bstarts = 1:20:10_000 blengths = rep(20, 500) @time block_crossprod!(out1, X, cl) # elapsed time: 0.611584694 seconds (358190568 bytes allocated, 41.52% gc time) @time block_crossprod2!(out1, X, cl) # elapsed time: 0.179214852 seconds (19066568 bytes allocated, 20.01% gc time) @time block_crossprod3!(out1, X, bstarts, blengths) # elapsed time: 0.095740838 seconds (390416 bytes allocated) ``` The memory allocation for the last version is kind of astonishing. [1]: http://julia.readthedocs.org/en/latest/stdlib/linalg/#Base.LinAlg.BLAS.syrk !
[julia-users] Matrix crossproduct by groups of rows
I have been struggling to find a fast and elegant way to implement the following algorithm. I have an Array{Float64, 2}, say X = randn(100,2) and an Array{Int64, 1}, call it cl which denotes row-blocks of X. More concretely: 0.863377 0.867817 1.0 -0.310559 -0.863393 1.0 1.749630.0464976 1.0 -1.29083 -0.496978 1.0 ⋮ 1.041810.682401 5.0 0.427546 0.237533 5.0 0.650279 -0.727299 5.0 0.121416 1.612745.0 What I have to calculate is the crossproducts X_1'X_1, X_2'*X_2, ..., X_N'X_N, where X_i = X[find(cl.==i),:]. Something like this work, but I feel is way from optimal: X = randn(100,2) rep(v, n) = [v[div(i,n)+1] for i=0:n*length(v)-1] cl=rep(1:5, 20) out = zeros(2,2) function block_crossprod!(out, X, cl) for j in distinct(cl) out += X[find(cl.==j),:]'*X[find(cl.==j),:] end end Any suggestions will be greatly appreciated. Thank you.