Re: [julia-users] Re: Optimizing Function Injection in Julia

Cedric St-Jean Fri, 22 Jan 2016 12:25:09 -0800

What about creating a parametric type, with one parameter/closed-over 
variable?


On Friday, January 22, 2016 at 3:20:48 PM UTC-5, Tim Holy wrote:
>
> On Friday, January 22, 2016 12:03:02 PM Cedric St-Jean wrote: 
> > It looks like my understanding of FastAnonymous was flawed. Why doesn't 
> it 
> > create the type at macro time, and just instantiate it at runtime, 
> yielding 
> > 1 type / @anon ? Is there any complication that prevents that? 
>
> Yes: 
>
> for z in (1, 1.0) 
>     f = @anon c->c+z 
>     @show fieldtype(typeof(f), :z) 
> end 
>
> yields this output: 
>
> fieldtype(typeof(f),:z) = Int64 
> fieldtype(typeof(f),:z) = Float64 
>
> The fields of the constructed "function" need concrete type, if you're 
> going to 
> get good performance. 
>
> Best, 
> --Tim 
>
>
> > 
> > Cédric 
> > 
> > On Friday, January 22, 2016 at 2:15:20 PM UTC-5, Tim Holy wrote: 
> > > On Friday, January 22, 2016 10:21:31 AM Bryan Rivera wrote: 
> > > > For 1000 elements: 
> > > > 
> > > > 0.00019s vs 0.035s respectively 
> > > > 
> > > > Thanks! 
> > > 
> > > Glad it helped. 
> > > 
> > > > Is the reason the faster code has more allocations bc it is 
> > > > inserting vars into the single function?  (Opposed to the slower 
> > > > code already having its vars filled in.) 
> > > 
> > > Every time you call @anon, it creates a brand-new type (and an 
> instance of 
> > > that type) that julia has never seen before. That requires JITting any 
> > > code 
> > > that gets invoked on this instance. So the usual advice, "run once to 
> JIT, 
> > > then do your timing" doesn't work in this case: it JITs every time. 
> > > 
> > > --Tim 
> > > 
> > > > On Friday, January 22, 2016 at 12:23:59 PM UTC-5, Tim Holy wrote: 
> > > > > Just use 
> > > > > 
> > > > > z = 1 
> > > > > function2 = @anon c -> c + z 
> > > > > for z = 1:100 
> > > > > 
> > > > >     function2.z = z 
> > > > >     # do whatever with function2, including making a copy 
> > > > > 
> > > > > end 
> > > > > 
> > > > > --Tim 
> > > > > 
> > > > > On Friday, January 22, 2016 08:55:25 AM Cedric St-Jean wrote: 
> > > > > > (non-mutating) Closures and FastAnonymous work essentially the 
> same 
> > > 
> > > way. 
> > > 
> > > > > > They store the data that is closed over (more or less) and a 
> > > 
> > > function 
> > > 
> > > > > > pointer. The thing is that there's only one data structure in 
> Julia 
> > > 
> > > for 
> > > 
> > > > > all 
> > > > > 
> > > > > > regular anonymous functions, whereas FastAnonymous creates one 
> per 
> > > 
> > > @anon 
> > > 
> > > > > > site. Because the FastAnonymous-created datatype is specific to 
> that 
> > > > > > function definition, the standard Julia machinery takes over and 
> > > > > 
> > > > > produces 
> > > > > 
> > > > > > efficient code. It's just as good as if the function had been 
> > > 
> > > defined 
> > > 
> > > > > > normally with `function foo(...) ... end` 
> > > > > > 
> > > > > > 
> > > > > > for z = 1:100 
> > > > > > 
> > > > > >     function2 = @anon c -> (c + z) 
> > > > > >     
> > > > > >     dict[z] =  function2 
> > > > > > 
> > > > > > end 
> > > > > > 
> > > > > > 
> > > > > > So we end up creating multiple functions for each z value. 
> > > > > > 
> > > > > > 
> > > > > > In this code, whether you use @anon or not, Julia will create 
> 100 
> > > 
> > > object 
> > > 
> > > > > > instances to store the z values. 
> > > > > > 
> > > > > > The speed difference between the two will soon be gone. 
> > > > > > <https://github.com/JuliaLang/julia/pull/13412> 
> > > > > > 
> > > > > > Cédric 
> > > > > > 
> > > > > > On Friday, January 22, 2016 at 11:31:36 AM UTC-5, Bryan Rivera 
> > > 
> > > wrote: 
> > > > > > > I have to do some investigating here.  I thought we could do 
> > > 
> > > something 
> > > 
> > > > > > > like that but wasn't quite sure how it would look. 
> > > > > > > 
> > > > > > > Check this out: 
> > > > > > > 
> > > > > > > This code using FastAnonymous optimizes to the very same code 
> > > 
> > > below it 
> > > 
> > > > > > > where functions have been manually injected: 
> > > > > > > 
> > > > > > > using FastAnonymous 
> > > > > > > 
> > > > > > > 
> > > > > > > function function1(a, b, function2) 
> > > > > > > 
> > > > > > >   if(a > b) 
> > > > > > >   
> > > > > > >     c = a + b 
> > > > > > >     return function2(c) 
> > > > > > >   
> > > > > > >   else 
> > > > > > >   
> > > > > > >     # do anything 
> > > > > > >     # but return nothing 
> > > > > > >   
> > > > > > >   end 
> > > > > > > 
> > > > > > > end 
> > > > > > > 
> > > > > > > 
> > > > > > > z = 10 
> > > > > > > function2 = @anon c -> (c + z) 
> > > > > > > 
> > > > > > > 
> > > > > > > a = 1 
> > > > > > > b = 2 
> > > > > > > @code_llvm function1(a, b, function2) 
> > > > > > > @code_native function1(a, b, function2) 
> > > > > > > 
> > > > > > > Manually unrolled equivalent: 
> > > > > > > 
> > > > > > > function function1(a, b, z) 
> > > > > > > 
> > > > > > >   if(a > b) 
> > > > > > >   
> > > > > > >     c = a + b 
> > > > > > >     return function2(c, z) 
> > > > > > >   
> > > > > > >   else 
> > > > > > >   
> > > > > > >     # do anything 
> > > > > > >     # but return nothing 
> > > > > > >   
> > > > > > >   end 
> > > > > > > 
> > > > > > > end 
> > > > > > > 
> > > > > > > 
> > > > > > > function function2(c, z) 
> > > > > > > 
> > > > > > >   return c + z 
> > > > > > > 
> > > > > > > end 
> > > > > > > 
> > > > > > > 
> > > > > > > a = 1 
> > > > > > > b = 2 
> > > > > > > z = 10 
> > > > > > > 
> > > > > > > 
> > > > > > > @code_llvm function1(a, b, z) 
> > > > > > > 
> > > > > > > @code_native function1(a, b, z) 
> > > > > > > 
> > > > > > > However, this is a bit too simplistic.  My program actually 
> does 
> > > 
> > > this: 
> > > > > > > # Test to see if multiple functions are created.  They are. 
> > > > > > > # We would only need to create a single function if we used 
> julia 
> > > > > 
> > > > > anon, 
> > > > > 
> > > > > > > but its time inefficient. 
> > > > > > > 
> > > > > > > dict = Dict{Int, Any}() 
> > > > > > > for z = 1:100 
> > > > > > > 
> > > > > > >     function2 = @anon c -> (c + z) 
> > > > > > >     
> > > > > > >     dict[z] =  function2 
> > > > > > > 
> > > > > > > end 
> > > > > > > 
> > > > > > > 
> > > > > > > a = 1 
> > > > > > > b = 2 
> > > > > > > 
> > > > > > > function test() 
> > > > > > > 
> > > > > > >   function1(a,b, dict[100]) 
> > > > > > >   function1(a,b, dict[50]) 
> > > > > > > 
> > > > > > > end 
> > > > > > > 
> > > > > > > @code_llvm test() 
> > > > > > > @code_native test() 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > So we end up creating multiple functions for each z value.  We 
> > > 
> > > could 
> > > 
> > > > > use 
> > > > > 
> > > > > > > Julia's anon funs, which would only create a single function, 
> > > 
> > > however 
> > > 
> > > > > > > these 
> > > > > > > lamdas are less performant than FastAnon. 
> > > > > > > 
> > > > > > > So its a space vs time tradeoff, I want the speed of FastAnon, 
> > > 
> > > without 
> > > 
> > > > > the 
> > > > > 
> > > > > > > spacial overhead of storing multiple functions. 
> > > > > > > 
> > > > > > > Can we be greedy?  :) 
> > > > > > > 
> > > > > > > On Thursday, January 21, 2016 at 9:56:51 PM UTC-5, Cedric 
> St-Jean 
> > > > > 
> > > > > wrote: 
> > > > > > >> Something like this? 
> > > > > > >> 
> > > > > > >> function function1(a, b, f) # Variable needed in callback fun 
> > > > > 
> > > > > injected. 
> > > > > 
> > > > > > >>     if(a > b) 
> > > > > > >>     
> > > > > > >>       c = a + b 
> > > > > > >>       res = f(c) # Callback function has been injected. 
> > > > > > >>       return res + 1 
> > > > > > >>     
> > > > > > >>     else 
> > > > > > >>     
> > > > > > >>       # do anything 
> > > > > > >>       # but return nothing 
> > > > > > >>     
> > > > > > >>     end 
> > > > > > >> 
> > > > > > >> end 
> > > > > > >> 
> > > > > > >> type SomeCallBack 
> > > > > > >> 
> > > > > > >>     z::Int 
> > > > > > >> 
> > > > > > >> end 
> > > > > > >> Base.call(callback::SomeCallBack, c) = c + callback.z 
> > > > > > >> 
> > > > > > >> function1(2, 1, SomeCallBack(10)) 
> > > > > > >> 
> > > > > > >> Because of JIT, this is 100% equivalent to your "callback 
> > > 
> > > function 
> > > 
> > > > > has 
> > > > > 
> > > > > > >> been injected" example, performance-wise. My feeling is that 
> > > 
> > > .call 
> > > 
> > > > > > >> overloading is not to be abused in Julia, so I would favor 
> using 
> > > 
> > > a 
> > > 
> > > > > > >> regular 
> > > > > > >> function call with a descriptive name instead of call 
> > > 
> > > overloading, 
> > > 
> > > > > but 
> > > > > 
> > > > > > >> the 
> > > > > > >> same performance guarantees apply. Does that answer your 
> > > 
> > > question? 
> > > 
> > > > > > >> On Thursday, January 21, 2016 at 9:02:50 PM UTC-5, Bryan 
> Rivera 
> > > > > 
> > > > > wrote: 
> > > > > > >>> I think what I wrote above might be too complicated, as it 
> is an 
> > > > > 
> > > > > attempt 
> > > > > 
> > > > > > >>> to solve this problem. 
> > > > > > >>> 
> > > > > > >>> In essence this is what I want: 
> > > > > > >>> 
> > > > > > >>> 
> > > > > > >>> # wasmerged, _, _, _ = elide_pairwise!(ttree1, ttree2, 
> canmerge; 
> > > > > 
> > > > > nbrs=idbgv) 
> > > > > 
> > > > > > >>> function function1(a, b, onGreaterThanCallback) 
> > > > > > >>> 
> > > > > > >>>   if(a > b) 
> > > > > > >>>   
> > > > > > >>>     c = a + b 
> > > > > > >>>     res = onGreaterThanCallback(c, z) 
> > > > > > >>>     return res + 1 
> > > > > > >>>   
> > > > > > >>>   else 
> > > > > > >>>   
> > > > > > >>>     # do anything 
> > > > > > >>>     # but return nothing 
> > > > > > >>>   
> > > > > > >>>   end 
> > > > > > >>> 
> > > > > > >>> end 
> > > > > > >>> 
> > > > > > >>> 
> > > > > > >>> global onGreaterThanCallback = (c) -> c + z 
> > > > > > >>> 
> > > > > > >>> function1(a, b, onGreaterThanCallback) 
> > > > > > >>> 
> > > > > > >>> 
> > > > > > >>> Problems: 
> > > > > > >>> 
> > > > > > >>> The global variable. 
> > > > > > >>> 
> > > > > > >>> The anonymous function which has performance impact (vs 
> other 
> > > > > > >>> approaches).  We could use Tim Holy's @anon, but then the 
> value 
> > > 
> > > of 
> > > 
> > > > > `z` 
> > > > > 
> > > > > > >>> is 
> > > > > > >>> fixed at function definition, which we don't always want. 
> > > > > > >>> 
> > > > > > >>> I think that the ideal optimization would look like this: 
> > > > > > >>>       function function1(a, b, z) # Variable needed in 
> callback 
> > > 
> > > fun 
> > > 
> > > > > > >>> injected. 
> > > > > > >>> 
> > > > > > >>>         if(a > b) 
> > > > > > >>>         
> > > > > > >>>           c = a + b 
> > > > > > >>>           res = c + z # Callback function has been injected. 
> > > > > > >>>           return res + 1 
> > > > > > >>>         
> > > > > > >>>         else 
> > > > > > >>>         
> > > > > > >>>           # do anything 
> > > > > > >>>           # but return nothing 
> > > > > > >>>         
> > > > > > >>>         end 
> > > > > > >>>       
> > > > > > >>>       end 
> > > > > > >>>       
> > > > > > >>>       
> > > > > > >>>       function1(a, b, z) 
> > > > > > >>> 
> > > > > > >>> In OO languages we would be using an abstract class or its 
> > > > > 
> > > > > equivalent. 
> > > > > 
> > > > > > >>>  But I've thought about it, and read the discussions on 
> > > 
> > > interfaces, 
> > > 
> > > > > and 
> > > > > 
> > > > > > >>> don't see those solutions optimizing the code out like I did 
> > > 
> > > above. 
> > > 
> > > > > > >>> Any ideas? 
>
>

Re: [julia-users] Re: Optimizing Function Injection in Julia

Reply via email to