1 has type Int. If you add it to something with a different type, you might be 
causing type instability. What happens if you replace the literal 1 with one(T) 
for the type you're working with?

  -- John

On Sep 18, 2014, at 9:56 AM, G. Patrick Mauroy <[email protected]> wrote:

> Profiling shows incrementing integers by 1 (i += 1) being the bottleneck.
> 
> Within the same loop are other statements that do take much less time.
> 
> In my performance optimizing zeal, I over typed the hell out of everything to 
> attempt squeezing performance to the last once.
> Some of this zeal did help in other parts of the code, but now struggling 
> making sense at spending most of the time incrementing by 1.
> I suspect the problem is over typing zeal because I seem to recall having a 
> version not so strongly typed that ran consistently 2-3 times faster for 
> default Int (but not for other Int types).  It was late at night so I don't 
> recall the details!
> 
> I am pretty confident the increment variables are typed so there should not 
> be any undue cast.
> 
> Any idea?
> 
> Here is how my code conceptually looks like:
> 
> # Global static type declaration ahead seems to have helped (as opposed to 
> deriving from eltype of underlying array at the beginning of function being 
> profiled).
> IdType = Int # Int64
> DType = Int
> function my_fct(dt1, dt2)
>   # Convert is for sure unnecessary for default Int types but more rigorous 
> and necessary in some parts of code when experimenting with other IdType & 
> DType types.
>   const oneIdType = convert(IdType, 1) # Used to make sure I increment with a 
> value of the proper type, again useless with IdType = Int.
>   const zeroIdType = convert(IdType, 0)
>   i::IdType = zeroIdType; i2Match::IdType = zeroIdType; i2Lower::IdType = 
> zeroIdType; i2Upper::IdType = oneIdType;
>   ...
>     # Critical loop.
>     i2Match = i2Lower
>     while i2Match < i2Upper
>       @inbounds i2MatchD2 = dt2D2[i2Match]
>       if i1D <= i2MatchD2
>         i += oneIdType # SLOW!
>         @inbounds i2MatchD1 = dt2D1[i2Match]
>         @inbounds resid1[i] = i1id1
>         ...
>       end
>       i2Match += oneIdType # SLOW!
>     end
>   ...
> end
> 
> The undeclared types are 1-dim arrays of the appropriate type -- basically 
> all Int in this configuration.
> 
> Enclosed is the full stand-alone code if anyone cares to try.
> On my machines, one function call is in the range of 0.05 to 0.1 sec, highly 
> depending upon garbage collection, so profiling with 100 runs is done in 
> about 10 sec.
> 
> Thanks.
> 
> Patrick
> 
> <crossJoinFilter.jl>

Reply via email to