No change. I over typed everything to avoid such type mismatches, particularly when experimenting with other integer types. So unless I missed something somewhere, it should not be the case. I suspect something like the compiler does not recognize the incrementing variables should be registries. Unless it is the inherent speed of incrementing, but I doubt it, I had some faster runs at some points...
On Thursday, September 18, 2014 12:58:12 PM UTC-4, John Myles White wrote: > > 1 has type Int. If you add it to something with a different type, you > might be causing type instability. What happens if you replace the literal > 1 with one(T) for the type you're working with? > > -- John > > On Sep 18, 2014, at 9:56 AM, G. Patrick Mauroy <gpma...@gmail.com > <javascript:>> wrote: > > Profiling shows incrementing integers by 1 (i += 1) being the bottleneck. > > Within the same loop are other statements that do take much less time. > > In my performance optimizing zeal, I over typed the hell out of everything > to attempt squeezing performance to the last once. > Some of this zeal did help in other parts of the code, but now struggling > making sense at spending most of the time incrementing by 1. > I suspect the problem is over typing zeal because I seem to recall having > a version not so strongly typed that ran consistently 2-3 times faster for > default Int (but not for other Int types). It was late at night so I don't > recall the details! > > I am pretty confident the increment variables are typed so there should > not be any undue cast. > > Any idea? > > Here is how my code conceptually looks like: > > # Global static type declaration ahead seems to have helped (as opposed to >> deriving from eltype of underlying array at the beginning of function being >> profiled). >> IdType = Int # Int64 >> DType = Int >> function my_fct(dt1, dt2) >> # Convert is for sure unnecessary for default Int types but more >> rigorous and necessary in some parts of code when experimenting with other >> IdType & DType types. >> const oneIdType = convert(IdType, 1) # Used to make sure I increment >> with a value of the proper type, again useless with IdType = Int. >> const zeroIdType = convert(IdType, 0) >> i::IdType = zeroIdType; i2Match::IdType = zeroIdType; i2Lower::IdType = >> zeroIdType; i2Upper::IdType = oneIdType; >> ... >> # Critical loop. >> i2Match = i2Lower >> while i2Match < i2Upper >> @inbounds i2MatchD2 = dt2D2[i2Match] >> if i1D <= i2MatchD2 >> i += oneIdType # SLOW! >> @inbounds i2MatchD1 = dt2D1[i2Match] >> @inbounds resid1[i] = i1id1 >> ... >> end >> i2Match += oneIdType # SLOW! >> end >> ... >> end > > > The undeclared types are 1-dim arrays of the appropriate type -- basically > all Int in this configuration. > > Enclosed is the full stand-alone code if anyone cares to try. > On my machines, one function call is in the range of 0.05 to 0.1 sec, > highly depending upon garbage collection, so profiling with 100 runs is > done in about 10 sec. > > Thanks. > > Patrick > > <crossJoinFilter.jl> > > >