No change.
I over typed everything to avoid such type mismatches, particularly when 
experimenting with other integer types.  So unless I missed something 
somewhere, it should not be the case.
I suspect something like the compiler does not recognize the incrementing 
variables should be registries.  Unless it is the inherent speed of 
incrementing, but I doubt it, I had some faster runs at some points...

On Thursday, September 18, 2014 12:58:12 PM UTC-4, John Myles White wrote:
> 1 has type Int. If you add it to something with a different type, you 
> might be causing type instability. What happens if you replace the literal 
> 1 with one(T) for the type you're working with?
>   -- John
> On Sep 18, 2014, at 9:56 AM, G. Patrick Mauroy < 
> <javascript:>> wrote:
> Profiling shows incrementing integers by 1 (i += 1) being the bottleneck.
> Within the same loop are other statements that do take much less time.
> In my performance optimizing zeal, I over typed the hell out of everything 
> to attempt squeezing performance to the last once.
> Some of this zeal did help in other parts of the code, but now struggling 
> making sense at spending most of the time incrementing by 1.
> I suspect the problem is over typing zeal because I seem to recall having 
> a version not so strongly typed that ran consistently 2-3 times faster for 
> default Int (but not for other Int types).  It was late at night so I don't 
> recall the details!
> I am pretty confident the increment variables are typed so there should 
> not be any undue cast.
> Any idea?
> Here is how my code conceptually looks like:
> # Global static type declaration ahead seems to have helped (as opposed to 
>> deriving from eltype of underlying array at the beginning of function being 
>> profiled).
>> IdType = Int # Int64
>> DType = Int
>> function my_fct(dt1, dt2)
>>   # Convert is for sure unnecessary for default Int types but more 
>> rigorous and necessary in some parts of code when experimenting with other 
>> IdType & DType types.
>>   const oneIdType = convert(IdType, 1) # Used to make sure I increment 
>> with a value of the proper type, again useless with IdType = Int.
>>   const zeroIdType = convert(IdType, 0)
>>   i::IdType = zeroIdType; i2Match::IdType = zeroIdType; i2Lower::IdType = 
>> zeroIdType; i2Upper::IdType = oneIdType;
>>   ...
>>     # Critical loop.
>>     i2Match = i2Lower
>>     while i2Match < i2Upper
>>       @inbounds i2MatchD2 = dt2D2[i2Match]
>>       if i1D <= i2MatchD2
>>         i += oneIdType # SLOW!
>>         @inbounds i2MatchD1 = dt2D1[i2Match]
>>         @inbounds resid1[i] = i1id1
>>         ...
>>       end
>>       i2Match += oneIdType # SLOW!
>>     end
>>   ...
>> end
> The undeclared types are 1-dim arrays of the appropriate type -- basically 
> all Int in this configuration.
> Enclosed is the full stand-alone code if anyone cares to try.
> On my machines, one function call is in the range of 0.05 to 0.1 sec, 
> highly depending upon garbage collection, so profiling with 100 runs is 
> done in about 10 sec.
> Thanks.
> Patrick
> <crossJoinFilter.jl>

