Also, somewhat related the "speeds up short path" you just commited actually most likely does not. :)
Check the generated code, it's surprisingly similar for both before and after. http://goo.gl/XIPWyu The inner loop is the same length, the only difference is the registers used, and the overall function size differs by 2 instructions, one of which is in the error handling that is rather unlikely to be triggered. Again, the C compiler is good at obvious optimizations, so most of the time I go for the most easily readable verision of the code. This change is otherwise fine, I guess, but the commit message is misleading.