LCD 47 <[email protected]> wrote: > On 12 June 2013, Dominique Pellé <[email protected]> wrote: > [...] >> You can half the number of comparisons when doing range checking. >> Instead of doing 2 comparisons... >> >> if (c >= '0' && c <= '9') >> >> You can do it in one comparison with this trick: >> >> if ((unsigned)c - '0' < 10) > [...] > > Actually, the second version is a bit slower than the first. :) A > comparison is exactly as fast as a subtraction of the same length, and > (unsigned)c - '0' needs to store the result in memory. > > /lcd
It depends on the machine of course, but often branching is more expensive because the processor can mispredict the outcome, causing it to discard in pipeline instruction(s) already partially processed: http://en.wikipedia.org/wiki/Branch_misprediction http://en.wikipedia.org/wiki/Instruction_pipeline#Branches That's why there are people who find clever branchless algorithms for all kind of things (min, max, abs...). The branchless abs() algorithm is even apparently patented (sigh): http://www.strchr.com/optimized_abs_function Having said that, I have compiled this function... int is_digit(int d) { #ifdef OPTIMIZE return ((unsigned)d - '0') <= 9; #else return d >= '0' && d <= '9'; #endif } With gcc -O0 on x86_64, the code produced is shorter and faster with -DOPTIMIZE as I was hoping for. -O0 without -DOPTIMIZE (two comparisons): 000000000040051c <is_digit>: 40051c: 55 push %rbp 40051d: 48 89 e5 mov %rsp,%rbp 400520: 89 7d fc mov %edi,-0x4(%rbp) 400523: 83 7d fc 2f cmpl $0x2f,-0x4(%rbp) 400527: 7e 0d jle 400536 <is_digit+0x1a> 400529: 83 7d fc 39 cmpl $0x39,-0x4(%rbp) 40052d: 7f 07 jg 400536 <is_digit+0x1a> 40052f: b8 01 00 00 00 mov $0x1,%eax 400534: eb 05 jmp 40053b <is_digit+0x1f> 400536: b8 00 00 00 00 mov $0x0,%eax 40053b: 5d pop %rbp 40053c: c3 retq -O0 -DOPTIMIZE (one comparison): 000000000040051c <is_digit>: 40051c: 55 push %rbp 40051d: 48 89 e5 mov %rsp,%rbp 400520: 89 7d fc mov %edi,-0x4(%rbp) 400523: 8b 45 fc mov -0x4(%rbp),%eax 400526: 83 e8 30 sub $0x30,%eax 400529: 83 f8 09 cmp $0x9,%eax 40052c: 0f 96 c0 setbe %al 40052f: 0f b6 c0 movzbl %al,%eax 400532: 5d pop %rbp 400533: c3 retq But with gcc -O1, -O2, or -O3, the generated code is identical anyway whether OPTIMIZE is defined or not. For example gcc -O3 produces this code with/without -DOPTIMIZE: 00000000004005f0 <is_digit>: 4005f0: 83 ef 30 sub $0x30,%edi 4005f3: 31 c0 xor %eax,%eax 4005f5: 83 ff 09 cmp $0x9,%edi 4005f8: 0f 96 c0 setbe %al 4005fb: c3 retq 4005fc: 0f 1f 40 00 nopl 0x0(%rax) Dominique -- -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
