On Dec 5, 2010, at 9:49 AM, Chris Lattner wrote:
>
> On Dec 5, 2010, at 3:19 AM, Richard Guenther wrote:
>
>>> $ clang t.cc -S -o - -O3 -mkernel -fomit-frame-pointer -mllvm
>>> -show-mc-encoding
>>> .section __TEXT,__text,regular,pure_instructions
>>> .globl __Z4testl
>>> .align 4, 0x90
>>> __Z4testl: ## @_Z4testl
>>> ## BB#0: ## %entry
>>> movl $4, %ecx ## encoding:
>>> [0xb9,0x04,0x00,0x00,0x00]
>>> movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
>>> mulq %rcx ## encoding: [0x48,0xf7,0xe1]
>>> movq $-1, %rdi ## encoding:
>>> [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
>>> cmovnoq %rax, %rdi ## encoding: [0x48,0x0f,0x41,0xf8]
>>> jmp __Znam ## TAILCALL
>>> ## encoding: [0xeb,A]
>>> ## fixup A - offset: 1, value:
>>> __Znam-1, kind: FK_PCRel_1
>>> .subsections_via_symbols
>>>
>>> This could be further improved by inverting the cmov condition to avoid the
>>> first movq, which we'll tackle as a general regalloc improvement.
>>
>> I'm curious as on how you represent the overflow checking in your highlevel
>> IL.
>
> The (optimized) generated IR is:
>
> $ clang t.cc -emit-llvm -S -o - -O3
> ...
> define noalias i8* @_Z4testl(i64 %count) ssp {
> entry:
> %0 = tail call %0 @llvm.umul.with.overflow.i64(i64 %count, i64 4)
> %1 = extractvalue %0 %0, 1
> %2 = extractvalue %0 %0, 0
> %3 = select i1 %1, i64 -1, i64 %2
> %call = tail call noalias i8* @_Znam(i64 %3)
> ret i8* %call
> }
Sorry, it's a little easier to read with expanded names and types:
define noalias i8* @_Z4testl(i64 %count) ssp {
entry:
%A = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %count, i64 4)
%B = extractvalue { i64, i1 } %A, 1
%C = extractvalue { i64, i1 } %A, 0
%D = select i1 %B, i64 -1, i64 %C
%call = tail call noalias i8* @_Znam(i64 %D)
ret i8* %call
}
-Chris