I thought whether CVT is needed is decided with the table (I forgot exactly
where that is now). This is a machine dependent operation (zero or sign
extended when loaded from memory). E.g. IA is zero extended, Mips is sign
extended. Or am I in the wrong context?
Sun
On Tue, Nov 29, 2011 at 4:29 PM, Gang Yu <yugang...@gmail.com> wrote:
> Hi, list and Sun
>
> Let me re-organize the review request for bug912 fix. The bug case:
>
> struct go7007 {
> int sensor_framerate;
> int fps_scale;
> };
> int vti_bitlen(struct go7007 *go)
> {
> unsigned int i, max_time_incr = go->sensor_framerate / go->fps_scale;
> for (i = 31; (max_time_incr & ((1 << i) - 1)) == max_time_incr; --i);
> return i + 1;
> }
>
> opencc O2 compile will assert:
>
> ### Assertion failure at line 2109 of
> /fc/proj/ctires/open64/o64guru/src/Sat/trunk/osprey/be/cg/cgemit.cxx:
> ### Compiler Error in file bug912-cutdown.c during Assembly phase:
> ### literal for operand 1 is not in range
>
> If we check the whirl output, we get the initial condition check for for
> loop is as follows:
>
> U4U4LDID 50 <1,8,.preg_U4> T<8,.predef_U4,4> # max_time_incr
> U4U4LDID 50 <1,8,.preg_U4> T<8,.predef_U4,4> # max_time_incr
> I8INTCONST -2147483649 (0xffffffff7fffffff)
> U4BAND
> I4U4EQ
> FALSEBR L2562 {line: 0/0}
>
> this is already wrong, the expected whirl output for the loop entry
> condition should be:
>
> U4U4LDID 50 <1,8,.preg_U4> T<8,.predef_U4,4> # max_time_incr
> U4INTCONST 2147483647
> U4BAND
> I4U4EQ
> FALSEBR L2562 {line: 0/0}
> and investigation shows before the phase of "Create CODEMAP
> Representation"
>
> this condition can be expressed as:
> U4U4LDID 0 <st 19> T<8,.predef_U4,4>
> I4INTCONST 1 (0x1)
> I4I4LDID 0 <st 20> T<4,.predef_I4,4>
> I4SHL
> I4INTCONST -1 (0xffffffffffffffff)
> I4ADD
> U4I4CVT
> U4U4LDID 0 <st 19> T<8,.predef_U4,4>
> U4BAND
> I4U4EQ
> FALSEBR L1538 {line: 0/0} {freq: 0, ln: 0, col: 0}
> st19 is max_time_incr, st20 is i, this is still right.
>
> After the CODEMAP phase, the expression is transformed to
> > LDID U4 U4 sym3v3 0 ty=802 <u=4 cr12> flags:0x0 b=-1 #max_time_incr
> > LDID U4 U4 sym3v3 0 ty=802 <u=4 cr12> flags:0x0 b=-1 #max_time_incr
> > LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1
> > U4BAND <u=1 cr18> isop_flags:0x0 flags:0x0 b=-1
> > I4U4EQ <u=1 cr19> isop_flags:0x0 flags:0x0 b=-1
> >OPR_FALSEBR 1538 b=-1 flags:0x2 pj2 Sid-1
>
> The expression
> I4INTCONST 1 (0x1)
> I4I4LDID 0 <st 20> T<4,.predef_I4,4>
> I4SHL
> I4INTCONST -1 (0xffffffffffffffff)
> I4ADD
> U4I4CVT
> U4U4LDID 0 <st 19> T<8,.predef_U4,4>
> is unexpected tranformed to
> > LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1
>
> it should rightly transformed be
>
> > LDC U4 2147483647 <u=1 cr19> flags:0x0 b=-1
>
> So, we can make sure there is something unexpected done in codemap phase.
>
> More investigation shows when control goes to Canon_CVT, it get the
> representation of
> I4INTCONST 1 (0x1)
> I4I4LDID 0 <st 20> T<4,.predef_I4,4>
> I4SHL
> I4INTCONST -1 (0xffffffffffffffff)
> I4ADD
> to
> LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1
>
> This is still right, instead of get I4 CONST, we get I8 const because a
> canonical const scale is 64bit default, while fold the coderep
> > LDC U4 1 <u=1 cr15> flags:0x0 b=-1
> > LDC U4 31 <u=2 cr13> flags:0x0 b=-1
> >I4SHL <u=0 cr0> isop_flags:0x0 flags:0x0 b=-1
> is sign extended to 0xffffffff10000000, so we get
> > LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1
>
> The right solution here is we should not delete the U4I4CVT in
> this expression.
>
> So, my suggested patch is to change the control flow in Canon_cvt to avoid
> ignore U4I4CVT in such case.
>
> The patch below:
>
> osprey/be/opt/opt_htable.cxx -- 1fc05d0..0b7a14a 100644
> --- a/osprey/be/opt/opt_htable.cxx
> +++ b/osprey/be/opt/opt_htable.cxx
> @@ -2271,7 +2271,11 @@ CODEMAP::Canon_cvt(WN *wn,
> #endif
> if ((Get_mtype_class(OPCODE_rtype(op)) &
> Get_mtype_class(OPCODE_desc(op))) != 0 &&
> - MTYPE_size_min(OPCODE_rtype(op)) == MTYPE_size_min(OPCODE_desc(op)))
> + MTYPE_size_min(OPCODE_rtype(op)) == MTYPE_size_min(OPCODE_desc(op))
> &&
> + // bug912 open64.net. Do not delete U4I4CVT if his kid is a
> constant
> + (!(OPCODE_rtype(op) == MTYPE_U4 &&
> + OPCODE_desc(op) == MTYPE_I4 &&
> + ccr->Tree() == NULL)))
> return propagated;
> if ( WOPT_Enable_Cvt_Folding &&
>
> This new patch does not lost the chances to U4I4CVT optimisation, while it
> makes the case of U4I4CVT const right.
>
> If you still have other questions, let me know.
>
> Regards
> Gang
>
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel