Re: ICE with MEM_REF when Pmode is different from word_mode

2012-05-30 Thread Mohamed Shafi
On 29 May 2012 17:31, Richard Guenther richard.guent...@gmail.com wrote:
 On Tue, May 29, 2012 at 1:57 PM, Mohamed Shafi shafi...@gmail.com wrote:
 Hi,

 I am porting a private target in GCC 4.6.3 version. For my target
 pointer size is 24bits and word size is 32bits. Moreover a byte is
 32bit

 For the testcase gcc.c-torture/compile/92-1.c i get the following ICE

 92-1.c: In function 'f':
 92-1.c:18:5: internal compiler error: in size_binop_loc, at
 fold-const.c:1436
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See http://gcc.gnu.org/bugs.html for instructions

 This is the reduced testcase of the same

  struct vp {
  int wa;
 };

 typedef struct vp *vpt;

 typedef struct vc {
  int o;
  vpt py[8];
 } *vct;

 typedef struct np *npt;
 struct np {
  vct d;
  int di;
 };

 int f(npt dp)
 {
  vpt *py;

  py = dp-d-py[dp-di];
  return (int)(py[1])-wa;
 }

 The ICE happens in tree_slp_vectorizer pass. The following is the tree
 dump just before that

 ;; Function f (f)

 f (struct np * dp)
 {
  struct vp * D.1232;
  int D.1230;
  unsigned int D.1228;
  int D.1227;
  struct vc * D.1225;

 bb 2:
  D.1225_2 = dp_1(D)-d;
  D.1227_4 = dp_1(D)-di;
  D.1228_5 = (unsigned int) D.1227_4;
  D.1232_9 = MEM[(struct vp * *)D.1225_2 + 4B].py[D.1228_5]{lb: 0 sz: 4};
  D.1230_10 = D.1232_9-wa;
  return D.1230_10;
 }

 The ICE happens for

  D.1232_9 = MEM[(struct vp * *)D.1225_2 + 4B].py[D.1228_5]{lb: 0 sz: 4};

 This is due to the addition of the new code in tree-data-ref.c (this
 is was not there in 4.5 series)

  if (TREE_CODE (base) == MEM_REF)
    {
      if (!integer_zerop (TREE_OPERAND (base, 1)))
        {
          if (!poffset)
            {
              double_int moff = mem_ref_offset (base);
              poffset = double_int_to_tree (sizetype, moff);
            }
          else
            poffset = size_binop (PLUS_EXPR, poffset, TREE_OPERAND (base, 1));

 This should use mem_ref_offset, too.


This is present in the trunk also. Will you be submitting a patch for this?

Shafi


ICE with MEM_REF when Pmode is different from word_mode

2012-05-29 Thread Mohamed Shafi
Hi,

I am porting a private target in GCC 4.6.3 version. For my target
pointer size is 24bits and word size is 32bits. Moreover a byte is
32bit

For the testcase gcc.c-torture/compile/92-1.c i get the following ICE

92-1.c: In function 'f':
92-1.c:18:5: internal compiler error: in size_binop_loc, at
fold-const.c:1436
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions

This is the reduced testcase of the same

 struct vp {
  int wa;
};

typedef struct vp *vpt;

typedef struct vc {
  int o;
  vpt py[8];
} *vct;

typedef struct np *npt;
struct np {
  vct d;
  int di;
};

int f(npt dp)
{
  vpt *py;

  py = dp-d-py[dp-di];
  return (int)(py[1])-wa;
}

The ICE happens in tree_slp_vectorizer pass. The following is the tree
dump just before that

;; Function f (f)

f (struct np * dp)
{
  struct vp * D.1232;
  int D.1230;
  unsigned int D.1228;
  int D.1227;
  struct vc * D.1225;

bb 2:
  D.1225_2 = dp_1(D)-d;
  D.1227_4 = dp_1(D)-di;
  D.1228_5 = (unsigned int) D.1227_4;
  D.1232_9 = MEM[(struct vp * *)D.1225_2 + 4B].py[D.1228_5]{lb: 0 sz: 4};
  D.1230_10 = D.1232_9-wa;
  return D.1230_10;
}

The ICE happens for

  D.1232_9 = MEM[(struct vp * *)D.1225_2 + 4B].py[D.1228_5]{lb: 0 sz: 4};

This is due to the addition of the new code in tree-data-ref.c (this
is was not there in 4.5 series)

  if (TREE_CODE (base) == MEM_REF)
{
  if (!integer_zerop (TREE_OPERAND (base, 1)))
{
  if (!poffset)
{
  double_int moff = mem_ref_offset (base);
  poffset = double_int_to_tree (sizetype, moff);
}
  else
poffset = size_binop (PLUS_EXPR, poffset, TREE_OPERAND (base, 1));
}
  base = TREE_OPERAND (base, 0);
}
  else
base = build_fold_addr_expr (base);

the assert check in size_binop fails


  gcc_assert (int_binop_types_match_p (code, TREE_TYPE (arg0),
   TREE_TYPE (arg1)));

This is because the mode of arg0 and arg1 are different, one is Pmode
and other is word_mode.
This is present in m32c target which also has different Pmode and word_mode.
Is this a know failure? I cannot find a bug entry for this issue.
Should i report this?

Regards,
Shafi


Re: Reloading going wrong. Bug in GCC?

2012-03-20 Thread Mohamed Shafi
ping !!!. Any help on http://gcc.gnu.org/ml/gcc/2011-09/msg00150.html

shafi

On 14 September 2011 15:07, Mohamed Shafi shafi...@gmail.com wrote:
 Hi,

 I am working on a 32bit private target which has the following restriction

 1. store/load can happen only through a general purpose register (GP_REGS)
 2. base register should be an address register (AD_REGS)
 3. moves between GP_REGS and AD_REGS can happen only through PT_REGS

 In a PRE_MODIFY instruction when both the base register and the output
 register gets spilled the reloading is going wrong.

 befor IRA pass
 ~~~
 (insn 259 336 317 2 ../rld_bug.c:94 (set (reg:QI 234 [+1 ])
        (mem/s/j/c:QI (pre_modify:PQI (reg/f:PQI 233)
                (plus:PQI (reg/f:PQI 233)
                    (const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
 (expr_list:REG_INC (reg/f:PQI 233)
        (nil)))

 after IRA pass
 ~~~
 Reloads for insn # 259
 Reload 0: GP_REGS, RELOAD_FOR_OPADDR_ADDR (opnum = 1), can't combine,
 secondary_reload_p
        reload_reg_rtx: (reg:PQI 11 g11)
 Reload 1: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
 combine, secondary_reload_p
        reload_reg_rtx: (reg:PQI 12 as0)
        secondary_in_reload = 0
 Reload 2: GP_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
 combine, secondary_reload_p
        reload_reg_rtx: (reg:PQI 11 g11)
 Reload 3: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
 combine, secondary_reload_p
        reload_reg_rtx: (reg:PQI 13 as1)
        secondary_out_reload = 2

 Reload 4: reload_in (PQI) = (reg/f:PQI 233)
        reload_out (PQI) = (reg/f:PQI 233)
        AD_REGS, RELOAD_OTHER (opnum = 1)
        reload_in_reg: (reg/f:PQI 233)
        reload_out_reg: (reg/f:PQI 233)
        reload_reg_rtx: (reg:PQI 31 a3)
        secondary_in_reload = 1, secondary_out_reload = 3

 Reload 5: reload_out (QI) = (reg:QI 234 [+1 ])
        GP_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
        reload_out_reg: (reg:QI 234 [+1 ])
        reload_reg_rtx: (reg:QI 11 g11)


 (insn 744 336 745 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
        (mem/c:PQI (plus:PQI (reg/f:PQI 32 sp)
                (const_int -24 [0xffe8])) [99 %sfp+8 S1
 A32])) 9 {movpqi_op} (nil))

 (insn 745 744 746 2 ../rld_bug.c:94 (set (reg:PQI 12 as0)
        (reg:PQI 11 g11)) 9 {movpqi_op} (nil))

 (insn 746 745 259 2 ../rld_bug.c:94 (set (reg:PQI 31 a3)
        (reg:PQI 12 as0)) 9 {movpqi_op} (nil))

 (insn 259 746 747 2 ../rld_bug.c:94 (set (reg:QI 11 g11)
        (mem/s/j/c:QI (pre_modify:PQI (reg:PQI 31 a3)
                (plus:PQI (reg:PQI 31 a3)
                    (const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
 (expr_list:REG_INC (reg:PQI 31 a3)
        (nil)))

 (insn 747 259 748 2 ../rld_bug.c:94 (set (reg:PQI 13 as1)
        (reg:PQI 31 a3)) 9 {movpqi_op} (nil))

 (insn 748 747 749 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
        (reg:PQI 13 as1)) 9 {movpqi_op} (nil))

 (insn 749 748 750 2 ../rld_bug.c:94 (set (mem/c:PQI (plus:PQI (reg/f:PQI 32 
 sp)
                (const_int -24 [0xffe8])) [99 %sfp+8 S1 A32])
        (reg:PQI 11 g11)) 9 {movpqi_op} (nil))

 (insn 750 749 751 2 ../rld_bug.c:94 (set (mem/c:QI (plus:PQI (reg/f:PQI 32 sp)
                (const_int -29 [0xffe3])) [99 %sfp+3 S1 A32])
        (reg:QI 11 g11)) 7 {movqi_op} (nil))


 After IRA pass for insn 259 1st the modified address is stored into
 its spilled location and then the modified value is stored. As you can
 see from the instructions same register (g11) is used for Reload 5 and
 2, and hence the modified value is getting corrupted and hence the
 modified address gets stored instead of modified value (insn 749 and
 insn 750). I am not able to figure out where this is going wrong in
 the reload phase. I suspect that this is a GCC issue.

 Can some one give me some pointers to resolve this issue?

 Regards,
 Shafi


Restricting with Multilib

2012-03-07 Thread Mohamed Shafi
Hi,

For the target that i am porting needs a cpu command line option i.e
it doesn't have a default option. Currently it takes 3 variant, say
cpu1, cpu2, cpu3.

So when i enable multilib option

MULTILIB_OPTIONS = mcpu=1/mcpu=2/mcpu=3

I get the following libgcc variants:

cpu1/libgcc
cpu2/libgcc
cpu3/libgcc
libgcc

That includes i variant for each cpu and a default version. Is there
any way to restrict GCC from building the default version?

Regards,
Shafi


Reloading going wrong. Bug in GCC?

2011-09-14 Thread Mohamed Shafi
Hi,

I am working on a 32bit private target which has the following restriction

1. store/load can happen only through a general purpose register (GP_REGS)
2. base register should be an address register (AD_REGS)
3. moves between GP_REGS and AD_REGS can happen only through PT_REGS

In a PRE_MODIFY instruction when both the base register and the output
register gets spilled the reloading is going wrong.

befor IRA pass
~~~
(insn 259 336 317 2 ../rld_bug.c:94 (set (reg:QI 234 [+1 ])
(mem/s/j/c:QI (pre_modify:PQI (reg/f:PQI 233)
(plus:PQI (reg/f:PQI 233)
(const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
(expr_list:REG_INC (reg/f:PQI 233)
(nil)))

after IRA pass
~~~
Reloads for insn # 259
Reload 0: GP_REGS, RELOAD_FOR_OPADDR_ADDR (opnum = 1), can't combine,
secondary_reload_p
reload_reg_rtx: (reg:PQI 11 g11)
Reload 1: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
combine, secondary_reload_p
reload_reg_rtx: (reg:PQI 12 as0)
secondary_in_reload = 0
Reload 2: GP_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
combine, secondary_reload_p
reload_reg_rtx: (reg:PQI 11 g11)
Reload 3: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
combine, secondary_reload_p
reload_reg_rtx: (reg:PQI 13 as1)
secondary_out_reload = 2

Reload 4: reload_in (PQI) = (reg/f:PQI 233)
reload_out (PQI) = (reg/f:PQI 233)
AD_REGS, RELOAD_OTHER (opnum = 1)
reload_in_reg: (reg/f:PQI 233)
reload_out_reg: (reg/f:PQI 233)
reload_reg_rtx: (reg:PQI 31 a3)
secondary_in_reload = 1, secondary_out_reload = 3

Reload 5: reload_out (QI) = (reg:QI 234 [+1 ])
GP_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
reload_out_reg: (reg:QI 234 [+1 ])
reload_reg_rtx: (reg:QI 11 g11)


(insn 744 336 745 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
(mem/c:PQI (plus:PQI (reg/f:PQI 32 sp)
(const_int -24 [0xffe8])) [99 %sfp+8 S1
A32])) 9 {movpqi_op} (nil))

(insn 745 744 746 2 ../rld_bug.c:94 (set (reg:PQI 12 as0)
(reg:PQI 11 g11)) 9 {movpqi_op} (nil))

(insn 746 745 259 2 ../rld_bug.c:94 (set (reg:PQI 31 a3)
(reg:PQI 12 as0)) 9 {movpqi_op} (nil))

(insn 259 746 747 2 ../rld_bug.c:94 (set (reg:QI 11 g11)
(mem/s/j/c:QI (pre_modify:PQI (reg:PQI 31 a3)
(plus:PQI (reg:PQI 31 a3)
(const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
(expr_list:REG_INC (reg:PQI 31 a3)
(nil)))

(insn 747 259 748 2 ../rld_bug.c:94 (set (reg:PQI 13 as1)
(reg:PQI 31 a3)) 9 {movpqi_op} (nil))

(insn 748 747 749 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
(reg:PQI 13 as1)) 9 {movpqi_op} (nil))

(insn 749 748 750 2 ../rld_bug.c:94 (set (mem/c:PQI (plus:PQI (reg/f:PQI 32 sp)
(const_int -24 [0xffe8])) [99 %sfp+8 S1 A32])
(reg:PQI 11 g11)) 9 {movpqi_op} (nil))

(insn 750 749 751 2 ../rld_bug.c:94 (set (mem/c:QI (plus:PQI (reg/f:PQI 32 sp)
(const_int -29 [0xffe3])) [99 %sfp+3 S1 A32])
(reg:QI 11 g11)) 7 {movqi_op} (nil))


After IRA pass for insn 259 1st the modified address is stored into
its spilled location and then the modified value is stored. As you can
see from the instructions same register (g11) is used for Reload 5 and
2, and hence the modified value is getting corrupted and hence the
modified address gets stored instead of modified value (insn 749 and
insn 750). I am not able to figure out where this is going wrong in
the reload phase. I suspect that this is a GCC issue.

Can some one give me some pointers to resolve this issue?

Regards,
Shafi


Issue with delay slot scheduling?

2011-09-06 Thread Mohamed Shafi
Hi,

I am doing a private port in GCC 4.5.1. For the my target i see some
strange behavior in delay slot scheduling. For my target the
instruction in the delay slots gets executed irrespective of whether
the branch is taken or not. I have generated the following code after
commenting out the call to 'relax_delay_slots' in the function
'dbr_schedule'.

RTL:

(insn 97 42 51 del1.c:19 (sequence [
    (jump_insn 61 42 38 del1.c:19 (set (pc)
    (if_then_else (ne (reg:CCF 34 CC)
    (const_int 0 [0x0]))
    (label_ref:PQI 86)
    (pc))) 56 {conditional_branch}
(expr_list:REG_BR_PRED (const_int 5 [0x5])
    (expr_list:REG_DEAD (reg:CCF 34 CC)
    (expr_list:REG_BR_PROB (const_int 5000 [0x1388])
    (nil
 - 86)
    (insn 38 61 43 (set (mem/s/j:QI (reg/f:PQI 28 a0 [orig:62
D.1955 ] [62]) [0 bytes S1 A32])
    (reg:QI 1 g1 [orig:65 D.1938 ] [65])) 7 {movqi_op} (nil))
    (insn 43 38 51 (set (reg:QI 1 g1 [75])
    (ior:QI (reg:QI 1 g1 [orig:65 D.1938 ] [65])
    (reg:QI 3 g3 [77]))) 31 {iorqi3}
(expr_list:REG_EQUAL (ior:QI (reg:QI 1 g1 [orig:65 D.1938 ] [65])
    (const_int 128 [0x80]))
    (nil)))
    ]) -1 (nil))

(code_label 51 97 52 1  [2 uses])

(note 52 51 73 [bb 4] NOTE_INSN_BASIC_BLOCK)

(jump_insn 73 52 72 (return) 72 {return_rts} (expr_list:REG_BR_PRED
(const_int 12 [0xc])
    (nil)))

(barrier 72 73 86)

(code_label 86 72 41 5  [1 uses])

(note 41 86 45 [bb 5] NOTE_INSN_BASIC_BLOCK)

(insn 45 41 44 del1.c:20 (set (reg:QI 2 g2 [orig:68 ivtmp.7 ] [68])
    (plus:QI (reg:QI 2 g2 [orig:68 ivtmp.7 ] [68])
    (const_int 1 [0x1]))) 13 {addqi3} (nil))

(insn 44 45 101 del1.c:20 (set (mem/s/j:QI (reg/f:PQI 28 a0 [orig:62
D.1955 ] [62]) [0 bytes S1 A32])
    (reg:QI 1 g1 [75])) 7 {movqi_op} (expr_list:REG_DEAD
(reg/f:PQI 28 a0 [orig:62 D.1955 ] [62])
    (expr_list:REG_DEAD (reg:QI 1 g1 [75])
    (nil

(code_label 101 44 79 7  [1 uses])


Corresponding code:

jmp.ne  .L5;
st  [a0], g1; (INSN 38)
or  g1, g1, g3;  (INSN 43)
.L1:
rts;
nop;
nop;
.L5:
add   g2, g2, 1;   (INSN 45)
st  [a0], g1;(INSN 44)  - deleted
.L7:



You can see that INSN 44 and INSN 38 are identical. In
'relax_delay_slots' while processing INSN 97, the second call to
'try_merge_delay_insns' deletes the INSN 44 because of which
unexpected result is generated.

  /* If we own the thread opposite the way this insn branches, see if we
 can merge its delay slots with following insns.  */
  if (INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1))
   own_thread_p (NEXT_INSN (insn), 0, 1))
try_merge_delay_insns (insn, next);
  else if (! INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1))
own_thread_p (target_label, target_label, 0))
try_merge_delay_insns (insn, next_active_insn (target_label));

Deleting the INSN 44 would have been proper if the 2nd delay slot insn
had not modified G1. But looking at the comments from the function
'try_merge_delay_insns'

/* Try merging insns starting at THREAD which match exactly the insns in
   INSN's delay list.

   If all insns were matched and the insn was previously annulling, the
   annul bit will be cleared.

   For each insn that is merged, if the branch is or will be non-annulling,
   we delete the merged insn.  */

I think REGOUT dependency of g1 between instructions 38 and 43 in the
delay slot is not being considered by 'try_merge_delay_insns'.

Is this a bug?

Regards,
Shafi


Re: Issue with delay slot scheduling?

2011-09-06 Thread Mohamed Shafi
On 6 September 2011 20:50, Jeff Law l...@redhat.com wrote:

 On 09/06/11 08:46, Mohamed Shafi wrote:
 Hi,

 I am doing a private port in GCC 4.5.1. For the my target i see some
 strange behavior in delay slot scheduling. For my target the
 instruction in the delay slots gets executed irrespective of whether
 the branch is taken or not. I have generated the following code
 after commenting out the call to 'relax_delay_slots' in the function
 'dbr_schedule'.
 [ ... ]
 It looks like you have found a bug.  While reorg.c is supposed to work
 with targets that have multiple delay slots, it's not something that has
 been extensively tested.


 I think REGOUT dependency of g1 between instructions 38 and 43 in
 the delay slot is not being considered by 'try_merge_delay_insns'.
 You're probably correct.

 Jeff

How do raise a bug report, mine being a private target?

Regards,
Shafi


How to generate loop counter with a different mode ?

2011-05-16 Thread Mohamed Shafi
Hi all,

I am trying to add support for hardware loops for a 32bit target. In
the target QImode is 32bit. The loop counter used in hardware loop
construct is 17bit address registers. This is represented using
PQImode. Since mode for the doloop pattern is found out after loop
discovery it need not be always PQImode . So what i did was to convert
the mode of the counter variable to PQImode then emit the a new
pattern with PQImode along with other bells and whistles required by
the target loop construct. I am able to generate the assembly files
with the proper loop initialization instructions and all. But the
issue is that the loop counter is set to 0 in the body of the loop.

In define_expand (in doloop_end and doloop_begin) I am converting to
PQImode using the following construct:

operands[0] = convert_to_mode (PQImode, operands[0], 0);

So the above construct will result in an rtl pattern like:

(insn 33 17 34 4 loop.c:52 (set (reg:PQI 50)
(truncate:PQI (reg:QI 49))) -1 (nil))

But GCC will extract the loop counter from the define_expand generated
doloop pattern, which is in PQImode.

(insn 33 17 34 4 loop.c:52 (set (reg:PQI 50)
(truncate:PQI (reg:QI 49))) -1 (nil))

(jump_insn 34 33 20 4 loop.c:52 (parallel [
(set (pc)
(if_then_else (ne (reg:PQI 50)
(const_int 1 [0x1]))
(label_ref:PQI 30)
(pc)))
(set (reg:PQI 50)
(plus:PQI (reg:PQI 50)
(const_int -1 [0x])))
(unspec [
(const_int 0 [0x0])
] 3)
(clobber (scratch:PQI))
]) 62 {doloop_end_pqi} (expr_list:REG_BR_PROB (const_int 9100 [0x238c])
(nil))
 - 30)


This is the counter value that gets used for doloop begin. Hence the
original loop counter (reg:QI 49) never gets initialized. Due to this
'if-conversion' pass will modify the statement to:

(insn 33 38 34 4 loop.c:52 (set (reg:PQI 50)
(const_int 0 [0x0])) 9 {movpqi_op} (nil))

This results in loop counter being set to 0 in the body of the loop.
Can someone suggest me solution to get out of this?

Regards,
Shafi


Reloading an auto-increment addresses

2011-02-11 Thread Mohamed Shafi
Hello all,

I am porting GCC 4.5.1 for a private target. For one particular test
reloading pass is being asked to reload the following instruction:

(insn 45 175 46 11 pr20601-1.c:90 (set (reg/f:PQI 3 g3 [70])
(mem/f:PQI (pre_inc:PQI (reg/f:PQI 1 g1 [orig:55 prephitmp.16
] [55])) [2 S1 A32])) 9 {movpqi_op} (expr_list:REG_INC (reg/f:PQI 1 g1
[orig:55 prephitmp.16 ] [55])
(nil)))

The address is invalid in this. Base address should always be stored
in the address register. This instruction gets reloaded  in the
following manner:

(insn 175 43 202 11 pr20601-1.c:90 (set (reg/f:PQI 1 g1 [orig:55
prephitmp.16 ] [55])
(reg/f:PQI 12 as0 [orig:49 e.4 ] [49])) 9 {movpqi_op} (nil))

(insn 202 175 203 11 pr20601-1.c:90 (set (reg/f:PQI 1 g1 [orig:55
prephitmp.16 ] [55])
(plus:PQI (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55])
(const_int 1 [0x1]))) 14 {addpqi3} (nil))

(insn 203 202 45 11 pr20601-1.c:90 (set (reg:PQI 28 a0)
(reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55])) 9 {movpqi_op} (nil))

(insn 45 203 46 11 pr20601-1.c:90 (set (reg/f:PQI 3 g3 [70])
(mem/f:PQI (reg:PQI 28 a0) [2 S1 A32])) 9 {movpqi_op} (nil))

The issue with this reload is that there is no move operation between
GP registers and address registers. So insn 203 is invalid. I am
catching these kinds in secondary reloads, but auto-increment
addressing modes are not handled in that . So if i try to do that in
TARGET_SECONDARY_RELOAD i am getting assert failure from
reload1.c:emit_input_reload_insns() due to the following code:

  /* Auto-increment addresses must be reloaded in a special way.  */
  if (rl-out  ! rl-out_reg)
{
  /* We are not going to bother supporting the case where a
 incremented register can't be copied directly from
 OLDEQUIV since this seems highly unlikely.  */
  gcc_assert (rl-secondary_in_reload  0);

How can i overcome this failure?  Can some one suggest a solution?

Thanks for the help.

Regards,
Shafi


Re: Reloading an auto-increment addresses

2011-02-11 Thread Mohamed Shafi
On 11 February 2011 15:28, Paulo J. Matos pocma...@gmail.com wrote:


 On 11/02/11 09:46, Mohamed Shafi wrote:

 How can i overcome this failure?  Can some one suggest a solution?



 Have you defined TARGET_LEGITIMATE_ADDRESS_P and also BASE_REG_CLASS
 correctly for your target?



Yes, I have. Register allocator is allocating the wrong registers for
the base registers. This probably is due to the fact that address
registers cannot be saved and restored directly, a secondary reload is
required. There is also the restriction that there is no move
operation between the address registers. For that also a secondary
reload is required. (I know its weird). I am trying to figure out why
register allocator is not assigning a base register. But even then,
reload could be asked to reload a auto-increment addresses.

Shafi


Re: ICE in get_constraint_for_component_ref

2011-02-10 Thread Mohamed Shafi
On 10 February 2011 15:57, Richard Guenther richard.guent...@gmail.com wrote:
 On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi shafi...@gmail.com wrote:
 Hi all,

 I am trying to port a private target in GCC 4.5.1. Following are the
 properties of the target

 #define BITS_PER_UNIT           32
 #define BITS_PER_WORD        32
 #define UNITS_PER_WORD       1


 #define CHAR_TYPE_SIZE        32
 #define SHORT_TYPE_SIZE       32
 #define INT_TYPE_SIZE         32
 #define LONG_TYPE_SIZE        32
 #define LONG_LONG_TYPE_SIZE   32



 I am getting an ICE
 internal compiler error: in get_constraint_for_component_ref, at
 tree-ssa-structalias.c:3031

 For the following testcase:

 struct fb_cmap {
  int start;
  int len;
  int *green;
 };

 extern struct fb_cmap fb_cmap;

 void directcolor_update_cmap(void)
 {
  fb_cmap.green[0] = 34;
 }

 The following is the output of debug_tree of the argument thats given
 for the function get_constraint_for_component_ref

 component_ref 0x2b6a45618080
    type pointer_type 0x2b6a45559930
        type integer_type 0x2b6a4554a498 int public QI
            size integer_cst 0x2b6a4553c460 constant 32
            unit size integer_cst 0x2b6a4553c488 constant 1
            align 32 symtab 0 alias set -1 canonical type
 0x2b6a4554a498 precision 32 min integer_cst 0x2b6a4553c5c8
 -2147483648 max integer_cst 0x2b6a4553c5f0 2147483647
            pointer_to_this pointer_type 0x2b6a45559930
        unsigned PQI size integer_cst 0x2b6a4553c460 32 unit size
 integer_cst 0x2b6a4553c488 1
        align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930

    arg 0 var_decl 0x2b6a45614000 fb_cmap
        type record_type 0x2b6a45602888 fb_cmap type_0 BLK
            size integer_cst 0x2b6a455fc4d8 constant 96
            unit size integer_cst 0x2b6a455fc488 constant 3
            align 32 symtab 0 alias set -1 canonical type
 0x2b6a45602888 fields field_decl 0x2b6a45613000 start context
 translation_unit_decl 0x2b6a4555f7e8 D.1201
            chain type_decl 0x2b6a4555f730 D.1193
        used public external common BLK file pr28675.c line 7 col 23
 size integer_cst 0x2b6a455fc4d8 96 unit size integer_cst
 0x2b6a455fc488 3
        align 32
        chain function_decl 0x2b6a45616000 directcolor_update_cmap
 type function_type 0x2b6a45560888
            public static QI file pr28675.c line 9 col 6 align 32
 initial block 0x2b6a45619000 result result_decl 0x2b6a45617000
 D.1200
            (mem:QI (symbol_ref:PQI (directcolor_update_cmap) [flags
 0x3] function_decl 0x2b6a45616000 directcolor_update_cmap) [0 S1
 A32])
            struct-function 0x2b6a455453f0
    arg 1 field_decl 0x2b6a45613130 green type pointer_type 0x2b6a45559930
        unsigned PQI file pr28675.c line 4 col 7 size integer_cst
 0x2b6a4553c460 32 unit size integer_cst 0x2b6a4553c488 1
        align 32 offset_align 32
        offset integer_cst 0x2b6a4553c8c0 constant 2
        bit offset integer_cst 0x2b6a4553cc80 constant 0 context
 record_type 0x2b6a45602888 fb_cmap
    pr28675.c:11:10

 I was wondering if this ICE is due to the fact that this is a 32bit
 char target ? Can somebody help me with pointers to debug this issue?

 Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT.


That did the trick. Looking at the code i assume that this is proper
and hence should be committed in the trunk and 4.5 branch.  Will that
be done?

Shafi


Re: ICE in get_constraint_for_component_ref

2011-02-10 Thread Mohamed Shafi
On 10 February 2011 17:16, Richard Guenther richard.guent...@gmail.com wrote:
 On Thu, Feb 10, 2011 at 12:42 PM, Mohamed Shafi shafi...@gmail.com wrote:
 On 10 February 2011 15:57, Richard Guenther richard.guent...@gmail.com 
 wrote:
 On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi shafi...@gmail.com wrote:
 Hi all,

 I am trying to port a private target in GCC 4.5.1. Following are the
 properties of the target

 #define BITS_PER_UNIT           32
 #define BITS_PER_WORD        32
 #define UNITS_PER_WORD       1


 #define CHAR_TYPE_SIZE        32
 #define SHORT_TYPE_SIZE       32
 #define INT_TYPE_SIZE         32
 #define LONG_TYPE_SIZE        32
 #define LONG_LONG_TYPE_SIZE   32



 I am getting an ICE
 internal compiler error: in get_constraint_for_component_ref, at
 tree-ssa-structalias.c:3031

 For the following testcase:

 struct fb_cmap {
  int start;
  int len;
  int *green;
 };

 extern struct fb_cmap fb_cmap;

 void directcolor_update_cmap(void)
 {
  fb_cmap.green[0] = 34;
 }

 The following is the output of debug_tree of the argument thats given
 for the function get_constraint_for_component_ref

 component_ref 0x2b6a45618080
    type pointer_type 0x2b6a45559930
        type integer_type 0x2b6a4554a498 int public QI
            size integer_cst 0x2b6a4553c460 constant 32
            unit size integer_cst 0x2b6a4553c488 constant 1
            align 32 symtab 0 alias set -1 canonical type
 0x2b6a4554a498 precision 32 min integer_cst 0x2b6a4553c5c8
 -2147483648 max integer_cst 0x2b6a4553c5f0 2147483647
            pointer_to_this pointer_type 0x2b6a45559930
        unsigned PQI size integer_cst 0x2b6a4553c460 32 unit size
 integer_cst 0x2b6a4553c488 1
        align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930

    arg 0 var_decl 0x2b6a45614000 fb_cmap
        type record_type 0x2b6a45602888 fb_cmap type_0 BLK
            size integer_cst 0x2b6a455fc4d8 constant 96
            unit size integer_cst 0x2b6a455fc488 constant 3
            align 32 symtab 0 alias set -1 canonical type
 0x2b6a45602888 fields field_decl 0x2b6a45613000 start context
 translation_unit_decl 0x2b6a4555f7e8 D.1201
            chain type_decl 0x2b6a4555f730 D.1193
        used public external common BLK file pr28675.c line 7 col 23
 size integer_cst 0x2b6a455fc4d8 96 unit size integer_cst
 0x2b6a455fc488 3
        align 32
        chain function_decl 0x2b6a45616000 directcolor_update_cmap
 type function_type 0x2b6a45560888
            public static QI file pr28675.c line 9 col 6 align 32
 initial block 0x2b6a45619000 result result_decl 0x2b6a45617000
 D.1200
            (mem:QI (symbol_ref:PQI (directcolor_update_cmap) [flags
 0x3] function_decl 0x2b6a45616000 directcolor_update_cmap) [0 S1
 A32])
            struct-function 0x2b6a455453f0
    arg 1 field_decl 0x2b6a45613130 green type pointer_type 
 0x2b6a45559930
        unsigned PQI file pr28675.c line 4 col 7 size integer_cst
 0x2b6a4553c460 32 unit size integer_cst 0x2b6a4553c488 1
        align 32 offset_align 32
        offset integer_cst 0x2b6a4553c8c0 constant 2
        bit offset integer_cst 0x2b6a4553cc80 constant 0 context
 record_type 0x2b6a45602888 fb_cmap
    pr28675.c:11:10

 I was wondering if this ICE is due to the fact that this is a 32bit
 char target ? Can somebody help me with pointers to debug this issue?

 Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT.


 That did the trick. Looking at the code i assume that this is proper
 and hence should be committed in the trunk and 4.5 branch.  Will that
 be done?

 I'll include it in one of my next bootstraps/tests and commit it.

Thanks Richard :)

Shafi


ICE in get_constraint_for_component_ref

2011-02-09 Thread Mohamed Shafi
Hi all,

I am trying to port a private target in GCC 4.5.1. Following are the
properties of the target

#define BITS_PER_UNIT   32
#define BITS_PER_WORD32
#define UNITS_PER_WORD   1


#define CHAR_TYPE_SIZE32
#define SHORT_TYPE_SIZE   32
#define INT_TYPE_SIZE 32
#define LONG_TYPE_SIZE32
#define LONG_LONG_TYPE_SIZE   32



I am getting an ICE
internal compiler error: in get_constraint_for_component_ref, at
tree-ssa-structalias.c:3031

For the following testcase:

struct fb_cmap {
 int start;
 int len;
 int *green;
};

extern struct fb_cmap fb_cmap;

void directcolor_update_cmap(void)
{
  fb_cmap.green[0] = 34;
}

The following is the output of debug_tree of the argument thats given
for the function get_constraint_for_component_ref

component_ref 0x2b6a45618080
type pointer_type 0x2b6a45559930
type integer_type 0x2b6a4554a498 int public QI
size integer_cst 0x2b6a4553c460 constant 32
unit size integer_cst 0x2b6a4553c488 constant 1
align 32 symtab 0 alias set -1 canonical type
0x2b6a4554a498 precision 32 min integer_cst 0x2b6a4553c5c8
-2147483648 max integer_cst 0x2b6a4553c5f0 2147483647
pointer_to_this pointer_type 0x2b6a45559930
unsigned PQI size integer_cst 0x2b6a4553c460 32 unit size
integer_cst 0x2b6a4553c488 1
align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930

arg 0 var_decl 0x2b6a45614000 fb_cmap
type record_type 0x2b6a45602888 fb_cmap type_0 BLK
size integer_cst 0x2b6a455fc4d8 constant 96
unit size integer_cst 0x2b6a455fc488 constant 3
align 32 symtab 0 alias set -1 canonical type
0x2b6a45602888 fields field_decl 0x2b6a45613000 start context
translation_unit_decl 0x2b6a4555f7e8 D.1201
chain type_decl 0x2b6a4555f730 D.1193
used public external common BLK file pr28675.c line 7 col 23
size integer_cst 0x2b6a455fc4d8 96 unit size integer_cst
0x2b6a455fc488 3
align 32
chain function_decl 0x2b6a45616000 directcolor_update_cmap
type function_type 0x2b6a45560888
public static QI file pr28675.c line 9 col 6 align 32
initial block 0x2b6a45619000 result result_decl 0x2b6a45617000
D.1200
(mem:QI (symbol_ref:PQI (directcolor_update_cmap) [flags
0x3] function_decl 0x2b6a45616000 directcolor_update_cmap) [0 S1
A32])
struct-function 0x2b6a455453f0
arg 1 field_decl 0x2b6a45613130 green type pointer_type 0x2b6a45559930
unsigned PQI file pr28675.c line 4 col 7 size integer_cst
0x2b6a4553c460 32 unit size integer_cst 0x2b6a4553c488 1
align 32 offset_align 32
offset integer_cst 0x2b6a4553c8c0 constant 2
bit offset integer_cst 0x2b6a4553cc80 constant 0 context
record_type 0x2b6a45602888 fb_cmap
pr28675.c:11:10

I was wondering if this ICE is due to the fact that this is a 32bit
char target ? Can somebody help me with pointers to debug this issue?

Regards,
Shafi


Re: Help with reloading

2010-12-20 Thread Mohamed Shafi
On 20 December 2010 10:56, Jeff Law l...@redhat.com wrote:
 On 12/15/10 07:14, Mohamed Shafi wrote:

 Hi,

 I am doing a port in GCC 4.5.1.
 The target supports storing immediate values into memory location
 represented by a symbolic address. So in the move pattern i have given
 constraints to represent this.

 Presumably the target does not support storing an immediate value into other
 MEMs?  ie, the only store-immediate is to a symbolic memory operand, right?


yes you are right.

 I think this is a case where you're going to need a secondary reload to
 force the immediate into a register if the destination is a non-symbolic MEM
 or a pseudo without a hard reg and its equivalent address is non-symbolic.

I am not sure how i should be implementing this.
Currently in define_expand for move i have code to force the
immediate value into a register if the destination is not a symbolic
address. If i understand correctly this is the only place where i can
decide what to do with the source depending on the destination. right?

Moreover for the pattern

(insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 12 as0
[69]) [0 S1 A32])
   (reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 12 as0 [69])
   (expr_list:REG_EQUAL (const_int 0 [0x0])
   (nil

destination is the src operand gets converted by

  /* This is equivalent to calling find_reloads_toplev.
 The code is duplicated for speed.
 When we find a pseudo always equivalent to a constant,
 we replace it by the constant.  We must be sure, however,
 that we don't try to replace it in the insn in which it
 is being set.  */
  int regno = REGNO (recog_data.operand[i]);
  if (reg_equiv_constant[regno] != 0
   (set == 0 || SET_DEST (set) != recog_data.operand_loc[i]))
{
  /* Record the existing mode so that the check if constants are
 allowed will work when operand_mode isn't specified.  */

  if (operand_mode[i] == VOIDmode)
operand_mode[i] = GET_MODE (recog_data.operand[i]);

  substed_operand[i] = recog_data.operand[i]
= reg_equiv_constant[regno];
}

and since the destination is already selected for reload

/* If the address was already reloaded,
   we win as well.  */
else if (MEM_P (operand)
  address_reloaded[i] == 1)
  win = 1;

the reload phase never reaches secondary reload.
So i do not understand your answer. Could you explain it briefly.

Regards,
Shafi


Re: Help with reloading

2010-12-20 Thread Mohamed Shafi
On 20 December 2010 19:30, Jeff Law l...@redhat.com wrote:
 On 12/20/10 01:47, Mohamed Shafi wrote:


 I think this is a case where you're going to need a secondary reload to
 force the immediate into a register if the destination is a non-symbolic
 MEM
 or a pseudo without a hard reg and its equivalent address is
 non-symbolic.

     I am not sure how i should be implementing this.
     Currently in define_expand for move i have code to force the
 immediate value into a register if the destination is not a symbolic
 address. If i understand correctly this is the only place where i can
 decide what to do with the source depending on the destination. right?

 Just changing the movxx expander is not sufficient since for this case you
 do not know until reload time whether or not a particular insn needs an
 extra register to implement the move.   That's the whole point of the
 secondary reload mechanism -- to allow you to allocate a scratch register
 during reloading to handle oddball cases like this.


 In your secondary reload code you'll need to check for the case where the
 destination is a MEM and the source is an unallocated pseudo with a constant
 equivalent and return a suitable register class for that case.

   Jeff, thanks for the reply.
   I didn't know that you could do that in TARGET_SECONDARY_RELOAD
hook. Can you point me to some target that does this - figuring out
what the destination is based on the source or vice versa. In my case
only the address operand comes into TARGET_SECONDARY_RELOAD hook
during the reload pass. I am not sure how to find out the source for
the pattern which has this particular address as the destination.

Sorry for the trouble.

Shafi


Help with reloading

2010-12-15 Thread Mohamed Shafi
Hi,

I am doing a port in GCC 4.5.1.
The target supports storing immediate values into memory location
represented by a symbolic address. So in the move pattern i have given
constraints to represent this.

(define_insn movqi_op
  [(set (match_operand:QI 0 nonimmediate_operand =!Q,!Q,d,d,d,d,d,d,d,Q,R,S)
(match_operand:QI 1 general_operand   I,J,i,W,J,d,Q,R,S,d,d,d))]
  
  @
  st.s32\t%0, %1;
  st.u32\t%0, %1;
  set\t%0, %1;
  set.u32\t%0, %1;
  set.u32\t%0, %1;
  move\t%0, %1;
  ld%u0\t%0, %1;
  ld%u0\t%0, %1;
  ld%u0\t%0, %1;
  st%u0\t%0, %1;
  st%u0\t%0, %1;
  st%u0\t%0, %1;
 )

where
Q represents symbolic address,
R represents all address formed using SP
S represents all address formed using address registers
I, J,W,i represents various const_ints
d represents general registers.


Whenever reload get a pattern to store const_int to a memory that is
scheduled for reloading, the reload pass will match it with Q
constraints. So to avoid those i added the constrain modifier '!' to
'Q'. But even then there is one particular case that causes trouble.
This happens when reload pass gets a pattern where the destination is
an illegal address and source is a pesudo register (no register
allocated) for which reg_equiv_constant[regno] != 0.

Before IRA pass:

(insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 69) [0 S1 A32])
(reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 69)
(expr_list:REG_EQUAL (const_int 49 [0x31])
(nil

Just before reloading phase:

(insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 12 as0
[69]) [0 S1 A32])
(reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 12 as0 [69])
(expr_list:REG_EQUAL (const_int 0 [0x0])
(nil

Since reg93 is not allocated with any register, its replaced with
reg_equiv_constant[regno], and this combination wins the (Q, I)
constraint pair and in that process 'losers' (variable in loop over
alternatives) becomes 0 and hence breaks out and returns. Due to this
compiler crashes with insn does not satisfy its constraints:  error.
Any pointers in fixing this?

Regards,
Shafi

P.S. When can we merge constraints? What are the criteria to decide
which all constraints to merge


Re: Help with reloading FP + offset addressing mode

2010-11-24 Thread Mohamed Shafi
On 30 October 2010 05:45, Joern Rennecke joern.renne...@embecosm.com wrote:
 Quoting Mohamed Shafi shafi...@gmail.com:

 On 29 October 2010 00:06, Joern Rennecke joern.renne...@embecosm.com
 wrote:

 Quoting Mohamed Shafi shafi...@gmail.com:

 Hi,

 I am doing a port in GCC 4.5.1. For the port

 1. there is only (reg + offset) addressing mode only when reg is SP.
 Other base registers are not allowed
 2. FP cannot be used as a base register. (FP based addressing is done
 by copying it into a base register)
 In order to take advantage of FP elimination (this will create SP +
 offset addressing), what i did the following

 1. Created a new register class (address registers + FP) and used this
 new class as the BASE_REG_CLASS

 Stop right there.  You need to distinguish between FRAME_POINTER_REGNUM
 and HARD_FRAME_POINTER_REGNUM.


 From the description given in the internals, i am not able to
 understand why you suggested this. Could you please explain this?

 In order to trigger reloading of the address, you have to have a register
 elimination, even if the stack pointer is not a suitable destinatination
 for the elimination.  Also, if you want to reload do the work for you,
 you must not lie to it about the addressing capabilities of an actual hard
 register.  Hence, you need separate hard and soft frame pointers.

 If you have them, but conflate them when you describe what you are doing
 in your port, you are not only likely to confuse the listener/reader,
 but also your documentation, your code, and ultimately yourself.


Having a FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM will
trigger reloading of address. But for the following pattern

(insn 3 2 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg/f:QI 35 SFP)
 (const_int 1 [0x1])) [0 c+0 S1 A32])
(reg:QI 0 g0 [ c ])) 7 {movqi_op} (nil))

where SFP is FRAME_POINTER_REGNUM, an elimination will result in

(insn 3 2 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg/f:QI 27 as15)
 (const_int 1 [0x1])) [0 c+0 S1 A32])
(reg:QI 0 g0 [ c ])) 7 {movqi_op} (nil))

where as15 is the HARD_FRAME_POINTER_REGNUM. But remember this new
address is not valid (as only SP is allowed in this addressing mode).
When the above pattern is reloaded i get:

(insn 28 27 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg:QI 28 a0)
 (const_int 1 [0x1])) [0 c+0 S1 A32])
  (reg:QI 3 g3)) -1 (nil))

I get unrecognizable insn ICE, because this addressing mode is not
valid. I believe this happens because when the reload_pass get the
address of the form (reg + off), it assumes that the address is
invalid due to one of the following:

1. 'reg' is not a suitable base register
2. the offset is out of range
3. the address has an eliminatable register as a base register.

Is there any way to over come this one?

Any help is appreciated.

Shafi


A question about combining constraints

2010-11-12 Thread Mohamed Shafi
Hi,

For a private target that i am porting in GCC 4.5 I have the following
pattern in my md file for call value:


(define_insn call_value_op
  [(set (match_operand 0 register_operand =da)
(call (mem:QI (match_operand:QI 1 call_operand Wd))
  (match_operand:QI 2  )))]
  
  jsr\\t%1
  [(set_attr slottable has_slot)]
)

All the constraints are one letter constraints for my target. Here 'W'
is for symbol_ref and all others are register constraints. So for a
particular combination when operand 0 is 'a' and operand 1 is 'W' i
got the following ICE :

error: unable to generate reloads for:
(call_insn 11 4 12 2 test.c:7 (set (reg:QI 12 as0)
(call (mem:QI (symbol_ref:QI (malloc) [flags 0x41]
function_decl 0x2b5733ff3600 __builtin_malloc) [0 S1 A32])
(const_int 0 [0x0]))) 50 {call_value_op}
(expr_list:REG_DEAD (reg:QI 0 g0)
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil)))
(expr_list:REG_DEP_TRUE (use (reg:QI 0 g0))
(nil)))

I get this ICE because the constraints are not matched properly. I ICE
goes away when i write the constraints as:

=ad, Wd

or

a,a,d,d, , W,W,d,d

So i have the following questions:

1. Why is that constraints are not matched here?
2. When can i combine the constrains?

Regards,
Shafi


Re: A question about combining constraints

2010-11-12 Thread Mohamed Shafi
On 12 November 2010 18:39, Joern Rennecke amyl...@spamcop.net wrote:
 Quoting Mohamed Shafi shafi...@gmail.com:

 So i have the following questions:

 1. Why is that constraints are not matched here?

 Please read the node Register Classes in doc/tm.texi .


I am sorry , could you please highlight the relevant portion for me?
In the pattern that i have given the combination (a,W) satisfies the
pattern. But its not matched because i have given then like (da,Wd). I
know that we can combine the constraints together.

Shafi


Opinion on a hardware feature for conditional instructions

2010-11-09 Thread Mohamed Shafi
Hi all,

I need a opinion on a design front. I am doing a port for a private
target in GCC 4.5.1. We are also in the process of designing the
hardware along with the development of the build tools. Currently we
don't have enough bits in the encoding to support conditional
instruction like arm does. i.e. you have the option to decide whether
to update the status flags or not. So what is the next best thing to
have?

1. Allow both conditional and non-conditional instructions to update
the status flags
2. Allow only non-conditional instructions to update the status flags

Could you please let me know your thoughts on this and the reason for
choosing it?

Regards,
Shafi


Re: Help with reloading FP + offset addressing mode

2010-11-02 Thread Mohamed Shafi
On 30 October 2010 05:45, Joern Rennecke joern.renne...@embecosm.com wrote:
 Quoting Mohamed Shafi shafi...@gmail.com:

 On 29 October 2010 00:06, Joern Rennecke joern.renne...@embecosm.com
 wrote:

 Quoting Mohamed Shafi shafi...@gmail.com:

 Hi,

 I am doing a port in GCC 4.5.1. For the port

 1. there is only (reg + offset) addressing mode only when reg is SP.
 Other base registers are not allowed
 2. FP cannot be used as a base register. (FP based addressing is done
 by copying it into a base register)
 In order to take advantage of FP elimination (this will create SP +
 offset addressing), what i did the following

 1. Created a new register class (address registers + FP) and used this
 new class as the BASE_REG_CLASS

 Stop right there.  You need to distinguish between FRAME_POINTER_REGNUM
 and HARD_FRAME_POINTER_REGNUM.


 From the description given in the internals, i am not able to
 understand why you suggested this. Could you please explain this?

 In order to trigger reloading of the address, you have to have a register
 elimination, even if the stack pointer is not a suitable destinatination
 for the elimination.  Also, if you want to reload do the work for you,
 you must not lie to it about the addressing capabilities of an actual hard
 register.  Hence, you need separate hard and soft frame pointers.


Debugging sessions of the reload pass tells me that if the reload_pass
get the address of the form (reg + off), it assumes one of the
following:

1. the address is invalid because 'reg' is not a suitable base register
2. the offset is out of range
3. the address has an eliminatable register as a base register.

Depending on what it finds, reload_pass reloads the address
accordingly. So for my target when the pass encounters the address of
the form:

(plus:QI (reg/f:QI 33 ArgP) (const_int -2 [0xfffe]))

it eliminates the arg pointer to either stack or frame pointer and
reloads it. If the base register is FP, during reloading it just
reloads the FP with a valid base register, but then the address
becomes invalid. Relaod_pass cannot figure out that the addressing
mode itself is invalid due to wrong base register. Since SP is the
only valid register among the base registers that can form (reg + off)
addressing mode, for the reload to work properly i will have to allow
this addressing mode only when SP is base register - even in
non-strict mode. But then i will loose lot of oppurtunities when
elimination happens in favour of SP. Hence i allow the above form of
address for all frame related pesudos.

So to respond to your comments, i agree that as far as possible the
port has to be truthful to reload pass about the addressing mode
capabilities, but then i am not sure if distinguishing between
FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM will help my cause.

Do you agree? Or am i not understanding what your suggestion implies?

Shafi


Help with reloading FP + offset addressing mode

2010-10-28 Thread Mohamed Shafi
Hi,

I am doing a port in GCC 4.5.1. For the port

1. there is only (reg + offset) addressing mode only when reg is SP.
Other base registers are not allowed
2. FP cannot be used as a base register. (FP based addressing is done
by copying it into a base register)

In order to take advantage of FP elimination (this will create SP +
offset addressing), what i did the following

1. Created a new register class (address registers + FP) and used this
new class as the BASE_REG_CLASS
2. Defined HARD_REGNO_OK_FOR_BASE_P like the following :

#define HARD_REGNO_OK_FOR_BASE_P(NUM) \
    ((NUM)  FIRST_PSEUDO_REGISTER \
  (((reload_completed || reload_in_progress)? 0 : (NUM) == FP_REG) \
 || REGNO_REG_CLASS(NUM) == ADD_REGS))

3. In legitimate_address_p i have the followoing:

  if (REGNO (x) == FP_REG)
    {
  if (strict)
    return false;
  else
    return true;
    }
  else if (strict)
    return STRICT_REG_OK_FOR_BASE_P (REGNO (x));
  else
    return NONSTRICT_REG_OK_FOR_BASE_P (REGNO (x));

But when FP doesn't get eliminated i will get address of the form

(plus:QI (reg/f:QI 27 as15) (const_int 2))

which gets reloaded by replacing FP with address register, other than
SP. I am guessing this happens because of modified BASE_REG_CLASS. I
haven't confirmed this. So in order to over come this what i have done
is, in legitimize_reload_address i have the following :

  if (GET_CODE (*x) == PLUS
   REG_P (XEXP (*x, 0))
   REGNO (XEXP (*x, 0))  FIRST_PSEUDO_REGISTER
   GET_CODE (XEXP (*x, 1)) == CONST_INT
   XEXP (*x, 0) == frame_pointer_rtx)
    {
   /* GCC will by default reload the FP into a BASE_CLASS_REG,
  which results in an invalid address.  For us, the best
  thing to do is move the whole expression to a REG.  */
  push_reload (*x, NULL_RTX, x, NULL, SPAA_REGS,
   mode, VOIDmode,0, 0, opnum, (enum reload_type)type);
  return 1;
    }

Does my logic makes sense? Is there any better way to implement this?

With this implementation for the following sequence :

(insn 9 6 10 2 fun_calls.c:12 (set (reg/f:QI 42)
    (mem/f/c/i:QI (plus:QI (reg/f:QI 33 AP)
    (const_int -2 [0xfffe])) [0 f+0 S1 A32]))
9 {movqi_op} (nil))

(insn 10 9 11 2 fun_calls.c:12 (set (reg:QI 43)
    (const_int 60 [0x3c])) 7 {movqi_op} (nil))

I am getting the following output:

(insn 45 6 47 2 fun_calls.c:12 (set (reg:QI 28 a0)
    (const_int 2 [0x2])) 9 {movqi_op} (nil))

(insn 47 45 48 2 fun_calls.c:12 (set (reg:QI 28 a0)
    (reg/f:QI 27 as15)) 9 {movqi_op} (nil))

(insn 48 47 49 2 fun_calls.c:12 (set (reg:QI 28 a0)
    (plus:QI (reg:QI 28 a0)
    (const_int 2 [0x2]))) 14 {addqi3} (expr_list:REG_EQUIV
(plus:QI (reg/f:QI 27 as15)
    (const_int 2 [0x2]))
    (nil)))

(insn 49 48 10 2 fun_calls.c:12 (set (reg/f:QI 0 g0 [42])
    (mem/f/c/i:QI (reg:QI 28 a0) [0 f+0 S1 A32])) 9 {movqi_op} (nil))

insn 45 is redundant. Is this generated because the
legitimize_reload_address is wrong?

Any hints as to why the redundant instruction gets generated?

Regards,
Shafi


Re: Help with reloading FP + offset addressing mode

2010-10-28 Thread Mohamed Shafi
On 29 October 2010 00:06, Joern Rennecke joern.renne...@embecosm.com wrote:
 Quoting Mohamed Shafi shafi...@gmail.com:

 Hi,

 I am doing a port in GCC 4.5.1. For the port

 1. there is only (reg + offset) addressing mode only when reg is SP.
 Other base registers are not allowed
 2. FP cannot be used as a base register. (FP based addressing is done
 by copying it into a base register)
 In order to take advantage of FP elimination (this will create SP +
 offset addressing), what i did the following

 1. Created a new register class (address registers + FP) and used this
 new class as the BASE_REG_CLASS

 Stop right there.  You need to distinguish between FRAME_POINTER_REGNUM
 and HARD_FRAME_POINTER_REGNUM.


From the description given in the internals, i am not able to
understand why you suggested this. Could you please explain this?

Shafi


Need help in deciding the instruction set for a new target.

2010-08-23 Thread Mohamed Shafi
Hello all,

I am trying to do a port on GCC 4.5. The target has a memory
resolution of 32bits i.e. char is 32bits in the target (addr 0 selects
1st 32bit and addr 1 selects 2nd 32bit). It has only word (32bit)
access.

In terms of address resolution this target is similar to c4x which
became obsolete in GCC 4.2. There are two ways to implement this port.
One is to have BITS_PER_UNIT ==32, like c4x and other is to have a
normal C like char == 8, short == 16, and int == 32. We are thinking
about having BITS_PER_UNIT == 32. Yes I know the support for such a
target is bit rotten in GCC. I am currently trying to removing it.

In the mean time, we are in the process of finalizing the
instructions. The current instruction set has support for 32bit
immediate data only in move operations. i.e.

move src1GP, #imm32

For all other operations like div, sub, add, compare, modulus, load,
store the support is only for 16bit immediate. For all these
instruction there is separate flavor for sign and zero extension. i.e.

mod.s32 srcdstGP, #imm16 // 32%imm16   signed modulus
mod.u32 srcdstGP, #imm16 // 32%imm16 unsigned modulus

cmp.s32 src1GP, #imm16 // signed register to 16-bit immediate compare
cmp.u32 src1GP, #imm16 // unsigned register to 16-bit immediate compare

sub.s32 srcdstGP, #imm16 // signed 16-bit register to immediate subtract
sub.u32 srcdstGP, #imm16 // unsigned 16-bit register to immediate subtract


I want to know if it is good to have both sign and zero extension for
16bit immediate.
Will it be of any use with a configuration where char == short == int == 32bit?
Will I be able to support these kinds of instructions in a GCC port?
Or will it good to have a separate sign and zero extension
instruction, which the current instruction set doesn’t have.
Do I need a separate sign and zero ext instructions along with the
above instructions?

It would be of great help if you could guide me in deciding these instructions.

Regards,
Shafi


Help for target with BITS_PER_UNIT = 16

2010-08-16 Thread Mohamed Shafi
Hello all,

I am trying to port GCC 4.5.1 for a processor that has the following
addressing capability:

The data memory address space of 64K bytes is represented by a total
of 15 bits, with each address selecting a 16-bit element. When using
the address register, the LSB of address reg (AD) points to a 16-bit
field in data memory. If a data memory line is 128 bits there are 8,
16-bit elements per data memory line. We use little endian addressing,
so
if AD=0, bits [15:0] of data memory line address 0 would be selected.
If AD=1, bits [31:16] of data memory line address 0 would be selected.
If AD=9, bits [31:16] of data memory line address 1 would be selected.

So if i have the following program

short arr[5] = {11,12,13,14,15};

int foo ()
{
    short a = arr[0] + arr[3];
    return a;
}

Assume that short is 16bits and short address is 2byte aligned.Then I
expect the following code to get generated:

mov a0,#arr       // Load the address

mov a1, a0        // Copy the address
add a1, 1          // Increment the location by 1 so that the address
points to arr[1]
ld.16 g0, (a1)    // Load the value 12 into g0

mov a1, a0        // Copy the address
add a1, 3          // Increment the location by 3 so that the address
points to arr[3]
ld.16 g1, (a1)    // Load the value 14 into g0

add g1, g1, g0  // Add 12 and 14

For the following code:

short arr[5] = {11,12,13,14,15};

int foo ()
{
short a,b;

a = (short) (arr[3] - arr[1]); // a is 2 after this operation
b = (short) ((char*)arr[3] - (char*)arr[1]);  // b is 4 after this operation

return a;
}

My question is should i set the macro BITS_PER_UNIT = 16 to get a code
generated like this? From IRC chat i realize that  BITS_PER_UNIT != 8
is seriously rotten. If that is the case how can i proceed to port
this target?

Regards,
Shafi


how to identify a part of a multi-word register

2010-02-10 Thread Mohamed Shafi
Hi,

I am doing a port for a 32bit target in GCC 4.4.0. I need a way to
identify that a register is part of a multiword register. I need to
emit an instruction that works on LSW of the double word register on
move instructions. Currently the target splits the DImode and DFmode
moves after reloading. So i am able to generate the required
instruction while doing the split. But it seems that sometimes the
subreg pass splits the multiword register into SImode or SFmode
register references before reg-alloc. Since it is not required to
split these moves, I am not able to insert the required instruction
for LSW.  So I was wondering if it is possible to recognize a register
as a part of a multiword register? In the rtl-dumps there are
expressions like :

(insn 255 254 256 2 pr28634.c:13 (set (mem/v/c/i:SI (plus:SI (reg/f:SI 49 sp)
(const_int -16 [0xfff0])) [2 y+0 S4 A64])
(reg:SI 2 d2)) 2 {*movsi_internal} (nil))

(insn 256 255 257 2 pr28634.c:13 (set (mem/v/c/i:SI (plus:SI (reg/f:SI 49 sp)
(const_int -12 [0xfff4])) [2 y+4 S4 A32])
(reg:SI 3 d3 [+4 ])) 2 {*movsi_internal} (nil))

which points out that d3 is part of a multiword register. Looking into
the gcc sources I find that this is done with the help of REG_OFFSET
macro. So can I use this macro to identify a register as a part of
multiword register? Is there any other way to do this?

Regards,
Shafi


Question about peephole2 and addressing mode

2010-01-21 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit a target in GCC 4.4.0. The target
supports (base + offset) addressing mode for QImode store instructions
but not for QImode load instructions. GCC doesn't take the middle
path. It either supports an addressing mode completely and doesn't
support at all. I tried lot of hacks to support  (base + offset)
addressing mode only for QI mode store instructions. After a lot of
fight i finally gave up and removed the QImode support for this
addressing mode completely in GO_IF_ LEGITIMATE_ADDRESS macro. Now i
am pursing an alternate solution. Have peephole2 patterns to implement
QImode (base+offset) addressing mode for store instructions. How does
it sound?

So now i have written a peephole2 pattern like:

(define_peephole2
 [(parallel
   [(set (match_operand:SI 0 register_operand )
 (plus:SI (match_operand:SI 1 register_operand )
  (match_operand:SI 2 const_int_operand )))
(clobber (reg:CCC CC_REGNUM))
(clobber (reg:CCO EMR_REGNUM))])
  (set (mem:QI (match_dup 0))
   (match_operand:QI 3 register_operand ))]
 REGNO_OK_FOR_BASE_P (REGNO (operands[1]))
   constraint_satisfied_p (operands[2], CONSTRAINT_N)
 [(set (mem:QI (plus:SI (match_dup 1) (match_dup 2)))
   (match_dup 3))]
 )


In the rtl dumps just before peephole2 pass i get

(insn 213 211 215 39 20010408-1.c:71 (parallel [
(set (reg/f:SI 16 r0 [121])
(plus:SI (reg/v/f:SI 18 r2 [orig:93 p ] [93])
(const_int -1 [0x])))
(clobber (reg:CCC 50 sr))
(clobber (reg:CCO 54 emr))
]) 18 {addsi3} (expr_list:REG_UNUSED (reg:CCO 54 emr)
(expr_list:REG_UNUSED (reg:CCC 50 sr)
(nil

(insn 215 213 214 39 20010408-1.c:71 (set (mem/f/c/i:SI (plus:SI
(reg/f:SI 23 r7)
(const_int -32 [0xffe0])) [5 s+0 S4 A32])
(reg/v/f:SI 18 r2 [orig:93 p ] [93])) 2 {*movsi_internal}
(expr_list:REG_DEAD (reg/v/f:SI 18 r2 [orig:93 p ] [93])
(nil)))

(insn 214 215 284 39 20010408-1.c:71 (set (mem:QI (reg/f:SI 16 r0
[121]) [0 S1 A8])
(reg/v:QI 6 d6 [orig:92 ch ] [92])) 0 {*movqi_internal}
(expr_list:REG_DEAD (reg/f:SI 16 r0 [121])
(expr_list:REG_DEAD (reg/v:QI 6 d6 [orig:92 ch ] [92])
(nil


This is not match by the peephole2 pattern. After debugging i see that
the function 'peephole2_insns' matches only consecutive patterns. Is
that true? Is there a way to over come this?

Another issue. In another instance peephole2 matched but the generated
pattern did not get recognized because GO_IF_ LEGITIMATE_ADDRESS was
rejecting the addressing mode. Since peephole2 pass was run after
reload i changed GO_IF_ LEGITIMATE_ADDRESS macro to allow the
addressing mode after reload is completed. So now the check is
something like this:

 case PLUS:
{
  rtx offset = XEXP (x, 1);
  rtx base = XEXP (x, 0);

  if ( !(BASE_REG_RTX_P (base, strict) || STACK_REG_RTX_P (base)))
return 0;

  /* For QImode the target does not suppport (base + offset) address
 in the load instructions. So we disable this addressing
mode till reload is completed. */
  if (!reload_completed  mode == QImode  BASE_REG_RTX_P
(base, strict))
return 0;

I haven't run the testsuite, but Is this ok to have like this?
Please let me know your thoughts on this.

Thanks for your time.

Regards
Shafi


Re: How to implement pattens with more that 30 alternatives

2009-12-22 Thread Mohamed Shafi
2009/12/22 Richard Earnshaw rearn...@arm.com:

 On Mon, 2009-12-21 at 18:44 +, Paul Brook wrote:
I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of
scheduling framework i have to write the move patterns with more
clarity, so that i could control the scheduling with the help of
attributes. Re-writting the pattern resulted in movsi pattern with 41
alternatives :(
  
   Use rtl expressions instead of alternatives. e.g. arm.md:arith_shiftsi
 
  Or use the more modern iterators approach.

 Aren't iterators for generating multiple insns (e.g. movsi and movdi) from 
 the
 same pattern, whereas in this case we have a single insn  that needs to 
 accept
 many different operand combinartions?

 Yes, but that is often better, I suspect, than having too fancy a
 pattern that breaks the optimization simplifications that genrecog does.

 Note that the attributes that were requested could be made part of the
 iterator as well, using a mode_attribute.

  I can't find a back-end that does this. Can you show me a example?

Regards,
Shafi


How to implement pattens with more that 30 alternatives

2009-12-21 Thread Mohamed Shafi
Hi all,

I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of
scheduling framework i have to write the move patterns with more
clarity, so that i could control the scheduling with the help of
attributes. Re-writting the pattern resulted in movsi pattern with 41
alternatives :(  When i specify the attributes it seems that all the
alternatives above 31 are allocated with the default value of the
attribute. This is done in the generated file insn-attrtab.c. The
following is one such piece of code:


case 2:  /* *movsi_internal */
  extract_constrain_insn_cached (insn);
  if (((1  which_alternative)  0xf))
{
  return DELAY_SLOT_TYPE_CLOB_SR;
}
  else if (((1  which_alternative)  0x30))
{
  return DELAY_SLOT_TYPE_RW_SP;
}
  else if (which_alternative == 6)
{
  return DELAY_SLOT_TYPE_CLOB_SR;
}
  else if (((1  which_alternative)  0x1fff80))
{
  return DELAY_SLOT_TYPE_COMMON;
}
  else if (((1  which_alternative)  0x1e0))
{
  return DELAY_SLOT_TYPE_RW_SP;
}
  else if (which_alternative == 25)
{
  return DELAY_SLOT_TYPE_READ_SR;
}
  else if (which_alternative == 26)
{
  return DELAY_SLOT_TYPE_READ_EMR;
}
  else if (which_alternative == 27)
{
  return DELAY_SLOT_TYPE_COMMON;
}
  else if (which_alternative == 28)
{
  return DELAY_SLOT_TYPE_WRITE_SR;
}
  else
{
  return DELAY_SLOT_TYPE_COMMON;
}


As you can see from the above code all the alternatives which are more
that 31 will always get the default value of the attribute. This is
because GCC assumes that the target has only 31 alternatives. Even
changing the macro

#define MAX_RECOG_ALTERNATIVES 30

in the file recog.h there is no change in this assumption. (Which i
think should have affected the attribute calulation). I guess that if
i make need_64bit_hwint=yes , then this problem should go away. I
havent check this. But i dont want to do that, since this means that i
will have to change all the dependencies that are affected by this
change. Is there any other solution for my problem?

Any help is appreciated.

Regards,
Shafi


Re: How to support 40bit GP register - Take two

2009-12-17 Thread Mohamed Shafi
2009/12/18 Hans-Peter Nilsson h...@bitrange.com:
 On Fri, 20 Nov 2009, Mohamed Shafi wrote:
 I tried implementing the suggestion given by Richard, but got into
 issues. The GCC frame work is written assuming that there are no modes
 with HOST_BITS_PER_WIDE_INT  GET_MODE_BITSIZE (mode)  2 *
 HOST_BITS_PER_WIDE_INT.

 (Not seeing a reply regarding this issue, so here's mine, belated:)

 Perhaps a wart, but with a 64-bit HOST_BITS_PER_WIDE_INT, would
 that affect your port?  It's not?  Just set need_64bit_hwint=yes
 in config.gcc.  And send a patch for the introductory comment in
 that file, unless your port already matches the BITS_PER_WORD 
 32 bits condition.

   Thanks Hans for yourr reply
   I have already tried that. What you are suggesting is the first
solution that i got from Richard Henderson. I have mentioned the
issues if faced with this in my mail. The GCC frame work is written
assuming that there are no modes with HOST_BITS_PER_WIDE_INT 
GET_MODE_BITSIZE (mode)  2 * HOST_BITS_PER_WIDE_INT. So i had to hack
at places to get things working. For my target the BITS_PER_WORD ==
32. The mode that i am using is RImode (5bytes)

Regards,
Shafi


How to support 40bit GP register - Take two

2009-11-19 Thread Mohamed Shafi
Hello all,

I am porting GCC 4.4.0 for a 32bit target. The target has 40bit data
registers and 32bit address register. Both can be used as general
purpose registers. All load and store operations are 32bit. If 40bit
data register is involved in load/sore the register gets sign
extended. Whenever there is a move from address register to data
register sign extension is automatically performed. Currently GCC
generates code for 32bit register target. Since the data register is
40bit after/before some operations sign/zero extension has to be
performed for the result to be proper. So at present for the port the
results are not proper. I would need a solution to fix this.

I had mailed about this previously. You can see about this here
http://www.mail-archive.com/gcc@gcc.gnu.org/msg47224.html

I tried implementing the suggestion given by Richard, but got into
issues. The GCC frame work is written assuming that there are no modes
with HOST_BITS_PER_WIDE_INT  GET_MODE_BITSIZE (mode)  2 *
HOST_BITS_PER_WIDE_INT. Moreover i am getting ICEs when there is an
optimization/operation related to subreg. (GCC tries to split RImode
values).RImode is 5byte and uses SImode load/store instructions. So
GCC generates offsets/addresses that are not 32bit aligned. Currently
i am hacking the complier all the way to get an executable (though i
have not tested the output of the obtained executables) Even if i
somehow manage to get proper output there is the issue of using 32bit
registers in RImode instructions. RImode values is meant for 40bit
register, i.e data register. That means i will not be able to use
address registers(32bit registers) in RImode patterns even though the
instructions accept them. This will definitely hamper efficiency.

So i was wondering if anybody has any alternative solution that i can
try. All i can think is to flag an insn for unsigned operation so that
i will be able to insert sign/zero extension during say reorg pass.
Can this be implemented? How feasible is this?

Regards,
Shafi


Re: How to split mulsi3 pattern

2009-11-12 Thread Mohamed Shafi
2009/11/10 Richard Henderson r...@redhat.com:
 On 11/10/2009 05:48 AM, Mohamed Shafi wrote:

 (define_insn mulsi3
  [(set (match_operand:SI 0 register_operand           =d)
       (mult:SI (match_operand:SI 1 register_operand  %d)
               (match_operand:SI 2 register_operand d)))]

 Note that % is only useful if the constraints for the two operands are
 different (e.g. only one operand accepts an immediate input).  When they're
 identical, you simply waste cpu cycles asking reload to try the operands in
 the other order.

  [(set (match_dup 0)
        (ashift:SI
         (plus:SI (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
                           (unspec:HI [(match_dup 1)] UNSPEC_REG_HIGH))
                  (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_HIGH)
                           (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW)))
         (const_int 16)))
   (set (match_dup 0)
        (plus:SI (match_dup 0)
                 (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
                          (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW]

 Well for one, your modes don't match.  You actually want your unspecs and
 MULTs to be SImode.

 You could probably usefully model the second insn as

 (define_insn mulsi3_part2
  [(set (match_operand:SI 0 register_operand =d)
        (plus:SI
          (mult:SI (zero_extend:SI
                     (match_operand:HI 1 register_operand d))
                   (zero_extend:SI
                     (match_operand:HI 2 register_operand d)))
          (match_operand:SI 3 register_operand 0)))]
  
  ...)

So i need to change the mode of the register from SI to HI after
reloading. Is that allowed?

Regards,
Shafi


How to split mulsi3 pattern

2009-11-10 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0. In my target 32bit
multiply instruction is carried out in two instructions.

Dn = Da x Db is executed as

Dn = (Da.L * Db.H + Da.H * Db.L)  16
Dn = Dn + (Da.L * Db.L)

Currently the pattern that i have for this is as follows:

(define_insn mulsi3
 [(set (match_operand:SI 0 register_operand   =d)
   (mult:SI (match_operand:SI 1 register_operand  %d)
   (match_operand:SI 2 register_operand d)))]

I would like to split this pattern into two (either after of before
reload). Currently i am doing something like this:

(define_insn_and_split mulsi3
 [(set (match_operand:SI 0 register_operand   =d)
   (mult:SI (match_operand:SI 1 register_operand  %d)
(match_operand:SI 2 register_operand d)))]
 
 #
 reload_completed
 [(set (match_dup 0)
   (ashift:SI
(plus:SI (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
  (unspec:HI [(match_dup 1)] UNSPEC_REG_HIGH))
 (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_HIGH)
  (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW)))
(const_int 16)))
  (set (match_dup 0)
   (plus:SI (match_dup 0)
(mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
 (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW]
  
)

But in few testcases this is creating problems. So i would like to
know better patterns to split mulsi3 pattern.
Can someone help me out.

Regards,
Shafi


Re: How to write shift and add pattern?

2009-11-09 Thread Mohamed Shafi
2009/11/6 Richard Henderson r...@redhat.com:
 On 11/06/2009 05:29 AM, Mohamed Shafi wrote:

     The target that i am working on has 1  2 bit shift-add patterns.
 GCC is not generating shift-add patterns when the shift count is 1. It
 is currently generating add operations. What should be done to
 generate shift-add pattern instead of add-add pattern?

 I'm not sure.  You may have to resort to matching

  (set (match_operand 0 register_operand )
       (plus (plus (match_operand 1 register_operand )
                   (match_dup 1))
             (match_operand 2 register_operand 

 But you should debug make_compound_operation first to
 figure out what's going on for your port, because it's
 working for x86_64:

        long foo(long a, long b) { return a*2 + b; }

        leaq    (%rsi,%rdi,2), %rax     # 8     *lea_2_rex64
        ret                             # 26    return_internal


 r~


   I have fixed this. The culprit was the cost factor. I added the
case in targetm.rtx_costs and now it works properly. But i am having
issues with the reload.

Regards,
Shafi


Re: How to write shift and add pattern?

2009-11-09 Thread Mohamed Shafi
2009/11/6 Ian Lance Taylor i...@google.com:
 Mohamed Shafi shafi...@gmail.com writes:

 It is generating with data registers. Here is the pattern that i have
 written:


 (define_insn *saddl
   [(set (match_operand:SI 0 register_operand =r,d)
       (plus:SI (mult:SI (match_operand:SI 1 register_operand r,d)
                         (match_operand:SI 2 const24_operand J,J))
                (match_operand:SI 3 register_operand 0,0)))]

 How can i do this. Will the constraint modifiers '?' or '!' help?
 How can make GCC generate shift and add sequence when the shift count is 1?

 Does 'd' represent a data register?  I assume that 'r' is a general
 register, as it always is.  What is the constraint character for an
 address register?  You don't seem to have an alternative here for
 address registers, so I'm not surprised that the compiler isn't
 picking it.  No doubt I misunderstand something.

   Ok the constrain for address register is 'a'. Thats typo in the
pattern that i given here. The proper pattern is

 (define_insn *saddl
   [(set (match_operand:SI 0 register_operand =a,d)
   (plus:SI (mult:SI (match_operand:SI 1 register_operand a,d)
 (match_operand:SI 2 const24_operand J,J))
(match_operand:SI 3 register_operand 0,0)))]

So how can i choose the address registers over data registers if that
is more profitable?

Regards,
Shafi


Re: How to support 40bit GP register

2009-11-09 Thread Mohamed Shafi
2009/10/22 Richard Henderson r...@redhat.com:
 On 10/21/2009 07:25 AM, Mohamed Shafi wrote:

 For accessing a-b GCC generates the following code:

        move.l  (sp-16), d3
        lsrr.l  #16, d3
        move.l  (sp-12),d2
        asll    #16,d2
        or      d3,d2
        cmpeq.w #2,d2
        jf      _L2

 Because data registers are 40 bit for 'asll' operation the shift count
 should be 16+8 or there should be sign extension from 32bit to 40 bits
 after the 'or' operation. The target has instruction to sign extend
 from 32bit to 40 bit.

 Similarly there are other operation that requires sign/zero extension.
 So is there any way to tell GCC that the data registers are 40bit and
 there by expect it to generate sign/zero extension accordingly ?

 Define a machine mode for your 40-bit type in cpu-modes.def.  Depending on
 how your 40-bit type is stored in memory, you'll use either

  INT_MODE (RI, 5)                // load-store uses exactly 5 bytes
  FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes

 Where I've arbitrarily chosen RImode as a mnemonic for Register Integral
 Mode.  Now you define arithmetic operations, as needed, on
 RImode.  You define the extendsiri pattern to be that sign-extend from
 32-to-40-bit instruction.  You define your comparison patterns on RImode,
 and not on SImode, since your comparison instruction works on the entire 40
 bits.

 You'll wind up with a selection of patterns in your machine description that
 have a sign-extension pattern built in, depending on the exact behaviour of
 your ISA.  There are plenty of examples on x86_64, mips64, and Alpha (to
 name a few) that have similar properties with SI and DImodes.  Examine the
 -fdump-rtl-combine-details dump for exemplars of the canonical forms that
 the combiner creates when it tries to merge sign-extension instructions into
 preceeding patterns.

  Ok i have comparison patterns written in RImode. When you say that i
will wind up with a selection of patterns do you mean to say that i
should have patterns for operations that operate on full 40bits in
RImode and disable the corresponding SImode patterns? Or is it that i
have to write nameless patterns in RImode for arithmetic operations
and look at the dumps to see how the combiner will merge the patterns
so that it can match the comparison operations?

Regards,
Shafi


Re: How to support 40bit GP register

2009-11-04 Thread Mohamed Shafi
2009/10/22 Richard Henderson r...@redhat.com:
 On 10/21/2009 07:25 AM, Mohamed Shafi wrote:

 For accessing a-b GCC generates the following code:

        move.l  (sp-16), d3
        lsrr.l  #16, d3
        move.l  (sp-12),d2
        asll    #16,d2
        or      d3,d2
        cmpeq.w #2,d2
        jf      _L2

 Because data registers are 40 bit for 'asll' operation the shift count
 should be 16+8 or there should be sign extension from 32bit to 40 bits
 after the 'or' operation. The target has instruction to sign extend
 from 32bit to 40 bit.

 Similarly there are other operation that requires sign/zero extension.
 So is there any way to tell GCC that the data registers are 40bit and
 there by expect it to generate sign/zero extension accordingly ?

 Define a machine mode for your 40-bit type in cpu-modes.def.  Depending on
 how your 40-bit type is stored in memory, you'll use either

  INT_MODE (RI, 5)                // load-store uses exactly 5 bytes
  FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes

Richard thanks for the reply.

Load-store uses 32bits. Sign extension happens automatically. So i
have choosen INT_MODE (RI, 5) and copied movsi and renamed it to
movri. I have also specified that RImode need only one register.

 Where I've arbitrarily chosen RImode as a mnemonic for Register Integral
 Mode.  Now you define arithmetic operations, as needed, on
 RImode.  You define the extendsiri pattern to be that sign-extend from
 32-to-40-bit instruction.  You define your comparison patterns on RImode,
 and not on SImode, since your comparison instruction works on the entire 40
 bits.

I have defined extendsiri and cbranchri4 patterns. When i compile a
program like

unsigned long xh = 1;
int main ()
{
unsigned long yh = 0xull;
unsigned long z = xh * yh;

 if (z != yh)
   abort ();

return 0;
}

I get the following ICE

internal compiler error: in immed_double_const, at emit-rtl.c:553

This happens from cse_insn () calls insert() - gen_lowpart -
gen_lowpart_common - simplify_gen_subreg - simplfy_immed_subreg.
simplify_immed_subreg is called with the parameters (outermode=RImode,
(const_int 65535), innermode=DImode, byte=0)

cse_insn is called for the following insn

(insn 10 9 11 3 bug7.c:14 (set (reg:RI 67)
(const_int 65535 [0x])) 4 {movri} (nil))


How can i overcome this?

Regards,
Shafi


 You'll wind up with a selection of patterns in your machine description that
 have a sign-extension pattern built in, depending on the exact behaviour of
 your ISA.  There are plenty of examples on x86_64, mips64, and Alpha (to
 name a few) that have similar properties with SI and DImodes.  Examine the
 -fdump-rtl-combine-details dump for exemplars of the canonical forms that
 the combiner creates when it tries to merge sign-extension instructions into
 preceeding patterns.



Re: IRA is not looking into the predicates ?

2009-11-04 Thread Mohamed Shafi
2009/10/30 Jeff Law l...@redhat.com:
 On 10/30/09 07:13, Mohamed Shafi wrote:

 Hi,

 I am doing a port for a 32bit target in GCC 4.4.0. The target does not
 have support for symbolic address in QImode for load operations.

 You'll need to make sure to reject such addresses for QImode in
 GO_IF_LEGITIMATE_ADDRESS.


  In
 order to do this what i have done is in define_expand for moveqi
 reject symbolic address it they come in source operands and i have
 also written a predicate for *moveqi_internal to reject such cases.


 OK.  Nothing wrong with these steps.  Though you really need to make sure
 GO_IF_LEGITIMATE_ADDRESS is defined correctly.

 IRA doesn't look at operand predicates or insn conditions.  It assumes that
 any insns are valid assuming any pseudo registers appearing in the insn get
 suitable hard registers.

 Based on the dumps you provided it appears that reg61 does not get a hard
 register and reload is generating the problematical insn #24.  This is a
 good indication that your GO_IF_LEGITIMATE_ADDRESS is incorrectly
 implemented.

   I the GO_IF_LEGITIMATE_ADDRESS address macro i am allowing this
address because the target supports symbolic address in QImode for
store operations. And in the macro GO_IF_LEGITIMATE_ADDRESS there is
no option to check if the address is used in load or store. Thats why
in define_expand for moveqi i reject symbolic address it they come in
source operands and a predicate for *moveqi_internal to reject such
cases. But still i am getting the ICE.  IIRC the control does not come
to TARGET_SECONDARY_RELOAD also. How can i overcome this?

Regards,
Shafi


Re: IRA is not looking into the predicates ?

2009-11-04 Thread Mohamed Shafi
2009/10/30 Ian Lance Taylor i...@google.com:
 Mohamed Shafi shafi...@gmail.com writes:

From ice4.c.168r.asmcons

 (insn 5 2 6 2 ice4.c:4 (set (reg:SI 61 [ s ])
         (mem/c/i:SI (symbol_ref:SI (s) [flags 0x2] var_decl
 0xb7bfd000 s) [0 s+0 S4 A32])) 2 {*movsi_internal} (nil))

 (insn 6 5 7 2 ice4.c:4 (set (reg:QI 62)
         (plus:QI (subreg:QI (reg:SI 61 [ s ]) 0)
             (const_int -100 [0xff9c]))) 16 {addqi3}
 (expr_list:REG_DEAD (reg:SI 61 [ s ])
         (nil)))

 How can i prevent this ICE ?

 If asmcons is the first place that this appears, then I think it must
 be coming from some asm statement.  So the first step would be to look
 at the asm statement and see if it can be rewritten using a different
 constraint.

   No this appears from the rtl expand onwards.

Shafi


IRA is not looking into the predicates ?

2009-10-30 Thread Mohamed Shafi
Hi,

I am doing a port for a 32bit target in GCC 4.4.0. The target does not
have support for symbolic address in QImode for load operations. In
order to do this what i have done is in define_expand for moveqi
reject symbolic address it they come in source operands and i have
also written a predicate for *moveqi_internal to reject such cases.
But i get the following ICE:

 insn does not satisfy its constraints:
(insn 24 5 6 2 ice4.c:4 (set (reg:QI 17 r1)
(mem/c/i:QI (symbol_ref:SI (s) [flags 0x2] var_decl
0xb7bfd000 s) [0 s+0 S1 A32])) 0 {*movqi_internal} (nil))


From ice4.c.172r.ira

(insn 24 5 6 2 ice4.c:4 (set (reg:QI 17 r1)
(mem/c/i:QI (symbol_ref:SI (s) [flags 0x2] var_decl
0xb7bfd000 s) [0 s+0 S1 A32])) 0 {*movqi_internal} (nil))

(insn 6 24 7 2 ice4.c:4 (set (reg:QI 16 r0 [62])
(plus:QI (reg:QI 17 r1)
(const_int -100 [0xff9c]))) 16 {addqi3} (nil))

From ice4.c.168r.asmcons

(insn 5 2 6 2 ice4.c:4 (set (reg:SI 61 [ s ])
(mem/c/i:SI (symbol_ref:SI (s) [flags 0x2] var_decl
0xb7bfd000 s) [0 s+0 S4 A32])) 2 {*movsi_internal} (nil))

(insn 6 5 7 2 ice4.c:4 (set (reg:QI 62)
(plus:QI (subreg:QI (reg:SI 61 [ s ]) 0)
(const_int -100 [0xff9c]))) 16 {addqi3}
(expr_list:REG_DEAD (reg:SI 61 [ s ])
(nil)))

How can i prevent this ICE ?

Regards,
Shafi


Typo in internals

2009-10-23 Thread Mohamed Shafi
Hi,

The internal doc says :

— Target Hook: bool TARGET_CAN_INLINE_P (tree caller, tree callee)

This target hook returns false if the caller function cannot
inline callee, based on target specific information. By default,
inlining is not allowed if the callee function has function specific
target options and the caller does not use the same options.


But looking in the sources i think this really should have been
TARGET_OPTION_CAN_INLINE_P


Shafi.


Re: Supporting FP cmp lib routines

2009-10-23 Thread Mohamed Shafi
2009/9/14 Richard Henderson r...@redhat.com:
 Another thing to look at, since you have hand-written routines and may be
 able to specify that e.g. only a subset of the normal call clobbered
 registers are actually modified, is to leave the call as a compare insn.
  Something like

 (define_insn *cmpsf
  [(set (reg:CC status-reg)
        (compare:CC
          (match_operand:SF 0 register_operand R0)
          (match_operand:SF 1 register_operand R1)))
   (clobber (reg:SI r2))
   (clobber (reg:SI r3))]
  
  call __compareSF
  [(set_attr type call)])

 Where the R0 and R1 constraints resolve to the input registers for the
 routine.  Depending on your ISA and ABI, you may not even need to split this
 pattern post-reload.


I have implemented the above solution and it works. I have to support
the same for DF also. But with DF i have a problem with the
constraints. My target generates code for both big and little endian.
The ABI specifies that when a 64bit value is passed as an argument
they are passed in R6 and R7, R6 containing the most significant  long
word and R7 containing the least significant long word, regardless of
the  endianess mode. How can i do this in the DF compare pattern?

Regards,
Shafi


How to support 40bit GP register

2009-10-21 Thread Mohamed Shafi
HI all,

I am porting GCC 4.4.0 for a 32bit target. The target has 40bit data
registers and 32bit address registers that can be used as general
purpose registers. When 40bit registers are used for arithmetic
operations or comparison operations GCC generates code assuming that
its a 32bit register. Whenever there is a move from address register
to data register sign extension is automatically performed by the
target. Since the data register is 40bit after some operations
sign/zero extension has to be performed for the result to be proper.
Take the following test case for example :

typedef struct
{
  char b0;
  char b1;
  char b2;
  char b3;
  char b4;
  char b5;
} __attribute__ ((packed)) b_struct;


typedef struct
{
  short a;
  long b;
  short c;
  short d;
  b_struct e;
} __attribute__ ((packed)) a_struct;


int
main(void)
{
  volatile a_struct *a;
  volatile a_struct b;

  a = b;
  *a = (a_struct){1,2,3,4};
  a-e.b4 = 'c';

  if (a-b != 2)
abort ();

  exit (0);
}

For accessing a-b GCC generates the following code:

   move.l  (sp-16), d3
   lsrr.l  #16, d3
   move.l  (sp-12),d2
   asll#16,d2
   or  d3,d2
   cmpeq.w #2,d2
   jf  _L2

Because data registers are 40 bit for 'asll' operation the shift count
should be 16+8 or there should be sign extension from 32bit to 40 bits
after the 'or' operation. The target has instruction to sign extend
from 32bit to 40 bit.

Similarly there are other operation that requires sign/zero extension.
So is there any way to tell GCC that the data registers are 40bit and
there by expect it to generate sign/zero extension accordingly ?

Regards,
Shafi


Re: How to split 40bit data types load/store?

2009-10-05 Thread Mohamed Shafi
2009/9/14 Richard Henderson r...@redhat.com:
 On 09/14/2009 07:24 AM, Mohamed Shafi wrote:

 Hello all,

 I am doing a port for a 32bit target in GCC 4.4.0. I have to support a
 40bit data (_Accum) in the port. The target has 40bit registers which
 is a GPR and works as 32bit reg in other modes. The load and store for
 _Accum happens in two step. The lower 32bit in one instruction and the
 upper 8bit in the next instruction. I want to split the instruction
 after reload. I tired to have a pattern (for load) like this:

 (define_insn fn_load_ext_sa
  [(set (unspec:SA [(match_operand:DA 0 register_operand )]
                    UNSPEC_FN_EXT)
        (match_operand:SA 1 memory_operand ))]

 (define_insn fn_load_sa
  [(set (unspec:SA [(match_operand:DA 0 register_operand )]
                     UNSPEC_FN)
        (match_operand:SA 1 memory_operand ))]

 Unspec on the left-hand-side isn't something that's supposed to happen, and
 is more than likely the cause of your problems.  Try moving the unspec to
 the right-hand-side like:

  (set (reg:SI reg) (mem:SI addr))

  (set (reg:SA reg)
       (unspec:SA [(reg:SI reg) (mem:QI addr)]
                  UNSPEC_ACCUM_INSERT))

 and

  (set (mem:SI addr) (reg:SI reg))

  (set (mem:QI addr)
       (unspec:QI [(reg:SA reg)]
                  UNSPEC_ACCUM_EXTRACT))

 Note that after reload it's perfectly acceptable for a hard register to
 appear with the different SI and SAmodes.

 It's probably not too hard to define this with zero_extract sequences
 instead of unspecs, but given that these only appear after reload, it may
 not be worth the effort.


   I was able to implement this with unspecs. But now it seems that i
need to split the pattern before reload also. So i am thinking of
removing this and doing a split before reload. The issue is that there
is no support to for register indirect addressing mode for accessing
the upper eight bits of the 40bit register. The only addressing mode
supported for accessing this section is (SP+offset). So what i thought
was to allow this addressing mode and at the time of reloading, at the
time of secondary reload with the help of a scratch register and a
scratch memory. But it seems that in GCC it is not possible to have
both scratch memory and a scratch register for the same operation. Am
i right?
So what i did was to implement this at the define_expand stage itself.
The idea is to generate the following sequence:

for load (R0), D0 generate

load (R0), D0// 32bit mode , SAmode move
load (R0+4), scratch_reg  // 32bit mode, SAmode
store scratch_reg, (SP+off)   //32bit mode, SAmode
load.ext (SP+off), D0.u8

and similarly for store.
 Here are the patterns that i used for this purpose:

(define_expand movda
 [(set (match_operand:DA 0 nonimmediate_operand )
   (match_operand:DA 1 nonimmediate_operand ))]
 
 {
  if (MEM_P (operands[1])  REG_P (XEXP (operands[1], 0))
   XEXP (operands[1], 0) != virtual_stack_vars_rtx))
{
  rtx lo_half, hi_half;
  rtx scratch_mem, scratch_reg, subreg;

  gcc_assert (can_create_pseudo_p ());
  scratch_reg = gen_reg_rtx (SAmode);
  scratch_mem = assign_stack_temp (SAmode, GET_MODE_SIZE (SAmode), 0);\
  subreg = gen_rtx_SUBREG (SAmode, operands[0], 0);

  lo_half = adjust_address (operands[1], SAmode, 0);
  hi_half = adjust_address (operands[1], SAmode, 4);
  emit_insn (gen_rtx_SET (SAmode, subreg, lo_half));
  emit_insn (gen_rtx_SET (SAmode, scratch_reg, hi_half));
  emit_insn (gen_rtx_SET (SAmode, scratch_mem, scratch_reg));
  emit_insn (gen_load_reg_ext (operands[0], scratch_mem));
  DONE;
}
   /* and similarly for store operation */
 }
)

(define_insn load_reg_ext
 [(set (subreg:SA (zero_extract:DA (match_operand:DA 0 register_operand =d)
(const_int 8)
(const_int 24)) 4)
   (match_operand:SA 1 memory_operand Sd3))]

(define_insn store_reg_ext
 [(set (match_operand:SA 0 memory_operand =Sd3)
   (zero_extract:SA (match_operand:DA 1 register_operand d)
(const_int 8)
(const_int 24)))]

(define_insn *movsa_internal
 [(set (match_operand:SA 0 nonimmediate_operand =m,d,d)
 (match_operand:SA 1 nonimmediate_operand d,m,d))]


By default -fomit-frame-pointer will passed to the complier. Without
optimization compiler generates the expected output. But with
optimization that is not the case. It seems that the pattern that i
have written above are not proper. For the simple function like the
following

_Accum foo (_Accum *a)
{
   _Accum b = *a;
   return b;
}

with optimization enabled the complier generates only

load (R0), D0// 32bit mode , SAmode move

the 1st instruction in the expected 4 instruction sequence.
How can i write the patterns properly?

Regards
Shafi


Re: define_memory_constraint and REG_OK_STRICT

2009-10-02 Thread Mohamed Shafi
2009/9/30 Richard Henderson r...@redhat.com:
 On 09/29/2009 09:46 PM, Mohamed Shafi wrote:

  bool strict =  reload_completed ? true : false;

 What happens if you set strict = false here?
 That's what ARM does.

  That particular case works, and yes arm does it that way but there
are other targets that uses (reload_completed || reload_in_progress)
like s390.  So thats why i had to ask if my definition of strict is
proper or not. I am not sure which one to use?

Shafi


Re: Reload going wrong for addition.

2009-10-02 Thread Mohamed Shafi
2009/9/28 Richard Henderson r...@redhat.com:
 On 09/28/2009 07:25 AM, Mohamed Shafi wrote:

 Hope someone suggests me a solution.

 The solution is almost certainly something involving the
 TARGET_SECONDARY_RELOAD hook.  You need to inform reload that it's going to
 need some scratch registers in order to perform the operation.

 It's been a long time since I had to fiddle with this sort of thing, so I
 forget all the details involved.  Perhaps someone else has some additional
 advice.


Ok what i did was to remove the code from preferred_reload_class
function, so that  now it returns class i.e

#define PREFERRED_RELOAD_CLASS(class, x) class

And did in  TARGET_SECONDARY_RELOAD i added the code to have a scratch
register to do the move operation. Now things are working. So i guess
i should as why we have PREFERRED_RELOAD_CLASS when we can do the same
with TARGET_SECONDARY_RELOAD?

Shafi


define_memory_constraint and REG_OK_STRICT

2009-09-29 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0.
I have defined memory_constraints in predicates.c like this

(define_memory_constraint Sr0
   Memory refrence through base registers
   (match_test target_mem_constraint (\r0\, op)))

In the function target_mem_constraint i have

int
target_mem_constraint (const char *str, rtx op)
{
  char c0 = str[0];
  char c1 = str[1];
  rtx op0 = XEXP (op, 0);
  bool strict =  reload_completed;

  if (!MEM_P (op))
return 0;

  switch (c0)
{
case 'r':
  return (!STACK_REG_RTX_P (op0)
   BASE_REG_RTX_P (op0, strict));
...
...

My question is my definition of strict correct?
or should it be reload_in_progress || reload_completed?

Regards,
Shafi


Reload going wrong for addition.

2009-09-28 Thread Mohamed Shafi
Hello all,

I doing a port for a 32bit target for GCC 4.4.0. I am getting the
following error:

rd_er.c:19: error: insn does not satisfy its constraints:
(insn 5 35 34 2 rd_er.c:8 (set (reg:SI 16 r0)
(plus:SI (reg:SI 16 r0)
(reg:SI 2 d2))) 57 {addsi3} (expr_list:REG_EQUAL (plus:SI
(reg/f:SI 49 sp)
(const_int -65544 [0xfffefff8]))
(nil)))


My target has 16 data registers and 16 address registers. All are
32bit registers.
The target also has a dedicated stack pointer.
There is no move operation possible between SP and data regs.
There is no provision for addition between data and address registers.
R7 is used as Frame Pointer.


Pattern for addition
---

(define_insn addmode3
  [(set (match_operand:INT 0 register_operand
 =d, t, k, a, a, t, k, t, d)
(plus:INT(match_operand:INT 1 register_operand
   0, 0, 0, t, k, 0, 0, 0, 0)
 (match_operand:INT 2 nonmemory_operand
   J, J, J, L, L, t, t, k, d)))]

The constraints used are -
;;d  -   Data registers [D0 - D15]
;;a  -   Address registers [R0 - R15]
;;t   -   Address and Index registers
;;k   -  Stack Pointer
;;J   -   Unsigned 5bit immediate
;;L   -   Signed 16bit immediate

Since there is no move operation between SP and data regs i have
specified 12 as the register_move_cost between them. I also return the
reload class as address register class in preferred_reload_class when
the rtx is SP.

b4 ira pass
---

(insn 5 2 12 2 rd_er.c:8 (set (reg/v/f:SI 60 [ bufptr ])
(reg/f:SI 23 r7)) 43 {*movsi_internal} (nil))


Input for reload pass
-

(insn 5 2 12 2 rd_er.c:8 (set (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60])
(plus:SI (reg/f:SI 49 sp)
(const_int -65536 [0x]))) 57 {addsi3}
(expr_list:REG_EQUAL (plus:SI (reg/f:SI 49 sp)
(const_int -65536 [0x]))
(nil)))


After IRA
---

Reloads for insn # 5
Reload 0: reload_in (SI) = (reg/f:SI 49 sp)
reload_out (SI) = (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60])
HIGH_OR_LOW, RELOAD_OTHER (opnum = 0)
reload_in_reg: (reg/f:SI 49 sp)
reload_out_reg: (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60])
reload_reg_rtx: (reg:SI 16 r0)
Reload 1: reload_in (SI) = (const_int -65544 [0xfffefff8])
DALU_REGS, RELOAD_FOR_INPUT (opnum = 2)
reload_in_reg: (const_int -65544 [0xfffefff8])
reload_reg_rtx: (reg:SI 2 d2)

(insn 5 35 34 2 rd_er.c:8 (set (reg:SI 16 r0)
(plus:SI (reg:SI 16 r0)
(reg:SI 2 d2))) 57 {addsi3} (expr_list:REG_EQUAL (plus:SI
(reg/f:SI 49 sp)
(const_int -65544 [0xfffefff8]))
(nil)))

The reload pass chooses the final alternative as the goal for reloading.
Since the input instruction already has data register as the
destination the constraint combination (t, 0, t) looses to (d, 0, d),
since the last combination requires least amount copying for
constraint matching (or so the reload pass believes). There are cases
when reload fixes the add pattern and those are when either the
destination is address register or there is no stack pointer involved.
But otherwise i am getting this ICE. I am not sure how to over come
this,.

Hope someone suggests me a solution.

Regards,
Shafi

P.S Can i have commutative operation for the constraint combination
(t, 0, t) i.e (t, %0, t). If so what will be the output template?


Segmentation fault when calling a library fun - GCC bug?

2009-09-25 Thread Mohamed Shafi
I am doing a port for a 32bit target in GCC 4.4.0
I am getting segmentation fault in the function assign_temp in the
following line:

if (DECL_P (type_or_decl))

After analyzing the issue i find that this might be a bug. I just want
to confirm if that is the case or not.
In order to reproduce i think the target should have the following properties
a. Only 2 32bit registers available as argument registers.
b. Second 64bit value will be pushed in stack
c. ACCUMULATE_OUTGOING_ARGS is set
d. STRICT_ALIGNMENT is set
e. PARM_BOUNDARY is 32

When there is a library call for an operation that takes two 64bit
arguments, say division of two long long values - _divdi3, the
following sequence happens
emit_library_call_value - emit_library_call_value_1 -
emit_push_insn-assign_temp

emit_push_insn is called because the second argument is pushed on the
stack and ACCUMULATE_OUTGOING_ARGS is set.
assign_temp is called because  STRICT_ALIGNMENT  PARM_BOUNDARY 
GET_MODE_ALIGNMENT (DImode) is true


Can somebody please confirm whether this is due to some mistake in my
port or a GCC bug?

Thanks,
Shafi


How to implement compare and branch instruction

2009-09-24 Thread Mohamed Shafi
Hello all,

I am porting a 32bit target in GCC 4.4.0
The target has have distinct signed and unsigned compare instructions,
and only one set of conditional branch instructions. Moreover the
operands cannot be immediate values if the comparison is unsigned. I
have implemented this using compare-and-branch instruction. This gets
split after reload. The pattern that i have written are as follows:

(define_expand cmpmode
 [(set (reg:CC CC_REGNUM)
   (compare (match_operand:INT 0 register_operand )
(match_operand:INT 1 nonmemory_operand )))]
 
 
  {
   compare_op0 = operands[0];
   compare_op1 = operands[1];
   DONE;
  }
 
)


(define_expand bcode
 [(set (reg:CC CC_REGNUM)
   (compare:CC (match_dup 1)
   (match_dup 2)))
  (set (pc)
   (if_then_else (comp_op:CC (reg:CC CC_REGNUM)(const_int 0))
 (label_ref (match_operand 0  ))
 (pc)))]
  
  {
operands[1] = compare_op0;
operands[2] = compare_op1;

if (CONSTANT_P (operands[2])
 (CODE == LTU || CODE == GTU || CODE == LEU || CODE == GEU))
  operands[2] = force_reg (GET_MODE (operands[1]), operands[2]);

operands[3] = gen_rtx_fmt_ee (CODE, CCmode,
  gen_rtx_REG (CCmode,CC_REGNUM), const0_rtx);
emit_jump_insn (gen_compare_and_branch_insn (operands[0], operands[1],
 operands[2], operands[3]));
DONE;
  }
)

(define_insn_and_split compare_and_branch_insn
 [(set (pc)
   (if_then_else (match_operator:CC 3 comparison_operator
   [(match_operand 1 register_operand
d,d,a,a,d,t,k,t)
(match_operand 2 nonmemory_operand
J,L,J,L,d,t,t,k)])
 (label_ref (match_operand 0  ))
 (pc)))]
 !unsigned_immediate_compare_p (GET_CODE (operands[3]), operands[2])
 #
 reload_completed
 [(set (reg:CC CC_REGNUM)
   (match_op_dup:CC 3 [(match_dup 1) (match_dup 2)]))
  (set (pc)
   (if_then_else (eq (reg:CC CC_REGNUM) (const_int 0))
 (label_ref (match_dup 0))
 (pc)))]
  {
if (expand_compare_insn (operands, 0))
  DONE;
  }
)

In the function expand_compare_insn i am asserting that operand[2]
is not a immediate value if the comparison is unsigned. I am getting a
assertion failure in this function. The problem is that reload pass
will replace operand[2]  with its equiv_constant. This breaks the
pattern after reload pass.

Before reload pass

(jump_insn 58 56 59 10 20070129.c:73 (set (pc)
(if_then_else (leu:CC (reg:QI 84)
(reg:QI 91))
(label_ref 87)
(pc))) 77 {compare_and_branch_insn} (expr_list:REG_DEAD (reg:QI 84)
(expr_list:REG_BR_PROB (const_int 200 [0xc8])
(nil

After reload pass:

(jump_insn 58 56 59 10 20070129.c:73 (set (pc)
(if_then_else (leu:CC (reg:QI 17 r1 [84])
(const_int 1 [0x1]))
(label_ref 87)
(pc))) 77 {compare_and_branch_insn} (expr_list:REG_BR_PROB
(const_int 200 [0xc8])
(nil)))


How can i overcome this error?
Thanks for your help.

Regards,
Shafi


Supporting FP cmp lib routines

2009-09-14 Thread Mohamed Shafi
Hi all,

I am doing a GCC port for a 32bit target in GCC 4.4.0. The target uses
hand coded floating point compare routines. Generally the function
returns the values in the general purpose registers. But these fp cmp
routines return the result in the Status Register itself.  So there is
no need to have compare instruction after the function call for FP
compare. Is there a way to let GCC know that the result for FP compare
are stored in the Status Register so that GCC generates directly a
jump operation? How can i implement this?

Regards,
Shafi


How to split 40bit data types load/store?

2009-09-14 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0. I have to support a
40bit data (_Accum) in the port. The target has 40bit registers which
is a GPR and works as 32bit reg in other modes. The load and store for
_Accum happens in two step. The lower 32bit in one instruction and the
upper 8bit in the next instruction. I want to split the instruction
after reload. I tired to have a pattern (for load) like this:

(define_insn fn_load_ext_sa
 [(set (unspec:SA [(match_operand:DA 0 register_operand )]
UNSPEC_FN_EXT)
   (match_operand:SA 1 memory_operand ))]

(define_insn fn_load_sa
 [(set (unspec:SA [(match_operand:DA 0 register_operand )]
UNSPEC_FN)
   (match_operand:SA 1 memory_operand ))]


The above patterns works for O0. But with optimizations i am getting
ICE. It seems that GCC won't  accept unspec object in destination
operand. So how can split the pattens for the load and store for these
data types?

Regards,
Shafi


Reloading is going wrong?

2009-09-03 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0. Of the addressing
modes that are allowed by my target the one with (base register +
offset) is restrictive in QImode. The restriction is that if the base
register is not Stack Pointer then this kind of address cannot come in
a load instruction but only in store instruction.

 To implement this i added constrains for all supported memory
operations in QImode. So the pattern is as follows

(define_insn movqi
  [(set (match_operand:QI 0 nonimmediate_operand
 =b,b,d,t,d, b,Ss0, Ss1, a,Se1, Sb2,  b,Sd3,  d,Se0)
(match_operand:QI 1 general_operand
  I,  L,d,d,t, Ss0,b,  b,Se1,a,  b, Sd3,b,  Se0,d))]

where
d is data registers
a is address registers
b is data and address registers
Sb2 is Rn + offset addressing mode
Sd3 is SP + offset addressing mode

Se0 - (Rn), (Rn)+, (Rn)-, (Rn + Ri) and Post modify register addressing mode
Se1 - Se0 excluding Post modify register addressing mode

I believe that there are enough combinations available for the reload
to try for alternate addressing mode if it encounters the restrictive
addressing mode. But I am still getting the following error

main1.c:11: error: insn does not satisfy its constraints:
(insn 30 29 7 2 main1.c:9 (set (reg:QI 2 d2 [orig:61 variable.a+1 ] [61])
(mem/s/j:QI (plus:SI (reg:SI 16 r0)
(const_int 1 [0x1])) [0 variable.a+1 S1 A8])) 41
{movqi} (nil))
main1.c:11: internal compiler error: in reload_cse_simplify_operands,
at postreload.c:396


So what am i doing wrong?
Cant this scenario be solved by the reload pass?
How can generate instructions with the QImode restriction?

Regards,
Shafi


How to write shift and add pattern?

2009-08-28 Thread Mohamed Shafi
Hello all,

I am trying to port a 32bit arch in GCC 4.4.0. My target has support
for 1bit, 2bit shift and add operations. I tried to write patterns for
this , but gcc is not generating those. The following are the patterns
that i have written in md file:

(define_insn shift_add_mode
 [(set (match_operand:SI 0 register_operand )
   (plus:SI (match_operand:SI 3 register_operand )
 (ashift:SI (match_operand:SI 1 register_operand )
 (match_operand:SI 2 immediate_operand ]
 
 shadd1\\t%1, %0
)

(define_insn shift_add1_mode
 [(set (match_operand:SI 0 register_operand )
   (plus:SI (ashift:SI (match_operand:SI 1 register_operand )
 (match_operand:SI 2 immediate_operand ))
 (match_operand:SI 3 register_operand )))]
 
 shadd1\\t%1, %0
)

(define_insn shift_n_add_mode
 [(set (match_operand:SI 1 register_operand )
   (ashift:SI (match_dup 1)
   (match_operand:SI 2 immediate_operand )))
  (set (match_operand:SI 0 register_operand )
   (plus:SI (match_dup 0)
 (match_dup 1)))]
 
 shadd2\\t%1, %0
)


As you can see i have tried combinations. Since i was looking for
pattern matching i didnt bother to write according to the target.
Thought i will do that after i get a matching pattern. When i debugged
GCC was generating patterns with multiply. But that gets discarded
since md file doesnt have those patterns. How can i make GCC generate
shift and add pattern? Is GCC generating patterns with multiply due to
cost issues? I havent mentioned any cost details.

Regards,
Shafi


Re: Function argument passing

2009-08-23 Thread Mohamed Shafi
2009/7/16 Richard Henderson r...@redhat.com:
 On 07/13/2009 07:35 AM, Mohamed Shafi wrote:

 So i made both TARGET_STRICT_ARGUMENT_NAMING and
 PRETEND_OUTGOING_VARARGS_NAMED to return false. Is this correct?

 Yes.

 How to make the varargs argument to be promoted to 32bits when the
 normal argument don't require promotion as mentioned in point (1) ?

 There is no way at present.  You'll have to extend the promote_function_args
 hook to accept a bool named parameter.

 4. A long long return value is returned in R6 and R7, R6 containing
 the most significant  long word and R7 containing the least
 significant long word, regardless of the endianess mode.
 Solution: Used TARGET_RETURN_IN_MSB to return true when the mode is
 little endian

 I don't believe this is correct.  RETURN_IN_MSB is supposed to be handling
 the case where the data to be returned is smaller than the register in which
 it is returned -- e.g. a 3 byte structure returned in a 32-bit register.  I
 believe you should be using...

 5. If the first argument is a long long , it is passed in R6 and R7,
 R6 containing the most significant long word and R7 containing the
 least significant long word, regardless of the endianess mode.
 For return value, i have done as mentioned in (4) but I am not sure
 how to control the argument passing so that R6 contains the msw and R7
 contains lsw, regardless of the endianess mode.

 For both return values and arguments, we support a PARALLEL which allows the
 target to indicate where each piece of the value is located.  It's also true
 that the generated rtl is more complicated, so you'd want to avoid this
 solution in big-endian mode, when it isn't needed.

 So here you would do

 if (WORDS_BIG_ENDIAN)
  return gen_rtx_REG (DImode, 6);
 else
  {
    rtx r6, r7, par;

    r7 = gen_rtx_REG (SImode, 7);
    r7 = gen_rtx_EXPR_LIST (SImode, r7, GEN_INT (0));
    r6 = gen_rtx_REG (SImode, 6);
    r6 = gen_rtx_EXPR_LIST (SImode, r6, GEN_INT (4));
    par = gen_rtx_PARALLEL (DImode, gen_rtvec (2, r7, r6)));
    return par;
  }

 See the docs for FUNCTION_ARG for details.


I am getting the following error when i make a function call.

(call_insn 18 17 19 3 1.c:29 (set (parallel:DI [
(expr_list:REG_UNUSED (reg:SI 7 d7)
(const_int 0 [0x0]))
(expr_list:REG_UNUSED (reg:SI 6 d6)
(const_int 4 [0x4]))
])
(call:SI (mem:SI (symbol_ref:SI (dd1) [flags 0x41]
function_decl 0xb7bfa980 dd1) [0 S4 A8])
(const_int 8 [0x8]))) -1 (nil)
(expr_list:REG_DEP_TRUE (use (reg:SI 7 d7))
(expr_list:REG_DEP_TRUE (use (reg:SI 6 d6))
(nil

How do i write a pattern for this?
Another question is in LITTLE ENDIAN mode for the return value will
the compiler know that values are actually stored the other way.. in
big endian format? And generate the code accordingly for the rest of
the program?

Regards,
Shafi


DI mode and endianess

2009-08-19 Thread Mohamed Shafi
HI,

I am trying to port a 32bit target in GCC 4.4.0. My target supports
big and little endian. This is selected using a target switch. So i
have defined the macro

#define WORDS_BIG_ENDIAN (TARGET_BIG_ENDIAN)

Currently i have written pattens only for SImode moves. So GCC will
synthesize DImode patterns for me. The problem is that GCC is
generating the same code for both big and little endian i.e  for the
following code

extern long long h;
extern long long j;
extern long long k;
int temp()
{
  k = j+h;
  return 0;
}

the compiler is generating the following code.

section .text local
ALIGN   16
GLOBAL  _temp
_temp:
mov  _h,d4
mov  _h+4,d5
mov  _j,d2
mov  _j+4,d3
addd4,d2
adcd5,d3
mov  d2,_k
mov  d3,_k+4
ret
SIZE_temp,*-_temp


irrespective of which endian it is.
What could i be missing here? Should i add anything specific for this
in the back-end?

Regards,
Shafi


Re: About feasibility of implementing an instruction

2009-08-14 Thread Mohamed Shafi
2009/7/3 Ian Lance Taylor i...@google.com:
 Mohamed Shafi shafi...@gmail.com writes:

 I just want to know about the feasibility of implementing an
 instruction for a port in gcc 4.4
 The target has 40 bit register where the normal load/store/move
 instructions will be able to access the 32 bits of the register. In
 order to move data into the rest of the register [b32 to b39] the data
 has to be stored into a 32bit memory location. The data should be
 stored in such a way that if it is stored for 0-7 in memory the data
 can be moved to b32-b39 of a even register and if the data in the
 memory is stored in 16-23 of the memory word then it can be moved to
 b32-b39 of a odd register. Hope i make myself clear.

 Will it be possible to implement this in the gcc back-end so that the
 particular instruction is supported?

 In general, the gcc backend can do anything, so, yes, this can be
 supported.  It sounds like this is not a general purpose register, so I
 would probably do it using a builtin function.  If you need to treat it
 as a general purpose register (i.e., the register is managed by the
 register allocator) then you will need a secondary reload to handle
 this.


This is a general purpose register. All the 40 bits are used only for
fixed-point data types. When the register is used for fixed-point data
type all the operations except initialization, are done through
built-in functions. For initialization the immediate value should move
through a memory ..i.e there is no immediate load when the data is
40bit. So i am planning to control this using LEGITIMATE_CONSTANT
macro. But then i have a question. If all the operations are through
intrinsics will there be a need for spilling for the variables used in
the built-in functions? If so then depending on the register that get
spilled is even or odd [b32 to b39] of the register gets stored in the
memory to [b0 to b7] or [b16 tr b23] respectively. Will i be able to
keep track of the spilling so that i can reload into the proper
register?

Hope i am clear.

Regards
Shafi


Restrictive addressing mode

2009-08-10 Thread Mohamed Shafi
Hello all,

I am trying to port a 32bit target in GCC 4.4.0
Of the addressing modes that are allowed by my target the one with
(base register + offset) is restrictive in QImode.
The restriction is that if the base register is not Stack Pointer then
this kind of address cannot come in a load instruction but only in
store instruction. So how can i implement this? Should i do a
define_expand for movQi3 and force it to a register when i get this
addressing mode?

Please let me know your thoughts on this.

Regards,
shafi


Re: How to set the alignment

2009-08-05 Thread Mohamed Shafi
2009/8/5 Jim Wilson wil...@codesourcery.com:
 On Tue, 2009-08-04 at 11:09 +0530, Mohamed Shafi wrote:
  i am not able to implement the alignment for short.
  The following is are the macros that i used for this
  #define PARM_BOUNDARY 8
  #define STACK_BOUNDARY 64
 The target is 32bit . The first two parameters are passed in registers
 and the rest in stack. For the parameters that are passed in stack the
 alignment is that of the data type. The stack pointer is 8 byte
 aligned. char is 1 byte, int is 4 byte and short is 2 byte. The code
 that is getting generated is give below (-O0 -fomit-frame-pointer)

 Er, wait.  You set PARM_BOUNDARY to 8.  This means all arguments will be
 padded to at most an 8-bit boundary, which means that yes, a short after
 a char will have only 1 byte alignment.  If you want all arguments to
 have 2-byte alignment, then you need to set PARM_BOUNDARY to 16.  But
 you probably want a value of 32 here so that 4-byte ints get 4-byte
 alignment.  This will allocate a minimum 4-byte stack slot for every
 argument.  I don't know the calling convention, so I don't know exactly
 how you want arguments arranged on the stack.

 If you are pushing arguments, then you can lie in the PUSH_ROUNDING
 macro.  You could say for instance that one byte pushes always push 2
 bytes.  This ensures that the stack always has 2-byte alignment while
 pushing arguments.  If your push instruction doesn't actually do this,
 then you need to modify the pushqi pattern to emit two pushes or use a
 HImode push to get the right behaviour.

 Try looking at the code in store_one_arg in calls.c, and emit_push_insn
 in expr.c.

What i did was to define FUNCTION_ARG_BOUNDARY macro to return the
alignment as per the requirement. i.e 8bits for char, 16bits for
short, 32bits for int and kept PARM_BOUNDARY to 8. Now the complier is
emitting the alignment prperly.

Is this OK?

Regards,
Shafi


Re: How to set the alignment

2009-08-03 Thread Mohamed Shafi
2009/8/3 Jim Wilson wil...@codesourcery.com:
 On 08/03/2009 02:14 AM, Mohamed Shafi wrote:

 short - 2 bytes
 i am not able to implement the alignment for short.
 The following is are the macros that i used for this
 #define PARM_BOUNDARY 8
 #define STACK_BOUNDARY 64

 You haven't explained what the actual problem is.  Is there a problem with
 global variables?  Is the variable initialized or uninitialized? If it is
 uninitialized, is it common?  If this a local variable?  Is this a function
 argument or parameter?  Is this a named or unnamed (stdarg) argument or
 parameter?  Etc.  It always helps to include a testcase.

 You should also mention what gcc is currently emitting, why it is wrong, and
 what the output should be instead.

 All this talk about stack and parm boundary suggests that it might be an
 issue with function arguments, in which case you will probably have to
 describe the calling conventions a bit so we can understand what you want.

  This is the test case that i tried

short funs (int a, int b, char w,short e,short r)
{
  return e+r;
}

The target is 32bit . The first two parameters are passed in registers
and the rest in stack. For the parameters that are passed in stack the
alignment is that of the data type. The stack pointer is 8 byte
aligned. char is 1 byte, int is 4 byte and short is 2 byte. The code
that is getting generated is give below (-O0 -fomit-frame-pointer)

funs:
add  16,sp
mov  d0,(sp-16)
mov  d1,(sp-12)
movh  (sp-19),d0
movh  d0,(sp-8)
movh  (sp-21),d0
movh  d0,(sp-6)
movh  (sp-8),d1
movh  (sp-6),d0
add d1,d0,d0
sub16,sp
ret


From the above code you can see that some of the half word access is
not aligned on a 2byte boundary.

So where am i going wrong.
Hope this info is enough

Regards,
Shafi


Re: Output sections

2009-08-01 Thread Mohamed Shafi
2009/8/1 Dave Korn dave.korn.cyg...@googlemail.com:
 Mohamed Shafi wrote:
 I am looking for adding something to the end of each section in the
 generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to
 keep track of the sections that are being emitted. But from
 TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the
 sections stored in some global variable?

  I'm not sure I understand the question.  You enter a section simply by
 emitting the correct .section directive into the asm output.  You re-enter it 
 by
 the same method.

    cheers,
      DaveK


Ok, Then i don't understand your solution.

 you could use the TARGET_ASM_FILE_END hook to output
 directives that re-enter each used section and then output your new 
 directive.

if i want to do the following in the assembly output

section .code
.
.
..
section_end


you are saying that if i emit a section directive the compiler will
switch to the previously emitted section and then i have to somehow
seek to the end of that section and emit my 'section_end' directive?

Shafi


Re: Output sections

2009-08-01 Thread Mohamed Shafi
2009/8/1 Dave Korn dave.korn.cyg...@googlemail.com:
 Mohamed Shafi wrote:
 2009/8/1 Dave Korn dave.korn.cyg...@googlemail.com:
 Mohamed Shafi wrote:
 I am looking for adding something to the end of each section in the
 generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to
 keep track of the sections that are being emitted. But from
 TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the
 sections stored in some global variable?
  I'm not sure I understand the question.  You enter a section simply by
 emitting the correct .section directive into the asm output.  You re-enter 
 it by
 the same method.

 Ok, Then i don't understand your solution.

  Ah, it looks like I didn't quite understand your problem.

 you could use the TARGET_ASM_FILE_END hook to output
 directives that re-enter each used section and then output your new 
 directive.

 if i want to do the following in the assembly output

 section .code
 .
 .
 ..
 section_end

  I thought you just wanted to have

   .section .code
   section_end
   .section .data
   section_end

 ... etc. for all used sections, at the very end of the file; after all, all 
 the
 contributions to a section get concatenated in the assembler.  Now you seem to
 be saying that you want to have multiple section_end directives throughout the
 file, every time the current section changes.

 you are saying that if i emit a section directive the compiler will
 switch to the previously emitted section and then i have to somehow
 seek to the end of that section and emit my 'section_end' directive?

  I think you may need to re-read the assembler manual about sections, you are 
 a
 little confused about the concepts.  The compiler doesn't really switch
 anything; the compiler emits .section directives, in response to which the
 *assembler* switches to emit code in the chosen section.  The compiler doesn't
 keep track of sections; it just randomly emits directives for whichever one it
 wants the assembly output to go into at any given time, according to whether
 it's generating the assembly for a function or a variable or other data 
 object.


Ok. will TARGET_NAMED_SECTION get invoked for the normal sections like
text, data, bss ? I tired to include this hook in my code, but the
execution never reaches this hook for the sections.

Shafi


Re: Output sections

2009-07-31 Thread Mohamed Shafi
2009/7/18 Dave Korn dave.korn.cyg...@googlemail.com:
 Mohamed Shafi wrote:
 Hello all,

 Is it possible to emit a assembler directive at the end of each sections?
 Say like section_end
 Is there any support for doing something like this in the back-end files?
 Or should i need to the make changes in the gcc sources?
 Is so do does anyone know in which function it should happen?

  There isn't really such a concept as 'end of a section' until you get to
 final-link time and get all the contributions from different .o files to a
 given section.  During assembler output GCC treats sections as random access,
 switching freely from one to another and back; it doesn't have any concept of
 starting/stopping/opening/closing a section but just jumps into any one it
 likes completely ad-hoc.

  Assuming you're happy with adding something to the end of each section in
 each generated .s file, you could use the TARGET_ASM_FILE_END hook to output
 directives that re-enter each used section and then output your new directive.
  You may find it hard to know which sections have been used or not in a given
 file - you can define TARGET_ASM_NAMED_SECTION and make a note of which
 sections get invoked there, but I'm not sure if that gets called for all
 sections e.g. init/fini, you may have to try it and see.


I am looking for adding something to the end of each section in the
generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to
keep track of the sections that are being emitted. But from
TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the
sections stored in some global variable?

Shafi


Re: current_function_outgoing_args_size

2009-07-19 Thread Mohamed Shafi
2009/7/18 Ian Lance Taylor i...@google.com:
 Mohamed Shafi shafi...@gmail.com writes:

 The change logs says that current_function_outgoing_args_size is no
 more available. But it doesnt say with what it is replaced. Looking at
 the other targets i find that its replaced with some field in a
 structure crtl. Where is this defined/declared.

 crtl is declared in function.h.

 I am working in GCC 4.4.0. I checked with the mainline internals. Even
 there the references of these deleted variables are not replaced.
 Could somebody please take care of this.

And also references to regs_ever_live.

Regards,
Shafi


Output sections

2009-07-18 Thread Mohamed Shafi
Hello all,

Is it possible to emit a assembler directive at the end of each sections?
Say like section_end
Is there any support for doing something like this in the back-end files?
Or should i need to the make changes in the gcc sources?
Is so do does anyone know in which function it should happen?

Regards,
Shafi


current_function_outgoing_args_size

2009-07-18 Thread Mohamed Shafi
Hello all,

The change logs says that current_function_outgoing_args_size is no
more available. But it doesnt say with what it is replaced. Looking at
the other targets i find that its replaced with some field in a
structure crtl. Where is this defined/declared.

I am working in GCC 4.4.0. I checked with the mainline internals. Even
there the references of these deleted variables are not replaced.
Could somebody please take care of this.

Regards,
Shafi


Function argument passing

2009-07-13 Thread Mohamed Shafi
Hello all,

I am doing a port for a private target in GCC 4.4.0. It generates code
for both little  big endian.

The ABI for the target is as follows:

1. All arguments passed in stack are passed using their alignment constrains.
Solution: For this to happen no argument promotion should be done.

2. Functions with a variable number of arguments pass the last fixed
argument and all subsequent variable arguments on the stack. Such
arguments of fewer than 4 bytes are located on the stack as if the
argument had been promoted to 32 bits.

Solution:
For TARGET_STRICT_ARGUMENT_NAMING the internals says the following :

This hook controls how the named argument to FUNCTION_ARG is set for
varargs and stdarg functions. If this hook returns true, the named
argument is always true for named arguments, and false for unnamed
arguments. If it returns false, but
TARGET_PRETEND_OUTGOING_VARARGS_NAMED returns true, then all arguments
are treated as named. Otherwise, all named arguments except the last
are treated as named.

So i made both TARGET_STRICT_ARGUMENT_NAMING and
PRETEND_OUTGOING_VARARGS_NAMED to return false. Is this correct?

How to make the varargs argument to be promoted to 32bits when the
normal argument don't require promotion as mentioned in point (1) ?

3. A function returning a structure or union receives in D0 the
address of the returned structure or union. The caller allocates space
for the returned object.
Solution: Used TARGET_FUNCTION_VALUE and returned D0 reg_rtx for
structure and unions.

4. A long long return value is returned in R6 and R7, R6 containing
the most significant  long word and R7 containing the least
significant long word, regardless of the endianess mode.
Solution: Used TARGET_RETURN_IN_MSB to return true when the mode is
little endian

5. If the first argument is a long long , it is passed in R6 and R7,
R6 containing the most significant long word and R7 containing the
least significant long word, regardless of the endianess mode.
For return value, i have done as mentioned in (4) but I am not sure
how to control the argument passing so that R6 contains the msw and R7
contains lsw, regardless of the endianess mode.


Regards,
Shafi


CALL_USED_REGISTERS vs CALL_REALLY_USED_REGISTERS

2009-07-10 Thread Mohamed Shafi
Hello all,

The GCC 4.4.0 internal says :
[Macro] CALL_REALLY_USED_REGISTERS
Like CALL_USED_REGISTERS except this macro doesn’t require that the
entire set of
FIXED_REGISTERS be included. (CALL_USED_REGISTERS must be a superset of FIXED_
REGISTERS). This macro is optional. If not specifed, it defaults to the value of
CALL_USED_REGISTERS.

But it doesn't say why one needs to use this.
What is the need for the macro CALL_REALLY_USED_REGISTERS when
compared to CALL_USED_REGISTERS?

regards,
Shafi


About feasibility of implementing an instruction

2009-07-01 Thread Mohamed Shafi
Hello all,

I just want to know about the feasibility of implementing an
instruction for a port in gcc 4.4
The target has 40 bit register where the normal load/store/move
instructions will be able to access the 32 bits of the register. In
order to move data into the rest of the register [b32 to b39] the data
has to be stored into a 32bit memory location. The data should be
stored in such a way that if it is stored for 0-7 in memory the data
can be moved to b32-b39 of a even register and if the data in the
memory is stored in 16-23 of the memory word then it can be moved to
b32-b39 of a odd register. Hope i make myself clear.

Will it be possible to implement this in the gcc back-end so that the
particular instruction is supported?


Regards,
Shafi


Variable Length Execution Set?

2009-05-27 Thread Mohamed Shafi
Hi all,

Does GCC support architectures that has Variable Length Execution Set (VLES)?
Are there any developments happening in this direction?

Regards,
Shafi


Re: Variable Length Execution Set?

2009-05-27 Thread Mohamed Shafi
2009/5/27 Ian Lance Taylor i...@google.com:
 Mohamed Shafi shafi...@gmail.com writes:

 Does GCC support architectures that has Variable Length Execution Set (VLES)?
 Are there any developments happening in this direction?

 gcc supports many instruction sets whose instructions are not all the
 same size, including x86.  In particular, gcc supports ia64, which uses
 bundling.  If you mean something else, I think you need to give more
 details.

I know that GCC supports VLIW. VLES is similar to VLIW, except that in
a packet i can have variable number of instruction. ie. each packet
should contain at least one instruction with a max of 6 instructions
in a packet.

Shafi


Re: insn does not satisfy its constraints

2008-08-31 Thread Mohamed Shafi




- Original Message 
 From: Omar Torres [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: gcc@gcc.gnu.org
 Sent: Saturday, August 30, 2008 12:11:36 AM
 Subject: Re: insn does not satisfy its constraints
 
 shafi wrote:
 Operand 0 is a register
 Operand 1 is a memory
 Operand 2 is a register
 
 
  The md description for this instruction is:
 
  ;; addhi3
  (define_expand addhi3
[(set (match_operand:HI 0 register_operand )
  (plus:HI (match_operand:HI 1 cool_addhi_operand  )
   (match_operand:HI 2 cool_addhi_operand  )))]

)
 
  (define_insn *addhi3
[(set (match_operand:HI 0 register_operand=r ,r  ,r)
  (plus:HI (match_operand:HI 1 cool_addhi_operand %0 ,rim,r)
   (match_operand:HI 2 cool_addhi_operand rim,0  ,r)))]

 
 
 Do you have an option where operand 0 is reg and operand 1 is mem and
  operand 2 is reg?
 
 My purpose is to describe the three possible scenarios:
 1)  Operand 0 is a register
  Operand 1 is the same register as operand 0
  Operand 2 is a register, immediate or memory
 
 2)  Operand 0 is a register
  Operand 1 is a register, immediate or memory
   Operand 2 is the same register as operand 0
 
 3)  Operand 0 is a register
  Operand 2 is a register
  Operand 3 is also a register
 
 
 I am not sure what rim is for?
 
 rim = is a short cut for r, m, i. I think is is allow to mix several
 constrains like this, right?
 
   So rim is a user define constraint. Then i think you may want to look 
properly into EXTRA_CONSTRAINT_STR. Probably this is where you might be going 
wrong.

HTH
Shafi



  


Need a pointer for debugging

2008-07-24 Thread Mohamed Shafi
Hello all,

I am involved in the porting of GCC 4.1.2 for a 16 bit target. The
target doenst have any SImode comparisons. Most of the time SImode
comparisons are synthesized using HImode comparisons. But some in some
instances SImode patterns are generated like that, where the code is
expanded in the pattern template. During the regression i got a ICE
related to SImode comparisons, more specifically in
unroll_and_peel_loops().

In unroll_loop_runtime_iterations called from unroll_and_peel_loops ()
the following piece of code


  for (i = 0; i  n_peel; i++)
{
  /* Peel the copy.  */
  sbitmap_zero (wont_exit);
  if (i != n_peel - 1 || !last_may_exit)
SET_BIT (wont_exit, 1);
  ok = duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  loops, 1,
  wont_exit, desc-out_edge,
  remove_edges, n_remove_edges,
  DLTHE_FLAG_UPDATE_FREQ);
  gcc_assert (ok);

  /* Create item for switch.  */
  j = n_peel - i - (extra_zero_check ? 0 : 1);
  p = REG_BR_PROB_BASE / (i + 2);

  preheader = loop_split_edge_with (loop_preheader_edge (loop), NULL_RTX);
  branch_code = compare_and_jump_seq (copy_rtx (niter), GEN_INT (j), EQ,
  block_label (preheader), p,
  NULL_RTX);

  swtch = loop_split_edge_with (single_pred_edge (swtch), branch_code);
  set_immediate_dominator (CDI_DOMINATORS, preheader, swtch);
  single_pred_edge (swtch)-probability = REG_BR_PROB_BASE - p;
  e = make_edge (swtch, preheader,
 single_succ_edge (swtch)-flags  EDGE_IRREDUCIBLE_LOOP);
  e-probability = p;
}



generated SImode comparisons from 'compare_and_jump_seq()'. This
SImode comparisons are synthesized using HImode comparisons. This
results is the generation of two jump instructions and the ICE is
because both jump instructions end up in the same basic block.

From  bug.c.21.loop2_unroll

;; Start of basic block 26, registers live: (nil)
(note 248 174 245 26 [bb 26] NOTE_INSN_BASIC_BLOCK)

(jump_insn 245 248 246 26 (set (pc)
(if_then_else (ne:CC (subreg:HI (reg:SI 113) 0)
(const_int 0 [0x0]))
(label_ref 247)
(pc))) -1 (nil)
(nil))

(jump_insn 246 245 247 26 (set (pc)
(if_then_else (eq:CC (subreg:HI (reg:SI 113) 2)
(const_int 0 [0x0]))
(label_ref 244)
(pc))) -1 (nil)
(nil))

(code_label 247 246 224 26 18  [0 uses])
;; End of basic block 26, registers live:
 (nil)


Note that if i compile a code which has SImode EQ comparisons the
basic blocks and code is generated properly. Right now i am stuck in
debugging.
Could anybody please provide me with any pointers?

Regards,
Shafi


ICE in flow.c - Gcc 4.1.2 private port

2008-07-21 Thread Mohamed Shafi
Hello all,

For the target that i am porting if support for partial argument
passing is enabled i get the following error:
error: Attempt to delete prologue/epilogue insn:
internal compiler error: in propagate_one_insn, at flow.c:1699

This is 16bit target with 4 argument registers. FRAME_POINTER_REQUIRED
is defined to 0.
The code that is being complied is :

f(float a[],int b[],int c,float d)
{
}


From *.c.00.expand

;; Start of basic block 0, registers live: (nil)
(note 15 2 6 0 [bb 0] NOTE_INSN_BASIC_BLOCK)

(insn 6 15 7 0 (set (reg/v/f:HI 24 [ a ])
(reg:HI 0 R0 [ a ])) -1 (nil)
(nil))

(insn 7 6 8 0 (set (reg/v/f:HI 25 [ b ])
(reg:HI 1 R1 [ b ])) -1 (nil)
(nil))

(insn 8 7 9 0 (set (reg/v:HI 26 [ c ])
(reg:HI 2 R2 [ c ])) -1 (nil)
(nil))

(insn 9 8 12 0 (set (mem/c/i:HI (reg/f:HI 18 virtual-incoming-args) [0
d+0 S2 A16])
(reg:HI 3 R3)) -1 (nil)
(nil))

(insn 12 9 10 0 (clobber (reg/v:SF 27 [ d ])) -1 (nil)
(nil))

(insn 10 12 11 0 (set (subreg:HI (reg/v:SF 27 [ d ]) 0)
(mem/c/i:HI (reg/f:HI 18 virtual-incoming-args) [0 d+0 S2
A16])) -1 (nil)
(nil))

(insn 11 10 13 0 (set (subreg:HI (reg/v:SF 27 [ d ]) 2)
(mem/c/i:HI (plus:HI (reg/f:HI 18 virtual-incoming-args)
(const_int 2 [0x2])) [0 d+2 S2 A16])) -1 (nil)
(nil))

(note 13 11 14 0 NOTE_INSN_FUNCTION_BEG)


from *.c.37.lreg

(note 2 0 15 NOTE_INSN_DELETED)

;; Start of basic block 0, registers live: 3 [R3] 12 [R12]
(note 15 2 9 0 [bb 0] NOTE_INSN_BASIC_BLOCK)

(insn 9 15 13 0 (set (mem/c/i:HI (reg/f:HI 12 R12) [0 d+0 S2 A16])
(reg:HI 3 R3)) 1 {movhi_internal} (nil)
(nil))

(note 13 9 17 0 NOTE_INSN_FUNCTION_BEG)



from *.c.40.flow2

(note 15 2 31 0 [bb 0] NOTE_INSN_BASIC_BLOCK)

(insn/f 31 15 32 0 (set (reg/f:HI 12 R12)
(minus:HI (reg/f:HI 12 R12)
(const_int 2 [0x2]))) -1 (nil)
(nil))

(insn/f 32 31 33 0 (set (mem:HI (reg/f:HI 12 R12) [0 S2 A16])
(reg:HI 3 R3)) -1 (nil)
(nil))

(note 33 32 9 0 NOTE_INSN_PROLOGUE_END)

(insn 9 33 13 0 (set (mem/c/i:HI (reg/f:HI 12 R12) [0 d+0 S2 A16])
(reg:HI 3 R3)) 1 {movhi_internal} (nil)
(nil))

(note 13 9 17 0 NOTE_INSN_FUNCTION_BEG)



When argument is passed partially then
'current_function_pretend_args_size' is initialized and prologue will
set stack space accordingly. Based on the live information
'propagate_one_insn()'  is trying to delete the insn from the
prologue. My question is is gcc suppose to delete insn 9, even before
prologue generation ?
If its not the case where am i going wrong?

Regards,
Shafi


What are the functions that i can use?

2008-07-19 Thread Mohamed Shafi
Hello all,

I am involved in porting gcc 4.1.2.
For some processing i need to know whether a register is being defined
and used in a particular instruction. Till now i have been using
'refers_to_regno_p()' to know whether a register is being used in a
instruction and 'modified_in_p()' to know whether a register is being
defined in the instruction. But 'refers_to_regno_p()' also looks into
expr_list and/or notes in an instruction. So sometimes
refers_to_regno_p() returns 1 when the register is referred in the
expr_list in the instruction even though its use list of the
instruction.

Could any one tell me the functions that i can use to find out whether
an register is being used and/or defined in a particular instruction?

Regards,
Shafi


A question about varargs

2008-07-16 Thread Mohamed Shafi
Hello all,

I am involved in the porting of gcc 4.1.2 for 16 bit target. For this
target size of long long is 32bits. For the following code

#define VALUE 0x1B4E81B4E81B4DLL
#define AFTER 0x55

//void test (int n, long long q, int y);
void test (int n, ...);

int
main ()
{
  test (1, VALUE, AFTER);
  exit(0);
}

i find that the machine mode of the arguments of test are HImode,
DImode and HImode. When replace function 'test' with normal one
instead of varargs i find that the machine modes are HImode, SImode
and HImode respectively.
My question is even if the function is a vararg function shouldn't the
mode of the argument be SImode instead of DImode since long long is
only 32bit for the target?

Regards,
Shafi


Re: A question about varargs

2008-07-16 Thread Mohamed Shafi
2008/7/16 Ian Lance Taylor [EMAIL PROTECTED]:
 Mohamed Shafi [EMAIL PROTECTED] writes:

 I am involved in the porting of gcc 4.1.2 for 16 bit target. For this
 target size of long long is 32bits. For the following code

 #define VALUE 0x1B4E81B4E81B4DLL

 That is not a 32-bit value.


 #define AFTER 0x55

 //void test (int n, long long q, int y);
 void test (int n, ...);

 int
 main ()
 {
   test (1, VALUE, AFTER);
   exit(0);
 }

 i find that the machine mode of the arguments of test are HImode,
 DImode and HImode. When replace function 'test' with normal one
 instead of varargs i find that the machine modes are HImode, SImode
 and HImode respectively.
 My question is even if the function is a vararg function shouldn't the
 mode of the argument be SImode instead of DImode since long long is
 only 32bit for the target?

 The value is too big for a long long.  When you specify the type, gcc
 is forced to convert (I hope you can get a warning for that).  When
 you don't specify the type, gcc does not convert.  The resulting value
 has a type which can only be expressed using a gcc extension.

   So the behavior that i am getting is a proper one.


 If you change the TARGET_SCALAR_MODE_SUPPORTED_P hook to reject all
 modes larger than SImode, you may get a different result--probably
 some sort of error.

   Yes this is one option that i dint think about. But let me ask you
some thing. for my target when returning structures will use
registers, if its available. So a structure that has size of 16x4 will
be given 4 registers (i.e DImode). So if i use this hook will the
structure returning work properly? I mean will they be broken down
into two 32bit data types?

Shafi


Is this the expected behavior?

2008-07-15 Thread Mohamed Shafi
Hello all,

I am not sure if this the right mailing list.

I am involved in the porting of gcc 4.1.2 for a 16 bit target.
In some cases i noticed that callee save registers were getting
allocated in the body even though there isn't any function call. I
believe that callee save registers will be allocated only if some
variable values are required across a function call. So if there is no
function call there shouldn't be any callee save registers used in a
function body. So my question is will GCC allocate callee save
registers for function even if the function doesn't call any other
function?
Or is this a gcc bug?

Hope my question is clear.

Regards,
Shafi


Re: Is this the expected behavior?

2008-07-15 Thread Mohamed Shafi
2008/7/15 Ramana Radhakrishnan [EMAIL PROTECTED]:
 Hi Mohamed,



 Why not ? Callee save registers are after all registers and the split
 is in the ABI's head (so to speak). So GCC is well within its right to
 use callee save registers. In fact if you were in a leaf function that
 did not make any function calls the first preference would be to
 allocate caller save registers and then to allocate callee save
 registers - Instead of spilling a caller save register , GCC could
 very well use a callee save register and the only extra cost would be
 saving and restoring context of the callee save register in the
 prologue and the epilogue respectively.


   I agree with you, but what about when there are still caller save
register are available and there are no register restrictions for any
instructions? In my case i find that GCC has used only the argument
registers, stack pointer and callee saved registers. So out of the 16
available registers ony 5+1+4 registers were used, even though there
was 6 caller save registers were available


 HTH.


 cheers
 Ramana

 On Tue, Jul 15, 2008 at 7:50 AM, Mohamed Shafi [EMAIL PROTECTED] wrote:

 Hello all,

 I am not sure if this the right mailing list.

 I am involved in the porting of gcc 4.1.2 for a 16 bit target.
 In some cases i noticed that callee save registers were getting
 allocated in the body even though there isn't any function call. I
 believe that callee save registers will be allocated only if some
 variable values are required across a function call. So if there is no
 function call there shouldn't be any callee save registers used in a
 function body. So my question is will GCC allocate callee save
 registers for function even if the function doesn't call any other
 function?
 Or is this a gcc bug?

 Hope my question is clear.

 Regards,
 Shafi



 --
 Ramana Radhakrishnan



Re: Is this the expected behavior?

2008-07-15 Thread Mohamed Shafi
2008/7/15 Ramana Radhakrishnan [EMAIL PROTECTED]:
 snipped parts of the last mail


   I agree with you, but what about when there are still caller save
 register are available and there are no register restrictions for any
 instructions? In my case i find that GCC has used only the argument
 registers, stack pointer and callee saved registers. So out of the 16
 available registers ony 5+1+4 registers were used, even though there
 was 6 caller save registers were available


 Check your REG_ALLOC_ORDER macro ?

  The order is argument registers, caller save registers and finally
the callee save registers.




 cheers
 Ramana


 HTH.


 cheers
 Ramana

 On Tue, Jul 15, 2008 at 7:50 AM, Mohamed Shafi [EMAIL PROTECTED] wrote:

 Hello all,

 I am not sure if this the right mailing list.

 I am involved in the porting of gcc 4.1.2 for a 16 bit target.
 In some cases i noticed that callee save registers were getting
 allocated in the body even though there isn't any function call. I
 believe that callee save registers will be allocated only if some
 variable values are required across a function call. So if there is no
 function call there shouldn't be any callee save registers used in a
 function body. So my question is will GCC allocate callee save
 registers for function even if the function doesn't call any other
 function?
 Or is this a gcc bug?

 Hope my question is clear.

 Regards,
 Shafi



 --
 Ramana Radhakrishnan





 --
 Ramana Radhakrishnan



How to get signedness from rtx?

2008-07-05 Thread Mohamed Shafi
Hello all,

Is there a way to know whether an operand is signed or unsigned from its rtx?

Regards,
Shafi


How to identify comparison of 8bit operands

2008-07-02 Thread Mohamed Shafi
Hello all,

I am involved in porting a 16bit target in gcc 4.1.2
The target that i am porting to has a minor flaw. Comparison of signed
variables will go wrong. So i have to use a different approach to do
comparison of signed operands. This obviously takes more cycles and
instructions. But the comparison of sign-extended 8bit values are
proper. So i can use the normal comparison for char and the modified
one for 16bit values. So my question is in the back-end will i be able
to identify between comparisons of signed-extended 8bit and 16bit
operands?

Regards,
Shafi


How to implement conditional execution

2008-06-27 Thread Mohamed Shafi
Hello all,

For the 16-bit target that i porting now to gcc 4.1.2 doesn't have any
branch instructions. It only has jump instructions. For comparison
operation it has this instruction:

if cond Rx Ry
 execute this insn

So compare and branch is implemented as

if cond Rx Ry
 jmp Label

If the condition in the 'if' instruction is satisfied the processor
will execute the next instruction or it will replace with a nop. So
this means that i can instructions similar to:

if eq Rx, Ry
  add Rx, Ry
add Rx, 2

This is similar to conditional execution. This way any instruction can
be executed conditionally. But this is different from normal. Normally
the comparison operations set the status flags. An instruction gets
conditionally executed based on these flags. This means that GCC can
schedule instructions between the comparison instruction and the
conditional instruction, provided none of the scheduled instructions
are altering the status flags. This is not possible in my case as
there shouldn't be any instruction between 'if eq Rx, Ry' and 'add Rx,
Ry' and this is not as such an comparison operation and 'if'
instruction doesn't set any status flags.

Will it be possible to implement this in the Gcc backend ?
Does any other targets have similar instructions?

Regards,
Shafi


Can register rename pass rename a callee-saved register?

2008-06-19 Thread Mohamed Shafi
Hello everyone,

I am involved in gcc port in which i found the following problem.

Before register renaming pass, callee registers was being used in the
body of the code. Hence function prologue saved the register and
epilogue restored the register. But register renaming pass removed
this particular callee saved register.The output and code generation
is proper, but there is an unnecessary save and restore of a callee
saved register in the prologue and epilogue even though the reference
of the callee saved register has been removed by the renaming pass.
I am using the prologue/epilogue patterns instead of the target macros.

So is the rename pass allowed to rename a callee saved register? Where
might this going wrong?

Thanks for you help.

Regards,
Shafi


Re: Can register rename pass rename a callee-saved register?

2008-06-19 Thread Mohamed Shafi
2008/6/19 Ian Lance Taylor [EMAIL PROTECTED]:
 Mohamed Shafi [EMAIL PROTECTED] writes:

 Before register renaming pass, callee registers was being used in the
 body of the code. Hence function prologue saved the register and
 epilogue restored the register. But register renaming pass removed
 this particular callee saved register.The output and code generation
 is proper, but there is an unnecessary save and restore of a callee
 saved register in the prologue and epilogue even though the reference
 of the callee saved register has been removed by the renaming pass.
 I am using the prologue/epilogue patterns instead of the target macros.

 Which version of gcc?  I was under the impression that this
 longstanding buglet was cleaned up by the dataflow work.


   I am doing a port in gcc 4.1.2. The register is actually replaced
by register copy-propagation optimization pass.
   Here is the rtl dumps before .rnreg (the relevant portions)


(insn/f 42 41 43 0 (set (mem:HI (reg/f:HI 12 R12) [0 S2 A16])
(reg:HI 4 R4)) -1 (nil)
(expr_list:REG_DEAD (reg:HI 4 R4)
(nil)))

(note 43 42 9 0 NOTE_INSN_PROLOGUE_END)

(note 9 43 14 0 NOTE_INSN_FUNCTION_BEG)

(insn 14 9 37 0 (set (reg:HI 4 R4 [orig:26+2 ] [26])
(reg:HI 0 R0 [ pExtern ])) 1 {*movhi_internal}
(insn_list:REG_DEP_ANTI 16 (nil))
(expr_list:REG_DEAD (reg:HI 0 R0 [ pExtern ])
(expr_list:REG_NO_CONFLICT (reg/v:SI 0 R0 [orig:23 pExtern ] [23])
(nil

(insn 37 14 18 0 (set (reg:HI 8 R8)
(const_int 42 [0x2a])) 1 {*movhi_internal} (nil)
(nil))

(insn 18 37 38 0 (set (unspec:HI [
(reg:HI 8 R8)
] 2)
(unspec_volatile:HI [
(reg:HI 4 R4 [orig:26+2 ] [26])
] 6)) 6 {out} (insn_list:REG_DEP_TRUE 17
(insn_list:REG_DEP_ANTI 14 (insn_list:REG_DEP_TRUE 7
(insn_list:REG_DEP_TRUE 6 (nil)
(expr_list:REG_DEAD (reg:HI 8 R8)
(expr_list:REG_DEAD (reg:HI 4 R4 [orig:26+2 ] [26])
(nil


And this is the after the optimization pass


insn 18: replaced reg 4 with 0

.
(insn/f 42 41 43 0 (set (mem:HI (reg/f:HI 12 R12) [0 S2 A16])
(reg:HI 4 R4)) 1 {*movhi_internal} (nil)
(expr_list:REG_DEAD (reg:HI 4 R4)
(nil)))

(note 43 42 9 0 NOTE_INSN_PROLOGUE_END)

(note 9 43 37 0 NOTE_INSN_FUNCTION_BEG)

(insn 37 9 18 0 (set (reg:HI 8 R8)
(const_int 42 [0x2a])) 1 {*movhi_internal} (nil)
(nil))

(insn 18 37 38 0 (set (unspec:HI [
(reg:HI 8 R8)
] 2)
(unspec_volatile:HI [
(reg:HI 0 R0 [orig:26+2 ] [26])
] 6)) 6 {out} (insn_list:REG_DEP_TRUE 17
(insn_list:REG_DEP_ANTI 14 (insn_list:REG_DEP_TRUE 7
(insn_list:REG_DEP_TRUE 6 (nil)
(expr_list:REG_DEAD (reg:HI 8 R8)
(expr_list:REG_DEAD (reg:HI 0 R0 [orig:26+2 ] [26])
(nil




 So is the rename pass allowed to rename a callee saved register? Where
 might this going wrong?

 If this is the buglet I'm thinking of, the resulting code does work,
 despite being suboptimal.  It just does an unnecessary save and
 restore.

  The resulting code is proper except for the unnecessary save and
 restore.

 Ian



Re: Can register rename pass rename a callee-saved register?

2008-06-19 Thread Mohamed Shafi
2008/6/19 Ian Lance Taylor [EMAIL PROTECTED]:
 Mohamed Shafi [EMAIL PROTECTED] writes:

 Which version of gcc?  I was under the impression that this
 longstanding buglet was cleaned up by the dataflow work.


I am doing a port in gcc 4.1.2. The register is actually replaced
 by register copy-propagation optimization pass.

 I believe that in gcc 4.3 this unnecessary store and load should no
 longer happen.

   Can you tell me what was done in gcc 4.3 so that i can back port
the changes to gcc 4.1.2

Regards,
Shafi


Re: Can register rename pass rename a callee-saved register?

2008-06-19 Thread Mohamed Shafi
2008/6/20 Andrew Pinski [EMAIL PROTECTED]:
 On Thu, Jun 19, 2008 at 11:56 PM, Mohamed Shafi [EMAIL PROTECTED] wrote:
   Can you tell me what was done in gcc 4.3 so that i can back port
 the changes to gcc 4.1.2

 It was a rewrite of life information of flow.c really.  It is very
 hard to backport (trust me I have tried already).

   So i should do something in the machine reorg pass to catch cases like these.
   I guess that is the only hack that is possible. Is there any other way?

  Was there a bug report filed for this case?

  Regards,
  Shafi


How to write pattern for addition with carry operation

2008-06-06 Thread Mohamed Shafi
Hello all,

The 16bit target that i am porting to gcc4.1.2 doesn't have any
instructions for 32bit operations. But for addition and subtraction
there is
addc
subc
instructions that consider carry bit also. Presently i have patterns
for SImode addition and subtraction such that the template will have

add %0, %1\naddc %N0, %N1
sub %0, %1\nsubc %N0, %N1

Will it be possible for me to write separate patterns for the
instructions add and addc?

Regards,
Shafi


How to insert nops

2008-06-04 Thread Mohamed Shafi
Hello all,

For the big endian 16bit target that i am porting to gcc 4.1.2 a nop
is needed after a load instruction if the destination register of the
load instruction is used as the source in the next instruction. So

load R0, R3[2]
add R2, R0

needs a nop inserted in between the instructions. I have issues when
the operation is that of 32bit data types. The target doesn't have any
32bit instructions. All the 32bit move instructions are split after
reload. The following is an example where i am having issues

(set (reg:HI 2 R2)
(mem/s:HI (reg/f:HI 8 R8)

(set (reg:HI 3 R3)
(mem/s:HI (plus:HI (reg/f:HI 8 R8)
   (const_int 2 [0x2]))

(set (reg:SI 0 R0)
(minus:SI (reg:SI 0 R0)
   (reg:SI 2 R2)))


load R2, R8
load R3, R8[2]
sub R1, R3
subc R0, R2

For the above case no nop inserted. But because of the endianess src
reg gets used in the next instructions. How do i solve this?
I do nop insertion in reorg pass where i first do delay slot
scheduling. The follwoing is what i have in reorg() for nop insertion

  attr = get_attr_type (insn);
  if (next_insn  attr == TYPE_LOAD) {
  if (insn_true_dependent_p (insn, next_insn))
emit_insn_after (gen_nop (), insn);
}



static bool
insn_true_dependent_p (rtx x, rtx y)
{
  rtx tmp;

  if (! INSN_P (x) || ! INSN_P (y))
return 0;

  tmp = PATTERN (y);
  note_stores (PATTERN (x), insn_dependent_p_1, tmp);
  return (tmp == NULL_RTX);
}

static void
insn_dependent_p_1 (rtx x, rtx pat ATTRIBUTE_UNUSED, void *data)
{
  rtx * pinsn = (rtx *) data;

  if (*pinsn  reg_mentioned_p (x, *pinsn))
*pinsn = NULL_RTX;
}


I think apart from the above cases i will also have cases where nop
gets inserted when it's not really required.
How will it be possible to solve this issue?

Regards,
Shafi


Implementing a restrictive addressing mode for a gcc port - Take 2

2008-05-28 Thread Mohamed Shafi
Hello all,

The target that i am working on is 16bit, big endian and with 16 registers.
It has this particular addressing mode

load Rd, Ra[offset]
store Rs, Ra[offset]

where the offset should be positive, base register Ra should be an
even register and for the source or the destination register Rd/Ra,
the restriction is
that it should be one more than the base register . So the following
instructions are valid:

load R5, R4[4]
store R11, R10[2]

while the following ones are wrong:

load R8, R6[4]
store R3, R8[2]

What i did to implement this is to have eight register classes with
each class having two registers, an even register and an odd register
then in define expand look for the register indirect with offset
addressing mode and emit gen_store_offset or gen_load_offset pattern
if the addressing mode is found. In the pattern i will have the 8
similar constraints for the base register and the source/destination
register. But this didn't work out properly, probably because i had
many patterns for movhi operations. So i tired what Jim Wilson
suggested to me when i posted this question earlier. What he suggested
was:

One thing you could try is generating a double-word pseudo-reg at RTL
expand time, and then using subreg 0 for the source and subreg 1 for
the dest (or vice versa depending on endianness/word order).  This
will get you a register pair you can use from the register allocator.
This doesn't help at reload time though.

You probably have to define a constraint for every register, and then
write an alternative for every register pair matching the correct even
register with the correct odd register.  That gets you past reload.

So i did the following to implement his suggestion.
I have single define_expand and define_insn for movhi patterns. In
define_expand for movhi i have the folllowing

offset = INTVAL(XEXP(XEXP(mem_op, 0), 1));
dword = gen_reg_rtx (SImode);
base = simplify_gen_subreg (HImode, dword, SImode, 0);

if (mode == Pmode) {
  reg_op = simplify_gen_subreg (HImode, dword, SImode, 2);
  mem_op1 = gen_rtx_MEM (Pmode, plus_constant (base, offset));
}
else {
  reg_op = simplify_gen_subreg (QImode, dword, SImode, 3);
  mem_op1 = gen_rtx_MEM (QImode, plus_constant (base, offset));
}

if (GET_CODE (operands[0]) == MEM) {
operands[0] = mem_op1;
operands[1] = reg_op;
  }
else if (GET_CODE (operands[1]) == MEM) {
operands[1] = mem_op1;
operands[0] = reg_op;
  }


and in define_insn i have the following pattern:

(define_insn *movhi_internal
  [(set (match_operand:HI 0 nonimmediate_operand
=r,R01,R03,R05,R07,R09,R13,R15,r,U00,U02,U04,U06,U08,U12,U14,m,r)
(match_operand:HI 1 general_operand
r,U00,U02,U04,U06,U08,U12,U14,m,R01,R03,R05,R07,R09,R13,R15,r,i))]

where Uxx is memory constraints and Rxx is register constraints.
After implementing this i came across this problem:


(insn 12 11 13 1 (set (reg/f:HI 24)
(mem/c/i:HI (reg/f:HI 25) [0 m+0 S2 A16])) -1 (nil)
(nil))

(insn 13 12 14 1 (set (subreg:QI (reg:SI 28) 3)
(mem:QI (plus:HI (subreg:HI (reg:SI 28) 0)
(const_int 1 [0x1])) [0 S1 A8])) -1 (nil)
(nil))

(insn 14 13 15 1 (set (reg:HI 26)
(zero_extend:HI (reg:QI 27))) -1 (nil)
(nil))

For zero-extend both operands should be in registers. So one operand
which was previously in memory is moved to reg27 through load
operations(insn 13). Since for this offset addressing mode is used
define_expand for movqi will generate SImode register, reg28 and does
the operations. But this is not reflected in the subsequent
instructions (insn 14). And hence insn 13 is getting deleted as its
operands are never used.

What i am i doing wrong? Am i implementing the addressing mode properly?

Any help is appreciated.

Regards,
Shafi


How to specify registers constraints for memory operands?

2008-05-23 Thread Mohamed Shafi
Hello everyone,

I need to specify constraints for registers used in the memory
operands in a load pattern. For these the following are the things
that i have done.

#define CONSTRAINT_LEN(CHAR,STR) \
((CHAR) == 'R' ? 3 \
 : DEFAULT_CONSTRAINT_LEN(CHAR,STR))

#define EXTRA_MEMORY_CONSTRAINT(C, STR) \
  ((C) == 'R')

#define REG_CLASS_FROM_CONSTRAINT(CHAR,STR) \
 reg_class_from_constraint (CHAR, STR)

#define EXTRA_CONSTRAINT_STR(VALUE,C,STR) \
 extra_constraint (VALUE, C, STR)

in extra_constraints i have the following code:

{
  if (GET_CODE(value) != MEM)
return 0;

  if (c == 'R')
{
  r = XEXP(value,0);
  if ((GET_CODE(r) == REG)  (REGNO(r)  FIRST_PSEUDO_REGISTER))
{
  rclass = REG_CLASS_FROM_CONSTRAINT(c, str);
  if (rclass == REGNO_REG_CLASS (REGNO(r)))
return 1;
}
}
  return 0;
}

And i have the following pattern in the md file:

(define_insn movhi_load
  [(set (match_operand:HI 0 register_operand =R01,R03,R05,R07,R09,R13,R15,r)
(match_operand:HI 1 memory_operand   
R00,R02,R04,R06,R08,R12,R14,m))]

Is this the proper way to do this?

Thank you for taking the time to read this.

Regards,
Shafi


Re: Few question regarding the implementation of splitting HImode patterns

2008-05-23 Thread Mohamed Shafi
On Sat, May 24, 2008 at 12:26 AM, Omar Torres [EMAIL PROTECTED] wrote:
 Mohamed Shafi wrote:
 Hello Omar,

 I saw your mail to gcc mailing list regarding splitting of HImode
 patterns into QImode patterns. I am also involved in porting. My
 problem is similar to yours. But i have to split SImode patterns into
 HImode patterns.

 I am sure that you have modified your define_split patterns after
 receiving the suggestions from the mailing list. Could you just mail
 me the finalized define_split pattern of HImode.

 One thing that i noticed in your split pattern is that you are not
 handling cases where operand[0] is memory, i.e store patterns.  How
 are you handling this? Do you have a define_insn for this case?

 I hope you don't mind me asking these questions.

 Thank you for your time.

 Regards,
 Shafi



 Hi Mohamed,
  I added the gcc mailing list to the threat.

 My current implementation looks like this:
 ;; movhi
 (define_expand movhi
   [(set (match_operand:HI 0 nonimmediate_operand )
 (match_operand:HI 1 general_operand  ))]
 
 {
   if (c816_expand_move (HImode, operands)) {
   DONE;
   }
 })

 ;; =r creates an early clobber.
 ;; It prevent insn where the target register
 ;; is the same as the base register used for memory addressing...
 ;; This is needed so that the split produce correct code.
 (define_insn *movhi
   [(set (match_operand:HI 0 nonimmediate_operand =r,m)
 (match_operand:HI 1 general_operand  g,r))]
 
 #)

 (define_split
   [(set (match_operand:HI 0 nonimmediate_operand )
 (match_operand:HI 1 general_operand  ))]
   reload_completed
 [(set (match_dup 2) (match_dup 4))
  (set (match_dup 3) (match_dup 5))]
  {
   gcc_assert (REG_P (operands[0]) || MEM_P (operands[0]));

 #ifdef DEBUG_OVERLAP
   if (reg_overlap_mentioned_p(operands[0], operands[1])){
   fprintf (stderr, \\nOperands Overlap:\n\);
   debug_rtx (curr_insn);
   }
 #endif

   if (REG_P (operands[0])) {
   operands[2] = gen_highpart(QImode, operands[0]);
   operands[3] = gen_lowpart (QImode, operands[0]);
   }
   else if (MEM_P (operands[0])) {
   operands[2] = adjust_address (operands[0], QImode, 0);
   operands[3] = adjust_address (operands[0], QImode, 1);
   }

   if (MEM_P (operands[1])) {// || CONST == GET_CODE (operands[1])) {
   operands[4] = adjust_address (operands[1], QImode, 0);
   operands[5] = adjust_address (operands[1], QImode, 1);
   }
   else if (LABEL_REF == GET_CODE (operands[1])
  || SYMBOL_REF == GET_CODE (operands[1])) {//
   operands[4] = simplify_gen_subreg(QImode, operands[1], HImode, 0);
   operands[5] = simplify_gen_subreg(QImode, operands[1], HImode, 1);
   }
   else if (CONST_INT == GET_CODE (operands[1])
  || REG_P (operands[1])) {
   operands[4] = simplify_gen_subreg(QImode, operands[1], HImode, 0);
   operands[5] = simplify_gen_subreg(QImode, operands[1], HImode, 1);
   }
   else {
   error(\Unrecognized code in operands[1]\);
   fputs(\\nrtx code is: \, stderr);
   debug_rtx(curr_insn);
   abort();
   }
  })


 The purpose of the expand is to load Label or Symbol references into
 base registers for index addressing. I decided to use the expand since
 the force_reg() was failing when I called from the split.

Thank you for your reply.
I think you can do this in GO_IF_LEGITIMATE_ADDRESS macro. There
just return false if you find the above addressing modes or rather
return tru only for the addressing modes you want to use. That way gcc
will automatically load the symbol ref to registers.


Regards,
Shafi


Re: gmon.out creation procedure

2008-05-21 Thread Mohamed Shafi
On Tue, May 20, 2008 at 1:54 PM,  [EMAIL PROTECTED] wrote:
 Dear Shafi

 Thanks you very much for the clear details. Definitely your inputs are
 helpful.

 1) I am sure that in gcc-4.0 I found there is file gmon.c in the path
 gcc-4.0.0/gcc/gmon.c.  Anyhow let me concentrate on gmon.c of glibc.

   I am  not sure why this is found in gcc. It is not available in
other versions.

 2) Next thing I would like to know is to better understand the gmon.c of
 glibc I would like to degug glibc. since glibc is linked with gcc, I built
 gcc and glibc separately. while debugging gcc is referring shared glib
 library, but not the one I built freshly for debugging purpose. To  make
 this happen, where I need to change the path to like both gcc and glibc ?

   IIRC by passing -static to linker you can link with the static
glibc. To make sure that your glibc is picked up maybe you can hide
the other glibc from the PATH variable.


 3) Please correct me If I am wrong
   a. for every function mcount() function is called to collect the caller
 and callee address. where this collected info is placed ?
   b. the flow of monstartup() function
monstartup()--moncontrol() -- profil()

  who will call the monstartup() ? is it gcrt0 ? before calling the
 main() function of our routine ?

  Thats right. Thats the other thing that happen when -pg option is
provided. A different startup files is used. This will have a call to
the monstartup. monstartup will initialize all the data structures
required for collecting profile data and invokes profil system call.


   c. write_profiling() ---  write_gmon() functions calls write_hist(),
 write_call_graph() and write_bb_counts(). here who calls the
 write_profiling() ?

   d. mcleanup() calls write_gmon(). who calls the mcleanup() ? is it
 gcrt0 ? after control return from main() function ?

IIRC it is mcleanup that calls the output function write_gmon, which
in turn calls the other functions. mcleanup will be called from the
startup file after main returns. mcleanup dumps all the information in
the output file.

Hope this helps.

Regards,
Shafi

 Thanks and Regards
 Raja








 2008/5/19  [EMAIL PROTECTED]:
 Hi,

 I am Raja, I need a favor on understand how the gmon.out file is
 created.
 Please help me.

 1. gmon.c is available in both gcc and glibc.  Which is the one used to
 create gmon.out ?

 I don't think gcc has gmon.c. Only glibc has it. You can also find
 gmon in newlib for some targets. But this will be customized for the
 target


 2. Can you brief how profile information required to create gmon.out is
 captured?#65533;#65533;#65533;BBWhich are the functions are
 responsible for this ?

 These days gmon.c is used only to get histogram records(time related
 infomation). All the other information is now produced by gcc itself,
 than can be analyzed using gcov. (You will get gcov when you build
 gcc).
 For histogram records, gmon.c code primarily uses 'profil' system
 call. You can get more information about this in man pages. And of
 course you will get to know how this is used if you go through the
 code in gmon.c

 When -pg switch is enabled all complier does is inserting a call to
 the function mcount, usually after the function prologue. This is the
 function the collects all the needed information. For profiling
 information about caller address and callee address is necessary. If
 this information cannot be obtained using __bultin_return_address then
 this is calculated by mcount in a target specific manner and passed
 onto another function that takes these address as the arguments and
 gathers the profiling information.


 3. Suppose assume that executable is built without #65533;#65533;Cpg
 option, but
 want
 to create gmon.out at run-time. Is there any way or guidelines to
 implement?

 A call to the profiling function (mcount) should be there to generate
 profiling information. Without that you won't be able to generate
 gmon.out

 Hope this helps,

 Regards,
 Shafi


 Thanks and Regards
 Raja Saleru







Re: How to legitimize the reload address?

2008-05-21 Thread Mohamed Shafi
On Wed, May 21, 2008 at 1:42 AM, Jeff Law [EMAIL PROTECTED] wrote:
 Ian Lance Taylor wrote:

 Mohamed Shafi [EMAIL PROTECTED] writes:

 For the 16 bit target that i am currently porting can have only
 positive offsets less than 0x100. (unsigned 8 bit) for offset
 addressing mode.

 I would expect reload to be able to handle this kind of thing anyhow,
 assuming you define GO_IF_LEGITIMATE_ADDRESS correctly.  reload should
 automatically try loading an out of range offset into a register.

 Agreed.

 Typically if there are problems in this area it is because the port hasn't
 properly defined secondary reloads, or the valid offsets are not consistent
 within a machine mode.

 Mohamed, without more details, there's not much we can do to help you.

I am sure that i have written GO_IF_LEGITIMATE_ADDRESS correctly.
What i have in my port is something similar to mcore back-end. These
are the relevant parts:

  else if (GET_CODE (X) == PLUS)
{
  rtx xop0 = XEXP (X,0);
  rtx xop1 = XEXP (X,1);

  if (BASE_REGISTER_RTX_P (xop0))
return legitimate_index_p (mode, xop1);
   }

static int
legitimate_index_p (enum machine_mode mode, rtx OP)
{
  if (GET_CODE (OP) == CONST_INT)
{
  if (GET_MODE_SIZE (mode) = 4
   (((unsigned)INTVAL (OP)) % 4) == 0
((unsigned)INTVAL (OP)) = 0x0100)
return 1;

  if (GET_MODE_SIZE (mode) == 2
   (((unsigned)INTVAL (OP)) % 2) == 0
((unsigned)INTVAL (OP)) = 0x0100)
return 1;

  if (GET_MODE_SIZE (mode) == 1
   ((unsigned)INTVAL (OP)) = 0x0100)
return 1;
}

  return 0;
}


The compiler is crashing in change_address_1, at emit-rtl.c
...
  if (validate)
{
  if (reload_in_progress || reload_completed)
gcc_assert (memory_address_p (mode, addr));
  else
addr = memory_address (mode, addr);
}



Everything starts when cleanup_subreg_operands() is called from
reload() for the following pattern.

(set (subreg:HI (mem:SI (plus:HI (reg:HI 12 [SP]) (const_int 256)) 2)
   (reg:HI 3))

and then this becomes

(set (mem:HI (plus:HI (reg:HI 12 [SP] ) (const_int 258)))
   (reg:HI 3))

This pattern is not legitimate due to out of range offset.
Will i be able to overcome this if i write LEGITIMIZE_RELOAD_ADDRESS
or LEGITIMIZE_ADDRESS

Thank you for your time.

Regards,
Shafi


How to legitimize the reload address?

2008-05-20 Thread Mohamed Shafi
Hello all,

For the 16 bit target that i am currently porting can have only
positive offsets less than 0x100. (unsigned 8 bit) for offset
addressing mode.
During reload i am getting ICE because the address created is not
legitimate. So i guess i have to define the macro
LEGITIMIZE_RELOAD_ADDRESS.
But i am not sure how to do this?

With this will i be able to convert
load Rd, Rb[offset]
into
li Rs, offset
add Rs,Rb
load Rd, Rs

where Rs is a reserved register.

Or the only way is to do this  like the other targets say in rs6000

From rs6000_legitimize_reload_address()

 /* Reload the high part into a base reg; leave the low part
 in the mem directly.  */

  x = gen_rtx_PLUS (GET_MODE (x),
gen_rtx_PLUS (GET_MODE (x), XEXP (x, 0),
  GEN_INT (high)),
GEN_INT (low));

  push_reload (XEXP (x, 0), NULL_RTX, XEXP (x, 0), NULL,
   BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
   opnum, (enum reload_type)type);
  *win = 1;
  return x;

I guess this will generate something like

add Rs, Rb, excess_offset
load Rd, Rs[legitimate_offset];

Regards,
Shafi


Re: gmon.out creation procedure

2008-05-19 Thread Mohamed Shafi
2008/5/19  [EMAIL PROTECTED]:
 Hi,

 I am Raja, I need a favor on understand how the gmon.out file is created.
 Please help me.

 1. gmon.c is available in both gcc and glibc.  Which is the one used to
 create gmon.out ?

I don't think gcc has gmon.c. Only glibc has it. You can also find
gmon in newlib for some targets. But this will be customized for the
target


 2. Can you brief how profile information required to create gmon.out is
 captured?�BBWhich are the functions are responsible for this ?

These days gmon.c is used only to get histogram records(time related
infomation). All the other information is now produced by gcc itself,
than can be analyzed using gcov. (You will get gcov when you build
gcc).
For histogram records, gmon.c code primarily uses 'profil' system
call. You can get more information about this in man pages. And of
course you will get to know how this is used if you go through the
code in gmon.c

When -pg switch is enabled all complier does is inserting a call to
the function mcount, usually after the function prologue. This is the
function the collects all the needed information. For profiling
information about caller address and callee address is necessary. If
this information cannot be obtained using __bultin_return_address then
this is calculated by mcount in a target specific manner and passed
onto another function that takes these address as the arguments and
gathers the profiling information.


 3. Suppose assume that executable is built without ¨Cpg option, but want
 to create gmon.out at run-time. Is there any way or guidelines to
 implement?

A call to the profiling function (mcount) should be there to generate
profiling information. Without that you won't be able to generate
gmon.out

Hope this helps,

Regards,
Shafi


 Thanks and Regards
 Raja Saleru




A question about UNSPEC expression and register allocation

2008-05-17 Thread Mohamed Shafi
Hello all,

Recently i noticed that register allocation for the operands in a
unspec pattern was going wrong.
This was because there was no conflict between the registers used in
the unspec pattern and the other registers which should have been
there.
During debugging i found out that the code is written in such a way
that it doesn't consider registers used inside an unspec expression.
So i rewrote the patten so that the unspec is in the source rather
than in the destination of the pattern. That solved the issue. But is
this expected?
Will the allocation also go wrong for the source operands if they
contain registers inside an unpsec expression? I still haven't
encountered this.

What about live analysis. How are the registers inside an unspec
expression handled there?

Regards,
Shafi


Re: Implementing a restrictive addressing mode for a gcc port

2008-05-16 Thread Mohamed Shafi
On Tue, Apr 1, 2008 at 2:10 AM, Jim Wilson [EMAIL PROTECTED] wrote:
 Mohamed Shafi wrote:

 For the source or the destination register Rd/Ra, the restriction is
 that it should be one more than the base register . So the following
 instructions are valid:

 GCC doesn't provide any easy way for the source address to depend on the
 destination address, or vice versa.

 One thing you could try is generating a double-word pseudo-reg at RTL expand
 time, and then using subreg 0 for the source and subreg 1 for the dest (or
 vice versa depending on endianness/word order).  This will get you a
 register pair you can use from the register allocator.  This doesn't help at
 reload time though.

Ok, whatever i tried to do didn't work properly. So i am trying to
implement the way you have suggested.
In define_expand for movhi i have the following code to generate
double word pseudo-reg.

.
rtx dword,base,reg;
HOST_WIDE_INT offset;

offset = INTVAL(XEXP(XEXP(mem_op, 0), 1));
siwrd = gen_reg_rtx (SImode);
base = simplify_gen_subreg (HImode, dword, SImode, 0);
reg = simplify_gen_subreg (HImode, dword, SImode, 2);

if (GET_CODE (operands[0]) == MEM)
  {
operands[0] = gen_rtx_MEM (Pmode, plus_constant (base, offset));
operands[1] = reg;
  }
else if (GET_CODE (operands[1]) == MEM)
  {
operands[1] = gen_rtx_MEM (Pmode, plus_constant (base, offset));
operands[0] = reg;
  }

I hope i am doing correctly.



 You probably have to define a constraint for every register, and then write
 an alternative for every register pair matching the correct even register
 with the correct odd register.  That gets you past reload.


I have defined a constraint for all the registers. But i am not sure
as how to use them
in the pattern.

  [(set (match_operand:HI 0 register_operand =r)
(match_operand:HI 1 memory_operand m))]

I have to add the constraints along with 'm' and 'r'. But the new
constraints are suppose to
indicate the register that has to be used. So i have defined
REG_CLASS_FROM_CONSTRAINT
macro to return the reg class of a particular constraint. But i am not
sure how this can be used
with a memory operand.  Should i be defining EXTRA_MEMORY_CONSTRAINT?
Can i directly use the register constraints for a memory operand?

Thanks for your time.

Regards,
Shafi


Re: GCC 4.1.2 Port - Is live analysis going wrong?

2008-05-16 Thread Mohamed Shafi
On Fri, May 16, 2008 at 11:39 PM, Eric Botcazou [EMAIL PROTECTED] wrote:
 (insn 211 210 215 1 (set (reg:HI 1 R1 [+2 ])
 (subreg:HI (reg/v:SF 207 [ d.104 ]) 2)) 4 {movhi_regmove}
 (insn_list:REG_DEP_TRUE 208 (nil))
 (nil))

 (call_insn/u 215 211 217 1 (set (reg:HI 0 R0)
 (call:HI (mem:HI (reg/f:HI 234) [0 S2 A16])
 (const_int 0 [0x0]))) 25 {*call_value_internal_long}
 (insn_list:REG_DEP_ANTI 207 (insn_list:REG_DEP_ANTI 209
 (insn_list:REG_DEP_TRUE 213 (insn_list:REG_DEP_TRUE 212
 (insn_list:REG_DEP_TRUE 211 (insn_list:REG_DEP_TRUE 210
 (insn_list:REG_DEP_ANTI 208 (nil
 (expr_list:REG_DEAD (reg:SF 2 R2)
 (insn_list:REG_RETVAL 210 (expr_list:REG_EH_REGION (const_int
 -1 [0x])
 (nil
 (expr_list:REG_DEP_TRUE (use (reg:SF 2 R2))
 (expr_list:REG_DEP_TRUE (use (reg:SF 0 R0))
 (nil
 [...]
 Things go wrong in call_insn/u 215. Target has R0 and R1 are the
 parameter registers.

 There should probably be a USE for R1 on the call insn then, like for R0.
 Why is it there for the latter and not for the former?

 --

This is a 16bit target. SF uses two registers.So There its proper.
But i am still tracing the bug. The problem is for some reason a
definition of R1 is not getting emitted for a library call. This
definition actually defines one of the parameters of the library call.
This call also returns 2 register value, i.e in R0 and R1. So as far
as live analysis is concerned there is a use for R1 but no definition.
And hence it stays live through out the program. I now just need to
find out why the instruction is not getting emitted.

But thanks for taking your time to read this.

Regards,
Shafi


GCC 4.1.2 Port - Is live analysis going wrong?

2008-05-14 Thread Mohamed Shafi
Hello all,

In the gcc 4.1.2 port i am working on, i get an ICE in insert_save, at
caller-save.c:725
And following is the assert that assert failure.

  /* A common failure mode if register status is not correct in the
 RTL is for this routine to be called with a REGNO we didn't
 expect to save.  That will cause us to write an insn with a (nil)
 SET_DEST or SET_SRC.  Instead of doing so and causing a crash
 later, check for this common case here.  This will remove one
 step in debugging such problems.  */
  gcc_assert (regno_save_mem[regno][1]);

insert_save function is called by save_call_clobbered_regs() in the
same file.The below is the relevant portion of the dump after local
register allocation.

(insn 210 213 211 1 (set (reg:HI 0 R0 [ d.104 ])
(subreg:HI (reg/v:SF 207 [ d.104 ]) 0)) 4 {movhi_regmove}
(insn_list:REG_DEP_TRUE 208 (nil))
(insn_list:REG_LIBCALL 215 (nil)))

(insn 211 210 215 1 (set (reg:HI 1 R1 [+2 ])
(subreg:HI (reg/v:SF 207 [ d.104 ]) 2)) 4 {movhi_regmove}
(insn_list:REG_DEP_TRUE 208 (nil))
(nil))

(call_insn/u 215 211 217 1 (set (reg:HI 0 R0)
(call:HI (mem:HI (reg/f:HI 234) [0 S2 A16])
(const_int 0 [0x0]))) 25 {*call_value_internal_long}
(insn_list:REG_DEP_ANTI 207 (insn_list:REG_DEP_ANTI 209
(insn_list:REG_DEP_TRUE 213 (insn_list:REG_DEP_TRUE 212
(insn_list:REG_DEP_TRUE 211 (insn_list:REG_DEP_TRUE 210
(insn_list:REG_DEP_ANTI 208 (nil
(expr_list:REG_DEAD (reg:SF 2 R2)
(insn_list:REG_RETVAL 210 (expr_list:REG_EH_REGION (const_int
-1 [0x])
(nil
(expr_list:REG_DEP_TRUE (use (reg:SF 2 R2))
(expr_list:REG_DEP_TRUE (use (reg:SF 0 R0))
(nil

(jump_insn 217 215 222 1 (set (pc)
(if_then_else (le:CC (reg:HI 0 R0)
(const_int 0 [0x0]))
(label_ref:HI 226)
(pc))) 48 {cmpbrhi_le} (insn_list:REG_DEP_TRUE 215 (nil))
(expr_list:REG_DEAD (reg:HI 0 R0)
(expr_list:REG_BR_PROB (const_int 5000 [0x1388])
(nil
;; End of basic block 1, registers live:
 1 [R1] 12 [R12] 14 [R14] 16 [AP] 206 207 208 209 215 218 234

Things go wrong in call_insn/u 215. Target has R0 and R1 are the
parameter registers. So while building the reload chain for the call
instructions the registers that are live during the call are R0, R1
among other registers. This information is stored in live_throughout
member of the reload chain. In the function save_call_clobbered_regs()
register life information in CHAIN is used to compute which regs are
live during the call. And this is stored in hard_regs_to_save.
After doing the following operations

  /* Compute which hard regs must be saved before this call.  */
  AND_COMPL_HARD_REG_SET (hard_regs_to_save, call_fixed_reg_set);
  AND_COMPL_HARD_REG_SET (hard_regs_to_save, this_insn_sets);
  AND_COMPL_HARD_REG_SET (hard_regs_to_save, hard_regs_saved);
  AND_HARD_REG_SET (hard_regs_to_save, call_used_reg_set);

hard_regs_to_save will still contain R1 in it.(Parameter registers are
part of call used register set.) And hence insert_save is called for
saving reg R1.

From the time of reload chain generation to save_call_clobbered_regs()
function call things are proper, even though, as the comment says the
registers status is not proper when save_call_clobbered_regs()  is
called. Looking at the dumps i think the only thing that is going
wrong is the live information of the registers.
For a call instructions all the parameter registers used by the call
instructions will be live at the time of the call. But if these
registers are used in the successor blocks only to pass the
parameters, i.e their value is not used again, shouldn't these
registers be marked as dead in the call instruction?. After some
lengthy debugging this is the only conclusion that i can come to. But
i am not sure if this is way live information is handled.

Can some one give any thoughts on this?

Regards,
Shafi


  1   2   >