improve -fverbose-asm option

2009-03-16 Thread Eric Fisher
Hello,

I'd like to get more helpful information from the final .S file, such
as basic block info, so that I can draw a cfg graph through a script.
Perhaps the -fverbose-asm option is the right way to open this
functionality. Here's a simple patch based on the current trunk svn.

Index: gcc/final.c
===
--- gcc/final.c (revision 144878)
+++ gcc/final.c (working copy)
@@ -1830,10 +1830,38 @@ final_scan_insn (rtx insn, FILE *file, i
  targetm.asm_out.unwind_emit (asm_out_file, insn);
 #endif

- if (flag_debug_asm)
+ if (flag_debug_asm  !flag_verbose_asm)
fprintf (asm_out_file, \t%s basic block %d\n,
 ASM_COMMENT_START, NOTE_BASIC_BLOCK (insn)-index);

+  /* Print basic block info. */
+  if (flag_verbose_asm)
+{
+  fprintf (asm_out_file, \t%s BLOCK %d,
+   ASM_COMMENT_START, NOTE_BASIC_BLOCK (insn)-index);
+  if (NOTE_BASIC_BLOCK (insn)-frequency)
+fprintf (asm_out_file,  freq: %d,
+ NOTE_BASIC_BLOCK (insn)-frequency);
+  if (NOTE_BASIC_BLOCK (insn)-count)
+fprintf (asm_out_file,  count: %d,
+ NOTE_BASIC_BLOCK (insn)-count);
+  fprintf (asm_out_file, \n);
+
+  fprintf (asm_out_file, \t%s PRED:, ASM_COMMENT_START);
+  FOR_EACH_EDGE (e, ei, NOTE_BASIC_BLOCK (insn)-preds)
+{
+  dump_edge_info (asm_out_file, e, 0);
+}
+  fprintf (asm_out_file, \n);
+
+  fprintf (asm_out_file, \t%s SUCC:, ASM_COMMENT_START);
+  FOR_EACH_EDGE (e, ei, NOTE_BASIC_BLOCK (insn)-succs)
+{
+  dump_edge_info (asm_out_file, e, 1);
+}
+  fprintf (asm_out_file, \n);
+}
+
  if ((*seen  (SEEN_EMITTED | SEEN_BB)) == SEEN_BB)
{
  *seen |= SEEN_EMITTED;

Also, I think it will be better to generate one label for each basic
block, and the local label should have the function name as the
suffix. Because some profile tools, such as oprofile, will output
samples based on the labels. So this will help us to analyze the
samples for each basic block. But current generated code will have
many local labels with the same name. Perhaps it's again the
-fverbose-asm to enable this functionality. But where should I go if I
wanna implement this functionality?

Cheers,

Eric Fisher
Mar 16, 2009


Re: help for arm avr bfin cris frv h8300 m68k mcore mmix pdp11 rs6000 sh vax

2009-03-16 Thread Martin Guy
On 3/14/09, Paolo Bonzini bonz...@gnu.org wrote:
 Hans-Peter Nilsson wrote:
   The answer to the question is no, but I'd guess the more
   useful answer is yes, for different definitions of truncate.

 Ok, after my patches you will be able to teach GCC about this definition
  of truncate.

I expect it's a bit too extreme an example, but I've just found (to my
horror) that the MaverickCrunch FPU truncates all its shift counts to
6-bit signed (-32(right) to +31(left)), including on 64-bit integers,
which is not very helpful to compile for.
...unless it happens to come easy to handle shift count is truncated
to less than size of word in your new framework

M


Re: help for arm avr bfin cris frv h8300 m68k mcore mmix pdp11 rs6000 sh vax

2009-03-16 Thread Paolo Bonzini
Martin Guy wrote:
 On 3/14/09, Paolo Bonzini bonz...@gnu.org wrote:
 Hans-Peter Nilsson wrote:
   The answer to the question is no, but I'd guess the more
   useful answer is yes, for different definitions of truncate.

 Ok, after my patches you will be able to teach GCC about this definition
  of truncate.
 
 I expect it's a bit too extreme an example, but I've just found (to my
 horror) that the MaverickCrunch FPU truncates all its shift counts to
 6-bit signed (-32(right) to +31(left)), including on 64-bit integers,
 which is not very helpful to compile for.
 ...unless it happens to come easy to handle shift count is truncated
 to less than size of word in your new framework

Uhm, well, no. :-)

This could already be handled by faking a 63 bit truncation and using a
splitter to expand those into something like this (I only know integer
ARM assembly, so I'm making this up):

   AND R1, R0, #31
   MOV R2, R2, SHIFT R1
   ANDS R1, R0, #32
   MOVNE R2, R2, SHIFT #31
   MOVNE R2, R2, SHIFT #1

or

   ANDS R1, R0, #32
   MOVNE R2, R2, SHIFT #-32
   SUB R1, R1, R0  ; R1 = (x = 32 ? 32 - x : -x)
   MOV R2, R2, SHIFT R1

(which requires a scratch register, so it cannot be done postreload...
this might be a problem)

But my new stuff won't change anything.

Paolo


Re: Preprocessor for assembler macros?

2009-03-16 Thread Ph . Marek
Philipp Marek philipp at marek.priv.at writes:
  gcc -S tmp.S for some reason prints to stdout, so gcc -S tmp.S  tmp.s
  is what you need
 Thank you very much, I'll take a look.
I tried very hard to achieve that; and one time it seemed to work, but I cannot
make it work again.

As an example I'm trying to expand the macros in the linux kernel source file
   arch/x86/kernel/entry_64.S

I tried to call gcc -S, to put the various -I.. paths as needed, and I even
renamed my as to as.bin and tried to get the assembler source directly (by
using gcc -S $COLLECT_GCC_OPTIONS sourcefile) ...

I cannot make it work again ...


Do you have some other hint for me?

Thank you very much.


Regards,

Phil




Re: -mfpmath=sse,387 is experimental ?

2009-03-16 Thread Zuxy Meng

Hi,

Timothy Madden terminato...@gmail.com 写入消息 
news:5078d8af0903120218i23b69a4bma28ad9b3f1bd4...@mail.gmail.com...

On Thu, Mar 12, 2009 at 1:15 AM, Jan Hubicka hubi...@ucw.cz wrote:

Timothy Madden wrote:
 Hello

 Is -mfpmath=both for i386 and x86-64 still experimental in gcc 4.3, as
 the in the online manual page ?

[...]


The fundamental problem here is that backend lies to compiler about the
fact that FP operation can not take one operand from SSE and other from
X87.  This is something I want to look into once I have more time.  With
new RA, perhaps we can drop all these fake constraints.


That would be great !
I am sure having twice the number of registers (sse+387) would make a
big difference.

Even if SSE and FPU instructions set can not mix operands, using both
at the same
time (each with its registers) will be an improvement.

Until then I would have a question: if I compile with -msse than using
-mfpmath=387
would help floating-point operations not steal SSE registers that are
already used by
CPU operations ? And using -mfpmath=sse would make FPU and CPU share the 
SSE

registers and compete on them ?

How would I know if my AMD Sempron 2200+ has separate execution units
for SSE and
FPU instructions, with independent registers ?


Most CPU use the same FP unit for both x87 and SIMD operations so it 
wouldn't give you double the performance. The only exception I know of is 
K6-2/3, whose x87 and 3DNow! units are separate.


--
Zuxy 





Re: GCC 4.4.0 Status Report (2009-03-13)

2009-03-16 Thread Paolo Bonzini
NightStrike wrote:
 On Fri, Mar 13, 2009 at 1:58 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 Given the SC request we need to stay in Stage 4 rather than trying to work
 around it.
 
 What if GCC went back to stage 3 until the issue is resolved, thus
 opening the door for a number of stage3-type patches that don't affect
 1) licensing and 2) plugin frameworks, but are merely bug fixes which
 would have long been shaken out by now.

No, not at all.  The only benefit we're having from this is that GCC 4.4
should be quite stable already in GCC 4.4.0, let's not destroy this one too.

Paolo


Re: help for arm avr bfin cris frv h8300 m68k mcore mmix pdp11 rs6000 sh vax

2009-03-16 Thread Martin Guy
On 3/16/09, Paolo Bonzini bonz...@gnu.org wrote:
AND R1, R0, #31
MOV R2, R2, SHIFT R1
ANDS R1, R0, #32
MOVNE R2, R2, SHIFT #31
MOVNE R2, R2, SHIFT #1

  or

ANDS R1, R0, #32
MOVNE R2, R2, SHIFT #-32
SUB R1, R1, R0  ; R1 = (x = 32 ? 32 - x : -x)
MOV R2, R2, SHIFT R1

Thanks for the tips. Yes, I was contemplating cooking up something
like that, hobbled by the fact that if you use maverick instructions
conditionally you either have to put seven nops either side of them or
risk death by astonishment.

M


Re: -mfpmath=sse,387 is experimental ?

2009-03-16 Thread Tim Prince
Zuxy Meng wrote:
 Hi,
 
 Timothy Madden terminato...@gmail.com 写入消息
!
 I am sure having twice the number of registers (sse+387) would make a
 big difference.
You're not counting the rename registers, you're talking about 32-bit mode
only, and you're discounting the different mode of accessing the registers.


 How would I know if my AMD Sempron 2200+ has separate execution units
 for SSE and
 FPU instructions, with independent registers ?
 
 Most CPU use the same FP unit for both x87 and SIMD operations so it
 wouldn't give you double the performance. The only exception I know of
 is K6-2/3, whose x87 and 3DNow! units are separate.
 
-march=pentium-m observed the preference of those CPUs for mixing the
types of code.  This was due more to the limited issue rate for SSE
instructions than to the expanded number of registers in use.  You are
welcome to test it on your CPU; however, AMD CPUs were designed to perform
well with SSE alone, particularly in 64-bit mode.



RE: ARM compiler rewriting code to be longer and slower

2009-03-16 Thread Ramana Radhakrishnan
[Resent because of account funnies. Apologies to those who get this twice]

Hi,

  This problem is reported every once in a while, all targets with
 small
  load-immediate instructions suffer from this, especially since GCC
 4.0
  (i.e. since tree-ssa).  But it seems there is just not enough
 interest
  in having it fixed somehow, or someone would have taken care of it by
  now.
 
  I've summed up before how the problem _could_ be fixed, but I can't
  find where.  So here we go again.
 
  This could be solved in CSE by extending the notion of related
  expressions to constants that can be generated from other constants
  by a shift. Alternatively, you could create a simple, separate pass
  that applies CSE's related expressions thing in dominator tree
 walk.
 
 See http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00158.html for
 handling
 something similar when related expressions differ by a small additive
 constant.  I am planning to finish this and submit it for 4.5.

Wouldn't doing this in CSE only solve the problem within an extended basic
block and not necessarily across the program ? Surely you'd want to do it
globally or am I missing something very basic here ?

Ramana





Dose gcc provide any function to build def-use chain in RTL form

2009-03-16 Thread villa gogh
hi
now i'm trying to construct def-use chain after the PASS_LEAF_REGS.
for the ssa form structure has been destoried during the former
passes.
I have found that gcc provides a way to build the def-use chain in the
PASS_REGRENAME, but it only contains the defs and uses all in one
basic block.

so if I want to get the global def-use data of the whole function,
need i to construct it myself ?

Does gcc provide any function to build the def-use chain in RTL form?

thank you


Re: Dose gcc provide any function to build def-use chain in RTL form

2009-03-16 Thread Paolo Bonzini
villa gogh wrote:
 hi
 now i'm trying to construct def-use chain after the PASS_LEAF_REGS.
 for the ssa form structure has been destoried during the former
 passes.
 I have found that gcc provides a way to build the def-use chain in the
 PASS_REGRENAME, but it only contains the defs and uses all in one
 basic block.

No, don't look at those.  Instead look at fwprop.c which uses use-def
chains -- DU chains are the same but they are computed with

  df_chain_add_problem (DF_DU_CHAIN);

instead of

  df_chain_add_problem (DF_UD_CHAIN);

before df_analyze.

fwprop accesses use-def chains by using DF_REF_CHAIN (use); def-use
chains are the same but the DF_REF_CHAIN macro is used with a def
argument instead.

Paolo


Re: GCC 4.4.0 Status Report (2009-03-13)

2009-03-16 Thread Jack Howarth
What about allowing for more backports from the graphite
branch if this drags out for an extended period of time? In
particular, I am thinking of those changes in graphite branch
that might reduce those cases where -fgraphite-identity
degrades the performance of the resulting code.
 Jack

On Mon, Mar 16, 2009 at 11:10:07AM +0100, Paolo Bonzini wrote:
 NightStrike wrote:
  On Fri, Mar 13, 2...@1:58 PM, Joseph S. Myers
  jos...@codesourcery.com wrote:
  Given the SC request we need to stay in Stage 4 rather than trying to work
  around it.
  
  What if GCC went back to stage 3 until the issue is resolved, thus
  opening the door for a number of stage3-type patches that don't affect
  1) licensing and 2) plugin frameworks, but are merely bug fixes which
  would have long been shaken out by now.
 
 No, n...@all.  The only benefit we're having from this is that GCC 4.4
 should be quite stable already in GCC 4.4.0, let's not destroy this one too.
 
 Paolo


Re: sign/zero extension of function arguments on x86-64

2009-03-16 Thread Rafael Espindola
I got mixed results with icc

for
--
short a;
void g(short);
void f(void)
{ g(a); }
--

it produces a movswl. For

---
void g(int);
void f(short a) {
 g(a);
}
--

it produces a  movswq.

For the original test
-
void g(short);
void f(short a) {
 g(a);
}
--

it avoids the extension.

Cheers,
-- 
Rafael Avila de Espindola

Google | Gordon House | Barrow Street | Dublin 4 | Ireland
Registered in Dublin, Ireland | Registration Number: 368047


Re: Typo or intended?

2009-03-16 Thread Andrew Haley
Bingfeng Mei wrote:

 I just updated our porting to include last 2-3 weeks of GCC
 developments. I noticed a large number of test failures at -O1 that
 use a user-defined data type (based on a special register file of
 our processor). All variables of such type are now spilled to memory
 which we don't allow at -O1 because it is too expensive. After
 investigation, I found that it is the following new code causes the
 trouble. I don't quite understand the function of the new code, but
 I don't see what's special for -O1 in terms of register allocation
 in comparison with higher optimizing levels. If I change it to
 (optimize  1), everthing is fine as before. I start to wonder
 whether (optimize = 1) is a typo or intended. Thanks in advance.

-O1 is supposed to allow debugging but still optimize, so it's quite
possible that Vlad did intend to do this.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39432

Andrew.



RE: ARM compiler rewriting code to be longer and slower

2009-03-16 Thread Adam Nemet
Ramana Radhakrishnan writes:
 [Resent because of account funnies. Apologies to those who get this twice]
 
 Hi,
 
   This problem is reported every once in a while, all targets with
  small
   load-immediate instructions suffer from this, especially since GCC
  4.0
   (i.e. since tree-ssa).  But it seems there is just not enough
  interest
   in having it fixed somehow, or someone would have taken care of it by
   now.
  
   I've summed up before how the problem _could_ be fixed, but I can't
   find where.  So here we go again.
  
   This could be solved in CSE by extending the notion of related
   expressions to constants that can be generated from other constants
   by a shift. Alternatively, you could create a simple, separate pass
   that applies CSE's related expressions thing in dominator tree
  walk.
  
  See http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00158.html for
  handling
  something similar when related expressions differ by a small additive
  constant.  I am planning to finish this and submit it for 4.5.
 
 Wouldn't doing this in CSE only solve the problem within an extended basic
 block and not necessarily across the program ? Surely you'd want to do it
 globally or am I missing something very basic here ?

No, you're not.  There are plans moving some of what's in CSE to a new LCM
(global) pass.  Also note that for a global a pass you clearly need some more
sophisticated cost model for deciding when CSEing is beneficial.  On a
multi-scalar architecture, instructions synthesizing consts sometimes appear
to be free whereas holding a value a in a register for an extended period of
time is not.

Adam


Typo or intended?

2009-03-16 Thread Bingfeng Mei
Hello,
I just updated our porting to include last 2-3 weeks of GCC developments. I 
noticed a large number of test failures at -O1 that use a user-defined data 
type (based on a special register file of our processor). All variables of such 
type are now spilled to memory which we don't allow at -O1 because it is too 
expensive. After investigation, I found that it is the following new code 
causes the trouble. I don't quite understand the function of the new code, but 
I don't see what's special for -O1 in terms of register allocation in 
comparison with higher optimizing levels. If I change it to (optimize  1), 
everthing is fine as before. I start to wonder whether (optimize = 1) is a 
typo or intended. Thanks in advance.

Cheers,
Bingfeng Mei
Broadcom UK

  if ((! flag_caller_saves  ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
  /* For debugging purposes don't put user defined variables in
 callee-clobbered registers.  */
  || (optimize = 1   -  why 
include -O1? 
   (attrs = REG_ATTRS (regno_reg_rtx [ALLOCNO_REGNO (a)])) != NULL
   (decl = attrs-decl) != NULL
   VAR_OR_FUNCTION_DECL_P (decl)
   ! DECL_ARTIFICIAL (decl)))
{
  IOR_HARD_REG_SET (ALLOCNO_TOTAL_CONFLICT_HARD_REGS (a),
call_used_reg_set);
  IOR_HARD_REG_SET (ALLOCNO_CONFLICT_HARD_REGS (a),
call_used_reg_set);
}
  else if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
{
  IOR_HARD_REG_SET (ALLOCNO_TOTAL_CONFLICT_HARD_REGS (a),
no_caller_save_reg_set);
  IOR_HARD_REG_SET (ALLOCNO_TOTAL_CONFLICT_HARD_REGS (a),
temp_hard_reg_set);
  IOR_HARD_REG_SET (ALLOCNO_CONFLICT_HARD_REGS (a),
no_caller_save_reg_set);
  IOR_HARD_REG_SET (ALLOCNO_CONFLICT_HARD_REGS (a),
temp_hard_reg_set);
}


Re: ARM compiler rewriting code to be longer and slower

2009-03-16 Thread Steven Bosscher
On Mon, Mar 16, 2009 at 2:52 PM, Ramana Radhakrishnan
ramana.radhakrish...@arm.com wrote:
 Wouldn't doing this in CSE only solve the problem within an extended basic
 block and not necessarily across the program ? Surely you'd want to do it
 globally or am I missing something very basic here ?

Why so serious^Wsurely?

I think doing this optimization over extended basic blocks would catch
90% of the cases.  The loop-carried form is covered by auto-increment
generation (and yes I know that pass also needs to be improved ;-)

Ciao!
Steven


Re: ARM compiler rewriting code to be longer and slower

2009-03-16 Thread Daniel Berlin
On Mon, Mar 16, 2009 at 12:11 PM, Adam Nemet ane...@caviumnetworks.com wrote:
 Ramana Radhakrishnan writes:
 [Resent because of account funnies. Apologies to those who get this twice]

 Hi,

   This problem is reported every once in a while, all targets with
  small
   load-immediate instructions suffer from this, especially since GCC
  4.0
   (i.e. since tree-ssa).  But it seems there is just not enough
  interest
   in having it fixed somehow, or someone would have taken care of it by
   now.
  
   I've summed up before how the problem _could_ be fixed, but I can't
   find where.  So here we go again.
  
   This could be solved in CSE by extending the notion of related
   expressions to constants that can be generated from other constants
   by a shift. Alternatively, you could create a simple, separate pass
   that applies CSE's related expressions thing in dominator tree
  walk.
 
  See http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00158.html for
  handling
  something similar when related expressions differ by a small additive
  constant.  I am planning to finish this and submit it for 4.5.

 Wouldn't doing this in CSE only solve the problem within an extended basic
 block and not necessarily across the program ? Surely you'd want to do it
 globally or am I missing something very basic here ?

 No, you're not.  There are plans moving some of what's in CSE to a new LCM
 (global) pass.  Also note that for a global a pass you clearly need some more
 sophisticated cost model for deciding when CSEing is beneficial.  On a
 multi-scalar architecture, instructions synthesizing consts sometimes appear
 to be free whereas holding a value a in a register for an extended period of
 time is not.


Right. You probably want something closer to nigel horspool's
isothermal speculative PRE which takes into account (using
heuristics and profiles) where the best place to put things is based
on costs, instead of LCM, which uses a notion of lifetime optimality

See http://webhome.cs.uvic.ca/~nigelh/pubs.html for Fast
Profile-Based Partial Redundancy Elimination

There was a working implementation of this done for GCC 4.1 that used
profile info and execution counts.
If you are interested, and can hunt down David Pereira (He isn't at
uvic anymore, and i haven't talked to him since so i don't have his
email), he'd probably give you the code :)


Re: Fwd: Mips, -fpie and TLS management

2009-03-16 Thread Joel Porquet
2009/3/12 Daniel Jacobowitz d...@false.org:
 On Thu, Mar 12, 2009 at 02:02:36PM +0100, Joel Porquet wrote:
  Check what symbol is at, or near, 0x4003 + 22368.  It's probably
  the GOT plus a constant bias.

 It seems there is nothing at this address. Here is the program header:

 Don't know then.  Look at compiler-generated assembly instead of
 disassembly; that often helps.

Do you mean the object file produced by gcc before linkage?
If yes, the code looks like:

3c05lui a1,0x0
40: R_MIPS_TLS_DTPREL_HI16  a

which will be computed later as

3c054003lui a1,0x4003

 By the way, how did you test the code of TLS for mips? I mean, uclibc
 seems the more advanced lib for mips, and although this lib seems to
 have the necessary code to manage tls once it is installed, the ldso
 doesn't contain any code for handling TLS (relocation, tls allocation,
 etc)...

 That statement about uclibc strikes me as bizarre.  I tested it with
 glibc, naturally.  GLIBC has a much more reliable TLS implementation
 than uclibc's in-progress one.

I just downloaded the glibc archive without noticing that the mips
port was in another archive... My mistake..

  Last question, is there a difference between DSO and PIE objects other
  than the INTERP entry in the program header?
 
  Yes.  Symbol preemption is allowed for DSOs but not for PIEs or normal
  executables.  That explains the different choice of model.

 But this is only a property, isn't it? I was meaning, how can you
 differenciate them at loading time, when you analyse the elf file.

 You can't.

 As you surely know, ELF_R_SYM() macro performs (val8) which gives
 the symbol index in order to retrieve the name of the symbol. This
 name then allows to look up the symbol. Unfortunately, in the case of
 local-dynamic, ELF_R_SYM will return 0 which is not correct (the same
 for global-dynamic will return 9): we can see by the way that readelf
 is not able to get the symbol name. What do you think about this?

 This is a *module* relocation.  In local dynamic the module is always
 the current DSO; it does not need a symbol.

But what if the DSO access other module's TLS?

Finally, I noticed another problem. GCC seems to not make room for the
4 arguments as specified in the ABI, when calling __get_tls_addr.
For example, here is an extract of the code for calling (we see that
data are stored directly at the top of the stack):

...
5ffe0bfc:   27bdfff0addiu   sp,sp,-16
5ffe0c00:   afbf000csw  ra,12(sp)
5ffe0c04:   afbcsw  gp,0(sp)
5ffe0c08:   afa40010sw  a0,16(sp)
5ffe0c0c:   100db   5ffe0c44 puts+0x54
5ffe0c10:   nop
5ffe0c14:   8f998030lw  t9,-32720(gp)
5ffe0c18:   27848038addiu   a0,gp,-32712
5ffe0c1c:   0320f809jalrt9
5ffe0c20:   nop
5ffe0c24:   8fbclw  gp,0(sp)
...

The jalr t9 is the call to get_tls_addr whose code is:

...
5ffe0b40:   27bdffe8addiu   sp,sp,-24
5ffe0b44:   afbcsw  gp,0(sp)
5ffe0b48:   afa40018sw  a0,24(sp)
5ffe0b4c:   7c03e83b0x7c03e83b
...

We notice then that sw a0, 24(sp) will erase $gp which was saved at
the same place (sw gp, 0(gp)) by the caller.

Regards,

Joel


Re: Fwd: Mips, -fpie and TLS management

2009-03-16 Thread Daniel Jacobowitz
On Mon, Mar 16, 2009 at 06:19:01PM +0100, Joel Porquet wrote:
 2009/3/12 Daniel Jacobowitz d...@false.org:
  On Thu, Mar 12, 2009 at 02:02:36PM +0100, Joel Porquet wrote:
   Check what symbol is at, or near, 0x4003 + 22368.  It's probably
   the GOT plus a constant bias.
 
  It seems there is nothing at this address. Here is the program header:
 
  Don't know then.  Look at compiler-generated assembly instead of
  disassembly; that often helps.
 
 Do you mean the object file produced by gcc before linkage?

That will do, but the actual assembly (-S) is more helpful sometimes.

  This is a *module* relocation.  In local dynamic the module is always
  the current DSO; it does not need a symbol.
 
 But what if the DSO access other module's TLS?

Then it does not use Local Dynamic to do so.

 
 Finally, I noticed another problem. GCC seems to not make room for the
 4 arguments as specified in the ABI, when calling __get_tls_addr.
 For example, here is an extract of the code for calling (we see that
 data are stored directly at the top of the stack):
 
 ...
 5ffe0bfc: 27bdfff0addiu   sp,sp,-16
 5ffe0c00: afbf000csw  ra,12(sp)
 5ffe0c04: afbcsw  gp,0(sp)

That line is bogus.  Figure out where it came from; the cprestore
offset should not be zero.

-- 
Daniel Jacobowitz
CodeSourcery


[Fwd: gomp - cost of threadprivate data access]

2009-03-16 Thread Toon Moene

[ Perhaps we need a somewhat larger audience for this one, as it isn't a
  gfortran specific issue (despite the COMMONs). ]

The reporter of this problem (perhaps it's necessary to open a bugzilla 
PR) uses:


It is GNU/linux on x86_64, fedora 10

kernel 2.6.27.12-170.2.5.fc10.x86_64
glibc-2.9-3.x86_64

--
Toon Moene - e-mail: t...@moene.org (*NEW*) - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html
---BeginMessage---

Hello,

We have parallelized a relatively large f77 project (GEANT3, ~200k loc) using 
OpenMP.


Now we are running comparisons between standard and parallel version and it 
turns out that just making the commons threadprivate results in 20% percent 
speed penalty. This extra time is spent in __tls_get_addr() function which seems 
to be called for every access of a threadprivate variable.


Would it be in principle possible to optimize this access?

I figure that the base address of all referenced commons could be obtained once 
per function thus drastically reducing the __tls_get_addr() call count.


We are using gcc-4.3 branch from the beginning of February, with patches to 
allow equivalence statements among threadprivate data.


Callgrind output of a sample run is available at:

-O2 https://mtadel.web.cern.ch/mtadel/callgrind.out.13032
-O2 -g  https://mtadel.web.cern.ch/mtadel/callgrind.out.13055

Best,
Matevz

---End Message---


Re: [Fwd: gomp - cost of threadprivate data access]

2009-03-16 Thread Steven Bosscher
On Mon, Mar 16, 2009 at 7:06 PM, Toon Moene t...@moene.org wrote:
 [ Perhaps we need a somewhat larger audience for this one, as it isn't a
  gfortran specific issue (despite the COMMONs). ]

 The reporter of this problem (perhaps it's necessary to open a bugzilla PR)
 uses:

 It is GNU/linux on x86_64, fedora 10

 kernel 2.6.27.12-170.2.5.fc10.x86_64
 glibc-2.9-3.x86_64

The __tls_get_addr() calls should already be optimized if the proper
TLS model is used.
Do we have a test case?

Ciao!
Steven


Re: improve -fverbose-asm option

2009-03-16 Thread Ian Lance Taylor
Eric Fisher joefoxr...@gmail.com writes:

 I'd like to get more helpful information from the final .S file, such
 as basic block info, so that I can draw a cfg graph through a script.

The basic block information and the CFG graph is not reliable at that
point in the compilation.  Your patch will work reliably for some
targets and optimization levels but not for others.  The CFG information
is messed up by the machine dependent reorg pass and the delay slot
pass.  I would be worried about confusing people.


 Also, I think it will be better to generate one label for each basic
 block, and the local label should have the function name as the
 suffix. Because some profile tools, such as oprofile, will output
 samples based on the labels. So this will help us to analyze the
 samples for each basic block. But current generated code will have
 many local labels with the same name. Perhaps it's again the
 -fverbose-asm to enable this functionality. But where should I go if I
 wanna implement this functionality?

The local labels used for blocks are normally discarded by the assembler
and thus are never seen by tools like oprofile.  Using named symbols for
basic blocks seems like a reasonable option if it will indeed give
better information from oprofile, but it should be an option separate
from -fverbose-asm.  The labels in RTL are CODE_LABEL insns, so you
would want to change the way that they are emitted in final_scan_insn.
The fact that there can be several CODE_LABELs in sequence doesn't seem
to matter too much, since only one will be picked up by profiling tools.
To be clear, I would want to see that you really do get better results
from profiling tools before accepting such a patch.

Ian


Re: Preprocessor for assembler macros?

2009-03-16 Thread Ian Lance Taylor
Ph. Marek phil...@marek.priv.at writes:

 Philipp Marek philipp at marek.priv.at writes:
  gcc -S tmp.S for some reason prints to stdout, so gcc -S tmp.S  tmp.s
  is what you need
 Thank you very much, I'll take a look.
 I tried very hard to achieve that; and one time it seemed to work, but I 
 cannot
 make it work again.

I already asked you to take this question to a different mailing list,
and I already answered your question.

http://gcc.gnu.org/ml/gcc/2009-03/msg00187.html

Please take any followups to a different mailing list.

Ian


Difference between local/global/parameter array handling

2009-03-16 Thread Jean Christophe Beyler
Dear all,

I've been working on explaining to GCC the cost of loads/stores on my
target and I arrived to this problem. Consider the following code:

uint64_t sum = 0;
for(i=0; iN; i += 2) {  /* N is defined by a macro */
z0 = buff[i];
z1 = buff[i+1];
sum += z0 + z1;
}

Depending on the type (local/global or parameter of the function) of
buff, I get different code generations for the loop:

For global and local definitions of buff:
$L2:
ldd r6,8(r10)
ldd r7,0(r10)
addir10,r10,16
cmpne   r8,r11,r10
add r6,r6,r7
add r9,r9,r6
bt  r8,$L2

For the parameter, I get this:
$L7
add r6,r48,r10
ldd r8,0(r6)
ldd r7,0(r11)
addir10,r10,16
cmpine  r6,r10,1024
addir11,r11,16
add r7,r7,r8
add r9,r9,r7
bt  r6,$L7


I don't seem to see why the compiler handles the case of buff as a
parameter to the function differently. It uses 2 registers and fails
to see that it could use the same one with the offset like how it does
it in the global/local cases. Any idea of why this happens to my code
generation?

I wonder now that I look at this if it's an address issue. If you
compare the way it handles the end test, for local and global (where
the compiler has the information of the array), the compare is done
using the end address of the array, whereas this is no longer the case
for the parameter. Instead it uses the number of iterations instead.

I have just now confirmed this by defining the global array as a
pointer or an array (int *tab or int tab[128];). In the case of the
array, I get the solution I would expect. In the case of the pointer,
I get the version that I do not like. Any ideas?

Thank you very much for your help,
Jean Christophe Beyler


Re: Understand BLKmode and returning structure in register.

2009-03-16 Thread Richard Sandiford
Bingfeng Mei b...@broadcom.com writes:
 In foo function, compute_record_mode function will set the mode for
 struct COMPLEX as BLKmode partly because STRICT_ALIGNMENT is 1 on my
 target. In TARGET_RETURN_IN_MEMORY hook, I return 1 for BLKmode type
 and 0 otherwise for small size (8) (like MIPS).  Thus, this structure
 is still returned through memory, which is not very efficient. More
 importantly, ABI is NOT FIXED under such situation. If an assembly
 code programmer writes a function returning a structure. How does he
 know the structure will be treated as BLKmode or otherwise? So he
 doesn't know whether to pass result through memory or register. Do I
 understand correctly?

Yes.  I think having TARGET_RETURN_IN_MEMORY depend on internal details
like the RTL mode is often seen as an historical mistake.  As you say,
the ABI should be defined directly by the type instead.

Unfortunately, once you start using a mode, it's difficult to stop
using a mode without breaking compatibility.  So one of the main reasons
the MIPS port still uses the mode is because no-one dares touch it.

Likewise, it's now difficult to change the mode attached to a structure
(which could potentially make structure accesses more efficient) without
accidentally breaking someone's ABI.

 On the other hand, if I return 0 only according to struct type's size
 regardless BLKmode or not, GCC will produces very inefficient
 code. For example, stack setup code in foo is still generated even it
 is totally unnecessary.

Yeah, there's definitely room for improvement here.  And as you say,
it's already a problem for MIPS.  I think it's just one of those things
that doesn't occur often enough in critical code for anyone to have
spent time optimising it.

Richard


generic bug in fixed-point constant folding

2009-03-16 Thread Sean D'Epagnier
Hi,

I think I found a generic problem for fixed point constant folding.

In fold-const.c:11872 gcc tries to apply:
  /* Transform (x  c)  c into x  (-1c), or transform (x 
c)  c
 into x  ((unsigned)-1  c) for unsigned types.  */

I attached a simple patch which fixes the problem by not applying this
optimization to fixed point types.  I would like to have this
optimization because it is possible.. but the problem is fixed-point
types do not support bitwise operations like  | ^ ~.. so without
supporting these somehow internally but not allowing the user to have
them, this can't take place.

I am open to other suggestions.  For future reference should this be
posted as a bug report?   It seems simple enough that it could be
included right away.. but I feel like if it's a bug report no one will
notice since fixed-point support is not widely used.

Sean
Index: fold-const.c
===
--- fold-const.c	(revision 144210)
+++ fold-const.c	(working copy)
@@ -11877,7 +11877,8 @@ fold_binary (enum tree_code code, tree t
 	   host_integerp (arg1, false)
 	   TREE_INT_CST_LOW (arg1)  TYPE_PRECISION (type)
 	   host_integerp (TREE_OPERAND (arg0, 1), false)
-	   TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1))  TYPE_PRECISION (type))
+	   TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1))  TYPE_PRECISION (type)
+	   TREE_CODE (type) != FIXED_POINT_TYPE)
 	{
 	  HOST_WIDE_INT low0 = TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1));
 	  HOST_WIDE_INT low1 = TREE_INT_CST_LOW (arg1);


[Bug libobjc/39465] libobjc does not find classes of DLLs

2009-03-16 Thread ayers at gcc dot gnu dot org


--- Comment #2 from ayers at gcc dot gnu dot org  2009-03-16 07:27 ---
So the situation seems to be:
- libobjc is a static library.
- libfoo is a dll statically linked against libobjc.
- test is program which is linked both against libfoo and libobjc.

I'm guessing here since I have no experience mingw and with linking libobjc
statically, but I could imagine that you may have two copies of libobjc in your
executable each with it's own set of runtime structures, which may cause
confusion.

Is there any reason why libobjc isn't dynamically linked if you going to use
DLL's?

Note I'll still need to build a mingw compiler and look into the auto-import
warning and I'm not sure when I'll get around to it, so I haven't assigned the
bug yet in case someone else can easily test it.

Cheers,
David


-- 

ayers at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||ayers at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39465



[Bug debug/39355] [4.4 Regression] Revision 144529 miscompiled libcpp/expr.c

2009-03-16 Thread jakub at gcc dot gnu dot org


--- Comment #25 from jakub at gcc dot gnu dot org  2009-03-16 07:52 ---
I'd say first try to add noinline attribute on all callers of num_positive, if
it fails even with those, add also __attribute__((__optimize__(0))) to them one
by one.  If the noinline attribute to those makes the miscompilation go away,
search one by one which one it is and retry with all callers of that function.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39355



[Bug tree-optimization/39455] [4.3/4.4 Regression] ICE : in compare_values_warnv, at tree-vrp.c:1073

2009-03-16 Thread jakub at gcc dot gnu dot org


--- Comment #7 from jakub at gcc dot gnu dot org  2009-03-16 08:15 ---
Reduced testcase:

/* { dg-do compile } */
/* { dg-options -O2 -fprefetch-loop-arrays } */

void
foo (char *x, unsigned long y, unsigned char *z)
{
  unsigned int c[256], *d;

  for (d = c + 1; d  c + 256; ++d)
*d += d[-1];
  x[--c[z[y]]] = 0;
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39455



[Bug tree-optimization/39455] [4.3/4.4 Regression] ICE : in compare_values_warnv, at tree-vrp.c:1073

2009-03-16 Thread pinskia at gmail dot com


--- Comment #8 from pinskia at gmail dot com  2009-03-16 08:28 ---
Subject: Re:  [4.3/4.4 Regression] ICE : in compare_values_warnv, at
tree-vrp.c:1073



Sent from my iPhone

On Mar 16, 2009, at 1:15 AM, jakub at gcc dot gnu dot org
gcc-bugzi...@gcc.gnu.org 
  wrote:



 --- Comment #7 from jakub at gcc dot gnu dot org  2009-03-16  
 08:15 ---
 Reduced testcase:

 /* { dg-do compile } */
 /* { dg-options -O2 -fprefetch-loop-arrays } */

 void
 foo (char *x, unsigned long y, unsigned char *z)
 {
  unsigned int c[256], *d;

  for (d = c + 1; d  c + 256; ++d)
*d += d[-1];
  x[--c[z[y]]] = 0;

Hmm. Could this be the char-- bug? Where the front-end/gimplifier does  
not promote that to int?

Thanks,
Andrew Pinski


 }


 -- 


 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39455



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39455



[Bug rtl-optimization/30688] Branch registers loaded too late on ia64

2009-03-16 Thread steven at gcc dot gnu dot org


--- Comment #5 from steven at gcc dot gnu dot org  2009-03-16 08:46 ---
Can someone point me to the IA64 optimiation manuals mentioned in comment #0?  

I'm looking for some answers, for example:

* Which branch registers can I use? bt-load can actually perform register
renaming.  It has to, of course, because bt-load runs after the register
allocator. The register allocator prefers to always use tr0 on sh64, and it
probably always tries to use the same branch register on ia64 too.  So register
renaming is a Good Thing here.  But which regs can I use on IA64?

* What does as early as possible mean in comment #0?  Are there
recommendations for what is considered too early (for example due to
interactions with calls and such)?

* What happens if a value is assigned to a branch register on IA64?  Is the
prefetcher always triggered?  What is the latency of the prefetching after a
branch register has been assigned a value?

* Is there a possibility to add a prediction hint to say branch register A is
more likely to be used than branch register B when multiple branch registers
are assigned a value in the same basic block?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688



[Bug tree-optimization/39455] [4.3/4.4 Regression] ICE : in compare_values_warnv, at tree-vrp.c:1073

2009-03-16 Thread jakub at gcc dot gnu dot org


--- Comment #9 from jakub at gcc dot gnu dot org  2009-03-16 08:49 ---
No, this seems to be aprefetch's pass fault, at least in quick skim
*.cunroll seems to be ok typewise, while *.aprefetch has:
  D.1649_44 = c + 1024;
  D.1650_43 = (long unsigned int) D.1649_44;
  if (c[2] = D.1650_43)

D.1650 is long unsigned int and c is unsigned int c[256], so obviously the
comparison above is wrong.

Will try to debug it.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |jakub at gcc dot gnu dot org
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2009-03-13 14:03:54 |2009-03-16 08:49:06
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39455



[Bug driver/39356] assembler isn't called

2009-03-16 Thread ktietz at gcc dot gnu dot org


--- Comment #9 from ktietz at gcc dot gnu dot org  2009-03-16 09:15 ---
(In reply to comment #8)
 (In reply to comment #7)
  The following patch solves this problem and prevents the name collision for 
  32
  and 64 bits win32 systems.
  
  ChangeLog
  
  * config/i386/i386.md (allocate_stack_worker_32): Use
  ___gnu_chkstk.
  (allocate_stack_worker_64): Likewise.
  * config/i386/cygwin.asm (__alloca): Renamed to __gnu_alloca.
  (___chkstk): Renamed to ___gnu_chkstk.
  
 No. This breaks backward compatibility.  Static libraries and objects built
 with current and older versions of gcc will not be able to resolve references
 to __alloca or ___chkstk.Why not add labels with the new names as aliases
 rather than replace.
 
 Danny
 

Ok, for 32-bits this makes sense to keep the old symbol names. Beside there is
still a chance that a user uses the manually the chkstk.o file, which can lead
to undefined behaviour (at least if the user code references __chkstk).
For 64-bit I prefer to avoid those old names and simply rename it.
Is this ok for you? I'll file then a patch for it?

Kai


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39356



[Bug target/39115] [4.3 Regression] Value of variable is not read again

2009-03-16 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

  Known to fail||4.3.3
  Known to work|4.2.4   |4.2.4 4.4.0
   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39115



[Bug tree-optimization/39455] [4.3/4.4 Regression] ICE : in compare_values_warnv, at tree-vrp.c:1073

2009-03-16 Thread jakub at gcc dot gnu dot org


--- Comment #10 from jakub at gcc dot gnu dot org  2009-03-16 09:43 ---
Seems tree-ssa-loop-niter.c has a lot of p+ issues.  The following untested
patch fixes just the number_of_iterations_lt_to_ne bugs and fixes this
testcase:

--- gcc/tree-ssa-loop-niter.c.jj2009-03-04 20:06:31.0 +0100
+++ gcc/tree-ssa-loop-niter.c   2009-03-16 10:30:39.0 +0100
@@ -699,8 +699,10 @@ number_of_iterations_lt_to_ne (tree type
 iv0-base = iv1-base + MOD.  */
   if (!iv0-no_overflow  !integer_zerop (mod))
{
- bound = fold_build2 (MINUS_EXPR, type,
+ bound = fold_build2 (MINUS_EXPR, type1,
   TYPE_MAX_VALUE (type1), tmod);
+ if (POINTER_TYPE_P (type))
+   bound = fold_convert (type, bound);
  assumption = fold_build2 (LE_EXPR, boolean_type_node,
iv1-base, bound);
  if (integer_zerop (assumption))
@@ -708,6 +710,11 @@ number_of_iterations_lt_to_ne (tree type
}
   if (mpz_cmp (mmod, bnds-below)  0)
noloop = boolean_false_node;
+  else if (POINTER_TYPE_P (type))
+   noloop = fold_build2 (GT_EXPR, boolean_type_node,
+ iv0-base,
+ fold_build2 (POINTER_PLUS_EXPR, type,
+  iv1-base, tmod));
   else
noloop = fold_build2 (GT_EXPR, boolean_type_node,
  iv0-base,
@@ -723,6 +730,8 @@ number_of_iterations_lt_to_ne (tree type
{
  bound = fold_build2 (PLUS_EXPR, type1,
   TYPE_MIN_VALUE (type1), tmod);
+ if (POINTER_TYPE_P (type))
+   bound = fold_convert (type, bound);
  assumption = fold_build2 (GE_EXPR, boolean_type_node,
iv0-base, bound);
  if (integer_zerop (assumption))
@@ -730,6 +739,13 @@ number_of_iterations_lt_to_ne (tree type
}
   if (mpz_cmp (mmod, bnds-below)  0)
noloop = boolean_false_node;
+  else if (POINTER_TYPE_P (type))
+   noloop = fold_build2 (GT_EXPR, boolean_type_node,
+ fold_build2 (POINTER_PLUS_EXPR, type,
+  iv0-base,
+  fold_unary (NEGATE_EXPR,
+  type1, tmod)),
+ iv1-base);
   else
noloop = fold_build2 (GT_EXPR, boolean_type_node,
  fold_build2 (MINUS_EXPR, type1,

but e.g. number_of_iterations_le doesn't look correct at all as well.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39455



[Bug middle-end/39333] gcc 4.3.3 miscompiles when -finline-small-functions is used

2009-03-16 Thread falk at debian dot org


--- Comment #19 from falk at debian dot org  2009-03-16 10:24 ---
(In reply to comment #18)
 Well, I've got bad news for you anyway:
 it seems that the problem affects gcc-4.3.2 too:
 it seems it's reproducible in another app,
 however one potentially much harder to debug.
 Please read http://bugs.winehq.org/show_bug.cgi?id=17406
 and give some ideas for a test.

The fact that -fno-inline helps gives only very little indication that this is
actually the same problem.

In any case, I don't think there's really anything we can do without a complete
test case (that is, a single file with a main() that exits with 0 when
everything's fine and 1 otherwise). This is very difficult to do for someone
who doesn't know the freeciv codebase.


-- 

falk at debian dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39333



[Bug libobjc/39465] libobjc does not find classes of DLLs

2009-03-16 Thread js-gcc at webkeks dot org


--- Comment #3 from js-gcc at webkeks dot org  2009-03-16 11:24 ---
When the target is mingw32, it seems that libobjc is only built as a static
library. This isn't a bad idea after all, because I guess no win32 user has a
libobjc.so installed somewhere, so you would need to ship that file with every
binary produced from ObjC-sources.

I heard from the GNUstep guys that they had the same problem until they linked
libobjc dynamically. But IMO, this is only a workaround - it should also work
if libobjc is linked statically.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39465



[Bug target/36047] -pg does not work on large binaries and m68k

2009-03-16 Thread mkuvyrkov at gcc dot gnu dot org


--- Comment #2 from mkuvyrkov at gcc dot gnu dot org  2009-03-16 11:35 
---
Would you please attach a preprocessed testcase so one can reproduce the
problem.


-- 

mkuvyrkov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36047



[Bug libobjc/39465] libobjc does not find classes of DLLs

2009-03-16 Thread ayers at gcc dot gnu dot org


--- Comment #4 from ayers at gcc dot gnu dot org  2009-03-16 11:41 ---
Well, consider me a GNUstep guy yet I'm definitely not a GNUstep on MinGW32
guy. (Or anything on MinGW32... which is why this a bit difficult, yet I'm
trying to help maintain libobjc so I'll see what I can do.)

Could you please add a link to that discussion?  It seems that I missed it. 
I've found a few mingw32 discussions searching the archive but nothing recent
wrt static linking.

In the meantime I'm learning how to setup a cross tool chain... please be
patient.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39465



[Bug libobjc/39465] libobjc does not find classes of DLLs

2009-03-16 Thread js-gcc at webkeks dot org


--- Comment #5 from js-gcc at webkeks dot org  2009-03-16 11:46 ---
It would be hard to link to that discussion as that was IRL on FOSDEM in the
GNUstep Dev Room :).
I reported that bug once on the mingw32 list, but they wouldn't really care
about it. After speaking to Nicola Pero on FOSDEM, I decided that it'd be best
to file a bug for libobjc - and so I did :).

For building mingw32 with gcc 4, you could have a look at these Port files I
wrote:
https://webkeks.org/hg/crux_ports/file/6062794869e8/mingw32-api
https://webkeks.org/hg/crux_ports/file/6062794869e8/mingw32-binutils
https://webkeks.org/hg/crux_ports/file/6062794869e8/mingw32-gcc
https://webkeks.org/hg/crux_ports/file/6062794869e8/mingw32-runtime


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39465



[Bug debug/37890] Incorrect nesting for DW_TAG_imported_declaration

2009-03-16 Thread jan dot kratochvil at redhat dot com


--- Comment #1 from jan dot kratochvil at redhat dot com  2009-03-16 14:24 
---
Verified as the problem exists on GNU C++ 4.4.0 20090315 (experimental).
Tried also non-main function and slightly complicated function.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37890



[Bug debug/39471] New: DW_TAG_imported_module should be used (not DW_TAG_imported_declaration)

2009-03-16 Thread jan dot kratochvil at redhat dot com
Regression from g++-4.3 for GNU C++ 4.4.0 20090315 (experimental)
(+also for 4.4.0 20090313 (Red Hat 4.4.0-0.26))
For full namespace import one should use DW_TAG_imported_module.

1:namespace A
2:{
3:  int i = 1;
4:}
5:
6:int
7:main ()
8:{
9:  using namespace A;
10:  i = 2;
11:  return 0;
12:}

Using g++-4.4 DWARF one must use `A::i' at `main' in the debugger.
The whole namespace `A' should be imported there instead.

WRONG g++-4.4 debuginfo:
 c   DW_AT_producer: (indirect string, offset: 0x0): GNU C++ 4.4.0
20090315 (experimental)
 12d: Abbrev Number: 2 (DW_TAG_subprogram)
2f   DW_AT_name: (indirect string, offset: 0x7d): main
 251: Abbrev Number: 3 (DW_TAG_lexical_block)
52   DW_AT_low_pc  : 0x4  
5a   DW_AT_high_pc : 0x13 
 362: Abbrev Number: 4 (DW_TAG_imported_declaration)
65   DW_AT_name: A
67   DW_AT_import  : 0x74   [Abbrev Number: 6 (DW_TAG_namespace)]
 174: Abbrev Number: 6 (DW_TAG_namespace)
75   DW_AT_name: A
 27d: Abbrev Number: 7 (DW_TAG_variable)
7e   DW_AT_name: i
82   DW_AT_MIPS_linkage_name: (indirect string, offset: 0x74): _ZN1A1iE   

Correct g++-4.3 debuginfo:
 c   DW_AT_producer: (indirect string, offset: 0x0): GNU C++ 4.3.2
20081105 (Red Hat 4.3.2-7)
 12d: Abbrev Number: 2 (DW_TAG_subprogram)
2f   DW_AT_name: (indirect string, offset: 0x80): main
 251: Abbrev Number: 3 (DW_TAG_imported_module)
54   DW_AT_import  : 0x60   [Abbrev Number: 5 (DW_TAG_namespace)]
 160: Abbrev Number: 5 (DW_TAG_namespace)
61   DW_AT_name: A
 269: Abbrev Number: 6 (DW_TAG_variable)
6a   DW_AT_name: i
6e   DW_AT_MIPS_linkage_name: (indirect string, offset: 0x77): _ZN1A1iE   
72   DW_AT_type: 0x59   

It causes regressions on gdb.cp/namespace-using.exp for the GDB project Archer.


-- 
   Summary: DW_TAG_imported_module should be used (not
DW_TAG_imported_declaration)
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jan dot kratochvil at redhat dot com
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39471



[Bug tree-optimization/39455] [4.3/4.4 Regression] ICE : in compare_values_warnv, at tree-vrp.c:1073

2009-03-16 Thread jakub at gcc dot gnu dot org


--- Comment #11 from jakub at gcc dot gnu dot org  2009-03-16 16:07 ---
Subject: Bug 39455

Author: jakub
Date: Mon Mar 16 16:07:07 2009
New Revision: 144885

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=144885
Log:
PR tree-optimization/39455
* tree-ssa-loop-niter.c (number_of_iterations_lt_to_ne): Fix types
mismatches for POINTER_TYPE_P (type).
(number_of_iterations_le): Likewise.

* gcc.dg/pr39455.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr39455.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-niter.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39455



[Bug tree-optimization/39455] [4.3 Regression] ICE : in compare_values_warnv, at tree-vrp.c:1073

2009-03-16 Thread jakub at gcc dot gnu dot org


--- Comment #12 from jakub at gcc dot gnu dot org  2009-03-16 16:27 ---
Fixed on the trunk so far.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

  Known to fail|4.3.3 4.4.0 |4.3.3
  Known to work|4.1.2 4.2.4 |4.1.2 4.2.4 4.4.0
Summary|[4.3/4.4 Regression] ICE :  |[4.3 Regression] ICE : in
   |in compare_values_warnv, at |compare_values_warnv, at
   |tree-vrp.c:1073 |tree-vrp.c:1073


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39455



[Bug target/39472] New: Add -mabi=[ms|sysv]

2009-03-16 Thread hjl dot tools at gmail dot com
UEFI uses MS x64 calling convention. It will be nice to
support -mabi=ms on Linux so that we can use gcc 4.4
to build UEFI applications on Linux.


-- 
   Summary: Add -mabi=[ms|sysv]
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hjl dot tools at gmail dot com
 GCC build triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39472



[Bug c/39375] asm with a =X output overwrites the output

2009-03-16 Thread balrogg at gmail dot com


--- Comment #4 from balrogg at gmail dot com  2009-03-16 16:53 ---
Reopening because
int params; __asm__ (xxx : =X (params));
and
int params[1]; __asm__ (xxx : =X (params[0]));
still produce different output in a way that is undocumented.


-- 

balrogg at gmail dot com changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39375



[Bug c/39375] asm with a =X output overwrites the output

2009-03-16 Thread pinskia at gcc dot gnu dot org


--- Comment #5 from pinskia at gcc dot gnu dot org  2009-03-16 17:02 ---
(In reply to comment #4)
 Reopening because
 int params; __asm__ (xxx : =X (params));
 and
 int params[1]; __asm__ (xxx : =X (params[0]));
 still produce different output in a way that is undocumented.

How so?  =X (params[0]) says it can be in memory which means params is
addressable.  This is documented as =X really means =rfm (plus extra
constraints which don't correspond to r, f, or m).


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39375



[Bug debug/39355] [4.4 Regression] Revision 144529 miscompiled libcpp/expr.c

2009-03-16 Thread dave at hiauly1 dot hia dot nrc dot ca


--- Comment #26 from dave at hiauly1 dot hia dot nrc dot ca  2009-03-16 
17:20 ---
Subject: Re:  [4.4 Regression] Revision 144529 miscompiled libcpp/expr.c

 Since revision 144529:
 
 http://gcc.gnu.org/ml/gcc-patches/2009-03/msg0.html
 
 is the cause and it is inline related, I suggest you use revision
 144529 as base and revert the tree-inline.c change to see if it fixes
 libcpp/expr.c.

The regressions don't occur with revision 144874 if I replace
tree-inline.c with the version from revision 144528.

http://gcc.gnu.org/ml/gcc-testresults/2009-03/msg01655.html

144529 significantly changed the amount of inlining.  Thus, it's
very difficult to determine the location of the miscompilation in
expr.o by comparing the difference in code between 144528 and 144529.
It's also my impression that the miscompilation has moved in subsequent
revisions.

The miscompilation is related to the generation of dwarf2 debug
information as it doesn't appear with hpux.

While it may be that the changes to tree-inline.c are not directly
responsilble for the regressions, they are definitely a contributing
factor.  I note that Jan does have an account on a hppa linux machine,
gsyprf11.external.hp.com.

Probably, I should rebuild 144529 and try Jakub's suggestions.

Dave


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39355



[Bug debug/39355] [4.4 Regression] Revision 144529 miscompiled libcpp/expr.c

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #27 from hjl dot tools at gmail dot com  2009-03-16 17:26 
---
(In reply to comment #26)
 
 Probably, I should rebuild 144529 and try Jakub's suggestions.

You need the fix for PR 39345 on top of revision 144529.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39355



[Bug target/39473] New: Typo in untyped_call in i386.md

2009-03-16 Thread hjl dot tools at gmail dot com
untyped_call in i386.md has

  ix86_expand_call ((TARGET_FLOAT_RETURNS_IN_80387
 ? gen_rtx_REG (XCmode, FIRST_FLOAT_REG) : NULL),
operands[0], const0_rtx,
GEN_INT ((DEFAULT_ABI == SYSV_ABI ? X86_64_SSE_REGPARM_MAX
  : X64_SSE_REGPARM_MAX)
 - 1),
NULL, 0);


It doesn't look right for 32bit. Shouldn't it be GEN_INT (SSE_REGPARM_MAX)
instead?


-- 
   Summary: Typo in untyped_call in i386.md
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hjl dot tools at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39473



[Bug target/39473] Typo in untyped_call in i386.md

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #1 from hjl dot tools at gmail dot com  2009-03-16 18:26 ---
Also

void
ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
  rtx callarg2,
  rtx pop, int sibcall)
{
  rtx use = NULL, call; 
  enum calling_abi function_call_abi;

  if (callarg2  INTVAL (callarg2) == -2)
function_call_abi = MS_ABI;
  else  
function_call_abi = SYSV_ABI;

doesn't look right either. Where does -2 come from? Shouldn't it check
TARGET_64BIT?


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

 CC||ktietz at onevision dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39473



[Bug target/39473] Typo in untyped_call in i386.md

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #2 from hjl dot tools at gmail dot com  2009-03-16 18:40 ---
(In reply to comment #1)
 Also
 
 void
 ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
   rtx callarg2,
   rtx pop, int sibcall)
 {
   rtx use = NULL, call; 
   enum calling_abi function_call_abi;
 
   if (callarg2  INTVAL (callarg2) == -2)
 function_call_abi = MS_ABI;
   else  
 function_call_abi = SYSV_ABI;
 
 doesn't look right either. Where does -2 come from? Shouldn't it check
 TARGET_64BIT?
 

This was added by revision 142859:

http://gcc.gnu.org/ml/gcc-cvs/2008-12/msg00559.html


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

 CC||jh at suse dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39473



[Bug target/39473] Typo in untyped_call in i386.md

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #3 from hjl dot tools at gmail dot com  2009-03-16 18:47 ---
(In reply to comment #0)
 untyped_call in i386.md has
 
   ix86_expand_call ((TARGET_FLOAT_RETURNS_IN_80387
  ? gen_rtx_REG (XCmode, FIRST_FLOAT_REG) : NULL),
 operands[0], const0_rtx,
 GEN_INT ((DEFAULT_ABI == SYSV_ABI ? X86_64_SSE_REGPARM_MAX
   : X64_SSE_REGPARM_MAX)
  - 1),
 NULL, 0);
 
 
 It doesn't look right for 32bit. Shouldn't it be GEN_INT (SSE_REGPARM_MAX)
 instead?
 

This was changed by revision 136311:

http://gcc.gnu.org/ml/gcc-cvs/2008-06/msg00067.html

Those changes:

@ -1953,9 +1972,22 @@
is also used as the pic register in ELF.  So for now, don't allow more than
3 registers to be passed in registers.  */

-#define REGPARM_MAX (TARGET_64BIT ? 6 : 3)
-
-#define SSE_REGPARM_MAX (TARGET_64BIT ? 8 : (TARGET_SSE ? 3 : 0))
+/* Abi specific values for REGPARM_MAX and SSE_REGPARM_MAX */
+#define X86_64_REGPARM_MAX 6
+#define X64_REGPARM_MAX 4
+#define X86_32_REGPARM_MAX 3
+
+#define X86_64_SSE_REGPARM_MAX 8
+#define X64_SSE_REGPARM_MAX 4
+#define X86_32_SSE_REGPARM_MAX (TARGET_SSE ? 3 : 0)
+
+#define REGPARM_MAX (TARGET_64BIT ? (TARGET_64BIT_MS_ABI ? X64_REGPARM_MAX \
+: X86_64_REGPARM_MAX)
\
+ : X86_32_REGPARM_MAX)
+
+#define SSE_REGPARM_MAX (TARGET_64BIT ? (TARGET_64BIT_MS_ABI ?
X64_SSE_REGPARM_MAX \
+:
X86_64_SSE_REGPARM_MAX) \
+ : X86_32_SSE_REGPARM_MAX)

weren't properly mentioned in ChangeLog.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39473



[Bug debug/39474] New: DW_AT_location missing for unused variables even at -O0

2009-03-16 Thread jan dot kratochvil at redhat dot com
It is a regression since gcc-4.3 but it was found only at artificial (GDB)
testcase.  Also at -O2 such behavior is even expected.

The variable is considered as optimized-out which should not happen on -O0.

Testcase:
--
int
main (void)
{
  int var;

  return 0;
}
--

gcc -Wall -g

WRONG gcc-4.4:
 c   DW_AT_producer: (indirect string, offset: 0xb): GNU C 4.4.0
20090315 (experimental)
 12d: Abbrev Number: 2 (DW_TAG_subprogram)
2f   DW_AT_name: (indirect string, offset: 0x39): main
 252: Abbrev Number: 3 (DW_TAG_variable)
53   DW_AT_name: var
59   DW_AT_type: 0x5e

Correct gcc-4.3:
 c   DW_AT_producer: (indirect string, offset: 0xf): GNU C 4.3.2
20081105 (Red Hat 4.3.2-7)
 12d: Abbrev Number: 2 (DW_TAG_subprogram)
2f   DW_AT_name: (indirect string, offset: 0x36): main
 252: Abbrev Number: 3 (DW_TAG_variable)
53   DW_AT_name: var
59   DW_AT_type: 0x61
5d   DW_AT_location: 2 byte block: 91 6c  (DW_OP_fbreg: -20)


-- 
   Summary: DW_AT_location missing for unused variables even at -O0
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: debug
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jan dot kratochvil at redhat dot com
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39474



[Bug c++/39475] New: c++0x type-traits should error out in case of incompleteness

2009-03-16 Thread d dot frey at gmx dot de
The current implementation returns misleading results if used the wrong way. A
simple example is:

#include iostream
struct X;
int main()
{
  std::cout  __is_abstract(X)  std::endl;
}

compiles and prints 0. Things get worse when templates are involved. PR
libstdc++/39405 shows why this can be a real problem. I attach the example code
from 39405 to this PR again.


-- 
   Summary: c++0x type-traits should error out in case of
incompleteness
   Product: gcc
   Version: 4.3.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: d dot frey at gmx dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39475



[Bug c++/39475] c++0x type-traits should error out in case of incompleteness

2009-03-16 Thread d dot frey at gmx dot de


--- Comment #1 from d dot frey at gmx dot de  2009-03-16 19:05 ---
Created an attachment (id=17468)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17468action=view)
show inconsistency for is_abstract


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39475



[Bug rtl-optimization/30688] Branch registers loaded too late on ia64

2009-03-16 Thread wilson at codesourcery dot com


--- Comment #6 from wilson at codesourcery dot com  2009-03-16 19:07 ---
Subject: Re:  Branch registers loaded too late
 on ia64

steven at gcc dot gnu dot org wrote:
 --- Comment #5 from steven at gcc dot gnu dot org  2009-03-16 08:46 
 ---
 Can someone point me to the IA64 optimiation manuals mentioned in comment #0? 
  

You can find manuals on the Intel web site.  You want the Intel Itanium 
2 Processor Reference Manual (For Software Development and 
Optimization).  Chapter 7 talks about branch instructions.

 * Which branch registers can I use?

Any one of the 8 special branch registers, class BR_REGS.

 * What does as early as possible mean in comment #0?

The manual says there should be several cycles between the branch 
register write and the branch for correct prediction.  There is probably 
no too early to worry about, as long as you don't use more than the 
available 8 registers.  You want to avoid reloads here.  Some of the 
regs are call clobbered, some are preserved, and probably some are 
reserved for call/return.  I don't recall all of the ABI details.  You 
can look them up in the manuals.  See the Itanium Software Conventions 
and Runtime Architecture Guide.

 * What happens if a value is assigned to a branch register on IA64?  Is the
 prefetcher always triggered?  What is the latency of the prefetching after a
 branch register has been assigned a value?

This is complicated.  I suggest downloading the docs and reading them.

 * Is there a possibility to add a prediction hint to say branch register A is
 more likely to be used than branch register B when multiple branch registers
 are assigned a value in the same basic block?

There is separate predication support for each branch register, but I 
assume this is about priority for prefetching?  Yes, there are branch 
hints for that.  See the Itanium Architecture Software Developer's 
Manual, Volume 1, section 4.5 is for branch instructions.  There is a 
few completer for prefetching a few lines, and a many completer for 
prefetching many lines.  ia64.md uses many for call and return.

Jim


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688



[Bug libstdc++/39405] [4.3 regression] std::shared_ptr barfs on incomplete template class that boost::shared_ptr accepts

2009-03-16 Thread d dot frey at gmx dot de


--- Comment #27 from d dot frey at gmx dot de  2009-03-16 19:08 ---
Thanks Paolo. I've opened PR c++/39475 for the type traits intrinsics.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39405



[Bug c++/39475] c++0x type-traits should error out in case of incompleteness

2009-03-16 Thread paolo dot carlini at oracle dot com


--- Comment #2 from paolo dot carlini at oracle dot com  2009-03-16 19:20 
---
Indeed, ICC errors out.


-- 

paolo dot carlini at oracle dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |paolo dot carlini at oracle
   |dot org |dot com
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-03-16 19:20:13
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39475



[Bug target/39473] Typo in untyped_call in i386.md

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #4 from hjl dot tools at gmail dot com  2009-03-16 19:21 ---
A patch is posed at

http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00749.html


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/gcc-
   ||patches/2009-
   ||03/msg00749.html


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39473



[Bug testsuite/37628] gcc.c-torture/execute/pr35456.c is not generic

2009-03-16 Thread janis at gcc dot gnu dot org


--- Comment #1 from janis at gcc dot gnu dot org  2009-03-16 19:58 ---
Subject: Bug 37628

Author: janis
Date: Mon Mar 16 19:58:32 2009
New Revision: 144890

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=144890
Log:
PR testsuite/37628
* gcc.c-torture/execute/pr35456.x: New, skip for vax.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/pr35456.x
Modified:
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37628



[Bug testsuite/37630] gcc.dg/20001012-1.c depends on IEEE FP encoding

2009-03-16 Thread janis at gcc dot gnu dot org


--- Comment #2 from janis at gcc dot gnu dot org  2009-03-16 19:59 ---
Subject: Bug 37630

Author: janis
Date: Mon Mar 16 19:59:37 2009
New Revision: 144891

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=144891
Log:
PR testsuite/37630
* lib/target-supports.exp (check_effective_target_ieee): New.
* gcc.c-torture/execute/ieee/ieee.exp: Use it.
* gcc.dg/20001012-1.c: Require ieee.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.c-torture/execute/ieee/ieee.exp
trunk/gcc/testsuite/gcc.dg/20001012-1.c
trunk/gcc/testsuite/lib/target-supports.exp


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37630



[Bug testsuite/37960] FAIL: gcc.dg/pr11492.c (test for bogus messages, line 8)

2009-03-16 Thread janis at gcc dot gnu dot org


--- Comment #10 from janis at gcc dot gnu dot org  2009-03-16 20:01 ---
Subject: Bug 37960

Author: janis
Date: Mon Mar 16 20:01:15 2009
New Revision: 144892

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=144892
Log:
PR testsuite/37960
* gcc.dg/pr11492.c: Replace constant and remove xfail.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/pr11492.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37960



[Bug debug/39471] DW_TAG_imported_module should be used (not DW_TAG_imported_declaration)

2009-03-16 Thread jakub at gcc dot gnu dot org


--- Comment #1 from jakub at gcc dot gnu dot org  2009-03-16 20:55 ---
Created an attachment (id=17469)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17469action=view)
gcc44-pr39471.patch

Untested patch.  Dodji, any reason why you started emitting
DW_TAG_imported_declaration for this instead of DW_TAG_imported_module?

Also, looking at the http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37410#c6
comment, I'm wondering about the C++ doesn't allow that usage part in the
comment.  Isn't:

namespace A
{
  int i = 1;
  int j = 2;
}

namespace B
{
  int k = 3;
}

int k = 13;

int
main ()
{
  using namespace A;
  i++;
  j++;
  k++;
  {
using B::k;
k++;
  }
  return 0;
}

a testcase which needs IMPORTED_DECL with non-NAMESPACE_DECL
IMPORTED_DECL_ASSOCIATED_DECL?


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |jakub at gcc dot gnu dot org
   |dot org |
 Status|UNCONFIRMED |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39471



[Bug target/39473] Typo in untyped_call in i386.md

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #5 from hjl dot tools at gmail dot com  2009-03-16 21:00 ---
An updated patch is posted at

http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00754.html


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

URL|http://gcc.gnu.org/ml/gcc-  |http://gcc.gnu.org/ml/gcc-
   |patches/2009-   |patches/2009-
   |03/msg00749.html|03/msg00754.html


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39473



[Bug testsuite/37630] gcc.dg/20001012-1.c depends on IEEE FP encoding

2009-03-16 Thread janis at gcc dot gnu dot org


--- Comment #3 from janis at gcc dot gnu dot org  2009-03-16 21:12 ---
Subject: Bug 37630

Author: janis
Date: Mon Mar 16 21:11:57 2009
New Revision: 144893

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=144893
Log:
Revert patch for PR testsuite/37630.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.c-torture/execute/ieee/ieee.exp
trunk/gcc/testsuite/gcc.dg/20001012-1.c
trunk/gcc/testsuite/lib/target-supports.exp


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37630



[Bug target/39291] _Unwind_Backtrace fails.

2009-03-16 Thread pluto at agmk dot net


--- Comment #6 from pluto at agmk dot net  2009-03-16 21:24 ---
i've tested u-dw2.exe on wine and got more info.

$ ./u-dw2.exe
err:process:start_wineboot failed to start wineboot, err 2
err:process:__wine_kernel_init boot event wait timed out
fixme:msvcrt:__lconv_init  stub
foo:enter
bar:enter
zoo:enter
boom!
signalHandler:enter
lookupSymbol: 00401887
lookupSymbol: 0040166A
signalHandler:longjmp
err:seh:raise_exception Unhandled exception code c096 flags 0 addr 0x409461

$ i486-pc-mingw32-objdump -hw u-dw2.exe

u-dw2.exe: file format pei-i386

Sections:
Idx Name  Size  VMA   LMA   File off  Algn 
Flags
  0 .text 54c4  00401000  00401000  0400  2**4  CONTENTS,
ALLOC, LOAD, READONLY, CODE, DATA
  1 .data 0030  00407000  00407000  5a00  2**2  CONTENTS,
ALLOC, LOAD, DATA
  2 .rdata0c28  00408000  00408000  5c00  2**2  CONTENTS,
ALLOC, LOAD, READONLY, DATA
  3 .bss  0538  00409000  00409000    2**5  ALLOC
  4 .idata0558  0040a000  0040a000  6a00  2**2  CONTENTS,
ALLOC, LOAD, DATA
  5 .CRT  0034  0040b000  0040b000  7000  2**2  CONTENTS,
ALLOC, LOAD, DATA
  6 .tls  0008  0040c000  0040c000  7200  2**2  CONTENTS,
ALLOC, LOAD, DATA

the 0xc096 code means 'EXCEPTION_PRIV_INSTRUCTION'
and the 0x409461 points to the .bss section.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39291



[Bug debug/39471] DW_TAG_imported_module should be used (not DW_TAG_imported_declaration)

2009-03-16 Thread jan dot kratochvil at redhat dot com


--- Comment #2 from jan dot kratochvil at redhat dot com  2009-03-16 21:37 
---
Thanks although there is still excessive DW_AT_name:
 3422: Abbrev Number: 12 (DW_TAG_imported_module)
425   DW_AT_name: A
427   DW_AT_import  : 0x113 [Abbrev Number: 2 (DW_TAG_namespace)]

DW_AT_name looks as undefined for me for DW_TAG_imported_module and it
certainly breaks the current Archer C++ implementation.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39471



[Bug target/39476] New: Typo in ix86_function_regparm in i386.c

2009-03-16 Thread hjl dot tools at gmail dot com
ix86_function_regparm in i386.c has

 if (TARGET_64BIT)
{
  if (ix86_function_type_abi (type) == ix86_abi)
return regparm;
  return ix86_abi != SYSV_ABI ? X86_64_REGPARM_MAX : X64_REGPARM_MAX;
}

Shouldn't it be

return ix86_abi == SYSV_ABI ? X86_64_REGPARM_MAX : X64_REGPARM_MAX;


-- 
   Summary: Typo in ix86_function_regparm in i386.c
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hjl dot tools at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39476



[Bug target/39476] Typo in ix86_function_regparm in i386.c

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #1 from hjl dot tools at gmail dot com  2009-03-16 21:59 ---
It is

 if (TARGET_64BIT)
{
  if (ix86_function_type_abi (type) == DEFAULT_ABI)
return regparm;
  return DEFAULT_ABI != SYSV_ABI ? X86_64_REGPARM_MAX : X64_REGPARM_MAX;
}

Shouldn't it be

return DEFAULT_ABI == SYSV_ABI ? X86_64_REGPARM_MAX : X64_REGPARM_MAX;


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

 CC||ktietz at onevision dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39476



[Bug bootstrap/39470] [melt] - lrand48_r() and srand48_r() are GNU extensions and are not portable

2009-03-16 Thread rob1weld at aol dot com


--- Comment #2 from rob1weld at aol dot com  2009-03-16 22:08 ---
My next difficulty (on OpenSolaris) is the lack of a fopencookie()
function (and the related support in FILE). I'm now building melt
on i686-pc-linux-gnu and running into a few other errors; thus melt
does need some fixing, even on a Linux Operating System.

Rob


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39470



[Bug target/39476] Typo in ix86_function_regparm in i386.c

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #2 from hjl dot tools at gmail dot com  2009-03-16 22:09 ---
We never change regparm for 64bit. Does this patch

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 144817)
+++ gcc/config/i386/i386.c  (working copy)
@@ -4273,17 +4273,15 @@ static int
 ix86_function_regparm (const_tree type, const_tree decl)
 {
   tree attr;
-  int regparm = ix86_regparm;
+  int regparm;

   static bool error_issued;

   if (TARGET_64BIT)
-{
-  if (ix86_function_type_abi (type) == DEFAULT_ABI)
-return regparm;
-  return DEFAULT_ABI != SYSV_ABI ? X86_64_REGPARM_MAX : X64_REGPARM_MAX;
-}
+return (ix86_function_type_abi (type) == SYSV_ABI
+   ? X86_64_REGPARM_MAX : X64_REGPARM_MAX);

+  regparm = ix86_regparm;
   attr = lookup_attribute (regparm, TYPE_ATTRIBUTES (type));
   if (attr)
 {

look OK?


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

 CC||jh at suse dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39476



[Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange

2009-03-16 Thread il dot basso dot buffo at gmail dot com


--- Comment #3 from il dot basso dot buffo at gmail dot com  2009-03-16 
22:21 ---
Here's a further reduction:

struct Point
  {
  int line, col;

  Point( int l = -1, int c = 0 ) throw() : line( l ), col( c ) {}
  bool operator==( const Point  p ) const throw()
{ return ( line == p.line  col == p.col ); }
  bool operator( const Point  p ) const throw()
{ return ( line  p.line || ( line == p.line  col  p.col ) ); }
  };

class Basic_buffer
  {

protected:
  void add_line( const char * const buf, const int len );

public:
  Basic_buffer( const Basic_buffer  b, const Point  p1, const Point  p2 );

  int characters( const int line ) const throw();

  int pgetc( Point  p ) const throw();
  Point eof() const throw() { return Point( 0, 0 ); }

  bool pisvalid( const Point  p ) const throw()
{ return ( ( p.col = 0  p.col  characters( p.line ) ) || p == eof() );
}
  };


class Buffer : public Basic_buffer
  {
public:
  bool save( Point p1 = Point(), Point p2 = Point() ) const;
  };

bool Buffer::save( Point p1, Point p2 ) const
  {
  if( !this-pisvalid( p1 ) ) p1 = eof();
  if( !this-pisvalid( p2 ) ) p2 = eof();
  for( Point p = p1; p  p2; ) { pgetc( p ); }
  return true;
  }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39447



[Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange

2009-03-16 Thread il dot basso dot buffo at gmail dot com


--- Comment #4 from il dot basso dot buffo at gmail dot com  2009-03-16 
22:24 ---
Bah, here's an even smaller example:

struct Point
  {
  int line, col;

  Point( int l = -1, int c = 0 ) throw() : line( l ), col( c ) {}
  bool operator==( const Point  p ) const throw()
{ return ( line == p.line  col == p.col ); }
  bool operator( const Point  p ) const throw()
{ return ( line  p.line || ( line == p.line  col  p.col ) ); }
  };

class Buffer
  {
public:
  int characters( const int line ) const throw();
  int pgetc( Point  p ) const throw();
  Point eof() const throw() { return Point( 0, 0 ); }
  bool pisvalid( const Point  p ) const throw()
{ return ( ( p.col = 0  p.col  characters( p.line ) ) || p == eof() );
}
  bool save( Point p1 = Point(), Point p2 = Point() ) const;
  };

bool Buffer::save( Point p1, Point p2 ) const
  {
  if( !this-pisvalid( p1 ) ) p1 = eof();
  if( !this-pisvalid( p2 ) ) p2 = eof();
  for( Point p = p1; p  p2; ) { pgetc( p ); }
  return true;
  }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39447



Re: [Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange

2009-03-16 Thread Sebastian Pop
Thanks for the reduced testcase, it completely went out of my radar
(by now my delta script should have finished reducing it as well on
the gcc-farm, but I won't even look at it).

Thanks again for the reduced case.  I will look at the bug now.

Sebastian


[Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange

2009-03-16 Thread sebpop at gmail dot com


--- Comment #5 from sebpop at gmail dot com  2009-03-16 22:34 ---
Subject: Re:  ICE in create_data_ref with -O1 
-floop-interchange

Thanks for the reduced testcase, it completely went out of my radar
(by now my delta script should have finished reducing it as well on
the gcc-farm, but I won't even look at it).

Thanks again for the reduced case.  I will look at the bug now.

Sebastian


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39447



[Bug debug/39474] DW_AT_location missing for unused variables even at -O0

2009-03-16 Thread rguenth at gcc dot gnu dot org


--- Comment #1 from rguenth at gcc dot gnu dot org  2009-03-16 22:59 ---
Well, it doesn't even have a value assigned.  So I consider this a valid
optimization for -O0.  Does the variable have a location once you inintialize
it?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39474



[Bug c++/39429] compiler create bad asm codes.

2009-03-16 Thread rearnsha at gcc dot gnu dot org


--- Comment #2 from rearnsha at gcc dot gnu dot org  2009-03-16 22:53 
---
Confirmed.  This is a bug in the arith_adjacent_mem pattern that only triggers
when the offset to the memory from the base pointer exceeds the range of a
simple add instruction (ie more than 1024 bytes).  In that case we fall back to
emitting two ldr instructions, but fail to consider the case when the first
load overwrites the base address.


-- 

rearnsha at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu dot
   ||org, ramana dot r at gmail
   ||dot com
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-03-16 22:53:12
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39429



[Bug target/39429] compiler create bad asm codes.

2009-03-16 Thread rearnsha at gcc dot gnu dot org


-- 

rearnsha at gcc dot gnu dot org changed:

   What|Removed |Added

   Keywords||wrong-code
   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39429



Re: [Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange

2009-03-16 Thread Sebastian Pop
Hi,

I don't know who coded the overly complicated exclude_component_ref.
In the graphite branch we already cleaned up all this code, but in
trunk we still have it.

Attached is a patch that fixes the problem by looking at whether the
operand contains COMPONENT_REFs before calling the data reference
analysis.

I'm testing the patch on the gcc farm, and will send it to the gcc-patches
once it finishes regstrap.

Sebastian

	* graphite.c (exclude_component_ref): Renamed contains_component_ref_p.
	(is_simple_operand): Call contains_component_ref_p before calling data
	reference analysis that would fail on COMPONENT_REFs.

Index: graphite.c
===
--- graphite.c	(revision 144893)
+++ graphite.c	(working copy)
@@ -1062,27 +1062,20 @@ loop_affine_expr (basic_block scop_entry
is component_ref.  */
 
 static bool
-exclude_component_ref (tree op) 
+contains_component_ref_p (tree op) 
 {
   int i;
-  int len;
 
-  if (op)
-{
-  if (TREE_CODE (op) == COMPONENT_REF)
-	return false;
-  else
-	{
-	  len = TREE_OPERAND_LENGTH (op);	  
-	  for (i = 0; i  len; ++i)
-	{
-	  if (!exclude_component_ref (TREE_OPERAND (op, i)))
-		return false;
-	}
-	}
-}
+  if (!op)
+return false;
 
-  return true;
+  if (TREE_CODE (op) == COMPONENT_REF)
+return true;
+
+  for (i = 0; i  TREE_OPERAND_LENGTH (op); i++)
+return contains_component_ref_p (TREE_OPERAND (op, i));
+
+  return false;
 }
 
 /* Return true if the operand OP is simple.  */
@@ -1094,13 +1087,15 @@ is_simple_operand (loop_p loop, gimple s
   if (DECL_P (op)
   /* or a structure,  */
   || AGGREGATE_TYPE_P (TREE_TYPE (op))
+  /* or a COMPONENT_REF,  */
+  || contains_component_ref_p (op)
   /* or a memory access that cannot be analyzed by the data
 	 reference analysis.  */
   || ((handled_component_p (op) || INDIRECT_REF_P (op))
 	   !stmt_simple_memref_p (loop, stmt, op)))
 return false;
 
-  return exclude_component_ref (op);
+  return true;
 }
 
 /* Return true only when STMT is simple enough for being handled by


[Bug middle-end/39447] ICE in create_data_ref with -O1 -floop-interchange

2009-03-16 Thread sebpop at gmail dot com


--- Comment #6 from sebpop at gmail dot com  2009-03-16 23:18 ---
Subject: Re:  ICE in create_data_ref with -O1 
-floop-interchange

Hi,

I don't know who coded the overly complicated exclude_component_ref.
In the graphite branch we already cleaned up all this code, but in
trunk we still have it.

Attached is a patch that fixes the problem by looking at whether the
operand contains COMPONENT_REFs before calling the data reference
analysis.

I'm testing the patch on the gcc farm, and will send it to the gcc-patches
once it finishes regstrap.

Sebastian


--- Comment #7 from sebpop at gmail dot com  2009-03-16 23:18 ---
Created an attachment (id=17470)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17470action=view)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39447



[Bug inline-asm/38815] Taking the address of a __thread variable prevents the r0 register from being loaded

2009-03-16 Thread rearnsha at gcc dot gnu dot org


--- Comment #2 from rearnsha at gcc dot gnu dot org  2009-03-16 23:27 
---
I believe this is a bug in the way we expand local reg vars.  The manual says:

Local register variables in specific registers do not reserve the
registers, except at the point where they are used as input or output
operands in an @code{asm} statement and the @code{asm} statement itself is
not deleted.  The compiler's data flow analysis is capable of determining
where the specified registers contain live values, and where they are
available for other uses.

There are two key points to note in the above: 1) The only point at which a
register variable *has* to be in the named register is when an inline ASM
appears.  2) Data flow is supposed to know when the value is live.  I thus
believe we need to expand local vars as used in this test-case by copying a
pseudo reg that contains the real value into the required register immediately
before its use in an ASM -- and to leave optimizing this code path to the
register allocator -- so that ideally no copy is necessary.

In the test-case cited, the user assigns the variable r0 with a value and then
tries to assign another value to the variable r1.  The second step requires a
libcall sequence that clobbers the value previously stored into r0 -- to avoid
this happening the value previously assigned must be copied to a call-saved
register (or the assignment deferred until after the libcall).


-- 

rearnsha at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu dot
   ||org, ramana dot r at gmail
   ||dot com
 Status|UNCONFIRMED |NEW
  Component|target  |inline-asm
 Ever Confirmed|0   |1
   Keywords||wrong-code
   Last reconfirmed|-00-00 00:00:00 |2009-03-16 23:27:22
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38815



[Bug middle-end/38674] When storing in a register the address of a value contained in the same register, gcc 4.3.2 on ARM clobbers the register before saving its content on the stack.

2009-03-16 Thread rearnsha at gcc dot gnu dot org


--- Comment #2 from rearnsha at gcc dot gnu dot org  2009-03-16 23:38 
---
Confirmed.  We need a way to represent an early-clobber between a register and
a memory-address with side-effects.


-- 

rearnsha at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu dot
   ||org, ramana dot r at gmail
   ||dot com
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-03-16 23:38:45
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38674



[Bug middle-end/38674] When storing in a register the address of a value contained in the same register, gcc 4.3.2 on ARM clobbers the register before saving its content on the stack.

2009-03-16 Thread rearnsha at gcc dot gnu dot org


-- 

rearnsha at gcc dot gnu dot org changed:

   What|Removed |Added

   Keywords||wrong-code
   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38674



[Bug c++/39475] c++0x type-traits should error out in case of incompleteness

2009-03-16 Thread d dot frey at gmx dot de


--- Comment #3 from d dot frey at gmx dot de  2009-03-16 23:49 ---
One more thought on the diagnostics: There are two cases: Incomplete types
(like in the initial example in the description of this PR) and recursive
template instantiations (see attachment). I think the latter produces a
diagnostic which suggests it is the former. This problem not only affects
C++0x, it also happens for normal C++:

f...@viasko:~/work/test/recursive_instantiation$ cat t.cc
template typename T 
struct foo
{
  typename T::type dummy();
};

template typename T 
struct bar
{
  typedef void type;
  foo bar  p;
};

foo bar int   x;
f...@viasko:~/work/test/recursive_instantiation$ g++ t.cc
t.cc: In instantiation of 'barint':
t.cc:4:   instantiated from 'foobarint '
t.cc:14:   instantiated from here
t.cc:11: error: 'barT::p' has incomplete type
t.cc:3: error: declaration of 'struct foobarint '
f...@viasko:~/work/test/recursive_instantiation$ 

g++ is Ubuntu's GCC 4.3.2. The error message says barT::p is incomplete, but
there is no hint why this is the case. Would it be possible to generally
improve this type of diagnostic? Should I open yet another PR or is that not
possible/worth it/...?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39475



[Bug libobjc/39465] libobjc does not find classes of DLLs

2009-03-16 Thread ayers at gcc dot gnu dot org


--- Comment #6 from ayers at gcc dot gnu dot org  2009-03-16 23:51 ---
I've played a bit with creating a trivial static library and linking into an
dynamic library and into an executable.  After tweaking back and forth it seems
that at least on GNU/Linux the static version linked into the executable
actually replaces the version that was linked into the dynamic library... not
sure what would happen if the version linked in last doesn't satisfy all the
requirements needed by the dynamic library.

All very intriguing , yet I believe it has nothing to do with your issue.

Since I wasn't able to get a cross tool chain running (and www.mingw.org
doesn't seem to support that with the current gcc versions) I went ahead and
updated an old Windows VM, installed all kinds of updates... and then installed
MinGW/MSYS natively.  First I reproduced you issue successfully and then went
about installing GNUstep.  Note that GNUstep's MinGW HOWTO explicitly states:

It's a good idea to remove the libobjc.a and libobjc.la and
include/objc headers that come with gcc (gcc -v for location) so that
they are not accidentally found instead of the libobjc DLL that you
will compile below.  ...

After installing the GNUstep packages, I was able to build and execute
applications.  Now GNUstep uses it's own build environment (gnustep-make) to
hide all the fancy stuff that needs to be done on windows.  I was hoping to see
something with messages=yes to give me an indication of what you need to do.
Yet I had no luck in identifying anything interesting.  Well except that
GNUstep is using a shared libobjc.

I'm going to throw in the towel here, but I don't believe your issue has to do
with libobjc.  I think your missing some flag or extra processing that
gnustep-make might do for you dll or the program.

But I also believe that statically linking (potentially different versions) of
libobjc into different modules is error prone.  I guess it would be OK, if you
only have a single executable, but the constellation of the dll linking one
version and the executable potentially linking another scares me... even if
that itself is most likely not your issue either.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39465



[Bug target/38644] Optimization flag -O1 -fschedule-insns2 causes wrong code

2009-03-16 Thread rearnsha at gcc dot gnu dot org


--- Comment #2 from rearnsha at gcc dot gnu dot org  2009-03-17 00:03 
---
Confirmed, this is a nasty bug that might silently bite users after a long
period of apparently correct operation.


-- 

rearnsha at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu dot
   ||org, ramana dot r at gmail
   ||dot com
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Priority|P3  |P2
   Last reconfirmed|-00-00 00:00:00 |2009-03-17 00:03:45
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644



[Bug target/10242] [ARM] subsequent use of plus and minus operators could be improved

2009-03-16 Thread ramana dot r at gmail dot com


--- Comment #4 from ramana dot r at gmail dot com  2009-03-17 00:05 ---
Still present with 4.4 mainline as on 20090312 revision. It looks like some
sort of relic left behind with the calculations of the soft frame pointer.
Maybe a peephole will help.


-- 

ramana dot r at gmail dot com changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu dot
   ||org, ramana dot r at gmail
   ||dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10242



[Bug rtl-optimization/11222] arm/thumb __Unwind_SjLj_Register prologue optimization causes crash on interrupts

2009-03-16 Thread ramana dot r at gmail dot com


--- Comment #10 from ramana dot r at gmail dot com  2009-03-17 00:11 ---
This should be a target bug. Also with mainline the testcase empty described in
comment #9 appears fixed.


-- 

ramana dot r at gmail dot com changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu dot
   ||org, ramana dot r at gmail
   ||dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11222



[Bug target/10242] [ARM] subsequent use of plus and minus operators could be improved

2009-03-16 Thread rearnsha at gcc dot gnu dot org


--- Comment #5 from rearnsha at gcc dot gnu dot org  2009-03-17 00:15 
---
This is a case where early splitting (before register allocation) of a constant
in a plus expression leads to poor code.  We should try disabling the split of
a plus when combined with the internal frame pointer.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10242



[Bug target/39477] New: Incorrect document for regparm attribute

2009-03-16 Thread hjl dot tools at gmail dot com
extend.texi has

---
@item regparm (@var{number})
@cindex @code{regparm} attribute
@cindex functions that are passed arguments in registers on the 386
On the Intel 386, the @code{regparm} attribute causes the compiler to
pass arguments number one to @var{number} if they are of integral type
in registers EAX, EDX, and ECX instead of on the stack.  Functions that
take a variable number of arguments will continue to be passed all of their
arguments on the stack.

Beware that on some ELF systems this attribute is unsuitable for
global functions in shared libraries with lazy binding (which is the
default).  Lazy binding will send the first call via resolving code in
the loader, which might assume EAX, EDX and ECX can be clobbered, as
per the standard calling conventions.  Solaris 8 is affected by this. 
GNU systems with GLIBC 2.1 or higher, and FreeBSD, are believed to be 
safe since the loaders there save all registers.  (Lazy binding can be
disabled with the linker or the loader if desired, to avoid the
problem.)
---

Although glibc is safe since it preserves EAX, EDX and ECX:

_dl_runtime_resolve:
cfi_adjust_cfa_offset (8) 
pushl %eax  # Preserve registers otherwise clobbered.
cfi_adjust_cfa_offset (4) 
pushl %ecx
cfi_adjust_cfa_offset (4) 
pushl %edx
cfi_adjust_cfa_offset (4) 
movl 16(%esp), %edx # Copy args pushed by PLT in register.  Note
movl 12(%esp), %eax # that `fixup' takes its parameters in regs.
call _dl_fixup  # Call resolver.

it doesn't save all registers.


-- 
   Summary: Incorrect document for regparm attribute
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hjl dot tools at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39477



[Bug target/39477] Incorrect document for regparm attribute

2009-03-16 Thread hjl dot tools at gmail dot com


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

 GCC target triplet||i686-pc-linux-gnu
   Target Milestone|--- |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39477



[Bug c++/39475] c++0x type-traits should error out in case of incompleteness

2009-03-16 Thread paolo dot carlini at oracle dot com


--- Comment #4 from paolo dot carlini at oracle dot com  2009-03-17 00:34 
---
Maybe Daniel, but this is a completely separate issue.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39475



[Bug target/39477] Incorrect document for regparm attribute

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #1 from hjl dot tools at gmail dot com  2009-03-17 00:45 ---
A patch is posted at

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39477


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

 CC||ubizjak at gmail dot com
URL||http://gcc.gnu.org/bugzilla/
   ||show_bug.cgi?id=39477


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39477



[Bug target/39476] Typo in ix86_function_regparm in i386.c

2009-03-16 Thread hjl dot tools at gmail dot com


--- Comment #3 from hjl dot tools at gmail dot com  2009-03-17 01:24 ---
A patch is posted at

http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00761.html


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/gcc-
   ||patches/2009-
   ||03/msg00761.html
   Target Milestone|--- |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39476



[Bug target/39473] Typo in untyped_call in i386.md

2009-03-16 Thread hjl dot tools at gmail dot com


-- 

hjl dot tools at gmail dot com changed:

   What|Removed |Added

   Target Milestone|--- |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39473



[Bug target/35180] built-in-setjmp.x2

2009-03-16 Thread hp at gcc dot gnu dot org


--- Comment #1 from hp at gcc dot gnu dot org  2009-03-17 04:18 ---
Does this still happen?  See also PR38609.


-- 

hp at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||hp at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35180



[Bug target/38609] [4.4 Regression]: gcc.c-torture/execute/built-in-setjmp.c execute -O2 and above

2009-03-16 Thread hp at gcc dot gnu dot org


--- Comment #9 from hp at gcc dot gnu dot org  2009-03-17 05:35 ---
(In reply to comment #8)
 Guess it probably won't be TARGET_BUILTIN_SETJMP_FRAME_VALUE then.

At any rate, changing it to hard_frame_pointer_rtx doesn't help by itself.
(Resulting diffs in RTL dumps are gone after 132r.unshare, for r144898.)

Either, GCC should punt and force p to the stack, or calculate p / keep track
of the stack-pointer correctly: the value is off by 20 when used after the
longjump.  (It should be move.d [$sp+28],$r10, not $sp+8.)  Right, that's the
sp -= 20 due to the __builtin_alloca (20) before the __builtin_setjmp call.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38609