RE: [PATCH] Fix stack red zone bug (PR38644)
-Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Saturday, October 01, 2011 3:05 AM To: Jiangning Liu Cc: 'Jakub Jelinek'; 'Richard Guenther'; Andrew Pinski; gcc- patc...@gcc.gnu.org Subject: Re: [PATCH] Fix stack red zone bug (PR38644) On 09/29/2011 06:13 PM, Jiangning Liu wrote: -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Thursday, September 29, 2011 6:14 PM To: Jiangning Liu Cc: 'Richard Guenther'; Andrew Pinski; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix stack red zone bug (PR38644) On Thu, Sep 29, 2011 at 06:08:50PM +0800, Jiangning Liu wrote: As far as I know different back-ends are implementing different prologue/epilogue in GCC. If one day this part can be refined and abstracted as well, I would say solving this stack-red-zone problem in shared prologue/epilogue code would be a perfect solution, and barrier can be inserted there. I'm not saying you are wrong on keeping scheduler using a pure barrier interface. From engineering point of view, I only feel my proposal is so far so good, because this patch at least solve the problem for all targets in a quite simple way. Maybe it can be improved in future based on this. But you don't want to listen about any other alternative, other backends are happy with being able to put the best kind of barrier at the best spot in the epilogue and don't need a generic solution which won't model very well the target diversity anyway. Jakub, Appreciate for your attention on this issue, 1) Can you clarify who are the others back-ends? Does it cover most of the back-ends being supported by GCC right now? Your red-stack barrier issue is *exactly* the same as the frame pointer barrier issue, which affects many backends. That is, if the frame pointer is initialized before the local stack frame is allocated, then one has to add a barrier such that memory references based on the frame pointer are not scheduled before the local stack frame allocation. One example of this is in the i386 port, where the prologue looks like push%ebp mov %esp, %ebp sub $frame, %esp The rtl we emit for that subtraction looks like (define_insn pro_epilogue_adjust_stack_mode_add [(set (match_operand:P 0 register_operand =r,r) (plus:P (match_operand:P 1 register_operand 0,r) (match_operand:P 2 nonmemory_operand ri,li))) (clobber (reg:CC FLAGS_REG)) (clobber (mem:BLK (scratch)))] Note the final clobber, which is a memory scheduling barrier. Other targets use similar tricks. For instance arm stack_tie. Honestly, I've found nothing convincing throughout this thread that suggests to me that this problem should be handled generically. Richard H., Thanks for your explanation by giving an example in x86. The key is if possible, fixing it in middle end can benefit all ports directly and avoid bug fixing burden in back-ends, rather than fix this problem port by port. Actually now the debating here is whether memory barrier is properly modeling through whole GCC rather than a single component, because my current understanding is scheduler is not the only component using memory barrier. Thanks, -Jiangning r~
RE: [PATCH] Fix stack red zone bug (PR38644)
-Original Message- From: Richard Guenther [mailto:richard.guent...@gmail.com] Sent: Friday, September 30, 2011 8:57 PM To: Jiangning Liu; Jakub Jelinek; Richard Guenther; Andrew Pinski; gcc- patc...@gcc.gnu.org; richard.sandif...@linaro.org Subject: Re: [PATCH] Fix stack red zone bug (PR38644) On Fri, Sep 30, 2011 at 2:46 PM, Richard Sandiford richard.sandif...@linaro.org wrote: Jiangning Liu jiangning@arm.com writes: You seem to feel strongly about this because it's a wrong-code bug that is very easy to introduce and often very hard to detect. And I defintely sympathise with that. If we were going to to do it in a target- independent way, though, I think it would be better to scan patterns like epilogue and automatically introduce barriers before assignments to stack_pointer_rtx (subject to the kind of hook in your patch). But I still don't think that's better than leaving the onus on the backend. The backend is still responsible for much more complicated things like determning the correct deallocation and register-restore sequence, and for determining the correct CFI sequence. I think middle-end in GCC is actually shared code rather than the part exactly in the middle. A pass working on RTL can be a middle end just because the code can be shared for all targets, and some passes can even work for both GIMPLE and RTL. Actually some optimizations need to work through shared part (middle-end) plus target specific part (back-end). You are thinking the interface between this shared part and target specific part should be using barrier as a properly model. To some extension I agree with this. However, it doesn't mean the fix should be in back-end rather than middle end, because obviously this problem is a common ABI issue for all targets. If we can abstract this issue to be a shared part, why shouldn't we do it in middle end to reduce the onus of back-end? Back-end should handle the target specific things rather than only the complicated things. And for avoidance of doubt, the automatic barrier insertion that I described would be one way of doing it in target-independent code. But... If a complicated problem can be implemented in a shared code manner, we still want to put it into middle end rather than back-end. I believe those optimizations based on SSA form are complicated enough, but they are all in middle end. This is the logic I'm seeing in GCC. The situation here is different. The target-independent rtl code is being given a blob of instructions that the backend has generated for the epilogue. There's no fine-tuning beyond that. E.g. we don't have separate patterns for restore registers, deallocate stack, return: we just have one monolithic epilogue pattern. The target- independent code has very little control. In contrast, after the tree optimisers have handed off the initial IL, the tree optimisers are more or less in full control. There are very few cases where we generate further trees outside the middle- end. The only case I know off-hand is the innards of va_start and va_arg, which can be generated by the backend. So let's suppose we had a similar situation there, where we wanted va_arg do something special in a certain situation. If we had the same three choices of: 1. use an on-the-side hook to represent the special something 2. scan the code generated by the backend and automatically inject the special something at an appropriate place 3. require each backend to do it properly from the start (OK, slightly prejudiced wording :-)) I think we'd still choose 3. For this particular issue, I don't think that hook interface I'm proposing is more complicated than the barrier. Instead, it is easier for back-end implementer to be aware of the potential issue before really solving stack red zone problem, because it is very clearly listed in target hook list. The point for model it in the IL supporters like myself is that we have both many backends and many rtl passes. Putting it in a hook keeps things simple for the backends, but it means that every rtl pass must be aware of this on-the-side dependency. Perhaps sched2 really is the only pass that needs to look at the hook at present. But perhaps not. E.g. dbr_schedule (not a problem on ARM, I realise) also reorders instructions, so maybe it would need to be audited to see whether any calls to this hook are needed. And perhaps we'd add more rtl passes later. The point behind using a barrier is that the rtl passes do not then need to treat the stack-deallocation dependency as a special case. They can just use the normal analysis and get it right. In other words, we're both arguing for safety here. Indeed. It's certainly not only scheduling that can move instructions, but RTL PRE, combine, ifcvt all can
Out-of-order update of new_spill_reg_store[]
This patch fixes an ordering problem in reload: the output reloads are emitted in reverse operand order, but new_spill_reg_store[] is updated in forward reload order. This causes problems if the same register is used for two reloads. I saw this hit on mips64-linux-gnu/-mabi=64 as a failure in execute/scal-to-vec1.c at -O3. The reloads were: Reloads for insn # 580 Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p reload_reg_rtx: (reg:SI 5 $5) Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655]) MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0) reload_out_reg: (reg:SI 32 $f0 [1655]) reload_reg_rtx: (reg:SI 65 lo) secondary_out_reload = 0 Reload 2: reload_out (SI) = (reg:SI 1656) GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3) reload_out_reg: (reg:SI 1656) reload_reg_rtx: (reg:SI 5 $5) So $5 is first stored in 1656 (operand 3), then $5 is used a secondary reload in copying LO to $f0 (operand 0, reg 1655). The next and final use of 1655 ends up inheriting this second reload of $5, so we try to delete the original output copy. The problem is that we delete the wrong one: we delete the store of $5 to 1656 rather than the copy of $5 to 1655/$f0. The fix I went for is to clear new_spill_reg_store[] for all reloads as a separate pass (rather than in the main do_{input,output}_reload loop), then only allow new_spill_store_reg[] to be set if the associated reload register reaches the end of the reload sequence. emit_input_reloads has: /* Output a special code sequence for this case, and forget about spill reg information. */ new_spill_reg_store[REGNO (reloadreg)] = NULL; inc_for_reload (reloadreg, oldequiv, rl-out, rl-inc); I think this store is redundant: emit_reload_insns should already have cleared it, beth before and after the patch. (The code was originally: /* Output a special code sequence for this case. */ new_spill_reg_store[REGNO (reloadreg)] = inc_for_reload (reloadreg, oldequiv, rl-out, rl-inc); but was changed because we can't inherit auto-inc reloads as easily as that. So the nullification came from an existing new_spill_reg_store[] assignment, rather than being added explicitly.) Also, emit_reload_insns has two blocks to record inheritance information: one for spill registers and one for non-spill registers. The spill version checks that the reload register reaches the end of the sequence, and I think the non-spill version should too. Tested on mips64-linux-gnu and x86_64-linux-gnu. It fixes the testcase (by deleting the correct instruction -- the inheritance still happens). Bernd, Uli, does this look OK? Richard gcc/ * reload1.c (reload_regs_reach_end_p): Replace with... (reload_reg_rtx_reaches_end_p): ...this function. (new_spill_reg_store): Update commentary. (emit_input_reload_insns): Don't clear new_spill_reg_store here. (emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p before setting new_spill_reg_store. (emit_reload_insns): Use a separate loop to clear new_spill_reg_store. Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p. Also use reload_reg_rtx_reaches_end_p when recording inheritance information for non-spill reload registers. Index: gcc/reload1.c === --- gcc/reload1.c 2011-10-08 16:32:26.0 +0100 +++ gcc/reload1.c 2011-10-08 16:32:26.0 +0100 @@ -5499,15 +5499,15 @@ reload_reg_reaches_end_p (unsigned int r } /* Like reload_reg_reaches_end_p, but check that the condition holds for - every register in the range [REGNO, REGNO + NREGS). */ + every register in REG. */ static bool -reload_regs_reach_end_p (unsigned int regno, int nregs, int reloadnum) +reload_reg_rtx_reaches_end_p (rtx reg, int reloadnum) { - int i; + unsigned int i; - for (i = 0; i nregs; i++) -if (!reload_reg_reaches_end_p (regno + i, reloadnum)) + for (i = REGNO (reg); i END_REGNO (reg); i++) +if (!reload_reg_reaches_end_p (i, reloadnum)) return false; return true; } @@ -7052,7 +7052,9 @@ static rtx operand_reload_insns = 0; static rtx other_operand_reload_insns = 0; static rtx other_output_reload_insns[MAX_RECOG_OPERANDS]; -/* Values to be put in spill_reg_store are put here first. */ +/* Values to be put in spill_reg_store are put here first. Instructions + must only be placed here if the associated reload register reaches + the end of the instruction's reload sequence. */ static rtx new_spill_reg_store[FIRST_PSEUDO_REGISTER]; static HARD_REG_SET reg_reloaded_died; @@ -7213,9 +7215,7 @@ emit_input_reload_insns (struct insn_cha /* Prevent normal processing of this reload. */ special = 1; - /* Output a special code sequence for this case, and forget about -spill
Re: PATCH RFA: New configure option --with-native-system-header-dir
Ian Lance Taylor i...@google.com writes: So, it seems to me that we should: * Remove SYSTEM_INCLUDE_DIR, which is undefined and unnecessary. * Move the definition of NATIVE_SYSTEM_HEADER_DIR into config.gcc (named native_system_header_dir). The default is /usr/include. This appears to be necessary since the configure script itself needs to know this value. * Have the configure script use NATIVE_SYSTEM_HEADER_DIR when setting target_header_dir. * Arrange for Makefile to define NATIVE_SYSTEM_HEADER_DIR when compiling cppdefault.c (i.e., add it to PREPROCESSOR_DEFINES in Makefile.in). * Replace STANDARD_INCLUDE_DIR in cppdefault.c with NATIVE_SYSTEM_HEADER_DIR. * Remove STANDARD_INCLUDE_DIR. * Add the --with-native-system-header-dir option. This patch implements this proposal. Only lightly tested so far. How does this look if testing succeeds? Ian 2011-10-08 Simon Baldwin sim...@google.com Ian Lance Taylor i...@google.com * configure.ac: Add --with-native-system-header-dir. Set and substitute NATIVE_SYSTEM_HEADER_DIR. Use native_system_header when setting target_header_dir. * config.gcc: Always set native_system_header_dir. (*-*-gnu*): Set native_system_header_dir. Don't use t-gnu. (i[34567]86-pc-msdosdjgpp*): Set native_system_header_dir. Don't use i386/t-djgpp. (i[34567]86-*-mingw* | x86_64-*-mingw*): Set native_system_header_dir. (spu-*-elf*): Set native_system_header_dir. * Makefile.in (NATIVE_SYSTEM_HEADER_DIR): Set to @NATIVE_SYSTEM_HEADER_DIR@. (PREPROCESSOR_DEFINES): Define NATIVE_SYSTEM_HEADER_DIR. * cppdefault.c (STANDARD_INCLUDE_DIR): Don't define. (NATIVE_SYSTEM_HEADER_COMPONENT): Rename from STANDARD_INCLUDE_COMPONENT. (cpp_include_defaults): Don't use SYSTEM_INCLUDE_DIR. Rename STANDARD_INCLUDE_DIR to NATIVE_SYSTEM_HEADER_DIR. * system.h: Poison SYSTEM_INCLUDE_DIR, STANDARD_INCLUDE_DIR, and STANDARD_INCLUDE_COMPONENT. * config/i386/t-mingw32 (NATIVE_SYSTEM_HEADER_DIR): Remove. * config/i386/t-mingw-w32: Likewise. * config/i386/t-mingw-w64: Likewise. * config/spu/t-spu-elf: Likewise. * config/i386/t-djgpp: Remove. * config/t-gnu: Remove. * config/i386/mingw32.h (STANDARD_INCLUDE_DIR): Don't define. (NATIVE_SYSTEM_HEADER_COMPONENT): Rename from STANDARD_INCLUDE_COMPONENT. * config/i386/djgpp.h (STANDARD_INCLUDE_DIR): Don't define. * config/spu/spu-elf.h: Likewise. * config/vms/xm-vms.h: Likewise. * config/gnu.h: Likewise. * config/openbsd.h (INCLUDE_DEFAULTS): Change STANDARD_INCLUDE_DIR and STANDARD_INCLUDE_COMPONENT to NATIVE_SYSTEM_HEADER_DIR and NATIVE_SYSTME_HEADER_COMPONENT. * doc/install.texi (Configuration): Document --with-native-system-header-dir. Mention it in the documentation for --with-sysroot and --with-build-sysroot. * doc/tm.texi.in (Driver): Don't document SYSTEM_INCLUDE_DIR or STANDARD_INCLUDE_DIR. Rename STANDARD_INCLUDE_COMPONENT to NATIVE_SYSTEM_HEADER_COMPONENT. Rename uses of STANDARD_INCLUDE_DIR to NATIVE_SYSTEM_HEADER_DIR. * doc/fragments.texi (Target Fragment): Don't document NATIVE_SYSTEM_HEADER_DIR. * configure, doc/tm.texi: Rebuild. Index: doc/fragments.texi === --- doc/fragments.texi (revision 179696) +++ doc/fragments.texi (working copy) @@ -1,5 +1,6 @@ @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, -@c 1999, 2000, 2001, 2003, 2004, 2005, 2008 Free Software Foundation, Inc. +@c 1999, 2000, 2001, 2003, 2004, 2005, 2008, 2011 +@c Free Software Foundation, Inc. @c This is part of the GCC manual. @c For copying conditions, see the file gcc.texi. @@ -128,12 +129,6 @@ compiler. In that case, set @code{MULTI of options to be used for all builds. If you set this, you should probably set @code{CRTSTUFF_T_CFLAGS} to a dash followed by it. -@findex NATIVE_SYSTEM_HEADER_DIR -@item NATIVE_SYSTEM_HEADER_DIR -If the default location for system headers is not @file{/usr/include}, -you must set this to the directory containing the headers. This value -should match the value of the @code{SYSTEM_INCLUDE_DIR} macro. - @findex SPECS @item SPECS Unfortunately, setting @code{MULTILIB_EXTRA_OPTS} is not enough, since Index: doc/tm.texi.in === --- doc/tm.texi.in (revision 179696) +++ doc/tm.texi.in (working copy) @@ -468,33 +468,15 @@ initialize the necessary environment var Define this macro as a C string constant if you wish to override the standard choice of @file{/usr/local/include} as the default prefix to try when searching for local header files.
Re: [google] record compiler options to .note sections
On Sun, Oct 09, 2011 at 09:18:25AM +0800, Dehao Chen wrote: Unfortunately -frecord-gcc-switches cannot serve our purpose because the recorded switches are mergable, i.e. the linker will merge all options to a set of strings. However, object files may have distinct compile options. We want to preserve every object file's compile options when doing LIPO build. And -grecord-gcc-switches? That one, although it is mergeable, still preserves every object files's compile options. Jakub
RE: [PATCH] Fix stack red zone bug (PR38644)
-Original Message- From: Richard Sandiford richard.sandif...@linaro.org Date: Fri, Sep 30, 2011 at 8:46 PM Subject: Re: [PATCH] Fix stack red zone bug (PR38644) To: Jiangning Liu jiangning@arm.com Cc: Jakub Jelinek ja...@redhat.com, Richard Guenther richard.guent...@gmail.com, Andrew Pinski pins...@gmail.com, gcc-patches@gcc.gnu.org Jiangning Liu jiangning@arm.com writes: You seem to feel strongly about this because it's a wrong-code bug that is very easy to introduce and often very hard to detect. And I defintely sympathise with that. If we were going to to do it in a target- independent way, though, I think it would be better to scan patterns like epilogue and automatically introduce barriers before assignments to stack_pointer_rtx (subject to the kind of hook in your patch). But I still don't think that's better than leaving the onus on the backend. The backend is still responsible for much more complicated things like determning the correct deallocation and register-restore sequence, and for determining the correct CFI sequence. I think middle-end in GCC is actually shared code rather than the part exactly in the middle. A pass working on RTL can be a middle end just because the code can be shared for all targets, and some passes can even work for both GIMPLE and RTL. Actually some optimizations need to work through shared part (middle-end) plus target specific part (back-end). You are thinking the interface between this shared part and target specific part should be using barrier as a properly model. To some extension I agree with this. However, it doesn't mean the fix should be in back-end rather than middle end, because obviously this problem is a common ABI issue for all targets. If we can abstract this issue to be a shared part, why shouldn't we do it in middle end to reduce the onus of back-end? Back-end should handle the target specific things rather than only the complicated things. And for avoidance of doubt, the automatic barrier insertion that I described would be one way of doing it in target-independent code. But... If a complicated problem can be implemented in a shared code manner, we still want to put it into middle end rather than back-end. I believe those optimizations based on SSA form are complicated enough, but they are all in middle end. This is the logic I'm seeing in GCC. The situation here is different. The target-independent rtl code is being given a blob of instructions that the backend has generated for the epilogue. There's no fine-tuning beyond that. E.g. we don't have separate patterns for restore registers, deallocate stack, return: we just have one monolithic epilogue pattern. The target-independent code has very little control. In contrast, after the tree optimisers have handed off the initial IL, the tree optimisers are more or less in full control. There are very few cases where we generate further trees outside the middle-end. The only case I know off-hand is the innards of va_start and va_arg, which can be generated by the backend. So let's suppose we had a similar situation there, where we wanted va_arg do something special in a certain situation. If we had the same three choices of: 1. use an on-the-side hook to represent the special something 2. scan the code generated by the backend and automatically inject the special something at an appropriate place 3. require each backend to do it properly from the start (OK, slightly prejudiced wording :-)) I think we'd still choose 3. Richard S., Although I've ever implemented va_arg for a commercial compiler previously long times ago, I forgot all the details. :-) I'm not sure if using va_arg is a good example to compare with this stack red zone case. For this particular issue, I don't think that hook interface I'm proposing is more complicated than the barrier. Instead, it is easier for back-end implementer to be aware of the potential issue before really solving stack red zone problem, because it is very clearly listed in target hook list. The point for model it in the IL supporters like myself is that we have both many backends and many rtl passes. Putting it in a hook keeps things simple for the backends, but it means that every rtl pass must be aware of this on-the-side dependency. Perhaps sched2 really is the only pass that needs to look at the hook at present. But perhaps not. E.g. dbr_schedule (not a problem on ARM, I realise) also reorders instructions, so maybe it would need to be audited to see whether any calls to this hook are needed. And perhaps we'd add more rtl passes later. Let me rephrase your justification with my own words. === We can't compare adding a new pass and adding a new port, because they are totally different things. But it implies with my proposal the burden may still be added
Fix for PR libobjc/49883 (clang + gcc 4.6 runtime = broken) and a small related clang fix
This patch fixes PR libobjc/49883. To fix it, I installed clang and tried out what happens if you compile Objective-C code using clang and targetting the GCC runtime. Unfortunately, the report was correct in that clang is producing incorrect code and abusing the higher bits of the class-info field to store some other information. On the good side, the fix I proposed in the discussion of PR libobjc/49883 actually works. :-) So, I applied that fix. I also found that clang still emits calls to the objc_lookup_class() function, so this patch also adds that function back into the runtime to get code compiled with clang work. Committed to trunk. Thanks PS: In case anyone wonders, I do want the GNU Objective-C Runtime to be usable with free, non-GCC Objective-C compilers. It should obviously work perfectly with GCC, the GNU compiler, which is its natural partner, but some people would like to use it with other free compilers and that seems a reasonable request. Refusing that request just provides an incentive to write and support other Objective-C runtimes, which is a waste of time and resources. ;-) Index: init.c === --- init.c (revision 179711) +++ init.c (working copy) @@ -643,6 +643,15 @@ assert (CLS_ISMETA (class-class_pointer)); DEBUG_PRINTF ( installing class '%s'\n, class-name); + /* Workaround for a bug in clang: Clang may set flags other than +_CLS_CLASS and _CLS_META even when compiling for the +traditional ABI (version 8), confusing our runtime. Try to +wipe these flags out. */ + if (CLS_ISCLASS (class)) + __CLS_INFO (class) = _CLS_CLASS; + else + __CLS_INFO (class) = _CLS_META; + /* Initialize the subclass list to be NULL. In some cases it isn't and this crashes the program. */ class-subclass_list = NULL; Index: class.c === --- class.c (revision 179711) +++ class.c (working copy) @@ -764,6 +764,15 @@ return objc_get_class (name)-class_pointer; } +/* This is not used by GCC, but the clang compiler seems to use it + when targetting the GNU runtime. That's wrong, but we have it to + be compatible. */ +Class +objc_lookup_class (const char *name) +{ + return objc_getClass (name); +} + /* This is used when the implementation of a method changes. It goes through all classes, looking for the ones that have these methods (either method_a or method_b; method_b can be NULL), and reloads Index: ChangeLog === --- ChangeLog (revision 179711) +++ ChangeLog (working copy) @@ -1,3 +1,18 @@ +2011-10-09 Nicola Pero nicola.p...@meta-innovation.com + + PR libobjc/49883 + * init.c (__objc_exec_class): Work around a bug in clang's code + generation. Clang sets the class-info field to values different + from 0x1 or 0x2 (the only allowed values in the traditional GNU + Objective-C runtime ABI) to store some additional information, but + this breaks backwards compatibility. Wipe out all the bits in the + fields other than the first two upon loading a class. + +2011-10-09 Nicola Pero nicola.p...@meta-innovation.com + + * class.c (objc_lookup_class): Added back for compatibility with + clang which seems to emit calls to it. + 2011-10-08 Richard Frith-Macdonald r...@gnu.org Nicola Pero nicola.p...@meta-innovation.com
Re: [RFC] Slightly fix up vgather* patterns
On Sat, Oct 8, 2011 at 5:43 PM, Jakub Jelinek ja...@redhat.com wrote: The AVX2 docs say that the insns will #UD if any of the mask, src and index registers are the same, but e.g. on #include x86intrin.h __m256 m; float f[1024]; __m256 foo (void) { __m256i mi = (__m256i) m; return _mm256_mask_i32gather_ps (m, f, mi, m, 4); } which is IMHO valid and should for m being zero vector just return a zero vector and clear mask (in this case it was already cleared) we compile it as vmovdqa m(%rip), %ymm1 vmovaps %ymm1, %ymm0 vgatherdps %ymm1, (%rax, %ymm1, 4), %ymm0 and thus IMHO it will #UD. Also, the insns should make it clear that the mask register is modified too (the patch clobbers it, perhaps we could instead say that it zeros the register (which is true if it doesn't segfault), but then what if a segfault handler chooses to continue with the next insn and doesn't clear the mask register?). Still, the insn description is imprecise, saying that it loads from mem at the address register is wrong and perhaps some DCE might delete what shouldn't be deleted. So, either it should (use (mem (scratch))) or something similar, or in the unspec list all the memory locations that are being read (mem:scalarssemode (plus:SI (reg:SI) (vec_select:SI (match_operand:V4SI) (parallel [(const_int N)] for N 0 through something (but it is complicated by Pmode size vs. the need to do nothing/truncate/sign_extend the vec_select to the right mode). What do you think? Regarding the clear of mask operand: I agree that this should be modelled as a clobber. Zeroing can't be guaranteed due to the fact you described above. About memory - can't we use (mem:BLK (match_operand:P register_operand r)) here? BTW: No need to use %c modifier: /* Meaning of CODE: L,W,B,Q,S,T -- print the opcode suffix for specified size of operand. C -- print opcode suffix for set/cmov insn. c -- like C, but print reversed condition ... */ Uros.
Re: [patch] C6X unwinding/exception handling
This did break libobjc and libjava on arm-linux-gnueabi. libobjc now has an undefined reference to _Unwind_decode_target2, which can be avoided with --- libobjc/exception.c.orig2011-07-21 15:33:57.0 + +++ libobjc/exception.c 2011-10-09 10:53:12.554940776 + @@ -182,7 +182,7 @@ _Unwind_Ptr ptr; ptr = (_Unwind_Ptr) (info-TType - (i * 4)); - ptr = _Unwind_decode_target2 (ptr); + ptr = _Unwind_decode_typeinfo_ptr (info-ttype_base, (_Unwind_Word) ptr); /* NULL ptr means catch-all. Note that if the class is not found, this will abort the program. */ libjava fails to build, the same change doesn't work for libjava/exception.cc, because the struct lsda_header_info in exception.cc is missing the ttype_base member. Any suggestions? On 09/13/2011 02:48 PM, Paul Brook wrote: C6X uses an unwinding/exception handling echeme very similar to that defined by the ARM EABI. The core of the unwinder is the same, so I've pulled it out into a common file. Other than the obvious target specific bits, the main compiler visible difference is that the C6X assembler generates the unwinding tables from DWARF .cfi directives, rather than the separate set of directives used by the ARM assembler. The libstdc++ changes probably deserve a bit of explanation. The ttype_base field was clearly used in an early draft of the ARM EABI, and the current ARM definition is a compatible subset of that used by C6X. _GLIBCXX_OVERRIDE_TTYPE_ENCODING is an unfortunate hack because when doing the ARM implementation I failed to realise ttype_encoding was the same thing as R_ARM_TARGET2. We now have a lot of ARM binaries floating around with that field set incorrectly, so it's either this or an ABI bump. I've updated the patch to accomodate the move to libgcc/, done a quick sanity recheck of arm-linux and c6x-elf and applied to svn. P.S. in case it's not clear from my description, the libstdc++ changes aren't really a new hack, it's just making an old one more obvious. Paul 2011-09-13 Paul Brook p...@codesourcery.com gcc/ * config/arm/arm.h (ASM_PREFERRED_EH_DATA_FORMAT): Define. (ARM_TARGET2_DWARF_FORMAT): Provide default definition. * config/arm/linux-eabi.h (ARM_TARGET2_DWARF_FORMAT): Define. * config/arm/symbian.h (ARM_TARGET2_DWARF_FORMAT): Define. * config/arm/uclinux-eabi.h(ARM_TARGET2_DWARF_FORMAT): Define. * config/arm/t-bpabi (EXTRA_HEADERS): Add unwind-arm-common.h. * config/arm/t-symbian (EXTRA_HEADERS): Add unwind-arm-common.h. * config/c6x/c6x.c (c6x_output_file_unwind): Don't rely on dwarf2 code enabling unwind tables. (c6x_debug_unwind_info): New function. (TARGET_ARM_EABI_UNWINDER): Define. (TARGET_DEBUG_UNWIND_INFO): Define. * config/c6x/c6x.h (DWARF_FRAME_RETURN_COLUMN): Define. (TARGET_EXTRA_CFI_SECTION): Remove. * config/c6x/t-c6x-elf (EXTRA_HEADERS): Set. * ginclude/unwind-arm-common.h: New file. libgcc/ * config.host (tic6x-*-*): Add c6x/t-c6x-elf. Set unwind_header. * unwind-c.c (PERSONALITY_FUNCTION): Use UNWIND_POINTER_REG. * unwind-arm-common.inc: New file. * config/arm/unwind-arm.c: Use unwind-arm-common.inc. * config/arm/unwind-arm.h: Use unwind-arm-common.h. (_GLIBCXX_OVERRIDE_TTYPE_ENCODING): Define. * config/c6x/libunwind.S: New file. * config/c6x/pr-support.c: New file. * config/c6x/unwind-c6x.c: New file. * config/c6x/unwind-c6x.h: New file. * config/c6x/t-c6x-elf: New file. libstdc++-v3/ * libsupc++/eh_arm.cc (__cxa_end_cleanup): Add C6X implementation. * libsupc++/eh_call.cc (__cxa_call_unexpected): Set rtti_base. * libsupc++/eh_personality.cc (NO_SIZE_OF_ENCODED_VALUE): Remove __ARM_EABI_UNWINDER__ check. (parse_lsda_header): Check _GLIBCXX_OVERRIDE_TTYPE_ENCODING. (get_ttype_entry): Use generic implementation on ARM EABI. (check_exception_spec): Use _Unwind_decode_typeinfo_ptr and UNWIND_STACK_REG. (PERSONALITY_FUNCTION): Set ttype_base.
[Patch, Fortran, committed] PR 50659: [4.4/4.5/4.6/4.7 Regression] ICE with PROCEDURE statement
Hi all, I have just committed as obvious a patch for an ICE-on-valid problem with PROCEDURE statements: http://gcc.gnu.org/viewcvs?root=gccview=revrev=179723 The problem was the following: When setting up an external procedure or procedure pointer (declared via a PROCEDURE statement), we copy the expressions for the array bounds and string length from the interface symbol given in the PROCEDURE declaration (cf. 'resolve_procedure_interface'). If those expressions depend on the actual args of the interface, we have to replace those args by the args of the new procedure symbol that we're setting up. This is what 'gfc_expr_replace_symbols' / 'replace_symbol' does. Unfortunately we failed to check whether the symbol we try to replace is actually a dummy! Contrary to Andrew's initial assumption, I think the test case is valid. I could neither find a compiler which rejects it, nor a restriction in the standard which makes it invalid. The relevant part of F08 is probably chapter 7.1.11 (Specification expression). This states that a specification expression can contain variables, which are made accessible via use association. I'm planning to apply the patch to the 4.6, 4.5 and 4.4 branches soon. Cheers, Janus
[C++ Patch] PR 50660
Hi, another duplicated diagnostic message. This one happens for snippets like the below due to the temporary for the const ref: int g(const int); int m2() { return g(__null); } 50660.C:4:18: warning: passing NULL to non-pointer argument 1 of ‘int g(const int)’ 50660.C:4:18: warning: passing NULL to non-pointer argument 1 of ‘int g(const int)’ I'm changing conversion_null_warnings to return true when a warning is actually produced, which is checked by convert_like_real before calling again itself recursively. I think it should be safe to shut down in that case all kinds of further warnings, otherwise, we could even envisage adding an issue_conversion_null_warnings parameter to convert_like_real, as a last resort which certainly works. Patch tested x86_64-linux. Thanks, Paolo. / 2011-10-09 Paolo Carlini paolo.carl...@oracle.com PR c++/50660 * call.c (conversion_null_warnings): Return true when a warning is actually emitted. (convert_like_real): When conversion_null_warnings returns true set issue_conversion_warnings to false. Index: call.c === --- call.c (revision 179720) +++ call.c (working copy) @@ -5509,9 +5509,9 @@ build_temp (tree expr, tree type, int flags, /* Perform warnings about peculiar, but valid, conversions from/to NULL. EXPR is implicitly converted to type TOTYPE. - FN and ARGNUM are used for diagnostics. */ + FN and ARGNUM are used for diagnostics. Returns true if warned. */ -static void +static bool conversion_null_warnings (tree totype, tree expr, tree fn, int argnum) { tree t = non_reference (totype); @@ -5526,6 +5526,7 @@ conversion_null_warnings (tree totype, tree expr, else warning_at (input_location, OPT_Wconversion_null, converting to non-pointer type %qT from NULL, t); + return true; } /* Issue warnings if false is converted to a NULL pointer */ @@ -5538,7 +5539,9 @@ conversion_null_warnings (tree totype, tree expr, else warning_at (input_location, OPT_Wconversion_null, converting %false% to pointer type %qT, t); + return true; } + return false; } /* Perform the conversions in CONVS on the expression EXPR. FN and @@ -5624,8 +5627,9 @@ convert_like_real (conversion *convs, tree expr, t return cp_convert (totype, expr); } - if (issue_conversion_warnings (complain tf_warning)) -conversion_null_warnings (totype, expr, fn, argnum); + if (issue_conversion_warnings (complain tf_warning) + conversion_null_warnings (totype, expr, fn, argnum)) +issue_conversion_warnings = false; switch (convs-kind) {
[CRIS] Hookize PREFERRED_RELOAD_CLASS
Hello. This patch removes obsolete PREFERRED_RELOAD_CLASS macro from CRIS back end in the GCC and introduces equivalent TARGET_PREFERRED_RELOAD_CLASS target hook. Regression tested on cris-axis-elf. OK to install? * config/cris/cris.c (cris_preferred_reload_class): New function. (TARGET_PREFERRED_RELOAD_CLASS): Define. * config/cris/cris.h (OUTPUT_ADDR_CONST_EXTRA): Remove. Index: gcc/config/cris/cris.c === --- gcc/config/cris/cris.c (revision 179721) +++ gcc/config/cris/cris.c (working copy) @@ -123,6 +123,8 @@ static void cris_file_start (void); static void cris_init_libfuncs (void); +static reg_class_t cris_preferred_reload_class (rtx, reg_class_t); + static int cris_register_move_cost (enum machine_mode, reg_class_t, reg_class_t); static int cris_memory_move_cost (enum machine_mode, reg_class_t, bool); static bool cris_rtx_costs (rtx, int, int, int, int *, bool); @@ -198,6 +200,9 @@ #undef TARGET_INIT_LIBFUNCS #define TARGET_INIT_LIBFUNCS cris_init_libfuncs +#undef TARGET_PREFERRED_RELOAD_CLASS +#define TARGET_PREFERRED_RELOAD_CLASS cris_preferred_reload_class + #undef TARGET_REGISTER_MOVE_COST #define TARGET_REGISTER_MOVE_COST cris_register_move_cost #undef TARGET_MEMORY_MOVE_COST @@ -1342,6 +1347,31 @@ return false; } + +/* Worker function for TARGET_PREFERRED_RELOAD_CLASS. + + It seems like gcc (2.7.2 and 2.9x of 2000-03-22) may send NO_REGS as + the class for a constant (testcase: __Mul in arit.c). To avoid forcing + out a constant into the constant pool, we will trap this case and + return something a bit more sane. FIXME: Check if this is a bug. + Beware that we must not override classes that can be specified as + constraint letters, or else asm operands using them will fail when + they need to be reloaded. FIXME: Investigate whether that constitutes + a bug. */ + +static reg_class_t +cris_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, reg_class_t rclass) +{ + if (rclass != ACR_REGS + rclass != MOF_REGS + rclass != SRP_REGS + rclass != CC0_REGS + rclass != SPECIAL_REGS) +return GENERAL_REGS; + + return rclass; +} + /* Worker function for TARGET_REGISTER_MOVE_COST. */ static int Index: gcc/config/cris/cris.h === --- gcc/config/cris/cris.h (revision 179721) +++ gcc/config/cris/cris.h (working copy) @@ -583,22 +583,6 @@ /* See REGNO_OK_FOR_BASE_P. */ #define REGNO_OK_FOR_INDEX_P(REGNO) REGNO_OK_FOR_BASE_P(REGNO) -/* It seems like gcc (2.7.2 and 2.9x of 2000-03-22) may send NO_REGS as - the class for a constant (testcase: __Mul in arit.c). To avoid forcing - out a constant into the constant pool, we will trap this case and - return something a bit more sane. FIXME: Check if this is a bug. - Beware that we must not override classes that can be specified as - constraint letters, or else asm operands using them will fail when - they need to be reloaded. FIXME: Investigate whether that constitutes - a bug. */ -#define PREFERRED_RELOAD_CLASS(X, CLASS) \ - ((CLASS) != ACR_REGS \ - (CLASS) != MOF_REGS \ - (CLASS) != SRP_REGS \ - (CLASS) != CC0_REGS \ - (CLASS) != SPECIAL_REGS \ - ? GENERAL_REGS : (CLASS)) - /* We can't move special registers to and from memory in smaller than word_mode. We also can't move between special registers. Luckily, -1, as returned by true_regnum for non-sub/registers, is valid as a Anatoly.
[C++ Patch] Trailing comma in enum
Hi. As I understand it C++11 allows trailing commas in enum definitions. Thus I think the following little patch should be included. On a side note I have to say that the effects of pedwarn_cxx98 are unexpected, especially in light of the comment above the function body. /MF 2011-10-09 Magnus Fromreide ma...@lysator.liu.se * gcc/cp/parser.c (cp_parser_enumerator_list): Do not warn about trailing commas in C++0x mode. * gcc/testsuite/g++.dg/cpp0x/enum21a.C: Test that enum x { y, } do generate a pedwarning in c++98-mode. * gcc/testsuite/g++.dg/cpp0x/enum21b.C: Test that enum x { y, } don't generate a pedwarning in c++0x-mode. Index: gcc/testsuite/g++.dg/cpp0x/enum21a.C === --- gcc/testsuite/g++.dg/cpp0x/enum21a.C (revision 0) +++ gcc/testsuite/g++.dg/cpp0x/enum21a.C (revision 0) @@ -0,0 +1,4 @@ +// { dg-do compile } +// { dg-options -pedantic } + +enum x { y, }; // { dg-warning comma at end of enumerator list } Index: gcc/testsuite/g++.dg/cpp0x/enum21b.C === --- gcc/testsuite/g++.dg/cpp0x/enum21b.C (revision 0) +++ gcc/testsuite/g++.dg/cpp0x/enum21b.C (revision 0) @@ -0,0 +1,4 @@ +// { dg-do compile } +// { dg-options -pedantic -std=c++0x } + +enum x { y, }; Index: gcc/cp/parser.c === --- gcc/cp/parser.c (revision 179711) +++ gcc/cp/parser.c (working copy) @@ -13444,6 +13444,7 @@ cp_parser_elaborated_type_specifier (cp_parser* pa enum-specifier: enum-head { enumerator-list [opt] } + enum-head { enumerator-list , } [C++0x] enum-head: enum-key identifier [opt] enum-base [opt] @@ -13463,6 +13464,8 @@ cp_parser_elaborated_type_specifier (cp_parser* pa GNU Extensions: enum-key attributes[opt] identifier [opt] enum-base [opt] { enumerator-list [opt] }attributes[opt] + enum-key attributes[opt] identifier [opt] enum-base [opt] + { enumerator-list, }attributes[opt] [C++0x] Returns an ENUM_TYPE representing the enumeration, or NULL_TREE if the token stream isn't an enum-specifier after all. */ @@ -13802,8 +13805,9 @@ cp_parser_enumerator_list (cp_parser* parser, tree /* If the next token is a `}', there is a trailing comma. */ if (cp_lexer_next_token_is (parser-lexer, CPP_CLOSE_BRACE)) { - if (!in_system_header) - pedwarn (input_location, OPT_pedantic, comma at end of enumerator list); + if (cxx_dialect cxx0x !in_system_header) + pedwarn (input_location, OPT_pedantic, + comma at end of enumerator list); break; } }
[Patch] Don't ignore testsuite errors in Makefile
Hello, currently, the testsuite return value is ignored by make. It is a little annoying if one wants to check automatically for regressions as we have to parse the testsuite output. This patch reverts to the normal make behaviour, which is to not ignore commands' return values. Note: As a result the -k flag has to be added to the make command line if one wants the tests to continue after one failure. OK for trunk? Mikael PS: Jakub, I CCed you as you are the author of the Makefile chunk. 2011-10-09 Mikael Morin mikael.mo...@sfr.fr * Makefile.in (check-parallel-%): Don't ignore testsuite errors. Index: Makefile.in === --- Makefile.in (révision 179710) +++ Makefile.in (copie de travail) @@ -5116,10 +5124,10 @@ $(patsubst %,%-subtargets,$(lang_checks_paralleliz # Otherwise check-$lang isn't parallelized and runtest is invoked just with # the $(RUNTESTFLAGS) arguments. check-parallel-% : site.exp - -test -d plugin || mkdir plugin - -test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR) + test -d plugin || mkdir plugin + test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR) test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir $(TESTSUITEDIR)/$(check_p_subdir) - -(rootme=`${PWD_COMMAND}`; export rootme; \ + (rootme=`${PWD_COMMAND}`; export rootme; \ srcdir=`cd ${srcdir}; ${PWD_COMMAND}` ; export srcdir ; \ cd $(TESTSUITEDIR)/$(check_p_subdir); \ rm -f tmp-site.exp; \
[patch] Fix PR tree-optimization/50635
Hi, In vectorizer pattern recognition when a pattern def_stmt already exists, we need to mark it properly for the current pattern. Another problem is that we don't really have to check that TYPE_OUT is a vector type. It is set by the pattern detection procedures, and if the type is invalid we fail later in the operation analysis anyway. Bootstrapped and tested on powerpc64-suse-linux. Committed. Ira ChangeLog: PR tree-optimization/50635 * tree-vect-patterns.c (vect_handle_widen_mult_by_const): Add DEF_STMT to the list of statements to be replaced by the pattern statements. (vect_handle_widen_mult_by_const): Don't check TYPE_OUT. testsuite/ChangeLog: PR tree-optimization/50635 * gcc.dg/vect/pr50635.c: New test. Index: testsuite/gcc.dg/vect/pr50635.c === --- testsuite/gcc.dg/vect/pr50635.c (revision 0) +++ testsuite/gcc.dg/vect/pr50635.c (revision 0) @@ -0,0 +1,21 @@ +/* { dg-do compile } */ + +typedef signed long int32_t; +typedef char int8_t; + +void f0a(int32_t * result, int32_t * arg1, int8_t * arg2, int32_t temp_3) +{ + int idx; + for (idx=0;idx10;idx += 1) +{ + int32_t temp_4; + int32_t temp_12; + + temp_4 = (-2 arg2[idx]) + temp_3; + temp_12 = -2 * arg2[idx] + temp_4; + result[idx] = temp_12; +} +} + +/* { dg-final { cleanup-tree-dump vect } } */ + Index: tree-vect-patterns.c === --- tree-vect-patterns.c(revision 179718) +++ tree-vect-patterns.c(working copy) @@ -388,6 +388,7 @@ vect_handle_widen_mult_by_const (gimple stmt, tree || TREE_TYPE (gimple_assign_lhs (new_stmt)) != new_type) return false; + VEC_safe_push (gimple, heap, *stmts, def_stmt); *oprnd = gimple_assign_lhs (new_stmt); } else @@ -1424,8 +1425,6 @@ vect_pattern_recog_1 (vect_recog_func_ptr vect_rec { /* No need to check target support (already checked by the pattern recognition function). */ - if (type_out) - gcc_assert (VECTOR_MODE_P (TYPE_MODE (type_out))); pattern_vectype = type_out ? type_out : type_in; } else
[committed] small change (was: Re: [Patch, Fortran] PR 35831: [F95] Shape mismatch check missing for dummy procedure argument)
On Tuesday 04 October 2011 20:54:21 Janus Weil wrote: The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk? The patch is basically OK. Otherwise I'll just start by committing the patch as posted ... Just did so (r179520). Hello, I've just committed the following amendment as revision 179726. Mikael Index: interface.c === --- interface.c (révision 179725) +++ interface.c (révision 179726) @@ -1098,7 +1098,7 @@ check_dummy_characteristics (gfc_symbol *s1, gfc_s case 1: case -3: snprintf (errmsg, err_len, Shape mismatch in dimension %i of - argument '%s', i, s1-name); + argument '%s', i + 1, s1-name); return FAILURE; case -2: Index: ChangeLog === --- ChangeLog (révision 179725) +++ ChangeLog (révision 179726) @@ -1,3 +1,8 @@ +2011-10-09 Mikael Morin mikael.mo...@sfr.fr + + * interface.c (check_dummy_characteristics): Count dimensions starting + from one in diagnostic. + 2011-10-09 Tobias Burnus bur...@net-b.de * Make-lang.in (F95_PARSER_OBJS, GFORTRAN_TRANS_DEPS): Add
Re: [Patch, fortran] [00/14] PR fortran/50420 Support coarray subreferences
On 07.10.2011 16:38, Mikael Morin wrote: The full patchset has passed the fortran testsuite successfully. OK for trunk? OK for the whole patch set. Thanks for finding and fixing the issue! Tobias Patches layout 01..04/14: Add support for non-full arrays in descriptor initialization code. 05..09/14: Make walk_coarray initialize the scalarizer structs properly to accept expression with subreferences. 10..11/14: Fix corank checking 12/14: Accept coarray subreferences in simplify_cobound 13/14: Fix gfc_build_array_type 14/14: Fix gfc_build_array_ref
[committed] Fix bogus e-mail address in ChangeLogs
Hello, it seems that a bogus e-mail address (mistake of mine in the first place) has been promoted lately to being the main way to (miss-)communicate with me. Committed as revision 179727. Mikael Index: ChangeLog === --- ChangeLog (révision 179726) +++ ChangeLog (révision 179727) @@ -380,7 +380,7 @@ * symbol.c (check_conflict): Allow threadprivate attribute with FL_PROCEDURE if proc_pointer. -2011-08-25 Mikael Morin mikael.mo...@gcc.gnu.org +2011-08-25 Mikael Morin mik...@gcc.gnu.org PR fortran/50050 * expr.c (gfc_free_shape): Do nothing if shape is NULL. @@ -430,7 +430,7 @@ * cpp.c (gfc_cpp_init): Force BUILTINS_LOCATION for tokens defined in cpp_define_builtins. -2011-08-22 Mikael Morin mikael.mo...@gcc.gnu.org +2011-08-22 Mikael Morin mik...@gcc.gnu.org PR fortran/50050 * gfortran.h (gfc_clear_shape, gfc_free_shape): New prototypes. Index: ChangeLog-2010 === --- ChangeLog-2010 (révision 179726) +++ ChangeLog-2010 (révision 179727) @@ -71,7 +71,7 @@ substring references. (gfc_check_same_strlen): Use gfc_var_strlen. -2010-12-23 Mikael Morin mikael.mo...@gcc.gnu.org +2010-12-23 Mikael Morin mik...@gcc.gnu.org PR fortran/46978 Revert part of revision 164112
Re: [Patch, Fortran, committed] PR 50585: [4.6/4.7 Regression] ICE with assumed length character array argument
On 08.10.2011 11:51, Janus Weil wrote: Thanks! What's about the .texi change for -fwhole-file? Will do. Should I include a note about deprecation? And if yes, do you have a suggestion for the wording? How about the following attachment? Tobias diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi index 41fee67..cae114a 100644 --- a/gcc/fortran/invoke.texi +++ b/gcc/fortran/invoke.texi @@ -164,7 +164,7 @@ and warnings}. @item Code Generation Options @xref{Code Gen Options,,Options for code generation conventions}. @gccoptlist{-fno-automatic -ff2c -fno-underscoring @gol --fwhole-file -fsecond-underscore @gol +-fno-whole-file -fsecond-underscore @gol -fbounds-check -fcheck-array-temporaries -fmax-array-constructor =@var{n} @gol -fcheck=@var{all|array-temps|bounds|do|mem|pointer|recursion} @gol -fcoarray=@var{none|single|lib} -fmax-stack-var-size=@var{n} @gol @@ -1225,12 +1225,13 @@ in the source, even if the names as seen by the linker are mangled to prevent accidental linking between procedures with incompatible interfaces. -@item -fwhole-file -@opindex @code{fwhole-file} -By default, GNU Fortran parses, resolves and translates each procedure -in a file separately. Using this option modifies this such that the -whole file is parsed and placed in a single front-end tree. During -resolution, in addition to all the usual checks and fixups, references +@item -fno-whole-file +@opindex @code{fno-whole-file} +This flag causes the compiler to resolve and translate each procedure in +a file separately. + +By default, the whole file is parsed and placed in a single front-end tree. +During resolution, in addition to all the usual checks and fixups, references to external procedures that are in the same file effect resolution of that procedure, if not already done, and a check of the interfaces. The dependences are resolved by changing the order in which the file is @@ -1238,6 +1239,8 @@ translated into the backend tree. Thus, a procedure that is referenced is translated before the reference and the duplication of backend tree declarations eliminated. +The @option{-fno-whole-file} option is deprecated and may lead to wrong code. + @item -fsecond-underscore @opindex @code{fsecond-underscore} @cindex underscore
[committed] More e-mail address fixes in ChangeLogs: dead e-mail address
That address is long dead. Committed as revision 179728. Mikael Index: ChangeLog-2008 === --- ChangeLog-2008 (révision 179727) +++ ChangeLog-2008 (révision 179728) @@ -45,7 +45,7 @@ * trans-intrinsic.c (conv_same_strlen_check): New method. (gfc_conv_intrinsic_merge): Call it here to actually do the check. -2008-12-15 Mikael Morin mikael.mo...@tele2.fr +2008-12-15 Mikael Morin mik...@gcc.gnu.org PR fortran/38487 * dependency.c (gfc_is_data_pointer): New function. @@ -53,7 +53,7 @@ in the pointer case. (gfc_check_dependency): Use gfc_is_data_pointer. -2008-12-15 Mikael Morin mikael.mo...@tele2.fr +2008-12-15 Mikael Morin mik...@gcc.gnu.org PR fortran/38113 * error.c (show_locus): Start counting columns at 0. @@ -98,13 +98,13 @@ * invoke.texi (idirafter): New. (no-range-check): Fixed entry in option-index. -2008-12-09 Mikael Morin mikael.mo...@tele2.fr +2008-12-09 Mikael Morin mik...@gcc.gnu.org PR fortran/37469 * expr.c (find_array_element): Simplify array bounds. Assert that both bounds are constant expressions. -2008-12-09 Mikael Morin mikael.mo...@tele2.fr +2008-12-09 Mikael Morin mik...@gcc.gnu.org PR fortran/35983 * trans-expr.c (gfc_trans_subcomponent_assign): @@ -158,7 +158,7 @@ * trans-types.c (gfc_sym_type,gfc_get_function_type): Support procedure pointers as function result. -2008-12-01 Mikael Morin mikael.mo...@tele2.fr +2008-12-01 Mikael Morin mik...@gcc.gnu.org PR fortran/38252 * parse.c (parse_spec): Skip statement order check in case @@ -193,7 +193,7 @@ * module.c (gfc_dump_module): Report error on unlink only if errno != ENOENT. -2008-11-25 Mikael Morin mikael.mo...@tele2.fr +2008-11-25 Mikael Morin mik...@gcc.gnu.org PR fortran/36463 * expr.c (replace_symbol): Don't replace the symtree @@ -218,7 +218,7 @@ * arith.c (gfc_check_real_range): Add mpfr_check_range. * simplify.c (gfc_simplify_nearest): Add mpfr_check_range. -2008-11-24 Mikael Morin mikael.mo...@tele2.fr +2008-11-24 Mikael Morin mik...@gcc.gnu.org PR fortran/38184 * simplify.c (is_constant_array_expr): Return true instead of false @@ -308,7 +308,7 @@ * module.c (load_equiv): Regression fix; check that equivalence members come from the same module only. -2008-11-16 Mikael Morin mikael.mo...@tele2.fr +2008-11-16 Mikael Morin mik...@gcc.gnu.org PR fortran/35681 * dependency.c (gfc_check_argument_var_dependency): Add @@ -333,7 +333,7 @@ * dependency.h (enum gfc_dep_check): New enum. (gfc_check_fncall_dependency): Update prototype. -2008-11-16 Mikael Morin mikael.mo...@tele2.fr +2008-11-16 Mikael Morin mik...@gcc.gnu.org PR fortran/37992 * gfortran.h (gfc_namespace): Added member old_cl_list, @@ -518,7 +518,7 @@ * fortran/check.c (gfc_check_random_seed): Check PUT size at compile time. -2008-10-31 Mikael Morin mikael.mo...@tele2.fr +2008-10-31 Mikael Morin mik...@gcc.gnu.org PR fortran/35840 * expr.c (gfc_reduce_init_expr): New function, containing checking code @@ -528,7 +528,7 @@ checking that the expression is a constant. * match.h (gfc_reduce_init_expr): Prototype added. -2008-10-31 Mikael Morin mikael.mo...@tele2.fr +2008-10-31 Mikael Morin mik...@gcc.gnu.org PR fortran/35820 * resolve.c (gfc_count_forall_iterators): New function. @@ -548,7 +548,7 @@ gfc_simplify_ifix, gfc_simplify_idint, simplify_nint): Update function calls to include locus. -2008-10-30 Mikael Morin mikael.mo...@tele2.fr +2008-10-30 Mikael Morin mik...@gcc.gnu.org PR fortran/37903 * trans-array.c (gfc_trans_create_temp_array): If n is less @@ -563,7 +563,7 @@ possible. Calculate the translation from loop variables to array indices if an array constructor. -2008-10-30 Mikael Morin mikael.mo...@tele2.fr +2008-10-30 Mikael Morin mik...@gcc.gnu.org PR fortran/37749 * trans-array.c (gfc_trans_create_temp_array): If size is NULL Index: ChangeLog-2009 === --- ChangeLog-2009 (révision 179727) +++ ChangeLog-2009 (révision 179728) @@ -3519,7 +3519,7 @@ * intrinsic.texi (MALLOC): Make example more portable. -2009-02-13 Mikael Morin mikael.mo...@tele2.fr +2009-02-13 Mikael Morin mik...@gcc.gnu.org PR fortran/38259 * module.c (gfc_dump_module,gfc_use_module): Add module @@ -3566,7 +3566,7 @@ * invoke.texi (RANGE): RANGE also takes INTEGER arguments. -2009-01-19 Mikael Morin mikael.mo...@tele2.fr +2009-01-19 Mikael Morin mik...@gcc.gnu.org PR fortran/38859 * simplify.c (simplify_bound): Don't use array specification @@ -3656,7 +3656,7 @@ is substituted by a function. * resolve.c (check_host_association): Return if above is set. -2009-01-04 Mikael Morin mikael.mo...@tele2.fr +2009-01-04 Mikael Morin mik...@gcc.gnu.org PR fortran/35681 * ChangeLog-2008: Fix function
[PATCH] RFC: Cache LTO streamer mappings
From: Andi Kleen a...@linux.intel.com Currently the LTO file mappings use a simple one-off cache. This doesn't match the access pattern very well. This patch adds a real LRU of LTO mappings with a total limit. Each file is completely mapped now instead of only specific sections. This addresses one of the FIXME comments in LTO. The limit is 256GB on 64bit and 256MB on 32bit. The limit can be temporarily exceeded by a single file. The whole file has to fit into the address space now. This may increase the address space requirements a bit. I originally wrote this in an attempt to minimze fragmentation of the virtual memory map, but it didn't make too much difference for that because it was all caused by GGC. Also on my fairly large builds it didn't make a measurable compile time difference, probably because it was shadowed by other much slower passes. That is why I'm just sending it as a RFC. It certainly complicates the code somewhat. Maybe if people have other LTO builds they could try if it makes a difference for them. Is it still a good idea? Passes a full LTO bootstrap plus test suite on x86-64-linux. gcc/lto/: 2011-10-05 Andi Kleen a...@linux.intel.com * lto.c (list_head, mapped_file): Add. (page_mask): Rename to page_size. (MB, GB, max_mapped, cur_mapped, mapped_lru, list_add, list_del): Add. (mf_lru_enforce_limit, mf_hashtable, mf_lru_finish_cache, mf_eq): Add (mf_hash, mf_lookup_or_create) (lto_read_section_data): Split into two ifdef versions. Implement version using LRU cache. Add more error checks. (mf_lru_finish_cached): Add dummy in ifdef. (free_section_data): Rewrite for LRU. (read_cgraph_and_symbols): Call mf_lru_finish_cache. --- gcc/lto/lto.c | 292 ++--- 1 files changed, 238 insertions(+), 54 deletions(-) diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c index a77eeb4..29dc3b8 100644 --- a/gcc/lto/lto.c +++ b/gcc/lto/lto.c @@ -1141,6 +1141,30 @@ lto_file_finalize (struct lto_file_decl_data *file_data, lto_file *file) lto_free_section_data (file_data, LTO_section_decls, NULL, data, len); } +/* A list node or head */ +struct list_head + { +struct list_head *next; /* Next node */ +struct list_head *prev; /* Prev node */ + }; + +/* Cache of mapped files */ +struct mapped_file + { +struct list_head lru; /* LRU list. Must be first. */ +char *map; /* Mapping of the file */ +size_t size; /* Size of mapping (rounded up) */ +int refcnt;/* Number of users */ +const char *filename; /* File name */ + }; + +struct lwstate +{ + lto_file *file; + struct lto_file_decl_data **file_data; + int *count; +}; + /* Finalize FILE_DATA in FILE and increase COUNT. */ static int @@ -1200,65 +1224,213 @@ lto_file_read (lto_file *file, FILE *resolution_file, int *count) #endif #if LTO_MMAP_IO + /* Page size of machine is used for mmap and munmap calls. */ -static size_t page_mask; -#endif +static size_t page_size; + +#define MB (1UL 20) +#define GB (1UL 30) + +/* Limit of mapped files */ +static unsigned HOST_WIDE_INT max_mapped = sizeof(void *) 4 ? 256*GB : 256*MB; + +/* Total size of currently mapped files */ +static unsigned HOST_WIDE_INT cur_mapped; + +/* LRU of mapped files */ +static struct list_head mapped_lru = { mapped_lru, mapped_lru }; + +/* Add NODE into list HEAD */ + +static void +list_add(struct list_head *node, struct list_head *head) +{ + struct list_head *prev = head; + struct list_head *next = head-next; + + next-prev = node; + node-next = next; + node-prev = prev; + prev-next = node; +} + +/* Remove NODE from list. */ + +static void +list_del(struct list_head *node) +{ + struct list_head *prev = node-prev; + struct list_head *next = node-next; + + if (!next !prev) +return; + next-prev = prev; + prev-next = next; + node-next = NULL; + node-prev = NULL; +} + +/* Enforce the global LRU limit MAX when the commitment changes by INCREMENT. */ + +static void +mf_lru_enforce_limit (unsigned HOST_WIDE_INT increment, unsigned HOST_WIDE_INT max) +{ + struct mapped_file *mf; + unsigned HOST_WIDE_INT new_mapped = cur_mapped + increment; + struct list_head *node, *prev; + + for (node = mapped_lru.prev; new_mapped max node != mapped_lru; node = prev) +{ + prev = node-prev; + mf = (struct mapped_file *) node; + if (mf-refcnt 0) +continue; + munmap (mf-map, mf-size); + mf-map = NULL; + new_mapped -= mf-size; + list_del (node); +} + + cur_mapped = new_mapped; +} + +/* Hash table of mapped_files */ +static htab_t mf_hashtable; + +/* Free all mappings in the hash table. */ + +static void +mf_lru_finish_cache (void) +{ + mf_lru_enforce_limit (0, 0); + gcc_assert (mapped_lru.next == mapped_lru.prev); + htab_delete (mf_hashtable); + mf_hashtable = NULL; +} + +/* Compare hash table entries A and B. */ +
Re: [committed] More e-mail address fixes in ChangeLogs: dead e-mail address
On Sun, Oct 9, 2011 at 7:04 PM, Mikael Morin mikael.mo...@sfr.fr wrote: That address is long dead. Committed as revision 179728. We usually don't retroactively change ChangeLogs this way. Please refrain from making further changes like this. Thanks, Richard. Mikael
[Patch, Fortran] Fix PR 50564
Hello world, the attached patch fixes the PR by removing common function elimination in FORALL statements. In the course of fixing this PR, I had originally fixed the ICE only to find that the transformation (where f is a function) forall (i=1:2) a(i) = f(i) + f(i) end forall to forall (i=1:2) tmp = f(i) a(i) = tmp end forall did the Wrong Thing. Oh well... Regression-tested. OK for tunk? Thomas 2011-10-09 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/50564 * frontend-passes (forall_level): New variable. (cfe_register_funcs): Don't register functions if we are within a forall loop. (optimize_namespace): Set forall_level to 0 before entry. (gfc_code_walker): Increase/decrease forall_level. 2011-10-09 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/50564 * gfortran.dg/forall_15.f90: New test case. Index: frontend-passes.c === --- frontend-passes.c (Revision 179709) +++ frontend-passes.c (Arbeitskopie) @@ -62,6 +62,10 @@ static gfc_code *inserted_block, **changed_stateme gfc_namespace *current_ns; +/* If we are within any forall loop. */ + +static int forall_level; + /* Entry point - run all passes for a namespace. So far, only an optimization pass is run. */ @@ -165,6 +169,12 @@ cfe_register_funcs (gfc_expr **e, int *walk_subtre || (*e)-ts.u.cl-length-expr_type != EXPR_CONSTANT)) return 0; + /* We don't do function elimination within FORALL statements, it can + lead to wrong-code in certain circumstances. */ + + if (forall_level 0) +return 0; + /* If we don't know the shape at compile time, we create an allocatable temporary variable to hold the intermediate result, but only if allocation on assignment is active. */ @@ -493,6 +503,7 @@ optimize_namespace (gfc_namespace *ns) { current_ns = ns; + forall_level = 0; gfc_code_walker (ns-code, convert_do_while, dummy_expr_callback, NULL); gfc_code_walker (ns-code, cfe_code, cfe_expr_0, NULL); @@ -1193,6 +1204,7 @@ gfc_code_walker (gfc_code **c, walk_code_fn_t code WALK_SUBEXPR (fa-end); WALK_SUBEXPR (fa-stride); } + forall_level ++; break; } @@ -1335,6 +1347,10 @@ gfc_code_walker (gfc_code **c, walk_code_fn_t code WALK_SUBEXPR (b-expr2); WALK_SUBCODE (b-next); } + + if (co-op == EXEC_FORALL || co-op == EXEC_DO_CONCURRENT) + forall_level --; + } } return 0; ! { dg-do run } ! { dg-options -ffrontend-optimize -fdump-tree-original } ! PR 50564 - this used to ICE with front end optimization. ! Original test case by Andrew Benson. program test implicit none double precision, dimension(2) :: timeSteps, control integer:: iTime double precision :: ratio double precision :: a ratio = 0.7d0 control(1) = ratio**(dble(1)-0.5d0)-ratio**(dble(1)-1.5d0) control(2) = ratio**(dble(2)-0.5d0)-ratio**(dble(2)-1.5d0) forall(iTime=1:2) timeSteps(iTime)=ratio**(dble(iTime)-0.5d0)-ratio**(dble(iTime)-1.5d0) end forall if (any(abs(timesteps - control) 1d-10)) call abort ! Make sure we still do the front-end optimization after a forall a = cos(ratio)*cos(ratio) + sin(ratio)*sin(ratio) if (abs(a-1.d0) 1d-10) call abort end program test ! { dg-final { scan-tree-dump-times __builtin_cos 1 original } } ! { dg-final { scan-tree-dump-times __builtin_sin 1 original } } ! { dg-final { cleanup-tree-dump original } }
Avoid double mangling at WHOPR
Hi, whopr currently produce local_static.1234.43124 type symbols. This is because everything gets mangled at WPA time and then again at ltrans time. This simply avoids the second mangling. This save some space makes WHOPR/non_WHOPR symbol tables comparable more directly. Bootstrapped/regtested x86_64-linux, also tested with Mozilla LTO, OK? Honza * lto.c (lto_register_var_decl_in_symtab, lto_register_function_decl_in_symtab): Do not mangle at ltrans time. * lto-lang.c (lto_set_decl_assembler_name): Likewise. Index: lto/lto.c === --- lto/lto.c (revision 179664) +++ lto/lto.c (working copy) @@ -604,7 +604,7 @@ lto_register_var_decl_in_symtab (struct /* Variable has file scope, not local. Need to ensure static variables between different files don't clash unexpectedly. */ - if (!TREE_PUBLIC (decl) + if (!TREE_PUBLIC (decl) !flag_ltrans !((context = decl_function_context (decl)) auto_var_in_fn_p (decl, context))) { @@ -646,7 +646,7 @@ lto_register_function_decl_in_symtab (st { /* Need to ensure static entities between different files don't clash unexpectedly. */ - if (!TREE_PUBLIC (decl)) + if (!TREE_PUBLIC (decl) !flag_ltrans) { /* We must not use the DECL_ASSEMBLER_NAME macro here, as it may set the assembler name where it was previously empty. */ Index: lto/lto-lang.c === --- lto/lto-lang.c (revision 179664) +++ lto/lto-lang.c (working copy) @@ -954,7 +954,7 @@ lto_set_decl_assembler_name (tree decl) TREE_PUBLIC, to avoid conflicts between individual files. */ tree id; - if (TREE_PUBLIC (decl)) + if (TREE_PUBLIC (decl) || flag_ltrans) id = targetm.mangle_decl_assembler_name (decl, DECL_NAME (decl)); else {
Re: [committed] More e-mail address fixes in ChangeLogs: dead e-mail address
On Sunday 09 October 2011 19:30:20 Richard Guenther wrote: We usually don't retroactively change ChangeLogs this way. On the other hand, ChangeLogs usually don't need to be changed. Please refrain from making further changes like this. OK, I will. Is there a reason for such a policy? Mikael
[patch bfd]: Some adjustments on coff-link.c
Hello, this patch improves COFF linker for undefined weak symbols and avoids writing symbols for discarded sections - if linker tells so -, and for IR generated sections. ChangeLog 2011-10-09 Kai Tietz kti...@redhat.com * cofflink.c (coff_link_check_ar_symbols): Allow adding of archive-file if symbol was undefined weak. (_bfd_coff_write_global_sym): Skip write for symbol in discared section, or if section is coming from IR, or if input section has explicit SEC_EXCLUDED set. (_bfd_coff_generic_relocate_section): For undefined weak symbol and replacing it by another undefined weak, mark section as absolute. Regression tested for i686-w64-mingw32, x86_64-w64-mingw32, and i686-pc-cygwin. Ok for apply? Regards, Kai Index: src/bfd/cofflink.c === --- src.orig/bfd/cofflink.c +++ src/bfd/cofflink.c @@ -242,7 +242,8 @@ coff_link_check_ar_symbols (bfd *abfd, COFF linkers do not bring in an object file which defines it. */ if (h != (struct bfd_link_hash_entry *) NULL - h-type == bfd_link_hash_undefined) + (h-type == bfd_link_hash_undefined + || h-type == bfd_link_hash_undefweak)) { if (!(*info-callbacks -add_archive_element) (info, abfd, name, subsbfd)) @@ -2527,6 +2528,7 @@ _bfd_coff_write_global_sym (struct bfd_h bfd_size_type symesz; unsigned int i; file_ptr pos; + asection *input_sec; output_bfd = finfo-output_bfd; @@ -2547,6 +2549,21 @@ _bfd_coff_write_global_sym (struct bfd_h h-root.root.string, FALSE, FALSE) == NULL return TRUE; + else if (h-indx != -2 +(h-root.type == bfd_link_hash_defined + || h-root.type == bfd_link_hash_defweak) + ((finfo-info-strip_discarded +!bfd_is_abs_section (h-root.u.def.section) +bfd_is_abs_section (h-root.u.def.section-output_section)) + || (h-root.u.def.section-owner != NULL + (h-root.u.def.section-owner-flags BFD_PLUGIN) != 0))) +return TRUE; + else if (h-indx != -2 +(h-root.type == bfd_link_hash_undefined + || h-root.type == bfd_link_hash_undefweak) + h-root.u.undef.abfd != NULL + (h-root.u.undef.abfd-flags BFD_PLUGIN) != 0) +return TRUE; switch (h-root.type) { @@ -2560,26 +2577,37 @@ _bfd_coff_write_global_sym (struct bfd_h case bfd_link_hash_undefweak: isym.n_scnum = N_UNDEF; isym.n_value = 0; + input_sec = bfd_und_section_ptr; break; case bfd_link_hash_defined: case bfd_link_hash_defweak: - { - asection *sec; + input_sec = h-root.u.def.section; + if (input_sec-output_section != NULL) + { + asection *sec; - sec = h-root.u.def.section-output_section; - if (bfd_is_abs_section (sec)) - isym.n_scnum = N_ABS; - else - isym.n_scnum = sec-target_index; - isym.n_value = (h-root.u.def.value - + h-root.u.def.section-output_offset); - if (! obj_pe (finfo-output_bfd)) - isym.n_value += sec-vma; - } + sec = h-root.u.def.section-output_section; + if (bfd_is_abs_section (sec)) + isym.n_scnum = N_ABS; + else + isym.n_scnum = sec-target_index; + isym.n_value = (h-root.u.def.value + + h-root.u.def.section-output_offset); + if (! obj_pe (finfo-output_bfd)) + isym.n_value += sec-vma; + } + else +{ + BFD_ASSERT (input_sec-owner == NULL); + isym.n_scnum = N_UNDEF; + isym.n_value = 0; + input_sec = bfd_und_section_ptr; + } break; case bfd_link_hash_common: + input_sec = h-root.u.c.p-section; isym.n_scnum = N_UNDEF; isym.n_value = h-root.u.c.size; break; @@ -2589,6 +2617,9 @@ _bfd_coff_write_global_sym (struct bfd_h return TRUE; } + if ((input_sec-flags SEC_EXCLUDE) != 0) +return TRUE; + if (strlen (h-root.root.string) = SYMNMLEN) strncpy (isym._n._n_name, h-root.root.string, SYMNMLEN); else @@ -3013,7 +3044,8 @@ _bfd_coff_generic_relocate_section (bfd h-auxbfd-tdata.coff_obj_data-sym_hashes[ h-aux-x_sym.x_tagndx.l]; - if (!h2 || h2-root.type == bfd_link_hash_undefined) + if (!h2 || h2-root.type == bfd_link_hash_undefined + || h2-root.type == bfd_link_hash_undefweak) { sec = bfd_abs_section_ptr; val = 0;
Re: [Patch] Don't ignore testsuite errors in Makefile
On Sun, Oct 09, 2011 at 04:32:12PM +0200, Mikael Morin wrote: currently, the testsuite return value is ignored by make. It is a little annoying if one wants to check automatically for regressions as we have to parse the testsuite output. This patch reverts to the normal make behaviour, which is to not ignore commands' return values. Note: As a result the -k flag has to be added to the make command line if one wants the tests to continue after one failure. OK for trunk? Please no. This is a very bad idea, most of the testsuites on many architectures contain some FAILs and a failure from check-parallel-% would mean the *.log/*.sum files would be never merged in that case. If you really need to propagate the return value (I fail to see how it is useful), then you should e.g. store the $? value from $(RUNTEST) in check-parallel-% into some file in that directory and have the parallelization goal after the merging collect those from the individual files and or them all together into the final return value. 2011-10-09 Mikael Morin mikael.mo...@sfr.fr * Makefile.in (check-parallel-%): Don't ignore testsuite errors. Index: Makefile.in === --- Makefile.in (révision 179710) +++ Makefile.in (copie de travail) @@ -5116,10 +5124,10 @@ $(patsubst %,%-subtargets,$(lang_checks_paralleliz # Otherwise check-$lang isn't parallelized and runtest is invoked just with # the $(RUNTESTFLAGS) arguments. check-parallel-% : site.exp - -test -d plugin || mkdir plugin - -test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR) + test -d plugin || mkdir plugin + test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR) test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir $(TESTSUITEDIR)/$(check_p_subdir) - -(rootme=`${PWD_COMMAND}`; export rootme; \ + (rootme=`${PWD_COMMAND}`; export rootme; \ srcdir=`cd ${srcdir}; ${PWD_COMMAND}` ; export srcdir ; \ cd $(TESTSUITEDIR)/$(check_p_subdir); \ rm -f tmp-site.exp; \ Jakub
RE: Intrinsics for N2965: Type traits and base classes
Here is a new diff that works for non-class types (fixing Benjamin's failing test), fixes some spacing and alphabetization, and doesn't inadvertently break the __underlying_type trait. Index: libstdc++-v3/include/tr2/type_traits === --- libstdc++-v3/include/tr2/type_traits(revision 0) +++ libstdc++-v3/include/tr2/type_traits(revision 0) @@ -0,0 +1,96 @@ +// TR2 type_traits -*- C++ -*- + +// Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 +// Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// http://www.gnu.org/licenses/. + +/** @file tr2/type_traits + * This is a TR2 C++ Library header. + */ + +#ifndef _GLIBCXX_TR2_TYPE_TRAITS +#define _GLIBCXX_TR2_TYPE_TRAITS 1 + +#pragma GCC system_header +#include type_traits +#include bits/c++config.h + +namespace std _GLIBCXX_VISIBILITY(default) +{ +namespace tr2 +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + + /** + * @defgroup metaprogramming Type Traits + * @ingroup utilities + * + * Compile time type transformation and information. + * @{ + */ + + templatetypename... _Elements struct typelist; + template +struct typelist +{ + typedef std::true_type empty; +}; + + templatetypename _First, typename... _Rest +struct typelist_First, _Rest... +{ + struct first + { +typedef _First type; + }; + + struct rest + { +typedef typelist_Rest... type; + }; + + typedef std::false_type empty; +}; + + // Sequence abstraction metafunctions default to looking in the type + templatetypename T struct first : public T::first {}; + templatetypename T struct rest : public T::rest {}; + templatetypename T struct empty : public T::empty {}; + + + templatetypename T +struct bases +{ + typedef typelist__bases(T)... type; +}; + + templatetypename T +struct direct_bases +{ + typedef typelist__direct_bases(T)... type; +}; + +_GLIBCXX_END_NAMESPACE_VERSION +} +} + +#endif // _GLIBCXX_TR2_TYPE_TRAITS Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 178892) +++ gcc/c-family/c-common.c (working copy) @@ -423,6 +423,7 @@ { __asm__, RID_ASM,0 }, { __attribute, RID_ATTRIBUTE, 0 }, { __attribute__, RID_ATTRIBUTE, 0 }, + { __bases, RID_BASES, D_CXXONLY }, { __builtin_choose_expr, RID_CHOOSE_EXPR, D_CONLY }, { __builtin_complex, RID_BUILTIN_COMPLEX, D_CONLY }, { __builtin_offsetof, RID_OFFSETOF, 0 }, @@ -433,6 +434,7 @@ { __const, RID_CONST, 0 }, { __const__, RID_CONST, 0 }, { __decltype, RID_DECLTYPE, D_CXXONLY }, + { __direct_bases, RID_DIRECT_BASES, D_CXXONLY }, { __extension__, RID_EXTENSION, 0 }, { __func__,RID_C99_FUNCTION_NAME, 0 }, { __has_nothrow_assign, RID_HAS_NOTHROW_ASSIGN, D_CXXONLY }, Index: gcc/c-family/c-common.h === --- gcc/c-family/c-common.h (revision 178892) +++ gcc/c-family/c-common.h (working copy) @@ -129,12 +129,13 @@ RID_CONSTCAST, RID_DYNCAST, RID_REINTCAST, RID_STATCAST, /* C++ extensions */ + RID_BASES, RID_DIRECT_BASES, RID_HAS_NOTHROW_ASSIGN, RID_HAS_NOTHROW_CONSTRUCTOR, RID_HAS_NOTHROW_COPY,RID_HAS_TRIVIAL_ASSIGN, RID_HAS_TRIVIAL_CONSTRUCTOR, RID_HAS_TRIVIAL_COPY, RID_HAS_TRIVIAL_DESTRUCTOR, RID_HAS_VIRTUAL_DESTRUCTOR, RID_IS_ABSTRACT, RID_IS_BASE_OF, - RID_IS_CONVERTIBLE_TO, RID_IS_CLASS, + RID_IS_CLASS,RID_IS_CONVERTIBLE_TO, RID_IS_EMPTY,RID_IS_ENUM, RID_IS_LITERAL_TYPE, RID_IS_POD, RID_IS_POLYMORPHIC, RID_IS_STD_LAYOUT, Index: gcc/cp/pt.c === --- gcc/cp/pt.c (revision 178892) +++ gcc/cp/pt.c (working copy) @@ -2976,6
Re: Intrinsics for N2965: Type traits and base classes
On 10/09/2011 08:13 PM, Michael Spertus wrote: +dfs_calculate_bases_pre (tree binfo, void *data_) +{ + (void)data_; You can use ATTRIBUTE_UNUSED to mark an unused parameter. I'd still like to see some testcases for the intrinsic, independent of the library. Jason
Re: [Patch, Fortran] PR 50547 / cleanup in resolve_formal_arglist
On 02.10.2011 01:43, Janus Weil wrote: Hi all, while working on PR50547, I noticed some strange things about resolve_formal_arglist, so I decided to clean it up a little. The attached patch does a couple of things: Regtested on x86_64-unknown-linux-gnu. Ok for trunk? OK. Thanks for the cleanup. Tobias 2011-10-02 Janus Weilja...@gcc.gnu.org PR fortran/50547 * resolve.c (resolve_formal_arglist): Remove unneeded error message. Some reshuffling. 2011-10-02 Janus Weilja...@gcc.gnu.org PR fortran/50547 * gfortran.dg/elemental_args_check_4.f90: New.
[PATCH, testsuite]: Remove *.gdb files from testsuite dir
Hello! Attached patch removes *.gdb temporary files from testsuite directory. 2011-10-09 Uros Bizjak ubiz...@gmail.com * lib/gcc-gdb-test.exp (gdb-test): Delete $cmd_file before return. Tested on x86_64-pc-linux-gnu {,-m32}. OK for mainline and branches? Uros. Index: gcc-gdb-test.exp === --- gcc-gdb-test.exp(revision 179718) +++ gcc-gdb-test.exp(working copy) @@ -56,6 +56,7 @@ set res [remote_spawn target $gdb_name -nx -nw -quiet -x $cmd_file ./$output_file] if { $res 0 || $res == } { unsupported $testname + file delete $cmd_file return } @@ -64,6 +65,7 @@ -re Unhandled dwarf expression|Error in sourced command file { unsupported $testname remote_close target + file delete $cmd_file return } -re {[\n\r]\$1 = ([^\n\r]*)[\n\r]+\$2 = ([^\n\r]*)[\n\r]} { @@ -76,16 +78,19 @@ fail $testname } remote_close target + file delete $cmd_file return } timeout { unsupported $testname remote_close target + file delete $cmd_file return } } +unsupported $testname remote_close target -unsupported $testname +file delete $cmd_file return }
Re: [PATCH, testsuite, i386] FMA3 testcases + typo fix in MD
Hi guys, This is a Ping. Could anyboady with appropriate rights commit that? Thanks, K On Thu, Oct 6, 2011 at 11:46 PM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Oct 6, 2011 at 3:48 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: BTW, don't you also need -mfmpath=sse in dg-options? According to doc/invoke.texi ... @itemx -mfma ... These options will enable GCC to use these extended instructions in generated code, even without @option{-mfpmath=sse}. Seems it -mfpmath=sse is useless.. Although, if this is wrong, we probably have to update doc as well. Well, OK then. Uros.
Improve ggc-page fragmentation
I ran into problems with virtual memory fragmentation ggc-page on large LTO builds. The memory was so fragmented that builds failed because the compiler would use more than the 64k mappings Linux allows each process by default. For more details see PR 50636 This patchkit includes various improvements to the fragmentation behaviour plus some optimizations to increase the use of 2MB pages on modern Linux kernels. This fixes the fragmentation problem for me and increases the use of huge pages significantly. My simple benchmarks didn't show a lot of performance improvement though. On non Linux kernels the fragmentation problem will be still somewhat visible (the best fix is using the Linux specific MADV_DONTNEED), but the new threshold should still improve things there. All passed bootstrap and test suite run on x86-64. -Andi
[PATCH 3/5] On a Linux kernel ask explicitely for a huge page in ggc
From: Andi Kleen a...@linux.intel.com Benchmarks show slightly faster build times on a kernel build, near the measurement error unfortunately. This will only work with a recent glibc that defines MADV_HUGEPAGE. 2011-10-08 Andi Kleen a...@linux.intel.com * ggc-page.c (alloc_page): Add madvise for hugepage --- gcc/ggc-page.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c index 1f52b56..6e08cda 100644 --- a/gcc/ggc-page.c +++ b/gcc/ggc-page.c @@ -779,6 +779,11 @@ alloc_page (unsigned order) page = alloc_anon (NULL, G.pagesize * GGC_QUIRE_SIZE); +#if defined(HAVE_MADVISE) defined(MADV_HUGEPAGE) + /* Kernel, I would like to have hugepages, please. */ + madvise(page, G.pagesize * GGC_QUIRE_SIZE, MADV_HUGEPAGE); +#endif + /* This loop counts down so that the chain will be in ascending memory order. */ for (i = GGC_QUIRE_SIZE - 1; i = 1; i--) -- 1.7.5.4
[PATCH 2/5] Increase the GGC quite size to 2MB
From: Andi Kleen a...@linux.intel.com Using 2MB allows modern kernels to use 2MB huge pages on x86. gcc/: 2011-10-08 Andi Kleen a...@linux.intel.com * ggc-page.c (GGC_QUIRE_SIZE): Increase to 512 --- gcc/ggc-page.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c index b0b3b3f..1f52b56 100644 --- a/gcc/ggc-page.c +++ b/gcc/ggc-page.c @@ -469,7 +469,7 @@ static struct globals can override this by defining GGC_QUIRE_SIZE explicitly. */ #ifndef GGC_QUIRE_SIZE # ifdef USING_MMAP -# define GGC_QUIRE_SIZE 256 +# define GGC_QUIRE_SIZE 512 /* 2MB for 4K pages */ # else # define GGC_QUIRE_SIZE 16 # endif -- 1.7.5.4
[PATCH 5/5] Add error checking to lto_section_read
From: Andi Kleen a...@linux.intel.com Various callers of lto_section_read segfault on a NULL return when the mmap fails. Add some internal_errors to give a better message to the user. gcc/lto/: 2011-10-09 Andi Kleen a...@linux.intel.com * lto.c (lto_section_read): Call internal_error on IO or mmap errors. --- gcc/lto/lto.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c index a77eeb4..dc16db4 100644 --- a/gcc/lto/lto.c +++ b/gcc/lto/lto.c @@ -1237,7 +1237,10 @@ lto_read_section_data (struct lto_file_decl_data *file_data, { fd = open (file_data-file_name, O_RDONLY|O_BINARY); if (fd == -1) - return NULL; +{ + internal_error (Cannot open %s, file_data-file_name); + return NULL; +} fd_name = xstrdup (file_data-file_name); } @@ -1255,7 +1258,10 @@ lto_read_section_data (struct lto_file_decl_data *file_data, result = (char *) mmap (NULL, computed_len, PROT_READ, MAP_PRIVATE, fd, computed_offset); if (result == MAP_FAILED) -return NULL; +{ + internal_error (Cannot map %s, file_data-file_name); + return NULL; +} return result + diff; #else @@ -1264,6 +1270,7 @@ lto_read_section_data (struct lto_file_decl_data *file_data, || read (fd, result, len) != (ssize_t) len) { free (result); + internal_error (Cannot read %s, file_data-file_name); result = NULL; } #ifdef __MINGW32__ -- 1.7.5.4
[PATCH 4/5] Add a freeing threshold for the garbage collector.
From: Andi Kleen a...@linux.intel.com Add a threshold to avoid freeing pages back too early to the OS. This avoid virtual memory map fragmentation. Based on a idea from Honza ggc/doc/: 2011-10-08 Andi Kleen a...@linux.intel.com PR other/50636 * invoke.texi (ggc-free-threshold, ggc-free-min): Add. ggc/: 2011-10-08 Andi Kleen a...@linux.intel.com PR other/50636 * ggc-page.c (ggc_collect): Add free threshold. * params.def (GGC_FREE_THRESHOLD, GGC_FREE_MIN): Add. --- gcc/doc/invoke.texi | 11 +++ gcc/ggc-page.c | 13 + gcc/params.def | 10 ++ 3 files changed, 30 insertions(+), 4 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ef7ac68..6557f66 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -8837,6 +8837,17 @@ very large effectively disables garbage collection. Setting this parameter and @option{ggc-min-expand} to zero causes a full collection to occur at every opportunity. +@item ggc-free-threshold + +Only free memory back to the system when it would free more than this +many percent of the total allocated memory. Default is 20 percent. +This avoids memory fragmentation. + +@item ggc-free-min + +Only free memory back to the system when it would free more than this. +Unit is kilobytes. + @item max-reload-search-insns The maximum number of instruction reload should look backward for equivalent register. Increasing values mean more aggressive optimization, making the diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c index 6e08cda..cd1c41a 100644 --- a/gcc/ggc-page.c +++ b/gcc/ggc-page.c @@ -1968,14 +1968,19 @@ ggc_collect (void) if (GGC_DEBUG_LEVEL = 2) fprintf (G.debug_file, BEGIN COLLECTING\n); + /* Release the pages we freed the last time we collected, but didn't + reuse in the interim. But only do this if this would free a + reasonable number of pages. Otherwise hold on to them + to avoid virtual memory fragmentation. */ + if (G.bytes_mapped - G.allocated = + (PARAM_VALUE (GGC_FREE_THRESHOLD) / 100.0) * G.bytes_mapped + G.bytes_mapped - G.allocated = (size_t)PARAM_VALUE (GGC_FREE_MIN) * 1024) +release_pages (); + /* Zero the total allocated bytes. This will be recalculated in the sweep phase. */ G.allocated = 0; - /* Release the pages we freed the last time we collected, but didn't - reuse in the interim. */ - release_pages (); - /* Indicate that we've seen collections at this context depth. */ G.context_depth_collections = ((unsigned long)1 (G.context_depth + 1)) - 1; diff --git a/gcc/params.def b/gcc/params.def index 5e49c48..ca28715 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -561,6 +561,16 @@ DEFPARAM(GGC_MIN_HEAPSIZE, #undef GGC_MIN_EXPAND_DEFAULT #undef GGC_MIN_HEAPSIZE_DEFAULT +DEFPARAM(GGC_FREE_THRESHOLD, + ggc-free-threshold, + Dont free memory back to system less this percent of the total memory, + 20, 0, 100) + +DEFPARAM(GGC_FREE_MIN, +ggc-free-min, +Dont free less memory than this back to the system, in kilobytes, +8 * 1024, 0, 0) + DEFPARAM(PARAM_MAX_RELOAD_SEARCH_INSNS, max-reload-search-insns, The maximum number of instructions to search backward when looking for equivalent reload, -- 1.7.5.4
[PATCH 1/5] Use MADV_DONTNEED for freeing in garbage collector
From: Andi Kleen a...@linux.intel.com Use the Linux MADV_DONTNEED call to unmap free pages in the garbage collector.Then keep the unmapped pages in the free list. This avoid excessive memory fragmentation on large LTO bulds, which can lead to gcc bumping into the Linux vm_max_map limit per process. Based on a idea from Jakub. gcc/: 2011-10-08 Andi Kleen a...@linux.intel.com PR other/50636 * config.in, configure: Regenerate. * configure.ac (madvise): Add to AC_CHECK_FUNCS. * ggc-page.c (USING_MADVISE): Add. (page_entry): Add unmapped field. (alloc_page): Check for unmapped pages. (release_pages): Add USING_MADVISE branch. --- gcc/config.in|6 ++ gcc/configure|2 +- gcc/configure.ac |2 +- gcc/ggc-page.c | 48 +++- 4 files changed, 55 insertions(+), 3 deletions(-) diff --git a/gcc/config.in b/gcc/config.in index f2847d8..e8148b6 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -1276,6 +1276,12 @@ #endif +/* Define to 1 if you have the `madvise' function. */ +#ifndef USED_FOR_TARGET +#undef HAVE_MADVISE +#endif + + /* Define to 1 if you have the malloc.h header file. */ #ifndef USED_FOR_TARGET #undef HAVE_MALLOC_H diff --git a/gcc/configure b/gcc/configure index cb55dda..4a54adf 100755 --- a/gcc/configure +++ b/gcc/configure @@ -9001,7 +9001,7 @@ fi for ac_func in times clock kill getrlimit setrlimit atoll atoq \ sysconf strsignal getrusage nl_langinfo \ gettimeofday mbstowcs wcswidth mmap setlocale \ - clearerr_unlocked feof_unlocked ferror_unlocked fflush_unlocked fgetc_unlocked fgets_unlocked fileno_unlocked fprintf_unlocked fputc_unlocked fputs_unlocked fread_unlocked fwrite_unlocked getchar_unlocked getc_unlocked putchar_unlocked putc_unlocked + clearerr_unlocked feof_unlocked ferror_unlocked fflush_unlocked fgetc_unlocked fgets_unlocked fileno_unlocked fprintf_unlocked fputc_unlocked fputs_unlocked fread_unlocked fwrite_unlocked getchar_unlocked getc_unlocked putchar_unlocked putc_unlocked madvise do : as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh` ac_fn_c_check_func $LINENO $ac_func $as_ac_var diff --git a/gcc/configure.ac b/gcc/configure.ac index a7b94e6..357902e 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -1027,7 +1027,7 @@ define(gcc_UNLOCKED_FUNCS, clearerr_unlocked feof_unlocked dnl AC_CHECK_FUNCS(times clock kill getrlimit setrlimit atoll atoq \ sysconf strsignal getrusage nl_langinfo \ gettimeofday mbstowcs wcswidth mmap setlocale \ - gcc_UNLOCKED_FUNCS) + gcc_UNLOCKED_FUNCS madvise) if test x$ac_cv_func_mbstowcs = xyes; then AC_CACHE_CHECK(whether mbstowcs works, gcc_cv_func_mbstowcs_works, diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c index 624f029..b0b3b3f 100644 --- a/gcc/ggc-page.c +++ b/gcc/ggc-page.c @@ -50,6 +50,10 @@ along with GCC; see the file COPYING3. If not see #define USING_MALLOC_PAGE_GROUPS #endif +#if defined(HAVE_MADVISE) defined(MADV_DONTNEED) +# define USING_MADVISE +#endif + /* Strategy: This garbage-collecting allocator allocates objects on one of a set @@ -277,6 +281,9 @@ typedef struct page_entry /* The lg of size of objects allocated from this page. */ unsigned char order; + /* Unmapped page? */ + bool unmapped; + /* A bit vector indicating whether or not objects are in use. The Nth bit is one if the Nth object on this page is allocated. This array is dynamically sized. */ @@ -740,6 +747,10 @@ alloc_page (unsigned order) if (p != NULL) { + if (p-unmapped) +G.bytes_mapped += p-bytes; + p-unmapped = false; + /* Recycle the allocated memory from this page ... */ *pp = p-next; page = p-page; @@ -956,7 +967,42 @@ free_page (page_entry *entry) static void release_pages (void) { -#ifdef USING_MMAP +#ifdef USING_MADVISE + page_entry *p, *start_p; + char *start; + size_t len; + + for (p = G.free_pages; p; ) +{ + if (p-unmapped) +{ + p = p-next; + continue; +} + start = p-page; + len = p-bytes; + start_p = p; + p = p-next; + while (p p-page == start + len) +{ + len += p-bytes; + p = p-next; +} + /* Give the page back to the kernel, but don't free the mapping. + This avoids fragmentation in the virtual memory map of the +process. Next time we can reuse it by just touching it. */ + madvise (start, len, MADV_DONTNEED); + /* Don't count those pages as mapped to not touch the garbage collector + unnecessarily. */ + G.bytes_mapped -= len; + while (start_p != p) +{ + start_p-unmapped = true; + start_p = start_p-next; +} +} +#endif +#if defined(USING_MMAP) !defined(USING_MADVISE) page_entry *p, *next; char *start; size_t len; -- 1.7.5.4
Re: [Patch] Don't ignore testsuite errors in Makefile
On Sunday 09 October 2011 21:12:12 Jakub Jelinek wrote: On Sun, Oct 09, 2011 at 04:32:12PM +0200, Mikael Morin wrote: currently, the testsuite return value is ignored by make. It is a little annoying if one wants to check automatically for regressions as we have to parse the testsuite output. This patch reverts to the normal make behaviour, which is to not ignore commands' return values. Note: As a result the -k flag has to be added to the make command line if one wants the tests to continue after one failure. OK for trunk? Please no. This is a very bad idea, most of the testsuites on many architectures contain some FAILs and a failure from check-parallel-% would mean the *.log/*.sum files would be never merged in that case. If you really need to propagate the return value (I fail to see how it is useful), then you should e.g. store the $? value from $(RUNTEST) in check-parallel-% into some file in that directory and have the parallelization goal after the merging collect those from the individual files and or them all together into the final return value. Thanks for the tips. I will just keep the patch locally for now. I don't use parallel testing anyway. Mikael
Re: [C++ Patch] PR 38980
OK. Jason
Re: [C++ Patch] PR 50660
Hmm, I guess it's unlikely that a conversion is going to hit both that warning and another one. OK. Jason
Re: [C++ Patch] PR 50660
On 10/09/2011 11:40 PM, Jason Merrill wrote: Hmm, I guess it's unlikely that a conversion is going to hit both that warning and another one. OK. Wait...how about changing conversion_null_warnings to stop looking through references? Does that break anything? Jason
Re: [C++ Patch] PR 50660
On 10/10/2011 12:41 AM, Jason Merrill wrote: On 10/09/2011 11:40 PM, Jason Merrill wrote: Hmm, I guess it's unlikely that a conversion is going to hit both that warning and another one. OK. Wait...how about changing conversion_null_warnings to stop looking through references? Does that break anything? Let me check... Paolo.
Re: [wwwdocs] Re: [2/2] tree-ssa-strlen optimization pass
Hi Jakub, this is a minor update on top of yours that I just applied. Thanks for taking the time to write this up. Gerald Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.46 diff -u -r1.46 changes.html --- changes.html4 Oct 2011 19:07:01 - 1.46 +++ changes.html9 Oct 2011 23:05:47 - @@ -125,13 +125,13 @@ growth./li /ul/li -liString length optimization pass has been added. This pass attempts +liA string length optimization pass has been added. It attempts to track string lengths and optimize various standard C string functions like codestrlen/code, codestrchr/code, codestrcpy/code, codestrcat/code, codestpcpy/code and their code_FORTIFY_SOURCE/code counterparts into faster alternatives. This pass is enabled by default at code-O2/code or above, unless - optimizing for size, and can be disabled by + optimizing for size, and can be disabled by the code-fno-optimize-strlen/code option. The pass can e.g. optimize pre char *bar (const char *a)
Re: [C++-11] User defined literals
On 10/08/2011 07:15 PM, Jason Merrill wrote: On 10/08/2011 07:25 PM, Ed Smith-Rowland wrote: Also, In spite of the documentation cp_parser_template_parameter_list returns a TREE_VEC not a TREE_LIST. This happens inside end_template_parm_list called inside the former. So parameter_list is a TREE_VEC, parm_list is a TREE_LIST, parm is a PARM_DECL, etc. Ah, I was thinking of template arguments rather than parameters. You're right, except that INNERMOST_TEMPLATE_PARMS should be just TREE_VALUE; you are already starting from the innermost parm list if you use what end_template_parm_list returns. Though it occurs to me that push_template_decl_real might be a better place for this check. I'm still looking for a fix for duplicate errors/warnings coming from cp_parser_operator. I tried cp_parser_error and lost the errors. I'll look for different code paths for the two invocations and see if I can either move something up or see if something is set differently between the two that would be useful for a flag. One approach would be changing the token stream after the first error to something that won't produce another error, e.g. changing token-u.value to be an empty string after you complain about it being non-empty. Interesting. That one error is the one of the three that does *not* repeat. One idea. the fist error about non-empty string is followed by a consume_token (for the string). Does cp_parser_identifier (parser) *not* consume the identifier token? Is that token left on the stream for second pass? I'll try it and get back. Jason
Re: [4/4] Make SMS schedule register moves
On Wed, Sep 28, 2011 at 4:49 PM, Richard Sandiford richard.sandif...@linaro.org wrote: Ayal Zaks ayal.z...@gmail.com writes: + /* The cyclic lifetime of move-new_reg starts and ends at move-def + (the instruction that defines move-old_reg). So instruction I_REG_MOVE (new_reg=reg) must be scheduled before the next I_MUST_FOLLOW move/original-def (due to anti-dependence: it overwrites reg), but after the previous instance of I_MUST_FOLLOW (due to true dependence; i.e. account for latency also). Why do moves, except for the one closest to move-def (which is directly dependent upon it, i.e. for whom move-def == I_MUST_FOLLOW), have to worry about move-def at all? (Or have their cyclic lifetimes start and end there?) Because the uses of new_reg belong to the same move-def based cycle. the cycle (overloaded term; rather iteration in this context) to which the uses belong, is inferred from the cycle (absolute schedule time) in which they are scheduled, regardless of move-def. Just to prove your point about cycle being an overloaded term: I wasn't actually meaning it in the sense of (loop) iteration. I meant a circular window based on move-def. Point proven ;-) So (I think this is the uncontroversial bit): [M1] must be scheduled cyclically before [B] and cyclically after [C], with the cycle based at [B]: row 3 after [B]: empty row 4: [C] row 5: [D] row 0: empty row 1: empty row 2: [A] row 3 before [B]: empty [M1] could therefore go in row 1. This part is OK. Here's how I see it: [M1] feeds [C] which is scheduled at cycle 10, so it must be scheduled before cycle 10-M_latency and after cycle 10-ii. [M1] uses the result of [B] which is scheduled at cycle 3, so must be scheduled after cycle 3+B_latency and before cycle 3+ii. Taking all latencies to be 1 and ii=6, this yields a scheduling window of cycles [4,9]\cap[4,9]=[4,9]; if scheduled at cycle 4 it must_follow [C], if scheduled at cycle 9 it must_precede [B]. This is identical to the logic behind the sched_window of any instruction, based on its dependencies (as you've updated not too long ago..), if we do not allow reg_moves (and arguably, one should not allow reg_moves when scheduling reg_moves...). To address the potential erroneous scenario of Loop 2, suppose [A] is scheduled as in the beginning in cycle 20, and that [M1] is scheduled in cycle 7 (\in[4,9]). Then [M2] feeds [D] and [A] which are scheduled at cycles 17 and 20, so it must be scheduled before cycle 17-1 and after cycle 20-6. [M2] uses the result of [M1], so must be scheduled after cycle 7+1 and before cycle 7+6. This yields the desired [14,16]\cap[8,13]=\emptyset. I agree it's natural to schedule moves for intra-iteration dependencies in the normal get_sched_window way. But suppose we have a dependency: A --(T,N,1)-- B that requires two moves M1 and M2. If we think in terms of cycles (in the SCHED_TIME sense), then this effectively becomes: A --(T,N1,1)-- M1 --(T,N2,0)-- M2 --(T,N3,0)-- B because it is now M1 that is fed by both the loop and the incoming edge. But if there is a second dependency: A --(T,M,0)-- C that also requires two moves, we end up with: A --(T,N1,1)-- M1 --(T,N2,0)-- M2 --(T,N3,0)-- B --(T,M3,-1)-- B and dependence distances of -1 feel a bit weird. :-) Of course, what we really have are two parallel dependencies: A --(T,N1,1)-- M1 --(T,N2,0)-- M2 --(T,N3,0)-- B A --(T,M1,0)-- M1' --(T,M2,0)-- M2' --(T,N3,0)-- B where M1' and M2' occupy the same position as M1 and M2 in the schedule, but are one stage further along. But we only schedule them once, so if we take the cycle/SCHED_TIME route, we have to introduce dependencies of distance -1. Interesting; had to digest this distance 1 business, a result of thinking in cycles instead of rows (or conversely), and mixing dependences with scheduling; here's my understanding, based on your explanations: Suppose a Use is truely dependent on a Def, where both have been scheduled at some absolute cycles; think of them as timing the first iteration of the loop. Assume first that Use appears originally after Def in the original instruction sequence of the loop (dependence distance 0). In this case, Use requires register moves if its distance D from Def according to the schedule is more than ii cycles long -- by the time Use is executed, the value it needs is no longer available in the def'd register due to intervening occurrences of Def. So in this case, the first reg-move (among D/ii) should appear after Def, recording its value before the next occurrence of Def overwrites it, and feeding subsequent moves as needed before each is overwritten. Thus the scheduling window of this first reg-move is within (Def, Def+ii). Now, suppose Use appears before Def, i.e., Use is upwards-exposed; if it remains
Re: [C++ Patch] PR 50660
On 10/10/2011 12:41 AM, Jason Merrill wrote: On 10/09/2011 11:40 PM, Jason Merrill wrote: Hmm, I guess it's unlikely that a conversion is going to hit both that warning and another one. OK. Wait...how about changing conversion_null_warnings to stop looking through references? Does that break anything? If I just do this (I hope it's what you had in mind): static void conversion_null_warnings (tree totype, tree expr, tree fn, int argnum) { - tree t = non_reference (totype); + tree t = totype; /*non_reference (totype); */ I see this failure, for sure: cpp0x/variadic111.C, that is: // PR c++/48424 // { dg-options -std=c++0x } templatetypename... Args1 struct S { templatetypename... Args2 void f(Args1... args1, Args2... args2) { } }; int main() { Sint, double s; s.f(1,2.0,false,'a'); } triggers: variadic111.C:16:22: warning: converting ‘false’ to pointer type for argument 3 of ‘void SArgs1::f(Args1 ..., Args2 ...) [with Args2 = {bool, char}; Args1 = {int, double}]’ [-Wconversion-null] Also, tree-ssa/copyprop.C, for example. Paolo.
Re: [wwwdocs] add libstdc++/1773 change to gcc-4.7/changes.html
On Tue, 4 Oct 2011, Jonathan Wakely wrote: I've committed this, which documents the fix for http://gcc.gnu.org/PR1773 in gcc-4.7/changes.html, and also replaces some characters with the gt; entity. Interesting that the latter was not caught by the validator? Thanks for addressing it, Jonathan! There also is a minor change on top of yours that I just committed; see below. Gerald Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.47 diff -u -r1.47 changes.html --- changes.html9 Oct 2011 23:08:14 - 1.47 +++ changes.html10 Oct 2011 01:06:49 - @@ -267,8 +267,8 @@ } a; // initializes a.i to 42 /pre/blockquote/li - liG++ now sets the predefined macro tt__cplusplus/tt to the -correct value, tt199711L/tt. + liG++ now sets the predefined macro code__cplusplus/code to the +correct value, code199711L/code. /li /ul
Re: [google] record compiler options to .note sections
On Sun, Oct 9, 2011 at 5:28 PM, Jakub Jelinek ja...@redhat.com wrote: On Sun, Oct 09, 2011 at 09:18:25AM +0800, Dehao Chen wrote: Unfortunately -frecord-gcc-switches cannot serve our purpose because the recorded switches are mergable, i.e. the linker will merge all options to a set of strings. However, object files may have distinct compile options. We want to preserve every object file's compile options when doing LIPO build. And -grecord-gcc-switches? That one, although it is mergeable, still preserves every object files's compile options. I tried -grecord-gcc-switches, but looks like it's not recording options that I want. e.g. the following two commands output the same assembly code, while the former should record one more options. gcc -g3 -grecord-gcc-switches a.c -Dabcdefgh -Dxy -I/usr/ -S gcc -g3 -grecord-gcc-switches a.c -Dabcdefgh -Dxy -S Thanks, Dehao Jakub
Re: [CRIS] Hookize PREFERRED_RELOAD_CLASS
Date: Sun, 9 Oct 2011 17:47:22 +0400 From: Anatoly Sokolov ae...@post.ru OK to install? * config/cris/cris.c (cris_preferred_reload_class): New function. (TARGET_PREFERRED_RELOAD_CLASS): Define. * config/cris/cris.h (OUTPUT_ADDR_CONST_EXTRA): Remove. ^^^ With the macro name in the ChangeLog entry fixed, yes, thanks. brgds, H-P