Re: abs insn with QI and HI mode
Ying Yi wrote: The generated codes do the following operations: 1) extend variable a_HI (HImode) to temp variable SImode, and do abs operation with SImode operators. I find the gimple intermedia represention as shown below: abs is a standard library function that takes an int as an argument. So if you call abs(), then gcc must convert the argument to type int before generating code for the abs. To get your special char/short abs instructions, we need one of two things 1) Optimization support to recognize a sign-extend followed by an abs, where the target has an abs instruction that operates on the pre-extended value. We can then optimize away the sign extend instruction. This optimization support apparently does not exist at the moment, perhaps because no one has needed it before. 2) Alternative abs function calls that accept short/char. We already have abs (int), labs (long), llabs(long long), fabs (double), fabsf (float) and fabsl (long double). -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: abs insn with QI and HI mode
On Tue, 2007-07-10 at 10:35 +0100, Ying Yi wrote: Thanks very much for your email. Will gcc add the optimization support in the future (method 1)? For method 2, if abs accept short/char, may I give the function names as sabs and qabs? Gcc does already have cabs as complex abs, doesn't it? The function names I gave are mostly standard library names, specified, by ISO C, POSIX, GNU libc, or whatever. New functions sabs and qabs would not be standard in that sense, and may conflict with user code, which may be undesirable. Names like __builtin_sabs and __builtin_qabs would be better from that point of view. The builtins.def file has a number of different ways of defining builtin functions depending on which command line options should enable/disable them, and whether or not __builtin should be prepended. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Host/Target confusion in Dwarf output
Michael Eager wrote: Is it guaranteed to hold all target integer sizes? How does this work for 32-bit hosts and 64-bit targets? RTL and tree constants were defined from the beginning as two HOST_WIDE_INTs. This was necessary to bootstrap long long support on 32-bit systems before most compilers had long long support. This means that the largest int on the host must be at least half the size of the largest int on the target. Hence, building 64-bit target compilers on 32-bit host systems has never been a problem. We have never supported 16-bit hosts from the beginning, so that is no problem. This does mean that you can't build a 128-bit target compiler on a 32-bit host, but that hasn't been a problem yet. This restriction by the way is one of the reasons why long long is 64-bits on 64-bit targets, because the 32-bit hosts used to build the initial 64-bit cross compilers could not support 128-bit integer constants. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Host/Target confusion in Dwarf output
Jim Wilson wrote: This does mean that you can't build a 128-bit target compiler on a 32-bit host, but that hasn't been a problem yet. And now that we allow HOST_WIDE_INT to be defined as long long, this shouldn't be a problem any more either. A 32-bit host with 2 long longs gets us up to 128-bit constants. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: GCC 4.2.1 : bootstrap fails at stage 2. compiler produces wrong binary for wrong processor
Dennis Clarke wrote: At the moment GCC 4.2.1 seems to be tied to the UltraSparc processor and thus the older sun4m and 32-bit Sparc machines are being ignored. The default cpu is v8plus. You can change that by using the configure option --with-cpu=v8 or --with-cpu=v7 depending on how old your machine is. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: How to remove the option -Qy to as
Andrea Marongiu wrote: It seems to be using the correct as (arm-elf-as) which anyhow doesn't support the -Qy option. No, it is using the wrong assembler. There are two compilers here, the native x86 linux one, and the cross compiler. You have the native x86 linux compiler using the arm-elf-as assembler, which won't work. The solution is to remove $toolchain-prefix/arm-elf/bin from your path. It is unnecessary and undesirable. Do put $toolchain-prefix/bin on your path though. Jim
Re: error compiling libgcc with ported cross-compiler
Tomas Svensson wrote: It seems that gcc has emitted rtl describing a memory reference (mem (plus (mem (plus (reg ..) (const_int ..))) (const_int ..))), which should not have been permitted by GO_IF_LEGITIMATE_ADDRESS since it only allows (mem (plus (reg ..) (const ..))), and forbids a second level of memory reference. This is probably a REG_OK_STRICT bug. During reload, an unallocated pseudo-reg is actually a memory reference in disguise, so you must check for and reject pseudo-regs during reload. This is handled by the REG_OK_STRICT macro. Just look at any port to see how to handle this correctly. Jim
Re: understanding __FUNCTION__ generation
Sunzir Deepur wrote: recently I've encountered a problem in which some removals of (what seems to be unneeded) lines of header files inflicted changes in the resulting binary. further inverstigation showed that the chages were different __FUNCTION__.numbers (in the __FUNCTION__. xxx symbol names). This would be clearer if you gave an example. However, I'm guessing that the issue here is the names for function local statics. The names for these are unique within a function, but not unique across a file. The assembler requires unique names for them, so we add the DECL_UID to make them unique. See lhd_set_decl_assembler_name in langhooks.c. The DECL_UID changes if you add/delete declarations. For instance, if I compile this with -O -S for an x86_64-linux target int *sub (void) { static int i = 10; return i; } int *sub2 (void) { static int i = 20; return i; } I get variables i.1499 and i.1504. If I add another variable in the first line static int y; then I now have variables i.1500 and i.1505. This wouldn't happen with functions unless you have nested functions. You could see a similar problem if compiling multiple files as one unit, since static functions in different files could have the same name, which would have to be changed to make them unique. This is probably handled in a different place in the gcc code, but probably handled the same way. Jim
Re: SImode and PSImode question
Tal Agmon wrote: I see many references in gcc code to SImode. Isn't this problematic for ports such as this when SImode does not represent the natural int? In the gcc dir, grep SImode *.[ch] | wc shows only 67 lines. That isn't a large number relatively speaking. Many of these are in comments, and some are correct because they relate to libgcc functions. It does look like loop-iv.c is broken though. Every simplify_gen_relational call uses SImode. That probably should be word_mode instead. You might want to submit a bug report for that. I would like to define PSImode for 36-bit accumulators. Yet, when I'm using attribute mode to define PSImode variable, gcc Is choosing SImode which is the smallest mode for it defined through wider_mode table. I want to define different behavior for PSImode Variables, how can I make sure this mode is actually attached to a variable? I'm not sure if anyone has ever done this before. The assembler/linker/etc have no support for odd sized variables. Usually the partial int modes were just used internal to gcc. For instance, if you have 24-bit address registers, you would use 32-bit pointers, and convert on input/output to the 24-bit PSImode internally, but no user variable would ever be PSImode. Anyways, I assume you are talking about the code in stor-layout.c in layout_type that calls smallest_mode_for_size. This uses TYPE_PRECISION. Have you defined a built-in type with an appropriate size for your PSImode variables? There is certainly no standard type with odd sizes that will work here.
Re: porting problem: segfault when compiling programs that call malloc
Tomas Svensson wrote: Ok, do you have any idea about what might cause this to happen? Could it be something wrong with exception handling or dwarf2 debugging output? Or possibly varargs handling? I am complete lost here unfortunately... Other function calls work just fine. Build any port that works, like x86, and step through the code to see why it does work. Now step through the code in your port to see what it does different that causes it to fail. My guess, is that you have struct_value_addr set, because aggregate_value_p returned true, because of some problem with your port. The gcc code here isn't expecting a function like malloc to return a value indirectly through memory. Jim
Re: porting problem again: ICE in add_clobbers
Tomas Svensson wrote: I have tried compiling with all optimization flags turned on manually (list included below) and that compiles just fine. That leads me to think that what is causing the bug is some undocumented optimization, triggered only if optimize 0. There is no optimization at all without -O, no matter how many -f options you use. What you want to do is -O -fno-foo -fno-bar etc. However, we do not have -f options for every optimization, so there is no guarantee that this will identify the optimization pass that exposes the bug in your port. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Signed division with rounding towards -infinity (and floating point rounding)
Christopher Key wrote: The most concise form that I've found so far is: const int d = 8; // 16, 32 etc x = y / d - ((y % d 0) ? 1 : 0) although this still produces rather longer code (see example below). Any integer divide by constant can be replaced by a sequence of multiply, shift, and add instructions that produces the same result for all inputs. There is a proof in a paper by Peter Montgomery and Torbjorn Granlund. Because integer divide instructions are slow on most hardware, this is often a profitable optimization. The same is true for the modulo operation. So what you are seeing here are two sequences of instructions to compute divide/modulo results using multiply/shift/add. Gcc is not able to simplify the two sequences further. We would have to add code to recognize this sequence and handle it specially, and that isn't what you are asking for. On a similar point, is there a good way get floats rounded to the nearest integer value rather than truncated. The following give the correct rounding behaviour (I'm only interested in +ve values), x = (int) (f + 0.5) although gets compiled as an addition of 0.5 followed by a truncation (again, see example). Gcc has some builtin support for the lrint library function, however, only x86 supports this, and you have to use -ffast-math to get this optimization. On an x86_64-linux system, compiling this testcase long int lrint (double d); long int sub (double d) { return lrint (d); } with -O2 -ffast-math -S gives me sub: cvtsd2siq %xmm0, %rax ret -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: porting problem again: ICE in add_clobbers
Tomas Svensson wrote: On 19 Sep 2007 07:54:14 -0700, Ian Lance Taylor [EMAIL PROTECTED] wrote: gcse will never convert a recognizable insn into an unrecognizable insn. Ok. Do you know of any other reasons why this particular optimization switch would cause this problem? There are millions of reasons why there might be a problem. You need to do some debugging to analyze the problem. I think the fact that gcse create an insn that caused combine to choke is probably not relevant. Step through combine to see what is going on. add_clobbers should only be called for an insn number that contains clobbers. This is conditional on num_clobbers_to_add, which is set in a recog() call. So step through recog to see why it is being set. This is the variable pnum_clobbers in recog, in the file insn-recog.c. insn-recog.c is created by genrecog.c, but you want to look there only as a last resort. No one else can do this analysis for you as no one else has a copy of your port. But this should not need a matching insn, since it ends in a DONE;, right? Or am I missing something again? This is correct. You need a define_insn to match all RTL emitted during the RTL expansion phase. However, when you have a define_expand which uses a minimal RTL template just to indicate the number of operands, and which does not actually emit this RTL template into any insn, then you do not need a define_insn to match this RTL template. You do need a define_insn to match the RTL emitted by the expand_branch function though. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: support single predicate set instructions in GCC-4.1.1
ÎâêØ wrote: (define_insn *shift_predicate_cmp [(set (const_int 0) (and:BI (and:BI (match_operand:BI 1 register_operand c) (and:BI (match_operand:DI 2 gr_reg_or_8bit_adjusted_operand rL) (match_operand:DI 3 gr_register_operand r))) (match_operand:BI 0 register_operand c)))] (%0) cmp.ne %1, p0 = %2, %3 [(set_attr itanium_class icmp)]) it warns WAW and there should be stop ;; between these two instructions. It is the assembler that is giving the warning. The assembler knows that the %1 operand is modified by the instruction, but the compiler does not, because the %1 operand is not a SET_DEST operand. Your SET_DEST is (const_int 0) which is useless info and incorrect. You need to make sure that the RTL is an accurate description of what the instruction does. Besides the problem with the missing SET_DEST, there is also the problem that you are using AND operands for a compare, which won't work. AND and NE are not interchangeable operations. Consider what happens if you compare 0x1 with 0x1. cmp.ne returns false. However, AND returns 0x1, which when truncated from DImode to BImode is still 0x1, i.e. true. So the RTL does not perform the same operation as the instruction you emitted. This could confuse the optimizer. GCC internals assume that predicate registers are always allocated in pairs, and that the second one is always the inverse of the first one. Defining a special pattern that only modifies one predicate register probably isn't gaining you much. If you are doing this before register allocation, then you are still using 2 predicate registers, as the register allocator will always give you 2 even if you only use one. Worst case, if this pattern is exposed to the optimizer, then the optimizer may make changes that break your assumptions. It might simplify a following instruction by using the second predicate reg for instance, which then fails at run-time because you didn't actually set the second predicate reg. If you are only using this in sequences that the optimizer can't rewrite, then you should be OK. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: support single predicate set instructions in GCC-4.1.1
On Tue, 2007-09-25 at 15:13 +0800, 吴曦 wrote: propagate_one_insn), I don't understand why GCC fails the computation of liveness if there is no optimization flag :-(. There is probably something else happening with -O that is recomputing some liveness or CFG info. For instance, the flow2 pass will call split_all_insns and cleanup_cfg, but only with -O. You could try selectively disabling other optimization passes to determine which one is necessary in order for your code to work. Actually, looking closer, I see several of them call update_life_info. regrename for instance has two update_life_info calls. Another possibility here is to try calling recompute_reg_usage instead of doing it yourself. Or maybe calling just update_life_info directly, if you need different flags set. FYI This stuff is all different on mainline since the dataflow merge. I'm assuming you are using gcc-4.2.x. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: support tnat instruction on IA-64. error occurs in bundling. help
ÎâêØ wrote: [(set_attr itanium_class tnat)]) The itanium_class names are based on info from the Itanium Processor Microprocessor Reference by the way. I believe the problem is that you didn't add info to the DFA scheduler dscriptions in the itanium1.md and itanium2.md files for this new instruction class. Normally the DFA scheduler info is optional. However, for itanium, we also use the scheduler for bundling, and hence proper DFA scheduler info for each instruction class is required. It appears that the tnat instruction schedules and bundles the same as the tbit instruction, so just use the existing tbit class instead of trying to add a new one. The docs are a bit unclear here though, since some places mention tbit and tnat, and other places just mention tbit. For your purposes, this isn't important. Modifying the DFA scheduler descriptions is complicated. It is best to avoid that if you can. Specifying that tnat is an I type instruction isn't enough for bundling purposes, since a lot of instructions have further restrictions. In this case, for instance, tnat can only go into an I0 slot, not an I1 slot. This detail is handled in the DFA scheduler descriptions. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: support tnat instruction on IA-64. error occurs in bundling. help
On Wed, 2007-09-26 at 08:11 +0800, 吴曦 wrote: Truly thanks, I have discovered this problem after I sent the first mail, and I found itanium1.md and itanium2.md describe the pipeline hazard, but they are really complex... :-(. Is there any guide or docs on this? thanks There is Itanium microarchitecture documentation available from Intel, and there is general DFA documentation available in the gcc docs. However, there is no documentation specifically for the itanium dfa descriptions. You just have to spend a lot of time looking at them, and eventually they start to make sense. By the way, I didn't write them, and fortunately haven't had to modify them yet. Hopefully I never will. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: support single predicate set instructions in GCC-4.1.1
On Wed, 2007-09-26 at 23:35 +0800, 吴曦 wrote: Thanks, it's the problem of pass_stack_adjustments. pass_stack_adjustments isn't in gcc-4.2.x; it is only on mainline. But the flow stuff you are using isn't on mainline anymore since the dataflow merge. Maybe you are using a month or two old snapshot of mainline? This will limit the help I can provide, since I only have copies of mainline and gcc-4.2.x to look at, neither of which matches what you are working with. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: again problems implementing own target
Michael_fogel wrote: The failing instruction is a call. The compiler loads every symbol reference into a register and calls the function using this. In front of the error this load is missing. In smaller files the compiler uses the reference, which is the way i want him to do it. We need more info in order to determine what is wrong. Some RTL examples would be nice. Also, a copy of your pattern in your md file that matches call instructions. Also, in which pass does the load symbol instruction disappear? If you compile with -da, you will get RTL dumps after each optimization pass. You can look at them to see where problem was introduced. There does appear to be a problem with your port, but there is also a way to work around it. If you define NO_FUNCTION_CSE, then gcc will no longer try to optimize function addresses by loading them into pseudos. See the docs for this macro in the doc/tm.texi file. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Scheduling problem - A more detailed explain
Ô¬Á¢Íþ wrote: So, my question becomes clear: How to solve this problem by making GCC knows the data dependencies between mov X = pr (or mov pr = X, -1) and other usage of a specific predicate register (e.g. p6, p7)? We already have support for these move instructions. See the movdi_internal pattern. Since there are 64 1-bit PR registers, we use a DImode reference to pr0 to represent the entire set of PR registers. Is this the RTL that you are using? Or do you have your own representation? If different, what RTL are you using? Personally speaking, I think I need to modify itanium1.md or itanium2.md to instruct GCC to notice these dependencies (However, these files look too much complex :-(); or is there any simpler way to get around this problem ? You don't need to touch the itanium1.md or itanium2.md files for this issue. They are only for pipeline info. Dependency info is handled in rtx_needs_barrier and friends in ia64.c. rws_access_reg should be handling this correctly. It uses HARD_REGNO_NREGS to get the number of regs referred to by a reg rtl. So it should return 64 in this case, and then it will iterate over all 64-bit PR regs when checking for a dependency. So this should work if you are using the correct RTL representation. If you are using a different RTL representation, it won't work. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: again problems implementing own target
Michael_fogel wrote: In my opinion the compiler ran out of registers and was not able to allocate the pseudo register. In this case the compiler has to spill these registers. How is this done in GCC? Is there a way to control it? There are a lot of things that affect this. The main things are constraints in the cpu.md file which specify register classes, and definitions in the cpu.h file which define the register classes. There are also various macros for adjusting costs related to register classes, availability of registers, suitability of registers for reloads, secondary reloads, etc. There is far too much stuff to list here. See the docs, and learn how to read -da RTL dumps. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Scheduling problem - A more detailed explain
ÎâêØ wrote: Well... Is there anything I miss or forget to do ? Someone needs to step through code in a debugger and try to figure out what is going wrong. I made an initial attempt at that. I hacked gcc a little to try to force a failure with a contrived testcase. The first thing I noticed is that I pointed you at the wrong bit of code. The rws_access_reg code is used for placing stop bits, which is not the problem here. The problem here is in the scheduler. Unfortunately, this leads to a second problem. How this works in the scheduler depends on what gcc sources you are using. There is the pre-dataflow merge code and the post-dataflow merge code. I don't know which one you are using. From earlier comments, I'm guessing that you are using some snapshot from mid to early summer. It would be helpful to know exactly what sources you are starting from. Worst case, I might also need your gcc patches, if I can't find a way to trigger the failure on my own. The scheduler dep code is in sched-deps.c. Try looking at sched_analyze_insn. As before, it does use hard_regno_nregs so in theory it should be working. Do we know for sure that the scheduler is failing here? Have you looked at -da RTL dumps to verify which pass is performing the incorrect optimization? Currently, gcc only emits these pr reg group save/restores in the prologue and epilogue, and we have scheduling barriers after/before the prologue/epilogue, so it is possible that there is a latent problem here which has gone unnoticed simply because it is impossible to reproduce with unmodified FSF gcc sources. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: builtin_frame_address for stack pointer
skaller wrote: BTW: what happens on ia64 which has two? stacks? You have to search both stacks for GC roots. Only one stack is visible to normal user code, the regular program stack, and __builtin_frame_address(0) will point there. For the other stack, the backing store, you need some IA-64 specific code to get to it. You can use the __builtin_ia64_bsp() function, or you could write some assembly language code to read the ar.bsp register. Taking a quick look at boehm-gc, I see that there is a glibc hook it uses first, and then if that isn't available it tries to get the value from /proc/self/maps. Those are obviously linux specific solutions. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Problem when build glibc on IA64
Ô¬Á¢Íþ wrote: I'm a guy working on IA64 and I need to compile glibc with gcc4.2. I tried gcc version 4.2.2 to build glibc 2.5, 2.6 and 2.7, all failed with: internal compiler error: RTL flag check: INSN_DELETED_P used with unexpected rtx code 'plus' in output_constant_pool_1, at varasm.c: 3393 This is lacking some info about how to reproduce. You didn't mention which file in glibc fails to compile. You didn't how anything was configured. Etc. Googling for an error isn't always useful, as there are many different ways to trigger some failures. Unfortunately, since you didn't mention which web page you were looking at, I can't comment on that further. I have no idea what you were looking at. One thing I did notice, however, is that you got an internal consistency checking failure for a release, which is odd, as releases default to --enable-checking=release which disables most of the internal consistency checks. So I tried building a release with --enable-checking=yes and was able to reproduce the problem. The failure happens with top of tree on the gcc-4.2 branch. The failure occurs with the file iconvdata/iso-2022-cn-ext.c. I used top of tree glibc for the test build. I can not reproduce the failure with mainline gcc with this testcase, but the bug still appears to be there. The problem is in output_constant_pool_1 in varasm.c. There is code that tries to handle a LABEL_REF and also (CONST (PLUS (LABEL_REF) ...)). It does this tmp = x; switch (GET_CODE (x)) { case CONST: ... tmp = XEXP (XEXP (x, 0), 0); /* FALLTHRU */ case LABEL_REF: tmp = XEXP (x, 0); which obviously doesn't work for the CONST case. We end up with a PLUS instead of a CODE_LABEL after falling through. The fix here is to replace every use of x with tmp inside this switch statement. The bug is still present in gcc mainline (4.3), but I don't have a testcase that reproduces it there. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Can CODE_FOR_$(div$V$I$a3$) ever match?
Kai Henningsen wrote: This is genopinit.c:92 (sdivv_optab) (in revision 127595). I read this as the next mode must be a full integer mode; add a v if it is a float mode. Which is doubly strange as this is the only place where $V is used. Am I missing something here, or is this a bug? It looks like a bug to me. The sdivv_optab rule should look like the addv/subv/smulv optab rules. Probably an oversight because this code is not immediately next to the rest of the code, and there is no target using this optab, so it went unnoticed. I also can't help but notice that none of the named patterns here are documented. There is no mention of addv, subv, smulv, sdivv, negv, or absv in doc/md.texi. One more thing, there is no divvsi3/divvdi3 in libgcc, nor in doc/libgcc.texi. This implies that if we fix the genopinit.c bug we may end up exposing a latent bug in libgcc. Hmm, maybe there is no need for sdivv as many targets already trap on divide overflow, or to say it differently, most users accept the fact that integer divide might trap, even without -ftrapv, and hence a target doesn't have to use sdivv for a trapping integer divide. In fact, looking at optabs.c, I see that there is no use of sdivv here, where as the others listed above are clearly used with -ftrapv. I think we can just delete all sdivv optab support from optabs.c, optabs.h, and genopinit.c, plus the $V operator from genopinit.c. The code was added here http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00607.html Another interesting point, the ChangeLog entry doesn't mention the addition of $V. You want to try writing a patch, or maybe just file a formal bug report? -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: own target: combine emits invalid RTL
Michael_fogel wrote: (ior:SI (subreg:SI (mem/s:QI (reg/f:SI 1250) [0 variable.flags+0 S1 A32]) 0) See register_operand and general_operand in recog.c. (SUBREG (MEM)) is accepted by register_operand if INSN_SCHEDULING is not defined, for historical reasons. This is something that should be fixed some day. INSN_SCHEDULING is defined if you have any of the instruction scheduling related patterns in your md file. If this is a new port, and you haven't tried to add instruction scheduling support, then INSN_SCHEDULING won't be defined yet. Anyways, this means that the RTL is correct, and we expect reload to fix it. The error from gen_reg_rtx implies that reload is failing, perhaps because of a bug in your port that doesn't handle (SUBREG (MEM)) correctly. There are other legitimate cases where (SUBREG (MEM)) can occur during reload, when you have a subreg of an pseudo that did not get allocated to a hard register for instance, so even if register_operand and general_operand are changed, you still need to find and fix the bug in your port. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Build Failure for gcc-4.3-20071109
Tom Browder wrote: Attached is a log of my build attempt (and the config.log). There is a config.log file in every directory that gets configured. It looks like you attached the one from the top-level dir which is not where the problem is occurring. The make -j3 makes the output hard to read. You might try a make all build from scratch to get a better look at what is wrong. You can build a host libiberty with make all-libiberty and a target libiberty with make all-target-libiberty. The same is true for other things, e.g. make all-gcc will just build the gcc directory, and make all-target-libstdc++ will build a target libstdc++. But make all works just as well to get back to the error point. It looks to me as if libiberty is getting compiled and tested Ok, but, for some reason, make reports an error which then prevents a good build completion. These lines in the output are suspect: /bin/sh: /usr/bin/true: Success I don't have a /usr/bin/true on my F7 machines. There is a /bin/true. The program true should just return without error and not do anything else. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: How to let GCC produce flat assembly
Li Wang wrote: and execute it. If I want to let GCC produce assembly for it, how should I code the machine description file? Should I first let cc1 produce a elf assembly for it, and then let binutils trunate it to a flat assembly? It seems ugly hacking. Thanks. I don't know what a .com file is, but you can use objcopy from binutils to convert between file formats. You could use objcopy to convert an ELF file into a binary file by using -O binary for instance. See the objcopy documentation. The binary file format may be what you are looking for. If you want code without function calls, then you would have to write a C program without any function calls. Neither gcc nor binutils will help you there. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: bootstrap failure with rev 130208
Thomas Koenig wrote: build/genmodes -h tmp-modes.h /bin/sh: build/genmodes: No such file or directory Does the file build/genmodes exist? If the file isn't there, then you need to figure out what happened to it. If the file is there, then this might mean that the interpreter for the binary is missing. For an ELF executable, the interpreter is the dynamic linker ld.so. This info is stored in the interp section. localhost$ objdump --full-contents --section .interp /bin/ls /bin/ls: file format elf32-i386 Contents of section .interp: 8048134 2f6c6962 2f6c642d 6c696e75 782e736f /lib/ld-linux.so 8048144 2e3200 .2. If the interp section points at a file that is non-existent, then you may get a confusing message when you run the program. A similar problem can occur with shell scripts. E.g. if you have #!/bin/foo exit 0 in a script, make it executable, and try to run it directly, you may get a confusing message saying the file does not exist. The message means the interpreter is missing, not the actual shell script. The error message comes from the shell. Bash prints a useful message, but some shells print a confusing one. localhost$ echo $SHELL /bin/bash localhost$ cat tmp.script #!/bin/foo exit 0 localhost$ ./tmp.script bash: ./tmp.script: /bin/foo: bad interpreter: No such file or directory localhost$ csh [EMAIL PROTECTED] ~/tmp]$ ./tmp.script ./tmp.script: Command not found. [EMAIL PROTECTED] ~/tmp]$ -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: own target: combine emits invalid RTL
Dave Korn wrote: First places to look would be GO_IF_LEGITIMATE_ADDRESS and REG_OK_FOR_BASE_P, wouldn't they? Particularly in conjunction with REG_OK_STRICT. This could be a REG_OK_STRICT issue, but it isn't the usual case of accepting an unallocated pseudo in reload, as we have a SUBREG MEM here. Probably there is code in the backend that assumes SUBREG can only contain a REG, which is incorrect. SUBREG can also contain a MEM. You need to check to make sure. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: bootstrap failure with rev 130208
Thomas Koenig wrote: On Thu, 2007-11-15 at 17:42 -0800, Jim Wilson wrote: Thomas Koenig wrote: build/genmodes -h tmp-modes.h /bin/sh: build/genmodes: No such file or directory Your problem is that you accidentally ran ../gcc/gcc/configure instead of ../gcc/configure. However, why it fails this way has me baffled. I can easily reproduce it though. The rule to build genmodes by the way is build/gen%$(build_exeext): build/gen%.o $(BUILD_LIBDEPS) $(CC_FOR_BUILD) $(BUILD_CFLAGS) $(BUILD_LDFLAGS) -o $@ \ $(filter-out $(BUILD_LIBDEPS), $^) $(BUILD_LIBS) This is some GNU make magic. For some reason, make finds this rule if gcc is correctly configured, but can't find it if gcc is incorrectly configured. I looked at make -d output; I didn't find it helpful. For some reason the search order for matching patterns is different, but I see no clue why. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: hard_regno_nregs == 0 ?
DJ Delorie wrote: Which assumption is wrong? That hard_regno_nregs can be zero (m32c), or that hard_regno_nregs will never be zero (rtlanal)? I would say the m32c port is wrong. HARD_REGNO_MODE_OK indicates whether a register can hold a mode. HARD_REGNO_NREGS indicates how many registers we need to hold a value. This is a number that can be larger than 1 if the value is larger than one register, but it makes no sense for it to be zero, as no (non-void) value can ever be held in zero registers. The number of registers needed is irrespective of whether the register can actually hold the value, as that is specified by HARD_REGNO_MODE_OK. There are lots of places that use HARD_REGNO_NREGS in division/modulus operations. It would be complicated to fix them all to handle a zero value. However, as Ian mentioned, there does seem to be something else wrong here, as it seems odd that you have an invalid subreg being passed in here. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: hard_regno_nregs == 0 ?
On Wed, 2008-01-09 at 15:38 -0500, DJ Delorie wrote: [EMAIL PROTECTED] This macro must never return zero, even if a register +cannot hold the requested mode - indicate that with HARD_REGNO_MODE_OK +and/or CANNOT_CHANGE_MODE_CLASS instead. I think that HARD_REGNO_NREGS should not be returning zero, but I also think it is a moot point, as we shouldn't be using it on invalid subregs. Hence I like Paul's wording better than yours, but it appears not to be sufficient for your case, as your m32c is only fixing HARD_REGNO_NREGS to not return zero. It appears that the m32c port does define both HARD_REGNO_MODE_OK and CANNOT_CHANGE_MODE_CLASS, so it isn't clear what the problem is. I suspect that there may be another problem here. Unfortunately, I don't have a testcase I can use to look at this myself. Maybe if we found and fixed this other problem, Paul's wording would be correct again. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: hard_regno_nregs == 0 ?
On Thu, 2008-01-10 at 19:16 -0500, DJ Delorie wrote: IIRC, the bug happened building either libgcc or newlib. If you want to revert my latest patch in a local source tree and just try a build, it's likely to show you an example ;-) It was unwind-dw2.c in the m32cm libgcc multilib. The problem lies with the var-tracking pass. It sees a QImode variable being held in an HImode hard-reg, and decides to track only the low QImode part. This happens in track_loc_p, when it decides to use DECL_MODE instead of the hard reg mode. We then have to create RTL to represent the part that we are tracking, which means we have to create an invalid QImode lowpart of a hard reg that can't hold QImode values. Since we will only use this for emitting debug info this seems perfectly harmless. Unfortunately, this does mean that HARD_REGNO_NREGS must work correctly for invalid register modes. We need that info in order to construct the lowpart/subreg. Nowhere does var-tracking check to see if it is creating a valid register references, nor do I think this is necessary. Hence, I now believe that your suggested doc change is correct, and is OK to check in to mainline. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: gcc using wrong as
Antoine Kaufmann wrote: after some small changes. But when I tried to compile a simple Hello World, gcc failed with some assembler error messages about illegal operad size with some push instructions. As i guessed they seem to come from the host 64bit as. I compared the error messages by trying to assemble the code manually. You must use the exact same prefix option when configuring and installing binutils as you use when configuring and installing gcc. If you do this, then gcc will find the the right as automatically. If you don't then the gcc driver will need help to find it. The gcc driver will just search for a program called as in a pre-defined list of directories. So you need to make sure it will find one there before it finds the one in /usr/bin/as. The canonical place for this is in $prefix/$target/bin/as. Or you could put it in the same directory where cc1 got installed; I think that works too. The fact that configure found the cross assembler is irrelevant. This is only used during the build process. It has no effect on where gcc looks for the assembler at run time. Adding the -v option to a gcc run will show some useful info about what is going on. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Invalid address after reload
Vander Aa Tom wrote: This fails the legitimate address constraint since I'm not allowing a mem inside another mem. Sounds like a REG_OK_STRICT bug. GO_IF_LEGITIMATE_ADDRESS should accept a pseudo-reg when !REG_OK_STRICT, and should reject a pseudo-reg when REG_OK_STRICT. In reload, an unallocated pseudo-reg is really a memory reference in disguise. This is typically handled by having two different versions of macros like REG_OK_FOR_BASE_P. GO_IF_LEGITIMATE_ADDRESS then uses these macros for its checks. This is a common mistake in first time ports. See also strict_memory_address_p in reload.c, and memory_address_p in recog.c. Jim
Re: GCC build problem
Dasarath Weeratunge wrote: I added a field to struct basic_block_def in basic-block.h of type struct pointer_set_t. Now when I try to rebuild GCC build fails with error: libbackend.a(gtype-desc.o): In function `gt_pch_nx_basic_block_def': /scratch/dweeratu/gcc/build/gcc/gtype-desc.c:2472: undefined reference to `gt_pch_nx_pointer_set_t' See http://gcc.gnu.org/onlinedocs/gccint/Type-Information.html#Type-Information See also the gengtype.c file, and the doc/gty.texi file. I'm just guessing, but you probably either need to add a GTY marker to the struct pointer_set_t definition, or else add a GTY skip marker to the new field in struct basic_block_def that uses the pointer_set_t type. It depends on whether this info will be allocated in the garbage collection space (e.g. ggc_alloc). Jim
Re: Injecting data declarations?
Reuben Harris wrote: I would like to modify GCC to inject a link-once word-sized data declaration into the object file, i.e. to behave AS IF there were extra declarations in the source code, e.g.: Builtin functions are a good source for how to create decls, but they create mostly type decls. See for instance c_register_builtin_type in c-common.c. The profilers are another good choice, and they create variables. See for instance tree_init_ic_make_global_vars in tree-profile.c. These aren't link-once variables though. I think all you need for that is a call to make_decl_one_only, but you might want to look for examples. Jim
Re: Seg fault in call_gmon_start
Desineni, Harikishan wrote: I just compiled an app with GCC. It is segmentation faulting in call_gmon_start (even before reaching main() of my program Gcc usage questions should not be sent to the gcc list. This list is for being doing development work on gcc. This is an appropriate question for the gcc-help list. It will probably do nothing unless you compiled with -pg. The important bit here is that your program is apparently failing on the very first instruction it executes, as call_gmon_start is probably the first function in the init section, which contains initializers run before calling main. So there is apparently something seriously wrong with your setup, e.g. the kernel isn't constructing the process stack correctly, or the process stack is being mapped to non-existent memory, or something else unusual is going wrong. There may not be anything wrong with gcc itself. call_gmon_start comes from glibc by the way. Look there for more info. Jim
Re: RTL definition
Fran Baena wrote: RTL represents a low-level language, machine-independent. But I didn't find any especification of such language represented. This is, I found no document where the language represented were described or defined in a grammar way. RTL isn't a programming language, and hence has no grammar. It is merely one of the internal representations that gcc uses for optimizing and generating code. We do have gcc internals documentation that you have already been pointed at. Jim
Re: -B vs Multilib
Greg Schafer wrote: Currently, -B doesn't add the multilib search paths when processing startfile_prefixes. For example, -B $prefix/lib/ doesn't find startfiles in $prefix/lib/../lib64 GCC has two different schemes for multilib search dirs. One that is used in the gcc build tree, and one that is used in the OS install dirs. See the difference between multilib_dir and multilib_os_dir. We are doing multilib searches for startfile_prefix. See find_file. However, we are using the gcc build tree form. This is necessary because -B options are used during the gcc build. Perhaps what you need is to make -B options work both ways, so that they are also useful when pointing into OS dirs like /lib. Or maybe we need yet another option which is like -B but meant to be used in OS dirs instead of in the gcc build tree. Jim
Re: insn appears multiple times
Boris Boesler wrote: insn 381 appears in the delay slot and later in another basic block (but same function). These insns are equal but they are not the same, two disjunct pieces of memory. Is this possible? Yes. Reorg calls copy_rtx to avoid having shared RTL. Unsharing the insns means that we can do things like set flag bits in one without accidentally changing the other. Jim
Re: Basic block infrastructure after dbr pass
Boris Boesler wrote: But some basic blocks seem to point to insns which are not in the insn-list. I had a short look at dbr_schedule() in reorg.c and the basic blocks are not updated. Are they evaluated in a later pass? No. See pass_free_cfg, which is the third pass before pass_delay_slots. Jim
Re: xscale-elf-gcc: compilation of header file requested
Ajit Mittal wrote: This command $(CC) -M $(HOST_CFLAGS) $(CPPFLAGS) -MQ $@ include/common.h [EMAIL PROTECTED] xscale-elf-gcc: compilation of header file requested Looks like an old bug fixed long ago, sometime around the gcc-3.3 time frame. You should always include the gcc version number in a bug report. Jim
Re: DFA state and arc explosion
Bingfeng Mei wrote: However, if I also want to model the resource for writing back register file, the number of states and arcs just explodes. It is especially true for long pipeline instruction. The usual solution is to have two DFAs, one used for most instructions, and one used just for the long pipeline instructions. See for instance the gcc/config/mips/sb1.md file which has the sb1_cpu_div automaton which is only used for divide instructions. DFA stands for deterministic finite automaton, which is a type of finite state machine. NDFA is non-deterministic finite automaton. Any good book on the theory of computing should cover this. Jim
Re: Basic block infrastructure after dbr pass
Boris Boesler wrote: The following code generators use FOR_EACH_BB[_REVERSE] in the target machine dependent reorg pass: - bfin - frv - ia64 - mt - s390 The very first thing that ia64_reorg does is compute_bb_for_insn (); Just taking a quick look, I don't see any bb usage in the s390 port, and I see bfin and frv calls compute_bb_for_insns also. mt isn't using the full bb-insn mapping table, it is only using the bb tail, so it is probably not broken. I think most basic block info should still be there at the end, it is the bb-insn mapping table that disappears. See the code for pass_free_cfg. This table can be recomputed if you need it. Jim
Re: Different *CFLAGS in gcc/Makefile.in
Basile STARYNKEVITCH wrote: It is indeed the easiest. But for X_CFLAGS T_CFLAGS I only found the comment # These exists to be overridden by the x-* and t-* files, respectively. t-* files are target makefile fragments. x-* files are (cross)-host makefile fragments. See config.gcc and config.host respectively. You can find example uses by greping in config/*/*. and for XCFLAGS # XCFLAGS is used for most compilations but not when using the GCC just built. XCFLAGS is apparently obsolete and unused. Looks like the last use was removed in 2004 from rs6000/x-darwin. I see that libgomp is using XCFLAGS, but I think it is a different makefile variable than the gcc one. By the way, X_CFLAGS and XCFLAGS are documented in doc/fragments.texi. T_CFLAGS docs are missing there though. You can find all sorts of stuff if you grep the entire gcc source tree. Jim
Re: Basic block infrastructure after dbr pass
Boris Boesler wrote: I haven't specified my problem properly? If I traverse basic blocks via FOR_EACH_BB (used in compute_bb_for_insn, too) I get insns which are not in the insn-stream for(insn = get_insns(), insn; insn = NEXT_INSN(insn)) .. As Ian mentioned, the delay-slot filling pass does not update the CFG. So you can't use it after this pass, at least for any target that has delay slots. As I mentioned, the machine dependent reorg passes are the very first passes that are run after the last pass that updates the CFG. Hence, the CFG is still valid when machine dependent reorg runs, and it can still be used there if you are careful. You really need to look at the order of passes defined in passes.c, particularly the pass_free_cfg I pointed you at earlier. The CFG is always valid before this point. The CFG is not valid after this point, if you run any optimization pass that moves instructions around, which includes most all of them that run after this point. None of these passes try to maintain the CFG info. md_reorg is a special case because it is the first one run after we stop maintaining the CFG, so the residual info is still usable if you are careful. Jim
Re: Is vec_initmode allowed to FAIL?
Jan Hoogerbrugge wrote: I see however that no code is generated if trimedia_expand_vector_init() returns 0 and the define_expand FAILs. I also see in other targets that a vec_init always ends with a DONE. Could it be that vec_init is not allowed to FAIL? Grep for vec_init, and we see that it is used in expr.c, store_constructor(), case VECTOR_TYPE. It calls optab_handler to set icode, and then at the end calls GEN_FCN (icode). There is no provision for the GEN_FCN (icode) call returning failure, so yes, vec_init is not allowed to fail. This could probably be fixed with a little code rearrangement. For instance, you could try putting all code starting with the line that tests need_to_clear size 0 !vector into a loop that iterates twice. If vector is not set, then we exit the loop after emitting code. If vector is set, and the GEN_FCN call succeeded, then we exit the loop. If vector is set, and the GEN_FCN call fails, then we clear vector, throw away any RTL that may have been emitted, and loop back for a second pass using the non-vector code, which always succeeds. Throwing away the temporary RTL can be accomplished by using start_sequence and end_sequence. There may also be other ways to fix this. Jim
Re: GCC 4.3.0 compilation error
Wirawan Purwanto wrote: I tried to compile GCC 4.3.0 on a Red Hat Linux 9.0 box, it stopped at stage 1: Compiling new gcc versions on old linux versions may not always work, and is unlikely to be fixed. You are probably on your own here if you run into a non-trivial problem. ../../gcc-4.3.0/gcc/config/host-linux.c: In function `linux_gt_pch_use_address': ../../gcc-4.3.0/gcc/config/host-linux.c:207: `SSIZE_MAX' undeclared (first use in this function) GCC is expecting SSIZE_MAX to be defined by the glibc headers, which is true with current versions of glibc. Assuming ssize_t is a long, then SSIZE_MAX should be LONG_MAX. Jim
Re: [RFH] Uninitialized warning as error is disabled on the trunk
Andrew Pinski wrote: /src/gcc/local/gcc/gcc/cp/pt.c: In function 'subst_copy': /src/gcc/local/gcc/gcc/cp/pt.c:9919: warning: 'len' may be used uninitialized in this function This was introduced by your patch here: http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01675.html Please suggest a fix. Jim
Re: Implementing a restrictive addressing mode for a gcc port
Mohamed Shafi wrote: For the source or the destination register Rd/Ra, the restriction is that it should be one more than the base register . So the following instructions are valid: GCC doesn't provide any easy way for the source address to depend on the destination address, or vice versa. One thing you could try is generating a double-word pseudo-reg at RTL expand time, and then using subreg 0 for the source and subreg 1 for the dest (or vice versa depending on endianness/word order). This will get you a register pair you can use from the register allocator. This doesn't help at reload time though. You probably have to define a constraint for every register, and then write an alternative for every register pair matching the correct even register with the correct odd register. That gets you past reload. Another alternative might be to have a pattern (e.g. reload_insi) that generates secondary reloads that request the register pair you need. This is unlikely to get good code, but would not be hard to do. Jim
Re: genattrtab segfault on RH 7.3 (powerpc cross)
Sergei Poselenov wrote: I'm building a powerpc cross of gcc-4.2.2 on RH 7.2 host and ran into this: RHL 7.2 is very old. It is unlikely that we can help you here. The bug is very hardly reproducable; on FC6 I was unable to reproduce after running test loop overnight. If the bug isn't reproducible, then it usually isn't a gcc problem. There are some rare cases where a bug is only reproducible with the right environment. Different sized environments can cause changes in how stack frames get allocated, which can result in different behavior in the presence of stray pointer reads/writes. If it fails inside make, but not from the command line, it could be such a case. Jim
Re: GCC : how to add VFPU to PSP Allegrex (MIPS target) ?
Christophe Avoinne wrote: * How can I make coexist the SF mode between the FPU registers and the VFPU registers in the argument list of a function ? You probably don't want to use VFPU registers for argument passing. That will complicate the ABI. If you really do, then you need two different float types, which in turn may cause other problems. * Another way to distinguish a VFPU scalar is to use typedef float __attribute__((vector_size(4))) V1SF;. Is that difficult to make it possible (right now, gcc refuses it) ? See the code in make_vector_modes in genmodes.c. /* Do not construct vector modes with only one element, or vector modes where the element size doesn't divide the full size evenly. */ That is only the first step though. I'm not sure what the rest of gcc will do for single element vectors. * Same question for V3SF, is that difficult to make it possible ? Not obvious what the problem is here, seems like it should already be possible to do this. first 32 registers, would it be difficult to have combined V2V1SF, V3V1SF, V4V1SF to define column vectors of two, three or four components ? and to have V2V2SF, V3V3SF, V4V4SF as matrixes ? The gcc support was designed to handle SIMD instructions. There is some support for stride on memory accesses, but not within the register set. I don't think there is any way to support column or matrix vectors at this time. You might get better responses if you put something obvious like autovectorization in the subject line. The people doing this work might not have realized that this message was relevant to them. Jim
Re: Implementing a restrictive addressing mode for a gcc port
On Tue, 2008-04-01 at 09:48 +0530, Mohamed Shafi wrote: What i did was to have 8 register class with each class having two registers, an even register and an odd register then in define expand look for the register indirect with offset addressing mode and emit gen_store_offset or gen_load_offset pattern if the addressing mode is found. This sounds similar to what I suggested, so it may work. However, having a separate pattern for certain kinds of loads/stores may not work. reload doesn't re-recognize an insn while it is fixing it, hence you need to have a single movsi (or whatever) pattern that can handle any kind of operand. If you have a movsi pattern that doesn't accept load/store offset, then probably what will happen is that any fp +offset addresses generated during reload will get reloaded into a register, and you may not get very good code. Of course, with your address register restrictions, there is only one register that can be used with fp+offset, so you might not get good code for that regardless of what you do. Jim
Re: gcc4.3 configuring problems with mpfr
Swapna Pawar wrote: configure:4542: checking for correct version of mpfr.h configure:4573: i386-pc-mingw32msvc-gcc -o conftest.exe -O2 -I/home/manjunathm1/gmp/prefix/include -I/home/manjunathm1/mpfr/prefix/include conftest.c -L/h ome/manjunathm1/gmp/prefix/lib -L/home/manjunathm1/mpfr/prefix/lib -lmpfr -lgmp 5 /tmp/cc9dzbXZ.o(.text+0x23):conftest.c: undefined reference to `mpfr_init' Are the -I and -L options correct? Run the command by hand with --save-temps and look at the .i file and verify that the right header files got included. Run the command by hand with -Wl,--verbose and verify that the right libraries got linked in. Look at the installed libraries and make sure that they contain the functions in question, i.e. make sure that they were built and installed correctly. Maybe try installing gmp and mpfr in the same place instead of in two different places? Jim
Re: Copyright assignment wiki page
FX Coudert wrote: Moreover, our contribute page says the GCC maintainer that is taking care of your contributions and there is no documentation to maintainers, so that part at least is wrong: maintainers don't know what to do. Or else, I just didn't receive the maintainer welcome package including the appropriate documentation :) See http://www.gnu.org/prep/maintain/ http://www.gnu.org/prep/standards/ The FSF defines maintainer a bit differently than the gcc project. In the FSF view, the GCC SC is the maintainer of GCC, and no one else. When the FSF appoints a maintainer for a package, they usually do point the new maintain at appropriate document, but gcc has so many sub-maintainers(?) that not all of them get the info they need. Note that these documents assume you have an account at the FSF, which true maintainers do, but most gcc sub-maintainers(?) don't. More fundamentaly, I think it's a strange pattern when an open-source project makes the path harder for our contributors. This is unfortunately a problem with the US legal system, not a problem with the FSF. I'd point out that other GNU (or not) projects have the same form on their website, e.g. Other people making mistakes does not give us permission to make the same mistake. By the way, the main reason why the FSF doesn't want the forms up for public view is that most people who grab them from there will fill them out wrong. This just creates more work for the FSF. Hence the FSF generally prefers that you just point people at them first, and then the FSF will fill out the necessary forms to ensure that they are done right. Jim
Re: Doubt about filling delay slot
Mohamed Shafi wrote: 'liu' will load the immediate value into the upper byte of the specified register. The lower byte of the register is unaffected. The liu pattern should be something like (set (regX) (ior:HI (and:HI (regX) (const_int 255)) (const_int Y))) Jim
Re: Problem with reloading in a new backend...
Stelian Pop wrote: #define PREFERRED_RELOAD_CLASS(X, CLASS)\ ((CONSTANT_P(X)) ? EIGHT_REGS : \ (MEM_P(X)) ? EVEN_REGS : CLASS) #define PREFERRED_OUTPUT_RELOAD_CLASS(X, CLASS) \ ((CONSTANT_P(X)) ? EIGHT_REGS : \ (MEM_P(X)) ? EVEN_REGS : CLASS) I think most of your trouble is here. Suppose we are trying to reload a constant into an even-reg. We call PREFERRED_RELOAD_CLASS, which says to use eight_regs instead, and you get a fatal_insn error because you didn't get the even-reg that the instruction needed. PREFERRED_RELOAD_CLASS must always return a class that is a strict subset of the class that was passed in. So define another register class which is the intersection of eight regs and even regs, and when we call PREFERRED_RELOAD_CLASS with a constant and even regs, then return the eight/even intersection class. Likewise in all of the other cases you are trying to handle. Fix this problem, and you probably don't need most of the other changes you have made recently. Jim
Re: Where is scheduling going wrong? - GCC-4.1.2
Mohamed Shafi wrote: This looks like reordering is proper. When schedule-insn2 is run for the above region/block the no:of instructions in the region (rgn_n_insns) is 3. Maybe bb reorder got the basic block structure wrong, and the barrier is not supposed to be part of the basic block. In fact, looking at bb-reorder.c, I see this /* Make BB_END for cur_bb be the jump instruction (NOT the barrier instruction at the end of the sequence...). */ BB_END (cur_bb) = jump_insn; Maybe it is getting this wrong someplace else in the file. This might already be fixed in current sources, so you could try looking for an existing patch to fix it. Jim
Re: address taken problem
Dasarath Weeratunge wrote: In the following code I marked the tree 'node.0' as address taken using 'c_mark_addressable'. Now in the assembly code, isn't the return value of the second call to malloc completely discarded? c_mark_addressable is meant to be called during parsing. It may affect the IL that will be generated, so you can't expect to be able to call it during optimization. You haven't given us a copy of your patch, or the C code for your testcase. It is difficult to say much definitive when we are lacking detailed info about what you are doing. Jim
Re: Problem with reloading in a new backend...
On Sat, 2008-04-12 at 00:06 +0200, Stelian Pop wrote: I will still have the problems with the fact that my indirect addressing doesn't allow displacements, no ? (so I would need to implement LEGITIMIZE_RELOAD_ADDRESS, in which I'll need a special reserved register to compute the full address by adding the base and the displacement). Or do you imply that I won't need this anymore ? I didn't see an obvious explanation for your troubles here. There are other targets like IA-64 that do not have base+offset addressing modes. It should just work. LEGITIMIZE_RELOAD_ADDRESS is a hack. It should never be necessary for correct code generation, in theory, though I think there are some rare corner cases where it may be required for correct results. Long term this is something that should be fixed. Meanwhile, I can point out that the IA-64 port does not define LEGITIMIZE_RELOAD_ADDRESS, and does not have base+offset addressing modes, and it works. IA-64 does have auto-increment addressing modes, but those shouldn't matter here. Reload won't generate such addressing modes for a stack slot reference. Jim
Re: Problem with reloading in a new backend...
Stelian Pop wrote: I will still have the problems with the fact that my indirect addressing doesn't allow displacements, no ? (so I would need to implement LEGITIMIZE_RELOAD_ADDRESS, in which I'll need a special reserved register to compute the full address by adding the base and the displacement). Or do you imply that I won't need this anymore ? In your original message, the right thing happens. The stack slot address gets reloaded. (insn 1117 16 1118 2 ../../../../src/gcc-4.3.0/libgcc/../gcc/libgcc2.c:1090 (set (reg:QI 10 r10) (const_int 24 [0x18])) 1 {*movqi_imm} (nil)) (insn 1118 1117 1119 2 ../../../../src/gcc-4.3.0/libgcc/../gcc/libgcc2.c:1090 (set (reg:QI 10 r10) (plus:QI (reg:QI 10 r10) (reg/f:QI 30 r30))) 13 {addqi3} (expr_list:REG_EQUIV (plus:QI (reg/f:QI 30 r30) (const_int 24 [0x18])) (nil))) (insn 1119 1118 21 2 ../../../../src/gcc-4.3.0/libgcc/../gcc/libgcc2.c:1090 (set (mem/c:QI (reg:QI 10 r10) [31 S1 A16]) (reg:QI 14 r14)) 8 {*movqi_tomem} (nil)) The only problem here is that the register choice is wrong, and I already explained why it is wrong. It is only after you started making some changes that this apparently broke. It isn't clear why. I can only suggest that you try to debug it. Put a breakpoint in find_reloads() conditional on the instruction number. Then step through to see what happens. It should call finds_reloads_address or find_reloads_toplev at some point, which will see that we don't have a strictly valid address, and then reload it. Jim
Re: Where is scheduling going wrong? - GCC-4.1.2
On Sun, 2008-04-13 at 17:05 +0530, Mohamed Shafi wrote: Well i tracked down the cause to the md file. In the md file i had a define_expand for the jump pattern. Inside the pattern i was checking whether the value of the offset for the jump is out of range and if its out of range then force the offset into a register and emit indirect_jump. Though this didnt work, every time an unconditional jump was being emitted a barrier was also being emitted. It looks like in define_expand for jump we should consider all the case and emit DONE for all the cases, if you are considering any case, otherwise a barrier will be generated for cases not covered in DONE. Am i right? Sorry, I don't understand what the problem is. We always emit a barrier after an unconditional branch. Whether or not you call DONE inside the jump pattern is irrelevant. Also, whether you emit a PC-relative branch or an indirect branch is irrelevant. The following link is the reply from Ian for a query of mine regarding scheduling. http://gcc.gnu.org/ml/gcc/2008-04/msg00245.html After reading this, i feel that gcc should have looked for barrier insn while scheduling and should have given an ICE if it found one. True, barrier got into the instruction stream because of my mistake, but then thats what ICEs are for. Then again i might be wrong about this. We do have consistency checks for many problems with the RTL, but it isn't possible to catch all of them all of the time. P.S. I am still searching for a solution to choose between jump and indirect_jump pattern when the offset is out of range. http://gcc.gnu.org/ml/gcc/2008-04/msg00290.html May be you can help with that This is what the shorten_branches optimization pass is for. Define a length attribute that says how long a branch is for each offset to the target label. Then when emitting assembly language code, you can choose the correct instruction to emit based on the instruction length computed by the shorten branches pass. If you need to allocate a register, that gets a bit tricky, but there are various solutions. See the sh, mips16 (mips) and thumb (arm) ports for ideas. Jim
Re: A query regarding the implementation of pragmas
Mohamed Shafi wrote: For a function call will i be able to implement long call/short call for the same function at different locations? Say fun1 calls bar and fun2 calls bar. I want short-call to be generated for bar in fun1 and long-call to be generated in fun2. Is to possible to implement this in the back-end using pragmas? A simple grep command shows that both arm and rs6000 already both support long call pragmas. Jim
Re: Problem with reloading in a new backend...
On Tue, 2008-04-15 at 00:06 +0200, Stelian Pop wrote: - I had to add a PLUS case in PREFERRED_RELOAD_CLASS() or else reload kept generating incorrect insn (putting constants into EVEN_REGS for example). I'm not sure this is correct or if it hides something else... It does sound odd, but I can't really say it is wrong, as you have an odd set of requirements here. At least it is working which is good. Jim
IA-64 ICE on integer divide due to trap_if and cfgrtl
This testcase extracted from libgcc2.c int sub (int i) { if (i == 0) return 1 / i; return i + 2; } compiled with -minline-int-divide-min-latency for IA-64 generates an ICE. tmp2.c:8: error: flow control insn inside a basic block (insn 18 17 19 3 tmp2.c:5 (trap_if (const_int 1 [0x1]) (const_int 1 [0x1])) 352 {*trap} (nil)) tmp2.c:8: internal compiler error: in rtl_verify_flow_info_1, at cfgrtl.c:1920 The problem is that IA-64 ABI specifies that integer divides trap, so we must emit a conditional trap instruction. cse simplifies the compare. combine substitutes the compare into the conditional trap changing it to an unconditional trap. The next pass then fails a consistency check in cfgrtl. It seems odd that cfgrtl allows a conditional trap inside a basic block, but not an unconditional trap. The way things are now, it means we need to fix up the basic blocks after running combine or any other pass that might be able to simplify a conditional trap into an unconditional trap. I can work around this in the IA64 port. For instance I could use different patterns for conditional and unconditional traps so that one can't be converted to the other. Or I could try to hide the conditional trap inside some pattern that doesn't get expanded until after reload. None of these solutions seems quite right. But changing the basic block tree during/after combine doesn't seem quite right either. The other solution would be to fix cfgbuild to treat all trap instructions are control flow insns, instead of just the unconditional ones. I'm not sure why it was written this way though, so I don't know if this will cause other problems. I see that sibling and noreturn calls are handled the same way as trap instructions, implying that they are broken too. I'm looking for suggestions here as what I should do to fix this. Jim
Re: A query regarding the implementation of pragmas
On Tue, 2008-04-15 at 11:27 +0530, Mohamed Shafi wrote: On Mon, Apr 14, 2008 at 11:44 PM, Jim Wilson [EMAIL PROTECTED] wrote: A simple grep command shows that both arm and rs6000 already both support long call pragmas. I did see those but i coudn't determine whether it is possible to change the attribute of the same function at different points of compilation. Configure and build an arm and/or rs6000 compiler, and then try it, and see what happens. In theory, it should work as you expect. Jim
Re: protect label from being optimized
Kunal Parmar wrote: But my return label is getting optimized away. Could you please tell me how to avoid this. You may also need to add a (USE (REG RA)) to the call pattern. Gcc will see that you set a register to the value of the return label, but it won't see any code that uses that register, so it will optimize away both the load and the label. To prevent this, you need to add an explicit use of that register to the call insn pattern. Jim
Re: Implementing built-in functions for I/O
Mohamed Shafi wrote: short k; __OUT(port no) = k; So hoe can i do that. Make __OUT take two parameters. __OUT(port no, k); Jim
Re: Common Subexpression Elimination Opportunity not being exploited
Pranav Bhandarkar wrote: GCC 4.3 does fine here except when the operator is logical and (see attached. test.c uses logical and and test1.c uses plus) Logical and generates control-flow instructions, i.e. compares, branches, and labels. This makes optimizing it a very different problem than optimizing for plus. Try compiling with -fdump-tree-all and notice that the gimple dump file already contains all of the control-flow expressed in the IL, which means optimizing this is going to be very difficult. We could perhaps add a new high level gimple that contains the C language as an operator, run a CSE pass, and then later lower it to expose the control flow, but that will be a lot of work, and probably won't give enough benefit to justify it. It is simpler to rewrite the code. For instance if you change this a[0] = ione itwo ithree ifour ifive; to a[0] = !!ione !!itwo !!ithree !!ifour !!ifive; then you get the same effect (assuming none of the subexpressions have side-effects), and gcc is able to perform the optimization. You also get code without branches which is likely to be faster on modern workstation cpus. Jim
Re: Feature request - a macro defined for GCC
x z wrote: I would like to see that GCC define a macro in the case it is being used to compile a program. Currently there is a __GNUC__ macro defined by the GNU C preprocessor CPP. This is our mistake. Originally __GNUC__ meant that this was the GNU C compiler (aka GNU Compiler Collection). However, we have added so many extensions to the compiler that it later came to mean that this is the GNU C language. There is unfortunately no way to distinguish between a program written in GNU C, and a program intended to be compiled by the GNU C compiler. All compilers that implement the GNU C language must define __GNUC__. There is no way around this. The use of __GNUC__ is so pervasive in GNU/Linux that a compiler has to define it or else it simply won't work. This is why the Intel compilers and other compilers define it. They have no choice. If we want a solution to this problem, complaining to Intel and others will do no good. We will have to fix it ourselves. One way to fix the problem is to separate the meaning of the macros. We can have one macro that means this is a GNU C program and other macro that means this is the GNU C compiler. We then have to make sure that glibc and other libraries use the macros correctly. Etc. While I agree that this is our mistake, it isn't clear to me why it matters. If the Intel compiler correctly implements the GNU C language, then it shouldn't matter if the code is being compiled by GCC or ICC. Unless maybe you ran into a GCC bug, and want to enable a workaround only for GCC. Jim
Re: Feature request - a macro defined for GCC
x z wrote: This is somewhat off-topic. Perhaps the GCC development team should consider making this __GNUC__ stuff more clarified in the GCC Manual. I don't think this is off-topic. We need to get people to understand that __GNUC__ is ambiguous before we can solve the problem. It means two things: 1) This code is written in the GNU C language. 2) This code is meant to be compiled by GCC. Other compilers that implement the GNU C language are forced to define __GNUC__ because of the first issue, even though it then confuses the second issue. If we want to fix this, gcc must change. And this may also require GNU libc changes and linux kernel changes, etc. The talk about whether __GNUC__ is defined by the preprocessor or the compiler proper is irrelevant. Either way, it is still ambiguous. You are right that we may also have trouble with other related macros. I am not sure if there is a GNU Fortran language, if there is, then we may have the same problem with __GFORTRAN__. We don't need things like __GNUG_MINOR__ as G++ is always distributed in lock step with the C compiler, so we only need one set of macros for gcc version numbers. We do however have the problem that the GNU C language changes frequently, and people have gotten in the habit of testing __GNUC_MINOR__ and other related macros to determine which features are present in the version of the GNU C language implemented by this compiler. Hence, this means that other compilers that implement the GNU C language may also be forced to define macros like __GNUC_MINOR__ through no fault of their own, to correctly describe which version of the GNU C language that they implement. This is a very complicated issue, and until people realize how complicated it has gotten, and accept that we need a solution, it is unlikely that we will make progress on this issue. Jim
Re: Feature request - a macro defined for GCC
x z wrote: If we want to fix this, gcc must change. And this may also require GNU libc changes and linux kernel changes, etc. Maybe you can enlighten us a bit on why GNU libc and linux kernel need changes so that we can realize better how complicated the issue is. Because there are header files in /usr/include that test __GNUC__. In order for these header files to do the right thing, the Intel compiler (and other compilers) need to define __GNUC__. Now suppose we add a new macro __GCC_COMPILER__ that is intended to be unambiguous, and mean only that this is the GCC compiler. What happens next? Within a few months, someone will post a patch to glibc and/or the linux kernel that uses __GCC_COMPILER__, based on the misconception that because it is new, that they are supposed to use it. A few months later, there is a glibc release and/or linux kernel release that contains this code. A few months later it gets into a linux release. Then Intel discovers that their compiler no longer works as intended on linux, and in order to fix it, they have to define __GCC_COMPILER__. And now we are back where we started, except now we have two useless ambiguous macros instead of one, and hence we are worse off than before. If we want to make progress on this issue, we need to get people to understand what the underlying problem is first, and adopt changes that will lead to a solution, otherwise adding new macros is futile. One thing I haven't seen you answer yet is why you think you need a macro that uniquely identifies GCC. If the Intel compiler correctly implements the GNU C language, then it should not matter whether the code is being compiled by GCC or ICC. I think it would help the discussion if you could give a specific testcase where this matters. Jim
Re: PATCH: [4.1/4.2 Regression]: Miscompiled FORTRAN program
H. J. Lu wrote: PR rtl-optimization/25603 * reload.c (reg_inc_found_and_valid_p): New. (regno_clobbered_p): Handle REG_INC as 25603 workaround. I don't believe this is safe. If you look at the uses of regno_clobbered_p in reload.c, the comments clearly indicate that we are looking for registers used in clobbers. So unconditionally adding code that handles REG_INC notes will break these uses. You have to add the REG_INC support the same way that the sets support was added, by adding another argument (or reusing the sets argument), and then modifying the one place we know is broken (choose_reload_regs) to use the new argument (or new sets argument value). -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Since r110852: Mainline broken for VAX (cc0 target)
Hans-Peter Nilsson wrote: PS. No, I don't know why the simple some search terms doesn't work. Probably because simple search only looks at the summary field (and a few other usually useless fields), and the summary field often doesn't have the info you are searching for. If you click on the help link next to the simple search box, it will tell you all of the usually useless things that simple search does. I also recommend using the advanced search feature, and typing your search term into the A Comment search field. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Ada bootstrap failure for mainline on hppa2.0w-hp-hpux11.00
Rainer Emrich wrote: I'm using gmake 3.80 and it's the first time that I see this kind of problem. As you say there's no such option -f in the xgcc command. The gcc driver has support to convert arbitrary long options into -f options. So for instance --test-coverage gets converted to -ftest-coverage. See the last entry of the option_map table which performs this magic. This was added for POSIX support, as POSIX has some special rules for how command line options work, but -- options are always safe in POSIX. If you have something like --special-character on the command line, the gcc driver would convert that to a -f option, and then cc1 would complain that -fspecial-character is not a recognized option. It does look like you have some spurious -- characters on your gcc command line. Maybe one of them is followed by a non-printing non-white space character? I wonder if this -- to -f conversion feature is documented anywhere. It probably isn't. Also, the option_map table is probably long out of date, because it is supposed to list every option that doesn't start with -f, or which requires arguments. Otherwise, gcc won't be fully compliant with the POSIX rules for command line arguments. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: GCC 4.1.0 RC1
Rainer Emrich wrote: /SCRATCH/gcc-build/Linux/ia64-unknown-linux-gnu/install/bin/ld: unrecognized option '-Wl,-rpath' This looks like PR 21206. See my explanation at the end. I see this on some of our FreeBSD machines, but I've never seen it on an IA-64 linux machine. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Fwd: trees: function declaration
[EMAIL PROTECTED] wrote: I need some kind of assistance. I am trying to substitute function name during the compilation procedure. The only way to tell what is wrong is to debug the patch. And since it is your patch, you are the one that should be trying to debug it. Try setting breakpoints at every place that calls SET_DECL_ASSEMBLER_NAME. Perhaps someone is calling it after you are. Try looking at the implementation of the _asm_ extension, which provides the same feature, i.e. changing the assembler name of a function. See the set_user_assembler_name function in varasm.c. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: static inline function body is missing
Nemanja Popov wrote: dlx-elf-gcc -S foo.c -funit-at-a-time Mike's suggestions are good in general, but there is another thing you should be looking at. Since you are explicitly asking for -funit-at-a-time, I would suggest looking in cgraph. cgraph has code to optimize away unused static functions. You may be confusing cgraph somehow. Take a look at cgraph_mark_needed_node and decide_is_function_needed. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: intermediate representation
Mateusz Berezecki wrote: I'm new to GCC and I'd appreciate if somebody could point me to _all_ files which are responsible for intermediate representation and constructing it. See the internals documentation, for instance: http://gcc.gnu.org/onlinedocs/gccint/Passes.html#Passes We have two ILs, a high level one (gimple), and a low level one (rtl), and the details for each is different. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: debug_hooks-end_prologue problem
Douglas B Rupp wrote: The HP debugger on IA64 VMS defines a new Dwarf2 attribute that computes the offset of the end of the prologue from the beginning of the function. To implement this an end prologue label must be emitted and some related info saved in dwarf2out.c. However I've noted that calling debug_hooks-end_prologue as it is now results in a label at the beginning of the prologue, not the end. I've attached the obvious fix below, but I fear it might mess up sdbout.c. Should I create a new hook or investigate the impact on sdbout and try to fix it? Belatedly looking at this, you seem to be confused over the purpose of the NOTE_INSN_FUNCTION_BEG note. It does not mark the beginning of the function. That is the first RTL insn. Instead, it marks the beginning of the function body. If you compile without optimization, then this is the same as the prologue end. If you compile with optimization, then the function body and the prologue overlap. Now the question is, do you really want to mark the beginning of the function body (which may be before the prologue end), or the end of the prologue (which may be after the function body beginning). Either way, it is likely that someone may not be happy. However, in general, the former (function body beginning) is more useful to end users, as otherwise they won't be able to debug all of the code in the function body. So the current code in final.c looks correct. Othogonal to this, there is the problem of what to do with these notes when instruction scheduling is enabled. It looks like the current code just always moves notes to the beginning of the basic block containing them. That means effectively that the function beginning is always before the prologue. However, the prologue end note also ends up before the prologue in the same place. So using that one doesn't help. The only solution is a more general one. In the scheduler, before scheduling, you need to mark all instructions that originally came from the prologue. After scheduling, you go back through, find the first non-prologue instruction, and insert the NOTE_INSN_FUNCTION_BEG note before it. You find the last prologue instruction, and insert the NOTE_INSN_PROLOGUE_END note after it. You would also have to do the same (reversed) for the epilogue. It isn't clear if this extra work is worthwhile. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: information request about generated trees during gcc process
Bruno GALLEGGIANTI - LLSP wrote: I'm looking for tools that can generate AST (Abstract Syntax Tree) from C files. I have found with gcc toolchain the option -dump-tree-original. The generated files are very interresting for me and I think they can be exploited for my project. The dump files generated by gcc are intended only as debugging aids. They are not intended to be read by other programs. You may or may not be able to use them for this purpose. There is no documentation for the format of the dump files themselves. The format may change whenever it is convenient for us to do so. There is documentation for the structure of the trees that we use. See the gcc internals documentation, either in the source tree or on the web page. The format of the dump files follows from that. You may also want to take a look at the papers published in the GCC conference. Some of them discuss the tree structures we use, particularly the tree-ssa papers. There are pointers to these conference proceedings in the wiki off of our web page. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Receive only special Trees (fdump-tree...)
Jan Wegner wrote: Hi! Is it possible to receive only special trees from -fdump-tree-{all-raw}? Try reading the docs for the -fdump-tree-* options. You can choose which dump files are created by using the appropriate option. I only need original, generic and gimple. Is there a description about the generic-file somewhere? You mean the text format dump file? Probably not. This is just a debugging aid. There is some documentation for the GENERIC and GIMPLE intermediate languages in the internal docs. See for instance the gcc/doc/tree-ssa.texi file. The format of the dump files follows from the definition of GENERIC and GIMPLE and should mostly be pretty obvious. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: tracking pointers in hardware
Yoav Etsion wrote: I'm designing a new hardware that needs to know which GPR contains a simple integer, and which contained pointer. The hardware simply needs different load operations for both (we're talking load/store machines, with no indirect addressing to make life easier). You can try using a different mode for pointers, e.g. PSImode. Then you can be sure that SImode is an integer and PSImode is a pointer, and you won't get them confused. This take some extra work, and may hinder optimization. This is usually used when pointers are an odd size, like 24-bits, but it could work for your case also. There is one existing port that uses it, the m32c port, which supports an optional 24-bit pointer size. You can also try looking at REG_POINTER. This is simpler, but it isn't clear whether this is good enough for your purposes. You may run into bugs with the REG_POINTER support. Try looking at a port that uses this one, like the pa port. The PA port requires that in a REG+REG address, we must know which one is the pointer, and which one is the offset, and the PA port uses REG_POINTER for this. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: i686 architecture behaviour in gcc
David Fernandez wrote: Can anyone explain why has been chosen that -march=i686 makes the compiler change the normal behaviour, and zero-expand unsigned short parameters into 32-bit registers by all means? You failed to mention the gcc version, and your testcase doesn't actually use any unsigned short parameters unless you forgot to mention something important, like a macro, or an uncommon command line option. So I can't actually reproduce your problem with the testcase that you gave unless I modify it. Anyways, I've seen a similar problem reported before, so I am guessing this is related to this patch: http://gcc.gnu.org/ml/gcc-patches/2000-02/msg00890.html So, yes, this was done for a good reason, because it results in faster PentiumPro code on average. Though it does result in some unnecessary extra instructions in some unfortunate cases, like this one. If you don't have a PentiumPro processor, then you may not want to use this option. Pentium 2 through 4 do not have this problem. See also PR 15184 in our bugzilla database on our web site. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: reload problem in GCC 4.1
Rajkishore Barik wrote: problems with the following instruction in post-reload.c:391 in reload_cse_simplify_operands function stating that the insn does not satisfy constraint. There are lots of different ways that this problem can occur. It is hard to say much without having a testcase I can use to reproduce the problem. You may just have to spend a little time stepping through reload inside gdb to see what is going wrong. You didn't mention the target. I'm assuming it is x86. The instruction pattern name is in the dump, it is divmodsi4_ctld, which you can find in the i386.md file. There is nothing really special about it other than the fact it uses match_dup. If you are changing compiler internals, then it is possible that you have made an error somewhere. One thing that comes to mind is that match_dup requires identical objects, i.e. the address of the objects must be the same. Also, in general, pseudo-regs are unique objects. There can be only one instance of (reg:SI 75) in the compiler, so they should be identical objects anyways. This is something you should check though. You can't just create random pseudo-register objects. You must reuse the existing RTL if a pseudo-reg already exists. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: undefined BITS_PER_UNIT
Rogelio Serrano wrote: When building gcc-4.1.0 with uclibc im getting and undefined BITS_PER_UNIT error when building libgcc at the muldi3. Using grep would show that it is defined in the defaults.h file. Using grep again shows that defaults.h is supposed to be automatically included in the tm.h file by configure. The fact that it isn't implies that tm_file is being set wrong in config.gcc. You failed to mention the target triplet. Just uclibc is ambiguous, as there are many different systems using uclibc. If you are configuring for an existing target, then this would indicate a bug in a supported target that should be reported and fixed. If you are writing your own port, this indicates a bug in your port. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Help needed with gcc-4.1.0 on Linux
Tom Williams wrote: I downloaded gcc-4.1.0 the other day and the compile went fine. When I ran make check to make sure all went well, I get this error: Always use make -k check. Otherwise, make will exit after the first failure, instead of running all of the testsuites. Some failures are normal. It is hard to get all tests working for all targets all at the same time. So just because something failed doesn't mean there is anything seriously wrong. Most people don't bother to run the fixincludes testsuite. It requires an extra tool that most people don't have installed. So, offhand, I can't say for sure whether this is a known problem. It wouldn't hurt to file a bug report if you already haven't done so. See http://gcc.gnu.org/bugs.html for info on how to report bugs. Linux is ambiguous. If this is an IA-64 Linux system, then you should say that. The installed headers on an IA-64 Linux system may be different than the installed headers on, say, an x86_64 Linux system. Actually, looking at the fixincludes check support, it looks like it is a bug in the make check support. It appears that everytime we add fixinclude support for a new file, and we include test_text in that support, then we need to add a copy of the resulting corrected output file to the testsuite. This was not done for the ia64/sys/getppdp.h file. There could also be other similar errors. A bug report in bugzilla would be useful to track this. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: [RFC] Removal of loop notes
Zdenek Dvorak wrote: *sched* -- no idea what happens there; it seems to make REG_SAVE_NOTE notes from loop notes, and then makes some magic, but I do not understand what and why. The loop notes are converted into REG_NOTES, and attached to an adjacent instruction. The REG_SAVE_NOTES are treated as scheduling barriers. Nothing is allowed to move forwards and backwards across them. After scheduling, the REG_SAVE_NOTES are converted back to loop notes. This ensures that loop structure is preserved by the instruction scheduler. All insns inside the loop before sched will still be inside the loop after sched. All insns outside the loop before sched will still be outside the loop after sched. The reason for doing this is because some of the register lifetime info computed by flow is loop dependent. For instance, reg_n_refs is multiplied by a constant factor if a use occurs inside a loop. This ensures the registers used inside a loop are more likely to get a hard register than pseudo registers used outside a loop. However, a side effect of this means that the scheduler either has to preserve loop structure to prevent the numbers from getting out of whack, or else we need to recompute flow info after sched. The decision was made long ago to preserve loop structure. This is my recollection of how and why things work they do. My recollection could be wrong and/or out-of-date. I see that flow no longer uses loop_depth when computing REG_N_REFS, so the original reason for the sched support seems to be gone. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: [RFC] Removal of loop notes
Bernd Schmidt wrote: Do we have a replacement for this heuristic? I see REG_FREQ, which is computed from some basic block frequency info. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: comparing DejaGNU results
James Lemke wrote: I wanted some mechanical way to compare the output of dejagnu runs between releases, etc. Did you look at contrib/compare_tests? It does something very similar to what your script is doing. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Expansion of __builtin_frame_address
Mark Shinwell wrote: Option (i), which is in all but name the solution 5 approach [1] proposed last year, means that the count == 0 case is elevated to the same level of importance as the count 0 cases, in line with the use in backtrace (). The problem with this is that on platforms where the soft and hard FPs coincide there is going to be a slight performance degradation, as identified previously, whenever these builtins are used. I don't have a problem with the proposed patch that implements this. I didn't choose this option earlier because I didn't have a testcase that required it. You have now supplied one (glibc backtrace), so I think it is now reasonable to go forward with this. The workaround for the performance problem is for backend maintainers to add an appropriate definition for the INITIAL_FRAME_ADDRESS_RTX macro. Perhaps forcing the issue will provide some incentive to people to fix their backends. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Solaris 2.8 build failure for 4.1.1 (libtool/libjava)
Joe Buck wrote: It's GNU ld version 2.16.1. This is strange; I would have expected the linker to get just -rpath: -Wl should tell gcj to pass the following option to the linker. Known problem. See PR 21206. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Problem with address reloading
Juul Vanderspek wrote: In order to enable indexed loads, I extended GO_IF_LEGITIMATE_ADDRESS to also match indexed addresses, added a memory constraint 'R' to match nonindexed addresses, and used R for stores in the 'mov' rule (see below). This may be easier if you do it the other way. I.e. remove indexed addresses from GO_IF_LEGITIMATE_ADDRESS, add an 'R' constraint that only matches indexed addresses, and then use mR for load memory operands, and m for store memory operands. With a proper predicate for input_operands that accepts memory_operand plus indexed addresses, combine can create the indexed loads. It should be possible to make your approach work. First thing is that you have to fix the predicate. You can't use nonimmediate_operand here, as this will accept invalid indexed addresses. Define your own output_operand predicate that rejects indexed addresses. It is always better to reject an operand than to rely on reload to fix it. You probably also need a mov* expander that fixes a store indexed address if one is passed in, since it is a bad idea for the mov* patterns to fail. Another problem here is that reload knows how to fix simple things like 'm' and 'r' operands, but it doesn't know how to fix an 'R' operand. This means you either have to be careful to only accept operands that will never be reloading, or add code to reload it yourself, or add hints to help reload. The last one is the easiest option, see the EXTRA_MEMORY_CONSTRAINT documentation. If you define this, then reload will know that it can always fix an 'R' by reloading the address into a register. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: GCC 3.4.6 problem
Roland Persson wrote: In the above the case 2 is the problem so I guess the u1sel member must be a 1 bit bitfield. These warnings seem like a problem or can I just ignore them? This is target independent, and fixed in gcc-4.0.0, so it should be safe to ignore it. If it was a real problem, it would have already been fixed in gcc-3.4.x. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Errors while building bootstrap GCC for mipsisa32-elf target
Monika Sapra wrote: I am not able to understand, why the checkout source of GCC is so large in size? I am using the following command to checkout source: See the info in the wiki. It talks about ways to reduce disk space. http://gcc.gnu.org/wiki/SvnHelp -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: sparc elf
Niklaus wrote: when i executed a.out on sparc machine it segfaulted and dumped core. On what kind of sparc machine? It sounds like you tried to run the code on a sparc-solaris or sparc-linux machine, which won't work. sparc-elf code can only be run on bare hardware. Try building a cross gdb, and then using the sparc-elf-run simulator. You can also try using the sparc-elf-gdb, but you will have to read the docs to learn how to do it. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: make proto fails
Andreas Jaeger wrote: Andreas Schwab [EMAIL PROTECTED] writes: That's probably the same bug as PR21059. That report even has a patch - but no action since december. Jim, will you handle this one? It isn't exactly the same problem, as there is no auto-inc address here. So my patch in the PR doesn't look like it will help. We need one of the more complicated solutions which require a smarter CFG and/or smarter register lifetime check. I don't know if I can help with that. Others are doing more work in this area than I am. This isn't a new problem. The .i file I produced from mainline generates the same failure with gcc-4.1 and gcc-4.0, and maybe also other gcc versions. I'm not sure what broke this, maybe it was the addition of -Werror to the Makefile. As for my patch, I would like to finish it now that I'm functional again, but I've got a big backlog of stuff so I don't know when I will get to it. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: dejaGNU testsuite files for 2.95.3 20010315 (release)
J.J.Garcia wrote: I'm messed with this, anyway i don't understand why the 2.95.3 doesn't have a testsuite folder within. Because gcc had no official testsuite at that time, so obviously, there was no testsuite folder (directory). We did have a package called c-torture that was released separately from gcc, which included the beginnings of a regression testsuite. It was eventually added to gcc after resolving some copyright issues, and formed the core of what is now the gcc testsuite. You might be able to find a 8 year old copy if you google for it. I don't know offhand where to find a copy. It is an oversight that we didn't archive a copy of it on our ftp server. There was a period during which c-torture was in the FSF GCC development tree, but not part of the gcc releases. I suspect gcc-2.95 was during this period. The cvs/svn tree might be a little confused during this period, as the testsuite dir might be on mainline, but not on release branches. Or it might be on release branches but not in release tarballs. -- Jim Wilson, GNU Tools Support, http://www.specifix.com