GCC 4.1.1 Freeze
I will be building the GCC 4.1.1 release later tonight, or, at latest, tomorrow (Wednesday) in California. Please refrain from all check-ins on the branch until I have announced the release. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.1.1 Released
GCC 4.1.1 has been released. This release is a bug-fix release for problems in GCC 4.0.2. GCC 4.1.1 contains changes to correct regressions from previous releases, but no new features. This release is available from the FTP servers listed here: http://www.gnu.org/order/ftp.html The release is in the gcc/gcc-4.1.1 subdirectory. If you encounter any difficulties using GCC 4.1.1, please do not send them directly to me. Instead, please http://gcc.gnu.org/ for information about getting help and filing problem reports. As usual, a vast number of people contributed to this release -- far too many to thank by name! -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.1.1 Released
Roberto Bagnara wrote: Mark Mitchell wrote: GCC 4.1.1 has been released. This release is a bug-fix release for problems in GCC 4.0.2. GCC [...] Do you mean a bug-fix release for problems in GCC 4.1.0? Yup. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Expansion of __builtin_frame_address
Mark Shinwell wrote: Hi, I'd like to gather some opinions and advice on the expansion of __builtin_frame_address, as discussed on gcc-patches last year [1, 2]. This centres on the following comment in expand_builtin_return_addr arising from revision 103294 last year: I've explicitly Cc'd Jim Wilson on this email, since he did the work in this area that you cite. I'm not sure whether Jim is reading GCC email presently or not, but I want to give him every opportunity to comment. Let us come back to the more general case. As identified above, when expand_builtin_return_addr is used to expand __builtin_frame_address (), we do care about the count == 0 case. I believe that the comment should be adjusted to reflect this whatever other changes, if any, are made. I think this is non-controversial. As for the remaining problem, I suggest that we could: (i) always return the hard frame pointer, and disable FP elimination in the current function; or (iii) ...the same as option (i), but allow targets to define another macro that will cause the default code to use the soft frame pointer rather than the hard frame pointer, and hence allow FP elimination. (If such a macro were set by a particular target, am I right in thinking that it would be safe to use the soft frame pointer even in the count = 1 cases?) I tend to think that option (iii) might be best, although perhaps it is overkill and option (i) would do. But I'm not entirely sure; still being a gcc novice I have to admit to not being quite thoroughly clear on this myself at this stage. So any advice or comments would be appreciated! I agree that option (iii) is best, as it provides the ability to optimize on platforms where that is feasible, and still provides a working default elsewhere. I will review and approve a suitable patch to implement (iii), assuming that there are no objections from Jim or others. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: IA-64 speculation patches have bad impact on ARM
David Edelsohn wrote: Maxim Kuvyrkov writes: Maxim Anyway, this work is for stage 1 or 2 and for now I propose following Maxim fix: implement targetm.sched.reorder hook so that it will ensure that if Maxim there is an insn from the current block in the ready list, then insn Maxim from the other block won't stand first in the line (and, therefore, Maxim won't be chosen for schedule). I feel that this will be what you are Maxim calling 'filling holes'. Please find an example patch attached (arm.patch). What about all of the other GCC targets? If your patch changed the default behavior of the scheduler assumed by all other ports, you should fix the scheduler and modify the IA-64 port to get the behavior desired. Exactly. I think this is a serious regression, and I would like to consider our options. Daniel has suggested changing the default value of the max-sched-extend-regions-iters param to 1. However, I think we should conservatively change it to zero, for now, and then use a target macro to allow IA64 to set it to two, and other ports to gradually turn this on if useful. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Expansion of __builtin_frame_address
Richard Earnshaw wrote: The only chance you have for producing a backtrace() is to have unwinding information similar to that provided for exception unwinding. This would describe to the unwinder how that frames code is laid out so that it can unpick it. I'd suggest we leave backtrace() aside, and just talk about __builtin_frame_address(0), which does have well-defined semantics. _b_f_a(0) is currently broken on ARM, and we all agree we should fix it. I mildly disagree with David's comment that: It seems like a bad idea to force every port to define INITIAL_FRAME_ADDRESS_RTX to avoid a penalty. in that I think the default should be working code, and Mark's change accomplishes that. Of course, if _b_f_a(0) can be implemented more efficiently on some target, there should be a hook to do that. And, I think it's reasonable to ask Mark to go through and add that optimization to ports that already work that way, so that his patch doesn't regress any target. (I'm not actually sure how _b_f_a(0) works on other targets, but not on ARM.) But, scrapping about the default probably isn't very productive. The important thing is to work out how _b_f_a(0) can be made to work on ARM. Richard, I can't tell from your comments how you think _b_f_a(0) should be implemented on ARM. We could use Mark's logic (forcing a hard frame pointer), but stuff it into INITIAL_FRAME_ADDRESS_RTX. We could also try to reimplement the thing you mentioned about using a pseudo, though I guess we'd need to work out why that was thought a bad idea before. What option do you suggest? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Expansion of __builtin_frame_address
Richard Sandiford wrote: Mark Mitchell [EMAIL PROTECTED] writes: I'd suggest we leave backtrace() aside, and just talk about __builtin_frame_address(0), which does have well-defined semantics. _b_f_a(0) is currently broken on ARM, and we all agree we should fix it. I don't want to fan the flames here, but I'm not sure either statement is true. Does __builtin_frame_address(0) really have well-defined semantics on a target for which the ABI does not specify a frame layout? I think that's Richard's point. Thanks for explaining; after the latest messages from you and Richard E., I understand the ARM doesn't have a standard frame layout. I had not realized that before. However, even though there's not an ARM ABI layout, if GCC uses a standard layout, then it would make sense for _b_f_a(0) to return a pointer to that frame. (_b_f_a(0) is a GCC extension, so it can have a GCC-specific definition.) If even that condition does not hold, then I agree that _b_f_a(0) should just have undefined behavior on ARM. We might even consider making it an error to use it. We should document in the manual that you can't use _b_f_a(0) on reliably on some architectures. If, however, there are plausible semantics we like for _b_f_a(0) on ARM, then it doesn't seem to me that we should worry terribly much about pessimizing the code by requiring a hard frame pointer. Of course, it would be better not to do so -- but if the only functions affected are those that actually call _b_f_a(0), I doubt we'd be able to measure a change in any real-world program. Richard E. asked what possible uses this function might have. Obviously, GLIBC's backtrace() function is one, though I guess that's a weak example, in that we all agree one should really be using unwind information. (I still think it's somewhat valid, in that there are a lot of people building GCC-only applications, and if backtrace() worked in that case, it would be better than not working at all.) The other examples I can think of are also odd hacks; for example, checking for stack overwrites, manually poking ones own return address pointer, or manually grabbing saved registers from a caller, or some such. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Expansion of __builtin_frame_address
Daniel Jacobowitz wrote: On Sun, Jun 04, 2006 at 09:54:25AM -0700, Mark Mitchell wrote: Richard E. asked what possible uses this function might have. Obviously, GLIBC's backtrace() function is one, though I guess that's a weak example, in that we all agree one should really be using unwind information. Which there is no plausible way to do for the ARM EABI, due to ARM's other choices. If one compiled everything with -fexceptions, could you then use the ARM ABI EH unwind stuff? Or, is there just no way to do this in the ARM EABI? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Expansion of __builtin_frame_address
Daniel Jacobowitz wrote: On Sun, Jun 04, 2006 at 10:29:14AM -0700, Mark Mitchell wrote: Daniel Jacobowitz wrote: On Sun, Jun 04, 2006 at 09:54:25AM -0700, Mark Mitchell wrote: Richard E. asked what possible uses this function might have. Obviously, GLIBC's backtrace() function is one, though I guess that's a weak example, in that we all agree one should really be using unwind information. Which there is no plausible way to do for the ARM EABI, due to ARM's other choices. If one compiled everything with -fexceptions, could you then use the ARM ABI EH unwind stuff? Or, is there just no way to do this in the ARM EABI? The DWARF unwinding tables are descriptive; you can interpret them however necessary, for unwinding or for backtraces. But the ARM unwinding tables are more opaque. In that case, I guess the only open question is whether _b_f_a(0) has a sensible GCC-specific meaning, or whether we should just declare it invalid/undefined on ARM. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: [Bug middle-end/27590] [4.1/4.2 Regression] ICE when compiling catalina.jar from tomcat 5.0.30
Andrew Haley wrote: mmitchel at gcc dot gnu dot org writes: --- Comment #8 from mmitchel at gcc dot gnu dot org 2006-06-04 19:02 --- Java is not release critical. I protest. This is not a Java bug but an exception handling bug. Do you have a C++ test-case? I'm all for fixing the bug and there's plenty of time to get this into GCC 4.2. So, don't think this means that this bug can't be fixed; it just means I wouldn't hold up the release for it, at this point. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.2 Status Report (2006-06-04)
This status report has been a long time coming, for which I apologize. As fwprop is no longer on the table for 4.2, and as the vectorizer improvements seem to have stalled due to a combination of lack of review and Dorit's leave, I think it's time to declare 4.2 feature-complete. That leaves us looking at regressions. There are lots; presently 56 P1s. About half of those are new in 4.2. So, we're not ready to create a 4.2 branch. Therefore, we need to make the mainline open for regression-fixes only to force ourselves to attack the open bugs. Please consider the mainline operating under release-branch rules as of 12:01 AM Wednesday, California time. That will give everyone a few days to check in any in-progress bug-fixes that are not regressions. At this time, I don't think it makes sense to set a 4.2 target branch date. We have to see how fast the bug-fixing goes. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.2 Status Report (2006-06-04)
Joern RENNECKE wrote: In http://gcc.gnu.org/ml/gcc/2006-06/msg00120.html, you wrote: As fwprop is no longer on the table for 4.2, and as the vectorizer improvements seem to have stalled due to a combination of lack of review and Dorit's leave, I think it's time to declare 4.2 feature-complete. I am still waiting for review of by auto-increment patches, and for Berndt to complete the cross-jump part struct-equiv patches, so that I can post an updated patch for the if-conversion part. Depending on how quickly that goes, and how quickly other things go, those things may or may not make 4.2. We'll have to take it case by case. The patch queue also includes some patches for bugs that are not strictly speaking regressions. As usual, I think we should permit the inclusion of already submitted patches. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Generator programs can only be built with optimization enabled?
Typing make in $objdir/gcc (after a bootstrap) sometimes results in errors like: build/gencondmd.o: In function `VEC_rtx_heap_reserve': /net/sparrowhawk/scratch/mitchell/src/lto/gcc/rtl.h:195: undefined reference to `vec_heap_p_reserve' For an ordinary make the generator programs are built without optimization. But, rtl.h use DEF_VEC_*, the expansion of which includes inline functions. With optimization disabled, the compiler apparently emits the inline functions. The inline functions reference symbols (like vec_heap_p_reserve) that are not included in the build objects linked into the generator program. I'm using a version of mainline that's a few weeks old; is this something that has been recently fixed? Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Generator programs can only be built with optimization enabled?
David Edelsohn wrote: This is part of the new build infrastructure. One cannot simply go into $objdir/gcc and type make. One either needs to use the appropriate incantation at the top-level build directory or go into $objdir/gcc and type make CFLAGS='xxx', where 'xxx' matches the optimization options for the current bootstrap phase. That seems unfortunate, but so be it. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Generator programs can only be built with optimization enabled?
Paolo Bonzini wrote: This was caused by: 2006-01-22 Zack Weinberg [EMAIL PROTECTED] * genautomata.c: Include vec.h, not varray.h. The problem that Mark reported happens because (since always) the CFLAGS of the gcc directory are just -g, not -O2 -g. Optimized builds have (since always) been done only because the toplevel overrides the -g CFLAGS. So, when you type make in the gcc directory, it triggers a non-optimized build of everything (generator programs, compiler, driver), which in turn triggers PR18058. I think that, after Zack's change, the generator programs that include rtl.h should be linked with build/vec.o. That may not be necessary when optimizing, but it would avoid this problem. Do you agree? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.2 Status Report (2006-06-16)
[Jakub, see below for question.] There are presently 200 P3 or higher regressions open against 4.2. We remain in regression-only mode on the mainline. I have half a mind to see if I can bribe AJH to lock everyone in the conference hall until they have fixed at least 1 PR. :-) I'm not sure the number above is in and of itself terribly meaningful, in part because Volker has been filing many ICE-on-invalid-after-error-message PRs against the C++ front end. These don't really even show up for users in releases, due to the confused by earlier errors trick, so, although I've been marking these as P2, that might actually be an overly high priority. It offends my sensibilities that we crash in these cases, but it's hard to argue there's a lot of impact on users. What I do consider meaningful is that there are 37 P1s. Several of these (25938, 26175, 26477, 27890, 27984) are low-hanging fruit regarding to installation. Jakub, I thought that you and I worked out a plan for the libgomp configuration issues; did that patch get checked in? Can any of these PRs be closed? Would folks please tackles some of the other P1s? There are a number of 4.2-only regressions involving wrong-code and ICEs on valid code. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: state of decimal float support in GCC
Janis Johnson wrote: Support in GCC 4.2 for decimal floating types is based on drafts of ISO/IEC WDTR 23732, which continues to change. There are some differences between the most recent draft N1176 and what GCC implements, and some other things that should probably change or at least be documented. I'd appreciate some guidance on these. 1. The draft TR specifies that an implementation define __STDC_DEC_FP__ as 1 to indicate conformance to the technical report. GCC currently defines this, but there are a couple of issues here. First is that GCC is only one piece of the implementation and the macro shouldn't be defined unless the C library also implements the library portion. The other issue is that the TR continues to change, and the macro shouldn't be defined until GCC and the relevant C library support the final version of the TR. I'd like to remove this from GCC, if it's OK to do that at this late stage. That seems a good conservative change. (This same problem has arisen in past with some of the IEEE macros, in the opposite direction; GLIBC has defined them, even when the compiler doesn't provide the full guarantees.) I think the only way to get this 100% right is to provide a configure option to GCC that says whether or not to define the macro, but, even then, you have to be multilib-aware, as a uClibc multilib might not get the macro, even while a GLIBC multilib does get the macro. If the standards permit defining the macros in library headers (rather than requiring them to be pre-defined by the compiler), then an easier solution would be to have GCC define __GCC_DEC_FP__, and then have the library do: #ifdef __GCC_DEC_FP__ /* This library is DFP-capable, and GCC is DFP-capable, so... */ #define __STDC_DEC_FP__ #endif 3. Earlier drafts specified that a new header file decfloat.h define various macros for characteristics of decimal floating types. The latest draft puts these in float.h instead. I'm inclined to leave them in decfloat.h until the TR is approved, or at least more stable, and document that difference. The downside to that is that some people may want us to provide decfloat.h forever, for source compatibility with GCC 4.2. It's frustrating that we don't really know if/when the TR will actually be final; things might move again, for all we know. I think you should document the difference, and say that we expect to remove decfloat.h in a future release. I think your other documentation suggestions make sense. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Visibility and C++ Classes/Templates
Jason Merrill wrote: Nice to see this stuff getting improved! #pragma GCC visibility push(hidden) class __attribute((visibility(default))) A { void f (); }; void A::f() { } Here I think we'd all agree that f should get default visibility. Agreed. class A { void f (); }; #pragma GCC visibility push(hidden) void A::f() { } #pragma GCC visibility pop This case is less clear; A does not have a specified visibility, but the context of f's definition does. However, we don't want to encourage this kind of code; the visibility should be specified as early as possible so that callers use the right calling convention. Waiting until the definition to specify visibility is bad practice. Also, the status quo is that f gets A's visibility. I would preserve that and possibly give a warning to tell the user that they might want to add __attribute((visibility)) to the declaration of f in A. Agreed. Now, templates: templateclass T __attribute((visibility (hidden)) T f(T); #pragma GCC visibility push(default) extern template int f(int); #pragma GCC visibility pop This could really go either way. It could be considered similar to the above case in that fint is in a way part of fT, but there isn't the same scoping relationship. Also, there isn't the declaration/definition problem, as the extern template directive is the first declaration of the instantiation. In this case I am inclined to respect the #pragma rather than the attribute on the template. I'd tend to say that the attribute wins, and that if you want to specify the visibility on the template instantiation, you must use the attribute on the instantiation, as you suggest: Using an attribute would be less ambiguous: extern template __attribute ((visibility (default)) int f(int); In a PR Geoff asked if we really want to allow different visibility for different instantiations. I think we do; perhaps one instantiation is part of the interface of an exported class, but we want other instantiations to be generated locally in each shared object. Agreed. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Visibility and C++ Classes/Templates
Ian Lance Taylor wrote: Don't you still have to deal with this case? #pragma GCC visibility push(hidden) templateclass T T f(T); #pragma GCC visibility pop ... #pragma GCC visibility push(default) extern template int f(int); #pragma GCC visibility pop Personally I wouldn't mind saying that the attribute always beats the pragma, but it seems to me that there is still the potential for ambiguity. I would treat that case as if the template had the attribute, and, therefore, ignore the pragma at the point of instantiation. My concern here is that template instantiation can happen at any time. I'm sure we all agree that the pragma should affect *implicit* instantiations; if you happened to say: #pragma GCC visibility push(default) int i = fint(int); #pragma GCC visibility pop we wouldn't want the visibility of i to affect fint. But, an explicit instantiation: template int fint(int); should really behave just like an implicit instantiation; it's just a manual way of saying instantiate here. And, extern template is a GNU extension which says there's an explicit instantiation elsewhere; you needn't bother implicitly instantiating here. I'm just not comfortable with the idea of #pragmas affecting instantiations. (I'm OK with them affecting specializations, though; in that case, the original template has basically no impact, so I think it's fine to treat the specialization case as if it were any other function.) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Visibility and C++ Classes/Templates
Jason Merrill wrote: Yep. I'm sympathetic to Mark's position, but still tend to believe that the #pragma should affect explicit instantiations. I don't feel strongly enough to care; let's do make sure, however, that we clearly document the precedence, so that people know what to expect. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: unable to detect exception model
Richard Guenther wrote: I'll go ahead and revert the ggc-page.c patch now. Thanks, I think that's the right call. I'm sorry I didn't spot this issue in my review. The idea you have is a good one, but it does look like some of the funny games we're playing get in the way. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Visibility and C++ Classes/Templates
Geoffrey Keating wrote: In the traditional declaration/definition model, if you try to change the linkage of something you get an error... Indeed, if you consider visibility to be an intrinsic property of the template (like its type, say), you could argue: (1) the template gets to specify the visibility (2) all instantiations (explicit or implicit) always get that visibility (3) if you want a different visibility, you must use an explicit specialization But, I think we all agree that's too restrictive; visibility is an extra-linguistic instruction about a low-level detail, beyond the scope of the language itself. So, I think that it's reasonable to allow the visibility specification on an explicit instantiation. I don't think a warning about a mismatch between the visibility specified by the template and the instantiation is particularly useful -- but maybe what we should do is try to discourage the use of the #pragma in favor of the attribute? (There are no scoping problems with attributes.) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: RFC: __cxa_atexit for mingw32
Brian Dessent wrote: is a good thing: replace an ISO standard-conformant and perfectly adequate atexit function already supplied by OS vendor with a new version, perhaps with some licensing strings attached. As a MinGW user, I would prefer not to see __cxa_atexit added to MinGW. I really want MinGW to provide the ability to link to MSVCRT: nothing more, nothing less. Cygwin is an excellent solution if I want a more UNIX-like environment. I think it would be better to adopt G++ to use whatever method Microsoft uses to handle static destructions. Ultimately, I would like to see G++ support the Microsoft C++ ABI -- unless we can convince Microsoft to support the cross-platform C++ ABI. :-) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: RFC: __cxa_atexit for mingw32
Joe Buck wrote: As I understand it, Microsoft has patented aspects of their C++ class layout. That might be, and we should investigate that before actually trying to implement a compatible layout, but it doesn't change my opinion about the original question regarding __cxa_atexit -- unless Microsoft's patents extend to destruction of global objects with static storage duration. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: RFC: __cxa_atexit for mingw32
Danny Smith wrote: I have a patch that allows use of atexit for destructors in the same way as __cxa_atexit in cp/decl.c and decl2.c and will submit in Stage1 next. That sounds great. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: A question about TYPE_ARG_TYPES
Daniel Berlin wrote: I believe it also happens with varargs functions in some cases, if there was nothing but a varargs parameter. This is the one and only case in which it should occur, but, yes, it is possible. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: A question about TYPE_ARG_TYPES
Kazu Hirata wrote: Hi Ian, I keep finding places in GCC sources that check whether a member of TYPE_ARG_TYPES is 0. For example, for (link = TYPE_ARG_TYPES (function_or_method_type); link TREE_VALUE (link); link = TREE_CHAIN (link)) gen_type_die (TREE_VALUE (link), context_die); Notice that TREE_VALUE (link) is part of the loop condition. Now, do we ever allow a NULL in TYPE_ARG_TYPES? If so, what does that mean? My guess is that soneone was trying to be cautious about encountering a NULL in TYPE_ARG_TYPES. (If that's the case, we should be using gcc_assert instead.) Just guessing here, but what happens with an old-style function definition in C? void f(); AFAIK, that gets TYPE_ARG_TYPES (...) == NULL, so we don't even get to evaluate TREE_VALUE (TYPE_ARG_TYPES (...)). On IRC, Daniel Berlin claims that there are some weird cases where TREE_VALUE (TYPE_ARG_TYPES (...)) is NULL. I'll keep putting gcc_assert to see what happens. That may be the difference between void f() (where TYPE_ARG_TYPES might be NULL) and void f(...) (where TREE_VALUE (TYPE_ARG_TYPES) would be NULL). The latter, as Daniel says, is not valid C, but perhaps we used to accept it. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: gcc 4.2 more strict check for function called through a non-compatible type
Ian Lance Taylor wrote: I realized that I am still not stating my position very clearly. I don't think we should make any extra effort to make this code work: after all, the code is undefined. I just think 1) we should not insert a trap; 2) we should not ICE. I agree. If the inlining thing is indeed a problem (and I can see how it could be, even though you could not immediately reproduce it), then we should mark the call as uninlinable. Disabling an optimization in the face of such a cast seems more user-friendly than inserting a trap. Since we know the code is undefined, we're not pessimizing correct code, so this is not a case where to support old code we'd be holding back performance for valid code. I also agree with Gaby that we should document this as an extension. If we go to the work of putting it back in, we should ensure that it continues to work for the foreseeable future. Part of that is writing down what we've decided. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Visibility and C++ Classes/Templates
Jason Merrill wrote: Hmm, I'm starting to be convinced that ignoring #pragma visibility for all template instantiations and specializations will be a simpler rule for users to understand. I think I argued for that earlier; in any case, I agree. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: gcc 4.2 more strict check for function called through a non-compatible type
Andrew Haley wrote: Yuri Pudgorodsky writes: We can say something like: In GNU C, you may cast a function pointer of one type to a function pointer of another type. If you use a function pointer to call a function, and the dynamic type of the function pointed to by the function pointer is not the same as indicated by the static type of the function pointer, then the results of the call are unspecified. In general, if the types are not too different s/not too different/compatible/ not too different has no meaning, whereas compatible is defined in Section 6.2.7. But, at least in C++, the official meaning of compatible is not the meaning we want. For example, int * and long * are not compatible -- but, in this context, we want to say that this will work. We need to do this because we use type-based alias analysis in gcc. Yes, I remember adding that feature. :-) :-) If we permit incompatible types to be casted in function calls, we make a hole in alias analysis. Yes, users who lie will lose. The only thing we're trying to do here is behave a bit more gracefully. Introducing a call to __builtin_trap is pretty brutal; instead, we want to say we can't promise this is going to work, but if you want to try, go ahead... -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Ben Elliston appointed DFP maintainer
The SC has appointed Ben Elliston as maintainer of the Decimal Floating-Point components of the compiler, including relevant portions of the front ends, libraries, etc. Ben, please update MAINTAINERS to reflect your expanded role. Thanks! -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: RFD: language hooks in GIMPLE / lang_flag?
Diego Novillo wrote: That's relevant at the language syntactic/semantic level of validing such things as parameter lists, but to GIMPLE two types are compatible if and only if they would produce the identical RTL. So two integer types are compatible if they have the same mode, precision, and bounds. Two FIELD_DECLs are compatible if they have the same positions, aligments, sizes, and compatible types, two record types are compatible if they have the same sizes and alignment and all fields are compatible. Etc. The issue of debugging information is very different but I don't think in the scope of this particular discussion. Yup. That's been the idea all along. We represent all the FE language semantics in GIMPLE, much like we expose ctor/dtor in C++ or EH or any of the other language mechanisms. Everything must be explicitly represented in the IL, totally independent from the input language. FWIW, I agree. However, I do not agree that two types are compatible iff they would produce identical RTL. GIMPLE should still know that int and long are distinct types (even if both 32 bits) since that permits alias analysis to do a better job. Similarly, struct S { int i; } and struct T {int j; } are not the same type. So, what should happen is that the front end should make these differences/similarities visible to the middle end via TYPE_ALIAS_SET, or some other mechanism *in the IL itself* rather than via a callback. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: gcc visibility used by moz
Jason Merrill wrote: It seems that you have a different mental model of type visibility. I've gotten a little lost in this thread. Is there a clear proposal for the semantics that we're leaning towards at this point? One meta-note is that we're not the first people to consider this. I wonder if the rules beginning at page 19 (Inter-DLL symbol visibility and linkage) of the ARM C++ ABI: http://www.arm.com/miscPDFs/8033.pdf might be helpful? This is really about mapping dllexport/dllimport onto ELF symbols, but there are some rules about how to decide whether members of classes are exported. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: RFD: language hooks in GIMPLE / lang_flag?
Richard Kenner wrote: FWIW, I agree. However, I do not agree that two types are compatible iff they would produce identical RTL. GIMPLE should still know that int and long are distinct types (even if both 32 bits) since that permits alias analysis to do a better job. Sure, but that's not what we currently use the compatible types hook for. What you're essentially saying is that (int *) and (long *) are different types, and that's correct. But if we have a cast from int to long or vice versa, that cast is not accomplishing anything and *could* be deleted. In RTL, sure. In GIMPLE, I don't think so, as if you do that you lose the type information about the result. But, I'm not a GIMPLE expert; maybe there's some magic way of handling this. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: RFD: language hooks in GIMPLE / lang_flag?
Richard Kenner wrote: In RTL, sure. In GIMPLE, I don't think so, as if you do that you lose the type information about the result. But, I'm not a GIMPLE expert; maybe there's some magic way of handling this. The type of the result is always the type of the LHS. OK. But, GIMPLE is also supposed to be type-safe, so I wouldn't think that int = long would be well-formed gimple. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: LTO and Code Compaction \ Reverse Inlining \ Procedure Abstraction?
Miguel Angel wrote: Hello! I have a VERY simple example: int f1 (int i) {i = (i-7)/9+3; return i;} int f2 (int i) {i = (i-7)/9+3; return i;} It could be reduced to: int f1 (int i) {i = (i-7)/9+3; return i;} int f2 (int i) {return f1 (i);} Are there any ideas on how and where to add a target and language independent code compaction pass into gcc? Some people call this uninlining. I've also heard the term procedural abstraction. The generalization is to identify common code fragments that can be turned into functions. Then, replace the users of the common code with function calls. If we wanted to do this in GCC, it might well make sense to do this at the same place we presently do inlining. Some toolchains do it in the linker, at the level of assembly code. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: [lto] factor code common to all builtin_function
Rafael EspĂndola wrote: I have a patch that factors code common to all builtin_function implementations. It is approved for trunk when we get to stage1. Are the developers involved in the lto branch interested in this patch? If so, I can port it. Thanks for the offer! Yes, I think that would be good. However, we can also wait until it goes into the mainline, and until we decide to merge the mainline to LTO. I don't think we need it on the LTO branch on this time. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.2 Status Report (2006-07-16)
At present, we have 160 serious regressions open against mainline (which will become 4.2). (I've downgraded many of Volker's reports about ICEs after valid error messages to P4, following Jason's recommendation. Upon reflection, I think that's the right thing to do; although robust error recovery is clearly a good thing, these ICEs don't have a substantial impact on most users.) Of the 160 regressions, 33 are P1s. As usual, a number are C++ issues. I intend to go after many of those personally in the near future. However, there are plenty of non-C++ P1s as well, so don't feel there's nothing for you non-C++ folks to do. Our historical standard for branching has been 100 regressions. I still think that's a reasonable target. The fact that we've still got a lot of issues on the mainline, even though we've been in regressions-only mode for quite a while, indicates that if we had a release branch we'd probably be having a very hard time getting 4.2 out -- and we'd be spending effort on 4.3. For those that think we're in Stage 3 for too long, please note that if every frequent GCC contributor fixed three regressions, we'd be under 100. We could do it tomorrow -- and certainly this week! -- with a concerted effort. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: LTO and Code Compaction \ Reverse Inlining \ Procedure Abstraction?
Rafael EspĂndola wrote: Some people call this uninlining. I've also heard the term procedural abstraction. The generalization is to identify common code fragments that can be turned into functions. Then, replace the users of the common code with function calls. Is this the same as Code Factoring? http://gcc.gnu.org/projects/cfo.html Yes, that's another name. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Searching configured and relocated prefix.
Andrew Pinski wrote: On Jul 14, 2006, at 1:17 PM, Carlos O'Donell wrote: We currently search both the relocated compilers prefix and the originally configured prefix. Should a relocated compiler be searching both directories? Yes because someone might have just relocated the compiler but not the rest of the tools. The main use of this is that untar a common sysroot and use different versions of the compiler which normally would be installed under that common location. There are benefits and costs in either direction. 1. If we search both locations (i.e., for a relocated compiler, search the configured-time prefix and the installation-time prefix), we get the following set of problems: (a) If the configure-time prefix exists (for unrelated reasons) the compiler finds files in the configure-time prefix, even though neither the system administrator or user expects that. (b) If the configure-time prefix does not exist, but is under an NFS mount, the compiler will cause automount traffic, NFS timeouts, etc. (c) In searching for files, the compiler will make a lot of stat calls, measurably slowing down a relocated compiler. 2. If we search only location (i.e., for a relocated compiler, search only the installation-time prefix), we get a different set of issues: (a) As you say, a single sysroot (or other toolchain components) cannot as easily shared across compiler installations. However, I think it's clear that the problems in (1) are more severe than the problems in (2), on several grounds: * The problems in (1) are demonstrably annoying to people; CodeSourcery has several customer complaints about different customers related to this issue. All of (1a), (1b), and (1c) have been reported to us. (1a) is particularly nasty; users got totally incorrect behavior out of the compiler because the compiler was searching a configure-time prefix that happened to contain unrelated files. * The problems in (2) can be worked around by (for example) using a symlink to place the sysroot in both installation prefixes. These are actions that can be taken by system administrators at installation-time; they have no affect on ordinary users. * The problems in (1) are due to an implicit behavior of the compiler that empirically violates the principle of least surprise. If you get a tarball full of binaries, and unpack it in /home/mitchell/install, why would you expect it to search /opt/distributor, /tmp/buildroot, etc.? * No non-GCC compiler searches a configure-time prefix. The only locations relevant are well-known paths (like /usr/include) and the installation-time prefix. So, GCC's model is confusing to users coming from other compilers. This is not a definitive argument, but it should carry weight unless there is some strong argument in favor of GCC's current behavior. * I suspect that the problems in (2) are relative rare while the problems in (1) are relatively common. A lot of users download binary distributions and install them somewhere convenient; relatively few try to do complicated things involving partially shared installations, and those users are probably more expert. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Searching configured and relocated prefix.
Andrew Pinski wrote: I actually think the problems with 1 (b) are artificial and should not be taken into account. This is not a hypothetical or artificial issue -- as I said, all three problems I listed have been encountered by real users. I actually depend on a common sysroot already and it allows me to do development of a newer compiler much faster and it allows some of our game developers to be able to test a new compiler without having another copy of the SDK installed. Yes, that's clever. But, you can create a symbolic link to the sysroot from each installation with a single command. And, your installer for third-party developers can do that for you. 1(a) is not confusing at all, if people read the documentation like they should be doing, it should be clear to them that it looks at both locations. Most developers do not start by reading the entire compiler manual -- especially including information about building the compiler itself. The typical software developer probably doesn't have any idea what the configure-time prefix for the compiler might even mean. Their knowledge of compilers is probably that -c, -D, -I, -g, and -O2 are useful command-line options. And, if using an IDE, maybe not even that! And, there's no reason they should need more information than that; there's no reason compiler users should have to have any knowldege about how the compiler is put together. For 1(c), it is not going to slow it down that much, in fact it just one or two extra Sorry, this is just incorrect. It's a significant issued (as much as 25% for some projects) as measured by actual customers in the field. play but that is a miss-configuration of their systems at that point and not really a GCC issue and should not be treated as such. A large class of users (most corporate developers, for example) run on systems they don't administer. We want the compiler to perform well on their systems, if possible. Even developers who do administer their own systems may not be expert administrators; I used to administer my own GNU/Linux box, but I didn't know about most of the options to the kernel or other parts of the system. And, many GCC users are running on Windows, where they have less control. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Searching configured and relocated prefix.
Andrew Pinski wrote: Yes, that's clever. But, you can create a symbolic link to the sysroot from each installation with a single command. And, your installer for third-party developers can do that for you. What are the equivalent to symbolic links on Windows and I am not talking about cygwin either? That's a good question. In Vista, the answer is symbolic links: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/symbolic_links.asp Before Vista, there's no solution short of cp. However, you still have the --sysroot command-line option. And, if you're worried about Windows, see Paul's response; the problems I've described are particularly bad on Windows, and the developer-base there is often less used to GNU software, so the problems are even weirder. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Searching configured and relocated prefix.
Andrew Pinski wrote: Before Vista, there's no solution short of cp. However, you still have the --sysroot command-line option. And, if you're worried about Windows, see Paul's response; the problems I've described are particularly bad on Windows, and the developer-base there is often less used to GNU software, so the problems are even weirder. But isn't Paul's response a confirmation that it is a bug in Windows for not having a stat cache and not knowing when a drive map becomes valid? If so I would have hoped that Codesourcery would have filed a bug with Microsoft already about how bad performance problem it is instead of now trying to work around in real already working code. Are you suggesting that we ship software that performs poorly on one of the most popular systems actually in the field because, in the abstract, those systems could be better? I would expect that most large software applications (for *all* operating systems) contain comments like: /* On some versions of the OS, we have to do X to workaround Y. */ It's just cutting off our nose to spite our face to ship software that doesn't work well and tell users wait until your system distributor fixes your OS. Even for most GNU/Linux users, that would be untenable; they're not system hackers, and they only get to upgrade when RHEL or SuSE or Debian or ... distributes new packages. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Searching configured and relocated prefix.
Andrew Pinski wrote: On Jul 23, 2006, at 11:48 AM, Mark Mitchell wrote: Are you suggesting that we ship software that performs poorly on one of the most popular systems actually in the field because, in the abstract, those systems could be better? Maybe we just have to force the issue on people. One of the purposes of FSF GCC development is to get more people to use GCC. That's more likely to happen if we make GCC work well for them. We should not have a double standard here, just because it is a performance issue and other people are confused about the current behavior. We should not try so hard to be consistent that we do the wrong thing. It's good to have procedures and rules, but not to the point that they are absolutes. Decisions about these kinds of things are of necessity only semi-algorithmic. This isn't like a -f option which is a documented feature. This is the current behavior of the compiler, which I expect you'd find most people consider to be odd, despite the fact that there is some utility in the current approach. I think it would be wrong to make this change now (it's clearly neither Stage 3 nor regression-only material), but I see no reason not to make it in 4.2. /* On some versions of the OS, we have to do X to workaround Y. */ Yes but most of those because people don't think about filing bug reports. We need to file the bug reports with the OS. Sure, filing OS bug reports is good. It's just orthogonal; we still have to build software that works with the systems users are using. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: gcc-4.3 projects page?
Dan Kegel wrote: Is it time to create a GCC_4.3_Projects page like http://gcc.gnu.org/wiki/GCC_4.2_Projects ? I imagine several projects are already in progress, but not yet mentioned on the wiki... Yes, I've been thinking about doing that. It's fine with me if someone would like to create the page -- if not, I will take care of it shortly. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: A question about ARG_FINAL_P in the Java frontend.
Tom Tromey wrote: Kazu == Kazu Hirata [EMAIL PROTECTED] writes: Kazu I just noticed that the Java frontend has ARG_FINAL_P, which uses a Kazu bit in the TREE_LIST node that is pointed to from TYPE_ARG_TYPES. Kazu I am wondering if there is any way we could move this bit elsewhere. On the gcj-eclipse branch the code that uses ARG_FINAL_P is actually no longer used. It hasn't been deleted yet but it won't ever run. I'm hoping to merge this to trunk after 4.2 branches ... will that help? Yes. Kazu, I'd suggest you just ignore Java; you can still get proof-of-concept for tree-trimming without Java. The ECJ changes are going to be massive, and they're going to go in before we get our stuff ready to go in, so dealing with Java now is probably a waste of time; we'll have to regroup after ECJ goes in. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.2 Status Report (2006-07-31)
I'm getting a little bit depressed about progress towards GCC 4.2. On July 16, we had 160 serious regressions and 33 P1s. Today, 15 days later, we have 162 serious regressions and 29 P1s -- just about the same. Many of those P1s are middle-end problems that have been reported from compiling real code. I'm not particularly concerned about ICE-on-invalid regressions in the C++ front end, but I am worried about wrong code generation and ICEs on valid code (C and C++). Many of the P1s are 4.2-only regressions. Obviously, we'd all like to start thinking about GCC 4.3, but we need to make some headway on 4.2 first. So, I think we're still in a holding pattern: let's get the P1s fixed. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Eric Botcazou appointed RTL maintainer
Eric -- The GCC SC has appointed you an RTL maintainer. Congratulations! That means that you have maintainership of all machine-independent RTL optimization passes, like jump, CSE, GCSE, flow, sched2, shorten_branches, etc. This post doesn't cover back ends, dwarf2out.c, or other things that aren't optimization passes. I know that it's hard to be exactly sure where the boundaries lie, but I've every confidence you'll use your judgment well. Please feel free to ask if you're not sure. Please adjust MAINTAINERS accordingly -- and then please fix PRs and approve patches for same. :-) Thanks! -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Eric Botcazou appointed RTL maintainer
Eric Botcazou wrote: Obvious question: what of the RTL expander(s)? They're specifically excluded from your purview. (That's not a judgment on your competence; just that the definition we used when discussing your appointment restricted itself to RTL passes only.) Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: type consistency of gimple
Kenneth Zadeck wrote: I have had some discussions with Honza and Diego about the type consistency at the gimple level. They told me that Andrew was in the process of making gimple properly type consistent. I think that there is widespread consensus that GIMPLE should ideally be type-consistent, in the sense that (as you say) the only places where types change is via explicit type-casts (e.g., NOP_EXPRs). The same optimization that you want to perform at the LTO level (i.e., not writing out types for every nodes) is also one that would pay off in terms of memory savings at the TREE level. (In my opinion, it doesn't really matter if MODIFY_EXPR is treated as doing an implicit conversion; the important thing is that the set of places where implicit conversions are performed be both limited and documented. If we save tons of TREE nodes by saying that MODIFY_EXPR is defined to do an implicit conversion, as if the right-hand side had a NOP_EXPR to convert it to the type of the left-hand side, then that might be a perfectly valid memory optimization on TREE.) So, the question is one of timing: do we build LTO to work with the current not-type-consistent GIMPLE, or do we fix the compiler to generate type-consistent GIMPLE first and then go on to LTO? I think this is a question of priorities. It's relatively straightforward to fix the compiler to generate type-consistent GIMPLE: you write consistency-checking routines and then you just fix all the problems that arise, by inserting explicit type-conversions at the source of the offending inconsistency. However, while straightforward, that's probably person-months of effort. From an engineering point of view, it would probably be best to fix GIMPLE first; that has other positive side-effects, and it would avoid doing LTO work that might have to be undone. However, from a project-management point of view, it might be best to go for proof-of-concept on LTO first, writing out the types for nodes explicitly. I would assume that this wouldn't be a lot of additional effort; i.e., while it will waste a lot of space, it won't waste a lot of programmer time. You could also strike a middle ground: write the consistency checker (Danny may have already done this) as a separate pass and put it on the LTO branch. Run it before writing out LTO information. If the input is inconsistent, abort; otherwise, write out LTO information. That lets you write out the more compact representation, but still avoid trying to fix all of GIMPLE right now. (You just have to fix whatever is required for proof-of-concept programs to work.) It does mean, though, that you're essentially committing to the type-safety work before we can make LTO part of mainline, which is probably adding several person-months to the overall project. Or else you have to assume that you may have to go back and write out the type information later, in order to get LTO into mainline. So, I guess my inclination would be to just write out the type information now, and thereby avoid the dependency on fixing GIMPLE. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: type consistency of gimple
Daniel Berlin wrote: Mark Mitchell wrote: Kenneth Zadeck wrote: So, I guess my inclination would be to just write out the type information now, and thereby avoid the dependency on fixing GIMPLE. Please don't take this the wrong way, but this approach is the reason GIMPLE is not flat/tupelized, not type consistent *right now*, and why this discussing is even occurring at all. I agree with the thrust of your email. If engineering excellence were our primary goal, and we had a master engineering plan for GCC, and we all committed to following the master plan, we'd clearly put something like type-consistent GIMPLE early on the plan because everyone agrees it is The Right Thing from an engineering point of view. However, (a) all of the antecedents in the previous sentence are false, and (b) we are resource-limited. So, GCC development tends to let the pressure build on some set of infrastructure until it has been painfully obvious for some amount of time that it has to change. (In my experience, the same thing happens in developing proprietary software; convincing product management to let you spend significant time fixing something that's not on the next release's feature list requires some good salesmanship.) Demonstrating proof-of-concept for functionality is important because it (a) builds community interest (as you mention by saying how new code tends to appear immediately after something works), and (b) convinces people with resources (including companies and individuals that make non-sponsored contributions to the toolchain) that a particular technology has good potential RoI. I think the ultimate question here is the usual: is this the *best* use of the engineering time we have available? And that value-judgment is ultimately made by the people funding LTO; I haven't yet had a chance to talk to our customer about this issue. But, my personal opinion is that I would rather see LTO working, and not try to solve orthogonal problems along the way. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: type consistency of gimple
Diego Novillo wrote: If we had a GIMPLE type-system, we could allow the implicit type conversions. Right, I was trying to make this point earlier, but not being clear. It doesn't matter if every last conversion is explicit, as long as there are clear rules about where conversions may be implicit, and what the semantics of those conversions are. The question of where exactly to let implicit conversions occur can be driven by space considerations and convenience. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: And back to my original answer: it's up to each language to decide that. Hence my original question: is it legal or not? What did the C++ developers decide? The C++ standard implies that for all pointer-to-object types have the same size and that all pointer-to-function types have the same size. (Technically, it doesn't say that that; it says that you can convert T* - U* - T* and get the original value.) However, nothing in the standard says that pointer-to-object types must have the same size as pointer-to-function types. In theory, I believe that G++ should permit the sizes to be different. However, as far as I know, none of the G++ developers considered that possibility, which probably means that we have made the assumption that they are all the same size at some points. I would consider places where that assumption is made to be bugs in the front end. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: type consistency of gimple
Michael Matz wrote: pressure build on some set of infrastructure until it has been painfully obvious for some amount of time that it has to change. (In my experience, the same thing happens in developing proprietary software; convincing product management to let you spend significant time fixing something that's not on the next release's feature list requires some good salesmanship.) How true :) Nevertheless the goals for the FSF GCC should IMHO be purely based on rather technical arguments and considerations, not the drive by paying customers. Even when I was contributing purely as a volunteer, I had motivations of my own, like wanting to use a particular feature in a program I was writing. I don't think we can realistically expect that the SC can set up a master plan for everyone to follow. The SC or the maintainers are in my opinion completely justified in blocking the inclusion of a technically inferior patch, even if it has some short-term benefit. There's no reason that any contributor should get to jam in a patch that's going to make things hard for everyone else in future. So, I'm not arguing that whoever gets there first wins, by any means. But, I don't think we can ignore the motivations of contributors, either; we've got to accept that they'll invest time/effort/money in GCC only to the extent they see return on that investment. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: However, the C++ definition has been amended at the last Lillehammer meeting to allow that cast as conditionally supported: either it is valid or it errors out. the compiler has to tell. Also, the mechanism to create multiple pointer sizes (attribute((mode))) is a GCC extension. I'm very suspicious of allowing users to specify this via attributes. Having pointers-to-objects or pointers-to-functions with different sizes (within one of those classes) seems problematic, but perhaps you can say more about how you expect this to work in the presence of conversions and such. I expected that what you were asking was whether the back end could reasonably say that function pointers had size four, say, while data pointers had size two. I think that's reasonable, but I don't find it nearly so reasonable to say that some int * pointers have size four while others have size two. But, maybe I just need appropriate motivation. Also, there is a nasty class of bugs in G++ stemming from the GCC attribute extensions because there are no well-defined rules about how to tell if two types with different attributes are the same or different, and if they are different what conversions (if any) can be used to convert back and forth. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
Andrew Pinski wrote: Aren't there some targets (like ia64-hpux) that support two different sizes of pointers Those are entirely separate ABIs, controlled by a command-line option. There are not multiple pointer sizes within any single program. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: I'm very suspicious of allowing users to specify this via attributes. Me too, but that's what extensions are all about. You never know what the user is going to need to do. Sorry, but I think that's far too casual. The long history of GCC extensions causing various kinds of problems is proof positive that new extensions should be added only with extreme care. We've already established the position that new extensions should come with a list of changes to the relevant language standards with the same rigor that would be used to modify the language itself. but I don't find it nearly so reasonable to say that some int * pointers have size four while others have size two. But, maybe I just need appropriate motivation. Well, it seems to work just fine in C. Well, I think it's in direct conflict with the C++ standard. If X is a 32-bit pointer type, and x is a value of type X, Y is a 16-bit pointer type, then: (X*)(Y*)x is supposed to get you back the value of x, but I don't see how that can work, in general. So, if you want to do this in C++, you need to work through the language standard and work out how having two distinct classes of pointers is going to work. I think that's doable, but not trivial. For example, you might make the 16-bit pointers pointers what the standard calls pointers, and then make the 32-bit pointers big pointers. You could say that within a single class of pointers, all the usual pointer rules apply. Then, work through things like conversions between the two (is one direction implicit? are these static_casts or reinterpret_casts?), what mangling to use for big pointers, how to encode the types in RTTI, etc. If you really just need these things in a few places (like reset vectors), then I think you'd be better off with a __builtin_long_pointer() intrinsic (returning a 32-bit integer, not pointer) to use in the initializers. You only need two classes of pointers if you expect people to use the second class in non-trivial expressions, i.e., dereference them, perform pointer arithmetic on them, etc. There's also nothing inherently wrong with assembly code; if it's necessary to express the reset vector in assembly code, well, then, so be it. I can well see why doing it in C/C++ is nicer, but I don't think we should try to extend GNU C/C++ to encompass everything that can be done with an assembler. Also, there is a nasty class of bugs in G++ stemming from the GCC attribute extensions because there are no well-defined rules about how to tell if two types with different attributes are the same or different, and if they are different what conversions (if any) can be used to convert back and forth. We have target hooks for function attributes, no reason to avoid target hooks for data attributes. Sure -- but you still have to say what the semantics are! In my opinion, it seems to work arguments are far too loose for making semantic changes to the C++ front end. We've been burned too many times. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: type consistency of gimple
Kenneth Zadeck wrote: I am modifying my code so that their is a preprocessor flag, STUPID_TYPE_SYSTEM that either writes or does not write the redundant type nodes. I think the macro name is needlessly negative, but I think the idea is fine. Could we just say something like EXPLICIT_TYPE_INFORMATION instead? I would suggest that we ask those with patches to strengthen the type system to contribute those patches to the lto branch and for diego (who I believe has the last working type checker) to contribute that type checker to the lto branch. I agree. I think it's very desirable for the type-checker to be a separate pass so that we can run it at various points in the compilation to check for consistency; that will help us isolate problems. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: Well, I think it's in direct conflict with the C++ standard. If X is a 32-bit pointer type, and x is a value of type X, Y is a 16-bit pointer type, then: (X*)(Y*)x is supposed to get you back the value of x, but I don't see how that can work, in general. I made an error in the code above; I should have said (X)(Y)x since I had already defined X and Y to be pointer types. Where in the standard does it say that? [expr.reinterpret.cast] A pointer to an object can be explicitly converted to a pointer to an object of different type.13) Except that converting an rvalue of type pointer to T1 to the type pointer to T2 (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified. The except that sentence implies the statement above, assuming that the pointed-to type does not have stricter alignment. So, if casting a 32-bit pointer to int to a 16-bit pointer to char and back does not always yield the same value, then something has to give. Fundamentally, pointer types in C++ are compound types determined solely by the pointed-to-type; what you're doing (by adding attributes to the pointer) is adding a new operator for forming compound types. That's a language extension, so it needs to be specified. It's not enough just to tweak the back end to allow the mode attribute. You only need two classes of pointers if you expect people to use the second class in non-trivial expressions, i.e., dereference them, perform pointer arithmetic on them, etc. Like the m16c, which lets you put additional less-frequently used data in function memory? Perhaps, a table of strings, or some CRC lookups? If you really need two classes of pointers, then, sure, you need them. All I did was ask whether or not you really need them and offer a possible solution if you *don't* need them. I am aware of near and far pointers in Borland (and other) compilers. That's good news; you may have an example to help work through the issues. That doesn't mean that there are no issues. You seem to be trying to convince me that this is a simple thing and that we should just do it and let the chips fall where they may. You might be right -- but since almost every other such change has lead to trouble, I'm not inclined to take chances. Please do the work up front to specify how this interacts with all aspects of the language. That's user documentation we need anyhow. I do think this sounds like a useful feature. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: The except that sentence implies the statement above, assuming that the pointed-to type does not have stricter alignment. So, if casting a 32-bit pointer to int to a 16-bit pointer to char and back does not always yield the same value, then something has to give. reinterpret_cast doesn't require that the intermediate form have the same bit pattern. Exactly so. However, all valid pointers must be handles, so unless the 32-bit address space is sparse, something will go wrong. The way I read it, a pointer to an object can be converted to a pointer to a different type of object, but as reinterpret_cast already leaves the const qualifier alone, it seems to be focusing on the object's type, not the pointer's type. There's no distinction in ISO C++; every object type has exactly one associated pointer type. The point of reinterpret_cast is to let you convert A* to B* where A and B are unrelated object types. It's an operation on pointers, not general objects; for example, you can't do reinterpret_castdouble(7), but you can do reinterpret_castdouble*((int*)0)). If you say C++ doesn't support them, I'll take it out and make it obvious that C++ doesn't support them, as long as C still supports them (because I use that feature a lot). I just don't want it to crash when the user does something that appears to be legal based on the gcc manual. Good call. I don't feel qualified to comment for C, but for C++, I think it's correct to say that we don't support them. I think we *could* support them, in theory, but that would be a good bit of work. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
Richard Kenner wrote: I'm very suspicious of allowing users to specify this via attributes. Having pointers-to-objects or pointers-to-functions with different sizes (within one of those classes) seems problematic, but perhaps you can say more about how you expect this to work in the presence of conversions and such. I think there's some confusion here. So you need to be able to express the interfaces to both of these and that requires both pointer sizes. The confusion is perhaps that you're thinking that my statement that we need to specify the semantics is clearly implies that I don't think it's a useful feature? I do think it's a useful feature, but I also think that you can't just drop it into C++ without thinking about all the consequences of that action. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: reinterpret_cast doesn't require that the intermediate form have the same bit pattern. Exactly so. However, all valid pointers must be handles, so unless the 32-bit address space is sparse, something will go wrong. I didn't help things here by saying handles; I meant handled. Sorry! I would go so far as to say that it's defined (hence supported) if the intermediate form is at least as many bits as the other types. I'm not sure if I understand. In ISO C++, it would be fine for char * to have more bits than all other pointers. The standard says X* - Y* - X* is value-preserving if Y has no stricter alignment than X. Since char has weak alignment requirements, Y can be char. Is that what you mean? In ISO C++, there's of course no notion of char *far or char *near; there's just char *. So, there's no way to directly map your intended type system onto the conversion sequence above. The spirit of the standard would seem to be that X* near - X* far - X* near be value-preserving, but to make no guarantees about X* far - X* near - X* far. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: The problem seems to revolve around casts. How about if I turn the abort into a useful message about casting between pointer sizes, and require the user to use a builtin to convert pointers? That's a good start -- but, at the very least, you still have to say what happens for type_info and define name-mangling. Your suggestion isn't going to be easy to implement, either; the front end probably has lots of places where it handles, for example, implicit conversions from Derived* to Base*, and it's going to be looking at the types of the pointed-to objects, but all it's going to check for the pointer types is that they are in fact POINTER_TYPE nodes. I think you really have to accept that the change you want to make goes to a relatively fundamental invariant of C++. It's not something you can do correctly as a quick change; you have to think through all the consequences. I'll again point out that in your reset-vector example you don't actually need any pointer operations. You could just as well do: typedef int ifunc __attribute__((mode(SI)); vects[A0_VEC] = __builtin_pointer32 (timer_a0_handler); I think you should consider that solution. It's not as pretty, for the programmer, but it's a lot less problematic from a language point of view. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: I'll again point out that in your reset-vector example you don't actually need any pointer operations. I'm not trying to dereference any of these nonstandard pointers. Good! In that case, you don't need them to be pointers at all. :-) I think you should just declare them as integer types. (You can give them pointer-sounding typedef names.) Then, provide builtins for converting to/from real pointers. The other things: assignment, copy, storage, cast to/from integers, etc. will then just work. Rarely, calling a function indirectly, but that would have to be specific to the target, and documented therein. Again, a built-in will work here. If you avoid trying to introduce multiple *pointer* types, and just treat these things as *integer* types, you avoid all of the hard language issues. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: Good! In that case, you don't need them to be pointers at all. :-) I think you should just declare them as integer types. That makes initializing function pointers messy. Besides, they're not integers. They're pointers. I'd *like* to be able to dereference them in some simple ways in the future; telling me to just use integers is a step backwards. Why don't I just use assembler or just use C? that defeats the whole purpose of having a high level language. I think we're going to have to agree to disagree. Sure, it would be nice if these things were pointers. I'd be happy to see a specification for how these alternative pointers work, and I'd be happy to consider a patch that made a serious stab at implementing that specification. However, I will reject any patch to support these alternative pointers in C++ until all the language issues have been resolved. I'm strongly opposed to adding more extensions to GNU C++ without thinking through all of their implications. We've suffered far too much pain for far too many years because of doing precisely this in the past. I would also argue against this extension in C at this point because users expect GNU C extensions to work in C++ as well. However, I think it would be presumptuous for me to try to reject supporting these pointers in GNU C; that's for the C maintainers to say. Since you seem to be hesitant (and, reasonably so, in my opinion!) to work on the language-design issues for C++, I would recommend the integer approach as a way of providing the functionality you need in the short term. Sorry, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
DJ Delorie wrote: At the moment, there are no language issues to resolve. No, rather, at the moment there is a class of programs which are accidentally accepted by the C++ front end and happen to do some of the things that you want. That is different from saying that the compiler supports the feature that you want. This is an undocumented extension. Note that the documentation for the mode attribute says: This in effect lets you request an integer or floating point type according to its width. (It does not say pointer type.) It then goes on to say and `pointer' or `__pointer__' for the mode used to represent pointers. The use of the definite article indicates that there is in fact only one mode for pointers. The remaining problems aren't really language issues. I'm surprised that you believe this, in view of the concrete examples I've given. I'm not sure how to convince you that there's anything non-trivial to do here, but there is. So, unfortunately, I'm stuck: all I can do is tell you that the fact that this current works at all in C++ is an accident, might go away at any point, and probably violates various assumptions in the front end. If you're willing to use integers, you'll have an easy time. If you want the more elegant language semantics of multiple pointer sizes, you'll have to take the time to read through the C++ standard and think about all the potential impact of the extension you're proposing. Or, you can decide to work on the middle end issues, try to get your patches accepted, and then come back to the front end issues later. I don't have any plans to aggressively go reject this code in the C++ front end, but I would vaguely like to clean up and specify how attributes interact with C++ in more detail, and that might result in problems for this usage. (We really need to specify how attributes interact with same-typed-ness, implicit conversions, etc.) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: does gcc support multiple sizes, or not?
Bernd Jendrissek wrote: May I jog your memory about named address spaces. Are near and far pointers something that might be able to be shoehorned into any [future] infrastructure for supporting these named address spaces? Same for DJ's oddball pointers - could they fit? Maybe so -- but that's another can of worms from a language design point of view... -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.2 Status Report (2006-08-22)
I'm pleased to note that there's been a noticeable decrease in open regressions relative to my previous report. The total number of regressions has fallen from 160 to 140 and the number of P1s has falled from 29 to 21. In an effort to move us closer to the goal of 100 regressions to create the branch, CodeSourcery's GNU toolchain team will be having a GCC 4.2 regression hack-a-thon this coming Friday, August 25th. Our goal is to fix at least one P1 or P2 regression per Sourcerer. Please join us! Assuming that we continue to reduce the regression-count, I'll start setting up the 4.3 process soon, so that we're ready to go before the 4.2 release branch is created. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
the abbreviation tables. I'm making the assumption that it f calls N functions, then they probably come from N object files. I have no data to back up that assumption. (There is nothing that says that you can only have one abbreviation table for all functions. You can equally well have one abbreviation table per function. In that mode, you trade space (more abbreviation tables, and the same abbreviation appearing in multiple tables) against the fact that you now only need to read the abbreviation tables you need. I'm not claiming this is a good idea.) I don't find this particular argument (that the abbreviation tables will double file I/O) very convincing. I don't think it's likely that the problem we're going to have with LTO is running out of *virtual* memory, especially as 64-bit hardware becomes nearly universal. The problem is going to be running out of physical memory, and thereby paging like crazy, running out of D-cache. So, I'd assume you'd just read the tables as-needed, and never both discarding them. As long as there is reasonable locality of reference to abbreviation tables (i.e., you can arrange to hit object files in groups), then the cost here doesn't seem like it would be very big. 2) I PROMISED TO USE THE DWARF3 STACK MACHINE AND I DID NOT. I never imagined you doing this; as per above, I always expected that you would use DWARF tags for the expression nodes. I agree that the stack-machine is ill-suited. 3) THERE IS NO COMPRESSION IN DWARF3. In 1 file per mode, zlib -9 compression is almost 6:1. In 1 function per mode, zlib -9 compression averages about 3:1. In my opinion, if you considered DWARF + zlib to be satisfactory, then I think that would be fine. For LTO, we're allowed to do whatever we want. I feel the same about your confession that you invented a new record form; if DWARF + extensions is a suitable format, that's fine. In other words, in principle, using a somewhat non-standard variant of DWARF for LTO doesn't seem evil to me -- if that met our needs. 2) LOCAL DECLARATIONS Mark was going to do all of the types and all of the declarations. His plan was to use the existing DWARF3 and enhance it where it was necessary eventually replacing the GCC type trees with direct references to the DWARF3 symbol table. The types and global variables are likely OK, or at least Mark should be able to add any missing info. Yes, I agree that if you're not using DWARF for the function bodies, you probably want your own encoding for the local variables. We will also need to add other structures to the object files. We will need to have a version of the cgraph, in a separate section, that is in a form so that all of the cgraphs from all of the object files can be read a processed without looking at the actual function bodies. Definitely. function only calls other pure functions and so on... If we simply label the call graph with the locally pure and locally constant attributes, the closure phase can be done for all of the functions in the LTO compilation without having to reprocess their bodies. Virtually all inteprocedural optimizations, including aliasing, can and must be structured this way. You could also label the function declarations. There's a decision to make here as to whether the nodes of the call graph are the same as the DWARF nodes for the functions themselves, or are instead separate entities (which, of course, point to those DWARF nodes). It would be nice, a priori, to have this information in the DWARF nodes because it would allow the debugger to show this information to users and to view it via DWARF readers. However, I can also imagine that it needs to be in the separate call graph. I have not done this because I do not rule the earth. That was not what I was assigned to do, and I agreed that DWARF3 sounded like a reasonable way to go. Now that I understand the details of DWARF3, I have changed my mind about the correct direction. Now is the time to make that change before there is a lot of infrastructure built that assumes the DWARF3 encoding. I think it's great that you're asking for feedback. My only feedback is that you may not need to make this decision *now*. We could conceivably wire this up, work on the other things (CFG, etc.) and return to the encoding issue. I'm vaguely in favor of that plan, just in that I'm eager to actually see us make something work. On the other hand, building up DWARF reading for this code only to chuck it later does seem wasteful. But, the DWARF reader is already there; it's mostly filling in some blanks. But, filling in blanks is always harder than one expects. So, I think this should really be your call: rework the format now, or later, as you think best. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I understand -- but it's natural to expect that those functions will be clumped together. In a gigantic program, I expect there are going to be clumps of tightly connected object files, with relatively few connections between the clumps. So, you're likely to get good cache behavior for any per-object-file specific data that you need to access. I have never depended on the kindness of strangers or the virtues of virtual memory. I fear the size of the virtual memory when we go to compile really large programs. I don't think we're going to blow out a 64-bit address space any time soon. Disks are big, but they are nowhere near *that* big, so it's going to be pretty hard for anyone to hand us that many .o files. And, there's no point manually reading/writing stuff (as opposed to mapping it into memory), unless we actually run out of address space. In fact, if you're going to design your own encoding formats, I would consider a format with self-relative pointers (or, offsets from some fixed base) that you could just map into memory. It wouldn't be as compact as using compression, so the total number of bytes written when generating the object files would be bigger. But, it will be very quick to load it into memory. I guess my overriding concern is that we're focusing heavily on the data format here (DWARF? Something else? Memory-mappable? What compression scheme?) and we may not have enough data. I guess we just have to pick something and run with it. I think we should try to keep that code as as separate as possible so that we can recover easily if whatever we pick turns out to be (another) bad choice. :-) One of the comments that was made by a person on the dwarf committee is that the abbrev tables really can be used for compression. If you have information that is really common to a bunch of records, you can build an abbrev entry with the common info in it. Yes. I was a little bit surprised that you don't seem to have seen much commonality. If you recorded most of the tree flags, and treated them as DWARF attributes, I'd expect you would see relatively many expressions of a fixed form. Like, there must be a lot of PLUS_EXPRs with TREE_USED set on them. But, I gather that you're trying to avoid recording some of these flags, hoping either that (a) they won't be needed, or (b) you can recreate them when reading the file. I think both (a) and (b) hold in many cases, so I think it's reasonable to assume we're writing out very few attributes. I had a discussion on chat today with drow and he indicated that you were busily adding all of the missing stuff here. All is an overstatement. :-) Sandra is busily adding missing stuff and I'll be working on the new APIs you need. I told him that I thought this was fine as long as there is not a temporal drift in information encoded for the types and decls between the time I write my stuff and when the types and decls are written. I'm not sure what this means. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Daniel Berlin wrote: On 8/31/06, Kenneth Zadeck [EMAIL PROTECTED] wrote: Mark Mitchell wrote: Kenneth Zadeck wrote: Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I understand -- but it's natural to expect that those functions will be clumped together. In a gigantic program, I expect there are going to be clumps of tightly connected object files, with relatively few connections between the clumps. So, you're likely to get good cache behavior for any per-object-file specific data that you need to access. I just do not know. I assume that you are right, that there is some clumping. But I am just no sure. I just want to point out that this argument (okay cache locality) was used as a reason the massive amount of open/seek/close behavior by Subversion's FSFS filesystem is a-ok. Here, we won't be making syscalls -- but we will be taking page faults if we go out of cache. I don't know what the consequences of page faults for files backed over NFS are, but if your object files are coming over NFS, your linker isn't going to go too fast anyhow. I would expect most users carefully use local disk for object files. Since we're descending into increasingly general arguments, let me say it more generally: we're optimizing before we've fully profiled. Kenny had a very interesting datapoint: that abbreviation tables tended to be about the size of a function. That's great information. All I'm suggesting is that this data doesn't necessarily imply that enabling random access to functions (as we all agree is necessary) implies a 2x I/O cost. It's only a 2x I/O cost if every time you need to go look at a function the abbreviation table has been paged out. I think we've gotten extremely academic here. As far as I can tell, Kenny has decided not to use DWARF, and nobody's trying to argue that he should, so we should probably just move on. My purpose in raising a few counterpoints is just to make sure that we're not overlooking anything obvious in favor of DWARF; since Kenny's already got that code written, it would be nice if we had a good reason not to start over. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: I am not so concerned with running out of virtual address space than I am about being able to break this up so that it can be done in parallel, on a farm of machines. Otherwise, lto can never be part of anyone's compile/test loop. I think we just expanded the scope of work by an order of magnitude. :-) If you had just said that you wanted to support multi-threaded LTO, that would have been a big deal. But multiple machines with multiple address spaces trying to do LTO on one program is a really big deal. (Of course, there is a cheap hack way of doing what you want: run LTO on clumps of object files in parallel, and then just link the pre-optimized files together in the ordinary way.) I'd really like to see us inline a function before we even begin to have this conversation. :-) I have no idea how stable all the types and decls are over a compilation. I write my info pretty early, and I assume the types and decls are written pretty late in the compilation (otherwise you would not have address expressions for the debugger). If there has been any processing on these between when I write my stuff and when the types and decls get written, things may not match up. I don't think that this is an issue. The important information about types and declaration is stable. Things like is this declaration used? change over the course of the compilation, but that's not useful for DWARF anyhow -- and, in general, we don't write out information about types/declarations that are entirely unused. The key aspects (sizes/layouts/etc.) are fixed. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Daniel Jacobowitz wrote: On Thu, Aug 31, 2006 at 09:24:20AM -0700, Mark Mitchell wrote: Here, we won't be making syscalls Yes, you almost certainly will. OK, good point. In any case, my concern is that we're worrying a lot about on-disk encoding, but that there are lots of other hard problems we have to solve before we can make this work -- even independent of resource constraints. So, I suggest we choose an encoding that seems approximately reasonable, but not worry too much about exactly how optimal it is. I think that boils down to: Kenny, I think you should do what you think best, but without working too terribly hard. :-) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
mathieu lacage wrote: I have spent a considerable amount of time looking at the abbrev tables output by gcc are not totally random: their entries are sorted by their abbrev code. That is, the abbrev code of entry i+1 is higher than that of entry i. That's an interesting observation. Essentially, you've shown that by storing lg(n) information, you can cut the cost to find an entry in an abbreviation table of size n to a constant. Since, for LTO, we certainly can depend on the .o file being produced by GCC, we could depend on this behavior, even though it's not mandated by the DWARF standard. I think this is probably moot, since I believe that Kenny feels DWARF is not suitable for reasons other than the abbreviation table issue, but this is a clever technique. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.3 Projects Page
Since we're making some headway on GCC 4.2, it's now time to start thinking about GCC 4.3. As for the past couple of releases, let's start by trying to gather information about what people are planning to contribute for GCC 4.3. Please add your project page to the bottom of: http://gcc.gnu.org/wiki/GCC_4.3_Release_Planning In this cycle, I'm going to try to prioritize projects for Stage 1 by readiness (e.g., has your patch already been tested fully? is it current with mainline? has the patch already been approved?), and then by time of original submission to gcc-patches. I want to make sure that anything that fell through the cracks for 4.2 gets first priority for 4.3. Since the idea is that you prepare major patches for Stage 1 during the previous release cycle, if you haven't started writing it now, and you want it in 4.3, please type quickly. The 4.2 branch will not be created before September 18th, even if we make enough progress to make that possible, so that everyone has plenty of time to list their projects. The 4.2 branch may be created later than September 18th, if either (a) 4.2 isn't looking solid enough, or (b) we need more time to get 4.3 organized. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.3 Projects Page
Joe Buck wrote: On Fri, Sep 01, 2006 at 03:56:30PM -0700, Mark Mitchell wrote: Please add your project page to the bottom of: http://gcc.gnu.org/wiki/GCC_4.3_Release_Planning BTW, that page provides a link to SampleProjectPage which does not exist. Thanks! I forgot which Wiki syntax I was using; that's now fixed. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.3 Projects Page
Daniel Berlin wrote: On 9/1/06, Mark Mitchell [EMAIL PROTECTED] wrote: Joe Buck wrote: On Fri, Sep 01, 2006 at 03:56:30PM -0700, Mark Mitchell wrote: Please add your project page to the bottom of: http://gcc.gnu.org/wiki/GCC_4.3_Release_Planning BTW, that page provides a link to SampleProjectPage which does not exist. Thanks! I forgot which Wiki syntax I was using; that's now fixed. In order to make life even easier, i renamed the sample project page to end in Template. Thanks! -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: 4.1 status?
Kenny Simpson wrote: What is the status of the 4.1 branch? Any word on 4.1.2? My current plan is to do a 4.1.2 along with 4.2.0. My concern has been that with 4.2.0 moving slowly, trying to organize another release might just distract the developer community. However, I realize that's a pretty wide gap between 4.1.1 and 4.1.2. We could also do 4.1.2 sooner, and then do 4.1.3 along with 4.2.0. (I want to do a 4.1.x release along with 4.2.0 so as to avoid the problems we have in past with quality going backwards between releases from different branches.) I'm sure that, a priori, people would prefer a 4.1.2 release, but it does take effort. On the other hand, many 4.1 bugs are also in 4.2. Any thoughts? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: pr27650 - dllimport of virtual methods broken.
Carlos O'Donell wrote: Is any of you able to give some comments on pr27650 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27650 In particular I am interested in an opinion of Danny's fix. http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01504.html I definately don't know enough about attributes and dllimport to comment. The fix works, but is it correct? I think the idea of the patch is correct: virtual functions shouldn't be marked dllimport because you need their addresses to be constants so that they can go in virtual tables. However, I think that the winnt.c change (which has already been checked in) shouldn't be necessary. It only makes sense if we set DECL_DLLIMPORT_P at one point, and then set it back to zero, indicating that we don't want to allow the function to be dllimport'ed. But, in that case, I don't think dllimport should still be on the DECL_ATTRIBUTES list. So, it seems like a band-aid. The cp/decl2.c change also seems less than ideal. The key invariant is that virtual functions can't be dllimport'd. So, I think we should mark them that way when they're declared, perhaps in grokfndecl or in cp_finish_decl. It could be that I'm missing something, though; Danny might want to debate my conclusions. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: pr27650 - dllimport of virtual methods broken.
Danny Smith wrote: The problem I had was with the second case below. We don't know if a method is implicitly virtual until search.c:look_for_overrides_r). Would t be better to unset DECL_DLLIMPORT_P (and remove the attribute as well) there? Ah, right, good point. I always forget that case,.partly because I really think that processing should be done when the function is declared. We can know whether it's virtual at that point, so I think we should. But, that's not how things work now. :-( So, perhaps the best place would be in check_for_override. That's called for all methods when the class is complete. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Merging identical functions in GCC
Laurent GUERBY wrote: On Fri, 2006-09-15 at 13:54 -0700, Ian Lance Taylor wrote: Laurent GUERBY [EMAIL PROTECTED] writes: For code sections (I assume read-only), isn't the linker always able to merge identical ones? What can the compiler do better than the linker? The linker can't merge just any identical code section, it can only merge code sections where that is permitted. For example, consider: int foo() { return 0; } int bar() { return 0; } int quux(int (*pfn)()) { return pfn == foo; } Compile with -ffunction-sections. If the linker merges sections, that program will break. Indeed. The compiler could merge (or mark for linker) static functions whose address is not taken, those are safe against this use. I wrote an tech report for a company where I used to work about this optimization about 10 years ago, in the context of C++ templates. (Unfortunately, it was internal-only, so I don't have a copy. Or, perhaps fortunately; I don't remember if I said stupid things...) Anyhow, I think that a combination of compiler/linker help and programmer help are useful. There are some cases where you can do this automatically, and others where you might need programmer help. Just as -ffast-math is useful, so might -fmerge-functions or __attribute__((mergeable)), even if that resulted in non-standard handling of function pointers. We could certainly do this in the context of LTO. It might be a nice trick to do it in the general GCC back end, and then it would work both for a single module and for LTO; both cases are useful. It's also possible to do some of this in the front end; the paper I wrote talked about how you could notice that a particular template did not make any particular use of the type of the argument (i.e,. that T* was treated the same, independent of T). I think that *all* of these places might be useful eventually: in the front end, the back end, and the linker. I'd be happy to start anywhere. :-) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Merging identical functions in GCC
Mike Stump wrote: On Sep 15, 2006, at 2:32 PM, Ross Ridge wrote: Also, I don't think it's safe if you merge only functions in COMDAT sections. Sure it is, one just needs to merge them as: variant1: nop variant2: nop variant3: nop [ ... ] this way important inequalities still work. Yes, that will work. But, in practice, people will also want the mode where you do not insert the nops, and just accept that some functions compare equal when they shouldn't. So, we should have a switch for that mode too. I think it's reasonable -- a priori -- to consider doing this optimization in both the compiler and in the linker. In the compiler, when generating a single object file, eliminate duplicates. In the linker, when linking stuff, we can do it again. I don't think we can know how much bang comes from either approach without measuring some sample programs. Pick a random application (maybe an KDE office application?) and measure how many functions, if any, in the final link image are duplicates. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Missing elements in VECTOR_CST
Andrew Pinski wrote: The documention on VECTOR_CST is not clear if we can have missing elements in that the remaining elements are zero. Right we produce such VECTOR_CST for things like: #define vector __attribute__((vector_size(16) )) vector int a = {1, 2}; But is that valid? We currently produce a VECTOR_CST with just two elements instead of 4. Should we always have the same number of elements in a VECTOR_CST as there are elements in the vector type? I think it is reasonable for front-ends to elide initializers and to follow the usual C semantics that elided initializers are (a) zero, if the constant is appearing as an initializer for static storage, or (b) unspecified, random values elsewhere. Requiring explicit zeros is just a way to take up memory. We clearly wouldn't want to do it for big arrays, and I think we might as well treat vectors the same way, since we already need logic to handle the implicit zeros in the back end for arrays and structures. The counter-argument is that front ends should be explicit. However, explicit doesn't need to mean verbose; as long as we specify the semantics I give above, eliding the elements is still perfectly clear. This is why PR 29091 is failing currently. output_constant assumes VECTOR_CST have the correct number of elements but the C front-end via digest_init creates a VECTOR_CST with only 2 elements. Thus, I think that output_constant should be changed to add the additional zeros. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Planning for GCC 4.2 branch
As I mentioned in passing last night, I'm reviewing the open GCC 4.2 PRs and catching up on the mailing list traffic today, with the intent of announcing a GCC 4.2 branch date later today together with thoughts about staging the GCC 4.3 contributions. I know that this is very short notice, but if anyone has input about whether or not we are ready to branch, that would be very helpful. Also helpful would be to add 4.3 project pages to the Wiki for any projects not already mentioned. The branch date will be no sooner than one week from today, so don't worry if you don't have time to get me input today. I will revise both the branch date and 4.3 staging in response to feedback; consider today's expected mail as a first try. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: pr27650 - dllimport of virtual methods broken.
Danny Smith wrote: cp/ChangeLog PR target/27650 * class.c (check_for_override): Remove dllimport from virtual methods. testsuite/Changelog PR target/27650 * g++.dg/ext/dllimport12.C: New file. OK, thanks. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.3 Platform List
It's a bit off topic, but as I'm thinking about GCC 4.3, I was reviewing the GCC 4.2 primary/secondary platform list, and I think it's a bit out of date. (See http://gcc.gnu.org/gcc-4.2/criteria.html). The SC is responsible for setting the list for 4.3 -- but I think the SC would like the overall community's input. I'll start by proposing a set of changes that seems good to me, to get the ball rolling. I'm sure other people will suggest other changes. I'll try to bundle up the combined input, and forward on the SC. (In the hope of heading off speculation about whether I'm wearing an FSF or CodeSourcery hat in this context: (a) no CodeSourcery customer has asked me to make the suggestions I'm making, nor have I consulted with any customer about these suggestions, but (b) I'm inevitably influenced by the slice of the world with which I most often interact. So, I think I'm wearing my FSF hat -- but I certainly don't claim to be perfectly objective.) My proposed changes: 1. Replace arm-none-elf with arm-none-eabi. Most of the ARM community has switched to using the EABI. 2. Downgrade hppa2.0w-hp-hpux11.11 and powerpc-ibm-aix5.2.0.0 to secondary platforms. Update HP-UX to 11.31? Update AIX to 5.3? I like having these platforms in the list, in that the differences in object models tend to flush out bugs in GCC, but there doesn't seem to be as much interest in these systems from GCC developers as in other systems. 3. Update sparc-sun-solaris2.9 to sparc64-sun-solaris2.10? 4. Replace powerpc-apple-darwin with i686-apple-darwin. Apple's hardware switch would seem to make the PowerPC variant less interesting. 5. Add i686-mingw32 as a secondary platform. Reactions? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.3 Platform List
Mark Mitchell wrote: My proposed changes: 1. Replace arm-none-elf with arm-none-eabi. Most of the ARM community has switched to using the EABI. 2. Downgrade hppa2.0w-hp-hpux11.11 and powerpc-ibm-aix5.2.0.0 to secondary platforms. Update HP-UX to 11.31? Update AIX to 5.3? I like having these platforms in the list, in that the differences in object models tend to flush out bugs in GCC, but there doesn't seem to be as much interest in these systems from GCC developers as in other systems. 3. Update sparc-sun-solaris2.9 to sparc64-sun-solaris2.10? 4. Replace powerpc-apple-darwin with i686-apple-darwin. Apple's hardware switch would seem to make the PowerPC variant less interesting. 5. Add i686-mingw32 as a secondary platform. I should also have added: 6. Move powerpc-unknown-linux-gnu from the secondary list to the primary list. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.3 Platform List
Andrew Pinski wrote: On Wed, 2006-09-20 at 23:11 -0400, Mark Mitchell wrote: Reactions? Change powerpc-unknown-linux-gnu to powerpc64-unknown-linux-gnu so that we also require the 64bit of PowerPC to work. To be clear, you're suggesting that we say powerpc64-unknown-linux-gnu, but mean that both it's 32-bit and 64-bit modes should work? That makes sense to me. What about MIPS/MIPS64? Also move powerpc64-unknown-linux-gnu or powerpc-linux-gnu to Primary if powerpc-aix is moving to secondary so we keep a PowerPC up as a primary target. Definitely; I'd confused myself. 5. Add i686-mingw32 as a secondary platform. Is i686-pc-cygwin just as important as mingw32 then? I wonder if you mean to ask whether mingw32 is as important as Cygwin, or the other way around? I think both are important, and about equally so. Cygwin is widely used by people used to GNU software when running on Windows and has a very active community. Windows (without Cygwin) is of course a widely-used operating system, and my perception is that a reasonable number of people are using GCC to build non-Cygwin Windows applications. However, I have no hard data. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.3 Platform List
Andrew Pinski wrote: The last time a freebsd testresult was sent to the list from the mainline was in May, maybe that is a sign that we should downgrade it to secondary from primary. I personally have no opinion about FreeBSD; I don't feel I know enough to say anything sensible. However, the fact that there are no test results coming in does seem consistent with your suggestion. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.3 Platform List
Jack Howarth wrote: Since Apple is committed (at least in their advertising) to provide 64-bit development tools for both PPC and Intel in Leopard, it would seem a tad premature to downgrade the powerpc-apple-darwin in favor of i686-apple-darwin for 4.3. I think maybe it's best, after my initial flurry of postings, not to respond directly -- I don't want to dominate the discussion. So, I'll just say that I think that's a perfectly reasonable suggestion, and step back. In a few days, I'll try to put together a summary of the opinions of the group. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.2 Status Report (2006-09-21)
I've reviewed the list of open PRs for 4.2. We've gotten a bit stuck: we're still at 116PRs, with 22 P1s, which is about where we were a couple of weeks ago. (I'm going to try to beat down a few of the P1 C++ PRs tonight and tomorrow, but I doubt I'll get 16...) So, my plan is to branch as soon as we get to 100, but no sooner that September 28th. Please fix what you can; let's get this show on the road! (I'll send a separate mail about 4.3 -- but I may not be able to do that before tomorrow morning, as I want to spend some time thinking about all the projects.) Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
ARM Thumb-2 merge?
Paul -- In addition to the Thumb-2 bits, I assume you plan to merge the other ARM changes on the branch? Is that correct? (For example, what about the NEON bits?) Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
IPA branch
Jan -- I'm trying to plan for GCC 4.3 Stage 1. The IPA branch project is clearly a good thing, and you've been working on it for a long time, so I'd really like to get it into GCC 4.3. However, I'm a little concerned, in reading the project description, that it's not all that far along. I'm hoping that I'm just not reading the description well, and that you can explain things in a way that makes it obvious to me that the work is actually almost done. The Wiki page says first part of patches was already sent, but I can't tell how much that is, or how many of the modifications required steps are already done. Have you completed all the work on the IPA branch itself, so that it's just a problem of merging? How much of the merging have you actually done? What version of mainline corresponds to the root of the IPA branch? Have maintainers with appropriate write privileges reviewed the patches? I'm not in any way trying to send a negative signal about this work. I have every hope that it will be merged soon. I just want to better understand the situation. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.3 Merge Plan
[N.B. -- We are not in Stage 1 yet!] I've categorized the projects submitted for 4.3 into Stage 1 and Stage 2 projects. The criteria I've used are (a) completeness (projects that are more complete should go in earlier), (b) riskiness (projects that are riskier, affect more targets, etc., should go in earlier). I've also tried to give priority to projects that were submitted for 4.2, but didn't make it in. Feedback is welcome. As always, you still need to get your patches reviewed before check-in; the fact that your project is on the list doesn't imply that it will actually be accepted, though, of course, that is the hope. Here are some additional ground rules and notes: * Recruit a Reviewer If you can't approve your own contribution, please locate an appropriate reviewer now. Find a person or people who will commit to reviewing your patch, and put their name on your project Wiki page. Lining up a reviewer can be more important that finishing the code; in the past, we've had the problem of completed patches, but with no reviewer available. * Check-in Windows If you are contributing a project, and think you are almost ready to check in your code, please announce that fact. From the point of your announcement, you have a 72-hour window to get the contribution done. During your window,, you can impose your own check-in rules on mainline. In general, please let people to continue to check as many things in as possible -- but if you really need to lock everything down to test on multiple platforms, etc., you have the freedom to do that. Before you make your reservation, you should have merged, tested, and have had your patches reviewed. The reservation should just be for a final merge/test/check-in cycle. When you're done, please announce that the branch has reverted to normal rules. * Stage 2 Projects If you have a Stage 2 project that's ready, reviewed, and tested, you can check in early. For example, the Stage 2 list has architecture-specific work listed for ARM, ColdFire, and x86-64. If that work is ready, it's unlikely to affect the Stage 1 work, so it's fine for it to go in early. The reason it's in Stage 2 is that it wouldn't perturb me for it to go in during Stage 2; in contrast, the new dataflow stuff should go in soon so that we have time to fix any problems that arise and so that future work can build on that platform. However, if you are going to commit early, please announce that fact at least 72 hours before you actually do your check-in, so that people working on Stage 1 projects can ask you to wait, and please respect ay such requests. * Internal Ordering I'm not going to try to order the Stage 1 projects. When you're ready, go ahead and commit. But, please do try to keep other developers informed of your intentions. * Submit Early Remember that Stage 1 isn't really the time to be doing major development; it's primarily the time to be *merging* major development. So, get your branches merged up to mainline, write documentation, do tests, and submit your patches. There's no reason a patch for Stage 1 can't be completed and submitted in Stage 2/Stage 3 of the previous release cycle. You can certainly submit your Stage 2 patch during Stage 1. * Uncategorized Projects I've left the Fixed-Point Arithmetic and Variadic Templates projects as uncategorized for the moment. The fact that these projects aren't (yet) on the merge plan doesn't mean that they're not nice projects. However, the former looks like a very substantial change, and isn't done yet, so my tentative feeling is that it should wait for GCC 4.4, by which point it will hopefully be more complete. The latter requires consensus on C++ 0x. I plan to summarize what I think the consensus is for the development list, and then ask for SC ratification. I'd be happy to see both of these projects in GCC 4.3 if all the pieces come together in time. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: GCC 4.3 Platform List
Mark Mitchell wrote: I have now reviewed the suggestions. Here is the mail that I plan to recommend to the SC. (Of course, I can't guarantee what the SC will do with it.) I've tried to take into account most of the feedback. However, I've tried to note all of these suggestions in my draft mail so that the SC will see all of the ideas, even those with which I don't agree. Please let me know if your recommendation isn't reflected in this mail, either in the Recommendations or in the Other Suggestions section -- that means that I unintentionally overlooked it. And, if you would like to address the SC directly, please let me know and I will be happy to pass along your message. Recommendations === I think that we should update the list of primary and secondary platforms for GCC 4.3. This section has my personal recommendations, based on feedback from the GCC developer mailing list. In the next section are other suggestions that I received, but which I don't personally endorse. Primary Platforms - * arm-eabi * i386-unknown-freebsd * i686-pc-linux-gnu * i686-apple-darwin * mipsisa64-elf * powerpc64-unknown-linux-gnu * sparc-sun-solaris2.10 * x86_64-unknown-linux-gnu Secondary Platforms --- * hppa2.0w-hp-hpux11.23 * powerpc-ibm-aix5.2.0.0 * powerpc-apple-darwin * i686-pc-cygwin * i686-mingw32 * ia64-unknown-linux-gnu Other Suggestions = Here are the suggestions that I'm not endorsing personally, but which I would like to pass along: * AVR as a secondary target * Replace the MIPS bare metal platform with a GNU/Linux platform. * Make both Intel and PowerPC Darwin primary platforms. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: representation of struct field offsets
Sandra Loosemore wrote: I've been having a heck of a time figuring out how to translate the offsets for struct fields from the DWARF encoding back to GCC's internal encoding for the LTO project. Yes, that's a nasty bit. I think the DECL_FIELD_OFFSET/DECL_FIELD_BIT_OFFSET stuff is, quite simply, mis-designed. The way I think it should work is for DECL_FIELD_OFFSET to be the byte offset, and DECL_FIELD_BIT_OFFSET to be the bit offset, always less than BITS_PER_UNIT. But, that's not how it actually works. Instead, the BIT_OFFSET is kept below the alignment of the field, rather than BITS_PER_UNIT. The bit of dwarf2out.c that emits the offset for the field is add_data_member_location_attribute. It uses dwarf2out.c:field_byte_offset, which is the function that normalizes the weird GCC representation into the obvious one. I don't know why it's using a custom function; I would think it should just use tree.c:byte_position. The current DWARF code looks oddly heuristic. But that doesn't explain why you're not getting idempotent results. Are you going through the stor_layout.c:place_field routines when creating structure types? If so, I wouldn't; here, you know where stuff is supposed to go, so I would just put it there, and set DECL_FIELD_OFFSET, etc., accordingly. My bet is that you are not setting DECL_ALIGN, or that we have failed to set TYPE_ALIGN somewhere, and that, therefore, the heuristics in dwarf2out.c:field_byte_offset are getting confused. For example, simple_type_align_in_bits might not be working. I would probably step through field_byte_offset both when compiling C and in LTO mode, and try to see where it goes different. It shouldn't be necessary as part of this work, but I can't see why we should just replace field_byte_offset with a use of byte_position. Does anyone else know? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: representation of struct field offsets
Chris Lattner wrote: An alternative design, which would save a field, is just to keep the offset of a field, in bits, from the start of the structure. Yes, that would also work. But, in many cases, you need the byte offset, so there's a time/space tradeoff. Also, because of GCC's internal representation of integers, you have to be careful that you have enough bits; for example, you need 72 bits to represent things in a 64-bit address space. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: representation of struct field offsets
Chris Lattner wrote: Also, because of GCC's internal representation of integers, you have to be careful that you have enough bits; for example, you need 72 bits to represent things in a 64-bit address space. Actually, just 67, right? Does GCC support structures whose size is greater than 2^61 ? I'm not sure -- but if it doesn't, it should. There are folks who like to make structures corresponding to the entire address space, and then poke at particular bytes by using fields. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: IPA branch
Razya Ladelsky wrote: Except for new optimizations, IPCP (currently on mainline) should also be transformed to SSA. IPCP in SSA code exists on IPA branch, and will be submitted to GCC4.3 after IPA branch is committed and some testsuite regressions failing with IPCP+versioning+inlining are fixed. Is there a project page for this work? Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
GCC 4.3 project to merge representation changes
Kazu, Sandra -- I don't believe there is a GCC 4.3 project page to merge the work that you folks did on CALL_EXPRs and TYPE_ARG_TYPEs. Would one of you please create a Wiki page for that? Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713