Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/29/12, Michael Matz m...@suse.de wrote: On Sun, 27 May 2012, Gabriel Dos Reis wrote: people actually working on it and used to that style. We don't want to have a mixture of several different styles in the compiler. I (and I expect many others) don't want anyone working around the latter by going over the whole source base and reindent everything. Hence inventing a new coding standard for GCC-in-C++ (by reusing existing ones or doing something new) that isn't mostly the same as GCC-in-C isn't going to fly. if this coding standard is going to be adopted as a GNU coding convention, then you have to be flexible and allow yourself to see beyond the past written in C. You have to ask yourself: how do I want the codebase to look like in 10, 15, 20, 25 years. ... And thanks for making clear what the whole GCC-in-c++ stunt is about. ( ... ) Namely useless noise and source change activity for the sake of it. The conversion to C++ is not a stunt. It is an attempt to reduce the cost of developing GCC and to ease the path for more developers to contribute. I believe progress on those goals is necessary to the long-term health of GCC. Do you wish to see progress to those goals? If so, what you have us do differently? We need a coding standard for C++ if we are to use C++. A whole new coding standard would be disruptive, and so the proposals on the table are incremental changes to the existing C conventions. There have been discussions about potential future changes, more in line with industry practice, but they are not present proposals. That activity is part of the construction work. Any construction work is always going to have a few pardon the inconvenience signs. If there is anything we can do to reduce that, but still make progress, please let us know. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
Lawrence == Lawrence Crowl cr...@google.com writes: Lawrence On 5/24/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On May 24, 2012 Lawrence Crowl cr...@google.com wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. Lawrence, is there any chance you could just call it hash_table? After the conversion, we will be living most of the time in a typed world, so the typed_ prefix will be redundant if not confusing :-) Lawrence The name hash_table is already taken in libcpp/include/symtab.h. Lawrence Do you have any other suggestions? FWIW I think it would be fine if you wanted to rename the libcpp hash table to something else, say cpp_hash_table, to free up 'hash_table' for use in gcc. Tom
Re: [cxx-conversion] New Hash Table (issue6244048)
My main concern is that the precise collector we have in place now requires substantial care to use. It is supported by a tool that does not really understand C, let alone C++. We avoid the problems when annotations are unnecessary. We can do that in one of two ways, either by upgrading GTY to understand more of the language so that it infers the right things from regular code, or by changing the implementation strategy so that the garbage collector does not need type information. I prefer the latter approach, as it makes our experience closer to our customers' experience. Are you really saying that, because our customers might have different ways of doing garbage collection in their code, we should drop our implementation that might well be superior to theirs? That would be a weak argument IMO, all the more so that, in the end, this might degrade the GCC experience for them. -- Eric Botcazou
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 8:43 PM, Mike Stump mikest...@comcast.net wrote: On May 25, 2012, at 10:50 AM, Lawrence Crowl wrote: Diego and I looked long and hard at this issue. It all came down to a sequence of problems. First, libstdc++ isn't rigged for GTY, If portability to other C++ compilers wasn't a concern, we could extend out g++ to make supporting GTY better, so that we can simplify and refine the GTY stuff. I fear we need some light weight reflection, might make a nice language feature for a future C++ standard, if done well. Seconded. It points the finger of my #1 concern with the C++ conversion - our GC. We need a GC scheme that allows us to use standard library containers, and the scheme that was outlined earlier would work. Are the TR1 hash table implementations using any non-C++98/03 features? If not then I would suggest to use our TR1 hash tables. Richard.
Re: [cxx-conversion] New Hash Table (issue6244048)
On May 25, 2012, at 2:42 PM, Lawrence Crowl wrote: On 5/24/12, Jakub Jelinek ja...@redhat.com wrote: You haven't looked at the most important problem of that approach - code bloat. Are you claiming that the size of the binary is more important than run-time performance, type safety, and source code size? I just look at it as a nice commoning problem in search of a nice solution.
Re: [cxx-conversion] New Hash Table (issue6244048)
On May 26, 2012, at 11:25 PM, Gabriel Dos Reis wrote: how do I want the codebase to look like in 10, 15, 20, 25 years. Gee, if the next 25 look just like the last 25, it will look very similar to what it looks like today. :-)
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/24/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On May 24, 2012 Lawrence Crowl cr...@google.com wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. Lawrence, is there any chance you could just call it hash_table? After the conversion, we will be living most of the time in a typed world, so the typed_ prefix will be redundant if not confusing :-) The name hash_table is already taken in libcpp/include/symtab.h. Do you have any other suggestions? -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/28/12, Eric Botcazou ebotca...@adacore.com wrote: My main concern is that the precise collector we have in place now requires substantial care to use. It is supported by a tool that does not really understand C, let alone C++. We avoid the problems when annotations are unnecessary. We can do that in one of two ways, either by upgrading GTY to understand more of the language so that it infers the right things from regular code, or by changing the implementation strategy so that the garbage collector does not need type information. I prefer the latter approach, as it makes our experience closer to our customers' experience. Are you really saying that, because our customers might have different ways of doing garbage collection in their code, we should drop our implementation that might well be superior to theirs? That would be a weak argument IMO, all the more so that, in the end, this might degrade the GCC experience for them. No, I am suggesting that the current GTY approach requires more manual care than is desireable. There are several ways to fix that. One way is to improve GTY so that less manual effort is needed. However, any work we spend on GTY would not affect our customers except, perhaps, in the data size of the compiler. Another way is to use C++ containers that manage the storage for us, and to back that up with a conservative collector. If we do so, then we are operating in the same environment as our customers. We can fix performance issues in the compiler by generating better code. That better code helps our customers with their code. The intent in living as our customers do is to improve the overall experience of our customers. Having said that, let me be clear. There is no proposal as yet to change anything about garbage collection in gcc other than upgrade it as we need. (See Diego's vec patch.) We are just having a friendly philosophical discussion. :-) -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/29/12, Richard Guenther richard.guent...@gmail.com wrote: On May 25, 2012 Mike Stump mikest...@comcast.net wrote: On May 25, 2012, at 10:50 AM, Lawrence Crowl wrote: Diego and I looked long and hard at this issue. It all came down to a sequence of problems. First, libstdc++ isn't rigged for GTY, If portability to other C++ compilers wasn't a concern, we could extend out g++ to make supporting GTY better, so that we can simplify and refine the GTY stuff. I fear we need some light weight reflection, might make a nice language feature for a future C++ standard, if done well. Seconded. It points the finger of my #1 concern with the C++ conversion - our GC. We need a GC scheme that allows us to use standard library containers, and the scheme that was outlined earlier would work. Using the standard library containers is not a prerequisite to using C++, nor is it the source of primary benefit. You can build the compiler completely independently of the standard library. There are even good reasons to do so. More importantly, the standard library containers have different operations, different semantics, and different performance implications from the existing containers. Replacing any existing container uses with standard library uses is actually many tasks, which should be done incrementally after the infrastructure is in place. More importantly, some of that replacement would not be trivial, and the specialists in particular data structures might be better suited to it. Are the TR1 hash table implementations using any non-C++98/03 features? If not then I would suggest to use our TR1 hash tables. Almost by definition they are not using non-C++03 features. I do not think we should try to make that change in one step, but make it an eventual goal. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
Hi, On Sun, 27 May 2012, Gabriel Dos Reis wrote: people actually working on it and used to that style. We don't want to have a mixture of several different styles in the compiler. I (and I expect many others) don't want anyone working around the latter by going over the whole source base and reindent everything. Hence inventing a new coding standard for GCC-in-C++ (by reusing existing ones or doing something new) that isn't mostly the same as GCC-in-C isn't going to fly. if this coding standard is going to be adopted as a GNU coding convention, then you have to be flexible and allow yourself to see beyond the past written in C. You have to ask yourself: how do I want the codebase to look like in 10, 15, 20, 25 years. No, I don't have to ask that myself. I don't have to be flexible (just to be clear: I am flexible in many cases, just not because you tell me to be). And no, I don't have to allow myself to see beyond the past you imply. I am seeing beyond the past whenever I see fit. Thanks for fathering my feelings, but please stop doing so. And thanks for making clear what the whole GCC-in-c++ stunt is about. ( I didn't know beforehand that the C++ proponents for GCC set out to start a new GNU coding standard for C++ and wanted to make GCC the case at hand for that adventure including all kinds of niceties that c++ perhaps delivers. If that would have been their open goal from the start (which obviously it wasn't; rather it was hidden in feel-good rhetoric about how c++ would magically increase the contributor base, which of course will never happen; I mean how naive is that thinking, a new, more complex language, driving more contributors to a project; hello?) I would have opposed it even stronger from the start. Of course without any result (because a global reviewer is part of the club). You made a reference to the coding standard as to be influenced in the way it is because of the Lisp way, implying that this is the obsolete way, suggesting from wording that this is the old style that's somehow to be overcome. (No doubt you'll now claim that this is not what you said explicitely; which is literally true but doesn't matter because you're very well aware of the mechanics of your rhetoric): That's obviously complete bollocks. While the backend IR is lispy (and even that only on the outset) the coding standard itself has nothing to do with that whatsoever. Not even the RTL routines are written in a lispy style. And the tree/gimple routines obviously have never seen the honor of being read by you. Seeing lisp style in _those_ involves major inventarism. So, if you make the suggestion that I should think and program beyond my (implied) small cubicle of lispy influenced old-style GNU coding standard C code (so as to, in a way, expand my programming capabilities), then I can only say: Gabriel, thanks for your suggestion, but you obviously don't have the slightest idea what you're talking about. Why do you even have the guts to take part in this discussion (apart from being in the c++ fan-boy club, of course)? When was the last submission of yours to the code base in areas that we're talking about? ) Namely useless noise and source change activity for the sake of it. Only marginally interested in an answer, Michael.
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 02:42:39PM -0700, Lawrence Crowl wrote: On 5/24/12, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 24, 2012 at 09:43:42AM -0700, Lawrence Crowl wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. You haven't looked at the most important problem of that approach - code bloat. Are you claiming that the size of the binary is more important than run-time performance, type safety, and source code size? Runtime performance goes in hand with the size of the binary, at least size of frequently used code. By converting just a couple of hash tables you can't really measure it, you'd need to convert a significant number of them, then you can see what effect it has on runtime performance. As said earlier, GCC has lots of hash tables, and many of them are used in performance critical code, by increasing the I-cache footprint of that performance criticial code there is risk of reducing performance. The common C++ programming techniques often lead to significant code bloat which really shouldn't be ignored. Jakub
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 05:43:20PM -0500, Gabriel Dos Reis wrote: On Fri, May 25, 2012 at 4:52 PM, Lawrence Crowl cr...@google.com wrote: Personally, I would rather see if we can take advantage of C++ features to reduce garbage and then use the Boehm collector. There is too much manual management with GTY, and I'd rather the compiler leverage mainstream practice rather than depart from it. I could not agree more. I could not agree less, replacing a nicely precise garbage collector for a conservative collector? Ugh. I realy want a deterministic compiler, not one where bugs won't be really reproduceable because with ASLR the conservative collector collects or not collects something at some point. Jakub
Re: [cxx-conversion] New Hash Table (issue6244048)
On 05/25/2012 01:10 PM, Paweł Sikora wrote: On Friday 25 of May 2012 11:50:13 Gabriel Dos Reis wrote: On Fri, May 25, 2012 at 11:44 AM, Paweł Sikorapl...@agmk.net wrote: so, why you just don't use the hash table implementation from libstdc++? we have agreed on C++03 as a bootstrap compiler. There is unfortunately no hash table in C++03. can't you use implementation from tr1 for c++98/03 mode? It would be interesting to see the result of using C++ standard hash tables and I hope somebody will finally try this. But I doubt the performance results will be better. C++ standrad hash tables is based on buckets usage. Libibery hashtables has no buckets. That was a major idea for such implementation more 20 years ago and including it in gcc (more 13 years ago). Hashtable without buckets permits 2-3 times more entries for the same space than hashtable with the buckets and that, I guess, compensates a slightly bigger collision rate when the buckets are not used. It also makes the table entry search code is very compat, simple, and fast.
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/28/12, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 25, 2012 at 02:42:39PM -0700, Lawrence Crowl wrote: On 5/24/12, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 24, 2012 at 09:43:42AM -0700, Lawrence Crowl wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. You haven't looked at the most important problem of that approach - code bloat. Are you claiming that the size of the binary is more important than run-time performance, type safety, and source code size? Runtime performance goes in hand with the size of the binary, at least size of frequently used code. Well, yes and no. We need to worry about total size, memory resident size, and cache resident size. The patch clearly increases total size, but I doubt that is much of a factor because most systems lazily load pages from the image. I don't think we have enough information to assess the memory resident size, and I don't think it matters because large compilations are large because the data space is much larger than the code space. The patch reduces the cache resident size because the dynamic path of instructions is shorter. By converting just a couple of hash tables you can't really measure it, you'd need to convert a significant number of them, then you can see what effect it has on runtime performance. Well, I have a performance improvement with eight of them, of which one isn't exercised in the bootstrap. As said earlier, GCC has lots of hash tables, and many of them are used in performance critical code, by increasing the I-cache footprint of that performance criticial code there is risk of reducing performance. The common C++ programming techniques often lead to significant code bloat which really shouldn't be ignored. But the patch potentially reduces the I-cache footprint. The new implementation eliminates pointer tests, it turns indirect function calls into direct function calls, it enables inlining those functions, etc. If the compiler is suboptimal in dealing with that, we should fix the compiler. (I think the compiler is behaving reasonably.) If changing a table does not deliver performance, we can choose not to convert that table. We do not need to convert all of them. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On Sat, May 26, 2012 at 10:21 PM, Michael Matz m...@suse.de wrote: Hi, On Fri, 25 May 2012, NightStrike wrote: This point of yours should be stressed. Using writing standards of one language to develop in another language is a fundamentally bad idea. C and C++ kind of look the same, but they are not: http://www.research.att.com/~bs/bs_faq.html#C-slash You wouldn't use the GNU C Coding conventions to write in tcl, and you shouldn't use them to write in C++. You should just create the GNU C++ Coding Standards new, and not base them off of the former. That is all nice and fine. For starting from scratch. But we aren't. We have an existing compiler written in a certain style. We have existing people actually working on it and used to that style. We don't want to have a mixture of several different styles in the compiler. I (and I expect many others) don't want anyone working around the latter by going over the whole source base and reindent everything. Hence inventing a new coding standard for GCC-in-C++ (by reusing existing ones or doing something new) that isn't mostly the same as GCC-in-C isn't going to fly. if this coding standard is going to be adopted as a GNU coding convention, then you have to be flexible and allow yourself to see beyond the past written in C. You have to ask yourself: how do I want the codebase to look like in 10, 15, 20, 25 years. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
Gabriel Dos Reis g...@integrable-solutions.net writes: if this coding standard is going to be adopted as a GNU coding convention, then you have to be flexible and allow yourself to see beyond the past written in C. You have to ask yourself: how do I want the codebase to look like in 10, 15, 20, 25 years. Er, sure, but that's obviously not an excuse for simply _ignoring_ the existing C standard, as nightstrike seemed to be suggesting. For the most part, the existing C standard (with the obvious generalizations where C++ is a superset) works perfectly fine for C++ too -- and yes, this is still true even if BS is annoyed by some people's use of C/C++ (I get BS's point, but it's silly to read too much into it). -miles -- We have met the enemy, and he is us. -- Pogo
Re: [cxx-conversion] New Hash Table (issue6244048)
Hi, On Fri, 25 May 2012, NightStrike wrote: This point of yours should be stressed. Using writing standards of one language to develop in another language is a fundamentally bad idea. C and C++ kind of look the same, but they are not: http://www.research.att.com/~bs/bs_faq.html#C-slash You wouldn't use the GNU C Coding conventions to write in tcl, and you shouldn't use them to write in C++. You should just create the GNU C++ Coding Standards new, and not base them off of the former. That is all nice and fine. For starting from scratch. But we aren't. We have an existing compiler written in a certain style. We have existing people actually working on it and used to that style. We don't want to have a mixture of several different styles in the compiler. I (and I expect many others) don't want anyone working around the latter by going over the whole source base and reindent everything. Hence inventing a new coding standard for GCC-in-C++ (by reusing existing ones or doing something new) that isn't mostly the same as GCC-in-C isn't going to fly. Ciao, Michael.
Re: [cxx-conversion] New Hash Table (issue6244048)
On Thu, May 24, 2012 at 09:43:42AM -0700, Lawrence Crowl wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. You haven't looked at the most important problem of that approach - code bloat. hashtab.o .text is currently ~ 4KB on x86_64, in your version only very small part out of it is shared. In a cc1plus binary I count roughly 179 htab_create{,_ggc} calls, ignoring the first parameter (initial size) that is roughly 139 unique hash/eq/del fn combinations, thus you'll end up with over hundred copies of most of the hashtab code, in many of the cases performance critical and thus its I-cache friendliness is quite important. Static functions are also not acceptable as template arguments, so this patch externalizes the functions. To avoid potential name conflicts, the function names have been prefixed. Ugh. I guess the C++ way around this would be to put the functions into anonymous namespace. + /* Return the current size of this hash table. */ + + size_t size() + { +return htab-size; + } (and various other places) - formatting is wrong, missing space between (. +void +typed_htab Element, Hash, Equal, Remove, Allocator +::dispose () Do we want to format methods this way? Jakub
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 1:25 AM, Jakub Jelinek ja...@redhat.com wrote: Static functions are also not acceptable as template arguments, so this patch externalizes the functions. To avoid potential name conflicts, the function names have been prefixed. Ugh. I guess the C++ way around this would be to put the functions into anonymous namespace. Anonymous namespaces have the problem that they may interfere with bootstrap diff-compare. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
On Thu, May 24, 2012 at 4:55 PM, Diego Novillo dnovi...@google.com wrote: On 12-05-24 17:52 , Lawrence Crowl wrote: That said, I'll let y'all decide how much to put in any one piece. I favour the approach we are taking now. Each patch to cxx-conversion is a small incremental step. When we merge into trunk we'll merge the final product of each change, of course. But with the minimalistic approach, we can discuss and understand each change better. I agree. Lawrence satisfactorily answered my main concern with the patch he posted (the template argument); so I am board modulo the change on class name. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
Hi, On Fri, 25 May 2012, Jakub Jelinek wrote: + /* Return the current size of this hash table. */ + + size_t size() + { +return htab-size; + } (and various other places) - formatting is wrong, missing space between (. And it doesn't start at the first column, and type isn't on a separate line. I realize that this is a member method, hence indenting and C GNU coding standards conflict, but the latter do have some nice properties (for instance that 'grep ^func_name *.c' finds only the definition, not all the calls to func_name). I think we need a discussion about style, now that actually people are working on this. +void +typed_htab Element, Hash, Equal, Remove, Allocator +::dispose () Do we want to format methods this way? Don't know, but at least it starts at the first column, and grepping for ^:: will give only C++ methods, 'grep -B1 ^::' even including class. So it's no too horrible, despite the syntactic C++ noise. Ciao, Michael.
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 7:17 AM, Michael Matz m...@suse.de wrote: Hi, On Fri, 25 May 2012, Jakub Jelinek wrote: + /* Return the current size of this hash table. */ + + size_t size() + { + return htab-size; + } (and various other places) - formatting is wrong, missing space between (. And it doesn't start at the first column, and type isn't on a separate line. I realize that this is a member method, hence indenting and C GNU coding standards conflict, but the latter do have some nice properties (for instance that 'grep ^func_name *.c' finds only the definition, not all the calls to func_name). I think we need a discussion about style, now that actually people are working on this. Note also the almost 2 decades of C++ style practice in our libstdc++ implementation. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
Hi, On Fri, 25 May 2012, Gabriel Dos Reis wrote: (and various other places) - formatting is wrong, missing space between (. And it doesn't start at the first column, and type isn't on a separate line. I realize that this is a member method, hence indenting and C GNU coding standards conflict, but the latter do have some nice properties (for instance that 'grep ^func_name *.c' finds only the definition, not all the calls to func_name). I think we need a discussion about style, now that actually people are working on this. Note also the almost 2 decades of C++ style practice in our libstdc++ implementation. That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Ciao, Michael.
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 3:51 PM, Michael Matz m...@suse.de wrote: Hi, On Fri, 25 May 2012, Gabriel Dos Reis wrote: (and various other places) - formatting is wrong, missing space between (. And it doesn't start at the first column, and type isn't on a separate line. I realize that this is a member method, hence indenting and C GNU coding standards conflict, but the latter do have some nice properties (for instance that 'grep ^func_name *.c' finds only the definition, not all the calls to func_name). I think we need a discussion about style, now that actually people are working on this. Note also the almost 2 decades of C++ style practice in our libstdc++ implementation. That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Neither is the GNU C style, but we use it anyway. IMHO it'd be very strange to use one style in gcc itself, and another in libstdc++. Ciao! Steven
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 03:51:25PM +0200, Michael Matz wrote: On Fri, 25 May 2012, Gabriel Dos Reis wrote: That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Seconded. Jakub
Re: [cxx-conversion] New Hash Table (issue6244048)
Hi, On Fri, 25 May 2012, Steven Bosscher wrote: Note also the almost 2 decades of C++ style practice in our libstdc++ implementation. That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Neither is the GNU C style, but we use it anyway. IMHO it is. IMHO it'd be very strange to use one style in gcc itself, and another in libstdc++. We already have a style for GCC, which is different from libstdc++. And mixing two styles _within_ one project source base (for the moment thinking about GCC and libstdc++ as two different projects) would be even worse. Ciao, Michael.
Re: [cxx-conversion] New Hash Table (issue6244048)
On 12-05-25 09:56 , Jakub Jelinek wrote: On Fri, May 25, 2012 at 03:51:25PM +0200, Michael Matz wrote: On Fri, 25 May 2012, Gabriel Dos Reis wrote: That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Seconded. I don't care how ugly coding conventions look. I've dealt with many and they all have their ugly spots. I mostly care about consistency. I don't think we should deviate much from the established GNU standards (which are hideous, btw). Mostly because we are all used to them and it is going to be simpler to write new code that is similar to what is already there. Lawrence, Ian and Gaby have been working on the proposed coding guidelines for C++ (http://gcc.gnu.org/wiki/CppConventions). Lawrence, have you had a chance to update them with your latest edits? Diego.
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 8:57 AM, Michael Matz m...@suse.de wrote: Hi, On Fri, 25 May 2012, Steven Bosscher wrote: Note also the almost 2 decades of C++ style practice in our libstdc++ implementation. That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Neither is the GNU C style, but we use it anyway. IMHO it is. IMHO it'd be very strange to use one style in gcc itself, and another in libstdc++. We already have a style for GCC, which is different from libstdc++. And mixing two styles _within_ one project source base (for the moment thinking about GCC and libstdc++ as two different projects) would be even worse. The existing style for GCC made sense when gcc was originally designed with a deliberate flavour of writing Lisp in C. However, after 25 years, the codebase has a distinctively different flavour -- one that isn't Lispy anymore. While adopting a new implementation language, it makes sense to revisit and revise the style. It also makse sense to look at existing C++ style within the GCC project. That isn't imposing by any stretch of imagination or hyperbole. It would be dismaying if we just kept the old lispy style just because grand'pa did it that way in C. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 9:15 AM, Diego Novillo dnovi...@google.com wrote: I don't care how ugly coding conventions look. I've dealt with many and they all have their ugly spots. I mostly care about consistency. Agreed that consistency is very important. (At at single time in a week, I deal with 3 different styles with 4 projects.) There are some nice properties with the existing GNU style; but they don't scale well with advanced scoping constructs (e.g. namespaces, classes, etc.) and we need to address those. Conversely the libstdc++ style good bits that we should look at. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 10:15:54AM -0400, Diego Novillo wrote: Lawrence, Ian and Gaby have been working on the proposed coding guidelines for C++ (http://gcc.gnu.org/wiki/CppConventions). That page is quite inconsistent. E.g. it first talks about retaining space before ( (which I really hope will be retained, it makes code more readable), but in the example it doesn't bother with it. And the long line example is horribly ugly, ending line with ( ? That ( should surely go on the next line after the few spaces. It should list also examples of when the first parameter isn't too long, but there are too many parameters and so some wrapping is needed, etc. Jakub
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, 25 May 2012, Diego Novillo wrote: I don't care how ugly coding conventions look. I've dealt with many and they all have their ugly spots. I mostly care about consistency. I don't think we should deviate much from the established GNU standards (which are hideous, btw). Mostly because we are all used to them and it is going to be simpler to write new code that is similar to what is already there. In addition, many people work on multiple components of the GNU toolchain and other pieces of the GNU system, and it's desirable to keep consistency with binutils, GDB, glibc etc. (The GNU Coding Standards do change over time ... for example, they now allow ranges of years in copyright notices, which I think it would be good for GCC and binutils to adopt as GDB and glibc have done, and, following GCC's lead, they no longer recommand `' quoting in diagnostics.) -- Joseph S. Myers jos...@codesourcery.com
Re: [cxx-conversion] New Hash Table (issue6244048)
On 05/25/2012 04:15 PM, Diego Novillo wrote: On 12-05-25 09:56 , Jakub Jelinek wrote: On Fri, May 25, 2012 at 03:51:25PM +0200, Michael Matz wrote: On Fri, 25 May 2012, Gabriel Dos Reis wrote: That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Seconded. I don't care how ugly coding conventions look. I've dealt with many and they all have their ugly spots. I mostly care about consistency. I don't think we should deviate much from the established GNU standards (which are hideous, btw). Mostly because we are all used to them and it is going to be simpler to write new code that is similar to what is already there. Lawrence, Ian and Gaby have been working on the proposed coding guidelines for C++ (http://gcc.gnu.org/wiki/CppConventions). Can emacs handle the indentation rules automatically? Bernd
Re: [cxx-conversion] New Hash Table (issue6244048)
Hi, On Fri, May 25, 2012 at 7:17 AM, Michael Matz m...@suse.de wrote: Hi, On Fri, 25 May 2012, Jakub Jelinek wrote: + /* Return the current size of this hash table. */ + + size_t size() + { +return htab-size; + } (and various other places) - formatting is wrong, missing space between (. And it doesn't start at the first column, and type isn't on a separate line. I don't think I'm going to follow this discussion in much detail and, more importantly, I don't know how representative may own contributions are as 'libstdc++ style', but I *always* have the return type on a separate line in my own patches. It's true however that we don't normally put a space between name and open round bracket, but hey, I don't think we (the libstdc++ people) are so in love with that: myself would approve right away a big mechanical patch adding spaces everywhere, to be clear. Paolo
Re: [cxx-conversion] New Hash Table (issue6244048)
Hi, On Fri, 25 May 2012, Diego Novillo wrote: That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Seconded. I don't care how ugly coding conventions look. Perhaps people who don't care about uglyness shouldn't then be defining the style? Half-way :-/ Ciao, Michael.
Re: [cxx-conversion] New Hash Table (issue6244048)
On 12-05-25 11:06 , Michael Matz wrote: Hi, On Fri, 25 May 2012, Diego Novillo wrote: That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Seconded. I don't care how ugly coding conventions look. Perhaps people who don't care about uglyness shouldn't then be defining the style? Half-way :-/ Who says I am? I *explicitly* said that Lawrence, Ian and Gaby are working on it. I do *not* care about C++ style, I care about all the other benefits it provides us. I will use whatever style y'all agree to. Diego.
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 11:07 AM, Diego Novillo dnovi...@google.com wrote: On 12-05-25 11:06 , Michael Matz wrote: Hi, On Fri, 25 May 2012, Diego Novillo wrote: That's one of my fears, namely that those used to the libstdc++ style impose that on the compiler source base. Because IMHO the libstdc++ style isn't very appealing. Seconded. I don't care how ugly coding conventions look. Perhaps people who don't care about uglyness shouldn't then be defining the style? Half-way :-/ Who says I am? I *explicitly* said that Lawrence, Ian and Gaby are working on it. Didn't Ian already create a style for writing gold in c++?
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 9:33 AM, Paolo Carlini pcarl...@gmail.com wrote: Hi, On Fri, May 25, 2012 at 7:17 AM, Michael Matz m...@suse.de wrote: Hi, On Fri, 25 May 2012, Jakub Jelinek wrote: + /* Return the current size of this hash table. */ + + size_t size() + { + return htab-size; + } (and various other places) - formatting is wrong, missing space between (. And it doesn't start at the first column, and type isn't on a separate line. I don't think I'm going to follow this discussion in much detail and, more importantly, I don't know how representative may own contributions are as 'libstdc++ style', but I *always* have the return type on a separate line in my own patches. It's true however that we don't normally put a space between name and open round bracket, but hey, I don't think we (the libstdc++ people) are so in love with that: myself would approve right away a big mechanical patch adding spaces everywhere, to be clear. no, you don't want to do that :-) -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 11:44 AM, Paweł Sikora pl...@agmk.net wrote: so, why you just don't use the hash table implementation from libstdc++? we have agreed on C++03 as a bootstrap compiler. There is unfortunately no hash table in C++03. The page about not defining a new template was just when we were gathering all kinds of ideas. It is clearly too restrictive to be appropriate. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 12:50 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Fri, May 25, 2012 at 11:44 AM, Paweł Sikora pl...@agmk.net wrote: so, why you just don't use the hash table implementation from libstdc++? we have agreed on C++03 as a bootstrap compiler. There is unfortunately no hash table in C++03. That's a shame. C++11 makes life wonderful.
Re: [cxx-conversion] New Hash Table (issue6244048)
On Friday 25 of May 2012 11:50:13 Gabriel Dos Reis wrote: On Fri, May 25, 2012 at 11:44 AM, Paweł Sikora pl...@agmk.net wrote: so, why you just don't use the hash table implementation from libstdc++? we have agreed on C++03 as a bootstrap compiler. There is unfortunately no hash table in C++03. can't you use implementation from tr1 for c++98/03 mode?
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/25/12, Diego Novillo dnovi...@google.com wrote: Lawrence, Ian and Gaby have been working on the proposed coding guidelines for C++ (http://gcc.gnu.org/wiki/CppConventions). Lawrence, have you had a chance to update them with your latest edits? Not yet; they're on the desk in front of me. I'll do that ASAP. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/25/12, Paweł Sikora pl...@agmk.net wrote: On Friday 25 of May 2012 10:15:54 Diego Novillo wrote: Lawrence, Ian and Gaby have been working on the proposed coding guidelines for C++ (http://gcc.gnu.org/wiki/CppConventions). on this page we can read: The following features of the C++ language should in general be avoided in GCC code. (...) Defining new templates (use of existing templates, e.g., from the standard library, is fine). That rule came from Ian's approach with gold. It works well for a new project. It does not work well in gcc. so, why you just don't use the hash table implementation from libstdc++? Diego and I looked long and hard at this issue. It all came down to a sequence of problems. First, libstdc++ isn't rigged for GTY, so we would have to instrument the library. Second, the GTY parser doesn't handle C++, so we would have to redo the GTY parser. Third, PCH requires gc, which requires GTY, so we cannot easily get rid of gc. Fourth, any use of libstdc++ will make different tradeoffs in size and performance, so any switch would require both a change in form and a change in substance. We would simultaneously have to show benefit on two axes, which is a predictor of trouble. So, instead we chose to take a more incremental approach, exploiting C++ with the existing algorithms and requiring mostly mechanical changes in client code. Both the hash table and vector patches take that approach. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On May 25, 2012, at 10:50 AM, Lawrence Crowl wrote: Diego and I looked long and hard at this issue. It all came down to a sequence of problems. First, libstdc++ isn't rigged for GTY, If portability to other C++ compilers wasn't a concern, we could extend out g++ to make supporting GTY better, so that we can simplify and refine the GTY stuff. I fear we need some light weight reflection, might make a nice language feature for a future C++ standard, if done well.
Re: [cxx-conversion] New Hash Table (issue6244048)
On May 25, 2012, at 7:29 AM, Bernd Schmidt wrote: Can emacs handle the indentation rules automatically? I hope that any style picked, will have a no local mods required for emacs as the corner stone. One thing I hate about .md files, is they don't (last I checked) fulfill this requirement.
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/25/12, Michael Matz m...@suse.de wrote: On Fri, 25 May 2012, Jakub Jelinek wrote: + /* Return the current size of this hash table. */ + + size_t size() + { +return htab-size; + } (and various other places) - formatting is wrong, missing space between (. Yes, an omission that's a consequence of dealing with several coding conventions. And it doesn't start at the first column, and type isn't on a separate line. I realize that this is a member method, hence indenting and C GNU coding standards conflict, but the latter do have some nice properties (for instance that 'grep ^func_name *.c' finds only the definition, not all the calls to func_name). We can address this issue by requiring all function definitions to appear outside the class definition. The code is more verbose, but livable. I think we need a discussion about style, now that actually people are working on this. Yes. +void +typed_htab Element, Hash, Equal, Remove, Allocator +::dispose () Do we want to format methods this way? Don't know, but at least it starts at the first column, and grepping for ^:: will give only C++ methods, 'grep -B1 ^::' even including class. So it's no too horrible, despite the syntactic C++ noise. For template classes, the specification of an out-of-line member function is going to be long, so a single-line specification is likely to overflow often. Whatever we chose, it should be reasonably extended to multiple lines. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/25/12, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 25, 2012 at 10:15:54AM -0400, Diego Novillo wrote: Lawrence, Ian and Gaby have been working on the proposed coding guidelines for C++ (http://gcc.gnu.org/wiki/CppConventions). That page is quite inconsistent. E.g. it first talks about retaining space before ( (which I really hope will be retained, it makes code more readable), but in the example it doesn't bother with it. And the long line example is horribly ugly, ending line with ( ? That ( should surely go on the next line after the few spaces. It should list also examples of when the first parameter isn't too long, but there are too many parameters and so some wrapping is needed, etc. We will clean the page. There are technical guidelines that we can, and I think should, apply to the coding convention. In particular, I believe any coding convention should be consistent with the grammar of the language. Both the GNU style and the C++ Standard style break that guideline in different respects. For example, consider the following GNU code. int *p = *foo (3); Here, the convention implies that foo binds more tightly to the * than it does to the (. This implication is false. Now consider the C++ Standard equivalent. int* p = *foo(3); It fixes the binding of foo, but implies that * is syntactically closer to int than it is to p. That implication is false. Now consider a form that is consistent with the actual grammar of the language: int *p = *foo(3); Here there is no implication that foo binds more closely to either the * or the (, so the form is consistent with the grammar. We can go further and have the spacing reinforce the grammar, but lines will get longer. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/24/12, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 24, 2012 at 09:43:42AM -0700, Lawrence Crowl wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. You haven't looked at the most important problem of that approach - code bloat. Are you claiming that the size of the binary is more important than run-time performance, type safety, and source code size? hashtab.o .text is currently ~ 4KB on x86_64, in your version only very small part out of it is shared. In a cc1plus binary I count roughly 179 htab_create{,_ggc} calls, ignoring the first parameter (initial size) that is roughly 139 unique hash/eq/del fn combinations, thus you'll end up with over hundred copies of most of the hashtab code, First, the new hash table code is smaller because it avoid supporting special cases that we are not using in practice. What was formerly conditional is now unconditional, leading to tighter binaries for a given function. As you say, there will be more functions in the source, leading to an increased executable size. Setting aside run-time performance, why is that size important? in many of the cases performance critical and thus its I-cache friendliness is quite important. It is precisely the performance critical code that needs the template approach. The approach avoids two indirect calls, which go across translation units. The optimizer has no ability to inline the often trivial actions within those indirect calls. This problem is most severe in the traverse functions, because the optimize cannot see enough to do anything useful with the loop. The inliner also has no ability to elide comparisons for null function pointers. In short, with the libiberty code, the operations must span a large number of completely useless operations. Those operations disappear with the template approach. The result is that the working set in the cache is much smaller. That said, if you think code size is important for tables that are not performance critical, I can provide a version of the code that simply reuses libiberty and gets the type safety without the performance. I don't have a good intuition for which of those tables are performance-critical, so please give me an idea of which they are. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/25/12, Mike Stump mikest...@comcast.net wrote: On May 25, 2012, at 10:50 AM, Lawrence Crowl wrote: Diego and I looked long and hard at this issue. It all came down to a sequence of problems. First, libstdc++ isn't rigged for GTY, If portability to other C++ compilers wasn't a concern, we could extend out g++ to make supporting GTY better, so that we can simplify and refine the GTY stuff. Personally, I would rather see if we can take advantage of C++ features to reduce garbage and then use the Boehm collector. There is too much manual management with GTY, and I'd rather the compiler leverage mainstream practice rather than depart from it. I fear we need some light weight reflection, might make a nice language feature for a future C++ standard, if done well. There has been some talk of such a language feature. As yet, the only concrete proposal does not have wide support. I suspect it will take several more proposals before we hit upon one that will get wide support. If you have any good ideas, now would be a really excellent time to speak up! -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 4:16 PM, Lawrence Crowl cr...@google.com wrote: On 5/25/12, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 25, 2012 at 10:15:54AM -0400, Diego Novillo wrote: Lawrence, Ian and Gaby have been working on the proposed coding guidelines for C++ (http://gcc.gnu.org/wiki/CppConventions). That page is quite inconsistent. E.g. it first talks about retaining space before ( (which I really hope will be retained, it makes code more readable), but in the example it doesn't bother with it. And the long line example is horribly ugly, ending line with ( ? That ( should surely go on the next line after the few spaces. It should list also examples of when the first parameter isn't too long, but there are too many parameters and so some wrapping is needed, etc. We will clean the page. There are technical guidelines that we can, and I think should, apply to the coding convention. In particular, I believe any coding convention should be consistent with the grammar of the language. The problem with this is that it assumes that the grammar is a logical expression of the ideas and ideas of the language. When that is the case, that is fine. For C++ in particular, that is not the case. We have gone to extra length to fit ideas and ideals into something that has grown barnacles and then patch over those deficiencies by telling/teaching people to write in certain ways. The examples you gave below are great and in some sense illustrate what the C++ would have liked to happen (the ideas and ideals) as opposed to the gritty and unsavory details of the horror that is the C++ grammar. While we should pay attention, I think it would be a mistake to set coding conventions that mirror the horrors of the C++ grammar. Both the GNU style and the C++ Standard style break that guideline in different respects. For example, consider the following GNU code. int *p = *foo (3); Here, the convention implies that foo binds more tightly to the * than it does to the (. This implication is false. Now consider the C++ Standard equivalent. int* p = *foo(3); It fixes the binding of foo, but implies that * is syntactically closer to int than it is to p. That implication is false. Now consider a form that is consistent with the actual grammar of the language: int *p = *foo(3); Here there is no implication that foo binds more closely to either the * or the (, so the form is consistent with the grammar. C++ standard thinks of * binding to int because it puts emphasis on types -- and one of selling points of moving to C++ is that by exploiting the static typing we would eliminate bugs and runtime checks that are there only there because we could not get enough static assurances and be sufficiently expressive and elegant with the current implementation language. Consequently, I think we should retain the binding suggested by the C++ standard. I realize that the existing GNU C convention says the opposite -- but then it is written for C. We can go further and have the spacing reinforce the grammar, but lines will get longer. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 4:52 PM, Lawrence Crowl cr...@google.com wrote: Personally, I would rather see if we can take advantage of C++ features to reduce garbage and then use the Boehm collector. There is too much manual management with GTY, and I'd rather the compiler leverage mainstream practice rather than depart from it. I could not agree more. -- Gaby
Re: [cxx-conversion] New Hash Table (issue6244048)
On Fri, May 25, 2012 at 12:42 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: with the current implementation language. Consequently, I think we should retain the binding suggested by the C++ standard. I realize that the existing GNU C convention says the opposite -- but then it is written for C. This point of yours should be stressed. Using writing standards of one language to develop in another language is a fundamentally bad idea. C and C++ kind of look the same, but they are not: http://www.research.att.com/~bs/bs_faq.html#C-slash You wouldn't use the GNU C Coding conventions to write in tcl, and you shouldn't use them to write in C++. You should just create the GNU C++ Coding Standards new, and not base them off of the former. What I see happening in this and in other recent threads is people starting to apply C ideas to C++ code formatting. Aren't there other people out there that already did a pretty good job of painting that shed green? (I'm asking in earnest, I really don't know.) I know in the defense world that the JSF project did a very in depth and very strict standard (http://www.jsf.mil/downloads/down_documentation.htm), but that's as much as I know.
Re: [cxx-conversion] New Hash Table (issue6244048)
NightStrike nightstr...@gmail.com writes: You wouldn't use the GNU C Coding conventions to write in tcl, and you shouldn't use them to write in C++. You should just create the GNU C++ Coding Standards new, and not base them off of the former. ... r, just use some common sense, and change things which need changing. C++ and C are not the same, but they certainly have enough overlap that using the existing C conventions as a starting point makes a huge amount of sense... To just start new would either (1) end up adopting many of the same rules anyway, or (2) result in a huge amount of absolutely pointless inconsistency (where otherwise identical code ends up looking different). E.g., as a start, there's absolutely no reason to change most of the indentation or spacing rules, because the C rules work totally fine for C++ (with obvious extensions to handle new cases). You can argue about the details (should namespaces be indented or not?), but most of it is pretty obvious. -miles -- Egotist, n. A person of low taste, more interested in himself than in me.
Re: [cxx-conversion] New Hash Table (issue6244048)
On 5/24/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On May 24, 2012 Lawrence Crowl cr...@google.com wrote: Add a type-safe hash table, typed_htab. Uses of this table replace uses of libiberty's htab_t. The benefits include less boiler-plate code, full type safety, and improved performance. Lawrence, is there any chance you could just call it hash_table? After the conversion, we will be living most of the time in a typed world, so the typed_ prefix will be redundant if not confusing :-) I agree that the name is not great, and have no love of it. I'd like to get all the name suggestions on the table before I to a renaming pass. [...] The type-safe hash table is a template, taking the element type and the operational functions as template parameters. Passing the operational functions as template parameters, rather than function pointers, exposes the calls to inlining at -O2. A side effect is that declarations of hash tables move to after the declarations of those functions. A further side effect is that the control block shrinks from 108 bytes to 44 bytes. There is otherwise no effect on data size. Nice. Do you anticipate that at some point we could just pass function object classes as template arguments as opposed to functions? That of course would retain the nice property of inlining, but it would remove the need to make the function have an external linkage. The downside is that change would touch more places than your current patch does. Yes, we could do that. I'm a little reluctant to make too many steps in any one patch. As an example, we could move the control block into the variable for most uses of the table, but not all. Doing so would complicate this patch, and so I thought it best to make that change a separate issue. That said, I'll let y'all decide how much to put in any one piece. -- Lawrence Crowl
Re: [cxx-conversion] New Hash Table (issue6244048)
On 12-05-24 17:52 , Lawrence Crowl wrote: That said, I'll let y'all decide how much to put in any one piece. I favour the approach we are taking now. Each patch to cxx-conversion is a small incremental step. When we merge into trunk we'll merge the final product of each change, of course. But with the minimalistic approach, we can discuss and understand each change better. Diego.