On Sat, 11 Nov 2006, Paul Eggert wrote: > "Joel E. Denny" <[EMAIL PROTECTED]> writes: > > > The following patch (committed) creates a code_props structure and > > interface that (for now) encapsulates the first three fields above and > > functionality related to them. > > I'm afraid I'm going to have to object to this one. The basic idea is > OK, but there are too many changes here, and I can't say I agree with > them all. The changes slow Bison down in ways that I can measure
Thanks for reviewing the patch. Using GNU time 1.7, I just tried timing Bison on an example input grammar. On average, when Bison is compiled without my patch, I get 0.123s real time. With my patch, I get 0.124s. In both cases, I chose Bison's own parse-gram.y without my patch as the example input grammar. I understand that the performance of Bison-generated parsers should be optimal. I was not aware that the performance of Bison itself was this vital. If your results differ from mine, please post the test grammar so I can see what you're seeing. > How about if we back out this patch and then talk about the changes > one by one? Done. > The patch changes Bison to pass too many copies of structures around, > when we should be passing references to structures. I suspect this is > what hurts Bison's performance. I usually prefer that approach as well. Never mind performance, it's more consistent. I don't know why I didn't follow that here. > Code like this is unnecessarily verbose: > > rules[ruleno].action_location = > code_props_location_get (p->action_props); > > It's easier to read this: > > rules[ruleno].action_location = p->action_props->location; > > If we turn every '->' into a foo_bar_baz_boof_get, our code will get > harder to read, with little real benefit. I agree that your alternative is easier to read when a single example like the above is considered in isolation. C++ has an advantage here. However, I've worked on other C-based projects for which I was forbidden to use C++. In some of those projects, every function other than main is conceptually a method of some struct. Every method name contains the full namespace qualification as you see in all my method names for code_props. No code external to a particular struct's methods ever directly access the members of that struct. In such projects, I've found that the slight notational inconvenience is minor in comparison to the advantages of consistent data hiding and interface documentation. > I know why you encapsulated > the members inside getters; In the specific example of code_props_location_get, I think the aver inside it is a helpful sanity check during development. In my experience, it's painful to find and rewrite all appearances of A->location to code_props_location_get (A) when you suddenly realize you want some extra code like that at every access. > but to my mind the cost exceeds the > benefit, simply in terms of reading the code. If we tried abbreviated names like cprops_loc_get, would that address your concern at all? > The Doxygenated comments are hard to read and often don't offer much. > For example: > > /** > * \brief > * The value returned by \c code_props_is_value_used for this > * \c code_props. > */ > bool is_value_used; > > This comment (a) doesn't tell me anything useful, It tells developers working on the code_props implementation that this field will always have the same value as its getter, so they should read the documentation for the getter to learn more about it. That fact isn't true for all fields. > and (b) is hard to > read because of those annoying "\c"s. How about if we get rid of all > the \c occurrences, for starters? Just as @code tells Texinfo to style this as literal code, \c tells Doxygen that. In cases where literal code looks too similar to English, I find it also helps me to read the documentation. For me, the ALL_CAPS style seen in pre-Doxygen areas of Bison is harder to read and offers no way to distinguish between some_name and SOME_NAME, which might be two different symbols.
