Re: encapsulating code properties

Joel E. Denny Sat, 11 Nov 2006 23:44:45 -0800

On Sat, 11 Nov 2006, Paul Eggert wrote:

> "Joel E. Denny" <[EMAIL PROTECTED]> writes:
> 
> > The following patch (committed) creates a code_props structure and 
> > interface that (for now) encapsulates the first three fields above and 
> > functionality related to them.
> 
> I'm afraid I'm going to have to object to this one.  The basic idea is
> OK, but there are too many changes here, and I can't say I agree with
> them all.  The changes slow Bison down in ways that I can measure


Thanks for reviewing the patch.

Using GNU time 1.7, I just tried timing Bison on an example input grammar. 
On average, when Bison is compiled without my patch, I get 0.123s real 
time.  With my patch, I get 0.124s.  In both cases, I chose Bison's own 
parse-gram.y without my patch as the example input grammar.

I understand that the performance of Bison-generated parsers should be 
optimal.  I was not aware that the performance of Bison itself was this 
vital.  If your results differ from mine, please post the test grammar so 
I can see what you're seeing.

> How about if we back out this patch and then talk about the changes
> one by one?

Done.

> The patch changes Bison to pass too many copies of structures around,
> when we should be passing references to structures.  I suspect this is
> what hurts Bison's performance.

I usually prefer that approach as well.  Never mind performance, it's more 
consistent.  I don't know why I didn't follow that here.

> Code like this is unnecessarily verbose:
> 
>         rules[ruleno].action_location =
>           code_props_location_get (p->action_props);
> 
> It's easier to read this:
> 
>         rules[ruleno].action_location = p->action_props->location;
> 
> If we turn every '->' into a foo_bar_baz_boof_get, our code will get
> harder to read, with little real benefit.

I agree that your alternative is easier to read when a single example like 
the above is considered in isolation.  C++ has an advantage here.  
However, I've worked on other C-based projects for which I was forbidden 
to use C++.  In some of those projects, every function other than main is 
conceptually a method of some struct.  Every method name contains the full 
namespace qualification as you see in all my method names for code_props.  
No code external to a particular struct's methods ever directly access the 
members of that struct.  In such projects, I've found that the slight 
notational inconvenience is minor in comparison to the advantages of 
consistent data hiding and interface documentation.

> I know why you encapsulated
> the members inside getters;

In the specific example of code_props_location_get, I think the aver 
inside it is a helpful sanity check during development.  In my experience, 
it's painful to find and rewrite all appearances of A->location to 
code_props_location_get (A) when you suddenly realize you want some extra 
code like that at every access.

> but to my mind the cost exceeds the
> benefit, simply in terms of reading the code.

If we tried abbreviated names like cprops_loc_get, would that address your 
concern at all?

> The Doxygenated comments are hard to read and often don't offer much.
> For example:
> 
>   /**
>    * \brief
>    *   The value returned by \c code_props_is_value_used for this
>    *   \c code_props.
>    */
>   bool is_value_used;
> 
> This comment (a) doesn't tell me anything useful,

It tells developers working on the code_props implementation that this 
field will always have the same value as its getter, so they should read 
the documentation for the getter to learn more about it.  That fact isn't 
true for all fields.

> and (b) is hard to
> read because of those annoying "\c"s.  How about if we get rid of all
> the \c occurrences, for starters?

Just as @code tells Texinfo to style this as literal code, \c tells 
Doxygen that.  In cases where literal code looks too similar to English, I 
find it also helps me to read the documentation.  For me, the ALL_CAPS 
style seen in pre-Doxygen areas of Bison is harder to read and offers no 
way to distinguish between some_name and SOME_NAME, which might be two 
different symbols.

Re: encapsulating code properties

Reply via email to