Re: Merge Request - Temp_ComplexScripts into Trunk

Vincent Hennebert Mon, 24 Oct 2011 03:54:48 -0700

On 22/10/11 01:22, Glenn Adams wrote:
> inline
> 
> On Sat, Oct 22, 2011 at 12:04 AM, Chris Bowditch <[email protected]
>> wrote:
> 
>>
>> Since Thunderhead also needs this feature we are willing to invest some
>> time into it too. Currently my team are telling me it would take 9 person
>> months to port this code into our branch of FOP, partly because of some
>> merge conflicts, but also partly because we are not comfortable with some
>> aspects of the code as it has already been pointed out. The estimate would
>> include the time to refactor long files into small ones, deal with the
>> variable names and restructuring the code.
>>
> 
> I would advise against this, since it would it is functionally unnecessary
> and since it will make future merges more difficult. I will be making
> additional changes as more features in this area are added.
> 
> I don't see what refactoring long files into small ones buys you. However,
> if you make a reasoned argument for factoring specific long files (i.e., why
> such factoring improves architecture, modularity, etc), rather than simply
> say "all long files must be refactored", then I will seriously discuss doing
> so post merge.


When I read this I’m really puzzled because that should really be the
other way around: what is the reason to keep those classes so big (and
that must be a really good one)? Most likely the Single Responsibility
Principle is being violated in those classes.

BidiUtil is a good example of this: it computes Bidi levels on the FO
tree, /and/ also re-orders areas on the area tree. Those two things
should most probably be done in two separate classes. Especially since
from the quick look I had they seem to be using two distinct sets or
private methods.


>> I appreciate your commitment to add comments to short identifiers
>> declarations, so at least it will be easier for the team to translate the
>> short variables to longer equivalents. Just so we are clear on what you
>> propose, do you mean this:
>>
>> int gi = 0; // Glyph Index
>>
> 
> Yes. Note that I already do this in most cases, such as:
> 
> private static void resolveExplicit ( int[] wca, int defaultLevel, int[] ea
> ) {
>     int[] es = new int [ MAX_LEVELS ];          /* embeddings stack */
>     int ei = 0;                                 /* embeddings stack index */
>     int ec = defaultLevel;                      /* current embedding level
> */
>     for ( int i = 0, n = wca.length; i < n; i++ ) {
>         int bc = wca [ i ];                     /* bidi class of current
> char */
>         int el;                                 /* embedding level to assign
> to current char */
>         switch ( bc ) {
>         case LRE:                               // start left-to-right
> embedding
>         case RLE:                               // start right-to-left
> embedding
>              case LRO:                               // start left-to-right
> override
>         case RLO:                               // start right-to-left
> override
>             {
>                 int en;                         /* new embedding level */
>                 if ( ( bc == RLE ) || ( bc == RLO ) ) {
>                     en = ( ( ec & ~OVERRIDE ) + 1 ) | 1;
>                 } else {
>                     en = ( ( ec & ~OVERRIDE ) + 2 ) & ~1;
>                 }
>                 if ( en < ( MAX_LEVELS + 1 ) ) {
>                     es [ ei++ ] = ec;
>                     if ( ( bc == LRO ) || ( bc == RLO ) ) {
>                         ec = en | OVERRIDE;
>                     } else {
>                         ec = en & ~OVERRIDE;
>                     }
>                 } else {
>                     // max levels exceeded, so don't change level or
> override
>                 }
>                 el = ec;
>                 break;
>             }
> ...
> 
> What I'm agreeing to do in the relative near future (after merge, before new
> release) is to add such comments to those places where I have not already
> done so, which are probably a minority of such cases.

This is good to hear, although it only marginally helps. Again, what’s
the rationale behind having 2 or 3 letter variables when every course
about programming will emphasise the importance of having reasonably
long, self-describing variable names? What amount of typing is there to
save when any modern IDE will auto-complete variable names?

Explaining the purpose of a variable in a comment creates two problems:
first, it forces the developer to constantly scroll from where the
variable is being used to where it is declared, in order to remember
what its purpose is. By doing so, they lose the context in which the
variable is used and have to read the code again. This makes it
extremely painful and difficult to understand the code.

But more importantly, there is no guarantee that the comment is
accurate. It’s notorious that comments tend to be left behind when
changes are made to the code. Which means that they quickly become
misleading or even plain wrong. Therefore people tend to not trust
comments, or even not read them at all.

Choosing a self-explaining variable name solves both problems, while
also forcing the programmer to re-think about the correctness of the
code.


Vincent

Re: Merge Request - Temp_ComplexScripts into Trunk

Reply via email to