Re: Merge Request - Temp_ComplexScripts into Trunk

Glenn Adams Wed, 26 Oct 2011 04:24:50 -0700

inline

On Wed, Oct 26, 2011 at 6:49 PM, Vincent Hennebert <[email protected]>wrote:

> On 21/10/11 08:29, Glenn Adams wrote:
> > On Thu, Oct 20, 2011 at 10:31 PM, Peter Hancock <[email protected]
> >wrote:
> >
> >>
> >> From the surface I would have been very much in favor of supporting a
> >> merge in the near future, however I have had the chance to review some
> >> areas of the complex script branch and I have some concerns.
> >> The treatment of Unicode Bidi spans from the creation of the FO Tree
> >> through to the construction of the Area Tree and I would have liked to
> >> have seen the complex scripts solution integrate the Unicode Bidi
> >> Algorithm more directly into the core process:  For example, the
> >> implementation performs a post process on the FO Tree to resolve the
> >> Bidi properties of FONodes relating to text. It would be preferable to
> >> see the construction of the FO Tree embracing this Bidi aspect:
> >> FONodes should be responsible for determining their own bidi state
> >> from the fo node semantics in context to the position in the tree.
> >> Such an implementation would immediately force the maintainer to
> >> consider how a change would effect the Bidi process.
> >>
> >
> > Please review XSL-FO 1.1 Section 5.8, and, in particular:
> >
> > "the algorithm is applied to a sequence of characters coming from the
> > content of one or more formatting objects. The sequence of characters is
> > created by processing a fragment of the formatting object tree. A
> *fragment* is
> > any contiguous sequence of children of some formatting object in the
> tree.
> > The sequence is created by doing a pre-order traversal of the fragment
> down
> > to the fo:character level."
> >
> > "the final, text reordering step is not done during refinement. Instead,
> the
> > XSL equivalent of re-ordering is done during area tree generation"
> >
> > The current implementation adheres to the XSL-FO specification in this
> > regard, while your suggestion that this behavior be isolated to
> individual
> > FONodes is contrary to the specification and does not permit correct
> > implementation of the functionality required.
>
> Section 5 of the XSL-FO 1.1 Recommendation starts with the following
> note:
> “Although the refinement process is described in a series of steps, this
> is solely for the convenience of exposition and does not imply they must
> be implemented as separate steps in any conforming implementation.
> A conforming implementation must only achieve the same effect.”
>
> So we are free to implement the algorithm any way we see adequate.
>

And I have done just that: implement the [Bidi] algorithm in a way I found
adequate.

> But even so, a fragment is “any contiguous sequence of children of some
> formatting object in the tree“. That formatting object doesn’t have to
> be a whole page-sequence and can as well be a single block or inline or
> anything else.
>
> Implementing Bidi in individual FONode sub-classes allows to keep the
> treatment encapsulated in each FO element, and adapt it to the specific
> semantics of that element.
>

I have not ruled this option out. I have stated previously that this may be
possible; however, there are possible problems with confining this behavior
to FONode classes such as the fact that what constitutes a "fragment" and a
"delimited text range" (see XSL-FO 1.1 Section 5.8) is not confined to FO
node boundaries.

> If this is done in a single BidiUtil class, all the behaviours that are
> specific to each element are mixed together. Implementation details that
> should be kept within individual classes are being exposed to the rest
> of them. Elements must be handled in lenghty sequences of if/else
> statement using ‘instanceof’ and casts to the concrete class.
>

In adding support to Bidi to FOP, I was faced with the problem of (1) not
knowing the implementation, and (2) desiring to minimize the point of
contact of my changes with existing code in order to better isolate
behavioral regressions.

I made the determination that it would be most convenient and expedient to
code the bidi level and order resolution functionality in a single set of
utility functions (with other helper classes, such as UnicodeBidiAlgorithm).
That approach has worked so far. However, that doesn't mean it has to remain
that way.

It is not a straightforward task to add Bidi support to a large
implementation like FOP which failed to take the needs of
internationalization into account in its design. As a consequence, adding
this support must be done delicately, and in stages.

> If a new FO element is being implemented this will be very likely that
> it will be forgotten to add the appropriate ‘if’ statement for that
> element in the BidiUtil class. If it’s not forgotten, it will be
> difficult to find out where to put that statement.
>

Although I have not added tests for right-to-left writing mode for all
existing layoutengine standard tests, I am in the process of doing just
that, and have already put a number of tests in place. Eventually, there
will be RTL WM tests for all these standard tests.

When a new FO element is implemented, it should be obvious to the
implementer that they need to test for RTL WM functionality. If necessary,
we can put into place a set of conditions that require us to evaluate
support for both RTL and vertical WMs prior to accepting new FO elements.

> Doing treatment specific to an object outside its implementation screams
> for trouble and regressions as soon as a change is made in one or the
> other place. Unless people are aware that they must keep an eye on that
> BidiUtil class, which is unlikely for newcomer who don’t know the code
> well.
>

By definition, a newcomer won't know the code well. I didn't know the code
well. I know it better now. One reason I have focused on adding many tests
is that I am, in general, dissatisfied with the existing coverage of testing
in FOP that could detect regressions. I have made a reasonable start on
adding RTL WM tests, and I expect to add more.

> This BidiUtil class has clearly been written in a procedural style, and
> there are reasons why that style was abandoned years ago in favour of an
> object-oriented paradigm, that allows to write more flexible,
> maintainable programs, and easier to understand by people who are not
> the original authors.
>

First, you are clearly wrong that procedural style has been abandoned. Every
Java project including FOP combines both OO and procedural coding styles.

I'm not interested in entertaining an ideological discussion of when it is
appropriate to use one or the other. I will simply maintain that both have
reasonable purposes, just like there are reasonable purposes for writing
declarative content versus procedural content.

I had a job to do, which was to add complex script support to a large,
relative mature body of code for which no design consideration was given to
supporting international text and other writing modes. The way I took to
solve that problem was practical, not theoretical. And it works.

The fact that it works now doesn't mean it can't be refactored over time to
better associate functionality with FONode class design. However, normally
one starts with working code, and then refactors it over time.

> I’d like to know what’s your view on this?
>

I would like to ask you a question? Do you believe it is necessary to insist
that new code be refactored prior to merging? Do you believe that new
contributions should meet some notion of 'perfection' with respect to your
coding standards prior to making use of those contributions?

> > I realize this is a complex subject area that requires considerable
> domain
> > knowledge, but you can take my word as a domain expert (having
> implemented
> > this functionality multiple times in commercial products) that the
> approach
> > I have taken is (1) the most consistent with the XSL-FO specification,
> (2)
> > the behavior required to achieve the desired functionality, and (3) the
> > minimum changes and points of dependency to existing code.
> >
> > In contrast, a more distributed approach such as you suggest would (1)
> > diverge from XSL-FO specified behavior, (2) increase and distribute the
> > number of points of interaction with existing code so as to make behavior
> > harder to understand, test, and debug, and, most telling, (3) not provide
> > any functional or performance advantage.
> >
> > Regards,
> > Glenn
>
>
> Thanks,
> Vincent
>

Re: Merge Request - Temp_ComplexScripts into Trunk

Reply via email to