Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-28 Thread Clay Leeds
 However, if someone actually renamed my variables after I have declared my 
 position, then I would interpret that as doing bad things to the code. In 
 fact, i would revert such a change.
 
 If folks aren't willing to respect my style of coding along with my promise 
 to document short names, then I will withdraw my request for a merge and 
 abrogate my ICLA. I would not wish my work to be used in a community that 
 does not have sufficient respect for personal coding styles of contributors 
 that do not in any way vary from documented conventions.

This is a long thread and my hope is that we will be able to resolve it 
amicably, and that our community can grow from this experience. 

The ComplexScripts functionality being merged is a valued and important 
addition to the Apache FOP Project. 

More important than the code to me personally, is the continued community and 
its growth. 

FOP is a community built around a common desire to create a robust and open 
source tool for generating XSL Formatting Output in various common formats for 
print, screen, archival and transfer purposes.  

There is an expressed concern about code readability and reusability. What 
concerns me most is that this may cause division.

I'm concerned about the level of compromise and intent. It may take more of a 
stretch on each or our parts to move forward.

Clay

My religion is simple. My religion is kindness.
- HH The Dalai Lama of Tibet



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-27 Thread Simon Pepping
   Ninth, spending time changing variable names is a waste of time when I
  could
   be working on adding support for other scripts.
 
  So someone else is going to have to waste all that time converting those
  names into more readable ones. That’s a bit unfair, isn’t it?
 
 
 I would advise against anyone wasting their time by changing my names.
 Indeed, I will likely react very negatively to such an attempt. What you
 want to do in your code is your business, but don't imagine you are going to
 start rewriting my code to meet your style. Or at least don't do so if you
 wish me to be a part of this team.
 
 I would take such an action as a direct affront.

This is a big no. At the moment you hand in your code to FOP, it
belongs to the community. Anyone can touch it. That is where team
membership kicks in. Team members trust each other not to do bad
things to the code.
 
 If in the indefinite future I am not working on this code, then feel free to
 change it as you like. In the mean time, I'd appreciate a little respect.

Respect yes, but not touching it, no.

Simon Pepping


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-27 Thread Glenn Adams
On Thu, Oct 27, 2011 at 3:41 PM, Simon Pepping spepp...@leverkruid.euwrote:

Ninth, spending time changing variable names is a waste of time when
 I
   could
be working on adding support for other scripts.
  
   So someone else is going to have to waste all that time converting
 those
   names into more readable ones. That’s a bit unfair, isn’t it?
  
 
  I would advise against anyone wasting their time by changing my names.
  Indeed, I will likely react very negatively to such an attempt. What you
  want to do in your code is your business, but don't imagine you are going
 to
  start rewriting my code to meet your style. Or at least don't do so if
 you
  wish me to be a part of this team.
 
  I would take such an action as a direct affront.

 This is a big no. At the moment you hand in your code to FOP, it
 belongs to the community. Anyone can touch it. That is where team
 membership kicks in. Team members trust each other not to do bad
 things to the code.
 
  If in the indefinite future I am not working on this code, then feel free
 to
  change it as you like. In the mean time, I'd appreciate a little respect.

 Respect yes, but not touching it, no.


I did not say or imply hands off, so of course I presume that anyone can
touch it. Why do you insist on reading me otherwise?

However, if someone actually renamed my variables after I have declared my
position, then I would interpret that as doing bad things to the code. In
fact, i would revert such a change.

If folks aren't willing to respect my style of coding along with my promise
to document short names, then I will withdraw my request for a merge and
abrogate my ICLA. I would not wish my work to be used in a community that
does not have sufficient respect for personal coding styles of contributors
that do not in any way vary from documented conventions.



 Simon Pepping



RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-27 Thread Eric Douglas
There is little room for individuality in a coding community.
I'd say it doesn't matter what you call your variables as long as
someone who's never seen the code can understand the purpose through the
name and/or comments, and they conform to any predefined naming standard
for the project.
Ages ago all variables had short names because disk space and/or memory
was at a premium.  Today that shouldn't be an excuse.
Try to avoid overly simplistic names like x1 unless they have overly
simplistic purpose (create, use, destroy within a 5 line method), and
try to avoid overly verbose names.
As long as what you write makes sense I'd agree with you no one should
change the code simply for the sake of personal preference.
You should certainly change it if you don't want someone else to change
it if someone unfamiliar with it can't understand it.
If they're changing your names that should be fine as long as they're
fixing a bug or enhancing something where the name change makes sense to
go with the new logic.
 



From: Glenn Adams [mailto:gl...@skynav.com] 
Sent: Thursday, October 27, 2011 3:56 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Merge Request - Temp_ComplexScripts into Trunk




On Thu, Oct 27, 2011 at 3:41 PM, Simon Pepping spepp...@leverkruid.eu
wrote:


   Ninth, spending time changing variable names is a waste of
time when I
  could
   be working on adding support for other scripts.
 
  So someone else is going to have to waste all that time
converting those
  names into more readable ones. That's a bit unfair, isn't
it?
 

 I would advise against anyone wasting their time by changing
my names.
 Indeed, I will likely react very negatively to such an
attempt. What you
 want to do in your code is your business, but don't imagine
you are going to
 start rewriting my code to meet your style. Or at least don't
do so if you
 wish me to be a part of this team.

 I would take such an action as a direct affront.


This is a big no. At the moment you hand in your code to FOP, it
belongs to the community. Anyone can touch it. That is where
team
membership kicks in. Team members trust each other not to do bad
things to the code.


 If in the indefinite future I am not working on this code,
then feel free to
 change it as you like. In the mean time, I'd appreciate a
little respect.


Respect yes, but not touching it, no.



I did not say or imply hands off, so of course I presume that anyone can
touch it. Why do you insist on reading me otherwise?

However, if someone actually renamed my variables after I have declared
my position, then I would interpret that as doing bad things to the
code. In fact, i would revert such a change.

If folks aren't willing to respect my style of coding along with my
promise to document short names, then I will withdraw my request for a
merge and abrogate my ICLA. I would not wish my work to be used in a
community that does not have sufficient respect for personal coding
styles of contributors that do not in any way vary from documented
conventions.
 


Simon Pepping





Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Vincent Hennebert
On 21/10/11 08:29, Glenn Adams wrote:
 On Thu, Oct 20, 2011 at 10:31 PM, Peter Hancock 
 peter.hanc...@gmail.comwrote:
 

 From the surface I would have been very much in favor of supporting a
 merge in the near future, however I have had the chance to review some
 areas of the complex script branch and I have some concerns.
 The treatment of Unicode Bidi spans from the creation of the FO Tree
 through to the construction of the Area Tree and I would have liked to
 have seen the complex scripts solution integrate the Unicode Bidi
 Algorithm more directly into the core process:  For example, the
 implementation performs a post process on the FO Tree to resolve the
 Bidi properties of FONodes relating to text. It would be preferable to
 see the construction of the FO Tree embracing this Bidi aspect:
 FONodes should be responsible for determining their own bidi state
 from the fo node semantics in context to the position in the tree.
 Such an implementation would immediately force the maintainer to
 consider how a change would effect the Bidi process.

 
 Please review XSL-FO 1.1 Section 5.8, and, in particular:
 
 the algorithm is applied to a sequence of characters coming from the
 content of one or more formatting objects. The sequence of characters is
 created by processing a fragment of the formatting object tree. A *fragment* 
 is
 any contiguous sequence of children of some formatting object in the tree.
 The sequence is created by doing a pre-order traversal of the fragment down
 to the fo:character level.
 
 the final, text reordering step is not done during refinement. Instead, the
 XSL equivalent of re-ordering is done during area tree generation
 
 The current implementation adheres to the XSL-FO specification in this
 regard, while your suggestion that this behavior be isolated to individual
 FONodes is contrary to the specification and does not permit correct
 implementation of the functionality required.

Section 5 of the XSL-FO 1.1 Recommendation starts with the following
note:
“Although the refinement process is described in a series of steps, this
is solely for the convenience of exposition and does not imply they must
be implemented as separate steps in any conforming implementation.
A conforming implementation must only achieve the same effect.”

So we are free to implement the algorithm any way we see adequate.

But even so, a fragment is “any contiguous sequence of children of some
formatting object in the tree“. That formatting object doesn’t have to
be a whole page-sequence and can as well be a single block or inline or
anything else.

Implementing Bidi in individual FONode sub-classes allows to keep the
treatment encapsulated in each FO element, and adapt it to the specific
semantics of that element.

If this is done in a single BidiUtil class, all the behaviours that are
specific to each element are mixed together. Implementation details that
should be kept within individual classes are being exposed to the rest
of them. Elements must be handled in lenghty sequences of if/else
statement using ‘instanceof’ and casts to the concrete class.

If a new FO element is being implemented this will be very likely that
it will be forgotten to add the appropriate ‘if’ statement for that
element in the BidiUtil class. If it’s not forgotten, it will be
difficult to find out where to put that statement.

Doing treatment specific to an object outside its implementation screams
for trouble and regressions as soon as a change is made in one or the
other place. Unless people are aware that they must keep an eye on that
BidiUtil class, which is unlikely for newcomer who don’t know the code
well.

This BidiUtil class has clearly been written in a procedural style, and
there are reasons why that style was abandoned years ago in favour of an
object-oriented paradigm, that allows to write more flexible,
maintainable programs, and easier to understand by people who are not
the original authors.

I’d like to know what’s your view on this?


 I realize this is a complex subject area that requires considerable domain
 knowledge, but you can take my word as a domain expert (having implemented
 this functionality multiple times in commercial products) that the approach
 I have taken is (1) the most consistent with the XSL-FO specification, (2)
 the behavior required to achieve the desired functionality, and (3) the
 minimum changes and points of dependency to existing code.
 
 In contrast, a more distributed approach such as you suggest would (1)
 diverge from XSL-FO specified behavior, (2) increase and distribute the
 number of points of interaction with existing code so as to make behavior
 harder to understand, test, and debug, and, most telling, (3) not provide
 any functional or performance advantage.
 
 Regards,
 Glenn


Thanks,
Vincent


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Vincent Hennebert
On 24/10/11 14:05, Glenn Adams wrote:
 On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl georg.datt...@geneon.dewrote:
 
 Hello Glenn,

 (2) there is no standard for symbol length documented in FOP practice
 or enforced by checkstyle; I decline to exchange my choice of symbols
 with longer symbols simply because you prefer it that way; I have
 offered to add comments to my uses, and that is the most I'm willing
 to do to address this matter;

 You probably spent more years programming than I am alive, so please excuse
 me if that’s a stupid question: What is the reasoning/advantage behind those
 short variable names?

 
 First, I don't use short names everywhere. Mostly I just use in local
 variables, but generally not as class variables.
 
 Second, I was trained in Physics and Mathematics, which uses short variable
 names (E = M C ^ 2).

Welcome to the Computer Science world, where longer variable names rule
because they allow to make a program easier to understand and maintain.

When I read the paper about the total-fit algorithm for breaking
paragraphs into line, I found that the numerous one-letter variable
names were an impediment to understanding it. It was difficult to
remember what concept each variable was associated to.


 Third, I started programming in the 1960s with BAL 360, APL, then FORTRAN
 IV. We use short names there.

Yes, it is very fortunate that the computer world has learnt from those
old days, and moved on to embrace better programming practices.


 Fourth, I happen to have a good memory and I have no trouble remembering the
 meaning of variable names.
 
 Fifth, I find that short names prevents making lines too long and gives me
 more room for comments.

By putting only one statement per line it is rare to bump into the 100
characters per line limit.


 Sixth, I am going to be maintaining this code. If anyone has a problem with
 specific code during a merge or regression, they merely need ask me.

As Simon has already pointed out, this is not the way it should be in an
open-source project.


 Seventh, that's just my style, and I assert it is as valid as doing it with
 long names.

When I joined the FOP project, I adjusted my style to follow the
project’s practices. It seemed obvious to me to do so, because
a consistent style within a project avoids unnecessary distraction when
wandering through its different parts. I would expect any contributor to
do the same.


 Eighth, asking me to adhere to an undocumented convention that is not
 otherwise enforced, and for which there is no evidence or analysis of having
 been previously followed in FOP contributions is unwarranted.

There is no documented convention simply because it has never occurred
to anybody in the past that short variable names could be an option.
This is this kind of unwritten convention that everybody takes for
granted.


 Ninth, spending time changing variable names is a waste of time when I could
 be working on adding support for other scripts.

So someone else is going to have to waste all that time converting those
names into more readable ones. That’s a bit unfair, isn’t it?


 I can probably throw in a few more random reasons, but this should be
 sufficient.
 
 I've offered to add comments, take it or leave it.

Would you at least agree to use more readable variable names in new
code? That would be of great help to people who will get involved in the
Bidi stuff in the future.


Thanks,
Vincent


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
inline

On Wed, Oct 26, 2011 at 6:49 PM, Vincent Hennebert vhenneb...@gmail.comwrote:

 On 21/10/11 08:29, Glenn Adams wrote:
  On Thu, Oct 20, 2011 at 10:31 PM, Peter Hancock peter.hanc...@gmail.com
 wrote:
 
 
  From the surface I would have been very much in favor of supporting a
  merge in the near future, however I have had the chance to review some
  areas of the complex script branch and I have some concerns.
  The treatment of Unicode Bidi spans from the creation of the FO Tree
  through to the construction of the Area Tree and I would have liked to
  have seen the complex scripts solution integrate the Unicode Bidi
  Algorithm more directly into the core process:  For example, the
  implementation performs a post process on the FO Tree to resolve the
  Bidi properties of FONodes relating to text. It would be preferable to
  see the construction of the FO Tree embracing this Bidi aspect:
  FONodes should be responsible for determining their own bidi state
  from the fo node semantics in context to the position in the tree.
  Such an implementation would immediately force the maintainer to
  consider how a change would effect the Bidi process.
 
 
  Please review XSL-FO 1.1 Section 5.8, and, in particular:
 
  the algorithm is applied to a sequence of characters coming from the
  content of one or more formatting objects. The sequence of characters is
  created by processing a fragment of the formatting object tree. A
 *fragment* is
  any contiguous sequence of children of some formatting object in the
 tree.
  The sequence is created by doing a pre-order traversal of the fragment
 down
  to the fo:character level.
 
  the final, text reordering step is not done during refinement. Instead,
 the
  XSL equivalent of re-ordering is done during area tree generation
 
  The current implementation adheres to the XSL-FO specification in this
  regard, while your suggestion that this behavior be isolated to
 individual
  FONodes is contrary to the specification and does not permit correct
  implementation of the functionality required.

 Section 5 of the XSL-FO 1.1 Recommendation starts with the following
 note:
 “Although the refinement process is described in a series of steps, this
 is solely for the convenience of exposition and does not imply they must
 be implemented as separate steps in any conforming implementation.
 A conforming implementation must only achieve the same effect.”

 So we are free to implement the algorithm any way we see adequate.


And I have done just that: implement the [Bidi] algorithm in a way I found
adequate.


 But even so, a fragment is “any contiguous sequence of children of some
 formatting object in the tree“. That formatting object doesn’t have to
 be a whole page-sequence and can as well be a single block or inline or
 anything else.

 Implementing Bidi in individual FONode sub-classes allows to keep the
 treatment encapsulated in each FO element, and adapt it to the specific
 semantics of that element.


I have not ruled this option out. I have stated previously that this may be
possible; however, there are possible problems with confining this behavior
to FONode classes such as the fact that what constitutes a fragment and a
delimited text range (see XSL-FO 1.1 Section 5.8) is not confined to FO
node boundaries.


 If this is done in a single BidiUtil class, all the behaviours that are
 specific to each element are mixed together. Implementation details that
 should be kept within individual classes are being exposed to the rest
 of them. Elements must be handled in lenghty sequences of if/else
 statement using ‘instanceof’ and casts to the concrete class.


In adding support to Bidi to FOP, I was faced with the problem of (1) not
knowing the implementation, and (2) desiring to minimize the point of
contact of my changes with existing code in order to better isolate
behavioral regressions.

I made the determination that it would be most convenient and expedient to
code the bidi level and order resolution functionality in a single set of
utility functions (with other helper classes, such as UnicodeBidiAlgorithm).
That approach has worked so far. However, that doesn't mean it has to remain
that way.

It is not a straightforward task to add Bidi support to a large
implementation like FOP which failed to take the needs of
internationalization into account in its design. As a consequence, adding
this support must be done delicately, and in stages.


 If a new FO element is being implemented this will be very likely that
 it will be forgotten to add the appropriate ‘if’ statement for that
 element in the BidiUtil class. If it’s not forgotten, it will be
 difficult to find out where to put that statement.


Although I have not added tests for right-to-left writing mode for all
existing layoutengine standard tests, I am in the process of doing just
that, and have already put a number of tests in place. Eventually, there
will be RTL WM tests for all these standard tests.


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Peter Hancock
 On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
 are you claiming my code is not maintainable by other developers? if so,
 then please prove it objectively; otherwise, let's stop talking about this,
 and move on with the merge vote
How would one go about proving objectively that code is not maintainable?

There are many aspects to writing maintainable code, spanning from the
synax level through to the structuring of classes  and modules
(packages).  Importantly we should encourage:
Code reuse - (using trusted libraries, applying the DRY principle)
hard to measure objectively
A consistent style - this may be an emergent aspect of a project and
choosing guidelines at the start or even retrospectively may be too
difficult, but we can largely infer the style from the current state.
An imperfect but consistent style is arguably favorable to
inconsistency.
Idiomatic language usage - applying common solutions that leverage the
constructs of, and the philosophies behind a language (e.g applying OO
design patterns in Java applications).

I find that writing code that is in keeping with a the style of a
project and using the language as recommended makes it easier to
distill the intention of a piece of code from the implementation and
can lead towards self-documenting code.

The inner workings of FOP are complex and I think that all efforts to
boost understandability are essential.

Peter

On Wed, Oct 26, 2011 at 12:55 PM, Glenn Adams gl...@skynav.com wrote:
 inline
 On Wed, Oct 26, 2011 at 7:17 PM, Vincent Hennebert vhenneb...@gmail.com
 wrote:

 On 24/10/11 14:05, Glenn Adams wrote:
  On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl
  georg.datt...@geneon.dewrote:
 
  Hello Glenn,
 
  (2) there is no standard for symbol length documented in FOP practice
  or enforced by checkstyle; I decline to exchange my choice of symbols
  with longer symbols simply because you prefer it that way; I have
  offered to add comments to my uses, and that is the most I'm willing
  to do to address this matter;
 
  You probably spent more years programming than I am alive, so please
  excuse
  me if that’s a stupid question: What is the reasoning/advantage behind
  those
  short variable names?
 
 
  First, I don't use short names everywhere. Mostly I just use in local
  variables, but generally not as class variables.
 
  Second, I was trained in Physics and Mathematics, which uses short
  variable
  names (E = M C ^ 2).

 Welcome to the Computer Science world, where longer variable names rule
 because they allow to make a program easier to understand and maintain.

 When I read the paper about the total-fit algorithm for breaking
 paragraphs into line, I found that the numerous one-letter variable
 names were an impediment to understanding it. It was difficult to
 remember what concept each variable was associated to.

 I had no trouble understanding it. In fact, I re-implemented it in Lisp
 (Scheme), and fixed a few issues in the process, which I reported to Don
 Knuth and for which he sent me a check for $2.56. See attached file. Note
 that I used long names for (structure) member names and dynamic variables,
 but often short names for local (lexical) variables in this code which I
 wrote 20 years ago. I haven't changed my style since then, and I don't
 intend to do so now.


  Third, I started programming in the 1960s with BAL 360, APL, then
  FORTRAN
  IV. We use short names there.

 Yes, it is very fortunate that the computer world has learnt from those
 old days, and moved on to embrace better programming practices.

 We are apparently in different generations, and this influences our
 thinking. I am not judging your style, but you seem to be quick to judge my
 style. Personally, I find ideology counterproductive.


  Fourth, I happen to have a good memory and I have no trouble remembering
  the
  meaning of variable names.
 
  Fifth, I find that short names prevents making lines too long and gives
  me
  more room for comments.

 By putting only one statement per line it is rare to bump into the 100
 characters per line limit.


  Sixth, I am going to be maintaining this code. If anyone has a problem
  with
  specific code during a merge or regression, they merely need ask me.

 As Simon has already pointed out, this is not the way it should be in an
 open-source project.


  Seventh, that's just my style, and I assert it is as valid as doing it
  with
  long names.

 When I joined the FOP project, I adjusted my style to follow the
 project’s practices. It seemed obvious to me to do so, because
 a consistent style within a project avoids unnecessary distraction when
 wandering through its different parts. I would expect any contributor to
 do the same.


  Eighth, asking me to adhere to an undocumented convention that is not
  otherwise enforced, and for which there is no evidence or analysis of
  having
  been previously followed in FOP contributions is unwarranted.

 There is no documented convention simply 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
On Wed, Oct 26, 2011 at 8:36 PM, Peter Hancock peter.hanc...@gmail.comwrote:

  On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
  are you claiming my code is not maintainable by other developers? if so,
  then please prove it objectively; otherwise, let's stop talking about
 this,
  and move on with the merge vote
 How would one go about proving objectively that code is not maintainable?


My point is that this is a subjective exercise we are having here, and not
particularly fruitful. It can be done objectively, or at least more
objectively if one wants to take the time to do so. For example, by defining
specific objective metrics and tools that measure those metrics against the
existing code base and against new code.

But instead of doing that, we are presently dealing with argument by
innuendo.



 There are many aspects to writing maintainable code, spanning from the
 synax level through to the structuring of classes  and modules
 (packages).  Importantly we should encourage:
 Code reuse - (using trusted libraries, applying the DRY principle)
 hard to measure objectively
 A consistent style - this may be an emergent aspect of a project and
 choosing guidelines at the start or even retrospectively may be too
 difficult, but we can largely infer the style from the current state.
 An imperfect but consistent style is arguably favorable to
 inconsistency.
 Idiomatic language usage - applying common solutions that leverage the
 constructs of, and the philosophies behind a language (e.g applying OO
 design patterns in Java applications).

 I find that writing code that is in keeping with a the style of a
 project and using the language as recommended makes it easier to
 distill the intention of a piece of code from the implementation and
 can lead towards self-documenting code.

 The inner workings of FOP are complex and I think that all efforts to
 boost understandability are essential.


It is rather ironic that I find myself being interpreted as somehow trying
to decrease coding understandability, or being interpreted as promoting
idiomatic or inconsistent usage. You should ask some of the hundred or so
developers who have worked under me their opinion about my code. You would
come to a different conclusion.

I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
where the author apparently did not know Java, and introduces the libc
functions strcmp, strcpy, and strlen, and which uses the Java char type
(within the String type) for coding tree pointers!

I also note the author of this file uses short names for (exposed,
non-private) instance variables, such as:

protected char[] lo;
protected char[] hi;
protected char[] eq;
protected char[] sc;
protected CharVector kv;

At least in my case, I use long names for instance variables, even when they
are private.

If you wanted to make a serious case against using short names, you would
start first by analyzing existing FOP usage and using such an analysis to
establish concrete metrics. That would allow objective comparisons to be
made.

G.



 Peter

 On Wed, Oct 26, 2011 at 12:55 PM, Glenn Adams gl...@skynav.com wrote:
  inline
  On Wed, Oct 26, 2011 at 7:17 PM, Vincent Hennebert vhenneb...@gmail.com
 
  wrote:
 
  On 24/10/11 14:05, Glenn Adams wrote:
   On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl
   georg.datt...@geneon.dewrote:
  
   Hello Glenn,
  
   (2) there is no standard for symbol length documented in FOP
 practice
   or enforced by checkstyle; I decline to exchange my choice of
 symbols
   with longer symbols simply because you prefer it that way; I have
   offered to add comments to my uses, and that is the most I'm willing
   to do to address this matter;
  
   You probably spent more years programming than I am alive, so please
   excuse
   me if that’s a stupid question: What is the reasoning/advantage
 behind
   those
   short variable names?
  
  
   First, I don't use short names everywhere. Mostly I just use in local
   variables, but generally not as class variables.
  
   Second, I was trained in Physics and Mathematics, which uses short
   variable
   names (E = M C ^ 2).
 
  Welcome to the Computer Science world, where longer variable names rule
  because they allow to make a program easier to understand and maintain.
 
  When I read the paper about the total-fit algorithm for breaking
  paragraphs into line, I found that the numerous one-letter variable
  names were an impediment to understanding it. It was difficult to
  remember what concept each variable was associated to.
 
  I had no trouble understanding it. In fact, I re-implemented it in Lisp
  (Scheme), and fixed a few issues in the process, which I reported to Don
  Knuth and for which he sent me a check for $2.56. See attached file. Note
  that I used long names for (structure) member names and dynamic
 variables,
  but often short names for local (lexical) variables in this code which I
  wrote 20 years ago. I haven't 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
BTW, sometimes I choose to use longer names for local variables: see my
reimplementation of number to string conversion in
o.a.f.util.NumberConverter, which is a new (and large) class I added in the
CS branch. I use a few short names here, but not as many as longer names. So
you can see that sometimes I find it useful to use longer names. This sort
of decision (when to use long or short) should be based on an author's
preferences, and not established by fiat.

Notice also the considerable use of nested classes (and interfaces), which
tends to make the file longer, but nevertheless encapsulates abstractions in
smaller units. True, this file could be sub-divided into smaller files, and
I may yet do that. However, I found it convenient to keep it in one file for
the initial implementation.

On Wed, Oct 26, 2011 at 8:54 PM, Glenn Adams gl...@skynav.com wrote:



 On Wed, Oct 26, 2011 at 8:36 PM, Peter Hancock peter.hanc...@gmail.comwrote:

  On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
  are you claiming my code is not maintainable by other developers? if so,
  then please prove it objectively; otherwise, let's stop talking about
 this,
  and move on with the merge vote
 How would one go about proving objectively that code is not maintainable?


 My point is that this is a subjective exercise we are having here, and not
 particularly fruitful. It can be done objectively, or at least more
 objectively if one wants to take the time to do so. For example, by defining
 specific objective metrics and tools that measure those metrics against the
 existing code base and against new code.

 But instead of doing that, we are presently dealing with argument by
 innuendo.



 There are many aspects to writing maintainable code, spanning from the
 synax level through to the structuring of classes  and modules
 (packages).  Importantly we should encourage:
 Code reuse - (using trusted libraries, applying the DRY principle)
 hard to measure objectively
 A consistent style - this may be an emergent aspect of a project and
 choosing guidelines at the start or even retrospectively may be too
 difficult, but we can largely infer the style from the current state.
 An imperfect but consistent style is arguably favorable to
 inconsistency.
 Idiomatic language usage - applying common solutions that leverage the
 constructs of, and the philosophies behind a language (e.g applying OO
 design patterns in Java applications).

 I find that writing code that is in keeping with a the style of a
 project and using the language as recommended makes it easier to
 distill the intention of a piece of code from the implementation and
 can lead towards self-documenting code.

 The inner workings of FOP are complex and I think that all efforts to
 boost understandability are essential.


 It is rather ironic that I find myself being interpreted as somehow trying
 to decrease coding understandability, or being interpreted as promoting
 idiomatic or inconsistent usage. You should ask some of the hundred or so
 developers who have worked under me their opinion about my code. You would
 come to a different conclusion.

 I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
 where the author apparently did not know Java, and introduces the libc
 functions strcmp, strcpy, and strlen, and which uses the Java char type
 (within the String type) for coding tree pointers!

 I also note the author of this file uses short names for (exposed,
 non-private) instance variables, such as:

 protected char[] lo;
 protected char[] hi;
 protected char[] eq;
 protected char[] sc;
 protected CharVector kv;

 At least in my case, I use long names for instance variables, even when
 they are private.

 If you wanted to make a serious case against using short names, you would
 start first by analyzing existing FOP usage and using such an analysis to
 establish concrete metrics. That would allow objective comparisons to be
 made.

 G.



 Peter

 On Wed, Oct 26, 2011 at 12:55 PM, Glenn Adams gl...@skynav.com wrote:
  inline
  On Wed, Oct 26, 2011 at 7:17 PM, Vincent Hennebert 
 vhenneb...@gmail.com
  wrote:
 
  On 24/10/11 14:05, Glenn Adams wrote:
   On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl
   georg.datt...@geneon.dewrote:
  
   Hello Glenn,
  
   (2) there is no standard for symbol length documented in FOP
 practice
   or enforced by checkstyle; I decline to exchange my choice of
 symbols
   with longer symbols simply because you prefer it that way; I have
   offered to add comments to my uses, and that is the most I'm
 willing
   to do to address this matter;
  
   You probably spent more years programming than I am alive, so please
   excuse
   me if that’s a stupid question: What is the reasoning/advantage
 behind
   those
   short variable names?
  
  
   First, I don't use short names everywhere. Mostly I just use in local
   variables, but generally not as class variables.
  
   Second, I was trained in 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
While you are at it, Peter, you may also take note that I have made liberal
use of *assert* in the file I reference below (NumberConverter). If we are
going to improve not only understandability but also real quality, how about
a campaign to maximize use of assertions to document code assumptions?

I notice that nobody has pointed out my liberal use of assertions as a
positive for understanding and quality, while instead focusing on the narrow
issue of short symbol name length as a negative. Given that symbol length
has no impact at the JVM layer (except for consuming more runtime memory
resources than shorter symbol names), perhaps we should focus on something
which does have an impact, such as runtime assertion testing.

On Wed, Oct 26, 2011 at 9:13 PM, Glenn Adams gl...@skynav.com wrote:

 BTW, sometimes I choose to use longer names for local variables: see my
 reimplementation of number to string conversion in
 o.a.f.util.NumberConverter, which is a new (and large) class I added in the
 CS branch. I use a few short names here, but not as many as longer names. So
 you can see that sometimes I find it useful to use longer names. This sort
 of decision (when to use long or short) should be based on an author's
 preferences, and not established by fiat.

 Notice also the considerable use of nested classes (and interfaces), which
 tends to make the file longer, but nevertheless encapsulates abstractions in
 smaller units. True, this file could be sub-divided into smaller files, and
 I may yet do that. However, I found it convenient to keep it in one file for
 the initial implementation.


 On Wed, Oct 26, 2011 at 8:54 PM, Glenn Adams gl...@skynav.com wrote:



 On Wed, Oct 26, 2011 at 8:36 PM, Peter Hancock 
 peter.hanc...@gmail.comwrote:

  On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
  are you claiming my code is not maintainable by other developers? if
 so,
  then please prove it objectively; otherwise, let's stop talking about
 this,
  and move on with the merge vote
 How would one go about proving objectively that code is not maintainable?


 My point is that this is a subjective exercise we are having here, and not
 particularly fruitful. It can be done objectively, or at least more
 objectively if one wants to take the time to do so. For example, by defining
 specific objective metrics and tools that measure those metrics against the
 existing code base and against new code.

 But instead of doing that, we are presently dealing with argument by
 innuendo.



 There are many aspects to writing maintainable code, spanning from the
 synax level through to the structuring of classes  and modules
 (packages).  Importantly we should encourage:
 Code reuse - (using trusted libraries, applying the DRY principle)
 hard to measure objectively
 A consistent style - this may be an emergent aspect of a project and
 choosing guidelines at the start or even retrospectively may be too
 difficult, but we can largely infer the style from the current state.
 An imperfect but consistent style is arguably favorable to
 inconsistency.
 Idiomatic language usage - applying common solutions that leverage the
 constructs of, and the philosophies behind a language (e.g applying OO
 design patterns in Java applications).

 I find that writing code that is in keeping with a the style of a
 project and using the language as recommended makes it easier to
 distill the intention of a piece of code from the implementation and
 can lead towards self-documenting code.

 The inner workings of FOP are complex and I think that all efforts to
 boost understandability are essential.


 It is rather ironic that I find myself being interpreted as somehow trying
 to decrease coding understandability, or being interpreted as promoting
 idiomatic or inconsistent usage. You should ask some of the hundred or so
 developers who have worked under me their opinion about my code. You would
 come to a different conclusion.

 I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
 where the author apparently did not know Java, and introduces the libc
 functions strcmp, strcpy, and strlen, and which uses the Java char type
 (within the String type) for coding tree pointers!

 I also note the author of this file uses short names for (exposed,
 non-private) instance variables, such as:

 protected char[] lo;
 protected char[] hi;
 protected char[] eq;
 protected char[] sc;
 protected CharVector kv;

 At least in my case, I use long names for instance variables, even when
 they are private.

 If you wanted to make a serious case against using short names, you would
 start first by analyzing existing FOP usage and using such an analysis to
 establish concrete metrics. That would allow objective comparisons to be
 made.

 G.



 Peter

 On Wed, Oct 26, 2011 at 12:55 PM, Glenn Adams gl...@skynav.com wrote:
  inline
  On Wed, Oct 26, 2011 at 7:17 PM, Vincent Hennebert 
 vhenneb...@gmail.com
  wrote:
 
  On 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Peter Hancock
 I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
 where the author apparently did not know Java, and introduces the libc
 functions strcmp, strcpy, and strlen, and which uses the Java char type
 (within the String type) for coding tree pointers!

My apprehension about certain areas of your code (and not the
majority!) stems from such examples, and the headaches theycan
bring.  This is old code that I had no influence over at the time and
I do not want it to have any bearing on where the  project is heading.

 If you wanted to make a serious case against using short names, you would
 start first by analyzing existing FOP usage and using such an analysis to
 establish concrete metrics.

I do not think I have focused on the length of variable or member
names have I?  I did a PhD in mathematics and I have a soft spot
for the aesthetic value of short names.  It is always pleasing to
distill a mathematical proof to the simplist form possible and
using consise variable naming is often a part of that.  That said, I
do not think that working codebenefits from this approach:
what can seem like an efficient and powerful piece of code when
written can prove to be an  overly difficult thing to read later.
Unlike yourself, apparently, my memory ain't so good and I benefit
from code that has clear intention.

Peter


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Peter Hancock
On Wed, Oct 26, 2011 at 2:13 PM, Glenn Adams gl...@skynav.com wrote:
 Notice also the considerable use of nested classes (and interfaces), which
 tends to make the file longer, but nevertheless encapsulates abstractions in
 smaller units. True, this file could be sub-divided into smaller files, and
 I may yet do that. However, I found it convenient to keep it in one file for
 the initial implementation.

I appreciate that Java does not always help us when striving for well
encapsulated code AND manageable file lengths!

I really do not think you implementation is fundamentally that far off
the mark and the amount of constructive attention it has received has
naturally been proportional to the quantity - something that is very
impressive!

Peter


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
On Wed, Oct 26, 2011 at 9:34 PM, Peter Hancock peter.hanc...@gmail.comwrote:

  I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
  where the author apparently did not know Java, and introduces the libc
  functions strcmp, strcpy, and strlen, and which uses the Java char type
  (within the String type) for coding tree pointers!

 My apprehension about certain areas of your code (and not the
 majority!) stems from such examples, and the headaches theycan
 bring.  This is old code that I had no influence over at the time and
 I do not want it to have any bearing on where the  project is heading.

  If you wanted to make a serious case against using short names, you would
  start first by analyzing existing FOP usage and using such an analysis to
  establish concrete metrics.

 I do not think I have focused on the length of variable or member
 names have I?  I did a PhD in mathematics and I have a soft spot
 for the aesthetic value of short names.  It is always pleasing to
 distill a mathematical proof to the simplist form possible and
 using consise variable naming is often a part of that.  That said, I
 do not think that working codebenefits from this approach:
 what can seem like an efficient and powerful piece of code when
 written can prove to be an  overly difficult thing to read later.
 Unlike yourself, apparently, my memory ain't so good and I benefit
 from code that has clear intention.


Yet you continue to imply that:

short variable names != clear intention

This I must disagree with. I could use long random names and obfuscate
intention. I can uses short names and document intention (in comments). I
have agreed to do the latter. Is that not enough?


 Peter



RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Eric Douglas
I haven't looked at the code in question on this particular discussion
so this is not to criticize.

Overly concise variables names should be acceptable within limited
scope.
Calling an ObjectOutputStream oos may be sufficient when it's created
and destroyed within one little method.
Using i or z may suffice as loop counters within a single simple method,
while you may want a longer name simply to track the loop if it gets
more complex nesting loops.
A project should have defined standards for meaningful variable naming,
particularly when they're declared at the class level or they're public,
protected, or passed in to the method.

The simplest readability standard is of course the layout.  Eclipse has
plenty of preferences and an option to export them.  Line wraps, comment
format, etc should be consistant within a project.

Of course if code must be reused it helps if standard naming can be
enforced by such as abstract methods and interfaces.

-Original Message-
From: Peter Hancock [mailto:peter.hanc...@gmail.com] 
Sent: Wednesday, October 26, 2011 9:34 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Merge Request - Temp_ComplexScripts into Trunk

 I wonder what you think about the code in 
 o.a.f.hyphenation.TernaryTree, where the author apparently did not 
 know Java, and introduces the libc functions strcmp, strcpy, and 
 strlen, and which uses the Java char type (within the String type) for
coding tree pointers!

My apprehension about certain areas of your code (and not the
majority!) stems from such examples, and the headaches theycan
bring.  This is old code that I had no influence over at the time and I
do not want it to have any bearing on where the  project is heading.

 If you wanted to make a serious case against using short names, you 
 would start first by analyzing existing FOP usage and using such an 
 analysis to establish concrete metrics.

I do not think I have focused on the length of variable or member
names have I?  I did a PhD in mathematics and I have a soft spot
for the aesthetic value of short names.  It is always pleasing to
distill a mathematical proof to the simplist form possible and
using consise variable naming is often a part of that.  That said, I
do not think that working codebenefits from this approach:
what can seem like an efficient and powerful piece of code when
written can prove to be an  overly difficult thing to read later.
Unlike yourself, apparently, my memory ain't so good and I benefit from
code that has clear intention.

Peter


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Vincent Hennebert
On 22/10/11 01:22, Glenn Adams wrote:
 inline
 
 On Sat, Oct 22, 2011 at 12:04 AM, Chris Bowditch bowditch_ch...@hotmail.com
 wrote:
 

 Since Thunderhead also needs this feature we are willing to invest some
 time into it too. Currently my team are telling me it would take 9 person
 months to port this code into our branch of FOP, partly because of some
 merge conflicts, but also partly because we are not comfortable with some
 aspects of the code as it has already been pointed out. The estimate would
 include the time to refactor long files into small ones, deal with the
 variable names and restructuring the code.

 
 I would advise against this, since it would it is functionally unnecessary
 and since it will make future merges more difficult. I will be making
 additional changes as more features in this area are added.
 
 I don't see what refactoring long files into small ones buys you. However,
 if you make a reasoned argument for factoring specific long files (i.e., why
 such factoring improves architecture, modularity, etc), rather than simply
 say all long files must be refactored, then I will seriously discuss doing
 so post merge.

When I read this I’m really puzzled because that should really be the
other way around: what is the reason to keep those classes so big (and
that must be a really good one)? Most likely the Single Responsibility
Principle is being violated in those classes.

BidiUtil is a good example of this: it computes Bidi levels on the FO
tree, /and/ also re-orders areas on the area tree. Those two things
should most probably be done in two separate classes. Especially since
from the quick look I had they seem to be using two distinct sets or
private methods.


 I appreciate your commitment to add comments to short identifiers
 declarations, so at least it will be easier for the team to translate the
 short variables to longer equivalents. Just so we are clear on what you
 propose, do you mean this:

 int gi = 0; // Glyph Index

 
 Yes. Note that I already do this in most cases, such as:
 
 private static void resolveExplicit ( int[] wca, int defaultLevel, int[] ea
 ) {
 int[] es = new int [ MAX_LEVELS ];  /* embeddings stack */
 int ei = 0; /* embeddings stack index */
 int ec = defaultLevel;  /* current embedding level
 */
 for ( int i = 0, n = wca.length; i  n; i++ ) {
 int bc = wca [ i ]; /* bidi class of current
 char */
 int el; /* embedding level to assign
 to current char */
 switch ( bc ) {
 case LRE:   // start left-to-right
 embedding
 case RLE:   // start right-to-left
 embedding
  case LRO:   // start left-to-right
 override
 case RLO:   // start right-to-left
 override
 {
 int en; /* new embedding level */
 if ( ( bc == RLE ) || ( bc == RLO ) ) {
 en = ( ( ec  ~OVERRIDE ) + 1 ) | 1;
 } else {
 en = ( ( ec  ~OVERRIDE ) + 2 )  ~1;
 }
 if ( en  ( MAX_LEVELS + 1 ) ) {
 es [ ei++ ] = ec;
 if ( ( bc == LRO ) || ( bc == RLO ) ) {
 ec = en | OVERRIDE;
 } else {
 ec = en  ~OVERRIDE;
 }
 } else {
 // max levels exceeded, so don't change level or
 override
 }
 el = ec;
 break;
 }
 ...
 
 What I'm agreeing to do in the relative near future (after merge, before new
 release) is to add such comments to those places where I have not already
 done so, which are probably a minority of such cases.

This is good to hear, although it only marginally helps. Again, what’s
the rationale behind having 2 or 3 letter variables when every course
about programming will emphasise the importance of having reasonably
long, self-describing variable names? What amount of typing is there to
save when any modern IDE will auto-complete variable names?

Explaining the purpose of a variable in a comment creates two problems:
first, it forces the developer to constantly scroll from where the
variable is being used to where it is declared, in order to remember
what its purpose is. By doing so, they lose the context in which the
variable is used and have to read the code again. This makes it
extremely painful and difficult to understand the code.

But more importantly, there is no guarantee that the comment is
accurate. It’s notorious that comments tend to be left behind when
changes are made to the code. Which means that they quickly become
misleading or even plain wrong. Therefore people tend to not trust
comments, or even not read them 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Glenn Adams
Vincent,

We apparently disagree on whether coding should be based on ideology or on
practical results. You appear to favor the former, I favor the latter. I
think we will have to leave it at that. I'm not going to alter my
programming style in order to adhere to your notion of ideal programming
practice. It would be one thing if there was an established consensus on
these matters based on objective reasoning, but there is not:

(1) the limit on file length appear to be arbitrary, and not a result of an
explicit reasoned process; i had reasons for coding the way I did (including
the use of nested classes), and I feel no need to alter or justify that
process; if someone can make an objectively reasoned argument on a case by
case basis, then I'm willing to consider refactoring at some point in the
future when other more important tasks are completed; e.g., I would be
willing to break out the nested class UnicodeBidiAlgorithm from
BidiUtil.java into a separate file (which would in address your comment
about separating bidi level determination from reordering);

(2) there is no standard for symbol length documented in FOP practice or
enforced by checkstyle; I decline to exchange my choice of symbols with
longer symbols simply because you prefer it that way; I have offered to add
comments to my uses, and that is the most I'm willing to do to address this
matter;

Note that in both of these cases, I am offering to take concrete steps to
address your comments, though not necessarily in the manner or to the full
extent you would prefer. This is called compromise, an important aspect of
working in a team.

G.

On Mon, Oct 24, 2011 at 6:54 PM, Vincent Hennebert vhenneb...@gmail.comwrote:

 On 22/10/11 01:22, Glenn Adams wrote:
  inline
 
  On Sat, Oct 22, 2011 at 12:04 AM, Chris Bowditch 
 bowditch_ch...@hotmail.com
  wrote:
 
 
  Since Thunderhead also needs this feature we are willing to invest some
  time into it too. Currently my team are telling me it would take 9
 person
  months to port this code into our branch of FOP, partly because of some
  merge conflicts, but also partly because we are not comfortable with
 some
  aspects of the code as it has already been pointed out. The estimate
 would
  include the time to refactor long files into small ones, deal with the
  variable names and restructuring the code.
 
 
  I would advise against this, since it would it is functionally
 unnecessary
  and since it will make future merges more difficult. I will be making
  additional changes as more features in this area are added.
 
  I don't see what refactoring long files into small ones buys you.
 However,
  if you make a reasoned argument for factoring specific long files (i.e.,
 why
  such factoring improves architecture, modularity, etc), rather than
 simply
  say all long files must be refactored, then I will seriously discuss
 doing
  so post merge.

 When I read this I’m really puzzled because that should really be the
 other way around: what is the reason to keep those classes so big (and
 that must be a really good one)? Most likely the Single Responsibility
 Principle is being violated in those classes.

 BidiUtil is a good example of this: it computes Bidi levels on the FO
 tree, /and/ also re-orders areas on the area tree. Those two things
 should most probably be done in two separate classes. Especially since
 from the quick look I had they seem to be using two distinct sets or
 private methods.


  I appreciate your commitment to add comments to short identifiers
  declarations, so at least it will be easier for the team to translate
 the
  short variables to longer equivalents. Just so we are clear on what you
  propose, do you mean this:
 
  int gi = 0; // Glyph Index
 
 
  Yes. Note that I already do this in most cases, such as:
 
  private static void resolveExplicit ( int[] wca, int defaultLevel, int[]
 ea
  ) {
  int[] es = new int [ MAX_LEVELS ];  /* embeddings stack */
  int ei = 0; /* embeddings stack index
 */
  int ec = defaultLevel;  /* current embedding
 level
  */
  for ( int i = 0, n = wca.length; i  n; i++ ) {
  int bc = wca [ i ]; /* bidi class of current
  char */
  int el; /* embedding level to
 assign
  to current char */
  switch ( bc ) {
  case LRE:   // start left-to-right
  embedding
  case RLE:   // start right-to-left
  embedding
   case LRO:   // start
 left-to-right
  override
  case RLO:   // start right-to-left
  override
  {
  int en; /* new embedding level */
  if ( ( bc == RLE ) || ( bc == RLO ) ) {
  en = ( ( ec  ~OVERRIDE ) + 1 ) | 1;
  } else {
  en 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Glenn Adams
On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl georg.datt...@geneon.dewrote:

 Hello Glenn,

  (2) there is no standard for symbol length documented in FOP practice
  or enforced by checkstyle; I decline to exchange my choice of symbols
  with longer symbols simply because you prefer it that way; I have
  offered to add comments to my uses, and that is the most I'm willing
  to do to address this matter;

 You probably spent more years programming than I am alive, so please excuse
 me if that’s a stupid question: What is the reasoning/advantage behind those
 short variable names?


First, I don't use short names everywhere. Mostly I just use in local
variables, but generally not as class variables.

Second, I was trained in Physics and Mathematics, which uses short variable
names (E = M C ^ 2).

Third, I started programming in the 1960s with BAL 360, APL, then FORTRAN
IV. We use short names there.

Fourth, I happen to have a good memory and I have no trouble remembering the
meaning of variable names.

Fifth, I find that short names prevents making lines too long and gives me
more room for comments.

Sixth, I am going to be maintaining this code. If anyone has a problem with
specific code during a merge or regression, they merely need ask me.

Seventh, that's just my style, and I assert it is as valid as doing it with
long names.

Eighth, asking me to adhere to an undocumented convention that is not
otherwise enforced, and for which there is no evidence or analysis of having
been previously followed in FOP contributions is unwarranted.

Ninth, spending time changing variable names is a waste of time when I could
be working on adding support for other scripts.

I can probably throw in a few more random reasons, but this should be
sufficient.

I've offered to add comments, take it or leave it.


RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Eric Douglas
Short variable names should use less memory, which is mostly irrelevant
these days.
In an open project where other people could be working on the same code
(or other code in the same package) it helps if all names are
consistant.
Personally I could never figure out what variable naming conventions
are.  Each class I write seems to provide reason to use an entirely new
convention.
As long as someone who has never seen your code before can determine the
purpose of each variable, I'd say you're good.
If that requires comments, definitely add comments.  When in doubt,
comment.



From: Glenn Adams [mailto:gl...@skynav.com] 
Sent: Monday, October 24, 2011 9:06 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Merge Request - Temp_ComplexScripts into Trunk



On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl georg.datt...@geneon.de
wrote:


Hello Glenn,


 (2) there is no standard for symbol length documented in FOP
practice
 or enforced by checkstyle; I decline to exchange my choice of
symbols
 with longer symbols simply because you prefer it that way; I
have
 offered to add comments to my uses, and that is the most I'm
willing
 to do to address this matter;


You probably spent more years programming than I am alive, so
please excuse me if that's a stupid question: What is the
reasoning/advantage behind those short variable names?



First, I don't use short names everywhere. Mostly I just use in local
variables, but generally not as class variables.

Second, I was trained in Physics and Mathematics, which uses short
variable names (E = M C ^ 2).

Third, I started programming in the 1960s with BAL 360, APL, then
FORTRAN IV. We use short names there.

Fourth, I happen to have a good memory and I have no trouble remembering
the meaning of variable names.

Fifth, I find that short names prevents making lines too long and gives
me more room for comments.

Sixth, I am going to be maintaining this code. If anyone has a problem
with specific code during a merge or regression, they merely need ask
me.

Seventh, that's just my style, and I assert it is as valid as doing it
with long names.

Eighth, asking me to adhere to an undocumented convention that is not
otherwise enforced, and for which there is no evidence or analysis of
having been previously followed in FOP contributions is unwarranted.

Ninth, spending time changing variable names is a waste of time when I
could be working on adding support for other scripts.

I can probably throw in a few more random reasons, but this should be
sufficient.

I've offered to add comments, take it or leave it.



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Pascal Sancho
Hi,

I'm not sure that short variables names affect readability when long
mathematical formulas are used.
sometimes, code concision can help in understanding what code does:
 depending on what you can read and understand at a glance.
Readability should be in a place between concision and verbose,
depending on the threated topic.

that can be discussed, but this should not prevent from merging GA's works.

+1 for merging it now.

Le 24/10/2011 15:26, Eric Douglas a écrit :
 Short variable names should use less memory, which is mostly irrelevant
 these days.
 In an open project where other people could be working on the same code
 (or other code in the same package) it helps if all names are consistant.
 Personally I could never figure out what variable naming conventions
 are.  Each class I write seems to provide reason to use an entirely new
 convention.
 As long as someone who has never seen your code before can determine the
 purpose of each variable, I'd say you're good.
 If that requires comments, definitely add comments.  When in doubt, comment.
 
 
 *From:* Glenn Adams [mailto:gl...@skynav.com]
 *Sent:* Monday, October 24, 2011 9:06 AM
 *To:* fop-dev@xmlgraphics.apache.org
 *Subject:* Re: Merge Request - Temp_ComplexScripts into Trunk
 
 
 On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl georg.datt...@geneon.de
 mailto:georg.datt...@geneon.de wrote:
 
 Hello Glenn,
 
  (2) there is no standard for symbol length documented in FOP practice
  or enforced by checkstyle; I decline to exchange my choice of symbols
  with longer symbols simply because you prefer it that way; I have
  offered to add comments to my uses, and that is the most I'm willing
  to do to address this matter;
 
 You probably spent more years programming than I am alive, so please
 excuse me if that’s a stupid question: What is the
 reasoning/advantage behind those short variable names?
 
 
 First, I don't use short names everywhere. Mostly I just use in local
 variables, but generally not as class variables.
 
 Second, I was trained in Physics and Mathematics, which uses short
 variable names (E = M C ^ 2).
 
 Third, I started programming in the 1960s with BAL 360, APL, then
 FORTRAN IV. We use short names there.
 
 Fourth, I happen to have a good memory and I have no trouble remembering
 the meaning of variable names.
 
 Fifth, I find that short names prevents making lines too long and gives
 me more room for comments.
 
 Sixth, I am going to be maintaining this code. If anyone has a problem
 with specific code during a merge or regression, they merely need ask me.
 
 Seventh, that's just my style, and I assert it is as valid as doing it
 with long names.
 
 Eighth, asking me to adhere to an undocumented convention that is not
 otherwise enforced, and for which there is no evidence or analysis of
 having been previously followed in FOP contributions is unwarranted.
 
 Ninth, spending time changing variable names is a waste of time when I
 could be working on adding support for other scripts.
 
 I can probably throw in a few more random reasons, but this should be
 sufficient.
 
 I've offered to add comments, take it or leave it.
 

-- 
Pascal


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Simon Pepping
On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
 Sixth, I am going to be maintaining this code. If anyone has a problem with
 specific code during a merge or regression, they merely need ask me.

That is a big no. There will always be a moment when someone else must
or wants to work on this code. FOP code cannot depend on a single
person, it must be maintainable by other developers.

Simon


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Glenn Adams
are you claiming my code is not maintainable by other developers? if so,
then please prove it objectively; otherwise, let's stop talking about this,
and move on with the merge vote

On Tue, Oct 25, 2011 at 1:21 AM, Simon Pepping spepp...@leverkruid.euwrote:

 On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
  Sixth, I am going to be maintaining this code. If anyone has a problem
 with
  specific code during a merge or regression, they merely need ask me.

 That is a big no. There will always be a moment when someone else must
 or wants to work on this code. FOP code cannot depend on a single
 person, it must be maintainable by other developers.

 Simon



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Glenn Adams
On Thu, Oct 20, 2011 at 10:31 PM, Peter Hancock peter.hanc...@gmail.comwrote:


 From the surface I would have been very much in favor of supporting a
 merge in the near future, however I have had the chance to review some
 areas of the complex script branch and I have some concerns.
 The treatment of Unicode Bidi spans from the creation of the FO Tree
 through to the construction of the Area Tree and I would have liked to
 have seen the complex scripts solution integrate the Unicode Bidi
 Algorithm more directly into the core process:  For example, the
 implementation performs a post process on the FO Tree to resolve the
 Bidi properties of FONodes relating to text. It would be preferable to
 see the construction of the FO Tree embracing this Bidi aspect:
 FONodes should be responsible for determining their own bidi state
 from the fo node semantics in context to the position in the tree.
 Such an implementation would immediately force the maintainer to
 consider how a change would effect the Bidi process.


Please review XSL-FO 1.1 Section 5.8, and, in particular:

the algorithm is applied to a sequence of characters coming from the
content of one or more formatting objects. The sequence of characters is
created by processing a fragment of the formatting object tree. A *fragment* is
any contiguous sequence of children of some formatting object in the tree.
The sequence is created by doing a pre-order traversal of the fragment down
to the fo:character level.

the final, text reordering step is not done during refinement. Instead, the
XSL equivalent of re-ordering is done during area tree generation

The current implementation adheres to the XSL-FO specification in this
regard, while your suggestion that this behavior be isolated to individual
FONodes is contrary to the specification and does not permit correct
implementation of the functionality required.

I realize this is a complex subject area that requires considerable domain
knowledge, but you can take my word as a domain expert (having implemented
this functionality multiple times in commercial products) that the approach
I have taken is (1) the most consistent with the XSL-FO specification, (2)
the behavior required to achieve the desired functionality, and (3) the
minimum changes and points of dependency to existing code.

In contrast, a more distributed approach such as you suggest would (1)
diverge from XSL-FO specified behavior, (2) increase and distribute the
number of points of interaction with existing code so as to make behavior
harder to understand, test, and debug, and, most telling, (3) not provide
any functional or performance advantage.

Regards,
Glenn


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Simon Pepping
On Thu, Oct 20, 2011 at 02:53:54PM +0100, Vincent Hennebert wrote:
 Here are the sizes of some new files:
 1075 src/java/org/apache/fop/fonts/GlyphSequence.java
 1089 src/java/org/apache/fop/fonts/GlyphProcessingState.java
 1269 
 src/codegen/unicode/java/org/apache/fop/text/bidi/GenerateBidiTestData.java
 2034 src/java/org/apache/fop/layoutmgr/BidiUtil.java
 3449 test/java/org/apache/fop/complexscripts/util/TTXFile.java
 
 This latter one contains more than 50 field
 declarations, and the Handler.startElement method alone is more than
 1800 lines long.
 
 Also, the o.a.f.fonts.truetype.TTFFile class has now grown to
 5502 lines. That’s 3 times its original size which was already too big.
 I regularly find myself looking at bits of this class, and I would be
 unable to do so on a 5500 line class.

I agree that these are undesirably long.
 
 I don’t think it needs to be explained why big classes are undesirable?
 
 Also, most files disable Checkstyle checks, the most important ones
 being line length and white space. Many files have too long lines which
 makes it a pain to read through, having to horizontally scroll all the
 time. We agreed on a certain coding style in the project and it would be
 good if new code could adhere to it.

I agree that this is not in compliance with the code style of the FOP
project.

 Speaking of variable names, here is a method picked from
 o.a.f.fonts.GlyphSequence:
 /**
  * Merge overlapping and abutting sub-intervals.
  */
 private static int[] mergeIntervals ( int[] ia ) {
 int ni = ia.length;
 int i, n, nm, is, ie;
 // count merged sub-intervals
 for ( i = 0, n = ni, nm = 0, is = ie = -1; i  n; i += 2 ) {
 int s = ia [ i + 0 ];
 int e = ia [ i + 1 ];
 if ( ( ie  0 ) || ( s  ie ) ) {
 is = s;
 ie = e;
 nm++;
 } else if ( s = is ) {
 if ( e  ie ) {
 ie = e;
 }
 }
 }
 int[] mi = new int [ nm * 2 ];
 // populate merged sub-intervals
 for ( i = 0, n = ni, nm = 0, is = ie = -1; i  n; i += 2 ) {
 int s = ia [ i + 0 ];
 int e = ia [ i + 1 ];
 int k = nm * 2;
 if ( ( ie  0 ) || ( s  ie ) ) {
 is = s;
 ie = e;
 mi [ k + 0 ] = is;
 mi [ k + 1 ] = ie;
 nm++;
 } else if ( s = is ) {
 if ( e  ie ) {
 ie = e;
 }
 mi [ k - 1 ] = ie;
 }
 }
 return mi;
 }
 

It could be refactored as follows:

/**
 * Merge overlapping and abutting sub-intervals.
 * @param inputArray The array to be merged
 */
private static int[] mergeIntervals ( int[] inputArray ) {
int numMerge = 0;
// count merged sub-intervals
for ( int i = 0, iCurrent = -1, iNext = -1; i  inputArray.length; i += 2 ) 
{
int current = inputArray [ i + 0 ];
int next = inputArray [ i + 1 ];
if ( ( iNext  0 ) || ( current  iNext ) ) {
iCurrent = current;
iNext = next;
numMerge++;
} else if ( current = iCurrent ) {
if ( next  iNext ) {
iNext = next;
}
}
}
int[] returnArray = new int [ numMerge * 2 ];
// populate merged sub-intervals
for ( int i = 0, numMerge2 = 0, iCurrent = -1, iNext = -1; i  
inputArray.length; i += 2 ) {
int current = inputArray [ i + 0 ];
int next = inputArray [ i + 1 ];
int numMerge2Square = numMerge2 * 2;
if ( ( iNext  0 ) || ( current  iNext ) ) {
iCurrent = current;
iNext = next;
returnArray [ numMerge2Square + 0 ] = iCurrent;
returnArray [ numMerge2Square + 1 ] = iNext;
numMerge2++;
} else if ( current = iCurrent ) {
if ( next  iNext ) {
iNext = next;
}
returnArray [ numMerge2Square - 1 ] = iNext;
}
}
return returnArray;
}

Many variables are now limited in scope to the loops. Only numMerge
and returnArray are visible outside the loop. The names make it
somewhat clearer what the code does (if I guessed the names right).

BTW Is there a requirement that the length of the input array is even?
If not, inputArray [ i + 1 ] will raise an exception if i == n-1.

So, yes, I agree with your objections against the actual format of the
code.

Simon


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Simon Pepping
I am pleased to learn that you are also in need of this new
functionality.

I share some of Vincent and Peter's concerns about technical points of
the code. On the other hand, this is the only implementation of
complex scripts we have, created by Glenn, in the style of Glenn. It
is an initial implementation, and it is normal that it requires
further work, maybe even design changes to make it more flexible. Does
keeping it in a branch make that further work easier? Merging it into
trunk will enhance its visibility, and make it available to more
users.

Simon

On Thu, Oct 20, 2011 at 02:02:10PM +0100, Chris Bowditch wrote:
 On 19/10/2011 19:32, Simon Pepping wrote:
 
 Hi Simon,
 
 I think you misunderstood my mail. I don't want to stop the merge. I
 simply thought it was an appropriate time to discuss some concerns
 that Vincent and Peter had identified. You are preaching to the
 converted about the need for supporting Complex scripts. It is an
 urgent requirement for us too.
 
 If we don't discuss our concerns over the code now, then when do we
 discuss it?
 
 Vincent and Peter will be replying to this thread shortly and will
 outline their primary concerns then.
 
 Thanks,
 
 Chris
 


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Chris Bowditch

On 21/10/2011 09:36, Simon Pepping wrote:

Hi Simon,


I am pleased to learn that you are also in need of this new
functionality.

I share some of Vincent and Peter's concerns about technical points of
the code. On the other hand, this is the only implementation of
complex scripts we have, created by Glenn, in the style of Glenn. It
is an initial implementation, and it is normal that it requires
further work, maybe even design changes to make it more flexible. Does
keeping it in a branch make that further work easier? Merging it into
trunk will enhance its visibility, and make it available to more
users.


I'm not opposing the merge, I simply saw it as an appropriate milesone 
at which to open the debate on our concerns. It feels like we are making 
some progress here, so thanks for helping the debate along. I would 
really like to see an acknowledgement from Glenn that there are some 
imperfections that need addressing. If I saw that then I would give my 
full backing, but even without that I would vote +0 for the merge for 
the reasons you highlight above.


Thanks,

Chris



Simon

On Thu, Oct 20, 2011 at 02:02:10PM +0100, Chris Bowditch wrote:

On 19/10/2011 19:32, Simon Pepping wrote:

Hi Simon,

I think you misunderstood my mail. I don't want to stop the merge. I
simply thought it was an appropriate time to discuss some concerns
that Vincent and Peter had identified. You are preaching to the
converted about the need for supporting Complex scripts. It is an
urgent requirement for us too.

If we don't discuss our concerns over the code now, then when do we
discuss it?

Vincent and Peter will be replying to this thread shortly and will
outline their primary concerns then.

Thanks,

Chris







Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Glenn Adams
Chris,

I would really like to see an acknowledgement from Glenn that there are
some imperfections that need addressing.
I wasn't aware I had given anyone the impression of presenting a perfect
submission. In fact, one of my favorite quotes is Voltaire's  *le mieux est
l'ennemi du bien* the best [perfect] is the enemy of the good, which a
former colleague Charlie Sandbank (now deceased) used to love to cite at
least once a day in ITU committee meetings.

My coding philosophy is by and large based on a step-wise refinement
process:

(1) make it (some subset of possible features) work
(2) make sure its correct (and regression testable) - or if following BDD,
then do this first
(3) make it (more) understandable/maintainable, i.e., re-factor, improve
comments, documentation, etc
(4) optionally, make it faster and smaller
(5) optionally, add more features
(6) go to (1)

Right now, I've finished steps (1) and (2) to a certain degree for an
initial set of features, a degree that I think is sufficient to merit moving
it into trunk. I have not yet seriously addressed (3) through (6). It would
seem strange to expect that all these points have been addressed at this
point in the process.

If this work goes into the trunk, it will provide greater exposure to help
drive and prioritize the remaining steps. There are trade-offs in time and
money about how to spend my effort on steps (3) to (6). I have a well
defined set of issues already waiting for item (5) [1], so, post merge, I
can spend my time on these features or work on (3) or (4), or could attempt
to divide my time between them. The bottom line is that it is a process that
started some time ago and will continue for an indefinite time into the
future. The current request for merging is just one step in that process.

I expect to be maintaining this code and feature set for the indefinite
future, according to the desires of my sponsor, Basis Technologies, who has
a definite interest in the production use of an internationalized FOP, as
well as others who have expressed a similar interest. As a consequence,
there is little chance that any other FOP dev is going to have to work on
this code any time soon. Of course, if they want to do so, I would certainly
welcome community contributions to additional features, testing,
optimization, documentation, etc., in this area. I have no personal need to
*own* this code if others wish to help, and I certainly encourage it;
however, at the same time, I do have a sponsor who is willing to continue
investing in improving FOP in this area, and that willingness should not be
discounted.

[1] http://skynav.trac.cvsdude.com/fop/report/1

Regarding the comments about line length, file length etc., I will note
that, at least with line length, I have maintained the existing (but
arbitrary) limit of 100 in existing files. In the case of new files, I have
chosen to use a longer line length that works better for my development
process. I use GNU EMACS on a wide-screen MacBook Pro that has an effective
width of 200 columns, and which, when this limit is exceeded, wraps the line
(on display only). I find that arbitrary line lengths like 100 curtail my
coding style, and as I've been coding for 40+ years, it's a pretty well
established style (though back in the days when I wrote Fortran IV using an IBM
026 card 
punchhttp://en.wikipedia.org/wiki/IBM_026#IBM_024.2C_026_Card_Punches,
I had to stick with 80 columns). [If line length followed Moore's Law, we
would be using lines of length 1760, starting from 1967 with 80 columns.]

By the way, I would note that the FOP Coding Conventions [2] do not specify
a maximum line length. In searching through the FOP DEV archives, I also
don't see an explicit discussion about line length (though I may have missed
it). What I do see is the initial introduction of the checkstyle target by
Jeremias in 2002 [4], where it appears he basically adjusted the checkstyle
rules to pass most of the extant code at that time.

[2] http://xmlgraphics.apache.org/fop/dev/conventions.html
[3]
http://marc.info/?l=fop-devw=2r=1s=%2Bcheckstyle+%2Bline+%2Blengthq=b
[4] http://marc.info/?l=fop-devm=103124169719869w=2

My opinion about the various length limits (parameter count, method length,
file length, line length) as reported by checkstyle is that these should
interpreted as guidelines, and not hard rules. In the case of existing
files, it makes sense to attempt to adhere to them when possible (and I have
done this), while recognizing that some exceptions are inevitable. In the
case of new files, I agree they are reasonable targets, but I would not
readily agree to slavish adherence. Some latitude should be afforded to
individual coders, particularly when they are adding a substantial amount of
code in new files.

Regarding identifier length, neither the FOP coding conventions [2] nor
checkstyle indicates or checks for any kind of limitation; so I will
respectfully decline to change my code based on other author's 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Clay Leeds
Quick question about this.

Please forgive my naïveté but, does this code affect processing
if you're not using ComplexScript support?

Thanks,

Clay

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Vincent Hennebert
On 21/10/11 09:36, Simon Pepping wrote:
 I am pleased to learn that you are also in need of this new
 functionality.
 
 I share some of Vincent and Peter's concerns about technical points of
 the code. On the other hand, this is the only implementation of
 complex scripts we have, created by Glenn, in the style of Glenn. It
 is an initial implementation, and it is normal that it requires
 further work, maybe even design changes to make it more flexible. Does
 keeping it in a branch make that further work easier? Merging it into
 trunk will enhance its visibility, and make it available to more
 users.

If it’s merged into Trunk, anyone who makes changes to the Trunk that
break the Complex Scripts feature will have to fix what is breaking. And
for the reasons I’ve given earlier, I believe that this would put too
much of a burden on developers.


 Simon
 
 On Thu, Oct 20, 2011 at 02:02:10PM +0100, Chris Bowditch wrote:
 On 19/10/2011 19:32, Simon Pepping wrote:

 Hi Simon,

 I think you misunderstood my mail. I don't want to stop the merge. I
 simply thought it was an appropriate time to discuss some concerns
 that Vincent and Peter had identified. You are preaching to the
 converted about the need for supporting Complex scripts. It is an
 urgent requirement for us too.

 If we don't discuss our concerns over the code now, then when do we
 discuss it?

 Vincent and Peter will be replying to this thread shortly and will
 outline their primary concerns then.

 Thanks,

 Chris


Vincent


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Chris Bowditch

On 21/10/2011 15:13, Glenn Adams wrote:

Chris,


Hi Glenn,



 I would really like to see an acknowledgement from Glenn that there 
are some imperfections that need addressing.


I wasn't aware I had given anyone the impression of presenting a 
perfect submission. In fact, one of my favorite quotes is Voltaire's 
/le mieux est l'ennemi du bien/ the best [perfect] is the enemy of 
the good, which a former colleague Charlie Sandbank (now deceased) 
used to love to cite at least once a day in ITU committee meetings.


Many thanks for taking the time to write this long e-mail. I do 
appreciate it. I reached this impression because I see Vincent and Peter 
giving feedback but I've yet to see any of the suggestions actioned. It 
could be that some of their suggestions aren't appropriate, but I do 
believe some of them are good points.




My coding philosophy is by and large based on a step-wise refinement 
process:


(1) make it (some subset of possible features) work
(2) make sure its correct (and regression testable) - or if following 
BDD, then do this first
(3) make it (more) understandable/maintainable, i.e., re-factor, 
improve comments, documentation, etc

(4) optionally, make it faster and smaller
(5) optionally, add more features
(6) go to (1)

Right now, I've finished steps (1) and (2) to a certain degree for an 
initial set of features, a degree that I think is sufficient to merit 
moving it into trunk. I have not yet seriously addressed (3) through 
(6). It would seem strange to expect that all these points have been 
addressed at this point in the process.


Thanks for clarifying. Just to be clear, I didn't mean to say that 
reaching the end of development was a pre-requisite to merging to trunk. 
It just seemed like a major milestone and therefore seemed like a 
suitable prompt for the opening of a discussion about our concerns.




If this work goes into the trunk, it will provide greater exposure to 
help drive and prioritize the remaining steps. There are trade-offs in 
time and money about how to spend my effort on steps (3) to (6). I 
have a well defined set of issues already waiting for item (5) [1], 
so, post merge, I can spend my time on these features or work on (3) 
or (4), or could attempt to divide my time between them. The bottom 
line is that it is a process that started some time ago and will 
continue for an indefinite time into the future. The current request 
for merging is just one step in that process.


That makes sense to me.



I expect to be maintaining this code and feature set for the 
indefinite future, according to the desires of my sponsor, Basis 
Technologies, who has a definite interest in the production use of an 
internationalized FOP, as well as others who have expressed a similar 
interest. As a consequence, there is little chance that any other FOP 
dev is going to have to work on this code any time soon. Of course, if 
they want to do so, I would certainly welcome community contributions 
to additional features, testing, optimization, documentation, etc., in 
this area. I have no personal need to *own* this code if others wish 
to help, and I certainly encourage it; however, at the same time, I do 
have a sponsor who is willing to continue investing in improving FOP 
in this area, and that willingness should not be discounted.


Since Thunderhead also needs this feature we are willing to invest some 
time into it too. Currently my team are telling me it would take 9 
person months to port this code into our branch of FOP, partly because 
of some merge conflicts, but also partly because we are not comfortable 
with some aspects of the code as it has already been pointed out. The 
estimate would include the time to refactor long files into small ones, 
deal with the variable names and restructuring the code.




[1] http://skynav.trac.cvsdude.com/fop/report/1

Regarding the comments about line length, file length etc., I will 
note that, at least with line length, I have maintained the existing 
(but arbitrary) limit of 100 in existing files. In the case of new 
files, I have chosen to use a longer line length that works better for 
my development process. I use GNU EMACS on a wide-screen MacBook Pro 
that has an effective width of 200 columns, and which, when this limit 
is exceeded, wraps the line (on display only). I find that arbitrary 
line lengths like 100 curtail my coding style, and as I've been coding 
for 40+ years, it's a pretty well established style (though back in 
the days when I wrote Fortran IV using an IBM 026 card punch 
http://en.wikipedia.org/wiki/IBM_026#IBM_024.2C_026_Card_Punches, I 
had to stick with 80 columns). [If line length followed Moore's Law, 
we would be using lines of length 1760, starting from 1967 with 80 
columns.]


I personally don't have a problem with line length, but I think the File 
length is a small issue that we would like to address. I think the code 
would be easier to maintain if the larger classes were broken 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Glenn Adams
inline

On Sat, Oct 22, 2011 at 12:04 AM, Chris Bowditch bowditch_ch...@hotmail.com
 wrote:


 Since Thunderhead also needs this feature we are willing to invest some
 time into it too. Currently my team are telling me it would take 9 person
 months to port this code into our branch of FOP, partly because of some
 merge conflicts, but also partly because we are not comfortable with some
 aspects of the code as it has already been pointed out. The estimate would
 include the time to refactor long files into small ones, deal with the
 variable names and restructuring the code.


I would advise against this, since it would it is functionally unnecessary
and since it will make future merges more difficult. I will be making
additional changes as more features in this area are added.

I don't see what refactoring long files into small ones buys you. However,
if you make a reasoned argument for factoring specific long files (i.e., why
such factoring improves architecture, modularity, etc), rather than simply
say all long files must be refactored, then I will seriously discuss doing
so post merge.


 I appreciate your commitment to add comments to short identifiers
 declarations, so at least it will be easier for the team to translate the
 short variables to longer equivalents. Just so we are clear on what you
 propose, do you mean this:

 int gi = 0; // Glyph Index


Yes. Note that I already do this in most cases, such as:

private static void resolveExplicit ( int[] wca, int defaultLevel, int[] ea
) {
int[] es = new int [ MAX_LEVELS ];  /* embeddings stack */
int ei = 0; /* embeddings stack index */
int ec = defaultLevel;  /* current embedding level
*/
for ( int i = 0, n = wca.length; i  n; i++ ) {
int bc = wca [ i ]; /* bidi class of current
char */
int el; /* embedding level to assign
to current char */
switch ( bc ) {
case LRE:   // start left-to-right
embedding
case RLE:   // start right-to-left
embedding
 case LRO:   // start left-to-right
override
case RLO:   // start right-to-left
override
{
int en; /* new embedding level */
if ( ( bc == RLE ) || ( bc == RLO ) ) {
en = ( ( ec  ~OVERRIDE ) + 1 ) | 1;
} else {
en = ( ( ec  ~OVERRIDE ) + 2 )  ~1;
}
if ( en  ( MAX_LEVELS + 1 ) ) {
es [ ei++ ] = ec;
if ( ( bc == LRO ) || ( bc == RLO ) ) {
ec = en | OVERRIDE;
} else {
ec = en  ~OVERRIDE;
}
} else {
// max levels exceeded, so don't change level or
override
}
el = ec;
break;
}
...

What I'm agreeing to do in the relative near future (after merge, before new
release) is to add such comments to those places where I have not already
done so, which are probably a minority of such cases.

G.


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Simon Pepping
Jonathan,

Obviously, FOP's strongest supporters over the past years do not
require this new functionality. FOP needs the additional support of
new stakeholders of this new functionality. Could your teams test it
on their documents and report their findings to the fop-user email
list?

Simon Pepping

On Wed, Oct 19, 2011 at 03:20:40PM -0400, Jonathan Levinson wrote:
 We -- at InterSystems -- deploy an application that runs in upwards of 40 
 countries, using many of the languages for which complex script support is 
 required.
 
 We definitely need complex script support.  It is a requirement for us.


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Chris Bowditch

On 19/10/2011 19:32, Simon Pepping wrote:

Hi Simon,

I think you misunderstood my mail. I don't want to stop the merge. I 
simply thought it was an appropriate time to discuss some concerns that 
Vincent and Peter had identified. You are preaching to the converted 
about the need for supporting Complex scripts. It is an urgent 
requirement for us too.


If we don't discuss our concerns over the code now, then when do we 
discuss it?


Vincent and Peter will be replying to this thread shortly and will 
outline their primary concerns then.


Thanks,

Chris


Over the past ten years computing has pervaded life in all its facets,
and spread over the world. As a consequence computing is now used in
all languages and all scripts.

When I open my devanagari test file in emacs, it just works. When I
open it in firefox, it just works. The same when I open it in
LibreOffice Writer. I am sure that, if I would open it in *the* *Word*
processor, it would just work. When I process the file with FOP, it
does not work. With the complex scripts functionality, it works,
dependent on the use of supported or otherwise suitable fonts. (That
is true for all above applications, but apparently those come
configured with my system.)

So what does a person do who believes in the XML stack to maintain his
documentation, and wants to send his documents in Hindi to his
customers? See that XSL-FO with FOP is not a suitable solution for him
because Hindi uses a complex script?

FOP needs the complex scripts functionality to remain a player in the
global playing field.

This is for me the single overarching consideration to want this
functionality in FOP's trunk code, and in, say, half a year in a
release. All other considerations are minor, unless one wants to claim
that this code will block FOP's further development and maintenance in
the coming years.

Of course, not everybody needs this functionality, and there is a fear
of increased maintenance overhead. But the question is: For whom do we
develop FOP? Also for the large part of the world that uses complex
scripts?

With the development of the complex scripts functionality, Glenn Adams
and his sponsor Basis Technologies have created a new reality, which
is not going to go away. If this functionality does not end up in FOP,
it will probably live on elsewhere. If the FOP team is fine with that,
say no to the merge request, and feel comfortable with a trusted niche
application.

Simon Pepping

On Wed, Oct 19, 2011 at 09:50:24AM +0100, Chris Bowditch wrote:

On 18/10/2011 19:55, Simon Pepping wrote:

I merged the ComplexScripts branch into trunk. Result:

Hi Simon,

As well of the question of how to do the merge there is also the
question should we do the merge? Of course this is a valuable
feature to the community, and Glenn has invested a lot of time in
its development but is it truely production ready? I asked Vincent
to take a look at the branch earlier in the year as it's a feature
we also need, but he had several concerns that have not be
adequately answered. Take a look at comment #30;
https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30

I'm not sure why Vincent describes it as a brief look because he
spent several days on it. I also asked Peter to take a look and he
had similar concerns. 2 or 3 letter variable names are a barrier for
any committer wanting to maintain this code and I don't think it is
a sufficient argument to say that a pre-requisite to maintaining
this code is to be a domain expert. I would hope that any
experienced committer with a debugger should be able to solve some
bugs. Obviously certain problems will require domain expertise, but
the variables names are a key barrier to being able to maintain this
code.

I realise my comments might be a little controversial and I don't
mean any disrespect to Glenn or his work (which is largely
excellent), but we should at least discuss these topics before the
merge is completed.






Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Vincent Hennebert
The Complex Scripts feature is obviously a great enhancement and we
would all love to have it implemented in FOP. However, that should not
come at the expense of maintainability and the implementation of other
new features.

When I look at the code in the Temp_ComplexScripts branch, I have
serious concerns regarding the latter two points. I would oppose merging
the branch back to Trunk until those are resolved.

Here are the sizes of some new files:
1075 src/java/org/apache/fop/fonts/GlyphSequence.java
1089 src/java/org/apache/fop/fonts/GlyphProcessingState.java
1269 src/codegen/unicode/java/org/apache/fop/text/bidi/GenerateBidiTestData.java
2034 src/java/org/apache/fop/layoutmgr/BidiUtil.java
3449 test/java/org/apache/fop/complexscripts/util/TTXFile.java

This latter one contains more than 50 field
declarations, and the Handler.startElement method alone is more than
1800 lines long.

Also, the o.a.f.fonts.truetype.TTFFile class has now grown to
5502 lines. That’s 3 times its original size which was already too big.
I regularly find myself looking at bits of this class, and I would be
unable to do so on a 5500 line class.

I don’t think it needs to be explained why big classes are undesirable?

Also, most files disable Checkstyle checks, the most important ones
being line length and white space. Many files have too long lines which
makes it a pain to read through, having to horizontally scroll all the
time. We agreed on a certain coding style in the project and it would be
good if new code could adhere to it.

Speaking of variable names, here is a method picked from
o.a.f.fonts.GlyphSequence:
/**
 * Merge overlapping and abutting sub-intervals.
 */
private static int[] mergeIntervals ( int[] ia ) {
int ni = ia.length;
int i, n, nm, is, ie;
// count merged sub-intervals
for ( i = 0, n = ni, nm = 0, is = ie = -1; i  n; i += 2 ) {
int s = ia [ i + 0 ];
int e = ia [ i + 1 ];
if ( ( ie  0 ) || ( s  ie ) ) {
is = s;
ie = e;
nm++;
} else if ( s = is ) {
if ( e  ie ) {
ie = e;
}
}
}
int[] mi = new int [ nm * 2 ];
// populate merged sub-intervals
for ( i = 0, n = ni, nm = 0, is = ie = -1; i  n; i += 2 ) {
int s = ia [ i + 0 ];
int e = ia [ i + 1 ];
int k = nm * 2;
if ( ( ie  0 ) || ( s  ie ) ) {
is = s;
ie = e;
mi [ k + 0 ] = is;
mi [ k + 1 ] = ie;
nm++;
} else if ( s = is ) {
if ( e  ie ) {
ie = e;
}
mi [ k - 1 ] = ie;
}
}
return mi;
}

Now I fully appreciate that one has to have some knowledge of an area to
understand code relating to that area, but no level of expertise,
however high, will help me to quickly understand this code. This is just
too easy to mistake one variable for another one when they differ by
only one letter.

Combined, these would make the code very hard to maintain by anyone
other than the original author, and this is why I’m opposed to merging
it to Trunk in its current form. When I commit code I do my very best to
make it as easy to read and understand by other people, and I would
really appreciate it I could have the same in return.


Thanks,
Vincent


On 19/10/11 19:32, Simon Pepping wrote:
 Over the past ten years computing has pervaded life in all its facets,
 and spread over the world. As a consequence computing is now used in
 all languages and all scripts.
 
 When I open my devanagari test file in emacs, it just works. When I
 open it in firefox, it just works. The same when I open it in
 LibreOffice Writer. I am sure that, if I would open it in *the* *Word*
 processor, it would just work. When I process the file with FOP, it
 does not work. With the complex scripts functionality, it works,
 dependent on the use of supported or otherwise suitable fonts. (That
 is true for all above applications, but apparently those come
 configured with my system.)
 
 So what does a person do who believes in the XML stack to maintain his
 documentation, and wants to send his documents in Hindi to his
 customers? See that XSL-FO with FOP is not a suitable solution for him
 because Hindi uses a complex script?
 
 FOP needs the complex scripts functionality to remain a player in the
 global playing field.
 
 This is for me the single overarching consideration to want this
 functionality in FOP's trunk code, and in, say, half a year in a
 release. All other considerations are minor, unless one wants to claim
 that this code will block FOP's further development and maintenance in
 the coming years.
 
 Of course, not everybody needs this functionality, and there is a fear
 of increased maintenance overhead. But the question is: For 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Peter Hancock
This is a tough one.

The need for complex script support in FOP is likely high on the wish
list of any global supporter of FOP and it is certainly in the
interest of the project to support. The amount of work that Glenn has
done is considerable: the volume of code, the test coverage and the
obvious depth of domain knowledge.

From the surface I would have been very much in favor of supporting a
merge in the near future, however I have had the chance to review some
areas of the complex script branch and I have some concerns.
The treatment of Unicode Bidi spans from the creation of the FO Tree
through to the construction of the Area Tree and I would have liked to
have seen the complex scripts solution integrate the Unicode Bidi
Algorithm more directly into the core process:  For example, the
implementation performs a post process on the FO Tree to resolve the
Bidi properties of FONodes relating to text. It would be preferable to
see the construction of the FO Tree embracing this Bidi aspect:
FONodes should be responsible for determining their own bidi state
from the fo node semantics in context to the position in the tree.
Such an implementation would immediately force the maintainer to
consider how a change would effect the Bidi process.

The layout and area tree construction phase interact a little more
directly  with the Bidi process, however most of the Bidi work is
delegated to static methods of a utility class (..layoutmgr.BidiUtil,
also used heavily in the Bidi post-process of the FO Tree) and
consequently require many instanceof expressions because of
differences in Bidi behavior between the Area classes.  Areas should
be responsible for the character re-ordering process.

Having this functionality in FOP is desirable and I would encourage
steps to address these concerns, and those outlined by Chris and
Vincent.

Peter

On Thu, Oct 20, 2011 at 2:53 PM, Vincent Hennebert vhenneb...@gmail.com wrote:
 The Complex Scripts feature is obviously a great enhancement and we
 would all love to have it implemented in FOP. However, that should not
 come at the expense of maintainability and the implementation of other
 new features.

 When I look at the code in the Temp_ComplexScripts branch, I have
 serious concerns regarding the latter two points. I would oppose merging
 the branch back to Trunk until those are resolved.

 Here are the sizes of some new files:
 1075 src/java/org/apache/fop/fonts/GlyphSequence.java
 1089 src/java/org/apache/fop/fonts/GlyphProcessingState.java
 1269 
 src/codegen/unicode/java/org/apache/fop/text/bidi/GenerateBidiTestData.java
 2034 src/java/org/apache/fop/layoutmgr/BidiUtil.java
 3449 test/java/org/apache/fop/complexscripts/util/TTXFile.java

 This latter one contains more than 50 field
 declarations, and the Handler.startElement method alone is more than
 1800 lines long.

 Also, the o.a.f.fonts.truetype.TTFFile class has now grown to
 5502 lines. That’s 3 times its original size which was already too big.
 I regularly find myself looking at bits of this class, and I would be
 unable to do so on a 5500 line class.

 I don’t think it needs to be explained why big classes are undesirable?

 Also, most files disable Checkstyle checks, the most important ones
 being line length and white space. Many files have too long lines which
 makes it a pain to read through, having to horizontally scroll all the
 time. We agreed on a certain coding style in the project and it would be
 good if new code could adhere to it.

 Speaking of variable names, here is a method picked from
 o.a.f.fonts.GlyphSequence:
    /**
     * Merge overlapping and abutting sub-intervals.
     */
    private static int[] mergeIntervals ( int[] ia ) {
        int ni = ia.length;
        int i, n, nm, is, ie;
        // count merged sub-intervals
        for ( i = 0, n = ni, nm = 0, is = ie = -1; i  n; i += 2 ) {
            int s = ia [ i + 0 ];
            int e = ia [ i + 1 ];
            if ( ( ie  0 ) || ( s  ie ) ) {
                is = s;
                ie = e;
                nm++;
            } else if ( s = is ) {
                if ( e  ie ) {
                    ie = e;
                }
            }
        }
        int[] mi = new int [ nm * 2 ];
        // populate merged sub-intervals
        for ( i = 0, n = ni, nm = 0, is = ie = -1; i  n; i += 2 ) {
            int s = ia [ i + 0 ];
            int e = ia [ i + 1 ];
            int k = nm * 2;
            if ( ( ie  0 ) || ( s  ie ) ) {
                is = s;
                ie = e;
                mi [ k + 0 ] = is;
                mi [ k + 1 ] = ie;
                nm++;
            } else if ( s = is ) {
                if ( e  ie ) {
                    ie = e;
                }
                mi [ k - 1 ] = ie;
            }
        }
        return mi;
    }

 Now I fully appreciate that one has to have some knowledge of an area to
 understand code relating to that area, but no level of expertise,
 however high, will help me to quickly 

RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Jonathan Levinson
Hi Simon,

I've contacted my management and asked what our teams can do to help test.  I 
report to our development not to our quality departments, and I can't speak for 
our quality departments.  I've contacted our international teams about what 
they can do to help test.

The bottom-line to our testing is that when a new FOP is released, we test our 
software on the new FOP, and testing is done with every application that uses 
FOP.  With an international FOP, international applications would be tested.  
That much is guaranteed by our normal testing process.  It is a tribute to the 
quality of FOP that we have never had to report a FOP issue, even though our 
reports can be quite complicated.  But I take it you would like us to get 
involved with the testing effort before a new FOP is released.  When you are 
discussing our involvement, are you discussing our testing the FOP that results 
from the merger onto the trunk, once that is accomplished?   I understand you 
are ironing out the details of what that merger would look like. 

You said bug reports should go to fop-users, but isn't it the case that .fo 
attachments won't be accepted by fop-users?  Don't bug reports have to be 
created through Bugzilla?  We can discuss what we see in terms of bugs on 
fop-users, but if we can't provide .fo files, won't our discussion be less 
helpful?

Best Regards,
Jonathan Levinson
Senior Software Developer
Object Group
InterSystems
+1 617-621-0600
jonathan.levin...@intersystems.com


 -Original Message-
 From: Simon Pepping [mailto:spepp...@leverkruid.eu]
 Sent: Thursday, October 20, 2011 3:19 AM
 To: fop-dev@xmlgraphics.apache.org
 Subject: Re: Merge Request - Temp_ComplexScripts into Trunk
 
 Jonathan,
 
 Obviously, FOP's strongest supporters over the past years do not require this
 new functionality. FOP needs the additional support of new stakeholders of 
 this
 new functionality. Could your teams test it on their documents and report 
 their
 findings to the fop-user email list?
 
 Simon Pepping
 
 On Wed, Oct 19, 2011 at 03:20:40PM -0400, Jonathan Levinson wrote:
  We -- at InterSystems -- deploy an application that runs in upwards of 40
 countries, using many of the languages for which complex script support is
 required.
 
  We definitely need complex script support.  It is a requirement for us.



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Chris Bowditch

On 18/10/2011 19:55, Simon Pepping wrote:

I merged the ComplexScripts branch into trunk. Result:


Hi Simon,

As well of the question of how to do the merge there is also the 
question should we do the merge? Of course this is a valuable feature to 
the community, and Glenn has invested a lot of time in its development 
but is it truely production ready? I asked Vincent to take a look at the 
branch earlier in the year as it's a feature we also need, but he had 
several concerns that have not be adequately answered. Take a look at 
comment #30; https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30


I'm not sure why Vincent describes it as a brief look because he spent 
several days on it. I also asked Peter to take a look and he had similar 
concerns. 2 or 3 letter variable names are a barrier for any committer 
wanting to maintain this code and I don't think it is a sufficient 
argument to say that a pre-requisite to maintaining this code is to be a 
domain expert. I would hope that any experienced committer with a 
debugger should be able to solve some bugs. Obviously certain problems 
will require domain expertise, but the variables names are a key barrier 
to being able to maintain this code.


I realise my comments might be a little controversial and I don't mean 
any disrespect to Glenn or his work (which is largely excellent), but we 
should at least discuss these topics before the merge is completed.


Thanks

Chris



--- Merging r981451 through r1185769 into '.':

Summary of conflicts:
   Text conflicts: 58
   Tree conflicts: 126

Most tree conflicts are probably an artifact of subversion. See

svn info lib/xmlgraphics-commons-1.5svn.jar|tail -n 4

Tree conflict: local add, incoming add upon merge
   Source  left: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk/lib/xmlgraphics-commons-1.5svn.jar@981450
   Source right: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_ComplexScripts/lib/xmlgraphics-commons-1.5svn.jar@1185769

This will cause quite some work.

I also merged trunk into ComplexScripts. Result:

--- Merging r1177231 through r1185780 into '.':

Summary of conflicts:
   Text conflicts: 2
   Tree conflicts: 2

I resolved the text conflicts easily. Again the tree conflicts were
not real conflicts.

Both merges should result in the same code: trunk + ComplexScripts.

I did not commit the merge of trunk into ComplexScripts to the
repository. I do not think it would facilitate merging ComplexScripts
into trunk.

Simon

On Sat, Oct 15, 2011 at 06:17:49PM +0800, Glenn Adams wrote:

With this latest patch, I am satisfied that there is sufficient testing and
stability in the CS branch to support its merger into trunk. Therefore, I
request that such a merge be accomplished after applying patch 5 to the CS
branch as described below.






Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Glenn Adams
I provided a detailed comment on Vincent's brief review at:

https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c31

With the exception of the the following comment, the remaining comments are
editorial in nature or have no actionable response.

 How feasible is it to run the BIDI algorithm on a subset of the FO tree? If 
 I'm
 right, ATM it is launched on a whole page-sequence. When we refactor the code
 so that layout can be started before a whole page-sequence has been parsed, in
 order to avoid the infamous memory issue that a lot of users are running into,
 will that code allow to do it?

Regarding the point(s) at which the bidi algorithm is invoked, I agree there
are possible, alternative points of invocation. Some of which may distribute
the cost of its performance as opposed to concentrating that work in one
call. However, even if there are alternatives, it is not clear whether they
are desirable, and whether it would have any impact on memory usage or time
performance. Further, when we refactor the code is an unknown, future
possibility, and making a substantial change at this time in order to
perform certain optimizations that only come into play in the eventuality
that such code refactoring occurs seems to be an example of premature
optimization.

As regards to symbol name length, I would merely cite that the FOP Coding
Conventions http://xmlgraphics.apache.org/fop/dev/conventions.html do not
specify length of an identifier (either short or long) as an established
convention. Let's not use this valuable contribution as a *cause célèbre* to
argue or establish new conventions that do not exist. Otherwise we will soon
be arguing over whether i, j, and k are reasonable identifiers for
enumeration variables.

You ask about whether this merge is truly production ready? Is this a
requirement for doing a merge from a temporary branch? How would you define
truly production ready? Is FOP itself truly production ready? I've spend the
last couple of months creating over 500,000 tests. I wonder if any other
part of FOP is so thoroughly tested? Frankly speaking, I would like to put
even more tests into place, and I will over time, but I'm not going to do
that until the current work is merged.

If there is a substantial bug or regression that would be caused by this
merger, then I'd be happy to address that problem before merging. I look
forward to any report of this nature.

If folks wish to keep pushing the issue of symbol name length, then please
enumerate, by source file and line number each case to which is objected.
Also, please provide objective comparative data showing how the reported
data diverges from statistically similar usage elsewhere in FOP. Please also
define an objective measure of the level of domain knowledge needed for
working with specific code in this merge for which a comment about symbol
length is an issue. In particular, please show how a reader skilled in the
arts of the code being considered would not be able to readily infer the
meaning of a specific symbol in cases where I have not explicitly provided a
comment at the place of definition.

G.

On Wed, Oct 19, 2011 at 4:50 PM, Chris Bowditch
bowditch_ch...@hotmail.comwrote:

 On 18/10/2011 19:55, Simon Pepping wrote:

 I merged the ComplexScripts branch into trunk. Result:


 Hi Simon,

 As well of the question of how to do the merge there is also the question
 should we do the merge? Of course this is a valuable feature to the
 community, and Glenn has invested a lot of time in its development but is it
 truely production ready? I asked Vincent to take a look at the branch
 earlier in the year as it's a feature we also need, but he had several
 concerns that have not be adequately answered. Take a look at comment #30;
 https://issues.apache.org/**bugzilla/show_bug.cgi?id=**49687#c30https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30

 I'm not sure why Vincent describes it as a brief look because he spent
 several days on it. I also asked Peter to take a look and he had similar
 concerns. 2 or 3 letter variable names are a barrier for any committer
 wanting to maintain this code and I don't think it is a sufficient argument
 to say that a pre-requisite to maintaining this code is to be a domain
 expert. I would hope that any experienced committer with a debugger should
 be able to solve some bugs. Obviously certain problems will require domain
 expertise, but the variables names are a key barrier to being able to
 maintain this code.

 I realise my comments might be a little controversial and I don't mean any
 disrespect to Glenn or his work (which is largely excellent), but we should
 at least discuss these topics before the merge is completed.

 Thanks

 Chris



 --- Merging r981451 through r1185769 into '.':

 Summary of conflicts:
   Text conflicts: 58
   Tree conflicts: 126

 Most tree conflicts are probably an artifact of subversion. See

 svn info lib/xmlgraphics-commons-1.**5svn.jar|tail -n 4

 Tree 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Pascal Sancho
HI,

IMHO, Production ready should only cited before a FOP release, not for
a merge of branch to trunk.
At this stage, the only questions are about regression tests (and code
readability, since open source).

Merging the branch now should encourage more users to test these new
features and give feedback.

Le 19/10/2011 14:25, Glenn Adams a écrit :
 You ask about whether this merge is truly production ready? Is this a
 requirement for doing a merge from a temporary branch? How would you
 define truly production ready? Is FOP itself truly production
 ready? I've spend the last couple of months creating over 500,000 tests.
 I wonder if any other part of FOP is so thoroughly tested? Frankly
 speaking, I would like to put even more tests into place, and I will
 over time, but I'm not going to do that until the current work is merged.

-- 
Pascal


RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Jonathan Levinson
We -- at InterSystems -- deploy an application that runs in upwards of 40 
countries, using many of the languages for which complex script support is 
required.

We definitely need complex script support.  It is a requirement for us.

Thanks,
Jonathan Levinson
Senior Software Developer
Object Group
InterSystems
+1 617-621-0600
jonathan.levin...@intersystems.com


 -Original Message-
 From: Simon Pepping [mailto:spepp...@leverkruid.eu]
 Sent: Wednesday, October 19, 2011 2:32 PM
 To: fop-dev@xmlgraphics.apache.org
 Subject: Re: Merge Request - Temp_ComplexScripts into Trunk
 
 Over the past ten years computing has pervaded life in all its facets, and 
 spread
 over the world. As a consequence computing is now used in all languages and 
 all
 scripts.
 
 When I open my devanagari test file in emacs, it just works. When I open it in
 firefox, it just works. The same when I open it in LibreOffice Writer. I am 
 sure
 that, if I would open it in *the* *Word* processor, it would just work. When I
 process the file with FOP, it does not work. With the complex scripts
 functionality, it works, dependent on the use of supported or otherwise 
 suitable
 fonts. (That is true for all above applications, but apparently those come
 configured with my system.)
 
 So what does a person do who believes in the XML stack to maintain his
 documentation, and wants to send his documents in Hindi to his customers? See
 that XSL-FO with FOP is not a suitable solution for him because Hindi uses a
 complex script?
 
 FOP needs the complex scripts functionality to remain a player in the global
 playing field.
 
 This is for me the single overarching consideration to want this 
 functionality in
 FOP's trunk code, and in, say, half a year in a release. All other 
 considerations
 are minor, unless one wants to claim that this code will block FOP's further
 development and maintenance in the coming years.
 
 Of course, not everybody needs this functionality, and there is a fear of
 increased maintenance overhead. But the question is: For whom do we develop
 FOP? Also for the large part of the world that uses complex scripts?
 
 With the development of the complex scripts functionality, Glenn Adams and his
 sponsor Basis Technologies have created a new reality, which is not going to 
 go
 away. If this functionality does not end up in FOP, it will probably live on
 elsewhere. If the FOP team is fine with that, say no to the merge request, and
 feel comfortable with a trusted niche application.
 
 Simon Pepping
 
 On Wed, Oct 19, 2011 at 09:50:24AM +0100, Chris Bowditch wrote:
  On 18/10/2011 19:55, Simon Pepping wrote:
  I merged the ComplexScripts branch into trunk. Result:
 
  Hi Simon,
 
  As well of the question of how to do the merge there is also the
  question should we do the merge? Of course this is a valuable feature
  to the community, and Glenn has invested a lot of time in its
  development but is it truely production ready? I asked Vincent to take
  a look at the branch earlier in the year as it's a feature we also
  need, but he had several concerns that have not be adequately
  answered. Take a look at comment #30;
  https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30
 
  I'm not sure why Vincent describes it as a brief look because he
  spent several days on it. I also asked Peter to take a look and he had
  similar concerns. 2 or 3 letter variable names are a barrier for any
  committer wanting to maintain this code and I don't think it is a
  sufficient argument to say that a pre-requisite to maintaining this
  code is to be a domain expert. I would hope that any experienced
  committer with a debugger should be able to solve some bugs. Obviously
  certain problems will require domain expertise, but the variables
  names are a key barrier to being able to maintain this code.
 
  I realise my comments might be a little controversial and I don't mean
  any disrespect to Glenn or his work (which is largely excellent), but
  we should at least discuss these topics before the merge is completed.



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-18 Thread Simon Pepping
I merged the ComplexScripts branch into trunk. Result:

--- Merging r981451 through r1185769 into '.':

Summary of conflicts:
  Text conflicts: 58
  Tree conflicts: 126

Most tree conflicts are probably an artifact of subversion. See
svn info lib/xmlgraphics-commons-1.5svn.jar|tail -n 4
Tree conflict: local add, incoming add upon merge
  Source  left: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk/lib/xmlgraphics-commons-1.5svn.jar@981450
  Source right: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_ComplexScripts/lib/xmlgraphics-commons-1.5svn.jar@1185769

This will cause quite some work.

I also merged trunk into ComplexScripts. Result:

--- Merging r1177231 through r1185780 into '.':

Summary of conflicts:
  Text conflicts: 2
  Tree conflicts: 2

I resolved the text conflicts easily. Again the tree conflicts were
not real conflicts.

Both merges should result in the same code: trunk + ComplexScripts.

I did not commit the merge of trunk into ComplexScripts to the
repository. I do not think it would facilitate merging ComplexScripts
into trunk.

Simon

On Sat, Oct 15, 2011 at 06:17:49PM +0800, Glenn Adams wrote:
 With this latest patch, I am satisfied that there is sufficient testing and
 stability in the CS branch to support its merger into trunk. Therefore, I
 request that such a merge be accomplished after applying patch 5 to the CS
 branch as described below.