Il giorno 28/ott/08, alle ore 13:53, Vincent Hennebert ha scritto:
A more sophisticated, maybe too much sophisticated, solution can choose it by looking at the average box length: we can see how many average box
can fit a line (wordsPerLine) and execute:

   avgWord = avgBox + LineLayoutManager.DEFAULT_SPACE_WIDTH;
   idealDifference = iLineWidth - (avgWord * (wordsPerLine / 2));

I’m not sure I’m following you here. What’s the value of wordsPerLine?
Is is set manually to a value that’s considered to be a reasonable one?
Because if it’s computed automatically, the formula can be simplified:
   wordsPerLine = lineWidth / avgWord, so
   idealDifference = lineWidth - lineWidth / 2
                   = lineWidth / 2

I compute wordsPerLine as you wrote but the simplified version is slightly different because using integers and not floats, so wordsPerLine * avgWord may be different from lineWidth. But I realize this precision is unnecessary and probably useless.
        

Anyway, the adjustment ratio is already a notion that is independent of the line width; that’s precisely the purpose of a ratio. In the case of
left-justified mode, the only available stretchability is due to the
space at the end of the line; the question is to determine up to how
much we accept that space to be...
Ok, by writing that I think I know what you mean now :-) But the issue
should probably be considered the other way around: the problem is not
so much the adjustment ratio as the amount of space allowed at the end
of the line. In the case of narrow columns, that “3 times the width of
a space character” is too big WRT the line width. Instead of having
a fixed value, it should be changed into a small proportion of the line
width.
At the origin that 3 * space-width value was probably chosen for
“normal” line widths, that is lines containing an optimal amount of
words. I’ve read somewhere that the optimal number of letters per line
is 60. Taking the Times font, the average width of lowercase letters is
459, so the optimal line width roughly is 459*60 = 27540. The width of
the space character is 250, so 3 times a space character at the end of
a line makes 2.7% of that line. So let’s go for an elastic space of 3%
the line width, and then we can always chose the same adjustment ratio;
the number of active nodes would be “automatically” limited, whatever
the line width.

Good idea!

The two-column case is not surprising: the columns are too narrow, which
makes line-breaking particularly challenging. The one-column
left-justified case surprises me a bit, however. I would have expected
that text could be broken without even needing hyphenation. I find it
a bit ironical that justifying text actually is easier for the
line-breaking algorithm...
At any rate, that adjustment ratio of 20 for the last run is surely too
much. It can probably be reduced to 5. Actually, I’m not even sure
a third run with a high adjustment ratio is desirable. Maybe we should
simply re-run the algorithm in forcing mode, and accept the underfull
lines that will be introduced.

I agree.

If you could run statistics on more real-life documents (how often is
the first run without hyphenation sufficient, the third run required,
justified and left-aligned text, single / two-column on A4 paper, etc),
that would be fantastic.

I already performed this tests but with paragraphs that probably are larger than normal. I'll give you more realistic reports asap, possibly regarding the example fo files in the repository too.


Dario


Reply via email to