Hi, I was made aware of an issue regarding line wrapping with certain characters and spaces (as shown in testA.pdf). After investigating the issue, I found that it was related to the nextChar() method in the LineBreakStatus class. What was happening was that for the string "? ? ? ? ?", spaces were being ignored and instead it was comparing only the question marks which led to every case being a prohibited line break. The reason why this works for most normal characters is because for example 'AA' would return an indirect line break from the pair table whereas '..' or '??' would not ever provide the opportunity to break.
After reading through the method (and some of the UAX #14 specification), it
seemed that the fix would be for it to take the spaces into account by
assigning it to the leftClass (the last character) like every other character
the method encounters. While this does break several current unit and render
tests we have, it does seem to resolve the issue and also fix / improve cases
of previous line wrapping ([old]TestA -> [new]TestB and [old]TestC ->
[new]TestD).
/* Check 2: current is a mandatory break or space? */
switch (currentClass) {
case LineBreakUtils.LINE_BREAK_PROPERTY_BK:
case LineBreakUtils.LINE_BREAK_PROPERTY_LF:
case LineBreakUtils.LINE_BREAK_PROPERTY_NL:
case LineBreakUtils.LINE_BREAK_PROPERTY_CR:
// LB 6: Do not break before a hard break
leftClass = currentClass;
return PROHIBITED_BREAK;
case LineBreakUtils.LINE_BREAK_PROPERTY_SP:
// LB 7: Do not break before spaces ...
// Zero-width spaces are in the pair-table (see below)
hadSpace = true;
++ leftClass = currentClass;
return PROHIBITED_BREAK;
default:
//nop
}
The reason I am posting this first rather than thinking about posting a patch
is because this effects current FOP output. Secondly, I was wondering if anyone
had any familiarity with the code and could offer their input. I am aware of
the hadSpace variable, but that only gets triggered when the break type is not
prohibited. If that boolean check is applied to the prohibited case statement,
it also really fouls up the layout. Finally, the comment above it is a bit
confusing as it mentions zero-width spaces being in the pair table, but if
that's the case why are the spaces never compared.
Thanks,
Robert Meyer
<?xml version="1.0" encoding="UTF-8"?> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:fox="http://xmlgraphics.apache.org/fop/extensions" xmlns:svg="http://www.w3.org/2000/svg"> <fo:layout-master-set> <fo:simple-page-master margin-right="1cm" margin-left="1cm" margin-bottom="0.3cm" margin-top="1cm" page-width="21cm" page-height="29.7cm" master-name="all"> <fo:region-body margin-left="0cm" margin-bottom="1cm" margin-right="0cm" margin-top="0cm"/> <fo:region-after extent="0.5cm"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence format="1" id="th_default_sequence1" master-reference="all"> <fo:static-content flow-name="xsl-region-after"> <fo:block margin-right="0cm" margin-left="0cm" color="#009999" font-size="6pt" font-family="Helvetica">Created using Thunderhead. Visit www.thunderhead.com for more information. </fo:block> </fo:static-content> <fo:flow flow-name="xsl-region-body"> <fo:block> <fo:block>? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?</fo:block> </fo:block> </fo:flow> </fo:page-sequence> </fo:root>
testA.pdf
Description: Adobe PDF document
testB.pdf
Description: Adobe PDF document
testC.pdf
Description: Adobe PDF document
testD.pdf
Description: Adobe PDF document
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:fox="http://xmlgraphics.apache.org/fop/extensions" xmlns:ff="http://xmlgraphics.apache.org/fop/extensions/forms"> <fo:layout-master-set> <fo:simple-page-master master-name="LetterPage" page-width="6.5in" page-height="1.4in"> <fo:region-body region-name="PageBody" margin="0.1in" background-color="rgb(245,245,245)"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="LetterPage"> <fo:flow flow-name="PageBody" font="12pt Arial"> <fo:block border="2pt solid black" space-after="5pt"> The content of this block is split across multiple lines.The content of this block is split ..... .............................................. The content of this block is split across multiple lines. </fo:block> </fo:flow> </fo:page-sequence> </fo:root>
