DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870
Hyphenation bug including bugfix : sporadic mutilation of hyphenated word
Summary: Hyphenation bug including bugfix : sporadic mutilation
of hyphenated word
Product: Fop
Version: 0.20.4
Platform: PC
OS/Version: Windows NT/2K
Status: NEW
Severity: Normal
Priority: Other
Component: general
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]
Explanation of bug:
---
Under some circumstances (see below) some hyphenated words are mutilated.
E.g. the german word Altersvorsorge, was SOMETIMES (but not very often)
hyphenated rsvor-Altesorge.
Reason:
---
Xerces uses the characters() calls to give FOP a character buffer which is
a 'view window' on the current document. It can happen that one word
(like Altersvorsorge) is fragmented over two calls of characters(). In the
given example : Alte and rsvorsorge
FOP adds the first part of the word to the pending areas. This happens in
org\apache\fop\layout\LineArea.java in the method addText(). Xerces delivers
the rest of the word in his second characters-call which results in a second
call to addText().
In this second call (if hyphenation is set to true) the method doHyphenation()
(also in class LineArea) is called which completely ignores pending areas!!! So
it happens that the word fragment rsvorsorge is handed over to the
hyphenation engine, which does a correct job with this fragment.
Now the Hyphenator determines that rsvor- is added to the current line area.
The next call to addText checks if there are any pending areas (Alte in our
example) prints it in the next line and continues with the rest of the current
buffer (sorge [...] in the example).
So the reason that this bug occurs only in very few situations is that it
depends on
1) how often and with which buffer size the xml-parser calls the characters-
method and so I think it definitely depends on the version of the xml parser
used
2) how the xml-document looks like; an additional character/newline somewhere
BEFORE the mutilated word can change the calls to the characters method.
MY CHANGES
--
I changed the internals of the method doHyphenation(). It now takes into
account any pending areas which may contain word fragments.
New Approach in doHyphenation:
1) Scan pending areas vector for pending text fragments, and remove them from
the pending areas vector
2) Concatenate result from 1) with the current word to be hyphenated in the
current char-buffer
3) call Hyphenator
4) use addWord to add pre-hyphen word fragment to current line area
5) Decision: is final hyphenation point somewhere in the pending area or in the
current char-buffer ?
5a) hyphenation point is somewhere in the pending area :
-- add rest of characters of the pending pending text fragments to the pending
area vector (they will be printed in a new line (by addText()) together with
the rest of the word which is in the current buffer). For this task I used the
existing addSpacedWord() method with the pending parameter set to true.
5b) hyphenation point is somewhere in the current char buffer:
-- just return new position in current char buffer
I also changed the signature of doHyphenation():
Parameter TextState was added : addSpacedWord method (used in 5a) needs the
current textState
The call to doHyphenation() in LineArea.addText() is modified:
The remaining width parameter now isn't reduced by the pendingWidth, because
doHyphenation now looks at pending areas itself:
ret = this.doHyphenation(dataCopy, i, wordStart,
this.getContentWidth()
- (finalWidth
+ spaceWidth
/*+ pendingWidth*/), textState);
I think it doesn't make sense that I include our xsl-fo documents to reproduce
the error, because we use custom fonts, which will likely lead to a different
layout on your system and the error will probably not occur.
Chris Wewerka
[EMAIL PROTECTED]
Munich, Germany
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]