DO NOT REPLY [Bug 16870] New: - Hyphenation bug including bugfix : sporadic mutilation of hyphenated word

2003-02-07 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870

Hyphenation bug including bugfix : sporadic mutilation of hyphenated word

   Summary: Hyphenation bug including bugfix : sporadic mutilation
of hyphenated word
   Product: Fop
   Version: 0.20.4
  Platform: PC
OS/Version: Windows NT/2K
Status: NEW
  Severity: Normal
  Priority: Other
 Component: general
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]


Explanation of bug:
---
Under some circumstances (see below) some hyphenated words are mutilated.

E.g. the german word Altersvorsorge, was SOMETIMES (but not very often) 
hyphenated rsvor-Altesorge.


Reason:
---
Xerces uses the characters() calls to give FOP a character buffer which is 
a 'view window' on the current document. It can happen that one word 
(like Altersvorsorge) is fragmented over two calls of characters(). In the 
given example : Alte and rsvorsorge

FOP adds the first part of the word to the pending areas. This happens in 
org\apache\fop\layout\LineArea.java in the method addText(). Xerces delivers 
the rest of the word in his second characters-call which results in a second 
call to addText(). 

In this second call (if hyphenation is set to true) the method doHyphenation() 
(also in class LineArea) is called which completely ignores pending areas!!! So 
it happens that the word fragment rsvorsorge is handed over to the 
hyphenation engine, which does a correct job with this fragment.

Now the Hyphenator determines that rsvor- is added to the current line area. 

The next call to addText checks if there are any pending areas (Alte in our 
example) prints it in the next line and continues with the rest of the current 
buffer (sorge [...] in the example).

So the reason that this bug occurs only in very few situations is that it 
depends on 
1) how often and with which buffer size the xml-parser calls the characters-
method and so I think it definitely depends on the version of the xml parser 
used
2) how the xml-document looks like; an additional character/newline somewhere 
BEFORE the mutilated word can change the calls to the characters method.



MY CHANGES
--
I changed the internals of the method doHyphenation(). It now takes into 
account any pending areas which may contain word fragments. 

New Approach in doHyphenation:
1) Scan pending areas vector for pending text fragments, and remove them from 
the pending areas vector
2) Concatenate result from 1) with the current word to be hyphenated in the 
current char-buffer 
3) call Hyphenator
4) use addWord to add pre-hyphen word fragment to current line area
5) Decision: is final hyphenation point somewhere in the pending area or in the 
current char-buffer ?

5a) hyphenation point is somewhere in the pending area :
-- add rest of characters of the pending pending text fragments to the pending 
area vector (they will be printed in a new line (by addText()) together with 
the rest of the word which is in the current buffer). For this task I used the 
existing addSpacedWord() method with the pending parameter set to true.

5b) hyphenation point is somewhere in the current char buffer:
-- just return new position in current char buffer



I also changed the signature of doHyphenation():
Parameter TextState was added : addSpacedWord method (used in 5a) needs the 
current textState


The call to doHyphenation() in LineArea.addText() is modified:
The remaining width parameter now isn't reduced by the pendingWidth, because 
doHyphenation now looks at pending areas itself:

ret = this.doHyphenation(dataCopy, i, wordStart,
 this.getContentWidth()
 - (finalWidth
 + spaceWidth
 /*+ pendingWidth*/), textState);



I think it doesn't make sense that I include our xsl-fo documents to reproduce 
the error, because we use custom fonts, which will likely lead to a different 
layout on your system and the error will probably not occur.




Chris Wewerka
[EMAIL PROTECTED]
Munich, Germany

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




DO NOT REPLY [Bug 16870] - Hyphenation bug including bugfix : sporadic mutilation of hyphenated word

2003-02-07 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870

Hyphenation bug including bugfix : sporadic mutilation of hyphenated word

[EMAIL PROTECTED] changed:

   What|Removed |Added

   Keywords||PatchAvailable

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




DO NOT REPLY [Bug 2106] - broken justification with numeric umlaut entities

2003-02-07 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=2106.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=2106

broken justification with numeric umlaut entities

[EMAIL PROTECTED] changed:

   What|Removed |Added

 CC||[EMAIL PROTECTED]



--- Additional Comments From [EMAIL PROTECTED]  2003-02-07 11:25 ---
*** Bug 16870 has been marked as a duplicate of this bug. ***

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




DO NOT REPLY [Bug 16870] - Hyphenation bug including bugfix : sporadic mutilation of hyphenated word

2003-02-07 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16870

Hyphenation bug including bugfix : sporadic mutilation of hyphenated word





--- Additional Comments From [EMAIL PROTECTED]  2003-02-07 11:37 ---
Created an attachment (id=4773)
patch

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]