DO NOT REPLY [Bug 37329] New: - More readable error messages
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 Summary: More readable error messages Product: Fop Version: 1.0dev Platform: PC OS/Version: Windows 2000 Status: NEW Severity: normal Priority: P2 Component: fo tree AssignedTo: fop-dev@xmlgraphics.apache.org ReportedBy: [EMAIL PROTECTED] I downloaded the latest FOP trunk of 1.0, build the fop.jar with the ANT script supplied and tested generation PDF files from XML/XSL sources (in comparison to FOP 0.20.5). After fixing some simpler error messages (like fo:region-before must be declared before fo:region-after), none of your documents run through because of this error javax.xml.transform.TransformerException: org.apache.fop.fo.ValidationException: Error(Unknown location): fo:table-cell is missing child elements. Required Content Model: marker* (%block;)+ Well, I don't really understand what this error means. It would also be helpful to have the name of the source file included. As each of our XML/XSL documents reference further XML/XML documents (xsl:include ...), and all have many tables, it would be helpful to have the filename and the line number. Is this possible? -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37318] - fop.bat: NoClassDefFoundError: org/apache/fop/cli/Main
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37318. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37318 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 11:45 --- Okay, is there a forum for this? Mail lists are evel. Just for having a question to subscribe to such a list and receive any amount of emails every day that don't interest me. A forum is much more convenient. Is there any? -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37329] - More readable error messages
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||INVALID --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 12:03 --- xsl:include has nothing to do with XSL-FO or FOP, it is specific to XSLT. Generating a Printed Document (PDF or PS) from XML is typically a two step process: XML + XSL - FO - PDF FOP is responsible for generating a PDF from FO. For the user's convenience the interface to FOP will take XML + XSL files which FOP passes off to a XSLT processor, e.g. Xalan, which generates the FO file, and passes it via SAX to FOP which generates the PDF. So in order to track down your problem when you have multiple XSL files you will need to first run the XSL Transform part of this process separately using Xalan or similar to generate a large FO file which will have all your xsl:includes pulled in. It should then be easier to find the problem by looking at the FO file. A polite request: problems/questions should be directed to the fop-user mailing list before they are raised as bugs. Thanks -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37330] - [PATCH] FOP Bridges not properly registered with Batik
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37330. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37330 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 12:11 --- Created an attachment (id=16851) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=16851action=view) Fix bridge registration This patch fixes the bridge registration by creating a subclass of BridgeContext and overriding registerBridges to replace the standard SVG bridges after the baseclass has done it's registration. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37329] - More readable error messages
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 12:16 --- Have you seen http://xmlgraphics.apache.org/fop/trunk/upgrading.html ? The upgrade issue you describe is explained in a fairly prominent position on that page. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37318] - fop.bat: NoClassDefFoundError: org/apache/fop/cli/Main
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37318. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37318 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 12:19 --- No there is no Apache hosted forum for FOP. The fop-user list is there for that purpose. It is not a very high volume list as a quick check of the archives would reveal. And you can always unsubscribe once your issues are resolved. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37236] - [PATCH] Fix gradients and patterns
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37236. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37236 [EMAIL PROTECTED] changed: What|Removed |Added Attachment #16837|0 |1 is obsolete|| --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 12:56 --- Created an attachment (id=16853) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=16853action=view) Update to gradient repeat, fixed createGraphics problem. This patch is an update to 16837. It indirects the access to jpegCount so all the PDFGraphics2D share a common count. This prevents inadvertant reuse of the wrong image. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37329] - More readable error messages
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 13:20 --- If you mean this: While FOP 0.20.5 allowed you to have empty fo:table-cell elements, the new code will complain about that (unless relaxed validation is enabled) because the specification demands at least one block-level element ((%block;)+, see XSL-FO 1.0, 6.7.10) inside an fo:table-cell element. It still appears after adding -r in the fop.bat call (relaxed validation). -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: Leading/trailing space removal in LineLM
Manuel Mall wrote: So we end up with only two cases to consider: preserve white space and remove white space around a line break created by the Knuth algorithm. 1. Preserve white space: IMO in this case the space itself is actually not a break opportunity but there are now two break opportunities: one before the space and one after the space. That is a sequence like 'abc#x20;def' is more like 'abc#x200b;#xa0;#x200b;def' or in a more readable notation 'abczwspnbspzwspdef'. That is our normal space becomes a non-breakable space flanked by zero-width spaces which represent the break opportunities. If this is correct the Knuth elements would look like: glue w=0 box w=0 pen +INFINITE glue w=space pen glue w=0 Is this sequence correct? The first and last glue represent the zwsp and are break opportunities. The box prevents the removal of the space if a break is created before the space. The penalty prevents the space to be considered as a break opportunity. Of course as usual these sequences are further complicated in the absence of justification and in the presence of border/padding. I like your idea of expanding a preserved space into zwsps and nbsp; this allows us to forget alignments and borders / padding as we just have to insert the appropriate elements for the non breaking space. The sequence is very good, as it has a couple of interesting properties: - it interacts with the surrounding elements just a single glue element - if there are two (or more) consecutive, non-collapsed spaces the sequence has just 3 feasible breaks, not 4 However, I have a doubt: reading the Unicode document about line breaking, it seems to me that, regardless of the quantity of consecutive spaces, there is only *one* feasible break, after the last one (Unicode Standard Annex #14, section 2 Definitions, in particular the definition of direct break and indirect break) --- begin quoted text --- Direct Break - a line break opportunity exists between two adjacent characters of the given line breaking classes. This is indicated in the rules below as B ? A, where B is the character class of the character before and A is the character class of the character after the break. If they are separated by one or more space characters, a break opportunity also exists after the last space. In the pair table, the optional space characters are not shown. Indirect Break - a line break opportunity exists between two characters of the given line breaking classes only if they are separated by one or more spaces. In this case, a break opportunity exists after the last space. No break opportunity exists if the characters are immediately adjacent. This is indicated in the pair table below as B % A, where B is the character class of the character before and A is the character class of the character after the break. Even though space characters are not shown in the pair table, an indirect break can only occur if one or more spaces follow B. In the notation of the rules in Section 6, Line Breaking Algorithm this would be represented as two rules: B ? A and B SP+ ? A. --- end quoted text --- I still have not read the document from top to bottom, and I could have misunderstood even the sections I read :-), but I think this point must be clarified before we continue. Regards Luca
DO NOT REPLY [Bug 37329] - More readable error messages
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 [EMAIL PROTECTED] changed: What|Removed |Added Severity|normal |enhancement Status|RESOLVED|REOPENED Resolution|INVALID | --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 13:30 --- A polite request: problems/questions should be directed to the fop-user mailing list before they are raised as bugs. Thanks This is not a question, but a request for enhancement. The error message I posted above is an example. And it is a error message from FOP (as a FOP user, it doesn't matter if it uses Xalan or similar internally), so I ask if it would be possible to give some more information in the error message to identify which source files are concerend. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: Leading/trailing space removal in LineLM
On Wed, 2 Nov 2005 01:59 pm, Manuel Mall wrote: On Wed, 2 Nov 2005 04:18 am, Simon Pepping wrote: On Tue, Nov 01, 2005 at 11:40:42PM +0800, Manuel Mall wrote: This is probably a question for Luca or Simon. snip/ Glue and penalty items are removed at the start of a line. This is part of the Knuth algorithm. It does not touch the matter of white-space-collapse. If there is whitespace that may not be removed/collapsed at the start of the line, it must be protected by a preceding zero-width box. I.o.w., the value of white-space-collapse needs to be taken into account at the phase of getNextKnuthElements. Fair enough - I need some help with the Knuth elements then. During getNextKnuth we need to only consider white-space-treatment as white-space-collapse can be handled completely during refinement, that is consecutive sequences of white space are either collapsed or not during refinement. We also can limit white-space-treatment during getNextKnuth to any line breaks generated by the line breaking algorithm (Knuth algorithm). white-space-treatment around hard line breaks (linefeeds, start/end of a block) are handled during refinement. We can also limit white-space-treatment during getNextKnuth to the values preserve vs ignore-if Other values are handled during refinement. We also can treat the three different ignore-if... values, that is the values: ignore-if-before-linefeed, ignore-if-after-linefeed, ignore-if-surrounding-linefeed, as just one case: 'delete all white space around a formatter generated break'. So we end up with only two cases to consider: preserve white space and remove white space around a line break created by the Knuth algorithm. 1. Preserve white space: IMO in this case the space itself is actually not a break opportunity but there are now two break opportunities: one before the space and one after the space. That is a sequence like 'abc#x20;def' is more like 'abc#x200b;#xa0;#x200b;def' or in a more readable notation 'abczwspnbspzwspdef'. That is our normal space becomes a non-breakable space flanked by zero-width spaces which represent the break opportunities. If this is correct the Knuth elements would look like: glue w=0 box w=0 pen +INFINITE glue w=space pen glue w=0 Is this sequence correct? The first and last glue represent the zwsp and are break opportunities. The box prevents the removal of the space if a break is created before the space. The penalty prevents the space to be considered as a break opportunity. Of course as usual these sequences are further complicated in the absence of justification and in the presence of border/padding. 2. Removal of white space: This is the current behaviour but it works only for a single space and not for a sequence of spaces. Actually because the algorithm removes leading glues/penalties it is mainly a problem for trailing white space. I am not sure how to best tackle this. What comes to mind is: a) Do the same as for leading glues/penalties at the end of the line. However I am not sure how tricky it would be to determine the boundary because any 'blocking boxes' (see 1. above) are only placed before but not after elements. This options suffers from the problem that it will not remove leading/trailing white space across inline boundaries with border/padding as these generate zero width boxes to block removal of the glue elements for the border/padding. b) Do not generate individual Knuth sequences for each white space character but instead collect all consecutive white space and create one glue-penalty sequence for it. Again I am uncertain of the consequences of doing that. To do that correctly we would need to collect white space across inline boundaries. This firstly breaks the current getNextKnuth approach which assumes each LM can generate its sequences without knowledge of its neighbours. It would also break the current area info structures as a single Knuth element could now refer to text snippets from different LMs. Comments please. Simon Thanks Luca wrote a longer response to this but my mail reader doesn't like the character set (is that topical or what?). Any way at end end Luca ask the question about the UAX#14 line breaking algorithm and its handling of spaces. My answer to that is: a) Yes UAX#14 always breaks at the of a sequence of spaces b) But is also says that it assumes any trailing spaces in a line are being removed This conflicts with XSL-FO which can force spaces being retained therefore adjustments to the algorithm are necessary to cater for that. One possible adjustment is simply changing what is given to the algorithm as indicated above, ie sp becomes zwspnbspzwsp. Manuel Manuel In case other people have the same problem with Luca's post here is the content: Start Luca's e-mail + I like your idea of expanding a preserved space into zwsps and nbsp; this allows us to forget alignments and
DO NOT REPLY [Bug 37136] - external-graphic dimensions and rendering
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37136. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37136 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 14:45 --- I noticed that we also have svg images imported and that code is causing this (same) error message: fo:instream-foreign-object svg:svg width=120mm height=130mm xmlns:xlink=http://www.w3.org/2000/svg; svg:image width=120mm height=130mm xlink:href={$chart_C_AssetSectorStructure2_H}/ /svg:svg /fo:instream-foreign-object leads to: javax.xml.transform.TransformerException: java.lang.RuntimeException: Some content could not fit into a line/page after 50 attempts. Giving up to avoid an endless loop. I checked it when I removed just this section of code, the message disappeard (and also the SVG ;-). Repasting it back reproduces the error message. Unfortunately, content-width=scale-to-fitdoesn't work with svg:svg and svg:image. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37136] - external-graphic dimensions and rendering
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37136. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37136 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 15:11 --- Please see response #4 for the SVG example. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37136] - external-graphic dimensions and rendering
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37136. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37136 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 15:15 --- Comment #4 is not a working example showing the problem. It is just a snippet from your XML file. For us to investigate this we need the XSL-FO file (as small as possible just enough to show the problem) together with any related files (svg, jpeg, ...) -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: Leading/trailing space removal in LineLM
Manuel Mall wrote: Luca wrote a longer response to this but my mail reader doesn't like the character set (is that topical or what?). Sorry, it looks really horrible ... still don't know what went wrong, but I won't do it again! :-) Any way at end Luca ask the question about the UAX#14 line breaking algorithm and its handling of spaces. My answer to that is: a) Yes UAX#14 always breaks at the of a sequence of spaces b) But is also says that it assumes any trailing spaces in a line are being removed This conflicts with XSL-FO which can force spaces being retained therefore adjustments to the algorithm are necessary to cater for that. One possible adjustment is simply changing what is given to the algorithm as indicated above, ie sp becomes zwspnbspzwsp. Ok, so back to your previous message: 2. Removal of white space: This is the current behaviour but it works only for a single space and not for a sequence of spaces. Actually because the algorithm removes leading glues/penalties it is mainly a problem for trailing white space. I am not sure how to best tackle this. What comes to mind is: a) Do the same as for leading glues/penalties at the end of the line. However I am not sure how tricky it would be to determine the boundary because any 'blocking boxes' (see 1. above) are only placed before but not after elements. This options suffers from the problem that it will not remove leading/trailing white space across inline boundaries with border/padding as these generate zero width boxes to block removal of the glue elements for the border/padding. b) Do not generate individual Knuth sequences for each white space character but instead collect all consecutive white space and create one glue-penalty sequence for it. Again I am uncertain of the consequences of doing that. To do that correctly we would need to collect white space across inline boundaries. This firstly breaks the current getNextKnuth approach which assumes each LM can generate its sequences without knowledge of its neighbours. It would also break the current area info structures as a single Knuth element could now refer to text snippets from different LMs. I'm not sure I follow you in all the details of white space handling and here we have borders too ... :-) I like b) most: after all, this is somewhat similar to the space resolution, as we have interactions between spaces coming from different nodes, and it's difficult to have each LM decide on its own. And I think we could find a way to keep the 1-1 relationship between AreaInfo objects and Positions. I have tried to play with the elements, and here are a few results: I hope they can help! At the moments, the sequence for a single space with borders and padding is: 1 glue w=endBP 2 penalty w=0 3 glue w=(spaceIPD - endBP - startBP) 4 box w=0 5 infinite penalty 6 glue w=startBP total width = spaceIPD if break at #2 = endBP / startBP If we have two (or more) spaces, we could use the sequence: 1 glue w=endBP 2 penalty w=0 3 glue w=(- endBP - startBP) 4 glue w=spaceIPD1 5 glue w=spaceIPD2 6 box w=0 7 infinite penalty 8 glue w=startBP total width = spaceIPD1 + spaceIPD2 if break at #2 = endBP / startBP Glues #4 and #5 have a Position pointing to different AreaInfo objects (from different LMs). This should solve (?) the case of ignore-if-surrounding. If white-space-treatment is ignore-if-after, and we have two consecutive spaces we could use the sequence: 1 glue w=endBP 2 penalty w=0 3 glue w=(spaceIPD - endBP) 4 penalty w=0 5 glue w=(spaceIPD - startBP) 6 box w=0 7 infinite penalty 8 glue w=startBP total width = 2 * spaceIPD if break at #2 = endBP / startBP if break at #4 = endBP + spaceIPD / startBP With three or more consecutive spaces: 1 glue w=endBP 2 penalty w=0 3 glue w=(spaceIPD - endBP) 4 penalty w=0 5 glue w=spaceIPD 6 penalty w=0 7 glue w=(spaceIPD - startBP) 8 box w=0 9 infinite penalty 10 glue w=startBP total width = 3 * spaceIPD if break at #2 = endBP / startBP if break at #4 = endBP + spaceIPD / startBP if break at #6 = endBP + 2 * spaceIPD / startBP I did not find a sequence for ignore-if-before yet ... Regards Luca
Re: linefeed-treatment=preserve
On Nov 2, 2005, at 01:39, Manuel Mall wrote: On Wed, 2 Nov 2005 05:14 am, Simon Pepping wrote: On Tue, Nov 01, 2005 at 06:54:08PM +0100, Andreas L Delmelle wrote: On Nov 1, 2005, at 16:03, Manuel Mall wrote: Is it a) empty line line 1 empty line This one, IMO. I agree, a) Same as Simon, Joerg and Andreas I thought a) as well but am now confused because both AntennaHouse and RenderX render: FWIW: I'm sticking to my initial a). Furthermore, I tried following the current whitespace removal algorithm (fo.flow.Block.handleWhiteSpace()) for this particular case, and IIC, it is handled pretty well --IOW: both linefeeds are properly presented to the layout-engine, but it's somewhere in layout that the decision is made to drop the trailing linefeed. Probably because the feasible break after 'line1' gets 'merged' with the forced line-break (or: what exactly does the algorithm do when it encounters a mere break-possibility immediately followed by a forced line-break?) Cheers, Andreas
DO NOT REPLY [Bug 37329] - Relaxed validation NYI for fo:table-cells
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 [EMAIL PROTECTED] changed: What|Removed |Added Status|REOPENED|ASSIGNED Summary|More readable error messages|Relaxed validation NYI for ||fo:table-cells --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 18:24 --- For the record: 1. I agree with Chris that FOP can't be expected to tell you where precisely in which source document the error lies (FOP is ultimately presented with the result of the XSL transform, so all the references to whatever source files you're using is lost at that point) FOP could add location info to the error message, but this makes little sense if the intermediate FO doesn't exist as a physical file. 2. The error message is, IMO, quite self-explanatory (at least for anyone who has had so much as a glance at the XSL-FO Recommendation). It simply means that an fo:table-cell should have 'zero-or- more markers' plus 'one-or-more block-level' descendants (according to the XSL-FO Rec.) 3. The relaxed validation doesn't work yet for this particular case... Either this is an oversight in the docs, or this is still TODO. I'll see if I can commit a fix for this ASAP. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37329] - Relaxed validation NYI for fo:table-cells
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 [EMAIL PROTECTED] changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 19:24 --- OK it's fixed. Please do note that this will only lead to usable results if the cells are really supposed to be empty. Any text-content of the cell will be silently ignored/dropped. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37329] - Relaxed validation NYI for fo:table-cells
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37329. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37329 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 19:34 --- On second thought: dropping it silently seemed outright stupid, so added a little warning message -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 37136] - external-graphic dimensions and rendering
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37136. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37136 --- Additional Comments From [EMAIL PROTECTED] 2005-11-02 19:40 --- In reply to comment #4: See recent post on fop-users: specify content-height and content-width on the fo:instream-foreign- object, not on the svg element. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Current FOText implementation + Refinement whitespace handling
Hi all, (Manuel, I guess this is mostly directed to you, as you may already have been browsing the same classes...) Just wandering a bit through the FOText source code (follow-up on Manuel's recent thread on whitespace handling), and I stumbled upon the following suspicious little detail: FOText has a static member 'lastFOTextProcessed', which doesn't seem to get cleared/flushed anywhere. The intention is quite clear, but the possible effects of the current implementation may turn out rather nasty. IIC, this is what the warning is about in the FOText javadoc as well as the TODO for that member variable. Rough guess: since the variable doesn't get cleared, it always contains a reference to a char array containing the last portion of accumulated text (or, more precisely, a FOText instance carrying that reference, as well as one to the previous FOText etc.) --even after the document has finished, into the next run if within the same JVM (+ possible multi-thread mayhem?) The TODO hints at a solution involving the page-sequence. I somehow feel that moving it to the block level would be enough... Logically, whitespace handling --which is one of the prime reasons of existence of this static variable-- deals with line-breaks, and start-block/end- block are implicit after- or before-eol. To follow up on that last sentence, the current refinement whitespace handling works roughly as follows: 1. Add all text and inline children to the block, until the first non- inline child is encountered (or the block ends) 2. Recursively iterate over *all* text nodes anywhere in the block up to here, converting/removing any superfluous whitespace in the process and (+/-) repeat the above for each uninterrupted sequence of text/ inline children in the block. Seems to work nicely, for the most part. Manuel already raised the issue of inappropriate inter-FO whitespace- collapsing, but I have another question. Given this algorithm, and knowing that the inlines do not do any whitespace-handling themselves, what happens in the following case: fo:block fo:inline fo:block fo:inline fo:block ... ? My current best guess is that the inner block's underlying character sequence will be 'recursively' iterated over three times (?) That would be two too many, since all whitespace will have been collapsed the first time around. I'm still chewing on some ideas to move part of this to InlineLevel, so that ultimately, we can do away with the recursion and let each level handle its own small part. The higher level then chains these small parts together with its own character content. One way to make this happen would be to overload Block.handleWhiteSpace() to deal with an InlineLevel parameter. This has the advantage of the whitespace-related properties being easily available. The call to this overloaded method would be made from InlineLevel.endOfNode(). If you're still following, I'd use a CharIterator that iterates over regular characters, fo:characters (and possibly the first and last characters of any nested FO). This iterator can operate very easily on both inlines and blocks. I don't immediately see any need to iterate backwards, at least not during refinement. Big advantage here would precisely be that we can wait until Block.endOfNode() to deal with any white-space for the entire block (leading and trailing), the nested bits will already have performed their parts at that point, so it is done sooner and far more efficiently IIC (guaranteed only one pass per level, no matter how deep the nesting goes). Food for thought :-) Cheers, Andreas
Re: Unicode compliant Line Breaking
On Tue, Nov 01, 2005 at 11:17:08PM +0100, J.Pietschmann wrote: Simon Pepping wrote: Is our current hyphenation method a subset of Unicode's method? Umm. What's the relation between hyphenation and TR14 (except for handling soft hyphens)? I guess you confuse finding line breaks in general and line breaking due to hyphenation. I mean, will our current method of finding possible line breaking points using the hyphenation tables be part of a TR14 compliant system to find line break opportunities? Simon -- Simon Pepping home page: http://www.leverkruid.nl
Re: Leading/trailing space removal in LineLM
On Wed, Nov 02, 2005 at 04:58:09PM +0100, Luca Furini wrote: Manuel Mall wrote: Luca wrote a longer response to this but my mail reader doesn't like the character set (is that topical or what?). Sorry, it looks really horrible ... still don't know what went wrong, but I won't do it again! :-) It is in the quoted-printable format, probably due to non-ascii or non-latin-1 characters in it, the TR14 symbols. Simon -- Simon Pepping home page: http://www.leverkruid.nl
Re: Unicode compliant Line Breaking
Simon Pepping wrote: I mean, will our current method of finding possible line breaking points using the hyphenation tables be part of a TR14 compliant system to find line break opportunities? In some sense yes, but I'm not sure what you really mean. Currently, spaces and slashes (/) as well as hyphenation points are considered break opportunities. TR14 doesn't care about hyphenation but expands significantly on the other points. For example, in the string foo-bar the position after the dash is a break opportunity, as people usually expect, but in -1234 the position after the dash isn't a break opportunity, also as people usually expect. The TR encodes as much of such expectations as is possible with a limited context. A few places in TextLayoutManager which use BREAK_CHARS will have to be changed, either keeping info from a previous scanning using a BreakIterator or something, or looking up the line break Unicode properties and looking up whether a break may occur in the line-break matrix. Hyphenation points are generated elsewhere and remain unaffected. J.Pietschmann
Re: zero width space
Manuel Mall wrote: That seems to be the consensus, that is consider ZWS for line breaking but then discard and don't give it to the renderers. Renderers could deal with ZWS if the font would have a glyph for this character; unfortunately, that's not the case for the PDF standard fonts :-) Some fonts *do* have glyphs for various Unicode space characters, notably the fixed width spaces. This leads to the question: Is a space a character? What *is* a character? The Unicode people had endless discussions about this. Spaces are exactly in the gray area between real characters which leave marks and layout control. Handling space characters in layout and discarding them before rendering has the distinctive advantage that they work for any font in any renderer (which can handle variable space areas properly, of course). OTOH, renderers which output a format which can handle the spaces itself, like a hypothetical HTML renderer, would better get the original character. Are there any other (unusual Unicode) characters which fall in the same category that is they influence layout decisions but should not be seen by the renderers? * Unicode spaces + variable with spaces - ordinary space U+0020 - ordinary non-breaking space U+00A0 + fixed width spaces; potentially available in fonts and *may* be passed to renderers, *except* for U+200B - zero width space U+200B, may expand in justification (not implemented this way in FOP 0.20.5, which will haunt us) - zero width non breaking space, aka byte order mark U+FEFF, should now only be used as BOM (as the BOM is eaten by the XML parser, FOP could emit a deprecated warning) - en quad U+2000, according to my Unicode book *identical* to U+2002, *not* a 4en space (strange) - em quad U+2001, similar to U+2000 - en space aka nut U+2002, - em space aka mutton U+2003 - three-per-em space aka thick space (1/3 em width) U+2004 - four-per-em space aka mid space (1/4 em width) U+2005 - six-per-em space (generally 1/6 em width) U+2006 - figure space (font dependent) U+2007 - punctuation space (as wide as a dot or comma) U+2008 - thin space (1/5..1/8 em width) U+2009 - hair space (1/10..1/16 em width) U+200A - narrow no-break space (probably 1/6 em width) U+202F - mathematical space U+205F - non breaking word joiner U+2060 replaces U+FFEF in text - ideographic space U+3000 - OGHAM SPACE MARK U+1680 (odd stuff) - Note: ETHIOPIC WORDSPACE U+1361 leaves marks and is therefore not a space. At least I hope so. + see also http://en.wikipedia.org/wiki/Space_character http://www.alistapart.com/stories/emen/ * Other characters + Character shaping hints; they do not cause line breaks. - zero width joiner U+200D - zero width non-joiner U+200C (may probably also hint at preventing ligatures) - see http://en.wikipedia.org/wiki/Zero-width_joiner et al. + Soft hyphen U+00AD. Must be hidden if no line break follows. + Formatting characters. I'd say these characters should not occur in XSLFO source, because there are FO which represent the same functionality. - line separator U+2028, FOP 0.20.5 creates an unconditional line break regardless of any FO properties - paragraph separator U+2029 - bidi control characters 200E-200F, 202A-202E - deprecated controls 206A-206F J.Pietschmann
Re: Leading/trailing space removal in LineLM
Manuel Mall wrote: a) Yes UAX#14 always breaks at the of a sequence of spaces b) But is also says that it assumes any trailing spaces in a line are being removed This conflicts with XSL-FO which can force spaces being retained therefore adjustments to the algorithm are necessary to cater for that. Computing line breaking opportunities and discarding whitespace at the end (or beginning) of a line are different matters. If whitespace has to be retained, trailing spaces after a non-space string may simply mean the previous line breaking opportunity has to be used, because otherwise the string including the trailing spaces will overflow the line area. The trailing whitespace may also influence text justification. J.Pietschmann
Re: Leading/trailing space removal in LineLM
On Thu, 3 Nov 2005 06:03 am, J.Pietschmann wrote: Manuel Mall wrote: a) Yes UAX#14 always breaks at the of a sequence of spaces b) But is also says that it assumes any trailing spaces in a line are being removed This conflicts with XSL-FO which can force spaces being retained therefore adjustments to the algorithm are necessary to cater for that. Computing line breaking opportunities and discarding whitespace at the end (or beginning) of a line are different matters. If whitespace has to be retained, trailing spaces after a non-space string may simply mean the previous line breaking opportunity has to be used, because otherwise the string including the trailing spaces will overflow the line area. The trailing whitespace may also influence text justification. Hmm, to me it appears that UNICODE and XSL-FO have slightly different models when it comes to white space in the context of line breaking which is causing the discussion here. In UNICODE everything is based simply on the properties of the codepoint in question and its neighbour. In XSL-FO one can change the behaviour of a codepoint by setting those white space related XSL-FO properties. That is not a concept within UNICODE. If you want to retain white space in UNICODE you use a different codepoint. If you want to retain a space in XSL-FO you could use a different codepoint but more likely you set a XSL-FO property if you want this applied widely in your document. If we want to 'marry' UNICODE linebreaking with XSL-FO white space handling we have this interaction to consider. One possible solution would be to replace spaces (U+0020) by different codepoints which resemble the behaviour modification imposed by any XSL-FO white space handling properties in effect. But I am not sure if this can be done in all cases. Otherwise we may have to modify the UNICODE line breaking algorithm to cater for the XSL-FO white space specialities. J.Pietschmann Manuel