Re: Sources for layout.jpg
--- The Web Maestro [EMAIL PROTECTED] wrote: Does anyone have any objections to my making these files LIVE on the FOP site (and replacing the current document.jpg image on the home page)? None whatsoever, but I'm going to act like I do in order to get something I want in exchange... No way, Clay! Not unless you refresh the team page so my change adding Renaud as a contributor appears!!! Thanks, Glen
this list is moving to fop-dev at xmlgraphics.apache.org
I am about to move this list to the xmlgraphics.apache.org TLP. All current subscribers will move with it, but you may need to update your incoming folder rules. Please pardon the inconvenience. Cheers, ...Roy T. Fielding, co-founder, The Apache Software Foundation ([EMAIL PROTECTED]) http://www.apache.org/ ([EMAIL PROTECTED])http://roy.gbiv.com/
Got Plass' dissertation
I got my copy of Michael Plass' dissertation today. A cursory overview shows that this document will provide some insight on using Knuth element model for page breaking but it also makes clear that we still have to come up with solutions for certain tricky problems that he didn't have to deal with back then. At any rate, the dissertation seems more helpful than the two documents we found by Brüggemann-Klein, Klein and Wohlfeil. Jeremias Maerki
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Let me sum up this tread to see if I get the picture: * Sun's codec [1] will not be integrated. * instead, Batik's transcoders will be used [2]. * where and how these transcoders will be made available to fop will be discussed next week [3] * I'll start by implementing basic functionalities for TIFF and PNG using Batik's codecs. This will be 1.3 compilant. I should be able to use the actual codecs without modifications. * aditional functionalities (like TIFF compressions) will be added via the Java Image I/O (needs Java 1.4) or via JAI.Those additional functionalities will be stored in a separate directory (src/java-1.4). * an alternative would be to create|integrate these functionalities in the Batik code. * I should check that the output isn't restricted to 72-dpi [4] Any input to tell me if these assertions are true is welcome. Regards, Renaud [1] http://java.sun.com/developer/sampsource/jai/ [2] http://cvs.apache.org/viewcvs.cgi/xml-batik/sources/org/apache/batik/ext/awt/image/codec/tiff/ [3] http://mail-archives.eu.apache.org/mod_mbox/xmlgraphics-general/200503.mbox/[EMAIL PROTECTED] [4] http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=10756
Re: Good job! / Re: Integration of TIFFRenderer in FOP
That's all correct although the third point does not really have anything to do with the bitmap renderer. On 10.03.2005 14:34:57 Renaud Richardet wrote: Let me sum up this tread to see if I get the picture: * Sun's codec [1] will not be integrated. * instead, Batik's transcoders will be used [2]. * where and how these transcoders will be made available to fop will be discussed next week [3] * I'll start by implementing basic functionalities for TIFF and PNG using Batik's codecs. This will be 1.3 compilant. I should be able to use the actual codecs without modifications. * aditional functionalities (like TIFF compressions) will be added via the Java Image I/O (needs Java 1.4) or via JAI.Those additional functionalities will be stored in a separate directory (src/java-1.4). * an alternative would be to create|integrate these functionalities in the Batik code. * I should check that the output isn't restricted to 72-dpi [4] Any input to tell me if these assertions are true is welcome. Regards, Renaud [1] http://java.sun.com/developer/sampsource/jai/ [2] http://cvs.apache.org/viewcvs.cgi/xml-batik/sources/org/apache/batik/ext/awt/image/codec/tiff/ [3] http://mail-archives.eu.apache.org/mod_mbox/xmlgraphics-general/200503.mbox/[EMAIL PROTECTED] [4] http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=10756 Jeremias Maerki
RE: Got Plass' dissertation
Jeremias Maerki wrote: I got my copy of Michael Plass' dissertation today. A cursory overview shows that this document will provide some insight on using Knuth element model for page breaking but it also makes clear that we still have to come up with solutions for certain tricky problems that he didn't have to deal with back then. At any rate, the dissertation seems more helpful than the two documents we found by Brüggemann-Klein, Klein and Wohlfeil. Hi Jeremias: I got my copy yesterday and spent some time in it as well. I wanted to weave some thoughts about it into the thread from a few days ago that dealt with this same topic. In that thread, you raised the same two issues that I have been troubled about for page-breaking: 1) differing page IPDs, and 2) side-floats. WRT differing page IPDs, I think it is probably very acceptable in almost every case to simply change the rules. IOW, If you want to use layout strategy X, you must do so on a page-sequence-master that has all page IPDs the same and that uses the same column-count. (Plass acknowledges that the problem to be solved is two-dimensional, but specifically says that his algorithm is one-dimensional.) Whether dealing with business forms (invoices, purchase orders, etc.) or high-end book publishing (I have experience with both), I cannot think of a use case for differing IPDs on pages in the same page-sequence. If the goal is 100% standard compliance, this isn't good enough. However, if the goal is to get some work done for a client, and it doesn't hinder future 100% compliance, then this would be the way I would address it. Side-floats are another story. Plass addresses them by adding a new primitive to the box/glue/penalty model, called insert (page 15). However, I thought the following was pretty interesting as well, drawn from the last paragraph of his dissertation: There are other ways of incorporating approximations into the pagination routine; for instance, inset figures could be handled satisfactorily by doing the pagination as if they were full width figures with the same area as the original figures; afterwards the paragraphs could be re-broken in order to fit them around the figures. The optimizing line breaking algorithm would be helpful in this regard, since it could break the text on the page in such a way that it would come out to the right number of lines. Layout artists traditionally use approximate copyfitting techniques to do such tasks, so methods like this hold much promise. So, if an insert requiring 4 square inches is required, and the IPD is six inches, add 2/3 of an inch to the BPD of the block, do your copyfitting, then come back and lay the block out properly later. After thinking through all of these papers and ideas, I am more convinced than ever of the utility of pluggable layout. But I guess you guys like branches better :-) Victor Mote
OpenOffice hyphenation in Java - YESSSSSS
Look what I've just found: http://linux.org.mt/projects/jtextcheck/jtextcheck-ooohyph-plugin/index.html It's LGPL (and so are OO's hyphenation patterns if I remember correctly) but if we provide a plug-in mechanism and host the actual adapter under the LGPL at http://offo.sourceforge.net/ we could provide the same hyphenation functionality as OO. Info on LGPL use by Apache software: http://wiki.apache.org/jakarta/Using_20LGPL_27d_20code http://wiki.apache.org/jakarta/LicenceIssues Now, can we please get rid of all hyphenation patterns in FOP? Please, please. Just joking, but I'd feel better. Jeremias Maerki
Re: master-reference '' for fo:page-sequence matches no simple-page-master or page-sequence-master
Quite happy to. Done. Glen --- Simon Pepping [EMAIL PROTECTED] wrote: Glen, On Wed, Mar 09, 2005 at 11:34:29AM -0800, Glen Mazza wrote: The property on fo:conditional-page-master-reference should be master-reference, not master-name [1]. I had this problem too the other day. Can you not throw some validation towards detecting a missing master-reference attribute? The idea that FOP goes searching for an empty master-reference name is not quite satisfactory. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Jeremias Maerki wrote: Thanks to Glen for raising the issue. The ideal approach is if Oleg would pack up his TIFFRenderer and donate it to the ASF accompanied with a software grant [1], but Oleg is a FOP committer and has a CLA on file. So if Oleg attaches a ZIP with the sources for the TIFFRenderer (ALv2 already applied) to a Bugzilla entry along with a note that we may include it in FOP, that's good enough for me. It's not that the thing is a big application in itself even though some people would argue works like Renaud's AWT patch and Oleg's TIFFRenderer must go/run through the Incubator. To make things even more complicated, TIFFRenderer is just a thin wrapper around some weird licensed [1] Sun's codec sources, called Java Advanced Imaging API 1.1.1 Sample Source [2], which includes some provisional bits of JAI. I'm not sure if we want to use it. What about using full-blown JAI? [1] http://www.tkachenko.com/fop/JAI_1.1.1_sample_io_sourcecodelic.10_23_01.txt [2] http://java.sun.com/developer/sampsource/jai/ -- Oleg Tkachenko http://blog.tkachenko.com Multiconn Technologies, Israel
Re: Good job! / Re: Integration of TIFFRenderer in FOP
That's no problem, I think, because Batik has a TIFF encoder [3] already in their codebase and we can move this code to the common area and use that. Shouldn't be difficult to adjust. Otherwise, I'd rather use ImageIO even if it's only available in JDKs =1.4. [3] http://cvs.apache.org/viewcvs.cgi/xml-batik/sources/org/apache/batik/ext/awt/image/codec/tiff/ On 09.03.2005 11:30:51 Oleg Tkachenko wrote: Jeremias Maerki wrote: Thanks to Glen for raising the issue. The ideal approach is if Oleg would pack up his TIFFRenderer and donate it to the ASF accompanied with a software grant [1], but Oleg is a FOP committer and has a CLA on file. So if Oleg attaches a ZIP with the sources for the TIFFRenderer (ALv2 already applied) to a Bugzilla entry along with a note that we may include it in FOP, that's good enough for me. It's not that the thing is a big application in itself even though some people would argue works like Renaud's AWT patch and Oleg's TIFFRenderer must go/run through the Incubator. To make things even more complicated, TIFFRenderer is just a thin wrapper around some weird licensed [1] Sun's codec sources, called Java Advanced Imaging API 1.1.1 Sample Source [2], which includes some provisional bits of JAI. I'm not sure if we want to use it. What about using full-blown JAI? [1] http://www.tkachenko.com/fop/JAI_1.1.1_sample_io_sourcecodelic.10_23_01.txt [2] http://java.sun.com/developer/sampsource/jai/ -- Oleg Tkachenko http://blog.tkachenko.com Multiconn Technologies, Israel Jeremias Maerki
Re: Integration of TIFFRenderer in FOP
Glen Mazza wrote: Yeah, Peter makes me want to do that sometimes myself... ;) Glen Glen, It's not difficult. I can give you some tips off-line if you like. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/
Re: Integration of TIFFRenderer in FOP
Jeremias Maerki wrote: Relationship to which PDF renderer? The one that directly creates PDF (PDFRenderer) or the one that creates PDF through JPS (normal PrintRenderer as defined in the Wiki painting to a Graphics2D instance provided by JPS) using a StreamPrintService? That's the two choices. Obviously, you will be taking the latter approach. If you wait a bit (until the common components area is set up) you'll have a neatly separated package to create PDF using JPS because I'll be publishing my proof-of-concept JPS StreamPrintService which you can build on. Hmm, this gives me another thing to talk about over in XML Graphics General. On 09.03.2005 00:53:16 Peter B. West wrote: This approach is obviously of interest to me, and I will follow developments closely. The wiki page is very general. If the Java2DRenderer is to be the (abstract) technical foundation, what will the relationship to the concrete PDF renderer be? The wiki is vague on this point. Jeremias, That will be extremely useful. However, I was trying to clarify the situation of PDFRenderer. The impression I got from Renaud's comment was that the Java2DRenderer was to be the basis of all renderers. Hence my interest. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/
Re: Integration of TIFFRenderer in FOP
Peter, Then my comment gave you a wrong impression: the Java2DRenderer is the (abstract) base for all renderers that use the Java2D API for rendering. The reference renderer is still the PDFRenderer, which inherits from AbstractRenderer directly. Renaud
Re: Integration of TIFFRenderer in FOP
No, definitely not. From what I learned from you, that's what you intend to do. FOP pursues a different strategy. I believe that you can't get the same quality PDF with all cool features with a PDF renderer that operates with a Java2DRenderer as its base. On 09.03.2005 12:34:20 Peter B. West wrote: That will be extremely useful. However, I was trying to clarify the situation of PDFRenderer. The impression I got from Renaud's comment was that the Java2DRenderer was to be the basis of all renderers. Hence my interest. Jeremias Maerki
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Renaud Richardet wrote: Peter, let me answer you last mail [1] here: You are right that the wiki is still vague about the detailled implementation of the different renderers. Actually, I haven't started to think about it until today. I will put my ideas tomorrow on the wiki. I would be happy if you could put your inputs there, too. Renaud, I don't have particular input. I haven't given the rendering any detailed thought at all, apart from the perception fostered by the presence of PDFGraphics2D, PDFGraphicsConfiguration, PDFGraphicsDevice and similar classes in other contexts, that a mapping of the Area tree to Java Graphics2D output could be translated very directly into PDF (and other formats). If that necessarily involves the JPS, so be it. In order to flesh these notions out, I will be taking maximum advantage of the expertise of others, including yourself. In the meantime, I continue to work on the generation of the Area tree. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/
Re: Integration of TIFFRenderer in FOP
Renaud Richardet wrote: Peter, Then my comment gave you a wrong impression: the Java2DRenderer is the (abstract) base for all renderers that use the Java2D API for rendering. The reference renderer is still the PDFRenderer, which inherits from AbstractRenderer directly. Renaud Renaud, Understood. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/
Re: Good job! / Re: Integration of TIFFRenderer in FOP
I downloaded sun's codecs [2] that Oleg used in his TIFFRenderer. Jeremias, you mean that we can legally just put those in the FOP-code? Following codecs are included in [2]: - TIFF - JPEG - PNG - BMP So it should be possible to create a renderer for each of this file formats. But do we need them all? Do we also need GIF encoding ([2] only supports GIF decoding) . If yes, we'll have to use other libraries like ACME Labs GIF encoder (right?) Besides, I haven't understand yet if Oleg will donate his code to Apache. Otherwise, I'd rather use ImageIO even if it's only available in JDKs =1.4. I thought FOP should be 1.3 compilant [3]? So how do we go around that? Regards, Renaud [3] http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgId=1332332
Re: Good job! / Re: Integration of TIFFRenderer in FOP
On 09.03.2005 12:51:11 Renaud Richardet wrote: I downloaded sun's codecs [2] that Oleg used in his TIFFRenderer. Jeremias, you mean that we can legally just put those in the FOP-code? This would have to be checked out. I'd rather not, especially when we have PNG and TIFF codecs under Apache license already available. Following codecs are included in [2]: - TIFF - JPEG - PNG - BMP So it should be possible to create a renderer for each of this file formats. But do we need them all? Do we also need GIF encoding ([2] only supports GIF decoding) . If yes, we'll have to use other libraries like ACME Labs GIF encoder (right?) I would like to suggest that you implement TIFF and PNG output using Batik's codecs. Besides, I haven't understand yet if Oleg will donate his code to Apache. I have the impression that he wants to. There are simply a few issues to look at. Looking at possible licensing issue I'd suggest Oleg simply donates his own classes (not the codec) to the FOP project by applying the Apache license and posting them as a Bugzilla issue. You can then use these classes to implement output via Batik's codecs. Or you simply reimplement the same functionality without copy/paste. :-) As he said, it's only a thin wrapper. The key is to have codecs with the right licensing. Otherwise, I'd rather use ImageIO even if it's only available in JDKs =1.4. I thought FOP should be 1.3 compilant [3]? So how do we go around that? That's right. But nothing stops us from providing additional code that's JDK 1.4 dependent as long as it's not core functionality and it's in a separate directory (src/java-1.4). Regards, Renaud [3] http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgId=1332332 Jeremias Maerki
Re: Skype-conference on page-breaking?
Sounds to me like 2) is the way to go right now. This would mean minimal recreation of vertical boxes in case of changing available IPD. Sure, this is an exotic case but XSL-FO makes it possible, therefore we must be prepared for it. Thanks for the hints and the helpful example. On 08.03.2005 19:43:57 Luca Furini wrote: Jeremias Maerki wrote: Luca, do you think your total-fit approach may be written in a way to handle changing available IPDs and that look-ahead can be disabled to improve processing speed at the cost of optimal break decisions? I think that a first fit algorithm could be implemented in two different ways: 1) wait until the list of elements representing a whole page-sequence is collected, and call findBreakingPoints(); this method will call a different considerLegalBreak() method, much simpler and faster than the knuth's one. 2) start building pages little by little: the FlowLM returns elements to the PageLM as soon as one of its own child returns them Alternative 1) is much like the total fit algorithm: breaks are computed at the end of each page-sequence; even if the evaluation method is much faster than Knuth's one, there could still be a long wait in order to get the whole list. With alternative 2) the PageLM would behave much the same as it now does: as soon as a page is filled, it is possible to call addAreas. Note that the last elements in the partial sequence cannot be considered as feasible break. For example, if there is a block which creates 6 lines, the sequence will be something like: box box penalty(not infinite) box penalty(not infinite) box box and the evaluation must stop at the second penalty; only when some following elements are known it will be possible to decide whether the last two lines could be at the end of a page. If the IPD is always the same, I think the two alternatives are equivalent, and the first one is better because it just needs a different considerLegalBreak() method; as the output file cannot be printed until the end of the process, the only advantage of 2) could be memory usage. That's the part where I have a big question mark about changing available IPD. We may have to have a check that figures out if the available IPD changes within a page-sequence by inspecting the page-masters. That would allow us to switch automatically between total-fit and best-fit or maybe even first-fit. If the IPD changes, I fear 2) must be necessarily used: if a block is split between pages with different ipd, only a few lines need to be recreated. Using 1), the LineLM should know how wide the lines are, but this cannot be known as page breaking has not yet started. The check could be done before starting the layout phase: if there is a change, 2) is used, otherwise 1). Maybe, the check could be even more sophisticated: for example, if the first page is different, but the following are equally wide, we could use 2) to create the first page and then switch to 1). A remaining question mark is with side-floats as they influence the available IPD on a line-to-line basis. This is a question mark for me too! :-) One thing for a deluxe strategy for book-style documents is certainly alignment of lines between facing pages. But that's something that's not important at the moment. I have created and implemented a new property right about this! :-) I'd be very interested to hear what you think about the difficulty of changing available IPD. The more I think about it, however, the more I think the total-fit model gets too complicated for what we/I need right now. But I'm unsure here. If changing ipd is really important and not just a theorical possibility, we could start implementing 2, and later add the check and the algorithm 1: the getNextKnuthElements() in the block-level LM could be used in both cases. Regards Luca Jeremias Maerki
Re: Good job! / Re: Integration of TIFFRenderer in FOP
--- Jeremias Maerki [EMAIL PROTECTED] wrote: Otherwise, I'd rather use ImageIO even if it's only available in JDKs =1.4. I thought FOP should be 1.3 compilant [3]? So how do we go around that? That's right. But nothing stops us from providing additional code that's JDK 1.4 dependent as long as it's not core functionality and it's in a separate directory (src/java-1.4). BTW, does it have to be in a separate directory? Can we keep it in the directory it would otherwise be in if FOP were 1.4-based but somehow alter the Ant scripts to help the 1.3-only users? Glen
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Jeremias Maerki wrote: That's no problem, I think, because Batik has a TIFF encoder [3] already in their codebase and we can move this code to the common area and use that. Shouldn't be difficult to adjust. Last time I checked Batik's TIFF encoder was kinda limited WRT some TIFF compressions, and that's the reason I used the codec from Sun. That would be really nice to fix Batik's codec instead. -- Oleg Tkachenko http://blog.tkachenko.com Multiconn Technologies, Israel
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Jeremias Maerki wrote: I would like to suggest that you implement TIFF and PNG output using Batik's codecs. Yep, that's the best solution. But please check that Batik's TIFF codec supports all TIFF compressions Sun's codec does. 2 years ago it was sort of limited, particularly wrt fax compressions. I have the impression that he wants to. There are simply a few issues to look at. Looking at possible licensing issue I'd suggest Oleg simply donates his own classes (not the codec) to the FOP project by applying the Apache license and posting them as a Bugzilla issue. Ok, I will anyway, just to be at least a bit helpful here :) -- Oleg Tkachenko http://blog.tkachenko.com Multiconn Technologies, Israel
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Yes, please, because it's a lot easier to handle inside an IDE. You simply define an additional source folder if you're on JDK 1.4, and you don't get compile error on JDK 1.3. On 09.03.2005 16:34:39 Glen Mazza wrote: --- Jeremias Maerki [EMAIL PROTECTED] wrote: Otherwise, I'd rather use ImageIO even if it's only available in JDKs =1.4. I thought FOP should be 1.3 compilant [3]? So how do we go around that? That's right. But nothing stops us from providing additional code that's JDK 1.4 dependent as long as it's not core functionality and it's in a separate directory (src/java-1.4). BTW, does it have to be in a separate directory? Can we keep it in the directory it would otherwise be in if FOP were 1.4-based but somehow alter the Ant scripts to help the 1.3-only users? Glen Jeremias Maerki
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Ah, there's the catch. Yes, CCITT4 is particularly interesting which is not supported by the code in Batik. But still, I think we don't have to support everything under JDK 1.3. I wonder how many people under JDK 1.3 would need that particular compression type. And if they really do they then have several examples on how to adjust the bitmap renderer for themselves. And a additional JAI implementation is certainly not a big deal after we have the first one. On 09.03.2005 16:38:33 Oleg Tkachenko wrote: Jeremias Maerki wrote: That's no problem, I think, because Batik has a TIFF encoder [3] already in their codebase and we can move this code to the common area and use that. Shouldn't be difficult to adjust. Last time I checked Batik's TIFF encoder was kinda limited WRT some TIFF compressions, and that's the reason I used the codec from Sun. That would be really nice to fix Batik's codec instead. Jeremias Maerki
Re: Good job! / Re: Integration of TIFFRenderer in FOP
--- Jeremias Maerki [EMAIL PROTECTED] wrote: Ah, there's the catch. Yes, CCITT4 is particularly interesting which is not supported by the code in Batik. But still, I think we don't have to I don't think we have to support everything under JDK 1.3. Or anything, for that matter. 1.3 users can remain on 0.20.5 IMO, optionally downloading Oleg's TIFF patch if they need to. Glen
Re: cvs commit: xml-fop/src/java/org/apache/fop/render/awt/viewer PreviewDialog.java
--- Jeremias Maerki [EMAIL PROTECTED] wrote: RFC 2045 [1] says this: (1) Private values (starting with X-) may be defined bilaterally between two cooperating agents without outside registration or standardization. Such values cannot be registered or standardized. So to be on the safe side we would need to rename application/awt to application/X-awt. Sounds good--and acceptable for purists as well. Glen [1] http://www.faqs.org/rfcs/rfc2045.html
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Le 9 mars 05, à 01:12, Glen Mazza a écrit : ...[Thanks also to Bertrand for sending Renaud our way. This is the second quality developer--Peter Herweg being the other--that we have gotten from him since I've been on this project.].. You're welcome - and you don't even know how many people I sent your way that did not make it ;-) (just kidding - I'm happy to contribute, even if it's just helping convince people to jump in) -Bertrand smime.p7s Description: S/MIME cryptographic signature
Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer
On 08.03.2005 03:18:21 Renaud Richardet wrote: snip/ On Mon, 28 Feb 2005 , Jeremias Maerki wrote: AbstractRenderer: I moved what I could reuse from PDFRenderer to AbstractRenderer: renderTextDecorations(), handleRegionTraits(), and added the needed empty methods. I think that was good although only time will tell if this will hold for all renderers to come. Eventually, I didn't modify AbstractRenderer, PDFRenderer and PS Renderer at all. The implementation of AWTRenderer is close to the other renderers, so that putting some methods in AbstractRenderer should not be a big problem. I agree. Speaking of startVParea(), could we rename it to something more meanigfull? Proposition: TransformPosition, or something like this. Actually, I like startVParea() (or rather startViewportArea like I would rather call it) because only for viewport a new transformation matrix is necessary. startViewportArea() is fine for me. I think the Java2D approach is not unlike the PDF/PS approach. Adobe was Sun's closest partner when they developed the Java2D API. I implemented a simple .bmp rendering (BMPReader.java). If there's a better way to render .bmp (JAI?), let me know. This should not be necessary. We have a BMP implementation in org.apache.fop.images. The BMP bitmaps should be loaded through that mechanism. OK, now I see. But how can I get an awt.Image from a FopImage? I've modified your patch to demonstrate, but it needs some additional work to handle the different color models. Probably the image package should be extended to provide the necessary information/methods. BTW, Using Graphics.create() you should be able to create a copy of the current Graphics2D object. By pushing the old one on a stack and overwriting the graphics member variable should should be able to create the same effect as with currentState.push()/saveGraphicsState() in PDFRenderer.startVParea () and currentState.pop()/restoreGraphicsState ()in endVParea(). When leaving a VP area you can simply restore an older Graphics2D object for the stack and continue painting. This will undo any transformations and state change done in the copy used within the VP area. See second paragraph in javadocs of java.awt.Graphics. Thanks for the hint. I did just that in AWTGraphicsState (same as PDFState). It holds all the context (font, colour, stroke, transformation) of the current graphics, and can act as a stack, too. I created an interface (RendererState) that could be implemented by all xxxState of the renderers. To be discussed... I moved the interface into the awt package for now since it contains Java2D-dependant methods and classes. I don't see right now how the other renderer would fit into the picture. The PDF renderer's state class is in the pdf package and should probably stay there. I also added a Debug button on the AWTRenderer-Windows, which outlines the blocks. This is just a test, and I would like to develop a full-fledged visual debugger [1]. That's an interesting feature. If this code works for you, then I'll start to separate the Java2DRenderer and the AWTRenderer. Otherwise, please tell me how I can improve my code. I liked the patch. It's a big step forward. Keep it up! Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D Jeremias Maerki
DO NOT REPLY [Bug 33760] - [Patch] current AWTRenderer
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33760. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33760 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2005-03-08 12:53 --- Patch applied with modifications. Thanks a lot! -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer
By the way, Renaud, you should sign and send an ICLA (Individual Contributor license agreement) to the ASF if you haven't done so already. http://www.apache.org/licenses/#clas Just so we're on the safe side. After all you're not just doing little bug fixes. Thanks! Jeremias Maerki
Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer
Thanks for integrating my patch in FOP. But how can I get an awt.Image from a FopImage? I've modified your patch to demonstrate, but it needs some additional work to handle the different color models. Probably the image package should be extended to provide the necessary information/methods. OK, I've put that on my TODO-list. I created an interface (RendererState) that could be implemented by all xxxState of the renderers. To be discussed... I moved the interface into the awt package for now since it contains Java2D-dependant methods and classes. I don't see right now how the other renderer would fit into the picture. Neither do I right now, but I think it will come clearer in the future. The PDF renderer's state class is in the pdf package and should probably stay there. Yes, in any case. If this code works for you, then I'll start to separate the Java2DRenderer and the AWTRenderer. Otherwise, please tell me how I can improve my code. I liked the patch. It's a big step forward. Keep it up! /me happy By the way, Renaud, you should sign and send an ICLA (Individual Contributor license agreement) to the ASF if you haven't done so already. I just did it last week. Thanks for remembering me. Renaud
RE: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer
Renaud Richardet wrote: 1. FOray has factored the FOP font logic into a separate module, cleaned it up significantly, and made some modest improvements. A few weeks ago, I aXSL-ized it as well, which means that it is written to a (theoretically) independent interface: http://cvs.sourceforge.net/viewcvs.py/axsl/axsl/axsl-font/src/java/org /axsl/ font/ I think there is general support within FOP to implement the FOray/aXSL font work in the FOP 1.0 code, but so far no one has actually taken the time to do it. If you get into messing with fonts at all, I highly recommend that FOray be implemented before doing anything else. I will be happy to support efforts to that end. For what I understand now, your approach sounds good to me. But I'm missing some major pieces of the picture ATM to start implementing your aXSL interface in FOP. Please let me come back to you when I'll feel more comfortable with the font-mechanism. Sure. The main thing I wanted to alert you about is that right now, AFAIK, the HEAD code is pretty similar to the maintenance branch code and integration *should be* relatively easy. If improvements are made to the HEAD code, then issues of merging, etc. crop up that make integration difficult. That is OK too -- I just want to make sure that if it is done that way, it is done that way on purpose. Please feel free to contact me offline if you have implementation questions. Victor Mote
DO NOT REPLY [Bug 32612] - [PATCH] refactoring of knuth line breaking code.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=32612. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=32612 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2005-03-08 16:05 --- Applied. This was long due and it helps me while preparing for page breaking. Reference in the mailing list archives: http://marc.theaimsgroup.com/?t=11026228562r=1w=2 -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Integration of TIFFRenderer in FOP
Oleg, I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D [2] http://www.tkachenko.com/fop/fop.html
Re: Integration of TIFFRenderer in FOP
On Mar 8, 2005, at 7:23 AM, Renaud Richardet wrote: snip I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. snip I think that would be great! Just be careful when you do, that the output isn't restricted to 72-dpi (IIRC, I had some problems in the past when i tried to use Oleg's TIFFRenderer--I ended up using ImageMagick to complete my task). Cheers! Web Maestro Clay -- [EMAIL PROTECTED] - http://homepage.mac.com/webmaestro/ My religion is simple. My religion is kindness. - HH The 14th Dalai Lama of Tibet
Re: Integration of TIFFRenderer in FOP
Renaud Richardet wrote: I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Unfortunately I'm sort of busy currently. Go ahead, that will be great renderer. -- Oleg Tkachenko http://blog.tkachenko.com Multiconn Technologies, Israel
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: Luca, do you think your total-fit approach may be written in a way to handle changing available IPDs and that look-ahead can be disabled to improve processing speed at the cost of optimal break decisions? I think that a first fit algorithm could be implemented in two different ways: 1) wait until the list of elements representing a whole page-sequence is collected, and call findBreakingPoints(); this method will call a different considerLegalBreak() method, much simpler and faster than the knuth's one. 2) start building pages little by little: the FlowLM returns elements to the PageLM as soon as one of its own child returns them Alternative 1) is much like the total fit algorithm: breaks are computed at the end of each page-sequence; even if the evaluation method is much faster than Knuth's one, there could still be a long wait in order to get the whole list. With alternative 2) the PageLM would behave much the same as it now does: as soon as a page is filled, it is possible to call addAreas. Note that the last elements in the partial sequence cannot be considered as feasible break. For example, if there is a block which creates 6 lines, the sequence will be something like: box box penalty(not infinite) box penalty(not infinite) box box and the evaluation must stop at the second penalty; only when some following elements are known it will be possible to decide whether the last two lines could be at the end of a page. If the IPD is always the same, I think the two alternatives are equivalent, and the first one is better because it just needs a different considerLegalBreak() method; as the output file cannot be printed until the end of the process, the only advantage of 2) could be memory usage. That's the part where I have a big question mark about changing available IPD. We may have to have a check that figures out if the available IPD changes within a page-sequence by inspecting the page-masters. That would allow us to switch automatically between total-fit and best-fit or maybe even first-fit. If the IPD changes, I fear 2) must be necessarily used: if a block is split between pages with different ipd, only a few lines need to be recreated. Using 1), the LineLM should know how wide the lines are, but this cannot be known as page breaking has not yet started. The check could be done before starting the layout phase: if there is a change, 2) is used, otherwise 1). Maybe, the check could be even more sophisticated: for example, if the first page is different, but the following are equally wide, we could use 2) to create the first page and then switch to 1). A remaining question mark is with side-floats as they influence the available IPD on a line-to-line basis. This is a question mark for me too! :-) One thing for a deluxe strategy for book-style documents is certainly alignment of lines between facing pages. But that's something that's not important at the moment. I have created and implemented a new property right about this! :-) I'd be very interested to hear what you think about the difficulty of changing available IPD. The more I think about it, however, the more I think the total-fit model gets too complicated for what we/I need right now. But I'm unsure here. If changing ipd is really important and not just a theorical possibility, we could start implementing 2, and later add the check and the algorithm 1: the getNextKnuthElements() in the block-level LM could be used in both cases. Regards Luca
Re: Integration of TIFFRenderer in FOP
Renaud Richardet wrote: Oleg, I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D [2] http://www.tkachenko.com/fop/fop.html Renaud, This approach is obviously of interest to me, and I will follow developments closely. The wiki page is very general. If the Java2DRenderer is to be the (abstract) technical foundation, what will the relationship to the concrete PDF renderer be? The wiki is vague on this point. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/
Re: Integration of TIFFRenderer in FOP
How to leave this group. Please help me to unsubscribe. --- Peter B. West [EMAIL PROTECTED] wrote: Renaud Richardet wrote: Oleg, I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D [2] http://www.tkachenko.com/fop/fop.html Renaud, This approach is obviously of interest to me, and I will follow developments closely. The wiki page is very general. If the Java2DRenderer is to be the (abstract) technical foundation, what will the relationship to the concrete PDF renderer be? The wiki is vague on this point. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/ __ Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web http://birthday.yahoo.com/netrospective/
Re: Integration of TIFFRenderer in FOP
Send an e-mail to: [EMAIL PROTECTED] Be sure to use the e-mail address from which you subscribed. On Mar 8, 2005, at 4:03 PM, vivek gupta wrote: How to leave this group. Please help me to unsubscribe. --- Peter B. West [EMAIL PROTECTED] wrote: Renaud Richardet wrote: Oleg, I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D [2] http://www.tkachenko.com/fop/fop.html Renaud, This approach is obviously of interest to me, and I will follow developments closely. The wiki page is very general. If the Java2DRenderer is to be the (abstract) technical foundation, what will the relationship to the concrete PDF renderer be? The wiki is vague on this point. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/ __ Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web http://birthday.yahoo.com/netrospective/ Web Maestro Clay -- [EMAIL PROTECTED] - http://homepage.mac.com/webmaestro/ My religion is simple. My religion is kindness. - HH The 14th Dalai Lama of Tibet
Good job! / Re: Integration of TIFFRenderer in FOP
Team, Oleg's TIFF Renderer is under the Mozilla license[1], not the Apache one (also apparently some of the code is from Sun?). Is the former compatible with the latter? If not, I would like Oleg to switch the license on it before we proceed further in putting it into FOP. Renaud--thanks for your fantastic work with the AWT Renderer. You clearly have ace technical skills, enthusiasm, organization, and you write beautifully. You have a bright future ahead of you. [Thanks also to Bertrand for sending Renaud our way. This is the second quality developer--Peter Herweg being the other--that we have gotten from him since I've been on this project.] Regards, Glen [1] http://www.tkachenko.com/fop/tiffrenderer.html --- Renaud Richardet [EMAIL PROTECTED] wrote: Oleg, I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D [2] http://www.tkachenko.com/fop/fop.html
Re: Integration of TIFFRenderer in FOP
Yeah, Peter makes me want to do that sometimes myself... ;) Glen --- vivek gupta [EMAIL PROTECTED] wrote: How to leave this group. Please help me to unsubscribe. --- Peter B. West [EMAIL PROTECTED] wrote: Renaud Richardet wrote: Oleg, I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D [2] http://www.tkachenko.com/fop/fop.html Renaud, This approach is obviously of interest to me, and I will follow developments closely. The wiki page is very general. If the Java2DRenderer is to be the (abstract) technical foundation, what will the relationship to the concrete PDF renderer be? The wiki is vague on this point. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/ __ Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web http://birthday.yahoo.com/netrospective/
Re: cvs commit: xml-fop/src/java/org/apache/fop/render/awt/viewer PreviewDialog.java
The application/awt MIME type doesn't exist. I think Jeremias wanted this to be null instead for output types that lack a MIME type, correct? Thanks, Glen --- [EMAIL PROTECTED] wrote: +/** The MIME type for AWT-Rendering */ public static final String MIME_TYPE = application/awt;
Good job! / Re: Integration of TIFFRenderer in FOP
Glen, Thanks for your mail. It's good you raised the legal issue. Peter, let me answer you last mail [1] here: You are right that the wiki is still vague about the detailled implementation of the different renderers. Actually, I haven't started to think about it until today. I will put my ideas tomorrow on the wiki. I would be happy if you could put your inputs there, too. Regards, Renaud [1] http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=10759 -- renaud.richardet (at) gmail (dot) com +41 78 675 9501 www.oslutions.com
Re: cvs commit: xml-fop/src/java/org/apache/fop/render/awt/viewer PreviewDialog.java
No. It's quite ok like this. It is in line with my vision how renderers should be made available to FOP in the future (dynamic registration like the FOP extensions). It's clear that the AWT preview doesn't manifest a certain type of file that has an officially defined MIME type. But nobody is blocked from creating a new, special one for special purposes. Only if you want to be purist about MIME types is this probably suboptimal. The MIME type is ideal for choosing a renderer. JPS, for example, works much in the same way. RFC 2045 [1] says this: (1) Private values (starting with X-) may be defined bilaterally between two cooperating agents without outside registration or standardization. Such values cannot be registered or standardized. So to be on the safe side we would need to rename application/awt to application/X-awt. [1] http://www.faqs.org/rfcs/rfc2045.html On 09.03.2005 01:31:24 Glen Mazza wrote: The application/awt MIME type doesn't exist. I think Jeremias wanted this to be null instead for output types that lack a MIME type, correct? Thanks, Glen --- [EMAIL PROTECTED] wrote: +/** The MIME type for AWT-Rendering */ public static final String MIME_TYPE = application/awt; Jeremias Maerki
Re: Integration of TIFFRenderer in FOP
Relationship to which PDF renderer? The one that directly creates PDF (PDFRenderer) or the one that creates PDF through JPS (normal PrintRenderer as defined in the Wiki painting to a Graphics2D instance provided by JPS) using a StreamPrintService? That's the two choices. Obviously, you will be taking the latter approach. If you wait a bit (until the common components area is set up) you'll have a neatly separated package to create PDF using JPS because I'll be publishing my proof-of-concept JPS StreamPrintService which you can build on. Hmm, this gives me another thing to talk about over in XML Graphics General. On 09.03.2005 00:53:16 Peter B. West wrote: This approach is obviously of interest to me, and I will follow developments closely. The wiki page is very general. If the Java2DRenderer is to be the (abstract) technical foundation, what will the relationship to the concrete PDF renderer be? The wiki is vague on this point. Jeremias Maerki
Re: Good job! / Re: Integration of TIFFRenderer in FOP
Thanks to Glen for raising the issue. The ideal approach is if Oleg would pack up his TIFFRenderer and donate it to the ASF accompanied with a software grant [1], but Oleg is a FOP committer and has a CLA on file. So if Oleg attaches a ZIP with the sources for the TIFFRenderer (ALv2 already applied) to a Bugzilla entry along with a note that we may include it in FOP, that's good enough for me. It's not that the thing is a big application in itself even though some people would argue works like Renaud's AWT patch and Oleg's TIFFRenderer must go/run through the Incubator. [1] http://www.apache.org/licenses/#grants On 09.03.2005 01:12:05 Glen Mazza wrote: Team, Oleg's TIFF Renderer is under the Mozilla license[1], not the Apache one (also apparently some of the code is from Sun?). Is the former compatible with the latter? If not, I would like Oleg to switch the license on it before we proceed further in putting it into FOP. Renaud--thanks for your fantastic work with the AWT Renderer. You clearly have ace technical skills, enthusiasm, organization, and you write beautifully. You have a bright future ahead of you. [Thanks also to Bertrand for sending Renaud our way. This is the second quality developer--Peter Herweg being the other--that we have gotten from him since I've been on this project.] Regards, Glen [1] http://www.tkachenko.com/fop/tiffrenderer.html --- Renaud Richardet [EMAIL PROTECTED] wrote: Oleg, I'm currently working on the AWTRenderer. The basic idea is to create a Java2DRenderer which provides the (abstract) technical foundation. Other renderers can subclass Java2DRenderer and provide the concrete output paths [1]. I think it would be a good idea to integrate your TIFFRenderer, as you propose in [2]. Would you like to integrate it yourself? Otherwise I would like to do it. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D [2] http://www.tkachenko.com/fop/fop.html Jeremias Maerki
Re: Skype-conference on page-breaking?
Thanks, Luca. I've had a nice casual talk on the phone with Simon, yesterday. Essentially, we only talked about very high-level stuff, especially the decision for a certain strategy (or two). You know I came up with the idea to create a simpler best-fit strategy with no look-ahead for invoice-style documents but maybe it would be possible to design your obvious total-fit strategy in a way that it could be used as a best-fit without look-ahead. The problem, like I mentioned already, is the possible change of available IPD within a page-sequence which results in a possible back-tracking and recalculation of vertical boxes. Of course, if it's possible to stay with one page-breaking algorithm for all use cases that would be best (because of the reduced effort), but only if the algorithm is reasonably fast for invoice-style documents. I'm repeatedly confronted with certain speed requirements in this case. Since modern high-volume single-feed printers handle about 180 pages per minute (continuous feed systems handle over 4 times that speed, but I think that's neither relevant, nor realistic here) FOP should be able to operate close to these 180 pages per minute for not too complex documents on a modern server. That means 330ms per page. Not much. Of course, in such an environment it is possible to distribute the formatting process over several blade servers but I had to realize that certain companies tend to prefer spending 100'000 dollars on a big server than spending a lot less for a much faster CPU-power-oriented setup. It seems to be hard to say good-bye to the old host systems. Well, that's just like the reality looks like in my environment. Simon, for example, is much more interested in book-style documents where there are other requirements. Speed is not a big issue, but quality is. In the end, I think we need to rate the chosen approach in these two points of view. These are very contradicting requirements and it's something that seems quite important to me not to forget here. Luca, do you think your total-fit approach may be written in a way to handle changing available IPDs and that look-ahead can be disabled to improve processing speed at the cost of optimal break decisions? If it's ok for you (and feasible) I'd like to integrate what you already have (in code) into that branch I was talking about. I would like to avoid recreating something you've already started, even if it doesn't work with the changes that happened in the last weeks. Even if we may create two different strategies I'm sure that certain parts will be shared by both approaches, like the creation of Knuth-style elements for the PageLM. Some more comments inline: On 04.03.2005 13:23:01 Luca Furini wrote: Jeremias Maerki wrote: Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. Ok, I'll try to. The main change in the LineLM is that the line breaking algorithm does not select only the node in activeList with fewest demerits: all the nodes whose demerits are = a threshold are used to create LineBreakPositions, so for each paragraph there is a set of layout options (for example, a paragraph could create 8 to 10 lines, 9 being the layout with fewest demerits). Hmm, that's a feature that I would say is something that only book-style documents will need. Invoice-style documents could live without it. According to the value of widows and orphans, the LineLM creates a sequence of elements: besides normal lines, represented by a box, there are optional lines, represented by box(0) penalty(inf,0) glue(0,1,0) box(0) and removable lines box(0) penalty(inf,0) glue(1,0,1) box(0) A few complications arise if not every possible layout allows breaks between lines, but they all can be solved using boxes, glues and penalties (for example, if a paragraph needs 3 or 4 lines, if it uses 3 it cannot be parted). Also something that's not all too important for invoice-style documents, although it can't hurt to have it. The BlockLM, and a block stacking LM in general, adds elements representing its children's spaces and keep condition, for example adding a 0 penalty or an infinite penalty according to child1.mustKeepWithNext(), child2.mustKeepWithPrevious() and this.mustKeepTogether(). That's certainly a must-have in any case. The PageLM, once it has the list of elements representing a whole page-sequence (or the content before a forced page break), calls the same breaking algorithm, using only a different selection method which leaves only one node in activeList. That's the part where I have a big question mark about changing available IPD. We may have to have a check that figures out if the available IPD changes within a page-sequence by inspecting the page-masters. That would allow us to switch automatically between total-fit and best-fit or maybe even first-fit. A remaining question mark is with side-floats as they influence
Re: Skype-conference on page-breaking?
I don't know why this is important to you but it's two to three months. On 04.03.2005 12:40:04 Peter B. West wrote: Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? Peter -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/ Jeremias Maerki
Re: FOP at ApacheCon Europe 2005?
FYI, I've just given myself a shove, followed Bertrand's suggestion and submitted a session proposal for ApacheCon. I feel that our project should be present there. I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. ApacheCon Europe 2005 CFP submission Submitter: Jeremias Maerki [EMAIL PROTECTED] Title: Apache FOP: Optimizing speed and memory consumption Level: Experienced Style: Orientation: Developer Duration: 60 Categories: Abstract: Apache FOP is the most popular XSL-FO implementation on the market. It is used in a wide variety of use cases to create documents in PDF, PostScript and other formats. This session will show a number of techniques to improve processing speed and and hints on how to handle things like OutOfMemoryErrors. It will also contain a short info block about the state and the future of the project. On 12.02.2005 10:57:15 Bertrand Delacretaz wrote: Le 8 févr. 05, à 19:29, Jeremias Maerki a écrit : Most of you will probably have heard that ApacheCon Europe will be happening in July. I think it would be great if FOP would somehow be visible there. There's a call for participation ending 2005-03-04. Any ideas? A recurring question in my consulting work is is FOP fast or what? or more precisely how to tune XSL-FO for FOP to run efficiently, mostly in view of avoiding memory bottlenecks. Me, I'm not using FOP hands-on enough these days to answer very precisely, I usually just tell them to test their performance on large documents very regularly during development, to avoid surprises. But maybe one of you FOP gurus could give a presentation with more precise information about this? Just my 2 cents. -Bertrand Jeremias Maerki
Re: FOP at ApacheCon Europe 2005?
Fantastic! I hope to be able to do the same someday. Glen --- Jeremias Maerki [EMAIL PROTECTED] wrote: FYI, I've just given myself a shove, followed Bertrand's suggestion and submitted a session proposal for ApacheCon. I feel that our project should be present there. I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. ApacheCon Europe 2005 CFP submission Submitter: Jeremias Maerki [EMAIL PROTECTED] Title: Apache FOP: Optimizing speed and memory consumption Level: Experienced Style: Orientation: Developer Duration: 60 Categories: Abstract: Apache FOP is the most popular XSL-FO implementation on the market. It is used in a wide variety of use cases to create documents in PDF, PostScript and other formats. This session will show a number of techniques to improve processing speed and and hints on how to handle things like OutOfMemoryErrors. It will also contain a short info block about the state and the future of the project. On 12.02.2005 10:57:15 Bertrand Delacretaz wrote: Le 8 févr. 05, à 19:29, Jeremias Maerki a écrit : Most of you will probably have heard that ApacheCon Europe will be happening in July. I think it would be great if FOP would somehow be visible there. There's a call for participation ending 2005-03-04. Any ideas? A recurring question in my consulting work is is FOP fast or what? or more precisely how to tune XSL-FO for FOP to run efficiently, mostly in view of avoiding memory bottlenecks. Me, I'm not using FOP hands-on enough these days to answer very precisely, I usually just tell them to test their performance on large documents very regularly during development, to avoid surprises. But maybe one of you FOP gurus could give a presentation with more precise information about this? Just my 2 cents. -Bertrand Jeremias Maerki
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: I don't know why this is important to you Just curious. but it's two to three months. Ouch. Good luck. You might want to keep an eye on Folio. Peter On 04.03.2005 12:40:04 Peter B. West wrote: Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/
Re: future of FOP
Michael, if you follow the fop-dev mailing list you will realize that the development has not come to a stand-still. It is true that the last release is almost two years old. We're in a redesign phase which tries to address exactly the issue of keeps among other things. The redesign took a lot longer than anticipated. But we're on the right track so we can start releasing again later this year, complete with keeps. If you can't work around the missing keeps (they work on table-rows) and you need an immediate solution you will need to switch to a different solution for the time being. I understand that IBM is quite big in the document business. It would be very interesting if IBM committed to supporting FOP like they do for other open source projects here at the Apache Software Foundation. As far as I know IBM even has its own implementation of XSL-FO although I don't know if it's actively maintained. On 07.03.2005 16:27:33 Michael Iwaniewicz wrote: Dear FOP developers, we are a big sw-development and decidedrecently to change or old bookmaster/afp based print componentto XSL-FO. As part of our solution we started to use FOP but run into formattingproblems in the area of the keep-together and keep-with-nextoptions. We got the impression that the FOP developmentcame to a kind of stand-still, since the current version is dated from2003. I just wanted to ask you if our impression is correct. We have nowto decide if we change from FOP to XEP or XSL-Formatter. Thanks for your help, Michael Michael Iwaniewicz CHIS Architecture Office: (43-1) 21145-6446 Mobile:(43) (0) 664-618-5839 Jeremias Maerki
Re: future of FOP
Jeremias Maerki wrote: snip/ I understand that IBM is quite big in the document business. It would be very interesting if IBM committed to supporting FOP like they do for other open source projects here at the Apache Software Foundation. As far as I know IBM even has its own implementation of XSL-FO although I don't know if it's actively maintained. I guess you mean the alphaworks XFC project? It is not maintained at all. I posted a are you still alive question back in 2003, still waiting for a reply ;-) Chris
Re: FOP at ApacheCon Europe 2005?
Jeremias Maerki wrote: I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. Well, there should be enough for an hour, at least in theory. I couldn't convice (yet) my boss that I have an important mission in Stuttgart in July. If I could, I'd probably talk about: - Handling fonts in Java, why the AWT font and text rendering subsystem is lame, and what FOP, Batik and perhaps others would expect from an API. - How to implement flowing text, line breaking and hyphenation efficiently; why the Java BreakIterator and other parts of the Java Unicode support sux0rs; what's behind TR14; Unicode normalization of text before looking it up in a dictionary, and efficient implementation of said dictionary for looking up all substrings in a word (using a trie, a PATRICIA tree or whatever) - Talk about the question why the algorithms aren't simply copied from Gecko (the Mozilla layout engine) Now that the deadline has been extended, I'll attempt it again. J.Pietschmann
Re: FOP at ApacheCon Europe 2005?
Cool, that would be great stuff. Let's hope your boss lets you off the leash. On 07.03.2005 23:57:50 J.Pietschmann wrote: Jeremias Maerki wrote: I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. Well, there should be enough for an hour, at least in theory. I couldn't convice (yet) my boss that I have an important mission in Stuttgart in July. If I could, I'd probably talk about: - Handling fonts in Java, why the AWT font and text rendering subsystem is lame, and what FOP, Batik and perhaps others would expect from an API. - How to implement flowing text, line breaking and hyphenation efficiently; why the Java BreakIterator and other parts of the Java Unicode support sux0rs; what's behind TR14; Unicode normalization of text before looking it up in a dictionary, and efficient implementation of said dictionary for looking up all substrings in a word (using a trie, a PATRICIA tree or whatever) - Talk about the question why the algorithms aren't simply copied from Gecko (the Mozilla layout engine) Now that the deadline has been extended, I'll attempt it again. J.Pietschmann Jeremias Maerki
Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer
I worked on my patch and tried to integrate you inputs. There are still many issues, but I think the basic structure is OK. You can find a patch attached to bug 33760. Comments inline: On Mon, 28 Feb 2005, Victor Mote wrote: 1. FOray has factored the FOP font logic into a separate module, cleaned it up significantly, and made some modest improvements. A few weeks ago, I aXSL-ized it as well, which means that it is written to a (theoretically) independent interface: http://cvs.sourceforge.net/viewcvs.py/axsl/axsl/axsl-font/src/java/org/axsl/ font/ I think there is general support within FOP to implement the FOray/aXSL font work in the FOP 1.0 code, but so far no one has actually taken the time to do it. If you get into messing with fonts at all, I highly recommend that FOray be implemented before doing anything else. I will be happy to support efforts to that end. For what I understand now, your approach sounds good to me. But I'm missing some major pieces of the picture ATM to start implementing your aXSL interface in FOP. Please let me come back to you when I'll feel more comfortable with the font-mechanism. On Mon, 28 Feb 2005 , Jeremias Maerki wrote: AbstractRenderer: I moved what I could reuse from PDFRenderer to AbstractRenderer: renderTextDecorations(), handleRegionTraits(), and added the needed empty methods. I think that was good although only time will tell if this will hold for all renderers to come. Eventually, I didn't modify AbstractRenderer, PDFRenderer and PS Renderer at all. The implementation of AWTRenderer is close to the other renderers, so that putting some methods in AbstractRenderer should not be a big problem. Speaking of startVParea(), could we rename it to something more meanigfull? Proposition: TransformPosition, or something like this. Actually, I like startVParea() (or rather startViewportArea like I would rather call it) because only for viewport a new transformation matrix is necessary. startViewportArea() is fine for me. I think the Java2D approach is not unlike the PDF/PS approach. Adobe was Sun's closest partner when they developed the Java2D API. I implemented a simple .bmp rendering (BMPReader.java). If there's a better way to render .bmp (JAI?), let me know. This should not be necessary. We have a BMP implementation in org.apache.fop.images. The BMP bitmaps should be loaded through that mechanism. OK, now I see. But how can I get an awt.Image from a FopImage? BTW, Using Graphics.create() you should be able to create a copy of the current Graphics2D object. By pushing the old one on a stack and overwriting the graphics member variable should should be able to create the same effect as with currentState.push()/saveGraphicsState() in PDFRenderer.startVParea () and currentState.pop()/restoreGraphicsState ()in endVParea(). When leaving a VP area you can simply restore an older Graphics2D object for the stack and continue painting. This will undo any transformations and state change done in the copy used within the VP area. See second paragraph in javadocs of java.awt.Graphics. Thanks for the hint. I did just that in AWTGraphicsState (same as PDFState). It holds all the context (font, colour, stroke, transformation) of the current graphics, and can act as a stack, too. I created an interface (RendererState) that could be implemented by all xxxState of the renderers. To be discussed... I also added a Debug button on the AWTRenderer-Windows, which outlines the blocks. This is just a test, and I would like to develop a full-fledged visual debugger [1]. If this code works for you, then I'll start to separate the Java2DRenderer and the AWTRenderer. Otherwise, please tell me how I can improve my code. Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D
DO NOT REPLY [Bug 33760] - [Patch] current AWTRenderer
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33760. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33760 [EMAIL PROTECTED] changed: What|Removed |Added Attachment #14371|0 |1 is obsolete|| Attachment #14372|0 |1 is obsolete|| --- Additional Comments From [EMAIL PROTECTED] 2005-03-08 03:25 --- Created an attachment (id=14426) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14426action=view) patch agains head for AWTRenderer -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 33871] New: - problem with fo:simple-page-master or fo:page-sequence while creating PDF using FOP 0.20.5
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33871. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33871 Summary: problem with fo:simple-page-master or fo:page- sequence while creating PDF using FOP 0.20.5 Product: Fop Version: 0.20.5 Platform: PC OS/Version: Windows NT Status: NEW Severity: major Priority: P3 Component: page-master/layout AssignedTo: fop-dev@xml.apache.org ReportedBy: [EMAIL PROTECTED] I am generating pdf using Xslt and Formatting Objects. I am not getting any error but erlier it was showing the content with(0.20.4) border after the fop version upgradation border got disappeared? why it is happening? Is there any change in the fop regarding this? I have upgraded the FOP version from 0.20.4 to 0.20.5 and I am using jars Jar- Files fop_020_5.jar and batik_fop_020_5.jar instead of Jar-Files fop17.jar, fop.jar, batik.jar have a look at this used with 0.20.5 after upgradation fo:layout-master-set fo:simple-page-master master-name=letter page-height={$pageHeight} in page-width={$pageWidth}in margin-left={$marginLeft}in margin- right={$marginRight}in fo:region-body margin=1in/ fo:region-before extent=1in padding=0pt margin- top=0.4in / fo:region-after extent=1in padding=0pt margin- bottom=0.5in/ /fo:simple-page-master /fo:layout-master-set fo:page-sequence master-reference=letter fo:static-content flow-name=xsl-region-before fo:block xsl:call-template name=header/ /fo:block /fo:static-content -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
An alternative link to the documents Renaud referred to: http://www.le-hacker.org/hacks/pagination/ There are a few other quite interesting documents on layout questions, particularly something about tables which might come in handy later. On 04.03.2005 02:14:56 Renaud Richardet wrote: Vincent Hennebert wrote: By looking for this reference, I found the following article: www.pi6.fernuni-hagen.de/publ/tr234.pdf It's entitled 'On the Pagination of Complex Documents' (actually it's also referencing Plass). There's another article, where the top level of the algorithm is presented (page 8). http://www.pi6.fernuni-hagen.de/publ/tr205.pdf Those 2 articles are summaries of a book. The link to command the book can be found at the bottom of http://web.informatik.uni-bonn.de/I/research/Pagination/ Renaud Jeremias Maerki
Re: Skype-conference on page-breaking?
Ok then, I'll call you Sunday evening 19.00 CET if nothing goes wrong. The others interested will find me in Skype. FYI, I'll be out of touch from later today until Sunday afternoon. On 03.03.2005 21:46:55 Simon Pepping wrote: On Thu, Mar 03, 2005 at 08:34:54PM +0100, Jeremias Maerki wrote: I've bought some SkypeOut credits now. Funny thing: It's cheaper to call Simon in the Netherlands than to call someone in Lucerne via PSTN. Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. Others are invited to listen in and contribute, of course. Max. number in the conference is four people with Skype. Sunday evening is OK. Monday and Tuesday after working hours is OK. I could be available from 16.00 hrs, but I would prefer after 19.00 hrs CET. There is no way I can do this at the office. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl Jeremias Maerki
Re: [XML Graphics - FOP Wiki] Updated: PageLayout
Jeremias Maerki wrote: However, Luca's example does not fully resolve in my brain. The penalty, for example, must not be infinite or it will not be eligible for break possibility. A legal break is defined by Knuth as a number b such that either (i) xb is a penalty item with pb infinity, or (ii) xb is a glue item and xb-1 is a box item. As far as I can see Luca's example doesn't contain any legal break, but I guess that was simply an oversight. Ops, you are right, the penalty should have a 0 penalty value (or maybe greater, anyway not infinite), otherwise it is completely useless to set its width, as it will never be used! The big problem I still have with both your examples is that the table header is very special in terms of the standard Knuth model. This model doesn't allow for conditional items at the beginning of a line. What Luca did in his example looks to me like forcing the model to do something it wasn't designed for. Yes, line breaking has not something analogous to the repeated header in a table, although the elements representing spaces in centered effect (which have effect only if there is a break between them) are somewhat similar. But I think that the main point is to have a representation whose height is correct: the representation is not the content, its only purpose is to allow the creation of pages satisfying the constraints (orphans, widows, keep, ...): what to put into the pages concerns the LMs. I'm a bit sceptical that the code will be able to identify such special conditions reliably. I think it would not be that difficult: for example, the penalty (whose width is header height + footer height) could have a conventional Position. In the addAreas phase, if the last position in the iterator is that particular Position, the LM will know that the table has been split between pages and, according to the value of table-omit-*-at-break, will add header and footer. Regards Luca
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. I'v very interested in page breaking, and I would be happy to contribute. Unfortunately, I'm not much used to speaking english :-(, so I think I would be much more comfortable with the idea of communicating via written words! As I have said before (or maybe I forgot to ...) I have done a few experiments trying to use Knuth's algorithm in page braking, and I have a working implementation which handles only some block level formatting objects (blocks and lists) and simplified documents (no footnotes or floats, at the moment, and pages with equal length and width), but it has some (I hope) interesting features: for example, it is able to adjust the number of lines used for each paragraph in order to both fill the pages and avoid orphans and widows. In a few words, using the box - penalty - glue model it is possible to represent paragraphs with an adjustable number of lines. I started working on it a few months ago, and I could not keep it updated with all the changes, but if you are interested I could try and recreate these features using the most recent code. Anyway, this could be done after we have reached a basic implementation. Regards Luca
Re: Skype-conference on page-breaking?
Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. On 04.03.2005 11:09:42 Luca Furini wrote: Jeremias Maerki wrote: Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. I'v very interested in page breaking, and I would be happy to contribute. Unfortunately, I'm not much used to speaking english :-(, so I think I would be much more comfortable with the idea of communicating via written words! As I have said before (or maybe I forgot to ...) I have done a few experiments trying to use Knuth's algorithm in page braking, and I have a working implementation which handles only some block level formatting objects (blocks and lists) and simplified documents (no footnotes or floats, at the moment, and pages with equal length and width), but it has some (I hope) interesting features: for example, it is able to adjust the number of lines used for each paragraph in order to both fill the pages and avoid orphans and widows. In a few words, using the box - penalty - glue model it is possible to represent paragraphs with an adjustable number of lines. I started working on it a few months ago, and I could not keep it updated with all the changes, but if you are interested I could try and recreate these features using the most recent code. Anyway, this could be done after we have reached a basic implementation. Regards Luca Jeremias Maerki
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? Peter -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/
Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
While looking for material on page breaking I found several references to this document: http://wwwlib.umi.com/dissertations/fullcit/8124134 Does anyone know if it's worth ordering and waiting for it? The unfortunate thing is that they don't seem to have a PDF version that I could download immediately for a reasonable fee. Thanks, Jeremias Maerki
Re: [XML Graphics - FOP Wiki] Updated: PageLayout
I've tried to think this case through. Simon's suggestion for a special penalty is intriguing. It handles the case for table borders where we don't simply have an x or 0 situation (like penalty and glue provide), but we have a x or y situation. On the other side Luca is probably right that we should try to handle as much as possible with the existing elements. However, Luca's example does not fully resolve in my brain. The penalty, for example, must not be infinite or it will not be eligible for break possibility. A legal break is defined by Knuth as a number b such that either (i) xb is a penalty item with pb infinity, or (ii) xb is a glue item and xb-1 is a box item. As far as I can see Luca's example doesn't contain any legal break, but I guess that was simply an oversight. The big problem I still have with both your examples is that the table header is very special in terms of the standard Knuth model. This model doesn't allow for conditional items at the beginning of a line. What Luca did in his example looks to me like forcing the model to do something it wasn't designed for. I'm a bit sceptical that the code will be able to identify such special conditions reliably. In this case I think we would have to introduce a special element that gets active if such a penalty was triggered on the last line/page (a carryover-penalty or something like that). In the end I think that all the elements after a chosen break may be invalid anyway if the available IPD changes (for example when the n+1 page is landscape while the page n was portrait). In this case all non-finalized elements have to be recalculated due to different break decisions in the line layout managers and different values coming from page-number(-citation) elements. [1] I wonder how other systems cope with this. Information on this seems to be very rare, at least when googling with the Knuth model in mind. Head smoking, digging deeper... [1] http://wiki.apache.org/xmlgraphics-fop/PageLayout/KnuthElementEvaluation On 28.02.2005 10:15:39 Luca Furini wrote: Simon Pepping wrote: +=== Space specifiers === + +When the space specifiers resolve to zero around a page break, we are +in the same situation as that of a word space in line breaking. It is +represented by the sequence `box - glue - box`. I add just a few thoughts about this subject. If there cannot be a break between the two block (the first has keep-with-next || the second has keep-with-previous || their block father has keep-together), the representation can be box - infinite penalty - glue - box. +=== Possible page break between content elements === + +Here the most general situation is that when the content is different +with and without page break: + * content Cn when there is no page break, + * content Ca at the end of the page before the page break, + * content Cb at the start of the page after the page break. + +An example of this situation is a page break between table rows: + +{{{ +no page break:page break: + +- - + row 1 row 1 +- - + border n border a +- - + row 2footer +- - + page break + - + header + - + border b + - +row 2 + - +}}} + +This situation cannot be dealt with using Knuth's box/glue/penalty +model. Maybe there is no need to create new kinds of elements (not that it's forbidden :-) , only the fewer they are, the simpler the algorithm is). Header and footer, with their borders, are duplicated around each break if table-omit-*-at-break is false; so, at each break the total height increases by (border a + footer + header + border b) and decreases by border n. Here is a different representation which uses normal penalties; there are two rows whose bpd is h1 and h2, header and footer with bpd hH and hF, border before the footer , border after the header and border between rows hA, hB and hN. box(h1) - penalty(inf, hH + hA + hB + hF) - glue(hN) - box(h2) - box(hB + hF) - box(hH + hA) If there is no break, the overall bpd is (hH + hA + h1 + hN + h2 + hB + hF), otherwise the first piece has bpd (hH + hA + h1 + hB + hF) and the second one (hH + hA + h2 + hB + hF), and the border between the rows is ignored. The elements representing the header and its border are moved at the end of the sequence, but I don't think this is could be a real problem: the TableLayoutManager would place it at its right place when adding areas. Regards Luca Jeremias Maerki
RE: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
Jeremias Maerki wrote: While looking for material on page breaking I found several references to this document: http://wwwlib.umi.com/dissertations/fullcit/8124134 Does anyone know if it's worth ordering and waiting for it? The unfortunate thing is that they don't seem to have a PDF version that I could download immediately for a reasonable fee. Wow. This looks like it is very valuable. I have ordered it for my own use, and I'll be glad to give you a book review when it arrives to help you decide whether it is worthwhile for you or not. I am especially interested in the summary's comment: For certain simple badness functions, the pagination problem is NP-complete; Dealing with that challenge is the likely tricky spot in all of this. My intuition has always been that the page-breaking problem is much more complicated than the line-breaking one, partly because lines must be laid out to even think about page-breaking (and line lengths can change as they move around), partly because you are effectively working with changes in two dimensions instead of one, and partly because there seem to me to be a lot more variables in the problem. I am hoping to find some insight into the detection and workarounds for the NP-complete situations. Note that Stanford is Knuth's school, the date year is the same as that of Chapter 3 of Knuth's Digital Typography, and that the author is the co-author of that article. It may be possible to infer the same information from looking at the TeX source code. Also, another source of similar information would be Volume I of Knuth's Computers and Typesetting, aka The TeXbook. It is essentially a commentary on TeX, by Knuth. Chapter 15 is entitled How TeX Makes Lines into Pages. You guys are way ahead of me in terms of thinking about how to implement this stuff. As you know, my approach has been to leave this stuff for last, preferring instead to solve the outer-layer problems first, and provide for multiple implementations that can be improved in parallel. However, I have a great interest in your efforts, and will be glad to help any way that I can. And, FWIW, I think you are on the right general track, in this regard at least. Victor Mote
Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
On 03.03.2005 16:19:24 Victor Mote wrote: Jeremias Maerki wrote: While looking for material on page breaking I found several references to this document: http://wwwlib.umi.com/dissertations/fullcit/8124134 Does anyone know if it's worth ordering and waiting for it? The unfortunate thing is that they don't seem to have a PDF version that I could download immediately for a reasonable fee. Wow. This looks like it is very valuable. I have ordered it for my own use, and I'll be glad to give you a book review when it arrives to help you decide whether it is worthwhile for you or not. I will probably order it anyway because I need it yesterday. :-) That's why I wanted the PDF version which they don't seem to have. Sigh. I am especially interested in the summary's comment: For certain simple badness functions, the pagination problem is NP-complete; Dealing with that challenge is the likely tricky spot in all of this. My intuition has always been that the page-breaking problem is much more complicated than the line-breaking one, partly because lines must be laid out to even think about page-breaking (and line lengths can change as they move around), partly because you are effectively working with changes in two dimensions instead of one, and partly because there seem to me to be a lot more variables in the problem. I am hoping to find some insight into the detection and workarounds for the NP-complete situations. I've found other references to discussion about communication between line and page breaking algorithms. But that stuff was mostly overview-style and written in a language I don't understand well: universitary language. That's probably the first time I regret stopping university after one semester because I hate mathematics. :-) Note that Stanford is Knuth's school, the date year is the same as that of Chapter 3 of Knuth's Digital Typography, and that the author is the co-author of that article. It may be possible to infer the same information from looking at the TeX source code. Also, another source of similar information would be Volume I of Knuth's Computers and Typesetting, aka The TeXbook. It is essentially a commentary on TeX, by Knuth. Chapter 15 is entitled How TeX Makes Lines into Pages. I haven't dared look into the TeX source code, yet, but I've read most of the chapter you mention. Didn't really help because there are many many TeX-specific things in there. You guys are way ahead of me in terms of thinking about how to implement this stuff. As you know, my approach has been to leave this stuff for last, preferring instead to solve the outer-layer problems first, and provide for multiple implementations that can be improved in parallel. However, I have a great interest in your efforts, and will be glad to help any way that I can. And, FWIW, I think you are on the right general track, in this regard at least. I very much hope so. But it becomes more and more apparent that this will be the greatest challenge in my programmer's life. Wow indeed. Jeremias Maerki
Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
Happy to see how you have reprioritized your efforts over the past two months [1], and much, much for the better. Glen --- Jeremias Maerki [EMAIL PROTECTED] wrote: I very much hope so. But it becomes more and more apparent that this will be the greatest challenge in my programmer's life. Wow indeed. Jeremias Maerki [1] http://marc.theaimsgroup.com/?l=fop-devm=110495579414655w=2
Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
Actually, I haven't. The RTF edition idea (which I haven't discarded, yet, just postponed due to time shortage) is still very high on my private priority list. As you can guess there's currently another one which comes first. On 03.03.2005 18:31:12 Glen Mazza wrote: Happy to see how you have reprioritized your efforts over the past two months [1], and much, much for the better. Glen --- Jeremias Maerki [EMAIL PROTECTED] wrote: I very much hope so. But it becomes more and more apparent that this will be the greatest challenge in my programmer's life. Wow indeed. Jeremias Maerki [1] http://marc.theaimsgroup.com/?l=fop-devm=110495579414655w=2 Jeremias Maerki
Re: Skype-conference on page-breaking?
I've bought some SkypeOut credits now. Funny thing: It's cheaper to call Simon in the Netherlands than to call someone in Lucerne via PSTN. Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. Others are invited to listen in and contribute, of course. Max. number in the conference is four people with Skype. On 01.03.2005 23:31:16 Jeremias Maerki wrote: Maybe I could hook you into a Skype conference by using SkypeOut. It's pretty cheap to call to the Netherlands. According to the FAQ this is possible. On 01.03.2005 22:26:50 Simon Pepping wrote: On Tue, Mar 01, 2005 at 03:09:46PM +0100, Jeremias Maerki wrote: To speed things up could we hold a conference (using Skype, for example) to discuss further details on page-breaking? I'd volunteer to sum up any results during that discussion for the archives. I have Finn on my Skype radar already. I do not have a broadband connection, and therefore no Skype or other VoIP. Jeremias Maerki Jeremias Maerki
Re: Skype-conference on page-breaking?
On Thu, Mar 03, 2005 at 08:34:54PM +0100, Jeremias Maerki wrote: I've bought some SkypeOut credits now. Funny thing: It's cheaper to call Simon in the Netherlands than to call someone in Lucerne via PSTN. Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. Others are invited to listen in and contribute, of course. Max. number in the conference is four people with Skype. Sunday evening is OK. Monday and Tuesday after working hours is OK. I could be available from 16.00 hrs, but I would prefer after 19.00 hrs CET. There is no way I can do this at the office. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
On Thu, Mar 03, 2005 at 08:19:24AM -0700, Victor Mote wrote: Jeremias Maerki wrote: While looking for material on page breaking I found several references to this document: http://wwwlib.umi.com/dissertations/fullcit/8124134 Does anyone know if it's worth ordering and waiting for it? The unfortunate thing is that they don't seem to have a PDF version that I could download immediately for a reasonable fee. Wow. This looks like it is very valuable. I have ordered it for my own use, and I'll be glad to give you a book review when it arrives to help you decide whether it is worthwhile for you or not. I do not know it. It sounds like I should buy it as well. Note that Stanford is Knuth's school, the date year is the same as that of Chapter 3 of Knuth's Digital Typography, and that the author is the co-author of that article. It may be possible to infer the same information from looking at the TeX source code. Also, another source of similar information would be Volume I of Knuth's Computers and Typesetting, aka The TeXbook. It is essentially a commentary on TeX, by Knuth. Chapter 15 is entitled How TeX Makes Lines into Pages. Note that The TeXbook is TeX's user guide. Yes, Knuth's users are quite advanced. It was my first book in the direction of computers, and one of the most inspirational I have read. The TeX program is described in 'TeX The Program'. That text is weaved into the program code according to Knuth's literate programming system. It can be freely extracted from the program code. A TeX distribution like TeXLive contains the tools to do this. I intend to do so soon. If you want to do it yourself, TeXLive is available from the TUG website, www.tug.org. The TeX source code itself is available from the CTAN repository, but I fear that you have to do some work to set up all the tools. It is up to yourself to decide whether knowing TeX's implementation is useful. It is a best-fit algorithm. There is no look-ahead. For example, TeX is not able to balance two facing pages (or two columns on a page, which for TeX is the same). I guess that a dissertation like that cited above contains much more information than implemented in TeX. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
RE: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
Simon Pepping wrote: Note that The TeXbook is TeX's user guide. Yes, Knuth's users are quite advanced. It was my first book in the direction of computers, and one of the most inspirational I have read. The TeX program is described in 'TeX The Program'. That text Oops. You're right. That is volume 2 from the same Computers and Typesetting series. set up all the tools. It is up to yourself to decide whether knowing TeX's implementation is useful. It is a best-fit algorithm. There is no look-ahead. For example, TeX is not able to balance two facing pages (or two columns on a page, which for TeX is the same). I guess that a dissertation like that cited above contains much more information than implemented in TeX. I'm not sure. The general TeX page-breaking algorithm is discussed in the paragraphs before Appendix A of Chapter 3 of Digital Typography. The general box/glue/penalty model is used, but only the current page is considered. So I think the difference between best-fit and total-fit (as described here anyway) is the amount of look-ahead itself, not so much the algorithm. This is why I thought Finn's (IIRC) idea of a variable look-ahead makes sense. A look-ahead of zero pages is a best-fit, a look-ahead of all pages is a total-fit. But the algorithm is the same. Anyway, I agree that the paper is probably the best source, but wanted to give Jeremias some options. Victor Mote
RE: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
Victor Mote wrote: Oops. You're right. That is volume 2 from the same Computers and Typesetting series. Er, it is actually volume B. Victor Mote
Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
Hi Fop Team, Simon Pepping a écrit : On Thu, Mar 03, 2005 at 08:19:24AM -0700, Victor Mote wrote: Jeremias Maerki wrote: While looking for material on page breaking I found several references to this document: http://wwwlib.umi.com/dissertations/fullcit/8124134 Does anyone know if it's worth ordering and waiting for it? By looking for this reference, I found the following article: www.pi6.fernuni-hagen.de/publ/tr234.pdf It's entitled 'On the Pagination of Complex Documents' (actually it's also referencing Plass). I've read parts of this article and it seems interesting. It's well written, quite easy to understand and provides a better algorithm than TeX's one. And it's also more recent. It doesn't agree with Plass about the NP-hard problem of page layout. Indeed that depends on the formula we are using to estimate the badness of a layout. It proposes another formula which seems more reasonable and better corresponds to a reader's expectations. Perhaps it could provide a good basis? I don't have Knuth's 'Digital Typography' (I'm considering purchasing it). It may be worthwhile to compare this article with what is in the book. The TeX program is described in 'TeX The Program'. That text is weaved into the program code according to Knuth's literate programming system. It can be freely extracted from the program code. I've already done it, and AFAICT it's not easy to find the way in all the TeX stuff (actually nothing is easy with TeX ;-)). I can send you the pdf file if you want, feel free to ask me. Be warned, though: it's a 535 pages (!) document that entirely describes the TeX program. I've started looking into it, and, well, it's rather cryptic. It's very close to the implementation, it seems to be difficult to get a general idea of what it's doing. But I'll investigate a bit more. And, as Simon wrote, TeX is excellent in line-breaking but not as good in page-breaking. It was implemented in the early 80's when memory was expensive. However, it was written with typographic quality in mind, and that's why it may be a good idea to try getting some hints from it. Vincent
Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems
Vincent Hennebert wrote: By looking for this reference, I found the following article: www.pi6.fernuni-hagen.de/publ/tr234.pdf It's entitled 'On the Pagination of Complex Documents' (actually it's also referencing Plass). There's another article, where the top level of the algorithm is presented (page 8). http://www.pi6.fernuni-hagen.de/publ/tr205.pdf Those 2 articles are summaries of a book. The link to command the book can be found at the bottom of http://web.informatik.uni-bonn.de/I/research/Pagination/ Renaud
DO NOT REPLY [Bug 33801] New: - FOP 0.20.5 AWTRenderer: Sometimes rendering fails in the middle of a table with no error message
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33801. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33801 Summary: FOP 0.20.5 AWTRenderer: Sometimes rendering fails in the middle of a table with no error message Product: Fop Version: 0.20.5 Platform: PC OS/Version: Windows XP Status: NEW Severity: major Priority: P2 Component: awt renderer AssignedTo: fop-dev@xml.apache.org ReportedBy: [EMAIL PROTECTED] My french team and I work on a Java project and we plan to embed FOP 0.20.5 for the needed print reports. We experience a problem when we are using the AWT renderer with documents that contain tables. Sometimes, a page is cut in the middle of a table and its header (fo:region-before) is missing. As far as we have investigated we can say that: * We have neither abnormal output message, nor exception stack trace. * We only experience this problem with the AWT renderer. Using the PCL renderer works fine (but it lacks multibyte characters support that is critical for some of our translated releases). * We only experience this problem on a true printer (we have tried several models from HP and OKI). We never reproduced the problem using a fake printer such as Adobe Acrobat Distiller. * With the FO document (bug.fo) that is attached, we experience the problem 20% of the time when launching the printing from our application (we tried both java.awt.print and javax.print APIs). When using the `fop.bat' script, we experience the problem 100% of the time with the same FO document. fop.bat -fo bug.fo -print We tried JDK 1.4.1_07 and 1.4.2_07. * So far, 100% of the time it happened, tables were involved in the document. * Using the same FO document and the same FOP command line several times may result in having the bug occuring on different pages of the document. The printing does not fail systematically at the same point of the document. I have attached: You can download http://www.guillaumeponce.org/fop/bug.fo.zip The reference FO document we have used to investigate this problem (zipped). You can also download http://www.guillaumeponce.org/fop/bug.pdf It is 12 Mb large. It is an actual 8 pages long paper print that has been scanned. You can see that the problemI describe occured on page 7 out of 8. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 33801] - FOP 0.20.5 AWTRenderer: Sometimes rendering fails in the middle of a table with no error message
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33801. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33801 --- Additional Comments From [EMAIL PROTECTED] 2005-03-02 10:11 --- Created an attachment (id=14386) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14386action=view) Test case for the bug fop.bat -fo bug.fo -print Rendering should fail. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: page-breaking strategies and performance
Jeremias Maerki wrote: Hi Jeremias, I finally have Knuth's Digital Typography and let myself enlighten by his well-written words. In [1] Simon outlined different strategies for page-breaking, obviously closely following the different approaches defined by Knuth. At first glance, I'd say that best-fit is probably the obvious strategy to select, especially if TeX is happy with it. Obviously, it can't find the optimal solution like this but the additional overhead (memory and CPU power) of a look-ahead/total-fit strategy is simply too much and unnecessary for things like invoices and insurance policies which are surely some of the most popular use cases of XSL-FO. Here, speed is extremely important. People writing documentation (maybe using DocBook) or glossy stock reports have additional requirements and don't mind the longer processing time and additional memory requirements. This leads me to the question if we shouldn't actually implement two page-breaking strategies (in the end, not both right now). For a speed-optimized algorithm, we could even think about ignoring side-floats. We have dozens of customers using an XSL-FO solution and I can confirm invoices and insurance policies are a common use case for XSL-FO. A lot of companies have performance as a priority and we have no one using side floats or even thinking about using them, so optimizing for speed by ignoring side floats sounds like a good idea! But this is just my 2 cents and may conflict with other people's wishes. Obviously, in this model we would have to make sure that we use a common model for both strategies. For example, we still have to make sure that the line layout gets information on the available IPD on each line, but probably this will not be a big problem to include later. An enhanced/adjusted box/glue/penalty model sounds like a good idea to me especially since Knuth hints at that in his book, too. There's also a question if part of the infrastructure from line breaking can be reused for page breaking, but I guess rather not. Probably best to re-create an algorithm from scratch for page breaking but line breaking can be reviewed for ideas. As for the plan to implement a new page-breaking mechanism: I've got to do it now. :-) I'm sorry if this may put some pressure on some of you. I'm also not sure if I'm fit already to tackle it, but I've got to do it anyway. Since I don't want to work with a series of patches like you guys did earlier, I'd like to create a branch to do that on as soon as we've agreed on a strategy. Any objections to that? If we are going to branch the code for this then we need to make sure we have a plan to merge the branch back once we are confident in the new page breaking algorithm. This plan (which should be agreed before branching takes place) should include an acceptance procedure, e.g. will a single -1 be able to prevent the code being merged back? We dont want to end up with another alt-design, which eventually moved to source forge!!! Chris
Re: page-breaking strategies and performance
--- Chris Bowditch [EMAIL PROTECTED] wrote: As for the plan to implement a new page-breaking mechanism: I've got to do it now. :-) I'm sorry if this may put some pressure on some of you. I'm also not sure if I'm fit already to tackle it, but I've got to do it anyway. Since I don't want to work with a series of patches like you guys did earlier, I'd like to create a branch to do that on as soon as we've agreed on a strategy. Any objections to that? If we are going to branch the code for this then we need to make sure we have a plan to merge the branch back once we are confident in the new page breaking algorithm. This plan (which should be agreed before branching takes place) should include an acceptance procedure, e.g. will a single -1 be able to prevent the code being merged back? We dont want to end up with another alt-design, which eventually moved to source forge!!! Chris Either way is fine with me, but Chris brings up a very valid point. If you can tolerate and keep up with my minor code housekeeping from time to time in some of the layout managers (currently mostly PSLM), feel free to work from HEAD directly instead if you wish. Glen
DO NOT REPLY [Bug 33801] - FOP 0.20.5 AWTRenderer: Sometimes rendering fails in the middle of a table with no error message
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33801. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33801 --- Additional Comments From [EMAIL PROTECTED] 2005-03-02 13:39 --- (In reply to comment #0) I could also reproduce the bug using FOP 0.20.5 and the provided bug.fo. It occured on page 2 out of 8. As far as we have investigated we can say that: * We have neither abnormal output message, nor exception stack trace. Same here for me (also using the command line option -d) * We only experience this problem with the AWT renderer. Using the PCL renderer works fine (but it lacks multibyte characters support that is critical for some of our translated releases). Haven't tried it. * We only experience this problem on a true printer (we have tried several models from HP and OKI). We never reproduced the problem using a fake printer such as Adobe Acrobat Distiller. Same here: the output in the Window of the AWT-Renderer looks fine. Only at printing comes a problem. If you (FOP Team) suspect that the problem lies in the AWT-Renderer, I could try to investigate. HTH, Renaud -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
DO NOT REPLY [Bug 33808] New: - problem with large number-rows-spanned
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33808. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33808 Summary: problem with large number-rows-spanned Product: Fop Version: 0.20.5 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: pdf renderer AssignedTo: fop-dev@xml.apache.org ReportedBy: [EMAIL PROTECTED] I am experiencing trouble when trying to create a table with FOP. My first table column has a number-rows-spanned that is larger than the number of rows fitting on a page. This leads to several problems: 1. The cell border is drawn across the page footer (does not end with the last table-row on the page, but is drawn into nowhere if you have even more rows). 2. The cell border for the (continued) first cell on the next page is missing completely. 3. An extra page is being added at the beginning of my document. (?!) Whether there is a patch correcting this mistake ? -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: page-breaking strategies and performance
I'd rather not work on HEAD directly because after creating the basics for the new mechanism the whole thing will probably not work for some time (probably 2-4 weeks). But I'd like to be able to check in early so people can review. I expect that the life time of the branch will not exceed 8 weeks. So there's almost no chance that alt-design is repeated, especially since the basic LM infrastructure will not be altered big time and it looks like we are all going in the same direction for the new page-breaking. It's clear that it has to be done and it seems to be moveing in the direction of a derived Knuth approach. It's much like the migration to the Knuth line breaking and it's mostly the block-level LMs that will be affected. People can continue to work on HEAD during that time as long as nothing serious is altered in the block-level LMs which would make merging difficult. Before I can kick off we need to agree to the general approach for the algorithm and clear a few details so we are reasonably sure that it'll work. Once we have that the plan for the branch should not be a big deal if we take the above into account. On 02.03.2005 13:16:42 Glen Mazza wrote: --- Chris Bowditch [EMAIL PROTECTED] wrote: As for the plan to implement a new page-breaking mechanism: I've got to do it now. :-) I'm sorry if this may put some pressure on some of you. I'm also not sure if I'm fit already to tackle it, but I've got to do it anyway. Since I don't want to work with a series of patches like you guys did earlier, I'd like to create a branch to do that on as soon as we've agreed on a strategy. Any objections to that? If we are going to branch the code for this then we need to make sure we have a plan to merge the branch back once we are confident in the new page breaking algorithm. This plan (which should be agreed before branching takes place) should include an acceptance procedure, e.g. will a single -1 be able to prevent the code being merged back? We dont want to end up with another alt-design, which eventually moved to source forge!!! Chris Either way is fine with me, but Chris brings up a very valid point. If you can tolerate and keep up with my minor code housekeeping from time to time in some of the layout managers (currently mostly PSLM), feel free to work from HEAD directly instead if you wish. Glen Jeremias Maerki
Re: page-breaking strategies and performance
Just a sanity check here, the XSL specification seems to suggest always the first-fit strategy for page breaking *except* where keeps are explicitly specified. Am I correct here? And, if so, is what you're planning going to result in an algorithm that will help us do this? Thanks, Glen --- Jeremias Maerki [EMAIL PROTECTED] wrote: I'd rather not work on HEAD directly because after creating the basics for the new mechanism the whole thing will probably not work for some time (probably 2-4 weeks). But I'd like to be able to check in early so people can review. I expect that the life time of the branch will not exceed 8 weeks. So there's almost no chance that alt-design is repeated, especially since the basic LM infrastructure will not be altered big time and it looks like we are all going in the same direction for the new page-breaking. It's clear that it has to be done and it seems to be moveing in the direction of a derived Knuth approach. It's much like the migration to the Knuth line breaking and it's mostly the block-level LMs that will be affected. People can continue to work on HEAD during that time as long as nothing serious is altered in the block-level LMs which would make merging difficult. Before I can kick off we need to agree to the general approach for the algorithm and clear a few details so we are reasonably sure that it'll work. Once we have that the plan for the branch should not be a big deal if we take the above into account. On 02.03.2005 13:16:42 Glen Mazza wrote: --- Chris Bowditch [EMAIL PROTECTED] wrote: As for the plan to implement a new page-breaking mechanism: I've got to do it now. :-) I'm sorry if this may put some pressure on some of you. I'm also not sure if I'm fit already to tackle it, but I've got to do it anyway. Since I don't want to work with a series of patches like you guys did earlier, I'd like to create a branch to do that on as soon as we've agreed on a strategy. Any objections to that? If we are going to branch the code for this then we need to make sure we have a plan to merge the branch back once we are confident in the new page breaking algorithm. This plan (which should be agreed before branching takes place) should include an acceptance procedure, e.g. will a single -1 be able to prevent the code being merged back? We dont want to end up with another alt-design, which eventually moved to source forge!!! Chris Either way is fine with me, but Chris brings up a very valid point. If you can tolerate and keep up with my minor code housekeeping from time to time in some of the layout managers (currently mostly PSLM), feel free to work from HEAD directly instead if you wish. Glen Jeremias Maerki
Re: page-breaking strategies and performance
Where did you find such a suggestion? I'd be interested to know if there's a hint in this direction in the spec. I thought it was up to the implementation to decide the strategy. I think the way we're now taking in our discussion suggests that we're not going to do a first-fit strategy at all. If we're really going down the two-strategy path we'll probably end up with a best-fit strategy and a total-fit or best-fit plus look-ahead. (See Simon's list [1]) But that's something we still need to figure out together. [1] http://wiki.apache.org/xmlgraphics-fop/PageLayout On 02.03.2005 14:48:17 Glen Mazza wrote: Just a sanity check here, the XSL specification seems to suggest always the first-fit strategy for page breaking *except* where keeps are explicitly specified. Am I correct here? And, if so, is what you're planning going to result in an algorithm that will help us do this? Jeremias Maerki
Re: page-breaking strategies and performance
I'm unsure here. My interpretation comes from two places: 1.) Section 4.8, the last paragraph of [1]: The area tree is constrained to satisfy all break conditions imposed. ***Each keep condition must also be satisfied***, except when this would cause a break condition or a stronger keep condition to fail to be satisfied. i.e., keep conditions need to be satisfied. 2.) The definitions of the three keep-[] properties [2] each have a initial value of auto, meaning There are no keep-[] conditions imposed by this property. So by default, if the user does not explicitly specify keep properties, e.g., keep-together.within-page, no text, pictures, etc. are to be kept together on the same page, if they wouldn't already be so due to free-flowing (i.e., first-fit) text. Everything would become free-flowing in order to obey the stylesheet writer's specifications. Just my $0.02. Thanks, Glen [1] http://www.w3.org/TR/2001/REC-xsl-20011015/slice4.html#keepbreak [2] http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#keep-together --- Jeremias Maerki [EMAIL PROTECTED] wrote: Where did you find such a suggestion? I'd be interested to know if there's a hint in this direction in the spec. I thought it was up to the implementation to decide the strategy. I think the way we're now taking in our discussion suggests that we're not going to do a first-fit strategy at all. If we're really going down the two-strategy path we'll probably end up with a best-fit strategy and a total-fit or best-fit plus look-ahead. (See Simon's list [1]) But that's something we still need to figure out together. If we ever have multiple page-breaking options, it can be a user-defined configuration switch. No problem there. Glen [1] http://wiki.apache.org/xmlgraphics-fop/PageLayout On 02.03.2005 14:48:17 Glen Mazza wrote: Just a sanity check here, the XSL specification seems to suggest always the first-fit strategy for page breaking *except* where keeps are explicitly specified. Am I correct here? And, if so, is what you're planning going to result in an algorithm that will help us do this? Jeremias Maerki
Re: page-breaking strategies and performance
Thanks. I think this has only to do with the rules to handle keeps and breaks and how to resolve conflicts. I don't think, however, that these parts create a restriction which tells us what page-breaking strategy to pursue. We could probably run with a first-fit strategy and still fulfill the rules below if we accept a lot of backtracking. But as Simon suggested, this seems to be a poor approach. Keeps and breaks are only part of what a page breaking algorithm has to deal with. See [3]. [3] http://wiki.apache.org/xmlgraphics-fop/PageLayout/InfluencingFeatures On 02.03.2005 16:44:17 Glen Mazza wrote: I'm unsure here. My interpretation comes from two places: 1.) Section 4.8, the last paragraph of [1]: The area tree is constrained to satisfy all break conditions imposed. ***Each keep condition must also be satisfied***, except when this would cause a break condition or a stronger keep condition to fail to be satisfied. i.e., keep conditions need to be satisfied. 2.) The definitions of the three keep-[] properties [2] each have a initial value of auto, meaning There are no keep-[] conditions imposed by this property. So by default, if the user does not explicitly specify keep properties, e.g., keep-together.within-page, no text, pictures, etc. are to be kept together on the same page, if they wouldn't already be so due to free-flowing (i.e., first-fit) text. Everything would become free-flowing in order to obey the stylesheet writer's specifications. Just my $0.02. Thanks, Glen [1] http://www.w3.org/TR/2001/REC-xsl-20011015/slice4.html#keepbreak [2] http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#keep-together --- Jeremias Maerki [EMAIL PROTECTED] wrote: Where did you find such a suggestion? I'd be interested to know if there's a hint in this direction in the spec. I thought it was up to the implementation to decide the strategy. I think the way we're now taking in our discussion suggests that we're not going to do a first-fit strategy at all. If we're really going down the two-strategy path we'll probably end up with a best-fit strategy and a total-fit or best-fit plus look-ahead. (See Simon's list [1]) But that's something we still need to figure out together. If we ever have multiple page-breaking options, it can be a user-defined configuration switch. No problem there. Glen [1] http://wiki.apache.org/xmlgraphics-fop/PageLayout On 02.03.2005 14:48:17 Glen Mazza wrote: Just a sanity check here, the XSL specification seems to suggest always the first-fit strategy for page breaking *except* where keeps are explicitly specified. Am I correct here? And, if so, is what you're planning going to result in an algorithm that will help us do this? Jeremias Maerki Jeremias Maerki
Re: page-breaking strategies and performance
Yes, I'm not in Simon's league here--I know very little about TeX--so I'll defer to you two on this issue. Just try to make sure that the final algorithm will help us support the keep-* properties. Thanks, Glen --- Jeremias Maerki [EMAIL PROTECTED] wrote: Thanks. I think this has only to do with the rules to handle keeps and breaks and how to resolve conflicts. I don't think, however, that these parts create a restriction which tells us what page-breaking strategy to pursue. We could probably run with a first-fit strategy and still fulfill the rules below if we accept a lot of backtracking. But as Simon suggested, this seems to be a poor approach. Keeps and breaks are only part of what a page breaking algorithm has to deal with. See [3]. [3] http://wiki.apache.org/xmlgraphics-fop/PageLayout/InfluencingFeatures On 02.03.2005 16:44:17 Glen Mazza wrote: I'm unsure here. My interpretation comes from two places: 1.) Section 4.8, the last paragraph of [1]: The area tree is constrained to satisfy all break conditions imposed. ***Each keep condition must also be satisfied***, except when this would cause a break condition or a stronger keep condition to fail to be satisfied. i.e., keep conditions need to be satisfied. 2.) The definitions of the three keep-[] properties [2] each have a initial value of auto, meaning There are no keep-[] conditions imposed by this property. So by default, if the user does not explicitly specify keep properties, e.g., keep-together.within-page, no text, pictures, etc. are to be kept together on the same page, if they wouldn't already be so due to free-flowing (i.e., first-fit) text. Everything would become free-flowing in order to obey the stylesheet writer's specifications. Just my $0.02. Thanks, Glen [1] http://www.w3.org/TR/2001/REC-xsl-20011015/slice4.html#keepbreak [2] http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#keep-together --- Jeremias Maerki [EMAIL PROTECTED] wrote: Where did you find such a suggestion? I'd be interested to know if there's a hint in this direction in the spec. I thought it was up to the implementation to decide the strategy. I think the way we're now taking in our discussion suggests that we're not going to do a first-fit strategy at all. If we're really going down the two-strategy path we'll probably end up with a best-fit strategy and a total-fit or best-fit plus look-ahead. (See Simon's list [1]) But that's something we still need to figure out together. If we ever have multiple page-breaking options, it can be a user-defined configuration switch. No problem there. Glen [1] http://wiki.apache.org/xmlgraphics-fop/PageLayout On 02.03.2005 14:48:17 Glen Mazza wrote: Just a sanity check here, the XSL specification seems to suggest always the first-fit strategy for page breaking *except* where keeps are explicitly specified. Am I correct here? And, if so, is what you're planning going to result in an algorithm that will help us do this? Jeremias Maerki Jeremias Maerki
Re: page-breaking strategies and performance
On 02.03.2005 17:05:55 Glen Mazza wrote: Yes, I'm not in Simon's league here--I know very little about TeX--so I'll defer to you two on this issue. I'm also still struggling. :-) Just try to make sure that the final algorithm will help us support the keep-* properties. Yes, the algorithm MUST be able to handle full keep support (among other things). That's part of why we need a new approach. The present one doesn't quite fit the picture, yet. Thankfully, with the new design we don't have to again rewrite the whole FOP. The present approach was very good to point us in the right direction and most of the effort already invested is not lost. We just have to improve a specific part. Jeremias Maerki
Re: cvs commit: xml-fop/src/java/org/apache/fop/fo/flow TableBody.java
On Tue, Mar 01, 2005 at 09:15:37PM -0800, Glen Mazza wrote: OH!!! lightBulb state=on wattage=25/ Yes, you're right, Chris--now I see the issue. I implemented validation for about 80% of the FOs, but 80% is not 100%. fo:table-body never had any validation implemented, hence the NPE's that were occurring. Your new validation code invalidates valid fo files. If you would have run the layoutengine tests, you would have noticed. The test file table-body1.xml no longer passes. I have committed a correction. I have also made TableFooter use TableBody's validation code, as TableHeader does. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
Re: cvs commit: xml-fop/src/java/org/apache/fop/fo/flow TableBody.java TableFooter.java
Thanks Simon. Glen --- [EMAIL PROTECTED] wrote: spepping2005/03/02 13:03:25 Modified:src/java/org/apache/fop/fo/flow TableBody.java TableFooter.java Log: Corrected a validation problem. Made TableFooter use TableBody's validation.
Re: cvs commit: xml-fop/src/java/org/apache/fop/fo/flow TableBody.java
Glen Mazza wrote: Hi Glen, OH!!! lightBulb state=on wattage=25/ Yes, you're right, Chris--now I see the issue. I implemented validation for about 80% of the FOs, but 80% is not 100%. fo:table-body never had any validation implemented, hence the NPE's that were occurring. I'm glad this issue has finally been resolved, thanks for taking the time to research a bit deeper. Sorry, Jeremias, I thought you had just gratuitously *removed* the validation from fo:table-body -- I should have researched that it wasn't there to begin with. Well, that would have been a bit rude, but like you said there wasnt any validation on fo:table-body. Now hopefully we are all a bit more comfortable with the situation. Thanks, Glen Chris
page-breaking strategies and performance
I finally have Knuth's Digital Typography and let myself enlighten by his well-written words. In [1] Simon outlined different strategies for page-breaking, obviously closely following the different approaches defined by Knuth. At first glance, I'd say that best-fit is probably the obvious strategy to select, especially if TeX is happy with it. Obviously, it can't find the optimal solution like this but the additional overhead (memory and CPU power) of a look-ahead/total-fit strategy is simply too much and unnecessary for things like invoices and insurance policies which are surely some of the most popular use cases of XSL-FO. Here, speed is extremely important. People writing documentation (maybe using DocBook) or glossy stock reports have additional requirements and don't mind the longer processing time and additional memory requirements. This leads me to the question if we shouldn't actually implement two page-breaking strategies (in the end, not both right now). For a speed-optimized algorithm, we could even think about ignoring side-floats. Obviously, in this model we would have to make sure that we use a common model for both strategies. For example, we still have to make sure that the line layout gets information on the available IPD on each line, but probably this will not be a big problem to include later. An enhanced/adjusted box/glue/penalty model sounds like a good idea to me especially since Knuth hints at that in his book, too. There's also a question if part of the infrastructure from line breaking can be reused for page breaking, but I guess rather not. As for the plan to implement a new page-breaking mechanism: I've got to do it now. :-) I'm sorry if this may put some pressure on some of you. I'm also not sure if I'm fit already to tackle it, but I've got to do it anyway. Since I don't want to work with a series of patches like you guys did earlier, I'd like to create a branch to do that on as soon as we've agreed on a strategy. Any objections to that? [1] http://wiki.apache.org/xmlgraphics-fop/PageLayout Jeremias Maerki
Skype-conference on page-breaking?
To speed things up could we hold a conference (using Skype, for example) to discuss further details on page-breaking? I'd volunteer to sum up any results during that discussion for the archives. I have Finn on my Skype radar already. Jeremias Maerki
Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer
Victor and Jeremias, thanks for your Inputs. Victor, I've checked out your aXSL. I'll study it and come back to you if I have questions. Jeremias wrote: Speaking of startVParea(), could we rename it to something more meanigfull? Proposition: TransformPosition, or something like this. Deleted the methods moved to AbstractRenderer. Actually, I like startVParea() (or rather startViewportArea like I would rather call it) because only for viewport a new transformation matrix is necessary. I think when you port the matrix concatenation from the PDF renderer over to Java2D in startVParea() you will start to understand what's going on here. OK, thanks. That makes sense. fop.area.CTM: added two getters for e and f. If there's another way to get those values, please let me know. Normally, we use toArray() but I guess these two getters are ok and don't hurt although I think they are not necessary because you need to use all other values in the CTM, too, to get the reference orientation stuff right. See above. OK, I'll use the available toArray() instead. The enclosed image doesn't have ipd/bpd either. Again: is this normal so? I have a workaround in mind (getting those values through the FopImage), but it doesn't sound right. In this case it is probably better to fix the LMs. I've started doing that but haven't finished. ATM this is lower priority for me. I can send you my current code if you want to try to fix it. Shouldn't be so difficult. I would also prefer to fix the LM's. I don't want to go into it now (too complex for me ATM), but I'll come back to you later. renderTextDecoration(InlineArea) seems to work, even if it's not implemented?? Huh? It was you who moved the implementation up from PDFRenderer to AbstractRenderer. That's how you implemented it. Inheritance! I mean renderTextDecoration(InlineArea) from AbstractRenderer, which is an empty ATM . Did you mean renderTextDecoration(Font fs, InlineArea Inline, int baseline, int startx) instead? But I think I got in now: when I run examples/fo/basic/textdeko.fo , the underline of the sentence This is a whole block wrapped in fo:inline with the property text-decoration=underline. Some more Text to get at least two lines. works ok. This is because the TextArea handles the underline (via renderTextDecoration(Font fs, InlineArea Inline, int baseline, int startx) ) and the renderTextDecoration(InlineArea) doesn't do anything. BTW, Using Graphics.create() you should be able to create a copy of the current Graphics2D object. By pushing the old one on a stack and overwriting the graphics member variable should should be able to create the same effect as with currentState.push()/saveGraphicsState() in PDFRenderer.startVParea () and currentState.pop()/restoreGraphicsState ()in endVParea(). When leaving a VP area you can simply restore an older Graphics2D object for the stack and continue painting. This will undo any transformations and state change done in the copy used within the VP area. See second paragraph in javadocs of java.awt.Graphics. Sounds very good. Why haven't I thought of it ? ;) Another thought: One of my low-priority tasks is to create a little application that renders a test suite with all of FOP's renderers creating bitmap images for each generated document and ultimately creating a little website that lets us compare the output. PDFs and PS files can be converted to bitmaps using GhostScript. Maybe you might want to write such a thingy. I won't get to it before I get to updating the PS renderer to full quality. That would be good. Do you mean something like the Bitmap production you documented on FopAndJava2D [1]? This is what I intend to work on after the basic Java2DRenderer works. Thanks for your valuable comments. I'll work them out carefully and post an improved patch. Regards, Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D
Re: Skype-conference on page-breaking?
I would be please to listen. Renaud
RE: page-breaking strategies and performance
Jeremias Maerki wrote: processing time and additional memory requirements. This leads me to the question if we shouldn't actually implement two page-breaking strategies (in the end, not both right now). For a speed-optimized algorithm, we could even think about ignoring side-floats. Obviously, in this model we would have to make sure that we use a common model for both strategies. For example, we still have to make sure that the line layout gets information on the available IPD on each line, but probably this will not be a big problem to include later. This is an excellent idea. It has from time to time gone under the moniker LayoutStrategy or pluggable layout. To do it without duplicating everything requires that the other pieces of the system be modularized, the concerns separated so that they can be reused. The upside is tremendous and the cost pays for itself in developer productivity. Victor Mote
Re: [XML Graphics - FOP Wiki] Updated: PageLayout
Simon, I've tried to think your example through. If I read the spec right about space resolution then I get the impression that we may need to do more in this area than find a suitable box/glue/penalty combination. There may be several spaces which need to be taken into account during resolution. There's the precedence and the conditionality that needs to be evaluated. I think we may need to create special elements that can hold this information (or reference it). They need to be distinguishable so we can apply the resolution rules properly. I believe your example should then look like this: - box - penalty (w=0, p=infinite) - space - glue (w=0, y=0, z=0) - space - penalty (w=0, p=infinite) - box A more complex example would look like this: fo:block space-after=5pt fo:blocka line/fo:block fo:block space-after=3pt blah blah /fo:block /fo:block fo:block space-before=10pt blah bla /fo:block - box (a line) - box (blah blah) - penalty (w=0, p=infinite) - space (w=3pt, ref to the space property) - penalty (w=0, p=infinite) - space (w=5pt, ref to the space property) - glue (w=0, y=0, z=0) - space (w=10pt, ref to the space property) - penalty (w=0, p=infinite) - box The algorithm would have to track down the space element before and after the break and then apply the space resolution rules. The space elements would behave much like glue elements. What do you think? On 25.02.2005 22:50:17 SimonPepping wrote: +=== Space specifiers === + +When the space specifiers resolve to zero around a page break, we are +in the same situation as that of a word space in line breaking. It is +represented by the sequence `box - glue - box`. + +When the space specifiers do not resolve to zero around a page break, +we are in the same situation as that of a word space in line breaking +in the case of centered lines. It is represented by the sequence +{{{ +box - infinite penalty - glue(ha) - zero penalty - glue(hn-ha-hb) - zero width box - infinite penalty - glue(hb) - box +}}} +where ha is the bpd of +the space-after before the page break, hb is the bpd of the +space-before after the page-break, hw is the space when there is no +page break. Jeremias Maerki