Re: problem with identity comparison in types inheriting from org.apache.fop.fo.FONode
Sounds reasonable to me--I think I should have done the validation that way to begin with (IIRC =='s are Not Good with Strings anyway for the very reason you gave.) I am surprised this problem did not occur before. I'll make the change next. I am personally pleased that FOP 1.0 is now activatible by directly instantiating FOTreeBuilder, i.e. one can avoid the apps package completely. This is because FOTreeBuilder now has all its needed business logic within it. Still, you may wish to try this method[1] for your work (i.e., using the apps.Fop class instead of FOTreeBuilder directly). FOP.getDefaultHandler()[2] should do the same thing as your new SAXResult(FOTreeBuilder), but FOTreeBuilder is within FOP's black box and may not be available/could be renamed, etc., in the future. Also, I have to update our embed page -- apparently the links to our examples are no longer working. [Team--viewSVN apparently does not have an annotate (can see line numbers) option -- G] Thanks, Glen [1] *http://tinyurl.com/8ofpc* [2] *http://tinyurl.com/c7c3z* Nils Meier wrote: Hi I've tried to use FOP for this scenario : + I have a docbook DOM tree + I send it through a transformer using docbook-xsl to generate fo + The output of that transformation 'is piped' into the result new SAXResult(new FOTreeBuilder) so no files are generated - the FOTreeBuilder is fed directly from an in-memory DOM tree. Now the subtypes inheriting from FONode contain an identity check in: protected void validateChildNode(Locator loc, String nsURI, String localName) throws ValidationException { if (nsURI == FO_URI ... This fails because FO_URI != nsURI *but* FO_URI.equals(nsURI). This is so because the generated FO document was dynamically created in my setup - apparently without String.intern() for the namespace URI. I'd propose to replace all those identity checks with .equals() to make this safe without relying on intern()-Strings - example: public class Root extends FObj { ... protected void validateChildNode(Locator loc, String nsURI, String localName) throws ValidationException { if (FO_URI.equals(nsURI)) { ... Does that sound reasonable? Affects around 30 types in org.apache.fop.fo.* Thanks Nils
Re: problem with identity comparison in types inheriting from org.apache.fop.fo.FONode
Andreas L. Delmelle wrote: -Original Message- From: Jeremias Maerki [mailto:[EMAIL PROTECTED] Committers, I don't see a problem with what Nils proposes. Does anyone else? If not, I can do the change next week. If Nils already has a patch for us, even better. :-) Well, the whole idea behind using interned strings and the == operator is speed. As you both are probably well aware, using .equals() on interned strings is a lot slower than comparing them with ==. Not necessarily--I suspect implementations of .equals() probably first check if they == each other, and if so quickly return true before trying a character-by-character compare. You do have a point here though--this portion of the code is heavily activated. What I don't understand is why this problem hasn't happened before. If I run command-line, or even embedded, the String for the namespaceURI normally ends up being shared with the static FO_URI String created earlier. It's probably a poor programming practice to rely on that fact, but I still wonder why Nils' namespaceURI FOP's FO_URI here. It may affect 'only' 30 types, but of these types, many can have an unlimited number of instances or children --think of fo:block/fo:inline...-- so this means that such comparisons could be made hundreds (maybe even thousands) of times. Another option: validateChildNode() is called from only one place, FOTreeBuilder.startElement(). At that point, we can also feed vcN() the parameter namespaceURI.intern() instead of just namespaceURI. This could be slightly faster for some VCN()'s that compare against multiple URI's--but I would think .intern() is much slower than .equals() for the reason given above. Glen
Re: problem with identity comparison in types inheriting from org.apache.fop.fo.FONode
Jeremias Maerki escribió: On 26.06.2005 14:41:13 Glen Mazza wrote: snip/ Well, the whole idea behind using interned strings and the == operator is speed. As you both are probably well aware, using .equals() on interned strings is a lot slower than comparing them with ==. Not necessarily--I suspect implementations of .equals() probably first check if they == each other, and if so quickly return true before trying a character-by-character compare. Glen is right. java.lang.String.equals() checks == as the first statement. So this change shouldn't have a big impact on performance. It' just an additional method call (which might even be inlined by the JIT). Jeremias Maerki Thanks for checking. BTW, Jeremias--the recent warning you added to the code on ignoring an span attribute on an fo:static-content descendant. Keep in mind, it may end up *not* being ignored for three reasons: (1) layout may someday allow fo:static-content to be redirected to the fo:region-body (where span values become relevant), although FOP currently raises an error when that occurs; (2) There are some XSL functions which allow you to reference the property value on that FO; and (3) the span attribute could be inherited by fo:instream-foreign-object that an fo:block encloses. Personally, I think this warning is not really needed (it is a given from the spec that multiple columns aren't supported in side regions), or would better be placed in layout (query the span attribute from SCLM and complain if not 1 or all.) Glen
Re: Validation: non-inherited properties on FOs they don't apply to
The recommendation allows any property to be on any FO (First sentence of section 5 of 1.1[1], but also somewhere in 1.0), regardless of its utility. So we don't need to mention this as a warning. We *might* want to do this in a few areas though where newbies might make very common mistakes (and be upset that FOP isn't working they way they think it should be as a result), however. Glen [1] http://www.w3.org/TR/xsl11/#refinement Jeremias Maerki wrote: While creating the check for the span attributes a few minutes ago, I wondered if we shouldn't actually warn about explicit non-inherited properties on FO they don't apply to. We currently don't do that AFAICS. Am I right that this is a task that we shouldn't forget? Jeremias Maerki
Re: [NOTICE] Apache FOP moved from CVS to Subversion (SVN)
Peter B. West wrote: You can always get the sources using the official command-line client that comes with Subversion: http://subversion.tigris.org/ http://subversion.tigris.org/project_packages.html Which is what I had to do with BitKeeper, for which no client existed in NetBeans, Eclipse or any other widely used IDE. My only gripe is facing another learning curve for an SCM product whose basic design has already been superseded by the distributed design of BitKeeper, Monotone, Darcs, etc. Peter I don't mind the change that much--TortoiseSVN is kind of neat. Glen
Re: [NOTICE] SVN migration completed
Jeremias/Another committer, For SVN access, I'm trying to use TortoiseSVN right now. (I can log into svn.apache.org using Putty without problem.) Also, I can easily check out FOP -- but it seems to be checking out the files as anonymous because I can't make any commits using Tortoise (I get 403 forbidden errors). Does anyone know how I can check out with my username/password with TortoiseSVN so it will let me do commits? The manual[1] is not giving me any indication of how to do this. Thanks, Glen [1] http://tortoisesvn.sourceforge.net/docs/release/TortoiseSVN_en.pdf Jeremias Maerki wrote: The migration of the xml-fop CVS module to Subversion is completed. The CVS module is now read-only. All commits need to happen on SVN from now on. Base SVN URL for FOP: http://svn.apache.org/repos/asf/xmlgraphics/fop/ FOP's trunk is at: http://svn.apache.org/repos/asf/xmlgraphics/fop/trunk/ Fop's maintenance branch is at: http://svn.apache.org/repos/asf/xmlgraphics/fop/branches/fop-0_20_2-maintain/ ViewCVS link: http://svn.apache.org/viewcvs.cgi/xmlgraphics/fop/ More information on [EMAIL PROTECTED]: http://www.apache.org/dev/version-control.html A must-read, the Subversion 1.1 book: http://svnbook.red-bean.com/ Subversion for CVS users: http://svnbook.red-bean.com/en/1.1/apa.html Next tasks: - Identify old branches and tags that we don't need and can remove. - Update our website that we're using Subversion now. If anyone has any problems, just shout! Have fun and hack away! Jeremias Maerki
Re: svn commit: r201562 - /xmlgraphics/fop/trunk/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java
Cool! https:// did it! Thanks, Clay (and also Jeremias for taking the time to give all the links)--this is so much fun, I think I'll finish up FOP tonight... ;-) Glen [EMAIL PROTECTED] escribió: Author: gmazza Date: Thu Jun 23 21:13:43 2005 New Revision: 201562 URL: http://svn.apache.org/viewcvs?rev=201562view=rev Log: First SVN commit. Trivial formatting change. Modified: xmlgraphics/fop/trunk/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java Modified: xmlgraphics/fop/trunk/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java URL: http://svn.apache.org/viewcvs/xmlgraphics/fop/trunk/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java?rev=201562r1=201561r2=201562view=diff == --- xmlgraphics/fop/trunk/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java (original) +++ xmlgraphics/fop/trunk/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java Thu Jun 23 21:13:43 2005 @@ -31,6 +31,7 @@ import org.apache.fop.fo.Constants; import org.apache.fop.fo.flow.Marker; import org.apache.fop.fo.flow.RetrieveMarker; + import org.apache.fop.fo.pagination.Flow; import org.apache.fop.fo.pagination.PageSequence; import org.apache.fop.fo.pagination.Region; - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Foray's font subsystem for Fop
Jeremias Maerki wrote: Ideally, a font engine that is shared between two projects should be in a more or less neutral place write-accessible by both parties but as we've seen now there are personal dissonances. I think Victor said he didn't want to collaborate anymore: http://marc.theaimsgroup.com/?l=fop-devm=111263144615399w=2 http://marc.theaimsgroup.com/?l=fop-devm=111265443425492w=2 The problem comes up if Glen starts to veto against using Victor's work, of if Victor can't or won't support our wishes anymore. Again, I will stay out of it if another worked with his code. I don't have time for the font work, but certainly recognize it needs improvement. I'm in layout now--if I don't like the front end it doesn't matter (as much!) anymore, I am now past it. But we've had Victor's front-end architecture for the first of my two years here and my mathematical reduction of it for the second. Improvements on layout have been much more rapid in the second year. I do appreciate your work here, Jeremias, and don't wish to add to your stress level on this. I won't interfere with the font work. Regards, Glen
Re: Fop Viewer
Thanks for the helping our project Richard! FOP/XSL is a fascinating mathematical equation, and the clearer and rawer we can make that equation, the more it will attract the best computer scientists and mathematicians from around the world to help us work on determining it. Reading your credentials on your website is only further confirming this for me. (Sorry though red face/, I hope someone else can help out with this patch--renderers have not been my focus for a long time...) Regards, Glen [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] writes: I'm considering doing some work on the FOP viewer package. The first patch has now been submitted. The following improvements have been made: 1. Separated out the preview panels and control logic from the buttons and dialog to make it easier to use elsewhere. There's a new PreviewPanel class which contains the basic logic. The PreviewDialog surrounds this with the familiar controls. 2. Added continuous scrolling and continuous facing modes similar to those used by acrobat reader. 3. Added facility to drag scroll area with any drag on that area. 4. Add fit-width and fit-to-page zoom options. 5. Added a load of mnemonics to menus. I still intend to: 1. Add (optional) thumbnail slider windows. 2. Add a ruler on the top and the left side of the preview frame showing inches / centimeters. Someone else needs to fix the non-english resource files. Comments are welcome, Richard
Disconnect PSLM from LayoutManager interface?
Team, This issue is somewhat messy because it involves two parts--but I'll try to keep it succinct: 1.) The connections between PSLM and AbstractLayoutManager/LayoutManager have been so reduced that it would appear now to be a simpler and more robust design to make PSLM standalone (i.e., not even have it implement the LayoutManager interface.) By extending ALM, PSLM is presently sitting on 250-300 LOC that it is not using (with the apparently erroneous exception of one empty method, the second issue below.) None of the 15 methods in the LM interface are needed by PSLM today in order to do it duties--so I wonder if we should go ahead and remove it (pending #2 below). 2.) The only remaining exception to being able to sever PSLM from LM is the method LayoutManager getTopLevelLM() within PSLM's PageBreaker. PSLM's implementation returns itself here, which therefore requires PSLM to be a LayoutManager. Problem here is that the Breaking mechanism doesn't appear to be doing much at all with the PSLM variable. Tracing the code, the only thing the PageBreaker is asking of the PSLM is a call to addAreas() -- which is defined to be empty for PSLM anyway. I don't see why the page breaking code needs the PSLM--looking at the Page Breaker code, it appears it gets everything it processes from the childFLM. Can this getTopLevelLM() be rewritten to eliminate the need to supply the pslm object to the PageBreaker? Either (a) just return NULL in PSLM's implementation, (b) return the childFLM (similar to StaticContentBreaker's subclass, which returns the sclm for this method), or even (c) replace the method entirely with addAreasFromLM() (this apparently being the only thing requested after getTopLevelLM() anyway), for which PSLM can just return NULL? We may decide, for various reasons (loss of symmetry perhaps), not to separate PSLM from LM anyway. And I certainly wouldn't want the code base deformed in order to artifically do this separation. But I also wouldn't want us to be *forced* to keep PSLM a LM due to what may be an incorrect or suboptimal implementation of getTopLevelLM() in the page breaking code. Thoughts/Comments? Thanks, Glen
Re: Changing available BPD between pages
- Original Message - But since the PageSequenceMaster is a stateful beast that can currently only iterate in one direction I'm in trouble. Next week I'll research a good way to reset or run backwards the PageSequenceMaster so the already created PV in the Provider can ultimately be replaced with a PV for a blank page. Yes, making PageSequenceMaster flexible--bidirectional, resettable, directly accessable in a static manner for any given page, etc--would be a very nice idea, and would result in downstream PSLM logic being much easier. I recommend improving PSM first so your subsequent layout work would be simpler. Glen
Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr StaticContentLayoutManager.java LineLayoutManager.java AbstractLayoutManager.java TextLayoutManager.java LayoutManagerMapping.java ContentLayo
Luca, Are you sure here? We had two versions of addALetterSpaceTo() -- the version in ILLM which takes a List (I didn't touch that one), and a old (?) version from AbstractLayoutManager that takes a KnuthElement. It is that latter version that I removed--it wasn't being called anywhere--not the former. Glen - Original Message - From: Luca Furini [EMAIL PROTECTED] To: fop-dev@xmlgraphics.apache.org Sent: Friday, June 10, 2005 2:19 PM Subject: Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr StaticContentLayoutManager.java LineLayoutManager.java AbstractLayoutManager.java TextLayoutManager.java LayoutManagerMapping.java ContentLayoutManager.java LeaderLayoutManager.java LayoutManager.java CharacterLayoutManager.java BlockLayoutManager.java FlowLayoutManager.java Thanks for your optimization work, Glen. Just a note: the method addALetterSpaceTo() is defined in the interface InlineLevelLayoutManager and is still used. It is called by LineLM.collectInlineKnuthElements(), if the last element returned by a child LM and the first returned by the next child LM are both boxes. So, the CharacterLM and LeaderLM (extending LeafNodeLM, that implements InlineLevelLM) should really implement it. For example, if we have fo:blocka fo:character character=w/ord/fo:block we must tell the CharacterLM that the w is followed by a letter space, as it is not a whole word. Regards Luca
Re: Consolidating LayoutManager's and AbstractBreaker's getNextKnuthElements() methods
- Original Message - I find it strange, Glen, that you dont care whether people use FOP or not. I'm such a monster. You have worked hard on FOP over the last couple of years. Wouldnt you be disappointed if your work benefitted no one? Chris My goals on FOP are to make XSL as open-source dominant as XSLT is, to make paying $4-5K/server CPU as ridiculous for an XSL processor as it is presently is for an XSLT one. But such an architecture will take much time, and Jeremias is welcome to release as often as he'd like--including today, if he wants--while we get this long-term work hammered down. But right now, studying and learning layout is my goal. Keep in mind, it is not just the immediate user base that matters--you also need to please the larger corporations (IBM, Sun) and technical organizations (W3C, OASIS and their member companies) for FOP to be successful, and they're a little more thorough on architecture (cf. the behavior of the Xerces and Xalan teams.) You please the Big Boys, everything else tends to fall in line, just as it has in the past with most other Java/XML-based technologies. Glen
Re: Consolidating LayoutManager's and AbstractBreaker's getNextKnuthElements() methods
Jeremias Maerki wrote: Given that the EOL phase for 1.3.1 ends March 2006 [1][2] and given FOP's estimated time for the next serious release JDK 1.3 compatibility may really be no big concern. But I know that many people (mostly running server applications) are still stuck with JDK 1.3 we would do them a disservice by requiring 1.4 too soon. But leave the JDK 1.3 compatibility to me. [1] http://java.sun.com/j2se/1.3/download.html [2] Interesting enough is the fact that the last maintenance release (1.3.1_15) for JDK 1.3.1 dates back to December 2004. Quite recent, don't you think? Yes, indeed quite recent--I did not know it was still being maintained by Sun. I won't revisit this issue until March 2006 at the earliest then. (Very detailed response, BTW. You seem to know a thing or two about Java... ;-) Glen
Consolidating LayoutManager's and AbstractBreaker's getNextKnuthElements() methods
Jeremias, perhaps Luca: Is there a reason why we maintain separate getNextKnuthElement() methods within both an LayoutManager and its inner AbstractBreaker? Can they be consolidated into LayoutManager and we call getTopLevelLM().getNextKnuthElement() instead within the breaker code? Three LM classes have these two duplicate methods: PSLM, SCLM, and BlockContainerLayoutManager. SCLM and BCLM I cannot tell if they are mergable--it looks doubtful to my eyes but I'm unsure, however PSLM's two implementations look somewhat strange: for PSLM, what else would activate PSLM.gNKE() other than its PageBreaker.gNKE()? If nothing, those two should be merged at least to PSLM's implementation to clarify -- I'll happily do so if I'm correct here. Thanks, Glen
Re: Markers: Determining the last generated area for a LM
Also, one more point--I think it may be a good idea for us to abstract out AreaTreeModel from PSLM and encapsulate it back into AreaTreeHandler (i.e. RootLayoutManager), including moving resolveRetrieveMarker() there. IIRC I was the guilty party who moved ATM into PSLM to begin with, quite erroneously thinking that ATH might be proven superfluous over time, and so trying to make direct ATM--PSLM linkages. ATH is here to stay, though, and resolveRetrieveMarker() is something that cycles through the results of several PSLM instances so it seems more natural/intuitive to have it in the higher, root-level processing class here. Thoughts? Thanks, Glen Glen Mazza wrote: Jeremias, I think we do something like this for ID's already -- I wonder if we can use a similar approach here. We already have a PSLM.getFirstPVWithID() method, which due to the (Map/List) data structure that contains this information in AreaTreeHandler, can probably be easily converted to a PSLM.getLastPVWithID(). Note that with this method, when we add PV's having a given ID, we don't bother needing to send is first or is last indications, that is easily determinable by the List when it is complete for that property ID. Can we do a similar thing for markers? I.e., feed a data structure without needing to give first/last indications, and rely on the state of that structure to subsequently find out what is first/last? Thanks, Glen Jeremias Maerki wrote: As you may have seen I've been working through the layoutengine testcases to fix various failures/bugs last week. One of the last problems that need to be fixed is markers. Markers already work fine under the new page breaking mechanism when an FO is not broken over the page/column boundaries. The problem is getting the two last booleans on getCurrentPV().addMarkers() right. Currently the calls are hardcoded to: getCurrentPV().addMarkers(markers, true, true, false); and getCurrentPV().addMarkers(markers, false, false, true); The isfirst and islast parameters must be set correctly. Currently, I don't see a reliable way to determine these values. For example, there's some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on the layout context but I found this doesn't work reliably. I've experimented with two other approaches both of which were not good enough. One (flags on Position instances) failed because the first n elements at the beginning of the element list may be removed which also removed the marker for the first element in the list. The other (counting Position instances) failed because the element list may be modified after the initial generation thus throwing off counters. I discarded this mainly because I didn't want to make the code more complicated just to get the indices right again. The only thing that sounds like worth pursuing right now is to do look-behind and look-ahead in the Position iterator, which is in a way extending the approach that is currently visible in AreaAdditionUtils. This approach checks whether the current LM changes or not. Maybe someone has another idea on how to approach this problem. I'll let it rest for a moment until I've made keeps and breaks work on tables. Jeremias Maerki
Re: [Fwd: Layout simplifications]
Jeremias Maerki wrote: AbstractBreaker has maybe two or three methods in common with LayoutManager. Furthermore, I see the Breaker as something else than a layout manager. I think it would be confusing to merge the two concepts. The duplicate methods can probably be moved up to the base class. I've been looking again at the code--I think I agree with you now. I like how the breakers accumulate all of the breaker-specific methods in one place in the class. It is cleaner (compared to having them mixed within the LM classes), and the code is actually quite efficiently written already. As for possible duplicate methods such as getNextKnuthElements()--which is apparently different between the Breaker and the LM in some cases anyway--I notice that we already have a getTopLevelLM() which returns the LM getting processed. Where duplication occurs, we can just reference the method from there (i.e., getTopLevelLM().methodInTheLMClass()). If helpful, we may also want to create more specialized LayoutManager interfaces (e.g., BreakableLM extending LayoutManager), and then have certain LM's implement it in addition to whatever they already extend. If we do that, then we'll have: BreakableLM getTopLevelLM(); instead to call those only-a-few-LM's-have methods. But I currently don't see any need to alter the Breaker classes themselves. Thanks, Glen
Re: [Fwd: Layout simplifications]
Jeremias Maerki wrote: I admit that the AbstractBreaker was simply an artifact from merging in Luca's initial code and which later evolved a bit but I began to like the distinction between the LMs and the breakers. OK, we'll keep that distinction then. This would have been a very time-consuming code change for me to do, and if the team does not see much benefit to it, it is not worth the effort. And as I mentioned before, I prefer getting FOP near a developer release over making the code perfect. But I won't stand in your way if you want to spend countless hours which result in little more than perfectly looking code. There are so many other little things that still need to be done but which have a real impact on the end result. Just my honest, personal opinion. Thanks for implying I can write perfect code. ;-) Glen
[Fwd: Layout simplifications]
trying again... Original-Nachricht Betreff:Layout simplifications Datum: Mon, 16 May 2005 18:14:52 -0400 Von:Glen Mazza [EMAIL PROTECTED] An: fop-dev@xmlgraphics.apache.org Team, Currently the LM classes that use the Knuth breaking strategy employ the breaking via a nested (inner) class -- PageSequenceLayoutManager.PageBreaker, for example. This is causing some duplication in methods (getNextKnuthElements(), for example) and variables in each of the Breaker classes. Also, AbstractBreaker has to duplicate methods already available in AbstractLayoutManager because it does not extend it. What I would like to do is the following (step-by-step, not proceeding until each stage works): 1.) Have AbstractBreaker extend AbstractLayoutManager (ALM). 2.) Remove the PageBreaker inner class from PSLM, and have PSLM directly extend AbstractBreaker. Refactor and remove any duplicate methods and variables. 3.) Rename AbstractBreaker to AbstractBreakingLayoutManager (or similar). 4.) One by one, remove the nested inner classes from those other breaking LM classes: removing any duplicate methods or functionality already available in the base class, and having those classes extend ABLM instead of ALM. (Only those LM classes which do breaking will extend ABLM--the rest will continue with ALM.) 5.) Refactor/simplify AbstractBreakingLayoutManager, taking advantage of methods already available in AbstractLayoutManager. Refactor/simplify the ABLM-extended classes, again resulting from insights during this process. 6.) (possibly) Create a BreakingLayoutManager interface (extending LayoutManager interface?), just to keep note of the delta between ABLM and ALM. I think if we do this, it will allow for more simplifications and insights into the layout coding, and make it easier to understand. I don't think we can afford the duplication as-is--my experience so far with FOP is that simplifying engenders more simplifications, while lard ends up begetting more lard. Thoughts/comments? Thanks, Glen
Layout simplifications
Team, Currently the LM classes that use the Knuth breaking strategy employ the breaking via a nested (inner) class -- PageSequenceLayoutManager.PageBreaker, for example. This is causing some duplication in methods (getNextKnuthElements(), for example) and variables in each of the Breaker classes. Also, AbstractBreaker has to duplicate methods already available in AbstractLayoutManager because it does not extend it. What I would like to do is the following (step-by-step, not proceeding until each stage works): 1.) Have AbstractBreaker extend AbstractLayoutManager (ALM). 2.) Remove the PageBreaker inner class from PSLM, and have PSLM directly extend AbstractBreaker. Refactor and remove any duplicate methods and variables. 3.) Rename AbstractBreaker to AbstractBreakingLayoutManager (or similar). 4.) One by one, remove the nested inner classes from those other breaking LM classes: removing any duplicate methods or functionality already available in the base class, and having those classes extend ABLM instead of ALM. (Only those LM classes which do breaking will extend ABLM--the rest will continue with ALM.) 5.) Refactor/simplify AbstractBreakingLayoutManager, taking advantage of methods already available in AbstractLayoutManager. Refactor/simplify the ABLM-extended classes, again resulting from insights during this process. 6.) (possibly) Create a BreakingLayoutManager interface (extending LayoutManager interface?), just to keep note of the delta between ABLM and ALM. I think if we do this, it will allow for more simplifications and insights into the layout coding, and make it easier to understand. I don't think we can afford the duplication as-is--my experience so far with FOP is that simplifying engenders more simplifications, while lard ends up begetting more lard. Thoughts/comments? Thanks, Glen
Re: First performance comparison
Thanks for taking the time to do this analysis. I was wondering where we were standing on performance. I think it is clear from the 12sec-7.8 sec drop that keeping logging/stdout output reduced helps performance. Keeping quiet seems to be Xalan's approach as well. I looked at our commericial competitors' sites to see where they are with logging. It appears RenderX doesn't log by default but it has a server-side EnMasse product[1] which does configurable logging. AntennaHouse apparently just uses stdout/stderr[2], but I don't know how much output it produces while running. Since the logging level is nonconfigurable, I would suspect not much. Glen [1] http://www.renderx.com/enmasseguide.html [2] http://www.antennahouse.com/support/qa/QA-product.html#QA2003082202 Jeremias Maerki wrote: I've just run readme.fo (from the examples) through both 0.20.5 and CVS HEAD, 20 times in 1 thread, to satisfy my curiosity. I don't want to hide these numbers from you: 0.20.5 takes 6.3 seconds for that. CVS HEAD took over 12 seconds at the beginning but spitting out lots of debug messages. After converting the System.out calls to log.debug calls and setting the log level to SEVERE, the time went down to 7.8 seconds. A subjective impression I had was that CVS HEAD took longer to warm up (i.e. classloading plus initialitation). readme.fo is a document that except for references looks fine in both versions, although CVS HEAD produces one page more (11 instead of 10). It looks like the line heights differ quite a bit. For all those who'd say now that the new FOP is too slow, I'm going somewhere else (or something like that), bear in mind this is all preliminary and based on non-optimized code and work in progress. I was simply curious and I'm sure others are, too. At least, we can say, it's not that bad and nothing is lost. :-) More later, probably with measurements with tables and on memory consumption. Jeremias Maerki
Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr/table TableStepper.java TableContentLayoutManager.java EffRow.java
[EMAIL PROTECTED] wrote: jeremias2005/05/12 07:13:45 Modified:src/java/org/apache/fop/layoutmgr/table Tag: Temp_KnuthStylePageBreaking TableStepper.java TableContentLayoutManager.java EffRow.java Log: Fix for ArrayIndexOutOfBoundsException when empty grid units are involved. Jeremias, I don't see this as a fix--you seem to be converting a RunTimeException into a logical error (system runs but you get bad output.) The latter is many more times harder and more stressful to fix because with an LE we have no idea where the problem is--FOTree, Layout, Renderers, PDF Library, user version of JDK/Adobe Acrobat, etc., etc. Converting RTE's into LE's IMO does not really create rigorous, robust, low-maintenance coding. +public GridUnit getGridUnit(int index) { +if (index = 0 index gridUnits.size()) { +return (GridUnit)gridUnits.get(index); +} else { +return null; +} +} If the caller is so incompetent to be requesting grid unit #42 for a system with only 10 grid units--shouldn't we have FOP to halt with the Array Index RTE so we can get that bug quickly identified and fixed? (Or, if we can't fix it immediately, put a Band-Aid fix in the caller instead of the callee?) The quiet returning of null thwarts that, and when users start complaining about bad output due to the LE, we won't know where the problem is to fix it. In addition to wearing out committers wading through the renderers and the PDF library when an RTE would have told them to quickly look at the FO package, we'll also have to ask the users a bunch of irrelevant questions such as their versions of Adobe Acrobat, etc. I don't see how an LE helps us here. I mentioned this to you earlier because of a odd change you made (line 98 of [1]) to the Span class to create an LE instead of an RTE should PSLM ask for an invalid column. I don't understand your rationale--if PSLM is asking for the wrong columns, the output will be messed up anyway. Best then to choose the solution--i.e., the RTE--that allows us to quickly zero in on the problem. I converted your change back[2, line 94/85] to explicitly return an IllegalStateException, and it was good I did so--I later had an error in the PSLM coding, asked for an invalid column, and quickly was informed by the RTE what the problem was so I could immediately fix it. Thanks, Glen [1] http://cvs.apache.org/viewcvs.cgi/xml-fop/src/java/org/apache/fop/area/Span.java?r1=1.6r2=1.6.2.1diff_format=h [2] http://cvs.apache.org/viewcvs.cgi/xml-fop/src/java/org/apache/fop/area/Span.java?r1=1.6.2.1r2=1.6.2.2diff_format=h
Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr PageSequenceLayoutManager.java
[EMAIL PROTECTED] schrieb: gmazza 2005/05/12 17:54:14 Modified:src/java/org/apache/fop/layoutmgr Tag: Temp_KnuthStylePageBreaking PageSequenceLayoutManager.java Log: Copied the logic over incorrectly--fixed (even though IIRC RetrieveMarkers work currently anyway.) Correction: *don't* work.
Re: Add a list to MARC -- the Apache FOP lists
Excellent!!! Thanks Hank! FYI Team -- Our MARC Archives[1] are back and have been populated with the previous months' emails! Glen [1] http://marc.theaimsgroup.com/?w=2 --- Hank Leininger [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, 22 Apr 2005, Glen Mazza wrote: [EMAIL PROTECTED] to [EMAIL PROTECTED] [1] fop-dev@xml.apache.org to fop-dev@xmlgraphics.apache.org [2] fop-cvs@xml.apache.org to fop-commits@xmlgraphics.apache.org [2] Would you be able to change your archives to read from these lists instead? OK, it looks like you guys carried our subscriptions over fine, but we were looking for the @xml.apache.org foo in mail headers to validate that mails were really coming from the lists, and not spam. I just updated the headers we look for, and have un-spam-flagged past fop- traffic, so the missing messages should show up under the existing fop-* archives shortly. If they don't please holler ;) Thanks, Hank Leininger [EMAIL PROTECTED] E407 AEF4 761E D39C D401 D4F4 22F8 EF11 861A A6F1 -BEGIN PGP SIGNATURE- iD4DBQFCfoVzIvjvEYYapvERAoYHAJjgWjP7heJcSmM+BtgxE0vrdtQPAJ4ryGSD KFAT8rLvNQzqPoZPU3oJ4Q== =Nl2q -END PGP SIGNATURE-
Re: removing storing of unresolved idrefs in PageViewports
Andreas L. Delmelle wrote: A closer look at ATH and PV shows that the two above steps are: - add the unresolved area to the viewport stored in the VP as a Map id-List where the List contains references to the source areas (the link's hotspot, the page-number-citation...) - add the unresolved area to the ATH stored here as a Map id-List where the List contains references to the pageVPs that have an unresolved area with a given id Right now, it still seems quite sane to keep the distinction. In principle, the ATH only needs to know which VPs contain an unresolved area mapping to a certain id, where it is the VPs themselves that need the details about the areas...? OK, now I see the reason. Yes, it makes sense to retain the distinction. Thanks, Glen
Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr PageSequenceLayoutManager.java
Jeremias Maerki wrote: Glen, I'd like you to revert that one and take a different approach if any. Kein Problem. Bald werde ich dass machen, aber nicht dieser Nacht, weil ich ziemlich muede bin. handleBreak does really handle break-before AND break-after, so the name was ok. Ja, Sie haben Recht. Leider habe ich nur die ehemalige PSLM.prepareNormalFlowArea() gelesen, und von da irrtumlich gelernen, dass break-before die einzel Benutzung war. What can be is that there is some left-over code from the previous approach. Ja, jetzt sollen wir das alterer Kodierung (pNFA() usw.) entfernen. Letztes Januar war PSLM ungefaehr 930 LVK, aber bald wird es vielleicht nur 560 LVK sein! Danke, Glen
Re: iStartOn = break-before?
OK, I'll update the variable with a comment explaining its meaning. Thanks, Glen Jeremias Maerki wrote: Using the official terms is usually a good idea but in this instance I'd leave it like it is. It's not just the break-before value. The value for startOn can also come from a break-after property. It simply indicates for a BlockSequence on what kind of page it should start after normalizing where the value originally came from (break-after or break-before). IMO using break-before here would actually be confusing and startOn is more descriptive. On 05.05.2005 06:00:56 Glen Mazza wrote: Team, The AbstractBreaker.BlockSequence has an iStartOn property, that from looking at PSLM, quite possibly just means the break-before trait. Are they synonyms? I would like us to use the official Rec term if at all possible. Thanks, Glen public class BlockSequence extends KnuthSequence { private int startOn; public BlockSequence(int iStartOn) { super(); startOn = iStartOn; } . Jeremias Maerki
iStartOn = break-before?
Team, The AbstractBreaker.BlockSequence has an iStartOn property, that from looking at PSLM, quite possibly just means the break-before trait. Are they synonyms? I would like us to use the official Rec term if at all possible. Thanks, Glen public class BlockSequence extends KnuthSequence { private int startOn; public BlockSequence(int iStartOn) { super(); startOn = iStartOn; } .
Re: Applying Finn Bock's patch (again) :-)
BTW, is the page breaking also using Knuth's algorithms, or is it from the research paper* that Jeremias ordered a few months back and was mentioning to us? I have been generically calling both the line- and page-breaking the Knuth code--I don't know how correct that is. Thanks, Glen * Also, would it help me to order that research paper--how helpful would it be in understanding our code? Luca Furini wrote: I realized just a few days ago that the breaking algorithm (in the BreakingAlgorithm class) is not fully patched with Finn's great refactoring of the Knuth code (bug 32612). I must admit that this is due to my laziness: when I was playing with Knuth's algorithm for page breaking I applied to my local copy of the code only the new restarting strategy, so, although Jeremias applied the patch before the branch, most benefits in performance and readability got lost in the merge. I have now applied the patch to the branch code: it needed some change in order to fit in the new classes, and I hope I did not introduce errors. :-) A few doubts / questions / comments: - the value BestRecord.INFINITE_DEMERITS: I'm not sure it must be +infinity; if it is a finite value it acts like a threshold, reducing the number of active nodes. On the other hand, the finite value should be carefully chosen, otherwise breakpoints with an allowed adjustment ratio could be later discarded because it has more than finite infinity demerits (this is something that Finn pointed out some time ago). What about a finite value computed according to the adjustment threshold and the line width? - in addition to Finn's restarting strategy, lastTooShort is resetted to null after the creation of new nodes: the newly created nodes are surely better than it, and a lastTooShort solution will be found later; it will most likely have more demerits (demerits always increase, when a new line / page is created), but it will be better anyway. - as now KnuthSequence extends ArrayList instead of LinkedList, a few more optimizations could be done here and there: using get() instead of ListIterators, for example. Regards Luca
Re: Release details
Gaywood, Mark wrote: Dear all, With your planned release of version 1, is there a list of anticipated conformance to 1.1 of the XSL:FO specifications? i.e. What you will and will not be supporting. Best regards and thank you in advance, Mark Gaywood This e-mail is confidential and intended solely for the use of the individual(s) to whom it is addressed. If you are not the intended recipient, be advised that you have received this e-mail in error and that any use, dissemination, forwarding, printing, copying of , or any action taken in reliance upon, it is strictly prohibited and may be illegal. Bookmarks are already in but otherwise no, we don't have a list. Glen
Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr AbstractBreaker.java PageSequenceLayoutManager.java
Looks good! Glen [EMAIL PROTECTED] wrote: lfurini 2005/04/27 08:59:59 Modified:src/java/org/apache/fop/layoutmgr Tag: Temp_KnuthStylePageBreaking AbstractBreaker.java PageSequenceLayoutManager.java Log: Using a more clear boolean instead of an int, as suggested by Glen and Andreas Revision ChangesPath No revision No revision 1.1.2.6 +3 -3 xml-fop/src/java/org/apache/fop/layoutmgr/Attic/AbstractBreaker.java Index: AbstractBreaker.java === RCS file: /home/cvs/xml-fop/src/java/org/apache/fop/layoutmgr/Attic/AbstractBreaker.java,v retrieving revision 1.1.2.5 retrieving revision 1.1.2.6 diff -u -r1.1.2.5 -r1.1.2.6 --- AbstractBreaker.java 26 Apr 2005 16:39:12 - 1.1.2.5 +++ AbstractBreaker.java 27 Apr 2005 15:59:59 - 1.1.2.6 @@ -96,7 +96,7 @@ return (blockLists.size() == 0); } -protected void startPart(BlockSequence list, int localPageNumber) { +protected void startPart(BlockSequence list, boolean bIsFirstPage) { //nop } @@ -202,7 +202,7 @@ System.out.println(PLM part: + (p + 1) + , break at position + endElementIndex); -startPart(effectiveList, p+1); +startPart(effectiveList, (p == 0)); int displayAlign = getCurrentDisplayAlign(); 1.50.2.18 +8 -12 xml-fop/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java Index: PageSequenceLayoutManager.java === RCS file: /home/cvs/xml-fop/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java,v retrieving revision 1.50.2.17 retrieving revision 1.50.2.18 diff -u -r1.50.2.17 -r1.50.2.18 --- PageSequenceLayoutManager.java 26 Apr 2005 16:39:12 - 1.50.2.17 +++ PageSequenceLayoutManager.java 27 Apr 2005 15:59:59 - 1.50.2.18 @@ -193,7 +193,7 @@ addAreas(alg, partCount, originalList, effectiveList); } -protected void startPart(BlockSequence list, int localPageNumber) { +protected void startPart(BlockSequence list, boolean bIsFirstPage) { if (curPage == null) { throw new IllegalStateException(curPage must not be null); } else { @@ -203,16 +203,12 @@ if (!firstPart) { if (curFlowIdx curPage.getCurrentSpan().getColumnCount()-1) { curFlowIdx++; -} else if (localPageNumber == 1) { -// this is the first page that will be created by -// the current BlockSequence: it could have a break -// condition that must be satisfied -handleBreak(list.getStartOn()); -} else { -// this is NOT the first page that will be created by -// the current BlockSequence: we simply need a new -// page -handleBreak(Constants.EN_PAGE); +} else { +// if this is the first page that will be created by +// the current BlockSequence, it could have a break +// condition that must be satisfied; +// otherwise, we simply need a new page +handleBreak(bIsFirstPage ? list.getStartOn() : Constants.EN_PAGE); } } } - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr AbstractBreaker.java PageSequenceLayoutManager.java
Oops... --- Glen Mazza [EMAIL PROTECTED] wrote: --- [EMAIL PROTECTED] wrote: -protected void startPart(BlockSequence list) { +protected void startPart(BlockSequence list, int localPageNumber) { boolean isFirstPageByBlock is probably better. The meaning of localPageNumber to indicate the first page created by a particular block I think will cause confusion, will cause confusion I think... Glen
Re: Requested MARC to update mailing archives
Did you also request the same from Eyebrowse? I can do so again if it would help. Glen --- Jeremias Maerki [EMAIL PROTECTED] wrote: I did on 2005-03-04 and got no response. Thanks for retrying. On 24.04.2005 23:43:05 Glen Mazza wrote: I don't know if someone else has done so earlier (I recall Chris raising the issue a few weeks back), but I sent an email yesterday to the MARC list[1] asking them to switch their FOP lists to our new @xmlgraphics addresses. I hope they will do so--our present archives lists aren't very good--there's no searching capability, etc. Glen [1] http://marc.theaimsgroup.com/ Jeremias Maerki
RE: Problems with break conditions and empty pages
Hi Andreas: --- Andreas L. Delmelle [EMAIL PROTECTED] wrote: [Glen:] With an FLM-centric approach, what I'm seeing is something like either of these two: (pseudocode) a) Within PSLM: FlowLayoutManager flm = new FlowLayoutManager(simplePageMaster); while (pv = flm.getNextPageViewport() != null) { addStaticContent(pv); // done for *every* PV areaTreeModel.renderPage(pv); } I don't quite get this... With the FLM controlling layout for a subset of the descendants of an fo:flow, I was thinking of the FLM controlling layout for *all* the FO descendants of the fo:flow. One and only one FLM instance for the fo:flow of an fo:page-sequence. It returns PageViewport instances to PSLM as it does its processing. I didn't exactly mean there should be a one-to-one relation between FLMs and those subsets (or even between the FLMs and the pages/page-masters). I'm unsure what you mean by subsets--while forced page-breaks may cause the Knuth algorithms to need to group up the PageViewports generated by the FLM into subsets for each subset to get optimized, that is IMO an FLM-specific implementation detail that PSLM ideally shouldn't need to be concerned about. More like: the FLM holds a reference to the entire set of descendants, which it may or may not be able to layout all on one page --depending on the properties of the page-master that is used to create the first page-viewport that the PSLM provides it with. IMO the FLM is a PageViewport-generating (and -returning) machine--with the number of pages it needs dependent upon the FLM's implementation, page breaking strategies, etc. My FLM may return 54 PV's to the PSLM for a given page-sequence -- your might return 50 for example. If the content fits in one page --sufficiently low bpd or indefinite page-height-- no additional pages are requested from the PSLM. This is all FLM needs to do to create a fully initialized PV: PV pv = new PV(spm); Why bother having PSLM declare and feed it to FLM each time? Just another unneeded moving part. FLM can fill up the BodyRegion and send it back to PSLM for it to do the StaticContent/SideRegions. If it doesn't fit in one page --bpd too high or forced page-breaks-- the FLM signals to the PSLM if and when it needs a new page, so that the PSLM can: a. finish up the current one -- instruct SCLMs to layout their content to the assigned region-viewports. (and if there's any content left in the flow) b. create a main viewport for the next page(s) Actually, PSLM doesn't do that anymore. PV creates itself interally, including its subareas. That's what allows us to discuss creating PV's in either FLM or PSLM. PSLM has way too much logic to handle than to get into the irrelevant minutae of creating a PV. (You may be forgetting also that PSLM will also need to handle Flow Maps in the future.) c. instruct FLM to resume layout where it left off, handing it the freshly created viewport as a blank canvas This is what is driving me crazy: I said three weeks ago, let's call FLM BodyRegionLM instead, because it is just doing page-by-page layout, with everything being controlled by PSLM and you said no, no, no--it processes the entire Flow--so call it FLM. Now, you disagree in having FLM process the entire flow, you want FLM to be page-by-page, i.e., a BodyRegionLM but not called that. I can accept either implemention--although I do prefer having the FLM now--but its name should be consistent with what it does. It just creates the pages on-demand, using page-masters defined higher up, but *as needed* by the FLM. Be careful, never say pages unless you are talking about the physical medium. Say PageViewports/page-viewport-areas instead. Also, page-masters are not defined higher-up, they are defined *separately* (in fo:layout-master-set) and should be available for any LM that references their values. So, in a sense, the 'page-breaking' can also be considered to take place at levels far deeper than the FLM --fo:blocks with forced page-breaks. That granularity I'm not at yet to be able to comment--I currently just prefer it to be outside of PSLM in FLM. To revisit the implementation of spans: snip/ This I also don't have an opinion on right now, other than to say that since the since fo:s-c can unfortunately be redirected to the fo:region-body, it will eventually need to handle columns as well. Your idea may very well take care of this issue--I haven't an alternative ATM. WRT implementing footnotes and floats, I see a few possibilities: 1) The FLM first performs layout, ignoring the footnotes/floats. I don't think it can ignore it, at the very least it will need the BPD that the separators take up in order to do its space calculations when footnotes/floats occur. I suspect the PSLM will pre-create each SCLM instance for footnote/float separators. (i.e., we don't create and initialize these objects for every page), so the FLM
RE: Problems with break conditions and empty pages
--- Andreas L. Delmelle [EMAIL PROTECTED] wrote: Indeed not. The FlowLM should definitely keep track of this, when applicable --in my description: the FlowLM would store the reference to its last processed descendant before the page overflow, and the PageSequenceLM, upon finishing one total page-vp, would simply instruct the FlowLM to 'continue, wherever it left off'. OK. Not a problem here. No, the 'M' is for 'Manager'... From the POV of the FlowLM, the constraints on the total layout dimension of the areas geneated by its descendants is not something it *needs* to have any control over itself --let alone: create at will :-/ Well I think different FLM's would make different judgment calls in terms of page breaking mechanisms and/or column balancing, at least. ...and the fo:page-sequence-master is linked in too-many-ways-to-mention to the fo:page-sequence directly (not, or only indirectly to, the fo:flow). I don't think this matters all that much. Just another unneeded moving part. FLM can fill up the BodyRegion and send it back to PSLM for it to do the StaticContent/SideRegions. Or it can fill up the readily created BodyRegion, and signal the PSLM that the layout for the static content may begin... Hmm, maybe the real fun of it is: if you would really like to wait until the end of the PageSequence to layout the static content for all pages...? No...do it after each page is done...because that is how it is sent to the AreaTreeModel for rendering. However, and this is where the main issue will be, the FLM may need to juggle three or four PV's at a time while it is determining optimal Knuth page breaks, etc. Once all that is determined, *then* send them back one-by-one to PSLM. However, I will stay out of this issue for the moment and await comments from Luca and/or others on the team. This is outside my scope of knowledge. So, there will be a possible N flows for one page-sequence...? All the more reason, it seems, to centralize the page-generation at the 'one' end of that branch... I think they will be used more commonly for fo:static-content objects (print the same information in both the region-before and region-after, etc.) Yes and no :-) It performs layout for the entire flow, on a page-by-page basis, in close co-operation with the PageSequenceLM --and the StaticContentLM, if necessary for static-content float/footnote separators. Start-pause-resume-pause-resume... It flows its content, bit by bit --reporting to PSLM, and handing over control every page or so. OK--sounds good. Thanks, Glen
RE: Problems with break conditions and empty pages
--- Andreas L. Delmelle [EMAIL PROTECTED] wrote: Hmm.. This does seem to be one of those situations where the logic could be placed anywhere. However, taking into account Luca's remarks, I would be inclined to see it as: The PSLM creates the page-viewports, and passes them on to 1. the FLM --which controls layout of a subset of the areas generated by descendants of the fo:flow (ultimately also floats/footnotes) 2. the SCLMs --which control layout of the areas generated by the descendants of the fo:static-contents (+ possible retrieved markers from the subset processed by the FLM) I like this. In this respect, the page-breaking logic is already at its most appropriate place. The FLM needs only a part of the total page-vp, so it makes sense to handle the creation/initialization of the viewport one level up. Actually, creating/initializing a PageViewport is not a big deal anymore--all PSLM does these days is: PageViewport pv = new PageViewport(simplePageMaster); That's it. The details of setting up region-reference-areas, page-reference-areas, etc., is all automatically done in PageViewport and related Area classes. So this one-line initialization can be done by FLM or PSLM wherever convenient. If we go this route--and it's primarily dependent on whether Luca is comfortable with it and sees sufficient benefits to it--the total number of pageViewports needed for a page-sequence would be a function of the page-breaking and layout strategy of the *FLM*, not the PSLM. For example, FLM may not immediately be able to support multiple columns and column balancing--output is single-column only. Once we have columns column-balancing implemented in FLM, probably a different number of PV's would be needed for a given page-sequence. Or, different Knuth implementations in FLM result in a different PV count. Also, (say) Finn, via our pluggable layout manager mechanism, decides to implement a different FLM, again changing the number of PV's needed, etc., etc. PSLM can remain the same regardless of FLM implementation. With an FLM-centric approach, what I'm seeing is something like either of these two: (pseudocode) a) Within PSLM: FlowLayoutManager flm = new FlowLayoutManager(simplePageMaster); while (pv = flm.getNextPageViewport() != null) { addStaticContent(pv); // done for *every* PV areaTreeModel.renderPage(pv); } (getNextPageViewport() returns one PV object with its flow information populated.) b.) Or, have a push mechanism in PSLM: FlowLayoutManager flm = new flm(simplePageMaster); flm.doLayout(); public void pageViewportFinished(pv) { // called by FLM addStaticContent(pv); // done for *every* PV areaTreeModel.renderPage(pv); } Also: PSLM needs to provide FLM the following: 1.) getBeforeFloatSeparator(); 2.) getFootnoteSeparator(); How these two are provided I'm not sure at the moment: have PSLM render these two by calling a SCLM, have FLM render them by calling a SCLM, etc. -- I don't know. Thanks, Glen
Requested MARC to update mailing archives
I don't know if someone else has done so earlier (I recall Chris raising the issue a few weeks back), but I sent an email yesterday to the MARC list[1] asking them to switch their FOP lists to our new @xmlgraphics addresses. I hope they will do so--our present archives lists aren't very good--there's no searching capability, etc. Glen [1] http://marc.theaimsgroup.com/
Re: Problems with break conditions and empty pages
--- Luca Furini [EMAIL PROTECTED] wrote: Break conditions in page breaking are quite similar to preserved linefeeds in line breaking: they divide a fo:page-sequence in smaller sequences, Another way of thinking about it would be that the array of page-viewport-areas returned by this FO is divided into smaller arrays, with each smaller array undergoing its own Knuth page breaking process. (I prefer to think of areas being divided rather than FO's.) Also, as food for thought, I wonder if the two methods Luca has mentioned should eventually be in FlowLayoutManager (FLM) instead. The break properties appear relevant only for fo:flow descendants. snip/ I don't know if the methods could be moved to the FLM: besides the break value, they depend on the current page number and this is known only by the PageSequenceLM. Actually, that is now available as a public accessor in the PageViewport object, so any LM working with one has access to the page number. And, within reason, accessors within PSLM could be used by FLM, which maintains a reference to its parent LM. the FLM is the immediate LM child of PSLM, so it should have everything that PSLM does (except for the static content, which I don't think we care about when it comes to page breaking anyway.) Ideally, FLM should be the topmost LM that handles the page breaking, no? I wonder if the Knuth code should be out of PSLM completely I need some more time to reflect on this idea, but I write a quick answer anyway. My first impression is that I would find somewhat strange that the *page* breaking is not in the *Page*SequenceLM! :-) Well, under our current philosophy, our LM's map to the formatting object (here, the page sequence), not the areas they generate. I was reminded a bit on that a few weeks ago by Andreas and Simon. You may recall, I recommended at the time that we have BodyRegionLM and a SideRegionLM instead of a FLM and a StaticContentLM. Under this scenario, PSLM controls complete page-by-page layout, and delegates to the BRLM and SRLM to do the body region or side areas. But if we have an FLM instead, my thinking is that it should perhaps process the entire fo:flow--including the creation of multiple page-viewport-areas in order to consume that flow. A more serious comment is that some formatting objects (footnotes and before floats) generates page-level-out-of-line-areas, whose placement, according to the recommendation (4.2.5), is controlled by the fo:page-sequence ancestor; I think this is because of the footnote and before-float separators (not the footnotes and before-floats themselves) which are defined in fo:static-content FO's under the page-sequences. The FLM somehow would have to be able to create these separators each time they are needed for each page. As for location controlled by the fo:page-sequence ancestor, that could simply mean that the fo:page-sequence defines the page margins and the side region dimensions. The footnote is just above the region-after, and before-floats are just below the region-before, hence the fo:page-sequence determines its location. This wouldn't necessarily mean that the actual layout of these objects needs to be done by the PSLM. so, if the PSLM must handle footnotes and before floats (influencing the available bpd for the normal areas) it must handle the whole page breaking process. Well, the available maximum bpd can be accessed from the area.BodyRegion child of the PageViewport--this value is calculated automatically upon initialization of a PageViewport. As you can see from section 6.10.1.3[1], these two areas consume space from the main-reference-area. So it appears that all that would be necessary is for the FLM to create a PageViewport, and if a flow has a before-float or footnote, reduce that bpd for the regular normal-reference-areas. (Also, to add the footnote/before-float separators in.) Actually, IMO right now, this work can be done by either PSLM or FLM. If the team's instincts are to remain with PSLM for this, that would OK with me. Thanks, Glen [1] http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#pg-out-of-line
Re: Problems with break conditions and empty pages
Just so I understand how this is supposed to work, will someone please confirm my assumptions below: 1.) If FOP is processing a block on the middle of page 17 with a break-before value of even-page, FOP is supposed to render this block at the top of page 18 instead. and 2.) If FOP is processing a block on the middle of page *18* with a break-before value of even-page, FOP is supposed to render it at the top of page 20 instead. and 3.) The above processing is done only once for the fo:block with this property. I.e., assuming no child of fo:block has this property as well, if the block takes up multiple pages it will use pages 18-19-20-21-22..., for (1) above, and *not* 18-20-22-24... Thanks, Glen --- Luca Furini [EMAIL PROTECTED] wrote: It seems there is a bug affecting the creation of the right kind of page for documents containing blocks with break-* = odd-page or even-page. If break-before = odd-page *each* page with some content is odd; even pages are all empty. If break-before = even-page the content is placed only on even pages, while odd pages are empty; moreover, if the block with break-before is the first one in the document it is placed on the first page (which is odd!), without adding an empty page before. The same happens with break-after. I think this could depend on the conditions tested in the methods PSLM.needEmptyPage() and PSLM.needNewPage(); in particular, the first one should return false if the first page has already been created, while now it seems to return always true. I'll look at this again next week, obviously unless someone finds a fix before! :-) Regards Luca
Re: Problems with break conditions and empty pages
Also, as food for thought, I wonder if the two methods Luca has mentioned should eventually be in FlowLayoutManager (FLM) instead. The break properties appear relevant only for fo:flow descendants. Glen --- Glen Mazza [EMAIL PROTECTED] wrote: Just so I understand how this is supposed to work, will someone please confirm my assumptions below: 1.) If FOP is processing a block on the middle of page 17 with a break-before value of even-page, FOP is supposed to render this block at the top of page 18 instead. and 2.) If FOP is processing a block on the middle of page *18* with a break-before value of even-page, FOP is supposed to render it at the top of page 20 instead. and 3.) The above processing is done only once for the fo:block with this property. I.e., assuming no child of fo:block has this property as well, if the block takes up multiple pages it will use pages 18-19-20-21-22..., for (1) above, and *not* 18-20-22-24... Thanks, Glen --- Luca Furini [EMAIL PROTECTED] wrote: It seems there is a bug affecting the creation of the right kind of page for documents containing blocks with break-* = odd-page or even-page. If break-before = odd-page *each* page with some content is odd; even pages are all empty. If break-before = even-page the content is placed only on even pages, while odd pages are empty; moreover, if the block with break-before is the first one in the document it is placed on the first page (which is odd!), without adding an empty page before. The same happens with break-after. I think this could depend on the conditions tested in the methods PSLM.needEmptyPage() and PSLM.needNewPage(); in particular, the first one should return false if the first page has already been created, while now it seems to return always true. I'll look at this again next week, obviously unless someone finds a fix before! :-) Regards Luca
RE: Problems with break conditions and empty pages
--- Andreas L. Delmelle [EMAIL PROTECTED] wrote: -Original Message- From: Glen Mazza [mailto:[EMAIL PROTECTED] Hi Glen, Also, as food for thought, I wonder if the two methods Luca has mentioned should eventually be in FlowLayoutManager (FLM) instead. The break properties appear relevant only for fo:flow descendants. Interesting idea. The FLM may have more convenient access to the information needed to deal with exactly this type of situation... Well, the FLM is the immediate LM child of PSLM, so it should have everything that PSLM does (except for the static content, which I don't think we care about when it comes to page breaking anyway.) Ideally, FLM should be the topmost LM that handles the page breaking, no? I wonder if the Knuth code should be out of PSLM completely, i.e., have FLM have this method: PageViewport[] generatePages(), which would be called by PSLM, and once it returns, PSLM then takes care of static content before sending each page to the AreaTreeModel for rendering. (Or, have the FLM feed the pages back to PSLM one-by-one, after it finishes the flow for that page. Same principle here--the FLM would do the breaking.) Luca, Jeremias, WDYT? at the very least, it's worth considering moving part of the logic to FLM --say, storing a state variable indicating whether the last page-break was forced or not-- so the result of PSLM.needNewPage() would depend on FLM.needNewPage() which would in its turn depend on 'lastPBForced'. OTOH, this state variable could also be stored in the PSLM itself... Roughly, the logic could become something like PSLM.needNewPage(int breakVal) { if( (curPage != null) (curPage.getPage().isEmpty() ) { if( breakVal == PAGE ) { return (currentPageNum == 1); } else { boolean evenPage = (currentPageNum % 2 == 0); return ((breakVal == (evenPage ? ODD_PAGE : EVEN_PAGE)) || lastPBForced); } } else { return true; } } The logic is not as much a concern to me as its location. This seems like it should ideally *all* be in FLM. I would think FLM is to completely take care of the fo:flow, including making 47 pages if need be, doing the incrementing of columns within the span on each page, etc. PSLM would just add the static content after each page-viewport is returned to it by FLM. I wonder if PSLM should be so designed that if we had multiple ways to break up pages--it might mean multiple FLM implementations, but PSLM would be the same regardless. Thanks, Glen
Detach PSLM from LayoutManager?
Team, I would like to tighten up the PSLM a little bit more--namely, I'm inclined to have PSLM stop extending AbstractLayoutManager or even implementing the LayoutManager interface. With this change, PSLM will no longer need to have unused empty methods within it, and will be as robust, independent, and precisely coded for its task as our other standalone LM RootLayoutManager (aka AreaTreeHandler) is. Currently I see 4-6 methods that will no longer need to be implemented by PSLM under such a scenario: void setFObj(FObj obj); void setParent(LayoutManager lm); LayoutManager getParent(); void initialize(); boolean isBogus(); (possibly) boolean generatesInlineAreas(); (possibly) And as things become better clarified, it is possible even more methods will become obsolete for it. Once done, I would like to add a getPSLM() method to the LayoutManager interface and remove the following PSLM-only implemented methods from it: void addIDToPage(); PageViewport getCurrentPageViewport(); PageViewport resolveRefID(); Marker retrieveMarker(); LayoutManagerMaker getLayoutManagerMaker(); void addUnresolvedArea(); void addMarkerMap(); (I can't add a getPSLM() without the change at the top because of the circular reference between PSLM and LM that would result.) Henceforth, all calls from the various LM's to these methods will consist of getPSLM().addUnresolvedArea(), getPSLM().retrieveMarker(), etc., instead. This change will much better anchor and make more readable these function calls within the various LM's, because it will show that these methods are implemented in only one place (PSLM), and not the child LM itself. It will also reduce the demands on those wishing to extend LM by removing the need for them to trivially implement these methods (i.e., no more recursive getParentLM().retrieveMarker(), getParentLM().addUnresolvedArea() methods in these child LM's. This will be needed for only one method, the new getPSLM().) There is no rush on this--and this can easily wait until the Knuth work is better solidified. But I would like to know your thoughts and comments here--there may be other issues I had not thought of in proposing this change. Thanks, Glen
Re: cvs commit: xml-fop/src/java/org/apache/fop/layoutmgr FlowLayoutManager.java
Hi Luca, 1.) Can the corresponding setting of these values on fo:root (642-643 of [1]) in PSLM now be removed? (I think so...because what is set on fo:flow will be used instead of fo:root.) 2.) Also, does your change below need to be added to StaticContentLayoutManager as well? Many thanks, Glen [1] http://cvs.apache.org/viewcvs.cgi/xml-fop/src/java/org/apache/fop/layoutmgr/PageSequenceLayoutManager.java?view=annotate#642 --- [EMAIL PROTECTED] wrote: +// set layout dimensions + fobj.setLayoutDimension(PercentBase.BLOCK_IPD, context.getRefIPD()); + fobj.setLayoutDimension(PercentBase.BLOCK_BPD, context.getStackLimit().opt); +
Re: cvs commit: xml-fop/src/java/org....
Jeremias, I do not fully understand the business logic for tables--so what I am saying here may not be relevant. But if cell should *never* be null (i.e., the caller of this method is very sloppily written), please let the methods NPE, raise IndexOutOfBoundsError, InvalidStateException, etc., so we can immediately be informed of the caller's incompetence at the point of error and work on that right away. Make sure we don't quietly return null so that the problem will resurface several classes further downstream where it presumably would be much harder to track. If we have to put a temporary band-aid in, best to put it with caller (i.e., have it not call the method if cell is null), not the callee. Thanks, Glen --- [EMAIL PROTECTED] wrote: +public BorderInfo getOriginalBorderInfoForCell(int side) { +if (cell != null) { +return cell.getCommonBorderPaddingBackground().getBorderInfo(side); +} else { +return null; +} +}
Re: two more class renamings
Oops, make that three differences: their content models (child FO's that the spec says they can have) are slightly different. Glen --- Glen Mazza [EMAIL PROTECTED] wrote: --- The Web Maestro [EMAIL PROTECTED] wrote: or something. That way, it's all in one (since it can apparently be repurposed anyway, with fo:flow being stuck into fo:static-content, and Be careful here: fo:flow being placed into a side region, or fo:static-content being placed into the body region (or main reference area). We really need to start divorcing the fo:static-content/fo:flow terms from where they are usually placed on the paper. The two differences between fo:flow and fo:static-content are: 1. fo:static-content is to be repeated from its start on every page, and truncated if it doesn't fit. 2. fo:flow is not repeated, but additional pages created until it its contents are finished. Regions of that these FO's are placed on are really not part of the equation. Glen
RE: two more class renamings
--- Andreas L. Delmelle [EMAIL PROTECTED] wrote: Sorry to be such a nitpick, but the 1.0 Rec. states literally: An fo:marker is only permitted as the descendant of an fo:flow. and An fo:retrieve-marker is only permitted as the descendant of an fo:static-content. Thanks for the correction--I had checked just the content model of fo:flow, which indicated fo:retrieve-marker would be allowed. It would be nice if those two statements above were duplicated in the parent's CM descriptions--that's where people normally go to see which child FO's are allowed/disallowed. Perhaps I'll make a request to the xsl editors ML sometime. Glen
Re: [VOTE] Release Transcoders
+1 --- Jeremias Maerki [EMAIL PROTECTED] wrote: +1 from me. The code has been stable over the last few months and seems to work well for most cases. I also don't see any other open issues preventing a release. On 25.03.2005 10:39:48 Jeremias Maerki wrote: Batik is currently preparing for a release (be that 1.5.1 or 1.6). Former Batik releases simply contained a fop-transcoder.jar without having been released properly. That's why I'm now calling for a formal vote for a release of our transcoders. The only thing we do is tag CVS HEAD, we don't publish the code ourselves. That also means that we're not talking about releasing the whole codebase here, only the transcoders plus their dependencies in our codebase (PDF lib, PS support code, fonts and images). When the vote passes this tag must be used by the Batik team to create their release. This should ensure that the Batik release contains only clean code (meaning without known IP issues and other showstoppers). Jeremias Maerki