Re: FOP Memory issues (fwd from fop-users)
Andreas L Delmelle writes: If I remember correctly, that was precisely the problem, since Cliff's report consists of one giant table. It's supposed to look like one uninterrupted flow, so figuring out where the page-sequences should end is next to impossible... (or IOW: sorting that out kind of defeats the purpose of using a formatter to compute the page-breaks) :/ That's exactly the same problem I ran up against. Hence my investigations into properties and memory consumption. On my current setup (as per the last patch under 41044) and using my current 16MB fo file, the object counts are as follows: 349576 instances of class org.apache.fop.fo.properties.CondLengthProperty 145647 instances of class org.apache.fop.fo.properties.KeepProperty 126237 instances of class org.xml.sax.helpers.LocatorImpl 116521 instances of class org.apache.fop.fo.properties.SpaceProperty 87814 instances of class java.lang.Object[] 87458 instances of class java.util.ArrayList 87394 instances of class org.apache.fop.fo.properties.CommonBorderPaddingBackground 87394 instances of class org.apache.fop.fo.properties.CommonBorderPaddingBackground$BorderInfo[] 87394 instances of class org.apache.fop.fo.properties.CondLengthProperty[] 81632 instances of class char[] 48548 instances of class org.apache.fop.fo.properties.LengthRangeProperty 42649 instances of class java.lang.String 38841 instances of class org.apache.fop.fo.properties.CommonMarginBlock 38839 instances of class org.apache.fop.datatypes.LengthBase 38839 instances of class org.apache.fop.fo.properties.PercentLength 38838 instances of class org.apache.fop.fo.flow.Block 38836 instances of class org.apache.fop.fo.flow.TableCell 9710 instances of class org.apache.fop.fo.flow.TableRow 5021 instances of class java.lang.Integer 4329 instances of class java.util.HashMap$Entry 1521 instances of class java.lang.Class 787 instances of class java.awt.Color 667 instances of class java.util.HashMap$Entry[] 658 instances of class java.util.HashMap 330 instances of class java.util.Hashtable$Entry 201 instances of class java.util.WeakHashMap$Entry 181 instances of class org.apache.fop.fo.properties.EnumProperty 160 instances of class java.util.concurrent.ConcurrentHashMap$HashEntry[] 160 instances of class java.util.concurrent.ConcurrentHashMap$Segment 160 instances of class java.util.concurrent.locks.ReentrantLock$NonfairSync 127 instances of class java.lang.String[] 116 instances of class java.util.LinkedHashMap$Entry 110 instances of class byte[] ... As you can see, there's still a lot of value in getting the compound properties reusable - which I still intend on doing. This won't make it possible to handle arbitary documents, but will at least raise the bar somewhat. I'm also currently reading through Knuth's Digital Typography. Can anyone point out any sections I should pay particular attention to w.r.t. FOP's usage, Regards, Richard
Re: Announcements refused
Vincent Hennebert schrieb: I can handle the docbook-apps list. BTW, my annoucement for XMLGraphics Commons 1.1 never appeared on the Batik-users list. It did on Fop-users, although my Apache address isn't subscribed to this list. Any idea of what's wrong? The moderator(s) of batik-users didn't moderate the message through while those of fop-users did. This is a bit strange because they are almost the same persons (I'm one of them..). Let's blame it on christmas :-| To find out who are the moderators of a mailing list see http://www.apache.org/dev/committers.html#mailing-list-moderators If this happens again feel free to cantact me directly and tell me to pay more attention to the moderation requests. -- Christian
Re: FOP Memory issues (fwd from fop-users)
On Tuesday 09 January 2007 18:13, [EMAIL PROTECTED] wrote: Andreas L Delmelle writes: If I remember correctly, that was precisely the problem, since Cliff's report consists of one giant table. It's supposed to look like one uninterrupted flow, so figuring out where the page-sequences should end is next to impossible... (or IOW: sorting that out kind of defeats the purpose of using a formatter to compute the page-breaks) :/ That's exactly the same problem I ran up against. Hence my investigations into properties and memory consumption. On my current setup (as per the last patch under 41044) and using my current 16MB fo file, the object counts are as follows: 349576 instances of class org.apache.fop.fo.properties.CondLengthProperty 145647 instances of class org.apache.fop.fo.properties.KeepProperty 126237 instances of class org.xml.sax.helpers.LocatorImpl 116521 instances of class org.apache.fop.fo.properties.SpaceProperty 87814 instances of class java.lang.Object[] 87458 instances of class java.util.ArrayList 87394 instances of class org.apache.fop.fo.properties.CommonBorderPaddingBackground 87394 instances of class org.apache.fop.fo.properties.CommonBorderPaddingBackground$BorderInfo [] 87394 instances of class org.apache.fop.fo.properties.CondLengthProperty[] 81632 instances of class char[] 48548 instances of class org.apache.fop.fo.properties.LengthRangeProperty 42649 instances of class java.lang.String 38841 instances of class org.apache.fop.fo.properties.CommonMarginBlock 38839 instances of class org.apache.fop.datatypes.LengthBase 38839 instances of class org.apache.fop.fo.properties.PercentLength 38838 instances of class org.apache.fop.fo.flow.Block 38836 instances of class org.apache.fop.fo.flow.TableCell 9710 instances of class org.apache.fop.fo.flow.TableRow 5021 instances of class java.lang.Integer 4329 instances of class java.util.HashMap$Entry 1521 instances of class java.lang.Class 787 instances of class java.awt.Color 667 instances of class java.util.HashMap$Entry[] 658 instances of class java.util.HashMap 330 instances of class java.util.Hashtable$Entry 201 instances of class java.util.WeakHashMap$Entry 181 instances of class org.apache.fop.fo.properties.EnumProperty 160 instances of class java.util.concurrent.ConcurrentHashMap$HashEntry[] 160 instances of class java.util.concurrent.ConcurrentHashMap$Segment 160 instances of class java.util.concurrent.locks.ReentrantLock$NonfairSync 127 instances of class java.lang.String[] 116 instances of class java.util.LinkedHashMap$Entry 110 instances of class byte[] ... Richard, very good stuff. I am trying to make sense of the numbers. Let me paraphrase the data: A table with 4 columns and 9710 rows. Each table cell contains a block and there is also a block around the table. This gives us (roughly) the 87394 CommonBorderPaddingBackground and CondLengthProperty[] instances. We also have an ArrayList (the property list) per formatting object and each ArrayList is backed by an Object[]. What are the 81632 instances of class char[]? I assume this is the text in the table cells. But why are there more than twice as many as there are table cells? Your summary also shows 126237 instances of class org.xml.sax.helpers.LocatorImpl. I believe the only purpose of these helpers if for providing location information in error messages. Looks like a fairly expensive feature. As you can see, there's still a lot of value in getting the compound properties reusable - which I still intend on doing. This won't make it possible to handle arbitary documents, but will at least raise the bar somewhat. Based on your above figures reusing identical compound property instances would certainly be a very useful improvement. A possible next step would be to reuse identical property lists. Especially in documents with lots of identically formatted table cells this would further reduce the memory footprint. I'm also currently reading through Knuth's Digital Typography. Can anyone point out any sections I should pay particular attention to w.r.t. FOP's usage, Regards, Richard Manuel
Re: FOP Memory issues (fwd from fop-users)
Perhaps the flyweight pattern, described by the GoF, may be of use to whoever is going to look into an implementation strategy. http://en.wikipedia.org/wiki/Flyweight_pattern gives a decent example. amin On 1/9/07, Manuel Mall [EMAIL PROTECTED] wrote: On Tuesday 09 January 2007 18:13, [EMAIL PROTECTED] wrote: Andreas L Delmelle writes: If I remember correctly, that was precisely the problem, since Cliff's report consists of one giant table. It's supposed to look like one uninterrupted flow, so figuring out where the page-sequences should end is next to impossible... (or IOW: sorting that out kind of defeats the purpose of using a formatter to compute the page-breaks) :/ That's exactly the same problem I ran up against. Hence my investigations into properties and memory consumption. On my current setup (as per the last patch under 41044) and using my current 16MB fo file, the object counts are as follows: 349576 instances of class org.apache.fop.fo.properties.CondLengthProperty 145647 instances of class org.apache.fop.fo.properties.KeepProperty 126237 instances of class org.xml.sax.helpers.LocatorImpl 116521 instances of class org.apache.fop.fo.properties.SpaceProperty 87814 instances of class java.lang.Object[] 87458 instances of class java.util.ArrayList 87394 instances of class org.apache.fop.fo.properties.CommonBorderPaddingBackground 87394 instances of class org.apache.fop.fo.properties.CommonBorderPaddingBackground$BorderInfo [] 87394 instances of class org.apache.fop.fo.properties.CondLengthProperty[] 81632 instances of class char[] 48548 instances of class org.apache.fop.fo.properties.LengthRangeProperty 42649 instances of class java.lang.String 38841 instances of class org.apache.fop.fo.properties.CommonMarginBlock 38839 instances of class org.apache.fop.datatypes.LengthBase 38839 instances of class org.apache.fop.fo.properties.PercentLength 38838 instances of class org.apache.fop.fo.flow.Block 38836 instances of class org.apache.fop.fo.flow.TableCell 9710 instances of class org.apache.fop.fo.flow.TableRow 5021 instances of class java.lang.Integer 4329 instances of class java.util.HashMap$Entry 1521 instances of class java.lang.Class 787 instances of class java.awt.Color 667 instances of class java.util.HashMap$Entry[] 658 instances of class java.util.HashMap 330 instances of class java.util.Hashtable$Entry 201 instances of class java.util.WeakHashMap$Entry 181 instances of class org.apache.fop.fo.properties.EnumProperty 160 instances of class java.util.concurrent.ConcurrentHashMap$HashEntry[] 160 instances of class java.util.concurrent.ConcurrentHashMap$Segment 160 instances of class java.util.concurrent.locks.ReentrantLock$NonfairSync 127 instances of class java.lang.String[] 116 instances of class java.util.LinkedHashMap$Entry 110 instances of class byte[] ... Richard, very good stuff. I am trying to make sense of the numbers. Let me paraphrase the data: A table with 4 columns and 9710 rows. Each table cell contains a block and there is also a block around the table. This gives us (roughly) the 87394 CommonBorderPaddingBackground and CondLengthProperty[] instances. We also have an ArrayList (the property list) per formatting object and each ArrayList is backed by an Object[]. What are the 81632 instances of class char[]? I assume this is the text in the table cells. But why are there more than twice as many as there are table cells? Your summary also shows 126237 instances of class org.xml.sax.helpers.LocatorImpl. I believe the only purpose of these helpers if for providing location information in error messages. Looks like a fairly expensive feature. As you can see, there's still a lot of value in getting the compound properties reusable - which I still intend on doing. This won't make it possible to handle arbitary documents, but will at least raise the bar somewhat. Based on your above figures reusing identical compound property instances would certainly be a very useful improvement. A possible next step would be to reuse identical property lists. Especially in documents with lots of identically formatted table cells this would further reduce the memory footprint. I'm also currently reading through Knuth's Digital Typography. Can anyone point out any sections I should pay particular attention to w.r.t. FOP's usage, Regards, Richard Manuel
Re: FOP Memory issues (fwd from fop-users)
Richard a écrit : snip/ I'm also currently reading through Knuth's Digital Typography. Can anyone point out any sections I should pay particular attention to w.r.t. FOP's usage, (Digital Typography caught my eyes. I'll try to respond to the rest later.) Chapter 3, Breaking Paragraphs Into Lines, is definitely THE chapter to read. The first 2 chapters are dealing with font rendering, which goes too far for us --but it may be interesting for your personal culture. I haven't read the rest but it seems very TeX-oriented to me. Maybe interesting, but obsoleted by the current font technology (Type1, OpenType) anyway. I've considered to describe the linebreaking algorithm in a less cryptic manner (variable names of more than one letter, mainly) on the Wiki [1], but have never had the time to do it. If you ever want to do that, that would be very valuable I think! [1] http://wiki.apache.org/xmlgraphics-fop/KnuthsModel Vincent
Re: FOP Memory issues (fwd from fop-users)
On Tue, 2007-01-09 at 16:52 +0100, Vincent Hennebert wrote: Richard a écrit : snip/ I'm also currently reading through Knuth's Digital Typography. Can anyone point out any sections I should pay particular attention to w.r.t. FOP's usage, (Digital Typography caught my eyes. I'll try to respond to the rest later.) Chapter 3, Breaking Paragraphs Into Lines, is definitely THE chapter to read. The first 2 chapters are dealing with font rendering, which goes too far for us --but it may be interesting for your personal culture. I haven't read the rest but it seems very TeX-oriented to me. Maybe interesting, but obsoleted by the current font technology (Type1, OpenType) anyway. I've considered to describe the linebreaking algorithm in a less cryptic manner (variable names of more than one letter, mainly) on the Wiki [1], but have never had the time to do it. If you ever want to do that, that would be very valuable I think! [1] http://wiki.apache.org/xmlgraphics-fop/KnuthsModel Vincent And I thought about describing Knuth and Plass' algorithm in a less cryptic manner, and from a slightly different point of view. http://defoe.sourceforge.net/folio/knuth-plass.html N.I.H. Peter
Re: FOP Memory issues (fwd from fop-users)
On Jan 9, 2007, at 14:11, Manuel Mall wrote: snip / What are the 81632 instances of class char[]? I assume this is the text in the table cells. But why are there more than twice as many as there are table cells? Hehe, see my little remark about the TextLM... In its initialize() method (I think, will check later), the FOText's char array is copied (System.arraycopy()). That means there are currently two nearly identical char arrays alive for each text node in the page sequence :( Cheers, Andreas
Re: FOP Memory issues (fwd from fop-users)
On Jan 9, 2007, at 18:46, Andreas L Delmelle wrote: On Jan 9, 2007, at 14:11, Manuel Mall wrote: snip / What are the 81632 instances of class char[]? I assume this is the text in the table cells. But why are there more than twice as many as there are table cells? Hehe, see my little remark about the TextLM... In its initialize() method (I think, will check later), the FOText's char array is copied (System.arraycopy()). Sorry, my bad. Just realized that Richard's data indicates that the layout stage hasn't even been reached at that point...? But that also makes the picture somewhat more dramatic, because That means there are currently two nearly identical char arrays alive for each text node in the page sequence :( this is till true, nonetheless, and that means that at layout stage there will be twice as many (16+ instances) White-space nodes? Could also be an effect of the white-space collapsing. Long shot, but theoretically, 'white-space-handling' in FOText means 'replace FOText.ca with a copy minus a few characters', if the originals weren't GC'ed at the time of the snapshot... A possible next step would be to reuse identical property lists. Especially in documents with lots of identically formatted table cells this would further reduce the memory footprint. Property lists themselves are no longer alive in the snapshot, it seems. I don't suppose they are that much of a problem. They are meant to be in scope for only a very brief interval (best case: from PropertyListMaker.make() until FObj.bind()) Except for the child-parent relation, there are virtually no strong references to PropertyLists. One in the MainFOHandler(), to pass down as a parentPropertyList to the others. Notable exceptions are - a table-column's PropertyList: could be needed to resolve calls to from-table-column() - a retrieve-marker's PropertyList: is needed for deferred resolution of the marker properties Possible improvement would be a subtype that collapses the tree of PropertyLists at a given point. Right now, every PropertyList holds a reference to its parent, and with that, to the whole ancestry... Cheers, Andreas
Re: Apache FOP 0.93 Released
Arnaud, On Tue, Jan 09, 2007 at 01:31:13AM +0100, Arnaud HERITIER wrote: Hi Simon, Can you deploy your jars on a maven repository please. For example : http://people.apache.org/repo/m1-ibiblio-rsync-repository/fop/jars/ Please ask on [EMAIL PROTECTED] There has been talk about Maven. I know Jeremias and Vincent have looked at it. The build script has a target for it. There is a POM file. But none of us have much experience with it, and we have little time. Therefore your best chance is a contribution by a user, such as yourself. Perhaps you can patch the build script to create such a deployment. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.eu
Re: FOP Memory issues
On Tue, Jan 09, 2007 at 04:50:14PM +, Peter B. West wrote: And I thought about describing Knuth and Plass' algorithm in a less cryptic manner, and from a slightly different point of view. http://defoe.sourceforge.net/folio/knuth-plass.html Interesting page. I will study it in detail later. Nice background, fitting to the subject. Simon -- Simon Pepping home page: http://www.leverkruid.eu