Re: FOP Memory issues (fwd from fop-users)

2007-01-09 Thread richardw
Andreas L Delmelle writes:
  If I remember correctly, that was precisely the problem, since  
  Cliff's report consists of one giant table. It's supposed to look  
  like one uninterrupted flow, so figuring out where the page-sequences  
  should end is next to impossible... (or IOW: sorting that out kind of  
  defeats the purpose of using a formatter to compute the page-breaks) :/

That's exactly the same problem I ran up against. Hence my investigations
into properties and memory consumption. On my current setup (as per the
last patch under 41044) and using my current 16MB fo file, the object
counts are as follows:

349576 instances of class org.apache.fop.fo.properties.CondLengthProperty
145647 instances of class org.apache.fop.fo.properties.KeepProperty
126237 instances of class org.xml.sax.helpers.LocatorImpl
116521 instances of class org.apache.fop.fo.properties.SpaceProperty
87814 instances of class java.lang.Object[]
87458 instances of class java.util.ArrayList
87394 instances of class 
org.apache.fop.fo.properties.CommonBorderPaddingBackground
87394 instances of class 
org.apache.fop.fo.properties.CommonBorderPaddingBackground$BorderInfo[]
87394 instances of class org.apache.fop.fo.properties.CondLengthProperty[]
81632 instances of class char[]
48548 instances of class org.apache.fop.fo.properties.LengthRangeProperty
42649 instances of class java.lang.String
38841 instances of class org.apache.fop.fo.properties.CommonMarginBlock
38839 instances of class org.apache.fop.datatypes.LengthBase
38839 instances of class org.apache.fop.fo.properties.PercentLength
38838 instances of class org.apache.fop.fo.flow.Block
38836 instances of class org.apache.fop.fo.flow.TableCell
9710 instances of class org.apache.fop.fo.flow.TableRow
5021 instances of class java.lang.Integer
4329 instances of class java.util.HashMap$Entry
1521 instances of class java.lang.Class
787 instances of class java.awt.Color
667 instances of class java.util.HashMap$Entry[]
658 instances of class java.util.HashMap
330 instances of class java.util.Hashtable$Entry
201 instances of class java.util.WeakHashMap$Entry
181 instances of class org.apache.fop.fo.properties.EnumProperty
160 instances of class java.util.concurrent.ConcurrentHashMap$HashEntry[]
160 instances of class java.util.concurrent.ConcurrentHashMap$Segment
160 instances of class java.util.concurrent.locks.ReentrantLock$NonfairSync
127 instances of class java.lang.String[]
116 instances of class java.util.LinkedHashMap$Entry
110 instances of class byte[]
...

As you can see, there's still a lot of value in getting the compound
properties reusable - which I still intend on doing. This won't make it
possible to handle arbitary documents, but will at least raise the bar
somewhat.

I'm also currently reading through Knuth's Digital Typography. Can anyone
point out any sections I should pay particular attention to w.r.t. FOP's
usage,

Regards,

Richard



Re: Announcements refused

2007-01-09 Thread Christian Geisert
Vincent Hennebert schrieb:
 I can handle the docbook-apps list.
 
 BTW, my annoucement for XMLGraphics Commons 1.1 never appeared on the
 Batik-users list. It did on Fop-users, although my Apache address isn't
 subscribed to this list. Any idea of what's wrong?

The moderator(s) of batik-users didn't moderate the message through
while those of fop-users did. This is a bit strange because they are
almost the same persons (I'm one of them..).
Let's blame it on christmas :-|

To find out who are the moderators of a mailing list see
http://www.apache.org/dev/committers.html#mailing-list-moderators

If this happens again feel free to cantact me directly and tell me to
pay more attention to the moderation requests.

-- 
Christian


Re: FOP Memory issues (fwd from fop-users)

2007-01-09 Thread Manuel Mall
On Tuesday 09 January 2007 18:13, [EMAIL PROTECTED] 
wrote:
 Andreas L Delmelle writes:
   If I remember correctly, that was precisely the problem, since
   Cliff's report consists of one giant table. It's supposed to look
   like one uninterrupted flow, so figuring out where the
   page-sequences should end is next to impossible... (or IOW:
   sorting that out kind of defeats the purpose of using a formatter
   to compute the page-breaks) :/

 That's exactly the same problem I ran up against. Hence my
 investigations into properties and memory consumption. On my current
 setup (as per the last patch under 41044) and using my current 16MB
 fo file, the object counts are as follows:

 349576 instances of class
 org.apache.fop.fo.properties.CondLengthProperty 145647 instances of
 class org.apache.fop.fo.properties.KeepProperty 126237 instances of
 class org.xml.sax.helpers.LocatorImpl
 116521 instances of class org.apache.fop.fo.properties.SpaceProperty
 87814 instances of class java.lang.Object[]
 87458 instances of class java.util.ArrayList
 87394 instances of class
 org.apache.fop.fo.properties.CommonBorderPaddingBackground 87394
 instances of class
 org.apache.fop.fo.properties.CommonBorderPaddingBackground$BorderInfo
[] 87394 instances of class
 org.apache.fop.fo.properties.CondLengthProperty[] 81632 instances of
 class char[]
 48548 instances of class
 org.apache.fop.fo.properties.LengthRangeProperty 42649 instances of
 class java.lang.String
 38841 instances of class
 org.apache.fop.fo.properties.CommonMarginBlock 38839 instances of
 class org.apache.fop.datatypes.LengthBase 38839 instances of class
 org.apache.fop.fo.properties.PercentLength 38838 instances of class
 org.apache.fop.fo.flow.Block
 38836 instances of class org.apache.fop.fo.flow.TableCell
 9710 instances of class org.apache.fop.fo.flow.TableRow
 5021 instances of class java.lang.Integer
 4329 instances of class java.util.HashMap$Entry
 1521 instances of class java.lang.Class
 787 instances of class java.awt.Color
 667 instances of class java.util.HashMap$Entry[]
 658 instances of class java.util.HashMap
 330 instances of class java.util.Hashtable$Entry
 201 instances of class java.util.WeakHashMap$Entry
 181 instances of class org.apache.fop.fo.properties.EnumProperty
 160 instances of class
 java.util.concurrent.ConcurrentHashMap$HashEntry[] 160 instances of
 class java.util.concurrent.ConcurrentHashMap$Segment 160 instances of
 class java.util.concurrent.locks.ReentrantLock$NonfairSync 127
 instances of class java.lang.String[]
 116 instances of class java.util.LinkedHashMap$Entry
 110 instances of class byte[]
 ...


Richard,

very good stuff. I am trying to make sense of the numbers. Let me 
paraphrase the data:

A table with 4 columns and 9710 rows. Each table cell contains a block 
and there is also a block around the table. This gives us (roughly) the 
87394 CommonBorderPaddingBackground and CondLengthProperty[] instances. 
We also have an ArrayList (the property list) per formatting object and 
each ArrayList is backed by an Object[].

What are the 81632 instances of class char[]? I assume this is the text 
in the table cells. But why are there more than twice as many as there 
are table cells?

Your summary also shows 126237 instances of class 
org.xml.sax.helpers.LocatorImpl. I believe the only purpose of these 
helpers if for providing location information in error messages. Looks 
like a fairly expensive feature.

 As you can see, there's still a lot of value in getting the compound
 properties reusable - which I still intend on doing. This won't make
 it possible to handle arbitary documents, but will at least raise the
 bar somewhat.


Based on your above figures reusing identical compound property 
instances would certainly be a very useful improvement. A possible next 
step would be to reuse identical property lists. Especially in 
documents with lots of identically formatted table cells this would 
further reduce the memory footprint.

 I'm also currently reading through Knuth's Digital Typography. Can
 anyone point out any sections I should pay particular attention to
 w.r.t. FOP's usage,

 Regards,

 Richard

Manuel


Re: FOP Memory issues (fwd from fop-users)

2007-01-09 Thread Amin Ahmad

Perhaps the flyweight pattern, described by the GoF, may be of use to
whoever is going to look into an implementation strategy.
http://en.wikipedia.org/wiki/Flyweight_pattern gives a decent example.

amin


On 1/9/07, Manuel Mall [EMAIL PROTECTED] wrote:


On Tuesday 09 January 2007 18:13, [EMAIL PROTECTED]
wrote:
 Andreas L Delmelle writes:
   If I remember correctly, that was precisely the problem, since
   Cliff's report consists of one giant table. It's supposed to look
   like one uninterrupted flow, so figuring out where the
   page-sequences should end is next to impossible... (or IOW:
   sorting that out kind of defeats the purpose of using a formatter
   to compute the page-breaks) :/

 That's exactly the same problem I ran up against. Hence my
 investigations into properties and memory consumption. On my current
 setup (as per the last patch under 41044) and using my current 16MB
 fo file, the object counts are as follows:

 349576 instances of class
 org.apache.fop.fo.properties.CondLengthProperty 145647 instances of
 class org.apache.fop.fo.properties.KeepProperty 126237 instances of
 class org.xml.sax.helpers.LocatorImpl
 116521 instances of class org.apache.fop.fo.properties.SpaceProperty
 87814 instances of class java.lang.Object[]
 87458 instances of class java.util.ArrayList
 87394 instances of class
 org.apache.fop.fo.properties.CommonBorderPaddingBackground 87394
 instances of class
 org.apache.fop.fo.properties.CommonBorderPaddingBackground$BorderInfo
[] 87394 instances of class
 org.apache.fop.fo.properties.CondLengthProperty[] 81632 instances of
 class char[]
 48548 instances of class
 org.apache.fop.fo.properties.LengthRangeProperty 42649 instances of
 class java.lang.String
 38841 instances of class
 org.apache.fop.fo.properties.CommonMarginBlock 38839 instances of
 class org.apache.fop.datatypes.LengthBase 38839 instances of class
 org.apache.fop.fo.properties.PercentLength 38838 instances of class
 org.apache.fop.fo.flow.Block
 38836 instances of class org.apache.fop.fo.flow.TableCell
 9710 instances of class org.apache.fop.fo.flow.TableRow
 5021 instances of class java.lang.Integer
 4329 instances of class java.util.HashMap$Entry
 1521 instances of class java.lang.Class
 787 instances of class java.awt.Color
 667 instances of class java.util.HashMap$Entry[]
 658 instances of class java.util.HashMap
 330 instances of class java.util.Hashtable$Entry
 201 instances of class java.util.WeakHashMap$Entry
 181 instances of class org.apache.fop.fo.properties.EnumProperty
 160 instances of class
 java.util.concurrent.ConcurrentHashMap$HashEntry[] 160 instances of
 class java.util.concurrent.ConcurrentHashMap$Segment 160 instances of
 class java.util.concurrent.locks.ReentrantLock$NonfairSync 127
 instances of class java.lang.String[]
 116 instances of class java.util.LinkedHashMap$Entry
 110 instances of class byte[]
 ...


Richard,

very good stuff. I am trying to make sense of the numbers. Let me
paraphrase the data:

A table with 4 columns and 9710 rows. Each table cell contains a block
and there is also a block around the table. This gives us (roughly) the
87394 CommonBorderPaddingBackground and CondLengthProperty[] instances.
We also have an ArrayList (the property list) per formatting object and
each ArrayList is backed by an Object[].

What are the 81632 instances of class char[]? I assume this is the text
in the table cells. But why are there more than twice as many as there
are table cells?

Your summary also shows 126237 instances of class
org.xml.sax.helpers.LocatorImpl. I believe the only purpose of these
helpers if for providing location information in error messages. Looks
like a fairly expensive feature.

 As you can see, there's still a lot of value in getting the compound
 properties reusable - which I still intend on doing. This won't make
 it possible to handle arbitary documents, but will at least raise the
 bar somewhat.


Based on your above figures reusing identical compound property
instances would certainly be a very useful improvement. A possible next
step would be to reuse identical property lists. Especially in
documents with lots of identically formatted table cells this would
further reduce the memory footprint.

 I'm also currently reading through Knuth's Digital Typography. Can
 anyone point out any sections I should pay particular attention to
 w.r.t. FOP's usage,

 Regards,

 Richard

Manuel



Re: FOP Memory issues (fwd from fop-users)

2007-01-09 Thread Vincent Hennebert
Richard a écrit :
snip/
 I'm also currently reading through Knuth's Digital Typography. Can anyone
 point out any sections I should pay particular attention to w.r.t. FOP's
 usage,

(Digital Typography caught my eyes. I'll try to respond to the rest
later.)

Chapter 3, Breaking Paragraphs Into Lines, is definitely THE chapter
to read.
The first 2 chapters are dealing with font rendering, which goes too far
for us --but it may be interesting for your personal culture. I haven't
read the rest but it seems very TeX-oriented to me. Maybe interesting,
but obsoleted by the current font technology (Type1, OpenType) anyway.

I've considered to describe the linebreaking algorithm in a less cryptic
manner (variable names of more than one letter, mainly) on the Wiki [1],
but have never had the time to do it. If you ever want to do that, that
would be very valuable I think!

[1] http://wiki.apache.org/xmlgraphics-fop/KnuthsModel

Vincent


Re: FOP Memory issues (fwd from fop-users)

2007-01-09 Thread Peter B. West
On Tue, 2007-01-09 at 16:52 +0100, Vincent Hennebert wrote:
 Richard a écrit :
 snip/
  I'm also currently reading through Knuth's Digital Typography. Can anyone
  point out any sections I should pay particular attention to w.r.t. FOP's
  usage,
 
 (Digital Typography caught my eyes. I'll try to respond to the rest
 later.)
 
 Chapter 3, Breaking Paragraphs Into Lines, is definitely THE chapter
 to read.
 The first 2 chapters are dealing with font rendering, which goes too far
 for us --but it may be interesting for your personal culture. I haven't
 read the rest but it seems very TeX-oriented to me. Maybe interesting,
 but obsoleted by the current font technology (Type1, OpenType) anyway.
 
 I've considered to describe the linebreaking algorithm in a less cryptic
 manner (variable names of more than one letter, mainly) on the Wiki [1],
 but have never had the time to do it. If you ever want to do that, that
 would be very valuable I think!
 
 [1] http://wiki.apache.org/xmlgraphics-fop/KnuthsModel
 
 Vincent

And I thought about describing Knuth and Plass' algorithm in a less
cryptic manner, and from a slightly different point of view.
http://defoe.sourceforge.net/folio/knuth-plass.html

N.I.H.

Peter





Re: FOP Memory issues (fwd from fop-users)

2007-01-09 Thread Andreas L Delmelle

On Jan 9, 2007, at 14:11, Manuel Mall wrote:


snip /
What are the 81632 instances of class char[]? I assume this is the  
text

in the table cells. But why are there more than twice as many as there
are table cells?


Hehe, see my little remark about the TextLM... In its initialize()  
method (I think, will check later), the FOText's char array is copied  
(System.arraycopy()).


That means there are currently two nearly identical char arrays alive  
for each text node in the page sequence :(


Cheers,

Andreas


Re: FOP Memory issues (fwd from fop-users)

2007-01-09 Thread Andreas L Delmelle

On Jan 9, 2007, at 18:46, Andreas L Delmelle wrote:


On Jan 9, 2007, at 14:11, Manuel Mall wrote:


snip /
What are the 81632 instances of class char[]? I assume this is the  
text
in the table cells. But why are there more than twice as many as  
there

are table cells?


Hehe, see my little remark about the TextLM... In its initialize()  
method (I think, will check later), the FOText's char array is  
copied (System.arraycopy()).


Sorry, my bad. Just realized that Richard's data indicates that the  
layout stage hasn't even been reached at that point...? But that also  
makes the picture somewhat more dramatic, because




That means there are currently two nearly identical char arrays  
alive for each text node in the page sequence :(


this is till true, nonetheless, and that means that at layout stage  
there will be twice as many (16+ instances)


White-space nodes? Could also be an effect of the white-space  
collapsing. Long shot, but theoretically, 'white-space-handling' in  
FOText means 'replace FOText.ca with a copy minus a few characters',  
if the originals weren't GC'ed at the time of the snapshot...



A possible next
step would be to reuse identical property lists. Especially in
documents with lots of identically formatted table cells this would
further reduce the memory footprint.


Property lists themselves are no longer alive in the snapshot, it  
seems. I don't suppose they are that much of a problem. They are  
meant to be in scope for only a very brief interval (best case: from  
PropertyListMaker.make() until FObj.bind())
Except for the child-parent relation, there are virtually no strong  
references to PropertyLists. One in the MainFOHandler(), to pass down  
as a parentPropertyList to the others.

Notable exceptions are
- a table-column's PropertyList: could be needed to resolve calls to  
from-table-column()
- a retrieve-marker's PropertyList: is needed for deferred resolution  
of the marker properties


Possible improvement would be a subtype that collapses the tree of  
PropertyLists at a given point. Right now, every PropertyList holds a  
reference to its parent, and with that, to the whole ancestry...



Cheers,

Andreas



Re: Apache FOP 0.93 Released

2007-01-09 Thread Simon Pepping
Arnaud,

On Tue, Jan 09, 2007 at 01:31:13AM +0100, Arnaud HERITIER wrote:
 Hi Simon,
 
  Can you deploy your jars on a maven repository please.
 For example :
 http://people.apache.org/repo/m1-ibiblio-rsync-repository/fop/jars/

Please ask on [EMAIL PROTECTED] There has been talk
about Maven. I know Jeremias and Vincent have looked at it. The build
script has a target for it. There is a POM file. But none of us have
much experience with it, and we have little time. Therefore your best
chance is a contribution by a user, such as yourself. Perhaps you can
patch the build script to create such a deployment.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu


Re: FOP Memory issues

2007-01-09 Thread Simon Pepping
On Tue, Jan 09, 2007 at 04:50:14PM +, Peter B. West wrote:
 
 And I thought about describing Knuth and Plass' algorithm in a less
 cryptic manner, and from a slightly different point of view.
 http://defoe.sourceforge.net/folio/knuth-plass.html

Interesting page. I will study it in detail later. Nice background,
fitting to the subject.

Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu