FOP and AreaTree serialization

2002-12-16 Thread Salonen, Aki
Hi!

I've been using FOP for Inventory reporting in PDF format.
It works great with documents less that 200 pages when
we have enough memory in server.

We need to produce reports with pages up to 1000 pages
and within one pagesequence.

I would like to serialize AreaTree to filesystem
to prevent server running out of memory.

What is best place to start from?
I've notice that sources in CVS have AreaTree subclasses:

AreaTree.AreaTreeModel 
AreaTree.RenderPagesModel 
AreaTree.StorePagesModel 

Does these classes already provide support for areatree
to-filesystem-caching
during pagesequence rendering?

How about latest 0.20.5rc release. Didn't find those classes there.

Any help appreciated.


Aki Salonen 
Wincor Nixdorf Oy 
Systems Analyst 
PL 160,  02601 ESPOO, Finland 
Visiting adress: Majurinkatu 6 (Perkkaa)
( Phone, direct: +358 10 511 5183,   Switch board: +358 10 511 4040 
2  Fax: +358 10 511 5502 
*  e-mail: mailto:[EMAIL PROTECTED] 
web: http://www.wincor-nixdorf.com 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: FOP and AreaTree serialization

2002-12-16 Thread Keiron Liddle
On Mon, 2002-12-16 at 14:32, Salonen, Aki wrote:
 Hi!
 
 I've been using FOP for Inventory reporting in PDF format.
 It works great with documents less that 200 pages when
 we have enough memory in server.
 
 We need to produce reports with pages up to 1000 pages
 and within one pagesequence.
 
 I would like to serialize AreaTree to filesystem
 to prevent server running out of memory.
 
 What is best place to start from?
 I've notice that sources in CVS have AreaTree subclasses:
 
 AreaTree.AreaTreeModel 
 AreaTree.RenderPagesModel 
 AreaTree.StorePagesModel 
 
 Does these classes already provide support for areatree
 to-filesystem-caching
 during pagesequence rendering?

Yes this is implemented, the CachedRenderPagesModel does it.
Caching only really helps when there are forward references. All other
pages are rendered and streamed out straight away.


 How about latest 0.20.5rc release. Didn't find those classes there.

It is not implemented in the releases, this functionality is only in the
development in cvs.

Your help with the development would be appreciated.

 Any help appreciated.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: FOP and AreaTree serialization

2002-12-16 Thread Salonen, Aki

I am not using any forward references.

I've debugged memory allocation and 
noticed that FOP allocates about 700Kb of memory
for each page processed. This remains allocated to the
end of pagesequence.

If I've understood right, it's AreaTree that grows bigger and bigger.

Is there any way or is it possible with current design to implement
explicit clearing of already handled/rendered objects/pages from AreaTree
to enable GC to collect objects that are needed no more?

I am intrested for donating work for FOP
to make possible processing very long pagesequences.

Aki



-Original Message-
From: Keiron Liddle [mailto:[EMAIL PROTECTED]]
Sent: 16. joulukuuta 2002 16:12
To: FOP
Subject: Re: FOP and AreaTree serialization


On Mon, 2002-12-16 at 14:32, Salonen, Aki wrote:
 Hi!
 
 I've been using FOP for Inventory reporting in PDF format.
 It works great with documents less that 200 pages when
 we have enough memory in server.
 
 We need to produce reports with pages up to 1000 pages
 and within one pagesequence.
 
 I would like to serialize AreaTree to filesystem
 to prevent server running out of memory.
 
 What is best place to start from?
 I've notice that sources in CVS have AreaTree subclasses:
 
 AreaTree.AreaTreeModel 
 AreaTree.RenderPagesModel 
 AreaTree.StorePagesModel 
 
 Does these classes already provide support for areatree
 to-filesystem-caching
 during pagesequence rendering?

Yes this is implemented, the CachedRenderPagesModel does it.
Caching only really helps when there are forward references. All other
pages are rendered and streamed out straight away.


 How about latest 0.20.5rc release. Didn't find those classes there.

It is not implemented in the releases, this functionality is only in the
development in cvs.

Your help with the development would be appreciated.

 Any help appreciated.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: FOP and AreaTree serialization

2002-12-16 Thread Keiron Liddle
On Mon, 2002-12-16 at 15:52, Salonen, Aki wrote:
 I am not using any forward references.
 
 I've debugged memory allocation and 
 noticed that FOP allocates about 700Kb of memory
 for each page processed. This remains allocated to the
 end of pagesequence.
 
 If I've understood right, it's AreaTree that grows bigger and bigger.

I think it is more likely a combination of the area tree, renderer and
things in the fo tree.

 Is there any way or is it possible with current design to implement
 explicit clearing of already handled/rendered objects/pages from AreaTree
 to enable GC to collect objects that are needed no more?

Not really. That is the whole idea of the current development, to make
these sort of things work better.

 I am intrested for donating work for FOP
 to make possible processing very long pagesequences.

That would be great.



 Aki



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Serialization

2002-03-12 Thread Peter B. West

Dear Fops,

I'm naïve about serialization (among other things.)  Can anyone tell me 
whether it is possible to serialize an instance of an inner class 
without implicitly serializing the containing class instance?

Peter


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Serialization

2002-03-12 Thread Keiron Liddle

Hi Peter,

An inner class is actually a separate class that contains a reference to 
the containing class.
When the class is serialized it will serialize that reference (since I 
presume it is not transient).
If the class is static then it will not have a rreference to the 
containing class but that may not be suitable for other reasons.

On 2002.03.13 02:44 Peter B. West wrote:
 Dear Fops,
 
 I'm naïve about serialization (among other things.)  Can anyone tell me 
 whether it is possible to serialize an instance of an inner class 
 without implicitly serializing the containing class instance?
 
 Peter

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Code Style, was Re: PDF serialization

2001-07-19 Thread Tore Engvig



On Thu, 19 Jul 2001, Mark Lillywhite wrote:

 Sure. Monday the 23rd evening (GMT) is codeformatting day unless somebody
 protests.
 
 Would it be possible to wait for a day or two before setting a firm
 date? I hope to have my pipelining stuff completed today and I would
 like to at least get an opinion on their suitability for inclusion
 before this goes ahead. While the changes aren't very big they will be
 very difficult to apply after the code is reformatted.

I guess it'll take more than a few days to look at it, better postone it a
week. So monday the 30th is codeformatting day!


Tore



 I'm just asking for a couple of day's grace, and the time of a committer
 to take a look at them and let me know if they are suitable or not.

 Regards
 Mark


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, email: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Code Style, was Re: PDF serialization

2001-07-18 Thread Tore Engvig



On Tue, 17 Jul 2001, Arved Sandstrom wrote:

 Hi, Tore

 Well, how about _you_ set the day, since you're the volunteer? :-)

Sure. Monday the 23rd evening (GMT) is codeformatting day unless somebody
protests.

Everyone who have some uncomitted code should commit it by then as the
diffs in your local copy after the codeformat will mismatch and cvs will
scream conflict.

I will tag the repository before beautifying the code (eg PRE_CODEFORMAT)
and also take a look at the long-form/short-form license.


Tore


 I tend to agree with your comments about cvswrappers, and you're right, why
 not try the logical first approach, which is that all committers make a
 point of running the code beautifier of their choice before making commits?
 We have not in a long time, if ever, put our foot down and said that this is
 the way it will be - it's been pretty lackadaisical so far. So, once you
 have reformatted everything, and recommitted, all committers would be on
 notice to adhere to code conventions.

 It would be useful if you can work in license long-form - short-form
 translation also.

 A tag is a good idea, I agree. Who knows what might happen?

 Thanks in advance for doing this.

 Regards,
 Arved

 Fairly Senior Software Type
 e-plicity (http://www.e-plicity.com)
 Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, email: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Code Style, was Re: PDF serialization

2001-07-17 Thread Keiron Liddle


On Tue, 17 Jul 2001 01:22:38 Arved Sandstrom wrote:
 At 09:26 PM 7/16/01 +1000, Mark wrote:
  As an aside: is there a style guide for FOP code? I must say I find the
 style and layout very confusing and I'm happy to clean things up as I
 go.
 At the moment I can't even work out what tab stop size people are using.
 (In my own work I use 8 character tabs with 2 character indenting).
 
 Yep. http://xml.apache.org/source.html references the Sun Java coding 
 conventions, at http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html.
 
 An interesting point that comes out of the coding conventions is that
 while 
 Sun instructs us that tabs must be 8 spaces, they also state that the
 unit 
 of indentation is 4 spaces.
 
 Seems to me that the easiest way to avoid problems is never to leave tabs
 in 
 the code - I use a text editor that is set to 4 spaces per tab and 
 immediately translates tabs to spaces.
 
 We are well past the point where we should take a style tool to the whole
 codebase, strip out all tabs, and re-commit the whole thing. :-)

From what I have seen so far we are never going to get a common code style.
I usually format the code but that doesn't mean it will stay that way and
there are a lot of files. eg. Arved committed in some of the Marker code
with tabs to replace the spaces (musn't have been using the right text
editor).



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: PDF serialization

2001-07-17 Thread Keiron Liddle

Mark,

I think the first step would be to change the PDF generation code so that
it can write out as it goes to a stream (and also do what it does now).
This would most likely be done by writing out each page after completion.
It may also (I'm not sure) require different tracking of objects such as
XObjects for images.

From there the changes required to the PDFRenderer itself should not be too
difficult. 

The PDF spec should give your more information (and give you a headache or
two).

On Mon, 16 Jul 2001 13:26:13 Mark wrote:
 Hi ho fops and fopettes,
 
 I have had a look at the memory use of my modified-for-streaming FOP
 executable and determined that the primary memory constraint is now in
 the PDF renderer. Basically, because the PDF renderer keeps all objects
 in memory until the end, this obviously limits the size of the output
 document. I can't say if other renderers have this issue because I
 haven't looked.
 
 I have had a look at the PDF renderer and I think it should be possible
 to serialize it with only a minimal amount of data being kept in RAM.
 However this will require extensive changes to the PDF renderer, because
 there doesn't seem to be a single point at which I can cleanly introduce
 the changes. Presumably these changes will be backwards-compatible with
 the current FOP since I don't want to actually change the code path,
 just make it so that the results are written earlier.
 
 So I was wondering if there was someone I should be speaking with
 directly about the changes that I want to make, or if I should just
 barge on ahead and make them, damn the torpedoes pip pip old chap? I
 would prefer some guidance about this because I don't actually know
 anything about PDF, but the code is well commented.
 
 Obviously my primary interest is in the PDF renderer at this stage, but
 I'm interested to know if other renderer writers are interested in my
 work?
 
 As an aside: is there a style guide for FOP code? I must say I find the
 style and layout very confusing and I'm happy to clean things up as I
 go. At the moment I can't even work out what tab stop size people are
 using. (In my own work I use 8 character tabs with 2 character
 indenting).
 
 Cheers
 Mark


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: PDF serialization

2001-07-17 Thread Jeremias Maerki

 FWIW the PCL renderer should not be keeping much in memory.

The same applies to the PostScript renderer and (I think) to the MIF
renderer, which was my starting point for the PostScript renderer.

Jeremias Märki

mailto:[EMAIL PROTECTED]

OUTLINE AG
Postfach 3954 - Rhynauerstr. 15 - CH-6002 Luzern
Fon +41 (41) 317 2020 - Fax +41 (41) 317 2029
Internet http://www.outline.ch


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: PDF serialization

2001-07-17 Thread Mark



Hi,

On 16 Jul 2001 14:24:06 -0400, Art Welch wrote: 
FWIW the PCL renderer should not be keeping much in memory.

It seems to keep the whole PDF document in RAM until it's closed. But I have worked out a sneaky way to pipeline the PDF without major changes, thanks mostly to the thinking time I got while spending today removing six cubic meters of debris from my house renovations instead of working :) 

BTW earlier you had mentioned that the buf option slowed things down. I would expect that pipelining things would improve performance, has this been the case? If so, how much?

Yes, I found -bug to have a significant impact on performace after about 100 pages. Before 100 pages there did not appear to be a (subjective) impact. At this time I have not done any benchmarks but subjectively it does seem faster with pipelining, probably because it's allocating a lot less RAM (10Mb instead of 64Mb for ~200 page documents)

It seems to me that with the exception of a few details (like ID References, etc) FOP would be an ideal application for pipelining.

It seems pretty easy, given that I'm new to FOP I have had quite a deal of success! Tonight I intend to apply my ideas to the PDF renderer and if that works I'll look at the other renderers and then make a Plan to build a Patch.

Keep up the good work,

Thanks for the encouragement!

Cheers
Mark



RE: PDF serialization

2001-07-17 Thread Mark



My apologies,

I said:
 Yes, I found -bug to have a significant impact on performace after
 about 100 pages. Before 100 pages there did

Of course, I meant -buf! Who put those two letters so close together anyway?

Cheers
Mark



Re: Code Style, was Re: PDF serialization

2001-07-17 Thread Arved Sandstrom

At 11:25 PM 7/17/01 +0200, Tore Engvig wrote:
[ SNIP ]
I guess we have to use codeformatters (eg astyle) before we check in our
code.

I happend to grab a copy of jIndent while it still was free. jIndent does
more than just codeformatting, it parses the code and is able to change a
lot of things (eg bracket style, set a max linelength, etc).

I could reformat the whole repository with jIndent, then we could continue
with using astyle or some other free codeformatter as we go.

Just pick a day. It would be best if everybody have checked in their
changes before reformatting the whole source (and I guess I should tag
the repository before reformatting).

Hi, Tore

Well, how about _you_ set the day, since you're the volunteer? :-)

I tend to agree with your comments about cvswrappers, and you're right, why 
not try the logical first approach, which is that all committers make a 
point of running the code beautifier of their choice before making commits? 
We have not in a long time, if ever, put our foot down and said that this is 
the way it will be - it's been pretty lackadaisical so far. So, once you 
have reformatted everything, and recommitted, all committers would be on 
notice to adhere to code conventions.

It would be useful if you can work in license long-form - short-form 
translation also.

A tag is a good idea, I agree. Who knows what might happen?

Thanks in advance for doing this.

Regards,
Arved

Fairly Senior Software Type
e-plicity (http://www.e-plicity.com)
Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Code Style, was Re: PDF serialization

2001-07-17 Thread Jeremias Maerki

May I suggest to put up a code convention section in involved.xml? I
think a short notice with the most important rules (tabs to spaces, 4
spaces for tab etc.) will suffice. That will make it easier to encourage
people to follow the conventions.

Jeremias Märki

mailto:[EMAIL PROTECTED]

OUTLINE AG
Postfach 3954 - Rhynauerstr. 15 - CH-6002 Luzern
Fon +41 (41) 317 2020 - Fax +41 (41) 317 2029
Internet http://www.outline.ch


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




PDF serialization

2001-07-16 Thread Mark



Hi ho fops and fopettes,

I have had a look at the memory use of my modified-for-streaming FOP executable and determined that the primary memory constraint is now in the PDF renderer. Basically, because the PDF renderer keeps all objects in memory until the end, this obviously limits the size of the output document. I can't say if other renderers have this issue because I haven't looked.

I have had a look at the PDF renderer and I think it should be possible to serialize it with only a minimal amount of data being kept in RAM. However this will require extensive changes to the PDF renderer, because there doesn't seem to be a single point at which I can cleanly introduce the changes. Presumably these changes will be backwards-compatible with the current FOP since I don't want to actually change the code path, just make it so that the results are written earlier.

So I was wondering if there was someone I should be speaking with directly about the changes that I want to make, or if I should just barge on ahead and make them, damn the torpedoes pip pip old chap? I would prefer some guidance about this because I don't actually know anything about PDF, but the code is well commented.

Obviously my primary interest is in the PDF renderer at this stage, but I'm interested to know if other renderer writers are interested in my work?

As an aside: is there a style guide for FOP code? I must say I find the style and layout very confusing and I'm happy to clean things up as I go. At the moment I can't even work out what tab stop size people are using. (In my own work I use 8 character tabs with 2 character indenting).

Cheers
Mark



RE: PDF serialization

2001-07-16 Thread Art Welch



FWIW 
the PCL renderer should not be keeping much in memory.

BTW 
earlier you had mentioned that the buf option slowed things down. I would expect 
that pipelining things would improve performance, has this been the case? If so, 
how much?

It 
seems to me that with the exception of a few details (like ID References, etc) 
FOP would be an ideal application for pipelining.

Keep 
up the good work,
Art

  -Original Message-From: Mark 
  [mailto:[EMAIL PROTECTED]]Sent: Monday, July 16, 2001 7:26 
  AMTo: [EMAIL PROTECTED]Cc: 
  [EMAIL PROTECTED]Subject: PDF serializationHi 
  ho fops and fopettes,I have had a look at the memory use of my 
  modified-for-streaming FOP executable and determined that the primary memory 
  constraint is now in the PDF renderer. Basically, because the PDF renderer 
  keeps all objects in memory until the end, this obviously limits the size of 
  the output document. I can't say if other renderers have this issue because I 
  haven't looked.I have had a look at the PDF renderer and I think it 
  should be possible to serialize it with only a minimal amount of data being 
  kept in RAM. However this will require extensive changes to the PDF renderer, 
  because there doesn't seem to be a single point at which I can cleanly 
  introduce the changes. Presumably these changes will be backwards-compatible 
  with the current FOP since I don't want to actually change the code path, just 
  make it so that the results are written earlier.So I was wondering if 
  there was someone I should be speaking with directly about the changes that I 
  want to make, or if I should just barge on ahead and make them, damn the 
  torpedoes pip pip old chap? I would prefer some guidance about this because I 
  don't actually know anything about PDF, but the code is well 
  commented.Obviously my primary interest is in the PDF renderer at this 
  stage, but I'm interested to know if other renderer writers are interested in 
  my work?As an aside: is there a style guide for FOP code? I must say I 
  find the style and layout very confusing and I'm happy to clean things up as I 
  go. At the moment I can't even work out what tab stop size people are using. 
  (In my own work I use 8 character tabs with 2 character 
  indenting).CheersMark 


Code Style, was Re: PDF serialization

2001-07-16 Thread Arved Sandstrom

At 09:26 PM 7/16/01 +1000, Mark wrote:
 As an aside: is there a style guide for FOP code? I must say I find the
style and layout very confusing and I'm happy to clean things up as I go.
At the moment I can't even work out what tab stop size people are using.
(In my own work I use 8 character tabs with 2 character indenting).

Yep. http://xml.apache.org/source.html references the Sun Java coding 
conventions, at http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html.

An interesting point that comes out of the coding conventions is that while 
Sun instructs us that tabs must be 8 spaces, they also state that the unit 
of indentation is 4 spaces.

Seems to me that the easiest way to avoid problems is never to leave tabs in 
the code - I use a text editor that is set to 4 spaces per tab and 
immediately translates tabs to spaces.

We are well past the point where we should take a style tool to the whole 
codebase, strip out all tabs, and re-commit the whole thing. :-)

Regards,
Arved Sandstrom

Fairly Senior Software Type
e-plicity (http://www.e-plicity.com)
Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Code Style, was Re: PDF serialization

2001-07-16 Thread Peter B. West



Arved Sandstrom wrote:

 At 09:26 PM 7/16/01 +1000, Mark wrote:
 
As an aside: is there a style guide for FOP code? I must say I find the
style and layout very confusing and I'm happy to clean things up as I go.
At the moment I can't even work out what tab stop size people are using.
(In my own work I use 8 character tabs with 2 character indenting).

 
 Yep. http://xml.apache.org/source.html references the Sun Java coding 
 conventions, at http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html.
 
 An interesting point that comes out of the coding conventions is that while 
 Sun instructs us that tabs must be 8 spaces, they also state that the unit 
 of indentation is 4 spaces.
 
 Seems to me that the easiest way to avoid problems is never to leave tabs in 
 the code - I use a text editor that is set to 4 spaces per tab and 
 immediately translates tabs to spaces.
 
 We are well past the point where we should take a style tool to the whole 
 codebase, strip out all tabs, and re-commit the whole thing. :-)


Arved,

Agreed.  From my own point of view, it would also be nice if, once the 
cleanup were done, committers could gradually re-arrange code to fit on 
an 79 column page (1 column is sacrificed to (X)Emacs.)  I appreciate 
that this is kind of old fashioned, but more often than not, I work with 
multiple 80 column windows rather than a full-screen editor.

Peter
-- 
Peter B. West  [EMAIL PROTECTED]  http://powerup.com.au/~pbwest
Lord, to whom shall we go?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]