Re: Retrieving Objects question

2011-06-09 Thread Michael Rubin

Thanks a lot for your reply Andreas. Yes if all I had to do was move 
references around then my work would already be complete and submitted for 
review. However, that catch is that the Page objects also have Parent 
references which also need to be updated when they get moved from one page tree 
node to another. But since they have been written out already this cannot be 
done. So the pages effectively become immovable (or else the parent references 
will not match the kids references as they will be out of date - which was why 
acroread could not open the pages).

Delaying writing the page objects would mean the parent references can be 
updated correctly, and the problem would be solved. But, that has a potential 
memory usage toll.

Today I will continue with my attempt to link every page to a node of its own 
(stored in a flat list), then re-order the nodes according to the page index of 
the page inside. Then build up the balanced page tree from those nodes up. 
That's the plan anyway... (I'll also be interested time permitting in looking 
more closely at what happened when the 2 page sequences ended up with mixed up 
pages...)

Thanks!

-Mike


On 08/06/11 20:14, Andreas L. Delmelle wrote:

On 08 Jun 2011, at 17:15, Michael Rubin wrote:

Hi Mike


Hello there. Thought I'd post an update. Admittedly I feel like I've found a 
bit of a catch 22 situation. I successfully completed my code to generate the 
balanced page tree on the fly and it works fine with a single page sequence. 
However, this morning I discovered that this code does not appear to work for 
multiple page sequences in a flow. (2x 101 page sequences, I got pages 1-9, 
102, 10-101 then 103-end in that order...) I guess this is where pages can come 
in in a different order anyway then, and why the current indexing / nulls 
system is there.

Ouch! I had not considered that to be the purpose. Without looking closer, I 
would say something like: page 10 contains a forward reference to page 102, and 
all pages in between are only flushed after the reference can be been resolved 
(?)


(And shows that I am still learning the ropes as I go along...)

Yep, and also shows that I am not intimately familiar with *all* of the 
codebase myself. ;-)


So I re-examined trying to generate the page tree after the pages have been 
added into one big flat list. I can do this by, in PDFDocument.outputTrailer(), 
calling a method to balance the page tree before all the remaining objects are 
written out. This way pages can be attached to nodes, and the tree hierarchy 
built up to the root node. This is on paper a more elegant, efficient and 
easier solution to doing it on the fly. But I ran into the same problem again - 
the page objects are already written out.

OK, here may be a gap in my understanding of it so far, but...
Do you really _need_ the PDFPage object for some reason, or does its PDF 
reference suffice to build the page tree?
 From what I know of PDF, that page tree would only contain the references to 
the actual page objects, no? As long as the PDFPages object is not written to 
the stream, you should be able to shuffle and play with the references all you 
want. All you need to keep track of, is to retain the natural order (= the 
page's index), as the object numbers will not necessarily reflect that.
Unless I am mistaken about this, I do not see a compelling reason *not* to 
write the PDFPage object to the stream as soon as it's finished. We keep a 
mapping of reference-to-index alive in the 'main' (temporary?) PDFPages object.
Note that notifyKidRegistered() only stores the reference; the natural index is 
translated into the position of the reference in the list. If you want to 
re-shape that into a structured tree/map, then by all means...

Perhaps there is still a catch --sounds too simple somehow... :-/


snip /
My current questions are:

-Why are the page objects flushed straight away? (Memory constraints?)

Very likely to save memory indeed. More with the intention of just flushing as soon 
as possible, to support full streaming processing if the document structure allows 
it. Theoretically, in a document consisting of single-page fo:page-sequences, without any 
cross-references, you should see relatively low memory usage even if the document is 
1+ pages, precisely because the pages are all written to the output immediately, long 
before the root page tree, which only retains their object references.


-Is it safe and wise to delay flushing the page objects until the end?

Safe? No issue here.
Wise? That would obviously depend on the context.
In documents with 1000s of pages, I can imagine we do not want to keep all of 
those pages in memory any longer than strictly necessary... I wouldn't mind too 
much if it were an option that users could switch on/off. However, if the 
process is hard coded as the *only* way FOP will render PDFs, such that it 
would affect *all* users, I am not so sure it is wise to do this.

snip

Re: Retrieving Objects question

2011-06-06 Thread Michael Rubin

Thanks for your reply Andreas.

Currently it is hardcoded to 10 nodes or leaves, but adding an xconf 
setting perhaps should be pretty easy and quick to do. However, having 
spoken to my manager, there isn't the business requirement currently to 
make it configurable, and given the current large array of options 
already available, the preference is to just keep it hardcoded for now. 
At the very least I'll make sure the maximum leaves / subnodes value is 
stored in a constant so if it is made configurable then only the 
constant needs to be paid attention to rather than multiple locations in 
the class.


As far as I can tell the page objects are kept alive anyway by the 
references in the document object itself (atleast until the trailer is 
written). So me keeping references in the page tree object should not 
extend their life in any way.


Currently, if I take a 20 page document, then there are two sets of 10 
pages, one in each node, each node being children of the root node. For 
the first 10 pages the kids list is something like {1 0 R, 2 0 R, 3 0 R, 
4 0 R, 5 0 R, 6 0 R, 7 0 R, 8 0 R, 9 0 R, 10 0 R} (object numbers not 
intended to be realistic for this example). But for the second 10 pages 
the kids list is {null, null, null, null, null, null, null, null, null, 
null, 11 0 R, 12 0 R, 13 0 R, 14 0 R, 15 0 R, 16 0 R, 17 0 R, 18 0 R, 19 
0 R, 20 0 R} since the page index (which is zero based) makes the page 
get placed in that index position on the tree, any previous unused 
indexes being filled with null. So for a 10,000 page doc there are going 
to be a lot of nulls in the page tree. For now setting the toPDFString() 
to ignore the nulls rather than throw an exception gets round this and 
allows the document to be correctly generated. In my tests all the pages 
are produced in the correct order. I was wondering though if there are 
any cases where the pages might not be passed in in the correct order 
(and hence might possibly explain why the notifyKidsRegistered() method 
was written in the way it is), and if so if that has any implications on 
the way I have written the balanced page tree code updates.


Thanks.

-Mike

On 03/06/11 22:38, Andreas L. Delmelle wrote:

On 03 Jun 2011, at 10:54, Michael Rubin wrote:

Hi Mike


Thanks a lot for your reply last week Andreas. Sorry for the delay. Been away 
and offline... FYI to follow up on the work I was doing:

snip /

So for example a 101 page document will have a root PDFPages node with two 
sub-nodes underneath. The first will contain a count of 100, and have 10 
sub-nodes, each containing 10 pages. The second will simply contain 1 page. 
More new pages will get added to the second sub-node (moving pages down to new 
sub-nodes to avoid more than 10 pages per node) until it's count reaches 100 
too, then another node created. Once 10 nodes under the root exist (at 1000 
pages) they will get moved down below a new root level sub-node with a count of 
1000, and a new root level sub-node created, and so on.

Cool! Impressive work. Will the number of pages per node be configurable?


Next task is to write a JUnit test since one appears not to exist... I guess 
remaining thoughts currently are:

- Wondering if keeping references to a page tree object's sub-nodes or leaves 
is the best way or can I improve it further? (Bearing in mind memory usage and 
performance.)

It depends a bit on whether you are thereby keeping PDFPage objects alive 
longer than necessary. The current design only stores the pages' referencePDF, 
so that seems safe.


- Was wondering if the trailer objects list is the right place to write the new 
sub-node PDFPages objects. (But if writing an object to the objects list - 
addObject() instead of addTrailerObject() - it gets written out too soon before 
I have added all the pages.) But given how it writes the objects out before 
writing the xref and trailer it seems OK and parses and shows fine in 
PDFBox/PDFDebugger and the evince PDF Reader in ubuntu.

I would think that that is the correct place, although I must admit, I would 
have to check the PDF Spec to be certain.


- When registering the pages themselves via notifyKidsRegistered() method it 
extracts the page index number and puts the reference at that index in the kids 
list, filling empty spaces ahead of it with nulls. So when counting kids and 
writing out the pdf code text I had to ignore nulls and 'gaps' in the kids list 
since not all the kids are in the same list any more (spread across multiple 
page tree nodes). I was wondering why this method was written like this, and 
doesn't simply append new pages to the end of the list all the time.

AFAICT, what it is designed to do is make sure that the page is entered at the 
correct index in the list of kids. It would only create null entries if the 
list is not yet large enough. I have a feeling this is just by design, taking 
into account a single page tree node only (see the javadoc of the PDFPages 
class

Re: Retrieving Objects question

2011-06-03 Thread Michael Rubin
Thanks a lot for your reply last week Andreas. Sorry for the delay. Been 
away and offline... FYI to follow up on the work I was doing:


In the end I saw that references are indeed kept by the PDFDocument. So 
I decided it wouldn't do any harm (or take up any significant extra 
memory) to keep references to the objects themselves when I am 
constructing the balanced page tree. I have since modified PDFPages (and 
a small change in PDFPage) and the first working draft completed late 
yesterday keeps a list of sub-nodes (PDFPages, managed internally via a 
recursive algorithm - external methods work as before to avoid 
regressions) or leaves (PDFPage) as well as the original kids (may be a 
PDFPage or a sub PDFPages object) with PDF references to all children. 
This eliminates an overhead of looking up each object (potentially many 
times). I have successfully run it with test .fo files up to 10001 pages 
(each just showing 'Page x/y' where x is current page and y is total 
page count, takes a while with that many pages but not surprised) 
verifying that a balanced tree gets produced (and not a flat tree of one 
page tree object containing 10001 pages!). When each subnode is created 
the PDFFactory.makePages() method stores it in the trailer. That way the 
objects are all written out at the end after I have added all the pages 
to the right places, just before the cross reference table and trailer 
themselves are written. So now there are never more than 10 pages or 10 
PDFPages (sub-nodes) per PDFPages object (I never mix sub-nodes and 
leaves on the same node). A similar structure to the page tree of the 
PDF 1.4 Reference document. Automatically generated on the fly.


So for example a 101 page document will have a root PDFPages node with 
two sub-nodes underneath. The first will contain a count of 100, and 
have 10 sub-nodes, each containing 10 pages. The second will simply 
contain 1 page. More new pages will get added to the second sub-node 
(moving pages down to new sub-nodes to avoid more than 10 pages per 
node) until it's count reaches 100 too, then another node created. Once 
10 nodes under the root exist (at 1000 pages) they will get moved down 
below a new root level sub-node with a count of 1000, and a new root 
level sub-node created, and so on.


Next task is to write a JUnit test since one appears not to exist... I 
guess remaining thoughts currently are:


- Wondering if keeping references to a page tree object's sub-nodes or 
leaves is the best way or can I improve it further? (Bearing in mind 
memory usage and performance.)
- Was wondering if the trailer objects list is the right place to write 
the new sub-node PDFPages objects. (But if writing an object to the 
objects list - addObject() instead of addTrailerObject() - it gets 
written out too soon before I have added all the pages.) But given how 
it writes the objects out before writing the xref and trailer it seems 
OK and parses and shows fine in PDFBox/PDFDebugger and the evince PDF 
Reader in ubuntu.
- When registering the pages themselves via notifyKidsRegistered() 
method it extracts the page index number and puts the reference at that 
index in the kids list, filling empty spaces ahead of it with nulls. So 
when counting kids and writing out the pdf code text I had to ignore 
nulls and 'gaps' in the kids list since not all the kids are in the same 
list any more (spread across multiple page tree nodes). I was wondering 
why this method was written like this, and doesn't simply append new 
pages to the end of the list all the time.


Once testing is complete I'll submit the code internally for the in-team 
committers to review as I did with the 128 bit encryption work last month...


Thanks!

-Mike

On 25/05/11 21:57, Andreas L. Delmelle wrote:

On 25 May 2011, at 09:45, Michael Rubin wrote:

Hi Mike


Hello there. In the PDFPages class the kids are stored as reference
strings (e.g. 23 0 R). Each of these objects are PDFPage objects. Do
you know if there is a method somewhere that I can retrieve the PDF java
object based on the reference string?

Not really, AFAIK. What you do have is various Collections of different 
subtypes of PDFObject, available by means of accessors on PDFDocument.
I guess the closest you would get without too much effort is to obtain the one 
you're interested in, then iterate over its elements and check 
PDFObject.referencePDF() against the lookup string. You do have to know the 
type(s) of object you need in advance, though...


(I am aiming to add support for some of those kids being other PDFPages
nodes to create a more balanced page tree.)

Interesting. Looking forward to seeing more.


Regards

Andreas
---






Michael Rubin
Developer

T: +44 20 8238 7400
F: +44 20 8238 7401

mru...@thunderhead.com

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy

Retrieving Objects question

2011-05-25 Thread Michael Rubin

Hello there. In the PDFPages class the kids are stored as reference
strings (e.g. 23 0 R). Each of these objects are PDFPage objects. Do
you know if there is a method somewhere that I can retrieve the PDF java
object based on the reference string?

(I am aiming to add support for some of those kids being other PDFPages
nodes to create a more balanced page tree.)

Thanks.

-Mike





Michael Rubin
Developer

T: +44 20 8238 7400
F: +44 20 8238 7401

mru...@thunderhead.com

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it. 







Re: any HTML to PDF example within apache FOP?

2011-05-23 Thread Michael Rubin
 (javax.xml.transform.TransformerException e) {
  return null;
}
return (Document) domResult.getNode();

   }


   /*
*  Apply FOP to XSL-FO input
*
*  @param foDocument  The XSL-FO input
*
*  @return byte[]  PDF result
*/
   private static byte[] fo2PDF(Document foDocument) {

   DocumentInputSource fopInputSource = new DocumentInputSource(
foDocument);

   try {

   ByteArrayOutputStream out = new ByteArrayOutputStream();
   Logger log = new ConsoleLogger(ConsoleLogger.LEVEL_WARN);

   Driver driver = new Driver(fopInputSource, out);
   driver.setLogger(log);
   driver.setRenderer(Driver.RENDER_PDF);
   driver.run();

   return out.toByteArray();

   } catch (Exception ex) {
   return null;
   }
   }


   /*
*  Create and return a Transformer for the specified stylesheet.
*
*  Based on the DOM2DOM.java example in the Xalan distribution.
*/
   private static Transformer getTransformer(String styleSheet) {

try {

  TransformerFactory tFactory = TransformerFactory.newInstance();

  DocumentBuilderFactory dFactory = DocumentBuilderFactory.newInstance();

  dFactory.setNamespaceAware(true);

  DocumentBuilder dBuilder = dFactory.newDocumentBuilder();
  Document xslDoc = dBuilder.parse(styleSheet);
  DOMSource xslDomSource = new DOMSource(xslDoc);

  return tFactory.newTransformer(xslDomSource);

}
catch (javax.xml.transform.TransformerException e) {
  e.printStackTrace();
  return null;
}
catch (java.io.IOException e) {
  e.printStackTrace();
  return null;
}
catch (javax.xml.parsers.ParserConfigurationException e) {
  e.printStackTrace();
  return null;
}
catch (org.xml.sax.SAXException e) {
  e.printStackTrace();
  return null;
}

   }

}



Kapil Garg






Michael Rubin


Developer


[http://thunderhead.com/email_signature/images/Thunderhead-logo.png]
[http://thunderhead.com/email_signature/images/make-every-communication-count.png]
  [http://thunderhead.com/email_signature/images/triangles.png]

T

F

M

E

W


+44 20 8238 7400

+44 20 8238 7401



mru...@thunderhead.commailto:mru...@thunderhead.com

www.thunderhead.comhttp://www.thunderhead.com



Thunderhead featured in The Sunday Times Profit Track 100 league table of companies 
with fastest-growing profits. Click 
herehttp://www.fasttrack.co.uk/fasttrack/press/pt11-lon.pdf to read more.


[http://thunderhead.com/email_signature/images/linkedin.png]http://www.linkedin.com/companies/25033/Thunderhead
 [http://thunderhead.com/email_signature/images/twitter.png] http://twitter.com/Thunderheadon 
[http://thunderhead.com/email_signature/images/rss.png] http://www.thunderhead.com/rss/rss.php 
[http://thunderhead.com/email_signature/images/youtube.png] http://www.youtube.com/user/ThunderheadOn 
[http://thunderhead.com/email_signature/images/theblog.png] http://thunderheadinnovate.wordpress.com/  
[http://thunderhead.com/email_signature/images/werehiring.png] http://thunderhead.com/about/careers.php

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it.






Re: Fop Memory Use

2011-05-18 Thread Michael Rubin

Just a wild thought. But is there a way you could possibly get the JVM to 
garbage collect between each run? Maybe that might free the memory up?

Thanks.

-Mike

On 18/05/11 13:20, Eric Douglas wrote:

I am using Fop 1.0.
I tried using Fop to transform a single document.  When I got a little over 100 
pages my FO file was over 5 MB.  The transform crashed with a Java heap out of 
memory error.

I managed to break the input down, as I'm using embedded code generating the 
input programmatically, and the PDF output is a lot smaller.

So I'm currently transforming 10 pages at a time, setting the 
initial-page-number to the next sequence (1, 11, 21, etc).

Then I save all the generated PDFs in memory and merge them using pdfbox.  So 
far this is working great.

I tried to do the same thing with the PNGRenderer, just calling a method to 
transform 10 pages at a time and save the output images in an array.

The PNGRenderer is created locally in the method.  It should be getting 
released when the method ends but the java process never releases any memory.

I tested a 90 page report and the memory use was over 1 GB.  I tested on 
another machine where the memory limit is apparently lower and it crashed on 
page 24.

Everything about the method to render to PNG is the same as the method to 
render to PDF aside from the Renderer.
Is there a problem with this renderer or something I could need to do different?




Michael Rubin


Developer


[http://thunderhead.com/email_signature/images/Thunderhead-logo.png]
[http://thunderhead.com/email_signature/images/make-every-communication-count.png]
  [http://thunderhead.com/email_signature/images/triangles.png]

T

F

M

E

W


+44 20 8238 7400

+44 20 8238 7401



mru...@thunderhead.commailto:mru...@thunderhead.com

www.thunderhead.comhttp://www.thunderhead.com



Thunderhead featured in The Sunday Times Profit Track 100 league table of companies 
with fastest-growing profits. Click 
herehttp://www.fasttrack.co.uk/fasttrack/press/pt11-lon.pdf to read more.


[http://thunderhead.com/email_signature/images/linkedin.png]http://www.linkedin.com/companies/25033/Thunderhead
 [http://thunderhead.com/email_signature/images/twitter.png] http://twitter.com/Thunderheadon 
[http://thunderhead.com/email_signature/images/rss.png] http://www.thunderhead.com/rss/rss.php 
[http://thunderhead.com/email_signature/images/youtube.png] http://www.youtube.com/user/ThunderheadOn 
[http://thunderhead.com/email_signature/images/theblog.png] http://thunderheadinnovate.wordpress.com/  
[http://thunderhead.com/email_signature/images/werehiring.png] http://thunderhead.com/about/careers.php

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it.






Re: Event broadcasting and listening question - solved!

2011-05-17 Thread Michael Rubin
Hello again. Thanks again to Jeremias's invaluable help. Much 
appreciated. I believe the issue is now resolved. For the benefit of 
everyone else here is a summary of what I did:


1. Set up the listener, adaptor and producer:
- Added 'void warnRevision3PermissionsIgnored(Object source);' (and its 
javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package 
and added a corresponding entry to the xml.
- Created org.apache.fop.pdf.PDFEventListener interface containing just 
'void warnRevision3PermissionsIgnored(Object source);'.
- Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf 
package that extends the PDFEventListener.


So the listener gets called from PDFEncryptionJCE.init() which hooks 
into the producer via the adaptor, thereby decoupling FOP's event 
subsystem from the PDF library.


2. Ensure the PDFDocument exists before init time:

PDFEncryptionJCE.make(): Added PDFDocument parameter to the method 
parameters. Call setDocument() to set the PDF before the init() method 
is called. The PDFDocument is passed in from 
PDFEncryptionManager.newInstance(), and comes from 
PDFDocument.setEncryption() (where 'this' is passed in). Compare current 
where the document is set in PDFDocument.setEncryption() after init time.


3. Ensure the listener is set up before init time:

Moved the listener setup code from
PDFDocumentHandler.startDocument() to
PDFRenderingUtil.setupPDFDocument() (just above the setupPDFEncryption 
method call).


Now the PDF Document and its listener are available from within the 
init() method of PDFEncryptionJCE.


Thanks!

-Mike

On 16/05/11 09:38, Michael Rubin wrote:

Thanks again Jeremias. Your help much appreciated.

I have made the PDFEncryptionJCE class pass itself as source into
PDFEventListener.warnRevision3PermissionsIgnored() which gets passed
onto the PDFEventProducer.

Yes I am indeed calling
PDFEventListener.warnRevision3PermissionsIgnored() from the
PDFEncryptionJCE class. The call is originating from the init() method.
A bit of debugging and a fresh mind this morning revealed that
getDocumentSafely() is throwing an exception as the returned document is
null. (That was getting swallowed up and the InvocationTargetException
thrown instead that I got at the end of Friday.) So I think your last
paragraph is applicable in that PDFEncryptionManager will need to be
modified to set the PDF immediately as you say. So my next step is to
work out how I should do that...

Thanks!

-Mike

On 14/05/11 10:42, Jeremias Maerki wrote:

On 13.05.2011 17:06:56 Michael Rubin wrote:

Thanks for your reply. I have now added the getter and setter to
PDFDocument as shown below and added 'this.pdfDoc.setEventListener(new
PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster());' to
PDFDocumentHandler.startDocument() (last line inside the try block). Now
I can get the listener from the PDFEncryptionJCE class. However what do
I do with it?

You call your PDFEventListener.warnRevision3PermissionsIgnored() method.


And how does this relate to the producer class and the
EventBroadcaster that I am trying to get hold of?

It doesn't. That's part of the decoupling. The PDFEncryptionJCE should
know nothing of the EventProducer or EventBroadcaster.

Maybe the attached UML helps a bit (I don't usually do UML so I'm not
sure I've made any mistakes).


In your first reply you said to create the Listener and Producer
interfaces. Based on the FontEvent* classes, both of these had an event
definition for their events. But in the latest reply you are saying not
to put my event in the new PDFEventListener? That would make it an empty
interface then if I understand right. So what would its purpose be? I
can't seem to do anything with it.

Sorry for the confusion here. I didn't remember that there was already
an EventProducer in org.apache.fop.render.pdf. So adding a new
EventProducer doesn't make much sense. Instead your
warnRevision3PermissionsIgnored() should be added to the existing one.
No new EventProducer is necessary (and I didn't notice that at first).


Following discussion with a colleague (Vincent) I left the listener
method in (but without the source) and made the call to that to kick off
the event. However now I get an InvocationTargetException when I try to
get the PDF Doc in order to invoke the listener event method. Looking at
the stack trace it happens when I call the
PDFDocument.getDocumentSafely() method. It seems when debugging to be
PDFEncryptionManager.newInstance() where the error is occurring, the 3rd
line calling makeMethod.invoke(...). (I attempted to run the build ant
script and then refresh eclipse but this didn't make any difference.)

I can't help much with that InvocationTargetException. Maybe if you
posted a patch so I could reproduce it


I will continue with this on Monday. Any further pointers in the
meantime very much appreciated.

There are two questions that come from my colleague:

1. What is the source object for? And do we need

Re: Event broadcasting and listening question

2011-05-16 Thread Michael Rubin

Thanks again Jeremias. Your help much appreciated.

I have made the PDFEncryptionJCE class pass itself as source into 
PDFEventListener.warnRevision3PermissionsIgnored() which gets passed 
onto the PDFEventProducer.


Yes I am indeed calling 
PDFEventListener.warnRevision3PermissionsIgnored() from the 
PDFEncryptionJCE class. The call is originating from the init() method. 
A bit of debugging and a fresh mind this morning revealed that 
getDocumentSafely() is throwing an exception as the returned document is 
null. (That was getting swallowed up and the InvocationTargetException 
thrown instead that I got at the end of Friday.) So I think your last 
paragraph is applicable in that PDFEncryptionManager will need to be 
modified to set the PDF immediately as you say. So my next step is to 
work out how I should do that...


Thanks!

-Mike

On 14/05/11 10:42, Jeremias Maerki wrote:

On 13.05.2011 17:06:56 Michael Rubin wrote:

Thanks for your reply. I have now added the getter and setter to
PDFDocument as shown below and added 'this.pdfDoc.setEventListener(new
PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster());' to
PDFDocumentHandler.startDocument() (last line inside the try block). Now
I can get the listener from the PDFEncryptionJCE class. However what do
I do with it?

You call your PDFEventListener.warnRevision3PermissionsIgnored() method.


And how does this relate to the producer class and the
EventBroadcaster that I am trying to get hold of?

It doesn't. That's part of the decoupling. The PDFEncryptionJCE should
know nothing of the EventProducer or EventBroadcaster.

Maybe the attached UML helps a bit (I don't usually do UML so I'm not
sure I've made any mistakes).


In your first reply you said to create the Listener and Producer
interfaces. Based on the FontEvent* classes, both of these had an event
definition for their events. But in the latest reply you are saying not
to put my event in the new PDFEventListener? That would make it an empty
interface then if I understand right. So what would its purpose be? I
can't seem to do anything with it.

Sorry for the confusion here. I didn't remember that there was already
an EventProducer in org.apache.fop.render.pdf. So adding a new
EventProducer doesn't make much sense. Instead your
warnRevision3PermissionsIgnored() should be added to the existing one.
No new EventProducer is necessary (and I didn't notice that at first).


Following discussion with a colleague (Vincent) I left the listener
method in (but without the source) and made the call to that to kick off
the event. However now I get an InvocationTargetException when I try to
get the PDF Doc in order to invoke the listener event method. Looking at
the stack trace it happens when I call the
PDFDocument.getDocumentSafely() method. It seems when debugging to be
PDFEncryptionManager.newInstance() where the error is occurring, the 3rd
line calling makeMethod.invoke(...). (I attempted to run the build ant
script and then refresh eclipse but this didn't make any difference.)

I can't help much with that InvocationTargetException. Maybe if you
posted a patch so I could reproduce it


I will continue with this on Monday. Any further pointers in the
meantime very much appreciated.

There are two questions that come from my colleague:

1. What is the source object for? And do we need it referenced in the
Listener? Or just the producer?

My original idea was that this object gives the event handler a chance
to intercept and modify the object that is the event origin. In most
cases, it will certainly be ignored but someone might find it handy. And
to have it in the producer means you also have to have it in the
PDFEventListener. See also java.util.EventObject from which
org.apache.fop.events.Event is derived.


2. Why should we get the PDFDocument object from the Encryption class?

It's a PDFObject, right? So it should already have the PDFDocument. That
makes access to the PDFEventListener easy.


Should the listener not be passed into the Encryption class via its
constructor rather than having to go fetch the listener?

Both are valid ways but since I expect the PDFDocument to already be set,
I see no point in giving more information that can otherwise be easily
accessed. Well, it could be that your event happens before the
PDFDocument is set on that object (see PDFDocument.setEncryption()). In
that case PDFEncryptionManager might have to be changed to pass in the
PDFDocument immediately. Or you pass in the PDFEventListener, although I
find the former more useful and flexible.


Thanks!

-Mike


On 12/05/11 21:29, Jeremias Maerki wrote:

On 12.05.2011 10:44:41 Michael Rubin wrote:

Thanks a lot for your response Jeremias. I have now done the following:

- Added 'void warnRevision3PermissionsIgnored(Object source);' (and its 
javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added 
a corresponding entry to the xml. Removed the 
org.apache.fop.pdf.PDFEventProducer class

Re: Event broadcasting and listening question

2011-05-13 Thread Michael Rubin
Thanks for your reply. I have now added the getter and setter to 
PDFDocument as shown below and added 'this.pdfDoc.setEventListener(new 
PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster());' to 
PDFDocumentHandler.startDocument() (last line inside the try block). Now 
I can get the listener from the PDFEncryptionJCE class. However what do 
I do with it? And how does this relate to the producer class and the 
EventBroadcaster that I am trying to get hold of?


In your first reply you said to create the Listener and Producer 
interfaces. Based on the FontEvent* classes, both of these had an event 
definition for their events. But in the latest reply you are saying not 
to put my event in the new PDFEventListener? That would make it an empty 
interface then if I understand right. So what would its purpose be? I 
can't seem to do anything with it.


Following discussion with a colleague (Vincent) I left the listener 
method in (but without the source) and made the call to that to kick off 
the event. However now I get an InvocationTargetException when I try to 
get the PDF Doc in order to invoke the listener event method. Looking at 
the stack trace it happens when I call the 
PDFDocument.getDocumentSafely() method. It seems when debugging to be 
PDFEncryptionManager.newInstance() where the error is occurring, the 3rd 
line calling makeMethod.invoke(...). (I attempted to run the build ant 
script and then refresh eclipse but this didn't make any difference.)


I will continue with this on Monday. Any further pointers in the 
meantime very much appreciated.


There are two questions that come from my colleague:

1. What is the source object for? And do we need it referenced in the 
Listener? Or just the producer?
2. Why should we get the PDFDocument object from the Encryption class? 
Should the listener not be passed into the Encryption class via its 
constructor rather than having to go fetch the listener?


Thanks!

-Mike


On 12/05/11 21:29, Jeremias Maerki wrote:

On 12.05.2011 10:44:41 Michael Rubin wrote:

Thanks a lot for your response Jeremias. I have now done the following:

- Added 'void warnRevision3PermissionsIgnored(Object source);' (and its 
javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added 
a corresponding entry to the xml. Removed the 
org.apache.fop.pdf.PDFEventProducer class and xml.
- Created org.apache.fop.pdf.PDFEventListener interface containing just 'void 
warnRevision3PermissionsIgnored(Object source);'.
- Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that 
extends the PDFEventListener. (Currently just contains my new event. Should I 
also add the existing 2 render.pdf events to this class?)

Or do it the other way around: add your new event to PDFEventProcuder.
Doesn't make sense to have two.

   I can also see how to obtain the PDFDocument object from the
PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure 
how to get the event broadcaster from that object. How is this done?

public class PDFDocument {

[..]

 private PDFEventListener listener;

[..]

 public void setListener(PDFEventListener listener) {
 this.listener = listener;
 }

 PDFEventListener getListener() {
 return this.listener;
 }
[..]

That's the simples way and should probably be sufficient. If we wanted
to get fancy, we could handle a ListPDFEventListener.

In PDFDocumentHandler.startDocument():
this.pdfDoc.setEventListener(new 
PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster());

So, the PDFDocument doesn't actually get an EventBroadcaster.
PDFDocument calls the PDFLibraryEventAdapter and that one in turn calls
the EventBroadcaster. Nicely decoupled.



Thanks!

-Mike


On 11/05/11 19:46, Jeremias Maerki wrote:

Hi Michael

Creating a new EventBroadcaster is obviously wrong. The idea is that the
user can get events for each FOP rendering run separately (unlike
logging where concurrent runs get mixed up). So you have to get hold of
that EventBroadcaster applicable to the current rendering run. Obviously,
you don't have access to the FOUserAgent in the PDF library. That is
intentional because the PDF library should remain reasonably independent
of as much FOP code as possible for the case that we ever factor it out
into a separate component/module or move it to XML Graphics Commons.

My suggestion is to follow a similar path as done in
org.apache.fop.fonts: Create an interface for the events coming out of
the PDF library (see FontEventListener). Let's call it PDFEventListener
or something like that and put it in the org.apache.fop.pdf package.
Then move your PDFEventProducer (corresponds to FontEventProducer) into
org.apache.fop.render.pdf as this package makes the glue between FOP and
PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener)
in the org.apache.fop.render.pdf package (corresponds to
FontEventAdapter). The PDFLibraryEventAdapter will get

Re: Event broadcasting and listening question

2011-05-12 Thread Michael Rubin

Thanks a lot for your response Jeremias. I have now done the following:

- Added 'void warnRevision3PermissionsIgnored(Object source);' (and its 
javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added 
a corresponding entry to the xml. Removed the 
org.apache.fop.pdf.PDFEventProducer class and xml.
- Created org.apache.fop.pdf.PDFEventListener interface containing just 'void 
warnRevision3PermissionsIgnored(Object source);'.
- Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that 
extends the PDFEventListener. (Currently just contains my new event. Should I 
also add the existing 2 render.pdf events to this class?)

 I can also see how to obtain the PDFDocument object from the
PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure 
how to get the event broadcaster from that object. How is this done?

Thanks!

-Mike


On 11/05/11 19:46, Jeremias Maerki wrote:

Hi Michael

Creating a new EventBroadcaster is obviously wrong. The idea is that the
user can get events for each FOP rendering run separately (unlike
logging where concurrent runs get mixed up). So you have to get hold of
that EventBroadcaster applicable to the current rendering run. Obviously,
you don't have access to the FOUserAgent in the PDF library. That is
intentional because the PDF library should remain reasonably independent
of as much FOP code as possible for the case that we ever factor it out
into a separate component/module or move it to XML Graphics Commons.

My suggestion is to follow a similar path as done in
org.apache.fop.fonts: Create an interface for the events coming out of
the PDF library (see FontEventListener). Let's call it PDFEventListener
or something like that and put it in the org.apache.fop.pdf package.
Then move your PDFEventProducer (corresponds to FontEventProducer) into
org.apache.fop.render.pdf as this package makes the glue between FOP and
PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener)
in the org.apache.fop.render.pdf package (corresponds to
FontEventAdapter). The PDFLibraryEventAdapter will get the
EventBroadcaster from the PDFDocumentHandler which is responsible for
instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is
then added as listener to a ListPDFEventListener  that you can add to
PDFDocument. From PDFEncryptionJCE you should have access to the
PDFDocument via the getDocumentSafely() method. That nicely decouples
FOP's event subsystem from the PDF library.

HTH

On 11.05.2011 15:47:49 Michael Rubin wrote:

?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I 
have already got it working successfully but one issue remains that I have a 
question about.

In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want 
to broadcast an event message. I looked 
athttp://xmlgraphics.apache.org/fop/trunk/events.html  to learn about events. However it 
just shows EventBroadcaster broadcaster = [get it from somewhere]; and 
doesn't show how I should be getting the broadcaster. After looking in the code in the 
AFP package for existing examples I put together the following which seems to work on 
testing:

FopFactory fopFactory = FopFactory.newInstance();
FOUserAgent agent = fopFactory.newFOUserAgent();
EventBroadcaster eventBroadcaster = agent.getEventBroadcaster();
PDFEventProducer eventProducer = 
PDFEventProducer.Provider.get(eventBroadcaster);
eventProducer.warnRevision3PermissionsIgnored(this);

This creates a new FopFactory, from which I create a new FOUserAgent, from 
which I can get the event broadcaster to supply to my event producer. (I had to 
create a PDFEventProducer which extends EventProducer. Plus 
PDFEventProducer.xml which contains the message mapping.)

In this case the EventBroadcaster will be created new every time so I am not 
sure existing listeners will pick up. So is there a recommended way that I can 
get an existing event broadcaster to use? Or is the above way the correct way 
to do it after all?

Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the 
Eclipse IDE.

Thanks!

-Mike





Michael Rubin
Developer

T: +44 20 8238 7400
F: +44 20 8238 7401

mru...@thunderhead.com

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us
immediately and then destroy it.








Jeremias Maerki





Event broadcasting and listening question

2011-05-11 Thread Michael Rubin

Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I 
have already got it working successfully but one issue remains that I have a 
question about.

In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want 
to broadcast an event message. I looked 
athttp://xmlgraphics.apache.org/fop/trunk/events.html  to learn about events. However it 
just shows EventBroadcaster broadcaster = [get it from somewhere]; and 
doesn't show how I should be getting the broadcaster. After looking in the code in the 
AFP package for existing examples I put together the following which seems to work on 
testing:

FopFactory fopFactory = FopFactory.newInstance();
FOUserAgent agent = fopFactory.newFOUserAgent();
EventBroadcaster eventBroadcaster = agent.getEventBroadcaster();
PDFEventProducer eventProducer = 
PDFEventProducer.Provider.get(eventBroadcaster);
eventProducer.warnRevision3PermissionsIgnored(this);

This creates a new FopFactory, from which I create a new FOUserAgent, from 
which I can get the event broadcaster to supply to my event producer. (I had to 
create a PDFEventProducer which extends EventProducer. Plus 
PDFEventProducer.xml which contains the message mapping.)

In this case the EventBroadcaster will be created new every time so I am not 
sure existing listeners will pick up. So is there a recommended way that I can 
get an existing event broadcaster to use? Or is the above way the correct way 
to do it after all?

Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the 
Eclipse IDE.

Thanks!

-Mike





Michael Rubin
Developer

T: +44 20 8238 7400
F: +44 20 8238 7401

mru...@thunderhead.com

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it.