Re: Retrieving Objects question
Thanks a lot for your reply Andreas. Yes if all I had to do was move references around then my work would already be complete and submitted for review. However, that catch is that the Page objects also have Parent references which also need to be updated when they get moved from one page tree node to another. But since they have been written out already this cannot be done. So the pages effectively become immovable (or else the parent references will not match the kids references as they will be out of date - which was why acroread could not open the pages). Delaying writing the page objects would mean the parent references can be updated correctly, and the problem would be solved. But, that has a potential memory usage toll. Today I will continue with my attempt to link every page to a node of its own (stored in a flat list), then re-order the nodes according to the page index of the page inside. Then build up the balanced page tree from those nodes up. That's the plan anyway... (I'll also be interested time permitting in looking more closely at what happened when the 2 page sequences ended up with mixed up pages...) Thanks! -Mike On 08/06/11 20:14, Andreas L. Delmelle wrote: On 08 Jun 2011, at 17:15, Michael Rubin wrote: Hi Mike Hello there. Thought I'd post an update. Admittedly I feel like I've found a bit of a catch 22 situation. I successfully completed my code to generate the balanced page tree on the fly and it works fine with a single page sequence. However, this morning I discovered that this code does not appear to work for multiple page sequences in a flow. (2x 101 page sequences, I got pages 1-9, 102, 10-101 then 103-end in that order...) I guess this is where pages can come in in a different order anyway then, and why the current indexing / nulls system is there. Ouch! I had not considered that to be the purpose. Without looking closer, I would say something like: page 10 contains a forward reference to page 102, and all pages in between are only flushed after the reference can be been resolved (?) (And shows that I am still learning the ropes as I go along...) Yep, and also shows that I am not intimately familiar with *all* of the codebase myself. ;-) So I re-examined trying to generate the page tree after the pages have been added into one big flat list. I can do this by, in PDFDocument.outputTrailer(), calling a method to balance the page tree before all the remaining objects are written out. This way pages can be attached to nodes, and the tree hierarchy built up to the root node. This is on paper a more elegant, efficient and easier solution to doing it on the fly. But I ran into the same problem again - the page objects are already written out. OK, here may be a gap in my understanding of it so far, but... Do you really _need_ the PDFPage object for some reason, or does its PDF reference suffice to build the page tree? From what I know of PDF, that page tree would only contain the references to the actual page objects, no? As long as the PDFPages object is not written to the stream, you should be able to shuffle and play with the references all you want. All you need to keep track of, is to retain the natural order (= the page's index), as the object numbers will not necessarily reflect that. Unless I am mistaken about this, I do not see a compelling reason *not* to write the PDFPage object to the stream as soon as it's finished. We keep a mapping of reference-to-index alive in the 'main' (temporary?) PDFPages object. Note that notifyKidRegistered() only stores the reference; the natural index is translated into the position of the reference in the list. If you want to re-shape that into a structured tree/map, then by all means... Perhaps there is still a catch --sounds too simple somehow... :-/ snip / My current questions are: -Why are the page objects flushed straight away? (Memory constraints?) Very likely to save memory indeed. More with the intention of just flushing as soon as possible, to support full streaming processing if the document structure allows it. Theoretically, in a document consisting of single-page fo:page-sequences, without any cross-references, you should see relatively low memory usage even if the document is 1+ pages, precisely because the pages are all written to the output immediately, long before the root page tree, which only retains their object references. -Is it safe and wise to delay flushing the page objects until the end? Safe? No issue here. Wise? That would obviously depend on the context. In documents with 1000s of pages, I can imagine we do not want to keep all of those pages in memory any longer than strictly necessary... I wouldn't mind too much if it were an option that users could switch on/off. However, if the process is hard coded as the *only* way FOP will render PDFs, such that it would affect *all* users, I am not so sure it is wise to do this. snip
Re: Retrieving Objects question
Thanks for your reply Andreas. Currently it is hardcoded to 10 nodes or leaves, but adding an xconf setting perhaps should be pretty easy and quick to do. However, having spoken to my manager, there isn't the business requirement currently to make it configurable, and given the current large array of options already available, the preference is to just keep it hardcoded for now. At the very least I'll make sure the maximum leaves / subnodes value is stored in a constant so if it is made configurable then only the constant needs to be paid attention to rather than multiple locations in the class. As far as I can tell the page objects are kept alive anyway by the references in the document object itself (atleast until the trailer is written). So me keeping references in the page tree object should not extend their life in any way. Currently, if I take a 20 page document, then there are two sets of 10 pages, one in each node, each node being children of the root node. For the first 10 pages the kids list is something like {1 0 R, 2 0 R, 3 0 R, 4 0 R, 5 0 R, 6 0 R, 7 0 R, 8 0 R, 9 0 R, 10 0 R} (object numbers not intended to be realistic for this example). But for the second 10 pages the kids list is {null, null, null, null, null, null, null, null, null, null, 11 0 R, 12 0 R, 13 0 R, 14 0 R, 15 0 R, 16 0 R, 17 0 R, 18 0 R, 19 0 R, 20 0 R} since the page index (which is zero based) makes the page get placed in that index position on the tree, any previous unused indexes being filled with null. So for a 10,000 page doc there are going to be a lot of nulls in the page tree. For now setting the toPDFString() to ignore the nulls rather than throw an exception gets round this and allows the document to be correctly generated. In my tests all the pages are produced in the correct order. I was wondering though if there are any cases where the pages might not be passed in in the correct order (and hence might possibly explain why the notifyKidsRegistered() method was written in the way it is), and if so if that has any implications on the way I have written the balanced page tree code updates. Thanks. -Mike On 03/06/11 22:38, Andreas L. Delmelle wrote: On 03 Jun 2011, at 10:54, Michael Rubin wrote: Hi Mike Thanks a lot for your reply last week Andreas. Sorry for the delay. Been away and offline... FYI to follow up on the work I was doing: snip / So for example a 101 page document will have a root PDFPages node with two sub-nodes underneath. The first will contain a count of 100, and have 10 sub-nodes, each containing 10 pages. The second will simply contain 1 page. More new pages will get added to the second sub-node (moving pages down to new sub-nodes to avoid more than 10 pages per node) until it's count reaches 100 too, then another node created. Once 10 nodes under the root exist (at 1000 pages) they will get moved down below a new root level sub-node with a count of 1000, and a new root level sub-node created, and so on. Cool! Impressive work. Will the number of pages per node be configurable? Next task is to write a JUnit test since one appears not to exist... I guess remaining thoughts currently are: - Wondering if keeping references to a page tree object's sub-nodes or leaves is the best way or can I improve it further? (Bearing in mind memory usage and performance.) It depends a bit on whether you are thereby keeping PDFPage objects alive longer than necessary. The current design only stores the pages' referencePDF, so that seems safe. - Was wondering if the trailer objects list is the right place to write the new sub-node PDFPages objects. (But if writing an object to the objects list - addObject() instead of addTrailerObject() - it gets written out too soon before I have added all the pages.) But given how it writes the objects out before writing the xref and trailer it seems OK and parses and shows fine in PDFBox/PDFDebugger and the evince PDF Reader in ubuntu. I would think that that is the correct place, although I must admit, I would have to check the PDF Spec to be certain. - When registering the pages themselves via notifyKidsRegistered() method it extracts the page index number and puts the reference at that index in the kids list, filling empty spaces ahead of it with nulls. So when counting kids and writing out the pdf code text I had to ignore nulls and 'gaps' in the kids list since not all the kids are in the same list any more (spread across multiple page tree nodes). I was wondering why this method was written like this, and doesn't simply append new pages to the end of the list all the time. AFAICT, what it is designed to do is make sure that the page is entered at the correct index in the list of kids. It would only create null entries if the list is not yet large enough. I have a feeling this is just by design, taking into account a single page tree node only (see the javadoc of the PDFPages class
Re: Retrieving Objects question
Thanks a lot for your reply last week Andreas. Sorry for the delay. Been away and offline... FYI to follow up on the work I was doing: In the end I saw that references are indeed kept by the PDFDocument. So I decided it wouldn't do any harm (or take up any significant extra memory) to keep references to the objects themselves when I am constructing the balanced page tree. I have since modified PDFPages (and a small change in PDFPage) and the first working draft completed late yesterday keeps a list of sub-nodes (PDFPages, managed internally via a recursive algorithm - external methods work as before to avoid regressions) or leaves (PDFPage) as well as the original kids (may be a PDFPage or a sub PDFPages object) with PDF references to all children. This eliminates an overhead of looking up each object (potentially many times). I have successfully run it with test .fo files up to 10001 pages (each just showing 'Page x/y' where x is current page and y is total page count, takes a while with that many pages but not surprised) verifying that a balanced tree gets produced (and not a flat tree of one page tree object containing 10001 pages!). When each subnode is created the PDFFactory.makePages() method stores it in the trailer. That way the objects are all written out at the end after I have added all the pages to the right places, just before the cross reference table and trailer themselves are written. So now there are never more than 10 pages or 10 PDFPages (sub-nodes) per PDFPages object (I never mix sub-nodes and leaves on the same node). A similar structure to the page tree of the PDF 1.4 Reference document. Automatically generated on the fly. So for example a 101 page document will have a root PDFPages node with two sub-nodes underneath. The first will contain a count of 100, and have 10 sub-nodes, each containing 10 pages. The second will simply contain 1 page. More new pages will get added to the second sub-node (moving pages down to new sub-nodes to avoid more than 10 pages per node) until it's count reaches 100 too, then another node created. Once 10 nodes under the root exist (at 1000 pages) they will get moved down below a new root level sub-node with a count of 1000, and a new root level sub-node created, and so on. Next task is to write a JUnit test since one appears not to exist... I guess remaining thoughts currently are: - Wondering if keeping references to a page tree object's sub-nodes or leaves is the best way or can I improve it further? (Bearing in mind memory usage and performance.) - Was wondering if the trailer objects list is the right place to write the new sub-node PDFPages objects. (But if writing an object to the objects list - addObject() instead of addTrailerObject() - it gets written out too soon before I have added all the pages.) But given how it writes the objects out before writing the xref and trailer it seems OK and parses and shows fine in PDFBox/PDFDebugger and the evince PDF Reader in ubuntu. - When registering the pages themselves via notifyKidsRegistered() method it extracts the page index number and puts the reference at that index in the kids list, filling empty spaces ahead of it with nulls. So when counting kids and writing out the pdf code text I had to ignore nulls and 'gaps' in the kids list since not all the kids are in the same list any more (spread across multiple page tree nodes). I was wondering why this method was written like this, and doesn't simply append new pages to the end of the list all the time. Once testing is complete I'll submit the code internally for the in-team committers to review as I did with the 128 bit encryption work last month... Thanks! -Mike On 25/05/11 21:57, Andreas L. Delmelle wrote: On 25 May 2011, at 09:45, Michael Rubin wrote: Hi Mike Hello there. In the PDFPages class the kids are stored as reference strings (e.g. 23 0 R). Each of these objects are PDFPage objects. Do you know if there is a method somewhere that I can retrieve the PDF java object based on the reference string? Not really, AFAIK. What you do have is various Collections of different subtypes of PDFObject, available by means of accessors on PDFDocument. I guess the closest you would get without too much effort is to obtain the one you're interested in, then iterate over its elements and check PDFObject.referencePDF() against the lookup string. You do have to know the type(s) of object you need in advance, though... (I am aiming to add support for some of those kids being other PDFPages nodes to create a more balanced page tree.) Interesting. Looking forward to seeing more. Regards Andreas --- Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy
Retrieving Objects question
Hello there. In the PDFPages class the kids are stored as reference strings (e.g. 23 0 R). Each of these objects are PDFPage objects. Do you know if there is a method somewhere that I can retrieve the PDF java object based on the reference string? (I am aiming to add support for some of those kids being other PDFPages nodes to create a more balanced page tree.) Thanks. -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it.
Re: any HTML to PDF example within apache FOP?
(javax.xml.transform.TransformerException e) { return null; } return (Document) domResult.getNode(); } /* * Apply FOP to XSL-FO input * * @param foDocument The XSL-FO input * * @return byte[] PDF result */ private static byte[] fo2PDF(Document foDocument) { DocumentInputSource fopInputSource = new DocumentInputSource( foDocument); try { ByteArrayOutputStream out = new ByteArrayOutputStream(); Logger log = new ConsoleLogger(ConsoleLogger.LEVEL_WARN); Driver driver = new Driver(fopInputSource, out); driver.setLogger(log); driver.setRenderer(Driver.RENDER_PDF); driver.run(); return out.toByteArray(); } catch (Exception ex) { return null; } } /* * Create and return a Transformer for the specified stylesheet. * * Based on the DOM2DOM.java example in the Xalan distribution. */ private static Transformer getTransformer(String styleSheet) { try { TransformerFactory tFactory = TransformerFactory.newInstance(); DocumentBuilderFactory dFactory = DocumentBuilderFactory.newInstance(); dFactory.setNamespaceAware(true); DocumentBuilder dBuilder = dFactory.newDocumentBuilder(); Document xslDoc = dBuilder.parse(styleSheet); DOMSource xslDomSource = new DOMSource(xslDoc); return tFactory.newTransformer(xslDomSource); } catch (javax.xml.transform.TransformerException e) { e.printStackTrace(); return null; } catch (java.io.IOException e) { e.printStackTrace(); return null; } catch (javax.xml.parsers.ParserConfigurationException e) { e.printStackTrace(); return null; } catch (org.xml.sax.SAXException e) { e.printStackTrace(); return null; } } } Kapil Garg Michael Rubin Developer [http://thunderhead.com/email_signature/images/Thunderhead-logo.png] [http://thunderhead.com/email_signature/images/make-every-communication-count.png] [http://thunderhead.com/email_signature/images/triangles.png] T F M E W +44 20 8238 7400 +44 20 8238 7401 mru...@thunderhead.commailto:mru...@thunderhead.com www.thunderhead.comhttp://www.thunderhead.com Thunderhead featured in The Sunday Times Profit Track 100 league table of companies with fastest-growing profits. Click herehttp://www.fasttrack.co.uk/fasttrack/press/pt11-lon.pdf to read more. [http://thunderhead.com/email_signature/images/linkedin.png]http://www.linkedin.com/companies/25033/Thunderhead [http://thunderhead.com/email_signature/images/twitter.png] http://twitter.com/Thunderheadon [http://thunderhead.com/email_signature/images/rss.png] http://www.thunderhead.com/rss/rss.php [http://thunderhead.com/email_signature/images/youtube.png] http://www.youtube.com/user/ThunderheadOn [http://thunderhead.com/email_signature/images/theblog.png] http://thunderheadinnovate.wordpress.com/ [http://thunderhead.com/email_signature/images/werehiring.png] http://thunderhead.com/about/careers.php The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it.
Re: Fop Memory Use
Just a wild thought. But is there a way you could possibly get the JVM to garbage collect between each run? Maybe that might free the memory up? Thanks. -Mike On 18/05/11 13:20, Eric Douglas wrote: I am using Fop 1.0. I tried using Fop to transform a single document. When I got a little over 100 pages my FO file was over 5 MB. The transform crashed with a Java heap out of memory error. I managed to break the input down, as I'm using embedded code generating the input programmatically, and the PDF output is a lot smaller. So I'm currently transforming 10 pages at a time, setting the initial-page-number to the next sequence (1, 11, 21, etc). Then I save all the generated PDFs in memory and merge them using pdfbox. So far this is working great. I tried to do the same thing with the PNGRenderer, just calling a method to transform 10 pages at a time and save the output images in an array. The PNGRenderer is created locally in the method. It should be getting released when the method ends but the java process never releases any memory. I tested a 90 page report and the memory use was over 1 GB. I tested on another machine where the memory limit is apparently lower and it crashed on page 24. Everything about the method to render to PNG is the same as the method to render to PDF aside from the Renderer. Is there a problem with this renderer or something I could need to do different? Michael Rubin Developer [http://thunderhead.com/email_signature/images/Thunderhead-logo.png] [http://thunderhead.com/email_signature/images/make-every-communication-count.png] [http://thunderhead.com/email_signature/images/triangles.png] T F M E W +44 20 8238 7400 +44 20 8238 7401 mru...@thunderhead.commailto:mru...@thunderhead.com www.thunderhead.comhttp://www.thunderhead.com Thunderhead featured in The Sunday Times Profit Track 100 league table of companies with fastest-growing profits. Click herehttp://www.fasttrack.co.uk/fasttrack/press/pt11-lon.pdf to read more. [http://thunderhead.com/email_signature/images/linkedin.png]http://www.linkedin.com/companies/25033/Thunderhead [http://thunderhead.com/email_signature/images/twitter.png] http://twitter.com/Thunderheadon [http://thunderhead.com/email_signature/images/rss.png] http://www.thunderhead.com/rss/rss.php [http://thunderhead.com/email_signature/images/youtube.png] http://www.youtube.com/user/ThunderheadOn [http://thunderhead.com/email_signature/images/theblog.png] http://thunderheadinnovate.wordpress.com/ [http://thunderhead.com/email_signature/images/werehiring.png] http://thunderhead.com/about/careers.php The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it.
Re: Event broadcasting and listening question - solved!
Hello again. Thanks again to Jeremias's invaluable help. Much appreciated. I believe the issue is now resolved. For the benefit of everyone else here is a summary of what I did: 1. Set up the listener, adaptor and producer: - Added 'void warnRevision3PermissionsIgnored(Object source);' (and its javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added a corresponding entry to the xml. - Created org.apache.fop.pdf.PDFEventListener interface containing just 'void warnRevision3PermissionsIgnored(Object source);'. - Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that extends the PDFEventListener. So the listener gets called from PDFEncryptionJCE.init() which hooks into the producer via the adaptor, thereby decoupling FOP's event subsystem from the PDF library. 2. Ensure the PDFDocument exists before init time: PDFEncryptionJCE.make(): Added PDFDocument parameter to the method parameters. Call setDocument() to set the PDF before the init() method is called. The PDFDocument is passed in from PDFEncryptionManager.newInstance(), and comes from PDFDocument.setEncryption() (where 'this' is passed in). Compare current where the document is set in PDFDocument.setEncryption() after init time. 3. Ensure the listener is set up before init time: Moved the listener setup code from PDFDocumentHandler.startDocument() to PDFRenderingUtil.setupPDFDocument() (just above the setupPDFEncryption method call). Now the PDF Document and its listener are available from within the init() method of PDFEncryptionJCE. Thanks! -Mike On 16/05/11 09:38, Michael Rubin wrote: Thanks again Jeremias. Your help much appreciated. I have made the PDFEncryptionJCE class pass itself as source into PDFEventListener.warnRevision3PermissionsIgnored() which gets passed onto the PDFEventProducer. Yes I am indeed calling PDFEventListener.warnRevision3PermissionsIgnored() from the PDFEncryptionJCE class. The call is originating from the init() method. A bit of debugging and a fresh mind this morning revealed that getDocumentSafely() is throwing an exception as the returned document is null. (That was getting swallowed up and the InvocationTargetException thrown instead that I got at the end of Friday.) So I think your last paragraph is applicable in that PDFEncryptionManager will need to be modified to set the PDF immediately as you say. So my next step is to work out how I should do that... Thanks! -Mike On 14/05/11 10:42, Jeremias Maerki wrote: On 13.05.2011 17:06:56 Michael Rubin wrote: Thanks for your reply. I have now added the getter and setter to PDFDocument as shown below and added 'this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster());' to PDFDocumentHandler.startDocument() (last line inside the try block). Now I can get the listener from the PDFEncryptionJCE class. However what do I do with it? You call your PDFEventListener.warnRevision3PermissionsIgnored() method. And how does this relate to the producer class and the EventBroadcaster that I am trying to get hold of? It doesn't. That's part of the decoupling. The PDFEncryptionJCE should know nothing of the EventProducer or EventBroadcaster. Maybe the attached UML helps a bit (I don't usually do UML so I'm not sure I've made any mistakes). In your first reply you said to create the Listener and Producer interfaces. Based on the FontEvent* classes, both of these had an event definition for their events. But in the latest reply you are saying not to put my event in the new PDFEventListener? That would make it an empty interface then if I understand right. So what would its purpose be? I can't seem to do anything with it. Sorry for the confusion here. I didn't remember that there was already an EventProducer in org.apache.fop.render.pdf. So adding a new EventProducer doesn't make much sense. Instead your warnRevision3PermissionsIgnored() should be added to the existing one. No new EventProducer is necessary (and I didn't notice that at first). Following discussion with a colleague (Vincent) I left the listener method in (but without the source) and made the call to that to kick off the event. However now I get an InvocationTargetException when I try to get the PDF Doc in order to invoke the listener event method. Looking at the stack trace it happens when I call the PDFDocument.getDocumentSafely() method. It seems when debugging to be PDFEncryptionManager.newInstance() where the error is occurring, the 3rd line calling makeMethod.invoke(...). (I attempted to run the build ant script and then refresh eclipse but this didn't make any difference.) I can't help much with that InvocationTargetException. Maybe if you posted a patch so I could reproduce it I will continue with this on Monday. Any further pointers in the meantime very much appreciated. There are two questions that come from my colleague: 1. What is the source object for? And do we need
Re: Event broadcasting and listening question
Thanks again Jeremias. Your help much appreciated. I have made the PDFEncryptionJCE class pass itself as source into PDFEventListener.warnRevision3PermissionsIgnored() which gets passed onto the PDFEventProducer. Yes I am indeed calling PDFEventListener.warnRevision3PermissionsIgnored() from the PDFEncryptionJCE class. The call is originating from the init() method. A bit of debugging and a fresh mind this morning revealed that getDocumentSafely() is throwing an exception as the returned document is null. (That was getting swallowed up and the InvocationTargetException thrown instead that I got at the end of Friday.) So I think your last paragraph is applicable in that PDFEncryptionManager will need to be modified to set the PDF immediately as you say. So my next step is to work out how I should do that... Thanks! -Mike On 14/05/11 10:42, Jeremias Maerki wrote: On 13.05.2011 17:06:56 Michael Rubin wrote: Thanks for your reply. I have now added the getter and setter to PDFDocument as shown below and added 'this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster());' to PDFDocumentHandler.startDocument() (last line inside the try block). Now I can get the listener from the PDFEncryptionJCE class. However what do I do with it? You call your PDFEventListener.warnRevision3PermissionsIgnored() method. And how does this relate to the producer class and the EventBroadcaster that I am trying to get hold of? It doesn't. That's part of the decoupling. The PDFEncryptionJCE should know nothing of the EventProducer or EventBroadcaster. Maybe the attached UML helps a bit (I don't usually do UML so I'm not sure I've made any mistakes). In your first reply you said to create the Listener and Producer interfaces. Based on the FontEvent* classes, both of these had an event definition for their events. But in the latest reply you are saying not to put my event in the new PDFEventListener? That would make it an empty interface then if I understand right. So what would its purpose be? I can't seem to do anything with it. Sorry for the confusion here. I didn't remember that there was already an EventProducer in org.apache.fop.render.pdf. So adding a new EventProducer doesn't make much sense. Instead your warnRevision3PermissionsIgnored() should be added to the existing one. No new EventProducer is necessary (and I didn't notice that at first). Following discussion with a colleague (Vincent) I left the listener method in (but without the source) and made the call to that to kick off the event. However now I get an InvocationTargetException when I try to get the PDF Doc in order to invoke the listener event method. Looking at the stack trace it happens when I call the PDFDocument.getDocumentSafely() method. It seems when debugging to be PDFEncryptionManager.newInstance() where the error is occurring, the 3rd line calling makeMethod.invoke(...). (I attempted to run the build ant script and then refresh eclipse but this didn't make any difference.) I can't help much with that InvocationTargetException. Maybe if you posted a patch so I could reproduce it I will continue with this on Monday. Any further pointers in the meantime very much appreciated. There are two questions that come from my colleague: 1. What is the source object for? And do we need it referenced in the Listener? Or just the producer? My original idea was that this object gives the event handler a chance to intercept and modify the object that is the event origin. In most cases, it will certainly be ignored but someone might find it handy. And to have it in the producer means you also have to have it in the PDFEventListener. See also java.util.EventObject from which org.apache.fop.events.Event is derived. 2. Why should we get the PDFDocument object from the Encryption class? It's a PDFObject, right? So it should already have the PDFDocument. That makes access to the PDFEventListener easy. Should the listener not be passed into the Encryption class via its constructor rather than having to go fetch the listener? Both are valid ways but since I expect the PDFDocument to already be set, I see no point in giving more information that can otherwise be easily accessed. Well, it could be that your event happens before the PDFDocument is set on that object (see PDFDocument.setEncryption()). In that case PDFEncryptionManager might have to be changed to pass in the PDFDocument immediately. Or you pass in the PDFEventListener, although I find the former more useful and flexible. Thanks! -Mike On 12/05/11 21:29, Jeremias Maerki wrote: On 12.05.2011 10:44:41 Michael Rubin wrote: Thanks a lot for your response Jeremias. I have now done the following: - Added 'void warnRevision3PermissionsIgnored(Object source);' (and its javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added a corresponding entry to the xml. Removed the org.apache.fop.pdf.PDFEventProducer class
Re: Event broadcasting and listening question
Thanks for your reply. I have now added the getter and setter to PDFDocument as shown below and added 'this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster());' to PDFDocumentHandler.startDocument() (last line inside the try block). Now I can get the listener from the PDFEncryptionJCE class. However what do I do with it? And how does this relate to the producer class and the EventBroadcaster that I am trying to get hold of? In your first reply you said to create the Listener and Producer interfaces. Based on the FontEvent* classes, both of these had an event definition for their events. But in the latest reply you are saying not to put my event in the new PDFEventListener? That would make it an empty interface then if I understand right. So what would its purpose be? I can't seem to do anything with it. Following discussion with a colleague (Vincent) I left the listener method in (but without the source) and made the call to that to kick off the event. However now I get an InvocationTargetException when I try to get the PDF Doc in order to invoke the listener event method. Looking at the stack trace it happens when I call the PDFDocument.getDocumentSafely() method. It seems when debugging to be PDFEncryptionManager.newInstance() where the error is occurring, the 3rd line calling makeMethod.invoke(...). (I attempted to run the build ant script and then refresh eclipse but this didn't make any difference.) I will continue with this on Monday. Any further pointers in the meantime very much appreciated. There are two questions that come from my colleague: 1. What is the source object for? And do we need it referenced in the Listener? Or just the producer? 2. Why should we get the PDFDocument object from the Encryption class? Should the listener not be passed into the Encryption class via its constructor rather than having to go fetch the listener? Thanks! -Mike On 12/05/11 21:29, Jeremias Maerki wrote: On 12.05.2011 10:44:41 Michael Rubin wrote: Thanks a lot for your response Jeremias. I have now done the following: - Added 'void warnRevision3PermissionsIgnored(Object source);' (and its javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added a corresponding entry to the xml. Removed the org.apache.fop.pdf.PDFEventProducer class and xml. - Created org.apache.fop.pdf.PDFEventListener interface containing just 'void warnRevision3PermissionsIgnored(Object source);'. - Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that extends the PDFEventListener. (Currently just contains my new event. Should I also add the existing 2 render.pdf events to this class?) Or do it the other way around: add your new event to PDFEventProcuder. Doesn't make sense to have two. I can also see how to obtain the PDFDocument object from the PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure how to get the event broadcaster from that object. How is this done? public class PDFDocument { [..] private PDFEventListener listener; [..] public void setListener(PDFEventListener listener) { this.listener = listener; } PDFEventListener getListener() { return this.listener; } [..] That's the simples way and should probably be sufficient. If we wanted to get fancy, we could handle a ListPDFEventListener. In PDFDocumentHandler.startDocument(): this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster()); So, the PDFDocument doesn't actually get an EventBroadcaster. PDFDocument calls the PDFLibraryEventAdapter and that one in turn calls the EventBroadcaster. Nicely decoupled. Thanks! -Mike On 11/05/11 19:46, Jeremias Maerki wrote: Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get
Re: Event broadcasting and listening question
Thanks a lot for your response Jeremias. I have now done the following: - Added 'void warnRevision3PermissionsIgnored(Object source);' (and its javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added a corresponding entry to the xml. Removed the org.apache.fop.pdf.PDFEventProducer class and xml. - Created org.apache.fop.pdf.PDFEventListener interface containing just 'void warnRevision3PermissionsIgnored(Object source);'. - Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that extends the PDFEventListener. (Currently just contains my new event. Should I also add the existing 2 render.pdf events to this class?) I can also see how to obtain the PDFDocument object from the PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure how to get the event broadcaster from that object. How is this done? Thanks! -Mike On 11/05/11 19:46, Jeremias Maerki wrote: Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListener that you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Jeremias Maerki
Event broadcasting and listening question
Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it.