Quick JIRA question
Hi all I recently discovered two JIRA issues logged against FOP that I think are actually one and the same issue in XGC (see: https://issues.apache.org/jira/browse/FOP-2343) I still have to verify locally that my proposed fix in XGC would fix the issue in FOP, but was wondering how to proceed. Does one usually a) create a new JIRA issue, against XGC, and then link the other two to that one, or b) reassign those issues to XGC (if that is even possible?) ? TIA! KR Andreas
Re: fop-pdf-images question regarding pdfbox Version/Patch
The patches to pdfbox 1.8.4 were deemed not necessary after changes were made to fop to not need them. So they were removed with the upgrade to 1.8.5. We could create a branch but right now it would be the same as the commit Simon mentions. If you have problems replacing 1.8.5 by 1.8.8 please ask in fop-users for a new branch and if there is interest from other users we will create one. On 1/22/15 10:05 AM, Simon Steiner wrote: Hi, For now you can check out revision 1593155 in svn to get the older version. You should use mailing list or jira to report any issues. Thanks *From:*Kai Hofmann [mailto:powers...@web.de] *Sent:* 22 January 2015 08:14 *To:* sstei...@apache.org; lberna...@apache.org *Subject:* fop-pdf-images question regarding pdfbox Version/Patch Hello you two main commiters on the famous fop-pdf-images project, ich hope it is ok to bother you directly with very special questions: On 08 Mai 2014 the commit removed pdfbox 1.8.4 and installed pdfbox 1.8.5 - during this the patches for the pdfbox have vanished - are they no longer required? I ask, because I will try to use the pdfbox 1.8.8 version (because of some problems I have with 1.8.5. Later in 2014 on 16 July you moved forward to pdfbox 2.0 - this version I can't bring to live yet - so I would like to make the suggestion that it would be helpful if you use branches - i.e. a branch for a fop-pdf-images version based on pdfbox 1.8.x would be very helpful, while you are wprking on the pdfbox 2.0 version in trunk. Thanks and Greetings Kai Hofmann -- Kai Hofmann EMail: powers...@web.de mailto:powers...@web.de Bremen/Germany
Re: Question on FOP release schedule
Hi Jacopo, Thanks for your contribution. I wonder why you ask for a bug fix release of 1.1? v2.0 will include all bug fixes since the v1.1 release, there are no plans for a 1.1.2 release or similar. We are working towards a v2.0 release of FOP. We've just released XML Graphics Commons v2.0, but the FOP release is dependent on other libraries being released first. Thanks, Chris On 03/10/2014 06:49, Jacopo Cappellato wrote: Thank you Luis, I have attached to Jira a Junit test for the CompareUtil.equal method that should prove the issue we are facing and should confirm that the fix I am proposing should work ok. As regards the bug fix release, at the moment this is the only issue that I am aware of that is causing some pain to OFBiz and having a bug fix release for it would be great; however I know that the release workflow requires a good amount of work and I am wondering if I or other OFBiz committers may be of any help in the release process (e.g. we could help the FOP community to maintain a release branch for 1.1 by backporting fixes to it and testing it). I am wide open to any suggestions. Of course OFBiz will upgrade to the new release 2.0 as soon as this will be available and we will help you to test that as well. All in all I am just trying to give something back to the FOP community, since the OFBiz community has been a rather silent and passive user of your amazing tool :-) Regards, Jacopo On Oct 3, 2014, at 1:15 AM, Luis Bernardo lmpmberna...@gmail.com wrote: I can apply your patch although I do not have the environment to test it. Regarding the question about a bug fix for 1.1, the answer is that there is nothing planned but if there is interest from the FOP users I think that can be accommodated. Is there any other bug your would like to see fixed in a 1.1+ release? On 10/2/14, 7:22 PM, Jacopo Cappellato wrote: Hi all, I am a committer for Apache OFBiz, a project that uses FOP 1.1 (thanks for this amazing product). I hope this is the right list to get some information about the release process and planning of Apache FOP. Apart from FOP 2.0, is there a plan to release a bug fix release for 1.1? For example, we may be specifically interested in getting a new release with this issue resolved: https://issues.apache.org/jira/browse/FOP-2157 (in the ticket I have attached a fix for the same). Is there something we could do to support you in the process? Thank you, Jacopo
Re: Question on FOP release schedule
Hi Chris, the only reason I have asked for a bug fix release for 1.1 is because it may be easier to approve/publish. But if you are already planning the new 2.0 release then great. Is there a tentative plan for it already? Thanks, Jacopo On Oct 7, 2014, at 2:36 PM, Chris Bowditch bowditch_ch...@hotmail.com wrote: Hi Jacopo, Thanks for your contribution. I wonder why you ask for a bug fix release of 1.1? v2.0 will include all bug fixes since the v1.1 release, there are no plans for a 1.1.2 release or similar. We are working towards a v2.0 release of FOP. We've just released XML Graphics Commons v2.0, but the FOP release is dependent on other libraries being released first. Thanks, Chris On 03/10/2014 06:49, Jacopo Cappellato wrote: Thank you Luis, I have attached to Jira a Junit test for the CompareUtil.equal method that should prove the issue we are facing and should confirm that the fix I am proposing should work ok. As regards the bug fix release, at the moment this is the only issue that I am aware of that is causing some pain to OFBiz and having a bug fix release for it would be great; however I know that the release workflow requires a good amount of work and I am wondering if I or other OFBiz committers may be of any help in the release process (e.g. we could help the FOP community to maintain a release branch for 1.1 by backporting fixes to it and testing it). I am wide open to any suggestions. Of course OFBiz will upgrade to the new release 2.0 as soon as this will be available and we will help you to test that as well. All in all I am just trying to give something back to the FOP community, since the OFBiz community has been a rather silent and passive user of your amazing tool :-) Regards, Jacopo On Oct 3, 2014, at 1:15 AM, Luis Bernardo lmpmberna...@gmail.com wrote: I can apply your patch although I do not have the environment to test it. Regarding the question about a bug fix for 1.1, the answer is that there is nothing planned but if there is interest from the FOP users I think that can be accommodated. Is there any other bug your would like to see fixed in a 1.1+ release? On 10/2/14, 7:22 PM, Jacopo Cappellato wrote: Hi all, I am a committer for Apache OFBiz, a project that uses FOP 1.1 (thanks for this amazing product). I hope this is the right list to get some information about the release process and planning of Apache FOP. Apart from FOP 2.0, is there a plan to release a bug fix release for 1.1? For example, we may be specifically interested in getting a new release with this issue resolved: https://issues.apache.org/jira/browse/FOP-2157 (in the ticket I have attached a fix for the same). Is there something we could do to support you in the process? Thank you, Jacopo
Question on FOP release schedule
Hi all, I am a committer for Apache OFBiz, a project that uses FOP 1.1 (thanks for this amazing product). I hope this is the right list to get some information about the release process and planning of Apache FOP. Apart from FOP 2.0, is there a plan to release a bug fix release for 1.1? For example, we may be specifically interested in getting a new release with this issue resolved: https://issues.apache.org/jira/browse/FOP-2157 (in the ticket I have attached a fix for the same). Is there something we could do to support you in the process? Thank you, Jacopo
Re: Question on FOP release schedule
I can apply your patch although I do not have the environment to test it. Regarding the question about a bug fix for 1.1, the answer is that there is nothing planned but if there is interest from the FOP users I think that can be accommodated. Is there any other bug your would like to see fixed in a 1.1+ release? On 10/2/14, 7:22 PM, Jacopo Cappellato wrote: Hi all, I am a committer for Apache OFBiz, a project that uses FOP 1.1 (thanks for this amazing product). I hope this is the right list to get some information about the release process and planning of Apache FOP. Apart from FOP 2.0, is there a plan to release a bug fix release for 1.1? For example, we may be specifically interested in getting a new release with this issue resolved: https://issues.apache.org/jira/browse/FOP-2157 (in the ticket I have attached a fix for the same). Is there something we could do to support you in the process? Thank you, Jacopo
Re: Question on FOP release schedule
Thank you Luis, I have attached to Jira a Junit test for the CompareUtil.equal method that should prove the issue we are facing and should confirm that the fix I am proposing should work ok. As regards the bug fix release, at the moment this is the only issue that I am aware of that is causing some pain to OFBiz and having a bug fix release for it would be great; however I know that the release workflow requires a good amount of work and I am wondering if I or other OFBiz committers may be of any help in the release process (e.g. we could help the FOP community to maintain a release branch for 1.1 by backporting fixes to it and testing it). I am wide open to any suggestions. Of course OFBiz will upgrade to the new release 2.0 as soon as this will be available and we will help you to test that as well. All in all I am just trying to give something back to the FOP community, since the OFBiz community has been a rather silent and passive user of your amazing tool :-) Regards, Jacopo On Oct 3, 2014, at 1:15 AM, Luis Bernardo lmpmberna...@gmail.com wrote: I can apply your patch although I do not have the environment to test it. Regarding the question about a bug fix for 1.1, the answer is that there is nothing planned but if there is interest from the FOP users I think that can be accommodated. Is there any other bug your would like to see fixed in a 1.1+ release? On 10/2/14, 7:22 PM, Jacopo Cappellato wrote: Hi all, I am a committer for Apache OFBiz, a project that uses FOP 1.1 (thanks for this amazing product). I hope this is the right list to get some information about the release process and planning of Apache FOP. Apart from FOP 2.0, is there a plan to release a bug fix release for 1.1? For example, we may be specifically interested in getting a new release with this issue resolved: https://issues.apache.org/jira/browse/FOP-2157 (in the ticket I have attached a fix for the same). Is there something we could do to support you in the process? Thank you, Jacopo
Re: Retrieving Objects question
Thanks a lot for your reply Andreas. Yes if all I had to do was move references around then my work would already be complete and submitted for review. However, that catch is that the Page objects also have Parent references which also need to be updated when they get moved from one page tree node to another. But since they have been written out already this cannot be done. So the pages effectively become immovable (or else the parent references will not match the kids references as they will be out of date - which was why acroread could not open the pages). Delaying writing the page objects would mean the parent references can be updated correctly, and the problem would be solved. But, that has a potential memory usage toll. Today I will continue with my attempt to link every page to a node of its own (stored in a flat list), then re-order the nodes according to the page index of the page inside. Then build up the balanced page tree from those nodes up. That's the plan anyway... (I'll also be interested time permitting in looking more closely at what happened when the 2 page sequences ended up with mixed up pages...) Thanks! -Mike On 08/06/11 20:14, Andreas L. Delmelle wrote: On 08 Jun 2011, at 17:15, Michael Rubin wrote: Hi Mike Hello there. Thought I'd post an update. Admittedly I feel like I've found a bit of a catch 22 situation. I successfully completed my code to generate the balanced page tree on the fly and it works fine with a single page sequence. However, this morning I discovered that this code does not appear to work for multiple page sequences in a flow. (2x 101 page sequences, I got pages 1-9, 102, 10-101 then 103-end in that order...) I guess this is where pages can come in in a different order anyway then, and why the current indexing / nulls system is there. Ouch! I had not considered that to be the purpose. Without looking closer, I would say something like: page 10 contains a forward reference to page 102, and all pages in between are only flushed after the reference can be been resolved (?) (And shows that I am still learning the ropes as I go along...) Yep, and also shows that I am not intimately familiar with *all* of the codebase myself. ;-) So I re-examined trying to generate the page tree after the pages have been added into one big flat list. I can do this by, in PDFDocument.outputTrailer(), calling a method to balance the page tree before all the remaining objects are written out. This way pages can be attached to nodes, and the tree hierarchy built up to the root node. This is on paper a more elegant, efficient and easier solution to doing it on the fly. But I ran into the same problem again - the page objects are already written out. OK, here may be a gap in my understanding of it so far, but... Do you really _need_ the PDFPage object for some reason, or does its PDF reference suffice to build the page tree? From what I know of PDF, that page tree would only contain the references to the actual page objects, no? As long as the PDFPages object is not written to the stream, you should be able to shuffle and play with the references all you want. All you need to keep track of, is to retain the natural order (= the page's index), as the object numbers will not necessarily reflect that. Unless I am mistaken about this, I do not see a compelling reason *not* to write the PDFPage object to the stream as soon as it's finished. We keep a mapping of reference-to-index alive in the 'main' (temporary?) PDFPages object. Note that notifyKidRegistered() only stores the reference; the natural index is translated into the position of the reference in the list. If you want to re-shape that into a structured tree/map, then by all means... Perhaps there is still a catch --sounds too simple somehow... :-/ snip / My current questions are: -Why are the page objects flushed straight away? (Memory constraints?) Very likely to save memory indeed. More with the intention of just flushing as soon as possible, to support full streaming processing if the document structure allows it. Theoretically, in a document consisting of single-page fo:page-sequences, without any cross-references, you should see relatively low memory usage even if the document is 1+ pages, precisely because the pages are all written to the output immediately, long before the root page tree, which only retains their object references. -Is it safe and wise to delay flushing the page objects until the end? Safe? No issue here. Wise? That would obviously depend on the context. In documents with 1000s of pages, I can imagine we do not want to keep all of those pages in memory any longer than strictly necessary... I wouldn't mind too much if it were an option that users could switch on/off. However, if the process is hard coded as the *only* way FOP will render PDFs, such that it would affect *all* users, I am not so sure it is wise to do this. snip /
Re: Retrieving Objects question
On 09 Jun 2011, at 09:49, Michael Rubin wrote: Hi Mike Thanks a lot for your reply Andreas. Yes if all I had to do was move references around then my work would already be complete and submitted for review. However, that catch is that the Page objects also have Parent references which also need to be updated when they get moved from one page tree node to another. But since they have been written out already this cannot be done. So the pages effectively become immovable (or else the parent references will not match the kids references as they will be out of date - which was why acroread could not open the pages). Aah, OK, now I understand that catch better. Thanks for clarifying! In the meantime, I'll give it some more thought too, and if I find anything useful to add, I'll follow up here. Regards Andreas ---
Re: Retrieving Objects question
On 08 Jun 2011, at 17:15, Michael Rubin wrote: Hi Mike Hello there. Thought I'd post an update. Admittedly I feel like I've found a bit of a catch 22 situation. I successfully completed my code to generate the balanced page tree on the fly and it works fine with a single page sequence. However, this morning I discovered that this code does not appear to work for multiple page sequences in a flow. (2x 101 page sequences, I got pages 1-9, 102, 10-101 then 103-end in that order...) I guess this is where pages can come in in a different order anyway then, and why the current indexing / nulls system is there. Ouch! I had not considered that to be the purpose. Without looking closer, I would say something like: page 10 contains a forward reference to page 102, and all pages in between are only flushed after the reference can be been resolved (?) (And shows that I am still learning the ropes as I go along...) Yep, and also shows that I am not intimately familiar with *all* of the codebase myself. ;-) So I re-examined trying to generate the page tree after the pages have been added into one big flat list. I can do this by, in PDFDocument.outputTrailer(), calling a method to balance the page tree before all the remaining objects are written out. This way pages can be attached to nodes, and the tree hierarchy built up to the root node. This is on paper a more elegant, efficient and easier solution to doing it on the fly. But I ran into the same problem again - the page objects are already written out. OK, here may be a gap in my understanding of it so far, but... Do you really _need_ the PDFPage object for some reason, or does its PDF reference suffice to build the page tree? From what I know of PDF, that page tree would only contain the references to the actual page objects, no? As long as the PDFPages object is not written to the stream, you should be able to shuffle and play with the references all you want. All you need to keep track of, is to retain the natural order (= the page's index), as the object numbers will not necessarily reflect that. Unless I am mistaken about this, I do not see a compelling reason *not* to write the PDFPage object to the stream as soon as it's finished. We keep a mapping of reference-to-index alive in the 'main' (temporary?) PDFPages object. Note that notifyKidRegistered() only stores the reference; the natural index is translated into the position of the reference in the list. If you want to re-shape that into a structured tree/map, then by all means... Perhaps there is still a catch --sounds too simple somehow... :-/ snip / My current questions are: -Why are the page objects flushed straight away? (Memory constraints?) Very likely to save memory indeed. More with the intention of just flushing as soon as possible, to support full streaming processing if the document structure allows it. Theoretically, in a document consisting of single-page fo:page-sequences, without any cross-references, you should see relatively low memory usage even if the document is 1+ pages, precisely because the pages are all written to the output immediately, long before the root page tree, which only retains their object references. -Is it safe and wise to delay flushing the page objects until the end? Safe? No issue here. Wise? That would obviously depend on the context. In documents with 1000s of pages, I can imagine we do not want to keep all of those pages in memory any longer than strictly necessary... I wouldn't mind too much if it were an option that users could switch on/off. However, if the process is hard coded as the *only* way FOP will render PDFs, such that it would affect *all* users, I am not so sure it is wise to do this. snip / Regards Andreas ---
Re: Retrieving Objects question
On 08 Jun 2011, at 21:14, Andreas L. Delmelle wrote: snip / snip / My current questions are: -Why are the page objects flushed straight away? (Memory constraints?) Very likely to save memory indeed. More with the intention of just flushing as soon as possible, to support full streaming processing if the document structure allows it. Theoretically, in a document consisting of single-page fo:page-sequences, without any cross-references, you should see relatively low memory usage even if the document is 1+ pages, precisely because the pages are all written to the output immediately, long before the root page tree, which only retains their object references. ^ Just felt this needed clarification: *PDF* object references (which, in Java are merely Strings, not references to the PDFPage objects).
Re: Retrieving Objects question
Thanks for your reply Andreas. Currently it is hardcoded to 10 nodes or leaves, but adding an xconf setting perhaps should be pretty easy and quick to do. However, having spoken to my manager, there isn't the business requirement currently to make it configurable, and given the current large array of options already available, the preference is to just keep it hardcoded for now. At the very least I'll make sure the maximum leaves / subnodes value is stored in a constant so if it is made configurable then only the constant needs to be paid attention to rather than multiple locations in the class. As far as I can tell the page objects are kept alive anyway by the references in the document object itself (atleast until the trailer is written). So me keeping references in the page tree object should not extend their life in any way. Currently, if I take a 20 page document, then there are two sets of 10 pages, one in each node, each node being children of the root node. For the first 10 pages the kids list is something like {1 0 R, 2 0 R, 3 0 R, 4 0 R, 5 0 R, 6 0 R, 7 0 R, 8 0 R, 9 0 R, 10 0 R} (object numbers not intended to be realistic for this example). But for the second 10 pages the kids list is {null, null, null, null, null, null, null, null, null, null, 11 0 R, 12 0 R, 13 0 R, 14 0 R, 15 0 R, 16 0 R, 17 0 R, 18 0 R, 19 0 R, 20 0 R} since the page index (which is zero based) makes the page get placed in that index position on the tree, any previous unused indexes being filled with null. So for a 10,000 page doc there are going to be a lot of nulls in the page tree. For now setting the toPDFString() to ignore the nulls rather than throw an exception gets round this and allows the document to be correctly generated. In my tests all the pages are produced in the correct order. I was wondering though if there are any cases where the pages might not be passed in in the correct order (and hence might possibly explain why the notifyKidsRegistered() method was written in the way it is), and if so if that has any implications on the way I have written the balanced page tree code updates. Thanks. -Mike On 03/06/11 22:38, Andreas L. Delmelle wrote: On 03 Jun 2011, at 10:54, Michael Rubin wrote: Hi Mike Thanks a lot for your reply last week Andreas. Sorry for the delay. Been away and offline... FYI to follow up on the work I was doing: snip / So for example a 101 page document will have a root PDFPages node with two sub-nodes underneath. The first will contain a count of 100, and have 10 sub-nodes, each containing 10 pages. The second will simply contain 1 page. More new pages will get added to the second sub-node (moving pages down to new sub-nodes to avoid more than 10 pages per node) until it's count reaches 100 too, then another node created. Once 10 nodes under the root exist (at 1000 pages) they will get moved down below a new root level sub-node with a count of 1000, and a new root level sub-node created, and so on. Cool! Impressive work. Will the number of pages per node be configurable? Next task is to write a JUnit test since one appears not to exist... I guess remaining thoughts currently are: - Wondering if keeping references to a page tree object's sub-nodes or leaves is the best way or can I improve it further? (Bearing in mind memory usage and performance.) It depends a bit on whether you are thereby keeping PDFPage objects alive longer than necessary. The current design only stores the pages' referencePDF, so that seems safe. - Was wondering if the trailer objects list is the right place to write the new sub-node PDFPages objects. (But if writing an object to the objects list - addObject() instead of addTrailerObject() - it gets written out too soon before I have added all the pages.) But given how it writes the objects out before writing the xref and trailer it seems OK and parses and shows fine in PDFBox/PDFDebugger and the evince PDF Reader in ubuntu. I would think that that is the correct place, although I must admit, I would have to check the PDF Spec to be certain. - When registering the pages themselves via notifyKidsRegistered() method it extracts the page index number and puts the reference at that index in the kids list, filling empty spaces ahead of it with nulls. So when counting kids and writing out the pdf code text I had to ignore nulls and 'gaps' in the kids list since not all the kids are in the same list any more (spread across multiple page tree nodes). I was wondering why this method was written like this, and doesn't simply append new pages to the end of the list all the time. AFAICT, what it is designed to do is make sure that the page is entered at the correct index in the list of kids. It would only create null entries if the list is not yet large enough. I have a feeling this is just by design, taking into account a single page tree node only (see the javadoc of the PDFPages class...)
Re: Retrieving Objects question
On 06 Jun 2011, at 10:59, Michael Rubin wrote: Hi Mike Thanks for your reply Andreas. Currently it is hardcoded to 10 nodes or leaves, but adding an xconf setting perhaps should be pretty easy and quick to do. However, having spoken to my manager, there isn't the business requirement currently to make it configurable, and given the current large array of options already available, the preference is to just keep it hardcoded for now. At the very least I'll make sure the maximum leaves / subnodes value is stored in a constant so if it is made configurable then only the constant needs to be paid attention to rather than multiple locations in the class. OK, sounds good. I must admit, I was playing devil's advocate here, and did not see any immediate reason to be able to change it either, but you can probably bet your life that _someone_ is going to come up with this requirement as soon as the feature is discovered... :-) snip / ... So for a 10,000 page doc there are going to be a lot of nulls in the page tree. For now setting the toPDFString() to ignore the nulls rather than throw an exception gets round this and allows the document to be correctly generated. In my tests all the pages are produced in the correct order. I was wondering though if there are any cases where the pages might not be passed in in the correct order (and hence might possibly explain why the notifyKidsRegistered() method was written in the way it is), and if so if that has any implications on the way I have written the balanced page tree code updates. I think the original idea was that PDF would, in the long run, also be able to do out-of-order rendering (i.e. if page N in a document would be completely resolved, and thus could be rendered, before page N-1 --in that case, the null reference would be needed as a placeholder for the not-yet-finished page). At any rate, AFAIR, this was never actually implemented for PDF, so that explains why you see all pages in the correct order every time. If it is cleaner to alter notifyKidRegistered() and avoid those nulls from being inserted in the first place, I would prefer that over just skipping them in toPDFString(). Not a must, though... Regards Andreas ---
Re: Retrieving Objects question
Thanks a lot for your reply last week Andreas. Sorry for the delay. Been away and offline... FYI to follow up on the work I was doing: In the end I saw that references are indeed kept by the PDFDocument. So I decided it wouldn't do any harm (or take up any significant extra memory) to keep references to the objects themselves when I am constructing the balanced page tree. I have since modified PDFPages (and a small change in PDFPage) and the first working draft completed late yesterday keeps a list of sub-nodes (PDFPages, managed internally via a recursive algorithm - external methods work as before to avoid regressions) or leaves (PDFPage) as well as the original kids (may be a PDFPage or a sub PDFPages object) with PDF references to all children. This eliminates an overhead of looking up each object (potentially many times). I have successfully run it with test .fo files up to 10001 pages (each just showing 'Page x/y' where x is current page and y is total page count, takes a while with that many pages but not surprised) verifying that a balanced tree gets produced (and not a flat tree of one page tree object containing 10001 pages!). When each subnode is created the PDFFactory.makePages() method stores it in the trailer. That way the objects are all written out at the end after I have added all the pages to the right places, just before the cross reference table and trailer themselves are written. So now there are never more than 10 pages or 10 PDFPages (sub-nodes) per PDFPages object (I never mix sub-nodes and leaves on the same node). A similar structure to the page tree of the PDF 1.4 Reference document. Automatically generated on the fly. So for example a 101 page document will have a root PDFPages node with two sub-nodes underneath. The first will contain a count of 100, and have 10 sub-nodes, each containing 10 pages. The second will simply contain 1 page. More new pages will get added to the second sub-node (moving pages down to new sub-nodes to avoid more than 10 pages per node) until it's count reaches 100 too, then another node created. Once 10 nodes under the root exist (at 1000 pages) they will get moved down below a new root level sub-node with a count of 1000, and a new root level sub-node created, and so on. Next task is to write a JUnit test since one appears not to exist... I guess remaining thoughts currently are: - Wondering if keeping references to a page tree object's sub-nodes or leaves is the best way or can I improve it further? (Bearing in mind memory usage and performance.) - Was wondering if the trailer objects list is the right place to write the new sub-node PDFPages objects. (But if writing an object to the objects list - addObject() instead of addTrailerObject() - it gets written out too soon before I have added all the pages.) But given how it writes the objects out before writing the xref and trailer it seems OK and parses and shows fine in PDFBox/PDFDebugger and the evince PDF Reader in ubuntu. - When registering the pages themselves via notifyKidsRegistered() method it extracts the page index number and puts the reference at that index in the kids list, filling empty spaces ahead of it with nulls. So when counting kids and writing out the pdf code text I had to ignore nulls and 'gaps' in the kids list since not all the kids are in the same list any more (spread across multiple page tree nodes). I was wondering why this method was written like this, and doesn't simply append new pages to the end of the list all the time. Once testing is complete I'll submit the code internally for the in-team committers to review as I did with the 128 bit encryption work last month... Thanks! -Mike On 25/05/11 21:57, Andreas L. Delmelle wrote: On 25 May 2011, at 09:45, Michael Rubin wrote: Hi Mike Hello there. In the PDFPages class the kids are stored as reference strings (e.g. 23 0 R). Each of these objects are PDFPage objects. Do you know if there is a method somewhere that I can retrieve the PDF java object based on the reference string? Not really, AFAIK. What you do have is various Collections of different subtypes of PDFObject, available by means of accessors on PDFDocument. I guess the closest you would get without too much effort is to obtain the one you're interested in, then iterate over its elements and check PDFObject.referencePDF() against the lookup string. You do have to know the type(s) of object you need in advance, though... (I am aiming to add support for some of those kids being other PDFPages nodes to create a more balanced page tree.) Interesting. Looking forward to seeing more. Regards Andreas --- Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy
Re: Retrieving Objects question
On 03 Jun 2011, at 10:54, Michael Rubin wrote: Hi Mike Thanks a lot for your reply last week Andreas. Sorry for the delay. Been away and offline... FYI to follow up on the work I was doing: snip / So for example a 101 page document will have a root PDFPages node with two sub-nodes underneath. The first will contain a count of 100, and have 10 sub-nodes, each containing 10 pages. The second will simply contain 1 page. More new pages will get added to the second sub-node (moving pages down to new sub-nodes to avoid more than 10 pages per node) until it's count reaches 100 too, then another node created. Once 10 nodes under the root exist (at 1000 pages) they will get moved down below a new root level sub-node with a count of 1000, and a new root level sub-node created, and so on. Cool! Impressive work. Will the number of pages per node be configurable? Next task is to write a JUnit test since one appears not to exist... I guess remaining thoughts currently are: - Wondering if keeping references to a page tree object's sub-nodes or leaves is the best way or can I improve it further? (Bearing in mind memory usage and performance.) It depends a bit on whether you are thereby keeping PDFPage objects alive longer than necessary. The current design only stores the pages' referencePDF, so that seems safe. - Was wondering if the trailer objects list is the right place to write the new sub-node PDFPages objects. (But if writing an object to the objects list - addObject() instead of addTrailerObject() - it gets written out too soon before I have added all the pages.) But given how it writes the objects out before writing the xref and trailer it seems OK and parses and shows fine in PDFBox/PDFDebugger and the evince PDF Reader in ubuntu. I would think that that is the correct place, although I must admit, I would have to check the PDF Spec to be certain. - When registering the pages themselves via notifyKidsRegistered() method it extracts the page index number and puts the reference at that index in the kids list, filling empty spaces ahead of it with nulls. So when counting kids and writing out the pdf code text I had to ignore nulls and 'gaps' in the kids list since not all the kids are in the same list any more (spread across multiple page tree nodes). I was wondering why this method was written like this, and doesn't simply append new pages to the end of the list all the time. AFAICT, what it is designed to do is make sure that the page is entered at the correct index in the list of kids. It would only create null entries if the list is not yet large enough. I have a feeling this is just by design, taking into account a single page tree node only (see the javadoc of the PDFPages class...) Regards Andreas ---
Retrieving Objects question
Hello there. In the PDFPages class the kids are stored as reference strings (e.g. 23 0 R). Each of these objects are PDFPage objects. Do you know if there is a method somewhere that I can retrieve the PDF java object based on the reference string? (I am aiming to add support for some of those kids being other PDFPages nodes to create a more balanced page tree.) Thanks. -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it.
Re: Retrieving Objects question
On 25 May 2011, at 09:45, Michael Rubin wrote: Hi Mike Hello there. In the PDFPages class the kids are stored as reference strings (e.g. 23 0 R). Each of these objects are PDFPage objects. Do you know if there is a method somewhere that I can retrieve the PDF java object based on the reference string? Not really, AFAIK. What you do have is various Collections of different subtypes of PDFObject, available by means of accessors on PDFDocument. I guess the closest you would get without too much effort is to obtain the one you're interested in, then iterate over its elements and check PDFObject.referencePDF() against the lookup string. You do have to know the type(s) of object you need in advance, though... (I am aiming to add support for some of those kids being other PDFPages nodes to create a more balanced page tree.) Interesting. Looking forward to seeing more. Regards Andreas ---
Re: Event broadcasting and listening question - solved!
it referenced in the Listener? Or just the producer? My original idea was that this object gives the event handler a chance to intercept and modify the object that is the event origin. In most cases, it will certainly be ignored but someone might find it handy. And to have it in the producer means you also have to have it in the PDFEventListener. See also java.util.EventObject from which org.apache.fop.events.Event is derived. 2. Why should we get the PDFDocument object from the Encryption class? It's a PDFObject, right? So it should already have the PDFDocument. That makes access to the PDFEventListener easy. Should the listener not be passed into the Encryption class via its constructor rather than having to go fetch the listener? Both are valid ways but since I expect the PDFDocument to already be set, I see no point in giving more information that can otherwise be easily accessed. Well, it could be that your event happens before the PDFDocument is set on that object (see PDFDocument.setEncryption()). In that case PDFEncryptionManager might have to be changed to pass in the PDFDocument immediately. Or you pass in the PDFEventListener, although I find the former more useful and flexible. Thanks! -Mike On 12/05/11 21:29, Jeremias Maerki wrote: On 12.05.2011 10:44:41 Michael Rubin wrote: Thanks a lot for your response Jeremias. I have now done the following: - Added 'void warnRevision3PermissionsIgnored(Object source);' (and its javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added a corresponding entry to the xml. Removed the org.apache.fop.pdf.PDFEventProducer class and xml. - Created org.apache.fop.pdf.PDFEventListener interface containing just 'void warnRevision3PermissionsIgnored(Object source);'. - Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that extends the PDFEventListener. (Currently just contains my new event. Should I also add the existing 2 render.pdf events to this class?) Or do it the other way around: add your new event to PDFEventProcuder. Doesn't make sense to have two. I can also see how to obtain the PDFDocument object from the PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure how to get the event broadcaster from that object. How is this done? public class PDFDocument { [..] private PDFEventListener listener; [..] public void setListener(PDFEventListener listener) { this.listener = listener; } PDFEventListener getListener() { return this.listener; } [..] That's the simples way and should probably be sufficient. If we wanted to get fancy, we could handle a ListPDFEventListener. In PDFDocumentHandler.startDocument(): this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster()); So, the PDFDocument doesn't actually get an EventBroadcaster. PDFDocument calls the PDFLibraryEventAdapter and that one in turn calls the EventBroadcaster. Nicely decoupled. Thanks! -Mike On 11/05/11 19:46, Jeremias Maerki wrote: Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListener that you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event
Re: Event broadcasting and listening question
and xml. - Created org.apache.fop.pdf.PDFEventListener interface containing just 'void warnRevision3PermissionsIgnored(Object source);'. - Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that extends the PDFEventListener. (Currently just contains my new event. Should I also add the existing 2 render.pdf events to this class?) Or do it the other way around: add your new event to PDFEventProcuder. Doesn't make sense to have two. I can also see how to obtain the PDFDocument object from the PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure how to get the event broadcaster from that object. How is this done? public class PDFDocument { [..] private PDFEventListener listener; [..] public void setListener(PDFEventListener listener) { this.listener = listener; } PDFEventListener getListener() { return this.listener; } [..] That's the simples way and should probably be sufficient. If we wanted to get fancy, we could handle a ListPDFEventListener. In PDFDocumentHandler.startDocument(): this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster()); So, the PDFDocument doesn't actually get an EventBroadcaster. PDFDocument calls the PDFLibraryEventAdapter and that one in turn calls the EventBroadcaster. Nicely decoupled. Thanks! -Mike On 11/05/11 19:46, Jeremias Maerki wrote: Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListenerthat you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone
Re: Event broadcasting and listening question
listener) { this.listener = listener; } PDFEventListener getListener() { return this.listener; } [..] That's the simples way and should probably be sufficient. If we wanted to get fancy, we could handle a ListPDFEventListener. In PDFDocumentHandler.startDocument(): this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster()); So, the PDFDocument doesn't actually get an EventBroadcaster. PDFDocument calls the PDFLibraryEventAdapter and that one in turn calls the EventBroadcaster. Nicely decoupled. Thanks! -Mike On 11/05/11 19:46, Jeremias Maerki wrote: Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListener that you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Jeremias Maerki Jeremias Maerki Jeremias Maerki attachment: PDFEventListener.png
Re: Event broadcasting and listening question
the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListener that you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Jeremias Maerki Jeremias Maerki
Re: Event broadcasting and listening question
Thanks a lot for your response Jeremias. I have now done the following: - Added 'void warnRevision3PermissionsIgnored(Object source);' (and its javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added a corresponding entry to the xml. Removed the org.apache.fop.pdf.PDFEventProducer class and xml. - Created org.apache.fop.pdf.PDFEventListener interface containing just 'void warnRevision3PermissionsIgnored(Object source);'. - Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that extends the PDFEventListener. (Currently just contains my new event. Should I also add the existing 2 render.pdf events to this class?) I can also see how to obtain the PDFDocument object from the PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure how to get the event broadcaster from that object. How is this done? Thanks! -Mike On 11/05/11 19:46, Jeremias Maerki wrote: Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListener that you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Jeremias Maerki
Re: Event broadcasting and listening question
On 12.05.2011 10:44:41 Michael Rubin wrote: Thanks a lot for your response Jeremias. I have now done the following: - Added 'void warnRevision3PermissionsIgnored(Object source);' (and its javadoc) to PDFEventProducer in the org.apache.fop.render.pdf package and added a corresponding entry to the xml. Removed the org.apache.fop.pdf.PDFEventProducer class and xml. - Created org.apache.fop.pdf.PDFEventListener interface containing just 'void warnRevision3PermissionsIgnored(Object source);'. - Created PDFLibraryEventAdaptor in the org.apache.fop.render.pdf package that extends the PDFEventListener. (Currently just contains my new event. Should I also add the existing 2 render.pdf events to this class?) Or do it the other way around: add your new event to PDFEventProcuder. Doesn't make sense to have two. I can also see how to obtain the PDFDocument object from the PDFEncryptionJCE class via the getDocumentSafely() method. But I am not sure how to get the event broadcaster from that object. How is this done? public class PDFDocument { [..] private PDFEventListener listener; [..] public void setListener(PDFEventListener listener) { this.listener = listener; } PDFEventListener getListener() { return this.listener; } [..] That's the simples way and should probably be sufficient. If we wanted to get fancy, we could handle a ListPDFEventListener. In PDFDocumentHandler.startDocument(): this.pdfDoc.setEventListener(new PDFLibraryEventAdapter(getUserAgent().getEventBroadcaster()); So, the PDFDocument doesn't actually get an EventBroadcaster. PDFDocument calls the PDFLibraryEventAdapter and that one in turn calls the EventBroadcaster. Nicely decoupled. Thanks! -Mike On 11/05/11 19:46, Jeremias Maerki wrote: Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListener that you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way
Event broadcasting and listening question
Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it.
Re: Event broadcasting and listening question
Hi Michael Creating a new EventBroadcaster is obviously wrong. The idea is that the user can get events for each FOP rendering run separately (unlike logging where concurrent runs get mixed up). So you have to get hold of that EventBroadcaster applicable to the current rendering run. Obviously, you don't have access to the FOUserAgent in the PDF library. That is intentional because the PDF library should remain reasonably independent of as much FOP code as possible for the case that we ever factor it out into a separate component/module or move it to XML Graphics Commons. My suggestion is to follow a similar path as done in org.apache.fop.fonts: Create an interface for the events coming out of the PDF library (see FontEventListener). Let's call it PDFEventListener or something like that and put it in the org.apache.fop.pdf package. Then move your PDFEventProducer (corresponds to FontEventProducer) into org.apache.fop.render.pdf as this package makes the glue between FOP and PDF output. Then create a PDFLibraryEventAdapter (implements PDFEventListener) in the org.apache.fop.render.pdf package (corresponds to FontEventAdapter). The PDFLibraryEventAdapter will get the EventBroadcaster from the PDFDocumentHandler which is responsible for instantiating the PDFDocument and PDFLibraryEventAdapter. The adapter is then added as listener to a ListPDFEventListener that you can add to PDFDocument. From PDFEncryptionJCE you should have access to the PDFDocument via the getDocumentSafely() method. That nicely decouples FOP's event subsystem from the PDF library. HTH On 11.05.2011 15:47:49 Michael Rubin wrote: ?Hello there. I have been busy implementing 128 bit PDF encryption for FOP. I have already got it working successfully but one issue remains that I have a question about. In the org.apache.fop.pdf.PDFEncriptionJCE.init() method there is one place where I want to broadcast an event message. I looked athttp://xmlgraphics.apache.org/fop/trunk/events.html to learn about events. However it just shows EventBroadcaster broadcaster = [get it from somewhere]; and doesn't show how I should be getting the broadcaster. After looking in the code in the AFP package for existing examples I put together the following which seems to work on testing: FopFactory fopFactory = FopFactory.newInstance(); FOUserAgent agent = fopFactory.newFOUserAgent(); EventBroadcaster eventBroadcaster = agent.getEventBroadcaster(); PDFEventProducer eventProducer = PDFEventProducer.Provider.get(eventBroadcaster); eventProducer.warnRevision3PermissionsIgnored(this); This creates a new FopFactory, from which I create a new FOUserAgent, from which I can get the event broadcaster to supply to my event producer. (I had to create a PDFEventProducer which extends EventProducer. Plus PDFEventProducer.xml which contains the message mapping.) In this case the EventBroadcaster will be created new every time so I am not sure existing listeners will pick up. So is there a recommended way that I can get an existing event broadcaster to use? Or is the above way the correct way to do it after all? Version of FOP is v1.0. Platform is Ubuntu Linux, running from within the Eclipse IDE. Thanks! -Mike Michael Rubin Developer T: +44 20 8238 7400 F: +44 20 8238 7401 mru...@thunderhead.com The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Jeremias Maerki
Re: CID Font Question
Hi Mehdi Interesting problem. Apparently, the overwhelming majority of TrueType fonts map glyph index 1 to .null and glyph index 2 to nonmarkingreturn (for carriage returns and such). Your version of the Frutiger 45 Light font apparently doesn't but has space on glyph index 1. Those three blind indices have been in FOP's codebase since the addition of CID subsets in 2001: http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/org/apache/fop/render/pdf/fonts/MultiByteFont.java?r1=194167r2=194168pathrev=195822; Funny that something like this shows up after so much time. Anyway, since we only need to embed a .notdef as index 0 and the glyphs we really need I think it is safe to remove the fixed indices 1 and 2 since this whole thing then seems to work for both kinds of fonts. I'd say, put your patch in Bugzilla! On 17.03.2011 17:27:22 mehdi houshmand wrote: Hi Guys, I found an issue with a True Type Font in PDF, I have attached a PDF with the possible bug (buggy.pdf) and with my fix (fixed.pdf). The issue is that if you copy/paste the text from the normal-weighted font (top line) of the PDF, the (space) and ! (exclamation mark) characters are mapped to unicode index \u. Initially I thought this was a bug in the font, so I looked at the cmap table in the font to see what unicode index these glyphs were mapped to, and I found that they were the 2nd and 3rd entries in the cmap table. This tickled my curiosity because all the fonts I remember (and I checked a couple to be sure) have the first 3 glyphs mapped to \u or \u and in their CID is .notdef. The BOLD version of the same font (in both PDFs) works fine, and as expected the first 3 glyphs are mapped to \u, \u and \u respectively. I also checked the code-base and in o.a.f.fonts.CIDSubset has the following lines of code: /** * Adds the initial 3 glyphs which are the same for all CID subsets. */ public void setupFirstThreeGlyphs() { // Make sure that the 3 first glyphs are included usedGlyphs.put(new Integer(0), new Integer(0)); usedGlyphsIndex.put(new Integer(0), new Integer(0)); usedGlyphsCount++; usedGlyphs.put(new Integer(1), new Integer(1)); usedGlyphsIndex.put(new Integer(1), new Integer(1)); usedGlyphsCount++; usedGlyphs.put(new Integer(2), new Integer(2)); usedGlyphsIndex.put(new Integer(2), new Integer(2)); usedGlyphsCount++; } So I checked the specification and no where does it suggest that the first THREE are reserved, it does however say that CID 0 should be .notdef. (see quote below, p340 of PDF spec). Every CIDFont must contain a glyph description for CID 0, which is analogous to the .notdef character name in simple fonts (see Handling Undefined Characters on page 355). My question is this, is this a FOP bug or is this a bug in the font we're using? If it's a fop bug, I'd be more than happy to fix it (delete the 6 lines and change the method name). If, however, it's a font bug, then which spec should I be looking at? What is the bug? I should also mention, that I started with the TTF spec and this doesn't suggest that any glyphs are reserved. Any help on this would very much be appreciated, Mehdi Jeremias Maerki
Re: CID Font Question
Excellent. I looked back and tracked it back to the same commit, and as you can imagine, perplexing me further. Well I'll post a fix soon. Mehdi On 18 March 2011 14:59, Jeremias Maerki d...@jeremias-maerki.ch wrote: Hi Mehdi Interesting problem. Apparently, the overwhelming majority of TrueType fonts map glyph index 1 to .null and glyph index 2 to nonmarkingreturn (for carriage returns and such). Your version of the Frutiger 45 Light font apparently doesn't but has space on glyph index 1. Those three blind indices have been in FOP's codebase since the addition of CID subsets in 2001: http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/org/apache/fop/render/pdf/fonts/MultiByteFont.java?r1=194167r2=194168pathrev=195822; Funny that something like this shows up after so much time. Anyway, since we only need to embed a .notdef as index 0 and the glyphs we really need I think it is safe to remove the fixed indices 1 and 2 since this whole thing then seems to work for both kinds of fonts. I'd say, put your patch in Bugzilla! On 17.03.2011 17:27:22 mehdi houshmand wrote: Hi Guys, I found an issue with a True Type Font in PDF, I have attached a PDF with the possible bug (buggy.pdf) and with my fix (fixed.pdf). The issue is that if you copy/paste the text from the normal-weighted font (top line) of the PDF, the (space) and ! (exclamation mark) characters are mapped to unicode index \u. Initially I thought this was a bug in the font, so I looked at the cmap table in the font to see what unicode index these glyphs were mapped to, and I found that they were the 2nd and 3rd entries in the cmap table. This tickled my curiosity because all the fonts I remember (and I checked a couple to be sure) have the first 3 glyphs mapped to \u or \u and in their CID is .notdef. The BOLD version of the same font (in both PDFs) works fine, and as expected the first 3 glyphs are mapped to \u, \u and \u respectively. I also checked the code-base and in o.a.f.fonts.CIDSubset has the following lines of code: /** * Adds the initial 3 glyphs which are the same for all CID subsets. */ public void setupFirstThreeGlyphs() { // Make sure that the 3 first glyphs are included usedGlyphs.put(new Integer(0), new Integer(0)); usedGlyphsIndex.put(new Integer(0), new Integer(0)); usedGlyphsCount++; usedGlyphs.put(new Integer(1), new Integer(1)); usedGlyphsIndex.put(new Integer(1), new Integer(1)); usedGlyphsCount++; usedGlyphs.put(new Integer(2), new Integer(2)); usedGlyphsIndex.put(new Integer(2), new Integer(2)); usedGlyphsCount++; } So I checked the specification and no where does it suggest that the first THREE are reserved, it does however say that CID 0 should be .notdef. (see quote below, p340 of PDF spec). Every CIDFont must contain a glyph description for CID 0, which is analogous to the .notdef character name in simple fonts (see “Handling Undefined Characters” on page 355). My question is this, is this a FOP bug or is this a bug in the font we're using? If it's a fop bug, I'd be more than happy to fix it (delete the 6 lines and change the method name). If, however, it's a font bug, then which spec should I be looking at? What is the bug? I should also mention, that I started with the TTF spec and this doesn't suggest that any glyphs are reserved. Any help on this would very much be appreciated, Mehdi Jeremias Maerki
CID Font Question
Hi Guys, I found an issue with a True Type Font in PDF, I have attached a PDF with the possible bug (buggy.pdf) and with my fix (fixed.pdf). The issue is that if you copy/paste the text from the normal-weighted font (top line) of the PDF, the (space) and ! (exclamation mark) characters are mapped to unicode index \u. Initially I thought this was a bug in the font, so I looked at the cmap table in the font to see what unicode index these glyphs were mapped to, and I found that they were the 2nd and 3rd entries in the cmap table. This tickled my curiosity because all the fonts I remember (and I checked a couple to be sure) have the first 3 glyphs mapped to \u or \u and in their CID is .notdef. The BOLD version of the same font (in both PDFs) works fine, and as expected the first 3 glyphs are mapped to \u, \u and \u respectively. I also checked the code-base and in o.a.f.fonts.CIDSubset has the following lines of code: /** * Adds the initial 3 glyphs which are the same for all CID subsets. */ public void setupFirstThreeGlyphs() { // Make sure that the 3 first glyphs are included usedGlyphs.put(new Integer(0), new Integer(0)); usedGlyphsIndex.put(new Integer(0), new Integer(0)); usedGlyphsCount++; usedGlyphs.put(new Integer(1), new Integer(1)); usedGlyphsIndex.put(new Integer(1), new Integer(1)); usedGlyphsCount++; usedGlyphs.put(new Integer(2), new Integer(2)); usedGlyphsIndex.put(new Integer(2), new Integer(2)); usedGlyphsCount++; } So I checked the specification and no where does it suggest that the first THREE are reserved, it does however say that CID 0 should be .notdef. (see quote below, p340 of PDF spec). Every CIDFont must contain a glyph description for CID 0, which is analogous to the .notdef character name in simple fonts (see “Handling Undefined Characters” on page 355). My question is this, is this a FOP bug or is this a bug in the font we're using? If it's a fop bug, I'd be more than happy to fix it (delete the 6 lines and change the method name). If, however, it's a font bug, then which spec should I be looking at? What is the bug? I should also mention, that I started with the TTF spec and this doesn't suggest that any glyphs are reserved. Any help on this would very much be appreciated, Mehdi fixed.pdf Description: Adobe PDF document buggy.pdf Description: Adobe PDF document
Re: A question about working on apache fop
Dear Simon, dear Vincent, dear FOP developers, after a lot of discussions (internally and externally), considerations and testing we finally made up our minds. We will implement a new machinery based on open-source libraries like Cairo and Pango. The engine will be written in F# using Mono/.NET. It will be a kind of .NET-XSL-FO processor. Of course FOP will be one of our sources of inspiration and I know that we have a lot of work to do in order to get equal or even better results. Best regards Martin Sievers -- Diplom-Mathematiker Martin Sievers Kompetenzzentrum für elektronische Erschließungs- und Publikationsverfahren in den Geisteswissenschaften Universität Trier Fachbereich II / Germanistik Universitätsring 15 54286 Trier Projekte: Textgrid (Printmodul) / Workspace for Collaborative Editing Raum: DM333 (3.OG B) Telefon: 0651 201-3017 Telefax: 0651 201-3589 Skype: martinsievers E-Mail: siev...@uni-trier.de http://www.kompetenzzentrum.uni-trier.de attachment: sievers.vcf
Question on MimeConstants
Hello All, I just downloaded the FOP jar file. Our Project involves Converting XML to PDF...So, while compiling your examples...we found out that Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out); Mime-PDF was not defined in MimeConstants.java Please help with this one Thank You, Dickson Robert 518-402-5404
RE: Question on MimeConstants
public interface MimeConstants extends org.apache.xmlgraphics.util.MimeConstants { Check out the other class(es). There's more than one set of MimeConstants combined here. From: Dickson Robert [mailto:dickson.rob...@ag.ny.gov] Sent: Wednesday, December 22, 2010 10:16 AM To: 'fop-dev@xmlgraphics.apache.org' Subject: Question on MimeConstants Hello All, I just downloaded the FOP jar file. Our Project involves Converting XML to PDF...So, while compiling your examples...we found out that Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out); Mime-PDF was not defined in MimeConstants.java Please help with this one Thank You, Dickson Robert 518-402-5404
Re: A question about working on apache fop
Hi Martin and Roland, The FOP team is pleased to accept contributions to FOP, large and small. Contributions are best submitted as a patch attached to an issue in our bug tracking system bugzilla, http://issues.apache.org/bugzilla/enter_bug.cgi?product=Fop. If you plan a larger contribution, such as the one you intend to develop, it is useful to create a single bugzilla issue, and attach subsequent patches to it. We will create a Subversion branch for your project, to which we will add your patches. We will also keep the branch up to date with respect to the main code (Subversion trunk), so that integration of your contributions with the main code is tested automatically. Our ant build system allows one to test the code with junit, checkstyle, and findbugs tests. We encourage contributors to run those tests. We also encourage them to create and submit test cases for their new functionality or feature and bug fixes. The use of a subversion branch allows you to submit an early implementation to FOP, and discard it later. In view of the complexity of the work, it may be useful to create design documents first, along with theoretical considerations about the algorithms used. You may publish these in your own web space, but you may also use FOP's wiki, http://wiki.apache.org/xmlgraphics-fop/DeveloperPages, for that purpose. The code in our subversion repository is automatically synchronized in a git repository, git://git.apache.org/fop.git or https://github.com/apache/fop. See also http://wiki.apache.org/general/GitAtApache. For larger contributions, the Apache Software Foundation (ASF) wishes to have a contributor license of all copyright owners (authors or their employers) of the submitted code, see http://www.apache.org/licenses/#clas. See also 'How can I contribute?', http://xmlgraphics.apache.org/fop/dev/faq.html#faq-N1000D. See also some parts of 'Guide for new committers', http://www.apache.org/dev/new-committers-guide.html. This enables the ASF to release the code under the Apache license version 2. All contributions must go via patches at bugzilla, so that it is clear that you submitted them under your contributor license with the ASF. Therefore, if you would maintain a public git branch or a public branch in another distributed VC system, we would not pull directly from it. Your first steps would be anything you need to do to arrive at your first submitted patch: design, code, test, submit. You could open a bugzilla issue early if you wish. You could create a wiki page for your project and add design documents to it. We hope to see your contributions to FOP, to the benefit of all its users. Best, Simon Pepping On Tue, Dec 21, 2010 at 04:40:11PM +0100, Roland Schwarz wrote: Dear developers, dear members of the Project Management, we work on a research project called XML-Print at the University of Trier, Germany. The idea is to implement (or improve) a XML to PDF typesetter with an easy-to-use GUI which helps humanists to publish their critical editions, dictionaries etc. It will be part of the toolkit TextGrid Lab which is a long-term project to develop a general framework containing different tools for collaborative work on digital documents (http://www.textgrid.de/en/startseite.html). Having looked at existing approaches FOP seems to be a stable and promising base to build on. However there are some features missing either not yet implemented in FOP itself or even not defined in XSL-FO 1.1. We therefore would have to implement features based on XSL-FO 1.1, but also on the requirements for XSL-FO 2.0 as described in http://www.w3.org/TR/xslfo20-req/. Among others we are especially interested in some elements mentioned in the current design draft like - marginalia (2.2.3) - side-by-side flows (2.2.6) - line numbering (2.2.7.1) ** - cross references (2.2.8) ** The line numbering will also involve some more complex issues, not only a simple line numbering of every n-th line. For example there could be interactions between line numbers and marginalia, which have to be considered in the typesetting process. We would also have to design and implement new layout features currently not mentioned in any seen XSL-FO design draft like the usage of a complex bibliographic apparatus or a grid typesetting feature. There are also requirements for complex footnotes, which may lead to an extension of the currently available footnote mechanism in the XSL-FO standard. At the current point in our work we wonder how we can use the current status of FOP, how we can embed our work into future releases and last but not least, give some work back to the community. One developer would work full-time on FOP for at least one year. Furthermore we would like to know if an early implementation of requirements -- using a separate namespace of course -- is somehow wanted and if there is any other group working on them. What would be the next steps for us? Thank
Re: A question about working on apache fop
Hello, Welcome to the FOP project. In addition to Simon’s notes, I have a few more specific comments. In theory I would be reluctant to work on any features that have not been standardized yet. The way they are specified may heavily impact the design, so you face the danger of having to throw away all your work and restart again. This can be mitigated by a careful definition of the object model, abstractions and encapsulations but the risk is still there. As you mentioned, any non-standardized element should be defined in its own extension namespace and developed as a plug-in to FOP. This will allow to avoid conflicts and backwards incompatibilities with later versions of the Recommendation. Any work on typesetting is inherently difficult, but FOP has a few additional challenges of its own. ATM the layout engine is not really designed to be pluggable, with well-defined extension points. That would need to be addressed. Also, the data model it’s based on (inspired from Dr Knuth’ work on digital typography) is fundamentally limited and doesn’t work well for e.g. tables. Some time ago I started to work on the prototype of a new layout engine, but unfortunately I was sidetracked by more urgent work. I’m planning to restart this work at some point in the future. The object-oriented paradigm applies well to XSL-FO. Implementing this paradigm on the layout engine is IMO the only way to keep it manageable and maintainable, especially if we are to add such complex new features (XSL-FO 1.1 alone is already extremely complex!). We’re not quite there yet, but this is doable. In short, a lot of work and challenges, but this is a very interesting area, and your full-time developer will definitely be welcome. Vincent On 21/12/10 15:40, Roland Schwarz wrote: Dear developers, dear members of the Project Management, we work on a research project called XML-Print at the University of Trier, Germany. The idea is to implement (or improve) a XML to PDF typesetter with an easy-to-use GUI which helps humanists to publish their critical editions, dictionaries etc. It will be part of the toolkit TextGrid Lab which is a long-term project to develop a general framework containing different tools for collaborative work on digital documents (http://www.textgrid.de/en/startseite.html). Having looked at existing approaches FOP seems to be a stable and promising base to build on. However there are some features missing either not yet implemented in FOP itself or even not defined in XSL-FO 1.1. We therefore would have to implement features based on XSL-FO 1.1, but also on the requirements for XSL-FO 2.0 as described in http://www.w3.org/TR/xslfo20-req/. Among others we are especially interested in some elements mentioned in the current design draft like - marginalia (2.2.3) - side-by-side flows (2.2.6) - line numbering (2.2.7.1) ** - cross references (2.2.8) ** The line numbering will also involve some more complex issues, not only a simple line numbering of every n-th line. For example there could be interactions between line numbers and marginalia, which have to be considered in the typesetting process. We would also have to design and implement new layout features currently not mentioned in any seen XSL-FO design draft like the usage of a complex bibliographic apparatus or a grid typesetting feature. There are also requirements for complex footnotes, which may lead to an extension of the currently available footnote mechanism in the XSL-FO standard. At the current point in our work we wonder how we can use the current status of FOP, how we can embed our work into future releases and last but not least, give some work back to the community. One developer would work full-time on FOP for at least one year. Furthermore we would like to know if an early implementation of requirements -- using a separate namespace of course -- is somehow wanted and if there is any other group working on them. What would be the next steps for us? Thank you for any responses. Best regards and Happy Holidays from Martin Sievers and Roland Schwarz
A question about working on apache fop
Dear developers, dear members of the Project Management, we work on a research project called XML-Print at the University of Trier, Germany. The idea is to implement (or improve) a XML to PDF typesetter with an easy-to-use GUI which helps humanists to publish their critical editions, dictionaries etc. It will be part of the toolkit TextGrid Lab which is a long-term project to develop a general framework containing different tools for collaborative work on digital documents (http://www.textgrid.de/en/startseite.html). Having looked at existing approaches FOP seems to be a stable and promising base to build on. However there are some features missing either not yet implemented in FOP itself or even not defined in XSL-FO 1.1. We therefore would have to implement features based on XSL-FO 1.1, but also on the requirements for XSL-FO 2.0 as described in http://www.w3.org/TR/xslfo20-req/. Among others we are especially interested in some elements mentioned in the current design draft like - marginalia (2.2.3) - side-by-side flows (2.2.6) - line numbering (2.2.7.1) ** - cross references (2.2.8) ** The line numbering will also involve some more complex issues, not only a simple line numbering of every n-th line. For example there could be interactions between line numbers and marginalia, which have to be considered in the typesetting process. We would also have to design and implement new layout features currently not mentioned in any seen XSL-FO design draft like the usage of a complex bibliographic apparatus or a grid typesetting feature. There are also requirements for complex footnotes, which may lead to an extension of the currently available footnote mechanism in the XSL-FO standard. At the current point in our work we wonder how we can use the current status of FOP, how we can embed our work into future releases and last but not least, give some work back to the community. One developer would work full-time on FOP for at least one year. Furthermore we would like to know if an early implementation of requirements -- using a separate namespace of course -- is somehow wanted and if there is any other group working on them. What would be the next steps for us? Thank you for any responses. Best regards and Happy Holidays from Martin Sievers and Roland Schwarz
A (long) question on implementing fo:retrieve-table-marker
I take as reference: http://www.w3.org/TR/xsl/ To begin with, I've built a simple example / test case related to my customer's needs but fairly based on the one in 6.13.1.1.2: http://www.w3.org/TR/xsl/#d0e14681 This example looks good and though not trivial to understand at first sight, it is a common-sense use of fo:retrieve-table-marker. It has a sort of conditional row that is displayed (or not) in the table footer and is defined just after table-body: *fo:marker marker-class-name=continued fo:table-row fo:table-cell fo:blockTable continued.../fo:block /fo:table-cell /fo:table-row /fo:marker * Now if I run my test case against the trunk of FOP I get this error message: *GRAVE: Exception javax.xml.transform.TransformerException: org.apache.fop.fo.ValidationException: file:/D:/Java/fop-0.95beta/test-rtm.fo:533:112: Error(533/112): fo:retrieve-tab le-marker is not a valid child element of fo:table-header. * For the sake of clarity only let's call this a bug since *fo-retrieve-table-marker* /is/ a valid child element of *fo:table-header:* http://www.w3.org/TR/xsl/#fo_retrieve-table-marker So let's read the full stack trace and fix this by adding one more condition in *org.apache.fop.fo.flow.table.TablePart.validateChildNode()*. The next step is where I'm puzzled and don't know how to proceed since the W3C recommendation and the example they give seem inconsistent with each other, while the error message FOP returns seems consistent with the recommendation: *GRAVE: Exception javax.xml.transform.TransformerException: org.apache.fop.fo.ValidationException: {http://www.w3.org/1999/XSL/Format}table-row; is not a valid child of fo:mark er! (See position 557:21) * This error is consistent with the W3C TR: http://www.w3.org/TR/xsl/#fo_marker that says that the fo:marker contents is: (#PCDATA|%inline; http://www.w3.org/TR/xsl/#inline.fo.list|%block; http://www.w3.org/TR/xsl/#block.fo.list)* but it makes the example they give useless (indeed, neither inline nor block allow directly or indirectly table-row). And so I don't quite know where I'm allowed to go from here. (If I change a behaviour in *org.apache.fop.fo.flow.Marker.validateChildNode()* I may end up with something that works and fits the bill but is inconsistent with the TR). Any comment appreciated ! Jean-Francois El Fouly
Re: A (long) question on implementing fo:retrieve-table-marker
On Sep 2, 2008, at 11:21, Jean-François El Fouly wrote: Just another hint I thought of: To begin with, I've built a simple example / test case related to my customer's needs but fairly based on the one in 6.13.1.1.2: http://www.w3.org/TR/xsl/#d0e14681 This example looks good and though not trivial to understand at first sight, it is a common-sense use of fo:retrieve-table-marker. It has a sort of conditional row that is displayed (or not) in the table footer and is defined just after table-body: *fo:marker marker-class-name=continued fo:table-row fo:table-cell fo:blockTable continued.../fo:block /fo:table-cell /fo:table-row /fo:marker * Due to the problems you ran into so far, I'd advise first trying to get the retrieval of a simple fo:block into the header/footer working. That would already help most people out who need this, I think. Once that works, extending the approach to also work with entire table-parts should prove a relatively simple exercise. Cheers Andreas
Re: A (long) question on implementing fo:retrieve-table-marker
On Sep 2, 2008, at 11:21, Jean-François El Fouly wrote: ... and another detail, to facilitate studying the process through debugging: snip / file:/D:/Java/fop-0.95beta/test-rtm.fo:533:112: Error(533/112): fo:retrieve-tab ^^^ For your own convenience, trim that testcase down! ;-) Seriously, try to generate extra space by means of space-before and/ or -after to generate multiple pages. Use an extra small page-size to limit the amount of needed content to trigger page-breaks... Generating dummy content, leading to a FO of 500+ lines, will not make the job any easier. Something I've also found helpful, in case you're unaware of it: specify ids on the FOs to easily track the corresponding FObj and their related LayoutManager in a debugger. Andreas
FOText question
Hi fopsters I've just been working on changing FOText to work with a java.nio.CharBuffer (instead of a char[]). So far, while refactoring this, I already noticed that this makes the related code much more compact on our side. Another small improvement is that TextLayoutManager no longer duplicates the FOText's character array, but simply shares the same array that is backing the FOText's CharBuffer (slight reduction of the memory footprint; could be a significant amount in large documents). Now, while I'm at it, I'm wondering whether it would be a good idea to have FOText implement Java 1.4's java.lang.CharSequence interface. This would mean that FOText gets a few extra methods (charAt(), length() and subSequence()), allowing us to use it in other parts of the code in the same fashion as a String or StringBuffer. The toString() method would need to be altered to follow the definition as specified in the API docs (i.e. only output the text-content). Any opinions on this? (or more importantly: Any objections?) Cheers Andreas
Re: FOText question
Andreas, Am 15.06.2008 um 13:25 schrieb Andreas Delmelle: Now, while I'm at it, I'm wondering whether it would be a good idea to have FOText implement Java 1.4's java.lang.CharSequence interface. This would mean that FOText gets a few extra methods (charAt(), length() and subSequence()), allowing us to use it in other parts of the code in the same fashion as a String or StringBuffer. The toString() method would need to be altered to follow the definition as specified in the API docs (i.e. only output the text-content). Any opinions on this? (or more importantly: Any objections?) Not at all. I am still working my way through the Layout code, trying to get the alignmentContext to work, so any reduction / simplification in this respect would only make it easier. As for the implementation of the interface: If the subSequence is never used, then it is probably a good idea to implement if by throwing an UnsupportedOperationException, as this operation seems very complex and there'd be no point implementing it unless it is used. Cheers Andreas Max
Question about FOP 0.94 and 0.20.5
I try to run in a seem application the two versions of fop 0.94 and 0.20.5.I rebuild version FOP 0.94 changing the names of folders (org.apache.fop by org.jcom.fop) and jars to let de ClassLoader upload both librarys. As a result I get an error of type class not definition found for FopFactory and another strange errors. Is it posible to run both versions togheter? Thanks In advance Juanjo
Re: Question about FOP 0.94 and 0.20.5
On May 6, 2008, at 18:00, Juanjo Alejandro wrote: I try to run in a seem application the two versions of fop 0.94 and 0.20.5. I rebuild version FOP 0.94 changing the names of folders (org.apache.fop by org.jcom.fop) and jars to let de ClassLoader upload both librarys. As a result I get an error of type class not definition found for FopFactory and another strange errors. Is it posible to run both versions togheter? AFAIK, this will always cause trouble. Both versions are not meant to work together inside the same VM, and definitely not when loaded by the same ClassLoader. Anyways, modifying the sources to try to work around this limitation would be a bit too invasive for my taste. A workaround you might consider is to set up a second container that hosts the old version. That one can then be phased out over time once you're certain that 0.94 suits your needs, and this is one sure way to keep both versions completely isolated from each other. HTH! Cheers Andreas
Yet another FOP + Unicode setup question
Greetings, I'm trying to get FOP to publish russian from Arial Unicode MS font. v.94 FOP My Environment Settings: # BEGIN writeEnvironmentReport($Revision: 1.29 $): Useful stuff found: version.DOM.draftlevel=2.0fd java.class.path=C:\Notification Manager\notificationmanagersrc.jar;C:\xalan\xala n.jar;C:\xalan\xercesImpl.jar;C:\FOP\fop.jar;C:\XML\docbook\xsl\extensions\xalan 27.jar version.JAXP=1.1 or higher java.ext.dirs=C:\Program Files\Java\jre1.6.0_03\lib\ext;C:\WINXP\Sun\Java\lib\ex t version.xerces2=Xerces-J 2.7.1 version.xerces1=not-present version.xalan2_2=Xalan Java 2.7.0 version.xalan1=not-present version.ant=not-present java.version=1.6.0_03 version.DOM=2.0 version.crimson=not-present sun.boot.class.path=C:\Program Files\Java\jre1.6.0_03\lib\resources.jar;C:\Progr am Files\Java\jre1.6.0_03\lib\rt.jar;C:\Program Files\Java\jre1.6.0_03\lib\sunrs asign.jar;C:\Program Files\Java\jre1.6.0_03\lib\jsse.jar;C:\Program Files\Java\j re1.6.0_03\lib\jce.jar;C:\Program Files\Java\jre1.6.0_03\lib\charsets.jar;C:\Pro gram Files\Java\jre1.6.0_03\classes # BEGIN Listing XML-related jars in: foundclasses.java.class.path xalan.jar-path=C:\xalan\xalan.jar xercesImpl.jar-apparent.version=xercesImpl.jar from Xerces-J-bin.2.7.1 xercesImpl.jar-path=C:\xalan\xercesImpl.jar #- END Listing XML-related jars in: foundclasses.java.class.path - version.SAX=2.0 version.xalan2x=Xalan Java 2.7.0 #- END writeEnvironmentReport: Useful properties found: - # YAHOO! Your environment seems to be OK. My config font settings: fonts !-- automatically detect operating system installed fonts -- auto-detect/ /fonts And finally the command line I use (from the fop.bat file): java %JAVAOPTS% -cp %LOCALCLASSPATH% org.apache.fop.cli.Main -c c:\fop\conf\fop.xconf %FOP_CMD_LINE_ARGS% I get PDFs and for the most part FOP works fine. But I have a bit of XML ???/? that it just refuses to render... Any suggestions? Thanks, David White The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it.
Small question: line-layout and markers
Hi all Just a check to see if anyone knows off-hand how the availIPD.opt that is passed into Paragraph.startParagraph() in LineLM.collectInlineKnuthElements() would influence marker-retrieval. Reason I'm asking is that I noticed the TODO on collectInlineKnuthElements() to remove the availIPD parameter from its signature. So, I decided to try removing it and using the LayoutContext's stack limit instead. All tests pass, apart from markers_4 and markers_5b... Cheers Andreas
Re: FOrayFont integration in question
Hi All, So, not many opinions on this it seems. Thanks to Bertrand and Jeremias for their comments. I'll need to have a closer look at the current font library. As I was supposed to replace it with FOrayFont I have never studied it in detail yet. Then I'll see if it is best to keep it or to switch to a fork of FOrayFont. Although right now I've the feeling the former solution is preferable. My first two goals are to polish the removal of the XML metrics generation step (mainly add an optional parameter in the config file for specifying the name of the PFM file), and add support for AFM metrics files. Then... we'll see. Cheers, Vincent
FOrayFont integration in question
Hi all, Sorry for the long post, but I think this is an important one. I would like to have your feelings about the FOrayFont integration. Since I started to work on that (in July 2005), things have quite evolved and I'm starting to doubt that integrating FOrayFont really is a good thing for Fop. I've already discussed with some of you about this whole issue, but I think it might be worth summarizing the points, and making everyone aware of it. Because I've the feeling that whatever decision we make, this will be a difficult one. First, some progress informations about the integration: the PDF renderer works now with FOrayFont, and seems to run well. The other renderers are still to be adapted. There shouldn't be too much work for the Java2D-based ones, a bit more for the Postscript Renderer, and also for PCL and AFP (I can't evaluate how much there is to do for those ones as I know nothing about those formats). I estimate to about 5 days the amount of work to have a compilable thing. There should be no loss of feature; there is a known problem with the Postscript renderer (no way to know which fonts are used for a given page, so we have to embed all of the configured fonts in the header), but Jeremias is working on a two-pass system thanks to which this problem should be solved soon. For those who are not familiar with the FOrayFont architecture, here's a quick presentation: there is a separate project called aXSL [1] (also maintained by Victor) whose purpose is to define a standard API for several modules related to XSL-FO. The one we're interested in is aXSL-font, but there are also modules for dealing with graphics, manipulating the FO tree, the area tree, etc. The goal is to have standard interfaces shared by XSL-FO implementations. Provided that, of course, there are more than one implementation which implement aXSL. So FOrayFont is a particular implementation of aXSL-font. If Fop were using FOrayFont there would actually be almost only aXSL calls in the code. [1] http://www.axsl.org/ Now, let me enumerate the pros and cons of the adaptation of FOrayFont to Fop: Cons: - After Bertrand's recent work on OTF support the existing font library is not far from being as feature-complete as FOrayFont: - ToUnicode support is now available; - it seems easy to remove the XML metrics generation step (actually Jeremias told me he had already done it on his working copy) - the old font support would have to be kept for use by Batik (PS PDF transcoders) as the Batik people have strong feelings against external dependencies - FOrayFont introduces a new font-config file which would disturb users (although I think it is better and more flexible than the current one) - FOrayFont is mainly a one-man-show and it's not very good for Fop to have such a dependency. And as this is primarily Victor's baby we can't just come in and ask for write access to the code or whatever. We must first show that our point of view is adequate to Victor's one. - However, it seems like we have difficulties understanding each other: each time I propose a change on the dev list, that triggers a lengthy discussion where we both try to explain our own point of vue and understand the other one, without even finally succeeding I think. There is whatever cultural gap + foreign language issue that hinders communication. - As a consequence, proposing changes on the aXSL/FOray area to better suit our needs will require twice as much time and energy as doing them on our own side. - And given that the API isn't perfect yet, I'm a bit afraid of going that route. One missing major feature for example is the ability to cache informations about fonts and retrieve them later; this is necessary for the XML area tree output or the CachedRenderPagesModel. There is simply no means in the API to get a font's identifier, in order to retrieve it later without having to re-launch the whole resolution process. - during the past year, growing technical disagreements have appeared; if we keep working together we might end up with having a thing that satisfies neither of us, because of the too many compromises we would have to do. That ranges from programming practices to API design decisions. - As far as I know, FOray has never been used in production yet, and it may be unstable. There are currently not many testcases and, well, it's already not very funny to write testcases for one's own code, if I have to write testcases for others... Now for the pros: - This would be unfortunate to break the last bridge between Victor and Fop. - I've myself already done quite a bit of work on FOrayFont, which would be basically lost. - Despite existing problems Victor brought quite a number of improvements to the font library, which would have to be re-done. And he started from the 0.20.5 code, like we would if we were to go our own way (tell me if I'm wrong, but I don't think the font code changed
Re: FOrayFont integration in question
Hi Vincent and team, I won't comment on the whole thing as IMO there's one element which narrows down the choices a lot (but thanks for the detailed explanations): On 11/13/06, Vincent Hennebert [EMAIL PROTECTED] wrote: ...FOrayFont is mainly a one-man-show and it's not very good for Fop to have such a dependency... I think having such a dependency on a one-man show project for a key part of FOP would be a bad idea. Even if the one man was Knuth himself...well, maybe we'd make an exception for Knuth, but not for mere mortals. So, IMHO the only remaining options are either to fork aXSL, or to not use it in FOP. As you mention, FOP's font stuff is not far from aXSL's stuff today. One thing that you didn't mention is the handling of OpenType fonts with PostScript (CFF) outlines, does aXSL do that? -Bertrand
Question on protected Logger member in some classes
Hi, While trying to debug some changes I've made wtih layout managers I've noticed that some classes have Logger as protected but not private member? What is the rationale here - just easy later inheritance of logger or anything else? For me it seems that logger purpose is to exactly identify the class, which makes some output and thus shown by name and not see some parent class name in output. What do you think? How about changing that current protected to private? Thank you, Andrejus
Re: Question on protected Logger member in some classes
You're right. These protected variables are sometimes not ideal. If you change anything to private while you work on the code, that's fine for me. On 03.10.2006 15:02:28 Andrejus Chaliapinas wrote: Hi, While trying to debug some changes I've made wtih layout managers I've noticed that some classes have Logger as protected but not private member? What is the rationale here - just easy later inheritance of logger or anything else? For me it seems that logger purpose is to exactly identify the class, which makes some output and thus shown by name and not see some parent class name in output. What do you think? How about changing that current protected to private? Thank you, Andrejus Jeremias Maerki
RE: Question on protected Logger member in some classes
You're right. These protected variables are sometimes not ideal. If you change anything to private while you work on the code, that's fine for me. Could you apply this patch to trunk? http://issues.apache.org/bugzilla/attachment.cgi?id=18956 Andrejus
Re: Question on protected Logger member in some classes
As soon as I have time and write access to the SVN repo. Here at the Cocoon GetTogether we're having some problems accessing SVN over HTTPS. If anyone else can do that in the meantime, that would be great. There are other patches also waiting to be processed. On 03.10.2006 16:05:48 Andrejus Chaliapinas wrote: You're right. These protected variables are sometimes not ideal. If you change anything to private while you work on the code, that's fine for me. Could you apply this patch to trunk? http://issues.apache.org/bugzilla/attachment.cgi?id=18956 Andrejus Jeremias Maerki
RE: Question on protected Logger member in some classes
If anyone else can do that in the meantime, that would be great. There are other patches also waiting to be processed. Probably you are the only man who could do that. There were no responses for last 4 hours. Hopefully tomorrow. Andrejus
Newbie question on Area Tree XML and testcases
Hello, I'm trying a little bit understand FOP dev. internals and please correct me if I wrong. When I run Junit layout testcase - it generates me under test-results some XML file in form of Area Tree XML, from which I could produce PDF file. But for me it's not clear how I would know in advance which values should I place into testcase file to be checked during test run. BTW - do you currently still support Junit v3.8.2 to be run with JDK 1.3.x? Or you've already switched to Junit 4.1? Thank you for anyone who will get me more light on this. Andrejus
RE: Newbie question on Area Tree XML and testcases
Normally, I just construct the FO part in the test case and then run the thing once so I get the current area tree XML. I obviously get an error if I have no checks, because we don't want any test cases without checks. When I have a first area tree XML I create the first checks I can build. I don't have to have every check there from the beginning. I add more checks as I progress with the implementation/fix. So, while this is a test-first approach, it is not necessarily also a all-checks-first approach. :-) The test case grows with you. ;-) But if you first construct FO part and run testcase without any checks - how you could be sure that produced Area Tree is correct? And if then later want to expand your FO part - do you need again remove all checks, run your testcase, get new Area Tree and return checks back with additional ones? Sorry, I don't get it right now. Also I've noticed that in LayoutEngineTestSuite class there is such call: new File(test/layoutengine/disabled-testcase2filename.xsl) and my Eclipse Ant hangs on that, cause tries to prefix that name with current eclipse directory (d:\eclipse in my case) and not with actual basedir of my project. So my proposal would be (of cause if I don't miss any other setting somewhere) to change a little main build.xml file and have instead of this part: target name=junit-layout-standard depends=junit-compile if=junit.present description=Runs FOP's standard JUnit layout tests echo message=Running standard layout engine tests ${basedir}/ junit haltonfailure=${junit.haltonfailure} fork=${junit.fork} errorproperty=fop.junit.error failureproperty=fop.junit.failure this one (with basedir specified for junit invocation): target name=junit-layout-standard depends=junit-compile if=junit.present description=Runs FOP's standard JUnit layout tests echo message=Running standard layout engine tests ${basedir}/ junit dir=${basedir} haltonfailure=${junit.haltonfailure} fork=${junit.fork} errorproperty=fop.junit.error failureproperty=fop.junit.failure In that case it runs normally in my environment.
Re: Newbie question on Area Tree XML and testcases
Andrejus Chaliapinas wrote: Normally, I just construct the FO part in the test case and then run the thing once so I get the current area tree XML. I obviously get an error if I have no checks, because we don't want any test cases without checks. When I have a first area tree XML I create the first checks I can build. I don't have to have every check there from the beginning. I add more checks as I progress with the implementation/fix. So, while this is a test-first approach, it is not necessarily also a all-checks-first approach. :-) The test case grows with you. ;-) But if you first construct FO part and run testcase without any checks - how you could be sure that produced Area Tree is correct? With an intimate knowledge of XSL-FO spec! Its not always easy to know, and takes time to work out. And if then later want to expand your FO part - do you need again remove all checks, run your testcase, get new Area Tree and return checks back with additional ones? Yes thats right. snip/ Chris
Re: Newbie question on Area Tree XML and testcases
Andrejus, I've created a project on sourceforge to generate pdf reports from JUnit tests. It's called JUnit PDF Report, an can be found here: http://junitpdfreport.sourceforge.net It uses FOP to render the JUnit XML docs into a PDF. Regards, Jan - Original Message - From: Andrejus Chaliapinas [EMAIL PROTECTED] To: fop-dev@xmlgraphics.apache.org Sent: Friday, September 29, 2006 4:47 PM Subject: Newbie question on Area Tree XML and testcases Hello, I'm trying a little bit understand FOP dev. internals and please correct me if I wrong. When I run Junit layout testcase - it generates me under test-results some XML file in form of Area Tree XML, from which I could produce PDF file. But for me it's not clear how I would know in advance which values should I place into testcase file to be checked during test run. BTW - do you currently still support Junit v3.8.2 to be run with JDK 1.3.x? Or you've already switched to Junit 4.1? Thank you for anyone who will get me more light on this. Andrejus
Re: Newbie question on Area Tree XML and testcases
On 29.09.2006 17:16:28 Andrejus Chaliapinas wrote: Normally, I just construct the FO part in the test case and then run the thing once so I get the current area tree XML. I obviously get an error if I have no checks, because we don't want any test cases without checks. When I have a first area tree XML I create the first checks I can build. I don't have to have every check there from the beginning. I add more checks as I progress with the implementation/fix. So, while this is a test-first approach, it is not necessarily also a all-checks-first approach. :-) The test case grows with you. ;-) But if you first construct FO part and run testcase without any checks - how you could be sure that produced Area Tree is correct? You check that by comparing the effective values with what you would expect from FO you've written. The easier the FO is the easier it is to determine the right checks. An example, if you have a plain fo:block with 12pt font. Given that you have a line-height of 1.2 (the default) you get 1.2 * 12pt = 14400mpt. You'll find that value in a bpd attribute later. And if then later want to expand your FO part - do you need again remove all checks, run your testcase, get new Area Tree and return checks back with additional ones? If you change so much that the whole area tree becomes different, yes. If you just change the color of a word only a little attribute in the tree is changed. Sorry, I don't get it right now. Also I've noticed that in LayoutEngineTestSuite class there is such call: new File(test/layoutengine/disabled-testcase2filename.xsl) and my Eclipse Ant hangs on that, cause tries to prefix that name with current eclipse directory (d:\eclipse in my case) and not with actual basedir of my project. So my proposal would be (of cause if I don't miss any other setting somewhere) to change a little main build.xml file and have instead of this part: Are you not running the org.apache.fop.layoutengine.LayoutEngineTestSuite using Eclipse's own JUnit support? That's the easiest way. Choosing the default working directory, everything works fine and you really only run the tests you need. It seems to me you're running the junit tasks from within Eclipse. target name=junit-layout-standard depends=junit-compile if=junit.present description=Runs FOP's standard JUnit layout tests echo message=Running standard layout engine tests ${basedir}/ junit haltonfailure=${junit.haltonfailure} fork=${junit.fork} errorproperty=fop.junit.error failureproperty=fop.junit.failure this one (with basedir specified for junit invocation): target name=junit-layout-standard depends=junit-compile if=junit.present description=Runs FOP's standard JUnit layout tests echo message=Running standard layout engine tests ${basedir}/ junit dir=${basedir} haltonfailure=${junit.haltonfailure} fork=${junit.fork} errorproperty=fop.junit.error failureproperty=fop.junit.failure In that case it runs normally in my environment. I'll look into that. Jeremias Maerki
RE: Newbie question on Area Tree XML and testcases
Andrejus, I've created a project on sourceforge to generate pdf reports from JUnit tests. It's called JUnit PDF Report, an can be found here: http://junitpdfreport.sourceforge.net It uses FOP to render the JUnit XML docs into a PDF. Regards, Jan Great, What about next step taking it inside same FOP Ant build process with one additional target for that (testcases-into-pdf)? It's quite native to be there as well (via fop.bat you just run: fop -atin infile -pdf outfile). What do you think? I always prefer to have everything for whole build process in one place (in this case great Eclipse IDE). Andrejus
RE: Newbie question on Area Tree XML and testcases
Are you not running the org.apache.fop.layoutengine.LayoutEngineTestSuite using Eclipse's own JUnit support? That's the easiest way. Choosing the default working directory, everything works fine and you really only run the tests you need. It seems to me you're running the junit tasks from within Eclipse. I have different JDK installed in different places and configured them for my Eclipse (which also uses different workspaces for different things as well). I also have separate Junit 3.8.2 actually outside Eclipse. So, in that sense it's not always default values/settings are involved. I'll look into that. Hopefully I don't bother you a lot ;) Andrejus
Re: Newbie question on Area Tree XML and testcases
I was actually replying to your mail because I have also tried to test my reports Area Tree. (but I forgot to mention it). It was too complex for me; so I ended up only testing if FOP actually generated my file. What about next step taking it inside same FOP Ant build process with one additional target for that (testcases-into-pdf)? The project comes with an ant task. So it would not be too hard to integrate it with an Ant build. Regards, Jan - Original Message - From: Andrejus Chaliapinas [EMAIL PROTECTED] To: fop-dev@xmlgraphics.apache.org; '[EMAIL PROTECTED]' [EMAIL PROTECTED] Sent: Friday, September 29, 2006 5:55 PM Subject: RE: Newbie question on Area Tree XML and testcases Andrejus, I've created a project on sourceforge to generate pdf reports from JUnit tests. It's called JUnit PDF Report, an can be found here: http://junitpdfreport.sourceforge.net It uses FOP to render the JUnit XML docs into a PDF. Regards, Jan Great, What about next step taking it inside same FOP Ant build process with one additional target for that (testcases-into-pdf)? It's quite native to be there as well (via fop.bat you just run: fop -atin infile -pdf outfile). What do you think? I always prefer to have everything for whole build process in one place (in this case great Eclipse IDE). Andrejus
Re: Newbie question on Area Tree XML and testcases
This is fixed in FOP Trunk now. Thanks for the suggestion. http://svn.apache.org/viewvc?view=revrev=451393 Stupid me applied it to a branch first. I guess I'm tired. On 29.09.2006 17:16:28 Andrejus Chaliapinas wrote: snip/ and my Eclipse Ant hangs on that, cause tries to prefix that name with current eclipse directory (d:\eclipse in my case) and not with actual basedir of my project. So my proposal would be (of cause if I don't miss any other setting somewhere) to change a little main build.xml file and have instead of this part: target name=junit-layout-standard depends=junit-compile if=junit.present description=Runs FOP's standard JUnit layout tests echo message=Running standard layout engine tests ${basedir}/ junit haltonfailure=${junit.haltonfailure} fork=${junit.fork} errorproperty=fop.junit.error failureproperty=fop.junit.failure this one (with basedir specified for junit invocation): target name=junit-layout-standard depends=junit-compile if=junit.present description=Runs FOP's standard JUnit layout tests echo message=Running standard layout engine tests ${basedir}/ junit dir=${basedir} haltonfailure=${junit.haltonfailure} fork=${junit.fork} errorproperty=fop.junit.error failureproperty=fop.junit.failure In that case it runs normally in my environment. Jeremias Maerki
Tiny question: PropertySets.canHaveMarkers() ?
Hi all, While browsing a bit through the FOTree code, it appeared to me that the sole function of the class fo.PropertySets is in the canHaveMarkers() method. None of the other members are referenced anywhere in the code, so I was thinking maybe this class could be removed completely? I'd go about that by moving that method to FONode, have it return 'false', and override to return true on the FONode subclasses in question, unless anyone objects... Cheers, Andreas
Re: Tiny question: PropertySets.canHaveMarkers() ?
I don't see a problem doing that. It's good to get rid of obsolete code. Only gets in newbies' way. On 14.07.2006 19:32:40 Andreas L Delmelle wrote: Hi all, While browsing a bit through the FOTree code, it appeared to me that the sole function of the class fo.PropertySets is in the canHaveMarkers() method. None of the other members are referenced anywhere in the code, so I was thinking maybe this class could be removed completely? I'd go about that by moving that method to FONode, have it return 'false', and override to return true on the FONode subclasses in question, unless anyone objects... Jeremias Maerki
fop question
hi,,my OS is windowsxp ,chinese simplied version,today i test fop-0.20.5 andcovent \fop-0.20.5\examples\fo\advanced\cid-fonts.fo to pdf,but the chinese and japanese can't convent correct, can you tell me why ?? and how can i convent it rightthanks very much. the java program i use fop-0.20.5\examples\embedding\java\embedding\ExampleFO2PDF.java ResultFO2PDF.pdf Description: Adobe PDF document
Re: Question about status of Bidi support, and Arabic/Persian shaping!
This is good news. I did not study the problems around non-LTR texts so I can't say anything useful about this. I think it is preferrable to reuse code from the Java class library if it covers our requirements (among them: maintain JDK 1.3 compatibility). Sadly that means that java.text.Bidi is a little problematic, except if we say that Bidi support is only available under JDK 1.4 but that might complicate the implementation additionally. If we can reuse code from our sister project Batik, that's probably better. Introducing an external library will take careful consideration. We have to make sure, among other things, that the licensing situation is ok. I don't know of anyone currently working on a Bidi impementation. If someone does, I expect that person to speak up. Please also look through the fop-dev mailing list archives for any discussions around Bidi and UAX#14. I think the two aspects may be closely related. On 05.05.2006 01:00:04 Kia Teymourian wrote: Dear developers, Im interested to use FOP for creating Documentation with complex text layout languages such as Arabic/Persian. I would like to offer my help for the implementation of Bidi Algorithm (Bug 32789) I search on the FOP-User mailing list and found, this discussion. http://www.mail-archive.com/fop-users@xmlgraphics.apache.org/msg01575.html Ive done the initial steps and establish myself a development environment, and could render a Unicode String for the Arabic Unicode shaping here is the PDF output and the fo file: http://user.cs.tu-berlin.de/~kiat/fop/simple.pdf http://user.cs.tu-berlin.de/~kiat/fop/simple.fo This test PDF output shows that the words in Arabic/Persian should be rendered for ligature glyphs characters Unicode Arabic Presentation form. And there is no problem with the Arabic shaping for Text, RTF output. Is there anyone who works on Bidi implementation? Should the Bidi Implementation be an awt independent solution? Could we use jdk 1.4 and classes like java.text.Bidi or java.awt.font.TextLayout? I looked around for Bidi implementation and found Fribid, http://fribidi.org/wiki/ which is implemented in C. iText project has also some Bidi classes like class com.lowagie.text.pdf.ArabicLigaturizer class com.lowagie.text.pdf.BidiLine class com.lowagie.text.pdf.BidiOrder Or we can use Batik classes like org.apache.batik.gvt.text.ArabicTextHandler.java I think the first point should be the implementation of org.apache.fop.fo.flow.BidiOverride.java org.apache.fop.layoutmgr.inline.BidiLayoutManager.java And then add some classes for the Arabic character shaping like ArabicLigaturizer. The bidi-Override implementation is very useful for some other projects to be able to create Persian/Arabic Documentation with PDF output from DocBook files. Regards, Kia Teymourian Jeremias Maerki
Re: Question about status of Bidi support, and Arabic/Persian shaping!
On Fri, 2006-05-05 at 08:43 +0200, Jeremias Maerki wrote: This is good news. I did not study the problems around non-LTR texts so I can't say anything useful about this. I think it is preferrable to reuse code from the Java class library if it covers our requirements (among them: maintain JDK 1.3 compatibility). Sadly that means that java.text.Bidi is a little problematic, except if we say that Bidi support is only available under JDK 1.4 but that might complicate the implementation additionally. If we can reuse code from our sister project Batik, that's probably better. Introducing an external library will take careful consideration. We have to make sure, among other things, that the licensing situation is ok. I don't know of anyone currently working on a Bidi impementation. If someone does, I expect that person to speak up. I am. Using Java 1.5 for the basics, and Java 1.6 for kerning and ligatures. I suppose you were aware of that though. Peter
Re: Question about status of Bidi support, and Arabic/Persian shaping!
On 05.05.2006 10:51:35 Peter B. West wrote: On Fri, 2006-05-05 at 08:43 +0200, Jeremias Maerki wrote: This is good news. I did not study the problems around non-LTR texts so I can't say anything useful about this. I think it is preferrable to reuse code from the Java class library if it covers our requirements (among them: maintain JDK 1.3 compatibility). Sadly that means that java.text.Bidi is a little problematic, except if we say that Bidi support is only available under JDK 1.4 but that might complicate the implementation additionally. If we can reuse code from our sister project Batik, that's probably better. Introducing an external library will take careful consideration. We have to make sure, among other things, that the licensing situation is ok. I don't know of anyone currently working on a Bidi impementation. If someone does, I expect that person to speak up. I am. Using Java 1.5 for the basics, and Java 1.6 for kerning and ligatures. I suppose you were aware of that though. Well, the requirement for 1.5 or even 1.6 pretty much rules out your implementation for us. Jeremias Maerki
Re: Question about status of Bidi support, and Arabic/Persian shaping!
On Fri, 2006-05-05 at 10:59 +0200, Jeremias Maerki wrote: On 05.05.2006 10:51:35 Peter B. West wrote: On Fri, 2006-05-05 at 08:43 +0200, Jeremias Maerki wrote: This is good news. I did not study the problems around non-LTR texts so I can't say anything useful about this. I think it is preferrable to reuse code from the Java class library if it covers our requirements (among them: maintain JDK 1.3 compatibility). Sadly that means that java.text.Bidi is a little problematic, except if we say that Bidi support is only available under JDK 1.4 but that might complicate the implementation additionally. If we can reuse code from our sister project Batik, that's probably better. Introducing an external library will take careful consideration. We have to make sure, among other things, that the licensing situation is ok. I don't know of anyone currently working on a Bidi impementation. If someone does, I expect that person to speak up. I am. Using Java 1.5 for the basics, and Java 1.6 for kerning and ligatures. I suppose you were aware of that though. Well, the requirement for 1.5 or even 1.6 pretty much rules out your implementation for us. Jeremias Maerki :) Peter
Re: Question about status of Bidi support, and Arabic/Persian shaping!
Hi all, Jeremias Maerki [EMAIL PROTECTED] wrote on 05/05/2006 02:43:26 AM: I think it is preferrable to reuse code from the Java class library if it covers our requirements (among them: maintain JDK 1.3 compatibility). Sadly that means that java.text.Bidi is a little problematic, except if we say that Bidi support is only available under JDK 1.4 but that might complicate the implementation additionally. java.awt.Font.TextLayout does Bidi layout. This is how Batik gets it's display order information. If we can reuse code from our sister project Batik, that's probably better. Introducing an external library will take careful consideration. We have to make sure, among other things, that the licensing situation is ok. You are more than welcome to use Batik code. However all of Batik's text handling is built around the JDK's AttributedString class, which may not be that compatible with FOP's text handling. Please also look through the fop-dev mailing list archives for any discussions around Bidi and UAX#14. I think the two aspects may be closely related. The two are mostly independent in Batik's implementation. On 05.05.2006 01:00:04 Kia Teymourian wrote: Dear developers, I’m interested to use FOP for creating Documentation with complex text layout languages such as Arabic/Persian. I would like to offer my help for the implementation of Bidi Algorithm (Bug 32789) I search on the FOP-User mailing list and found, this discussion. http://www.mail-archive.com/fop-users@xmlgraphics.apache.org/msg01575.html I’ve done the initial steps and establish myself a development environment, and could render a Unicode String for the Arabic Unicode shaping here is the PDF output and the fo file: http://user.cs.tu-berlin.de/~kiat/fop/simple.pdf http://user.cs.tu-berlin.de/~kiat/fop/simple.fo This test PDF output shows that the words in Arabic/Persian should be rendered for ligature glyphs characters Unicode Arabic Presentation form. And there is no problem with the Arabic shaping for Text, RTF output. Is there anyone who works on Bidi implementation? Should the Bidi Implementation be an awt independent solution? Could we use jdk 1.4 and classes like java.text.Bidi or java.awt.font.TextLayout? I looked around for Bidi implementation and found Fribid, http://fribidi.org/wiki/ which is implemented in C. iText project has also some Bidi classes like class com.lowagie.text.pdf.ArabicLigaturizer class com.lowagie.text.pdf.BidiLine class com.lowagie.text.pdf.BidiOrder Or we can use Batik classes like org.apache.batik.gvt.text.ArabicTextHandler.java I think the first point should be the implementation of org.apache.fop.fo.flow.BidiOverride.java org.apache.fop.layoutmgr.inline.BidiLayoutManager.java And then add some classes for the Arabic character shaping like ArabicLigaturizer. The bidi-Override implementation is very useful for some other projects to be able to create Persian/Arabic Documentation with PDF output from DocBook files. Regards, Kia Teymourian Jeremias Maerki
Question about status of Bidi support, and Arabic/Persian shaping!
Dear developers, I’m interested to use FOP for creating Documentation with complex text layout languages such as Arabic/Persian. I would like to offer my help for the implementation of Bidi Algorithm (Bug 32789) I search on the FOP-User mailing list and found, this discussion. http://www.mail-archive.com/fop-users@xmlgraphics.apache.org/msg01575.html I’ve done the initial steps and establish myself a development environment, and could render a Unicode String for the Arabic Unicode shaping here is the PDF output and the fo file: http://user.cs.tu-berlin.de/~kiat/fop/simple.pdf http://user.cs.tu-berlin.de/~kiat/fop/simple.fo This test PDF output shows that the words in Arabic/Persian should be rendered for ligature glyphs characters Unicode Arabic Presentation form. And there is no problem with the Arabic shaping for Text, RTF output. Is there anyone who works on Bidi implementation? Should the Bidi Implementation be an awt independent solution? Could we use jdk 1.4 and classes like java.text.Bidi or java.awt.font.TextLayout? I looked around for Bidi implementation and found Fribid, http://fribidi.org/wiki/ which is implemented in C. iText project has also some Bidi classes like class com.lowagie.text.pdf.ArabicLigaturizer class com.lowagie.text.pdf.BidiLine class com.lowagie.text.pdf.BidiOrder Or we can use Batik classes like org.apache.batik.gvt.text.ArabicTextHandler.java I think the first point should be the implementation of org.apache.fop.fo.flow.BidiOverride.java org.apache.fop.layoutmgr.inline.BidiLayoutManager.java And then add some classes for the Arabic character shaping like ArabicLigaturizer. The bidi-Override implementation is very useful for some other projects to be able to create Persian/Arabic Documentation with PDF output from DocBook files. Regards, Kia Teymourian
Re: Question about status of JEuclid and possible inclusion in FOP
Dear Max, results of our work (not the latest one, but still good enough for publishing) is available from cvs.sourceforge.net/cvsroot/jeuclid. Now I'm waiting for feedback from anyone of Apache FOP members on review the JEuclid code. At the moment, I have no idea whether some work has been done in this regard. Present status of our work (we still performing some regression tests, but its 99% already done) is the following: - we have merged JEuclid with FOP 0.20.5 (version available under sf.net is based on one of the pre 0.20.4). - we remove all references to AWT library, so now JEuclid is able to work on screenless servers without XWindows configuration. - support badly fonts (e.g. Arial Unicode MS, which we are enforced to use for MathML rendering). Due to last two issues we extended metric files with extended information about symbol location within the glyph. Plans for merging into 0.90 and later are not exists so far because we do have to improve the following functionality of the FOP itself to satisfy our needs (these features are done by us for FOP 0.20.5). 1. Support of embedded PDF's using itext library. 2. Support of the specific logging into files without any use of console (from my point of view, this is absolutely dummy requirement and we can use log4j, but I'm not in charge to modify it). 3. Support of Arial Unicode MS as TYPE1 font, and not as TYPE0 font. 4. Support of phantom attribute on page-sequence. To hide particular pages in PDF result file but take them into account in page numbering. 5. Support both JDK 1.3 and 1.4. This requires us to integrate support of SAXON parser for JDK 1.4 due to nasty bug in xalan shipped with JDK 1.4. However for JDK 1.3 we still have to keep support of xalan. 6. Improving text layout within the merged table cells for lists (I'guess that for FOP 0.90 it is not the issue anymore). 7. Implement specific DTD lookup algorithm (e.g. search for DTD not only in the XML location but also in current directory). From one hand we would like to have the JEuclid integrated into FOP ASAP, however, we cannot follow release schedule of the FOP due to very major changes of FOP due to features above. As far as I understood Jeremias, he recommended us wait until FOray will be integrated into main branch, but this work still not completed (or even canceled???). Concerning licensing issue. We have contacted the JEuclid original publishers and got their permission to re-publish JEuclid to the AFL license as required for FOP integration. Internally, from our side we got all the permissions to publish our work as well. We can commit all necessary statement as required by ASF once our work will be accepted for publication. Concerning your issues: - I would say this is good idea to include jeuclid into xmlgraphics. However, we need some FOP developers comments on the extended metric file. Otherwise, output will be useless in producing paper-ready PDF's. - We already doing that, however only for the TIFF format. This can be easily done by means of fo:external-graphics in the stylesheet (if I understood your work right). - This feature we are mostly interested in. At the moment this implemented by means of xsl:copy-of due to misinterpretation of character entities in xsl:copy implementation. You are absolutely free to re-use our results in any way under terms of AFL. Best regards, Gennadiy Saturday, April 29, 2006, 10:33:17 PM, you wrote: MB Dear Gennady, MB Dear developers, MB I've just recently played around with mathml and tried to include MB that in my fop documents. I've found several tools, and among others MB jeuclid. jeuclid is very complete, it is just missing a few adapter MB classes. I've written a small one to convert mml to svg and it works MB just fine. MB I've then found out that there was work done merging jeuclid into MB fop / xmlgraphics. What is the current status of this? What are the MB license / technical issues? Is this desired at all? MB Here is what I would like to see: MB - include jeuclid in xmlgraphics MB - add code to fop to support the inclusion of mml documents as MB external images. MB - add code to fop to support mml embedded within fo documents MB i would be willing to provide the first two items, if it is legal to MB do so... MB Max -- Best regards, Gennadiymailto:[EMAIL PROTECTED]
Re: Question about status of JEuclid and possible inclusion in FOP
I forgot to CC the jeuclid list when I replied to Max. Here's the link: http://mail-archives.apache.org/mod_mbox/xmlgraphics-fop-dev/200604.mbox/[EMAIL PROTECTED] Just in case... The whole thing about bringing JEuclid into the XML Graphics project is not that simple. As I said before, this would be a new subproject which means that it would have to go through the Incubator and not only the IP clearance process. However, without a live community (at least 2 active developers) around the thing it has practically no chance to succeed. I have to clarify my position here: I'm not actively using MathML and that's why I don't have much time to help here. I'm willing to help faciliating a migration if the preconditions are met. The problem is that I don't see they are. JEuclid is not unlike Barcode4J (my baby) which is also not going to come here mostly because there's no live developer community there. Last fall, I was hoping that JEuclid may gain some new momentum, but as far as I can see the whole thing was a single commit to CVS in December and that was it. Please be aware that such a migration is A LOT OF WORK and takes considerable energy. The legal work alone is a project in itself. So, IMO you should work towards a JEuclid release from within the SourceForge project. JEuclid is better served that way. People interested in MathML should gather there and strengthen JEuclid from within first. FOP can still use JEuclid for MathML handling. What I can help with right now is to do a review of JEuclid and update the MathML extension for FOP Trunk. This extension can either live here in FOP or in JEuclid. I don't care so much. When the JEuclid release is available and the FOP extension updated we can bundle JEuclid with FOP as we've already decided last year. Same for Barcode4J, BTW. If there are strong voices from within the XML Graphics project to adopt JEuclid we can reevaluate but until then I don't see JEuclid coming to the ASF. As a prebuilt JAR under a compatible license, yes, but not as source code. On 03.05.2006 13:17:04 Gennadiy Tsarenkov wrote: Dear Max, results of our work (not the latest one, but still good enough for publishing) is available from cvs.sourceforge.net/cvsroot/jeuclid. Now I'm waiting for feedback from anyone of Apache FOP members on review the JEuclid code. At the moment, I have no idea whether some work has been done in this regard. Present status of our work (we still performing some regression tests, but its 99% already done) is the following: - we have merged JEuclid with FOP 0.20.5 (version available under sf.net is based on one of the pre 0.20.4). - we remove all references to AWT library, so now JEuclid is able to work on screenless servers without XWindows configuration. - support badly fonts (e.g. Arial Unicode MS, which we are enforced to use for MathML rendering). Due to last two issues we extended metric files with extended information about symbol location within the glyph. Plans for merging into 0.90 and later are not exists so far because we do have to improve the following functionality of the FOP itself to satisfy our needs (these features are done by us for FOP 0.20.5). 1. Support of embedded PDF's using itext library. 2. Support of the specific logging into files without any use of console (from my point of view, this is absolutely dummy requirement and we can use log4j, but I'm not in charge to modify it). 3. Support of Arial Unicode MS as TYPE1 font, and not as TYPE0 font. 4. Support of phantom attribute on page-sequence. To hide particular pages in PDF result file but take them into account in page numbering. 5. Support both JDK 1.3 and 1.4. This requires us to integrate support of SAXON parser for JDK 1.4 due to nasty bug in xalan shipped with JDK 1.4. However for JDK 1.3 we still have to keep support of xalan. 6. Improving text layout within the merged table cells for lists (I'guess that for FOP 0.90 it is not the issue anymore). 7. Implement specific DTD lookup algorithm (e.g. search for DTD not only in the XML location but also in current directory). From one hand we would like to have the JEuclid integrated into FOP ASAP, however, we cannot follow release schedule of the FOP due to very major changes of FOP due to features above. As far as I understood Jeremias, he recommended us wait until FOray will be integrated into main branch, but this work still not completed (or even canceled???). Concerning licensing issue. We have contacted the JEuclid original publishers and got their permission to re-publish JEuclid to the AFL license as required for FOP integration. Internally, from our side we got all the permissions to publish our work as well. We can commit all necessary statement as required by ASF once our work will be accepted for publication. Concerning your issues: - I would say this is good idea to include jeuclid into xmlgraphics. However, we need some FOP developers
Re[2]: Question about status of JEuclid and possible inclusion in FOP
Hello Jeremias, I'm not saying that this is simple, I'm just saying pre-conditions, when we would be able to continue invest our efforts into integration into FOP. You are absolutely right, saying that there should be interest of the public community to such a work. At present, we get first feedback/requests on our work after about 5 month of absolute silence. This basically means that there is no demand on this feature. Either peoples are using other solutions or they are simply want to have additional feature in a list which says MathML support without any real background for that. Moreover, what I've learned so far is that FOP is not trying to get publication ready PDF's (what is my case) rather then readable XML. Correct me, if I'm wrong. The reasons for rare commits is that on our side there is more then one developer working on the project. While we do not have any feedback, we do not update public repository and just using our private one. There is another weak reason, why I cannot made release on sourceforge.net. I have only CVS commit rights to the JEuclid project and cannot make any releases. My perception is that we will continue making half-year commits to the sf.net repository with patched FOP and JEuclid which fully covers our needs. If somebody would be interested for somebody to integrate into FOP thunk, s/he is welcome. -- Best regards, Gennadiymailto:[EMAIL PROTECTED] Wednesday, May 3, 2006, 3:13:11 PM, you wrote: JM I forgot to CC the jeuclid list when I replied to Max. Here's the link: JM http://mail-archives.apache.org/mod_mbox/xmlgraphics-fop-dev/200604.mbox/[EMAIL PROTECTED] JM Just in case... JM The whole thing about bringing JEuclid into the XML Graphics project is JM not that simple. As I said before, this would be a new subproject which JM means that it would have to go through the Incubator and not only the IP JM clearance process. However, without a live community (at least 2 active JM developers) around the thing it has practically no chance to succeed. I JM have to clarify my position here: I'm not actively using MathML and JM that's why I don't have much time to help here. I'm willing to help JM faciliating a migration if the preconditions are met. The problem is JM that I don't see they are. JEuclid is not unlike Barcode4J (my baby) JM which is also not going to come here mostly because there's no live JM developer community there. Last fall, I was hoping that JEuclid may gain JM some new momentum, but as far as I can see the whole thing was a single JM commit to CVS in December and that was it. Please be aware that such a JM migration is A LOT OF WORK and takes considerable energy. The legal work JM alone is a project in itself. JM So, IMO you should work towards a JEuclid release from within the JM SourceForge project. JEuclid is better served that way. People JM interested in MathML should gather there and strengthen JEuclid from JM within first. FOP can still use JEuclid for MathML handling. What I can JM help with right now is to do a review of JEuclid and update the MathML JM extension for FOP Trunk. This extension can either live here in FOP or JM in JEuclid. I don't care so much. When the JEuclid release is available JM and the FOP extension updated we can bundle JEuclid with FOP as we've JM already decided last year. Same for Barcode4J, BTW. JM If there are strong voices from within the XML Graphics project to adopt JM JEuclid we can reevaluate but until then I don't see JEuclid coming to JM the ASF. As a prebuilt JAR under a compatible license, yes, but not as JM source code. JM On 03.05.2006 13:17:04 Gennadiy Tsarenkov wrote: Dear Max, results of our work (not the latest one, but still good enough for publishing) is available from cvs.sourceforge.net/cvsroot/jeuclid. Now I'm waiting for feedback from anyone of Apache FOP members on review the JEuclid code. At the moment, I have no idea whether some work has been done in this regard. Present status of our work (we still performing some regression tests, but its 99% already done) is the following: - we have merged JEuclid with FOP 0.20.5 (version available under sf.net is based on one of the pre 0.20.4). - we remove all references to AWT library, so now JEuclid is able to work on screenless servers without XWindows configuration. - support badly fonts (e.g. Arial Unicode MS, which we are enforced to use for MathML rendering). Due to last two issues we extended metric files with extended information about symbol location within the glyph. Plans for merging into 0.90 and later are not exists so far because we do have to improve the following functionality of the FOP itself to satisfy our needs (these features are done by us for FOP 0.20.5). 1. Support of embedded PDF's using itext library. 2. Support of the specific logging into files without any use of console (from my point of view,
Re: Question about status of JEuclid and possible inclusion in FOP
Hi Gennadiy On 03.05.2006 16:14:41 Gennadiy Tsarenkov wrote: Hello Jeremias, I'm not saying that this is simple, I'm just saying pre-conditions, when we would be able to continue invest our efforts into integration into FOP. You are absolutely right, saying that there should be interest of the public community to such a work. At present, we get first feedback/requests on our work after about 5 month of absolute silence. This basically means that there is no demand on this feature. Either peoples are using other solutions or they are simply want to have additional feature in a list which says MathML support without any real background for that. Moreover, what I've learned so far is that FOP is not trying to get publication ready PDF's (what is my case) rather then readable XML. Correct me, if I'm wrong. It's probably more so that people from the business document side are more likely to invest in FOP than those from the publishing department do. But the good news is that one of my clients requests some level of PDF/X support which clearly goes in the direction of publication ready PDFs. The reasons for rare commits is that on our side there is more then one developer working on the project. While we do not have any feedback, we do not update public repository and just using our private one. There is another weak reason, why I cannot made release on sourceforge.net. I have only CVS commit rights to the JEuclid project and cannot make any releases. Then maybe you should see if you can get admin privileges because you'll exclude probably more than 80% of the possible users if you don't do releases. Some people are simply not into getting source code from CVS. The same applies to FOP, BTW, because the license policy dictates that we shall not use unreleased JARs in our releases. Which means we're basically stuck with JEuclid 2.0 for now. My perception is that we will continue making half-year commits to the sf.net repository with patched FOP and JEuclid which fully covers our needs. Not a very promising prospect. That's certainly not how you can attract possible users. If somebody would be interested for somebody to integrate into FOP thunk, s/he is welcome. I'll handle that. snip/ Jeremias Maerki
Re: Question about status of JEuclid and possible inclusion in FOP
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Gennadiy, thank you for your very extensive answer. At least now I know what the current status is. So for now I'll use jeuclid externally to convert so svg and then include that into fop. Gennadiy Tsarenkov wrote: The reasons for rare commits is that on our side there is more then one developer working on the project. While we do not have any feedback, we do not update public repository and just using our private one. There is another weak reason, why I cannot made release on sourceforge.net. I have only CVS commit rights to the JEuclid project and cannot make any releases. I would like to see your changes in the jeuclid cvs repository. I have some minor additions to jeuclid where I would like to provide patches for. Thanks Max -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFEWO/ixVFyWbWycjQRAmzBAJ9Cd7gJYAHnv2yuaUPo9+LWSb6r6QCeJ5/6 0Qw2gAbx37i4V/ulQ0KrmSw= =KhMX -END PGP SIGNATURE-
Question about status of JEuclid and possible inclusion in FOP
Dear Gennady, Dear developers, I've just recently played around with mathml and tried to include that in my fop documents. I've found several tools, and among others jeuclid. jeuclid is very complete, it is just missing a few adapter classes. I've written a small one to convert mml to svg and it works just fine. I've then found out that there was work done merging jeuclid into fop / xmlgraphics. What is the current status of this? What are the license / technical issues? Is this desired at all? Here is what I would like to see: - include jeuclid in xmlgraphics - add code to fop to support the inclusion of mml documents as external images. - add code to fop to support mml embedded within fo documents i would be willing to provide the first two items, if it is legal to do so... Max PGP.sig Description: This is a digitally signed message part
Re: Question about status of JEuclid and possible inclusion in FOP
On 29.04.2006 21:33:17 Max Berger wrote: Dear Gennady, Dear developers, I've just recently played around with mathml and tried to include that in my fop documents. I've found several tools, and among others jeuclid. jeuclid is very complete, it is just missing a few adapter classes. I've written a small one to convert mml to svg and it works just fine. FOP includes a MathML extension for JEuclid. See examples/mathml. I've then found out that there was work done merging jeuclid into fop / xmlgraphics. What is the current status of this? What are the license / technical issues? Is this desired at all? Well, there were talks but as happens so often, not enough energy was behind. The ASF is also not going to adopt a codebase which is not actively maintained and supported. Here is what I would like to see: - include jeuclid in xmlgraphics I don't see this happen without a live community behind JEuclid. - add code to fop to support the inclusion of mml documents as external images. Easily done. Just implement an XMLHandler for MathML which converts the MathML to SVG. If it can optionally paint to a Graphics2D instance directly all the better (some Renderers provide a Graphics2DAdapter). - add code to fop to support mml embedded within fo documents Already available but the extension could profit from a touch-up if it provides an XMLHandler implementation. i would be willing to provide the first two items, if it is legal to do so... Perfectly legal. JEuclid is published under the Apache License. We can even ship it with FOP, which BTW we decided to do. But after the MathML discussion trailed off I left that be mostly because I had more important things on my list. I don't need MathML myself. I'm only available to faciliate the integration if there are people actively pushing it. Jeremias Maerki
Re: Another page-related question: page-position=last
Ah, so we need to define first, what we really want to expect. :-) Does the spec say anything about the expected behaviour? On 02.10.2005 00:57:07 J.Pietschmann wrote: Jeremias Maerki wrote: On 27.09.2005 16:38:23 Luca Furini wrote: [the usual layout oscillation/convergence problem] What is the expected output? In this case it has to generate a blank page IMO. The expected output is that there is some content (area with bpd0) on the last page, even if this sounds suboptimal. J.Pietschmann Jeremias Maerki
Re: Another page-related question: page-position=last
On Oct 3, 2005, at 16:11, Jeremias Maerki wrote: On 02.10.2005 00:57:07 J.Pietschmann wrote: Jeremias Maerki wrote: On 27.09.2005 16:38:23 Luca Furini wrote: [the usual layout oscillation/convergence problem] What is the expected output? In this case it has to generate a blank page IMO. The expected output is that there is some content (area with bpd0) on the last page, even if this sounds suboptimal. Ah, so we need to define first, what we really want to expect. :-) Does the spec say anything about the expected behaviour? I believe this can be (more or less) inferred from the fact that there are actually three sub-conditions, namely: page-position, odd-or-even and blank-or-not-blank. Given that: a) all three sub-conditions have to be true for the condition on the fo:conditional-page-master to be true (= to make it eligible for selection) b) the initial value for 'blank-or-not-blank' is 'any' Then I'd conclude that both the described expected outputs --blank page or one filled with some content-- are allowed in case there is no explicit blank-or-not-blank sub-condition specified on the fo:c-p-m in question (or an explicit any, which comes down to the same thing). If and only if you have a fo:c-p-m with both page-position=last and blank-or-not-blank=blank, the output of one blank last page is the only correct output. Same thing for Joerg's expectation, which is the only correct output in case you have page-position=last and blank-or-not-blank=not-blank. Note: in both cases I'm assuming this to be the only fo:c-p-m with a condition page-position=last. I'm not absolutely sure, so correct me if I'm wrong... For instance: I'm wondering whether the conditions *have* to be met, so that the layout-engine would, if necessary, have to perform all sorts of magic tricks to force the content to meet the criteria, or whether OTOH, the layout-engine only *has* to choose a particular fo:c-p-m if the criteria actually are met (?) Anyone? Cheers, Andreas
Re: Another page-related question: page-position=last
On Oct 3, 2005, at 21:22, J.Pietschmann wrote: Umm, emm, blank means no area at all on the page body, not even one with bpd=0. E.g. fo:flow ... fo:block/ fo:block break-before=page/ /fo:flow would create two non-blank pages. Or so I think. I see and fully agree, but IIRC, one of the ideas was to create a sort of dummy-area (internally, not corresponding to a FO in the document), so that the last page-master would always be used, even if the next-to-last page can hold all of the remaining content at a given point. AFAICT, this would *only* be acceptable if the page-master in question doesn't have an explicit non-blank constraint. In any case, you're right about the above example generating two non-blank (yet empty) pages. So, a page-master with explicit blank constraint can never be used for either of them. But now, I'm still wondering about the last question in my earlier post... :-/ If we have: fo:conditional-page-master page-position=last blank-or-not-blank=not-blank a) Does this page-master have to be used, period? (= If there are no areas left to fill up the last page, layout needs to backtrack in order to satisfy both conditions.) b) Does it have to be used only in case both criteria are met at the same time? (= If there are no areas left to fill up an eventual last page, then this page-master is simply never eligible for selection.) I'm inclined to believe b), but again, not absolutely --or: absolutely not-- certain about this... Make it page-position=last and odd-or-even=odd: does that mean that we have to make sure that the page-sequence always contains an odd number of pages, or that that page-master is eligible for selection only if the last page turns out to have an odd number? Cheers, Andreas
Re: Another page-related question: page-position=last
Jeremias Maerki wrote: On 27.09.2005 16:38:23 Luca Furini wrote: [the usual layout oscillation/convergence problem] What is the expected output? In this case it has to generate a blank page IMO. The expected output is that there is some content (area with bpd0) on the last page, even if this sounds suboptimal. J.Pietschmann
Re: Another page-related question: page-position=last
Jeremias Maerki wrote: What is the expected output? In this case it has to generate a blank page IMO. Oh, right, I did not think of an empty page! :-) The problem is with the page x of y hack that won't work like this if the last empty block ends up on the second-to-last page. [...] What about the following approach? Run the breaker without special last-page handling, then inspect the allocated BPD for the last part. If it fits into the last page, just exchange the page-master (*) and paint it there. If it doesn't fit, paint it using the non-last page-master and add a blank page with the last page-master. If there's a box w=0 at the end of the element list, force a new part and paint that on the last page to handle the page x of y case. I think this would work with my idea too: in this case, if the last empty block and the difference in page bpd (that cannot be parted) do not fit in the non-last page under construction, they would be placed in a new page; so, a page-number-citation pointing to the empty block would return the last page-number. This would avoid the need to exchange page-masters, and to have a special handling for zero-width box at the end of the sequence. Regards Luca
Re: question
On 26.09.2005 19:09:59 Sergey Simonchik wrote: Hi, Letters A and B have same indent from the left edge of page in this example: ... fo:block border-style=solid border-width=20pt border-color=redA fo:block border-style=solid border-width=20pt border-color=blackB /fo:block /fo:block ... Here behaviors of block-progression and inline-progression are different. Is it correct? Yes, that's right. In your example the start-indent and end-indent trait is the same for both blocks. This has to do with the rules established in 5.3.2 in XSL 1.0. Jeremias Maerki
Re: question
On 27.09.2005 09:25:15 Jeremias Maerki wrote: On 26.09.2005 19:09:59 Sergey Simonchik wrote: Hi, Letters A and B have same indent from the left edge of page in this example: ... fo:block border-style=solid border-width=20pt border-color=redA fo:block border-style=solid border-width=20pt border-color=blackB /fo:block /fo:block ... Here behaviors of block-progression and inline-progression are different. Is it correct? Yes, that's right. In your example the start-indent and end-indent trait is the same for both blocks. This has to do with the rules established in 5.3.2 in XSL 1.0. I may need to add something to that. You can modify the behaviour to that which I assume you expect: Just add margin=0pt to your example. fo:block margin=0pt border-style=solid border-width=20pt border-color=redA fo:block margin=0pt border-style=solid border-width=20pt border-color=blackB /fo:block /fo:block Extensive documentation about indents can be found here: http://wiki.apache.org/xmlgraphics-fop/IndentInheritance Jeremias Maerki
Re: Another page-related question: page-position=last
Jeremias Maerki wrote: It's an interesting idea. However, I suspect this will probably not be necessary. We should be able to make the breaker clever enough to handle this particular case. When the page bpd depends on the page-masters, things becomes very strange. Not only it's difficult to implement the page-master choice, but even to understand what should be the expected result! :-) For example: let's suppose the breaker is working, and it has to place the last 25 lines of a page-sequence. The page-master for the last page has a bpd allowing no more than 20 lines, while the other page-masters can contain up to 30 lines. What happens? If the breaker starts building a last page it soon realizes that it would not contain all the remaining content, so it would be no more a last page. But if it starts building a non-last page, it reaches the end of the content, and has to turn it into a last page, which is impossible. What is the expected output? The only way I see to satisfy the property is to create two more pages: one non-last page, partially empty, with less than 25 lines (24 or fewer, if there are keeps, widows or orphans) and a last page with the remaining lines. This sort of problems happens only if the last page is smaller than the previous ones: otherwise, the breaker can always try to build a non-last page, eventually moving all its content into a last page. Now I think of this ... an idea, that could work at least when the non-last pages have the same bpd and the last page a smaller one, could be to modify a little the elements appended at the end of the sequence, so that they have a width equal to the difference (nonLastBPD - lastBPD). This way, the last page created by the breaker will have an apparent width of nonLastBPD, but the content placed inside it will have an overall bpd equal to nonLastBPD - (nonLastBPD - lastBPD) = lastBPD What do you think? Regards Luca
RE: question
Thank you. -Исходное сообщение- От: Jeremias Maerki [mailto:[EMAIL PROTECTED] Отправлено: Вт 27.09.2005 12:12 Кому: fop-dev@xmlgraphics.apache.org Копия: Тема: Re: question On 27.09.2005 09:25:15 Jeremias Maerki wrote: On 26.09.2005 19:09:59 Sergey Simonchik wrote: Hi, Letters A and B have same indent from the left edge of page in this example: ... fo:block border-style=solid border-width=20pt border-color=redA fo:block border-style=solid border-width=20pt border-color=blackB /fo:block /fo:block ... Here behaviors of block-progression and inline-progression are different. Is it correct? Yes, that's right. In your example the start-indent and end-indent trait is the same for both blocks. This has to do with the rules established in 5.3.2 in XSL 1.0. I may need to add something to that. You can modify the behaviour to that which I assume you expect: Just add margin=0pt to your example. fo:block margin=0pt border-style=solid border-width=20pt border-color=redA fo:block margin=0pt border-style=solid border-width=20pt border-color=blackB /fo:block /fo:block Extensive documentation about indents can be found here: http://wiki.apache.org/xmlgraphics-fop/IndentInheritance Jeremias Maerki winmail.dat
Re: Another page-related question: page-position=last
On 27.09.2005 16:38:23 Luca Furini wrote: Jeremias Maerki wrote: It's an interesting idea. However, I suspect this will probably not be necessary. We should be able to make the breaker clever enough to handle this particular case. When the page bpd depends on the page-masters, things becomes very strange. Not only it's difficult to implement the page-master choice, but even to understand what should be the expected result! :-) For example: let's suppose the breaker is working, and it has to place the last 25 lines of a page-sequence. The page-master for the last page has a bpd allowing no more than 20 lines, while the other page-masters can contain up to 30 lines. What happens? If the breaker starts building a last page it soon realizes that it would not contain all the remaining content, so it would be no more a last page. But if it starts building a non-last page, it reaches the end of the content, and has to turn it into a last page, which is impossible. What is the expected output? In this case it has to generate a blank page IMO. Note that this is a scenario that is important for Switzerland where we have Einzahlungsscheine (a preprinted form used for payments). These are often expected to be on the last page of the document. So far I always had to persuade my clients that putting the Einzahlungsschein on the first page is not so bad. :-) The problem is with the page x of y hack that won't work like this if the last empty block ends up on the second-to-last page. The only way I see to satisfy the property is to create two more pages: one non-last page, partially empty, with less than 25 lines (24 or fewer, if there are keeps, widows or orphans) and a last page with the remaining lines. That sounds suboptimal. The less breaks, the better. Better have a blank page. This sort of problems happens only if the last page is smaller than the previous ones: otherwise, the breaker can always try to build a non-last page, eventually moving all its content into a last page. Now I think of this ... an idea, that could work at least when the non-last pages have the same bpd and the last page a smaller one, could be to modify a little the elements appended at the end of the sequence, so that they have a width equal to the difference (nonLastBPD - lastBPD). This way, the last page created by the breaker will have an apparent width of nonLastBPD, but the content placed inside it will have an overall bpd equal to nonLastBPD - (nonLastBPD - lastBPD) = lastBPD What do you think? I don't get it, yet. What about the following approach? Run the breaker without special last-page handling, then inspect the allocated BPD for the last part. If it fits into the last page, just exchange the page-master (*) and paint it there. If it doesn't fit, paint it using the non-last page-master and add a blank page with the last page-master. If there's a box w=0 at the end of the element list, force a new part and paint that on the last page to handle the page x of y case. (*) Doesn't work if the available IPD is different. A restart will be necessary in this case which could result in an overflow of the last page which means that the content can still be reset to the old page-master and a blank last page has to be generated. Jeremias Maerki
Re: Another page-related question: page-position=last
It's an interesting idea. However, I suspect this will probably not be necessary. We should be able to make the breaker clever enough to handle this particular case. ATM, I don't have free brain capacity to dive into this (even though this is an important and long-awaited feature) but it may make sense to gather ideas and notes from the mailing list archives and put these on a Wiki page. I remember that several approaches have been discussed in the past even though they haven't had anything to do with our current breaking approach. On 23.09.2005 20:28:14 Andreas L Delmelle wrote: Hi, (Apologies for the many posts... I'm definitely on a roll :-)) Now that it has been made clear to me that the layout-engine first calculates *all* break-possibilities, IMO this also seems to make implementing page-position=last much, much easier. Assuming that no areas are generated until the full element list is known, I'm thinking once you have created the Knuth-element list, instead of starting at element index zero to generate the areas, it would also be possible to have the algorithm start at the last element in the list until we reach the first break-possibility that would make the content exceed the height page-master whose page-position is last, no? At the very least, it would enable us to mark the break-possibility at which the algorithm should consider the last page-master... Obviously, we can't complete the full iteration in that direction, or we would run into problems determining the difference between odd and even, but it seems possible to iterate backwards once and assign a sort of 'favorability degree' to each of break-possibilities --increase or decrease penalty values?--, taking into account that the last page *has* to start somewhere after the marked one --before is impossible, since it won't fit. The generated last page can be thrown away (or serialized to be re-used if the marked possibility just happens to coincide with the actual last page-break). Am I getting this correctly? Jeremias Maerki
Re: Another page-related question: page-position=last
On Sep 26, 2005, at 16:03, Jeremias Maerki wrote: It's an interesting idea. However, I suspect this will probably not be necessary. We should be able to make the breaker clever enough to handle this particular case. ATM, I don't have free brain capacity to dive into this (even though this is an important and long-awaited feature) but it may make sense to gather ideas and notes from the mailing list archives and put these on a Wiki page. I remember that several approaches have been discussed in the past even though they haven't had anything to do with our current breaking approach. OK, I was just probing... Since that 'spark' --referring to your recent response to my completely misguided idea on tables ;-)-- lit my fire, I'm beginning to see all sorts of possibilities. Some things are falling into place now --been re-reading the Wiki docs, and browsing through the Knuth-related code, and this all makes very much more sense now. (Haven't read Knuth's book/paper myself yet, so that was still a bit of a mystery to me) When I find some time, I'll see what ideas I can dig up from the archives, and how they may possibly relate (or be translated) to our current layout code. Cheers, Andreas
Re: inline alignment question
FWIW I just downloaded the evaluation version of the Antennahouse Formatter and it generates as per my option b). For all to see here is the fo: fo:root xmlns:fo=http://www.w3.org/1999/XSL/Format; xmlns:svg=http://www.w3.org/2000/svg; fo:layout-master-set fo:simple-page-master master-name=normal page-width=5in page-height=5in fo:region-body/ /fo:simple-page-master /fo:layout-master-set fo:page-sequence master-reference=normal white-space-collapse=true fo:flow flow-name=xsl-region-body fo:block font-size=24pt Start-pfo:inline font-size=12pt background-color=yellow vertical-align=toptopfo:inline font-size=18pt background-color=red vertical-align=bottomp-bottom-g/fo:inline/fo:inlineg-End /fo:block fo:block font-size=24pt Startfo:inline font-size=12pt background-color=yellow vertical-align=toptopfo:inline font-size=18pt background-color=red vertical-align=bottomp-bottom-g/fo:inline/fo:inlineEnd /fo:block /fo:flow /fo:page-sequence /fo:root and I'll attach both the RenderX and AntennaHouse pdf outputs (I did put those strange letters 'g' in 'p' in to be able to see the font descenders clearly). IMO, AntennaHouse got the vertical alignment right but, and this was discussed in another thread mainly between me and Finn, I think RenderX got the highlight correct, i.e. the inline with the word top has only the small line-height. Note also that in the AntennaHouse PDF the red background does not quite reach to the bottom of the descenders. Now, the BIG question is, what should FOP do or 'what is the right way'? Manuel On Sun, 25 Sep 2005 09:49 am, Manuel Mall wrote: Andreas, let me start with thanking you for taking the time to look into this and respond so quickly. This is much appreciated. This post arrived here after midnight by which time I was sound asleep. I only got to look at it this morning. On Sun, 25 Sep 2005 12:39 am, Andreas L Delmelle wrote: On Sep 24, 2005, at 17:39, Manuel Mall wrote: On Sat, 24 Sep 2005 11:27 pm, Andreas L Delmelle wrote: Anyway, the full description would be: The alignment-baseline on the first inline is aligned with the before-edge baseline of the outer block. Now, IIC, this has an impact on its own after-edge baseline, which is then in its turn the basis for the alignment-baseline of the innermost inline (?) You are right - this is exactly the question: Does it have an impact on its after-edge baseline or not? Intuitively I would say YES but the spec says NO the baseline table is not recalculated (rescaled) when the font-size changes. I think I got it. Correct me if I'm wrong... The description in the Rec applies to cases where only the font-size changes, and all the alignment-related properties have a value of auto... The baseline-table seems to only be recalculated on a baseline-shift but not otherwise. ... but the baseline-shift value for the first inline is non-zero (value baseline is not equal to value 0), so the Rec says in 7.13.3: I think that is the core point. IMO the baseline-shift for the first inline is 0. Yes, there is a change of alignment-baseline but NO shift of any baseline. As you pointed out before vertical-align=top is equivalent to: alignment-baseline=before-edge alignment-adjust=auto baseline-shift=baseline dominant-baseline=auto And 7.13.3. says for baseline-shift=baseline: There is no baseline-shift; the dominant baseline remains in its original position. So neither changing the font-size nor changing the vertical-align to top or bottom involves a baseline-shift and therefore the original baseline-table stays in place. When the value of 'baseline-shift' is other than '0', then the baseline-table font-size component of the 'dominant-baseline' property is re-computed to use the 'font-size' applicable to the formatting object on which the non-zero 'baseline-shift' property is specified. So, the dominant-baseline *is* re-computed for the first inline it seems, although this isn't apparent if you look only at the description for the baseline value: There is no baseline shift; the dominant-baseline remains in its original position. Easy to get confused by this, but still, I think your original a) applies here. Problem solved? :-) I still think according to the spec its b). :-) Cheers, Andreas Manuel inline_vertical-align_3.xml.axf.pdf Description: Adobe PDF document inline_vertical-align_3.xml.xep.pdf Description: Adobe PDF document
Re: inline alignment question
On Sun, 25 Sep 2005 06:12 pm, Andreas L Delmelle wrote: On Sep 25, 2005, at 03:49, Manuel Mall wrote: snip/ ... but the baseline-shift value for the first inline is non-zero (value baseline is not equal to value 0), so the Rec says in 7.13.3: I think that is the core point. IMO the baseline-shift for the first inline is 0. Yes, there is a change of alignment-baseline but NO shift of any baseline. As you pointed out before vertical-align=top is equivalent to: alignment-baseline=before-edge alignment-adjust=auto baseline-shift=baseline dominant-baseline=auto And 7.13.3. says for baseline-shift=baseline: There is no baseline-shift; the dominant baseline remains in its original position. Exactly the source of the confusion. If it had been: alignment-baseline=before-edge baseline-shift=0 That would have been something different. As I understand, the baseline value for baseline-shift is precisely meant to distinguish between the 'no-shift' cases where the baseline-table component does or does not need to be re-computed... (baseline != 0) First, the baseline-table component is re-computed, and then 'the dominant-baseline remains in its original position' (= after the re-computation) This is where we disagree and I think I have the spec on my side as it says in 7.13.3 that a value of 0 is equivalent to baseline (baseline == 0). This is mentioned under both percentage and length. Therefore IMO the spec does not make a semantic distinction between a value of baseline and a value of 0. Actually I would argue that the computed value of the specified value baseline is 0. So neither changing the font-size nor changing the vertical-align to top or bottom involves a baseline-shift and therefore the original baseline-table stays in place. I hate to be a such pest, but I disagree :-P That's OK and no your are not a pest because if you are I would be an even worse pest (what a horrible thought...). But I really think this a quite important stuff. In the end positioning a glyph correctly, that is in accordance with the rules of the specification given a set of formatting instructions provided by the user, is the core function of a formatter. If we can't get that right why do we bother? And to get this right we have to understand it first which is what this exchange is all about. So keep coming... :-) BTW: Where's all the others? Ah well, it's weekend after all... :-) Cheers, Andreas Cheers Manuel
Re: inline alignment question
On Sep 24, 2005, at 17:22, Manuel Mall wrote: On Sat, 24 Sep 2005 11:04 pm, Andreas L Delmelle wrote: It seems then that the vertical-align on the innermost inline actually refers to the after-edge, which IIC would be relative to the after-edge of its parent (and not that of the block ancestor). So I'd agree with your hunch and RenderX here... Andreas, thanks for the quick response. And yes I agree with your expansion. And yes this means aligned with the 'after-edge' baseline of the parent area. But what is the 'after-edge' baseline of the parent area? (Note it doesn't say aligned with the 'after-edge' of the parent is says aligned with the 'after-edge' baseline of the parent) Yeah, sorry, I was being too fast here and forgot a few important terms. Anyway, the full description would be: The alignment-baseline on the first inline is aligned with the before-edge baseline of the outer block. Now, IIC, this has an impact on its own after-edge baseline, which is then in its turn the basis for the alignment-baseline of the innermost inline (?) That was actually more the point I wanted to make. Cheers, Andreas
Re: inline alignment question
On Sep 24, 2005, at 17:39, Manuel Mall wrote: On Sat, 24 Sep 2005 11:27 pm, Andreas L Delmelle wrote: Anyway, the full description would be: The alignment-baseline on the first inline is aligned with the before-edge baseline of the outer block. Now, IIC, this has an impact on its own after-edge baseline, which is then in its turn the basis for the alignment-baseline of the innermost inline (?) You are right - this is exactly the question: Does it have an impact on its after-edge baseline or not? Intuitively I would say YES but the spec says NO the baseline table is not recalculated (rescaled) when the font-size changes. I think I got it. Correct me if I'm wrong... The description in the Rec applies to cases where only the font-size changes, and all the alignment-related properties have a value of auto... The baseline-table seems to only be recalculated on a baseline-shift but not otherwise. ... but the baseline-shift value for the first inline is non-zero (value baseline is not equal to value 0), so the Rec says in 7.13.3: When the value of 'baseline-shift' is other than '0', then the baseline-table font-size component of the 'dominant-baseline' property is re-computed to use the 'font-size' applicable to the formatting object on which the non-zero 'baseline-shift' property is specified. So, the dominant-baseline *is* re-computed for the first inline it seems, although this isn't apparent if you look only at the description for the baseline value: There is no baseline shift; the dominant-baseline remains in its original position. Easy to get confused by this, but still, I think your original a) applies here. Problem solved? :-) Cheers, Andreas
Re: inline alignment question
Andreas, let me start with thanking you for taking the time to look into this and respond so quickly. This is much appreciated. This post arrived here after midnight by which time I was sound asleep. I only got to look at it this morning. On Sun, 25 Sep 2005 12:39 am, Andreas L Delmelle wrote: On Sep 24, 2005, at 17:39, Manuel Mall wrote: On Sat, 24 Sep 2005 11:27 pm, Andreas L Delmelle wrote: Anyway, the full description would be: The alignment-baseline on the first inline is aligned with the before-edge baseline of the outer block. Now, IIC, this has an impact on its own after-edge baseline, which is then in its turn the basis for the alignment-baseline of the innermost inline (?) You are right - this is exactly the question: Does it have an impact on its after-edge baseline or not? Intuitively I would say YES but the spec says NO the baseline table is not recalculated (rescaled) when the font-size changes. I think I got it. Correct me if I'm wrong... The description in the Rec applies to cases where only the font-size changes, and all the alignment-related properties have a value of auto... The baseline-table seems to only be recalculated on a baseline-shift but not otherwise. ... but the baseline-shift value for the first inline is non-zero (value baseline is not equal to value 0), so the Rec says in 7.13.3: I think that is the core point. IMO the baseline-shift for the first inline is 0. Yes, there is a change of alignment-baseline but NO shift of any baseline. As you pointed out before vertical-align=top is equivalent to: alignment-baseline=before-edge alignment-adjust=auto baseline-shift=baseline dominant-baseline=auto And 7.13.3. says for baseline-shift=baseline: There is no baseline-shift; the dominant baseline remains in its original position. So neither changing the font-size nor changing the vertical-align to top or bottom involves a baseline-shift and therefore the original baseline-table stays in place. When the value of 'baseline-shift' is other than '0', then the baseline-table font-size component of the 'dominant-baseline' property is re-computed to use the 'font-size' applicable to the formatting object on which the non-zero 'baseline-shift' property is specified. So, the dominant-baseline *is* re-computed for the first inline it seems, although this isn't apparent if you look only at the description for the baseline value: There is no baseline shift; the dominant-baseline remains in its original position. Easy to get confused by this, but still, I think your original a) applies here. Problem solved? :-) I still think according to the spec its b). :-) Cheers, Andreas Manuel
Another page-related question: page-position=last
Hi, (Apologies for the many posts... I'm definitely on a roll :-)) Now that it has been made clear to me that the layout-engine first calculates *all* break-possibilities, IMO this also seems to make implementing page-position=last much, much easier. Assuming that no areas are generated until the full element list is known, I'm thinking once you have created the Knuth-element list, instead of starting at element index zero to generate the areas, it would also be possible to have the algorithm start at the last element in the list until we reach the first break-possibility that would make the content exceed the height page-master whose page-position is last, no? At the very least, it would enable us to mark the break-possibility at which the algorithm should consider the last page-master... Obviously, we can't complete the full iteration in that direction, or we would run into problems determining the difference between odd and even, but it seems possible to iterate backwards once and assign a sort of 'favorability degree' to each of break-possibilities --increase or decrease penalty values?--, taking into account that the last page *has* to start somewhere after the marked one --before is impossible, since it won't fit. The generated last page can be thrown away (or serialized to be re-used if the marked possibility just happens to coincide with the actual last page-break). Am I getting this correctly? Cheers, Andreas
FOTree test question
Hi, I can describe the case only, since I can't commit my changes yet (they still break a few layout tests), so please ask further if you don't have enough info to form an idea of my changes. Maybe better to wait to when I commit my stuff to a branch --tomorrow or Thursday--, so you get a clearer view, but anyway... Sketch of the changes: In my local copy, I have added a new Maker subclass for the column-number property. It works in combination with an added columnIndex instance variable in the related FObjs (Table/TableBody/TableRow), which is updated in addChildNode() (if the child-node is a TableColumn/TableCell). Very roughly, the new ColumnNumberProperty.make() does something like this: if( fo.getNameId() == FO_TABLE_CELL || fo.getNameId() == FO_TABLE_COLUMN ) { //return the parent's current columnIndex } else { //tried two things here: //1) return null //2) throw a PropertyException } All seems to work nicely if I trace it using log.debug(), but I wanted to include some junit tests. Problem is that the tests fail because: - either an exception is thrown because the property is specified on 'null' - or the property evaluates to 'null' instead of the expected '1'. The related fragments of the test-case look like: fo:table-column column-width=100% test:assert property=column-number expected=1 / /fo:table-column ... fo:table-cell test:assert property=column-number expected=1 / fo:blockcell content/fo:block /fo:table-cell Any idea what I am missing/doing wrong? Any insights appreciated. TIA! Cheers, Andreas
Re: FOTree test question
[Andreas] Very roughly, the new ColumnNumberProperty.make() does something like this: That should be ColumnNumberPropertyMaker in order to follow the naming of all the other custom makers. regards, finn
Re: FOTree test question
On Sep 13, 2005, at 22:06, Finn Bock wrote: [Andreas] Very roughly, the new ColumnNumberProperty.make() does something like this: That should be ColumnNumberPropertyMaker in order to follow the naming of all the other custom makers. OK, I'll keep that in mind... Shouldn't it be TableBorderPrecedenceMaker as well then? Cheers, Andreas
Another table-question: initial column-number for cells
Hi, The Rec describes the default value for the column-number on table-cells as: For the first table-cell in a table-row, the current column number is 1... My question: is this also true in case there was a row-spanning cell from the previous row that already occupies the first column? IOW: May we assume that in that case, the first free column-number may be used (taking into account spans from previous rows)? It's not all that difficult to track pending row-spans, so we could already take care of that when determining the initial value, and it makes more sense to me to interpret it that way, but just thought I'd ask... Cheers, Andreas
Re: Another table-question: initial column-number for cells
On Sep 12, 2005, at 11:49, Jeremias Maerki wrote: The current code works that way. The first free grid unit is used. OK. Thanks for the confirmation. FYI: I'm currently almost done with having the initial values set by the Property subsystem itself. (Finn's recent addition of TableBorderPrecedence gave me just the hint I needed to get the necessary understanding of the property system...) Expect more later today. Still have to remove some superfluous code (the question hasColumnNumber() for instance, will become irrelevant, since every TableCell/TableColumn will have its column-number correctly set by the time layout starts, apart from repeated columns. I'm still struggling with those...) Cheers, Andreas
Re: Another table-question: initial column-number for cells
On Sep 12, 2005, at 12:01, Andreas L Delmelle wrote: FYI: I'm currently almost done with having the initial values set by the Property subsystem itself. (Finn's recent addition of TableBorderPrecedence gave me just the hint I needed to get the necessary understanding of the property system...) Expect more later today. Still have to remove some superfluous code (the question hasColumnNumber() for instance, will become irrelevant, since every TableCell/TableColumn will have its column-number correctly set by the time layout starts, apart from repeated columns. I'm still struggling with those...) Just a status-report: Will have to look again tomorrow... I'll have to dive deeper into the layout code first. Up until the FOTree, I got the column-numbers correctly assigned so far --only repeated columns will have to be handled by ColumnSetup for the time being. It seems it's going to take a little longer than I expected to make it pass all the layoutengine tests :-( Cheers, Andreas