Re: 0.95 Acrobat Performance Problems
I've just added support for forcing a single-byte encoding when using the font element without an XML font metric file in FOP Trunk: http://svn.apache.org/viewvc?rev=731248view=rev So using current FOP Trunk you could do: font embed-url=file:///opt/sigma_tomcat/webapps/sigma/WEB-INF/fonts/ocra.ttf encoding-mode=single-byte font-triplet name=ocra style=normal weight=normal/ /font This way and for larger fonts than OCRA (i.e. fonts with characters outside the WinAnsi encoding), you'll have access to all glyphs in the font which you don't have when working with an XML font metric file. On 19.12.2008 21:40:00 Jeremias Maerki wrote: Just as I thought. You're doing this: font kerning=yes embed-url=file:///opt/sigma_tomcat/webapps/sigma/WEB-INF/fonts/ocra.ttf font-triplet name=ocra style=normal weight=normal/ /font This bypasses the XML font metric files. Currently, it's not possible to tell FOP this way to use WinAnsi encoding. You have to create an XML font metrics file for each font as described in [1] (using -enc ansi, just as in FOP 0.20.5 although you'll have to recreate the files). Then you'll have to do this: font metrics-url=file:///opt/sigma_tomcat/webapps/sigma/WEB-INF/fonts/ocra.ttf.xml kerning=yes embed-url=file:///opt/sigma_tomcat/webapps/sigma/WEB-INF/fonts/ocra.ttf font-triplet name=ocra style=normal weight=normal/ /font HTH [2] http://xmlgraphics.apache.org/fop/trunk/fonts.html#truetype-metrics On 19.12.2008 20:20:01 egibler wrote: The font configuration should be attached (I used the 'Upload File...' button, but don't know where the file goes). Thanks!!! Ed That's just a font metrics file. It could still be that your configuration file is set up so you get a CID font. Can we see the font configuration, too? http://www.nabble.com/file/p21096930/configuration.xml configuration.xml -- View this message in context: http://www.nabble.com/0.95---Acrobat-Performance-Problems-tp20774481p21096930.html Sent from the FOP - Users mailing list archive at Nabble.com. Jeremias Maerki Jeremias Maerki - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: 0.95 Acrobat Performance Problems
The font configuration should be attached (I used the 'Upload File...' button, but don't know where the file goes). Thanks!!! Ed That's just a font metrics file. It could still be that your configuration file is set up so you get a CID font. Can we see the font configuration, too? http://www.nabble.com/file/p21096930/configuration.xml configuration.xml -- View this message in context: http://www.nabble.com/0.95---Acrobat-Performance-Problems-tp20774481p21096930.html Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: 0.95 Acrobat Performance Problems
Just as I thought. You're doing this: font kerning=yes embed-url=file:///opt/sigma_tomcat/webapps/sigma/WEB-INF/fonts/ocra.ttf font-triplet name=ocra style=normal weight=normal/ /font This bypasses the XML font metric files. Currently, it's not possible to tell FOP this way to use WinAnsi encoding. You have to create an XML font metrics file for each font as described in [1] (using -enc ansi, just as in FOP 0.20.5 although you'll have to recreate the files). Then you'll have to do this: font metrics-url=file:///opt/sigma_tomcat/webapps/sigma/WEB-INF/fonts/ocra.ttf.xml kerning=yes embed-url=file:///opt/sigma_tomcat/webapps/sigma/WEB-INF/fonts/ocra.ttf font-triplet name=ocra style=normal weight=normal/ /font HTH [2] http://xmlgraphics.apache.org/fop/trunk/fonts.html#truetype-metrics On 19.12.2008 20:20:01 egibler wrote: The font configuration should be attached (I used the 'Upload File...' button, but don't know where the file goes). Thanks!!! Ed That's just a font metrics file. It could still be that your configuration file is set up so you get a CID font. Can we see the font configuration, too? http://www.nabble.com/file/p21096930/configuration.xml configuration.xml -- View this message in context: http://www.nabble.com/0.95---Acrobat-Performance-Problems-tp20774481p21096930.html Sent from the FOP - Users mailing list archive at Nabble.com. Jeremias Maerki - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: 0.95 Acrobat Performance Problems
That's just a font metrics file. It could still be that your configuration file is set up so you get a CID font. Can we see the font configuration, too? On 17.12.2008 17:04:58 egibler wrote: I got ahold of the configuration file for the font that is acting up. From what I can see, it is being defined as an ANSI font, but when the PDF is produced, it shows as CID. Again, in version .2 it worked great. Is there anything that can or should be added / edited to the following (or elsewhere) that would get the font rendered as ANSI? I'd sure appreciate any suggestions. Thanks! Ed ?xml version=1.0 encoding=UTF-8? font-metrics type=TRUETYPE font-nameOCRAExtended/font-name cap-height648/cap-height x-height0/x-height ascender857/ascender descender-176/descender bbox left-4/left bottom-176/bottom right875/right top857/top /bbox flags33/flags stemv0/stemv italicangle0/italicangle subtypeTRUETYPE/subtype singlebyte-extras encodingWinAnsiEncoding/encoding first-char0/first-char last-char255/last-char widths char idx=0 wdt=1000/ char idx=1 wdt=1000/ snipped out ‘2’ to '254’*** char idx=255 wdt=604/ /widths /singlebyte-extras /font-metrics ** Jeremias Maerki-2 wrote: On 04.12.2008 23:25:57 Andreas Delmelle wrote: On 04 Dec 2008, at 23:05, egibler wrote: I think we uncovered the culprit! The files we received in the 0.2 version used an 'OCRAExtended' font, which was TrueType and Encoding: Ansi. The files we receive now with 0.95 have 'OCRAExtended', which is TrueType (CID) and Encoding: Entity-H. From what I've read, the CID font is basically for foreign languages with huge character sets. Is this correct? Basically, yes. But not just that. It doesn't really take an exotic language for that CID fonts to become interesting. Is there a reason that the guys who make the PDF need this CID font in version 0.95. The only characters used are numbers (0-9). I'm not sure. I'll leave the detailed explanation to those who are more familiar with that part of the code (if they decide to chime in), but my best guess would be that the on-the-fly metrics generation (introduced in later versions) somehow defaults to CID font-metrics (?) Yes, we do CID fonts for TrueType fonts by default because that's the most versatile approach. I was not aware that this can slow down Acrobat's PDF concatenation this much. If Adobe had chosen to simply embed the font multiple times (as probably other tools would), the whole thing would be much faster. But the PDF file would obviously be bigger in the end. The thing with this choice is that we currently cannot switch from single byte handling to multi byte handling when necessary. The architecture of the font subsystem doesn't currently allow that. So we have to make a decision beforehand which approach to take. And currently, if you don't use an XML font metrics file, the choice is hard-coded to CID fonts. Maybe that needs to be made configurable at least. I've recorded an RFE in Bugzilla so this doesn't get forgotten: https://issues.apache.org/bugzilla/show_bug.cgi?id=46348 I swapped the font out in my samples, and combining 200 one-page pdfs dropped from 30+ minutes to a few seconds. This would make all the difference to us, if it is doable from the developers' perspective. What do you think? Makes sense that this is such a drain. Using CID fonts means Acrobat has to merge the glyph tables from 200 documents (and possibly change references in the documents themselves too, accordingly). If my above assumption is correct, then the issue could be alleviated on the producing end. Even if no longer necessary, it is still possible to manually generate a font-metrics file for the given font in Ansi encoding, and reference that in the config file. Right, that's one work-around: an XML font metrics file generated with -ansi. The other option, since you're talking about an OCR font: you could switch to the Type 1 variant in which case you're automatically in the single byte realm and the problem would not appear. If they do not immediately know how, refer them to: http://xmlgraphics.apache.org/fop/0.95/fonts.html HTH! Cheers Andreas Jeremias Maerki - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org -- View this message in context: http://www.nabble.com/0.95---Acrobat-Performance-Problems-tp20774481p21055068.html Sent from the FOP - Users mailing list archive at Nabble.com. Jeremias Maerki
Re: 0.95 Acrobat Performance Problems
I got ahold of the configuration file for the font that is acting up. From what I can see, it is being defined as an ANSI font, but when the PDF is produced, it shows as CID. Again, in version .2 it worked great. Is there anything that can or should be added / edited to the following (or elsewhere) that would get the font rendered as ANSI? I'd sure appreciate any suggestions. Thanks! Ed ?xml version=1.0 encoding=UTF-8? font-metrics type=TRUETYPE font-nameOCRAExtended/font-name cap-height648/cap-height x-height0/x-height ascender857/ascender descender-176/descender bbox left-4/left bottom-176/bottom right875/right top857/top /bbox flags33/flags stemv0/stemv italicangle0/italicangle subtypeTRUETYPE/subtype singlebyte-extras encodingWinAnsiEncoding/encoding first-char0/first-char last-char255/last-char widths char idx=0 wdt=1000/ char idx=1 wdt=1000/ snipped out ‘2’ to '254’*** char idx=255 wdt=604/ /widths /singlebyte-extras /font-metrics ** Jeremias Maerki-2 wrote: On 04.12.2008 23:25:57 Andreas Delmelle wrote: On 04 Dec 2008, at 23:05, egibler wrote: I think we uncovered the culprit! The files we received in the 0.2 version used an 'OCRAExtended' font, which was TrueType and Encoding: Ansi. The files we receive now with 0.95 have 'OCRAExtended', which is TrueType (CID) and Encoding: Entity-H. From what I've read, the CID font is basically for foreign languages with huge character sets. Is this correct? Basically, yes. But not just that. It doesn't really take an exotic language for that CID fonts to become interesting. Is there a reason that the guys who make the PDF need this CID font in version 0.95. The only characters used are numbers (0-9). I'm not sure. I'll leave the detailed explanation to those who are more familiar with that part of the code (if they decide to chime in), but my best guess would be that the on-the-fly metrics generation (introduced in later versions) somehow defaults to CID font-metrics (?) Yes, we do CID fonts for TrueType fonts by default because that's the most versatile approach. I was not aware that this can slow down Acrobat's PDF concatenation this much. If Adobe had chosen to simply embed the font multiple times (as probably other tools would), the whole thing would be much faster. But the PDF file would obviously be bigger in the end. The thing with this choice is that we currently cannot switch from single byte handling to multi byte handling when necessary. The architecture of the font subsystem doesn't currently allow that. So we have to make a decision beforehand which approach to take. And currently, if you don't use an XML font metrics file, the choice is hard-coded to CID fonts. Maybe that needs to be made configurable at least. I've recorded an RFE in Bugzilla so this doesn't get forgotten: https://issues.apache.org/bugzilla/show_bug.cgi?id=46348 I swapped the font out in my samples, and combining 200 one-page pdfs dropped from 30+ minutes to a few seconds. This would make all the difference to us, if it is doable from the developers' perspective. What do you think? Makes sense that this is such a drain. Using CID fonts means Acrobat has to merge the glyph tables from 200 documents (and possibly change references in the documents themselves too, accordingly). If my above assumption is correct, then the issue could be alleviated on the producing end. Even if no longer necessary, it is still possible to manually generate a font-metrics file for the given font in Ansi encoding, and reference that in the config file. Right, that's one work-around: an XML font metrics file generated with -ansi. The other option, since you're talking about an OCR font: you could switch to the Type 1 variant in which case you're automatically in the single byte realm and the problem would not appear. If they do not immediately know how, refer them to: http://xmlgraphics.apache.org/fop/0.95/fonts.html HTH! Cheers Andreas Jeremias Maerki - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org -- View this message in context: http://www.nabble.com/0.95---Acrobat-Performance-Problems-tp20774481p21055068.html Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: 0.95 Acrobat Performance Problems
egibler wrote: Hi, I receive lots and lots of small (1-3 page) PDFs each day, combine them using Acrobat Professional, and print and mail them for clients. One client recently upgraded from FOP .2x to FOP 0.95, and the combining of his files has become nearly impossible. It'll put the first 10 pages together fairly quickly, but gets incrementally slower with subsequent additions until the process grinds to a halt before page 150 or so. He tends to send me batches of between 1500 and 3500 pages, so I'm well shy of what I need to accomplish. The process worked great with the earlier version, and works fine with my other clients' PDFs that are created with various other tools. These files are invoices. Batches of which regularly represent several millions of dollars, so they really have to be right, timely, and without duplicates or omissions. I'm not at all familiar with FOP - please forgive my ignorance. I hadn't heard of it until the problem occurred and I started doing a bit of research. From what I've read, there seems to be a fair amount of discussion related to memory issues and large PDFs. I'm wondering if anyone else has a situation similar to mine, and / or might be able to suggest what I might be able to do to get my production back on track. I'm a little confused. While FOP does indeed have some known performance issues with multi page documents, isn't the performance issue YOU'RE talking about inside Acrobat? BugBear - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 0.95 Acrobat Performance Problems
Thanks for the response. My problem is absolutely in Acrobat, specifically in how Acrobat deals with the PDFs generated using FOP 0.95 (I'm pretty sure). I'm hoping there might be some setting, or known issue, in dealing with the two products in the fashion I'm doing it. I combine somewhere between 20,000 and 30,000 PDF pages daily, and have done so for about three years. The process is quick and reliable. Last week, one of our clients migrated from FOP .2 to FOP .95, and abolutely every subsequent file sent by them has brought me to my knees. The remaining 90% of my work (coming from other clients and using other PDF generation tools) is flowing great, and I can rerun old FOP .2 files which also still work great. To me, the smoking gun is the upgrade, but I could be pursuaded otherwise. It wouldn't be my first incorrect assumption. I'm hoping to find the cause, or at least some sort of work around, or I'm afraid I'm going to have to turn away this client's business. Does that make sense? I readily acknowledge my lack of experience with the FOP toolset. Any light shed on the issue would be most helpful. Thanks, ed paul womack wrote: egibler wrote: Hi, I receive lots and lots of small (1-3 page) PDFs each day, combine them using Acrobat Professional, and print and mail them for clients. One client recently upgraded from FOP .2x to FOP 0.95, and the combining of his files has become nearly impossible. It'll put the first 10 pages together fairly quickly, but gets incrementally slower with subsequent additions until the process grinds to a halt before page 150 or so. He tends to send me batches of between 1500 and 3500 pages, so I'm well shy of what I need to accomplish. The process worked great with the earlier version, and works fine with my other clients' PDFs that are created with various other tools. These files are invoices. Batches of which regularly represent several millions of dollars, so they really have to be right, timely, and without duplicates or omissions. I'm not at all familiar with FOP - please forgive my ignorance. I hadn't heard of it until the problem occurred and I started doing a bit of research. From what I've read, there seems to be a fair amount of discussion related to memory issues and large PDFs. I'm wondering if anyone else has a situation similar to mine, and / or might be able to suggest what I might be able to do to get my production back on track. I'm a little confused. While FOP does indeed have some known performance issues with multi page documents, isn't the performance issue YOU'RE talking about inside Acrobat? BugBear - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- View this message in context: http://www.nabble.com/0.95---Acrobat-Performance-Problems-tp20774481p20777151.html Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: 0.95 Acrobat Performance Problems
I would look for differences in the source PDFs between the two versions. Even opening a PDF in a text editor will possibly show enough differences to explain the problem. FOP 0.95 creates PDF v1.4 files. I don't remember what FOP 0.20.5 created. Maybe there's a difference in how the contents of the PDF are compressed/uncompressed that influences the ability to concatenate. There are filters that FOP applies to certain items in the PDF, like the FLATE filter on images, so maybe those are affecting the behavior. Are files of similar content and page count approximately the same size between a source that works and a source that does not? If you have to, you could write a program to use iText to do the PDF concatenation instead of Acrobat just to see if the problem is reproducible from other software. -Original Message- From: egibler [mailto:[EMAIL PROTECTED] Sent: Monday, December 01, 2008 1:40 PM To: fop-users@xmlgraphics.apache.org Subject: Re: 0.95 Acrobat Performance Problems Thanks for the response. My problem is absolutely in Acrobat, specifically in how Acrobat deals with the PDFs generated using FOP 0.95 (I'm pretty sure). I'm hoping there might be some setting, or known issue, in dealing with the two products in the fashion I'm doing it. I combine somewhere between 20,000 and 30,000 PDF pages daily, and have done so for about three years. The process is quick and reliable. Last week, one of our clients migrated from FOP .2 to FOP .95, and abolutely every subsequent file sent by them has brought me to my knees. The remaining 90% of my work (coming from other clients and using other PDF generation tools) is flowing great, and I can rerun old FOP .2 files which also still work great. To me, the smoking gun is the upgrade, but I could be pursuaded otherwise. It wouldn't be my first incorrect assumption. I'm hoping to find the cause, or at least some sort of work around, or I'm afraid I'm going to have to turn away this client's business. Does that make sense? I readily acknowledge my lack of experience with the FOP toolset. Any light shed on the issue would be most helpful. Thanks, ed paul womack wrote: egibler wrote: Hi, I receive lots and lots of small (1-3 page) PDFs each day, combine them using Acrobat Professional, and print and mail them for clients. One client recently upgraded from FOP .2x to FOP 0.95, and the combining of his files has become nearly impossible. It'll put the first 10 pages together fairly quickly, but gets incrementally slower with subsequent additions until the process grinds to a halt before page 150 or so. He tends to send me batches of between 1500 and 3500 pages, so I'm well shy of what I need to accomplish. The process worked great with the earlier version, and works fine with my other clients' PDFs that are created with various other tools. These files are invoices. Batches of which regularly represent several millions of dollars, so they really have to be right, timely, and without duplicates or omissions. I'm not at all familiar with FOP - please forgive my ignorance. I hadn't heard of it until the problem occurred and I started doing a bit of research. From what I've read, there seems to be a fair amount of discussion related to memory issues and large PDFs. I'm wondering if anyone else has a situation similar to mine, and / or might be able to suggest what I might be able to do to get my production back on track. I'm a little confused. While FOP does indeed have some known performance issues with multi page documents, isn't the performance issue YOU'RE talking about inside Acrobat? BugBear - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- View this message in context: http://www.nabble.com/0.95---Acrobat-Performance-Problems-tp20774481p20777151.html Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024
Re: 0.95 Acrobat Performance Problems
On 01 Dec 2008, at 20:39, egibler wrote: Hi snip / To me, the smoking gun is the upgrade, but I could be pursuaded otherwise. It wouldn't be my first incorrect assumption. I'm hoping to find the cause, That would be helpful indeed. A lot has changed in the way the PDF is constructed between the two versions, so it's definitely possible that some of those changes are responsible. OTOH, without /any/ indication whatsoever of what precisely is going on, it will be difficult to address. So, the most important question is probably: is there something special about those PDFs? Are there many images, custom fonts, tables ...? Anything that might give us a clue? Apart from the mere theoretical possibility, you're the first to report such an issue (but this could be because there's only very few people that have the exact same setup, and use Acrobat to merge PDF generated by FOP? iText is among the most popular libraries to be used for this purpose.) or at least some sort of work around, or I'm afraid I'm going to have to turn away this client's business. That would be unfortunate, and is something we would help to avoid, if only we knew where to start looking... :/ Cheers Andreas - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]