Am 05.04.2019 um 22:44 schrieb William Pietri:
Hi!

First off, thanks for an excellent library. It's been a pleasure to work with something where the many mysteries of PDFs are tucked away behind an easy-to-use interface.

What I'm working on is filling forms. In particular, large forms into which we sometimes need to put a variety of languages. This has mostly worked fine, but I've noticed a speed issue. As an example, take the US I-130: https://www.uscis.gov/system/files_force/files/form/i-130.pdf

If I go through this and fill every field with roman text using the default font, it takes circa 2 seconds, which is fine. If I fill it with an added Arabic font, it takes circa 7 seconds. And if I use a CJK font, it takes circa 140 seconds, which seems like a lot. This is with PDFBox 2.0.14 and the Oracle 1.8.201 JDK on Linux.

One interesting symptom when I do this is that I see this error message every time I fill a text field:

   Apr 05, 2019 8:16:51 AM
   org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
   INFO: OpenType Layout tables used in font NotoSansCJKsc-Medium are
   not implemented in PDFBox and will be ignored
   Apr 05, 2019 8:16:51 AM
   org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
   INFO: OpenType Layout tables used in font NotoSansCJKsc-Medium are
   not implemented in PDFBox and will be ignored
   Apr 05, 2019 8:16:52 AM
   org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
   INFO: OpenType Layout tables used in font NotoSansCJKsc-Medium are
   not implemented in PDFBox and will be ignored

That is harmless... these are some very advanced features, e.g. for Thai, Arabic and Indian fonts.

Could you post some code that takes 140 seconds? And also share the font, or mention how to get it (I have Windows 7 and 10).

Or does it work when just looping through text fields and then calling setValue() ? And how did you use a CJK font - did you add a new font to /Acroform/DR replace the default appearance string in the fields?

Tilman





When I step through this in the debugger, I see the font being cached in pdmodel.lPDResources.java:155. However, next time through the loop, the cache is empty, as the PDResources object involved is a different one. Is that the intended behavior?

Yes, this is a convenience object. Although there is font caching done in the resource cache (see source code of getDefaultResources())



One thing that might be related is that the default resources for a form don't seem to be saved even when set. For example, this code:

   System.out.println("acroForm.getDefaultResources() = " + acroForm.getDefaultResources());    System.out.println("acroForm.getDefaultResources() = " + acroForm.getDefaultResources());    System.out.println("acroForm.getDefaultResources() = " + acroForm.getDefaultResources());
   PDResources defaultResources = acroForm.getDefaultResources();
   acroForm.setDefaultResources(defaultResources);
   System.out.println("set default resources to " + defaultResources);
   System.out.println("acroForm.getDefaultResources() = " + acroForm.getDefaultResources());    System.out.println("acroForm.getDefaultResources() = " + acroForm.getDefaultResources());    System.out.println("acroForm.getDefaultResources() = " + acroForm.getDefaultResources());


will produce this output:

   acroForm.getDefaultResources() =
   org.apache.pdfbox.pdmodel.PDResources@36d64342
   acroForm.getDefaultResources() =
   org.apache.pdfbox.pdmodel.PDResources@39ba5a14
   acroForm.getDefaultResources() =
   org.apache.pdfbox.pdmodel.PDResources@511baa65
   set default resources to org.apache.pdfbox.pdmodel.PDResources@340f438e
   acroForm.getDefaultResources() =
   org.apache.pdfbox.pdmodel.PDResources@30c7da1e
   acroForm.getDefaultResources() =
   org.apache.pdfbox.pdmodel.PDResources@5b464ce8
   acroForm.getDefaultResources() =
   org.apache.pdfbox.pdmodel.PDResources@57829d67



I definitely am new to the code base, but that isn't what I expected.

If this is a bug with the cache, I'm glad to take a swing at fixing it, but I figured I'd make sure that I wasn't doing something egregiously wrong first.

Thanks,

William





---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to