[jira] [Commented] (PDFBOX-2871) Performance issue when filling the first PDTextField of an AcroForm

Roberto Nibali (JIRA) Mon, 13 Jul 2015 15:49:46 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625519#comment-14625519
 ]


Roberto Nibali commented on PDFBOX-2871:
----------------------------------------

What about adding an even faster font fetching mechanism in the beginning?

{code:title=Faster font searching on filesystem|borderStyle=solid}
        //Obviousy this will be replaced by FontDirFinder.determineDirFinder()
        String[] directories = {System.getProperty("user.home") + 
"/Library/Fonts/",
                "/Library/Fonts/",
                "/System/Library/Fonts/",
                "/Network/Library/Fonts/"};
        HashMap<String, String> allFiles = new HashMap<>();
        for (String directory : directories) {
            File dir = new File(directory);
            File[] files = dir.listFiles((path, name) -> {
                String nl = name.toLowerCase();
                return (nl.endsWith(".ttf") ||
                        nl.endsWith(".otf") ||
                        nl.endsWith(".pfb") ||
                        nl.endsWith(".ttc"));
            });
            if (files == null) {
                System.out.println("No fonts found in: " + directory);
                continue;
            }
            // this essentially is the FontFileFinder.find() part, albeit with 
additional functionality to later identify the font specifics.
            for (File file : files) {
                String key = file.getPath();
                allFiles.put(key, key.substring(Math.max(0, key.length() - 
3)).toUpperCase());
            }
        }
        System.out.println("Total Fonts: " + allFiles.size());
        for (Map.Entry entry : allFiles.entrySet()) {
            System.out.println(entry.getKey() + " : " + entry.getValue());
        }
{code}

While absolutely marginal, it'll probably strip off another couple of ms per 
font. And to really make things faster, the whole RAF part of parsing the font 
entries could be move to a NIO RAF channel mechanism. Since you currently often 
parse byte by byte using seek() and position(), I reckon this also could 
improve speed. Though, I suspect that it's rather an exercise for some rainy 
days than really a crucial optimization. Especially in the face of much bigger 
challenges that PDFBox still has to tackle.

> Performance issue when filling the first PDTextField of an AcroForm
> -------------------------------------------------------------------
>
>                 Key: PDFBOX-2871
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2871
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm
>    Affects Versions: 2.0.0
>            Reporter: Maruan Sahyoun
>            Assignee: John Hewson
>            Priority: Critical
>              Labels: Appearance
>             Fix For: 2.0.0
>
>         Attachments: PDTextField.pdf, ProfilingOutput.png
>
>
> When filling the first PDTextField in a form the performance is slow. All 
> other PDTextFields in the form are handled quickly.
> This code
> {code}
> PDTextField field = (PDTextField) 
> doc.getDocumentCatalog().getAcroForm().getField("Textfield01");
> long start = System.nanoTime();
> field.setValue("ABCD");
> long end = System.nanoTime();
> double difference = (end - start)/1e6;
> System.out.println(difference);
> field = (PDTextField) 
> doc.getDocumentCatalog().getAcroForm().getField("Textfield02");
> start = System.nanoTime();
> field.setValue("ABCD");
> end = System.nanoTime();
> difference = (end - start)/1e6;
> System.out.println(difference);
> {code}
> produces the following output
> {noformat}
> 9713.38
> 3.904
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-2871) Performance issue when filling the first PDTextField of an AcroForm

Reply via email to