[ 
https://issues.apache.org/jira/browse/PDFBOX-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824466#comment-17824466
 ] 

Tilman Hausherr commented on PDFBOX-5781:
-----------------------------------------

Your patch is against an older code version. The current code is like this:
{code}
    private static String computeHash(byte[] ba)
    {
        CRC32 crc = new CRC32();
        crc.update(ba);
        long l = crc.getValue();
        return Long.toHexString(l);
    }
{code}
We got rid of SHA512 in PDFBOX-5727 because it was too slow.

> OutOfMemoryError in FileSystemFontsProvider.scanFonts
> -----------------------------------------------------
>
>                 Key: PDFBOX-5781
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5781
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.30
>         Environment: Running inside a JVM with -Xmx512
> openjdk version "17.0.10" 2024-01-16 LTS
> OpenJDK Runtime Environment (build 17.0.10+13-LTS)
> OpenJDK 64-Bit Server VM (build 17.0.10+13-LTS, mixed mode, sharing)
> macOS Sonoma 14.3.1 (23D60)
>            Reporter: Kim Hage
>            Priority: Minor
>         Attachments: FileSystemFontProvider_OutOfMemoryError.stacktrace, 
> FileSystemFontProvider_use_DigestInputStream.patch
>
>
> We experienced an OutOfMemoryError when calling
> PDAcroForm.getDefaultResources().getFont(COSName); with COSName\{Helv}
> The reason seemed to be that PdfBox initializes a FontCache when getFont is 
> called and this scans _all_ fonts on the system. This also loads some large 
> system fonts (AppleColorEmoji is 189,9MB). Each font gets copied into a 
> single large byte array at the location below and this causes an 
> OutOfMemoryError at this point in the code.
> {{org.apache.pdfbox.pdmodel.font.FileSystemFontProvider#addTrueTypeFontImpl:773}}
> {code:java}
> InputStream is = ttf.getOriginalData();
> byte[] ba = IOUtils.toByteArray(is);
> is.close();
> String hash = computeHash(ba); {code}
> I think this would be easily fixed by using a DigestInputStream instead of a 
> byte array to compute hashes at this location. I have tested this locally and 
> it seemed to work. Please see the attached .patch file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to