[ 
https://issues.apache.org/jira/browse/PDFBOX-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510535#comment-17510535
 ] 

Ryan Jackson commented on PDFBOX-4559:
--------------------------------------

As of version 2.0.24 we've seen issues with this method as well when called in 
parallel.

While we did not see the stack trace(s) described above, we saw issues with 
glyphs not being found when the following type of operation was performed:
{code:java}
        final int pageNumber = pageIndex + 1; // The page number used with 
PDFTextStripper is one-based and inclusive.

        PDFTextStripper textStripper = new PDFTextStripper();
        textStripper.setStartPage(pageNumber);
        textStripper.setEndPage(pageNumber);

        renderedImage = pdfRenderer.renderImageWithDPI(pageIndex, dpi, 
ImageType.RGB);
        pageText = textStripper.getText(pdfDocument);
{code}
While these two methods appear to be thread-safe at a high level they both 
interface with the internal font system in a way which is not.

If the two calls at the end of this sample are not synchronized then all manner 
of warnings are thrown as glyphs are requested. Often the resulting image is 
missing text.

I don't want to muddy the waters by adding yet another problem but as Tilman 
has suggested, any thread-safety work here is likely to go fairly deep into the 
code.

> Parse error reading document from several threads
> -------------------------------------------------
>
>                 Key: PDFBOX-4559
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4559
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.15
>         Environment: Oracle Java 8 update125 on both Mac OS X and centos
>            Reporter: Jack
>            Priority: Major
>              Labels: concurrency, multithreading, type1, type1font
>         Attachments: test.pdf
>
>
> I got following error while running a simple parallel rendering code. 
> However, the error doesn't happen when I change parallelStream to sequential 
> (stream()). Interestingly, both methods will render exact same images. I saw 
> a possible related ticket PDFBOX-3654. But seems that issue was fixed. I'd 
> like to learn if we have some more bugs related?  
> *Sample code*:
> {code:java}
> PDDocument document = PDDocument.load(new File(pdfFilename));
> List<PDDocument> pdfPages = new Splitter().split(document);
> pdfPages.parallelStream().forEach(page -> {
>  try {
> PDFRenderer renderer = new PDFRenderer(page);
> renderer.renderImageWithDPI(0, 180, ImageType.RGB); // change dpi to your 
> number
> } catch (IOException e) {
>  System.out.println(e);
> }
> try {
>  pdfPage.close();
> } catch (IOException ignored) {
> }
> });
> try {
>  document.close();
> } catch (IOException ignored) {
> }
> {code}
>  
> *Error log*:
> {noformat}
> ERROR [PDType1Font] Can't read the embedded Type1 font POAEND+Gotham-Book
> java.io.IOException: unexpected closing parenthesis
>  at org.apache.fontbox.type1.Type1Lexer.readToken(Type1Lexer.java:123) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Lexer.nextToken(Type1Lexer.java:75) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.readValue(Type1Parser.java:398) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.readOtherSubrs(Type1Parser.java:707) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.parseBinary(Type1Parser.java:550) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:64) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:85) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:262) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:62)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:146) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:869)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:505)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:479)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:152)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:265) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:314) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:243) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:229)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
> WARN [PDType1Font] Using fallback font Helvetica for POAEND+Gotham-Book
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to