[ https://issues.apache.org/jira/browse/PDFBOX-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766473#comment-17766473 ]
ASF subversion and git services commented on PDFBOX-5682: --------------------------------------------------------- Commit 1912401 from le...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1912401 ] PDFBOX-5682: don't use hashCode as key to avoid collisions > Long/permanent hang in PDFBox 3.x > --------------------------------- > > Key: PDFBOX-5682 > URL: https://issues.apache.org/jira/browse/PDFBOX-5682 > Project: PDFBox > Issue Type: Bug > Reporter: Tim Allison > Assignee: Andreas Lehmkühler > Priority: Minor > Fix For: 3.0.1 PDFBox, 4.0.0 > > > I found two files in the regression tests where we're now getting timeouts at > 3 minutes where we weren't before. Unfortunately, PDFBox's export:text works > on both, so it is probably another structural feature, perhaps a problem in > Tika? > This file halts after printing out the header for Table 19 on page 46: > https://corpora.tika.apache.org/base/docs/govdocs1/078/078656.pdf > Pure PDFBox's export:text complains multiple times: "Page skipped due to an > invalid or missing type null, but it does finish quickly." > This file halts after extracting {{"854,793,592"}}: > https://corpora.tika.apache.org/base/docs/commoncrawl3_refetched/G7/G7BO7PNCCREVF2BCY5YSYOPYDLMBYASY > Pure PDFBox's export:text processes this without problem. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org