[
https://issues.apache.org/jira/browse/PDFBOX-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720192#comment-17720192
]
ASF subversion and git services commented on PDFBOX-5595:
---------------------------------------------------------
Commit 1909649 from [email protected] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1909649 ]
PDFBOX-5595: use null instead of COSNUll for corrupt entries within a content
stream
> Slight regression on corrupt bug tracker file
> ---------------------------------------------
>
> Key: PDFBOX-5595
> URL: https://issues.apache.org/jira/browse/PDFBOX-5595
> Project: PDFBox
> Issue Type: Task
> Components: Parsing
> Affects Versions: 2.0.28, 3.0.0 PDFBox
> Reporter: Tim Allison
> Assignee: Andreas Lehmkühler
> Priority: Trivial
>
> I'm not sure this is a regression, and apologies if you already dealt with
> this before the release of 2.0.28. Also, as a warning, this file is corrupt.
>
> We used to get more text out of this file in 2.0.27 than we do now in 2.0.28:
> [https://corpora.tika.apache.org/base/docs/bug_trackers/evince/evince-395-0.zip-0.pdf]
>
> This file derived from the evince bug tracker, which now eventually links to
> this issue:
> [https://gitlab.freedesktop.org/poppler/poppler/-/issues/323]
>
> This image from the poppler issue shows what we get with PDFBox 2.0.28 on the
> left, and 2.0.27 on the right.
>
> If the decision is "the file is corrupt -> not going to fix", I completely
> understand.
> !https://gitlab.gnome.org/GNOME/evince/uploads/0bc2302dbafc0bbc2110f0d42951428e/evince.JPG!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]