[
https://issues.apache.org/jira/browse/TIKA-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728527#comment-14728527
]
Chris A. Mattmann commented on TIKA-1436:
-----------------------------------------
I tried applying this patch, it doesn't apply cleanly to 1.11 trunk I think it
needs to be brought up to date:
{noformat}
[chipotle:~/tmp/tika1.11] mattmann% patch -p0 < ste-20140927.patch
patching file
tika-core/src/main/java/org/apache/tika/sax/WriteOutContentHandler.java
patching file
tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
Hunk #1 succeeded at 56 (offset 4 lines).
Hunk #2 succeeded at 146 with fuzz 2 (offset -12 lines).
patching file
tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParserConfig.java
Hunk #1 succeeded at 156 with fuzz 1 (offset 12 lines).
can't find file to patch at input line 145
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|Index: tika-parsers/src/test/java/org/apache/tika/TikaTest.java
|===================================================================
|--- tika-parsers/src/test/java/org/apache/tika/TikaTest.java (revision
1627940)
|+++ tika-parsers/src/test/java/org/apache/tika/TikaTest.java (working copy)
--------------------------
File to patch:
Skip this patch? [y] y
Skipping patch.
3 out of 3 hunks ignored
patching file
tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
Hunk #1 FAILED at 16.
Hunk #2 succeeded at 1000 with fuzz 1 (offset 89 lines).
1 out of 2 hunks FAILED -- saving rejects to file
tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java.rej
patching file
tika-parsers/src/test/java/org/apache/tika/parser/rtf/RTFParserTest.java
Hunk #1 succeeded at 81 (offset -10 lines).
Hunk #2 succeeded at 88 (offset -10 lines).
Hunk #3 succeeded at 99 (offset -10 lines).
Hunk #4 succeeded at 121 with fuzz 2 (offset -10 lines).
Hunk #5 succeeded at 165 (offset -10 lines).
Hunk #6 succeeded at 221 (offset -10 lines).
Hunk #7 FAILED at 587.
1 out of 7 hunks FAILED -- saving rejects to file
tika-parsers/src/test/java/org/apache/tika/parser/rtf/RTFParserTest.java.rej
[chipotle:~/tmp/tika1.11] mattmann%
{noformat}
Also reading the comments i'm not sure of the outcome of consensus here - Jukka
and Tyler brought up some points and it seems like you responded to them,
Stefano but I didn't see their replies, etc. What conversation are you
referencing on list?
Thanks.
> improvement to PDFParser
> ------------------------
>
> Key: TIKA-1436
> URL: https://issues.apache.org/jira/browse/TIKA-1436
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 1.6
> Reporter: Stefano Fornari
> Assignee: Chris A. Mattmann
> Labels: parser, pdf
> Fix For: 1.11
>
> Attachments: ste-20140927.patch
>
>
> with regards to the thread "[PDFParser] - read limited number of characters"
> on Mar 29, I would like to propose the attached patch. I noticed that in Tika
> 1.6 there have been some work around a better handling of the
> WriteLimitReachedException condition, but I believe it could be even
> improved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)