[jira] [Created] (PDFBOX-3801) PDFToImage : Can the output image be generated with specified size details?

2017-05-21 Thread Kamna (JIRA)
Kamna created PDFBOX-3801: - Summary: PDFToImage : Can the output image be generated with specified size details? Key: PDFBOX-3801 URL: https://issues.apache.org/jira/browse/PDFBOX-3801 Project: PDFBox

[jira] [Commented] (PDFBOX-2431) Rendering errors

2017-05-21 Thread Kamna (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019083#comment-16019083 ] Kamna commented on PDFBOX-2431: --- I get this error while converting PDF to image and the images on the pdf

[jira] [Commented] (PDFBOX-3330) Enhance and update PDFBox website & documentation

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018990#comment-16018990 ] Tilman Hausherr commented on PDFBOX-3330: - https://pdfbox.apache.org/2.0/getting-started.html

[jira] [Comment Edited] (PDFBOX-3712) PDFBox goes into an infinite loop with this PDF

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903436#comment-15903436 ] Tilman Hausherr edited comment on PDFBOX-3712 at 5/21/17 9:21 PM: -- The

[jira] [Commented] (PDFBOX-3797) Error reading stream, expected='endstream' actual='' with truncated file

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018921#comment-16018921 ] Andreas Lehmkühler commented on PDFBOX-3797: My recent changes don't fix the problem, now it

[jira] [Commented] (PDFBOX-3797) Error reading stream, expected='endstream' actual='' with truncated file

2017-05-21 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018919#comment-16018919 ] ASF subversion and git services commented on PDFBOX-3797: - Commit 1795713 from

[jira] [Commented] (PDFBOX-3797) Error reading stream, expected='endstream' actual='' with truncated file

2017-05-21 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018916#comment-16018916 ] ASF subversion and git services commented on PDFBOX-3797: - Commit 1795712 from

[jira] [Commented] (PDFBOX-3799) Problem in TextPosition's hashCode

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018906#comment-16018906 ] Tilman Hausherr commented on PDFBOX-3799: - Re your other wish, I prefer not, because this would

[jira] [Resolved] (PDFBOX-3799) Problem in TextPosition's hashCode

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-3799. - Resolution: Fixed Assignee: Tilman Hausherr Fix Version/s: 3.0.0

[jira] [Commented] (PDFBOX-3799) Problem in TextPosition's hashCode

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018903#comment-16018903 ] Tilman Hausherr commented on PDFBOX-3799: - Thank you for finding this. The lesson of this is that

[jira] [Comment Edited] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018855#comment-16018855 ] Andreas Lehmkühler edited comment on PDFBOX-3798 at 5/21/17 5:22 PM: -

[jira] [Commented] (PDFBOX-3799) Problem in TextPosition's hashCode

2017-05-21 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018898#comment-16018898 ] ASF subversion and git services commented on PDFBOX-3799: - Commit 1795711 from

[jira] [Commented] (PDFBOX-3799) Problem in TextPosition's hashCode

2017-05-21 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018896#comment-16018896 ] ASF subversion and git services commented on PDFBOX-3799: - Commit 1795710 from

[jira] [Commented] (PDFBOX-3797) Error reading stream, expected='endstream' actual='' with truncated file

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018886#comment-16018886 ] Andreas Lehmkühler commented on PDFBOX-3797: The regression was introduced with PDFBOX-3717

[jira] [Resolved] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-3798. Resolution: Fixed > Truncated file has first page empty >

Re: tika-eval

2017-05-21 Thread Andreas Lehmkuehler
Am 17.02.2017 um 17:58 schrieb Allison, Timothy B.: All, I finally got around to adding tika-eval[1] to Apache Tika. If you have any interest in comparing the output of different tools/versions/parameters on text extraction, give it a try. You don't need to use Tika or format the output

[jira] [Closed] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed PDFBOX-3800. --- Resolution: Not A Problem > I extract text of a pdf using PDFTextStripper and part of the

[jira] [Updated] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3800: Attachment: PDFDebugger-screenshot.png Here's a screenshot of the PDFDebugger command line

[jira] [Commented] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018859#comment-16018859 ] Tilman Hausherr commented on PDFBOX-3800: - That is because "Mapping Twitter topic networks: ... "

[jira] [Commented] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018855#comment-16018855 ] Andreas Lehmkühler commented on PDFBOX-3798: Thanks [~tilman], but there is one regression,

[jira] [Commented] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018854#comment-16018854 ] Tilman Hausherr commented on PDFBOX-3798: - no regressions > Truncated file has first page empty

[jira] [Updated] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-3798: --- Fix Version/s: 3.0.0 2.0.7 > Truncated file has first page empty

[jira] [Commented] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018843#comment-16018843 ] ASF subversion and git services commented on PDFBOX-3798: - Commit 1795705 from

[jira] [Issue Comment Deleted] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Alexandre (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandre updated PDFBOX-3800: -- Comment: was deleted (was: I may provide another example of pdf if someone wants it.) > I extract

[jira] [Updated] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Alexandre (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandre updated PDFBOX-3800: -- Description: Hi, I am quite unfamiliar with PDFbox. Still, I spent some time trying to figure out to

[jira] [Commented] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Alexandre (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018842#comment-16018842 ] Alexandre commented on PDFBOX-3800: --- I may provide another example of pdf if someone wants it. > I

[jira] [Updated] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Alexandre (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandre updated PDFBOX-3800: -- Affects Version/s: 2.0.7 > I extract text of a pdf using PDFTextStripper and part of the text is

[jira] [Created] (PDFBOX-3800) I extract text of a pdf using PDFTextStripper and part of the text is missing.

2017-05-21 Thread Alexandre (JIRA)
Alexandre created PDFBOX-3800: - Summary: I extract text of a pdf using PDFTextStripper and part of the text is missing. Key: PDFBOX-3800 URL: https://issues.apache.org/jira/browse/PDFBOX-3800 Project:

[jira] [Assigned] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler reassigned PDFBOX-3798: -- Assignee: Andreas Lehmkühler > Truncated file has first page empty >

[jira] [Commented] (PDFBOX-3798) Truncated file has first page empty

2017-05-21 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018823#comment-16018823 ] Andreas Lehmkühler commented on PDFBOX-3798: The regression was introduced with PDFBOX-3717

[jira] [Created] (PDFBOX-3799) Problem in TextPosition's hashCode

2017-05-21 Thread Miro Mannino (JIRA)
Miro Mannino created PDFBOX-3799: Summary: Problem in TextPosition's hashCode Key: PDFBOX-3799 URL: https://issues.apache.org/jira/browse/PDFBOX-3799 Project: PDFBox Issue Type: Bug