[jira] [Commented] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-18 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591829#comment-14591829 ] Andreas Meier commented on PDFBOX-2831: --- There are several positions in the File,

[jira] [Updated] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-18 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2831: -- Attachment: chya31marked.jpg ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction

[jira] [Commented] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-22 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595410#comment-14595410 ] Andreas Meier commented on PDFBOX-2831: --- Thanks ArrayIndexOutOfBoundsException in

[jira] [Commented] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-19 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593059#comment-14593059 ] Andreas Meier commented on PDFBOX-2831: --- Personally, I don't speak any arabic

[jira] [Comment Edited] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-19 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593059#comment-14593059 ] Andreas Meier edited comment on PDFBOX-2831 at 6/19/15 6:05 AM:

[jira] [Commented] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-18 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591447#comment-14591447 ] Andreas Meier commented on PDFBOX-2831: --- I had to search for a file on the web,

[jira] [Updated] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-16 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2831: -- Description: PDFBox may fail on extraction of text in method mergeDiacritic(TextPosition

[jira] [Created] (PDFBOX-2831) ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text

2015-06-16 Thread Andreas Meier (JIRA)
Andreas Meier created PDFBOX-2831: - Summary: ArrayIndexOutOfBoundsException in mergeDiacritic() on extraction of text with diacritic text Key: PDFBOX-2831 URL: https://issues.apache.org/jira/browse/PDFBOX-2831

[jira] [Commented] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-23 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638657#comment-14638657 ] Andreas Meier commented on PDFBOX-2272: --- The patch

[jira] [Commented] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-24 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640162#comment-14640162 ] Andreas Meier commented on PDFBOX-2272: --- I did not attach the vertical.patch

[jira] [Updated] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-24 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2272: -- Attachment: (was: PDFTextStripper.java) Can't extract vertical text correctly

[jira] [Updated] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-23 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2272: -- Attachment: pdfbox_new_vertical_text_extraction.patch Can't extract vertical text correctly

[jira] [Created] (PDFBOX-2879) Wrong vertical text extraction for apache PDFBox 2.0.0

2015-07-13 Thread Andreas Meier (JIRA)
Andreas Meier created PDFBOX-2879: - Summary: Wrong vertical text extraction for apache PDFBox 2.0.0 Key: PDFBOX-2879 URL: https://issues.apache.org/jira/browse/PDFBOX-2879 Project: PDFBox

[jira] [Updated] (PDFBOX-2879) Wrong vertical text extraction for apache PDFBox 2.0.0

2015-07-13 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2879: -- Attachment: Test16.pdf Test15.pdf Test14.pdf Wrong vertical

[jira] [Updated] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-13 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2272: -- Attachment: PDFTextStripper.java PDFTextStripper.java that supports extraction of rotated

[jira] [Commented] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-13 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625868#comment-14625868 ] Andreas Meier commented on PDFBOX-2272: --- I did some small changes to the

[jira] [Updated] (PDFBOX-2252) PDFTextStripper has problem with bilingual documents

2015-07-16 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2252: -- Attachment: PDFTextStripper.java.patch PDFTextStripper has problem with bilingual documents

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with bilingual documents

2015-07-16 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629760#comment-14629760 ] Andreas Meier commented on PDFBOX-2252: --- I am currently reworking the

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with bilingual documents

2015-07-16 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629760#comment-14629760 ] Andreas Meier edited comment on PDFBOX-2252 at 7/16/15 1:56 PM:

[jira] [Commented] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-16 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629782#comment-14629782 ] Andreas Meier commented on PDFBOX-2272: --- Please have a look at Ticket PDFBOX-2252

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630834#comment-14630834 ] Andreas Meier edited comment on PDFBOX-2252 at 7/17/15 6:14 AM:

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630834#comment-14630834 ] Andreas Meier commented on PDFBOX-2252: --- After some investigation I can say:

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630834#comment-14630834 ] Andreas Meier edited comment on PDFBOX-2252 at 7/17/15 8:00 AM:

[jira] [Updated] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-20 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2252: -- Attachment: atest.pdf PDFTextStripper has problem with documents with mixed language

[jira] [Updated] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-20 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2252: -- Attachment: wikipedia_dl_lyric_test.pdf PDFTextStripper has problem with documents with mixed

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-20 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633613#comment-14633613 ] Andreas Meier commented on PDFBOX-2252: --- I created a small test: atest.pdf By the

[jira] [Issue Comment Deleted] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2252: -- Comment: was deleted (was: The (( occurs, becaus Adobe Reader notices the strong RTL

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634577#comment-14634577 ] Andreas Meier edited comment on PDFBOX-2252 at 7/21/15 6:23 AM:

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634582#comment-14634582 ] Andreas Meier commented on PDFBOX-2252: --- The (( occurs, becaus Adobe Reader notices

[jira] [Updated] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2252: -- Attachment: overlap.jpg PDFTextStripper has problem with documents with mixed language

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634565#comment-14634565 ] Andreas Meier edited comment on PDFBOX-2252 at 7/21/15 6:14 AM:

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634577#comment-14634577 ] Andreas Meier edited comment on PDFBOX-2252 at 7/21/15 9:43 AM:

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634577#comment-14634577 ] Andreas Meier edited comment on PDFBOX-2252 at 7/21/15 9:43 AM:

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634577#comment-14634577 ] Andreas Meier commented on PDFBOX-2252: --- Yes, numbers are written ltr

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-07-21 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634577#comment-14634577 ] Andreas Meier edited comment on PDFBOX-2252 at 7/21/15 6:24 AM:

[jira] [Commented] (PDFBOX-2272) Can't extract vertical text correctly

2015-07-15 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628012#comment-14628012 ] Andreas Meier commented on PDFBOX-2272: --- Sorry that I didn't answer yesterday, I

[jira] [Comment Edited] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-07 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946604#comment-14946604 ] Andreas Meier edited comment on PDFBOX-2998 at 10/7/15 9:54 AM: Thanks

[jira] [Updated] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-07 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2998: -- Attachment: DropCapSegmentation.jpg DropCapExample5.pdf

[jira] [Comment Edited] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-07 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946604#comment-14946604 ] Andreas Meier edited comment on PDFBOX-2998 at 10/7/15 9:58 AM: Thanks

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-07 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946604#comment-14946604 ] Andreas Meier commented on PDFBOX-2998: --- Thanks for pointing that out. > Enhance the text

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-07 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946324#comment-14946324 ] Andreas Meier commented on PDFBOX-2998: --- I just wanted to fuel the discussion with my snippet. My

[jira] [Comment Edited] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-07 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946324#comment-14946324 ] Andreas Meier edited comment on PDFBOX-2998 at 10/7/15 6:03 AM: I just

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-01 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939764#comment-14939764 ] Andreas Meier commented on PDFBOX-2998: --- I would neither call it "line finding" nor "proper text to

[jira] [Updated] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-01 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2998: -- Attachment: TextBehindText.pdf > Enhance the text extraction capabilities >

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-01 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939843#comment-14939843 ] Andreas Meier commented on PDFBOX-2998: --- You are right, first we need to get the lower level

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-28 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14910107#comment-14910107 ] Andreas Meier edited comment on PDFBOX-2252 at 9/28/15 6:56 AM: I tested

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-28 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14910107#comment-14910107 ] Andreas Meier commented on PDFBOX-2252: --- I tested the latest patch with the documents. There are

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-28 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933076#comment-14933076 ] Andreas Meier commented on PDFBOX-2252: --- Most of the files I see do not provide any article

[jira] [Created] (PDFBOX-2998) Document layout analysis tools needed

2015-09-28 Thread Andreas Meier (JIRA)
Andreas Meier created PDFBOX-2998: - Summary: Document layout analysis tools needed Key: PDFBOX-2998 URL: https://issues.apache.org/jira/browse/PDFBOX-2998 Project: PDFBox Issue Type: New

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-28 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933277#comment-14933277 ] Andreas Meier commented on PDFBOX-2252: --- I have already studied one of the two papers you posted

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-28 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933277#comment-14933277 ] Andreas Meier edited comment on PDFBOX-2252 at 9/28/15 1:23 PM: I have

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-24 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906244#comment-14906244 ] Andreas Meier commented on PDFBOX-2252: --- In fact I don't know if there are some build-in classes

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-24 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905915#comment-14905915 ] Andreas Meier edited comment on PDFBOX-2252 at 9/24/15 6:59 AM: Even if

[jira] [Comment Edited] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-24 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905915#comment-14905915 ] Andreas Meier edited comment on PDFBOX-2252 at 9/24/15 7:00 AM: Even if

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-24 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905915#comment-14905915 ] Andreas Meier commented on PDFBOX-2252: --- Even if the patch doesn't look like it, I had a hard time

[jira] [Updated] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-24 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2252: -- Attachment: BidiMirroring.txt PDFTextStripper.java.patch > PDFTextStripper has

[jira] [Commented] (PDFBOX-2252) PDFTextStripper has problem with documents with mixed language directions

2015-09-23 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904040#comment-14904040 ] Andreas Meier commented on PDFBOX-2252: --- I want provide a new patch to address this Problem,

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-06 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944999#comment-14944999 ] Andreas Meier commented on PDFBOX-2998: --- The question is, when is a group of textpositions forming

[jira] [Comment Edited] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-06 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944999#comment-14944999 ] Andreas Meier edited comment on PDFBOX-2998 at 10/6/15 1:09 PM: The

[jira] [Comment Edited] (PDFBOX-2998) Enhance the text extraction capabilities

2015-10-06 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944999#comment-14944999 ] Andreas Meier edited comment on PDFBOX-2998 at 10/6/15 1:14 PM: The

[jira] [Issue Comment Deleted] (PDFBOX-2998) Enhance the text extraction capabilities

2015-12-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2998: -- Comment: was deleted (was: I think it is the right place to comment. Writing an algorithm to

[jira] [Issue Comment Deleted] (PDFBOX-2998) Enhance the text extraction capabilities

2015-12-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-2998: -- Comment: was deleted (was: I think it is the right place to comment. Writing an algorithm to

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-12-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062122#comment-15062122 ] Andreas Meier commented on PDFBOX-2998: --- I think it is the right place to comment. Writing an

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-12-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062121#comment-15062121 ] Andreas Meier commented on PDFBOX-2998: --- I think it is the right place to comment. Writing an

[jira] [Commented] (PDFBOX-2998) Enhance the text extraction capabilities

2015-12-17 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062124#comment-15062124 ] Andreas Meier commented on PDFBOX-2998: --- I think it is the right place to comment. Writing an

[jira] [Comment Edited] (PDFBOX-4141) Suppress control characters?

2018-03-06 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389166#comment-16389166 ] Andreas Meier edited comment on PDFBOX-4141 at 3/7/18 7:23 AM: --- {quote}What

[jira] [Commented] (PDFBOX-4141) Suppress control characters?

2018-03-06 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389166#comment-16389166 ] Andreas Meier commented on PDFBOX-4141: --- {quote}What is the meaning of the table columns? Convert

[jira] [Commented] (PDFBOX-4141) Suppress control characters?

2018-03-15 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400154#comment-16400154 ] Andreas Meier commented on PDFBOX-4141: --- The last few days I searched for files like the one that

[jira] [Closed] (PDFBOX-4141) Suppress control characters?

2018-03-15 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier closed PDFBOX-4141. - Resolution: Feedback Received > Suppress control characters? > > >

[jira] [Created] (PDFBOX-4141) Suppress control characters?

2018-03-05 Thread Andreas Meier (JIRA)
Andreas Meier created PDFBOX-4141: - Summary: Suppress control characters? Key: PDFBOX-4141 URL: https://issues.apache.org/jira/browse/PDFBOX-4141 Project: PDFBox Issue Type: Improvement

[jira] [Updated] (PDFBOX-4141) Suppress control characters?

2018-03-05 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-4141: -- Attachment: Test_without_MW.txt Test_with_MW_linux.jpg

[jira] [Updated] (PDFBOX-4141) Suppress control characters?

2018-03-05 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Meier updated PDFBOX-4141: -- Attachment: Mapping_default_to_adobe.csv > Suppress control characters? >

[jira] [Comment Edited] (PDFBOX-4141) Suppress control characters?

2018-03-06 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387439#comment-16387439 ] Andreas Meier edited comment on PDFBOX-4141 at 3/6/18 8:13 AM: --- Thanks for

[jira] [Commented] (PDFBOX-4141) Suppress control characters?

2018-03-06 Thread Andreas Meier (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387439#comment-16387439 ] Andreas Meier commented on PDFBOX-4141: --- Thanks for the info Tilman. Overriding the characters in