[
https://issues.apache.org/jira/browse/PDFBOX-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447176#comment-17447176
]
zzz commented on PDFBOX-5327:
-----------------------------
Thanks very much,my friend.I had found the problem reason from the source code.
!image-2021-11-22-10-52-38-107.png!
Between "CONSIGN TO: " and "PACKING LIST" should add a word separator, but
there is a space of the first text end and it case
"!lastPosition.getTextPosition().getUnicode().endsWith(" ")" is false.
> Parse text from two rectangles to one rectangle
> -----------------------------------------------
>
> Key: PDFBOX-5327
> URL: https://issues.apache.org/jira/browse/PDFBOX-5327
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 2.0.24
> Reporter: zzz
> Priority: Major
> Attachments: 072B006805-P32939I-(2)(1).pdf, 1-1.png, 2-1.png, 3-1.png
>
>
> Rectangle(CONSIGN TO) and Rectangle(PACKING LIST) regard as one
> Rectangle(CONSIGN TO PACKING LIST)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]