[ 
https://issues.apache.org/jira/browse/PDFBOX-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr reassigned PDFBOX-2792:
---------------------------------------

    Assignee: Tilman Hausherr

> Text extraction ignores bookmarks
> ---------------------------------
>
>                 Key: PDFBOX-2792
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2792
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.9, 2.0.0
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>
> As reported by Noam S. on the user mailing list:
> {quote}
> My problem is that when trying to getText(doc) form a certain section of the 
> pdf using setStartBookmark(item) and setEndBookmark(item) I get all the text 
> rather than just the text from the specified section.
> WhiIe trying to resolve this I realized that the writeText(doc, outputStream) 
> method always calls resetEngine() method. That will reset all the parameters 
> and delete the bookmarks I set.
> {quote}
> The two lines that reset the bookmarks were added to resetEngine in 
> PDFBOX-1808 in [ https://svn.apache.org/r1553175 ] in an attempt to save some 
> memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to