[jira] [Commented] (PDFBOX-3737) Add a method to process page directly

Tilman Hausherr (JIRA) Tue, 28 Mar 2017 11:51:53 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945724#comment-15945724
 ]


Tilman Hausherr commented on PDFBOX-3737:
-----------------------------------------

That's a design decision. PDPage has the page related methods, i.e. reading and 
writing the page structures. Other jobs that access these structures, e.g. 
rendering and text extraction are in separate classes. PDFTextStripper has more 
options than just start and end page.

Btw you don't have to extend PDFTextStripper. This is only if you do low level 
activities like getting the individual positions of glyphs. For simple text 
extraction you just create a PDFTextStripper object.


> Add a method to process page directly
> -------------------------------------
>
>                 Key: PDFBOX-3737
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3737
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Parsing
>    Affects Versions: 2.0.0, 2.0.5
>            Reporter: Dewang Sun
>            Priority: Minor
>             Fix For: 2.0.6
>
>
> If you want to process a page, you need extend *PDFTextStripper*, and invoke 
> *setStartPage*, *setEndPage* and *processPage*. Therefore, why not add a 
> method to process a page directly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-3737) Add a method to process page directly

Reply via email to