[
https://issues.apache.org/jira/browse/PDFBOX-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr closed PDFBOX-4337.
-----------------------------------
Resolution: Won't Do
Closing; you can still comment.
> Could extract all elements(Text, Image, Table, etc) dynamically in sequence
> from pdf file
> ------------------------------------------------------------------------------------------
>
> Key: PDFBOX-4337
> URL: https://issues.apache.org/jira/browse/PDFBOX-4337
> Project: PDFBox
> Issue Type: Wish
> Reporter: RuhongCai
> Priority: Major
> Attachments: sample_pdf.pdf
>
>
> We are trying to compare two pdf files in run time and detect the "insertion"
> , "deletion", "modification" between two files.
> PDFBOx works well for "extract Text for two files", but it is not enough for
> us,
> Does any api in pdfbox or any workaround way to "read/extract" all
> component(Table, image,Text, etc) from pdf files in sequence and return some
> related useful information.
> The attached is sample file which contains Text, Table, image, not-well
> format. Read element/component in sequence
> could do further comparison work.
> [^sample_pdf.pdf]
>
> Many thanks!
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]