Heh heh... It's rather the opposite... it's a java library and the command line tools are for convenience :-)

Tilman

Am 31.10.2017 um 11:18 schrieb Lachezar Dobrev:
   Ahh... You mean use the tool as a *ahm* tool?
   I'm so used to seeing these as parts of the command-line tools that
I've totally forgotten that their inner elements are suitable for use
in code. Thanks.

   I think I'm going to create a Writer implementation that throws
exception if non-white space is written to it, and use the
writeText(PDDocument,Writer) to quickly cancel processing when
non-white space is found.

2017-10-30 19:54 GMT+02:00 Tilman Hausherr <[email protected]>:
Am 30.10.2017 um 16:52 schrieb Lachezar Dobrev:
    I have been looking at it. I am actually using (a similar) approach
to read embedded bar-codes, but there I can test all images.
    The best I can see in ExtractImages is a way to check if there is
only one image. However I can not check if there is additional text or
other content, so that I do not mistakenly skip a page that has a
single logo (for instance) and lots of other text information.
    I tried looking at PDFTextStripper, but that is hard to follow.

That one is easy... just create the object, set start and end page, and then
call getText().

Tilman


    Is there any sure(-ish) sign that there is text on a page that I can
use? Can I check for the existence of something that would tell me
that there is additional content on the page other than the single
image?

2017-10-30 15:53 GMT+02:00 Tilman Hausherr <[email protected]>:
Am 30.10.2017 um 14:04 schrieb Lachezar Dobrev:
     I have to process PDF files, that (supposedly) contain one big image
per page, which is a result from a Document-Scanner. I'd like to avoid
performing PDF-To-Image in these cases, and use the underlying image
instead.
     I am not well-versed in all things PDF and have no idea how to
detect if a page has content other than a single image.
     Please advise.

Please have a look at the ExtractImages.java source code. You can change
that one to your needs.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to