If you don't find any solutions, you could try an OCR that gives x/y positions 
of words like 'cuneiform -l eng -f hocr' and then look for holes with no words.

________________________________
From: poppler <[email protected]> on behalf of Albretch 
Mueller <[email protected]>
Sent: Tuesday, September 3, 2019 11:36 AM
To: [email protected] <[email protected]>
Subject: [poppler] (preferably Linux-based, OS) utility to extract images from 
image-based pdf files ...

The output of pdfimages would be a whole page image if the input is a
non-searchable, image-based pdf files. Take for example:

 https://www.nysedregents.org/ushistorygov/Archive/20000126exam.pdf

 which utility would detect the cartoons on page 6 and 7?

 lbrtchx
 [email protected]:(preferably Linux-based, OS) utility to
extract images from image-based pdf files ...
_______________________________________________
poppler mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________
poppler mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to