Hi Daniel,

the command you are using extracts images contaoined in the PDF but
doesn't render the PDF into an Image. 

Use https://pdfbox.apache.org/2.0/commandline.html#pdftoimage

BR
Maruan

Am Dienstag, dem 02.08.2022 um 15:31 +0000 schrieb Daniel Earwicker:
> Hi, this project looks perfect for my needs - converting PDF pages
> into images for easy rendering elsewhere. This is very much my first
> try so apologies in advance if this is a stupid question, but in the
> docs at https://pdfbox.apache.org/2.0/commandline.html I can't see
> any options that might improve the output.
> 
> Here's a side-by-side comparison, ExtractImages output on the left,
> and the PDF opened in chrome on the right:
> 
> https://imgur.com/a/KgNAZQ2
> 
> The PDF is an example I got from:
> https://www.ets.org/Media/Tests/GRE/pdf/gre_research_validity_data.pdf
> 
> Just in case this is relevant, I ran it a clean debian container:
> 
>     docker run -it -v c:/Users/me:/external debian:bullseye-slim
> 
>     apt update
>     apt install openjdk-17-jre -y
>     apt install wget -y
>     wget https://dlcdn.apache.org/pdfbox/2.0.26/pdfbox-app-2.0.26.jar
> 
> and then tested with:
> 
>     java -jar pdfbox-app-2.0.26.jar ExtractImages -prefix
> /external/extract-test /external/gre_research_validity_data.pdf
> 
> The screenshot is of the resulting extract-test-2.jpg file.
> 
> There's obviously some problem with the colours, and also there's a
> lot of extra stuff in the page margins that Chrome somehow knows it
> ought to hide. Is there any way to configure this extraction process
> so the image to look like how Chrome displays it? And for this kind
> of accurate rendering to work for the majority of PDFs? (this being
> the first one I tried). Thanks!
> This email is from FISCAL Technologies Limited, a company registered
> in England and Wales with company number 4801836, whose registered
> office is at 448 Basingstoke Road, Reading, RG2 0LP, United Kingdom.
> This notice applies to this email and to any other email subsequently
> sent by anyone at FISCAL Technologies Limited and appearing in the
> same chain of email correspondence. References below to "this email"
> should be read accordingly. The contents of this email and any
> attachments (if any) are private and confidential. If you have
> received this message in error, please notify us immediately by
> returning it to the sender or call our switchboard on +44 (0) 845 680
> 1905 and remove it from your system, do not use, copy or disclose it.
> The opinions expressed within this communication are not necessarily
> those expressed by FISCAL Technologies Limited. Emails are not secure
> and may contain viruses and it is your responsibility to scan
> attachments (if any).  The e-mail system of FISCAL Technologies
> Limited is subject to random monitoring. For information about how we
> use your personal data (including your rights) please see our privacy
> policy - https://www.fiscaltec.com/uk/general/privacy-policy/
> Visit our website at www.fiscaltec.co.uk<http://www.fiscaltec.co.uk>

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to