To create an image from a PDF, look at the code for the PDFToImage command
line utility (https://pdfbox.apache.org/2.0/commandline.html#pdftoimage). I
adapted this to convert PDFs to images. From a quick glance at my old code,
I think you want org.apache.pdfbox.rendering.PDFRenderer.

On Fri, Feb 2, 2018 at 3:04 AM, Serban Alexe <serban.al...@gmail.com> wrote:

> Thanks for the hints, I'll look into both of them.
>
> I'm aware that it's not possible to obtain something that looks like the
> original PDF, I'm rather aiming for something as close as possible, at
> least from the content perspective.
>
> *As an alternative* I could settle for a solution that extracts each page
> from the pdf as an individual image. What options would I have in this case
> ?
>
> Thanks.
>
>
>
> On 2018/02/01 16:14:00, Serban Alexe <s...@gmail.com> wrote:
> > Hello everybody,>
> >
> > I need to write a Java class that converts a *.pdf* document to the html>
> > format, preferably keeping the original formatting to the best extent>
> > possible.>
> > Also, I need to be able to extract the images (and preferably encode
> them>
> > as base64 in the html file).>
> >
> > *Can you please provide me some useful starting points and/or examples ?
> *>
> >
> > Through google search, I was able to find some limited functionality>
> > examples. None of these deal with images, and also my guess is that they>
> > refer to some older version of the PDFBox suite...>
> >
> > Thank you,>
> >
> > Serban>
> >
>



-- 
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)

Reply via email to