Re: Convert PDF to HTML with PDFBox in a Java app - Need some introductory info & guidance

Serban Alexe Fri, 02 Feb 2018 00:06:07 -0800

Thanks for the hints, I'll look into both of them.

I'm aware that it's not possible to obtain something that looks like the
original PDF, I'm rather aiming for something as close as possible, at
least from the content perspective.


*As an alternative* I could settle for a solution that extracts each page
from the pdf as an individual image. What options would I have in this case
?

Thanks.



On 2018/02/01 16:14:00, Serban Alexe <s...@gmail.com> wrote:
> Hello everybody,>
>
> I need to write a Java class that converts a *.pdf* document to the html>
> format, preferably keeping the original formatting to the best extent>
> possible.>
> Also, I need to be able to extract the images (and preferably encode
them>
> as base64 in the html file).>
>
> *Can you please provide me some useful starting points and/or examples ?
*>
>
> Through google search, I was able to find some limited functionality>
> examples. None of these deal with images, and also my guess is that they>
> refer to some older version of the PDFBox suite...>
>
> Thank you,>
>
> Serban>
>

Re: Convert PDF to HTML with PDFBox in a Java app - Need some introductory info & guidance

Reply via email to