Hi,

you can either extract to HTML (call Extract Text with the -html option for 
example) or create you own logic. You can take a look at  
org.apache.pdfbox.util.PDFText2HTML as a starting point.

There is also a project to convert PDFtoSVG using PDFBox as a basis which might 
also serve as an example (https://bitbucket.org/petermr/pdftosvg)

BR
Maruan Sahyoun

Am 07.05.2013 um 08:39 schrieb rahul bhalla <[email protected]>:

> Is it possible to extract text from a PDF without ignoring the formatting?
> or when text is extracting it put tag which we use in html..
> 
> Thanks
> 
> -- 
> Regards
> Rahul Bhalla

Reply via email to