Re: Copyright clairification for fo2html.xsl in Derby software.

Kim Haase Tue, 10 Jan 2012 10:15:39 -0800

I just tried it. Each line of the extracted text is wrapped in aparagraph tag, and each page is wrapped in div tags. That's all. Noother HTML tags are used.


<p>Installing Java DB
</p>

<p>Java DB is installed automatically as part of the Java SE DevelopmentKit (JDK).

</p>
<p>To obtain the JDK, navigate your web browser to
</p>

<p>http://www.oracle.com/technetwork/java/javase/downloads/ and clickthe Download JDK

</p>
<p>button. Follow the instructions on subsequent pages.
</p>


Kim

On 01/10/12 12:15 PM, Andrew McIntyre wrote:

On Tue, Jan 10, 2012 at 5:46 AM, Rick Hillegas<[email protected]>  wrote:


I ran a quick experiment: I removed fo2html.xsl and verified that I could
build the frames html docs. Here are some solutions listed in declining
order of effort:

<snip options>

Thanks,
-Rick


Another option would be to use PDFBox's ExtractText utility to convert
the PDFs generated by the FOP into HTML:

http://pdfbox.apache.org/commandlineutilities/ExtractText.html

I haven't tried it yet, so I can't speak to its accuracy or
presentation, but it would be another easy solution, and its
definitely licensed with the Apache License. :-)

- andrew

Re: Copyright clairification for fo2html.xsl in Derby software.

Reply via email to