New To Fop

2005-01-04 Thread Luke Shannon
Hello;

I need a advice from more experienced FOP developers. The web project I am
contracting on needs to be able to generate a PDF version of various pages
the user may be browsing.

As of now the only input I have to work with is the HTML of the page being
displayed (the system can return it to me as a string during runtime).

Speed is a factor so a requirements is the system only creates a new PDF
document when the previously created one is out of synch with the content.

I need to get this done fast. Can someone suggest what they think the best
strategy will be for me to create the document? Should I use an .fo input?
Transform the HTML into XML and process it with an XSL?

Any tips from some who has done something similar would be very very helpful
and appreciated.

With Regards,

Luke


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: New To Fop

2005-01-04 Thread Will Gilbert
Luke,
What you are looking to do comes up pretty often on this list so you 
will probably get quite a bit of help.

Here's mine
What you will need to do is go from HTML into XML then into FO, once in 
FO, FOP can render it quite quickly into a PDF, your browser can even 
be used as the delivery mechanism.

I wrote a Java Servlet which is invoked via an HTML page link, the 
links passes the necessary parameters. In your case that will be a 
reference to the original HMTL file.

The next step is not obvious, hence this e-mail.   Not all HTML is 
XML-ready, humans make mistakes which most browsers correct, unbalanced 
and missing tags for example.  Also some tags need to be doctored, BR 
and HR come to mind, these have no closing tags.  What I did here was 
to use the Tidy engine/library to fix up my HTML into valid XML.

Now the job gets pretty easy...
The next step is to develop an XSL transform which takes HMTL tags and 
create FO XML.  I have some transforms which I am very happy to share 
with you, as will others.  Nobody has a complete HTML to FO 
implementation as this would be huge but you can get most of the 
transform working quickly and then add to it as needed.

Once you have the FO XML -- BOOM, a few lines of code later and you've 
got your PDF.

The servlet I wrote actually communicates back to the browser every 
second and fakes an elapsed progress timer.  We had to do this as we 
originally were running on slow hardware and have very impatient.  With 
our hardware these days the transform and PDF generation runs so 
quickly, the interaction is more of a nuisance that an aid.  But at the 
time that is what the boss wanted, so I wrote it.

--will

smime.p7s
Description: S/MIME cryptographic signature


Re: New To Fop

2005-01-04 Thread Luke Shannon
Thanks Will. This is the sort of advice I was hoping for. From the little I
have played with FOP this makes sense. I would be interested in looking at
any code you would like to share.

Luke

- Original Message - 
From: Will Gilbert [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, January 04, 2005 11:30 AM
Subject: Re: New To Fop



 Luke,

 What you are looking to do comes up pretty often on this list so you
 will probably get quite a bit of help.

 Here's mine

 What you will need to do is go from HTML into XML then into FO, once in
 FO, FOP can render it quite quickly into a PDF, your browser can even
 be used as the delivery mechanism.

 I wrote a Java Servlet which is invoked via an HTML page link, the
 links passes the necessary parameters. In your case that will be a
 reference to the original HMTL file.

 The next step is not obvious, hence this e-mail.   Not all HTML is
 XML-ready, humans make mistakes which most browsers correct, unbalanced
 and missing tags for example.  Also some tags need to be doctored, BR
 and HR come to mind, these have no closing tags.  What I did here was
 to use the Tidy engine/library to fix up my HTML into valid XML.

 Now the job gets pretty easy...

 The next step is to develop an XSL transform which takes HMTL tags and
 create FO XML.  I have some transforms which I am very happy to share
 with you, as will others.  Nobody has a complete HTML to FO
 implementation as this would be huge but you can get most of the
 transform working quickly and then add to it as needed.

 Once you have the FO XML -- BOOM, a few lines of code later and you've
 got your PDF.

 The servlet I wrote actually communicates back to the browser every
 second and fakes an elapsed progress timer.  We had to do this as we
 originally were running on slow hardware and have very impatient.  With
 our hardware these days the transform and PDF generation runs so
 quickly, the interaction is more of a nuisance that an aid.  But at the
 time that is what the boss wanted, so I wrote it.

 --will



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: New To Fop

2005-01-04 Thread Luke Shannon
Hi Will;

What I did here was 
to use the Tidy engine/library to fix up my HTML into valid XML.

This library you are referring to. Which package is it part of?

Thanks,

Luke

- Original Message - 
From: Will Gilbert [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, January 04, 2005 11:30 AM
Subject: Re: New To Fop


 
 Luke,
 
 What you are looking to do comes up pretty often on this list so you 
 will probably get quite a bit of help.
 
 Here's mine
 
 What you will need to do is go from HTML into XML then into FO, once in 
 FO, FOP can render it quite quickly into a PDF, your browser can even 
 be used as the delivery mechanism.
 
 I wrote a Java Servlet which is invoked via an HTML page link, the 
 links passes the necessary parameters. In your case that will be a 
 reference to the original HMTL file.
 
 The next step is not obvious, hence this e-mail.   Not all HTML is 
 XML-ready, humans make mistakes which most browsers correct, unbalanced 
 and missing tags for example.  Also some tags need to be doctored, BR 
 and HR come to mind, these have no closing tags.  What I did here was 
 to use the Tidy engine/library to fix up my HTML into valid XML.
 
 Now the job gets pretty easy...
 
 The next step is to develop an XSL transform which takes HMTL tags and 
 create FO XML.  I have some transforms which I am very happy to share 
 with you, as will others.  Nobody has a complete HTML to FO 
 implementation as this would be huge but you can get most of the 
 transform working quickly and then add to it as needed.
 
 Once you have the FO XML -- BOOM, a few lines of code later and you've 
 got your PDF.
 
 The servlet I wrote actually communicates back to the browser every 
 second and fakes an elapsed progress timer.  We had to do this as we 
 originally were running on slow hardware and have very impatient.  With 
 our hardware these days the transform and PDF generation runs so 
 quickly, the interaction is more of a nuisance that an aid.  But at the 
 time that is what the boss wanted, so I wrote it.
 
 --will
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]