Hi,
 
Leaving out the pdf side of things below there are some good intros to xml on
 
using TIDY to generate xml from html (dreamweaver does it as well) and a cool tip on using
 
 
Colm  
 
 
 
 
-----Original Message-----
From: Colm Brazel [mailto:[EMAIL PROTECTED]
Sent: 18 June 2004 16:42
To: [EMAIL PROTECTED]
Subject: RE: [ cf-dev ] PDF 2 Text only

Alex,
 
How is your XSLT, I seem to be seeing this all over the place today? There are pressos at the coming CFUN04 by Michael Dinowitz and April Fleming on working with it, one of the outputs of FOP is to txt or RTF
in the process it must convert the doc at some stage to XSL-FO using XSLT, I`m assuming the doc can be converted by FOP then to pdf or rtf or text, I may be incorrect in assuming this and it might not be possible as you suggest, that is that you can step back with the pdf to XSL-FO then use the engine to output to txt or RTF,...working with this before you probably know more about it than I:)
 
For sure though if you can find a way to change the doc into xml then afaik it is fairly straight forward to
use an XSLT transformation to do what you want. Apparently its the in thing to grab anything in xml whether its emails from google using xml or search applications, grab the xml apply XSLT and you can do anything with it, put it into a database, make a rtf of it
 
 
 "FOP uses the standard XSL-FO file format as input, lays the content out into pages, then renders it to the requested output. One great advantage to using XSL-FO as input is that XSL-FO is itself an XML file, which means that it can be conveniently created from a variety of sources. The most common method is to convert semantic XML to XSL-FO, using an XSLT transformation."
 
ot Microblast html2text is excellent but you need pdf to text:)
 
I`m looking forward to CFUN04:)
 
 
 
Colm
 
 
-----Original Message-----
From: Alex Skinner [mailto:[EMAIL PROTECTED]
Sent: 18 June 2004 15:48
To: [EMAIL PROTECTED]
Subject: RE: [ cf-dev ] PDF 2 Text only

Colm,
 
Thanks for the reply but, i dont see how FOP is going to help, i have used that before and it helps for production of pdfs but what im trying to do is take an existing pdf document and convert it either into xml document, html or straight text output ?
 
Have you seen anything that does that ?
 
Alex


From: Colm Brazel [mailto:[EMAIL PROTECTED]
Sent: 18 June 2004 15:11
To: [EMAIL PROTECTED]
Subject: RE: [ cf-dev ] PDF 2 Text only

FOP is what you need
 
latest CFDJ there's an interesting article on using a free Apache FOP(Formatting Objects Processor) that can be used with CFMX to create free pdf's and other formats http://xml.apache.org/fop/output.html, its a java application and the article by Nate Nelson describes how to configure it for CFMX
 
 
WBR
 
Colm
-----Original Message-----
From: Alex Skinner [mailto:[EMAIL PROTECTED]
Sent: 18 June 2004 14:57
To: [EMAIL PROTECTED]
Subject: [ cf-dev ] PDF 2 Text only

Hi,
 
Has anyone got any experience of converting pdf docs into text or html on the fly with cfmx, are there any free java based modules that can do this ?
 
Cheers

Alex

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.657 / Virus Database: 422 - Release Date: 13/04/2004




avast! Antivirus: Outbound message clean.

Virus Database (VPS): 0425-1, 17/06/2004
Tested on: 18/06/2004 15:11:10
avast! is copyright (c) 2000-2003 ALWIL Software.



---
Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.657 / Virus Database: 422 - Release Date: 13/04/2004


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.657 / Virus Database: 422 - Release Date: 13/04/2004




avast! Antivirus: Outbound message clean.

Virus Database (VPS): 0425-1, 17/06/2004
Tested on: 18/06/2004 16:41:41
avast! is copyright (c) 2000-2003 ALWIL Software.





avast! Antivirus: Outbound message clean.

Virus Database (VPS): 0425-1, 17/06/2004
Tested on: 21/06/2004 16:52:48
avast! is copyright (c) 2000-2003 ALWIL Software.


Reply via email to