Alex,
How is your XSLT, I seem to be seeing this
all over the place today? There are pressos at the coming CFUN04 by Michael
Dinowitz and April Fleming on working with it, one of the outputs of FOP is to
txt or RTF
in the process it must convert the doc at
some stage to XSL-FO using XSLT, I`m assuming the doc can be converted by FOP
then to pdf or rtf or text, I may be incorrect in assuming this and it might
not be possible as you suggest, that is that you can step back with the pdf to
XSL-FO then use the engine to output to txt or RTF,...working with this before you probably know more about
it than I:)
For sure though if you can find a way to
change the doc into xml then afaik it is fairly straight forward
to
use an XSLT transformation to do what you
want. Apparently its the in thing to grab anything in xml whether its emails
from google using xml or search applications, grab the xml apply XSLT and you
can do anything with it, put it into a database, make a rtf of it
"FOP uses the
standard XSL-FO file format as input, lays the content out into pages, then
renders it to the requested output. One great advantage to using XSL-FO as
input is that XSL-FO is itself an XML file, which means that it can be
conveniently created from a variety of sources. The most common method is to
convert semantic XML to XSL-FO, using an XSLT transformation."
ot
Microblast html2text is excellent but you need pdf to
text:)
I`m looking forward to
CFUN04:)
Colm
Colm,
Thanks for the reply
but, i dont see how FOP is going to help, i have used that before and it
helps for production of pdfs but what im trying to do is take an existing
pdf document and convert it either into xml document, html or straight text
output ?
Have you seen
anything that does that ?
Alex
FOP is what you need
latest CFDJ there's an interesting
article on using a free Apache FOP(Formatting Objects Processor) that can be
used with CFMX to create free pdf's and other formats http://xml.apache.org/fop/output.html,
its a java application and the article by Nate Nelson describes how to
configure it for CFMX
WBR
Colm
Hi,
Has anyone got any
experience of converting pdf docs into text or html on the fly with cfmx,
are there any free java based modules that can do this ?
Cheers
Alex
---
Outgoing mail is certified Virus Free.
Checked by AVG
anti-virus system (http://www.grisoft.com).
Version: 6.0.657 / Virus
Database: 422 - Release Date: 13/04/2004
avast!
Antivirus: Outbound message clean.
Virus Database (VPS): 0425-1,
17/06/2004
Tested on: 18/06/2004 15:11:10
avast! is
copyright (c) 2000-2003 ALWIL Software.
---
Incoming mail is certified Virus Free.
Checked by
AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.657 / Virus
Database: 422 - Release Date: 13/04/2004
---
Outgoing mail is certified Virus Free.
Checked by
AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.657 / Virus
Database: 422 - Release Date: 13/04/2004
avast!
Antivirus: Outbound message clean.
Virus Database (VPS): 0425-1,
17/06/2004
Tested on: 18/06/2004 16:41:41
avast! is
copyright (c) 2000-2003 ALWIL Software.