Alex,
How is your XSLT, I seem to be seeing
this all over the place today? There are pressos at the coming CFUN04 by
Michael Dinowitz and April Fleming on working with it, one of the outputs of
FOP is to txt or RTF
in the process it must convert the doc
at some stage to XSL-FO using XSLT, I`m assuming the doc can be converted by
FOP then to pdf or rtf or text, I may be incorrect in assuming this and it
might not be possible as you suggest, that is that you can step back with
the pdf to XSL-FO then use the engine to output to txt or
RTF,...working with this before you
probably know more about it than I:)
For sure though if you can find a way to
change the doc into xml then afaik it is fairly straight forward
to
use an XSLT transformation to do what
you want. Apparently its the in thing to grab anything in xml whether its
emails from google using xml or search applications, grab the xml apply XSLT
and you can do anything with it, put it into a database, make a rtf of
it
"FOP uses the
standard XSL-FO file format as input, lays the content out into pages, then
renders it to the requested output. One great advantage to using XSL-FO as
input is that XSL-FO is itself an XML file, which means that it can be
conveniently created from a variety of sources. The most common method is to
convert semantic XML to XSL-FO, using an XSLT transformation."
ot
Microblast html2text is excellent but you need pdf to
text:)
I`m looking forward to
CFUN04:)
Colm
Colm,
Thanks for the
reply but, i dont see how FOP is going to help, i have used that before
and it helps for production of pdfs but what im trying to do is take an
existing pdf document and convert it either into xml document, html or
straight text output ?
Have you seen
anything that does that ?
Alex
FOP is what you need
latest CFDJ there's an interesting
article on using a free Apache FOP(Formatting Objects Processor) that can
be used with CFMX to create free pdf's and other formats http://xml.apache.org/fop/output.html,
its a java application and the article by Nate Nelson describes how to
configure it for CFMX
WBR
Colm
Hi,
Has anyone got any
experience of converting pdf docs into text or html on the fly with
cfmx, are there any free java based modules that can do this
?
Cheers
Alex
---
Outgoing mail is certified Virus Free.
Checked by AVG
anti-virus system (http://www.grisoft.com).
Version: 6.0.657 / Virus
Database: 422 - Release Date: 13/04/2004
avast!
Antivirus: Outbound message clean.
Virus Database (VPS): 0425-1,
17/06/2004
Tested on: 18/06/2004 15:11:10
avast!
is copyright (c) 2000-2003 ALWIL Software.
---
Incoming mail is certified Virus Free.
Checked
by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.657 /
Virus Database: 422 - Release Date: 13/04/2004
---
Outgoing mail is certified Virus Free.
Checked
by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.657 /
Virus Database: 422 - Release Date:
13/04/2004
avast!
Antivirus: Outbound message clean.
Virus Database (VPS): 0425-1,
17/06/2004
Tested on: 18/06/2004 16:41:41
avast! is
copyright (c) 2000-2003 ALWIL Software.