At 00:58 15/11/2002, Michael Bardeggia wrote:

>From word to PDF...the documents look great.  So, I figured....lets
convert the PDF into XML...and now I've downloaded Adobe Framemaker.

Am I wrong to think that Framemaker can convert a PDF into XML?  Does
anyone have any suggestions?
I don't believe that Framemaker can
convert PDF to XML. In fact, my current version (not the latest)
only imports PDF as an graphic object on page.

In general, I would think that converting to
PDF would make the final step to XML harder.
Most of the structure of the document is
lost in PDF, which is more an "image" of the
document that a collection of words and
paragraphs.

If you have a *lot* of documents, you might want
to consider saving the Word documents as RTF,
and writing a Perl script to convert to
XML. This will need a fair bit of time and
technical skill, but will probably give the
best result.

Be warned though: this is an extremely difficult
and costly task. Getting structure out of
unstructured Word documents is next to impossible.

(I should know, I spent 18 months writing a
fully automated system to take State legislation,
and convert it to XML. Worked in the end, but
it meant working very hard for the full 18 months.)

Cheers,
James


-------------------------
James Robertson
Step Two Designs Pty Ltd
Knowledge Management Consultancy, SGML & XML

Content Management Requirements Toolkit
112 CMS requirements, ready to cut-and-paste

http://www.steptwo.com.au/
[EMAIL PROTECTED]

--
http://cms-list.org/
trim your replies for good karma.

Reply via email to