"JJ" == Johnson, Jonathon W Mr CTR USA TRADOC USA <[email protected]>

    JJ> And as already mentioned, Word adds in A LOT of
    JJ> other <insert word of choice> code.

I've had reasonable luck in using Mac OS X's TextEdit to do
cleaner conversions to HTML.  TextEdit can open Word docs
nativelyy, and save as HTML.  You can specify the document type (I
use XHTML 1.0 Strict) and turn off CSS, which will give you a
fairly clean document to work with.  (My next steps usually
involve additional scripts or hand-processing to convert to LaTeX,
but you could probably do the same for wiki markup.)

I'm not sure what would happen with images in this case; all the
documents I've been unfortunate enough to receive as Word format
have been text only.

There are also tools such as antiword and other word-processors
(e.g., Abiword) that can open Word documents and, presumably, save
them in other formats.  The AbiWord option supports some
command-line conversion options, as well.

And, of course, you can always write some code to do some
additional clean up of generated HTML/wiki code from these
documents, but I suspect you're probably doomed to some amount of
tinkering no matter what you do; all you can do is minimize it
(and then avoid creating new documents that will require you to
repeat the process in the future).

   Claire

*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
  Claire M. Connelly                             [email protected]
  System Administrator, Dept. of Mathematics, Harvey Mudd College
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

Attachment: pgp6nNfu1fylA.pgp
Description: PGP signature

_______________________________________________
Discuss mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to