Hi Steve,
Have you already considered importing the Word document into
OpenOffice Writer and letting one of the OO->LaTeX converters do the
hard work?
(http://www.hj-gym.dk/~hj/writer2latex/ is one exmple, there might be
others)
No, I have never tried or used one of those. However, I heard about
success stories -- might be worth a try.
Daniel
On 18.07.2008, at 21:28, Steve Litt wrote:
Hi all,
I have a 300 page book written in MS Word version 97, and I have to
convert it
to LyX in order to make the second edition.
I'll accept all condolences now :-)
Believe it or not, the MS Word version was written very much what
you guys
would call WYSIWYM. I had styles for everything -- almost no
appearance was
fine tuned. Obviously it's essential that all those styles transfer
over into
the LyX version.
I'll accept all condolences now :-)
So heres what my plan, unless someone else has a better idea.
First, I'll export to RTF.
I'll accept all condolences now :-)
Then in Vim I'll do this:
:%s/}/}\r/g
Now the rtf file will have lines that are somewhat recognizeable as
markup.
Next I'll look at the \stylesheet part of the RTF, and make a list
of all
paragraph and character styles, sort of like this:
\fs20 Normal
\s1 heading 1
\s2 heading 2
\cs10 \additive Default Paragraph Font
\s16 myparagraphstyle
\cs17 mycharstyle
\cs18 mycharstyle2
Then, within Vim I'll run substitions so that the text referred to
by the
numbers such as \s2 are prepended with my own tags such as phdr2,
and better
yet that text has a proper ending tag appended. This is not so
simple for
three reasons:
1) There's always a bunch of gobblety gook between the \s2 and the
text to
which it refers, and that must eventually be deleted.
2) There's often gobblety gook before the \s2, and that gobblety
gook must
eventually be deleted.
3) It's MUCH harder to reliably put end tags at the end of the text
to which
it refers. If I don't put end tags, that means I'll have a much
harder time
converting it to LyX.
Next, I'll re-import the rtf into MS Word. What should happen is it
re-imports
the same as it originally was, only now it has my tags. From there I
should
be able to export it to plain text, and use my tags to create the
LyX file
with suitable scripts. Or maybe make scripts to directly manipulate
the RTF.
Of course, for all my custom character and paragraph styles, I'll
need to
create those styles within LyX, in a blank document, before
appending the
actual content.
Then comes the cleanup. Stuff like tables and images won't convert
-- I'll
need to manually do that cleanup and then run at least a rough
proofread.
The good news is, because the original document used styles for
almost every
appearance, fine tuning won't be necessary (hooray for styles!).
I'd estimate this to be about a week's job. That's a lot of time,
but in the
end I'll have converted a 300 page book, style for style, from MS
Word to
LyX.
If anyone has a better idea for converting a 300 page MS Word
document to LyX,
style for style and word for word, please let me know.
Thanks
STeveT
Steve Litt
Recession Relief Package
http://www.recession-relief.US