On 12 Jan, [EMAIL PROTECTED] wrote:
> This is semi-related...
Actually, it is quite much related to the subject.
I think we should try to make the parser system
generic enough that you could also support converters
to/from obscure document formats like MS Word.
> What I need is some way for the teacher to convert this into HTML or PDF.
> In some cases, the formatting for these is fairly... uh... complex. So it'd
> be good if it could handle complicated formatting without bletching.
> Any advice? It'd be great if I could just upload the file into a directory
> and have it be converted and put somewhere on the web as HTML. It'd be acceptable
> if I could do it as a PDF. It'd be bletcherous if it wasn't a good conversion...
The parser system is probably something that will
wait until Midgard 2.x, so at this point you'll
have to create the conversion mechanisms a bit
differently.
First (and probably easiest) way to do this is to
use Word's native 'Save as HTML' option. Of course,
the HTML this system produces is way too messy to
be served on a site, so you'll want to first clean
it up either manually or with a script. This might
be easier if you can set down the authors to only
use some specific templates and formatting
conventions. To make this more automatic, you could
have a page in Midgard that would receive the
Word Doc in HTML format, run it through a script
that cleans it up, and then save it to Midgard's
article database. Maybe you will want to also
store the original Word document as an attachment
to that article to make later editing easier.
The second way would be to have the authors just
send the Word document to the server, and do all
the conversions there. I've seen some Open Source
Word-to-HTML converters out there but I'm not
sure how good they are:
http://word2x.alcom.co.uk/
http://www.wvWare.com/
This would be easier to authors but possibly more
complex to set up.
The third way would be to write a custom set of
VBA macros to handle it all. I discussed this
possibility with some developers at Stonesoft back
when we were implementing Midgard 0.1 there, but
nothing came of it (and these days we just use
Midgard's :F text parser).
The Word format is a bit difficult one, as it
is poorly implemented in Open Source programs
(mostly due to the format itself, I'm sure!)
and Microsoft's own programs just can't handle
creating usable HTML. But hopefully these ideas
help a bit in pointing out where to look...
> Ben Garney
/Bergie
--
-- Henri Bergius -- +358 40 525 1334 -- [EMAIL PROTECTED] --
http://www.iki.fi/Henri.Bergius
--
This is The Midgard Project's mailing list. For more information,
please visit the project's web site at http://www.midgard-project.org
To unsubscribe the list, send an empty email message to address
[EMAIL PROTECTED]