Abdelrazak Younes
Sun, 05 Feb 2012 01:15:40 -0800
On 04/02/2012 18:03, Rob Oakes wrote:
Strong suggestion: use LyX proper. I am quite sure you already know that because I saw some patches from you in this area but I'll explain anyway: LyX's html own export is so good and fast because it effectively knows the in-memory representation of the document. You can't be faster nor more accurate than that. I mean, unless you want to rewrite LyX in python.Dear eLyXer Users and Developers, I'm still at work on the import/export module for Microsoft Word documents. I'm making pretty good progress. I've got a rough prototype that works pretty well and I'm now starting to refine it. My approach up to now has been to use regular expressions to match portions of the document and then use a library to translate those to the corresponding Word XML structures. It's working pretty well with my simple test documents. Before going too far with this approach, though, I wanted to post (another general query). In the eLyXer library, there is already a robust set of tools used for converting LyX documents to HTML. Does anyone know if the library is written in such as way that getting a generic in-memory representation of the document would be possible? It would be awesome to re-use as much existing code for the Word document export as possible. That would allow me to support a broader number of features, and gives me a framework for working with maths.
IIUC you want a single module in python for both import and export in python. But I don't think this is a valid argument. As for the word to lyx format conversion, if you want to use this epub library there must be a way to use that in C++ I'm sure...
Any thoughts Alex (and others)? I've downloaded the sources and have begun to work through them, but before spending hours to days trying to wrap my head around them, I thought I would ask.
AFAIK, eLyXer doesn't construct a document model. So you'd better spend this time reading the C++ code for exporting to html/xhtml ;-)
Abdel.