stefano franchi <stefano.fran...@gmail.com> writes:

> Dear Lyx devels,

Hi Stefano,

thanks for a good summary of the discussion - I think you have
identified the main points. I have some comments below. 

>
> given the intense discussion we have had in the last few days on this
> possible project, I thought I would briefly sum up some of the early
> conclusions (also because some items were discussed in private
> emails).
> (BTW: In the following I  say "Word" for the sake of brevity.  I
> actually mean Word XML | Libreoffice ODT)
>
> 1. One project or two?
>
> Is a LyX-->Word export a subset of the LyX<-->Word roundtrip?
>
> A. If the final ouput is Word, the conversion to Word is a subset of
> the roundtrip *if and only if* the XML output preserve Lyx-only
> (non-LaTeX) information (e.g. tracked-changes, LyX-notes, etc).

This point needs to be clarified: If one needs a semantic export, this
is true, as all semantic information needs to be maintained in the
round-trip as well as in the export. But not if the export should *look
like* the latex export. The limit in these discussions is semantics, and
not as much formatting (exceptions do apply, e.g. italics or bold of
words, which is important in articles which contain species names which
have to be in italics).

Additionally: if it is a subset - perfect - but as in (B), the
round-trip does not have to include everything, as there could be a
semantic exporter.

>
> B. If the final output is pdf, then it is not. It is not necessary to
> actually process the info in the .tex file with Latex (e.g
> bibliography,, and more). All we need to do is to make sure that the
> info that Latex will eventually need are preserved through the
> roundtrip. So, for a citation, we only need to make sure that when we
> go to Word we produce something like (made up XML):
> <citationCommand>
>        citet
>        <citationKey>
>             myBibkey
>        </citationKey>
> </citationCommand>
>
> and when we come back we reconstruct the proper LyX bib inset from
> those info. It will then be up to Latex to produce the actual citation
> and the corresponding reference.

Agreed.

>
> So scenario 2 is actually simpler, because we do not have a dependency
> on LaTeX at all.
> At the same time, scenario 1 is more important for those people who
> are likely to interact with Word users (see Juergen's comments, which
> I subscribe to).

I would say to design the round-trip-export so that it can easily be
extended to become a fully fledged semantic exporter.

>
> In general, then, we have overlapping projects with substantial
> differences sets:
> A - The LyX-only information that needs to be somehow encoded in the XML file
> B - The Latex-produced-only information that is missing from LyX
>
> Preserving LyX-only information in a XML file (A) strikes me as being
> substantially easier than producing the LyX-missing information (B)
> for the Word file. The latter requires TeX runs, the former does not.

I assume you are here referring to the last A and B, as if I understand
correctly, the first definition of A and B is the opposite?


>
> 2. How to produce a Word output, that is, how to solve problem B above?
> Since TeX is basically required to process a Lyx-produced tex file,
> the following approaches are available (there may be more than three,
> but these have known and working implementations):
>
> a. Mimic a TeX run by running a TeX-like processor on the tex file,
> but target XML as output
> examples: LatexML
>
> b. Run Latex and process the resulting Pdf or DVI file into XML
> examples: tex4ht
>
> c. Modify an existing Tex engine to target XML instead of pdf (or dvi)
> examples: XML from Context input in LuaTeX
>
> All three approaches are ambitious and have different shortcomings.
>
> (a) (Mimicking Latex) has the obvious problem that even once the basic
> LaTeX functionality is recovered, the LaTeX packages have to be
> basically recreated for the new engine. This is what happens in
> LaTeXML, where you have to write "bindings" fr every package you need
> to support. At the moment, many packages are not supported, including
> biblatex, and from the little I have seen on their mailing list adding
> such support is not trivial.
> On the plus side, since XML is the target, all the formatting-only
> machinery of TeX can be ignored (well, in theory. Real world is messy)
>
> b. This approach has the advantage of bringing in support for all
> LaTeX packages for free. However, parsing a DVI file with the goal of
> producing XML is not trivial given the completely different design
> goals of DVI/vs/XMl
>
> c. Finally, modifying an existing TeX engine (e.g. LuaTeX) may be the
> cleaner approach---at the price of much increased complexity.
>
> 3. Should  LyX<-->Word conversion be direct or use an intermediary
> format (e.g. pandoc | mmd | etc.)?
>
> This question applies mostly to the roundtrip project. The consensus
> seems to be that it would be better to avoid yet another format and go
> for direct conversion. On the minus side, such an approach would make
> it impossible (well, more difficult) to switch back-ends for the round
> trip, if so desired (see Rainer's points)

Unless one defines a clear software interface, which can be used by
other converters. Effectively, this could mean to extend the LyX server
to provide the information needed by the converter. So the parsing would
be doing in LyX (advantage: no worries about different .lyx formats) and
the conversion into docx in the external converter.

>
>
>
> These seem to me to be the most important issues we face. I maybe
> forgetting some important points. If so, please correct me.
> Comments of any kind are welcome.

One important point in the general design would be, to keep in mind that
the round-trip converters do not depend heavily on the .lyx file format,
which is likely (?) to change into xml in the medium future.

Cheers,

Rainer

>
>
> Cheers,
>
> Stefano

-- 
Rainer M. Krug

email: RMKrug<at>gmail<dot>com

Attachment: pgp3BLfAsFnwb.pgp
Description: PGP signature

Reply via email to