On Sat, Sep 07, 2002 at 10:01:08PM +0300, Dekel Tsur wrote:
> IF neither existing html->latex converter does a good job, I suggest you try
> to imporve one of them rather than write a html->lyx converter.
Yes, but writing a lyx->xml convertor might prove useful. I have been
working terribly hard on a script that converts RTF to XML. Now I need a
some type of utility to convert my resulting XML file to LyX. I can
simply write an xslt style sheet, but it would be better to have some
type of standard LyX XML document.
This needs explanation. Right now LyX stores its data in a text file.
You can use an xslt style sheet to convert from XML to text, but you
have to be very careful with spaces. The resulting style sheet can be
quite messy and tricky, since a missing line break or space changes the
way a LyX document presents text.
So to make the conversion easier, there could be a standard LyX XML
docuemnt. I have included my own preliminary efforts below, to better
illustrate what I mean.
Of course, you still need convert this LyX XML document to true XML.
This means writing an xslt style sheet. The overall conversion process
would look like this:
html->standard LyX XML->file.lyx
Or, to convert the other way:
file.lyx->standard LyX XML->html
If you look at this clearly, you will probably raise an objection
that this method doesn't save any work--in fact, it requires one extra
step!
However, what if we wrote standard convertors to take out the last step
in the first conversion, and the first step and the second conversion?
For example, you have an html file that you want to convert to LyX. You
merely need to write an xslt sytle sheet to convert to the standared LyX
XML format. The last step of actually converting to a LyX file would be
handled by a standard xslt style sheet, which I or somebody else could
write.
Now consider that you want to convert from LyX to html. The first step
of converting to a the standard LyX XML format is handled by a perl
script (or other covnertor). The result is an XML document, and
coverting XML to HTML is very easy.
In other words, my method means that a user only has to convert between
XML formats. Anybody with XML knowledge would find this easy. You would
not have to have any knowledge of the actually LyX layout.
This method has potential for anyone who likes working in XML. For
example, I believe abiword stores its files in XML format. If you wanted
to convert an abiword document to LyX, you would merely have to convert
it to the standard LyX XML document, and then use a stadard convertor
(the xslt stylesheet which I could write). Or, consider that you need to
convert a RTF file to LyX. In this case, you use my perl script to
convert the file to XML, and then write an xslt style sheet to convert
the document to the standard LyX XML format. The last step is handled by
the standard convertor.
I admit that most LyX users won't find anything advantageous in this
method. I know that there are already good convertors to convert from
LaTeX to html and to many other formats. However, using XML gives a user
more control over the actual form of the final LyX document. XML is
gaining popularity and it might be a good idea to provide a method for
XML users to convert to LyX
Paul
<?xml version="1.0"?>
<!DOCTYPE notes_on_xml>
<!--a very incomplete version of what a LyX document might look like if
converted to XML-->
<lyx_doc>
<definitions
lyxformat="218"
textclass="article"
language="english"
inputencoding="auto"
fontscheme="default"
graphics="default"
paperfontsize="default"
spacing="single"
papersize="Default"
paperpackage="a4"
use_geometry="0"
use_amsmath="0"
paperorientation="portrait"
secnumdepth="3"
tocdepth="3"
paragraph_separation="indent"
defskip="medskip"
quotes_language="english"
quotes_times="2"
papercolumns="1"
papersides="1"
paperpagestyle="default"
/>
<part>
<table
version="2"
rows="2"
columns="2"
rotate="false"
islongtable="false"
endhead="0"
endfirsthead="0"
endfoot="0"
endlastfoot="0"
>
<!--Don't know what to do with the formatting for each column.
Can't think of a good xml solution right now-->
<column nu="1"
alignment="center"
valignment="top"
leftline="true"
rightline="false"
width="" special=""
/>
<column nu="2"
alignment="center"
valignment="top"
leftline="true"
rightline="false"
width="" special=""
/>
<row topline="true"
bottomline="true"
newpage="false"
>
<cell multicolumn="0"
alignment="center"
valignment="top"
topline="true"
bottomline="false"
leftline="true"
rightline="false"
rotate="false"
usebox="none"
width=""
special=""
>
cell one
</cell>
</row>
</table>
</part>
<section number="false">
<para type="standard">standard text</para>
<para type="verse">verse</para>
<para type="standard" added_space_top="smallskip"> text</para>
</section>
</lyx_doc>
--
************************
*Paul Tremblay *
*[EMAIL PROTECTED]*
************************