Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Michael Wojcik

Manveru wrote:

Have you ever merge XML? I tried - it is horrible work.


It depends entirely on how the XML document is formatted. There's 
nothing that prevents XML with sensible line breaks, for example.


I keep lots of XHTML documents in CVS. They're well-formatted, so 
merging works just fine.


--
Michael Wojcik
Micro Focus
Rhetoric  Writing, Michigan State University



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Michael Wojcik

Steve Litt wrote:


Trouble is, replacing \begin..\end with .../ is a hack. LyX developers 
have defined LyX native format as \begin always is the first character on a 
line. There's no such requirement in XML, and if we require it, that's a 
hack. If we don't require it, LyX-XML parsing becomes a whole new level of 
difficulty.


It's not hard at all, with an XML parser. Actually, putting all XML 
elements on their own lines, with or without leading whitespace, can 
be done with a DFA (or anything equivalent, such as a regular 
expression); you don't even need a full-strength parser. If you want 
elements all on their own lines, pre-processing with a quick sed 
script would do that for you.


I'm a toolsmith myself, and I write lots of tools, in lots of 
languages, for pre- and post-processing various file formats. I don't 
expect the switch to XML to cause me any problems, and to be honest 
I'm a bit puzzled by all the worrying.


--
Michael Wojcik
Micro Focus
Rhetoric  Writing, Michigan State University



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Abdelrazak Younes

Michael Wojcik wrote:

I don't expect the
switch to XML to cause me any problems, and to be honest I'm a bit
puzzled by all the worrying.


/me too :-)

Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Michael Wojcik

Manveru wrote:

Have you ever merge XML? I tried - it is horrible work.


It depends entirely on how the XML document is formatted. There's 
nothing that prevents XML with sensible line breaks, for example.


I keep lots of XHTML documents in CVS. They're well-formatted, so 
merging works just fine.


--
Michael Wojcik
Micro Focus
Rhetoric  Writing, Michigan State University



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Michael Wojcik

Steve Litt wrote:


Trouble is, replacing \begin..\end with .../ is a hack. LyX developers 
have defined LyX native format as \begin always is the first character on a 
line. There's no such requirement in XML, and if we require it, that's a 
hack. If we don't require it, LyX-XML parsing becomes a whole new level of 
difficulty.


It's not hard at all, with an XML parser. Actually, putting all XML 
elements on their own lines, with or without leading whitespace, can 
be done with a DFA (or anything equivalent, such as a regular 
expression); you don't even need a full-strength parser. If you want 
elements all on their own lines, pre-processing with a quick sed 
script would do that for you.


I'm a toolsmith myself, and I write lots of tools, in lots of 
languages, for pre- and post-processing various file formats. I don't 
expect the switch to XML to cause me any problems, and to be honest 
I'm a bit puzzled by all the worrying.


--
Michael Wojcik
Micro Focus
Rhetoric  Writing, Michigan State University



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Abdelrazak Younes

Michael Wojcik wrote:

I don't expect the
switch to XML to cause me any problems, and to be honest I'm a bit
puzzled by all the worrying.


/me too :-)

Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Michael Wojcik

Manveru wrote:

Have you ever merge XML? I tried - it is horrible work.


It depends entirely on how the XML document is formatted. There's 
nothing that prevents XML with sensible line breaks, for example.


I keep lots of XHTML documents in CVS. They're well-formatted, so 
merging works just fine.


--
Michael Wojcik
Micro Focus
Rhetoric & Writing, Michigan State University



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Michael Wojcik

Steve Litt wrote:


Trouble is, replacing \begin..\end with <>... is a hack. LyX developers 
have defined LyX native format as \begin always is the first character on a 
line. There's no such requirement in XML, and if we require it, that's a 
hack. If we don't require it, LyX-XML parsing becomes a whole new level of 
difficulty.


It's not hard at all, with an XML parser. Actually, putting all XML 
elements on their own lines, with or without leading whitespace, can 
be done with a DFA (or anything equivalent, such as a regular 
expression); you don't even need a full-strength parser. If you want 
elements all on their own lines, pre-processing with a quick sed 
script would do that for you.


I'm a toolsmith myself, and I write lots of tools, in lots of 
languages, for pre- and post-processing various file formats. I don't 
expect the switch to XML to cause me any problems, and to be honest 
I'm a bit puzzled by all the worrying.


--
Michael Wojcik
Micro Focus
Rhetoric & Writing, Michigan State University



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-31 Thread Abdelrazak Younes

Michael Wojcik wrote:

I don't expect the
switch to XML to cause me any problems, and to be honest I'm a bit
puzzled by all the worrying.


/me too :-)

Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Steve Litt
On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:
 On Fri, Jul 25, 2008 at 4:43 PM, Manveru [EMAIL PROTECTED] wrote:
  To the discussion about data format preference:
 
  I am reading all your comments about XML, YAML and other suggested data
  formats. And this discussion reminds me something about XML what almost
  nobody is remeber about. How many LyX user are working in large team
  projects? How often they have to merge text files from different
  branches? Have you ever merge XML? I tried - it is horrible work.

 I don't see why it would be harder if we just replace \begin...\end
 with .../.

Trouble is, replacing \begin..\end with .../ is a hack. LyX developers 
have defined LyX native format as \begin always is the first character on a 
line. There's no such requirement in XML, and if we require it, that's a 
hack. If we don't require it, LyX-XML parsing becomes a whole new level of 
difficulty.

Like I said, nothing that XML-YAML and YAML-XML can't solve, but those would 
be required. Incidentally, I just heard there are already standalone programs 
that do those conversions, so before writing code myself, I'll investigate.

SteveT
 
Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread G. Milde
On 28.07.08, Steve Litt wrote:
 On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:
  On Fri, Jul 25, 2008 at 4:43 PM, Manveru [EMAIL PROTECTED] wrote:
   To the discussion about data format preference:
  
   ... Have you ever merged XML? I tried - it is horrible work.
 
  I don't see why it would be harder if we just replace \begin...\end
  with .../.

 Trouble is, replacing \begin..\end with .../ is a hack. 
...
 There's no such requirement in XML, and if we require it, that's a 
 hack. 

I'd call it a layout convention.

IMO it is perfectly legal to define the lyx file format as

... uses XML ...
... is laid out in a manner to facilitate processing by tools that
operate on a line basis (grep, merge, sed, awk, ...) 
...

Günter




Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Abdelrazak Younes

G. Milde wrote:

On 28.07.08, Steve Litt wrote:

On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:

On Fri, Jul 25, 2008 at 4:43 PM, Manveru[EMAIL PROTECTED]  wrote:

To the discussion about data format preference:

... Have you ever merged XML? I tried - it is horrible work.

I don't see why it would be harder if we just replace \begin...\end
with.../.



Trouble is, replacing \begin..\end with.../  is a hack.

...

There's no such requirement in XML, and if we require it, that's a
hack.


I'd call it a layout convention.

IMO it is perfectly legal to define the lyx file format as

... uses XML ...
... is laid out in a manner to facilitate processing by tools that
operate on a line basis (grep, merge, sed, awk, ...)
...


Right, but LyX should not depend on this human friendly format. IOW LyX 
will be able to parse non nicely formatted .lyx file but will always 
output nicely formatted .lyx file.


We could add an option to lyx2lyx so that badly formatted LyX files 
generated by some external tool would be transformed into a nicely 
formatted .lyx file. See? I don't forecast any parsing problem :-)


Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Abdelrazak Younes

John McCabe-Dansted wrote:

On Fri, Jul 25, 2008 at 4:43 PM, Manveru[EMAIL PROTECTED]  wrote:

To the discussion about data format preference:

I am reading all your comments about XML, YAML and other suggested data
formats. And this discussion reminds me something about XML what almost
nobody is remeber about. How many LyX user are working in large team
projects? How often they have to merge text files from different branches?
Have you ever merge XML? I tried - it is horrible work.


I don't see why it would be harder if we just replace \begin...\end
with.../.


I think LyX cannot exist with XML data format without build-in document
merge functionality.


This would be nice in any case.


Shameless plug:

http://www.lyx.org/Donate#sponsorship

Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Steve Litt
On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:
 On Fri, Jul 25, 2008 at 4:43 PM, Manveru [EMAIL PROTECTED] wrote:
  To the discussion about data format preference:
 
  I am reading all your comments about XML, YAML and other suggested data
  formats. And this discussion reminds me something about XML what almost
  nobody is remeber about. How many LyX user are working in large team
  projects? How often they have to merge text files from different
  branches? Have you ever merge XML? I tried - it is horrible work.

 I don't see why it would be harder if we just replace \begin...\end
 with .../.

Trouble is, replacing \begin..\end with .../ is a hack. LyX developers 
have defined LyX native format as \begin always is the first character on a 
line. There's no such requirement in XML, and if we require it, that's a 
hack. If we don't require it, LyX-XML parsing becomes a whole new level of 
difficulty.

Like I said, nothing that XML-YAML and YAML-XML can't solve, but those would 
be required. Incidentally, I just heard there are already standalone programs 
that do those conversions, so before writing code myself, I'll investigate.

SteveT
 
Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread G. Milde
On 28.07.08, Steve Litt wrote:
 On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:
  On Fri, Jul 25, 2008 at 4:43 PM, Manveru [EMAIL PROTECTED] wrote:
   To the discussion about data format preference:
  
   ... Have you ever merged XML? I tried - it is horrible work.
 
  I don't see why it would be harder if we just replace \begin...\end
  with .../.

 Trouble is, replacing \begin..\end with .../ is a hack. 
...
 There's no such requirement in XML, and if we require it, that's a 
 hack. 

I'd call it a layout convention.

IMO it is perfectly legal to define the lyx file format as

... uses XML ...
... is laid out in a manner to facilitate processing by tools that
operate on a line basis (grep, merge, sed, awk, ...) 
...

Günter




Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Abdelrazak Younes

G. Milde wrote:

On 28.07.08, Steve Litt wrote:

On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:

On Fri, Jul 25, 2008 at 4:43 PM, Manveru[EMAIL PROTECTED]  wrote:

To the discussion about data format preference:

... Have you ever merged XML? I tried - it is horrible work.

I don't see why it would be harder if we just replace \begin...\end
with.../.



Trouble is, replacing \begin..\end with.../  is a hack.

...

There's no such requirement in XML, and if we require it, that's a
hack.


I'd call it a layout convention.

IMO it is perfectly legal to define the lyx file format as

... uses XML ...
... is laid out in a manner to facilitate processing by tools that
operate on a line basis (grep, merge, sed, awk, ...)
...


Right, but LyX should not depend on this human friendly format. IOW LyX 
will be able to parse non nicely formatted .lyx file but will always 
output nicely formatted .lyx file.


We could add an option to lyx2lyx so that badly formatted LyX files 
generated by some external tool would be transformed into a nicely 
formatted .lyx file. See? I don't forecast any parsing problem :-)


Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Abdelrazak Younes

John McCabe-Dansted wrote:

On Fri, Jul 25, 2008 at 4:43 PM, Manveru[EMAIL PROTECTED]  wrote:

To the discussion about data format preference:

I am reading all your comments about XML, YAML and other suggested data
formats. And this discussion reminds me something about XML what almost
nobody is remeber about. How many LyX user are working in large team
projects? How often they have to merge text files from different branches?
Have you ever merge XML? I tried - it is horrible work.


I don't see why it would be harder if we just replace \begin...\end
with.../.


I think LyX cannot exist with XML data format without build-in document
merge functionality.


This would be nice in any case.


Shameless plug:

http://www.lyx.org/Donate#sponsorship

Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Steve Litt
On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:
> On Fri, Jul 25, 2008 at 4:43 PM, Manveru <[EMAIL PROTECTED]> wrote:
> > To the discussion about data format preference:
> >
> > I am reading all your comments about XML, YAML and other suggested data
> > formats. And this discussion reminds me something about XML what almost
> > nobody is remeber about. How many LyX user are working in large team
> > projects? How often they have to merge text files from different
> > branches? Have you ever merge XML? I tried - it is horrible work.
>
> I don't see why it would be harder if we "just replace \begin...\end
> with <>...".

Trouble is, replacing \begin..\end with <>... is a hack. LyX developers 
have defined LyX native format as \begin always is the first character on a 
line. There's no such requirement in XML, and if we require it, that's a 
hack. If we don't require it, LyX-XML parsing becomes a whole new level of 
difficulty.

Like I said, nothing that XML->YAML and YAML->XML can't solve, but those would 
be required. Incidentally, I just heard there are already standalone programs 
that do those conversions, so before writing code myself, I'll investigate.

SteveT
 
Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread G. Milde
On 28.07.08, Steve Litt wrote:
> On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:
> > On Fri, Jul 25, 2008 at 4:43 PM, Manveru <[EMAIL PROTECTED]> wrote:
> > > To the discussion about data format preference:
> > >
> > > ... Have you ever merged XML? I tried - it is horrible work.
> >
> > I don't see why it would be harder if we "just replace \begin...\end
> > with <>...".

> Trouble is, replacing \begin..\end with <>... is a hack. 
...
> There's no such requirement in XML, and if we require it, that's a 
> hack. 

I'd call it a layout convention.

IMO it is perfectly legal to define the lyx file format as

... uses XML ...
... is laid out in a manner to facilitate processing by tools that
operate on a line basis (grep, merge, sed, awk, ...) 
...

Günter




Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Abdelrazak Younes

G. Milde wrote:

On 28.07.08, Steve Litt wrote:

On Monday 28 July 2008 01:10, John McCabe-Dansted wrote:

On Fri, Jul 25, 2008 at 4:43 PM, Manveru<[EMAIL PROTECTED]>  wrote:

To the discussion about data format preference:

... Have you ever merged XML? I tried - it is horrible work.

I don't see why it would be harder if we "just replace \begin...\end
with<>...".



Trouble is, replacing \begin..\end with<>...  is a hack.

...

There's no such requirement in XML, and if we require it, that's a
hack.


I'd call it a layout convention.

IMO it is perfectly legal to define the lyx file format as

... uses XML ...
... is laid out in a manner to facilitate processing by tools that
operate on a line basis (grep, merge, sed, awk, ...)
...


Right, but LyX should not depend on this human friendly format. IOW LyX 
will be able to parse non nicely formatted .lyx file but will always 
output nicely formatted .lyx file.


We could add an option to lyx2lyx so that badly formatted LyX files 
generated by some external tool would be transformed into a nicely 
formatted .lyx file. See? I don't forecast any parsing problem :-)


Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-28 Thread Abdelrazak Younes

John McCabe-Dansted wrote:

On Fri, Jul 25, 2008 at 4:43 PM, Manveru<[EMAIL PROTECTED]>  wrote:

To the discussion about data format preference:

I am reading all your comments about XML, YAML and other suggested data
formats. And this discussion reminds me something about XML what almost
nobody is remeber about. How many LyX user are working in large team
projects? How often they have to merge text files from different branches?
Have you ever merge XML? I tried - it is horrible work.


I don't see why it would be harder if we "just replace \begin...\end
with<>...".


I think LyX cannot exist with XML data format without build-in document
merge functionality.


This would be nice in any case.


Shameless plug:

http://www.lyx.org/Donate#sponsorship

Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread José Matos
On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote:
 frankly - these are nice dreams, but there is not manpower to do it.
 my feeling is that the xml-branch commit activity pefectly shows what will
 happen after the worst bugs will be repaired in xml merged trunk.

 or you have some particular developer in mind? :))

Last time that I remember lyx2lyx was also a nice dream. :-)

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread Pavel Sanda
 On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote:
  frankly - these are nice dreams, but there is not manpower to do it.
  my feeling is that the xml-branch commit activity pefectly shows what will
  happen after the worst bugs will be repaired in xml merged trunk.
 
  or you have some particular developer in mind? :))
 
 Last time that I remember lyx2lyx was also a nice dream. :-)

you wanted to say docbook ? :))

  pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread Manveru
To the discussion about data format preference:

I am reading all your comments about XML, YAML and other suggested data
formats. And this discussion reminds me something about XML what almost
nobody is remeber about. How many LyX user are working in large team
projects? How often they have to merge text files from different branches?
Have you ever merge XML? I tried - it is horrible work.

I think LyX cannot exist with XML data format without build-in document
merge functionality. If any one is thinking about proffesional usage of LyX.
I saw some discussions about it, but I do not know whether it is in LyX or
not. I do not need this feature yet.

YAML is interesting idea, I saw use of it in one of Python frameworks (I
don't remeber which one). But it stays in nische. I don't see libraries for
YAML under active development right now.

-- 
Manveru
jabber: [EMAIL PROTECTED]
gg: 1624001
http://www.manveru.pl


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread John McCabe-Dansted
On Fri, Jul 25, 2008 at 4:43 PM, Manveru [EMAIL PROTECTED] wrote:
 To the discussion about data format preference:

 I am reading all your comments about XML, YAML and other suggested data
 formats. And this discussion reminds me something about XML what almost
 nobody is remeber about. How many LyX user are working in large team
 projects? How often they have to merge text files from different branches?
 Have you ever merge XML? I tried - it is horrible work.

I don't see why it would be harder if we just replace \begin...\end
with .../.

 I think LyX cannot exist with XML data format without build-in document
 merge functionality.

This would be nice in any case.

-- 
John C. McCabe-Dansted
PhD Student
University of Western Australia


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread José Matos
On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote:
 frankly - these are nice dreams, but there is not manpower to do it.
 my feeling is that the xml-branch commit activity pefectly shows what will
 happen after the worst bugs will be repaired in xml merged trunk.

 or you have some particular developer in mind? :))

Last time that I remember lyx2lyx was also a nice dream. :-)

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread Pavel Sanda
 On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote:
  frankly - these are nice dreams, but there is not manpower to do it.
  my feeling is that the xml-branch commit activity pefectly shows what will
  happen after the worst bugs will be repaired in xml merged trunk.
 
  or you have some particular developer in mind? :))
 
 Last time that I remember lyx2lyx was also a nice dream. :-)

you wanted to say docbook ? :))

  pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread Manveru
To the discussion about data format preference:

I am reading all your comments about XML, YAML and other suggested data
formats. And this discussion reminds me something about XML what almost
nobody is remeber about. How many LyX user are working in large team
projects? How often they have to merge text files from different branches?
Have you ever merge XML? I tried - it is horrible work.

I think LyX cannot exist with XML data format without build-in document
merge functionality. If any one is thinking about proffesional usage of LyX.
I saw some discussions about it, but I do not know whether it is in LyX or
not. I do not need this feature yet.

YAML is interesting idea, I saw use of it in one of Python frameworks (I
don't remeber which one). But it stays in nische. I don't see libraries for
YAML under active development right now.

-- 
Manveru
jabber: [EMAIL PROTECTED]
gg: 1624001
http://www.manveru.pl


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread John McCabe-Dansted
On Fri, Jul 25, 2008 at 4:43 PM, Manveru [EMAIL PROTECTED] wrote:
 To the discussion about data format preference:

 I am reading all your comments about XML, YAML and other suggested data
 formats. And this discussion reminds me something about XML what almost
 nobody is remeber about. How many LyX user are working in large team
 projects? How often they have to merge text files from different branches?
 Have you ever merge XML? I tried - it is horrible work.

I don't see why it would be harder if we just replace \begin...\end
with .../.

 I think LyX cannot exist with XML data format without build-in document
 merge functionality.

This would be nice in any case.

-- 
John C. McCabe-Dansted
PhD Student
University of Western Australia


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread José Matos
On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote:
> frankly - these are nice dreams, but there is not manpower to do it.
> my feeling is that the xml-branch commit activity pefectly shows what will
> happen after the worst bugs will be repaired in xml merged trunk.
>
> or you have some particular developer in mind? :))

Last time that I remember lyx2lyx was also a nice dream. :-)

> pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread Pavel Sanda
> On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote:
> > frankly - these are nice dreams, but there is not manpower to do it.
> > my feeling is that the xml-branch commit activity pefectly shows what will
> > happen after the worst bugs will be repaired in xml merged trunk.
> >
> > or you have some particular developer in mind? :))
> 
> Last time that I remember lyx2lyx was also a nice dream. :-)

you wanted to say docbook ? :))

> > pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread Manveru
To the discussion about data format preference:

I am reading all your comments about XML, YAML and other suggested data
formats. And this discussion reminds me something about XML what almost
nobody is remeber about. How many LyX user are working in large team
projects? How often they have to merge text files from different branches?
Have you ever merge XML? I tried - it is horrible work.

I think LyX cannot exist with XML data format without build-in document
merge functionality. If any one is thinking about proffesional usage of LyX.
I saw some discussions about it, but I do not know whether it is in LyX or
not. I do not need this feature yet.

YAML is interesting idea, I saw use of it in one of Python frameworks (I
don't remeber which one). But it stays in nische. I don't see libraries for
YAML under active development right now.

-- 
Manveru
jabber: [EMAIL PROTECTED]
gg: 1624001
http://www.manveru.pl


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-27 Thread John McCabe-Dansted
On Fri, Jul 25, 2008 at 4:43 PM, Manveru <[EMAIL PROTECTED]> wrote:
> To the discussion about data format preference:
>
> I am reading all your comments about XML, YAML and other suggested data
> formats. And this discussion reminds me something about XML what almost
> nobody is remeber about. How many LyX user are working in large team
> projects? How often they have to merge text files from different branches?
> Have you ever merge XML? I tried - it is horrible work.

I don't see why it would be harder if we "just replace \begin...\end
with <>...".

> I think LyX cannot exist with XML data format without build-in document
> merge functionality.

This would be nice in any case.

-- 
John C. McCabe-Dansted
PhD Student
University of Western Australia


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread José Matos
On Wednesday 23 July 2008 19:24:16 Pavel Sanda wrote:
 this depends on what you master. i'm used on the bunch of small unix
 utilities so i gave that sed example. if you know python you will do in
 python. my point was not propose the best tools but to groan and moan about
 xml :)

FWIW this chunk is from one of my shell scripts:

echo $1
for i in {8..40}
do
echo -n '.'
w=`printf %.2d0 $i`
f=dfa-$1-$w.dat
./dfa -s -w $w  $1.dat | ./join-lag.py -l $w -r $1.dates  $f
cut -f1,2 $f | join -a1 dfa.dat -  tmp.dat
mv tmp.dat dfa.dat
done

So as you can see I know more than python. :-)
And yes I know this only works with bash, and that is OK with me. :-)

My point is that it is alright to use the small tools of the trade but we can 
do better because lyx documents are richer than just pure text.

I am not saying that your usage is wrong what I claim is that we need better 
script tools to handle lyx documents. Those tools should be stable across lyx 
versions and should not depend of any particular file format.

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread Christian Ridderström

On Wed, 23 Jul 2008, Steve Litt wrote:


As a sed/awk/perl/ruby parser, I appreciate that very much.

The more I think about it, the more I think I should make the XML-YAML 
and YAML-XML converters. That way, if future generations of LyX project 
programmers forget why it's important to space their XML just so, it 
won't matter. Also, I have a feeling that YAML will be much easier to 
parse than either 1.5.x or XML.


At first I'll do them in Ruby because Ruby has all that stuff built in 
and easy to do.


Did you see José's post about how the lyx2lyx stuff is really inside a 
Python lib (module)?  You'd probably only need a different kind of wrapper 
that calls this module, instead of reinventing everything in Ruby.


/Christian

--
Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr

Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread Pavel Sanda
 what I claim is that we need better 
 script tools to handle lyx documents. Those tools should be stable across lyx 
 versions and should not depend of any particular file format.

frankly - these are nice dreams, but there is not manpower to do it.
my feeling is that the xml-branch commit activity pefectly shows what will
happen after the worst bugs will be repaired in xml merged trunk.

or you have some particular developer in mind? :))

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread José Matos
On Wednesday 23 July 2008 19:24:16 Pavel Sanda wrote:
 this depends on what you master. i'm used on the bunch of small unix
 utilities so i gave that sed example. if you know python you will do in
 python. my point was not propose the best tools but to groan and moan about
 xml :)

FWIW this chunk is from one of my shell scripts:

echo $1
for i in {8..40}
do
echo -n '.'
w=`printf %.2d0 $i`
f=dfa-$1-$w.dat
./dfa -s -w $w  $1.dat | ./join-lag.py -l $w -r $1.dates  $f
cut -f1,2 $f | join -a1 dfa.dat -  tmp.dat
mv tmp.dat dfa.dat
done

So as you can see I know more than python. :-)
And yes I know this only works with bash, and that is OK with me. :-)

My point is that it is alright to use the small tools of the trade but we can 
do better because lyx documents are richer than just pure text.

I am not saying that your usage is wrong what I claim is that we need better 
script tools to handle lyx documents. Those tools should be stable across lyx 
versions and should not depend of any particular file format.

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread Christian Ridderström

On Wed, 23 Jul 2008, Steve Litt wrote:


As a sed/awk/perl/ruby parser, I appreciate that very much.

The more I think about it, the more I think I should make the XML-YAML 
and YAML-XML converters. That way, if future generations of LyX project 
programmers forget why it's important to space their XML just so, it 
won't matter. Also, I have a feeling that YAML will be much easier to 
parse than either 1.5.x or XML.


At first I'll do them in Ruby because Ruby has all that stuff built in 
and easy to do.


Did you see José's post about how the lyx2lyx stuff is really inside a 
Python lib (module)?  You'd probably only need a different kind of wrapper 
that calls this module, instead of reinventing everything in Ruby.


/Christian

--
Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr

Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread Pavel Sanda
 what I claim is that we need better 
 script tools to handle lyx documents. Those tools should be stable across lyx 
 versions and should not depend of any particular file format.

frankly - these are nice dreams, but there is not manpower to do it.
my feeling is that the xml-branch commit activity pefectly shows what will
happen after the worst bugs will be repaired in xml merged trunk.

or you have some particular developer in mind? :))

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread José Matos
On Wednesday 23 July 2008 19:24:16 Pavel Sanda wrote:
> this depends on what you master. i'm used on the bunch of small unix
> utilities so i gave that sed example. if you know python you will do in
> python. my point was not propose the best tools but to groan and moan about
> xml :)

FWIW this chunk is from one of my shell scripts:

echo $1
for i in {8..40}
do
echo -n '.'
w=`printf "%.2d0" $i`
f="dfa-$1-$w.dat"
./dfa -s -w $w < $1.dat | ./join-lag.py -l $w -r $1.dates > $f
cut -f1,2 $f | join -a1 dfa.dat - > tmp.dat
mv tmp.dat dfa.dat
done

So as you can see I know more than python. :-)
And yes I know this only works with bash, and that is OK with me. :-)

My point is that it is alright to use the small tools of the trade but we can 
do better because lyx documents are richer than just pure text.

I am not saying that your usage is wrong what I claim is that we need better 
script tools to handle lyx documents. Those tools should be stable across lyx 
versions and should not depend of any particular file format.

> pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread Christian Ridderström

On Wed, 23 Jul 2008, Steve Litt wrote:


As a sed/awk/perl/ruby parser, I appreciate that very much.

The more I think about it, the more I think I should make the XML->YAML 
and YAML->XML converters. That way, if future generations of LyX project 
programmers forget why it's important to space their XML "just so", it 
won't matter. Also, I have a feeling that YAML will be much easier to 
parse than either 1.5.x or XML.


At first I'll do them in Ruby because Ruby has all that stuff built in 
and easy to do.


Did you see José's post about how the lyx2lyx stuff is really inside a 
Python lib (module)?  You'd probably only need a different kind of wrapper 
that calls this module, instead of reinventing everything in Ruby.


/Christian

--
Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr

Re: Progress on the MS Word to LyX conversion (xml)

2008-07-24 Thread Pavel Sanda
> what I claim is that we need better 
> script tools to handle lyx documents. Those tools should be stable across lyx 
> versions and should not depend of any particular file format.

frankly - these are nice dreams, but there is not manpower to do it.
my feeling is that the xml-branch commit activity pefectly shows what will
happen after the worst bugs will be repaired in xml merged trunk.

or you have some particular developer in mind? :))

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote:
  Pavel Sanda wrote:
  Moreover, if you're editing by hand, you can use
  something that recognizes XML.

 of course it will work, but it will take x-times more time.
 quite difference to write sed one-liner or start doing some
 xslt templating.

 pavel

Yeah, I think this was the point I was trying to get across. With the current 
format, you can do a lot with Vim. Or you can run through a series of small 
filters that do just one thing.

XML's a different animal. Without a parser, it's almost impossible to handle. 
With a parser, you're forced to work only within the language of that parser, 
and you're forced to make a monolithic solution that can't take advantage of 
Unix pipes and small executables that do one thing and do it well. You also 
forgo the ability to have a series of intermediate files, each serving as a 
test point to make sure things are still going well.

Also, an XML parser, especially a DOM one, makes READING XML very easy, but it 
does nothing for WRITING.

Pavel -- you and I and others like us need to start identifying parsing tools 
to at least partially compensate for the loss of our Unix based pipes with 
small filter executables. Theoretically, if one could read the XML into a DOM 
tree, tweak it in memory, and then write it back out, that would be at least 
somewhat doable, though nothing like the Awk and Perl techniques I'm used to.

And once again, we need COMPLETE documentation on the XML dialect, and Like I 
said I'm willing to help with that documentation.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote:
 by 'outside' i mean tweakings which i regularly do and watching users list
 power users do that too _and_ are happy about the current simplicity of
 format.

 tweaks like assembling of the whole file for various datasets, global
 changes of things (cf notes-mutate lfun i introduced lately), conversions
 and so on.

This works well for simple things but breaks badly when you try something a 
bit more complex.

 while you are right that xml could be better technology for internal
 lyx parsing (and i can understand your viewpoint as lyx2lyx fan:)
 this was not my mail about.

  It is funny to see all this nostalgia around something that is/was a
  nightmare.

 it has nothing to do with nostalgia, but speed of hacking around.

Not when the resulting file crashes lyx, something that should not ever happen 
but that it does now. First make it correct and then make it fast.

XML will not change the current status.

grep 'style name=Section' somefile.lyx

will still work and it not so different from what we have now. You need the '' 
already if you have spaces in your expression...

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Pavel Sanda
 On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote:
  while you are right that xml could be better technology for internal
  lyx parsing (and i can understand your viewpoint as lyx2lyx fan:)
  this was not my mail about.
 
   It is funny to see all this nostalgia around something that is/was a
   nightmare.
 
  it has nothing to do with nostalgia, but speed of hacking around.
 
 Not when the resulting file crashes lyx, something that should not ever 
 happen 
 but that it does now.

i've done incorrect file, it's my fault if lyx crashes. i take my 
responsibility,
no problem.
trial method is the fastest if you want something quickly.

 First make it correct and then make it fast.

i have exactly oposite view as far as the tweaking i was talking about
is concerned; i just need quickly output of something, may be i will throw
it away after few days.

or take Steve's example - if he takes your 'First make it correct and then
make it fast' it would take some two weaks to invent some beast to be 
correct in your sense. but then the whole point is lost, since after this
time he could do it manually.

i guess we can't agree on this, since i'm not talking about lyx internals,
while your job is to make lyx format conversions on lyx level... but this
is users list, not the the devel one, so i feel free to speak this way :)

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote:
 i've done incorrect file, it's my fault if lyx crashes. i take my
 responsibility, no problem.
 trial method is the fastest if you want something quickly.

If LyX crashes that is a bug. LyX should not ever crash, it can refused to 
load a file because it is invalid, or to truncate it but it should not ever 
crash.

In the whole picture our parser is one of our weak links so we should do 
something about it. Replace it in this case.

  First make it correct and then make it fast.

 i have exactly oposite view as far as the tweaking i was talking about
 is concerned; i just need quickly output of something, may be i will throw
 it away after few days.

 or take Steve's example - if he takes your 'First make it correct and then
 make it fast' it would take some two weaks to invent some beast to be
 correct in your sense. but then the whole point is lost, since after this
 time he could do it manually.

 i guess we can't agree on this, since i'm not talking about lyx internals,
 while your job is to make lyx format conversions on lyx level... but this
 is users list, not the the devel one, so i feel free to speak this way :)

Yes, I know but I can pretend otherwise. ;-)

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Manveru
Guys,

Have you even looked at TinyXML?

I have a project once where we use XML as a message passing protocol and we
were using XSLT as C++ code generator for classes handling XML and
converting them to data structures handling all data we need. This freed us
from portability problems (Litte Endian, Big Endian) which is not case here.
For the application like LyX binary structure may be better to handle -
certainly much work to do. We in our project hadn't found any known DOM
useful for our purpose.

Cheers!
M.

2008/7/23 José Matos [EMAIL PROTECTED]:

 On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote:
  i've done incorrect file, it's my fault if lyx crashes. i take my
  responsibility, no problem.
  trial method is the fastest if you want something quickly.

 If LyX crashes that is a bug. LyX should not ever crash, it can refused to
 load a file because it is invalid, or to truncate it but it should not ever
 crash.

 In the whole picture our parser is one of our weak links so we should do
 something about it. Replace it in this case.

   First make it correct and then make it fast.
 
  i have exactly oposite view as far as the tweaking i was talking about
  is concerned; i just need quickly output of something, may be i will
 throw
  it away after few days.
 
  or take Steve's example - if he takes your 'First make it correct and
 then
  make it fast' it would take some two weaks to invent some beast to be
  correct in your sense. but then the whole point is lost, since after this
  time he could do it manually.
 
  i guess we can't agree on this, since i'm not talking about lyx
 internals,
  while your job is to make lyx format conversions on lyx level... but this
  is users list, not the the devel one, so i feel free to speak this way :)

 Yes, I know but I can pretend otherwise. ;-)

  pavel

 --
 José Abílio




-- 
Manveru
jabber: [EMAIL PROTECTED]
gg: 1624001
http://www.manveru.pl


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 18:21, José Matos wrote:

 Clearly you did not had to deal with the lyx file format like I did. :-)
 If your idea of a parser is a set of regexp's that is so 80's. ;-)
[clip]
 It is funny to see all this nostalgia around something that is/was a
 nightmare. If the syntax was so clear you would not have the problem of
 crashing lyx with a bad formed file (a file modified by scripts).

When the discussion reverts to your thingamabob is from another 
decade/century so it must not be good by today's standards, you know that 
thingamabob is pretty darn good, or else there would have been a more 
powerful argument against it.

First of all, I understand *exactly* why an XML native format is an 
improvement for the LyX application. I'm limiting my point to the concept 
that something old has to be something bad.

Modern things are usually improvements, but often are not improvements in 
quality or usefulness. They can be improvements to profit margin (e.g. most 
MS Windows improvements), or marketing improvements (all the silly little 
expensive features thrown into basic family cars today), or improvements in 
restricting use (DRM), or improvements in price (crummy bicycles from 
Walmart). Sometimes older stuff has more quality or usefulness.

In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the 
philosophy of little executables that do one thing and do it right. Stdin, 
stdout and pipes were the glue language with which these little executables 
could be cascaded to produce a substantial result. This enabled 
logical-thinking non-developers, and also developers, to produce those 
substantial results in an hour, with perhaps the greatest encapsulation 
that's ever been achieved in the computer world. Each little executable has 
one input and one output, each being a measurable test point. For batch 
processes this programming technique is every bit as productive as it was 
39 years ago.

There may be things wrong with awking, seding and perling data into 
submission, but the age of these tools is not one of them.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 07:00, José Matos wrote:

 XML will not change the current status.

 grep 'style name=Section' somefile.lyx

 will still work and it not so different from what we have now. You need the
 '' already if you have spaces in your expression...

The trouble is, XML tags can be anywhere -- spacing and linefeeds are 
immaterial. That means you can no longer parse based on position, such as: 

/^begin_layout/

because technically the whole XML file could be in a single line. Or a single 
tag could be split between lines.

This problem is somewhat lessened by the fact that you could do the following 
in Vim/ex:

:%s//\r/g
:%s//\r/g
:g/^\s*$/d

I imagine you could do the same thing with sed, I just don't know how. The 
preceding would put every XML tag on its own line, and eliminate all blank 
lines, after which you could indeed parse based on linefeeds. The other 
problem, of course, is that angle brackets within the text would be linefed, 
which may or may not be a problem depending on the XML dialect you come up 
with.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote:
  Pavel Sanda wrote:
  Moreover, if you're editing by hand, you can use
  something that recognizes XML.

 of course it will work, but it will take x-times more time.
 quite difference to write sed one-liner or start doing some
 xslt templating.

 pavel

Hi Pavel,

Perhaps our best hope of continuing tweakability of native LyX is to create 
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can 
continue to be done in the 1.5.x format.

I'm presuming that the LyX developers will create the 1.5.x to XML converter 
so users can upgrade their old docs, and hopefully they would keep that 
converter updated for each new LyX version, so that you and I wouldn't need 
to worry about coding the 1.5.x to XML.

The only thing you and I would have to do is the XML to 1.5.x converter. I'm 
pretty darned good with C, and if necessary I can do C++ (but with a C 
accent). If we pick an XML parser with full schema/dtd capability, that 
doesn't have many dependencies, then if you know how to write 1.5.x, I can 
feed you whatever data is needed to write the 1.5.x.

There's another possibility that I think might be better. Using Ruby with 
REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml) if 
you could help me just a little bit with the return trip (YAML to XML). I 
think this would be EVEN BETTER than 1.5.x, because YAML was made for exactly 
what you and I want to do -- parsing with awk/sed/perl/grep/cut. It would 
also remove our responsibility to support 1.5.x syntax in the 22nd century.

Using YAML for tweaking, I think there may come a time when you and I would 
say remember when we had to parse that nasty 1.5.x?

I can begin this project as soon as the developers give me an XML def and an 
XML file. That way, once they actually specify what they're going to do, 
we'll have the technology for the XML-YAML-XML round trip, and only the 
details will require coding.

What do you think?

StevET

Steve Litt
Recession Relief Package
http://www.recession-relief.US


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Abdelrazak Younes

Steve Litt wrote:

Perhaps our best hope of continuing tweakability of native LyX is to create
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can
continue to be done in the 1.5.x format.

I'm presuming that the LyX developers will create the 1.5.x to XML converter
so users can upgrade their old docs, and hopefully they would keep that
converter updated for each new LyX version, so that you and I wouldn't need
to worry about coding the 1.5.x to XML.


Yes, switching to XML doesn't mean abandoning lyx2lyx. The difference is 
that we will be able to use simpler XSL templates for the conversion. 
The advantage being that the XSL templates will be available to all, not 
being specificy to python or lyx2lyx.


By the way, the switch to XML is not going to happen with 1.6 but with 
1.7, that is at least one year from now ;-)




The only thing you and I would have to do is the XML to 1.5.x converter.


This will be provided by lyx2lyx too. 1.7-XML will export to all 1.x 
formats with x = 6.



I'm
pretty darned good with C, and if necessary I can do C++ (but with a C
accent). If we pick an XML parser with full schema/dtd capability, that
doesn't have many dependencies, then if you know how to write 1.5.x, I can
feed you whatever data is needed to write the 1.5.x.


As I said above, this 1.7 to 1.6 will be supported via a simple XSL 
stylesheet. It's really the other direction 1.6 to 1.7 that will be 
difficult to implement.
But hey, all help is welcome, the development of 1.7 is going to begin 
in a couple of months so if you want to have a say in the new XML 
format, come along on the devel list ;-)


Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:33:16 Steve Litt wrote:
 The trouble is, XML tags can be anywhere -- spacing and linefeeds are
 immaterial. That means you can no longer parse based on position, such as:

 /^begin_layout/

 because technically the whole XML file could be in a single line. Or a
 single tag could be split between lines.

Since we control the format I am (almost) sure that we will choose a reader 
friendly output. There is no reason to do otherwise. In terms of size a blank 
or a newline are equivalent, so... :-)

That is why it will be business as usual. :-)
Not much will change in this regard.

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:20:59 Steve Litt wrote:
 When the discussion reverts to your thingamabob is from another
 decade/century so it must not be good by today's standards, you know that
 thingamabob is pretty darn good, or else there would have been a more
 powerful argument against it.

Pavel is a developer just as I am. In this thread we been teasing each other 
over this issue. In such cases this is an acceptable argument (IMO). ;-)

 First of all, I understand *exactly* why an XML native format is an
 improvement for the LyX application. I'm limiting my point to the concept
 that something old has to be something bad.

That is fair. :-)

 Modern things are usually improvements, but often are not improvements in
 quality or usefulness. They can be improvements to profit margin (e.g. most
 MS Windows improvements), or marketing improvements (all the silly little
 expensive features thrown into basic family cars today), or improvements in
 restricting use (DRM), or improvements in price (crummy bicycles from
 Walmart). Sometimes older stuff has more quality or usefulness.

All that is true but in this case the lyx file format and indirectly the lyx 
parser have not been changed in a long time until 2002 not because they were 
perfect but because most developers were afraid to touch and break it. The 
format had been evolving over time and it was a mess with places where 
whitespaces were significant and others were they were for no good reason.

 In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the
 philosophy of little executables that do one thing and do it right. Stdin,
 stdout and pipes were the glue language with which these little executables
 could be cascaded to produce a substantial result. This enabled
 logical-thinking non-developers, and also developers, to produce those
 substantial results in an hour, with perhaps the greatest encapsulation
 that's ever been achieved in the computer world. Each little executable has
 one input and one output, each being a measurable test point. For batch
 processes this programming technique is every bit as productive as it was
 39 years ago.

lyx2lyx that lyx uses to convert between the different file formats works 
using this principle, it acts as a filter receiving from stdin and writing the 
transformation in stdout.

Yet until now there is not a good way to have an external program (script) 
other than lyx to check the validity of a lyx file. For me, at least, this is 
a strong shortcoming of our file format.

 There may be things wrong with awking, seding and perling data into
 submission, but the age of these tools is not one of them.

If you add there the coreutils, like tail, cut, paste, merge and so on we can 
do things that spreadsheet programs can only dream of like processing Gigs of 
data with thousands of lines and columns. :-)

 SteveT

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:58:56 Steve Litt wrote:
 Hi Pavel,

 Perhaps our best hope of continuing tweakability of native LyX is to create
 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can
 continue to be done in the 1.5.x format.

I will advise against such practice. I hope to explain why in the paragraphs 
below.

 I'm presuming that the LyX developers will create the 1.5.x to XML
 converter so users can upgrade their old docs, and hopefully they would
 keep that converter updated for each new LyX version, so that you and I
 wouldn't need to worry about coding the 1.5.x to XML.

Note that the convertion to xml will only happen after 1.6. I know that your 
argument remains unchanged with this shift and just correct this before 
continuing.

With this said lyx2lyx will be able to convert from pre-xml to xml and vice-
versa.

Our previous experience suggest however that while the forward translation is 
complete the backwards translation results sometimes in the truncation or lots 
of ERT added to preserve the same structure.

For several reasons a transformation from X to X+1 and back again is not 
guaranteed to give the same document bit by bit. Note also that this is not an 
easy task in any way.

The next question is why do we need to manipulate lyx files with awk and 
friends? Is not there something that can should be done by lyx?

I have generated lyx files with scripts that have been used in my PhD thesis 
(almost 40 pages were generated like this) so I can recognize advantages in 
manipulating lyx files with scripts, but in that case there are better tools 
than awk and sed.

That is also the reason why lyx2lyx is nowadays mostly a python library 
(LyX.py) and the script lyx2lyx is just a wrapper around the library.

 The only thing you and I would have to do is the XML to 1.5.x converter.
 I'm pretty darned good with C, and if necessary I can do C++ (but with a C
 accent). If we pick an XML parser with full schema/dtd capability, that
 doesn't have many dependencies, then if you know how to write 1.5.x, I can
 feed you whatever data is needed to write the 1.5.x.

 There's another possibility that I think might be better. Using Ruby with
 REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml)
 if you could help me just a little bit with the return trip (YAML to XML).
 I think this would be EVEN BETTER than 1.5.x, because YAML was made for
 exactly what you and I want to do -- parsing with awk/sed/perl/grep/cut. It
 would also remove our responsibility to support 1.5.x syntax in the 22nd
 century.

 Using YAML for tweaking, I think there may come a time when you and I would
 say remember when we had to parse that nasty 1.5.x?

 I can begin this project as soon as the developers give me an XML def and
 an XML file. That way, once they actually specify what they're going to do,
 we'll have the technology for the XML-YAML-XML round trip, and only the
 details will require coding.

 What do you think?

 StevET

You are welcome both to tell us your requirements around the future xml file 
format and to help us so that in the end we all have a better lyx. Really, all 
help is welcome.

 Steve Litt
 Recession Relief Package
 http://www.recession-relief.US

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 14:49:12 Manveru wrote:
 Guys,

 Have you even looked at TinyXML?

Thanks for the link. :-)
-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 11:21, José Matos wrote:

  There may be things wrong with awking, seding and perling data into
  submission, but the age of these tools is not one of them.

 If you add there the coreutils, like tail, cut, paste, merge and so on we
 can do things that spreadsheet programs can only dream of like processing
 Gigs of data with thousands of lines and columns. :-)

:-)  :-)  :-)

Check this out:

http://www.troubleshooters.cxm/lpm/200801/200801.htm

http://www.troubleshooters.cxm/lpm/200802/200802.htm


But seriously -- it's obvious that for the LyX application itself, XML is by 
far the best way to go, and I would never suggest rewriting LyX in awk :-). 
My interest is in quick writes/tweaks of LyX native format files in order to 
do things that LyX isn't equipped to do, like my VimOutliner to LyX script.

STeveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 11:05, José Matos wrote:
 On Wednesday 23 July 2008 15:33:16 Steve Litt wrote:
  The trouble is, XML tags can be anywhere -- spacing and linefeeds are
  immaterial. That means you can no longer parse based on position, such
  as:
 
  /^begin_layout/
 
  because technically the whole XML file could be in a single line. Or a
  single tag could be split between lines.

 Since we control the format I am (almost) sure that we will choose a reader
 friendly output. There is no reason to do otherwise. In terms of size a
 blank or a newline are equivalent, so... :-)

 That is why it will be business as usual. :-)
 Not much will change in this regard.

Thanks José,

As a sed/awk/perl/ruby parser, I appreciate that very much.

The more I think about it, the more I think I should make the XML-YAML and 
YAML-XML converters. That way, if future generations of LyX project 
programmers forget why it's important to space their XML just so, it won't 
matter. Also, I have a feeling that YAML will be much easier to parse than 
either 1.5.x or XML.

The way I envision it, these two converters will be simple standalone commands 
implemented as filters (convert stdin to stdout), very few dependencies. They 
will comply with the Unix Philosophy (little apps that do one thing and do it 
well). Trivial to install. They will be simple enough to be maintained by one 
person. 

They will be encapsulated. They won't need to know about LyX other than its 
XML format, and LyX won't need to know about them. They can be included in 
the LyX distribution, or not.

At first I'll do them in Ruby because Ruby has all that stuff built in and 
easy to do. Later, depending on performance and the percent of people who 
have Ruby installed, I can convert them to C. There's a C implementation of 
the same YAML parser/emitter that Ruby uses -- Syck. I'm pretty sure there 
are also C or C++ implementations of XML Parsers, although I don't know how 
well they do things like DTD/schema.

Thanks

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Pavel Sanda
 Perhaps our best hope of continuing tweakability of native LyX is to create 
 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can 
 continue to be done in the 1.5.x format.

as have written others 1.6 is still ok. for lyx files assembly you can still
make what you want in 1.6 format and lyx2lyx will convert for you to 1.7 etc.

next possibility is to stick with 1.6 as long as possible :)

 The only thing you and I would have to do is the XML to 1.5.x converter. I'm 

this will be part of the the fileformat transition in lyx itself. moreover xml 
is not my religion, so i will try to keep myself as far as possible from any
xml related coding :D

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Pavel Sanda
 The next question is why do we need to manipulate lyx files with awk and 
 friends? Is not there something that can should be done by lyx?

search and replace is one of the weak lyx parts and even if we get Tommaso
one day to put his stuff in there are so many place where its of no help.
just look on the things like notes-mutate or graphics settings synchronization
other nonimplemented things come to my mind.

 I have generated lyx files with scripts that have been used in my PhD thesis 
 (almost 40 pages were generated like this) so I can recognize advantages in 
 manipulating lyx files with scripts, but in that case there are better tools 
 than awk and sed.

this depends on what you master. i'm used on the bunch of small unix utilities
so i gave that sed example. if you know python you will do in python. my point
was not propose the best tools but to groan and moan about xml :)

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Andre Poenitz
On Wed, Jul 23, 2008 at 10:33:16AM -0400, Steve Litt wrote:
 On Wednesday 23 July 2008 07:00, José Matos wrote:
 
  XML will not change the current status.
 
  grep 'style name=Section' somefile.lyx
 
  will still work and it not so different from what we have now. You need the
  '' already if you have spaces in your expression...
 
 The trouble is, XML tags can be anywhere -- spacing and linefeeds are 
 immaterial. That means you can no longer parse based on position, such as: 
 
 /^begin_layout/
 
 because technically the whole XML file could be in a single line. Or a single 
 tag could be split between lines.

This is formally correct, but we would pretty precisely know how 
(100 - \epsilon)% of all .lyx files will be formatted because LyX
is more or less the only application that actually writes it.

Andre'


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

Steve Litt wrote:

On Wednesday 23 July 2008 07:00, José Matos wrote:
  

XML will not change the current status.

grep 'style name=Section' somefile.lyx

will still work and it not so different from what we have now. You need the
'' already if you have spaces in your expression...



The trouble is, XML tags can be anywhere -- spacing and linefeeds are 
immaterial. That means you can no longer parse based on position, such as: 


/^begin_layout/

because technically the whole XML file could be in a single line. Or a single 
tag could be split between lines.


  

True of course, but in fact the file is likely to be nicely formatted.

rh

This problem is somewhat lessened by the fact that you could do the following 
in Vim/ex:


:%s//\r/g
:%s//\r/g
:g/^\s*$/d

I imagine you could do the same thing with sed, I just don't know how. The 
preceding would put every XML tag on its own line, and eliminate all blank 
lines, after which you could indeed parse based on linefeeds. The other 
problem, of course, is that angle brackets within the text would be linefed, 
which may or may not be a problem depending on the XML dialect you come up 
with.


SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US
  




Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

Steve Litt wrote:
Perhaps our best hope of continuing tweakability of native LyX is to create 
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can 
continue to be done in the 1.5.x format.


  
As always, LyX will have such converters, so old formats can be 
imported/exported.


rh



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

José Matos wrote:
That is also the reason why lyx2lyx is nowadays mostly a python library 
(LyX.py) and the script lyx2lyx is just a wrapper around the library.


  
And let me add that anyone who wants to process LyX files on a regular 
basis using external scripts would be well served to learn the basics of 
this library. The interface is really very simple once you get the hang 
of it.


rh



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

Steve Litt wrote:
At first I'll do them in Ruby because Ruby has all that stuff built in and 
easy to do. Later, depending on performance and the percent of people who 
have Ruby installed, I can convert them to C. There's a C implementation of 
the same YAML parser/emitter that Ruby uses -- Syck. I'm pretty sure there 
are also C or C++ implementations of XML Parsers, although I don't know how 
well they do things like DTD/schema.


  
At present, it's LyX policy that included things should be in Python, 
since we require it anyway.


rh



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote:
  Pavel Sanda wrote:
  Moreover, if you're editing by hand, you can use
  something that recognizes XML.

 of course it will work, but it will take x-times more time.
 quite difference to write sed one-liner or start doing some
 xslt templating.

 pavel

Yeah, I think this was the point I was trying to get across. With the current 
format, you can do a lot with Vim. Or you can run through a series of small 
filters that do just one thing.

XML's a different animal. Without a parser, it's almost impossible to handle. 
With a parser, you're forced to work only within the language of that parser, 
and you're forced to make a monolithic solution that can't take advantage of 
Unix pipes and small executables that do one thing and do it well. You also 
forgo the ability to have a series of intermediate files, each serving as a 
test point to make sure things are still going well.

Also, an XML parser, especially a DOM one, makes READING XML very easy, but it 
does nothing for WRITING.

Pavel -- you and I and others like us need to start identifying parsing tools 
to at least partially compensate for the loss of our Unix based pipes with 
small filter executables. Theoretically, if one could read the XML into a DOM 
tree, tweak it in memory, and then write it back out, that would be at least 
somewhat doable, though nothing like the Awk and Perl techniques I'm used to.

And once again, we need COMPLETE documentation on the XML dialect, and Like I 
said I'm willing to help with that documentation.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote:
 by 'outside' i mean tweakings which i regularly do and watching users list
 power users do that too _and_ are happy about the current simplicity of
 format.

 tweaks like assembling of the whole file for various datasets, global
 changes of things (cf notes-mutate lfun i introduced lately), conversions
 and so on.

This works well for simple things but breaks badly when you try something a 
bit more complex.

 while you are right that xml could be better technology for internal
 lyx parsing (and i can understand your viewpoint as lyx2lyx fan:)
 this was not my mail about.

  It is funny to see all this nostalgia around something that is/was a
  nightmare.

 it has nothing to do with nostalgia, but speed of hacking around.

Not when the resulting file crashes lyx, something that should not ever happen 
but that it does now. First make it correct and then make it fast.

XML will not change the current status.

grep 'style name=Section' somefile.lyx

will still work and it not so different from what we have now. You need the '' 
already if you have spaces in your expression...

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Pavel Sanda
 On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote:
  while you are right that xml could be better technology for internal
  lyx parsing (and i can understand your viewpoint as lyx2lyx fan:)
  this was not my mail about.
 
   It is funny to see all this nostalgia around something that is/was a
   nightmare.
 
  it has nothing to do with nostalgia, but speed of hacking around.
 
 Not when the resulting file crashes lyx, something that should not ever 
 happen 
 but that it does now.

i've done incorrect file, it's my fault if lyx crashes. i take my 
responsibility,
no problem.
trial method is the fastest if you want something quickly.

 First make it correct and then make it fast.

i have exactly oposite view as far as the tweaking i was talking about
is concerned; i just need quickly output of something, may be i will throw
it away after few days.

or take Steve's example - if he takes your 'First make it correct and then
make it fast' it would take some two weaks to invent some beast to be 
correct in your sense. but then the whole point is lost, since after this
time he could do it manually.

i guess we can't agree on this, since i'm not talking about lyx internals,
while your job is to make lyx format conversions on lyx level... but this
is users list, not the the devel one, so i feel free to speak this way :)

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote:
 i've done incorrect file, it's my fault if lyx crashes. i take my
 responsibility, no problem.
 trial method is the fastest if you want something quickly.

If LyX crashes that is a bug. LyX should not ever crash, it can refused to 
load a file because it is invalid, or to truncate it but it should not ever 
crash.

In the whole picture our parser is one of our weak links so we should do 
something about it. Replace it in this case.

  First make it correct and then make it fast.

 i have exactly oposite view as far as the tweaking i was talking about
 is concerned; i just need quickly output of something, may be i will throw
 it away after few days.

 or take Steve's example - if he takes your 'First make it correct and then
 make it fast' it would take some two weaks to invent some beast to be
 correct in your sense. but then the whole point is lost, since after this
 time he could do it manually.

 i guess we can't agree on this, since i'm not talking about lyx internals,
 while your job is to make lyx format conversions on lyx level... but this
 is users list, not the the devel one, so i feel free to speak this way :)

Yes, I know but I can pretend otherwise. ;-)

 pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Manveru
Guys,

Have you even looked at TinyXML?

I have a project once where we use XML as a message passing protocol and we
were using XSLT as C++ code generator for classes handling XML and
converting them to data structures handling all data we need. This freed us
from portability problems (Litte Endian, Big Endian) which is not case here.
For the application like LyX binary structure may be better to handle -
certainly much work to do. We in our project hadn't found any known DOM
useful for our purpose.

Cheers!
M.

2008/7/23 José Matos [EMAIL PROTECTED]:

 On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote:
  i've done incorrect file, it's my fault if lyx crashes. i take my
  responsibility, no problem.
  trial method is the fastest if you want something quickly.

 If LyX crashes that is a bug. LyX should not ever crash, it can refused to
 load a file because it is invalid, or to truncate it but it should not ever
 crash.

 In the whole picture our parser is one of our weak links so we should do
 something about it. Replace it in this case.

   First make it correct and then make it fast.
 
  i have exactly oposite view as far as the tweaking i was talking about
  is concerned; i just need quickly output of something, may be i will
 throw
  it away after few days.
 
  or take Steve's example - if he takes your 'First make it correct and
 then
  make it fast' it would take some two weaks to invent some beast to be
  correct in your sense. but then the whole point is lost, since after this
  time he could do it manually.
 
  i guess we can't agree on this, since i'm not talking about lyx
 internals,
  while your job is to make lyx format conversions on lyx level... but this
  is users list, not the the devel one, so i feel free to speak this way :)

 Yes, I know but I can pretend otherwise. ;-)

  pavel

 --
 José Abílio




-- 
Manveru
jabber: [EMAIL PROTECTED]
gg: 1624001
http://www.manveru.pl


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 18:21, José Matos wrote:

 Clearly you did not had to deal with the lyx file format like I did. :-)
 If your idea of a parser is a set of regexp's that is so 80's. ;-)
[clip]
 It is funny to see all this nostalgia around something that is/was a
 nightmare. If the syntax was so clear you would not have the problem of
 crashing lyx with a bad formed file (a file modified by scripts).

When the discussion reverts to your thingamabob is from another 
decade/century so it must not be good by today's standards, you know that 
thingamabob is pretty darn good, or else there would have been a more 
powerful argument against it.

First of all, I understand *exactly* why an XML native format is an 
improvement for the LyX application. I'm limiting my point to the concept 
that something old has to be something bad.

Modern things are usually improvements, but often are not improvements in 
quality or usefulness. They can be improvements to profit margin (e.g. most 
MS Windows improvements), or marketing improvements (all the silly little 
expensive features thrown into basic family cars today), or improvements in 
restricting use (DRM), or improvements in price (crummy bicycles from 
Walmart). Sometimes older stuff has more quality or usefulness.

In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the 
philosophy of little executables that do one thing and do it right. Stdin, 
stdout and pipes were the glue language with which these little executables 
could be cascaded to produce a substantial result. This enabled 
logical-thinking non-developers, and also developers, to produce those 
substantial results in an hour, with perhaps the greatest encapsulation 
that's ever been achieved in the computer world. Each little executable has 
one input and one output, each being a measurable test point. For batch 
processes this programming technique is every bit as productive as it was 
39 years ago.

There may be things wrong with awking, seding and perling data into 
submission, but the age of these tools is not one of them.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 07:00, José Matos wrote:

 XML will not change the current status.

 grep 'style name=Section' somefile.lyx

 will still work and it not so different from what we have now. You need the
 '' already if you have spaces in your expression...

The trouble is, XML tags can be anywhere -- spacing and linefeeds are 
immaterial. That means you can no longer parse based on position, such as: 

/^begin_layout/

because technically the whole XML file could be in a single line. Or a single 
tag could be split between lines.

This problem is somewhat lessened by the fact that you could do the following 
in Vim/ex:

:%s//\r/g
:%s//\r/g
:g/^\s*$/d

I imagine you could do the same thing with sed, I just don't know how. The 
preceding would put every XML tag on its own line, and eliminate all blank 
lines, after which you could indeed parse based on linefeeds. The other 
problem, of course, is that angle brackets within the text would be linefed, 
which may or may not be a problem depending on the XML dialect you come up 
with.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote:
  Pavel Sanda wrote:
  Moreover, if you're editing by hand, you can use
  something that recognizes XML.

 of course it will work, but it will take x-times more time.
 quite difference to write sed one-liner or start doing some
 xslt templating.

 pavel

Hi Pavel,

Perhaps our best hope of continuing tweakability of native LyX is to create 
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can 
continue to be done in the 1.5.x format.

I'm presuming that the LyX developers will create the 1.5.x to XML converter 
so users can upgrade their old docs, and hopefully they would keep that 
converter updated for each new LyX version, so that you and I wouldn't need 
to worry about coding the 1.5.x to XML.

The only thing you and I would have to do is the XML to 1.5.x converter. I'm 
pretty darned good with C, and if necessary I can do C++ (but with a C 
accent). If we pick an XML parser with full schema/dtd capability, that 
doesn't have many dependencies, then if you know how to write 1.5.x, I can 
feed you whatever data is needed to write the 1.5.x.

There's another possibility that I think might be better. Using Ruby with 
REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml) if 
you could help me just a little bit with the return trip (YAML to XML). I 
think this would be EVEN BETTER than 1.5.x, because YAML was made for exactly 
what you and I want to do -- parsing with awk/sed/perl/grep/cut. It would 
also remove our responsibility to support 1.5.x syntax in the 22nd century.

Using YAML for tweaking, I think there may come a time when you and I would 
say remember when we had to parse that nasty 1.5.x?

I can begin this project as soon as the developers give me an XML def and an 
XML file. That way, once they actually specify what they're going to do, 
we'll have the technology for the XML-YAML-XML round trip, and only the 
details will require coding.

What do you think?

StevET

Steve Litt
Recession Relief Package
http://www.recession-relief.US


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Abdelrazak Younes

Steve Litt wrote:

Perhaps our best hope of continuing tweakability of native LyX is to create
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can
continue to be done in the 1.5.x format.

I'm presuming that the LyX developers will create the 1.5.x to XML converter
so users can upgrade their old docs, and hopefully they would keep that
converter updated for each new LyX version, so that you and I wouldn't need
to worry about coding the 1.5.x to XML.


Yes, switching to XML doesn't mean abandoning lyx2lyx. The difference is 
that we will be able to use simpler XSL templates for the conversion. 
The advantage being that the XSL templates will be available to all, not 
being specificy to python or lyx2lyx.


By the way, the switch to XML is not going to happen with 1.6 but with 
1.7, that is at least one year from now ;-)




The only thing you and I would have to do is the XML to 1.5.x converter.


This will be provided by lyx2lyx too. 1.7-XML will export to all 1.x 
formats with x = 6.



I'm
pretty darned good with C, and if necessary I can do C++ (but with a C
accent). If we pick an XML parser with full schema/dtd capability, that
doesn't have many dependencies, then if you know how to write 1.5.x, I can
feed you whatever data is needed to write the 1.5.x.


As I said above, this 1.7 to 1.6 will be supported via a simple XSL 
stylesheet. It's really the other direction 1.6 to 1.7 that will be 
difficult to implement.
But hey, all help is welcome, the development of 1.7 is going to begin 
in a couple of months so if you want to have a say in the new XML 
format, come along on the devel list ;-)


Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:33:16 Steve Litt wrote:
 The trouble is, XML tags can be anywhere -- spacing and linefeeds are
 immaterial. That means you can no longer parse based on position, such as:

 /^begin_layout/

 because technically the whole XML file could be in a single line. Or a
 single tag could be split between lines.

Since we control the format I am (almost) sure that we will choose a reader 
friendly output. There is no reason to do otherwise. In terms of size a blank 
or a newline are equivalent, so... :-)

That is why it will be business as usual. :-)
Not much will change in this regard.

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:20:59 Steve Litt wrote:
 When the discussion reverts to your thingamabob is from another
 decade/century so it must not be good by today's standards, you know that
 thingamabob is pretty darn good, or else there would have been a more
 powerful argument against it.

Pavel is a developer just as I am. In this thread we been teasing each other 
over this issue. In such cases this is an acceptable argument (IMO). ;-)

 First of all, I understand *exactly* why an XML native format is an
 improvement for the LyX application. I'm limiting my point to the concept
 that something old has to be something bad.

That is fair. :-)

 Modern things are usually improvements, but often are not improvements in
 quality or usefulness. They can be improvements to profit margin (e.g. most
 MS Windows improvements), or marketing improvements (all the silly little
 expensive features thrown into basic family cars today), or improvements in
 restricting use (DRM), or improvements in price (crummy bicycles from
 Walmart). Sometimes older stuff has more quality or usefulness.

All that is true but in this case the lyx file format and indirectly the lyx 
parser have not been changed in a long time until 2002 not because they were 
perfect but because most developers were afraid to touch and break it. The 
format had been evolving over time and it was a mess with places where 
whitespaces were significant and others were they were for no good reason.

 In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the
 philosophy of little executables that do one thing and do it right. Stdin,
 stdout and pipes were the glue language with which these little executables
 could be cascaded to produce a substantial result. This enabled
 logical-thinking non-developers, and also developers, to produce those
 substantial results in an hour, with perhaps the greatest encapsulation
 that's ever been achieved in the computer world. Each little executable has
 one input and one output, each being a measurable test point. For batch
 processes this programming technique is every bit as productive as it was
 39 years ago.

lyx2lyx that lyx uses to convert between the different file formats works 
using this principle, it acts as a filter receiving from stdin and writing the 
transformation in stdout.

Yet until now there is not a good way to have an external program (script) 
other than lyx to check the validity of a lyx file. For me, at least, this is 
a strong shortcoming of our file format.

 There may be things wrong with awking, seding and perling data into
 submission, but the age of these tools is not one of them.

If you add there the coreutils, like tail, cut, paste, merge and so on we can 
do things that spreadsheet programs can only dream of like processing Gigs of 
data with thousands of lines and columns. :-)

 SteveT

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:58:56 Steve Litt wrote:
 Hi Pavel,

 Perhaps our best hope of continuing tweakability of native LyX is to create
 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can
 continue to be done in the 1.5.x format.

I will advise against such practice. I hope to explain why in the paragraphs 
below.

 I'm presuming that the LyX developers will create the 1.5.x to XML
 converter so users can upgrade their old docs, and hopefully they would
 keep that converter updated for each new LyX version, so that you and I
 wouldn't need to worry about coding the 1.5.x to XML.

Note that the convertion to xml will only happen after 1.6. I know that your 
argument remains unchanged with this shift and just correct this before 
continuing.

With this said lyx2lyx will be able to convert from pre-xml to xml and vice-
versa.

Our previous experience suggest however that while the forward translation is 
complete the backwards translation results sometimes in the truncation or lots 
of ERT added to preserve the same structure.

For several reasons a transformation from X to X+1 and back again is not 
guaranteed to give the same document bit by bit. Note also that this is not an 
easy task in any way.

The next question is why do we need to manipulate lyx files with awk and 
friends? Is not there something that can should be done by lyx?

I have generated lyx files with scripts that have been used in my PhD thesis 
(almost 40 pages were generated like this) so I can recognize advantages in 
manipulating lyx files with scripts, but in that case there are better tools 
than awk and sed.

That is also the reason why lyx2lyx is nowadays mostly a python library 
(LyX.py) and the script lyx2lyx is just a wrapper around the library.

 The only thing you and I would have to do is the XML to 1.5.x converter.
 I'm pretty darned good with C, and if necessary I can do C++ (but with a C
 accent). If we pick an XML parser with full schema/dtd capability, that
 doesn't have many dependencies, then if you know how to write 1.5.x, I can
 feed you whatever data is needed to write the 1.5.x.

 There's another possibility that I think might be better. Using Ruby with
 REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml)
 if you could help me just a little bit with the return trip (YAML to XML).
 I think this would be EVEN BETTER than 1.5.x, because YAML was made for
 exactly what you and I want to do -- parsing with awk/sed/perl/grep/cut. It
 would also remove our responsibility to support 1.5.x syntax in the 22nd
 century.

 Using YAML for tweaking, I think there may come a time when you and I would
 say remember when we had to parse that nasty 1.5.x?

 I can begin this project as soon as the developers give me an XML def and
 an XML file. That way, once they actually specify what they're going to do,
 we'll have the technology for the XML-YAML-XML round trip, and only the
 details will require coding.

 What do you think?

 StevET

You are welcome both to tell us your requirements around the future xml file 
format and to help us so that in the end we all have a better lyx. Really, all 
help is welcome.

 Steve Litt
 Recession Relief Package
 http://www.recession-relief.US

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 14:49:12 Manveru wrote:
 Guys,

 Have you even looked at TinyXML?

Thanks for the link. :-)
-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 11:21, José Matos wrote:

  There may be things wrong with awking, seding and perling data into
  submission, but the age of these tools is not one of them.

 If you add there the coreutils, like tail, cut, paste, merge and so on we
 can do things that spreadsheet programs can only dream of like processing
 Gigs of data with thousands of lines and columns. :-)

:-)  :-)  :-)

Check this out:

http://www.troubleshooters.cxm/lpm/200801/200801.htm

http://www.troubleshooters.cxm/lpm/200802/200802.htm


But seriously -- it's obvious that for the LyX application itself, XML is by 
far the best way to go, and I would never suggest rewriting LyX in awk :-). 
My interest is in quick writes/tweaks of LyX native format files in order to 
do things that LyX isn't equipped to do, like my VimOutliner to LyX script.

STeveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 11:05, José Matos wrote:
 On Wednesday 23 July 2008 15:33:16 Steve Litt wrote:
  The trouble is, XML tags can be anywhere -- spacing and linefeeds are
  immaterial. That means you can no longer parse based on position, such
  as:
 
  /^begin_layout/
 
  because technically the whole XML file could be in a single line. Or a
  single tag could be split between lines.

 Since we control the format I am (almost) sure that we will choose a reader
 friendly output. There is no reason to do otherwise. In terms of size a
 blank or a newline are equivalent, so... :-)

 That is why it will be business as usual. :-)
 Not much will change in this regard.

Thanks José,

As a sed/awk/perl/ruby parser, I appreciate that very much.

The more I think about it, the more I think I should make the XML-YAML and 
YAML-XML converters. That way, if future generations of LyX project 
programmers forget why it's important to space their XML just so, it won't 
matter. Also, I have a feeling that YAML will be much easier to parse than 
either 1.5.x or XML.

The way I envision it, these two converters will be simple standalone commands 
implemented as filters (convert stdin to stdout), very few dependencies. They 
will comply with the Unix Philosophy (little apps that do one thing and do it 
well). Trivial to install. They will be simple enough to be maintained by one 
person. 

They will be encapsulated. They won't need to know about LyX other than its 
XML format, and LyX won't need to know about them. They can be included in 
the LyX distribution, or not.

At first I'll do them in Ruby because Ruby has all that stuff built in and 
easy to do. Later, depending on performance and the percent of people who 
have Ruby installed, I can convert them to C. There's a C implementation of 
the same YAML parser/emitter that Ruby uses -- Syck. I'm pretty sure there 
are also C or C++ implementations of XML Parsers, although I don't know how 
well they do things like DTD/schema.

Thanks

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Pavel Sanda
 Perhaps our best hope of continuing tweakability of native LyX is to create 
 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can 
 continue to be done in the 1.5.x format.

as have written others 1.6 is still ok. for lyx files assembly you can still
make what you want in 1.6 format and lyx2lyx will convert for you to 1.7 etc.

next possibility is to stick with 1.6 as long as possible :)

 The only thing you and I would have to do is the XML to 1.5.x converter. I'm 

this will be part of the the fileformat transition in lyx itself. moreover xml 
is not my religion, so i will try to keep myself as far as possible from any
xml related coding :D

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Pavel Sanda
 The next question is why do we need to manipulate lyx files with awk and 
 friends? Is not there something that can should be done by lyx?

search and replace is one of the weak lyx parts and even if we get Tommaso
one day to put his stuff in there are so many place where its of no help.
just look on the things like notes-mutate or graphics settings synchronization
other nonimplemented things come to my mind.

 I have generated lyx files with scripts that have been used in my PhD thesis 
 (almost 40 pages were generated like this) so I can recognize advantages in 
 manipulating lyx files with scripts, but in that case there are better tools 
 than awk and sed.

this depends on what you master. i'm used on the bunch of small unix utilities
so i gave that sed example. if you know python you will do in python. my point
was not propose the best tools but to groan and moan about xml :)

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Andre Poenitz
On Wed, Jul 23, 2008 at 10:33:16AM -0400, Steve Litt wrote:
 On Wednesday 23 July 2008 07:00, José Matos wrote:
 
  XML will not change the current status.
 
  grep 'style name=Section' somefile.lyx
 
  will still work and it not so different from what we have now. You need the
  '' already if you have spaces in your expression...
 
 The trouble is, XML tags can be anywhere -- spacing and linefeeds are 
 immaterial. That means you can no longer parse based on position, such as: 
 
 /^begin_layout/
 
 because technically the whole XML file could be in a single line. Or a single 
 tag could be split between lines.

This is formally correct, but we would pretty precisely know how 
(100 - \epsilon)% of all .lyx files will be formatted because LyX
is more or less the only application that actually writes it.

Andre'


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

Steve Litt wrote:

On Wednesday 23 July 2008 07:00, José Matos wrote:
  

XML will not change the current status.

grep 'style name=Section' somefile.lyx

will still work and it not so different from what we have now. You need the
'' already if you have spaces in your expression...



The trouble is, XML tags can be anywhere -- spacing and linefeeds are 
immaterial. That means you can no longer parse based on position, such as: 


/^begin_layout/

because technically the whole XML file could be in a single line. Or a single 
tag could be split between lines.


  

True of course, but in fact the file is likely to be nicely formatted.

rh

This problem is somewhat lessened by the fact that you could do the following 
in Vim/ex:


:%s//\r/g
:%s//\r/g
:g/^\s*$/d

I imagine you could do the same thing with sed, I just don't know how. The 
preceding would put every XML tag on its own line, and eliminate all blank 
lines, after which you could indeed parse based on linefeeds. The other 
problem, of course, is that angle brackets within the text would be linefed, 
which may or may not be a problem depending on the XML dialect you come up 
with.


SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US
  




Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

Steve Litt wrote:
Perhaps our best hope of continuing tweakability of native LyX is to create 
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can 
continue to be done in the 1.5.x format.


  
As always, LyX will have such converters, so old formats can be 
imported/exported.


rh



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

José Matos wrote:
That is also the reason why lyx2lyx is nowadays mostly a python library 
(LyX.py) and the script lyx2lyx is just a wrapper around the library.


  
And let me add that anyone who wants to process LyX files on a regular 
basis using external scripts would be well served to learn the basics of 
this library. The interface is really very simple once you get the hang 
of it.


rh



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Richard heck

Steve Litt wrote:
At first I'll do them in Ruby because Ruby has all that stuff built in and 
easy to do. Later, depending on performance and the percent of people who 
have Ruby installed, I can convert them to C. There's a C implementation of 
the same YAML parser/emitter that Ruby uses -- Syck. I'm pretty sure there 
are also C or C++ implementations of XML Parsers, although I don't know how 
well they do things like DTD/schema.


  
At present, it's LyX policy that included things should be in Python, 
since we require it anyway.


rh



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote:
> > Pavel Sanda wrote:
> > Moreover, if you're editing by hand, you can use
> > something that recognizes XML.
>
> of course it will work, but it will take x-times more time.
> quite difference to write sed one-liner or start doing some
> xslt templating.
>
> pavel

Yeah, I think this was the point I was trying to get across. With the current 
format, you can do a lot with Vim. Or you can run through a series of small 
filters that do just one thing.

XML's a different animal. Without a parser, it's almost impossible to handle. 
With a parser, you're forced to work only within the language of that parser, 
and you're forced to make a monolithic solution that can't take advantage of 
Unix pipes and small executables that do one thing and do it well. You also 
forgo the ability to have a series of intermediate files, each serving as a 
test point to make sure things are still going well.

Also, an XML parser, especially a DOM one, makes READING XML very easy, but it 
does nothing for WRITING.

Pavel -- you and I and others like us need to start identifying parsing tools 
to at least partially compensate for the loss of our Unix based pipes with 
small filter executables. Theoretically, if one could read the XML into a DOM 
tree, tweak it in memory, and then write it back out, that would be at least 
somewhat doable, though nothing like the Awk and Perl techniques I'm used to.

And once again, we need COMPLETE documentation on the XML dialect, and Like I 
said I'm willing to help with that documentation.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote:
> by 'outside' i mean tweakings which i regularly do and watching users list
> power users do that too _and_ are happy about the current simplicity of
> format.
>
> tweaks like assembling of the whole file for various datasets, global
> changes of things (cf notes-mutate lfun i introduced lately), conversions
> and so on.

This works well for simple things but breaks badly when you try something a 
bit more complex.

> while you are right that xml could be better technology for internal
> lyx parsing (and i can understand your viewpoint as lyx2lyx fan:)
> this was not my mail about.
>
> > It is funny to see all this nostalgia around something that is/was a
> > nightmare.
>
> it has nothing to do with nostalgia, but speed of hacking around.

Not when the resulting file crashes lyx, something that should not ever happen 
but that it does now. First make it correct and then make it fast.

XML will not change the current status.

grep 

Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Pavel Sanda
> On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote:
> > while you are right that xml could be better technology for internal
> > lyx parsing (and i can understand your viewpoint as lyx2lyx fan:)
> > this was not my mail about.
> >
> > > It is funny to see all this nostalgia around something that is/was a
> > > nightmare.
> >
> > it has nothing to do with nostalgia, but speed of hacking around.
> 
> Not when the resulting file crashes lyx, something that should not ever 
> happen 
> but that it does now.

i've done incorrect file, it's my fault if lyx crashes. i take my 
responsibility,
no problem.
trial method is the fastest if you want something quickly.

> First make it correct and then make it fast.

i have exactly oposite view as far as the tweaking i was talking about
is concerned; i just need quickly output of something, may be i will throw
it away after few days.

or take Steve's example - if he takes your 'First make it correct and then
make it fast' it would take some two weaks to invent some beast to be 
correct in your sense. but then the whole point is lost, since after this
time he could do it manually.

i guess we can't agree on this, since i'm not talking about lyx internals,
while your job is to make lyx format conversions on lyx level... but this
is users list, not the the devel one, so i feel free to speak this way :)

pavel


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote:
> i've done incorrect file, it's my fault if lyx crashes. i take my
> responsibility, no problem.
> trial method is the fastest if you want something quickly.

If LyX crashes that is a bug. LyX should not ever crash, it can refused to 
load a file because it is invalid, or to truncate it but it should not ever 
crash.

In the whole picture our parser is one of our weak links so we should do 
something about it. Replace it in this case.

> > First make it correct and then make it fast.
>
> i have exactly oposite view as far as the tweaking i was talking about
> is concerned; i just need quickly output of something, may be i will throw
> it away after few days.
>
> or take Steve's example - if he takes your 'First make it correct and then
> make it fast' it would take some two weaks to invent some beast to be
> correct in your sense. but then the whole point is lost, since after this
> time he could do it manually.
>
> i guess we can't agree on this, since i'm not talking about lyx internals,
> while your job is to make lyx format conversions on lyx level... but this
> is users list, not the the devel one, so i feel free to speak this way :)

Yes, I know but I can pretend otherwise. ;-)

> pavel

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Manveru
Guys,

Have you even looked at TinyXML?

I have a project once where we use XML as a message passing protocol and we
were using XSLT as C++ code generator for classes handling XML and
converting them to data structures handling all data we need. This freed us
from portability problems (Litte Endian, Big Endian) which is not case here.
For the application like LyX binary structure may be better to handle -
certainly much work to do. We in our project hadn't found any known DOM
useful for our purpose.

Cheers!
M.

2008/7/23 José Matos <[EMAIL PROTECTED]>:

> On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote:
> > i've done incorrect file, it's my fault if lyx crashes. i take my
> > responsibility, no problem.
> > trial method is the fastest if you want something quickly.
>
> If LyX crashes that is a bug. LyX should not ever crash, it can refused to
> load a file because it is invalid, or to truncate it but it should not ever
> crash.
>
> In the whole picture our parser is one of our weak links so we should do
> something about it. Replace it in this case.
>
> > > First make it correct and then make it fast.
> >
> > i have exactly oposite view as far as the tweaking i was talking about
> > is concerned; i just need quickly output of something, may be i will
> throw
> > it away after few days.
> >
> > or take Steve's example - if he takes your 'First make it correct and
> then
> > make it fast' it would take some two weaks to invent some beast to be
> > correct in your sense. but then the whole point is lost, since after this
> > time he could do it manually.
> >
> > i guess we can't agree on this, since i'm not talking about lyx
> internals,
> > while your job is to make lyx format conversions on lyx level... but this
> > is users list, not the the devel one, so i feel free to speak this way :)
>
> Yes, I know but I can pretend otherwise. ;-)
>
> > pavel
>
> --
> José Abílio
>



-- 
Manveru
jabber: [EMAIL PROTECTED]
gg: 1624001
http://www.manveru.pl


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 18:21, José Matos wrote:

> Clearly you did not had to deal with the lyx file format like I did. :-)
> If your idea of a parser is a set of regexp's that is so 80's. ;-)
[clip]
> It is funny to see all this nostalgia around something that is/was a
> nightmare. If the syntax was so clear you would not have the problem of
> crashing lyx with a bad formed file (a file modified by scripts).

When the discussion reverts to "your thingamabob is from another 
decade/century so it must not be good by today's standards", you know that 
thingamabob is pretty darn good, or else there would have been a more 
powerful argument against it.

First of all, I understand *exactly* why an XML native format is an 
improvement for the LyX application. I'm limiting my point to the concept 
that something old has to be something bad.

Modern things are usually improvements, but often are not improvements in 
quality or usefulness. They can be improvements to profit margin (e.g. most 
MS Windows "improvements"), or marketing improvements (all the silly little 
expensive features thrown into basic family cars today), or improvements in 
restricting use (DRM), or improvements in price (crummy bicycles from 
Walmart). Sometimes older stuff has more quality or usefulness.

In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the 
philosophy of little executables that do one thing and do it right. Stdin, 
stdout and pipes were the glue language with which these little executables 
could be cascaded to produce a substantial result. This enabled 
logical-thinking non-developers, and also developers, to produce those 
substantial results in an hour, with perhaps the greatest encapsulation 
that's ever been achieved in the computer world. Each little executable has 
one input and one output, each being a measurable test point. For batch 
processes this "programming" technique is every bit as productive as it was 
39 years ago.

There may be things wrong with awking, seding and perling data into 
submission, but the age of these tools is not one of them.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 07:00, José Matos wrote:

> XML will not change the current status.
>
> grep 

Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote:
> > Pavel Sanda wrote:
> > Moreover, if you're editing by hand, you can use
> > something that recognizes XML.
>
> of course it will work, but it will take x-times more time.
> quite difference to write sed one-liner or start doing some
> xslt templating.
>
> pavel

Hi Pavel,

Perhaps our best hope of continuing tweakability of native LyX is to create 
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can 
continue to be done in the 1.5.x format.

I'm presuming that the LyX developers will create the 1.5.x to XML converter 
so users can upgrade their old docs, and hopefully they would keep that 
converter updated for each new LyX version, so that you and I wouldn't need 
to worry about coding the 1.5.x to XML.

The only thing you and I would have to do is the XML to 1.5.x converter. I'm 
pretty darned good with C, and if necessary I can do C++ (but with a C 
accent). If we pick an XML parser with full schema/dtd capability, that 
doesn't have many dependencies, then if you know how to write 1.5.x, I can 
feed you whatever data is needed to write the 1.5.x.

There's another possibility that I think might be better. Using Ruby with 
REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml) if 
you could help me just a little bit with the return trip (YAML to XML). I 
think this would be EVEN BETTER than 1.5.x, because YAML was made for exactly 
what you and I want to do -- parsing with awk/sed/perl/grep/cut. It would 
also remove our responsibility to support 1.5.x syntax in the 22nd century.

Using YAML for tweaking, I think there may come a time when you and I would 
say "remember when we had to parse that nasty 1.5.x?"

I can begin this project as soon as the developers give me an XML def and an 
XML file. That way, once they actually specify what they're going to do, 
we'll have the technology for the XML->YAML->XML round trip, and only the 
details will require coding.

What do you think?

StevET

Steve Litt
Recession Relief Package
http://www.recession-relief.US


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Abdelrazak Younes

Steve Litt wrote:

Perhaps our best hope of continuing tweakability of native LyX is to create
1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can
continue to be done in the 1.5.x format.

I'm presuming that the LyX developers will create the 1.5.x to XML converter
so users can upgrade their old docs, and hopefully they would keep that
converter updated for each new LyX version, so that you and I wouldn't need
to worry about coding the 1.5.x to XML.


Yes, switching to XML doesn't mean abandoning lyx2lyx. The difference is 
that we will be able to use simpler XSL templates for the conversion. 
The advantage being that the XSL templates will be available to all, not 
being specificy to python or lyx2lyx.


By the way, the switch to XML is not going to happen with 1.6 but with 
1.7, that is at least one year from now ;-)




The only thing you and I would have to do is the XML to 1.5.x converter.


This will be provided by lyx2lyx too. 1.7-XML will export to all 1.x 
formats with x <= 6.



I'm
pretty darned good with C, and if necessary I can do C++ (but with a C
accent). If we pick an XML parser with full schema/dtd capability, that
doesn't have many dependencies, then if you know how to write 1.5.x, I can
feed you whatever data is needed to write the 1.5.x.


As I said above, this 1.7 to 1.6 will be supported via a simple XSL 
stylesheet. It's really the other direction 1.6 to 1.7 that will be 
difficult to implement.
But hey, all help is welcome, the development of 1.7 is going to begin 
in a couple of months so if you want to have a say in the new XML 
format, come along on the devel list ;-)


Abdel.



Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:33:16 Steve Litt wrote:
> The trouble is, XML tags can be anywhere -- spacing and linefeeds are
> immaterial. That means you can no longer parse based on position, such as:
>
> /^begin_layout/
>
> because technically the whole XML file could be in a single line. Or a
> single tag could be split between lines.

Since we control the format I am (almost) sure that we will choose a reader 
friendly output. There is no reason to do otherwise. In terms of size a blank 
or a newline are equivalent, so... :-)

That is why it will be business as usual. :-)
Not much will change in this regard.

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:20:59 Steve Litt wrote:
> When the discussion reverts to "your thingamabob is from another
> decade/century so it must not be good by today's standards", you know that
> thingamabob is pretty darn good, or else there would have been a more
> powerful argument against it.

Pavel is a developer just as I am. In this thread we been teasing each other 
over this issue. In such cases this is an acceptable argument (IMO). ;-)

> First of all, I understand *exactly* why an XML native format is an
> improvement for the LyX application. I'm limiting my point to the concept
> that something old has to be something bad.

That is fair. :-)

> Modern things are usually improvements, but often are not improvements in
> quality or usefulness. They can be improvements to profit margin (e.g. most
> MS Windows "improvements"), or marketing improvements (all the silly little
> expensive features thrown into basic family cars today), or improvements in
> restricting use (DRM), or improvements in price (crummy bicycles from
> Walmart). Sometimes older stuff has more quality or usefulness.

All that is true but in this case the lyx file format and indirectly the lyx 
parser have not been changed in a long time until 2002 not because they were 
perfect but because most developers were afraid to touch and break it. The 
format had been evolving over time and it was a mess with places where 
whitespaces were significant and others were they were for no good reason.

> In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the
> philosophy of little executables that do one thing and do it right. Stdin,
> stdout and pipes were the glue language with which these little executables
> could be cascaded to produce a substantial result. This enabled
> logical-thinking non-developers, and also developers, to produce those
> substantial results in an hour, with perhaps the greatest encapsulation
> that's ever been achieved in the computer world. Each little executable has
> one input and one output, each being a measurable test point. For batch
> processes this "programming" technique is every bit as productive as it was
> 39 years ago.

lyx2lyx that lyx uses to convert between the different file formats works 
using this principle, it acts as a filter receiving from stdin and writing the 
transformation in stdout.

Yet until now there is not a good way to have an external program (script) 
other than lyx to check the validity of a lyx file. For me, at least, this is 
a strong shortcoming of our file format.

> There may be things wrong with awking, seding and perling data into
> submission, but the age of these tools is not one of them.

If you add there the coreutils, like tail, cut, paste, merge and so on we can 
do things that spreadsheet programs can only dream of like processing Gigs of 
data with thousands of lines and columns. :-)

> SteveT

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 15:58:56 Steve Litt wrote:
> Hi Pavel,
>
> Perhaps our best hope of continuing tweakability of native LyX is to create
> 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can
> continue to be done in the 1.5.x format.

I will advise against such practice. I hope to explain why in the paragraphs 
below.

> I'm presuming that the LyX developers will create the 1.5.x to XML
> converter so users can upgrade their old docs, and hopefully they would
> keep that converter updated for each new LyX version, so that you and I
> wouldn't need to worry about coding the 1.5.x to XML.

Note that the convertion to xml will only happen after 1.6. I know that your 
argument remains unchanged with this shift and just correct this before 
continuing.

With this said lyx2lyx will be able to convert from pre-xml to xml and vice-
versa.

Our previous experience suggest however that while the forward translation is 
complete the backwards translation results sometimes in the truncation or lots 
of ERT added to preserve the same structure.

For several reasons a transformation from X to X+1 and back again is not 
guaranteed to give the same document bit by bit. Note also that this is not an 
easy task in any way.

The next question is why do we need to manipulate lyx files with awk and 
friends? Is not there something that can should be done by lyx?

I have generated lyx files with scripts that have been used in my PhD thesis 
(almost 40 pages were generated like this) so I can recognize advantages in 
manipulating lyx files with scripts, but in that case there are better tools 
than awk and sed.

That is also the reason why lyx2lyx is nowadays mostly a python library 
(LyX.py) and the script lyx2lyx is just a wrapper around the library.

> The only thing you and I would have to do is the XML to 1.5.x converter.
> I'm pretty darned good with C, and if necessary I can do C++ (but with a C
> accent). If we pick an XML parser with full schema/dtd capability, that
> doesn't have many dependencies, then if you know how to write 1.5.x, I can
> feed you whatever data is needed to write the 1.5.x.
>
> There's another possibility that I think might be better. Using Ruby with
> REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml)
> if you could help me just a little bit with the return trip (YAML to XML).
> I think this would be EVEN BETTER than 1.5.x, because YAML was made for
> exactly what you and I want to do -- parsing with awk/sed/perl/grep/cut. It
> would also remove our responsibility to support 1.5.x syntax in the 22nd
> century.
>
> Using YAML for tweaking, I think there may come a time when you and I would
> say "remember when we had to parse that nasty 1.5.x?"
>
> I can begin this project as soon as the developers give me an XML def and
> an XML file. That way, once they actually specify what they're going to do,
> we'll have the technology for the XML->YAML->XML round trip, and only the
> details will require coding.
>
> What do you think?
>
> StevET

You are welcome both to tell us your requirements around the future xml file 
format and to help us so that in the end we all have a better lyx. Really, all 
help is welcome.

> Steve Litt
> Recession Relief Package
> http://www.recession-relief.US

-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread José Matos
On Wednesday 23 July 2008 14:49:12 Manveru wrote:
> Guys,
>
> Have you even looked at TinyXML?

Thanks for the link. :-)
-- 
José Abílio


Re: Progress on the MS Word to LyX conversion (xml)

2008-07-23 Thread Steve Litt
On Wednesday 23 July 2008 11:21, José Matos wrote:

> > There may be things wrong with awking, seding and perling data into
> > submission, but the age of these tools is not one of them.
>
> If you add there the coreutils, like tail, cut, paste, merge and so on we
> can do things that spreadsheet programs can only dream of like processing
> Gigs of data with thousands of lines and columns. :-)

:-)  :-)  :-)

Check this out:

http://www.troubleshooters.cxm/lpm/200801/200801.htm

http://www.troubleshooters.cxm/lpm/200802/200802.htm


But seriously -- it's obvious that for the LyX application itself, XML is by 
far the best way to go, and I would never suggest rewriting LyX in awk :-). 
My interest is in quick writes/tweaks of LyX native format files in order to 
do things that LyX isn't equipped to do, like my VimOutliner to LyX script.

STeveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



  1   2   >