Re: Progress on the MS Word to LyX conversion (xml)
Michael Wojcik wrote: I don't expect the switch to XML to cause me any problems, and to be honest I'm a bit puzzled by all the worrying. /me too :-) Abdel.
Re: Progress on the MS Word to LyX conversion (xml)
Steve Litt wrote: Trouble is, replacing \begin..\end with <>... is a hack. LyX developers have defined LyX native format as \begin always is the first character on a line. There's no such requirement in XML, and if we require it, that's a hack. If we don't require it, LyX-XML parsing becomes a whole new level of difficulty. It's not hard at all, with an XML parser. Actually, putting all XML elements on their own lines, with or without leading whitespace, can be done with a DFA (or anything equivalent, such as a regular expression); you don't even need a full-strength parser. If you want elements all on their own lines, pre-processing with a quick sed script would do that for you. I'm a toolsmith myself, and I write lots of tools, in lots of languages, for pre- and post-processing various file formats. I don't expect the switch to XML to cause me any problems, and to be honest I'm a bit puzzled by all the worrying. -- Michael Wojcik Micro Focus Rhetoric & Writing, Michigan State University
Re: Progress on the MS Word to LyX conversion (xml)
Manveru wrote: Have you ever merge XML? I tried - it is horrible work. It depends entirely on how the XML document is formatted. There's nothing that prevents XML with sensible line breaks, for example. I keep lots of XHTML documents in CVS. They're well-formatted, so merging works just fine. -- Michael Wojcik Micro Focus Rhetoric & Writing, Michigan State University
Re: Progress on the MS Word to LyX conversion (xml)
John McCabe-Dansted wrote: On Fri, Jul 25, 2008 at 4:43 PM, Manveru<[EMAIL PROTECTED]> wrote: To the discussion about data format preference: I am reading all your comments about XML, YAML and other suggested data formats. And this discussion reminds me something about XML what almost nobody is remeber about. How many LyX user are working in large team projects? How often they have to merge text files from different branches? Have you ever merge XML? I tried - it is horrible work. I don't see why it would be harder if we "just replace \begin...\end with<>...". I think LyX cannot exist with XML data format without build-in document merge functionality. This would be nice in any case. Shameless plug: http://www.lyx.org/Donate#sponsorship Abdel.
Re: Progress on the MS Word to LyX conversion (xml)
G. Milde wrote: On 28.07.08, Steve Litt wrote: On Monday 28 July 2008 01:10, John McCabe-Dansted wrote: On Fri, Jul 25, 2008 at 4:43 PM, Manveru<[EMAIL PROTECTED]> wrote: To the discussion about data format preference: ... Have you ever merged XML? I tried - it is horrible work. I don't see why it would be harder if we "just replace \begin...\end with<>...". Trouble is, replacing \begin..\end with<>... is a hack. ... There's no such requirement in XML, and if we require it, that's a hack. I'd call it a layout convention. IMO it is perfectly legal to define the lyx file format as ... uses XML ... ... is laid out in a manner to facilitate processing by tools that operate on a line basis (grep, merge, sed, awk, ...) ... Right, but LyX should not depend on this human friendly format. IOW LyX will be able to parse non nicely formatted .lyx file but will always output nicely formatted .lyx file. We could add an option to lyx2lyx so that badly formatted LyX files generated by some external tool would be transformed into a nicely formatted .lyx file. See? I don't forecast any parsing problem :-) Abdel.
Re: Progress on the MS Word to LyX conversion (xml)
On 28.07.08, Steve Litt wrote: > On Monday 28 July 2008 01:10, John McCabe-Dansted wrote: > > On Fri, Jul 25, 2008 at 4:43 PM, Manveru <[EMAIL PROTECTED]> wrote: > > > To the discussion about data format preference: > > > > > > ... Have you ever merged XML? I tried - it is horrible work. > > > > I don't see why it would be harder if we "just replace \begin...\end > > with <>...". > Trouble is, replacing \begin..\end with <>... is a hack. ... > There's no such requirement in XML, and if we require it, that's a > hack. I'd call it a layout convention. IMO it is perfectly legal to define the lyx file format as ... uses XML ... ... is laid out in a manner to facilitate processing by tools that operate on a line basis (grep, merge, sed, awk, ...) ... Günter
Re: Progress on the MS Word to LyX conversion (xml)
On Monday 28 July 2008 01:10, John McCabe-Dansted wrote: > On Fri, Jul 25, 2008 at 4:43 PM, Manveru <[EMAIL PROTECTED]> wrote: > > To the discussion about data format preference: > > > > I am reading all your comments about XML, YAML and other suggested data > > formats. And this discussion reminds me something about XML what almost > > nobody is remeber about. How many LyX user are working in large team > > projects? How often they have to merge text files from different > > branches? Have you ever merge XML? I tried - it is horrible work. > > I don't see why it would be harder if we "just replace \begin...\end > with <>...". Trouble is, replacing \begin..\end with <>... is a hack. LyX developers have defined LyX native format as \begin always is the first character on a line. There's no such requirement in XML, and if we require it, that's a hack. If we don't require it, LyX-XML parsing becomes a whole new level of difficulty. Like I said, nothing that XML->YAML and YAML->XML can't solve, but those would be required. Incidentally, I just heard there are already standalone programs that do those conversions, so before writing code myself, I'll investigate. SteveT Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion (xml)
On Fri, Jul 25, 2008 at 4:43 PM, Manveru <[EMAIL PROTECTED]> wrote: > To the discussion about data format preference: > > I am reading all your comments about XML, YAML and other suggested data > formats. And this discussion reminds me something about XML what almost > nobody is remeber about. How many LyX user are working in large team > projects? How often they have to merge text files from different branches? > Have you ever merge XML? I tried - it is horrible work. I don't see why it would be harder if we "just replace \begin...\end with <>...". > I think LyX cannot exist with XML data format without build-in document > merge functionality. This would be nice in any case. -- John C. McCabe-Dansted PhD Student University of Western Australia
Re: Progress on the MS Word to LyX conversion (xml)
To the discussion about data format preference: I am reading all your comments about XML, YAML and other suggested data formats. And this discussion reminds me something about XML what almost nobody is remeber about. How many LyX user are working in large team projects? How often they have to merge text files from different branches? Have you ever merge XML? I tried - it is horrible work. I think LyX cannot exist with XML data format without build-in document merge functionality. If any one is thinking about proffesional usage of LyX. I saw some discussions about it, but I do not know whether it is in LyX or not. I do not need this feature yet. YAML is interesting idea, I saw use of it in one of Python frameworks (I don't remeber which one). But it stays in nische. I don't see libraries for YAML under active development right now. -- Manveru jabber: [EMAIL PROTECTED] gg: 1624001 http://www.manveru.pl
Re: Progress on the MS Word to LyX conversion (xml)
> On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote: > > frankly - these are nice dreams, but there is not manpower to do it. > > my feeling is that the xml-branch commit activity pefectly shows what will > > happen after the worst bugs will be repaired in xml merged trunk. > > > > or you have some particular developer in mind? :)) > > Last time that I remember lyx2lyx was also a nice dream. :-) you wanted to say docbook ? :)) > > pavel
Re: Progress on the MS Word to LyX conversion (xml)
On Thursday 24 July 2008 13:07:19 Pavel Sanda wrote: > frankly - these are nice dreams, but there is not manpower to do it. > my feeling is that the xml-branch commit activity pefectly shows what will > happen after the worst bugs will be repaired in xml merged trunk. > > or you have some particular developer in mind? :)) Last time that I remember lyx2lyx was also a nice dream. :-) > pavel -- José Abílio
Re: Progress on the MS Word to LyX conversion
I understand DTD simplicity... but it is no longer fresh these days. Schema allows better understanding and can be processed by XSLT. 2008/7/23 John <[EMAIL PROTECTED]>: > On Wednesday 23 July 2008 08:04:59 am Steve Litt wrote: > > On Tuesday 22 July 2008 11:32, rgheck wrote: > > > Steve Litt wrote: > > > > I don't know how it will be after LyX goes XML, but right now at > 1.5.3, > > > > converting my LyX code to something else by parsing the LyX native > code > > > > would be trivial. > This is probably teaching Grandma to suck eggs - but > There is a very good set of XML utilities available in Linux which alloy > you > easily parse and transform .xml files into almost anything you want (using > xslt, sax, and friends. In openSUSE it is called xmlstarlet and comes with > the installation CDs or DVD. > These should make it easy to translate to and from LyX (when it finally > goes > fully XML). > > John O'Gorman > > > > > > My understanding is that, whatever happens with the LyX file format, we > > > want it to remain possible to do the sort of simple scripting we all > > > like to be able to do. The XML business is really just a matter of > > > replacing things like this: > > > > > > \begin_layout Standard > > > this. > > > \end_layout > > > > > > \begin_layout Standard > > > \begin_inset CommandInset bibtex > > > LatexCommand bibtex > > > bibfiles "/tmp/bib" > > > options "plain" > > > > > > \end_inset > > > > > > > > > \end_layout > > > > > > with things like this: > > > > > > > > > this. > > > > > > > > > > > > options="plain" > > > /> > > > > > > Just as easy to parse, I hope. Maybe even easier. > > > > > > That's not anything actually agreed or implemented > > > > It's not as easy to parse, but it's reasonable. If that's the extent of > the > > XMLization of LyX, it should still be somewhat tweakable with Vim, Perl, > > etc. > > > > The real problems come in when they do things in XML that would be > > denormalization in a database. Store the paragraphs one place, and then > > store the *number of paragraphs* somewhere else, so if you add a > paragraph > > and forget to increment the number, your doc no longer opens. > > > > Or treating the XML file like a relational database, where you have a > list > > of styles with numbered IDs one place, and then have those numbers > applied > > to paragraphs somewhere else. This is an excellent programming technique, > > but for the guy just trying to casually go in and tweak something, or > > casually trying to programmatically generate LyX data, it can be daunting > > indeed. Personally, I love having my style defs in the layout file and > > using the style names as their identifiers. > > > > Then there's this habit of people like OpenOffice, where the native > format > > is a Zip file unzipping to different directories, each containing XML > files > > and other types of files. Yeah, I just dare anyone to generate OpenOffice > > on the fly. > > > > I suggest that whatever you decide, you document the XML structure. I > don't > > mean document as in "it's open source, read the code". I mean document as > > in "Here is the data hierarchy, here is the high level data design, here > > are our reasons for doing it this way, here are the data > interdependencies, > > here are some tips for building LyX files programmatically and tweaking > > them either programmatically or with an editor. And here is a tutorial on > > building and tweaking LyX files without the LyX front end. > > > > I'm busy these days, but if you keep me in the loop I'll do at least a > good > > chunk of that documentation. > > > > One more thing -- if you're going XML and don't want to reinvent the > wheel, > > you'll be using someone else's XML parser. Please, please, PLEASE, don't > > make it some parser with tons of dependency so that the guy with a 2 year > > old distro can't compile LyX because of the XML parser. We already have > > enough problems with Qt dependencies. > > > > Thanks > > > > SteveT > > > > Steve Litt > > Recession Relief Package > > http://www.recession-relief.US > > > -- Manveru jabber: [EMAIL PROTECTED] gg: 1624001 http://www.manveru.pl
Re: Progress on the MS Word to LyX conversion (xml)
> what I claim is that we need better > script tools to handle lyx documents. Those tools should be stable across lyx > versions and should not depend of any particular file format. frankly - these are nice dreams, but there is not manpower to do it. my feeling is that the xml-branch commit activity pefectly shows what will happen after the worst bugs will be repaired in xml merged trunk. or you have some particular developer in mind? :)) pavel
Re: Progress on the MS Word to LyX conversion (xml)
On Wed, 23 Jul 2008, Steve Litt wrote: As a sed/awk/perl/ruby parser, I appreciate that very much. The more I think about it, the more I think I should make the XML->YAML and YAML->XML converters. That way, if future generations of LyX project programmers forget why it's important to space their XML "just so", it won't matter. Also, I have a feeling that YAML will be much easier to parse than either 1.5.x or XML. At first I'll do them in Ruby because Ruby has all that stuff built in and easy to do. Did you see José's post about how the lyx2lyx stuff is really inside a Python lib (module)? You'd probably only need a different kind of wrapper that calls this module, instead of reinventing everything in Ruby. /Christian -- Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 19:24:16 Pavel Sanda wrote: > this depends on what you master. i'm used on the bunch of small unix > utilities so i gave that sed example. if you know python you will do in > python. my point was not propose the best tools but to groan and moan about > xml :) FWIW this chunk is from one of my shell scripts: echo $1 for i in {8..40} do echo -n '.' w=`printf "%.2d0" $i` f="dfa-$1-$w.dat" ./dfa -s -w $w < $1.dat | ./join-lag.py -l $w -r $1.dates > $f cut -f1,2 $f | join -a1 dfa.dat - > tmp.dat mv tmp.dat dfa.dat done So as you can see I know more than python. :-) And yes I know this only works with bash, and that is OK with me. :-) My point is that it is alright to use the small tools of the trade but we can do better because lyx documents are richer than just pure text. I am not saying that your usage is wrong what I claim is that we need better script tools to handle lyx documents. Those tools should be stable across lyx versions and should not depend of any particular file format. > pavel -- José Abílio
Re: Progress on the MS Word to LyX conversion (xml)
Steve Litt wrote: At first I'll do them in Ruby because Ruby has all that stuff built in and easy to do. Later, depending on performance and the percent of people who have Ruby installed, I can convert them to C. There's a C implementation of the same YAML parser/emitter that Ruby uses -- Syck. I'm pretty sure there are also C or C++ implementations of XML Parsers, although I don't know how well they do things like DTD/schema. At present, it's LyX policy that included things should be in Python, since we require it anyway. rh
Re: Progress on the MS Word to LyX conversion (xml)
José Matos wrote: That is also the reason why lyx2lyx is nowadays mostly a python library (LyX.py) and the script lyx2lyx is just a wrapper around the library. And let me add that anyone who wants to process LyX files on a regular basis using external scripts would be well served to learn the basics of this library. The interface is really very simple once you get the hang of it. rh
Re: Progress on the MS Word to LyX conversion (xml)
Steve Litt wrote: Perhaps our best hope of continuing tweakability of native LyX is to create 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can continue to be done in the 1.5.x format. As always, LyX will have such converters, so old formats can be imported/exported. rh
Re: Progress on the MS Word to LyX conversion (xml)
Steve Litt wrote: On Wednesday 23 July 2008 07:00, José Matos wrote: XML will not change the current status. grep '
Re: Progress on the MS Word to LyX conversion (xml)
On Wed, Jul 23, 2008 at 10:33:16AM -0400, Steve Litt wrote: > On Wednesday 23 July 2008 07:00, José Matos wrote: > > > XML will not change the current status. > > > > grep '
Re: Progress on the MS Word to LyX conversion (xml)
> The next question is why do we need to manipulate lyx files with awk and > friends? Is not there something that can should be done by lyx? search and replace is one of the weak lyx parts and even if we get Tommaso one day to put his stuff in there are so many place where its of no help. just look on the things like notes-mutate or graphics settings synchronization other nonimplemented things come to my mind. > I have generated lyx files with scripts that have been used in my PhD thesis > (almost 40 pages were generated like this) so I can recognize advantages in > manipulating lyx files with scripts, but in that case there are better tools > than awk and sed. this depends on what you master. i'm used on the bunch of small unix utilities so i gave that sed example. if you know python you will do in python. my point was not propose the best tools but to groan and moan about xml :) pavel
Re: Progress on the MS Word to LyX conversion (xml)
> Perhaps our best hope of continuing tweakability of native LyX is to create > 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can > continue to be done in the 1.5.x format. as have written others 1.6 is still ok. for lyx files assembly you can still make what you want in 1.6 format and lyx2lyx will convert for you to 1.7 etc. next possibility is to stick with 1.6 as long as possible :) > The only thing you and I would have to do is the XML to 1.5.x converter. I'm this will be part of the the fileformat transition in lyx itself. moreover xml is not my religion, so i will try to keep myself as far as possible from any xml related coding :D pavel
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 11:05, José Matos wrote: > On Wednesday 23 July 2008 15:33:16 Steve Litt wrote: > > The trouble is, XML tags can be anywhere -- spacing and linefeeds are > > immaterial. That means you can no longer parse based on position, such > > as: > > > > /^begin_layout/ > > > > because technically the whole XML file could be in a single line. Or a > > single tag could be split between lines. > > Since we control the format I am (almost) sure that we will choose a reader > friendly output. There is no reason to do otherwise. In terms of size a > blank or a newline are equivalent, so... :-) > > That is why it will be business as usual. :-) > Not much will change in this regard. Thanks José, As a sed/awk/perl/ruby parser, I appreciate that very much. The more I think about it, the more I think I should make the XML->YAML and YAML->XML converters. That way, if future generations of LyX project programmers forget why it's important to space their XML "just so", it won't matter. Also, I have a feeling that YAML will be much easier to parse than either 1.5.x or XML. The way I envision it, these two converters will be simple standalone commands implemented as filters (convert stdin to stdout), very few dependencies. They will comply with the Unix Philosophy (little apps that do one thing and do it well). Trivial to install. They will be simple enough to be maintained by one person. They will be encapsulated. They won't need to know about LyX other than its XML format, and LyX won't need to know about them. They can be included in the LyX distribution, or not. At first I'll do them in Ruby because Ruby has all that stuff built in and easy to do. Later, depending on performance and the percent of people who have Ruby installed, I can convert them to C. There's a C implementation of the same YAML parser/emitter that Ruby uses -- Syck. I'm pretty sure there are also C or C++ implementations of XML Parsers, although I don't know how well they do things like DTD/schema. Thanks SteveT Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 11:21, José Matos wrote: > > There may be things wrong with awking, seding and perling data into > > submission, but the age of these tools is not one of them. > > If you add there the coreutils, like tail, cut, paste, merge and so on we > can do things that spreadsheet programs can only dream of like processing > Gigs of data with thousands of lines and columns. :-) :-) :-) :-) Check this out: http://www.troubleshooters.cxm/lpm/200801/200801.htm http://www.troubleshooters.cxm/lpm/200802/200802.htm But seriously -- it's obvious that for the LyX application itself, XML is by far the best way to go, and I would never suggest rewriting LyX in awk :-). My interest is in quick writes/tweaks of LyX native format files in order to do things that LyX isn't equipped to do, like my VimOutliner to LyX script. STeveT Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 14:49:12 Manveru wrote: > Guys, > > Have you even looked at TinyXML? Thanks for the link. :-) -- José Abílio
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 15:58:56 Steve Litt wrote: > Hi Pavel, > > Perhaps our best hope of continuing tweakability of native LyX is to create > 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can > continue to be done in the 1.5.x format. I will advise against such practice. I hope to explain why in the paragraphs below. > I'm presuming that the LyX developers will create the 1.5.x to XML > converter so users can upgrade their old docs, and hopefully they would > keep that converter updated for each new LyX version, so that you and I > wouldn't need to worry about coding the 1.5.x to XML. Note that the convertion to xml will only happen after 1.6. I know that your argument remains unchanged with this shift and just correct this before continuing. With this said lyx2lyx will be able to convert from pre-xml to xml and vice- versa. Our previous experience suggest however that while the forward translation is complete the backwards translation results sometimes in the truncation or lots of ERT added to preserve the same structure. For several reasons a transformation from X to X+1 and back again is not guaranteed to give the same document bit by bit. Note also that this is not an easy task in any way. The next question is why do we need to manipulate lyx files with awk and friends? Is not there something that can should be done by lyx? I have generated lyx files with scripts that have been used in my PhD thesis (almost 40 pages were generated like this) so I can recognize advantages in manipulating lyx files with scripts, but in that case there are better tools than awk and sed. That is also the reason why lyx2lyx is nowadays mostly a python library (LyX.py) and the script lyx2lyx is just a wrapper around the library. > The only thing you and I would have to do is the XML to 1.5.x converter. > I'm pretty darned good with C, and if necessary I can do C++ (but with a C > accent). If we pick an XML parser with full schema/dtd capability, that > doesn't have many dependencies, then if you know how to write 1.5.x, I can > feed you whatever data is needed to write the 1.5.x. > > There's another possibility that I think might be better. Using Ruby with > REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml) > if you could help me just a little bit with the return trip (YAML to XML). > I think this would be EVEN BETTER than 1.5.x, because YAML was made for > exactly what you and I want to do -- parsing with awk/sed/perl/grep/cut. It > would also remove our responsibility to support 1.5.x syntax in the 22nd > century. > > Using YAML for tweaking, I think there may come a time when you and I would > say "remember when we had to parse that nasty 1.5.x?" > > I can begin this project as soon as the developers give me an XML def and > an XML file. That way, once they actually specify what they're going to do, > we'll have the technology for the XML->YAML->XML round trip, and only the > details will require coding. > > What do you think? > > StevET You are welcome both to tell us your requirements around the future xml file format and to help us so that in the end we all have a better lyx. Really, all help is welcome. > Steve Litt > Recession Relief Package > http://www.recession-relief.US -- José Abílio
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 15:20:59 Steve Litt wrote: > When the discussion reverts to "your thingamabob is from another > decade/century so it must not be good by today's standards", you know that > thingamabob is pretty darn good, or else there would have been a more > powerful argument against it. Pavel is a developer just as I am. In this thread we been teasing each other over this issue. In such cases this is an acceptable argument (IMO). ;-) > First of all, I understand *exactly* why an XML native format is an > improvement for the LyX application. I'm limiting my point to the concept > that something old has to be something bad. That is fair. :-) > Modern things are usually improvements, but often are not improvements in > quality or usefulness. They can be improvements to profit margin (e.g. most > MS Windows "improvements"), or marketing improvements (all the silly little > expensive features thrown into basic family cars today), or improvements in > restricting use (DRM), or improvements in price (crummy bicycles from > Walmart). Sometimes older stuff has more quality or usefulness. All that is true but in this case the lyx file format and indirectly the lyx parser have not been changed in a long time until 2002 not because they were perfect but because most developers were afraid to touch and break it. The format had been evolving over time and it was a mess with places where whitespaces were significant and others were they were for no good reason. > In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the > philosophy of little executables that do one thing and do it right. Stdin, > stdout and pipes were the glue language with which these little executables > could be cascaded to produce a substantial result. This enabled > logical-thinking non-developers, and also developers, to produce those > substantial results in an hour, with perhaps the greatest encapsulation > that's ever been achieved in the computer world. Each little executable has > one input and one output, each being a measurable test point. For batch > processes this "programming" technique is every bit as productive as it was > 39 years ago. lyx2lyx that lyx uses to convert between the different file formats works using this principle, it acts as a filter receiving from stdin and writing the transformation in stdout. Yet until now there is not a good way to have an external program (script) other than lyx to check the validity of a lyx file. For me, at least, this is a strong shortcoming of our file format. > There may be things wrong with awking, seding and perling data into > submission, but the age of these tools is not one of them. If you add there the coreutils, like tail, cut, paste, merge and so on we can do things that spreadsheet programs can only dream of like processing Gigs of data with thousands of lines and columns. :-) > SteveT -- José Abílio
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 15:33:16 Steve Litt wrote: > The trouble is, XML tags can be anywhere -- spacing and linefeeds are > immaterial. That means you can no longer parse based on position, such as: > > /^begin_layout/ > > because technically the whole XML file could be in a single line. Or a > single tag could be split between lines. Since we control the format I am (almost) sure that we will choose a reader friendly output. There is no reason to do otherwise. In terms of size a blank or a newline are equivalent, so... :-) That is why it will be business as usual. :-) Not much will change in this regard. -- José Abílio
Re: Progress on the MS Word to LyX conversion (xml)
Steve Litt wrote: Perhaps our best hope of continuing tweakability of native LyX is to create 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can continue to be done in the 1.5.x format. I'm presuming that the LyX developers will create the 1.5.x to XML converter so users can upgrade their old docs, and hopefully they would keep that converter updated for each new LyX version, so that you and I wouldn't need to worry about coding the 1.5.x to XML. Yes, switching to XML doesn't mean abandoning lyx2lyx. The difference is that we will be able to use simpler XSL templates for the conversion. The advantage being that the XSL templates will be available to all, not being specificy to python or lyx2lyx. By the way, the switch to XML is not going to happen with 1.6 but with 1.7, that is at least one year from now ;-) The only thing you and I would have to do is the XML to 1.5.x converter. This will be provided by lyx2lyx too. 1.7-XML will export to all 1.x formats with x <= 6. I'm pretty darned good with C, and if necessary I can do C++ (but with a C accent). If we pick an XML parser with full schema/dtd capability, that doesn't have many dependencies, then if you know how to write 1.5.x, I can feed you whatever data is needed to write the 1.5.x. As I said above, this 1.7 to 1.6 will be supported via a simple XSL stylesheet. It's really the other direction 1.6 to 1.7 that will be difficult to implement. But hey, all help is welcome, the development of 1.7 is going to begin in a couple of months so if you want to have a say in the new XML format, come along on the devel list ;-) Abdel.
Re: Progress on the MS Word to LyX conversion (xml)
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote: > > Pavel Sanda wrote: > > Moreover, if you're editing by hand, you can use > > something that recognizes XML. > > of course it will work, but it will take x-times more time. > quite difference to write sed one-liner or start doing some > xslt templating. > > pavel Hi Pavel, Perhaps our best hope of continuing tweakability of native LyX is to create 1.5.x to XML and XML to 1.5.x converters. Then all the parsing/tweaking can continue to be done in the 1.5.x format. I'm presuming that the LyX developers will create the 1.5.x to XML converter so users can upgrade their old docs, and hopefully they would keep that converter updated for each new LyX version, so that you and I wouldn't need to worry about coding the 1.5.x to XML. The only thing you and I would have to do is the XML to 1.5.x converter. I'm pretty darned good with C, and if necessary I can do C++ (but with a C accent). If we pick an XML parser with full schema/dtd capability, that doesn't have many dependencies, then if you know how to write 1.5.x, I can feed you whatever data is needed to write the 1.5.x. There's another possibility that I think might be better. Using Ruby with REXML, I could convert the XML to YAML (http://en.wikipedia.org/wiki/Yaml) if you could help me just a little bit with the return trip (YAML to XML). I think this would be EVEN BETTER than 1.5.x, because YAML was made for exactly what you and I want to do -- parsing with awk/sed/perl/grep/cut. It would also remove our responsibility to support 1.5.x syntax in the 22nd century. Using YAML for tweaking, I think there may come a time when you and I would say "remember when we had to parse that nasty 1.5.x?" I can begin this project as soon as the developers give me an XML def and an XML file. That way, once they actually specify what they're going to do, we'll have the technology for the XML->YAML->XML round trip, and only the details will require coding. What do you think? StevET Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 07:00, José Matos wrote: > XML will not change the current status. > > grep '
Re: Progress on the MS Word to LyX conversion (xml)
On Tuesday 22 July 2008 18:21, José Matos wrote: > Clearly you did not had to deal with the lyx file format like I did. :-) > If your idea of a parser is a set of regexp's that is so 80's. ;-) [clip] > It is funny to see all this nostalgia around something that is/was a > nightmare. If the syntax was so clear you would not have the problem of > crashing lyx with a bad formed file (a file modified by scripts). When the discussion reverts to "your thingamabob is from another decade/century so it must not be good by today's standards", you know that thingamabob is pretty darn good, or else there would have been a more powerful argument against it. First of all, I understand *exactly* why an XML native format is an improvement for the LyX application. I'm limiting my point to the concept that something old has to be something bad. Modern things are usually improvements, but often are not improvements in quality or usefulness. They can be improvements to profit margin (e.g. most MS Windows "improvements"), or marketing improvements (all the silly little expensive features thrown into basic family cars today), or improvements in restricting use (DRM), or improvements in price (crummy bicycles from Walmart). Sometimes older stuff has more quality or usefulness. In 1969 and the early 1970's, Ken Thompson and the gang made Unix with the philosophy of little executables that do one thing and do it right. Stdin, stdout and pipes were the glue language with which these little executables could be cascaded to produce a substantial result. This enabled logical-thinking non-developers, and also developers, to produce those substantial results in an hour, with perhaps the greatest encapsulation that's ever been achieved in the computer world. Each little executable has one input and one output, each being a measurable test point. For batch processes this "programming" technique is every bit as productive as it was 39 years ago. There may be things wrong with awking, seding and perling data into submission, but the age of these tools is not one of them. SteveT Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion (xml)
Guys, Have you even looked at TinyXML? I have a project once where we use XML as a message passing protocol and we were using XSLT as C++ code generator for classes handling XML and converting them to data structures handling all data we need. This freed us from portability problems (Litte Endian, Big Endian) which is not case here. For the application like LyX binary structure may be better to handle - certainly much work to do. We in our project hadn't found any known DOM useful for our purpose. Cheers! M. 2008/7/23 José Matos <[EMAIL PROTECTED]>: > On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote: > > i've done incorrect file, it's my fault if lyx crashes. i take my > > responsibility, no problem. > > trial method is the fastest if you want something quickly. > > If LyX crashes that is a bug. LyX should not ever crash, it can refused to > load a file because it is invalid, or to truncate it but it should not ever > crash. > > In the whole picture our parser is one of our weak links so we should do > something about it. Replace it in this case. > > > > First make it correct and then make it fast. > > > > i have exactly oposite view as far as the tweaking i was talking about > > is concerned; i just need quickly output of something, may be i will > throw > > it away after few days. > > > > or take Steve's example - if he takes your 'First make it correct and > then > > make it fast' it would take some two weaks to invent some beast to be > > correct in your sense. but then the whole point is lost, since after this > > time he could do it manually. > > > > i guess we can't agree on this, since i'm not talking about lyx > internals, > > while your job is to make lyx format conversions on lyx level... but this > > is users list, not the the devel one, so i feel free to speak this way :) > > Yes, I know but I can pretend otherwise. ;-) > > > pavel > > -- > José Abílio > -- Manveru jabber: [EMAIL PROTECTED] gg: 1624001 http://www.manveru.pl
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 12:19:16 Pavel Sanda wrote: > i've done incorrect file, it's my fault if lyx crashes. i take my > responsibility, no problem. > trial method is the fastest if you want something quickly. If LyX crashes that is a bug. LyX should not ever crash, it can refused to load a file because it is invalid, or to truncate it but it should not ever crash. In the whole picture our parser is one of our weak links so we should do something about it. Replace it in this case. > > First make it correct and then make it fast. > > i have exactly oposite view as far as the tweaking i was talking about > is concerned; i just need quickly output of something, may be i will throw > it away after few days. > > or take Steve's example - if he takes your 'First make it correct and then > make it fast' it would take some two weaks to invent some beast to be > correct in your sense. but then the whole point is lost, since after this > time he could do it manually. > > i guess we can't agree on this, since i'm not talking about lyx internals, > while your job is to make lyx format conversions on lyx level... but this > is users list, not the the devel one, so i feel free to speak this way :) Yes, I know but I can pretend otherwise. ;-) > pavel -- José Abílio
Re: Progress on the MS Word to LyX conversion (xml)
> On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote: > > while you are right that xml could be better technology for internal > > lyx parsing (and i can understand your viewpoint as lyx2lyx fan:) > > this was not my mail about. > > > > > It is funny to see all this nostalgia around something that is/was a > > > nightmare. > > > > it has nothing to do with nostalgia, but speed of hacking around. > > Not when the resulting file crashes lyx, something that should not ever > happen > but that it does now. i've done incorrect file, it's my fault if lyx crashes. i take my responsibility, no problem. trial method is the fastest if you want something quickly. > First make it correct and then make it fast. i have exactly oposite view as far as the tweaking i was talking about is concerned; i just need quickly output of something, may be i will throw it away after few days. or take Steve's example - if he takes your 'First make it correct and then make it fast' it would take some two weaks to invent some beast to be correct in your sense. but then the whole point is lost, since after this time he could do it manually. i guess we can't agree on this, since i'm not talking about lyx internals, while your job is to make lyx format conversions on lyx level... but this is users list, not the the devel one, so i feel free to speak this way :) pavel
Re: Progress on the MS Word to LyX conversion (xml)
On Wednesday 23 July 2008 00:19:09 Pavel Sanda wrote: > by 'outside' i mean tweakings which i regularly do and watching users list > power users do that too _and_ are happy about the current simplicity of > format. > > tweaks like assembling of the whole file for various datasets, global > changes of things (cf notes-mutate lfun i introduced lately), conversions > and so on. This works well for simple things but breaks badly when you try something a bit more complex. > while you are right that xml could be better technology for internal > lyx parsing (and i can understand your viewpoint as lyx2lyx fan:) > this was not my mail about. > > > It is funny to see all this nostalgia around something that is/was a > > nightmare. > > it has nothing to do with nostalgia, but speed of hacking around. Not when the resulting file crashes lyx, something that should not ever happen but that it does now. First make it correct and then make it fast. XML will not change the current status. grep '
Re: Progress on the MS Word to LyX conversion (xml)
On Tuesday 22 July 2008 19:24, Pavel Sanda wrote: > > Pavel Sanda wrote: > > Moreover, if you're editing by hand, you can use > > something that recognizes XML. > > of course it will work, but it will take x-times more time. > quite difference to write sed one-liner or start doing some > xslt templating. > > pavel Yeah, I think this was the point I was trying to get across. With the current format, you can do a lot with Vim. Or you can run through a series of small filters that do just one thing. XML's a different animal. Without a parser, it's almost impossible to handle. With a parser, you're forced to work only within the language of that parser, and you're forced to make a monolithic solution that can't take advantage of Unix pipes and small executables that do one thing and do it well. You also forgo the ability to have a series of intermediate files, each serving as a test point to make sure things are still going well. Also, an XML parser, especially a DOM one, makes READING XML very easy, but it does nothing for WRITING. Pavel -- you and I and others like us need to start identifying parsing tools to at least partially compensate for the loss of our Unix based pipes with small filter executables. Theoretically, if one could read the XML into a DOM tree, tweak it in memory, and then write it back out, that would be at least somewhat doable, though nothing like the Awk and Perl techniques I'm used to. And once again, we need COMPLETE documentation on the XML dialect, and Like I said I'm willing to help with that documentation. SteveT Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion (xml)
> Pavel Sanda wrote: > Moreover, if you're editing by hand, you can use > something that recognizes XML. of course it will work, but it will take x-times more time. quite difference to write sed one-liner or start doing some xslt templating. pavel
Re: Progress on the MS Word to LyX conversion (xml)
> On Tuesday 22 July 2008 22:54:14 Pavel Sanda wrote: > > > > now you are joking right? :) i just see all the bugs just because '>' is > > redirection. and imho manually generate \begin_layout Standard is more > > simpler > > then typing . > > You are welcome to reimplement lyx in shell, good luck. :-) > > > now imagine those regexps where you need to escape all those \" > > > > in conclusion xml will be pain for people trying to use .lyx files > > directly with scripts etc. > > Clearly you did not had to deal with the lyx file format like I did. :-) > If your idea of a parser is a set of regexp's that is so 80's. ;-) clearly you haven't understand my point. i was not talking at all about lyx internal parsing, but about 'outside' usage. by 'outside' i mean tweakings which i regularly do and watching users list power users do that too _and_ are happy about the current simplicity of format. tweaks like assembling of the whole file for various datasets, global changes of things (cf notes-mutate lfun i introduced lately), conversions and so on. while you are right that xml could be better technology for internal lyx parsing (and i can understand your viewpoint as lyx2lyx fan:) this was not my mail about. > It is funny to see all this nostalgia around something that is/was a > nightmare. it has nothing to do with nostalgia, but speed of hacking around. pavel
Re: Progress on the MS Word to LyX conversion
On Wednesday 23 July 2008 08:04:59 am Steve Litt wrote: > On Tuesday 22 July 2008 11:32, rgheck wrote: > > Steve Litt wrote: > > > I don't know how it will be after LyX goes XML, but right now at 1.5.3, > > > converting my LyX code to something else by parsing the LyX native code > > > would be trivial. This is probably teaching Grandma to suck eggs - but There is a very good set of XML utilities available in Linux which alloy you easily parse and transform .xml files into almost anything you want (using xslt, sax, and friends. In openSUSE it is called xmlstarlet and comes with the installation CDs or DVD. These should make it easy to translate to and from LyX (when it finally goes fully XML). John O'Gorman > > > > My understanding is that, whatever happens with the LyX file format, we > > want it to remain possible to do the sort of simple scripting we all > > like to be able to do. The XML business is really just a matter of > > replacing things like this: > > > > \begin_layout Standard > > this. > > \end_layout > > > > \begin_layout Standard > > \begin_inset CommandInset bibtex > > LatexCommand bibtex > > bibfiles "/tmp/bib" > > options "plain" > > > > \end_inset > > > > > > \end_layout > > > > with things like this: > > > > > > this. > > > > > > > > > /> > > > > Just as easy to parse, I hope. Maybe even easier. > > > > That's not anything actually agreed or implemented > > It's not as easy to parse, but it's reasonable. If that's the extent of the > XMLization of LyX, it should still be somewhat tweakable with Vim, Perl, > etc. > > The real problems come in when they do things in XML that would be > denormalization in a database. Store the paragraphs one place, and then > store the *number of paragraphs* somewhere else, so if you add a paragraph > and forget to increment the number, your doc no longer opens. > > Or treating the XML file like a relational database, where you have a list > of styles with numbered IDs one place, and then have those numbers applied > to paragraphs somewhere else. This is an excellent programming technique, > but for the guy just trying to casually go in and tweak something, or > casually trying to programmatically generate LyX data, it can be daunting > indeed. Personally, I love having my style defs in the layout file and > using the style names as their identifiers. > > Then there's this habit of people like OpenOffice, where the native format > is a Zip file unzipping to different directories, each containing XML files > and other types of files. Yeah, I just dare anyone to generate OpenOffice > on the fly. > > I suggest that whatever you decide, you document the XML structure. I don't > mean document as in "it's open source, read the code". I mean document as > in "Here is the data hierarchy, here is the high level data design, here > are our reasons for doing it this way, here are the data interdependencies, > here are some tips for building LyX files programmatically and tweaking > them either programmatically or with an editor. And here is a tutorial on > building and tweaking LyX files without the LyX front end. > > I'm busy these days, but if you keep me in the loop I'll do at least a good > chunk of that documentation. > > One more thing -- if you're going XML and don't want to reinvent the wheel, > you'll be using someone else's XML parser. Please, please, PLEASE, don't > make it some parser with tons of dependency so that the guy with a 2 year > old distro can't compile LyX because of the XML parser. We already have > enough problems with Qt dependencies. > > Thanks > > SteveT > > Steve Litt > Recession Relief Package > http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion (xml)
José Matos wrote: now imagine those regexps where you need to escape all those \" in conclusion xml will be pain for people trying to use .lyx files directly with scripts etc. Clearly you did not had to deal with the lyx file format like I did. :-) If your idea of a parser is a set of regexp's that is so 80's. ;-) In fairness, I think he was talking about little hacked scripts to do the kind of search-and-replace that isn't possible yet in LyX itself. So you don't really have a parser in that case. Just a very long string. ;-) This seems to me like the debate between strong and bold. I want to parse the lyx file on a content based stream, not just a set of lines. After the change to xml the regularity will still be there with the added bonus that finally it will be consistent. We took 6 years to clean the lyx format to a reasonable state and we are still not there yet. So, Jose, are we ever actually going to do this? If so, then it seems to me we ought to decide to do it, halt other development for the few weeks it would take, and do it. I don't think it would really be that hard to have it working. The existing parser could be tweaked for the short term. It's already capable of dealing with tabulars, and those are written as XML already. Longer term, we'd prefer libxml2 or something---SAX, I assume, rather than DOM---, but that could be done after the format had stabilized. Yeah, I know, wrong list. rh
Re: Progress on the MS Word to LyX conversion (xml)
Pavel Sanda wrote: Steve Litt wrote: this. Just as easy to parse, I hope. Maybe even easier. now you are joking right? :) i just see all the bugs just because '>' is redirection. Only in the shell, right? now imagine those regexps where you need to escape all those \" There's lots of that in LyX now. But it's easy to deal with in Python, at least, via the r'' quoter. And in Perl, you have qr//. So the quotes aren't really a problem. Moreover, if you're editing by hand, you can use something that recognizes XML. But, well, XML isn't exactly around the corner, anyway, so far as I can tell. rh
Re: Progress on the MS Word to LyX conversion (xml)
On Tuesday 22 July 2008 22:54:14 Pavel Sanda wrote: > > now you are joking right? :) i just see all the bugs just because '>' is > redirection. and imho manually generate \begin_layout Standard is more > simpler > then typing . You are welcome to reimplement lyx in shell, good luck. :-) > now imagine those regexps where you need to escape all those \" > > in conclusion xml will be pain for people trying to use .lyx files > directly with scripts etc. Clearly you did not had to deal with the lyx file format like I did. :-) If your idea of a parser is a set of regexp's that is so 80's. ;-) This seems to me like the debate between strong and bold. I want to parse the lyx file on a content based stream, not just a set of lines. After the change to xml the regularity will still be there with the added bonus that finally it will be consistent. We took 6 years to clean the lyx format to a reasonable state and we are still not there yet. It is funny to see all this nostalgia around something that is/was a nightmare. If the syntax was so clear you would not have the problem of crashing lyx with a bad formed file (a file modified by scripts). > pavel -- José Abílio
Re: Progress on the MS Word to LyX conversion
On Tuesday 22 July 2008 21:04:59 Steve Litt wrote: > One more thing -- if you're going XML and don't want to reinvent the wheel, > you'll be using someone else's XML parser. Please, please, PLEASE, don't > make it some parser with tons of dependency so that the guy with a 2 year > old distro can't compile LyX because of the XML parser. We already have > enough problems with Qt dependencies. The idea is to have a DTD to describe the XML and to use a standard parser like libxml2. This should meet both criteria. :-) > Thanks > > SteveT -- José Abílio
Re: Progress on the MS Word to LyX conversion (xml)
> Steve Litt wrote: > > this. > > > > > > > Just as easy to parse, I hope. Maybe even easier. now you are joking right? :) i just see all the bugs just because '>' is redirection. and imho manually generate \begin_layout Standard is more simpler then typing . now imagine those regexps where you need to escape all those \" in conclusion xml will be pain for people trying to use .lyx files directly with scripts etc. pavel
Re: Progress on the MS Word to LyX conversion
On Tuesday 22 July 2008 11:32, rgheck wrote: > Steve Litt wrote: > > I don't know how it will be after LyX goes XML, but right now at 1.5.3, > > converting my LyX code to something else by parsing the LyX native code > > would be trivial. > > My understanding is that, whatever happens with the LyX file format, we > want it to remain possible to do the sort of simple scripting we all > like to be able to do. The XML business is really just a matter of > replacing things like this: > > \begin_layout Standard > this. > \end_layout > > \begin_layout Standard > \begin_inset CommandInset bibtex > LatexCommand bibtex > bibfiles "/tmp/bib" > options "plain" > > \end_inset > > > \end_layout > > with things like this: > > > this. > > > > > > > Just as easy to parse, I hope. Maybe even easier. > > That's not anything actually agreed or implemented It's not as easy to parse, but it's reasonable. If that's the extent of the XMLization of LyX, it should still be somewhat tweakable with Vim, Perl, etc. The real problems come in when they do things in XML that would be denormalization in a database. Store the paragraphs one place, and then store the *number of paragraphs* somewhere else, so if you add a paragraph and forget to increment the number, your doc no longer opens. Or treating the XML file like a relational database, where you have a list of styles with numbered IDs one place, and then have those numbers applied to paragraphs somewhere else. This is an excellent programming technique, but for the guy just trying to casually go in and tweak something, or casually trying to programmatically generate LyX data, it can be daunting indeed. Personally, I love having my style defs in the layout file and using the style names as their identifiers. Then there's this habit of people like OpenOffice, where the native format is a Zip file unzipping to different directories, each containing XML files and other types of files. Yeah, I just dare anyone to generate OpenOffice on the fly. I suggest that whatever you decide, you document the XML structure. I don't mean document as in "it's open source, read the code". I mean document as in "Here is the data hierarchy, here is the high level data design, here are our reasons for doing it this way, here are the data interdependencies, here are some tips for building LyX files programmatically and tweaking them either programmatically or with an editor. And here is a tutorial on building and tweaking LyX files without the LyX front end. I'm busy these days, but if you keep me in the loop I'll do at least a good chunk of that documentation. One more thing -- if you're going XML and don't want to reinvent the wheel, you'll be using someone else's XML parser. Please, please, PLEASE, don't make it some parser with tons of dependency so that the guy with a 2 year old distro can't compile LyX because of the XML parser. We already have enough problems with Qt dependencies. Thanks SteveT Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion
Steve Litt wrote: I don't know how it will be after LyX goes XML, but right now at 1.5.3, converting my LyX code to something else by parsing the LyX native code would be trivial. My understanding is that, whatever happens with the LyX file format, we want it to remain possible to do the sort of simple scripting we all like to be able to do. The XML business is really just a matter of replacing things like this: \begin_layout Standard this. \end_layout \begin_layout Standard \begin_inset CommandInset bibtex LatexCommand bibtex bibfiles "/tmp/bib" options "plain" \end_inset \end_layout with things like this: this. Just as easy to parse, I hope. Maybe even easier. That's not anything actually agreed or implemented rh
Re: Progress on the MS Word to LyX conversion
On Tuesday 22 July 2008 06:32, Christian Ridderström wrote: > On Mon, 21 Jul 2008, Steve Litt wrote: > > This morning I got an acceptably tagged text file out of MS Word. From > > that moment on, things got much easier. > > Congratulations! > > I put a reference to your post on a wiki page, giving others that need to > do this a starting point. (If you want to summarize how you did it and > post the relevant scripts on the wiki, I can help you with it). Here's the > page: > http://wiki.lyx.org/Tools/Word2LyXConversionProcess > > While doing this, I found this page: Thanks Christian! One use for the new page is showing people how to convert word to LyX while preserving all styles. Perhaps an even greater use for this page is showing people the mess they'll get themselves into by using MS Word to write a book. I don't know how it will be after LyX goes XML, but right now at 1.5.3, converting my LyX code to something else by parsing the LyX native code would be trivial. Thanks SteveT Steve Litt Recession Relief Package http://www.recession-relief.US
Re: Progress on the MS Word to LyX conversion
On Mon, 21 Jul 2008, Steve Litt wrote: This morning I got an acceptably tagged text file out of MS Word. From that moment on, things got much easier. Congratulations! I put a reference to your post on a wiki page, giving others that need to do this a starting point. (If you want to summarize how you did it and post the relevant scripts on the wiki, I can help you with it). Here's the page: http://wiki.lyx.org/Tools/Word2LyXConversionProcess While doing this, I found this page: http://wiki.lyx.org/Tools/Word2LyXMacro Maybe it can help you with the tables if nothing else? /Christian -- Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr
Progress on the MS Word to LyX conversion
This morning I got an acceptably tagged text file out of MS Word. From that moment on, things got much easier. I made a perl script to remove end tags, and instead put start tags on all lines between a start and end. It also made sure there were no interlinking tag sets. It also put all the start tags in the same format and easily parsable. I hadn't thought to do that when converting out of MS Word -- I had bigger fish to fry at the time. I hadn't marked Normal paragraphs, so my program had to deduce which lines weren't marked already, and put a b_pstyle_normal::: start tag on them. Armed with proper start tags on every line (which is actually a paragraph), it was pretty easy to pipe that through something that added the \begin_layout Whatever and \end_layout commands. At this point I have NOT removed the start or end tags -- I want some redundancy for checking. I also added a little C program to get rid of the '\015' characters that DOS put in. I made a layout with dummy styles for each style I used (sort -u came in very handy for this). Anyway, my program can make the body of a LyX file, and all the Part/Chapter/Section etc works perfectly, and it seems like all the other paragraph styles are working. It's basically a pipeline of little filters creating a LyX file from the text file, and I can do it over and over to my heart's content. I imagine tomorrow I'll add the code to handle character styles, and start making my layout file create effects that look how they're supposed to. That will help in looking at the produced PDF (it already produces a PDF, so the basic code is correct). Bottom line, I now have a text file with tags representing all my document's original style, and I've created perl, awk, sed and C code to convert it to a LyX document with my styles preserved. Anyway, thanks for all the help. SteveT Steve Litt Recession Relief Package http://www.recession-relief.US