Re: Document versions
On Wed, Aug 30, 2000 at 08:28:04PM +0200, Matej Cepl wrote: Content-Description: Mail message body Let me transfer a discussion from lyx-docs. I think, that it may be much more interesting here. On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx- docs list: I was just thinking about comparing two documents and seeing additions/deletions, like diff does for text file. This sounds Hard. Do tools like this exist for, say, HTML? If so, we could probably steal them. There is already a tool for comparing latex files: $TEXMF/latex/changebar/chbar.sh This script "take two LaTeX files and produce a third which has changebars highlighting the difference between them."
Re: Document versions
On Wed, Aug 30, 2000 at 08:28:04PM +0200, Matej Cepl wrote: Content-Description: Mail message body Let me transfer a discussion from lyx-docs. I think, that it may be much more interesting here. On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx- docs list: I was just thinking about comparing two documents and seeing additions/deletions, like diff does for text file. This sounds Hard. Do tools like this exist for, say, HTML? If so, we could probably steal them. There is already a tool for comparing latex files: $TEXMF/latex/changebar/chbar.sh This script "take two LaTeX files and produce a third which has changebars highlighting the difference between them."
Re: Document versions
On Wed, Aug 30, 2000 at 08:28:04PM +0200, Matej Cepl wrote: Content-Description: Mail message body > Let me transfer a discussion from lyx-docs. I think, that it may be > much more interesting here. > > On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx- > docs list: > > > > I was just thinking about comparing two documents and seeing > > > additions/deletions, like diff does for text file. > > > > > > This sounds Hard. Do tools like this exist for, say, HTML? If so, > > we could probably steal them. There is already a tool for comparing latex files: $TEXMF/latex/changebar/chbar.sh This script "take two LaTeX files and produce a third which has changebars highlighting the difference between them."
Re: Document versions
Let me transfer a discussion from lyx-docs. I think, that it may be much more interesting here. On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx- docs list: I was just thinking about comparing two documents and seeing additions/deletions, like diff does for text file. This sounds Hard. Do tools like this exist for, say, HTML? If so, we could probably steal them. I would like to give an info about ndiff (from Python 1.5.2 distribution -- see attached). Does it make any sense to you (IMJAL - I am just a lawyer, no programmer)? - word wrap. I did some work on a perl diff (Algorithm::Diff in CPAN). We had talked about the possibility of a "word-based" diff. In fact, I think someone (Jean-Marc?) said a wdiff already exists. Alternatively, we could make all paragraphs into one line and then run the diff. If you want to display the differences as a LyX or to divide whole document into one-word-per-line format and then compare by regular diff. - character formatting. Ouch. This is actually several problems. (1) add an italicized word to a regular paragraph. (2) add a word (in italics) to an italicized paragraph. (3) change a word from regular print to italics. Um, I suppose you could remove all character formatting and just compare text, which would be better than nothing. In my experience fromatting is not so important as a content (actually, I am glad, checking differences in formatting is now optional in Word). Ah. I've been thinking of doing the diff outside of LyX, of some version of a diff on the text of a LyX file. To do it within LyX has a different set of problems. For example, you have character formatting information on each character, making comparison easier. But you'll need to steal the GNUdiff algorithm and put it into LyX. Ugh. Much better IMHO than making diffs on two files is some kind of mechanism, which records revisions while they are made. Actually, I almost never use "Compare versions" feauture in Word (I am sorry for talking so much about Word -- I really prefer LyX and real programms to toys, honestly!), or I am using it only when necessary (= our client is stupid and I haven't been successfull in explaining what are the revisions about). I know that it is much more work for LyX programmers (than just throwing something in diff), but I am afraid that diff is The Bad Thing for everything other than computer programs and silmilar stuff. Any comments? Matej #! /usr/bin/env python # Module ndiff version 1.4.0 # Released to the public domain 27-Mar-1999, # by Tim Peters ([EMAIL PROTECTED]). # Provided as-is; use at your own risk; no warranty; no promises; enjoy! """ndiff [-q] file1 file2 or ndiff (-r1 | -r2) ndiff_output file1_or_file2 Print a human-friendly file difference report to stdout. Both inter- and intra-line differences are noted. In the second form, recreate file1 (-r1) or file2 (-r2) on stdout, from an ndiff report on stdin. In the first form, if -q ("quiet") is not specified, the first two lines of output are -: file1 +: file2 Each remaining line begins with a two-letter code: "- "line unique to file1 "+ "line unique to file2 " "line common to both files "? "line not present in either input file Lines beginning with "? " attempt to guide the eye to intraline differences, and were not present in either input file. These lines can be confusing if the source files contain tab characters. The first file can be recovered by retaining only lines that begin with " " or "- ", and deleting those 2-character prefixes; use ndiff with -r1. The second file can be recovered similarly, but by retaining only " " and "+ " lines; use ndiff with -r2; or, on Unix, the second file can be recovered by piping the output through sed -n '/^[+ ] /s/^..//p' See module comments for details and programmatic interface. """ __version__ = 1, 4, 0 # SequenceMatcher tries to compute a "human-friendly diff" between # two sequences (chiefly picturing a file as a sequence of lines, # and a line as a sequence of characters, here). Unlike e.g. UNIX(tm) # diff, the fundamental notion is the longest *contiguous* junk-free # matching subsequence. That's what catches peoples' eyes. The # Windows(tm) windiff has another interesting notion, pairing up elements # that appear uniquely in each sequence. That, and the method here, # appear to yield more intuitive difference reports than does diff. This # method appears to be the least vulnerable to synching up on blocks # of "junk lines", though (like blank lines in ordinary text files, # or maybe "P" lines in HTML files). That may be because this is # the only method of the 3 that has a *concept* of "junk" wink. # # Note that ndiff makes no claim to produce a *minimal* diff. To the # contrary, minimal diffs are often counter-intuitive, because they # synch up anywhere possible, sometimes
Re: Document versions
Let me transfer a discussion from lyx-docs. I think, that it may be much more interesting here. On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx- docs list: I was just thinking about comparing two documents and seeing additions/deletions, like diff does for text file. This sounds Hard. Do tools like this exist for, say, HTML? If so, we could probably steal them. I would like to give an info about ndiff (from Python 1.5.2 distribution -- see attached). Does it make any sense to you (IMJAL - I am just a lawyer, no programmer)? - word wrap. I did some work on a perl diff (Algorithm::Diff in CPAN). We had talked about the possibility of a "word-based" diff. In fact, I think someone (Jean-Marc?) said a wdiff already exists. Alternatively, we could make all paragraphs into one line and then run the diff. If you want to display the differences as a LyX or to divide whole document into one-word-per-line format and then compare by regular diff. - character formatting. Ouch. This is actually several problems. (1) add an italicized word to a regular paragraph. (2) add a word (in italics) to an italicized paragraph. (3) change a word from regular print to italics. Um, I suppose you could remove all character formatting and just compare text, which would be better than nothing. In my experience fromatting is not so important as a content (actually, I am glad, checking differences in formatting is now optional in Word). Ah. I've been thinking of doing the diff outside of LyX, of some version of a diff on the text of a LyX file. To do it within LyX has a different set of problems. For example, you have character formatting information on each character, making comparison easier. But you'll need to steal the GNUdiff algorithm and put it into LyX. Ugh. Much better IMHO than making diffs on two files is some kind of mechanism, which records revisions while they are made. Actually, I almost never use "Compare versions" feauture in Word (I am sorry for talking so much about Word -- I really prefer LyX and real programms to toys, honestly!), or I am using it only when necessary (= our client is stupid and I haven't been successfull in explaining what are the revisions about). I know that it is much more work for LyX programmers (than just throwing something in diff), but I am afraid that diff is The Bad Thing for everything other than computer programs and silmilar stuff. Any comments? Matej #! /usr/bin/env python # Module ndiff version 1.4.0 # Released to the public domain 27-Mar-1999, # by Tim Peters ([EMAIL PROTECTED]). # Provided as-is; use at your own risk; no warranty; no promises; enjoy! """ndiff [-q] file1 file2 or ndiff (-r1 | -r2) ndiff_output file1_or_file2 Print a human-friendly file difference report to stdout. Both inter- and intra-line differences are noted. In the second form, recreate file1 (-r1) or file2 (-r2) on stdout, from an ndiff report on stdin. In the first form, if -q ("quiet") is not specified, the first two lines of output are -: file1 +: file2 Each remaining line begins with a two-letter code: "- "line unique to file1 "+ "line unique to file2 " "line common to both files "? "line not present in either input file Lines beginning with "? " attempt to guide the eye to intraline differences, and were not present in either input file. These lines can be confusing if the source files contain tab characters. The first file can be recovered by retaining only lines that begin with " " or "- ", and deleting those 2-character prefixes; use ndiff with -r1. The second file can be recovered similarly, but by retaining only " " and "+ " lines; use ndiff with -r2; or, on Unix, the second file can be recovered by piping the output through sed -n '/^[+ ] /s/^..//p' See module comments for details and programmatic interface. """ __version__ = 1, 4, 0 # SequenceMatcher tries to compute a "human-friendly diff" between # two sequences (chiefly picturing a file as a sequence of lines, # and a line as a sequence of characters, here). Unlike e.g. UNIX(tm) # diff, the fundamental notion is the longest *contiguous* junk-free # matching subsequence. That's what catches peoples' eyes. The # Windows(tm) windiff has another interesting notion, pairing up elements # that appear uniquely in each sequence. That, and the method here, # appear to yield more intuitive difference reports than does diff. This # method appears to be the least vulnerable to synching up on blocks # of "junk lines", though (like blank lines in ordinary text files, # or maybe "P" lines in HTML files). That may be because this is # the only method of the 3 that has a *concept* of "junk" wink. # # Note that ndiff makes no claim to produce a *minimal* diff. To the # contrary, minimal diffs are often counter-intuitive, because they # synch up anywhere possible, sometimes
Re: Document versions
Let me transfer a discussion from lyx-docs. I think, that it may be much more interesting here. On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx- docs list: > > I was just thinking about comparing two documents and seeing > > additions/deletions, like diff does for text file. > > > This sounds Hard. Do tools like this exist for, say, HTML? If so, > we could probably steal them. I would like to give an info about ndiff (from Python 1.5.2 distribution -- see attached). Does it make any sense to you (IMJAL - I am just a lawyer, no programmer)? > - word wrap. I did some work on a perl diff (Algorithm::Diff in > CPAN). We had talked about the possibility of a "word-based" diff. > In fact, I think someone (Jean-Marc?) said a wdiff already exists. > Alternatively, we could make all paragraphs into one line and then > run the diff. If you want to display the differences as a LyX or to divide whole document into one-word-per-line format and then compare by regular diff. > - character formatting. Ouch. This is actually several problems. > (1) add an italicized word to a regular paragraph. (2) add a word > (in italics) to an italicized paragraph. (3) change a word from > regular print to italics. Um, I suppose you could remove all > character formatting and just compare text, which would be better > than nothing. In my experience fromatting is not so important as a content (actually, I am glad, checking differences in formatting is now optional in Word). > Ah. I've been thinking of doing the diff outside of LyX, of some > version of a diff on the text of a LyX file. To do it within LyX > has a different set of problems. For example, you have character > formatting information on each character, making comparison easier. > But you'll need to steal the GNUdiff algorithm and put it into LyX. > Ugh. Much better IMHO than making diffs on two files is some kind of mechanism, which records revisions while they are made. Actually, I almost never use "Compare versions" feauture in Word (I am sorry for talking so much about Word -- I really prefer LyX and real programms to toys, honestly!), or I am using it only when necessary (= our client is stupid and I haven't been successfull in explaining what are the revisions about). I know that it is much more work for LyX programmers (than just throwing something in diff), but I am afraid that diff is The Bad Thing for everything other than computer programs and silmilar stuff. Any comments? Matej #! /usr/bin/env python # Module ndiff version 1.4.0 # Released to the public domain 27-Mar-1999, # by Tim Peters ([EMAIL PROTECTED]). # Provided as-is; use at your own risk; no warranty; no promises; enjoy! """ndiff [-q] file1 file2 or ndiff (-r1 | -r2) < ndiff_output > file1_or_file2 Print a human-friendly file difference report to stdout. Both inter- and intra-line differences are noted. In the second form, recreate file1 (-r1) or file2 (-r2) on stdout, from an ndiff report on stdin. In the first form, if -q ("quiet") is not specified, the first two lines of output are -: file1 +: file2 Each remaining line begins with a two-letter code: "- "line unique to file1 "+ "line unique to file2 " "line common to both files "? "line not present in either input file Lines beginning with "? " attempt to guide the eye to intraline differences, and were not present in either input file. These lines can be confusing if the source files contain tab characters. The first file can be recovered by retaining only lines that begin with " " or "- ", and deleting those 2-character prefixes; use ndiff with -r1. The second file can be recovered similarly, but by retaining only " " and "+ " lines; use ndiff with -r2; or, on Unix, the second file can be recovered by piping the output through sed -n '/^[+ ] /s/^..//p' See module comments for details and programmatic interface. """ __version__ = 1, 4, 0 # SequenceMatcher tries to compute a "human-friendly diff" between # two sequences (chiefly picturing a file as a sequence of lines, # and a line as a sequence of characters, here). Unlike e.g. UNIX(tm) # diff, the fundamental notion is the longest *contiguous* & junk-free # matching subsequence. That's what catches peoples' eyes. The # Windows(tm) windiff has another interesting notion, pairing up elements # that appear uniquely in each sequence. That, and the method here, # appear to yield more intuitive difference reports than does diff. This # method appears to be the least vulnerable to synching up on blocks # of "junk lines", though (like blank lines in ordinary text files, # or maybe "" lines in HTML files). That may be because this is # the only method of the 3 that has a *concept* of "junk" . # # Note that ndiff makes no claim to produce a *minimal* diff. To the # contrary, minimal diffs are often counter-intuitive, because they # synch up anywhere