Re: Document versions

2000-09-01 Thread Dekel Tsur

On Wed, Aug 30, 2000 at 08:28:04PM +0200, Matej Cepl wrote:
Content-Description: Mail message body
 Let me transfer a discussion from lyx-docs. I think, that it may be 
 much more interesting here.
 
 On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx-
 docs list:
 
   I was just thinking about comparing two documents and seeing
   additions/deletions, like diff does for text file.
   
 
  This sounds Hard. Do tools like this exist for, say, HTML? If so,
  we could probably steal them. 

There is already a tool for comparing latex files:
$TEXMF/latex/changebar/chbar.sh
This script "take two LaTeX files and produce a third which
has changebars highlighting the difference between them."



Re: Document versions

2000-09-01 Thread Dekel Tsur

On Wed, Aug 30, 2000 at 08:28:04PM +0200, Matej Cepl wrote:
Content-Description: Mail message body
 Let me transfer a discussion from lyx-docs. I think, that it may be 
 much more interesting here.
 
 On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx-
 docs list:
 
   I was just thinking about comparing two documents and seeing
   additions/deletions, like diff does for text file.
   
 
  This sounds Hard. Do tools like this exist for, say, HTML? If so,
  we could probably steal them. 

There is already a tool for comparing latex files:
$TEXMF/latex/changebar/chbar.sh
This script "take two LaTeX files and produce a third which
has changebars highlighting the difference between them."



Re: Document versions

2000-09-01 Thread Dekel Tsur

On Wed, Aug 30, 2000 at 08:28:04PM +0200, Matej Cepl wrote:
Content-Description: Mail message body
> Let me transfer a discussion from lyx-docs. I think, that it may be 
> much more interesting here.
> 
> On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx-
> docs list:
> 
> > > I was just thinking about comparing two documents and seeing
> > > additions/deletions, like diff does for text file.
> > > 
> 
> > This sounds Hard. Do tools like this exist for, say, HTML? If so,
> > we could probably steal them. 

There is already a tool for comparing latex files:
$TEXMF/latex/changebar/chbar.sh
This script "take two LaTeX files and produce a third which
has changebars highlighting the difference between them."



Re: Document versions

2000-08-30 Thread Matej Cepl

Let me transfer a discussion from lyx-docs. I think, that it may be 
much more interesting here.

On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx-
docs list:

  I was just thinking about comparing two documents and seeing
  additions/deletions, like diff does for text file.
  

 This sounds Hard. Do tools like this exist for, say, HTML? If so,
 we could probably steal them. 

I would like to give an info about ndiff (from Python 1.5.2 distribution 
-- see attached). Does it make any sense to you (IMJAL - I am just a 
lawyer, no programmer)?

 - word wrap. I did some work on a perl diff (Algorithm::Diff in
 CPAN). We had talked about the possibility of a "word-based" diff.
 In fact, I think someone (Jean-Marc?) said a wdiff already exists.
 Alternatively, we could make all paragraphs into one line and then
 run the diff. If you want to display the differences as a LyX

or to divide whole document into one-word-per-line format and then 
compare by regular diff.

 - character formatting. Ouch. This is actually several problems.
 (1) add an italicized word to a regular paragraph. (2) add a word
 (in italics) to an italicized paragraph. (3) change a word from
 regular print to italics. Um, I suppose you could remove all
 character formatting and just compare text, which would be better
 than nothing.

In my experience fromatting is not so important as a content 
(actually, I am glad, checking differences in formatting is now 
optional in Word).

 Ah. I've been thinking of doing the diff outside of LyX, of some
 version of a diff on the text of a LyX file. To do it within LyX
 has a different set of problems. For example, you have character
 formatting information on each character, making comparison easier.
 But you'll need to steal the GNUdiff algorithm and put it into LyX.
 Ugh. 

Much better IMHO than making diffs on two files is some kind of 
mechanism, which records revisions while they are made. Actually, I 
almost never use "Compare versions" feauture in Word (I am sorry 
for talking so much about Word -- I really prefer LyX and real 
programms to toys, honestly!), or I am using it only when necessary 
(= our client is stupid and I haven't been successfull in explaining 
what are the revisions about).

I know that it is much more work for LyX programmers (than just 
throwing something in diff), but I am afraid that diff is The Bad 
Thing for everything other than computer programs and silmilar 
stuff.

Any comments?

Matej



#! /usr/bin/env python

# Module ndiff version 1.4.0
# Released to the public domain 27-Mar-1999,
# by Tim Peters ([EMAIL PROTECTED]).

# Provided as-is; use at your own risk; no warranty; no promises; enjoy!

"""ndiff [-q] file1 file2
or
ndiff (-r1 | -r2)  ndiff_output  file1_or_file2

Print a human-friendly file difference report to stdout.  Both inter-
and intra-line differences are noted.  In the second form, recreate file1
(-r1) or file2 (-r2) on stdout, from an ndiff report on stdin.

In the first form, if -q ("quiet") is not specified, the first two lines
of output are

-: file1
+: file2

Each remaining line begins with a two-letter code:

"- "line unique to file1
"+ "line unique to file2
"  "line common to both files
"? "line not present in either input file

Lines beginning with "? " attempt to guide the eye to intraline
differences, and were not present in either input file.  These lines can
be confusing if the source files contain tab characters.

The first file can be recovered by retaining only lines that begin with
"  " or "- ", and deleting those 2-character prefixes; use ndiff with -r1.

The second file can be recovered similarly, but by retaining only "  "
and "+ " lines; use ndiff with -r2; or, on Unix, the second file can be
recovered by piping the output through

sed -n '/^[+ ] /s/^..//p'

See module comments for details and programmatic interface.
"""

__version__ = 1, 4, 0

# SequenceMatcher tries to compute a "human-friendly diff" between
# two sequences (chiefly picturing a file as a sequence of lines,
# and a line as a sequence of characters, here).  Unlike e.g. UNIX(tm)
# diff, the fundamental notion is the longest *contiguous*  junk-free
# matching subsequence.  That's what catches peoples' eyes.  The
# Windows(tm) windiff has another interesting notion, pairing up elements
# that appear uniquely in each sequence.  That, and the method here,
# appear to yield more intuitive difference reports than does diff.  This
# method appears to be the least vulnerable to synching up on blocks
# of "junk lines", though (like blank lines in ordinary text files,
# or maybe "P" lines in HTML files).  That may be because this is
# the only method of the 3 that has a *concept* of "junk" wink.
#
# Note that ndiff makes no claim to produce a *minimal* diff.  To the
# contrary, minimal diffs are often counter-intuitive, because they
# synch up anywhere possible, sometimes 

Re: Document versions

2000-08-30 Thread Matej Cepl

Let me transfer a discussion from lyx-docs. I think, that it may be 
much more interesting here.

On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx-
docs list:

  I was just thinking about comparing two documents and seeing
  additions/deletions, like diff does for text file.
  

 This sounds Hard. Do tools like this exist for, say, HTML? If so,
 we could probably steal them. 

I would like to give an info about ndiff (from Python 1.5.2 distribution 
-- see attached). Does it make any sense to you (IMJAL - I am just a 
lawyer, no programmer)?

 - word wrap. I did some work on a perl diff (Algorithm::Diff in
 CPAN). We had talked about the possibility of a "word-based" diff.
 In fact, I think someone (Jean-Marc?) said a wdiff already exists.
 Alternatively, we could make all paragraphs into one line and then
 run the diff. If you want to display the differences as a LyX

or to divide whole document into one-word-per-line format and then 
compare by regular diff.

 - character formatting. Ouch. This is actually several problems.
 (1) add an italicized word to a regular paragraph. (2) add a word
 (in italics) to an italicized paragraph. (3) change a word from
 regular print to italics. Um, I suppose you could remove all
 character formatting and just compare text, which would be better
 than nothing.

In my experience fromatting is not so important as a content 
(actually, I am glad, checking differences in formatting is now 
optional in Word).

 Ah. I've been thinking of doing the diff outside of LyX, of some
 version of a diff on the text of a LyX file. To do it within LyX
 has a different set of problems. For example, you have character
 formatting information on each character, making comparison easier.
 But you'll need to steal the GNUdiff algorithm and put it into LyX.
 Ugh. 

Much better IMHO than making diffs on two files is some kind of 
mechanism, which records revisions while they are made. Actually, I 
almost never use "Compare versions" feauture in Word (I am sorry 
for talking so much about Word -- I really prefer LyX and real 
programms to toys, honestly!), or I am using it only when necessary 
(= our client is stupid and I haven't been successfull in explaining 
what are the revisions about).

I know that it is much more work for LyX programmers (than just 
throwing something in diff), but I am afraid that diff is The Bad 
Thing for everything other than computer programs and silmilar 
stuff.

Any comments?

Matej



#! /usr/bin/env python

# Module ndiff version 1.4.0
# Released to the public domain 27-Mar-1999,
# by Tim Peters ([EMAIL PROTECTED]).

# Provided as-is; use at your own risk; no warranty; no promises; enjoy!

"""ndiff [-q] file1 file2
or
ndiff (-r1 | -r2)  ndiff_output  file1_or_file2

Print a human-friendly file difference report to stdout.  Both inter-
and intra-line differences are noted.  In the second form, recreate file1
(-r1) or file2 (-r2) on stdout, from an ndiff report on stdin.

In the first form, if -q ("quiet") is not specified, the first two lines
of output are

-: file1
+: file2

Each remaining line begins with a two-letter code:

"- "line unique to file1
"+ "line unique to file2
"  "line common to both files
"? "line not present in either input file

Lines beginning with "? " attempt to guide the eye to intraline
differences, and were not present in either input file.  These lines can
be confusing if the source files contain tab characters.

The first file can be recovered by retaining only lines that begin with
"  " or "- ", and deleting those 2-character prefixes; use ndiff with -r1.

The second file can be recovered similarly, but by retaining only "  "
and "+ " lines; use ndiff with -r2; or, on Unix, the second file can be
recovered by piping the output through

sed -n '/^[+ ] /s/^..//p'

See module comments for details and programmatic interface.
"""

__version__ = 1, 4, 0

# SequenceMatcher tries to compute a "human-friendly diff" between
# two sequences (chiefly picturing a file as a sequence of lines,
# and a line as a sequence of characters, here).  Unlike e.g. UNIX(tm)
# diff, the fundamental notion is the longest *contiguous*  junk-free
# matching subsequence.  That's what catches peoples' eyes.  The
# Windows(tm) windiff has another interesting notion, pairing up elements
# that appear uniquely in each sequence.  That, and the method here,
# appear to yield more intuitive difference reports than does diff.  This
# method appears to be the least vulnerable to synching up on blocks
# of "junk lines", though (like blank lines in ordinary text files,
# or maybe "P" lines in HTML files).  That may be because this is
# the only method of the 3 that has a *concept* of "junk" wink.
#
# Note that ndiff makes no claim to produce a *minimal* diff.  To the
# contrary, minimal diffs are often counter-intuitive, because they
# synch up anywhere possible, sometimes 

Re: Document versions

2000-08-30 Thread Matej Cepl

Let me transfer a discussion from lyx-docs. I think, that it may be 
much more interesting here.

On Tue, 18 May 1999 08:39:17 -0700 Amir Karger wrote on lyx-
docs list:

> > I was just thinking about comparing two documents and seeing
> > additions/deletions, like diff does for text file.
> > 

> This sounds Hard. Do tools like this exist for, say, HTML? If so,
> we could probably steal them. 

I would like to give an info about ndiff (from Python 1.5.2 distribution 
-- see attached). Does it make any sense to you (IMJAL - I am just a 
lawyer, no programmer)?

> - word wrap. I did some work on a perl diff (Algorithm::Diff in
> CPAN). We had talked about the possibility of a "word-based" diff.
> In fact, I think someone (Jean-Marc?) said a wdiff already exists.
> Alternatively, we could make all paragraphs into one line and then
> run the diff. If you want to display the differences as a LyX

or to divide whole document into one-word-per-line format and then 
compare by regular diff.

> - character formatting. Ouch. This is actually several problems.
> (1) add an italicized word to a regular paragraph. (2) add a word
> (in italics) to an italicized paragraph. (3) change a word from
> regular print to italics. Um, I suppose you could remove all
> character formatting and just compare text, which would be better
> than nothing.

In my experience fromatting is not so important as a content 
(actually, I am glad, checking differences in formatting is now 
optional in Word).

> Ah. I've been thinking of doing the diff outside of LyX, of some
> version of a diff on the text of a LyX file. To do it within LyX
> has a different set of problems. For example, you have character
> formatting information on each character, making comparison easier.
> But you'll need to steal the GNUdiff algorithm and put it into LyX.
> Ugh. 

Much better IMHO than making diffs on two files is some kind of 
mechanism, which records revisions while they are made. Actually, I 
almost never use "Compare versions" feauture in Word (I am sorry 
for talking so much about Word -- I really prefer LyX and real 
programms to toys, honestly!), or I am using it only when necessary 
(= our client is stupid and I haven't been successfull in explaining 
what are the revisions about).

I know that it is much more work for LyX programmers (than just 
throwing something in diff), but I am afraid that diff is The Bad 
Thing for everything other than computer programs and silmilar 
stuff.

Any comments?

Matej



#! /usr/bin/env python

# Module ndiff version 1.4.0
# Released to the public domain 27-Mar-1999,
# by Tim Peters ([EMAIL PROTECTED]).

# Provided as-is; use at your own risk; no warranty; no promises; enjoy!

"""ndiff [-q] file1 file2
or
ndiff (-r1 | -r2) < ndiff_output > file1_or_file2

Print a human-friendly file difference report to stdout.  Both inter-
and intra-line differences are noted.  In the second form, recreate file1
(-r1) or file2 (-r2) on stdout, from an ndiff report on stdin.

In the first form, if -q ("quiet") is not specified, the first two lines
of output are

-: file1
+: file2

Each remaining line begins with a two-letter code:

"- "line unique to file1
"+ "line unique to file2
"  "line common to both files
"? "line not present in either input file

Lines beginning with "? " attempt to guide the eye to intraline
differences, and were not present in either input file.  These lines can
be confusing if the source files contain tab characters.

The first file can be recovered by retaining only lines that begin with
"  " or "- ", and deleting those 2-character prefixes; use ndiff with -r1.

The second file can be recovered similarly, but by retaining only "  "
and "+ " lines; use ndiff with -r2; or, on Unix, the second file can be
recovered by piping the output through

sed -n '/^[+ ] /s/^..//p'

See module comments for details and programmatic interface.
"""

__version__ = 1, 4, 0

# SequenceMatcher tries to compute a "human-friendly diff" between
# two sequences (chiefly picturing a file as a sequence of lines,
# and a line as a sequence of characters, here).  Unlike e.g. UNIX(tm)
# diff, the fundamental notion is the longest *contiguous* & junk-free
# matching subsequence.  That's what catches peoples' eyes.  The
# Windows(tm) windiff has another interesting notion, pairing up elements
# that appear uniquely in each sequence.  That, and the method here,
# appear to yield more intuitive difference reports than does diff.  This
# method appears to be the least vulnerable to synching up on blocks
# of "junk lines", though (like blank lines in ordinary text files,
# or maybe "" lines in HTML files).  That may be because this is
# the only method of the 3 that has a *concept* of "junk" .
#
# Note that ndiff makes no claim to produce a *minimal* diff.  To the
# contrary, minimal diffs are often counter-intuitive, because they
# synch up anywhere