Re: diff regexps

2006-07-12 Thread Oleg Goldshmidt
Gilad Ben-Yossef [EMAIL PROTECTED] writes:

 The problem is that I tried various combiniations and none worked:

 diff -X ti_dontdiff  -pBbNaur -X dontdiff this-kernel/ that-kernel/ -I
 '\$Id' -I '\$Header' -I '\$Date' -I '\$Source' -I '\$Auther'

Try double quotes?

Here is diffing a file from two different branches of CVS:

$ diff -Nur {prototype,exceptions}/src/clone.cc | grep \$Id
-   $Id: clone.cc,v 1.27 2006/07/10 08:30:21 olegg Exp $;
+   $Id: clone.cc,v 1.25.2.1 2006/07/10 08:16:40 olegg Exp $;
$ diff -Nur -I\$Id {prototype,exceptions}/src/clone.cc | grep \$Id
$

-- 
Oleg Goldshmidt | [EMAIL PROTECTED] | http://www.goldshmidt.org

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-12 Thread Oleg Goldshmidt
Gilad Ben-Yossef [EMAIL PROTECTED] writes:

 diff -X ti_dontdiff  -pBbNaur -X dontdiff this-kernel/ that-kernel/ -I
 '\$Id' -I '\$Header' -I '\$Date' -I '\$Source' -I '\$Auther'

I forgot to mention the obvious in my previous response: if you are
comparing two versions under CVS control (which is not the case for
you, I gather), then there are additional CVS-specific options to
exclude keyword diffs:

$ cvs diff -u -r1.27 clone.cc | grep \$Id
-   $Id: clone.cc,v 1.27 2006/07/10 08:30:21 olegg Exp $;
+   $Id: clone.cc,v 1.25.2.1 2006/07/10 08:16:40 olegg Exp $;
$ cvs diff -u -r1.27 -kk clone.cc | grep \$Id
$

See info cvs.

-- 
Oleg Goldshmidt | [EMAIL PROTECTED] | http://www.goldshmidt.org

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-12 Thread Shachar Shemesh

Oleg Goldshmidt wrote:


Try double quotes?
  

It makes no sense:

$ echo '\$Id'
\$Id
$ echo \$Id
$Id
You really want the former, as $ has special meaning in a regexp, and 
therefor is supposed to need a backslash before it if used literally. 
I'm not saying you are not right, just that it's strange that this is 
the case.


 Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? http://www.lingnu.com/backup.html


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-12 Thread Ehud Karni
On Wed, 12 Jul 2006 10:57:35 Oleg Goldshmidt wrote:

 Gilad Ben-Yossef writes:

  The problem is that I tried various combiniations and none worked:
 
  diff -X ti_dontdiff  -pBbNaur -X dontdiff this-kernel/ that-kernel/ -I
  '\$Id' -I '\$Header' -I '\$Date' -I '\$Source' -I '\$Auther'

 Try double quotes?

 Here is diffing a file from two different branches of CVS:

 $ diff -Nur {prototype,exceptions}/src/clone.cc | grep \$Id
 -   $Id: clone.cc,v 1.27 2006/07/10 08:30:21 olegg Exp $;
 +   $Id: clone.cc,v 1.25.2.1 2006/07/10 08:16:40 olegg Exp $;
 $ diff -Nur -I\$Id {prototype,exceptions}/src/clone.cc | grep \$Id
 $

In bash you can use \$VAR or '$VAR' (i.e. you need not escape the $
when it is between apostrophes).
There is even more exotic form: $'string' which does interpret the
string by bash (NOT by the calling application).

Run the following script to see the differences:

#! /bin/bash -ex

VAR=example  1 \\ \$ \' \ \134 \044 \047 \042

echo -E $VAR
echo -E '$VAR'
echo -E $'$VAR'
echo -E '$''$VAR'
eval echo -E '$''$VAR'
eval UNESC1='$''$VAR'# unescaped var
echo -E $UNESC1 | $UNESC2
echo -E $UNESC
echo -E \$VAR
echo -E '\$VAR'
echo -E $'\$VAR'

On Wed, 12 Jul 2006 11:23:02 Shachar Shemesh wrote:

  Try double quotes?
 
 It makes no sense:
  $ echo '\$Id'
  \$Id
  $ echo \$Id
  $Id
 You really want the former, as $ has special meaning in a regexp, and
 therefor is supposed to need a backslash before it if used literally.
 I'm not saying you are not right, just that it's strange that this is
 the case.

Shachar, you wrong. Gilad wants the `diff' program to see $ID not
\$ID which is what '\$ID' gives to the application (diff does not
substitute $ID with its environment value, bash does it).

Ehud.


--
 Ehud Karni   Tel: +972-3-7966-561  /\
 Mivtach - Simon  Fax: +972-3-7966-667  \ /  ASCII Ribbon Campaign
 Insurance agencies   (USA) voice mail and   X   Against   HTML   Mail
 http://www.mvs.co.il  FAX:  1-815-5509341  / \
 GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-12 Thread Shachar Shemesh

Ehud Karni wrote:

Shachar, you wrong. Gilad wants the `diff' program to see $ID not
\$ID which is what '\$ID' gives to the application (diff does not
substitute $ID with its environment value, bash does it).
  
Last time I checked, $ in regexp meant match end of line. '$Id' 
would mean, if I understand this correctly, an Id coming AFTER the end 
of the line (an impossible combination, I know, but still). If I want 
grep to understand a literal $, I need to pass it a \$, which I can 
do either by doing \\\$Id or '\$Id'.


I stand by my original statement.

Ehud.
  

Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? http://www.lingnu.com/backup.html


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-12 Thread Ehud Karni
On Wed, 12 Jul 2006 12:11:40 +0300, Shachar Shemesh wrote:

 Ehud Karni wrote:
  Shachar, you wrong. Gilad wants the `diff' program to see $ID not
  \$ID which is what '\$ID' gives to the application (diff does not
  substitute $ID with its environment value, bash does it).
 
 Last time I checked, $ in regexp meant match end of line. '$Id'
 would mean, if I understand this correctly, an Id coming AFTER the end
 of the line (an impossible combination, I know, but still). If I want
 grep to understand a literal $, I need to pass it a \$, which I can
 do either by doing \\\$Id or '\$Id'.

 I stand by my original statement.

You are right.

I'll say it again: YOU ARE RIGHT !  I take my statement back.

I think a better way to pass the $Id would be '[$]Id' then you don't
have to mess up with who is eating the backslash (and how many of them).

Ehud.


--
 Ehud Karni   Tel: +972-3-7966-561  /\
 Mivtach - Simon  Fax: +972-3-7966-667  \ /  ASCII Ribbon Campaign
 Insurance agencies   (USA) voice mail and   X   Against   HTML   Mail
 http://www.mvs.co.il  FAX:  1-815-5509341  / \
 GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-12 Thread Adam Morrison
On Wed, Jul 12, 2006 at 12:23:44PM +0300, Ehud Karni wrote:

  Last time I checked, $ in regexp meant match end of line. '$Id'
  would mean, if I understand this correctly, an Id coming AFTER the end
  of the line (an impossible combination, I know, but still). If I want
  grep to understand a literal $, I need to pass it a \$, which I can
  do either by doing \\\$Id or '\$Id'.
 
  I stand by my original statement.
 
 You are right.
 
 I'll say it again: YOU ARE RIGHT !  I take my statement back.

Not quite.  There are basic regular expressions and extended regular
expressions.  The $ means end of string only when used as an anchor.
In basic regular expressions, the $ is an anchor only if used at the
end of the regular expression.  For extended regular expression, what
Shachar said is essentially correct.

Apparently, diff uses basic regular expressions.

bash-3.00$ cat t1
$Id: clone.cc,v 1.27 2006/07/10 08:30:21 olegg Exp $;
bash-3.00$ grep '$Id' t1
$Id: clone.cc,v 1.27 2006/07/10 08:30:21 olegg Exp $;
bash-3.00$ grep -E '$Id' t1
bash-3.00$



=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-12 Thread Gilad Ben-Yossef

Adam Morrison wrote:



Apparently, diff uses basic regular expressions.


Indeed, that was also why my regexp did not work - I was trying to use extended regular expressions, whereas diff only 
support basic ones.


Luckily, for GNU versions of grep/diff, the only difference between basic and extended regexps is that special 
characters of extended mode (namely {, } |, $, ^ etc) need to be prefixed with a back slash for their special meaning to 
be used.  Yes, backwards from you normally would expect, a $ is just a dollarm but a \$ is a marker for end of line.


In the end, this is the diff line used:

# diff   -pBbNaur -X dontdiff this_kernel/ that_kernel/ -I 
'$Id\|$Header\|$Date\|$Source\|$Author\|$Revision'

Note that this did *not* weed out all uses of the CVS keywords, only those that happend in a block of chanes where all 
the changed lines matched the regexp, so $Log ... $ lines, for example, could not be catched using this technique. 
Luckily, after this diff line the number of files with those lines was small enough to allow manual trimming by using an 
exclude by file technique.


Thanks very much to Ehud, Shachar, Oleg, Adam and everyone else!

Gilad



--
Gilad Ben-Yossef [EMAIL PROTECTED]
Codefidence. A name you can trust(tm)
Web: http://codefidence.com  | SIP: [EMAIL PROTECTED]
IL: +972.3.7515563 ext. 201  | Fax:+972.3.7515503
US: +1.212.2026643 ext. 201  | Cel:   +972.52.8260388

Resistence was futile.
-- Danny Getz, 2004.

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



diff regexps

2006-07-11 Thread Gilad Ben-Yossef

Howdie,

I am trying to diff a vanilla vs. a vendor supplied linux kernel source 
tree.


Some dweeb at the vendor has put the Linux kernel into CVS, causing 
every line with $Id, $Revision, $Date, $Source etc. in the Linux kernel 
to mutate.


I'm trying to generate a diff that doesn't include these random changes 
but does include the real changes that occurs between the two trees.


Now, diff has an option, -I, via one can provide a regexp and blocks of 
changed lines that all contain the regexp will not get to the output.


The problem is that I tried various combiniations and none worked:

diff -X ti_dontdiff  -pBbNaur -X dontdiff this-kernel/ that-kernel/ -I 
'\$Id' -I '\$Header' -I '\$Date' -I '\$Source' -I '\$Auther'


and also:

diff -X ti_dontdiff  -pBbNaur -X dontdiff this-kernel/ that-kernel/ -I 
'\$Id|\$Header|\$Date|\$Source|\$Auther'


but to no avail. Anyone care to help a poor regexp challanged bugger?

Thanks,
Gilad



=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: diff regexps

2006-07-11 Thread Omer Zak
Hello Gilad,
You are doing a big jump here.  Let's try to walk a small step at a
time.

1. Start with diffing two files i.e. don't use the -X options or the
   -pBbNaur which requires several searches through man diff to verify.
   After the two-file compare works, try to add back those options.
2. Check the way you enter regular expressions by, for example,
   echo '\$Header' and egrepping your files with the same regular
   expression.
   (From my check, under bash and using egrep, '\$Header' is the correct
   way to match a line with the string '$Header' - so you got that part
   right.)
3. You wrote '\$Auther' - shouldn't it be '\$Author'?
4. If you did not solve the problem yourself by now, show the actual
   output from diff'ing two files with the unwanted blocks.
   Also specify exactly which you shell you used (I tested under
   bash 2.05b.0(1)-release).
   We'll know better where to look for the problem.
  --- Omer

On Tue, 2006-07-11 at 21:09 +0300, Gilad Ben-Yossef wrote:
 Howdie,
 
 I am trying to diff a vanilla vs. a vendor supplied linux kernel source 
 tree.
 
 Some dweeb at the vendor has put the Linux kernel into CVS, causing 
 every line with $Id, $Revision, $Date, $Source etc. in the Linux kernel 
 to mutate.
 
 I'm trying to generate a diff that doesn't include these random changes 
 but does include the real changes that occurs between the two trees.
 
 Now, diff has an option, -I, via one can provide a regexp and blocks of 
 changed lines that all contain the regexp will not get to the output.
 
 The problem is that I tried various combiniations and none worked:
 
 diff -X ti_dontdiff  -pBbNaur -X dontdiff this-kernel/ that-kernel/ -I 
 '\$Id' -I '\$Header' -I '\$Date' -I '\$Source' -I '\$Auther'
 
 and also:
 
 diff -X ti_dontdiff  -pBbNaur -X dontdiff this-kernel/ that-kernel/ -I 
 '\$Id|\$Header|\$Date|\$Source|\$Auther'
 
 but to no avail. Anyone care to help a poor regexp challanged bugger?
 
 Thanks,
 Gilad


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]