Re: Regular Expression Trouble

2008-08-28 Thread Peter Boosten


Paul Chvostek wrote:

  This is an attempt to isolate every MAC address that
 appears and then sort and count them to see who is having
 trouble or, in some cases, is causing trouble.
 
 Then you still may want to use awk for some of that...
 
   cat /var/log/dhcpd.log | \
   sed -nE 's/.*([0-9a-f]{2}(:[0-9a-f]{2}){5}).*/\1/p' | \
   awk '
{ a[$1]++; }
END {
 for(i in a){
  printf(%7.0f\t%s\n, a[i], i);
 }
}
   ' | sort -nr
 

Euhmm:

sed -nE 's/.*([0-9a-f]{2}(:[0-9a-f]{2}){5}).*/\1/p' /var/log/dhcpd.log|\
sort | uniq -c

:-)
Peter

-- 
http://www.boosten.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-27 Thread Wayne Sierke
On Tue, 2008-08-26 at 22:12 -0500, Martin McCormick wrote:
 I am trying to isolate only the MAC addresses that appear in
 dhcpd logs.
 For anyone who is interested, the sed construct that should do
 this looks like:
 
  sed 's/.*\([[ your regular expression ]]\).*/\1/' 
 
 The \1 tells sed to only print what matched and skip all the rest.
 
   I am doing something wrong with the regular expression
 that is supposed to recognise a MAC address. MAC addresses look
 like 5 pairs of hex digits followed by :'s and then a 6TH pair
 to end the string.
 
   I have tried:
 
 [[:xdigit:][:xdigit:][:punct:]
 
 Sorry. It won't all fit on a line, but there should be a string
 of 5 pairs and the : and then the 6TH pair followed by the
 closing ] so the expression ends with ]]
 
 One should also be able to put:
 
 [[:xdigit:][:xdigit:][:punct:]]\{5,5\}[[:xdigit:][:xdigit]]
 
 Any ideas as to what else I can try?
 
There have already been good suggestions for you to choose from. I'll
just add my bucketful to the TIMTOWTDI pool:

Since you weren't specific about the format of the log data that you're
attempting to parse (keep that in mind for future questions):

# ifconfig | grep ether
ether 02:00:20:75:43:34
ether 00:40:05:10:b9:79
# ifconfig | sed -nE 's/.*ether (([[:xdigit:]]{2}:){5}[[:xdigit:]]{2}).*/\1/p'
02:00:20:75:43:34
00:40:05:10:b9:79
# ifconfig | sed -nE 's/.*ether ([[:xdigit:]:]+).*/\1/p'
02:00:20:75:43:34
00:40:05:10:b9:79
# ifconfig | sed -nE 's/.*ether ([0-9a-f:]+).*/\1/p'
02:00:20:75:43:34
00:40:05:10:b9:79
# ifconfig | sed -nE '/ether/s/.*([0-9a-f:]{17}).*/\1/p'
02:00:20:75:43:34
00:40:05:10:b9:79

And then there's:

# ifconfig | grep ether | cut -d  -f 2
02:00:20:75:43:34
00:40:05:10:b9:79

But my preference would be:

# ifconfig | awk '/ether/ {print $2}'
02:00:20:75:43:34
00:40:05:10:b9:79


Wayne



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-27 Thread Martin McCormick
My thanks to several people who have provided great suggestions
and an apology for not being clear on the log data I am mining
for MAC addresses. It is syslog and a typical line looks like:

Aug 26 20:45:36 dh1 dhcpd: DHCPACK on 10.198.67.116 to 00:12:f0:88:97:d6
(peaster-laptop) via 10.198.71.246 

That was one line broken to aid in emailing, but that's what
types of lines are involved. The MAC appears at different field
locations depending on the type of event being logged so awk is
perfect for certain types of lines, but it misses others and no
one awk expression gets them all.

This is an attempt to isolate every MAC address that
appears and then sort and count them to see who is having
trouble or, in some cases, is causing trouble.

The sed pattern matching system is interesting because I
can think of several similar situations in which the data are
there but there is no guarantee where on a given line it sits
and grep or sed usually will pull in the whole line containing
the desired data which means that one must further parse things
to get what is wanted.

Martin McCormick
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-27 Thread Jonathan McKeown
On Wednesday 27 August 2008 15:25:02 Martin McCormick wrote:

   The sed pattern matching system is interesting because I
 can think of several similar situations in which the data are
 there but there is no guarantee where on a given line it sits
 and grep or sed usually will pull in the whole line containing
 the desired data which means that one must further parse things
 to get what is wanted.

Hi Martin

Look at grep -o which only outputs the bit that matched the regexp. Using 
egrep, you can look for exactly two hex digits and a colon, repeated exactly 
five times, and followed by exactly two hex digits:

egrep -o '([[:xdigit:]]{2}:){5}[[:xdigit:]]{2}' inputfile

will parse inputfile and output all the MAC addresses it finds, one per line 
(if it finds more than one on an input line, it'll match them and print them 
on separate output lines), and nothing else.

Jonathan
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-27 Thread Wayne Sierke
On Wed, 2008-08-27 at 08:25 -0500, Martin McCormick wrote:
 My thanks to several people who have provided great suggestions
 and an apology for not being clear on the log data I am mining
 for MAC addresses. It is syslog and a typical line looks like:
 
 Aug 26 20:45:36 dh1 dhcpd: DHCPACK on 10.198.67.116 to 00:12:f0:88:97:d6
 (peaster-laptop) via 10.198.71.246 
 
 That was one line broken to aid in emailing, but that's what
 types of lines are involved. The MAC appears at different field
 locations depending on the type of event being logged so awk is
 perfect for certain types of lines, but it misses others and no
 one awk expression gets them all.

The way to deal with that is to specify a pattern to match something
that distinguishes each form of log line that you want to extract from.
With the following (contrived) log data:

Aug 26 20:45:36 dh1 dhcpd: DHCPDISCOVER from 00:12:f0:88:97:d6 
(peaster-laptop) via eth0
Aug 26 20:45:36 dh1 dhcpd: DHCPACK on 10.198.67.116 to 00:12:f0:88:97:d6 
(peaster-laptop) via 10.198.71.246 

use awk with a script such as:

awk '/DHCPDISCOVER/ {print $8} /DHCPACK/ {print $10}' logfile


Wayne


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-27 Thread Paul Chvostek
Hi Martin.

On Wed, Aug 27, 2008 at 08:25:02AM -0500, Martin McCormick wrote:
 
 Aug 26 20:45:36 dh1 dhcpd: DHCPACK on 10.198.67.116 to 00:12:f0:88:97:d6
 (peaster-laptop) via 10.198.71.246 
 
 That was one line broken to aid in emailing, but that's what
 types of lines are involved. The MAC appears at different field
 locations depending on the type of event being logged so awk is
 perfect for certain types of lines, but it misses others and no
 one awk expression gets them all.

While I agree with others that awk should be used with explicit
recognition of the particular lines, you can still snatch everything
with sed if you want to.  In FreeBSD, sed supported extended regex, so:

sed -nE 's/.*([0-9a-f]{2}(:[0-9a-f]{2}){5}).*/\1/p'

The -n option tells sed not to print the line unless instructed to
explicitely, and the p modifier at the end is that instruction.  As
for the regex ... well, that's straightforward enough.

   This is an attempt to isolate every MAC address that
 appears and then sort and count them to see who is having
 trouble or, in some cases, is causing trouble.

Then you still may want to use awk for some of that...

cat /var/log/dhcpd.log | \
sed -nE 's/.*([0-9a-f]{2}(:[0-9a-f]{2}){5}).*/\1/p' | \
awk '
 { a[$1]++; }
 END {
  for(i in a){
   printf(%7.0f\t%s\n, a[i], i);
  }
 }
' | sort -nr

You can join the lines into a single command line if you like, or toss
it as-is into a tiny shell script.  Awk is forgiving about whitespace.

You should theoretically be able to feed the same regex to awk, but I've
found that awk's eregex support sometimes doesn't work as I'd expect.

Hope this helps.

p

-- 
  Paul Chvostek [EMAIL PROTECTED]
  it.canadahttp://www.it.ca/

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-27 Thread Martin McCormick
Paul Chvostek writes:
 While I agree with others that awk should be used with explicit
 recognition of the particular lines, you can still snatch everything
 with sed if you want to.  In FreeBSD, sed supported extended regex, so:
 
 sed -nE 's/.*([0-9a-f]{2}(:[0-9a-f]{2}){5}).*/\1/p'
 
 The -n option tells sed not to print the line unless instructed to
 explicitely, and the p modifier at the end is that instruction.  As
 for the regex ... well, that's straightforward enough.
 
This is an attempt to isolate every MAC address that
  appears and then sort and count them to see who is having
  trouble or, in some cases, is causing trouble.
 
 Then you still may want to use awk for some of that...

Actually, I have been using awk, but maybe not as
efficiently as I could be, judging from the responses. I was
hoping that the sed script would recognize 6 pairs of hex digits
connected by :'s no matter where they appeared in a line and
give me just that pattern match as, in this case, I don't care
why the MAC address printed, only that it did and having nothing
but MAC's makes the rest of the sorting and counting trivial.

Other helpful examples not quoted but much appreciated. . .

 Hope this helps.

It helps a lot. Awk is one of those things that one can use for
years and still not exploit all the good things it has. I am
amazed even after years of using UNIX how much genius is packed
in to the basic system.

One last sed observation. I did fail to use the -E flag
so sed didn't know it should be using extended RE's.
I will give your examples a try for both sed and awk and  see
what new capabilities I can come up with.

Again, a thousand thanks to you and everyone else for
your answers and patience. It is good to see many different ways
of solving the same problem.

Martin McCormick
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Regular Expression Trouble

2008-08-26 Thread Martin McCormick
I am trying to isolate only the MAC addresses that appear in
dhcpd logs.
For anyone who is interested, the sed construct that should do
this looks like:

 sed 's/.*\([[ your regular expression ]]\).*/\1/' 

The \1 tells sed to only print what matched and skip all the rest.

I am doing something wrong with the regular expression
that is supposed to recognise a MAC address. MAC addresses look
like 5 pairs of hex digits followed by :'s and then a 6TH pair
to end the string.

I have tried:

[[:xdigit:][:xdigit:][:punct:]

Sorry. It won't all fit on a line, but there should be a string
of 5 pairs and the : and then the 6TH pair followed by the
closing ] so the expression ends with ]]

One should also be able to put:

[[:xdigit:][:xdigit:][:punct:]]\{5,5\}[[:xdigit:][:xdigit]]

Any ideas as to what else I can try?

What happens is I get single characters per line that look like
the first or maybe the last character in that line, but
certainly nothing useful or nothing that remotely looks like a
MAC address.

Any ideas as to what's wrong with the regular
expression?

Many thanks.

Martin McCormick WB5AGZ  Stillwater, OK 
Systems Engineer
OSU Information Technology Department Telecommunications Services Group
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-26 Thread Paul A. Procacci

Martin McCormick wrote:

I am trying to isolate only the MAC addresses that appear in
dhcpd logs.
For anyone who is interested, the sed construct that should do
this looks like:

 sed 's/.*\([[ your regular expression ]]\).*/\1/' 


The \1 tells sed to only print what matched and skip all the rest.

I am doing something wrong with the regular expression
that is supposed to recognise a MAC address. MAC addresses look
like 5 pairs of hex digits followed by :'s and then a 6TH pair
to end the string.

I have tried:

[[:xdigit:][:xdigit:][:punct:]

Sorry. It won't all fit on a line, but there should be a string
of 5 pairs and the : and then the 6TH pair followed by the
closing ] so the expression ends with ]]

One should also be able to put:

[[:xdigit:][:xdigit:][:punct:]]\{5,5\}[[:xdigit:][:xdigit]]

Any ideas as to what else I can try?

What happens is I get single characters per line that look like
the first or maybe the last character in that line, but
certainly nothing useful or nothing that remotely looks like a
MAC address.

Any ideas as to what's wrong with the regular
expression?

Many thanks.

Martin McCormick WB5AGZ  Stillwater, OK 
Systems Engineer

OSU Information Technology Department Telecommunications Services Group
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]
  


I don't have a seperate dhcp log and you didn't make it clear if you do, 
but I do have something similar written for awk that parses the system 
log file.


awk ' /DHCPREQUEST/ { print $10 } '  /var/log/messages

Maybe that will help.

~Paul
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-26 Thread Bill Campbell
On Tue, Aug 26, 2008, Martin McCormick wrote:
I am trying to isolate only the MAC addresses that appear in
dhcpd logs.
For anyone who is interested, the sed construct that should do
this looks like:

 sed 's/.*\([[ your regular expression ]]\).*/\1/' 

The \1 tells sed to only print what matched and skip all the rest.


I just tried this, and it worked:

sed -n 's;.* to \([0-9:a-z]*\) via.*;\1;p' logfile

It would have been easier in perl or python where one could use
the pattern '.* to (\S+) via.*'.

Bill
-- 
INTERNET:   [EMAIL PROTECTED]  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
Voice:  (206) 236-1676  Mercer Island, WA 98040-0820
Fax:(206) 232-9186

My reading of history convinces me that most bad government results
from too much government. --Thomas Jefferson.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2008-08-26 Thread N. Raghavendra
At 2008-08-26T22:12:19-05:00, Martin McCormick wrote:

 I am trying to isolate only the MAC addresses that appear in
 dhcpd logs.
 For anyone who is interested, the sed construct that should do
 this looks like:

It'd be better if you post a few relevant lines of the log file.
Pending that, I suggest awk(1) in case the log file format is similar
to the following snippet of `dhcpd.leases' on an OpenBSD server:

% cat dhcpd.leases
lease 192.168.10.216 {
  starts 3 2008/07/16 23:17:29;
  ends 4 2008/07/17 00:17:29;
  tstp 4 2008/07/17 00:17:29;
  binding state free;
  hardware ethernet 00:1f:c6:81:66:a6;
}
lease 192.168.10.65 {
  starts 4 2008/07/17 11:15:48;
  ends 5 2008/07/18 11:15:48;
  tstp 5 2008/07/18 11:15:48;
  binding state free;
  hardware ethernet 00:16:d3:9e:eb:74;
}

% awk '/hardware ethernet/ { print substr($3, 4, 14) }' dhcpd.leases
1f:c6:81:66:a6
16:d3:9e:eb:74

Raghavendra.

-- 
N. Raghavendra [EMAIL PROTECTED] | http://www.retrotexts.net/
Harish-Chandra Research Institute   | http://www.mri.ernet.in/
See message headers for contact and OpenPGP information.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2005-12-12 Thread Martin McCormick
Parv writes:
For even finer results, use word boundaries ...

  egrep '\bIN[^[:alnum:]]+A\b' file
  egrep '\IN[[:space:]]+A\'  file

I sincerely thank all for your examples which I have saved for
future reference.  The word boundary test appears to work perfectly,
but after looking at all the other examples, they should work also
giving living proof that in UNIX, there are many perfectly valid ways
to solve the same problem.  Again, many thanks.

Martin McCormick WB5AGZ  Stillwater, OK 
OSU Information Technology Department Network Operations Group
.-- -... . .- --. --..
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Regular Expression Trouble

2005-12-09 Thread Martin McCormick
After reading  a bit about extended regular expressions and
having a few actually work correctly in sed scripts, I tried one in
egrep and it isn't working although there are no errors.

I was hoping to get only the A records from a dns zone file so
the expression I used is:

egrep [[:space:]IN[:space:]A[:space:]] zone_file h0

Had it worked, all that would have appeared in the h0 file was
all the A or Address records.  Instead, I get those plus almost
everything else in the file.  It is obviously not filtering correctly.
Plain grep produces a 0-length output file so that is not too useful
either.  Putting double quotes around the RE didn't help either.

It seems to match almost everything.

Thanks for any good ideas.

Martin McCormick WB5AGZ  Stillwater, OK 
OSU Information Technology Department Network Operations Group
.-- -... . .- --. --..
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2005-12-09 Thread Andrew P.
On 12/10/05, Martin McCormick [EMAIL PROTECTED] wrote:
 After reading  a bit about extended regular expressions and
 having a few actually work correctly in sed scripts, I tried one in
 egrep and it isn't working although there are no errors.

 I was hoping to get only the A records from a dns zone file so
 the expression I used is:

 egrep [[:space:]IN[:space:]A[:space:]] zone_file h0

 Had it worked, all that would have appeared in the h0 file was
 all the A or Address records.  Instead, I get those plus almost
 everything else in the file.  It is obviously not filtering correctly.
 Plain grep produces a 0-length output file so that is not too useful
 either.  Putting double quotes around the RE didn't help either.

 It seems to match almost everything.

 Thanks for any good ideas.

 Martin McCormick WB5AGZ  Stillwater, OK
 OSU Information Technology Department Network Operations Group
 .-- -... . .- --. --..
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]


Isn't it a character class consisting of a white-space
and three letters?
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2005-12-09 Thread Glenn Dawson

At 02:12 PM 12/9/2005, Martin McCormick wrote:

After reading  a bit about extended regular expressions and
having a few actually work correctly in sed scripts, I tried one in
egrep and it isn't working although there are no errors.

I was hoping to get only the A records from a dns zone file so
the expression I used is:

egrep [[:space:]IN[:space:]A[:space:]] zone_file h0

Had it worked, all that would have appeared in the h0 file was
all the A or Address records.  Instead, I get those plus almost
everything else in the file.  It is obviously not filtering correctly.
Plain grep produces a 0-length output file so that is not too useful
either.  Putting double quotes around the RE didn't help either.

It seems to match almost everything.

Thanks for any good ideas.


try:

egrep IN[^[:alnum:]]+A zone_file

-Glenn



Martin McCormick WB5AGZ  Stillwater, OK
OSU Information Technology Department Network Operations Group
.-- -... . .- --. --..
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2005-12-09 Thread Parv
in message [EMAIL PROTECTED],
wrote Glenn Dawson thusly...

 At 02:12 PM 12/9/2005, Martin McCormick wrote:
 
 I was hoping to get only the A records from a dns zone file so
 the expression I used is:
 
 egrep [[:space:]IN[:space:]A[:space:]] zone_file h0
 
 Had it worked, all that would have appeared in the h0 file was
 all the A or Address records.  Instead, I get those plus almost
 everything else in the file.
 try:
 
 egrep IN[^[:alnum:]]+A zone_file

For even finer results, use word boundaries ...

  egrep '\bIN[^[:alnum:]]+A\b' file
  egrep '\IN[[:space:]]+A\'  file


  - Parv

-- 

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Regular Expression Trouble

2005-12-09 Thread Tim Hammerquist
Martin McCormick wrote:
 After reading  a bit about extended regular expressions and
 having a few actually work correctly in sed scripts, I tried one in
 egrep and it isn't working although there are no errors.
 
 I was hoping to get only the A records from a dns zone file so
 the expression I used is:
 
 egrep [[:space:]IN[:space:]A[:space:]] zone_file h0

If you're using vi, put your cursor on that very first '[' and
bounce on the % key for a while; see if anything occurs to you.

If not, you could probably use a refresher course on that little
sub-syntax of regular expressions called character classes.

 It seems to match almost everything.

The regex [[:space:]IN[:space:]A[:space:]] is composed
entirely of one large character class.  Classes are logical sets
and have no interest in the order or quantity of their contents,
so it's reduced to [[:space:]AIN].  When fed to egrep, it
says, match any line which contains any of these 4 entities.
Most lines will contain a space character, if not the 3 letters,
so your output makes sense.

As the [:space:] only has meaning inside brackets itself, it's
improtant to open enclose each individual occurence inside it's
own additional set of brackets.

egrep [[:space:]]IN[[:space:]]A[[:space:]] zone_file h0

HTH,
Tim Hammerquist
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]