Re: Doing both regex match and assignment within a If loop?

2013-03-29 Thread Steven D'Aprano
On Thu, 28 Mar 2013 21:00:44 -0700, Victor Hooi wrote:


 Is it possible to somehow test for a match, as well as do assignment of
 the re match object to a variable?


mo = expression.match(line)
if mo:
...


Many problems become trivial when we stop trying to fit everything into a 
single line :-)


 if expression1.match(line) = results:
 results.groupsdict()...
 
 Obviously the above won't work - however, is there a Pythonic way to
 tackle this?

Yes. Stop trying to fit everything into a single line :-)

I would approach the problem like this:


LOOKUP_TABLE = {expression1: do_something, 
expression2: do_something_else, 
expression3: function3, 
expression4: function4, # etc.
}

with open('log.txt') as f:
for line in f:
for expr, func in LOOKUP_TABLE.items():
mo = expr.match(line)
if mo:
func(line, mo)
break
else:
# If we get here, we never reached the break.
raise SomeException


If you don't like having that many top level functions, you could make 
them methods of a class.


If you only have two or three expressions to test, and the body of each 
if clause is small, it's probably too much effort to write functions for 
each one. In that case, I'd stick to the slightly more verbose form:

with open('log.txt') as f:
for line in f:
mo = expression1.match(line)
if mo:
do_this()
do_that()
mo = expression2.match(line)
if mo:
do_something_else()
mo = expression3.match(line)
if mo:
fe()
fi()
fo()
fum()
else:
raise SomeException





 What I'm trying to avoid is this:
 
 if expression1.match(line):
 results = expression1.match(line)
 
 which I assume would call the regex match against the line twice

Correct.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Doing both regex match and assignment within a If loop?

2013-03-29 Thread Peter Otten
Victor Hooi wrote:

 Hi,
 
 I have logline that I need to test against multiple regexes. E.g.:
 
 import re
 
 expression1 = re.compile(r'')
 expression2 = re.compile(r'')
 
 with open('log.txt') as f:
 for line in f:
 if expression1.match(line):
 # Do something - extract fields from line.
 elif expression2.match(line):
 # Do something else - extract fields from line.
 else:
 # Oh noes! Raise exception.
 
 However, in the Do something section - I need access to the match object
 itself, so that I can strip out certain fields from the line.
 
 Is it possible to somehow test for a match, as well as do assignment of
 the re match object to a variable?
 
 if expression1.match(line) = results:
 results.groupsdict()...
 
 Obviously the above won't work - however, is there a Pythonic way to
 tackle this?
 
 What I'm trying to avoid is this:
 
 if expression1.match(line):
 results = expression1.match(line)
 
 which I assume would call the regex match against the line twice - and
 when I'm dealing with a huge amount of log lines, slow things down.

(1)
for line in f:
match = expression1.match(line)
if match:
# ...
continue
match = expression2.match(line)
if match:
# ...
continue
raise NothingMatches

(2)
import re

class Matcher:
def __call__(self, expr, line):
result = self.match = expr.match(line)
return result
def __getattr__(self, name):
return getattr(self.match, name)

match = Matcher()

for line in f:
if match(expression1, line):
print(match.groupdict())
elif match(expression2, line):
print(match.group(1))
else:
raise NothingMatches


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Doing both regex match and assignment within a If loop?

2013-03-29 Thread Alain Ketterlin
Victor Hooi victorh...@gmail.com writes:

 expression1 = re.compile(r'')
 expression2 = re.compile(r'')
[...]

Just a quick remark: regular expressions are pretty powerful at
representing alternatives. You could just stick everything inside a
single re, as in '...|...'

Then use the returned match to check which alternative was recognized
(make sure you have at least one group in each alternative).

 Is it possible to somehow test for a match, as well as do assignment
 of the re match object to a variable?

Yes, use '...(...)...' and MatchObject.group(). See the other messages.

-- Alain.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Doing both regex match and assignment within a If loop?

2013-03-29 Thread Arnaud Delobelle
On Friday, 29 March 2013, Alain Ketterlin wrote:

 Victor Hooi victorh...@gmail.com javascript:; writes:

  expression1 = re.compile(r'')
  expression2 = re.compile(r'')
 [...]

 Just a quick remark: regular expressions are pretty powerful at
 representing alternatives. You could just stick everything inside a
 single re, as in '...|...'


Then use the returned match to check which alternative was recognized
 (make sure you have at least one group in each alternative).


Yes, and for extra ease/clarity you can name these alternatives (
'(?Pnamepattern)').  Then you can do

if m.group('case1'):
...
elif m.group('case2'):
   ...

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Doing both regex match and assignment within a If loop?

2013-03-29 Thread Neil Cerutti
On 2013-03-29, Alain Ketterlin al...@dpt-info.u-strasbg.fr wrote:
 Victor Hooi victorh...@gmail.com writes:

 expression1 = re.compile(r'')
 expression2 = re.compile(r'')
 [...]

 Just a quick remark: regular expressions are pretty powerful at
 representing alternatives. You could just stick everything
 inside a single re, as in '...|...'

 Then use the returned match to check which alternative was
 recognized (make sure you have at least one group in each
 alternative).

Yes, but in a Python program it's more straightforward to program
in Python. ;)

But this is from a grade A regex avoider, so take it with a small
chunk of sodium.

 Is it possible to somehow test for a match, as well as do assignment
 of the re match object to a variable?

One way to attack this problem that's not yet been explicitly
mentioned is to match using a generator function:

def match_each(s, re_seq):
   for r in re_seq:
   yield r.match(s)

And later something like:

for match in match_each(s, (expression1, expression2, expression3)):
if match:
print(match.groups()) # etc...

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Doing both regex match and assignment within a If loop?

2013-03-29 Thread Mitya Sirenef

On 03/29/2013 04:27 AM, Peter Otten wrote:

(2)

 import re

 class Matcher:
 def __call__(self, expr, line):
 result = self.match = expr.match(line)
 return result
 def __getattr__(self, name):
 return getattr(self.match, name)


Perhaps it's a little simpler to do this?


self.match =  expr.match(line)

 return self.match


 -m


--
Lark's Tongue Guide to Python: http://lightbird.net/larks/

Frisbeetarianism is the belief that when you die, your soul goes up on
the roof and gets stuck.  George Carlin

--
http://mail.python.org/mailman/listinfo/python-list


Doing both regex match and assignment within a If loop?

2013-03-28 Thread Victor Hooi
Hi,

I have logline that I need to test against multiple regexes. E.g.:

import re

expression1 = re.compile(r'')
expression2 = re.compile(r'')

with open('log.txt') as f:
for line in f:
if expression1.match(line):
# Do something - extract fields from line.
elif expression2.match(line):
# Do something else - extract fields from line.
else:
# Oh noes! Raise exception.

However, in the Do something section - I need access to the match object 
itself, so that I can strip out certain fields from the line.

Is it possible to somehow test for a match, as well as do assignment of the re 
match object to a variable?

if expression1.match(line) = results:
results.groupsdict()...

Obviously the above won't work - however, is there a Pythonic way to tackle 
this?

What I'm trying to avoid is this:

if expression1.match(line):
results = expression1.match(line)

which I assume would call the regex match against the line twice - and when I'm 
dealing with a huge amount of log lines, slow things down.

Cheers,
Victor
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Doing both regex match and assignment within a If loop?

2013-03-28 Thread Chris Rebert
On Thu, Mar 28, 2013 at 9:00 PM, Victor Hooi victorh...@gmail.com wrote:
 Hi,

 I have logline that I need to test against multiple regexes. E.g.:

 import re

 expression1 = re.compile(r'')
 expression2 = re.compile(r'')

 with open('log.txt') as f:
 for line in f:
 if expression1.match(line):
 # Do something - extract fields from line.
 elif expression2.match(line):
 # Do something else - extract fields from line.
 else:
 # Oh noes! Raise exception.

 However, in the Do something section - I need access to the match object 
 itself, so that I can strip out certain fields from the line.

 Is it possible to somehow test for a match, as well as do assignment of the 
 re match object to a variable?

 if expression1.match(line) = results:
 results.groupsdict()...

AFAIK, not without hacks and/or being unidiomatic.

 Obviously the above won't work - however, is there a Pythonic way to tackle 
 this?

 What I'm trying to avoid is this:

 if expression1.match(line):
 results = expression1.match(line)

 which I assume would call the regex match against the line twice - and when 
 I'm dealing with a huge amount of log lines, slow things down.

def process(line):
match = expr1.match(line)
if match:
# ...extract fields…
return something
match = expr2.match(line)
if match:
# ...extract fields…
return something
# etc…
raise SomeError()  # Oh noes!

with open('log.txt') as f:
for line in f:
results = process(line)


If you choose to further move the extractor snippets into their own
functions, then you can do:


# these could be lambdas if they're simple enough
def case1(match):
# ...
def case2(match):
# …
# etc...

REGEX_EXTRACTOR_PAIRS = [
(re.compile(r''), case1),
(re.compile(r''), case2),
# etc...
]

def process(line):
for regex, extractor in REGEX_EXTRACTOR_PAIRS:
match = regex.match(line)
if match:
return extractor(match)
raise SomeError()

Although this second option is likely somewhat less performant, but it
definitely saves on repetition.

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list