Re: Why is regexp not working?

2014-07-05 Thread Denis McMahon
On Fri, 04 Jul 2014 14:27:12 +0200, Florian Lindner wrote:

 self.regexps = [rit (?Pcoupling_iterations\d+) .* dt complete yes |
 write-iteration-checkpoint |,
 rit (?Pit_read_ahead\d+) read ahead

My first thought is what is the effect of '|' as the last character in 
the regex?

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Why is regexp not working?

2014-07-04 Thread Florian Lindner
Hello,

I have that piece of code:

def _split_block(self, block):
cre = [re.compile(r, flags = re.MULTILINE) for r in self.regexps]
block = .join(block)
print(block)
print(---)
for regexp in cre:
match = regexp.match(block)
for grp in regexp.groupindex:
data = match.group(grp) if match else None
self.data[grp].append(data)


block is a list of strings, terminated by \n. self.regexps:


self.regexps = [rit (?Pcoupling_iterations\d+) .* dt complete yes | 
write-iteration-checkpoint |,
rit (?Pit_read_ahead\d+) read ahead


If I run my program it looks like that:


it 1 ahadf dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 1 read ahead
it 2 ahgsaf dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 4 read ahead
it 3 dfdsag dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 9 read ahead
it 4 dsfdd dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 16 read ahead
---
{'it_read_ahead': [None, '1', '4', '9', '16'], 'coupling_iterations': ['1', 
None, None, None, None]}

it_read_ahead is always matched when it should (all blocks but the first). 
But why is the regexp containing coupling_iterations only matched in the 
first block?

I tried different combinations using re.match vs. re.search and with or 
without re.MULTILINE.

Thanks!
Florian

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why is regexp not working?

2014-07-04 Thread MRAB

On 2014-07-04 13:27, Florian Lindner wrote:

Hello,

I have that piece of code:

 def _split_block(self, block):
 cre = [re.compile(r, flags = re.MULTILINE) for r in self.regexps]
 block = .join(block)
 print(block)
 print(---)
 for regexp in cre:
 match = regexp.match(block)
 for grp in regexp.groupindex:
 data = match.group(grp) if match else None
 self.data[grp].append(data)


block is a list of strings, terminated by \n. self.regexps:


self.regexps = [rit (?Pcoupling_iterations\d+) .* dt complete yes |
write-iteration-checkpoint |,
 rit (?Pit_read_ahead\d+) read ahead


If I run my program it looks like that:


it 1 ahadf dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 1 read ahead
it 2 ahgsaf dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 4 read ahead
it 3 dfdsag dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 9 read ahead
it 4 dsfdd dt complete yes | write-iteration-checkpoint |
Timestep completed

---
it 16 read ahead
---
{'it_read_ahead': [None, '1', '4', '9', '16'], 'coupling_iterations': ['1',
None, None, None, None]}

it_read_ahead is always matched when it should (all blocks but the first).
But why is the regexp containing coupling_iterations only matched in the
first block?

I tried different combinations using re.match vs. re.search and with or
without re.MULTILINE.


The character '|' is a metacharacter that separates alternatives. For
example, the regex 'a|b' will match 'a' or b'.

Your regexes end with '|', which means that they will match an empty
string at the start of the target string.

--
https://mail.python.org/mailman/listinfo/python-list