On 05/03/13 01:22, Charles Leviton wrote:

I have some confusion regarding when findall returns a list of strings
and when it returns a list of tuples.
Would appreciate an explanation.

re is not my strongest suite but I'll have a go.

My understanding of how findall works is that it returns a list of matches. If groups are used in the pattern each match will be a tuple containing the groups (if there is only one group in the pattern it will be a tuple of only one element)

s1 = '<td>1</td><td>Michael</td><td>Jessica</td>'
>>> re.findall(r'<td>(\w{1,})',s1)
['1', 'Michael', 'Jessica']
>>> re.findall(r'<td>(\d+)</td><td>(\w+)</td>\<td>(\w+)</td>', s1)
 >>> strlist
[('1', 'Michael', 'Jessica')]

In the first example you define a single group so you get the three separate matches in a list. ie 3 separate entries each of a single value. In the second you define 3 groups within your pattern and re locates only one occurrence of the pattern so returns a single entry which is a tuple of the 3 group items.

Consider now a slightly forced example:

>>> re.findall(r'<?td>(\w)(\w)',s1)
[('M', 'i'), ('J', 'e')]
>>>

This uses two groups so we get back a tuple of two elements.
And because the pattern matches twice we get two tuples.

I'm sure somebody else will give a more lucid explanation...

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to