bruce wrote: > Hi. I have a chunk of text code, which has multiple lines. > > I'd like to do a regex, find a pattern, and in the line that matches the > pattern, mod the line. Sounds simple. > > I've created a test regex. However, after spending time/google.. can't > quite figure out how to then get the "complete" line containing the > returned regex/pattern. > > Pretty sure this is simple, and i'm just missing something. > > my test "text" and regex are: > > > s=''' > <td valign="top" colspan="1"><b><a href="#" > id='CourseId10795788|ACCT2081|002_005_006' style="font-weight:bold;" > onclick='ShowSeats(this);return false;' alt="Click for Class Availability" > title="Click for Class Availability">ACCT2081</a></b></td>''' > > > pattern = re.compile(r'Course\S+|\S+\|') > aa= pattern.search(s).group() > print "sss" > print aa > > so, once I get the group, I'd like to use the returned match to then get > the complete line.. > > pointers/thoughts!! (no laughing!!)
Are you sure you are processing text rather than structured data? HTML doesn't have the notion of a "line". To extract information from HTML tools like Beautiful Soup are better suited than regular expressions: import bs4 import re s = ... soup = bs4.BeautifulSoup(s) for a in soup.find_all("a", id=re.compile(r"Course\S+\|\S+\|")): print a["id"] print a.text print a.parent.parent["colspan"] _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor