Re: Weird problem matching with REs

2011-05-29 Thread Thomas 'PointedEars' Lahn
Andrew Berg wrote: > On 2011.05.29 10:19 AM, Roy Smith wrote: >> Named after the governor of Tarsus IV? > Judging by the graphic at http://kodos.sourceforge.net/help/kodos.html , > it's named after the Simpsons character. I don't think that's a coincidence; both are from other planets and both

Re: Weird problem matching with REs

2011-05-29 Thread Chris Angelico
On Mon, May 30, 2011 at 2:16 AM, Andrew Berg wrote: >> Also, be sure to >> use a raw string when composing REs, so you don't run into backslash >> issues. > How would I do that when grabbing strings from a config file (via the > configparser module)? Or rather, if I have a predefined variable > co

Re: Weird problem matching with REs

2011-05-29 Thread John S
On May 29, 12:16 pm, Andrew Berg wrote: > > I've been meaning to learn how to use parenthesis groups. > > Also, be sure to > > use a raw string when composing REs, so you don't run into backslash > > issues. > > How would I do that when grabbing strings from a config file (via the > configparser m

Re: Weird problem matching with REs

2011-05-29 Thread Andrew Berg
On 2011.05.29 10:48 AM, John S wrote: > Dots don't match end-of-line-for-your-current-OS is how I think of > it. IMO, the docs should say the dot matches any character except a line feed ('\n'), since that is more accurate. > True, malformed > HTML can throw you off, but they can also throw a parse

Re: Weird problem matching with REs

2011-05-29 Thread John S
On May 29, 10:35 am, Andrew Berg wrote: > On 2011.05.29 09:18 AM, Steven D'Aprano wrote:> >> What makes you think it > shouldn't match? > > > > AFAIK, dots aren't supposed to match carriage returns or any other > > > whitespace characters. > > I got things mixed up there (was thinking whitespace

Re: Weird problem matching with REs

2011-05-29 Thread Andrew Berg
On 2011.05.29 10:19 AM, Roy Smith wrote: > Named after the governor of Tarsus IV? Judging by the graphic at http://kodos.sourceforge.net/help/kodos.html , it's named after the Simpsons character. -- http://mail.python.org/mailman/listinfo/python-list

Re: Weird problem matching with REs

2011-05-29 Thread Roy Smith
In article , Andrew Berg wrote: > Kodos is written in Python and uses Python's regex engine. In fact, it > is specifically intended to debug Python regexes. Named after the governor of Tarsus IV? -- http://mail.python.org/mailman/listinfo/python-list

Re: Weird problem matching with REs

2011-05-29 Thread Andrew Berg
On 2011.05.29 09:18 AM, Steven D'Aprano wrote: > >> What makes you think it shouldn't match? > > > > AFAIK, dots aren't supposed to match carriage returns or any other > > whitespace characters. > > They won't match *newlines* \n unless you pass the DOTALL flag, but they > do match whitespace: >

Re: Weird problem matching with REs

2011-05-29 Thread Steven D'Aprano
On Sun, 29 May 2011 08:41:16 -0500, Andrew Berg wrote: > On 2011.05.29 08:09 AM, Steven D'Aprano wrote: [...] > Kodos is written in Python and uses Python's regex engine. In fact, it > is specifically intended to debug Python regexes. Fair enough. >> Secondly, you probably should use a proper HT

Re: Weird problem matching with REs

2011-05-29 Thread Andrew Berg
On 2011.05.29 08:09 AM, Steven D'Aprano wrote: > On Sun, 29 May 2011 06:45:30 -0500, Andrew Berg wrote: > > > I have an RE that should work (it even works in Kodos [1], but not in my > > code), but it keeps failing to match characters after a newline. > > Not all regexes are the same. Different reg

Re: Weird problem matching with REs

2011-05-29 Thread Andrew Berg
On 2011.05.29 08:00 AM, Ben Finney wrote: > You are aware that most text-emitting processes on Windows, and Internet > text protocols like the HTTP standard, use the two-character “CR LF” > sequence (U+000C U+000A) for terminating lines? Yes, but I was not having trouble with just '\n' before, and

Re: Weird problem matching with REs

2011-05-29 Thread Steven D'Aprano
On Sun, 29 May 2011 06:45:30 -0500, Andrew Berg wrote: > I have an RE that should work (it even works in Kodos [1], but not in my > code), but it keeps failing to match characters after a newline. Not all regexes are the same. Different regex engines accept different symbols, and sometimes behav

Re: Weird problem matching with REs

2011-05-29 Thread Ben Finney
Ben Finney writes: > the two-character “CR LF” sequence (U+000C U+000A) > http://en.wikipedia.org/wiki/Newline> As detailed in that Wikipedia article, the characters are of course U+000D U+000A. -- \ “You say “Carmina”, and I say “Burana”, You say “Fortuna”, and | `\I say “cant

Re: Weird problem matching with REs

2011-05-29 Thread Ben Finney
Andrew Berg writes: > I was able to make a regex that matches in my code, but it shouldn't: > http://x264.nl/x264/64bit/8bit_depth/revision.\n{1,3}[0-9]{4}.\n{1,3}/x264.\n{1,3}.\n{1,3}.exe > I have to add a dot before each "\n". There is no character not > accounted for before those newlines, but

Weird problem matching with REs

2011-05-29 Thread Andrew Berg
I have an RE that should work (it even works in Kodos [1], but not in my code), but it keeps failing to match characters after a newline. I'm writing a little program that scans the webpage of an arbitrary application and gets the newest version advertised on the page. test3.py: > # -*- coding: