Re: string encoding regex problem

2014-08-23 Thread Peter Otten
Philipp Kraus wrote: > I have create a short script: > > - > #!/usr/bin/env python > > import re, urllib2 > > > def URLReader(url) : > f = urllib2.urlopen(url) > data = f.read() > f.close() > return data > > > print re.match( "\.*\<\/small\>", > URLReader("http://sour

Re: string encoding regex problem

2014-08-23 Thread Philipp Kraus
Hi, On 2014-08-16 09:01:57 +, Peter Otten said: Philipp Kraus wrote: The code works till last week correctly, I don't change the pattern. Websites' contents and structure change sometimes. My question is, can it be a problem with string encoding? Your regex is all-ascii. So an encod

Re: string encoding regex problem

2014-08-16 Thread Peter Otten
Philipp Kraus wrote: > The code works till last week correctly, I don't change the pattern. Websites' contents and structure change sometimes. > My question is, can it be a problem with string encoding? Your regex is all-ascii. So an encoding problem is very unlikely. > found = re.search( "

Re: string encoding regex problem

2014-08-15 Thread Steven D'Aprano
Philipp Kraus wrote: > The code works till last week correctly, I don't change the pattern. My > question is, can it be > a problem with string encoding? Did I mask the question mark and quotes > correctly? If you didn't change the code, how could the *exact same code* not mask the question mark

Re: string encoding regex problem

2014-08-15 Thread Roy Smith
In article , Philipp Kraus wrote: > The code works till last week correctly, I don't change the pattern. OK, so what did you change? Can you go back to last week's code and compare it to what you have now to see what changed? > My question is, can it be a problem with string encoding? Did I

Re: string encoding regex problem

2014-08-15 Thread Philipp Kraus
On 2014-08-16 00:48:46 +, Roy Smith said: In article , Philipp Kraus wrote: found = re.search( "http://sourceforge.net/projects/boost/files/boost/";) ) if found == None : raise MyError.StopError("Boost Download URL not found") But found is always None, so I cannot get the correc

Re: string encoding regex problem

2014-08-15 Thread Roy Smith
In article , Philipp Kraus wrote: > found = re.search( " href=\"/projects/boost/files/latest/download\?source=files\" > title=\"/boost/(.*)", > Utilities.URLReader("http://sourceforge.net/projects/boost/files/boost/";) > ) > if found == None : > raise MyError.StopError("Boost Download U

string encoding regex problem

2014-08-15 Thread Philipp Kraus
Hello, I have defined a function with: def URLReader(url) : try : f = urllib2.urlopen(url) data = f.read() f.close() except Exception, e : raise MyError.StopError(e) return data which get the HTML source code from an URL. I use this to get a part of a HTML

Re: regex (?!..) problem

2009-10-06 Thread Hans Mulder
Stefan Behnel wrote: Wolfgang Rohdewald wrote: I want to match a string only if a word (C1 in this example) appears at most once in it. def match(s): if s.count("C1") > 1: return None return s If this doesn't fit your requirements, you may want to provide some

Re: regex (?!..) problem

2009-10-05 Thread Wolfgang Rohdewald
On Monday 05 October 2009, MRAB wrote: > "(?!.*?(C1).*?\1)" will succeed only if ".*?(C1).*?\1" has failed, > in which case the group (group 1) will be undefined (no capture). I see. I should have moved the (C1) out of this expression anyway: >>> re.match(r'L(?P..)(?!.*?(?P=tile).*?(?P=tile))(

Re: regex (?!..) problem

2009-10-05 Thread MRAB
Wolfgang Rohdewald wrote: On Monday 05 October 2009, MRAB wrote: You're currently looking for one that's not followed by another; the solution is to check first whether there are two: >>> re.match(r'(?!.*?C1.*?C1)(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups() Traceback (most recent call last

Re: regex (?!..) problem

2009-10-05 Thread Wolfgang Rohdewald
On Monday 05 October 2009, MRAB wrote: > You're currently looking for one that's not followed by another; > the solution is to check first whether there are two: > > >>> re.match(r'(?!.*?C1.*?C1)(.*?C1)','C1b1b1b1 b3b3b3b3 > C1C2C3').groups() > > Traceback (most recent call last): >File ""

Re: regex (?!..) problem

2009-10-05 Thread Wolfgang Rohdewald
On Monday 05 October 2009, Carl Banks wrote: > Why do you have to use a regexp at all? not one but many with arbitrary content. please read my answer to Stefan. Take a look at the regexes I am using: http://websvn.kde.org/trunk/playground/games/kmj/src/predefined.py?view=markup moreover they are

Re: regex (?!..) problem

2009-10-05 Thread MRAB
Wolfgang Rohdewald wrote: Hi, I want to match a string only if a word (C1 in this example) appears at most once in it. This is what I tried: re.match(r'(.*?C1)((?!.*C1))','C1b1b1b1 b3b3b3b3 C1C2C3').groups() ('C1b1b1b1 b3b3b3b3 C1', '') re.match(r'(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups(

Re: regex (?!..) problem

2009-10-05 Thread Wolfgang Rohdewald
On Monday 05 October 2009, Stefan Behnel wrote: > Wolfgang Rohdewald wrote: > > I want to match a string only if a word (C1 in this example) > > appears at most once in it. > > def match(s): > if s.count("C1") > 1: > return None > return s > > If this doesn't fit y

Re: regex (?!..) problem

2009-10-05 Thread Carl Banks
On Oct 4, 11:17 pm, Wolfgang Rohdewald wrote: > On Monday 05 October 2009, Carl Banks wrote: > > > What you're not realizing is that if a regexp search comes to a > >  dead end, it won't simply return "no match".  Instead it'll throw > >  away part of the match, and backtrack to a previously-match

Re: regex (?!..) problem

2009-10-04 Thread Stefan Behnel
Wolfgang Rohdewald wrote: > I want to match a string only if a word (C1 in this example) appears > at most once in it. def match(s): if s.count("C1") > 1: return None return s If this doesn't fit your requirements, you may want to provide some more details. Stefan

Re: regex (?!..) problem

2009-10-04 Thread Wolfgang Rohdewald
On Monday 05 October 2009, Carl Banks wrote: > What you're not realizing is that if a regexp search comes to a > dead end, it won't simply return "no match". Instead it'll throw > away part of the match, and backtrack to a previously-matched > variable-length subexpression, such as ".*?", and t

Re: regex (?!..) problem

2009-10-04 Thread Carl Banks
On Oct 4, 9:34 pm, Wolfgang Rohdewald wrote: > Hi, > > I want to match a string only if a word (C1 in this example) appears > at most once in it. This is what I tried: > > >>> re.match(r'(.*?C1)((?!.*C1))','C1b1b1b1 b3b3b3b3 C1C2C3').groups() > > ('C1b1b1b1 b3b3b3b3 C1', '')>>> re.match(r'(.*?C1)'

Re: regex (?!..) problem

2009-10-04 Thread n00m
Why not check it simply by "count()"? >>> s = '1234C156789' >>> s.count('C1') 1 >>> -- http://mail.python.org/mailman/listinfo/python-list

regex (?!..) problem

2009-10-04 Thread Wolfgang Rohdewald
Hi, I want to match a string only if a word (C1 in this example) appears at most once in it. This is what I tried: >>> re.match(r'(.*?C1)((?!.*C1))','C1b1b1b1 b3b3b3b3 C1C2C3').groups() ('C1b1b1b1 b3b3b3b3 C1', '') >>> re.match(r'(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups() ('C1',) but this sho

Re: regex problem ..

2008-12-17 Thread Steve Holden
Analog Kid wrote: > Hi guys: > Thanks for your responses. Points taken. Basically, I am looking for a > combination of the following ... > [^\w] and %(?!20) ... How do I do this in a single RE? > > Thanks for all you help. > Regards, > AK > > On Mon, Dec 15, 2008 at 10:54 PM, Steve Holden

Re: regex problem ..

2008-12-17 Thread Analog Kid
Hi guys: Thanks for your responses. Points taken. Basically, I am looking for a combination of the following ... [^\w] and %(?!20) ... How do I do this in a single RE? Thanks for all you help. Regards, AK On Mon, Dec 15, 2008 at 10:54 PM, Steve Holden wrote: > Analog Kid wrote: > > Hi All: > >

Re: regex problem ..

2008-12-15 Thread Steve Holden
Analog Kid wrote: > Hi All: > I am new to regular expressions in general, and not just re in python. > So, apologies if you find my question stupid :) I need some help with > forming a regex. Here is my scenario ... > I have strings coming in from a list, each of which I want to check > against a r

Re: regex problem ..

2008-12-15 Thread Tino Wildenhain
Analog Kid wrote: Hi All: I am new to regular expressions in general, and not just re in python. So, apologies if you find my question stupid :) I need some help with forming a regex. Here is my scenario ... I have strings coming in from a list, each of which I want to check against a regular

regex problem ..

2008-12-15 Thread Analog Kid
Hi All: I am new to regular expressions in general, and not just re in python. So, apologies if you find my question stupid :) I need some help with forming a regex. Here is my scenario ... I have strings coming in from a list, each of which I want to check against a regular expression and see whet

Re: regex problem with re and fnmatch

2007-11-21 Thread Fabian Braennstroem
Hi John, John Machin schrieb am 11/20/2007 09:40 PM: > On Nov 21, 8:05 am, Fabian Braennstroem <[EMAIL PROTECTED]> wrote: >> Hi, >> >> I would like to use re to search for lines in a files with >> the word "README_x.org", where x is any number. >> E.g. the structure would look like this: >> [[file

Re: regex problem with re and fnmatch

2007-11-20 Thread John Machin
On Nov 21, 8:05 am, Fabian Braennstroem <[EMAIL PROTECTED]> wrote: > Hi, > > I would like to use re to search for lines in a files with > the word "README_x.org", where x is any number. > E.g. the structure would look like this: > [[file:~/pfm_v99/README_1.org]] > > I tried to use these kind of mat

regex problem with re and fnmatch

2007-11-20 Thread Fabian Braennstroem
root, name) if fnmatch.fnmatch(str(files), "README*"): print "File Found" print str(files) break As soon as it finds the file, it should stop the searching process; but there is the

Re: newb: Simple regex problem headache

2007-09-21 Thread Erik Jones
On Sep 21, 2007, at 4:04 PM, crybaby wrote: > import re > > s1 =' 25000 ' > s2 = ' 5.5910 ' > > mypat = re.compile('[0-9]*(\.[0-9]*|$)') > rate= mypat.search(s1) > print rate.group() > > rate=mypat.search(s2) > print rate.group() > rate = mypat.search(s1) > price = float(rate.group()) > print pri

Re: newb: Simple regex problem headache

2007-09-21 Thread Ian Clark
crybaby wrote: > import re > > s1 =' 25000 ' > s2 = ' 5.5910 ' > > mypat = re.compile('[0-9]*(\.[0-9]*|$)') > rate= mypat.search(s1) > print rate.group() > > rate=mypat.search(s2) > print rate.group() > rate = mypat.search(s1) > price = float(rate.group()) > print price > > I get an error when

Re: newb: Simple regex problem headache

2007-09-21 Thread chris . monsanto
On Sep 21, 5:04 pm, crybaby <[EMAIL PROTECTED]> wrote: > import re > > s1 =' 25000 ' > s2 = ' 5.5910 ' > > mypat = re.compile('[0-9]*(\.[0-9]*|$)') > rate= mypat.search(s1) > print rate.group() > > rate=mypat.search(s2) > print rate.group() > rate = mypat.search(s1) > price = float(rate.group()) >

newb: Simple regex problem headache

2007-09-21 Thread crybaby
import re s1 =' 25000 ' s2 = ' 5.5910 ' mypat = re.compile('[0-9]*(\.[0-9]*|$)') rate= mypat.search(s1) print rate.group() rate=mypat.search(s2) print rate.group() rate = mypat.search(s1) price = float(rate.group()) print price I get an error when it hits the whole number, that is in this forma

Re: regex problem

2006-11-22 Thread bearophileHUGS
> > line is am trying to match is > > 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 > > > > regex i have written is > > re.compile > > (r'(\d+?)\|((P|O|Q)\w{5})\|\w{3,6}\_\w{3,5}\s+?.{25}\s{3}(\d+?\.\d)\s+?(\d\.\d+?)') > > > > I am trying to extract 0.0011 value from

Re: regex problem

2006-11-22 Thread km
HI Tim, oof! thats true! thanks a lot. Is there any tool to simplify building the regex ? regards, KM On 11/23/06, Tim Chase <[EMAIL PROTECTED]> wrote: > line is am trying to match is > 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 > > regex i have written is >

Re: regex problem

2006-11-22 Thread Tim Chase
> line is am trying to match is > 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 > > regex i have written is > re.compile > (r'(\d+?)\|((P|O|Q)\w{5})\|\w{3,6}\_\w{3,5}\s+?.{25}\s{3}(\d+?\.\d)\s+?(\d\.\d+?)') > > I am trying to extract 0.0011 value from the above line

regex problem

2006-11-22 Thread km
Hi all, line is am trying to match is 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 regex i have written is re.compile (r'(\d+?)\|((P|O|Q)\w{5})\|\w{3,6}\_\w{3,5}\s+?.{25}\s{3}(\d+?\.\d)\s+?(\d\.\d+?)') I am trying to extract 0.0011 value from the above line. why

Re: regex problem

2005-07-27 Thread Odd-R.
On 2005-07-26, Duncan Booth <[EMAIL PROTECTED]> wrote: rx1=re.compile(r"""\b\d{4}(?:-\d{4})?,""") rx1.findall("1234,-,4567,") > ['1234,', '-,', '4567,'] Thanks all for good advice. However this last expression also matches the first four digits when the input is more than

Re: regex problem

2005-07-26 Thread John Machin
Duncan Booth wrote: > John Machin wrote: > > >>So here's the mean lean no-flab version -- you don't even need the >>parentheses (sorry, Thomas). >> >> >rx1=re.compile(r"""\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,""") >rx1.findall("1234,-,4567,") >> >>['1234,', '-,', '4567,'] > >

Re: regex problem

2005-07-26 Thread Duncan Booth
John Machin wrote: > So here's the mean lean no-flab version -- you don't even need the > parentheses (sorry, Thomas). > > >>> rx1=re.compile(r"""\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,""") > >>> rx1.findall("1234,-,4567,") > ['1234,', '-,', '4567,'] No flab? What about all that repeti

Re: regex problem

2005-07-26 Thread John Machin
Odd-R. wrote: > Input is a string of four digit sequences, possibly > separated by a -, for instance like this > > "1234,-,4567," > > My regular expression is like this: > > rx1=re.compile(r"""\A(\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,)*\Z""") > > When running rx1.findall("1234,-,4567,

Re: regex problem

2005-07-26 Thread Thomas Guettler
Am Tue, 26 Jul 2005 09:57:23 + schrieb Odd-R.: > Input is a string of four digit sequences, possibly > separated by a -, for instance like this > > "1234,-,4567," > > My regular expression is like this: > > rx1=re.compile(r"""\A(\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,)*\Z""") Hi, try it

regex problem

2005-07-26 Thread Odd-R.
Input is a string of four digit sequences, possibly separated by a -, for instance like this "1234,-,4567," My regular expression is like this: rx1=re.compile(r"""\A(\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,)*\Z""") When running rx1.findall("1234,-,4567,") I only get the last match as t