Re: string encoding regex problem

2014-08-23 Thread Philipp Kraus
Hi, On 2014-08-16 09:01:57 +, Peter Otten said: Philipp Kraus wrote: The code works till last week correctly, I don't change the pattern. Websites' contents and structure change sometimes. My question is, can it be a problem with string encoding? Your regex is all-ascii. So an

Re: string encoding regex problem

2014-08-23 Thread Peter Otten
Philipp Kraus wrote: I have create a short script: - #!/usr/bin/env python import re, urllib2 def URLReader(url) : f = urllib2.urlopen(url) data = f.read() f.close() return data print re.match( \small\ \.*\\/small\,

Re: string encoding regex problem

2014-08-16 Thread Peter Otten
Philipp Kraus wrote: The code works till last week correctly, I don't change the pattern. Websites' contents and structure change sometimes. My question is, can it be a problem with string encoding? Your regex is all-ascii. So an encoding problem is very unlikely. found = re.search( a

string encoding regex problem

2014-08-15 Thread Philipp Kraus
Hello, I have defined a function with: def URLReader(url) : try : f = urllib2.urlopen(url) data = f.read() f.close() except Exception, e : raise MyError.StopError(e) return data which get the HTML source code from an URL. I use this to get a part of a HTML

Re: string encoding regex problem

2014-08-15 Thread Roy Smith
In article lsm8ic$j90$1...@online.de, Philipp Kraus philipp.kr...@flashpixx.de wrote: found = re.search( a href=\/projects/boost/files/latest/download\?source=files\ title=\/boost/(.*), Utilities.URLReader(http://sourceforge.net/projects/boost/files/boost/;) ) if found == None :

Re: string encoding regex problem

2014-08-15 Thread Philipp Kraus
On 2014-08-16 00:48:46 +, Roy Smith said: In article lsm8ic$j90$1...@online.de, Philipp Kraus philipp.kr...@flashpixx.de wrote: found = re.search( a href=\/projects/boost/files/latest/download\?source=files\ title=\/boost/(.*),

Re: string encoding regex problem

2014-08-15 Thread Roy Smith
In article lsmeej$49n$1...@online.de, Philipp Kraus philipp.kr...@flashpixx.de wrote: The code works till last week correctly, I don't change the pattern. OK, so what did you change? Can you go back to last week's code and compare it to what you have now to see what changed? My question

Re: string encoding regex problem

2014-08-15 Thread Steven D'Aprano
Philipp Kraus wrote: The code works till last week correctly, I don't change the pattern. My question is, can it be a problem with string encoding? Did I mask the question mark and quotes correctly? If you didn't change the code, how could the *exact same code* not mask the question mark

Re: regex (?!..) problem

2009-10-06 Thread Hans Mulder
Stefan Behnel wrote: Wolfgang Rohdewald wrote: I want to match a string only if a word (C1 in this example) appears at most once in it. def match(s): if s.count(C1) 1: return None return s If this doesn't fit your requirements, you may want to provide some

Re: regex (?!..) problem

2009-10-05 Thread Carl Banks
On Oct 4, 9:34 pm, Wolfgang Rohdewald wolfg...@rohdewald.de wrote: Hi, I want to match a string only if a word (C1 in this example) appears at most once in it. This is what I tried: re.match(r'(.*?C1)((?!.*C1))','C1b1b1b1 b3b3b3b3 C1C2C3').groups() ('C1b1b1b1 b3b3b3b3 C1', '')

Re: regex (?!..) problem

2009-10-05 Thread Wolfgang Rohdewald
On Monday 05 October 2009, Carl Banks wrote: What you're not realizing is that if a regexp search comes to a dead end, it won't simply return no match. Instead it'll throw away part of the match, and backtrack to a previously-matched variable-length subexpression, such as .*?, and try

Re: regex (?!..) problem

2009-10-05 Thread Stefan Behnel
Wolfgang Rohdewald wrote: I want to match a string only if a word (C1 in this example) appears at most once in it. def match(s): if s.count(C1) 1: return None return s If this doesn't fit your requirements, you may want to provide some more details. Stefan --

Re: regex (?!..) problem

2009-10-05 Thread Carl Banks
On Oct 4, 11:17 pm, Wolfgang Rohdewald wolfg...@rohdewald.de wrote: On Monday 05 October 2009, Carl Banks wrote: What you're not realizing is that if a regexp search comes to a  dead end, it won't simply return no match.  Instead it'll throw  away part of the match, and backtrack to a

Re: regex (?!..) problem

2009-10-05 Thread Wolfgang Rohdewald
On Monday 05 October 2009, Stefan Behnel wrote: Wolfgang Rohdewald wrote: I want to match a string only if a word (C1 in this example) appears at most once in it. def match(s): if s.count(C1) 1: return None return s If this doesn't fit your

Re: regex (?!..) problem

2009-10-05 Thread Wolfgang Rohdewald
On Monday 05 October 2009, MRAB wrote: (?!.*?(C1).*?\1) will succeed only if .*?(C1).*?\1 has failed, in which case the group (group 1) will be undefined (no capture). I see. I should have moved the (C1) out of this expression anyway: re.match(r'L(?Ptile..)(?!.*?(?P=tile).*?(?P=tile))(.*?

regex (?!..) problem

2009-10-04 Thread Wolfgang Rohdewald
Hi, I want to match a string only if a word (C1 in this example) appears at most once in it. This is what I tried: re.match(r'(.*?C1)((?!.*C1))','C1b1b1b1 b3b3b3b3 C1C2C3').groups() ('C1b1b1b1 b3b3b3b3 C1', '') re.match(r'(.*?C1)','C1b1b1b1 b3b3b3b3 C1C2C3').groups() ('C1',) but this should

Re: regex (?!..) problem

2009-10-04 Thread n00m
Why not check it simply by count()? s = '1234C156789' s.count('C1') 1 -- http://mail.python.org/mailman/listinfo/python-list

Re: regex problem ..

2008-12-17 Thread Steve Holden
Analog Kid wrote: Hi guys: Thanks for your responses. Points taken. Basically, I am looking for a combination of the following ... [^\w] and %(?!20) ... How do I do this in a single RE? Thanks for all you help. Regards, AK On Mon, Dec 15, 2008 at 10:54 PM, Steve Holden

Re: regex problem ..

2008-12-17 Thread Analog Kid
Hi guys: Thanks for your responses. Points taken. Basically, I am looking for a combination of the following ... [^\w] and %(?!20) ... How do I do this in a single RE? Thanks for all you help. Regards, AK On Mon, Dec 15, 2008 at 10:54 PM, Steve Holden st...@holdenweb.com wrote: Analog Kid

Re: regex problem ..

2008-12-15 Thread Tino Wildenhain
Analog Kid wrote: Hi All: I am new to regular expressions in general, and not just re in python. So, apologies if you find my question stupid :) I need some help with forming a regex. Here is my scenario ... I have strings coming in from a list, each of which I want to check against a regular

regex problem ..

2008-12-15 Thread Analog Kid
Hi All: I am new to regular expressions in general, and not just re in python. So, apologies if you find my question stupid :) I need some help with forming a regex. Here is my scenario ... I have strings coming in from a list, each of which I want to check against a regular expression and see

Re: regex problem ..

2008-12-15 Thread Steve Holden
Analog Kid wrote: Hi All: I am new to regular expressions in general, and not just re in python. So, apologies if you find my question stupid :) I need some help with forming a regex. Here is my scenario ... I have strings coming in from a list, each of which I want to check against a

Re: regex problem with re and fnmatch

2007-11-21 Thread Fabian Braennstroem
Hi John, John Machin schrieb am 11/20/2007 09:40 PM: On Nov 21, 8:05 am, Fabian Braennstroem [EMAIL PROTECTED] wrote: Hi, I would like to use re to search for lines in a files with the word README_x.org, where x is any number. E.g. the structure would look like this:

regex problem with re and fnmatch

2007-11-20 Thread Fabian Braennstroem
*): print File Found print str(files) break As soon as it finds the file, it should stop the searching process; but there is the same matching problem like above. Does anyone have any suggestions about the regex problem? Greetings

Re: regex problem with re and fnmatch

2007-11-20 Thread John Machin
On Nov 21, 8:05 am, Fabian Braennstroem [EMAIL PROTECTED] wrote: Hi, I would like to use re to search for lines in a files with the word README_x.org, where x is any number. E.g. the structure would look like this: [[file:~/pfm_v99/README_1.org]] I tried to use these kind of matchings: #

newb: Simple regex problem headache

2007-09-21 Thread crybaby
import re s1 ='nbsp;25000nbsp;' s2 = 'nbsp;5.5910nbsp;' mypat = re.compile('[0-9]*(\.[0-9]*|$)') rate= mypat.search(s1) print rate.group() rate=mypat.search(s2) print rate.group() rate = mypat.search(s1) price = float(rate.group()) print price I get an error when it hits the whole number, that

Re: newb: Simple regex problem headache

2007-09-21 Thread chris . monsanto
On Sep 21, 5:04 pm, crybaby [EMAIL PROTECTED] wrote: import re s1 ='nbsp;25000nbsp;' s2 = 'nbsp;5.5910nbsp;' mypat = re.compile('[0-9]*(\.[0-9]*|$)') rate= mypat.search(s1) print rate.group() rate=mypat.search(s2) print rate.group() rate = mypat.search(s1) price = float(rate.group())

Re: newb: Simple regex problem headache

2007-09-21 Thread Ian Clark
crybaby wrote: import re s1 ='nbsp;25000nbsp;' s2 = 'nbsp;5.5910nbsp;' mypat = re.compile('[0-9]*(\.[0-9]*|$)') rate= mypat.search(s1) print rate.group() rate=mypat.search(s2) print rate.group() rate = mypat.search(s1) price = float(rate.group()) print price I get an error when

Re: newb: Simple regex problem headache

2007-09-21 Thread Erik Jones
On Sep 21, 2007, at 4:04 PM, crybaby wrote: import re s1 ='nbsp;25000nbsp;' s2 = 'nbsp;5.5910nbsp;' mypat = re.compile('[0-9]*(\.[0-9]*|$)') rate= mypat.search(s1) print rate.group() rate=mypat.search(s2) print rate.group() rate = mypat.search(s1) price = float(rate.group()) print

regex problem

2006-11-22 Thread km
Hi all, line is am trying to match is 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 regex i have written is re.compile (r'(\d+?)\|((P|O|Q)\w{5})\|\w{3,6}\_\w{3,5}\s+?.{25}\s{3}(\d+?\.\d)\s+?(\d\.\d+?)') I am trying to extract 0.0011 value from the above line. why

Re: regex problem

2006-11-22 Thread Tim Chase
line is am trying to match is 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 regex i have written is re.compile (r'(\d+?)\|((P|O|Q)\w{5})\|\w{3,6}\_\w{3,5}\s+?.{25}\s{3}(\d+?\.\d)\s+?(\d\.\d+?)') I am trying to extract 0.0011 value from the above line. why

Re: regex problem

2006-11-22 Thread km
HI Tim, oof! thats true! thanks a lot. Is there any tool to simplify building the regex ? regards, KM On 11/23/06, Tim Chase [EMAIL PROTECTED] wrote: line is am trying to match is 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 regex i have written is

Re: regex problem

2006-11-22 Thread bearophileHUGS
line is am trying to match is 1959400|Q2BYK3|Q2BYK3_9GAMM Hypothetical outer membra29.90.00011 1 regex i have written is re.compile (r'(\d+?)\|((P|O|Q)\w{5})\|\w{3,6}\_\w{3,5}\s+?.{25}\s{3}(\d+?\.\d)\s+?(\d\.\d+?)') I am trying to extract 0.0011 value from the above line.

Re: regex problem

2005-07-27 Thread Odd-R.
On 2005-07-26, Duncan Booth [EMAIL PROTECTED] wrote: rx1=re.compile(r\b\d{4}(?:-\d{4})?,) rx1.findall(1234,-,4567,) ['1234,', '-,', '4567,'] Thanks all for good advice. However this last expression also matches the first four digits when the input is more than four digits. To

regex problem

2005-07-26 Thread Odd-R.
Input is a string of four digit sequences, possibly separated by a -, for instance like this 1234,-,4567, My regular expression is like this: rx1=re.compile(r\A(\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,)*\Z) When running rx1.findall(1234,-,4567,) I only get the last match as the

Re: regex problem

2005-07-26 Thread Thomas Guettler
Am Tue, 26 Jul 2005 09:57:23 + schrieb Odd-R.: Input is a string of four digit sequences, possibly separated by a -, for instance like this 1234,-,4567, My regular expression is like this: rx1=re.compile(r\A(\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,)*\Z) Hi, try it without \A and

Re: regex problem

2005-07-26 Thread John Machin
Odd-R. wrote: Input is a string of four digit sequences, possibly separated by a -, for instance like this 1234,-,4567, My regular expression is like this: rx1=re.compile(r\A(\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,)*\Z) When running rx1.findall(1234,-,4567,) I only get

Re: regex problem

2005-07-26 Thread Duncan Booth
John Machin wrote: So here's the mean lean no-flab version -- you don't even need the parentheses (sorry, Thomas). rx1=re.compile(r\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,) rx1.findall(1234,-,4567,) ['1234,', '-,', '4567,'] No flab? What about all that repetition of \d? A less

Re: regex problem

2005-07-26 Thread John Machin
Duncan Booth wrote: John Machin wrote: So here's the mean lean no-flab version -- you don't even need the parentheses (sorry, Thomas). rx1=re.compile(r\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,) rx1.findall(1234,-,4567,) ['1234,', '-,', '4567,'] No flab? What about all that