Re: Issue with regular expressions

2008-04-29 Thread harvey . thomas
On Apr 29, 2:46 pm, Julien [EMAIL PROTECTED] wrote: Hi, I'm fairly new in Python and I haven't used the regular expressions enough to be able to achieve what I want. I'd like to select terms in a string, so I can then do a search in my database. query = '    some words  with and without  

Re: Matching XML Tag Contents with Regex

2007-12-11 Thread harvey . thomas
On Dec 11, 4:05 pm, Chris [EMAIL PROTECTED] wrote: I'm trying to find the contents of an XML tag. Nothing fancy. I don't care about parsing child tags or anything. I just want to get the raw text. Here's my script: import re data = ?xml version='1.0'? body div class='default'

Re: just a bug (was: xml.dom.minidom: how to preserve CRLF's inside CDATA?)

2007-05-25 Thread harvey . thomas
On May 25, 12:03 pm, sim.sim [EMAIL PROTECTED] wrote: On 25 ÍÁÊ, 12:45, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: In [EMAIL PROTECTED], sim.sim wrote: Below the code that tryes to parse an well-formed xml, but it fails with error message: not well-formed (invalid token): line

Re: xml.dom.minidom: how to preserve CRLF's inside CDATA?

2007-05-22 Thread harvey . thomas
On May 22, 2:45 pm, sim.sim [EMAIL PROTECTED] wrote: Hi all. i'm faced to trouble using minidom: #i have a string (xml) within CDATA section, and the section includes \r\n: iInStr = '?xml version=1.0?\nData![CDATA[BEGIN:VCALENDAR\r \nEND:VCALENDAR\r\n]]/Data\n' #After i create DOM-object,

Re: XML Parsing

2007-03-28 Thread harvey . thomas
On Mar 28, 10:51 am, Diez B. Roggisch [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: I want to parse this XML file: ?xml version=1.0 ? text text:one filefilename/file contents Hello /contents /text:one text:two filefilename2/file contents Hello2 /contents

Re: Match 2 words in a line of file

2007-01-19 Thread harvey . thomas
Rickard Lindberg wrote: I see two potential problems with the non regex solutions. 1) Consider a line: foo (bar). When you split it you will only get two strings, as split by default only splits the string on white space characters. Thus 'bar' in words will return false, even though bar is

Re: Match 2 words in a line of file

2007-01-19 Thread harvey . thomas
Rickard Lindberg wrote: I see two potential problems with the non regex solutions. 1) Consider a line: foo (bar). When you split it you will only get two strings, as split by default only splits the string on white space characters. Thus 'bar' in words will return false, even though bar is

Re: One more regular expressions question

2007-01-18 Thread harvey . thomas
Victor Polukcht wrote: My pattern now is: (?Pvar1[^(]+)(?Pvar2\d+)\)\s+(?Pvar3\d+) And i expect to get: var1 = Unassigned Number var2 = 1 var3 = 32 I'm sure my regexp is incorrect, but can't understand where exactly. Regex.debug shows that even the first block is incorrect. Thanks

Re: re.sub and empty groups

2007-01-16 Thread harvey . thomas
Hugo Ferreira wrote: Hi! I'm trying to do a search-replace in places where some groups are optional... Here's an example: re.match(rImage:([^\|]+)(?:\|(.*))?, Image:ola).groups() ('ola', None) re.match(rImage:([^\|]+)(?:\|(.*))?, Image:ola|).groups() ('ola', '')

Re: re.sub and empty groups

2007-01-16 Thread harvey . thomas
Hugo Ferreira wrote: Hi! I'm trying to do a search-replace in places where some groups are optional... Here's an example: re.match(rImage:([^\|]+)(?:\|(.*))?, Image:ola).groups() ('ola', None) re.match(rImage:([^\|]+)(?:\|(.*))?, Image:ola|).groups() ('ola', '')

Re: Insert characters into string based on re ?

2006-10-13 Thread harvey . thomas
Matt wrote: I am attempting to reformat a string, inserting newlines before certain phrases. For example, in formatting SQL, I want to start a new line at each JOIN condition. Noting that strings are immutable, I thought it best to spllit the string at the key points, then join with '\n'.

Re: Need a Regular expression to remove a char for Unicode text

2006-10-13 Thread harvey . thomas
శ్రీనివాస wrote: Hai friends, Can any one tell me how can i remove a character from a unocode text. కల్‌హార is a Telugu word in Unicode. Here i want to remove '' but not replace with a zero width char. And one more thing, if any whitespaces are there before and after '' char, the text should