Hi Michael,
Is a non regex way any help? I can think of a way that uses string methods -
space=" "
stringStuff="Stuff with multiple spaces"
indexN = 0
ranges=[]
while 1:
try:
indexN=stringStuff.index(space, indexN)
if indexN+1 == space:
indexT = indexN
while 1:
indexT += 1
if not indexT == " ":
ranges.append((indexN, indexT))
break
indexN=indexT +1
else:
indexN += 1
except ValueError:
ranges.reverse()
for (low, high) in ranges:
stringStuff.replace[stringStuff[low:high], space]
HTH
Liam Clarke
On Tue, 4 Jan 2005 15:39:18 -0800, Michael Powe <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I'm having erratic results with a regex. I'm hoping someone can
> pinpoint the problem.
>
> This function removes HTML formatting codes from a text email that is
> poorly exported -- it is supposed to be a text version of an HTML
> mailing, but it's basically just a text version of the HTML page. I'm
> not after anything elaborate, but it has gotten to be a bit of an
> itch. ;-)
>
> def parseFile(inFile) :
> import re
> bSpace = re.compile("^ ")
> multiSpace = re.compile(r"\s\s+")
> nbsp = re.compile(r" ")
> HTMLRegEx =
>
> re.compile(r"(<|<)/?((!--.*--)|(STYLE.*STYLE)|(P|BR|b|STRONG))/?(>|>)
> ",re.I)
>
> f = open(inFile,"r")
> lines = f.readlines()
> newLines = []
> for line in lines :
> line = HTMLRegEx.sub(' ',line)
> line = bSpace.sub('',line)
> line = nbsp.sub(' ',line)
> line = multiSpace.sub(' ',line)
> newLines.append(line)
> f.close()
> return newLines
>
> Now, the main issue I'm looking at is with the multiSpace regex. When
> applied, this removes some blank lines but not others. I don't want
> it to remove any blank lines, just contiguous multiple spaces in a
> line.
>
> BTB, this also illustrates a difference between python and perl -- in
> perl, i can change "line" and it automatically changes the entry in
> the array; this doesn't work in python. A bit annoying, actually.
> ;-)
>
> Thanks for any help. If there's a better way to do this, I'm open to
> suggestions on that regard, too.
>
> mp
> _______________________________________________
> Tutor maillist - [email protected]
> http://mail.python.org/mailman/listinfo/tutor
>
--
'There is only one basic human right, and that is to do as you damn well please.
And with it comes the only basic human duty, to take the consequences.
_______________________________________________
Tutor maillist - [email protected]
http://mail.python.org/mailman/listinfo/tutor