Re: [Tutor] clean text

A.T.Hofkamp Tue, 19 May 2009 04:29:05 -0700

spir wrote:

def _cleanRepr(text):
        ''' text with control chars replaced by repr() equivalent '''
        chars = []
        for char in text:
                n = ord(char)
                if (n < 32) or (n > 126 and n < 160):
                        char = repr(char)[1:-1]
                chars.append(char)
        return ''.join(chars)


But what else can I do?

You seem to break down the string to single characters, replace a few of them,and then build the whole string back.

Maybe you can insert larger chunks of text that do not need modification, iesomething like


start = 0
for idx, char in text:
    n = ord(char)
    if n < 32 or 126 < n < 160:
        chars.append(text[start:idx])
        chars.append(repr(char)[1:-1])
        start = idx + 1
chars.append(text[start:])
return ''.join(chars)

An alternative of the above is to keep track of the first occurrence of eachof the chars you want to split on (after some 'start' position), and computethe next point to break the string as the min of all those positions insteadof slowly 'walking' to it by testing each character seperately.

That would reduce the number of iterations you do in the loop, at the cost ofmaintaining a large number of positions of the next breaking point.



Albert
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] clean text

Reply via email to