On Fri, May 30, 2008 at 4:45 AM, Edward K. Ream <[EMAIL PROTECTED]> wrote:

>     5859    0.084    0.000    0.165    0.000 leoAtFile.py:
> 1079(readEndNode)
>       43    0.531    0.012    6.564    0.153 leoAtFile.py:
> 744(scanText4)

Judging by the cumulative time taken by scanText4, it does seem
rewriting this to use mxTextTools might help.

Some easier optimizations you may also want to do to this func:

        while at.errors == 0 and not at.done:
            s = at.readLine(theFile)
            self.lineNumber += 1
            if len(s) == 0: break
            kind = at.sentinelKind4(s)
            # g.trace(at.sentinelName(kind),s.strip())
            if kind == at.noSentinel:
                i = 0
            else:
                i = at.skipSentinelStart4(s,0)
            func = at.dispatch_dict[kind]
            func(s,i)

Store all the attributes to local variables

skind = at.sentinelKind
noSent = at.noSentinel
skips = at.skipSentinelStart4
ddict = at.dispatch_dict

I suppose the same could be done for lineNumber.

I suppose the repeated toUnicode stuff in readLine slows it down too.
Perhaps there should be "short circuit" for files that are plain
ascii.

It may also be significantly faster to convert the whole file in one
swoop, as opposed to doing it line-by-line.

-- 
Ville M. Vainio - vivainio.googlepages.com
blog=360.yahoo.com/villevainio - g[mail | talk]='vivainio'

Reply via email to