On 21/03/2016 12:34, BartC wrote:
On 21/03/2016 02:21, Terry Reedy wrote:

Of course you can.  But you cannot write in a crippled Python subset and
fairly claim that the result represents idiomatic Python code.

For Python I would have used a table of 0..255 functions, indexed by the
ord() code of each character.

But that's a hell of a lot of infra-structure to set up, only to find
out that Python's function call overheads mean it's not much faster (or
maybe it's slower) than using an if-elif chain.

I've tried it now and it's a bit slower (perhaps 5%). But once done, I think it looks better structured, and does away with a few globals.

I won't post the code as it will only get picked on for some irrelevant detail! But the main readtoken() routine now looks like this:

def readtoken(psource):
        global lxsptr, lxsymbol
        lxsubcode = 0

        while (1):
                c=psource[lxsptr]
                lxsptr+=1
                d=ord(c)
                if d<256:
                        lxsymbol = disptable[d](psource,c)
                else:
                        lxsymbol = fn_error(psource,c)

                if lxsymbol != skip_sym:
                        break

(This assumes input is ASCII or UTF-8. For Unicode, the 256 changes to 128, and the call to fn_error() is replaced by something that deals with a Unicode token starter, which is most likely to be an error in the case of C source input.)

(While I here, something that came up yesterday: why hasn't Python fixed the bug it seems to have inherited from C, where:

  a << b + c

is evaluated as 'a << (b+c)'? That cost me half an hour to sort out! << and >> scale numbers just like * and /, so should have the same precedence.)

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to