Thanks for all the replies. I wasn't aware of some of these alternatives. Most of these seem to transform Python code/bytecode into another language. I was already well aware of Cython. On the Nuitka blog, I notice it says "Compiling takes a lot [sic] time, ...". Compyler seems to generate assembly and then parse the assembly to generate a Windows exe. Berp turns python into Haskell, not directly into machine code.

The closest thing to mine seems to be Psyco. It tries to do something more ambitious. It analyzes the program while it's running to create specialized versions of certain functions. High memory usage seems to be an issue with Psyco.

My approach is to simply translate the bytecode into raw machine code as directly as possible, quickly and without using much memory. Basically I was going for a solution with no significant drawbacks. It was also meant to be very easy to maintain. The machine code is generated with a series of functions that very closely mirrors AT&T syntax (same as the default syntax for the GNU assembler) with some convenience functions that make it look like some kind of high-level assembly. For example, here is the implementation for LOAD_GLOBAL:

@hasname
def _op_LOAD_GLOBAL(f,name):
    return (
        f.stack.push_tos(True) + [
        ops.push(address_of(name)),
        ops.push(GLOBALS)
    ] + call('PyDict_GetItem') +
        if_eax_is_zero([
            discard_stack_items(1),
            ops.push(BUILTINS)
        ] + call('PyDict_GetItem') +
            if_eax_is_zero([
                discard_stack_items(1),
                ops.push(pyinternals.raw_addresses[
                    'GLOBAL_NAME_ERROR_MSG']),
                ops.push(pyinternals.raw_addresses['PyExc_NameError'])
            ] + call('format_exc_check_arg') + [
                goto(f.end)
            ])
        ) + [
        discard_stack_items(2)
    ])

To make sense of it, you just need to ignore the square brackets and plus signs (they are there to create a list that gets joined into one byte string at the very end) and imagine it's assembly code (I should probably just write a variadic function or use operator overloading to make this syntactically clearer). Any time a new version of Python is released, you would just run diff on Python-X.X/Python/ceval.c and see which op codes need updating. You wouldn't need to make a broad series of changes because of a minor change in Python's syntax or bytecode.

And that's just one of the larger op code implementations. Here's the one for LOAD_CONST:

@hasconst
def _op_LOAD_CONST(f,const):
    return f.stack.push_tos() + [f.stack.push(address_of(const))]


Anyway, It was certainly interesting working on this. I'll probably at least implement looping and arithmetic so I can have something meaningful to benchmark.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to