What version of CPython did you try that with? The latest py3k branch?

I had a quick look at 3.2, 2.5 and 2.7 and got the impression that the savings is more if the interpreter loop is faster: the fewer instructions there are, the bigger a 3 instruction difference would make.

The NEXTARG macro is the same in all three versions:
#define NEXTARG() (next_instr += 2, (next_instr[-1]<<8) + next_instr[-2])
and the compiler compiles this to two separate fetches.

I found out my compiler (gcc) will make better code if we used a short.
It produces a "movswl" instruction to do both fetches at the same time, if I force it to.
That saves two instructions already.

This would imply that on little-endian machines, this would already save a few percent changing just 1 line of code in ceval.c:
#define NEXTARG()       (next_instr += 2, *(short *)&next_instr[-2])

- Jurjen
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to