On 3/30/18 6:41 AM, bartc wrote:
On 27/03/2018 04:49, Richard Damon wrote:
On 3/26/18 8:46 AM, bartc wrote:

Hence my testing with CPython 3.6, rather than on something like PyPy which can give results that are meaningless. Because, for example, real code doesn't repeatedly execute the same pointless fragment millions of times. But a real context is too complicated to set up.

The bigger issue is that these sort of micro-measurements aren't actually that good at measuring real quantitative performance costs. They can often give qualitative indications, but the way modern computers work, processing environment is extremely important in performance, so these sorts of isolated measure can often be misleading. The problem is that if you measure operation a, and then measure operation b, if you think that doing a then b in the loop that you will get a time of a+b, you will quite often be significantly wrong, as cache performance can drastically affect things. Thus you really need to do performance testing as part of a practical sized exercise, not a micro one, in order to get a real measurement.

That might apply to native code, where timing behaviour of a complicated  chip like x86 might be unintuitive.

But my comments were specifically about byte-code executed with CPython. Then the behaviour is a level or two removed from the hardware and with slightly different characteristics.

(Since the program you are actually executing is the interpreter, not the Python program, which is merely data. And whatever aggressive optimisations are done to the interpreter code, they are not affected by the Python program being run.)

But cache behavior may very well still influence it, as a small section of byte code may only exercise a small part of the interpreter, and thus it might be able to all (or mostly)) live in cache, and thus run faster, while a broader program, uses more of the interpreter, and may no longer fit in the cache. In some ways, this can be much amplified over a fully compiled code as very small changes in byte code can have much bigger effects over what gets accessed. You probably do get less opportunity for things to speed up by combining pieces, but still plenty of opportunity to get slowdowns.

Another factor that you run into is that lookup time can be a factor, just the mere presence of lots of other code in the test module, even if not executing, can impact the speed it runs at.

--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to