Eddie Elizondo <eduardo.elizondoru...@gmail.com> added the comment:

> I'm somewhat puzzled how a version that does no more work and has no jumps is 
> slower.

Oh I think I know why there's some confusion. I've updated the PR from the 
initial version (which is probably the one that you saw). The branching does 
less work in Py_INCREF and Py_DECREF for all instances marked with the immortal 
bit by exiting early.

In the latest change, I added the immortal bit to a bunch of "known" immortal 
objects by. For instance:
* All static types (i.e: PyType_Type, etc.)
* All small ints (-5 to 256)
* The following Singletons: PyTrue, PyFalse, PyNone
* And the heap after the runtime is initialized (in pymain_main)

Example 1)
```
PyObject _Py_NoneStruct = {
  _PyObject_EXTRA_INIT
  1, &_PyNone_Type
#ifdef Py_IMMORTAL_OBJECTS
  _Py_IMMORTAL_BIT,
#else
  1,
#endif  /* Py_IMMORTAL_OBJECTS */
  &_PyNone_Type
};
```

Example 2)
```
static int
pymain_main(_PyArgv *args)
{
    PyStatus status = pymain_init(args);
    if (_PyStatus_IS_EXIT(status)) {
        pymain_free();
        return status.exitcode;
    }
    if (_PyStatus_EXCEPTION(status)) {
        pymain_exit_error(status);
    }

#ifdef Py_IMMORTAL_OBJECTS
    /* Most of the objects alive at this point will stay alive throughout the
     * lifecycle of the runtime. Immortalize to avoid the GC and refcnt costs */
    _PyGC_ImmortalizeHeap();
#endif  /* Py_IMMORTAL_OBJECTS */
    return Py_RunMain();
```


Therefore, you are now making Py_INCREF and Py_DECREF cheaper for things like  
`Py_RETURN_NONE` and a bunch of other instances.

Let me know if that explains it! I could also send you patch of the branch-less 
version so you can compare them.



> but making the object header immutable prevents changes like

Why would it prevent it? These changes are not mutually exclusive, you can 
still have an immortal bit by:
1) Using a bit from `gc_bits`. Currently you only need 2 bits for the GC. Even 
with a `short` you'll have space for the immortal bit.
2) Using a bit from the ob_refcnt. Separately, using this allows us to 
experiment with a branch-less and test-less code by using saturated adds. For 
example:

```
/* Branch-less incref with saturated add */
#define PY_REFCNT_MAX ((int)(((int)-1)>>1))
#define _Py_INCREF(op) ({
    __asm__ (
        "addl $0x1, %[refcnt]"
        "cmovol  %[refcnt_max], %[refcnt]"
        : [refcnt] "+r" (((PyObject *)op)->ob_refcnt)
        : [refcnt_max] "r" (PY_REFCNT_MAX)
    );})
```

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40255>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to