Eddie Elizondo <eduardo.elizondoru...@gmail.com> added the comment:
> I'm somewhat puzzled how a version that does no more work and has no jumps is > slower. Oh I think I know why there's some confusion. I've updated the PR from the initial version (which is probably the one that you saw). The branching does less work in Py_INCREF and Py_DECREF for all instances marked with the immortal bit by exiting early. In the latest change, I added the immortal bit to a bunch of "known" immortal objects by. For instance: * All static types (i.e: PyType_Type, etc.) * All small ints (-5 to 256) * The following Singletons: PyTrue, PyFalse, PyNone * And the heap after the runtime is initialized (in pymain_main) Example 1) ``` PyObject _Py_NoneStruct = { _PyObject_EXTRA_INIT 1, &_PyNone_Type #ifdef Py_IMMORTAL_OBJECTS _Py_IMMORTAL_BIT, #else 1, #endif /* Py_IMMORTAL_OBJECTS */ &_PyNone_Type }; ``` Example 2) ``` static int pymain_main(_PyArgv *args) { PyStatus status = pymain_init(args); if (_PyStatus_IS_EXIT(status)) { pymain_free(); return status.exitcode; } if (_PyStatus_EXCEPTION(status)) { pymain_exit_error(status); } #ifdef Py_IMMORTAL_OBJECTS /* Most of the objects alive at this point will stay alive throughout the * lifecycle of the runtime. Immortalize to avoid the GC and refcnt costs */ _PyGC_ImmortalizeHeap(); #endif /* Py_IMMORTAL_OBJECTS */ return Py_RunMain(); ``` Therefore, you are now making Py_INCREF and Py_DECREF cheaper for things like `Py_RETURN_NONE` and a bunch of other instances. Let me know if that explains it! I could also send you patch of the branch-less version so you can compare them. > but making the object header immutable prevents changes like Why would it prevent it? These changes are not mutually exclusive, you can still have an immortal bit by: 1) Using a bit from `gc_bits`. Currently you only need 2 bits for the GC. Even with a `short` you'll have space for the immortal bit. 2) Using a bit from the ob_refcnt. Separately, using this allows us to experiment with a branch-less and test-less code by using saturated adds. For example: ``` /* Branch-less incref with saturated add */ #define PY_REFCNT_MAX ((int)(((int)-1)>>1)) #define _Py_INCREF(op) ({ __asm__ ( "addl $0x1, %[refcnt]" "cmovol %[refcnt_max], %[refcnt]" : [refcnt] "+r" (((PyObject *)op)->ob_refcnt) : [refcnt_max] "r" (PY_REFCNT_MAX) );}) ``` ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue40255> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com