New submission from Raymond Hettinger: A number of fine-grained methods in Objects/listobject.c use PyList_Check(). They include PyList_Size, PyList_GetItem, PyList_SetItem, PyList_Insert, and PyList_Append.
The PyList_Check() works by making two sequentially dependent memory fetches: movq 8(%rdi), %rax testb $2, 171(%rax) je L1645 This patch proposes a fast path for the common case of an exact match, using PyList_CheckExact() as an early-out before the PyList_Check() test: leaq _PyList_Type(%rip), %rdx # parallelizable movq 8(%rdi), %rax # only 1 memory access cmpq %rdx, %rax # macro-fusion je L1604 # early-out testb $2, 171(%rax) # fallback to 2nd memory access je L1604 This technique won't help outside of Objects/listobject.c because the initial LEA instruction becomes a MOV for the global offset table, nullifying the advantage. ---------- assignee: serhiy.storchaka components: Interpreter Core files: list_check_fastpath.diff keywords: patch messages: 258918 nosy: rhettinger, serhiy.storchaka priority: normal severity: normal stage: patch review status: open title: Faster type checking in listobject.c type: performance versions: Python 3.6 Added file: http://bugs.python.org/file41713/list_check_fastpath.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26201> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com