[Python-Dev] Re: radix tree arena map for obmalloc
[Neil Schemenauer ] > ... > BTW, the current radix tree doesn't even require that pools are > aligned to POOL_SIZE. We probably want to keep pools aligned > because other parts of obmalloc rely on that. obmalloc relies on it heavily. Another radix tree could map block addresses to all the necessary info in a current pool header, but that could become gigantic. For example, even a current "teensy" 256 KiB arena can be carved into the order of 16K 16-byte blocks (the smallest size class). And finding a pool header now is unbeatably cheap: just clear the last 12 address bits, and you're done. > Here is the matchup of the radix tree vs the current > address_in_range() approach. > > - nearly the same in terms of performance. It might depend on OS > and workload but based on my testing on Linux, they are very > close. Would be good to do more testing but I think the radix > tree is not going to be faster, only slower. People should understand that the only point to these things is to determine whether a pointer passed to a free() or realloc() spelling was obtained from an obmalloc pool or from the system malloc family. So they're invoked once near the very starts of those two functions, and that's all. Both ways are much more expensive than finding a pool header (which is just clearing some trailing address bits). The current way reads an "arena index" out of the pool header, uses that to index the file static `arenas` vector to get a pointer to an arena descriptor, then reads the arena base address out of the descriptor. That;s used to determine whether the original address is contained in the arena. The primary change my PR makes is to read the arena index from the start of the _page_ the address belongs instead (the pool header address is irrelevant to this, apart from that a pool header is aligned to the first page in a pool). The radix tree picks bits out of the address three times to index into a 3-level (but potentially broad) tree, ending with a node containing info about the only two possible arenas the original address may belong to. Then that info is used to check. The number of loads is essentially the same, but the multiple levels of indexing in the tree is a teensy bit more expensive because it requires more bit-fiddling. I spent hours, in all, dreaming up a way to make the _seemingly_ more complex final "so is the address in one of those two arenas or not?" check about as cheap as the current way. But Neil didn't see any significant timing difference after making that change, which was mildly disappointing but not really surprising: arithmetic is just plain cheap compared to reading up memory. > - radix tree uses a bit more memory overhead. Maybe 1 or 2 MiB on a > 64-bit OS. The radix tree uses more as memory use goes up but it > is a small fraction of total used memory. The extra memory use is > the main downside at this point, I think. I'd call it the only downside. But nobody yet has quantified how bad it can get. > - the radix tree doesn't read uninitialized memory. The current > address_in_range() approach has worked very well but is relying on > some assumptions about the OS (how it maps pages into the program > address space). This is the only aspect where the radix tree is > clearly better. I'm not sure this matters enough to offset the > extra memory use. I'm not worried about that. The only real assumption here is that if an OS supports some notion of "pages" at all, then for any address for which the program has read/write access (which are the only kinds of addresses that can be sanely passed to free/realloc), the OS allows the same access to the entire page containing that address. In two decades we haven't seen an exception to that yet, right? It's hard to imagine a HW designer thinking "I know! Let's piss away more transistors on finer-grained control nobody has asked for, and slow down memory operations even more checking for that." ;-) > - IMHO, the radix tree code is a bit simpler than Tim's > obmalloc-big-pool code. Absolutely so. There's another way to look at this: if Vladimir Marangozov (obmalloc's original author) had used an arena radix tree from the start, would someone now get anywhere proposing a patch to change it to the current scheme? I'd expect a blizzard of -1 votes, starting with mine ;-) > ... > My feeling right now is that Tim's obmalloc-big-pool is the best > design at this point. Using 8 KB or 16 KB pools seems to be better > than 4 KB. The extra complexity added by Tim's change is not so > nice. obmalloc is already extremely subtle and obmalloc-big-pool > makes it more so. Moving to bigger pools and bigger arenas are pretty much no-brainers for us, but unless pool size is increased there's no particular reason to pursue either approach - "ain't broke, don't fix". Larry Hastings started a "The untuned tunable parameter ARENA_SIZE" thread here about two years ago, where he got a blizzard of
[Python-Dev] Re: radix tree arena map for obmalloc
Here are benchmark results for 64 MB arenas and 16 kB pools. I ran without the --fast option and on a Linux machine in single user mode. The "base" columm is the obmalloc-big-pools branch with ARENA_SIZE = 64 MB and POOL_SIZE = 16 kB. The "radix" column is obmalloc_radix_tree (commit 5e00f6041) with the same arena and pool sizes. +-+-+-+ | Benchmark | base (16kB/64MB)| radix (16KB/64MB) | +=+=+=+ | 2to3| 290 ms | 292 ms: 1.00x slower (+0%) | +-+-+-+ | crypto_pyaes| 114 ms | 116 ms: 1.02x slower (+2%) | +-+-+-+ | django_template | 109 ms | 106 ms: 1.03x faster (-3%) | +-+-+-+ | dulwich_log | 75.2 ms | 74.5 ms: 1.01x faster (-1%) | +-+-+-+ | fannkuch| 454 ms | 449 ms: 1.01x faster (-1%) | +-+-+-+ | float | 113 ms | 111 ms: 1.01x faster (-1%) | +-+-+-+ | hexiom | 9.45 ms | 9.47 ms: 1.00x slower (+0%) | +-+-+-+ | json_dumps | 10.6 ms | 11.1 ms: 1.04x slower (+4%) | +-+-+-+ | json_loads | 24.4 us | 25.2 us: 1.03x slower (+3%) | +-+-+-+ | logging_simple | 8.19 us | 8.37 us: 1.02x slower (+2%) | +-+-+-+ | mako| 15.1 ms | 15.1 ms: 1.01x slower (+1%) | +-+-+-+ | meteor_contest | 98.3 ms | 97.1 ms: 1.01x faster (-1%) | +-+-+-+ | nbody | 142 ms | 140 ms: 1.02x faster (-2%) | +-+-+-+ | nqueens | 93.8 ms | 93.0 ms: 1.01x faster (-1%) | +-+-+-+ | pickle | 8.89 us | 8.85 us: 1.01x faster (-0%) | +-+-+-+ | pickle_dict | 17.9 us | 18.2 us: 1.01x slower (+1%) | +-+-+-+ | pickle_list | 2.68 us | 2.64 us: 1.01x faster (-1%) | +-+-+-+ | pidigits| 182 ms | 184 ms: 1.01x slower (+1%) | +-+-+-+ | python_startup_no_site | 5.31 ms | 5.33 ms: 1.00x slower (+0%) | +-+-+-+ | raytrace| 483 ms | 476 ms: 1.02x faster (-1%) | +-+-+-+ | regex_compile | 167 ms | 169 ms: 1.01x slower (+1%) | +-+-+-+ | regex_dna | 170 ms | 171 ms: 1.01x slower (+1%) | +-+-+-+ | regex_effbot| 2.70 ms | 2.75 ms: 1.02x slower (+2%) | +-+-+-+ | regex_v8| 21.1 ms | 21.3 ms: 1.01x slower (+1%) | +-+-+-+ | scimark_fft | 368 ms | 371 ms: 1.01x slower (+1%) | +-+-+-+ | scimark_monte_carlo | 103 ms | 101 ms: 1.02x faster (-2%) | +-+-+-+ | scimark_sparse_mat_mult | 4.31 ms | 4.27 ms: 1.01x faster (-1%) | +-+-+-+ | spectral_norm | 131 ms | 135 ms: 1.03x slower (+3%) | +--
[Python-Dev] Re: radix tree arena map for obmalloc
On 2019-06-14, Tim Peters wrote: > However, last I looked there Neil was still using 4 KiB obmalloc > pools, all page-aligned. But using much larger arenas (16 MiB, 16 > times bigger than my branch, and 64 times bigger than Python currently > uses). I was testing it verses your obmalloc-big-pool branch and trying to make it a fair comparision. You are correct: 4 KiB pools and 16 MiB arenas. Maybe I should test with 16 KiB pools and 16 MiB arenas. That seems a more optimized setting for current machines and workloads. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SUN6QZQKRPI5WQZKSBZFSLBNG4MMV3YH/
[Python-Dev] Re: radix tree arena map for obmalloc
On 2019-06-15, Inada Naoki wrote: > Oh, do you mean your branch doesn't have headers in each page? That's right. Each pool still has a header but pools can be larger than the page size. Tim's obmalloc-big-pool idea writes something to the head of each page within a pool. The radix tree doesn't need that and actually doesn't care about OS page size. BTW, the current radix tree doesn't even require that pools are aligned to POOL_SIZE. We probably want to keep pools aligned because other parts of obmalloc rely on that. Here is the matchup of the radix tree vs the current address_in_range() approach. - nearly the same in terms of performance. It might depend on OS and workload but based on my testing on Linux, they are very close. Would be good to do more testing but I think the radix tree is not going to be faster, only slower. - radix tree uses a bit more memory overhead. Maybe 1 or 2 MiB on a 64-bit OS. The radix tree uses more as memory use goes up but it is a small fraction of total used memory. The extra memory use is the main downside at this point, I think. - the radix tree doesn't read uninitialized memory. The current address_in_range() approach has worked very well but is relying on some assumptions about the OS (how it maps pages into the program address space). This is the only aspect where the radix tree is clearly better. I'm not sure this matters enough to offset the extra memory use. - IMHO, the radix tree code is a bit simpler than Tim's obmalloc-big-pool code. That's not a big deal though as long as the code works and is well commented (which Tim's code is). My feeling right now is that Tim's obmalloc-big-pool is the best design at this point. Using 8 KB or 16 KB pools seems to be better than 4 KB. The extra complexity added by Tim's change is not so nice. obmalloc is already extremely subtle and obmalloc-big-pool makes it more so. Regards, Neil ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZAPSJB6TOODRBRCF3T3CXMYSX3FLWDDI/
[Python-Dev] Re: Who uses libpython38.a on Windows?
On 2019-06-14 21:53, Steve Dower wrote: One of the most annoying steps in building the Windows installers is generating the libpython38.a file. It's annoying, because it requires having "generic enough" MinGW tools to ensure that the file is compatible with whatever version of MinGW might be trying to build against the regular Windows distribution. I would like to stop shipping this file in 3.8 and instead put the steps into the docs to show people how to generate them themselves (with the correct version of their tools): gendef python38.dll > tmp.def dlltool --dllname python38.dll --def tmp.def --output-lib libpython38.a -m i386:x86-64 (Obviously the commands themselves are not complicated if you already have gendef and dlltool, but currently a normal CPython build system does not have these.) Before just doing this, I wanted to put out a request for information: * Do you rely (or know anyone who relies) on libpython38.a on Windows? * Are you able to add the two commands above to your build? If not, why not? I'm able to build the regex module without it; in fact, I believe I've been able to do so since Python 3.5! ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/A7IXOTARTYJUNSCFAU3YY2VOVILC4EBY/
[Python-Dev] Re: radix tree arena map for obmalloc
[Inada Naoki . to Neil S] > Oh, do you mean your branch doesn't have headers in each page? That's probably right ;-) Neil is using a new data structure, a radix tree implementing a sparse set of arena addresses. Within obmalloc pools, which can be of any multiple-of-4KiB (on a 64-bit box) size, every byte beyond the pool header is usable for user data. In my patch, there is no new data structure, but it needs to store an "arena index" at the start of every page (every 4K bytes) within a pool. I certainly _like_ Neil's code better. It's clean rather than excruciatingly tricky. The question is whether that's enough to justify the memory burden of an additional data structure (which can potentially grow very large). So I've been working with Neil to see whether it's possible to make it faster than my branch, to give it another selling point people actually care about ;-) Should also note that Neil's approach never needs to read uninitialized memory, so we could throw away decades of (e.g.) valgrind pain caused by the current approach (which my patch builds on). > https://bugs.python.org/issue32846 > > As far as I remember, this bug was caused by cache thrashing (page > header is aligned by 4K, so cache line can conflict often.) > Or this bug can be caused by O(N) free() which is fixed already. I doubt that report is relevant here, but anyone is free to try it with Neil's branch. https://github.com/nascheme/cpython/tree/obmalloc_radix_tree However, last I looked there Neil was still using 4 KiB obmalloc pools, all page-aligned. But using much larger arenas (16 MiB, 16 times bigger than my branch, and 64 times bigger than Python currently uses). But the `O(N) free()` fix may very well be relevant. To my eyes, while there was plenty of speculation in that bug report, nobody actually dug in deep to nail a specific cause. A quick try just now on my branch (which includes the `O(N) free()` fix) on Terry Reedy's simple code in that report shows much improved behavior, until I run out of RAM. For example, roughly 4.3 seconds to delete 40 million strings in a set, and 9.1 to delete 80 million in a set. Not really linear, but very far from quadratic. In contrast, Terry saw nearly a quadrupling of delete time when moving from 32M to 64M strings So more than one thing was going on there, but looks likely that the major pain was caused by quadratic-time arena list sorting. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GXNDFA7YO6NP3IWIW4IIYX5XEIOW2FJH/
[Python-Dev] Re: radix tree arena map for obmalloc
Oh, do you mean your branch doesn't have headers in each page? https://bugs.python.org/issue32846 As far as I remember, this bug was caused by cache thrashing (page header is aligned by 4K, so cache line can conflict often.) Or this bug can be caused by O(N) free() which is fixed already. I'll see it in next week. On Sat, Jun 15, 2019 at 3:54 AM Neil Schemenauer wrote: > > I've been working on this idea for a couple of days. Tim Peters has > being helping me out and I think it has come far enough to get some > more feedback. It is not yet a good replacement for the current > address_in_range() test. However, performance wise, it is very > close. Tim figures we are not done optimizing it yet so maybe it > will get better. > > Code is available on my github branch: > > https://github.com/nascheme/cpython/tree/obmalloc_radix_tree > > Tim's "obmalloc-big-pools" is what I have been comparing it to. It > seems 8 KB pools are faster than 4 KB. I applied Tim's arena > trashing fix (bpo-37257) to both branches. Some rough (--fast) > pyperformance benchmark results are below. > > > +-+-+-+ > | Benchmark | obmalloc-big-pools | obmalloc_radix > | > +=+=+=+ > | crypto_pyaes| 168 ms | 170 ms: 1.01x slower (+1%) > | > +-+-+-+ > | hexiom | 13.7 ms | 13.6 ms: 1.01x faster (-1%) > | > +-+-+-+ > | json_dumps | 15.9 ms | 15.6 ms: 1.02x faster (-2%) > | > +-+-+-+ > | json_loads | 36.9 us | 37.1 us: 1.01x slower (+1%) > | > +-+-+-+ > | meteor_contest | 141 ms | 139 ms: 1.02x faster (-2%) > | > +-+-+-+ > | nqueens | 137 ms | 140 ms: 1.02x slower (+2%) > | > +-+-+-+ > | pickle_dict | 26.2 us | 25.9 us: 1.01x faster (-1%) > | > +-+-+-+ > | pickle_list | 3.91 us | 3.94 us: 1.01x slower (+1%) > | > +-+-+-+ > | python_startup_no_site | 8.00 ms | 7.78 ms: 1.03x faster (-3%) > | > +-+-+-+ > | regex_dna | 246 ms | 241 ms: 1.02x faster (-2%) > | > +-+-+-+ > | regex_v8| 29.6 ms | 30.0 ms: 1.01x slower (+1%) > | > +-+-+-+ > | richards| 93.9 ms | 92.7 ms: 1.01x faster (-1%) > | > +-+-+-+ > | scimark_fft | 525 ms | 531 ms: 1.01x slower (+1%) > | > +-+-+-+ > | scimark_sparse_mat_mult | 6.32 ms | 6.24 ms: 1.01x faster (-1%) > | > +-+-+-+ > | spectral_norm | 195 ms | 198 ms: 1.02x slower (+2%) > | > +-+-+-+ > | sqlalchemy_imperative | 49.5 ms | 50.5 ms: 1.02x slower (+2%) > | > +-+-+-+ > | sympy_expand| 691 ms | 695 ms: 1.01x slower (+1%) > | > +-+-+-+ > | unpickle_list | 5.09 us | 5.32 us: 1.04x slower (+4%) > | > +-+-+-+ > | xml_etree_parse | 213 ms | 215 ms: 1.01x slower (+1%) > | > +-+-+-+ > | xml_etree_generate | 134 ms | 136 ms: 1.01x slower (+1%) > | > +-+-+-+ > | xml_etree_process | 103 ms | 104 ms: 1.01x slower (+1%) > | > +-+-+-+ > > Not significant (34): 2to3; chameleon; chaos; deltablue; > django_template; dulwich_log; fannkuch; float; go; html5lib; > logging_format
[Python-Dev] Who uses libpython38.a on Windows?
One of the most annoying steps in building the Windows installers is generating the libpython38.a file. It's annoying, because it requires having "generic enough" MinGW tools to ensure that the file is compatible with whatever version of MinGW might be trying to build against the regular Windows distribution. I would like to stop shipping this file in 3.8 and instead put the steps into the docs to show people how to generate them themselves (with the correct version of their tools): gendef python38.dll > tmp.def dlltool --dllname python38.dll --def tmp.def --output-lib libpython38.a -m i386:x86-64 (Obviously the commands themselves are not complicated if you already have gendef and dlltool, but currently a normal CPython build system does not have these.) Before just doing this, I wanted to put out a request for information: * Do you rely (or know anyone who relies) on libpython38.a on Windows? * Are you able to add the two commands above to your build? If not, why not? Thanks, Steve ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BYU35PWDNJ54COLNCFCSY3MCFYPF4KUK/
[Python-Dev] radix tree arena map for obmalloc
I've been working on this idea for a couple of days. Tim Peters has being helping me out and I think it has come far enough to get some more feedback. It is not yet a good replacement for the current address_in_range() test. However, performance wise, it is very close. Tim figures we are not done optimizing it yet so maybe it will get better. Code is available on my github branch: https://github.com/nascheme/cpython/tree/obmalloc_radix_tree Tim's "obmalloc-big-pools" is what I have been comparing it to. It seems 8 KB pools are faster than 4 KB. I applied Tim's arena trashing fix (bpo-37257) to both branches. Some rough (--fast) pyperformance benchmark results are below. +-+-+-+ | Benchmark | obmalloc-big-pools | obmalloc_radix | +=+=+=+ | crypto_pyaes| 168 ms | 170 ms: 1.01x slower (+1%) | +-+-+-+ | hexiom | 13.7 ms | 13.6 ms: 1.01x faster (-1%) | +-+-+-+ | json_dumps | 15.9 ms | 15.6 ms: 1.02x faster (-2%) | +-+-+-+ | json_loads | 36.9 us | 37.1 us: 1.01x slower (+1%) | +-+-+-+ | meteor_contest | 141 ms | 139 ms: 1.02x faster (-2%) | +-+-+-+ | nqueens | 137 ms | 140 ms: 1.02x slower (+2%) | +-+-+-+ | pickle_dict | 26.2 us | 25.9 us: 1.01x faster (-1%) | +-+-+-+ | pickle_list | 3.91 us | 3.94 us: 1.01x slower (+1%) | +-+-+-+ | python_startup_no_site | 8.00 ms | 7.78 ms: 1.03x faster (-3%) | +-+-+-+ | regex_dna | 246 ms | 241 ms: 1.02x faster (-2%) | +-+-+-+ | regex_v8| 29.6 ms | 30.0 ms: 1.01x slower (+1%) | +-+-+-+ | richards| 93.9 ms | 92.7 ms: 1.01x faster (-1%) | +-+-+-+ | scimark_fft | 525 ms | 531 ms: 1.01x slower (+1%) | +-+-+-+ | scimark_sparse_mat_mult | 6.32 ms | 6.24 ms: 1.01x faster (-1%) | +-+-+-+ | spectral_norm | 195 ms | 198 ms: 1.02x slower (+2%) | +-+-+-+ | sqlalchemy_imperative | 49.5 ms | 50.5 ms: 1.02x slower (+2%) | +-+-+-+ | sympy_expand| 691 ms | 695 ms: 1.01x slower (+1%) | +-+-+-+ | unpickle_list | 5.09 us | 5.32 us: 1.04x slower (+4%) | +-+-+-+ | xml_etree_parse | 213 ms | 215 ms: 1.01x slower (+1%) | +-+-+-+ | xml_etree_generate | 134 ms | 136 ms: 1.01x slower (+1%) | +-+-+-+ | xml_etree_process | 103 ms | 104 ms: 1.01x slower (+1%) | +-+-+-+ Not significant (34): 2to3; chameleon; chaos; deltablue; django_template; dulwich_log; fannkuch; float; go; html5lib; logging_format; logging_silent; logging_simple; mako; nbody; pathlib; pickle; pidigits; python_startup; raytrace; regex_compile; regex_effbot; scimark_lu; scimark_monte_carlo; scimark_sor; sqlalchemy_declarative; sqlite_synth; sympy_integrate; sympy_sum; sympy_str; telco; unpack_sequence; unpickle; xml_etree_iterparse ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/a
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (2019-06-07 - 2019-06-14) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open7022 (+15) closed 42013 (+70) total 49035 (+85) Open issues with patches: 2826 Issues opened (54) == #36607: asyncio.all_tasks() crashes if asyncio is used in multiple thr https://bugs.python.org/issue36607 reopened by asvetlov #36888: Create a way to check that the parent process is alive for dea https://bugs.python.org/issue36888 reopened by vstinner #37136: Travis CI: Documentation tests fails with Sphinx 2.1 https://bugs.python.org/issue37136 reopened by njs #37200: PyType_GenericAlloc might over-allocate memory https://bugs.python.org/issue37200 opened by nascheme #37201: fix test_distutils failures for Windows ARM64 https://bugs.python.org/issue37201 opened by Paul Monson #37205: time.perf_counter() is not system-wide on Windows, in disagree https://bugs.python.org/issue37205 opened by kh90909 #37206: Incorrect application of Argument Clinic to dict.pop() https://bugs.python.org/issue37206 opened by rhettinger #37207: Use PEP 590 vectorcall to speed up calls to range(), list() an https://bugs.python.org/issue37207 opened by Mark.Shannon #37208: Weird exception behaviour in ProcessPoolExecutor https://bugs.python.org/issue37208 opened by Iceflower #37209: Add what's new entries for pickle enhancements https://bugs.python.org/issue37209 opened by pitrou #37211: obmalloc: eliminate limit on pool size https://bugs.python.org/issue37211 opened by tim.peters #37212: ordered keyword arguments in unittest.mock.call repr and error https://bugs.python.org/issue37212 opened by xtreak #37214: Add new EncodingWarning warning category: emitted when the loc https://bugs.python.org/issue37214 opened by vstinner #37218: Default hmac.new() digestmod has not been removed from documen https://bugs.python.org/issue37218 opened by Alex.Willmer #37220: test_idle crash on Windows 2.7 when run with -R: https://bugs.python.org/issue37220 opened by zach.ware #37221: PyCode_New API change breaks backwards compatibility policy https://bugs.python.org/issue37221 opened by ncoghlan #37222: urllib missing voidresp breaks CacheFTPHandler https://bugs.python.org/issue37222 opened by danh #37224: test__xxsubinterpreters failed on AMD64 Windows8.1 Refleaks 3. https://bugs.python.org/issue37224 opened by vstinner #37225: Document BaseException constructor https://bugs.python.org/issue37225 opened by Hong Xu #37226: Asyncio Fatal Error on SSL Transport - IndexError Deque Index https://bugs.python.org/issue37226 opened by ben.brown #37228: UDP sockets created by create_datagram_endpoint() allow by def https://bugs.python.org/issue37228 opened by Jukka Väisänen #37231: Optimize calling special methods https://bugs.python.org/issue37231 opened by jdemeyer #37232: Parallel compilation fails because of low ulimit. https://bugs.python.org/issue37232 opened by kulikjak #37233: Use _PY_FASTCALL_SMALL_STACK for method_vectorcall https://bugs.python.org/issue37233 opened by jdemeyer #37235: urljoin behavior unclear/not following RFC 3986 https://bugs.python.org/issue37235 opened by Matthew Kenigsberg #37236: fix test_complex for Windows arm64 https://bugs.python.org/issue37236 opened by Paul Monson #37237: python 2.16 from source on Ubuntu 18.04 https://bugs.python.org/issue37237 opened by Jilguero ostras #37242: sub-process would be terminated when registered finalizers ar https://bugs.python.org/issue37242 opened by mrqianjinsi #37243: test_sendfile in asyncio crashes when os.sendfile() is not sup https://bugs.python.org/issue37243 opened by Michael.Felt #37244: test_multiprocessing_forkserver: test_resource_tracker() faile https://bugs.python.org/issue37244 opened by vstinner #37245: Azure Pipeline 3.8 CI: multiple tests hung and timed out on ma https://bugs.python.org/issue37245 opened by vstinner #37246: http.cookiejar.DefaultCookiePolicy should use current timestam https://bugs.python.org/issue37246 opened by xtreak #37247: swap distutils build_ext and build_py commands to allow proper https://bugs.python.org/issue37247 opened by jlvandenhout #37248: support conversion of `func(**{} if a else b)` https://bugs.python.org/issue37248 opened by Shen Han #37250: C files generated by Cython set tp_print to NULL: PyTypeObject https://bugs.python.org/issue37250 opened by vstinner #37251: Mocking a MagicMock with a function spec results in an AsyncMo https://bugs.python.org/issue37251 opened by jcline #37252: devpoll test failures on Solaris https://bugs.python.org/issue37252 opened by kulikjak #37254: POST large file to server (using http.server.CGIHTTPRequestHan https://bugs.python.org/issue37254 opened by shajianrui #37256: urllib.request.Request documentation erroneously refers to the https://bugs.python.org/issue3
[Python-Dev] Re: PyAPI_FUNC() is needed to private APIs?
On 2019-06-13 18:03, Inada Naoki wrote: We don't provide method calling API which uses optimization same to LOAD_METHOD. Which may be like this: /* methname is Unicode, nargs > 0, and args[0] is self. */ PyObject_VectorCallMethod(PyObject *methname, PyObject **args, Py_ssize_t nargs, PyObject *kwds) I agree that this would be useful. Minor nitpick: we spell "Vectorcall" with a lower-case "c". There should also be a _Py_Identifier variant _PyObject_VectorcallMethodId The implementation should be like vectorcall_method from Objects/typeobject.c except that _PyObject_GetMethod should be used instead of lookup_method() (the difference is that the code for special methods like __add__ only looks at the attributes of the type, not the instance). (Would you try adding this? Or may I?) Or course you may. Just let me know if you're working on it. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FLF74RH3XO4BYOTW2CRRD2GO23P2YUOO/