[issue47145] Improve graphlib.TopologicalSort by removing the prepare step
Larry Hastings added the comment: One final aside. Let me preface this by saying: I'm not proposing the following for graphlib.TopologicalSort. It's obviously too late to change that object this much. It's just something I'm thinking about--maybe I'll use this in my own library. Where we are now, the graphlib.TopologicalSort object is simultaneously a static representation of a graph--which the object doesn't provide a public API for you to examine!--and a stateful single-use iterator over that graph. My proposed change to the API seems to increase the tension between these two sets of semantics. Perhaps a better set of semantics, that more successfully maps my use case to the graph object, would be as follows: * you can add() nodes and edges at any time. * get_ready() always returns the current list of nodes with no prececessors. There is no internal "we already told you about this one" state. * replace done() with remove(), which removes the node and all edges from/to it from the graph. * static_order() is still fine. I think this would make it easy to reason about the object's behavior, and would be a better match to my use case where you're continually adding (and removing?) nodes, not just during an initial "populate the graph" phase. Again, not appropriate to consider for graphlib.TopologicalSort. -- ___ Python tracker <https://bugs.python.org/issue47145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47145] Improve graphlib.TopologicalSort by removing the prepare step
Larry Hastings added the comment: I agree that the API should have as few surprises as possible. AFAICT you haven't made any terminal pronouncements like "it's impossible to add this feature without too many unacceptable surprises", so I'll proceed assuming we can find an API (and semantics) that we're all happy with. I've come around on the idea that forgetting "done" nodes is too surprising. The least surprising behavior is: once we've added a node to the graph, the graph remembers it. Now I'll propose a second simple rule, which you've already touched on: you can't add a dependency after a node has been returned by get_ready(). Attempting this always throws an exception; which exception specifically TBD. "Tough beans". I think those two rules are easy enough to remember and to reason about. It meaningfully expands the utility of the library with minimal surprise. The library behaves identically for users of the existing API but permits new use cases too. Adding this functionality would therefore mean fewer users would discover too late their use case isn't supported. As far as my "long-lived graph objects will consume more and more memory over time" caveat, there's a better solution than "forget": graph.remove(*nodes). (My version already has a "remove" method, and I forgot that graphlib's doesn't.) Allowing the user to remove a node from the graph gives them explicit control, and the semantics should be obvious and unsurprising; if you--the user--remove a node, and later you--the user--re-add that node to the graph, it behaves identically to any other node the graph has never seen before. Removing a node intuitively removes all edges to that node. Two notes on "remove" if we decide to go that route. First, I'd ensure you can remove a node at any time. Nodes have three externally visible states wrt TopologicalSort: 1) added but not published by get_ready, 2) published by get_ready but not returned using done, and 3) done. You should be able to remove a node in any of those three states. Removing a node in 2) should be equivalent to calling done before calling remove; that is, if you're removing the node anyway, you don't need to call done. Second, the current underlying implementation would make remove really slow. Nodes don't remember their predecessors, only their successors, so removing a node would be O(n). If we expected remove to get a lot of use, we'd probably want to change how the graph is stored to speed up this operation. -- ___ Python tracker <https://bugs.python.org/issue47145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47145] Improve graphlib.TopologicalSort by removing the prepare step
Larry Hastings added the comment: I'm not sure I follow you. What do you suggest graphlib is guessing at? I thought we were discussing what graphlib's (documented, intentional) behavior should be; if there was any guessing going on, it was us doing it, guessing at what behavior the user would prefer. I appreciate you corresponding on the issue, but I'm having difficulty understanding what light you're shining on the problem. -- ___ Python tracker <https://bugs.python.org/issue47145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47145] Improve graphlib.TopologicalSort by removing the prepare step
Larry Hastings added the comment: Having slept on it, I think the footgun is slightly bigger than I gave it credit for. Also the problem of nodes lingering forever is easily solved: give the user control. My next iteration will keep the done nodes around, but I'll also add a forget() method that compels the graph to forget all done nodes. -- ___ Python tracker <https://bugs.python.org/issue47145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47145] Improve graphlib.TopologicalSort by removing the prepare step
Larry Hastings added the comment: > Assuming we do want to be able to add() after a get_ready(), is there > a reason that "forgetting" already-produced nodes is the correct > behavior, as opposed to remembering all nodes ever added, and > raising iff the addition creates a cycle among all nodes ever > added or depends on an already-yielded node? I'm not sure "correct" applies here, because I don't have a sense that one behavior is conceptually more correct than the other. But in implementing my API change, forgetting about the done nodes seemed more practical. The "benefit" to remembering done nodes: the library can ignore dependencies to them in the future, forever. Adding a dependency to a node that's already been marked as "done" doesn't make much conceptual sense to me, but as a practical thing maybe it's useful? I'm not sure, but it doesn't seem all that useful. I can only come up with a marginal reason why remembering done nodes is useful. Let's say all your tasks fan out from some fundamental task, like "download master list of work" or something. One of your tasks might discover additional tasks that need to run, and conceptually those tasks might depend on your "download master list of work" task. If the graph remembers the done list forever, then adding that dependency is harmless. If the graph forgets about done nodes, then adding that dependency could re-introduce that task to the graph, which could goof things up. So maybe it's a little bit of a footgun? But on the other hand: you already know you're running, and you're a task that was dependent on the master list of work, which means you implicitly know that dependency has been met. So just skip adding the redundant dependency and you're fine. On the other hand, forgetting about the nodes has a definite practical benefit: the graph consumes less memory. If you use a graph object for a long time, the list of done nodes it's holding references to would continue to grow and grow and grow. If we forget about done nodes, we free up all that memory, and done membership testing maybe gets faster. I guess I'm not married to the behavior. If someone had a great conceptual or practical reason why remembering the done nodes forever was better, I'd be willing to listen to reason. -- ___ Python tracker <https://bugs.python.org/issue47145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47145] Improve graphlib.TopologicalSort by removing the prepare step
Larry Hastings added the comment: I'm using my graph library to manage a list of tasks that need doing in some sort of proper order. One task may spawn other tasks at runtime, and we don't necessarily know what the tasks will be until runtime. It's way more convenient to simply add such tasks on demand, rather than trying to preemptively pre-add all such possible tasks before preparing the graph, or creating additional graphs. For example, consider a tool that downloads and digests zip files full of music from an online store. Your initial tasks might represent "download the zip file", then "decompress and examine the contents of the zip file". You could then iterate over the contents of the zip file, adding different tasks based on what you find--one pipeline of tasks for media files (FLAC/MP3/OGG/etc), another for the playlist, a third if you don't *find* a playlist, a fourth for image files, etc. (Not that you'd necessarily write such a tool this way, but it's at least plausible.) The new nodes needn't be connected to the existing nodes for this to still be useful. You could reuse the same graph object for all your tasks. -- ___ Python tracker <https://bugs.python.org/issue47145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47145] Improve graphlib.TopologicalSort by removing the prepare step
New submission from Larry Hastings : I've maintained my own topological sort class for a while now. The API I came up with (or was it Tim and I?) was adopted for graphlib.TopologicalSort(). Some of my method names are slightly different, but the APIs are so similar, I can do a little renaming and run Lib/test/test_graphlib using my implementation. (For clarity I'm going to write the rest of this issue using the names from graphlib.) Recently I made an improvement to my version: it no longer has the modal behavior where there's a "before prepare" phase where you add nodes and an "after prepare" phase where you can call get_ready and done. Instead, in my new and improved version the graph is always maintained in a "ready" state. You can call add, get_ready, and done in any order, as much as you like. prepare doesn't do anything besides the cycle check--but read on! This approach required tweaking some semantics behind the scenes. My graph object now maintains an internal "dirty" flag. It's set to True after any edges are added, and only gets cleared after checking for cycles. prepare runs this cycle check, but get_ready *also* runs this cycle check. (Though in both cases, only when the dirty flag is set, of course.) In theory this means you no longer need the prepare call. However, it's still useful, because you can call it to check for a CycleError. So, on the one hand, this means that get_ready can throw a CycleError, which it never did before. On the other hand, it won't throw for existing users of the library, because they never add edges after their prepare call. So this semantic change shouldn't break existing users, assuming they're not misusing the existing API. Another wrinkle: adding a dependency on a node that's already been handed out by get_ready (but hasn't been returned by done) is ignored. Again, not something existing users of graphlib can do, so this shouldn't break code. There's one final semantic change I made worth knowing about: when you return a node via done, the graph simply forgets about it. This means you could add it a second time (!) and it could go for a complete ride through the graph again, wh! This also means that when all nodes have been returned by done, the graph is essentially returned to its initial pristine state. This too seems completely harmless, because with the existing library it's illegal to add nodes to a graph after calling prepare, so nobody's doing it, so it won't break any code. I'm happy to contribute my version. But I was lazy and didn't worry about speed or memory implementation; it's strewn with sets and things. I figured I could always optimize it later. But "later" could become "now" if we were interested in trying to merge this behavior into Python's graphlib. Interested? -- assignee: larry components: Library (Lib) messages: 416182 nosy: eric.smith, larry, pablogsal, tim.peters priority: normal severity: normal stage: needs patch status: open title: Improve graphlib.TopologicalSort by removing the prepare step type: enhancement versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue47145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: Ooh, good one. I don't know anybody in the Django project to contact though. Anybody have any leads? -- ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: I heard back from both Samuel Colvin (Pydantic) and Sebastián RamÃrez (FastAPI). They said neither of them use update_wrapper with partial objects. They also took a peek in a bunch of other projects (FastAPI, Typer, SQLModel, Asyncer, SQLAlchemy, Trio, and AnyIO) and nobody was doing it. So honestly it seems like nobody (but me!) calls update_wrapper on partial objects, and we can just fix it. Graham, any final thoughts before we start pulling levers and merging PRs? For now I just want to fix this bug. I'm in favor of re-engineering the relevant objects so they write their own __signature__ objects, so inspect.Signature doesn't have to understand the internals of objects from other modules. But maybe we save that for another day. -- ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: > Performance wise... The SHA series have hardware acceleration on > modern CPUs and SoCs. External libraries such as OpenSSL are in a > position to provide implementations that make use of that. Same with > the Linux Kernel CryptoAPI (https://bugs.python.org/issue47102). > > Hardware accelerated SHAs are likely faster than blake3 single core. > And certainly more efficient in terms of watt-secs/byte. I don't know if OpenSSL currently uses the Intel SHA1 extensions. A quick google suggests they added support in 2017. And: * I'm using a recent CPU that AFAICT supports those extensions. (AMD 5950X) * My Python build with BLAKE3 support is using the OpenSSL implementation of SHA1 (_hashlib.openssl_sha1), which I believe is using the OpenSSL provided by the OS. (I haven't built my own OpenSSL or anything.) * I'm using a recent operating system release (Pop!_OS 21.10), which currently has OpenSSL version 1.1.1l-1ubuntu1.1 installed. * My Python build with BLAKE3 doesn't support multithreaded hashing. * In that Python build, BLAKE3 is roughly twice as fast as SHA1 for non-trivial workloads. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: I can't answer why the Rust one is so much larger--that's a question for Jack. But the blake3-py you built might (should?) have support for SIMD extensions. See the setup.py for how that works; it appears to at least try to use the SIMD extensions on x86 POSIX (32- and 64-bit), x86_64 Windows, and 64-bit ARM POSIX. If you were really curious, you could run some quick benchmarks, then hack your local setup.py to not attempt adding support for those (see "portable code only" in setup.py) and do a build, and run your benchmarks again. If BLAKE3 got a lot slower, yup, you (initially) built it with SIMD extension support. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: The Rust version is already quite "lean". And it can be much faster than the C version, because it supports internal multithreading. Even without multithreading I bet it's at least a hair faster. Also, Jack has independently written a Python package based around the C version: https://github.com/oconnor663/blake3-py/tree/master/c_impl so my making one would be redundant. I have no interest in building standalone BLAKE3 PyPI packages for Raspberry Pi or Android. My goal was for BLAKE3 to be one of the "included batteries" in Python--which would have meant it would, eventually, be available on the Raspberry Pi and Android builds that way. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: Ok, I give up. -- resolution: -> rejected stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: > Given that I don't want to see us gain new vendored copies of > significant but non-critical third party hash code in our tree > (Modules/_blake3/impl/ in PR 31686) for anything but a known > fixed term need (ex: the sha2 libtomcrypt code is gone from > our tree as was clearly going to happen from the start), > the only way I think we should include blake3 support is if > there is already a plan for that code to leave our tree in > the future with a high probability of success. You've said what you want, but not why. It sounds like you are against merging the BLAKE3 PR containing its own impl. Why? -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: Jack: I've updated the PR, improving compatibility with the "blake3" package on PyPI. I took your notes, and also looked at the C module you wrote. The resulting commit is here: https://github.com/python/cpython/pull/31686/commits/37ce72b0444ad63fd1989ad36be5f7790e51f4f1 Specifically: * derive_key_context is now a string, which I internally encode into UTF-8. * I added the AUTO member to the module, set to -1. * max_threads may not be zero; it can be >= 1 or AUTO. * I added the reset() method. Some additional thoughts, both on what I did and on what you did: * In your new() method, your error string says "keys must be 32 bytes". I went with "key must be exactly 32 bytes"; the name of the parameter is "key", and I think "exactly" makes the message clearer. * In my new() method, I complain if the derive_key_context is zero-length. In your opinion, is that a good idea, or is that overly fussy? * In your copy() method, you hard-code Blake3Type. It's considered good form to use type(self) here, in case the user is calling copy on a user-created subclass of Blake3Type. * In your copy() method, if the original has a lock created, you create a lock in the copy too. I'm not sure why you bother; I leave the lock member uninitialized, and let the existing logic create the lock in the copy on demand. * In the Blake3_methods array, you list the "update" method twice. I suspect this is totally harmless, but it's unnecessary. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: You make a good point. I filed a separate bug (#46846) suggesting that partial objects should set their own annotations and signature. I agree that objects performing such magic should take care of these details themselves, rather than requiring the inspect module to have workarounds based on deep knowledge of these other modules' inner workings. -- ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: Nobody I've nosied on this issue recently has expressed any opinion on the matter. I'm gonna try one more person: Graham Dumpleton, the maintainer of "wrapt", Python's premier function-wrapping. Graham, care to express any opinions about this issue? Can we fix it without causing widespread wailing and gnashing of teeth? Do you think people are depending on the current how-can-you-describe-it-as-anything-but-broken behavior? -- nosy: +grahamd ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45319] Possible regression in __annotations__ descr for heap type subclasses
Larry Hastings added the comment: I only did my experiments with the _wrappers.c Christian gave me. If that's not the shipping version of wrapt, and wrapt doesn't exhibit this behavior in 3.10, then that's good. I agree you can wait to address this behavior until it affects code you actually plan to ship. -- ___ Python tracker <https://bugs.python.org/issue45319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45319] Possible regression in __annotations__ descr for heap type subclasses
Larry Hastings added the comment: Graham: I'm happy to help. I can write a more elaborate explanation of what's happening, or answer questions, whatever you like. I'm a fan of wrapt so I'd certainly like to see this issue resolved. And, seeing as you're the author and maintainer of wrapt, I assure you you've got more battle-won experience with object proxying than most developers--myself included. You're *absolutely* the right person to ameliorate this issue! -- ___ Python tracker <https://bugs.python.org/issue45319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45319] Possible regression in __annotations__ descr for heap type subclasses
Larry Hastings added the comment: (It's also possible your workaround where wrapt explicitly defines an "__annotations__" getset on every child of ObjectProxy is the right fix. I don't know; I don't know the fine points of attribute access, like when does it access getsets on the type, vs getsets on base types, vs setattro, etc.) -- ___ Python tracker <https://bugs.python.org/issue45319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45319] Possible regression in __annotations__ descr for heap type subclasses
Larry Hastings added the comment: This isn't a CPython bug. It's a change in CPython behavior that wrapt needs to accommodate. In particular, it isn't 'causing a regression for C subclasses of heap types when the parent class has an "__annotations__" descriptor', nor do child classes have any difficulty inheriting the descriptor of their parent classes. That's unsurprising; after all, if I had broken child classes inheriting the descriptors of their parent classes, a lot more would have broken than just wrapt. The problem is in WraptObjectProxy_setattro(). (Which--just to drive my point home--*is* getting called when you set __annotations__ on one of wrapt's various proxy objects.) WraptObjectProxy_setattro() proxies setattr calls for wrapped objects to the original object--if "o" is a wrapt proxy object wrapping "fn", and you run "o.__annotations__ = x", it should actually execute "fn.__annotations__ = x" under the covers. Except WraptObjectProxy_setattro() executes *this* code first, starting at line 1531 in my copy of _wrapped.c: if (PyObject_HasAttr((PyObject *)Py_TYPE(self), name)) return PyObject_GenericSetAttr((PyObject *)self, name, value); If the *type* has the attribute, then it doesn't proxy the setattr to the wrapped object. Instead it does a "generic setattr" on the object itself. PyObject_HasAttr works by attempting a getattr on the type. If that getattr call succeeds, PyObject_HasAttr returns true. The type here is FunctionWrapper (WraptFunctionWrapper_Type). Since we're now looking it up on this type object, we use the type of the type object, which is "type", to access the attribute. And getting the "__annotations__" attribute from an object of type "type" means calling type_get_annotations(), a new descriptor which ensures that the annotations dict always exists, which means the HasAttr call succeeds and returns true. In short, this change to the semantics of the "__annotations__" attribute means wrapt no longer proxies the setattr to the underlying wrapped object when setting the "__annotations__" attribute on *any* of its objects. In my opinion, wrapt needs to accommodate this new behavior. In my testing I changed the above code to this: if (!annotations_str) { annotations_str = PyUnicode_InternFromString("__annotations__"); } if (PyObject_RichCompareBool(name, annotations_str, Py_NE) && PyObject_HasAttr((PyObject *)Py_TYPE(self), name)) return PyObject_GenericSetAttr((PyObject *)self, name, value); I also declared static PyObject *annotations_str = NULL; at the top of the function. With that change in place, the tests now passed. My hunch is, this approach is more or less what wrapt should do. It *might* be undersophisticated; it's possible that there are classes out there playing their own weird descriptor tricks with the "__annotations__" attribute. Perhaps the fix needs to be on a case-by-case basis, based on the type of the wrapped object. Anyway this is obviously up to Graham, which is for the best anyway--he has far more experience than I do with this sort of object proxying wizardry. -- resolution: -> third party stage: test needed -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue45319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46930] Iterating over cls.__dict__ in classmethod causes RuntimeError when printing __annotations__
Larry Hastings added the comment: When accessing __annotations__ *in a class without annotations*, and *for the first time*. And your workaround seems reasonable. -- ___ Python tracker <https://bugs.python.org/issue46930> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: Right, and I did say "(or BDFL)". Apparently you didn't bother to consult with the BDFL in advance, or at least not in the usual public venues--I haven't found a record of such a conversation on the bpo issue, nor in python-dev. BTW you simultaneously proposed adding SHA3/SHAKE. The total kloc for all this work was over 26k; you didn't mention any discomfort with the size of these patches at the time in public correspondance. In fact, quite the opposite. On 2016/05/28 you said: > I also don't get your obsession with lines of code. The gzip and expat > are far bigger than the KeccakCodePackage. https://mail.python.org/archives/list/python-...@python.org/message/3YHVN2I74UQC36AVY5BGRJJUE4PMU6GX/ -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: Jack O'Connor: > Was any of the experimental C extension code [...] useful to you? > I was wondering if it could be possible to copy blake3module.c from > there verbatim. I concede I didn't even look at it. The glue code to mate the library with the CPython runtime wasn't the hard part. And, Python has its own way of working (Argument Clinic etc). I did what seemed easiest, which was to start with CPython's BLAKE2 support and hack it up. Given that I had it working in maybe half a day, this seems reasonable. I should take a peek at your experimental build though, just to confirm I got the interfaces right. > The setup.py build there also has working Windows and NEON support. The CPython build process doesn't use setup.py to build extensions on Windows. I haven't looked recently, but back in the day they were built like any other DLL, using its own "project file". This will require someone to use the MSVS GUI to create a new project, add files, etc etc. I assume they can just use the .asm files for the SIMD extensions so hopefully this will be straightforward. As for NEON support, the technique I used theoretically should Just Work. I look forward to hearing back from someone building on ARM. (I have a cross-compiler setup for building for MiSTer, which is an ARM platform with NEON support. But the cross-compiler is a royal PITA to use... it's a Docker image, and requires a bunch of manual configuration, and I haven't touched any of that in more than a year.) You seem to have done at least a partial code review of my PR, given your comments to follow. I appreciate it! I tried to add you as a "reviewer" but GitHub wouldn't permit it--I assume this is some sort of permissions problem. Still, it'd be nice if you were able to do future code reviews using the GitHub review tool; are you permitted to use that for my PR? > - My derive_key_context parameter requires a string and refuses to > accept bytes. This is consistent with our Rust and C APIs (though the C > API does include a _raw version specifically for bindings, which we're > using here). I was considering going the other way with it actually, requiring bytes. Note that Python has first-class support for hard-coded bytes strings: b = blake3.blake3(derive_key_context=b"My Funny Valentine (1984)") The C interface takes "char *", not a "wchar_t *", and this seemed like the most direct and relatable way to reflect that. But I'm not militant about this, and I'm willing to change the interface to require an actual string (as in Unicode). I note that your C API already dictates that Unicode be encoded as UTF-8, so we can do that, and if the encoding fails the user can deal with it. > - I've included an `AUTO` constant that provides a special value (-1) > for the `max_threads` argument, and I explicitly don't support > `max_threads=0`. I can do that too; again, I prefer the 0 there, but I'm not militant about it. However, it would make sense to me if you had that constant somewhere in the BLAKE3 C .h files, which AFAICT you currently don't. > - I enforce that the `data` arguments are positional-only and that the > other keyword arguments are keyword-only. I think this is consistent > with the rest of hashlib. I suspect hashlib is mostly like that, due to being chock full of legacy code. But I don't see why that's necessary. I think permitting "data" to be a named argument is fine. So unless you have a strong conviction about it--which I bet you don't--I'll leave it as positional-or-keyword. There are rare circumstances where positional-only arguments are useful; this isn't one of them. > - I include a `.reset()` method. I don't mind adding that. > - Unrelated to tests: I haven't made any attempt to zero memory in my > `dealloc` function. But if that's what other hashlib functions do, > then I'm certainly in favor of doing it here too. I inherited that from the BLAKE2 code I carved up to make the BLAKE3 version. And yeah, it made sense to me, so I kept it. Christian Heimes: > GH-31686 is a massive patch set. I'm feeling uncomfortable adding > such much new code for a new hashing algorithm. Did you ask the > Steering Council for approval? I didn't. Like most hashing algorithms, BLAKE3 doesn't allocate memory and doesn't perform any I/O. All it does is meditate on the data you pass in, and write to various pre-allocated fixed-size buffers. As large codebases go this seems pretty harmless, almost inert. The Modules/_blake3/impl directory is about 32kloc. I note that the Modules/_blake2/impl directory you checked in in 2016 is about 21kloc, and you didn't require Steering Council (or BDFL) approval fo
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: Also, for what it's worth: I just ran my checksum benchmarker using a freshly built python a la my PR. Here are my results when hashing 462183782 bytes (dicey-dungeons-linux64.zip): hash algorithm timebytes/sec size hash - -- -- - blake3 0.18976406 2435570699 64 0c72d0a07ba767b75b0c99fed38fda... sha1 0.21692419 2130623562 40 9fb83614394cd3b39e42b1d9f84a3c... sha256 0.26399648 1750719480 64 2320129f2545ff8606d3db1d706d86... sha224 0.28097475 1644929957 56 133b5da8d8b387f2bcfd69b0c73ed8... md4 0.34185237 1351998195 32 dea7585ea9fa4520687bab1dc671858e blake2b 0.53724666 860282275 128 e3653f33858a83b386c2fe865280a1... md5 0.58128106 795112407 32 299440e1968cf8f8abc288bac8c0a4fa sha512_224 0.64589952 715566066 56 413d48b782f114870ef80815540678... sha384 0.64645893 714946859 96 b1c1cd96cef79c15f2171b8aa81304... sha512 0.65424513 706438241 128 e7d0cec3fe8b73d1534a7bdb484176... sha512_256 0.68371638 675987586 64 3f58faba70cea4d6ea8a8371e71bbb... md5-sha1 0.80361958 575127576 72 299440e1968cf8f8abc288bac8c0a4... shake_128 0.84424524 547452041 64 c62a813897b81f67822fc07115deae... blake2s 0.85661793 539544839 64 cb8bd19c6ca446bbf7a8abbec61dc5... sha3_224 0.95759645 482649850 56 6f96d117c7fcbcd802b222854db644... shake_256 1.0152032 455262322 64 2d9f9dafe0ddf792c6407910946845... sha3_256 1.015744455019929 64 cc5d55fe0ac31f6e335da1bc6abaf3... sha3_384 1.3235858 349190644 96 13206910ff231fe51a38fe637ded30... sm3 1.4478934 319211203 64 021cd913540d95b13a03342b54f80d... ripemd160 1.4737549 313609670 40 1a956000b88267ec8fc23327d22548... sha3_512 1.9131832 241578418 128 e84b9f499b013956f6f36c93234ca3... "time" is wall time in seconds. So, yes, BLAKE3 was the fastest hash algorithm available on my machine--2.4GB/sec! (I'm a little surprised by that result actually. My CPU is pretty modern, so I assume it has the SHA1 extensions. And I further assume we're using OpenSSL, and OpenSSL will use those extensions when available. If BLAKE3 is *still* faster that OpenSSL, well! hats off to the BLAKE3 team. And that just goes to show you how useful SIMD extensions are!) -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: Okay, so. Here's a PR that adds BLAKE3 support to hashlib. The code was straightforward; I just took the BLAKE2 module and modified it to only have one algorithm. I also copied over the whole C directory tree from BLAKE3, which is totally okay fine by their license. I didn't have to modify it a bit. The tricky part was building it. Building extension modules is done inside setup.py using distutils (or the modern replacement), and in order to get those sexy SIMD implementations I had to compile assembly-language files, or C files with extra flags. What I did works great on my Linux box, and I have my fingers crossed that it'll work on other POSIX platforms. I wrote a lng comment in setup.py explaining what I did and why. Steve: I didn't do anything for Windows support. Do you have time to take a pass at it? Or if you know another core dev who does Windows build stuff, please nosy them. Whoever does the work. I suggest you read https://github.com/BLAKE3-team/BLAKE3/blob/master/c/README.md for a little guidance on how to build BLAKE3 on Windows with SIMD support. Also, I see you now build Python for Windows on ARM! Does this mean Python can use BLAKE3's NEON support? Maybe it's time to find out! Get hyped! Jack and I corresponded last year (or maybe 2020?) about what the API should look like. The idea is, Jack also maintains a BLAKE3 Python extension on pypi, written in Rust. Our goal was to make the two types behave identically, so that it could be like the old stringio / cstringio situation. You can use the built-in one for convenience, but also you can install the Rust version from PyPI for even more performance. Jack, it wouldn't hurt my feelings overly much if you checked my PR to make sure I got the interface right. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Change by Larry Hastings : -- pull_requests: +29805 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/31686 ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: I emailed the Pydantic and FastAPI guys and didn't hear back. Given what you found on their issue trackers, I think it's unlikely that they care a lot about this issue (but were too busy to reply). It's far more likely that they don't care. Doing a little research (git blame), it looks like the "follow the wrapped chain to find the original signature" work was done by Nick Coghlan about nine years ago; he touched both functools.update_wrapper and the inspect module. The only other people to touch the code recently are Yuri and Batuhan. I've nosied Nick and Batuhan (already looped in Yuri), just to ping them and see if they have any strong opinions. If nobody has anything remarkable to say about it, honestly we probably *can* just merge your patch, Ofey. I see your name is now marked with a star; are you now authorized to contribute to CPython? (Note that I only glanced at your patch so far; if we were going to merge this bugfix I would of course first do a full review.) -- nosy: +BTaskaya, ncoghlan ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28824] os.environ should preserve the case of the OS keys ?
Change by Larry Hastings : -- nosy: -larry, loewis ___ Python tracker <https://bugs.python.org/issue28824> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46847] functools.update_wrapper doesn't understand partial objects and annotations
New submission from Larry Hastings : functools.update_wrapper currently copies over every attribute listed in the "assigned" parameter, which defaults to WRAPPER_ASSIGNMENTS, which means it copies the wrapped function's __annotations__ to the wrapper. This is slightly wrong if the wrapper occludes an annotated parameter: def foo(a: int, b: str, c: float): print(a, b, c) import functools foo_a = functools.partial(foo, 3) functools.update_wrapper(foo_a, foo) print(foo_a.__annotations__) In this case, foo_a.__annotations__ contains an annotation for a parameter named "a", even though foo_a doesn't have a parameter named "a". This problem occurred to me just after I filed #46846; the two issues are definitely related. -- components: Library (Lib) messages: 413898 nosy: larry, rhettinger priority: normal severity: normal stage: test needed status: open title: functools.update_wrapper doesn't understand partial objects and annotations type: behavior versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46847> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46846] functools.partial objects should set __signature__ and _annotations__
New submission from Larry Hastings : I ran across an interesting bug in issue #46761. If you call functools.update_wrapper on a functools.partial object, inspect.signature will return the wrong (original) signature for the partial object. We're still figuring that one out. And, of course, it's telling that the bug has been there for a long time. I suspect this isn't something that has inconvenienced a lot of people. But: I suggest that it's time functools.partial participated in signature stuff. Specifically, I think functools.partial should generate a new and correct __signature__ for the partial object. And I propose it should also generate a new and correct __annotations__ for the partial, by removing all entries for parameters that are filled in by the partial object. Right now inspect.signature has special support for functools.partial objects. It finds the underlying function, and . Which means there's code in both modules that has to understand the internals of partial objects. Just from a code hygiene perspective, it'd be better if all that logic lived under functools. I wonder if functools.partial objects should generally do a better job of impersonating the original function. Should they adopt the same __name__? __file__? __qualname__? My intuition is, it'd be nice if it did. But I might be forgetting something important. (I suspect everything I said about functools.partial also applies to functools.partialmethod.) -- components: Library (Lib) messages: 413897 nosy: larry, rhettinger priority: normal severity: normal stage: test needed status: open title: functools.partial objects should set __signature__ and _annotations__ type: enhancement versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46846> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: Okay, so, I considered the problem for a while, and I have a reasonable theory about what follow_wrapper_chains was for in the first place. If you have a generic decorator, like functools.cache(), it usually looks like this: def my_decorator(fn): def generic_version(*args, **kwargs): args, kwargs = do_something(args, kwargs) return fn(*args, **kwargs) return generic_version @my_decorator def add_five(i): return i+5 If you take the signature of add_five, you'd get (*args, **kwargs), because that's the signature of the wrapper function returned by the decorator. The decorator doesn't change the parameters of the function, but because of how decorators work it can occlude the proper function signature. In that instance, follow_wrapper_chains does the right thing, and as a result you get a precise function signature. (Of course, it would do the wrong thing if your hand-written decorator *also* behaved like a partial application, adding in its own hard-coded arguments, so that the resulting function signature changed.) Still, obviously it's doing the wrong thing when it comes to functools.partial() functions. My suspicion is that I'm the rare individual who actually uses update_wrapper on a functools.partial object. So maybe we have the situation here where, yeah, it's a bug, and we can fix it without causing further breakage. Maybe we can loop in someone who works on a popular runtime function introspection library (FastAPI, Pydantic) to see if they have any take on it. -- ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: Ofey, I appreciate your enthusiasm, but you should probably slow down. Fixing the bug is probably going to be the easy part here. But we're not to that stage quite yet. First, we need to determine * why the code behaves like this--is this behavior a genuine bug, or is it actually a bugfix for some worse behavior? * will fixing the bug cause problems for Python users? and if so, can we still fix the bug while mitigating the damage to people who are unfortunately depending on the bug? The next step is not to write a bugfix for this exact behavior, it's to determine why the code is the way it is. If it was genuinely just a mistake, and we can simply fix it and people will thank us, then we may have a use for your patch. But, generally, people who work on Python are smart, and they don't tend to commit dumb mistakes, so we can't simply assume it's a simple bug and fix it. -- ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46807] Wrong class __annotations__ when field name and type are equal
Larry Hastings added the comment: Yeah, it's the same behavior. In that class, you have defined Str to be "", then you reference Str in the annotation, and its value is "". Whatever you set Str to in the example in [21], that's the value of the annotation. GvR closed the previous report as "not a bug", so I'm gonna do that here too. -- ___ Python tracker <https://bugs.python.org/issue46807> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
Larry Hastings added the comment: Yury, Ka-Ping, can you guys shed any light on this? Your names are still on inspect.py. -- nosy: +ping, yselivanov ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: Just checking--I can liberally pull code from https://github.com/BLAKE3-team/BLAKE3 yes? -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46783] Add a new feature to enumerate(iterable, start=0) built-in function
Change by Larry Hastings : -- components: -Argument Clinic nosy: -larry ___ Python tracker <https://bugs.python.org/issue46783> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: I thought someone volunteered to do it--if that's not happening, I could take a look at it next week. Shouldn't be too hard... unless I have to touch autoconf, which I only barely understand. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46761] functools.update_wrapper breaks the signature of functools.partial objects
New submission from Larry Hastings : It's considered good hygiene to use functools.update_wrapper() to make your wrapped functions look like the original. However, when using functools.partial() to pre-supply arguments to a function, if you then call functools.update_wrapper() to update that partial object, inspect.signature() returns the *original* function's signature, not the *wrapped* function's signature. To be precise: if you wrap a function with functools.partial() to pre-provide arguments, then immediately call inspect.signature() on that partial object, it returns the correct signature with the pre-filled parameters removed. If you then call functools.update_wrapper() to update the partial from the original function, inspect.signature() now returns the *wrong* signature. I looked into it a little. The specific thing changing inspect.signature()'s behavior is the '__wrapped__' attribute added by functools.update_wrapper(). By default inspect.signature() will unwrap partial objects, but only if it has a '__wrapped__' attribute. This all looks pretty deliberate. And it seems like there was some thought given to this wrinkle; inspect.signature() takes a "follow_wrapper_chains" parameter the user can supply to control this behavior. But the default is True, meaning that by default it unwraps partial objects if they have a '__wrapped__'. I admit I don't have any context for this. Why do we want inspect.signature() to return the wrong signature by default? -- components: Library (Lib) files: update_wrapper.breaks.partial.signature.test.py messages: 413299 nosy: larry priority: normal severity: normal stage: test needed status: open title: functools.update_wrapper breaks the signature of functools.partial objects type: behavior versions: Python 3.10, Python 3.11, Python 3.7, Python 3.8, Python 3.9 Added file: https://bugs.python.org/file50625/update_wrapper.breaks.partial.signature.test.py ___ Python tracker <https://bugs.python.org/issue46761> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: I assume by "intrinsics" you mean using the GCC SIMD stuff, not like inlining memcpy() or something. My assumption is yes, that's fine, we appear to already be using them for BLAKE2. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: > In setup.py I assume that the target platform of the build is the same as the > current interpreter's platform. If this is included in CPython, it won't be using setup.py, so this isn't a concern. I don't think there's a way to use setup.py to cross-compile, so I'm not sure this ever was a concern. > - Compiling assembly files. AFAICT Python currently ships exactly one assembly file, "Modules/_decimal/libmpdec/vcdiv64.asm", which is only built on Windows. It would be a brave new world of configure.ac hacking to build assembly language files on POSIX platforms. As a first pass I say we merge the reference C implementation. Maybe someday we could add the SIMD assembly language stuff--or use the one built in to OpenSSL (if they ever add BLAKE3). > I assume we don't want to check in the .obj files? Correct, we don't. > - blake3module.c contains an awful lot of gotos to handle allocation failure > cases. Works for me, please keep it. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46314] Stray RESUME opcode for unused lambda
Change by Larry Hastings : -- components: -2to3 (2.x to 3.x conversion tool), Argument Clinic, Build, C API, Cross-Build, Demos and Tools, Distutils, Documentation, Extension Modules, FreeBSD, IDLE, IO, Installation, Library (Lib), Parser, Regular Expressions, SSL, Subinterpreters, Tests, Tkinter, Unicode, Windows, XML, asyncio, ctypes, email, macOS nosy: -larry ___ Python tracker <https://bugs.python.org/issue46314> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39298] add BLAKE3 to hashlib
Larry Hastings added the comment: So, can we shoot for adding this to 3.11? Jack, do you consider the code is in good shape? I'd be up for shepherding it along in the process. In particular, I can contribute the bindings so BLAKE3 is a first-class citizen of hashlib. -- ___ Python tracker <https://bugs.python.org/issue39298> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46347] memory leak in PyEval_EvalCodeEx
Larry Hastings added the comment: (Sorry--it'll leak "kwnames", "newargs", and "defaults".) -- ___ Python tracker <https://bugs.python.org/issue46347> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46347] memory leak in PyEval_EvalCodeEx
Larry Hastings added the comment: The function will still leak "kwnames" and "default" if creating the "func" object fails. Admittedly that would only happen in a low-memory condition which is a) rare and b) probably only happens just before the interpreter completely dies, so it's not worth addressing during today's mild emergency. -- nosy: +larry ___ Python tracker <https://bugs.python.org/issue46347> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43424] Document the `controller.name` field in `webbrowser` module
Change by Larry Hastings : -- components: +Documentation nosy: -larry ___ Python tracker <https://bugs.python.org/issue43424> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43424] Document the `controller.name` field in `webbrowser` module
Change by Larry Hastings : -- components: +Library (Lib) -Argument Clinic ___ Python tracker <https://bugs.python.org/issue43424> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43424] Document the `controller.name` field in `webbrowser` module
Change by Larry Hastings : -- components: -2to3 (2.x to 3.x conversion tool), Build, C API, Cross-Build, Demos and Tools, Distutils, Documentation, Extension Modules, FreeBSD, IDLE, IO, Installation, Interpreter Core, Library (Lib), Parser, Regular Expressions, SSL, Subinterpreters, Tests, Tkinter, Unicode, Windows, XML, asyncio, ctypes, email, macOS ___ Python tracker <https://bugs.python.org/issue43424> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19838] test.test_pathlib.PosixPathTest.test_touch_common fails on FreeBSD with ZFS
Larry Hastings added the comment: I can confirm that the behavior is fixed in ZFS on Linux. My test case C program now prints "Everything is okay." when run on a ZFS partition on Linux, and test_touch_common from the current tree passes every time. ZFS fixing this was the best possible outcome. I'll go ahead and just close it now--why wait! If somebody confirms that the test still fails on FreeBSD, please open a new issue. Thanks for checking in, Irit! -- stage: patch review -> resolved status: pending -> closed ___ Python tracker <https://bugs.python.org/issue19838> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45319] Possible regression in __annotations__ descr for heap type subclasses
Larry Hastings added the comment: (Preferably not using "tox". I don't know that tool.) -- ___ Python tracker <https://bugs.python.org/issue45319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45319] Possible regression in __annotations__ descr for heap type subclasses
Larry Hastings added the comment: I got the PR locally, but the command-line you gave me fails, tox can't find python3.10 despite it being on the path. Rather than force me to reverse-engineer and recreate your build environment, can you give me a straightforward command-line or shell script to reproduce the problem? -- ___ Python tracker <https://bugs.python.org/issue45319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45319] Possible regression in __annotations__ descr for heap type subclasses
Larry Hastings added the comment: I finally have some bandwidth to look at this--sorry for being a bit slow. I wasn't able to reproduce, because the patch didn't apply cleanly. I downloaded the patch ( https://patch-diff.githubusercontent.com/raw/GrahamDumpleton/wrapt/pull/187.patch ) and ran against current wrapt main, and about 50% of the "hunks" failed to apply. -- ___ Python tracker <https://bugs.python.org/issue45319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31000] Test failure in resource module on ZFS
Larry Hastings added the comment: FWIW the test still fails in exactly the same way. This was building with main, on Pop!_OS 21.04 64-bit. -- ___ Python tracker <https://bugs.python.org/issue31000> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45020] Freeze all modules imported during startup.
Larry Hastings added the comment: Nope. On Windows, os.path is "ntpath". -- ___ Python tracker <https://bugs.python.org/issue45020> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45392] docstring of "type" could use an update
Larry Hastings added the comment: Removing it makes sense to me. Not sure what I was thinking, way back when. Thanks for catching--and volunteering to fix--this! -- ___ Python tracker <https://bugs.python.org/issue45392> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45020] Freeze all modules imported during startup.
Larry Hastings added the comment: > What the two approaches have in common is that they require rebuilding the > python binary whenever you edit any of the changed modules. I heard somewhere > (I'm sorry, I honestly don't recall who said it first, possibly Eric himself) > that Jeethu's approach was rejected because of that. My dim recollection was that Jeethu's approach wasn't explicitly rejected, more that the community was more "conflicted" than "strongly interested", so I lost interest, and nobody else followed up. > I don't understand entirely why Jeethu's prototype had part written in C. My theory: it's easier to serialize C objects from C. It's maybe even slightly helpful? But it made building a pain. And yeah it just doesn't seem necessary. The code generator will be tied to the C representation no matter how you do it, so you might as well write it in a nice high-level language. > I never ran it so I don't know what the generated code looked like, [...] You can see an example of Jeethu's serialized objects here: https://raw.githubusercontent.com/python/cpython/267c93d61db9292921229fafd895b5ff9740b759/Python/frozenmodules.c Yours is generally more readable because you're using the new named structure initializers syntax. Though Jeethu's code is using some symbolic constants (e.g. PyUnicode_1BYTE_KIND) where you're just printing the actual value. > > With that in place, it'd be great to pre-cache all the .py files > > automatically read in at startup. > > *All* the .py files? I think the binary bloat cause by deep-freezing the > entire stdlib would be excessive. I did say "all the .py files automatically read in at startup". In current trunk, there are 32 modules in sys.module at startup (when run non-interactively), and by my count 13 of those are written in Python. If we go with Eric's approach, that means we'd turn those .pyc files into static data. My quick experiment suggests that'd be less than 300k. On my 64-bit Linux system, a default build of current trunk (configure && make -j) yields a 23mb python executable, and a 44mb libpython3.11.a. If I build without -g, they are 4.3mb and 7mb respectively. So this speedup would add another 2.5% to the size of a stripped build. If even that 300k was a concern, the marshal approach would also permit us to compile all the deep-frozen modules into a separate shared library and unload it after we're done. I don't know what the runtime impact of "deep-freeze" is, but it seems like it'd be pretty minimal. You're essentially storing these objects in C static data instead of the heap, which should be about the same. Maybe it'd mean the code objects for the module bodies would stick around longer than they otherwise would? But that doesn't seem like it'd add that much overhead. It's interesting to think about applying these techniques to the entire standard library, but as you suggest that would probably be wasteful. On the other hand: if we made a viable tool that could consume some arbitrary set of .py files and produce a C file, and said C file could then be compiled into a shared library, end users could enjoy this speedup over the subset of the standard library their program used, and perhaps even their own source tree(s). -- ___ Python tracker <https://bugs.python.org/issue45020> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45020] Freeze all modules imported during startup.
Larry Hastings added the comment: There should be a boolean flag that enables/disables cached copies of .py files from Lib/. You should be able to turn it off with either an environment variable or a command-line option, and when it's off it skips all the internal cached stuff and uses the normal .py / .pyc machinery. With that in place, it'd be great to pre-cache all the .py files automatically read in at startup. As for changes to the build process: the most analogous thing we have is probably Argument Clinic. For what it's worth, Clinic hasn't been very well integrated into the CPython build process. There's a pseudotarget that runs it for you in the Makefile, but it's only ever run manually, and I'm not sure there's *any* build automation for Windows developers. AFAIK it hasn't really been a problem. But then I'm not sure this is a very good analogy--the workflow for making Clinic changes is very different from people hacking on Lib/*.py. It might be sensible to add a mechanism that checks whether or not the pre-cached modules are current. Store a hash for each cached module and check that they all match. This could then be part of the release process, run from a GitHub hook, etc. -- ___ Python tracker <https://bugs.python.org/issue45020> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45020] Freeze all modules imported during startup.
Larry Hastings added the comment: Since nobody's said so in so many words (so far in this thread anyway): the prototype from Jeethu Rao in 2018 was a different technology than what Eric is doing. The "Programs/_freeze_importlib.c" Eric's playing with essentially inlines a .pyc file as C static data. The Jeethu Rao approach is more advanced: instead of serializing the objects, it stores the objects from the .pyc file as pre-initialized C static objects. So it saves the un-marshalling step, and therefore should be faster. To import the module you still need to execute the module body code object though--that seems unavoidable. The python-dev thread covers nearly everything I remember about this. The one thing I guess I never mentioned is that building and working with the prototype was frightful; it had both Python code and C code, and it was fragile and hard to get working. My hunch at the time was that it shouldn't be so fragile; it should be possible to write the converter in Python: read in .pyc file, generate .c file. It might have to make assumptions about the internal structure of the CPython objects it instantiates as C static data, but since we'd ship the tool with CPython this should be only a minor maintenance issue. In experimenting with the prototype, I observed that simply calling stat() to ensure the frozen .py file hadn't changed on disk lost us about half the performance win from this approach. I'm not much of a systems programmer, but I wonder if there are (system-proprietary?) library calls one could make to get the stat info for all files in a single directory all at once that might be faster overall. (Of course, caching this information at startup might make for a crappy experience for people who edit Lib/*.py files while the interpreter is running.) One more observation about the prototype: it doesn't know how to deal with any mutable types. marshal.c can deal with list, dict, and set. Does this matter? ISTM the tree of objects under a code object will never have a reference to one of these mutable objects, so it's probably already fine. Not sure what else I can tell you. It gave us a measurable improvement in startup time, but it seemed fragile, and it was annoying to work with/on, so after hacking on it for a week (at the 2018 core dev sprint in Redmond WA) I put it aside and moved on to other projects. -- ___ Python tracker <https://bugs.python.org/issue45020> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44879] How to insert newline characters as normal characters while input()?
Larry Hastings added the comment: This is not a bug, you are asking for programming help. Please don't use the Python issue tracker for programming help. You won't get any, you'll just waste people's time. -- components: -Argument Clinic, FreeBSD, IO, Interpreter Core, Windows nosy: -koobs, paul.moore, prasechen, steve.dower, tim.golden, zach.ware resolution: -> not a bug stage: -> resolved status: open -> closed type: crash -> versions: -Python 3.10, Python 3.11, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue44879> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25782] CPython hangs on error __context__ set to the error itself
Change by Larry Hastings : -- nosy: -larry ___ Python tracker <https://bugs.python.org/issue25782> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22018] signal.set_wakeup_fd() should accept sockets on Windows
Change by Larry Hastings : -- nosy: -larry ___ Python tracker <https://bugs.python.org/issue22018> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36216] CVE-2019-9636: urlsplit does not handle NFKC normalization
Change by Larry Hastings : -- nosy: -larry ___ Python tracker <https://bugs.python.org/issue36216> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37923] Combining typing.get_type_hints and inspect.signature
Change by Larry Hastings : -- nosy: -larry ___ Python tracker <https://bugs.python.org/issue37923> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37923] Combining typing.get_type_hints and inspect.signature
Larry Hastings added the comment: "Type hints" and "annotations" aren't the same thing. And type hints are more opinionated about the values of annotations than would be appropriate for the inspect module. For example, typing.get_type_hints() wraps strings with ForwardRef, turns None into NoneType, and it ignores objects with a "__no_type_check__" attribute. So, I think it might be okay to *add* a *new* function to the typing module that was equivalent to inspect.signature() (and typing.signature() seems like a good name here). I don't use type hints, so I don't have a strong opinion on that either way. But you definitely shouldn't modify inspect.signature() so it produces type hints. -- ___ Python tracker <https://bugs.python.org/issue37923> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43987] Add "Annotations Best Practices" to HOWTO
Larry Hastings added the comment: Thanks to Jelle for an enormous amount of reviewing and re-reviewing! That was just spiffy. -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue43987> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43987] Add "Annotations Best Practices" to HOWTO
Larry Hastings added the comment: New changeset 49b26fa517165f991c35a4afcbef1fcb26836bec by larryhastings in branch 'master': bpo-43987: Add "Annotations Best Practices" HOWTO doc. (#25746) https://github.com/python/cpython/commit/49b26fa517165f991c35a4afcbef1fcb26836bec -- ___ Python tracker <https://bugs.python.org/issue43987> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: Ah, I see. So it wasn't a Windows thing, it was a "we run the test multiple times and that particular test assumed it was only run once" thing. And reflink testing is guaranteed to run every test multiple times. -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: Obviously the original assertion failure ("AssertionError: False is not true") wasn't caused by the refleaks. So I'm still puzzled about why that test worked on POSIX and failed on Windows. I admit I was pulling some wacky import shenanigans in the test, and the import_fresh_module change is an improvement. But still... what was going on?! -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: You want a separate PR for the refleak fix, or should I roll them both up into one? -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: Eek! I swear I did a refleak check and got a clean bill of health. Again, sorry about this! -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Change by Larry Hastings : -- pull_requests: +24442 stage: resolved -> patch review pull_request: https://github.com/python/cpython/pull/25752 ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: Gee whiz! Sorry about that. I swear, it works fine on my machine. I'll incorporate import_helper.import_fresh_module helper into the test as you suggest, and once I get it to work I'll send you a PR. I don't know how to test fixing this failure, though, because again it already works fine on my machine. Why would it fail on Windows? Are there dramatic differences under the covers with respect to the import machinery and caching and such between POSIX and Windows? -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43987] Add "Annotations Best Practices" to HOWTO
Change by Larry Hastings : -- nosy: +Jelle Zijlstra ___ Python tracker <https://bugs.python.org/issue43987> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43987] Add "Annotations Best Practices" to HOWTO
Change by Larry Hastings : -- keywords: +patch pull_requests: +24436 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/25746 ___ Python tracker <https://bugs.python.org/issue43987> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43987] Add "Annotations Best Practices" to HOWTO
New submission from Larry Hastings : Dealing with annotations is complicated. I think there should be a section of the Python documentation describing best practices for working with annotations. So, here goes. The best spot I found for it was as a new HOWTO. I added links to that HOWTO to a couple relevant glossary definitions, and one in the documentation for inspect.get_annotations(). I'm not sure if it makes sense to add any other links; I considered adding links to the HOWTO to where the __annotations__ attribute is defined on functions, modules, and classes, in reference/datamodel.rst, but it just didn't seem like a good idea. -- assignee: larry components: Documentation messages: 392419 nosy: larry priority: normal severity: normal stage: needs patch status: open title: Add "Annotations Best Practices" to HOWTO type: enhancement versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue43987> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: Thanks for your feedback, everybody! It's now checked in. -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: New changeset 74613a46fc79cacc88d3eae4105b12691cd4ba20 by larryhastings in branch 'master': bpo-43817: Add inspect.get_annotations(). (#25522) https://github.com/python/cpython/commit/74613a46fc79cacc88d3eae4105b12691cd4ba20 -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43951] Two minor fixes for C module object
Larry Hastings added the comment: Thanks for the review, Victor! -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue43951> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: Thanks for your feedback and reviews, everybody! Python just got an eensy teensy tiny bit better. -- assignee: -> larry resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: New changeset 2f2b69855d6524e15d12c15ddc0adce629e7de84 by larryhastings in branch 'master': bpo-43901: Lazy-create an empty annotations dict in all unannotated user classes and modules (#25623) https://github.com/python/cpython/commit/2f2b69855d6524e15d12c15ddc0adce629e7de84 -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42904] get_type_hints does not provide localns for classes
Larry Hastings added the comment: Thank you for your in-depth and thoughtful reply! I think that APIs should behave in a predictable way. So, for my use case, I tell the user that I'm passing "globals" and "locals" into eval()--and I think I'd have to have a *very* compelling reason to swap them. Since I don't have the backwards-compatibility problem you do, I think I should keep it simple and predictable and not swap them. In reference to your example, I think it's natural enough that the A defined inside B eclipses the module-level A. That's what the user would *expect* to happen. If the user really wants to reference the module-level A, they have lots of options: * rename one or the other A to something else * establish an alias at module scope, and use that * explicitly say globals()['A'] So I'm not worried about it. -- ___ Python tracker <https://bugs.python.org/issue42904> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42904] get_type_hints does not provide localns for classes
Larry Hastings added the comment: kj: I just added support for default locals to inspect.get_annotation(), and I basically copied-and-pasted your dict(vars(base)) approach. Is this "surprising, but required" behavior due specifically to this being a backwards-incompatible change? inspect.get_annotations() will be a new function in Python 3.10, which means I have no existing users to worry about. Does that mean I don't have the problem you're solving by reversing the order of namespace lookup, and I can just pass globals and locals in like normal? -- ___ Python tracker <https://bugs.python.org/issue42904> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: Ie debated about this with myself (and with a friend!) over the last few days, and I've concluded that evaluating strings by default is too opinionated for the standard library. I've changed both inspect.signature() and inspect.get_annotations() so eval_str is False by default (and removed the ONLY_IF_STRINGIZED logic entirely). Both functions will still raise an exception if eval_str=True, but this will only happen if the user explicitly requests evaluating the strings. I've already updated the PR. -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43954] Possible missing word on unittest doc
Change by Larry Hastings : -- nosy: -larry ___ Python tracker <https://bugs.python.org/issue43954> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: I think the PR is in good shape. If anybody has the time for a review, I'd appreciate it! -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43941] Unit test failure in test_gdb only with -O0
Larry Hastings added the comment: Did that. Well, technically, I replaced two lines with the equivalent: % CFLAGS='-O0 -g3' ./configure I do see this in the Makefile: CONFIGURE_CFLAGS= -O0 -g3 so, I didn't screw it up! The test still failed: = FAIL: test_wrapper_call (test.test_gdb.PyBtTests) -- Traceback (most recent call last): File "/home/larry/src/python/buildtrunk/Lib/test/test_gdb.py", line 947, in test_wrapper_call self.assertRegex(gdb_output, AssertionError: Regex didn't match: ", v=) at Python/bltinmodule.c:1207\n1207\t{\nBreakpoint 2: file Objects/descrobject.c, line 1386.\n\nBreakpoint 2, wrapper_call (wp=, args=, kwds=) at Objects/descrobject.c:1386\n1386\t{\nTraceback (most recent call first):\n \n File "", line 4, in __init__\n File "", line 7, in \n' -- -- ___ Python tracker <https://bugs.python.org/issue43941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43941] Unit test failure in test_gdb only with -O0
Larry Hastings added the comment: I'm on "Pop!_OS 20.10". Current versions: * Linux kernel 5.11.0-7612 * gcc version 10.2.0 (Ubuntu 10.2.0-13ubuntu1) * GNU gdb (Ubuntu 9.2-0ubuntu2) 9.2 % ./python -m sysconfig | grep CFLAGS BASECFLAGS = "-Wno-unused-result -Wsign-compare" CFLAGS = "-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O0 -Wall" CFLAGSFORSHARED = "" CFLAGS_ALIASING = "" CFLAGS_NODIST = "" CONFIGURE_CFLAGS = "" CONFIGURE_CFLAGS_NODIST = "-std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden" EXTRA_CFLAGS = "" PY_BUILTIN_MODULE_CFLAGS = "-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O0 -Wall -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I. -I./Include -DPy_BUILD_CORE_BUILTIN" PY_CFLAGS = "-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O0 -Wall" PY_CFLAGS_NODIST = "-std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal" PY_CORE_CFLAGS = "-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O0 -Wall -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I. -I./Include -DPy_BUILD_CORE" PY_STDMODULE_CFLAGS = "-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O0 -Wall -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I. -I./Include" -- ___ Python tracker <https://bugs.python.org/issue43941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43951] Two minor fixes for C module object
Change by Larry Hastings : -- keywords: +patch pull_requests: +24349 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/25658 ___ Python tracker <https://bugs.python.org/issue43951> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43951] Two minor fixes for C module object
New submission from Larry Hastings : While working on another issue, I noticed two minor nits in the C implementation of the module object. Both are related to getting a module's name. -- First, the C function module_dir() (module.__dir__) starts by ensuring the module dict is valid. If the module dict is invalid, it wants to format an exception using the name of the module, which it gets from PyModule_GetName(). However, PyModule_GetName() gets the name of the module from the dict. So getting the name in this circumstance will never succeed. When module_dir() wants to format the error but can't get the name, it knows that PyModule_GetName() must have already raised an exception. So it leaves that exception alone and returns an error. The end result is that the exception raised here is kind of useless and misleading: dir(module) on a module with no __dict__ raises SystemError("nameless module"). I changed the code to actually raise the exception it wanted to raise, just without a real module name: TypeError(".__dict__ is not a dictionary"). This seems more useful, and would do a better job putting the programmer who encountered this on the right track of figuring out what was going on. -- Second, the C API function PyModule_GetNameObject() checks to see if the module has a dict. If m->md_dict is not NULL, it calls _PyDict_GetItemIdWithError(). However, it's possible for m->md_dict to be None. And if you call _PyDict_GetItemIdWithError(Py_None, ...) it will *crash*. Unfortunately, this crash was due to my own bug in the other branch. Fixing my code made the crash go away. I assert that this is still possible at the API level. The fix is easy: add a PyDict_Check() to PyModule_GetNameObject(). Unfortunately, I don't know how to add a unit test for this. Having changed module_dir() above, I can't find any other interfaces callable from Python that eventually call PyModule_GetNameObject(). So I don't know how to trick the runtime into reproducing this error. -- assignee: larry components: Interpreter Core messages: 392055 nosy: larry priority: normal severity: normal stage: needs patch status: open title: Two minor fixes for C module object type: behavior versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue43951> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: One final thought on that idea. Part of the reason why I didn't go back to the drawing board and re-think the API was because I thought folks were pushing back on the idea of *default* behavior possibly raising exceptions. If I misread it, and the pushback was "we want real Python values when we can get them, and strings when we can't, and it's acceptable if we have to explicitly request that behavior from inspect.signature() and inspect.get_annotations()", then hot diggity!, maybe we've finally found a compromise we can all agree on. -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: Just to be clear: I would *not* want this new mode to be the *default* behavior. So far I think ONLY_IF_STRINGIZED is the best compromise for default behavior. -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: I like Eric's suggestion best of all. I'd be willing to add a "silence errors on a case-by-case basis" flag to inspect.signature(). I imagine that would add a new field to the Parameter object (as Guido suggested) indicating which objects failed. Would that be acceptable all around? -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: > There may be a (deliberate? :-) misunderstanding. When I wrote about > "you" inspecting code by a "3rd party" I meant that as a symmetric > relationship -- the "you" could be a library and from the library's > POV the "3rd party" could be you (or me). I wasn't deliberately misunderstanding you. And I understand what you're saying. But the relationship isn't perfectly symmetric from a pragmatic perspective. Let's say I write some code, and I also call into a third-party library. I've annotated one of my objects, and the third-party library calls inspect.signature() on my object. If my annotations are strings, and they aren't eval()uatable, then boom! it raises an exception, showing me a bug in my code, and I can fix it straight away. On the other hand: let's say I write some code, and I call into a third-party library. If the third-party library has an annotated object, and I call inspect.signature() on that object, and the annotations happen to be strings that aren't evaluatable, boom! it's showing me a bug in the third-party library. Now I have a much larger headache: I have to notify this third-party vendor, convince them to fix the bug, wait for a new release, etc etc etc. (And in the meantime maybe I add eval_str=False to my inspect.signature() call.) It's my assumption that the former scenario is far more likely than the latter, which is good news because it's way easier to deal with. The latter scenario is plausible, and much more inconvenient, but I think it's far less likely. If I'm wrong, and there are lots of third-party libraries with stringized annotations that could have errors lurking in them, I'd very much like to know that. -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: > I use inspect.signature for getting information about callables > (third-party and first-party) in my type checker: > https://github.com/quora/pyanalyze/blob/master/pyanalyze/arg_spec.py#L436. > In that context, I'd much rather get string annotations that I can > process myself later than get an exception if the annotations aren't > valid at runtime. This use case is what the eval_str parameter is for. Since you're dealing specifically with type hints (as opposed to annotations generally), you don't really care whether you get strings or valid objects--you need to handle both. So in Python 3.10+ you can--and should!--call inspect.get_annotations() and inspect.signature() with eval_str=False. If you do that you won't have any trouble. Since I keep getting new proposals on how to suppress eval() errors in inspect.signature(), I think I need to zoom out and talk about why I don't want to do it at all. Forgive me, but this is long--I'm gonna start from first principles. Changing inspect.signature() so it calls eval() on annotations is a change in 3.10. And, due to the fact that it's possible (how likely? nobody seems to know) that there are malformed string annotations lurking in user code, this has the possibility of being a breaking change. In order to handle this correctly, I think you need to start with a more fundamental question: are *stringized* annotations supposed to be a hidden implementation detail, and Python should present annotations as real values whether or not they were internally stringized? Or: if the user adds "from __future__ import annotation" to their module, is this them saying they explicitly want their annotations as strings and they should always see them as strings? (Again, this is specifically when the *language* turns your annotations into strings with the "from __future" import. I think if the user enters annotations as strings, the language should preserve those annotations as strings, and that includes the library.) It seems to me that the first answer is right. PEP 563 talks a lot about "here's how you turn your stringized annotations back into objects". In particular, it recommends calling typing.get_type_hints(), which AFAIK has always called eval() to turn string annotations back into objects. It's worth noting here that, in both 3.9 and in the current Python trunk, typing.get_type_hints() doesn't catch exceptions. Also, during the development of Python 3.10, during a time when stringized annotations had become the default behavior, inspect.signature() was changed to call typing.get_type_hints(). Presumably to have typing.get_type_hints() handle the tricky work of calling eval(). (I had problems with this specific approach. "annotations" and "type hints" aren't the same thing, so having inspect.signature() e.g. wrap some annotations with Optional, and change None to NoneType, was always a mistake. Also, typing.get_type_hints() was changed at this time to catch "Exception" and suppress *all* errors raised during the eval() call. Happily both these changes have since been backed out.) >From that perspective, *not* having inspect.signature() turn stringized >annotations back into strings from the very beginning was a bug. And changing >inspect.signature() in Python 3.10 so it calls eval() on string annotations is >a bug fix. So far folks seem to agree--the pushback I'm getting is regarding >details of my approach, not on the idea of doing it at all. Now we hit our second question. It's possible that inspect.signature() can't eval() every stringized annotation back into a Python value, due to a malformed annotation expression that wasn't caught at compile-time. This means inspect.signature() calls that worked in 3.9 could potentially start failing in 3.10. How should we address it? >From the perspective of "string annotations are a hidden implementation >detail, and users want to see real objects", I think the fact that it wasn't >already failing was also a bug. If your annotations are malformed, surely you >want to know about it. If you have a malformed annotation, and you don't turn >on stringized annotations, you get an exception when the annotation expression >is evaluated at module import time, in any Python version up to an including >3.10. Stringized annotations delays this evaluation to when the annotations >are examined--which means the exception for a malformed expression gets >delayed too. Which in turn means, from my perspective, the fact that >inspect.signature() didn't raise on malformed annotation expressions was a bug. I just don't agree that silently changing the failed eval()'d string annotation i
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: > I'm not a big user of the inspect module, but I always thought that > its use was so that you could look at a function (or other object) > *produced by a 3rd party* and learn something about it. That's interesting! I always thought its main use was the opposite--it was so third parties could introspect *your* functions. Like, Pydantic examines your function's signature so it can wrap it with runtime validation. (Though in this specific case, I believe Pydantic examines __annotations__ directly.) I also dimly recall a Python library that built COM bindings around your code, but I can't find it right now. I can't remember ever writing application code that used inspect.signature() to inspect library functions--but I've certainly written library functions that used inspect.signature() to inspect application code. > Asking for the signature is one such operation, and I'd be rather > upset if I couldn't inspect the annotation if that 3rd party happened > to have added a bad string annotation. Do you think that's likely? I'd be quite surprised if it was common for third party libraries to have string annotations--either manual or using automatic stringizing--in the first place. I thought string annotations were really only used as part of type hints, and the type-hinted code was all internal code bases from large organizations, not third-party libraries. But I concede I know little about it. Following on to that, the annotations would have to be bad, which means again either someone made a typo or it was done deliberately. The latter reason would be simply obnoxious if done by a third-party library--third-party libraries should not have the circular imports problem used to justifiy such practices. I suppose third-party libraries are just as error-prone as anybody else, so if they were manually stringizing their annotations, it could happen there. Which I agree would be annoying, just like any bug in a third-party library. But I wouldn't agree this specific error is so special that we need to suppress it. -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43817] Add inspect.get_annotations()
Larry Hastings added the comment: I keep thinking about it, and I think letting inspect.get_annotations() and inspect.signature() raise exceptions is the right API choice. I note that that's what typing.get_type_hints() did in Python 3.9. During the development of Python 3.10, this was changed; typing.get_type_hints() started catching Exception and returning the original annotation dictionary (with un-eval'd strings). This change has been reverted, and as of the current moment typing.get_type_hints() no longer catches exceptions on eval(). I think there are two types of people who will have string annotations that throw an exception when eval'd: 1) People who made a typo or other mistake. 2) People who deliberately use undefined identifiers in their annotations, due to circular import / circular dependencies in their code bases. The people who are saying "just catch the exception and let it pass silently" seem to be in group 2. I suggest that people in group 2 are sophisticated enough to start passing in eval_str=False to inspect.signature(). And I think people in group 1 would want to be alerted to their mistake, rather than have the library silently catch the error and mysteriously change its behavior. -- ___ Python tracker <https://bugs.python.org/issue43817> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Change by Larry Hastings : -- keywords: +patch pull_requests: +24322 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/25623 ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43941] Unit test failure in test_gdb only with -O0
New submission from Larry Hastings : Recent Python source trees fail a regression test in test_gdb. Oddly, the behavior only appears up when optimization is turned off. To reproduce: % git clone cpython buildtrunk % cd buildtrunk % ./configure % vi Makefile # change "-O3" to "-O0" !! % make -j % ./python -m test -v test_gdb You'll be rewarded with one failure: == FAIL: test_wrapper_call (test.test_gdb.PyBtTests) -- Traceback (most recent call last): File "/home/larry/src/python/buildtrunk/Lib/test/test_gdb.py", line 947, in test_wrapper_call self.assertRegex(gdb_output, AssertionError: Regex didn't match: ", v=) at Python/bltinmodule.c:1207\n1207\t{\nBreakpoint 2: file Objects/descrobject.c, line 1386.\n\nBreakpoint 2, wrapper_call (wp=, args=, kwds=) at Objects/descrobject.c:1386\n1386\t{\nTraceback (most recent call first):\n \n File "", line 4, in __init__\n File "", line 7, in \n' -- I just reproduced this with version 3c4850e22239426e250ff43308e4802dc582 . Note that if you don't change the Makefile, and leave the optimization level at "-O3", you won't see this test failure. Pablo, I nosied you just to get it on your radar. Good luck with getting Python 3.10 over the finish line! -- components: Library (Lib) keywords: 3.10regression messages: 391878 nosy: larry, pablogsal priority: normal severity: normal stage: needs patch status: open title: Unit test failure in test_gdb only with -O0 type: behavior versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue43941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: > Functions don't store __annotations__ in their __dict__, it is a > separate slot named func_annotations (see funcobject.c). I guess > that's because the __dict__ is purely for user-defined function > attributes. I brought up functions because I'm now proposing to make classes and modules behave more like functions in this one way, that they too will now lazy-create an empty annotations dict every time. But yes, function objects don't store __annotations__ in their __dict__, and best practices for 3.9 and before was to use fn.__annotations__ or getattr() when getting the annotations from a function object, and 3.10 will not change that best practice. I don't know specifically why the implementor made that design choice, but I might have done that too. Most function objects don't have a __dict__ so this is almost always cheaper overall. And recreating the dict seems harmless. (Just to round the bases on this topic: I don't think you should be permitted to delete __annotations__, like most other metadata items on functions / classes / modules. But I don't propose to change that for 3.10.) > So if you look in __dict__ it will be like it's still Python 3.9, > but if you're using the attribute (the recommended approach for code > that only cares about 3.10) it'll be as if it always existed. Sounds > pretty compatible to me. Yes, exactly. That was the thing I finally figured out this afternoon. Sorry for being a slow learner. Again, this approach will change the semantics around deleting annotations on class and module objects. Deleting them won't be permanent--if you delete one, then ask for it, a fresh one will be created. But that seems harmless. > So, honestly I don't understand what your concern with the lazy > approach is. Was your design based on having a bit in the > class/module object (outside its __dict__) saying "I already > lazily created one"? Or am I missing something? My concern is that always lazy-creating on demand will change user-visible behavior. Consider this code: class C: a:int=3 del C.__annotations__ print(C.__annotations__) In 3.9, that throws an AttributeError, because C no longer has an '__annotations__' attribute. If I change Python 3.10 so that classes and modules *always* lazy-create __annotations__ if they don't have them, then this code will succeed and print an empty dict. That's a user-visible change, and I was hoping to avoid those entirely. Is it a breaking change? I doubt it. Is it an important change? It doesn't seem like it. I bring it up just in the interests of considering every angle. But I don't think this is important at all, and I think always lazy-creating annotations dicts on classes and modules is the right approach. -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43901] Lazy-create an empty annotations dict in all unannotated user classes and modules
Larry Hastings added the comment: Hmm. Sorry for the stream-of-consciousness thought process here, but this approach adds wrinkles too. Function objects from the very beginning have lazy-created their annotations dict if it's not set. Which means this works fine: while True: del fn.__annotations__ print(fn.__annotations__) You can do that all day. The function object will *always* create a new annotations object if it doesn't have one. del fn.__annotations__ always works, and accessing fn.__annotations__ always succeeds. It doesn't retain any state of "I already lazily created one, I shouldn't create another". If I add getsets to classes and modules so they lazy-create annotations on demand if they otherwise aren't set, then either a) they need an extra bit set somewhere that says "I lazily created an empty annotations dict once, I shouldn't do it again", or b) they will duplicate this behavior that functions display. a) would better emulate existing semantics; b) would definitely be a user-visible breaking change. There's actually a test in the Lib/test/test_opcodes (iirc) that explicitly tests deleting __annotations__, then checks that modifying the annotations dict throws an exception. I haven't done any investigating to see if this check was the result of a regression, and there was a bpo issue, and there was a valid use case, etc etc etc. My instinct is that deleting o.__annotations__ is not an important or widely-used use case. In fact I plan to recommend against it in my "best practices" documentation. So either approach should be acceptable. Which means I should go with the simpler one, the one that will duplicate the function object's always-recreate-annotations-dicts behavior. -- ___ Python tracker <https://bugs.python.org/issue43901> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com