Re: [Python-Dev] [python-committers] Enabling depreciation warnings feature code cutoff
I also feel this decision was a mistake. If there's a consensus to revert, I'm happy to draft a PEP. Alex On Nov 6, 2017 1:58 PM, "Neil Schemenauer"wrote: > On 2017-11-06, Nick Coghlan wrote: > > Gah, seven years on from Python 2.7's release, I still get caught by > > that. I'm tempted to propose we reverse that decision and go back to > > enabling them by default :P > > Either enable them by default or make them really easy to enable for > development evironments. I think some setting of the PYTHONWARNINGS > evironment variable should do it. It is not obvious to me how to do > it though. Maybe there should be an environment variable that does > it more directly. E.g. > > PYTHONWARNDEPRECATED=1 > > Another idea is to have venv to turn them on by default or, based on > a command-line option, do it. Or, maybe the unit testing frameworks > should turn on the warnings when they run. > > The current "disabled by default" behavior is obviously not working > very well. I had them turned on for a while and found quite a > number of warnings in what are otherwise high-quality Python > packages. Obviously the vast majority of developers don't have them > turned on. > > Regards, > > Neil > ___ > python-committers mailing list > python-committ...@python.org > https://mail.python.org/mailman/listinfo/python-committers > Code of Conduct: https://www.python.org/psf/codeofconduct/ > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python-committers] Cherry picker bot deployed in CPython repo
This is a great UX win for our development process. Thanks for making this happen! Alex On Tue, Sep 5, 2017 at 9:10 PM, Mariatta Wijayawrote: > Hi, > > The cherry picker bot has just been deployed to CPython repo, codenamed > miss-islington. > > miss-islington made the very first backport PR for CPython and became a > first time GitHub contributor: https://github.com/python/cpython/pull/3369 > > > GitHub repo: https://github.com/python/miss-islington > > What is this? > == > > As part of our workflow, quite often changes made on the master branch > need to be backported to the earlier versions. (for example: from master to > 3.6 and 2.7) > > Previously the backport has to be done manually by either a core developer > or the original PR author. > > With the bot, the backport PR is created automatically after the PR has > been merged. A core developer will need to review the backport PR. > > The issue was tracked in https://github.com/python/core-workflow/issues/8 > > How it works > == > > 1. If a PR needs to be backported to one of the maintenance branches, a > core developer should apply the "needs backport to X.Y" label. Do this > **before** you merge the PR. > > 2. Merge the PR > > 3. miss-islington will leave a comment on the PR, saying it is working on > backporting the PR. > > 4. If there's no merge conflict, the PR should be created momentarily. > > 5. Review the backport PR created by miss-islington and merge it when > you're ready. > > Merge Conflicts / Problems? > == > > In case of merge conflicts, or if a backport PR was not created within 2 > minutes, it likely failed and you should do the backport manually. > > Manual backport can be done using cherry_picker: https://pypi. > org/project/cherry-picker/ > > Older merged PRs not yet backported? > == > > At the moment, those need to be backported manually. > > Don't want PR to be backported by a bot? > > > My recommendation is to apply the "needs backport to X.Y" **after** the PR > has been merged. The label is still useful to remind ourselves that this PR > still needs backporting. > > Who is Miss Islington? > = > > I found out from Wikipedia that Miss Islington is the name of the witch in > Monty Python and The Holy Grail. > > miss-islington has not signed the CLA! > = > > A core dev can ignore the warning and merge the PR anyway. > > Thanks! > > > Mariatta Wijaya > > ___ > python-committers mailing list > python-committ...@python.org > https://mail.python.org/mailman/listinfo/python-committers > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: D1B3 ADC0 E023 8CA6 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Backport ssl.MemoryBIO on Python 2.7?
I'm +1 on this, I even wrote the patch: https://bugs.python.org/issue22559 :-) If you're interested in making sure that still applies and tests still pass, I'd be a big fan. In addition to all the benefits you mentioned, it also substantially reduces the diff between 2.7 and 3.x (or at least it did when I originally wrote it). Cheers, Alex On Tue, May 23, 2017 at 8:46 PM, Victor Stinnerwrote: > Hi, > > Would you be ok to backport ssl.MemoryBIO and ssl.SSLObject on Python > 2.7? I can do the backport. > > https://docs.python.org/dev/library/ssl.html#ssl.MemoryBIO > > Cory Benfield told me that it's a blocking issue for him to implement > his PEP 543 -- A Unified TLS API for Python 2.7: > > https://www.python.org/dev/peps/pep-0543/ > > And I expect that if a new cool TLS API happens, people will want to > use it on Python 2.7-3.6, not only on Python 3.7. Security evolves > more quickly that the current Python release process, and people wants > to keep their application secure. > > From what I understood, he wants to first implement an abstract > MemoryBIO API (http://sans-io.readthedocs.io/ like API? I'm not sure > about that), and then implement a socket/FD based on top of that. > Maybe later, some implementations might have a fast-path using > socket/FD directly. > > He described me his PEP and I strongly support it (sorry, I missed it > when he posted it on python-dev), but we decided (Guido van Rossum, > Christian Heimes, Cory Benfield and me, see the tweet below) to not > put this in the stdlib right now, but spend more time on testing it on > Twisted, asyncio, requests, etc. So publishing an implementation on > PyPI was proposed instead. It seems like we agreed on a smooth plan > (or am I wrong, Cory?). > > https://twitter.com/VictorStinner/status/865467388141027329 > > I'm quite sure that Twisted will love MemoryBIO on Python 2.7 as well, > to implement TLS, especially on Windows using IOCP. Currently, > external libraries (C extensions) are required. > > I'm not sure if the PEP 466 should be amended for that? Is a new PEP > really needed? MemoryBIO/SSLObject are tiny. Nick (Coghlan): what do > you think? > > https://www.python.org/dev/peps/pep-0466/ > > Victor > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: D1B3 ADC0 E023 8CA6 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Request for pronouncement on PEP 493 (HTTPS verification backport guidance)
Hi all, While I appreciate the vote of confidence from everyone, I'm not interested in being the BDFL-delegate for this. I don't think it's a good idea, and I'm not willing to put further time into. If he's interested, Donald Stufft would make a good choice for delegate. Really do appreciate everyone's confidence. Cheers, Alex On Mon, Nov 23, 2015 at 2:35 PM, Christian Heimeswrote: > On 2015-11-17 01:00, Guido van Rossum wrote: > > Hm, making Christian the BDFL-delegate would mean two out of three > > authors *and* the BDFL-delegate all working for Red Hat, which clearly > > has a stake (and IIUC has already committed to this approach ahead of > > PEP approval). SO then it would look like this is just rubber-stamping > > Red Hat's internal decision process (if it's a process -- sounds more > > like an accident :-). > > > > So, Alex, do you want to approve this PEP? > > I haven't read this thread until now. Independently from your objection > I have raised the same concern with Nick today. I'd be willing to BDFL > the PEP but I'd rather have somebody outside of Red Hat. Alex is a great > choice. > > > In the same mail I sent Nick a quick review of the latest PEP version in > private. > > > 1) The example implementation of the function doesn't check the > sys.flags.ignore_environment. Internally CPython has specialized getenv > function that ignores env vars with PYTHON prefix when the flag is set. > PYTHON* env vars aren't removed from os.environ. Modules have to check > the flag. > > > 2) The PEP is rather Linux-centric. What's the recommended path to the > config file on other platforms like BDS (/usr/local/etc/ is preferred > for additional dependencies on FreeBSD), OSX and Windows? > > > 3) What's the interaction between the location of the config file and > virtual envs? Would it make sense to search for the file in a venv's > etc/ first and then dispatch to global /etc/? That way venvs can > influence the setting, too. > > > 4) It makes sense to make the cert-verification.cfg file future-proof > and reserve it for other cert-related configuration in the future. For > example it could be used to define new contexts, set protocols, ciphers > or hashes for cert pinning. It should be enough to say that CPython > reserves the right to add more sections and keys later. > > > 5) I'm not particular fond of the section name [https]. For one It is > ambiguous because it doesn't distinguish between server certs and client > certs. It's also not correct. The default context is used for other > protocols like imap, smtp etc. over TLS. > > Christian > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.x and 3.x use survey, 2014 edition
Ben Finney ben+python at benfinney.id.au writes: Rather, the claim is that *if* one's code base doesn't migrate to Python 3, it will be decreasingly supported by the PSF and the Python community at large. The PSF doesn't support any versions of Python. We have effectively no involvement in the development of Python the language, or CPython. We certainly don't care what version of Python you use. Members of the python-dev list, or the CPython core development teams have opinions probably, but that doesn't make them the opinion of the PSF. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 481 - Migrate Some Supporting Repositories to Git and Github
On Sun Nov 30 2014 at 10:28:50 AM Brett Cannon br...@python.org wrote: Why specifically? Did you have a web UI for reviewing patches previously? Do you have CI set up for patches now and didn't before? What features did you specifically gain from the switch to GitHub that you didn't have before? IOW was it the magic of GitHub or some technical solution that you got as part of the GitHub package and thus could theoretically be replicated on python.org? -Brett Previously someone looking for a review (read: any non-committer) would export a diff from their VCS, upload it as a patch to trac, and then reviewers could leave comments as trac comments. CPython's present process is a clear improvement, insofar as Rietveld allows inlining commenting, but it is otherwise basically the same. By contrast, the Github process does not require a patch author to leave their workflow, they simply git push to update a patch. We now also have CI for PRs, but that's a recent addition. It's not magic, it's a good UX :-) Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 481 - Migrate Some Supporting Repositories to Git and Github
Donald Stufft donald at stufft.io writes: [words words words] I strongly support this PEP. I'd like to share two pieces of information. Both of these are personal anecdotes: For the past several years, I've been a contributor on two major projects using mercurial, CPython and PyPy. PyPy has a strong culture of in-repo branching, basically all contributors regularly make branches in the main repo for their work, and we're very free in giving people commit rights, so almost everyone working on PyPy in any way has this level of access. This workflow works ok. I don't love it as much as git, but it's fine, it's not an impediment to my work. By contrast, CPython does not have this type of workflow, there are almost no in-tree branches besides the 2.7, 3.4, etc. ones. Despite being a regular hg user for years, I have no idea how to create a local-only branch, or a branch which is pushed to a remote (to use the git term). I also don't generally commit my own work to CPython, even though I have push privledges, because I prefer to *always* get code review on my work. As a result, I use a git mirror of CPython to do all my work, and generate patches from that. The conclusion I draw from this is that hg's workflow is probably fine if you're a committer on the project, or don't ever need to maintain multiple patches concurrently (and thus can just leave everything uncommitted in the repo). However, the hg workflow seems extremely defficient at non-committer contributors. The seconds experience I have is that of Django's migration to git and github. For a long time we were on SVN, and we were very resistant to moving to DVCS in general, and github in particular. Multiple times I said that I didn't see how exporting a patch and uploading it to trac was more difficult than sending a pull request. That was very wrong on my part. My primary observation is not about new contributors though, it's actually about the behavior of core developers. Before we were on github, it was fairly rare for core developers to ask for reviews for anything besides *gigantic* patches, we'd mostly just commit stuff to trunk. Since the switch to github, I've seen that core developers are *far* more likely to ask for reviews of their work before merging. Big +1 from me, thanks for writing this up Donald, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP476: Enabling certificate validation by default
Guido van Rossum guido at python.org writes: OK, I'll hold off a bit on approving the PEP, but my intention is to approve it. Go Alex go! A patch for the environmental variable overrides on Windows has landed; thanks Benjamin! Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP476: Enabling certificate validation by default
Done and done. Alex On Fri, Sep 19, 2014 at 4:13 PM, Guido van Rossum gu...@python.org wrote: +1 on Nick's suggestion. (Might also mention that this is the reason why both functions should exist and have compatible signatures.) Also please, please, please add explicit mention of Python 2.7, 3.4 and 3.5 in the Abstract (for example in the 3rd paragraph of the abstract). On Fri, Sep 19, 2014 at 3:52 PM, Nick Coghlan ncogh...@gmail.com wrote: On 20 September 2014 08:34, Alex Gaynor alex.gay...@gmail.com wrote: Pushed a new version which I believe adresses all of these. I added an example of opting-out with urllib.urlopen, let me know if there's any other APIs you think I should show an example with. It would be worth explicitly stating the process global monkeypatching hack: import ssl ssl._create_default_https_context = ssl._create_unverified_context Adding that hack to sitecustomize allows corporate sysadmins that can update their standard operating environment more easily than they can fix invalid certificate infrastructure to work around the problem on behalf of their users. It also helps out users that will be able to deal with such broken infrastructure without updating each and every one of their scripts. It's deliberately ugly because it's a genuinely bad idea that folks should want to avoid using, but as a matter of practical reality, corporate IT departments are chronically understaffed, and often fully committed to fighting the crisis du jour, without sufficient time being available for regular infrastructure maintenance tasks. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia -- --Guido van Rossum (python.org/~guido) -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP476: Enabling certificate validation by default
That sounds reasonable to me -- at this point I don't expect this to make it into 3.4.2; Nick has some working code on the ticket: http://bugs.python.org/issue22417 it's mostly missing documentation. Alex On Sat, Sep 20, 2014 at 9:46 AM, Guido van Rossum gu...@python.org wrote: Nice. I just realized the release candidate for 3.4.2 is really close (RC1 Monday, final Oct 6, see PEP 429). What's your schedule for 3.4? I see no date for 2.7.9 yet (but that could just be that PEP 373 hasn't been updated). What about the Apple and Microsoft issues Christian pointed out? Regarding the approval process, I want to get this into 2.7 and 3.4, but I want it done right, and I'm not convinced that the implementation is sufficiently worked out. I don't want you to feel rushed, and I don't want you to feel that you can't start coding until the PEP is approved, but I also feel that I want to see more working code and some beta testing before it goes live. Perhaps I should just approve the PEP but separately get to approve the code? (Others will have to review it for correctness -- but I want to understand and review the API.) On Sat, Sep 20, 2014 at 8:54 AM, Alex Gaynor alex.gay...@gmail.com wrote: Done and done. Alex On Fri, Sep 19, 2014 at 4:13 PM, Guido van Rossum gu...@python.org wrote: +1 on Nick's suggestion. (Might also mention that this is the reason why both functions should exist and have compatible signatures.) Also please, please, please add explicit mention of Python 2.7, 3.4 and 3.5 in the Abstract (for example in the 3rd paragraph of the abstract). On Fri, Sep 19, 2014 at 3:52 PM, Nick Coghlan ncogh...@gmail.com wrote: On 20 September 2014 08:34, Alex Gaynor alex.gay...@gmail.com wrote: Pushed a new version which I believe adresses all of these. I added an example of opting-out with urllib.urlopen, let me know if there's any other APIs you think I should show an example with. It would be worth explicitly stating the process global monkeypatching hack: import ssl ssl._create_default_https_context = ssl._create_unverified_context Adding that hack to sitecustomize allows corporate sysadmins that can update their standard operating environment more easily than they can fix invalid certificate infrastructure to work around the problem on behalf of their users. It also helps out users that will be able to deal with such broken infrastructure without updating each and every one of their scripts. It's deliberately ugly because it's a genuinely bad idea that folks should want to avoid using, but as a matter of practical reality, corporate IT departments are chronically understaffed, and often fully committed to fighting the crisis du jour, without sufficient time being available for regular infrastructure maintenance tasks. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia -- --Guido van Rossum (python.org/~guido) -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084 -- --Guido van Rossum (python.org/~guido) -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP476: Enabling certificate validation by default
Hi all, I've just updated the PEP to reflect the API suggestions from Nick, and the fact that the necessary changes to urllib were landed. I think this is ready for pronouncement, Guido? Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP476: Enabling certificate validation by default
Pushed a new version which I believe adresses all of these. I added an example of opting-out with urllib.urlopen, let me know if there's any other APIs you think I should show an example with. On Fri, Sep 19, 2014 at 3:06 PM, Guido van Rossum gu...@python.org wrote: The PEP doesn't specify any of the API changes for Python 2.7. I feel it is necessary for the PEP to show a few typical code snippets using urllib in Python 2.7 and how one would modify these in order to disable the cert checking. There are also a few typos; especially this paragraph puzzled me: This will be acheived by adding a new ``ssl._create_default_https_context`` function, which is the same as ``ssl.create_default``. ``http.client`` can then replace it's usage of ``ssl._create_stdlib_context`` with the new ``ssl._create_default_https_context``. (1) spelling: it's achieved, not achieved (2) method name: it's ssl.create_default_context, not ssl.create_default (3) There's not enough whitespace (in the rendered HTML on legacy.python.org) before http.client -- I kept reading it as ... which is the same as ssl.create_default.http.client ... (4) There's no mention of the Python 2 equivalent of http.client. Finally, it's kind of non-obvious in the PEP that this affects Python 2.7.X (I guess the one after the next) as well as 3.4 and 3.5. On Fri, Sep 19, 2014 at 9:53 AM, Alex Gaynor alex.gay...@gmail.com wrote: Hi all, I've just updated the PEP to reflect the API suggestions from Nick, and the fact that the necessary changes to urllib were landed. I think this is ready for pronouncement, Guido? Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposed schedule for 3.4.2
Guido van Rossum guido at python.org writes: Would you be willing to officially pronounce on PEP-476 in the context of 3.4.x, so we can get it into the release, and then we can defer on officially approving it for 2.7.X until we figure out all the moving pieces? Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposed schedule for 3.4.2
*Shifts uncomfortably* it looks like presently there's not a good way to change anything about the SSL configuration for urllib.request.urlopen. It does not take a `context` argument, as the http.client API does: https://docs.python.org/3/library/urllib.request.html#module-urllib.request and instead takes the cafile, capath, cadefault args. This would need to be updated first, once it *did* take such an argument, this would be accomplished by: context = ssl.create_default_context() context.verify_mode = CERT_OPTIONACERT_NONE context.verify_hostname = False urllib.request.urlopen(https://something-i-apparently-dont-care-much-about;, context=context) Alex On Mon, Sep 8, 2014 at 10:35 AM, Guido van Rossum gu...@python.org wrote: I will pronounce for 3.4 once you point me to the documentation that explains how to disable cert validation for an example program that currently pulls down an https URL using urlopen. Without adding package dependencies. On Mon, Sep 8, 2014 at 10:25 AM, Alex Gaynor alex.gay...@gmail.com wrote: Guido van Rossum guido at python.org writes: Would you be willing to officially pronounce on PEP-476 in the context of 3.4.x, so we can get it into the release, and then we can defer on officially approving it for 2.7.X until we figure out all the moving pieces? Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 476: Enabling certificate validation by default!
Ethan Furman ethan at stoneleaf.us writes: I apologize if I missed this point, but if we have the source code then it is possible to go in and directly modify the application/utility to be able to talk over https to a router with an invalid certificate? This is an option when creating the ssl_context? -- ~Ethan~ Yes, it's totally possible to create (and pass to ``http.client``) an ``SSLContext`` which doesn't verify various things. My proposal is only about changing what happens when you don't explicitly pass a context. Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 476: Enabling certificate validation by default!
Guido van Rossum guido at python.org writes: OK, that changes my position for 2.7 (but not for 3.5). I had assumed there was a way to disable the cert check by changing one parameter to the urlopen() call. (And I had wanted to add that there should be a clear FAQ about the subject.) If this isn't possible that changes the situation. (But I still think that once we do have that simple change option we should do it, in a later 2.7 upgrade.) I apologize for speaking before I had read all facts, and I'll await what you and Nick come up with. --Guido This probably doesn't surprise anyone, but I'm more than happy to do the back- porting work for httplib, and any other modules which need SSLContext support; does this require an additional PEP, or does it fit under PEP466 or PEP476? Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 476: Enabling certificate validation by default!
Antoine Pitrou solipsis at pitrou.net writes: And how many people are using Twisted as an HTTPS client? (compared to e.g. Python's httplib, and all the third-party libraries building on it?) I don't think anyone could give an honest estimate of these counts, however there's two factors to bare in mind: a) It's extremely strongly recommended to use requests to make any HTTP requests precisely because httplib is negligent in certificate and hostname checking by default, b) We're talking about Python3, which has fewer users than Python2. Furthermore, disable verification is a nonsensical thing to do with TLS. It's not. For example, if you have an expired cert, all you can do AFAIK is to disable verification. It really is a nonsensical operation, accepting any random TLS certificate without pinning or use of a certificate authorities makes a valid connection completely indistinguishable from a MITM attack. If I were the emperor of the universe (or even just Python ;-)) I wouldn't allow this operation at all, however, I'm not and you can still disable any and all verification. It just requires you to pass a different argument, which doesn't seem overly burdensome. This whole scenario seems to be predicated on a siutation where: You have a peer whose certificate you can't change, and you have a piece of code you can't change, and you're going to upgrade your Python installation, and you want to talk to this peer, and you need to use an encrypted channel, but don't really care if it's being MITM'd. It doesn't seem to me that this is reasonably Python's responsibility to deal with the fact that you have no ability to upgrade any of your infrastructure, except your Python version. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 476: Enabling certificate validation by default!
The Windows certificate store is used by ``load_default_certs``: * https://github.com/python/cpython/blob/master/Lib/ssl.py#L379-L381 * https://docs.python.org/3.4/library/ssl.html#ssl.enum_certificates Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 476: Enabling certificate validation by default!
Hi all, I've just submitted PEP 476, on enabling certificate validation by default for HTTPS clients in Python. Please have a look and let me know what you think. PEP text follows. Alex --- PEP: 476 Title: Enabling certificate verification by default for stdlib http clients Version: $Revision$ Last-Modified: $Date$ Author: Alex Gaynor alex.gay...@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-August-2014 Abstract Currently when a standard library http client (the ``urllib`` and ``http`` modules) encounters an ``https://`` URL it will wrap the network HTTP traffic in a TLS stream, as is necessary to communicate with such a server. However, during the TLS handshake it will not actually check that the server has an X509 certificate is signed by a CA in any trust root, nor will it verify that the Common Name (or Subject Alternate Name) on the presented certificate matches the requested host. The failure to do these checks means that anyone with a privileged network position is able to trivially execute a man in the middle attack against a Python application using either of these HTTP clients, and change traffic at will. This PEP proposes to enable verification of X509 certificate signatures, as well as hostname verification for Python's HTTP clients by default, subject to opt-out on a per-call basis. Rationale = The S in HTTPS stands for secure. When Python's users type HTTPS they are expecting a secure connection, and Python should adhere to a reasonable standard of care in delivering this. Currently we are failing at this, and in doing so, APIs which appear simple are misleading users. When asked, many Python users state that they were not aware that Python failed to perform these validations, and are shocked. The popularity of ``requests`` (which enables these checks by default) demonstrates that these checks are not overly burdensome in any way, and the fact that it is widely recommended as a major security improvement over the standard library clients demonstrates that many expect a higher standard for security by default from their tools. The failure of various applications to note Python's negligence in this matter is a source of *regular* CVE assignment [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_. .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-4340 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-3533 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-5822 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-5825 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-1909 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2037 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2073 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2191 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-4111 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-6396 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-6444 Technical Details = Python would use the system provided certificate database on all platforms. Failure to locate such a database would be an error, and users would need to explicitly specify a location to fix it. This can be achieved by simply replacing the use of ``ssl._create_stdlib_context`` with ``ssl.create_default_context`` in ``http.client``. Trust database -- This PEP proposes using the system-provided certificate database. Previous discussions have suggested bundling Mozilla's certificate database and using that by default. This was decided against for several reasons: * Using the platform trust database imposes a lower maintenance burden on the Python developers -- shipping our own trust database would require doing a release every time a certificate was revoked. * Linux vendors, and other downstreams, would unbundle the Mozilla certificates, resulting in a more fragmented set of behaviors. * Using the platform stores makes it easier to handle situations such as corporate internal CAs. Backwards compatibility --- This change will have the appearance of causing some HTTPS connections to break, because they will now raise an Exception during handshake. This is misleading however, in fact these connections are presently failing silently, an HTTPS URL indicates an expectation of confidentiality and authentication. The fact that Python does not actually verify that the user's request has been made is a bug, further: Errors should never pass silently. Nevertheless, users who have a need to access servers with self-signed or incorrect certificates would be able to do so by providing a context with custom trust roots or which disables validation (documentation should strongly recommend the former where possible). Users will also be able to add necessary certificates to system trust stores in order to trust them globally
Re: [Python-Dev] PEP 476: Enabling certificate validation by default!
Thanks for the rapid feedback everyone! I want to summarize the action items and discussion points that have come up so far: To add to the PEP: * Emit a warning in 3.4.next for cases that would raise a Exception in 3.5 * Clearly state that the existing OpenSSL environment variables will be respected for setting the trust root Discussion points: * Disabling verification entirely externally to the program, through a CLI flag or environment variable. I'm pretty down on this idea, the problem you hit is that it's a pretty blunt instrument to swing, and it's almost impossible to imagine it not hitting things it shouldn't; it's far too likely to be used in applications that make two sets of outbound connections: 1) to some internal service which you want to disable verification on, and 2) some external service which needs strong validation. A global flag causes the latter to fail silently when subjected to a MITM attack, and that's exactly what we're trying to avoid. It also makes things much harder for library authors: I write an API client for some API, and make TLS connections to it. I want those to be verified by default. I can't even rely on the httplib defaults, because someone might disable them from the outside. Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
Donald Stufft donald at stufft.io writes: For the record I’ve had all of the problems that Nick states and I’m +1 on this change. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA I've hit basically every problem everyone here has stated, and in no uncertain terms am I completely opposed to deprecating anything. The Python 2 to 3 migration is already hard enough, and already proceeding far too slowly for many of our tastes. Making that migration even more complex would drive me to the point of giving up. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP466] SSLSockets, and sockets, _socketobjects oh my!
Antoine Pitrou antoine at python.org writes: No, IIRC there shouldn't be a cycle. It's just complicated in a different way than 3.x Regards Antoine. Indeed, you're right, this is just differently convoluted so no leak (not that I would call collected by a normal GC a leak :-)). That said, I've hit another issue, with SNI callbacks. The first argument to an SNI callback is the socket. The callback is set up by some C code, which right now has access to only the _socket.socket object, not the ssl.SSLSocket object, which is what the public API needs there. Possible solutions are: * Pass the SSLObject *in addition* to the _socket.socket object to the C code. This generates some additional divergence from the Python3 code, but is probably basically straightforward. * Try to refactor the socket code in the same way as Python3 did, so we can pass *only* the SSLObject here. This is some nasty scope creep for PEP466, but would make the overall _ssl.c diff smaller. * Some super sweet and simple thing I haven't thought of yet. Thoughts? By way of a general status update, the only failing tests left are this, and a few things about SSLError's str(), so this will hopefully be ready to upload any day now for review. Cheers, Alex PS: Please review and merge http://bugs.python.org/issue22023 :-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP466] SSLSockets, and sockets, _socketobjects oh my!
Antoine Pitrou antoine at python.org writes: You mean for use with SSL_set_app_data? Yes, if you look in ``_servername_callback``, you can see where it uses ``SSL_get_app_data`` and then reads ``ssl-Socket``, which is supposed to be the same object that's returned by ``context.wrap_socket()``. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] [PEP466] SSLSockets, and sockets, _socketobjects oh my!
Hi all, I've been happily working on the SSL module backports for Python2 (pursuant to PEP466), and I've hit something of a snag: In python3, the SSLSocket keeps a weak reference to the underlying socket, rather than a strong reference, as Python2 uses. Unfortunately, due to the way sockets work in Python2, this doesn't work: On Python2, _socketobject composes around _real_socket from the _socket module, whereas on Python3, it subclasses _socket.socket. Since you now have a Python- level class, you can weak reference it. The question is: a) Should we backport weak referencing _socket.sockets (changing the structure of the module seems overly invasive, albeit completely backwards compatible)? b) Does anyone know why weak references are used in the first place? The commit message just alludes to fixing a leak with no reference to an issue. Anyone who's interested in the state of the branch can see it at: github.com/alex/cpython on the backport-ssl branch. Note that many many tests are still failing, and you'll need to apply the patch from http://bugs.python.org/issue22023 to get it to work. Thanks, Alex PS: Any help in getting http://bugs.python.org/issue22023 landed which be very much appreciated. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Network Security Backport Status
Hi all, I wanted to bring everyone up to speed on the status of PEP 466, what's been completed, and what's left to do. First the completed stuff: * hmac.compare_digest * hashlib.pbkdf2_hmac Are both backported, and I've added support to use them in Django, so users should start seeing these benefits just as soon as we get a Python release into their hands. Now the uncompleted stuff: * Persistent file descriptor for ``os.urandom`` * SSL module It's the SSL module that I'll spend the rest of this email talking about. Backporting the features from the Python3 version of this module has proven more difficult than I had expected. This is primarily because the stdlib took a maintenance strategy that was different from what most Python projects have done for their 2/3 support: multiple independent codebases. I've tried a few different strategies for the backport, none of which has worked: * Copying the ``ssl.py``, ``test_ssl.py``, and ``_ssl.c`` files from Python3 and trying to port all the code. * Coping just ``test_ssl.py`` and then copying individual chunks/functions as necessary to get stuff to pass. * Manually doing stuff. All of these proved to be a massive undertaking, and made it too easy to accidentally introduce breaking changes. I've come up with a new approach, which I believe is most likely to be successful, but I'll need help to implement it. The idea is to find the most recent commit which is a parent of both the ``2.7`` and ``default`` branches. Then take every single change to an ``ssl`` related file on the ``default`` branch, and attempt to replay it on the ``2.7`` branch. Require manual review on each commit to make sure it compiles, and to ensure it doesn't make any backwards incompatible changes. I think this provides the most iterative and guided approach to getting this done. I can do all the work of reviewing each commit, but I need some help from a mercurial expert to automate the cherry-picking/rebasing of every single commit. What do folks think? Does this approach make sense? Anyone willing to help with the mercurial scripting? Cheers, Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Remove the redundant and poorly worded warning message.
Hi python-dev and Raymond, I think this change is a considerable usability regression for the documentation. Right now the warnings about CSPRNGs are hidden in the introductory paragraph, which users are likely to skip. I agree that there's no need to repeat the same advice twice, but I'd much rather we kept the .. warning:: version, so users are more likely to actually read it. Also, there's a few errors with your commit message. First, we can reasonably assert that urandom provides an acceptable CSPRNG, mostly because it does on every platform I'm aware of. Second, urandom is still a psuedo-random number generator, however they are cryptographically secure, it's not more random. Wikipedia does a good job laying out the necessary properties for a CSPRNG: https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator#Requirements Cheers, Alex On Sat, May 10, 2014 at 2:05 PM, raymond.hettinger python-check...@python.org wrote: http://hg.python.org/cpython/rev/b466dc34b86e changeset: 90618:b466dc34b86e parent: 90616:ce070040e1a6 user:Raymond Hettinger pyt...@rcn.com date:Sat May 10 14:05:28 2014 -0700 summary: Remove the redundant and poorly worded warning message. The paragraph above already says, clearly and correctly, that However, being completely deterministic, it is not suitable for all purposes, and is completely unsuitable for cryptographic purposes. Also we should make any promises about SystemRandom or os.urandom() being cryptographically secure (they may be, but be can't validate that promise). Further, those are actual random number generators not psuedo-random number generators. files: Doc/library/random.rst | 6 -- 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/Doc/library/random.rst b/Doc/library/random.rst --- a/Doc/library/random.rst +++ b/Doc/library/random.rst @@ -43,12 +43,6 @@ uses the system function :func:`os.urandom` to generate random numbers from sources provided by the operating system. -.. warning:: - - The pseudo-random generators of this module should not be used for - security purposes. Use :func:`os.urandom` or :class:`SystemRandom` if - you require a cryptographically secure pseudo-random number generator. - Bookkeeping functions: -- Repository URL: http://hg.python.org/cpython ___ Python-checkins mailing list python-check...@python.org https://mail.python.org/mailman/listinfo/python-checkins -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 466 (round 5): selected network security enhancements for Python 2.7
This mostly looks good to me, however I'm not sure I understand the point of this sentence: Rather, it is intended to send a clear signal to potential corporate contributors that the core development team are willing to accept offers of corporate assistance in putting this policy into effect [...]. It's fairly evident to me that the folks most likely to actually do the work of implementing this are myself and Donald. This PEP really has nothing to do with corporate contribution, so I think this sentence ought to be removed. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 466 (round 4): Python 2.7 network security enhancements
A casual glance at https://github.com/kennethreitz/requests/blob/master/requests/packages/urllib3/ util.py#L610 which is probably the most widely used consumer of these APIs, outside the stdlib itself, looks to me like if these names were to suddenly show up, everything would continue to work just fine, with the advance of being able to explicitly specify some options. All of which is to say: I don't think this is a real concern. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 466 (round 4): Python 2.7 network security enhancements
At this I think this PEP has become a little too vague and abstract, and I think we'd probably be better served by getting more concrete: Problem: Some of Python 2's modules which are fundamentally necessary for interop with the broader internet, and the security thereof, are missing really important features. Right now Python 2 has a policy of getting absolutely new features. Solution: We're going to ignore that policy for a couple of pretty important features to that end. Here's my proposed list of such featuers: * hmac * constant_time_compare * os * Persisant FD for os.urandom() * ssl * SNI * SSLContext * A giant suite of constants from OpenSSL * The functions for checking a hostname against a certificate * The functions for finding the platform's certificate store Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 466: Proposed policy change for handling network security enhancements
Thanks for putting this together Nick. I suspect it goes without saying that I'm wildly +1 on this as a whole. I'm in favor of leaving it somewhat implicit as to exactly which networking modules concern the health of the internet as a whole. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GC pauses in CPython
Maciej Fijalkowski fijall at gmail.com writes: HI I'm working on an incremental GC for PyPy. How do I measure GC pauses in CPython? (that is, the circular reference searching stuff) Cheers, fijal For what it's worth I threw together some code that might be helpful: http://bpaste.net/show/140334/ if someone was interested it might be a cool idea to properly organize this up and find a place to expose VM statistics like this. It'd also probably useful to use sane units, and maybe (it's unclear to me) exclude some amount of finalizations (Ideally I think you'd ignore use __del__ functions, but keep the bits of C code that decref other things and actually call free()). Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] performance of {} versus dict()
Stefan Behnel stefan_ml at behnel.de writes: Right. If that makes a difference, it's another bug. Stefan It's fixed, with, I will note, fewer lines of code than many messages in this thread: https://bitbucket.org/pypy/pypy/changeset/c30cb1dcb7a9adc32548fd14274e4995 Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [compatibility-sig] do all VMs implement the ast module? (was: Re: AST optimizer implemented in Python)
Brett Cannon brett at python.org writes: Time to ask the other VMs what they are currently doing (the ast module came into existence in Python 2.6 so all the VMs should be answer the question since Jython is in alpha for 2.7 compatibility). As far as I know PyPy supports the ast module, and produces ASTs that are the same as CPython's. That said I do regard this as an implementation detail, further I'm guessing this is the context of the AST optimizer thread, and though I have neither the time nor the inclination to wade into that, put me down as -1 a) everything proposed there is possible, b) making this a front-and-center API makes it really easy to shoot themselves in the foot, by doing things like breaking Python with invalid optimizations (hint: almost every optimization proposed in that thread is invalid in the general case). Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Requesting pronouncement on PEP 0424
On Mon, Jul 30, 2012 at 9:51 AM, Guido van Rossum gu...@python.org wrote: Also, I have a few content quibbles: - Is it really worth flagging a negative return value with ValueError? I'd just as well clip this to zero. What's the worry? That the computed value is wrong? But it's only meant to be a hint, and why would -1 be any more wrong than e.g. 10? This was done for consistency with len(), I'm not particularly attached to any behavior. - Did you mean to define operator.length_hint()? Of course :) - The default can be zero with no semantic impact, so I think there's no need to require the caller to specify a default. I suppose that's fair. - Most importantly: calling len(obj) and catching TypeError can only be a substitute for the real implementation, which IMO ought to check for the presence of a tp_len slot. Alas, checking hasattr(obj, '__len__') doesn't quite cut it either, since this returns true for a class object that defines a __len__ method for its instances (the class itself doesn't have a length). Still, I worry that calling len(obj) and catching all TypeErrors overspecifies the desired behavior; what I *want* to happen is to check if there is a __len__ method, and if so, call it and let any exceptions bubble through. It may be best to add a comment explaining that am implementation doesn't have to follow the letter of the Python code in the PEP, in particular, if obj *has* a __len__() method but calling it raises an exception, then length_hint(obj) may (ought to?) pass this exception on instead of calling obj.__length_hint__(). Seems reasonable, rather than try to spec that out precisely in the pseudocode (aka Python ;)) a note like you suggest sounds good. -- --Guido van Rossum (python.org/~guido) Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Requesting pronouncement on PEP 0424
Guido van Rossum guido at python.org writes: Looks good to me, so accepted.But why isn't it visible on python.org/dev/peps/ yet? I just realized the text in the python.org repo did not match what I had locally. I've pushed what I intended to be the latest text, if everyone could take a new look at that I would be very grateful. Sorry for the mixup. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Requesting pronouncement on PEP 0424
Hi all, The discussion on PEP 0424 seems to have subsided (and I haven't gotten angry emails in a week!). So I would like to request a BDFL or BDFP pronouncement on PEP 0424, text available here: http://hg.python.org/peps/file/tip/pep-0424.txt Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
That's not, strictly speaking, true. Mozilla added a method-JIT (Jaegermonkey) and then added another one (IonMonkey) because their tracing JIT (Tracemonkey) was bad. There's no fundamental reason that tracing has to only cover loops, indeed PyPy's tracing has been generalized to compile individual functions, recursion, etc. And any profiling JIT, in practice, needs a compile heuristic for how many calls must occur before a unit is compiled, even the Hotspot JVM has one. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Victor Stinner victor.stinner at gmail.com writes: Example: a = GETLOCAL(0); # a if (a == NULL) /* error */ b = GETLOCAL(1); # b if (b == NULL) /* error */ return PyNumber_Add(a, b); I don't expect to run a program 10x faster, but I would be happy if I can run arbitrary Python code 25% faster. -- Specialization / tracing JIT can be seen as another project, or at least added later. Victor This is almost exactly what Unladen Swallow originally did. First, LLVM will not do all of the optimizations you are expecting it to do out of the box. It will still have all the stack accesses, and it will have all of the ref counting operations. You can get a small speed boost from removing the interpretation dispatch overhead, but you also explode your memory usage, and the speedups are tiny. Please, learn from Unladen Swallow and other's experiences, otherwise they're for naught, and frankly we (python-dev) waste a lot of time. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
I've updated the PEP to reflect the discussion. There are two major changes: 1) NotImplemented may be used by __length_hint__ to indicate that there is no finite length hint available. 2) callers of operator.length_hint() must provide their own default value, this is also required by the current C _PyObject_LengthHint implementation. There are no provisions for infinite iterators, that is not within the scope of this proposal. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 0424: A method for exposing a length hint
Hi all, I've just submitted a PEP proposing making __length_hint__ a public API for users to define and other VMs to implement: PEP: 424 Title: A method for exposing a length hint Version: $Revision$ Last-Modified: $Date Author: Alex Gaynor alex.gay...@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 14-July-2012 Python-Version: 3.4 Abstract CPython currently defines an ``__length_hint__`` method on several types, such as various iterators. This method is then used by various other functions (such as ``map``) to presize lists based on the estimated returned by ``__length_hint__``. Types can then define ``__length_hint__`` which are not sized, and thus should not define ``__len__``, but can estimate or compute a size (such as many iterators). Proposal This PEP proposes formally documenting ``__length_hint__`` for other interpreter and non-standard library Python to implement. ``__length_hint__`` must return an integer, and is not required to be accurate. It may return a value that is either larger or smaller than the actual size of the container. It may raise a ``TypeError`` if a specific instance cannot have its length estimated. It may not return a negative value. Rationale = Being able to pre-allocate lists based on the expected size, as estimated by ``__length_hint__``, can be a significant optimization. CPython has been observed to run some code faster than PyPy, purely because of this optimization being present. Open questions == There are two open questions for this PEP: * Should ``list`` expose a kwarg in it's constructor for supplying a length hint. * Should a function be added either to ``builtins`` or some other module which calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. Copyright = This document has been placed into the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson benja...@python.orgwrote: 2012/7/14 Alex Gaynor alex.gay...@gmail.com: Proposal This PEP proposes formally documenting ``__length_hint__`` for other interpreter and non-standard library Python to implement. ``__length_hint__`` must return an integer, and is not required to be accurate. It may return a value that is either larger or smaller than the actual size of the container. It may raise a ``TypeError`` if a specific instance cannot have its length estimated. It may not return a negative value. And what happens if you return a negative value? ValueError, the same as with len. Rationale = Being able to pre-allocate lists based on the expected size, as estimated by ``__length_hint__``, can be a significant optimization. CPython has been observed to run some code faster than PyPy, purely because of this optimization being present. Open questions == There are two open questions for this PEP: * Should ``list`` expose a kwarg in it's constructor for supplying a length hint. * Should a function be added either to ``builtins`` or some other module which calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. Let's try to keep this as limited as possible for a public API. Sounds reasonable to me! Should we just go ahead and strip those out now? -- Regards, Benjamin Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sat, Jul 14, 2012 at 10:16 PM, Nick Coghlan ncogh...@gmail.com wrote: On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson benja...@python.org wrote: Open questions == There are two open questions for this PEP: * Should ``list`` expose a kwarg in it's constructor for supplying a length hint. * Should a function be added either to ``builtins`` or some other module which calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. Let's try to keep this as limited as possible for a public API. Length hints are very useful for *any* container implementation, whether those containers are in the standard library or not. Just as we exposed operator.index when __index__ was added, we should expose an operator.length_hint function with the following semantics: def length_hint(obj): Return an estimate of the number of items in obj. This is useful for presizing containers when building from an iterable. If the object supports len(), the result will be exact. Otherwise, it may over or underestimate by an arbitrary amount. The result will be an integer = 0. try: return len(obj) except TypeError: try: get_hint = obj.__length_hint__ except AttributeError: return 0 hint = get_hint() if not isinstance(hint, int): raise TypeError(Length hint must be an integer, not %r % type(hint)) if hint 0: raise ValueError(Length hint (%r) must be = 0 % hint) return hint There's no reason to make pure Python container implementations reimplement all that for themselves. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia Sounds reasonable to me, the only issue with your psuedocode (err... I mean Python ;)), is that there's no way for the __lenght_hint__ to specify that that particular instance can't have a length hint computed. e.g. imagine some sort of lazy stream that cached itself, and only wanted to offer a length hint if it had already been evaluated. Without an exception to raise, it has to return whatever the magic value for length_hint is (in your impl it appears to be 0, the current _PyObject_LengthHint method in CPython has a required `default` parameter). The PEP proposes using TypeError for that. Anyways that code looks good, do you want to add it to the PEP? Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [compatibility-sig] making sure importlib.machinery.SourceLoader doesn't throw an exception if bytecode is not supported by a VM
For PyPy: I'm not an expert in our import, but from looking at the source 1) imp.cache_from_source is unimplemented, it's an AttributeError. 2) sys.dont_write_bytecode is always false, we don't respect that flag (we really should IMO, but it's not a high priority for me, or anyone else apparently) Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [compatibility-sig] making sure importlib.machinery.SourceLoader doesn't throw an exception if bytecode is not supported by a VM
On Tue, Jun 12, 2012 at 11:47 AM, Brett Cannon br...@python.org wrote: On Tue, Jun 12, 2012 at 12:38 PM, Alex Gaynor alex.gay...@gmail.comwrote: For PyPy: I'm not an expert in our import, but from looking at the source 1) imp.cache_from_source is unimplemented, it's an AttributeError. Well, you will have it come Python 3.3 one way or another. =) Sure, I'm not totally up to speed on the py3k effort. 2) sys.dont_write_bytecode is always false, we don't respect that flag (we really should IMO, but it's not a high priority for me, or anyone else apparently) But doesn't PyPy read and write .pyc files ( http://doc.pypy.org/en/latest/config/objspace.usepycfiles.html suggests you do)? So I would assume you are not affected by this. Jython and IronPython, though, would be (I think). This is a compile time option, not a runtime option. However, it looks like I lied, someone did implement it correctly, so we have the same behavior as CPython. Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] backporting stdlib 2.7.x from pypy to cpython
Eric Snow ericsnowcurrently at gmail.com writes: Nick's option 2 would be an improvement, but I imagine that option 3 would have been the most effective by far. Of course, the key thing is how closely the various implementors would follow the new list. Only they could say, though Frank Wierzbicki seemed positive about it. -eric I'm +1 on such a list, I don't have the time to follow every single thread on python-dev, and I'm sure I miss a lot of things, have a dedicated place for things I know are relevant to my work would be a great help. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a frozendict builtin type
Nick Coghlan ncoghlan at gmail.com writes: I'm pretty sure the PyPy jit can already pick up and optimise cases where a dict goes read-only (i.e. stops being modified). No, it doesn't. We handle cases like a type's dict, or a module's dict, by having them use a different internal implementation (while, of course, still being dicts at the Python level). We do *not* handle the case of trying to figure out whether a Python object is immutable in any way. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
Brett Cannon brett at python.org writes: IOW you want the sys.modules case fast, which I will never be able to match compared to C code since that is pure execution with no I/O. Sure you can: have a really fast Python VM. Constructive: if you can run this code under PyPy it'd be easy to just: $ pypy -mtimeit import struct $ pypy -mtimeit -s import importlib importlib.import_module('struct') Or whatever the right API is. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
A few thoughts on this: a) This is not a new issue, I'm curious what the new interest is in it. b) Whatever the solution to this is, it is *not* CPython specific, any decision should be reflected in the Python language spec IMO, if CPython has the semantic that dicts aren't vulnerable to hash collision then users *will* rely on this and another implementation having a different (valid) behavior opens up users to security issues. c) I'm not convinced a randomized hash is appropriate for the default dict, for a number of reasons: it's a performance hit on every dict operations, using a per-process seed means you can't compile the hash of an obj at Python's compile time, a per-dict seed inhibits a bunch of other optimizations. These may not be relevant to CPython, but they are to PyPy and probably the invoke-dynamic work on Jython (pursuant to point b). Therefore I think these should be considered application issues, since request limiting is difficult and error prone, I'd recommend the Python stdlib including a non-hash based map (such as a binary tree). Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RFC: Add a new builtin strarray type to Python?
There are a number of issues that are being conflated by this thread. 1) Should str += str be fast. In my opinion, the answer is an obvious and resounding no. Strings are immutable, thus repeated string addition is O(n**2). This is a natural and obvious conclusion. Attempts to change this are only truly possible on CPython, and thus create a worse enviroment for other Pythons, as well as a quite misleading, as they'll be extremely brittle. It's worth noting that, to my knowledge, JVMs haven't attempted hacks like this. 2) Should we have a mutable string. Personally I think this question just misses the point. No one actually wants a mutable string, the closest thing anyone asks for is faster string building, which can be solved by a far more specialized thing (see (3)) without all the API hangups of What methods mutate?, Should it have every str method, or Is it a dropin replacement?. 3) And, finally the question that prompted this enter thing. Can we have a better way of incremental string building than the current list + str.join method. Personally I think unless your interest is purely in getting the most possible speed out of Python, the current idiom is probably acceptable. That said, if you want to get the most possible speed, a StringBuilder in the vein PyPy offers is the only sane way. It's able to be faster because it has very little ways to interact with it, and once you're done it reuses it's buffer to create the Python level string object, which is to say there's no need to copy it at the end. As I said, unless your interest is maximum performance, there's nothing wrong with the current idiom, and we'd do well to educate our users, rather than have more hacks. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible optimization for LOAD_FAST ?
On Tue, Jan 4, 2011 at 10:20 AM, Nick Coghlan ncogh...@gmail.com wrote: On Wed, Jan 5, 2011 at 1:58 AM, Guido van Rossum gu...@python.org wrote: On Tue, Jan 4, 2011 at 2:49 AM, Michael Foord fuzzy...@voidspace.org.uk wrote: I think someone else pointed this out, but replacing builtins externally to a module is actually common for testing. In particular replacing the open function, but also other builtins, is often done temporarily to replace it with a mock. It seems like this optimisation would break those tests. Hm, I already suggested to make an exception for open, (and one should be added for __import__) but if this is done for other builtins that is indeed a problem. Can you point to example code doing this? I've seen it done to write tests for simple CLI behaviour by mocking input() and print() (replacing sys.stdin and sys.stdout instead is far more common, but replacing the functions works too). If compile() accepted a blacklist of builtins that it wasn't allowed to optimise, then that should deal with the core of the problem as far as testing goes. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia Ugh, I can't be the only one who finds these special cases to be a little nasty? Special cases aren't special enough to break the rules. Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible optimization for LOAD_FAST ?
Cesare Di Mauro cesare.di.mauro at gmail.com writes: 2010/12/28 Lukas Lueg lukas.lueg at googlemail.com Consider the following code: def foobar(x): for i in range(5): x[i] = i The bytecode in python 2.7 is the following: 2 0 SETUP_LOOP 30 (to 33) 3 LOAD_GLOBAL 0 (range) 6 LOAD_CONST 1 (5) 9 CALL_FUNCTION 1 12 GET_ITER 13 FOR_ITER 16 (to 32) 16 STORE_FAST 1 (i) 3 19 LOAD_FAST 1 (i) 22 LOAD_FAST 0 (x) 25 LOAD_FAST 1 (i) 28 STORE_SUBSCR 29 JUMP_ABSOLUTE 13 32 POP_BLOCK 33 LOAD_CONST 0 (None) 36 RETURN_VALUE Can't we optimize the LOAD_FAST in lines 19 and 25 to a single load and put the reference twice on the stack? There is no way that the reference of i might change in between the two lines. Also, the load_fast in lne 22 to reference x could be taken out of the loop as x will always point to the same object Yes, you can, but you need: - a better AST evaluator (to mark symbols/variables with proper attributes); - a better optimizer (usually located on compile.c) which has a global vision (not limited to single instructions and/or single expressions). It's not that simple, and the results aren't guaranteed to be good. Also, consider that Python, as a dynamic-and-not-statically-compiled language need to find a good trade-off between compilation time and execution. Just to be clear, a C program is usually compiled once, then executed, so you can spend even *hours* to better optimize the final binary code. With a dynamic language, usually the code is compiled and the executed as needed, in realtime. So it isn't practical neither desirable having to wait too much time before execution begins (the startup problem). Python stays in a gray area, because modules are usually compiled once (when they are first used), and executed many times, but it isn't the only case. You cannot assume that optimization techniques used on other (static) languages can be used/ported in Python. Cesare No, it's singularly impossible to prove that any global load will be any given value at compile time. Any optimization based on this premise is wrong. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Very Strange Argument Handling Behavior
Hi all, I ran into the follow behavior while making sure Django works correctly on PyPy. The following behavior was observed in all tested versions of CPython (2.5, 3.1): def f(**kwargs): ... print(kwargs) ... kwargs = {1: 3} dict({}, **kwargs) {1: 3} f(**kwargs) Traceback (most recent call last): File stdin, line 1, in module TypeError: f() keywords must be strings This behavior seems pretty strange to me, indeed PyPy gives the TypeError for both attempts. I just wanted to confirm that it was in fact intentional. Thanks, Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Voltaire The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] O(1) random access to deque? (Re: patch to make list.pop(0) work in O(1) time)
On Wed, Jan 27, 2010 at 4:50 PM, Nick Coghlan ncogh...@gmail.com wrote: Steve Howell wrote: There is also the possibility that my initial patch can be refined by somebody smarter than myself to eliminate the particular tradeoff. In fact, Antoine Pitrou already suggested an approach, although I agree that it kind of pushes the boundary of sanity. :) I'm actually wondering if you could apply some of the implementation strategies discussed here to grant O(1) random access to arbitrary elements of a deque. I haven't looked at the deque code in a long time, but I believe the memory structure is already larger than that for a basic list. Reworking the way that extra space is used may be a more fruitful strategy than trying to convince anyone that it is worth changing the list implementation for this corner case. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alex.gaynor%40gmail.com I don't see how that's possible. The linked list is a pretty well known data structure and arbitrary lookups are O(n) in it. Using the unrolled-linked-list data structure python uses you can make it faster by a constant factor, but not O(1). There are other structures like skip-lists that have O(log n) arbitrary lookups though. If someone could make an O(1) linked-list I'd love to see it :) Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Voltaire The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
On Wed, Jan 27, 2010 at 11:30 PM, Steve Howell showel...@yahoo.com wrote: --- On Wed, 1/27/10, Raymond Hettinger raymond.hettin...@gmail.com wrote: From: Raymond Hettinger raymond.hettin...@gmail.com * the current design encourages people to use the right data structure for a given need. the proposed change makes the trades-off murky and implementation dependent. Are you saying that the current slowness of list for prepops helps people to choose more appropriate data structures? Really you have to know a lot more in order to make the right choices. that's not good for usability. we want tools that are easy to use correctly/well. If you want tools that are easy to use correctly, make them bug-free and document their behavior. If you want tools that are easy to use well, then make them perform better. I am not sure how my patch contradicts either of these goals. You keep making the argument that deque is a better alternative to list in many situations. I actually agree with you. Most programming problems are best modelled by a queue. I am not sure why Python lists get all the syntax sugar and promotion over deque, when in reality, Python lists implement a pretty useless data structure. Python lists are a glorification of a C array built on top of a memory-upward-biased memory allocator. As such, they optimize list appends (good) but fail awfully on list prepops (bad). They are much better as stacks than queues, even though queues are more useful for the most common programming known to man--work through a work queue and delete tasks when they are done. It is not surprising that Python lists are starting to show their lack of versatility in 2010. They're based on 1970's technology. Python lists are really just a thin encapsulation of C arrays built on top of an asymmetrical memory manager. In 2010 you could improve Python lists by releasing from the constraints of 1970s semantics. But I am starting to think a more modern approach would be to take more useful data structures like deques and give them more sugar. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alex.gaynor%40gmail.com Python lists implement a pretty useless data structure It's very difficult for ideas to gain traction when they contain such useless, and obviously wrong, rhetoric. There's an enormous body of code out there that begs to disagree with this assertion. Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Voltaire The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
On Mon, Jan 25, 2010 at 7:32 PM, Michael Foord fuzzy...@voidspace.org.uk wrote: On 26/01/2010 00:28, Christian Heimes wrote: Michael Foord wrote: How great is the complication? Making list.pop(0) efficient sounds like a worthy goal, particularly given that the reason you don't use it is because you *know* it is inefficient (so the fact that you don't use it isn't evidence that it isn't wanted - merely evidence that you had to work around the known inefficiency). The implementation must be changed in at least four places: * The PyListObject struct gets an additional pointer that stores a reference to the head. I would keep the head (element 0) of the list in **ob_item and the reference to the malloc()ed array in a new pointer *ob_allocated. * PyList_New() stores the pointer to the allocated memory in op-ob_allocated and sets op-ob_item = op-ob_allocated * listpop() moves the op-ob_item pointer by one for the special case of pop(0) * list_resize() should occasionally compact the free space before the head with memcpy() if it gets too large. listinsert() could be optimized for 0 if the list has some free space in front of the header, too. I favor this approach over an integer offset because doesn't change the semantic of ob_item. Christian Well, on the face of it this doesn't sound like a huge increase in complexity. Not that I'm qualified to judge. Michael -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alex.gaynor%40gmail.com Does anyone know if any other language's automatic array (or whatever they call it) special case the pop(0) case like this? Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Voltaire The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Fri, Jan 22, 2010 at 1:07 PM, Collin Winter collinwin...@google.com wrote: Hey Jake, On Thu, Jan 21, 2010 at 10:48 AM, Jake McGuire mcgu...@google.com wrote: On Thu, Jan 21, 2010 at 10:19 AM, Reid Kleckner r...@mit.edu wrote: On Thu, Jan 21, 2010 at 12:27 PM, Jake McGuire mcgu...@google.com wrote: On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter collinwin...@google.com wrote: Profiling - Unladen Swallow integrates with oProfile 0.9.4 and newer [#oprofile]_ to support assembly-level profiling on Linux systems. This means that oProfile will correctly symbolize JIT-compiled functions in its reports. Do the current python profiling tools (profile/cProfile/pstats) still work with Unladen Swallow? Sort of. They disable the use of JITed code, so they don't quite work the way you would want them to. Checking tstate-c_tracefunc every line generated too much code. They still give you a rough idea of where your application hotspots are, though, which I think is acceptable. Hmm. So cProfile doesn't break, but it causes code to run under a completely different execution model so the numbers it produces are not connected to reality? We've found the call graph and associated execution time information from cProfile to be extremely useful for understanding performance issues and tracking down regressions. Giving that up would be a huge blow. FWIW, cProfile's call graph information is still perfectly accurate, but you're right: turning on cProfile does trigger execution under a different codepath. That's regrettable, but instrumentation-based profiling is always going to introduce skew into your numbers. That's why we opted to improve oProfile, since we believe sampling-based profiling to be a better model. Profiling was problematic to support in machine code because in Python, you can turn profiling on from user code at arbitrary points. To correctly support that, we would need to add lots of hooks to the generated code to check whether profiling is enabled, and if so, call out to the profiler. Those is profiling enabled now? checks are (almost) always going to be false, which means we spend cycles for no real benefit. Can YouTube use oProfile for profiling, or is instrumented profiling critical? oProfile does have its downsides for profiling user code: you see all the C-language support functions, not just the pure-Python functions. That extra data might be useful, but it's probably more information than most people want. YouTube might want it, though. Assuming YouTube can't use oProfile as-is, there are some options: - Write a script around oProfile's reporting tool to strip out all C functions from the report. Enhance oProfile to fix any deficiencies compared to cProfile's reporting. - Develop a sampling profiler for Python that only samples pure-Python functions, ignoring C code (but including JIT-compiled Python code). - Add the necessary profiling hooks to JITted code to better support cProfile, but add a command-line flag (something explicit like -O3) that removes the hooks and activates the current behaviour (or something even more restrictive, possibly). - Initially compile Python code without the hooks, but have a trip-wire set to detect the installation of profiling hooks. When profiling hooks are installed, purge all machine code from the system and recompile all hot functions to include the profiling hooks. Thoughts? Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alex.gaynor%40gmail.com What about making profiling something more tied to the core VM. So profiling is either enabled or disabled for the course of the run of the application, not something that can be enabled or disabled arbitrarily. This way there's no overhead in JIT compiled code without profiling, and profiling has no worse overhead than it would in the VM loop. It's a slightly different semantic to profiling, but I wonder whether there's really any value to the other way? Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Voltaire The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Thu, Jan 21, 2010 at 3:00 PM, Steve Steiner (listsin) list...@integrateddevcorp.com wrote: On Jan 21, 2010, at 3:20 PM, Collin Winter wrote: Hey Greg, On Wed, Jan 20, 2010 at 10:54 PM, Gregory P. Smith g...@krypto.org wrote: +1 My biggest concern is memory usage but it sounds like addressing that is already in your mind. I don't so much mind an additional up front constant and per-line-of-code hit for instrumentation but leaks are unacceptable. Any instrumentation data or jit caches should be managed (and tunable at run time when possible and it makes sense). Reducing memory usage is a high priority. One thing being worked on right now is to avoid collecting runtime data for functions that will never be considered hot. That's one leak in the current implementation. Me, personally, I'd rather that you give me the profile information to make my own decisions, give me an @hot decorator to flag things that I want to be sped up, and let me switch the heat profiling gymnastics out of the runtime when I don't want them. That way, I can run a profile if I want to get the info to flag the things that are important, but a normal run doesn't waste a lot of time or energy doing something I don't want it to do during a regular run. Ideally, I could pre-JIT as much as possible on compile so that I could precompile my whole app pay the minimum JIT god's penalty at runtime. Yes, sometimes I'd like to run on full automatic, but not often. I run a *lot* of quick little scripts that do a few intense things once or in a tight loop. I know where the hotspots are, and I want them compiled before they're *ever* run. Unfortunately that model doesn't work particularly well with a JIT. The point of a JIT is that it can respond to runtime feedback, and take advantage of run time data. If you were to precompile it you'd lose interpretter overhead, and nothing else, because you can't do things like embed pointers to data in the assembly. Alex P.S.: SOrry to anyone who I personally sent that message to, stupid reply to all not being the default... 99% of the time, I don't need a runtime babysitter, I need a performance boost in known places, right away and without any load or runtime penalty to go along with it. S ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alex.gaynor%40gmail.com -- I disapprove of what you say, but I will defend to the death your right to say it. -- Voltaire The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyCon Keynote
On Thu, Jan 21, 2010 at 9:19 PM, Jesse Noller jnol...@gmail.com wrote: On Thu, Jan 21, 2010 at 6:16 PM, s...@pobox.com wrote: How about explaining why you're not going to give Collin a pony? Skip You're on to something, but the question is: 1 How do we get a pony to atlanta 2 Later deliver it to Mountain View 3 Get it to review patches? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alex.gaynor%40gmail.com A Pony reviewing patches? That's absurd. Clearly we should review patches ourselves and pray that the Pony doesn't decide to smite us. Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Voltaire The people's good is the highest law. -- Cicero Code can always be simpler than you think, but never as simple as you want -- Me ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com