Re: [Python-Dev] Discussion overload
Yes, it most certainly was. :( Sorry about that! From:on behalf of Guido van Rossum Reply-To: Date: Thursday, June 16, 2016 at 8:25 PM To: Kevin Ollivier Cc: Python Dev Subject: Re: [Python-Dev] Discussion overload More likely your post was too long... :-( On Thu, Jun 16, 2016 at 7:00 PM, Kevin Ollivier wrote: Hi Guido, From: on behalf of Guido van Rossum Reply-To: Date: Thursday, June 16, 2016 at 5:27 PM To: Kevin Ollivier Cc: Python Dev Subject: Re: [Python-Dev] Discussion overload Hi Kevin, I often feel the same way. Are you using GMail? It combines related messages in threads and lets you mute threads. I often use this feature so I can manage my inbox. (I presume other mailers have the same features, but I don't know if all of them do.) There are also many people who read the list on a website, e.g. gmane. (Though I think that sometimes the delays incurred there add to the noise -- e.g. when a decision is reached on the list sometimes people keep responding to earlier threads.) I fear I did quite a poor job of making my point. :( I've been on open source mailing lists since the late 90s, so I've learned strategies for dealing with mailing list overload. I've got my mail folders, my mail rules, etc. Having been on many mailing lists over the years, I've seen many productive discussions and many unproductive ones, and over time you start to see patterns. You also see what happens to those communities over time. On the mailing lists where discussions become these unwieldy floods with 30-40 posts a day on one topic, over time what I have seen is that that rapid fire of posts generally does not lead to better decisions being made. In fact, usually it is the opposite. Faster discussions are not usually better discussions, and the chances of that gem of knowledge getting lost in the flood of posts is much greater. The more long-term consequence is that people start hesitating to bring up ideas, sometimes even very good ones, simply because even the discussion of them gets to be so draining that it's better to just leave things be. As an example, I do have work to do :) and I know if I was the one who had wanted to propose a fix for os.urandom or what have you, waking up to 30 messages I need to read to get caught up each day would be a pretty disheartening prospect, and possibly not even possible with my work obligations. It raises the bar to participating, in a way. Perhaps some of this is inherent in mailing list discussions, but really in my experience, just a conscious decision on the part of contributors to slow down the discussion and "think more, write less", can do quite a lot to ensure the discussion is in fact a better one. I probably should have taken more time to write my initial message, in fact, in order to better coalesce my points into something more succinct and clearly understandable. I somehow managed to convince people I need to learn mail management strategies. :) Anyway, that is just my $0.02 cents on the matter. With inflation it accounts for less every day, so make of it what you will. :P Thanks, Kevin --Guido (don't get me started on top-posting :-) On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier wrote: Hi all, Recent joiner here, I signed up after PyCon made me want to get more involved and have been lurking. I woke up this morning again to about 30 new messages in my inbox, almost all of which revolve around the os.urandom blocking discussion. There are just about hourly new posts showing up on this topic. There is such a thing as too much of a good thing. Discussion of issues is certainly good, but so far since joining this list I am seeing too much discussion happening too fast, and as someone who has been involved in open source for approaching two decades now, frankly, that is not really a good sign. The discussions are somewhat overlapping as so many people write back so quickly, there are multiple sub-discussions happening at once, and really at this point I'm not sure how much new each message is really adding, if anything at all. It seems to me the main solutions to this problem have all been identified, as have the tradeoffs of each. The discussion is now mostly at a point where people are just repeatedly debating (or promoting) the merits of their preferred solution and tradeoff. It is even spawning more abstract sub-discsussions about things like project compatibility policies. This discussion has really taken on a life of its own. For someone like me, a new joiner, seeing this makes me feel like wanting to simply unsubscribe. I've been on mailing lists where issues get debated endlessly, and at
Re: [Python-Dev] Discussion overload
More likely your post was too long... :-( On Thu, Jun 16, 2016 at 7:00 PM, Kevin Ollivier < kevin-li...@theolliviers.com> wrote: > Hi Guido, > > From:on behalf of Guido van Rossum < > gu...@python.org> > Reply-To: > Date: Thursday, June 16, 2016 at 5:27 PM > To: Kevin Ollivier > Cc: Python Dev > Subject: Re: [Python-Dev] Discussion overload > > Hi Kevin, > > I often feel the same way. Are you using GMail? It combines related > messages in threads and lets you mute threads. I often use this feature so > I can manage my inbox. (I presume other mailers have the same features, but > I don't know if all of them do.) There are also many people who read the > list on a website, e.g. gmane. (Though I think that sometimes the delays > incurred there add to the noise -- e.g. when a decision is reached on the > list sometimes people keep responding to earlier threads.) > > > I fear I did quite a poor job of making my point. :( I've been on open > source mailing lists since the late 90s, so I've learned strategies for > dealing with mailing list overload. I've got my mail folders, my mail > rules, etc. Having been on many mailing lists over the years, I've seen > many productive discussions and many unproductive ones, and over time you > start to see patterns. You also see what happens to those communities over > time. > > On the mailing lists where discussions become these unwieldy floods with > 30-40 posts a day on one topic, over time what I have seen is that that > rapid fire of posts generally does not lead to better decisions being made. > In fact, usually it is the opposite. Faster discussions are not usually > better discussions, and the chances of that gem of knowledge getting lost > in the flood of posts is much greater. The more long-term consequence is > that people start hesitating to bring up ideas, sometimes even very good > ones, simply because even the discussion of them gets to be so draining > that it's better to just leave things be. As an example, I do have work to > do :) and I know if I was the one who had wanted to propose a fix for > os.urandom or what have you, waking up to 30 messages I need to read to get > caught up each day would be a pretty disheartening prospect, and possibly > not even possible with my work obligations. It raises the bar to > participating, in a way. > > Perhaps some of this is inherent in mailing list discussions, but really > in my experience, just a conscious decision on the part of contributors to > slow down the discussion and "think more, write less", can do quite a lot > to ensure the discussion is in fact a better one. > > I probably should have taken more time to write my initial message, in > fact, in order to better coalesce my points into something more succinct > and clearly understandable. I somehow managed to convince people I need to > learn mail management strategies. :) > > Anyway, that is just my $0.02 cents on the matter. With inflation it > accounts for less every day, so make of it what you will. :P > > Thanks, > > Kevin > > > --Guido (don't get me started on top-posting :-) > > On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier < > kevin-li...@theolliviers.com> wrote: > >> Hi all, >> >> Recent joiner here, I signed up after PyCon made me want to get more >> involved and have been lurking. I woke up this morning again to about 30 >> new messages in my inbox, almost all of which revolve around the os.urandom >> blocking discussion. There are just about hourly new posts showing up on >> this topic. >> >> >> >> >> There is such a thing as too much of a good thing. Discussion of issues >> is certainly good, but so far since joining this list I am seeing too much >> discussion happening too fast, and as someone who has been involved in open >> source for approaching two decades now, frankly, that is not really a good >> sign. The discussions are somewhat overlapping as so many people write back >> so quickly, there are multiple sub-discussions happening at once, and >> really at this point I'm not sure how much new each message is really >> adding, if anything at all. It seems to me the main solutions to this >> problem have all been identified, as have the tradeoffs of each. The >> discussion is now mostly at a point where people are just repeatedly >> debating (or promoting) the merits of their preferred solution and >> tradeoff. It is even spawning more abstract sub-discsussions about things >> like project compatibility policies. This discussion has really taken on a >> life of its own. >> >> For someone like me, a new joiner, seeing this makes me feel like wanting >> to simply unsubscribe. I've been on mailing lists where issues get debated >> endlessly, and at some point what inevitably happens is that the project >> starts to lose members who feel that even just trying to follow the >> discussions is eating up too much of their time. It really can
Re: [Python-Dev] Discussion overload
Hi Guido, From:on behalf of Guido van Rossum Reply-To: Date: Thursday, June 16, 2016 at 5:27 PM To: Kevin Ollivier Cc: Python Dev Subject: Re: [Python-Dev] Discussion overload Hi Kevin, I often feel the same way. Are you using GMail? It combines related messages in threads and lets you mute threads. I often use this feature so I can manage my inbox. (I presume other mailers have the same features, but I don't know if all of them do.) There are also many people who read the list on a website, e.g. gmane. (Though I think that sometimes the delays incurred there add to the noise -- e.g. when a decision is reached on the list sometimes people keep responding to earlier threads.) I fear I did quite a poor job of making my point. :( I've been on open source mailing lists since the late 90s, so I've learned strategies for dealing with mailing list overload. I've got my mail folders, my mail rules, etc. Having been on many mailing lists over the years, I've seen many productive discussions and many unproductive ones, and over time you start to see patterns. You also see what happens to those communities over time. On the mailing lists where discussions become these unwieldy floods with 30-40 posts a day on one topic, over time what I have seen is that that rapid fire of posts generally does not lead to better decisions being made. In fact, usually it is the opposite. Faster discussions are not usually better discussions, and the chances of that gem of knowledge getting lost in the flood of posts is much greater. The more long-term consequence is that people start hesitating to bring up ideas, sometimes even very good ones, simply because even the discussion of them gets to be so draining that it's better to just leave things be. As an example, I do have work to do :) and I know if I was the one who had wanted to propose a fix for os.urandom or what have you, waking up to 30 messages I need to read to get caught up each day would be a pretty disheartening prospect, and possibly not even possible with my work obligations. It raises the bar to participating, in a way. Perhaps some of this is inherent in mailing list discussions, but really in my experience, just a conscious decision on the part of contributors to slow down the discussion and "think more, write less", can do quite a lot to ensure the discussion is in fact a better one. I probably should have taken more time to write my initial message, in fact, in order to better coalesce my points into something more succinct and clearly understandable. I somehow managed to convince people I need to learn mail management strategies. :) Anyway, that is just my $0.02 cents on the matter. With inflation it accounts for less every day, so make of it what you will. :P Thanks, Kevin --Guido (don't get me started on top-posting :-) On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier wrote: Hi all, Recent joiner here, I signed up after PyCon made me want to get more involved and have been lurking. I woke up this morning again to about 30 new messages in my inbox, almost all of which revolve around the os.urandom blocking discussion. There are just about hourly new posts showing up on this topic. There is such a thing as too much of a good thing. Discussion of issues is certainly good, but so far since joining this list I am seeing too much discussion happening too fast, and as someone who has been involved in open source for approaching two decades now, frankly, that is not really a good sign. The discussions are somewhat overlapping as so many people write back so quickly, there are multiple sub-discussions happening at once, and really at this point I'm not sure how much new each message is really adding, if anything at all. It seems to me the main solutions to this problem have all been identified, as have the tradeoffs of each. The discussion is now mostly at a point where people are just repeatedly debating (or promoting) the merits of their preferred solution and tradeoff. It is even spawning more abstract sub-discsussions about things like project compatibility policies. This discussion has really taken on a life of its own. For someone like me, a new joiner, seeing this makes me feel like wanting to simply unsubscribe. I've been on mailing lists where issues get debated endlessly, and at some point what inevitably happens is that the project starts to lose members who feel that even just trying to follow the discussions is eating up too much of their time. It really can suck the energy right out of a community. I don't want to see that happen to Python. I had a blast at PyCon, my first, and I really came away feeling more than ever that the community you have here is really special. The one problem I felt concerned about though, was that the core
Re: [Python-Dev] Discussion overload
Hi Kevin, I often feel the same way. Are you using GMail? It combines related messages in threads and lets you mute threads. I often use this feature so I can manage my inbox. (I presume other mailers have the same features, but I don't know if all of them do.) There are also many people who read the list on a website, e.g. gmane. (Though I think that sometimes the delays incurred there add to the noise -- e.g. when a decision is reached on the list sometimes people keep responding to earlier threads.) --Guido (don't get me started on top-posting :-) On Thu, Jun 16, 2016 at 12:22 PM, Kevin Ollivier < kevin-li...@theolliviers.com> wrote: > Hi all, > > Recent joiner here, I signed up after PyCon made me want to get more > involved and have been lurking. I woke up this morning again to about 30 > new messages in my inbox, almost all of which revolve around the os.urandom > blocking discussion. There are just about hourly new posts showing up on > this topic. > > > > > There is such a thing as too much of a good thing. Discussion of issues is > certainly good, but so far since joining this list I am seeing too much > discussion happening too fast, and as someone who has been involved in open > source for approaching two decades now, frankly, that is not really a good > sign. The discussions are somewhat overlapping as so many people write back > so quickly, there are multiple sub-discussions happening at once, and > really at this point I'm not sure how much new each message is really > adding, if anything at all. It seems to me the main solutions to this > problem have all been identified, as have the tradeoffs of each. The > discussion is now mostly at a point where people are just repeatedly > debating (or promoting) the merits of their preferred solution and > tradeoff. It is even spawning more abstract sub-discsussions about things > like project compatibility policies. This discussion has really taken on a > life of its own. > > For someone like me, a new joiner, seeing this makes me feel like wanting > to simply unsubscribe. I've been on mailing lists where issues get debated > endlessly, and at some point what inevitably happens is that the project > starts to lose members who feel that even just trying to follow the > discussions is eating up too much of their time. It really can suck the > energy right out of a community. I don't want to see that happen to Python. > I had a blast at PyCon, my first, and I really came away feeling more than > ever that the community you have here is really special. The one problem I > felt concerned about though, was that the core dev community risked a sense > of paralysis caused by having too many cooks in the kitchen and too much > worry about the potential unseen ramifications of changing things. That > creates a sort of paralysis and difficulty achieving consensus on anything > that, eventually, causes projects to slowly decline and be disrupted by a > more agile alternative. > > Please consider taking a step back from this issue. Take a deep breath, > and consider responding more slowly and letting people's points stew in > your head for a day or two first. (Including this one pls. :) Python will > not implode if you don't get that email out right away. If I understand > what I've read of this torrent of messages correctly, we don't even know if > there's a single real world use case where a user of os.urandom is hitting > the same problem CPython did, so we don't even know if the blocking at > startup issue is actually even happening in any real world Python code out > there. It's clearly far from a rampant problem, in any case. Stop and think > about that for a second. This is, in practice, potentially a complete > non-issue. Fixing it in any number of ways may potentially change things > for no one at all. You could even introduce a real problem while trying to > fix a hypothetical one. There are more than enough real problems to deal > with, so why push hypothetical problems to t > he top of your priority list? > > It's too easy to get caught up in the abstract nature of problems and to > lose sight of the real people and code behind them, or sometimes, the lack > thereof. Be practical, be pragmatic. Before you hit that reply button, > think - in a practical sense, of all the things I could be doing right now, > is this discussion the place where my involvement could generate the > greatest positive impact for the project? Is this the biggest and most > substantial problem the project should be focusing on right now? Projects > and developers who know how to manage focus go on to achieve the greatest > things, in my experience. > > Having been critical, I will end with a compliment. :) It is nice to see > that with only a couple small exceptions, this discussion has remained very > civil and respectful, which should be expected, but I know from experience > that far too often these discussions start to take a nasty tone as people > get frustrated. This
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
Yes 'secrets' is one-liners. However, it might grow a few more lines around the blocking in getrandom() on Linux. But still, not more than a few. But the reason it should be on PyPI is so that programs can have a uniform API across various Python versions. There's no real reason that someone stick on Python 2.7 or 3.3 shouldn't be able to include the future-style: import secrets Answer = secrets.token_bytes(42) On Jun 16, 2016 4:53 PM, "Nick Coghlan"wrote: > On 16 June 2016 at 13:09, Barry Warsaw wrote: > > On Jun 16, 2016, at 01:01 PM, David Mertz wrote: > > > >>It seems to me that backporting 'secrets' and putting it on Warehouse > would > >>be a lot more productive than complaining about 3.5.2 reverting to > (almost) > >>the behavior of 2.3-3.4. > > > > Very wise suggestion indeed. We have all kinds of stdlib modules > backported > > and released as third party packages. Why not secrets too? If such > were on > > PyPI, I'd happily package it up for the Debian ecosystem. Problem solved > > . > > The secrets module is just a collection of one liners pulling together > other stdlib components that have been around for years - the main > problem it aims to address is one of discoverability (rather than one > of code complexity), while also eliminating the "simulation is in the > standard library, secrecy requires a third party module" discrepancy > in the long term. > > Once you're aware the problem exists, the easiest way to use it in a > version independent manner is to just copy the relevant snippet into > your own project's utility library - adding an entire new dependency > to your project just for those utility functions would be overkill. > > If you *do* add a dependency, you'd typically be better off with > something more comprehensive and tailored to the particular problem > domain you're dealing with, like passlib or cryptography or > itsdangerous. > > Cheers, > Nick. > > P.S. Having the secrets module available on PyPI wouldn't *hurt*, I > just don't think it would help much. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 487: Simpler customization of class creation
On Fri, Jun 17, 2016 at 2:36 AM, Nick Coghlanwrote: > On 16 June 2016 at 14:17, Martin Teichmann wrote: > An implementation like PyPy, with an inherently ordered standard dict > implementation, can just rely on that rather than being obliged to > switch to their full collections.OrderedDict type. I didin't know that PyPy has actually implemented packed ordered dicts! https://morepypy.blogspot.ru/2015/01/faster-more-memory-efficient-and-more.html https://mail.python.org/pipermail/python-dev/2012-December/123028.html This old idea by Raymond Hettinger is vastly superior to __definition_order__ duct tape (now that PyPy has validated it). It also gives kwarg order for free, which is important in many metaprogramming scenarios. Not to mention memory usage reduction and dict operations speedup... ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 487: Simpler customization of class creation
On Thu, Jun 16, 2016 at 3:36 PM, Nick Coghlanwrote: > I don't think that's a side note, I think it's an important point (and > relates to one of Nikita's questions as well): we have the option of > carving out certain aspects of PEP 520 as CPython implementation > details. > > In particular, the language level guarantee can be that "class > statements set __definition_order__ by default, but may not do so when > using a metaclass that returns a custom namespace from __prepare__", > with the implementation detail that CPython does that by using > collection.OrderedDict for the class namespace by default. > > An implementation like PyPy, with an inherently ordered standard dict > implementation, can just rely on that rather than being obliged to > switch to their full collections.OrderedDict type. Excellent point from you both. :) I'll rework PEP 520 accordingly (to focus on __definition_order__). At that point I expect the definition order part of PEP 487 could be dropped (as redundant). > > However, I don't think we should leave the compile-time vs runtime > definition order question as an implementation detail - I think we > should be explicit that the definition order attribute captures the > runtime definition order, with conditionals, loops and reassignment > being handled accordingly. Yeah, I'll make that clear. We can discuss these changes in a separate thread once I've updated PEP 520. So let's focus back on the rest of PEP 487! :) -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 487: Simpler customization of class creation
On 16 June 2016 at 14:17, Martin Teichmannwrote: > As a side note, you propose to use OrderedDict as the class definition > namespace, and this is exactly how I implemented it. Nonetheless, I > would like to keep this fact as an implementation detail, such that > other implementations of Python (PyPy comes to mind) or even CPython > at a later time may switch to a different way to implement this > feature. I am thinking especially about the option to determine the > -_order__ already at compile time. Sure, this would mean that someone > could trick us by dynamically changing the order of attribute > definition, but I would document that as an abuse of the functionality > with undocumented outcome. I don't think that's a side note, I think it's an important point (and relates to one of Nikita's questions as well): we have the option of carving out certain aspects of PEP 520 as CPython implementation details. In particular, the language level guarantee can be that "class statements set __definition_order__ by default, but may not do so when using a metaclass that returns a custom namespace from __prepare__", with the implementation detail that CPython does that by using collection.OrderedDict for the class namespace by default. An implementation like PyPy, with an inherently ordered standard dict implementation, can just rely on that rather than being obliged to switch to their full collections.OrderedDict type. However, I don't think we should leave the compile-time vs runtime definition order question as an implementation detail - I think we should be explicit that the definition order attribute captures the runtime definition order, with conditionals, loops and reassignment being handled accordingly. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 487: Simpler customization of class creation
Hi Eric, hi List, > I'd be glad to give feedback on this, probably later today or > tomorrow. In particular, I'd like to help resolve the intersection > with PEP 520. :) Thanks in advance! Let me already elaborate on the differences, so that others can follow: You chose the name "__definition_order__", I chose "__attribute_order__", I am fine with either, what are other people's opinions? The bigger difference is actually the path to inclusion into Python: my idea is to first make it a standard library feature, with the later option to put it into the C core, while you want to put the feature directly into the C core. Again I'm fine with either, as long as the feature is eventually in. As a side note, you propose to use OrderedDict as the class definition namespace, and this is exactly how I implemented it. Nonetheless, I would like to keep this fact as an implementation detail, such that other implementations of Python (PyPy comes to mind) or even CPython at a later time may switch to a different way to implement this feature. I am thinking especially about the option to determine the -_order__ already at compile time. Sure, this would mean that someone could trick us by dynamically changing the order of attribute definition, but I would document that as an abuse of the functionality with undocumented outcome. Greetings Martin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 487: Simpler customization of class creation
On Thu, Jun 16, 2016 at 12:56 PM, Martin Teichmannwrote: > I am looking forward to a lot of comments on this! I'd be glad to give feedback on this, probably later today or tomorrow. In particular, I'd like to help resolve the intersection with PEP 520. :) -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 16 June 2016 at 13:09, Barry Warsawwrote: > On Jun 16, 2016, at 01:01 PM, David Mertz wrote: > >>It seems to me that backporting 'secrets' and putting it on Warehouse would >>be a lot more productive than complaining about 3.5.2 reverting to (almost) >>the behavior of 2.3-3.4. > > Very wise suggestion indeed. We have all kinds of stdlib modules backported > and released as third party packages. Why not secrets too? If such were on > PyPI, I'd happily package it up for the Debian ecosystem. Problem solved > . The secrets module is just a collection of one liners pulling together other stdlib components that have been around for years - the main problem it aims to address is one of discoverability (rather than one of code complexity), while also eliminating the "simulation is in the standard library, secrecy requires a third party module" discrepancy in the long term. Once you're aware the problem exists, the easiest way to use it in a version independent manner is to just copy the relevant snippet into your own project's utility library - adding an entire new dependency to your project just for those utility functions would be overkill. If you *do* add a dependency, you'd typically be better off with something more comprehensive and tailored to the particular problem domain you're dealing with, like passlib or cryptography or itsdangerous. Cheers, Nick. P.S. Having the secrets module available on PyPI wouldn't *hurt*, I just don't think it would help much. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016, at 01:01 PM, David Mertz wrote: >It seems to me that backporting 'secrets' and putting it on Warehouse would >be a lot more productive than complaining about 3.5.2 reverting to (almost) >the behavior of 2.3-3.4. Very wise suggestion indeed. We have all kinds of stdlib modules backported and released as third party packages. Why not secrets too? If such were on PyPI, I'd happily package it up for the Debian ecosystem. Problem solved . But I'm *really* going to try to disengage from this discussion until Nick's PEP is posted. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 16 June 2016 at 18:03, Nick Coghlanwrote: > On 16 June 2016 at 09:39, Paul Moore wrote: >> I'm willing to accept the view of the security experts that there's a >> problem here. But without a clear explanation of the problem, how can >> a non-specialist like myself have an opinion? (And I hope the security >> POV isn't "you don't need an opinion, just do as we say"). > > If you're not writing Linux (and presumably *BSD) scripts and > applications that run during system initialisation or on embedded ARM > hardware with no good sources of randomness, then there's zero chance > of any change made in relation to this affecting you (Windows and Mac > OS X are completely immune, since they don't allow Python scripts to > run early enough in the boot sequence for there to ever be a problem). Understood. I could quite happily ignore this thread for all the impact it will have on me. However, I've seen enough of these debates (and witnessed the frustration of the security advocates) that I want to try to understand the issues better - as much as anything so that I don't end up adding uninformed opposition to these threads (in my day job, unfortunately, security is generally the excuse for all sorts of counter-productive rules, and never offers any practical benefits that I am aware of, so I'm predisposed to rejecting arguments based on security - that background isn't accurate in this environment and I'm actively trying to counter it). > The only question at hand is what CPython should do in the case where > the operating system *does* let Python scripts run before the system > random number generator is ready, and the application calls a security > sensitive API that relies on that RNG: > > - throw BlockingIOError (so the script developer knows they have a > potential problem to fix) > - block (so the script developer has a system hang to debug) > - return low quality random data (so the script developer doesn't even > know they have a potential problem) > > The last option is the status quo, and has a remarkable number of > vocal defenders. Understood. It seems to me that there are two arguments here - backward compatibility (which is always a pressure, but sometimes applied too vigourously and not always consistently) and "we've always done it that way" (aka "people will have to consider what happens when they run under 3.4 anyway, so how will changing help?"). Jusging backward compatibility is always a matter of trade-offs, hence my interest in the actual benefits. > The second option is what we changed the behaviour to in 3.5 as a side > effect of switching to a syscall to save a file descriptor (and *also* > inadvertently made a gating requirement for CPython starting at all, > without which I'd be very surprised if anyone actually noticed the > potentially blocking behaviour in os.urandom itself) OK, so (given that the issue of CPython starting at all was an accidental, and now corrected, side effect) why is this so bad? Maybe not in a minor release, but at least for 3.6? How come this has caused such a fuss? I genuinely don't understand why people see blocking as such an issue (and as far as I can tell, Ted Tso seems to agree). The one case where this had an impact was a quickly fixed bug - so as far as I can tell, the risk of problems caused by blocking is purely hypothetical. > The first option is the one I'm currently writing a PEP for, since it > makes the longstanding advice to use os.urandom() as the low level > random data API for security sensitive operations unequivocally > correct (as it will either do the right thing, or throw an exception > which the developer can handle as appropriate for their particular > application) In my code, I typically prefer Python to make detailed decisions for me (e.g. requests follows redirects by default, it doesn't expect me to do so manually). Now certainly this is a low-level interface so the rules are different, but I don't see why blocking by default isn't "unequivocally correct" in the same way that it is on other platforms, rather than raising an exception and requiring the developer to do the wait manually. (What else would they do - fall back to insecure data? I thought the point here was that that's the wrong thing to do?) Having a blocking default with a non-blocking version seems just as arguable, and has the advantage that naive users (I don't even know if we're allowing for naive users here) won't get an unexpected exception and handle it badly because they don't know what to do (a sadly common practice in my experience). OK. Guido has pronounced, you're writing a PEP. None of this debate is really constructive any more. But I still don't understand the trade-offs, which frustrates me. Surely security isn't so hard that it can't be explained in a way that an interested layman like myself can follow? :-( Paul ___ Python-Dev mailing list
[Python-Dev] Discussion overload
Hi all, Recent joiner here, I signed up after PyCon made me want to get more involved and have been lurking. I woke up this morning again to about 30 new messages in my inbox, almost all of which revolve around the os.urandom blocking discussion. There are just about hourly new posts showing up on this topic. There is such a thing as too much of a good thing. Discussion of issues is certainly good, but so far since joining this list I am seeing too much discussion happening too fast, and as someone who has been involved in open source for approaching two decades now, frankly, that is not really a good sign. The discussions are somewhat overlapping as so many people write back so quickly, there are multiple sub-discussions happening at once, and really at this point I'm not sure how much new each message is really adding, if anything at all. It seems to me the main solutions to this problem have all been identified, as have the tradeoffs of each. The discussion is now mostly at a point where people are just repeatedly debating (or promoting) the merits of their preferred solution and tradeoff. It is even spawning more abstract sub-discsussions about things like project compatibility policies. This discussion has really taken on a life of its own. For someone like me, a new joiner, seeing this makes me feel like wanting to simply unsubscribe. I've been on mailing lists where issues get debated endlessly, and at some point what inevitably happens is that the project starts to lose members who feel that even just trying to follow the discussions is eating up too much of their time. It really can suck the energy right out of a community. I don't want to see that happen to Python. I had a blast at PyCon, my first, and I really came away feeling more than ever that the community you have here is really special. The one problem I felt concerned about though, was that the core dev community risked a sense of paralysis caused by having too many cooks in the kitchen and too much worry about the potential unseen ramifications of changing things. That creates a sort of paralysis and difficulty achieving consensus on anything that, eventually, causes projects to slowly decline and be disrupted by a more agile alternative. Please consider taking a step back from this issue. Take a deep breath, and consider responding more slowly and letting people's points stew in your head for a day or two first. (Including this one pls. :) Python will not implode if you don't get that email out right away. If I understand what I've read of this torrent of messages correctly, we don't even know if there's a single real world use case where a user of os.urandom is hitting the same problem CPython did, so we don't even know if the blocking at startup issue is actually even happening in any real world Python code out there. It's clearly far from a rampant problem, in any case. Stop and think about that for a second. This is, in practice, potentially a complete non-issue. Fixing it in any number of ways may potentially change things for no one at all. You could even introduce a real problem while trying to fix a hypothetical one. There are more than enough real problems to deal with, so why push hypothetical problems to t he top of your priority list? It's too easy to get caught up in the abstract nature of problems and to lose sight of the real people and code behind them, or sometimes, the lack thereof. Be practical, be pragmatic. Before you hit that reply button, think - in a practical sense, of all the things I could be doing right now, is this discussion the place where my involvement could generate the greatest positive impact for the project? Is this the biggest and most substantial problem the project should be focusing on right now? Projects and developers who know how to manage focus go on to achieve the greatest things, in my experience. Having been critical, I will end with a compliment. :) It is nice to see that with only a couple small exceptions, this discussion has remained very civil and respectful, which should be expected, but I know from experience that far too often these discussions start to take a nasty tone as people get frustrated. This is one of the things I really do love about the Python community, and it's one reason I want to see both the product and community grow and succeed even more. That, in fact, is why I'm choosing to write this message first rather than simply unsubscribe. Kevin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 487: Simpler customization of class creation
Hi list, using metaclasses in Python is a very flexible method of customizing class creation, yet this customization comes at a cost: once you want to combine two classes with different metaclasses, you run into problems. This is why I proposed PEP 487 (see https://github.com/tecki/peps/blob/pep487/pep-0487.txt, which I also attached here for ease of discussion), proposing a simple hook into class creation, with which one can override in subclasses such that sub-subclasses get customized accordingly. Otherwise, the standard Python inheritance rules apply (super() and the MRO). I also proposed to store the order in which attributes in classes are defined. This is exactly the same as PEP 520, discussed here recently, just that unfortunately we chose different names, but I am open for suggestions for better names. After having gotten good feedback on python-ideas (see https://mail.python.org/pipermail/python-ideas/2016-February/038305.html) and from IPython traitlets as a potential user of the feature (see https://mail.scipy.org/pipermail/ipython-dev/2016-February/017066.html, and the code at https://github.com/tecki/traitlets/commits/pep487) I implemented a pure Python version of this PEP, to be introduced into the standard library. I also wrote a proof-of-concept for another potential user of this feature, django forms, at https://github.com/tecki/django/commits/no-metaclass. The code to be introduced into the standard library can be found at https://github.com/tecki/cpython/commits/pep487 (sorry for using github, I'll submit something using hg once I understand that toolchain). It introduces a new metaclass types.Type which contains the machinery, and a new base class types.Object which uses said metaclass. The naming was chosen to clarify the intention that eventually those classes may be implemented in C and replace type and object. As above, I am open to better naming. As a second step, I let abc.ABCMeta inherit from said types.Type, such that an ABC may also use the features of my metaclass, without the need to define a new mixing metaclass. I am looking forward to a lot of comments on this! Greetings Martin The proposed PEP for discussion: PEP: 487 Title: Simpler customisation of class creation Version: $Revision$ Last-Modified: $Date$ Author: Martin Teichmann, Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Feb-2015 Python-Version: 3.6 Post-History: 27-Feb-2015, 5-Feb-2016 Replaces: 422 Abstract Currently, customising class creation requires the use of a custom metaclass. This custom metaclass then persists for the entire lifecycle of the class, creating the potential for spurious metaclass conflicts. This PEP proposes to instead support a wide range of customisation scenarios through a new ``__init_subclass__`` hook in the class body, a hook to initialize attributes, and a way to keep the order in which attributes are defined. Those hooks should at first be defined in a metaclass in the standard library, with the option that this metaclass eventually becomes the default ``type`` metaclass. The new mechanism should be easier to understand and use than implementing a custom metaclass, and thus should provide a gentler introduction to the full power Python's metaclass machinery. Background == Metaclasses are a powerful tool to customize class creation. They have, however, the problem that there is no automatic way to combine metaclasses. If one wants to use two metaclasses for a class, a new metaclass combining those two needs to be created, typically manually. This need often occurs as a surprise to a user: inheriting from two base classes coming from two different libraries suddenly raises the necessity to manually create a combined metaclass, where typically one is not interested in those details about the libraries at all. This becomes even worse if one library starts to make use of a metaclass which it has not done before. While the library itself continues to work perfectly, suddenly every code combining those classes with classes from another library fails. Proposal While there are many possible ways to use a metaclass, the vast majority of use cases falls into just three categories: some initialization code running after class creation, the initalization of descriptors and keeping the order in which class attributes were defined. Those three use cases can easily be performed by just one metaclass. If this metaclass is put into the standard library, and all libraries that wish to customize class creation use this very metaclass, no combination of metaclasses is necessary anymore. Said metaclass should live in the ``types`` module under the name ``Type``. This should hint the user that in the future, this metaclass may become the default metaclass ``type``. The three use cases are achieved as follows: 1. The metaclass contains an ``__init_subclass__`` hook that initializes all subclasses of
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Thu, Jun 16, 2016 at 10:26:22AM -0700, Nick Coghlan wrote: > meta-guidance. However, there are multiple levels of improvement being > pursued here, since developer ignorance of security concerns and > problematic defaults at the language level is a chronic problem rather > than an acute one (and one that affects all languages, not just > Python). For a while Christian Heimes has speculated on Twitter about writing a Secure Programming HOWTO. At the last language summit in Montreal, I told him I'd be happy to do the actual writing and editing if given a detailed outline. (I miss not having an ongoing writing project since ceasing to write the "What's New", but have no ideas for anything to write about.) That offer is still open, if Christian or someone else wants to produce an outline. --amk ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16 2016, Nick Coghlanwrote: > On 16 June 2016 at 09:39, Paul Moore wrote: >> I'm willing to accept the view of the security experts that there's a >> problem here. But without a clear explanation of the problem, how can >> a non-specialist like myself have an opinion? (And I hope the security >> POV isn't "you don't need an opinion, just do as we say"). > > If you're not writing Linux (and presumably *BSD) scripts and > applications that run during system initialisation or on embedded ARM > hardware with no good sources of randomness, then there's zero chance > of any change made in relation to this affecting you (Windows and Mac > OS X are completely immune, since they don't allow Python scripts to > run early enough in the boot sequence for there to ever be a problem). > > The only question at hand is what CPython should do in the case where > the operating system *does* let Python scripts run before the system > random number generator is ready, and the application calls a security > sensitive API that relies on that RNG: > > - throw BlockingIOError (so the script developer knows they have a > potential problem to fix) > - block (so the script developer has a system hang to debug) > - return low quality random data (so the script developer doesn't even > know they have a potential problem) > > The last option is the status quo, and has a remarkable number of > vocal defenders. *applaud* Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 520: Ordered Class Definition Namespace
On Thu, Jun 16, 2016 at 5:11 AM, Nikita Nemkinwrote: > I'll reformulate my argument: > > Ordered class namespaces are a minority use case that's already covered > by existing language features (custom metaclasses) and doesn't warrant > the extension of the language (i.e. making OrderedDict a builtin type). > This is about Python-the-Language, not CPython-the-runtime. So your main objection is that OrderedDict would effectively become part of the language definition? Please elaborate on why this is a problem. > The simple answer is "don't do that", i.e. don't pile an ordered metaclass > on top of another metaclass. Such use case is hypothetical anyway. It isn't hypothetical. It's a concrete problem that folks have run into enough that it's been a point of discussion on several occasions and the motivation for several PEPs. > All explicit assignments in the class body can be detected statically. > Implicit assignments via locals(), sys._frame() etc. can't be detected, > BUT they are unlikely to have a meaningful order! > It's reasonable to exclude them from __definition_order__. Yeah, it's reasonable to exclude them. However, in cases where I've done so I would have wanted them included in the definition order. That said, explicitly setting __definition_order__ in the class body would be enough to address that corner case. > > This also applies to documentation tools. If there really was a need, > they could have easily extracted static order, solving 99.% of > the problem. You mean that they have the opportunity to do something like AST traversal to extract the definition order? I expect the definition order isn't important enough to them to do that work. However, if the language provided the definition order to them for free then they'd use it. > >> The rationale for "Why not make this configurable, rather than >> switching it unilaterally?" is that it's actually *simpler* overall to >> just make it the default - we can then change the documentation to say >> "class bodies are evaluated in a collections.OrderedDict instance by >> default" and record the consequences of that, rather than having to >> document yet another class customisation mechanism. > > It would have been a "simpler" default if it was the core dict that > became ordered. Instead, it brings in a 3rd party (OrderedDict). Obviously if dict preserved insertion order then we'd use that instead of OrderedDict. There have been proposals along those lines in the past but at the end of the day someone has to do the work. Since we can use OrderedDict right now and there's no ordered dict in sight, it makes the choice rather easy. :) Ultimately the cost of defaulting to OrderedDict is not significant, neither to the language definition nor to run-time performance. Furthermore, defaulting to OrderedDict (per the PEP) makes things possible right now that aren't otherwise a possibility. > >> It also eliminates boilerplate from class decorator usage >> instructions, where people have to write "to use this class decorator, >> you must also specify 'namespace=collections.OrderedDict' in your >> class header" > > Statically inferred __definition_order__ would work here. > Order-dependent decorators don't seem to be important enough > to worry about their usability. Please be careful about discounting seemingly unimportant use cases. There's a decent chance they are important to someone. In this case that someone is (at least) myself. :) My main motivation for PEP 520 is exactly writing a class decorator that would rely on access to the definition order. Such a decorator (which could also be used stand-alone) cannot rely on every possible class it might encounter to explicitly expose its definition order. -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 16 June 2016 at 10:40, Nathaniel Smithwrote: > On Jun 16, 2016 10:01 AM, "David Mertz" wrote: >> Python 3.6 is introducing a NEW MODULE, with new APIs. The 'secrets' >> module is the very first time that Python has ever really explicitly >> addressed cryptography in the standard library. > > This is completely, objectively untrue. If you look up os.urandom in the > official manual for the standard library, then it have always stated > explicitly, as the very first line, that os.urandom returns "a string of n > random bytes suitable for cryptographic use." This is *exactly* the same > explicit guarantee that the secrets module makes. The motivation for adding > the secrets module was to make this functionality easier to find and more > convenient to use (e.g. by providing convenience functions for getting > random strings of ASCII characters), not to suddenly start addressing > cryptographic concerns for the first time. An analogy that occurred to me that may help some folks: secrets is a higher level API around os.urandom and some other standard library features (like base64 and binascii.hexlify) in the same way that shutil and pathlib are higher level APIs that aggregate other os module functions with other parts of the standard library. The existence of those higher level APIs doesn't make the lower level building blocks redundant. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 520: Ordered Class Definition Namespace
Thanks for raising these good points, Nikita. I'll make sure the PEP reflects this discussion. (inline responses below...) -eric On Tue, Jun 14, 2016 at 3:41 AM, Nikita Nemkinwrote: > Is there any rationale for rejecting alternatives like: > > 1. Adding standard metaclass with ordered namespace. > 2. Adding `namespace` or `ordered` args to the default metaclass. We already have a metaclass-based solution: __prepare__(). Unfortunately, this opt-in option means that the definition order isn't preserved by default, which means folks can't rely on access to the definition order. This is effectively no different from the status quo. Furthermore, there's a practical problem with requiring the use of metaclasses to achieve some particular capability: metaclass conflicts. PEPs 422 and 487 exist, in large part, as a response to specific feedback from users about problems they've had with metaclasses. While the key objective of PEP 520 is preserving the class definition order, it also helps make it less necessary to write a metaclass. > 3. Making compiler fill in __definition_order__ for every class > (just like __qualname__) without touching the runtime. This is a great idea. I'd support any effort to do so. But keep in mind that how we derive __definition_order__ isn't as important as that it's always there. So the use of OrderedDict for the implementation isn't necessary. Instead, it's the implementation I've taken. If we later switch to using the compiler to get the definition order, then great! > ? > > To me, any of the above seems preferred to complicating > the core part of the language forever. What specific complication are you expecting? Like nearly all of Python's "power tools", folks won't need to know about the changes from this PEP in order to use the language. Then when they need the new functionality, it will be ready for them to use. Furthermore, as far as changes to the language go, this change is quite simple and straightforward (consider other recent changes, e.g. async). It is arguably a natural step and fills in some of the information that Python currently throws away. Finally, I've gotten broad support for the change from across the community (both on the mailing lists and in personal correspondence), from the time I first introduced the idea several years ago. > > The vast majority of Python classes don't care about their member > order, this is minority use case receiving majority treatment. The problem is that there isn't any other recourse available to code that wishes to determine the definition order of an arbitrary class. This is an obstacle to code that I personally want to write (hence my interest). > > Also, wiring OrderedDict into class creation means elevating it > from a peripheral utility to indispensable built-in type. Note that as of 3.5 CPython's OrderedDict *is* a builtin type (though exposed via the collections module rather than the builtins module). However, you're right that this change would mean OrderedDict would now be used by the interpreter in all implementations of Python 3.6+. Some of the other implementators from which I've gotten feedback have indicated this isn't a problem. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016 10:01 AM, "David Mertz"wrote: > Python 3.6 is introducing a NEW MODULE, with new APIs. The 'secrets' module is the very first time that Python has ever really explicitly addressed cryptography in the standard library. This is completely, objectively untrue. If you look up os.urandom in the official manual for the standard library, then it have always stated explicitly, as the very first line, that os.urandom returns "a string of n random bytes suitable for cryptographic use." This is *exactly* the same explicit guarantee that the secrets module makes. The motivation for adding the secrets module was to make this functionality easier to find and more convenient to use (e.g. by providing convenience functions for getting random strings of ASCII characters), not to suddenly start addressing cryptographic concerns for the first time. (Will try to address other more nuanced points later.) -n ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 16 June 2016 at 10:01, David Mertzwrote: > It seems to me that backporting 'secrets' and putting it on Warehouse would > be a lot more productive than complaining about 3.5.2 reverting to (almost) > the behavior of 2.3-3.4. "Let Flask/Django/passlib/cryptography/whatever handle the problem rather than rolling your own" is already the higher level meta-guidance. However, there are multiple levels of improvement being pursued here, since developer ignorance of security concerns and problematic defaults at the language level is a chronic problem rather than an acute one (and one that affects all languages, not just Python). In that context, the main benefit of the secrets module is as a deterrent against people reaching for the reproducible simulation focused random module to implement security sensitive operations. By offering both secrets and random in the standard library, we help make it clear that secrecy and simulation are *not the same problem*, even though they both involve random numbers. Folks that learn Python 3.6 first and then later start supporting earlier versions are likely to be more aware of the difference, and hence go looking for "What's the equivalent of the secrets module on earlier Python versions?" (at which point they can just copy whichever one-liner they actually need into their particular application - just as not every 3 line function needs to be a builtin, not every 3 line function needs to be a module on PyPI) The os.urandom proposal is aimed more at removing any remaining equivocation from the longstanding "Use os.urandom() for security sensitive operations in Python" advice - it's for the benefit of folks that are *already* attempting to do the right thing given the tools they have available. The sole source of that equivocation is that in some cases, at least on Linux, and potentially on *BSD (although we haven't seen a confirmed reproducer there), os.urandom() may return results that are sufficiently predictable to be inappropriate for use in security sensitive applications. At the moment, determining whether or not you're risking exposure to that problem requires that you know a whole lot about Linux (and *BSD, where even we haven't been able to determine the level of exposure on embedded systems), and also about how ``os.urandom()`` is implemented on different platforms. My proposal is that we do away with the requirement for all that assumed knowledge and instead say "Are you using os.urandom(), random.SystemRandom(), or an API in the secrets module? Are you using Python 3.6+? Did it raise BlockingIOError? No? Then you're fine". The vast majority of Python developers will thus be free to remain entirely ignorant of these platform specific idiosyncracies, while those that have a potential need to know will get an exception from the interpreter that they can then feed into a search engine and get pointed in the right direction. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 16 June 2016 at 09:39, Paul Moorewrote: > I'm willing to accept the view of the security experts that there's a > problem here. But without a clear explanation of the problem, how can > a non-specialist like myself have an opinion? (And I hope the security > POV isn't "you don't need an opinion, just do as we say"). If you're not writing Linux (and presumably *BSD) scripts and applications that run during system initialisation or on embedded ARM hardware with no good sources of randomness, then there's zero chance of any change made in relation to this affecting you (Windows and Mac OS X are completely immune, since they don't allow Python scripts to run early enough in the boot sequence for there to ever be a problem). The only question at hand is what CPython should do in the case where the operating system *does* let Python scripts run before the system random number generator is ready, and the application calls a security sensitive API that relies on that RNG: - throw BlockingIOError (so the script developer knows they have a potential problem to fix) - block (so the script developer has a system hang to debug) - return low quality random data (so the script developer doesn't even know they have a potential problem) The last option is the status quo, and has a remarkable number of vocal defenders. The second option is what we changed the behaviour to in 3.5 as a side effect of switching to a syscall to save a file descriptor (and *also* inadvertently made a gating requirement for CPython starting at all, without which I'd be very surprised if anyone actually noticed the potentially blocking behaviour in os.urandom itself) The first option is the one I'm currently writing a PEP for, since it makes the longstanding advice to use os.urandom() as the low level random data API for security sensitive operations unequivocally correct (as it will either do the right thing, or throw an exception which the developer can handle as appropriate for their particular application) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Thu, Jun 16, 2016 at 11:58 AM, Nathaniel Smithwrote: > [...] no one else be able to predict what session cookie I sent [...] In > python 2.3-3.5, the most correct way to write this code is to use > os.urandom. The question in this thread is whether we should break that in > 3.6, so that conscientious users are forced to switch existing code over to > using the secrets module if they want to continue to get the most correct > available behavior, or whether we should preserve that in 3.6, so that code > like my hypothetical web app that was correct on 2.3-3.5 remains correct on > 3.6 > This is kinda silly. Unless you specifically wrote your code for Python 3.5.1, and NOT for 2.3.x through 3.4.x, your code is NO WORSE in 3.5.2 than it has been under all those prior versions. The cases where the behavior in everything other than 3.5.0-3.5.1 is suboptimal are *extremely limited*, as you understand (things that run in Python very early in the boot process, and only on recent versions of Linux, no other OS). This does not even remotely describe the web-server-with-cookies example that you outline. Python 3.6 is introducing a NEW MODULE, with new APIs. The 'secrets' module is the very first time that Python has ever really explicitly addressed cryptography in the standard library. Yes, there have been third-party modules and libraries, but any cryptographic application of Python prior to 'secrets' is very much roll-your-own and know-what-you-are-doing. Yes, there has been a history of telling people to "use os.urandom()" on StackOverflow and places like that. That's about the best advice that was available prior to 3.6. Adding a new module and API is specifically designed to allow for a better answer, otherwise there'd be no reason to include it. And that advice that's been on StackOverflow and wherever has been subject to the narrow, edge-case flaw we've discussed here for at least a decade without anyone noticing or caring. It seems to me that backporting 'secrets' and putting it on Warehouse would be a lot more productive than complaining about 3.5.2 reverting to (almost) the behavior of 2.3-3.4. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
On Thu, Jun 16, 2016, at 07:34, Donald Stufft wrote: > python-dev tends to favor not breaking “working” code over securing > existing APIs, even if “working” is silently doing the wrong thing > in a security context. This is particularly frustrating when it > comes to security because security is by it’s nature the act of > taking code that would otherwise execute and making it error, > ideally only in bad situations, but this “security’s purpose is to > make things break” nature clashes with python-dev’s default of > not breaking “working” code in a way that is personally draining > to me. I was almost about to reply with "Maybe what we need is a new zen of python", then I checked. It turns out we already have "Errors should never pass silently" which fits *perfectly* in this situation. So what's needed is a change to the attitude that if an error passes silently, that making it no longer pass silently is a backward compatibility break. This isn't Java, where the exceptions not thrown by an API are part of that API's contract. We're free to throw new exceptions in a new version of Python. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 16 June 2016 at 16:58, Nathaniel Smithwrote: > The word "cryptographic" here is a bit of a red herring. The guarantee that > a CSPRNG makes is that the output should be *unguessable by third parties*. > There are plenty of times when this is what you need even when you aren't > using actual cryptography. For example, when someone logs into a web app, I > may want to send back a session cookie so that I can recognize this person > later without making then reauthenticate all the time. For this to work > securely, it's extremely important that no one else be able to predict what > session cookie I sent, because if you can guess the cookie then you can > impersonate the user. > > In python 2.3-3.5, the most correct way to write this code is to use > os.urandom. The question in this thread is whether we should break that in > 3.6, so that conscientious users are forced to switch existing code over to > using the secrets module if they want to continue to get the most correct > available behavior, or whether we should preserve that in 3.6, so that code > like my hypothetical web app that was correct on 2.3-3.5 remains correct on > 3.6 (with the secrets module being a more friendly wrapper that we recommend > for new code, but with no urgency about porting existing code to it). While your example is understandable and clear, it's also a bit of a red herring as well. Nobody's setting up a web session cookie during the first moments of Linux boot (are they?), so os.urandom is perfectly OK in all cases here. We have a new API in 3.6 that might better express the *intent* of generating a secret token, but (cryptographic) correctness is the same either way for this example. As someone who isn't experienced in crypto, I genuinely don't have the slightest idea of what sort of program we're talking about that is written in Python, runs in the early stages of OS startup, and needs crypto-strength random numbers. So I can't reason about whether the proposed solutions are sensible. Would such programs be used in a variety of environments with different Python versions? Would the developers be non-specialists? Which of the mistakes being made that result in a vulnerability is the easiest to solve (move the code to run later, modify the Python code, require a fixed version of Python)? How severe is the security hole compared to others (for example, users with weak passwords)? What attacks are possible, and what damage could be done? (I know that in principle, any security hole needs to be plugged, but I work in an environment where production services with a password of "password" exist, and applying system security patches is treated as a "think about it when things are quiet" activity - so forgive me if I don't immediately understand why obscure vulnerabilities are important). I'm willing to accept the view of the security experts that there's a problem here. But without a clear explanation of the problem, how can a non-specialist like myself have an opinion? (And I hope the security POV isn't "you don't need an opinion, just do as we say"). Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016 1:23 AM, "Stefan Krah"wrote: > > Nathaniel Smith pobox.com> writes: > > In practice, your proposal means that ~all existing code that uses > > os.urandom becomes incorrect and should be switched to either secrets > > or random. This is *far* more churn for end-users than Nick's > > proposal. > > This should only concern code that a) was specifically written for > 3.5.0/3.5.1 and b) implements a serious cryptographic application > in Python. > > I think b) is not a good idea anyway due to timing and side channel > attacks and the lack of secure wiping of memory. Such applications > should be written in C, where one does not have to predict the > behavior of multiple layers of abstractions. This is completely unhelpful. Firstly because it's an argument that os.urandom and the secrets module shouldn't exist, which doesn't tell is much about what their behavior should be given that they do exist, and secondly because it fundamentally misunderstands why they exist. The word "cryptographic" here is a bit of a red herring. The guarantee that a CSPRNG makes is that the output should be *unguessable by third parties*. There are plenty of times when this is what you need even when you aren't using actual cryptography. For example, when someone logs into a web app, I may want to send back a session cookie so that I can recognize this person later without making then reauthenticate all the time. For this to work securely, it's extremely important that no one else be able to predict what session cookie I sent, because if you can guess the cookie then you can impersonate the user. In python 2.3-3.5, the most correct way to write this code is to use os.urandom. The question in this thread is whether we should break that in 3.6, so that conscientious users are forced to switch existing code over to using the secrets module if they want to continue to get the most correct available behavior, or whether we should preserve that in 3.6, so that code like my hypothetical web app that was correct on 2.3-3.5 remains correct on 3.6 (with the secrets module being a more friendly wrapper that we recommend for new code, but with no urgency about porting existing code to it). -n ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
On 16 June 2016 at 05:50, Paul Moorewrote: > On 16 June 2016 at 12:34, Donald Stufft wrote: >> [1] I don’t think using os.urandom is incorrect to use for security sensitive >> applications and I think it’s a losing battle for Python to try and fight >> the rest of the world that urandom is not the right answer here. >> >> [2] python-dev tends to favor not breaking “working” code over securing >> existing >> APIs, even if “working” is silently doing the wrong thing in a security >> context. This is particularly frustrating when it comes to security >> because >> security is by it’s nature the act of taking code that would otherwise >> execute and making it error, ideally only in bad situations, but this >> “security’s purpose is to make things break” nature clashes with >> python-dev’s >> default of not breaking “working” code in a way that is personally >> draining >> to me. > > Should I take it from these two statements that you do not believe > that providing *new* APIs that provide better security compared to a > backward compatible but flawed existing implementation is a reasonable > approach? And specifically that you don't agree with the decision to > provide the new "secrets" module as the recommended interface for > getting secure random numbers from Python? > > One of the aspects of this debate that I'm unclear about is what role > the people arguing that os.urandom must change see for the new secrets > module. The secrets module is great for new code that gets to ignore any version of Python older than 3.6 - it's the "solve this problem for the next generation of developers" answer. All of the complicated "this API is safe for that purpose, this API isn't" discussions get replaced by "do the obvious thing" (i.e. use random for simulations, secrets for security). The os.urandom() debate is about taking the current obvious (because that's what the entire security community is telling you to do) low level way to do it and categorically eliminating any and all caveats on its correctness. Not "it's correct if you use these new flags that are incompatible with older Python versions". Not "it's not correct anymore, use a different API". Just "it's correct, and the newer your Python runtime, the more correct it is". Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Thu, Jun 16, 2016 at 1:04 PM, Donald Stufftwrote: > In my opinion, this is a usability issue as well. You have a ton of third > party documentation and effort around “just use urandom” for Cryptographic > random which is generally the right (and best!) answer except for this one > little niggle on a Linux platform where /dev/urandom *may* produce > predictable bytes (but usually doesn’t). Why not consider opt-out behavior with environment variables? Eg: people that don't care about crypto mumbojumbo and want fast interpreter startup could just use a PYTHONWEAKURANDOM=y or PYTHONFASTURANDOM=y. That ways there's no need to change api of os.urandom() and users have a clear and easy path to get old behavior. Thanks, -- Ionel Cristian Mărieș, http://blog.ionelmc.ro ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Thu, Jun 16, 2016, at 10:04, Barry Warsaw wrote: > On Jun 16, 2016, at 09:51 AM, Random832 wrote: > > >On Thu, Jun 16, 2016, at 04:03, Barry Warsaw wrote: > >> *If* we can guarantee that os.urandom() will never block or raise an > >> exception when only poor entropy is available, then it may be indeed > >> indistinguishably backward compatible for most if not all cases. > > > >Why can't we exclude cases when only poor entropy is available from > >"most if not all cases"? > > Because if it blocks or raises a new exception on poor entropy it's an > API break. Yes, but in only very rare cases. Which as I *just said* makes it backwards compatible for "most" cases. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
I also think it’s a great module for providing defaults that we can’t provide in os.urandom, like the number of bytes that are considered “secure” [1]. What I don’t think is that the secrets module means that all of a sudden os.urandom is no longer an API that is primarily used in a security sensitive context Not all of a sudden. However, I guess things will change in the future. If we want the secrets module to be the first and only place where crypto goes, we should work towards that goal. It needs proper communication, marketing etc. Deprecation periods can be years long. This change (whatever form it will take) can be carried out over 3 or 4 releases when the ultimate goal is made clear to everybody reading the docs. OTOH I don't know whether long deprecation periods are necessary here at all. Other industries are very sensitive to fast changes. Furthermore, next generations will be taught using the new way, so the Python community should not be afraid of some changes because most of them are for the better. On 16.06.2016 15:02, Donald Stufft wrote: I think that os.urandom is the most obvious thing that someone will reach for given: * Pages upon pages of documentation both inside the Python community and outside saying “use urandom”. * The sheer bulk of existing code that is already out there using os.urandom for it’s cryptographic properties. That's maybe you. However, as stated before, I am not expert in this field. So, when I need to, I first would start researching the current state of the art in Python. If the docs says: use the secrets module (e.g. near os.urandom), I would happily comply -- especially when there's reasonable explanation. That's from a newbie's point of view. Best, Sven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016, at 09:51 AM, Random832 wrote: >On Thu, Jun 16, 2016, at 04:03, Barry Warsaw wrote: >> *If* we can guarantee that os.urandom() will never block or raise an >> exception when only poor entropy is available, then it may be indeed >> indistinguishably backward compatible for most if not all cases. > >Why can't we exclude cases when only poor entropy is available from >"most if not all cases"? Because if it blocks or raises a new exception on poor entropy it's an API break. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Thu, Jun 16, 2016, at 04:03, Barry Warsaw wrote: > *If* we can guarantee that os.urandom() will never block or raise an > exception when only poor entropy is available, then it may be indeed > indistinguishably backward compatible for most if not all cases. Why can't we exclude cases when only poor entropy is available from "most if not all cases"? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
> On Jun 16, 2016, at 8:50 AM, Paul Moorewrote: > > On 16 June 2016 at 12:34, Donald Stufft wrote: >> [1] I don’t think using os.urandom is incorrect to use for security sensitive >>applications and I think it’s a losing battle for Python to try and fight >>the rest of the world that urandom is not the right answer here. >> >> [2] python-dev tends to favor not breaking “working” code over securing >> existing >>APIs, even if “working” is silently doing the wrong thing in a security >>context. This is particularly frustrating when it comes to security >> because >>security is by it’s nature the act of taking code that would otherwise >>execute and making it error, ideally only in bad situations, but this >>“security’s purpose is to make things break” nature clashes with >> python-dev’s >>default of not breaking “working” code in a way that is personally >> draining >>to me. > > Should I take it from these two statements that you do not believe > that providing *new* APIs that provide better security compared to a > backward compatible but flawed existing implementation is a reasonable > approach? And specifically that you don't agree with the decision to > provide the new "secrets" module as the recommended interface for > getting secure random numbers from Python? > > One of the aspects of this debate that I'm unclear about is what role > the people arguing that os.urandom must change see for the new secrets > module. > > Paul I think the new secrets module is great, particularly for functions other than secrets.token_bytes. If that’s all the secrets module was then I’d argue it shouldn’t exist because we already have os.urandom. IOW I think it solves a different problem than os.urandom, if all you need is cryptographically random bytes, I think that os.urandom is the most obvious thing that someone will reach for given: * Pages upon pages of documentation both inside the Python community and outside saying “use urandom”. * The sheer bulk of existing code that is already out there using os.urandom for it’s cryptographic properties. I also think it’s a great module for providing defaults that we can’t provide in os.urandom, like the number of bytes that are considered “secure” [1]. What I don’t think is that the secrets module means that all of a sudden os.urandom is no longer an API that is primarily used in a security sensitive context [2] and thus we should willfully choose to use a subpar interface to the same CSPRNG when the OS provides us a better one [3] because one small edge case *might* break in a loud an obvious way for the minority of people using this API in a non security sensitive context while leaving the majority of people using this API possible getting silently insecure behavior from it. [1] Of course, what is considered secure is going to be application dependent, but secrets can give a pretty good approximation for the general case. [2] This is one of the things that really gets me about this, it’s not like folks on my side are saying we need to break the pickle module because it’s possible to use it insecurely. That would be silly because one of the primary use cases for that module is using it in a context that is not security sensitive. However, os.urandom is, to the best of my ability to determine and reason, almost always used in a security sensitive context, and thus should make security sensitive trade offs in it’s API. [3] Thus it’s still a small wrapper around OS provided APIs, so we’re not asking for os.py to implement some great big functionality, we’re just asking for it to provide a thin shim over a better interface to the same thing. — Donald Stufft ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
Cory Benfield lukasa.co.uk> writes: > python-dev cannot wash its hands of the security decision here. As I’ve said many times, I’m pleased to > see the decision makers have not done that: while I don’t agree with their decision, I totally respect > that it was theirs to make, and they made it with all of the facts. I think the sysadmin's responsibility still plays a major role here. If a Linux system crucially relies on the quality of /dev/urandom, it should be possible to insert a small C program (call it ensure_random) into the boot sequence that does *exactly* what Python did in the bug report: block until entropy is available. Well, it *was* possible with SysVinit ... :) Python is not the only application that needs a secure /dev/urandom. Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
On 16 June 2016 at 12:34, Donald Stufftwrote: > [1] I don’t think using os.urandom is incorrect to use for security sensitive > applications and I think it’s a losing battle for Python to try and fight > the rest of the world that urandom is not the right answer here. > > [2] python-dev tends to favor not breaking “working” code over securing > existing > APIs, even if “working” is silently doing the wrong thing in a security > context. This is particularly frustrating when it comes to security > because > security is by it’s nature the act of taking code that would otherwise > execute and making it error, ideally only in bad situations, but this > “security’s purpose is to make things break” nature clashes with > python-dev’s > default of not breaking “working” code in a way that is personally > draining > to me. Should I take it from these two statements that you do not believe that providing *new* APIs that provide better security compared to a backward compatible but flawed existing implementation is a reasonable approach? And specifically that you don't agree with the decision to provide the new "secrets" module as the recommended interface for getting secure random numbers from Python? One of the aspects of this debate that I'm unclear about is what role the people arguing that os.urandom must change see for the new secrets module. Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
On Thu, Jun 16, 2016 at 03:24:33PM +0300, Barry Warsaw wrote: > Except that I disagree. I think os.urandom's original intent, as documented > in Python 3.4, is to provide a thin layer over /dev/urandom, with all that > implies, and with the documented quality caveats. I know as a Linux developer > that if I need to know the details of that, I can `man urandom` and read the > gory details. In Python 3.5, I can't do that any more. If Python were to document os.urandom as providing a thin wrapper over /dev/urandom as implemented on Linux, and also document os.getrandom as providing a thin wrapper over getrandom(2) as implemented on Linux. And then say that the best emulation of those two interfaces will be provided say that on other operating systems, and that today the best practice is to call getrandom with the flags set to zero (or defaulted out), that would certainly make me very happy. I could imagine that some people might complain that it is too Linux-centric, or it is not adhering to Python's design principles, but it makes a lot sense of me as a Linux person. :-) Cheers, - Ted ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
On Jun 16, 2016, at 07:34 AM, Donald Stufft wrote: >Well, I don’t think that for os.urandom someone using it for security is >running “counter to it’s original intent”, given that in general urandom’s >purpose is for cryptographic random. Someone *may* be using it for something >other than that, but it’s pretty explicitly there for security sensitive >applications. Except that I disagree. I think os.urandom's original intent, as documented in Python 3.4, is to provide a thin layer over /dev/urandom, with all that implies, and with the documented quality caveats. I know as a Linux developer that if I need to know the details of that, I can `man urandom` and read the gory details. In Python 3.5, I can't do that any more. >Right. I personally often fall towards securing the *existing* APIs and >adding new, insecure APIs that are obviously so in cases where we can >reasonably do that. Sure, and I personally fall on the side of maintaining stable, backward compatible APIs, adding new, better, more secure APIs to address deficiencies in real-world use cases. That's because when we break APIs, even with the best of intentions, it breaks people's code in ways and places that we can't predict, and which are very often very difficult to discover. I guess it all comes down to who's yelling at you. ;) Cheers, -Barry P.S. These discussions do not always end in despair. Witness PEP 493. pgpLldy2Ii3Jk.pgp Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
> On 16 Jun 2016, at 09:19, Stefan Krahwrote: > > This should only concern code that a) was specifically written for > 3.5.0/3.5.1 and b) implements a serious cryptographic application > in Python. > > I think b) is not a good idea anyway due to timing and side channel > attacks and the lack of secure wiping of memory. Such applications > should be written in C, where one does not have to predict the > behavior of multiple layers of abstractions. No, it concerns code that generates its random numbers from Python. For example, you may want to use AES GCM to encrypt a file at rest. AES GCM requires the use of an nonce, and has only one rule about this nonce: you MUST NOT, under any circumstances, re-use an nonce/key combination. If you do, AES GCM fails catastrophically (I cannot emphasise this enough, re-using a nonce/key combination in AES GCM totally destroys all the properties the algorithm provides)[0]. You can use a C implementation of all of the AES logic, including offload to your x86 CPU with its fancy AES GCM instruction set. However, you *need* to provide an nonce: AES GCM can’t magically guess what it is, and it needs to be communicated in some way for the decryption[1]. In situations where you do not have an easily available nonce (you do have it for TLS, for example), you will need to provide one, and the logical and obvious thing to do is to use a random number. Your Python application needs to obtain that random number, and the safest way to do it is via os.urandom(). This is the problem with this argument: we cannot wave our hands and say “os.urandom can be as unsafe as we want because crypto code must not be written in Python”. Even if we never implement an algorithm in Python (and I agree with you that crypto primitives in general should not be implemented in Python for the exact reasons you suggest), most algorithms require the ability to be provided with good random numbers by their callers. As long as crypto algorithms require good nonces, Python needs access to a secure CSPRNG. Kernel CSPRNGs are *strongly* favoured for many reasons that I won’t go into here, so os.urandom is our winner. python-dev cannot wash its hands of the security decision here. As I’ve said many times, I’m pleased to see the decision makers have not done that: while I don’t agree with their decision, I totally respect that it was theirs to make, and they made it with all of the facts. Cory [0]: Someone will *inevitably* point out that other algorithms resist nonce misuse somewhat better than this. While that’s true, it’s a) not relevant, because some standards require use of the non-NMR algorithms, and b) unhelpful, because even if we could switch, we’d need access to the better primitives, which we don’t have. [1]: Again, to head off some questions at the pass: the reason nonces are usually provided by the user of the algorithm is that sometimes they’re generated semi-deterministically. For example, TLS generates a unique key for each session (again, requiring randomness, but that’s neither here nor there), and so TLS can use deterministic *but non-repeated* nonces, which in practice it derives from record numbers. Because you have two options (re-use keys with random nonces, or random keys with deterministic nonces), a generic algorithm implementation does not constrain your choice of nonce. signature.asc Description: Message signed with OpenPGP using GPGMail ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
> On Jun 16, 2016, at 7:07 AM, Barry Warsawwrote: > > On Jun 16, 2016, at 06:04 AM, Donald Stufft wrote: > >> Regardless of what we document it as, people are going to use os.urandom for >> cryptographic purposes because for everyone who doesn’t keep up on exactly >> what modules are being added to Python who has any idea about cryptography at >> all is going to look for a Python interface to urandom. That doesn’t even >> begin to touch the thousands upon thousands of uses that already exist in the >> wild that are assuming that os.urandom will always give them cryptographic >> random, who now *need* to write this as: > > [...] > >> Frankly, I think it’s a disservice to Python developers to leave in this >> footgun. > > This really gets to the core of our responsibility to our users. Let's start > by acknowledging that good-willed people can have different opinions on this, > and that we all want to do what's best for our users, although we may have > different definitions of "what's best”. Yes, I don’t think anyone is being malicious :) that’s why I qualified my statement with “I think”, because I don’t believe that whether or not this particular choice is a disservice is a fundamental property of the universe, but rather my opinion influenced by my priorities. > > Since this topic comes up over and over again, it's worth exploring in more > detail. Here's my take on it in this context. > > We have a responsibility to provide stable, well-documented, obvious APIs to > our users to provide functionality that is useful and appropriate to the best > of our abilities. > > We have a responsibility to provide secure implementations of that > functionality wherever possible. > > It's in the conflict between these two responsibilities that these heated > discussions and differences of opinions come up. This conflict is exposed in > the os.urandom() debate because the first responsibility informs us that > backward compatibility is more important to maintain because it provides > stability and predictability. The second responsibility urges us to favor > retrofitting increased security into APIs that for practicality purposes are > being used counter to our original intent. Well, I don’t think that for os.urandom someone using it for security is running “counter to it’s original intent”, given that in general urandom’s purpose is for cryptographic random. Someone *may* be using it for something other than that, but it’s pretty explicitly there for security sensitive applications. > > It's not that you think backward compatibility is unimportant, or that I think > improving security has no value. In the messy mudpit of the middle, we can't > seem to have both, as much as I'd argue that providing new, better APIs can > give us edible cake. Right. I personally often fall towards securing the *existing* APIs and adding new, insecure APIs that are obviously so in cases where we can reasonably do that. That’s largely because given an API that’s both being used in security sensitive applications and ones that’s not, the “failure” to be properly secure is almost always a silent failure, while the “failure” to applications that don’t need that security is almost always obvious and immediate. Taking os.urandom as an example, the failure case here for the security side is that you get some bytes that are, to some degree, predictable. There is nobody alive who can look at some bytes and go “oh yep, those bytes are predictable we’re using the wrong API”, thus basically anyone “incorrectly” [1] using this API for security sensitive applications is going to have it just silently doing the wrong thing. On the flip side, if someone is using this API and what they care about is it not blocking, ever, and always giving them some sort of random-ish number no matter how predictable it is, then both of the proposed failure cases are fairly noticeable (to varying degrees), either it blocks long enough for it to matter for those people and they notice and dig in, or it raises an exception and they notice and dig in. In both cases they get some indication that something is wrong. > > Coming down on either side has its consequences, both known and unintended, > and I think in these cases consensus can't be reached. It's for these reasons > that we have RMs and BDFLs to break the tie. We must lay out our arguments > and trust our Larrys, Neds, and Guidos to make the right --or at least *a*-- > decision on a case-by-case basis, and if not agree then accept. Right. I’ve personally tried not to personally be the one who keeps pushing for this even after a decree, partially because it’s draining to me to argue for the security side with python-dev [2] and partially because It was ruled on and I lost. However if there continues to be discussion I’ll continue to advocate for what I think is right :) [1] I don’t think using os.urandom is incorrect to use for security sensitive applications
Re: [Python-Dev] PEP 520: Ordered Class Definition Namespace
I'll reformulate my argument: Ordered class namespaces are a minority use case that's already covered by existing language features (custom metaclasses) and doesn't warrant the extension of the language (i.e. making OrderedDict a builtin type). This is about Python-the-Language, not CPython-the-runtime. If you disagree with this premise, there's no point arguing about the alternatives. That being said, below are the answers to your objections to specific alternatives. On Thu, Jun 16, 2016 at 1:30 AM, Nick Coghlanwrote: > On 14 June 2016 at 02:41, Nikita Nemkin wrote: > > Adding metaclasses to an existing class can break compatibility with > third party subclasses, so making it possible for people to avoid that > while still gaining the ability to implicitly expose attribute > ordering to class decorators and other potentially interested parties > is a recurring theme behind this PEP and also PEPs 422 and 487. The simple answer is "don't do that", i.e. don't pile an ordered metaclass on top of another metaclass. Such use case is hypothetical anyway. Also, namespace argument to the default metaclass doesn't cause conflicts. >> 3. Making compiler fill in __definition_order__ for every class >> (just like __qualname__) without touching the runtime. >> ? > > Class scopes support conditionals and loops, so we can't necessarily > be sure what names will be assigned without running the code. It's > also possible to make attribute assignments via locals() that are > entirely opaque to the compiler, but visible to the interpreter at > runtime. All explicit assignments in the class body can be detected statically. Implicit assignments via locals(), sys._frame() etc. can't be detected, BUT they are unlikely to have a meaningful order! It's reasonable to exclude them from __definition_order__. This also applies to documentation tools. If there really was a need, they could have easily extracted static order, solving 99.% of the problem. > The rationale for "Why not make this configurable, rather than > switching it unilaterally?" is that it's actually *simpler* overall to > just make it the default - we can then change the documentation to say > "class bodies are evaluated in a collections.OrderedDict instance by > default" and record the consequences of that, rather than having to > document yet another class customisation mechanism. It would have been a "simpler" default if it was the core dict that became ordered. Instead, it brings in a 3rd party (OrderedDict). Documenting an extra metaclass or an extra type kward would hardly take more space. And it's NOT yet another mechanism. It's the good old metaclass mechanism. > It also eliminates boilerplate from class decorator usage > instructions, where people have to write "to use this class decorator, > you must also specify 'namespace=collections.OrderedDict' in your > class header" Statically inferred __definition_order__ would work here. Order-dependent decorators don't seem to be important enough to worry about their usability. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Why does base64 return bytes?
On Wed, 15 Jun 2016 11:51:05 +1200, Greg Ewingwrote: > R. David Murray wrote: > > The fundamental purpose of the base64 encoding is to take a series > > of arbitrary bytes and reversibly turn them into another series of > > bytes in which the eighth bit is not significant. > > No, it's not. If that were its only purpose, it would be > called base128, and the RFC would describe it purely in > terms of bit patterns and not mention characters or > character sets at all. Sorry, you are correct. IMO it is to encode it to a representation that consists of a limited subset of printable (makes marks on paper or screen) characters (which is an imprecise term); ie: data that will not be interpreted as having control information by most programs processing the data stream as either human-readable or raw bytes. The rest of the argument still applies, specifically the part about wire encoding to seven bit bytes being the currently-most-used[*] and backward-compatible use case. And I say this despite the fact that the email package currently handles everything as surrogate-escaped text and so does in fact decode the output of base64.encode to ASCII and only later re-encodes it. That's a design issue in the email package deriving from the fact that bytes and string used to be the same thing in python2. It might some day get corrected, but probably won't be, and it is a legacy of *not* making the distinction between bytes and string. --David [*] Yes this is changing, I already said that :) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)
On Jun 16, 2016, at 06:04 AM, Donald Stufft wrote: >Regardless of what we document it as, people are going to use os.urandom for >cryptographic purposes because for everyone who doesn’t keep up on exactly >what modules are being added to Python who has any idea about cryptography at >all is going to look for a Python interface to urandom. That doesn’t even >begin to touch the thousands upon thousands of uses that already exist in the >wild that are assuming that os.urandom will always give them cryptographic >random, who now *need* to write this as: [...] >Frankly, I think it’s a disservice to Python developers to leave in this >footgun. This really gets to the core of our responsibility to our users. Let's start by acknowledging that good-willed people can have different opinions on this, and that we all want to do what's best for our users, although we may have different definitions of "what's best". Since this topic comes up over and over again, it's worth exploring in more detail. Here's my take on it in this context. We have a responsibility to provide stable, well-documented, obvious APIs to our users to provide functionality that is useful and appropriate to the best of our abilities. We have a responsibility to provide secure implementations of that functionality wherever possible. It's in the conflict between these two responsibilities that these heated discussions and differences of opinions come up. This conflict is exposed in the os.urandom() debate because the first responsibility informs us that backward compatibility is more important to maintain because it provides stability and predictability. The second responsibility urges us to favor retrofitting increased security into APIs that for practicality purposes are being used counter to our original intent. It's not that you think backward compatibility is unimportant, or that I think improving security has no value. In the messy mudpit of the middle, we can't seem to have both, as much as I'd argue that providing new, better APIs can give us edible cake. Coming down on either side has its consequences, both known and unintended, and I think in these cases consensus can't be reached. It's for these reasons that we have RMs and BDFLs to break the tie. We must lay out our arguments and trust our Larrys, Neds, and Guidos to make the right --or at least *a*-- decision on a case-by-case basis, and if not agree then accept. Cheers, -Barry pgpi_GQ9iz6Hl.pgp Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
> On Jun 16, 2016, at 4:46 AM, Barry Warsawwrote: > > We can educate them through documentation, but I don't think it's appropriate > to retrofit existing APIs to different behavior based on those faulty > assumptions, because that has other negative effects, such as breaking the > promises we make to experienced and knowledgeable developers. You can’t document your way out of a usability problem, in the same way that while it was true that urllib was *documented* to not verify certificates by default, that didn’t matter because a large set of users used it like it did anyways. In my opinion, this is a usability issue as well. You have a ton of third party documentation and effort around “just use urandom” for Cryptographic random which is generally the right (and best!) answer except for this one little niggle on a Linux platform where /dev/urandom *may* produce predictable bytes (but usually doesn’t). That documentation typically doesn’t go into telling people this small niggle because prior to getrandom(0) there wasn’t much they could do about it except use /dev/random which is bad in every other situation but early boot cryptographic keys. Regardless of what we document it as, people are going to use os.urandom for cryptographic purposes because for everyone who doesn’t keep up on exactly what modules are being added to Python who has any idea about cryptography at all is going to look for a Python interface to urandom. That doesn’t even begin to touch the thousands upon thousands of uses that already exist in the wild that are assuming that os.urandom will always give them cryptographic random, who now *need* to write this as: try: from secrets import token_bytes except ImportError: from os import urandom as token_bytes In order to get the best cryptographic random available to them on their system, which assumes they’re even going to notice at all that there’s a new secrets model, and requires each and every use of os.urandom to change. Honestly, I think that the first sentence in the documentation should most obviously be the most pertinent one, and the first sentence here is "Return a string of n random bytes suitable for cryptographic use.”. The bit about how the exact quality depends on the OS and documenting what device it uses is, to my eyes, obviously a hedge to say that “Hey, if this gives you bad random it’s your OSs fault not ours, we can’t produce good random where your OS can’t give us some” and to give people a suggestion of where to look to determine if they’re going to get good random or not. I do not think “uses /dev/urandom” is, or should be considered a core part of this API, it already doesn’t use /dev/urandom on Windows where it doesn’t exist nor does it use /dev/urandom in 3.5+ if it can help it. Using getrandom(0) or using getrandom(GRND_NONBLOCK) and raising an exception on EAGAIN is still accessing the urandom CSPRNG with the same general runtime characteristics of /dev/urandom outside of cases where it’s not safe to actually use /dev/urandom. Frankly, I think it’s a disservice to Python developers to leave in this footgun. — Donald Stufft ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Smoothing the transition from Python 2 to 3
Nick Coghlan writes: > - even if there is a test suite, sufficiently pervasive [str/bytes] > type ambiguity may make it difficult to use for fault isolation Difficult yes, but I would argue that that difficuly is inherent[1]. Ie, if it's pervasive, the fault should be isolated to the whole module. Such a fault *will* regress, often in the exact same place, but if not there, elsewhere due to the same ambiguity. That was my experience in both GNU Emacs and Mailman. In GNU Emacs's case there's a paired, much more successful (in respect of encoding problems) experience with XEmacs to compare.[2] We'll see how things go in Mailman 3 (which uses a nearly completely rewritten email package), but I'll bet the experience there is even more successful.[3] If you're looking for a band-aid that will get you back running asap, then you're better off bisecting the change history than going through a slew of warnings one-by-one, as a recent error is likely due to a recent change. If Neil still wants to go ahead, more power to him. I don't know everything. It's just that my experience in this area is sufficiently extensive and sufficiently bad that it's worth repeating (just this once!) Footnotes: [1] Or as Brooks would have said, "of the essence". [2] GNU Emacs has a multilingualization specialist in Ken Handa whose day job is writing multiligualization libraries, so their encoding detection, accuracy of implementation, and codec coverage is and always was better than XEmacs's. I'm referring here to internal bugs in the Lisp primitives dealing with text, as well as the difficulty of writing applications that handled both internal text and external bytes without confusing them. [3] Though not strictly comparable to the XEmacs experience, due to (1) being a second implementation, not a parallel implementation, and (2) the Internet environment being much more standard conformant, even in email, these days. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016, at 01:40 AM, Larry Hastings wrote: >As Robert Collins points out, this does change the behavior ever-so-slightly >from 3.4; Ah yes, I misunderstood Robert's point. >if urandom is initialized, and the kernel has the getrandom system call, >getrandom() will give us the bytes we asked for and we won't open and read >from /dev/urandom. In this state os.urandom() behaves ever-so-slightly >differently: > > * os.urandom() will now work in chroot environments where /dev/urandom >doesn't exist. > * If Python runs in a chroot environment with a fake /dev/urandom, >we'll ignore that and use the kernel's urandom device. > * If the sysadmin changed what the systemwide /dev/urandom points to, >we'll ignore that and use the kernel's urandom device. > >But os.urandom() is documented as calling getrandom() when available in >3.5... though doesn't detail how it calls it or what it uses the result for. >Anyway, I feel these differences were minor, and covered by the documented >change in 3.5, so I thought it was reasonable and un-broken. > >If this isn't backwards-compatible enough to suit you, please speak up now! It does seem like a narrow corner case, which of course means *someone* will be affected by it . I'll leave it up to you, though it should at least be clearly documented. Let's hope the googles will also help our hypothetical future head-scratcher. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016, at 12:36 AM, Nathaniel Smith wrote: >Basically this is a question of whether we should make an (unlikely) error >totally invisible to the user, and "errors should never pass silently" is >right there in the Zen of Python :-). I'd phrase it differently though. To me, it comes down to hand-holding our users who for whatever reason, don't use the appropriate APIs for what they're trying to accomplish. We can educate them through documentation, but I don't think it's appropriate to retrofit existing APIs to different behavior based on those faulty assumptions, because that has other negative effects, such as breaking the promises we make to experienced and knowledgeable developers. To me, the better policy is to admit our mistake in 3.5.0 and 3.5.1, restore pre-existing behavior, accurately document the trade-offs, and provide a clear, better upgrade path for our users. We've done this beautifully and effectively via the secrets module in Python 3.6. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 06/16/2016 01:03 AM, Barry Warsaw wrote: *If* we can guarantee that os.urandom() will never block or raise an exception when only poor entropy is available, then it may be indeed indistinguishably backward compatible for most if not all cases. I stepped through the code that shipped in 3.5.2rc1. It only ever calls getrandom() with the GRND_NONBLOCK flag. If getrandom() returns -1 and errno is EAGAIN it falls back to /dev/urandom--I actually simulated this condition in gdb and watched it open /dev/urandom. I didn't see any code for raising an exception or blocking when only poor entropy is available. As Robert Collins points out, this does change the behavior ever-so-slightly from 3.4; if urandom is initialized, and the kernel has the getrandom system call, getrandom() will give us the bytes we asked for and we won't open and read from /dev/urandom. In this state os.urandom() behaves ever-so-slightly differently: * os.urandom() will now work in chroot environments where /dev/urandom doesn't exist. * If Python runs in a chroot environment with a fake /dev/urandom, we'll ignore that and use the kernel's urandom device. * If the sysadmin changed what the systemwide /dev/urandom points to, we'll ignore that and use the kernel's urandom device. But os.urandom() is documented as calling getrandom() when available in 3.5... though doesn't detail how it calls it or what it uses the result for. Anyway, I feel these differences were minor, and covered by the documented change in 3.5, so I thought it was reasonable and un-broken. If this isn't backwards-compatible enough to suit you, please speak up now! //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016, at 12:53 AM, Nathaniel Smith wrote: >> We have introduced churn. Predicting a future SO question such as "Can >> os.urandom() block on Linux?" the answer is "No in Python 3.4 and earlier, >> yes possibly in Python 3.5.0 and 3.5.1, no in Python 3.5.2 and the rest of >> the 3.5.x series, and yes possibly in Python 3.6 and beyond". > >It also depends on the kernel version, since it will never block on >old kernels that are missing getrandom(), but it might block on future >kernels if Linux's /dev/urandom ever becomes blocking. (Ted's said >that this is not going to happen now, but the only reason it isn't was >that he tried to make the change and it broke some distros that are >still in use -- so it seems entirely possible that it will happen a >few years from now.) Right; I noticed this and had it in my copious notes for my follow up but forgot to mention it. Thanks! >This is not an accurate docstring, though. The more accurate docstring >for your proposed behavior would be: [...] >You should never use this function. If you need unguessable random >bytes, then the 'secrets' module is always a strictly better choice -- >unlike this function, it always uses the best available source of >cryptographic randomness, even on Linux. Alternatively, if you need >random bytes but it doesn't matter whether other people can guess >them, then the 'random' module is always a strictly better choice -- >it will be faster, as well as providing useful features like >deterministic seeding. Note that I was talking about 3.5.x, where we don't have the secrets module. I'd quibble about the admonition about never using the function. It *can* be useful if the trade-offs are appropriate for your application (e.g. "almost always random enough, but maybe not though at least you won't block and you'll get back something"). Getting the words right is useful, but I agree that we should be strongly recommending crypto applications use the secrets module in Python 3.6. >In practice, your proposal means that ~all existing code that uses os.urandom >becomes incorrect and should be switched to either secrets or random. This is >*far* more churn for end-users than Nick's proposal. I disagree. We have a clear upgrade path for end-users. If you're using os.urandom() in pre-3.5 and understand what you're getting (or not getting as the case may be), you will continue to get or not get exactly the same bits in 3.5.x (where x >= 2). No changes to your code are necessary. This is also the case in 3.6 but there you can do much better by porting your code to the new secrets module. Go do that! >...Anyway, since there's clearly going to be at least one PEP about this, >maybe we should stop rehashing bits and pieces of the argument in these long >threads that most people end up skipping and then rehashing again later? Sure, I'll try. ;) Cheers, -Barry pgppqMdOn150C.pgp Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 16, 2016, at 07:26 PM, Robert Collins wrote: >Which is a contract change. Someone testing in E.g. a chroot could have a >different device on /dev/urandom, and now they will need to intercept >syscalls for the same effect. Personally I think this is fine, but assuming >i see Barry's point correctly, it is indeed but the same as it was. It's true there could be a different device on /dev/urandom, but by my reading of the getrandom() manpage I think that *should* be transparent since By default, getrandom() draws entropy from the /dev/urandom pool. This behavior can be changed via the flags argument. and we don't pass the GRND_RANDOM flag to getrandom(). Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
Nathaniel Smith pobox.com> writes: > In practice, your proposal means that ~all existing code that uses > os.urandom becomes incorrect and should be switched to either secrets > or random. This is *far* more churn for end-users than Nick's > proposal. This should only concern code that a) was specifically written for 3.5.0/3.5.1 and b) implements a serious cryptographic application in Python. I think b) is not a good idea anyway due to timing and side channel attacks and the lack of secure wiping of memory. Such applications should be written in C, where one does not have to predict the behavior of multiple layers of abstractions. Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 15, 2016, at 11:52 PM, Larry Hastings wrote: >Well, 3.5.2 hasn't happened yet. So if you see it as still being broken, >please speak up now. In discussion with other Ubuntu developers, several salient points were raised. The documentation for os.urandom() in 3.5.2rc1 doesn't make sense: On Linux, getrandom() syscall is used if available and the urandom entropy pool is initialized (getrandom() does not block). On a Unix-like system this will query /dev/urandom. Perhaps better would be: Where available, the getrandom() syscall is used (with the GRND_NONBLOCK flag) if available and the urandom entropy pool is initialized. When getrandom() returns EAGAIN because of insufficient entropy, fallback to reading from /dev/urandom. When the getrandom() syscall is unavailable on other Unix-like systems, this will query /dev/urandom. It's actually a rather twisty maze of code to verify these claims, and I'm nearly certain we don't have any tests to guarantee this is what actually happens in those cases, so there are many caveats. This means that an experienced developer can no longer just `man urandom` to understand the unique operational behavior of os.urandom() on their platform, but instead would be forced to actually read our code to find out what's actually happening when/if things break. It is unacceptable if any new exceptions are raised when insufficient entropy is available. Python 3.4 essentially promises that "if only crap entropy is available, you'll get crap, but at least it won't block and no exceptions are raised". Proper backward compatibility requires the same in 3.5 and beyond. Are we sure that's still the case? Using the system call *may* be faster in the we-have-good-entropy-case, but it will definitely be slower in the we-don't-have-good-entropy-case (because of the fallback logic). Maybe that doesn't matter in practice but it's worth noting. >Why do you call it only "semi-fixed"? As far as I understand it, the >semantics of os.urandom() in 3.5.2rc1 are indistinguishable from reading from >/dev/urandom directly, except it may not need to use a file handle. Semi-fixed because os.urandom() will still not be strictly backward compatible between Python 3.5.2 and 3.4. *If* we can guarantee that os.urandom() will never block or raise an exception when only poor entropy is available, then it may be indeed indistinguishably backward compatible for most if not all cases. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Wed, Jun 15, 2016 at 11:45 PM, Barry Warsawwrote: > On Jun 15, 2016, at 01:01 PM, Nick Coghlan wrote: > >>No, this is a bad idea. Asking novice developers to make security >>decisions they're not yet qualified to make when it's genuinely >>possible for us to do the right thing by default is the antithesis of >>good security API design, and os.urandom() *is* a security API >>(whether we like it or not - third party documentation written by the >>cryptographic software development community has made it so, since >>it's part of their guidelines for writing security sensitive code in >>pure Python). > > Regardless of what third parties have said about os.urandom(), let's look at > what *we* have said about it. Going back to pre-churn 3.4 documentation: > > os.urandom(n) > Return a string of n random bytes suitable for cryptographic use. > > This function returns random bytes from an OS-specific randomness > source. The returned data should be unpredictable enough for cryptographic > applications, though its exact quality depends on the OS > implementation. On a Unix-like system this will query /dev/urandom, and on > Windows it will use CryptGenRandom(). If a randomness source is not found, > NotImplementedError will be raised. > > For an easy-to-use interface to the random number generator provided by > your platform, please see random.SystemRandom. > > So we very clearly provided platform-dependent caveats on the cryptographic > quality of os.urandom(). We also made a strong claim that there's a direct > connection between os.urandom() and /dev/urandom on "Unix-like system(s)". > > We broke that particular promise in 3.5. and semi-fixed it 3.5.2. > >>Adding *new* APIs is also a bad idea, since "os.urandom() is the right >>answer on every OS except Linux, and also the best currently available >>answer on Linux" has been the standard security advice for generating >>cryptographic secrets in pure Python code for years now, so we should >>only change that guidance if we have extraordinarily compelling >>reasons to do so, and we don't. > > Disagree. > > We have broken one long-term promise on os.urandom() ("On a Unix-like system > this will query /dev/urandom") and changed another ("should be unpredictable > enough for cryptographic applications, though its exact quality depends on OS > implementations"). > > We broke the experienced Linux developer's natural and long-standing link > between the API called os.urandom() and /dev/urandom. This breaks pre-3.5 > code that assumes read-from-/dev/urandom semantics for os.urandom(). > > We have introduced churn. Predicting a future SO question such as "Can > os.urandom() block on Linux?" the answer is "No in Python 3.4 and earlier, yes > possibly in Python 3.5.0 and 3.5.1, no in Python 3.5.2 and the rest of the > 3.5.x series, and yes possibly in Python 3.6 and beyond". It also depends on the kernel version, since it will never block on old kernels that are missing getrandom(), but it might block on future kernels if Linux's /dev/urandom ever becomes blocking. (Ted's said that this is not going to happen now, but the only reason it isn't was that he tried to make the change and it broke some distros that are still in use -- so it seems entirely possible that it will happen a few years from now.) > We have a better answer for "cryptographically appropriate" use cases in > Python 3.6 - the secrets module. Trying to make os.urandom() "the right > answer on every OS" weakens the promotion of secrets as *the* module to use > for cryptographically appropriate use cases. > > IMHO it would be better to leave os.urandom() well enough alone, except for > the documentation which should effectively say, a la 3.4: > > os.urandom(n) > Return a string of n random bytes suitable for cryptographic use. > > This function returns random bytes from an OS-specific randomness > source. The returned data should be unpredictable enough for cryptographic > applications, though its exact quality depends on the OS > implementation. On a Unix-like system this will query /dev/urandom, and on > Windows it will use CryptGenRandom(). If a randomness source is not found, > NotImplementedError will be raised. > > Cryptographic applications should use the secrets module for stronger > guaranteed sources of randomness. > > For an easy-to-use interface to the random number generator provided by > your platform, please see random.SystemRandom. This is not an accurate docstring, though. The more accurate docstring for your proposed behavior would be: os.urandom(n) Return a string of n bytes that will usually, but not always, be suitable for cryptographic use. This function returns random bytes from an OS-specific randomness source. On non-Linux OSes, this uses the best available source of randomness, e.g. CryptGenRandom() on Windows and /dev/urandom on OS X, and thus will be strong enough
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
Nathaniel Smith pobox.com> writes: > On Wed, Jun 15, 2016 at 10:25 PM, Theodore Ts'o mit.edu> wrote: > > In practice, those Python ivocation which are exposed to hostile input > > are those that are started while the network are up. > > Not sure what you mean about the vast majority of Python invocations > being launched by the web browser? "Python invocations which are exposed to hostile input". ;) Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Wed, Jun 15, 2016 at 10:25 PM, Theodore Ts'owrote: > On Wed, Jun 15, 2016 at 04:12:57PM -0700, Nathaniel Smith wrote: >> - It's not exactly true that the Python interpreter doesn't need >> cryptographic randomness to initialize SipHash -- it's more that >> *some* Python invocations need unguessable randomness (to first >> approximation: all those which are exposed to hostile input), and some >> don't. And since the Python interpreter has no idea which case it's >> in, and since it's unacceptable for it to break invocations that don't >> need unguessable hashes, then it has to err on the side of continuing >> without randomness. All that's fine. > > In practice, those Python ivocation which are exposed to hostile input > are those that are started while the network are up. The vast > majority of time, they are launched by the web brwoser --- and if this > happens after a second or so of the system getting networking > interrupts, (a) getrandom won't block, and (b) /dev/urandom and > getrandom will be initialized. Not sure what you mean about the vast majority of Python invocations being launched by the web browser? But anyway, sure, usually this isn't an issue. This is just discussing about what to do in the unlikely case when it actually has become an issue, and it's hard to be certain that this will *never* happen. E.g. it's entirely plausible that someone will write some cloud-init plugin that exposes an HTTP server or something. People do all kinds of weird things in VMs these days... Basically this is a question of whether we should make an (unlikely) error totally invisible to the user, and "errors should never pass silently" is right there in the Zen of Python :-). > Also, I wish people would say that this is only an issue on Linux. > Again, FreeBSD's /dev/urandom will block as well if it is > uninitialized. It's just that in practice, for both Linux and > Freebsd, we try very hard to make sure /dev/urandom is fully > initialized by the time it matters. It's just that so far, it's only > on Linux when there was an attempt to use Python in the early init > scripts, and in a VM and in a system where everything is modularized > such that the deadlock became visible. > > >> (I guess the way to implement this would be for the SipHash >> initialization code -- which runs very early -- to set some flag, and >> then we expose that flag in sys._something, and later in the startup >> sequence check for it after the warnings module is functional. >> Exposing the flag at the Python level would also make it possible for >> code like cloud-init to do its own explicit check and respond >> appropriately.) > > I really don't think it's that big a of a deal in *practice*, and but > if you really are concerned about the very remote possibility that a > Python invocation could start in early boot, and *then* also stick > around for the long term, and *then* be exosed to hostile input --- > what if you set the flag, and then later on, N minutes, either > automatically, or via some trigger such as cloud-init --- try and see > if /dev/urandom is initialized (even a few seconds later, so long as > the init scripts are hanging, it should be initialized) have Python > hash all of its dicts, or maybe just the non-system dicts (since those > are presumably the ones mos tlikely to be exposed hostile input). I don't think this is technically doable. There's no global list of hash tables, and Python exposes the actual hash values to user code with some guarantee that they won't change. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 16 Jun 2016 6:55 PM, "Larry Hastings"wrote: > > > Why do you call it only "semi-fixed"? As far as I understand it, the semantics of os.urandom() in 3.5.2rc1 are indistinguishable from reading from /dev/urandom directly, except it may not need to use a file handle. Which is a contract change. Someone testing in E.g. a chroot could have a different device on /dev/urandom, and now they will need to intercept syscalls for the same effect. Personally I think this is fine, but assuming i see Barry's point correctly, it is indeed but the same as it was. -rob ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On 06/15/2016 11:45 PM, Barry Warsaw wrote: So we very clearly provided platform-dependent caveats on the cryptographic quality of os.urandom(). We also made a strong claim that there's a direct connection between os.urandom() and /dev/urandom on "Unix-like system(s)". We broke that particular promise in 3.5. and semi-fixed it 3.5.2. Well, 3.5.2 hasn't happened yet. So if you see it as still being broken, please speak up now. Why do you call it only "semi-fixed"? As far as I understand it, the semantics of os.urandom() in 3.5.2rc1 are indistinguishable from reading from /dev/urandom directly, except it may not need to use a file handle. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
On Jun 15, 2016, at 01:01 PM, Nick Coghlan wrote: >No, this is a bad idea. Asking novice developers to make security >decisions they're not yet qualified to make when it's genuinely >possible for us to do the right thing by default is the antithesis of >good security API design, and os.urandom() *is* a security API >(whether we like it or not - third party documentation written by the >cryptographic software development community has made it so, since >it's part of their guidelines for writing security sensitive code in >pure Python). Regardless of what third parties have said about os.urandom(), let's look at what *we* have said about it. Going back to pre-churn 3.4 documentation: os.urandom(n) Return a string of n random bytes suitable for cryptographic use. This function returns random bytes from an OS-specific randomness source. The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation. On a Unix-like system this will query /dev/urandom, and on Windows it will use CryptGenRandom(). If a randomness source is not found, NotImplementedError will be raised. For an easy-to-use interface to the random number generator provided by your platform, please see random.SystemRandom. So we very clearly provided platform-dependent caveats on the cryptographic quality of os.urandom(). We also made a strong claim that there's a direct connection between os.urandom() and /dev/urandom on "Unix-like system(s)". We broke that particular promise in 3.5. and semi-fixed it 3.5.2. >Adding *new* APIs is also a bad idea, since "os.urandom() is the right >answer on every OS except Linux, and also the best currently available >answer on Linux" has been the standard security advice for generating >cryptographic secrets in pure Python code for years now, so we should >only change that guidance if we have extraordinarily compelling >reasons to do so, and we don't. Disagree. We have broken one long-term promise on os.urandom() ("On a Unix-like system this will query /dev/urandom") and changed another ("should be unpredictable enough for cryptographic applications, though its exact quality depends on OS implementations"). We broke the experienced Linux developer's natural and long-standing link between the API called os.urandom() and /dev/urandom. This breaks pre-3.5 code that assumes read-from-/dev/urandom semantics for os.urandom(). We have introduced churn. Predicting a future SO question such as "Can os.urandom() block on Linux?" the answer is "No in Python 3.4 and earlier, yes possibly in Python 3.5.0 and 3.5.1, no in Python 3.5.2 and the rest of the 3.5.x series, and yes possibly in Python 3.6 and beyond". We have a better answer for "cryptographically appropriate" use cases in Python 3.6 - the secrets module. Trying to make os.urandom() "the right answer on every OS" weakens the promotion of secrets as *the* module to use for cryptographically appropriate use cases. IMHO it would be better to leave os.urandom() well enough alone, except for the documentation which should effectively say, a la 3.4: os.urandom(n) Return a string of n random bytes suitable for cryptographic use. This function returns random bytes from an OS-specific randomness source. The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation. On a Unix-like system this will query /dev/urandom, and on Windows it will use CryptGenRandom(). If a randomness source is not found, NotImplementedError will be raised. Cryptographic applications should use the secrets module for stronger guaranteed sources of randomness. For an easy-to-use interface to the random number generator provided by your platform, please see random.SystemRandom. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Why does base64 return bytes?
Steven D'Aprano wrote: I'm satisfied that the choice made by Python is the right choice, and that it meets the spirit (if, arguably, not the letter) of the RFC. IMO it meets the letter (if you read it a certain way) but *not* the spirit. -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com