jbk python-dev-requ...@python.org编写:
>Send Python-Dev mailing list submissions to > python-dev@python.org > >To subscribe or unsubscribe via the World Wide Web, visit > http://mail.python.org/mailman/listinfo/python-dev >or, via email, send a message with subject or body 'help' to > python-dev-requ...@python.org > >You can reach the person managing the list at > python-dev-ow...@python.org > >When replying, please edit your Subject line so it is more specific >than "Re: Contents of Python-Dev digest..." > > >Today's Topics: > > 1. Re: Status of the fix for the hash collision vulnerability > (Gregory P. Smith) > 2. Re: Status of the fix for the hash collision vulnerability > (Barry Warsaw) > 3. Re: Sphinx version for Python 2.x docs (?ric Araujo) > 4. Re: Status of the fix for the hash collision vulnerability > (mar...@v.loewis.de) > 5. Re: Status of the fix for the hash collision vulnerability > (Guido van Rossum) > 6. Re: [Python-checkins] cpython: add test, which was missing > from d64ac9ab4cd0 (Nick Coghlan) > 7. Re: Status of the fix for the hash collision vulnerability > (Terry Reedy) > 8. Re: Status of the fix for the hash collision vulnerability > (Jack Diederich) > 9. Re: cpython: Implement PEP 380 - 'yield from' (closes #11682) > (Nick Coghlan) > 10. Re: Status of the fix for the hash collision vulnerability > (Nick Coghlan) > > >---------------------------------------------------------------------- > >Message: 1 >Date: Fri, 13 Jan 2012 19:06:00 -0800 >From: "Gregory P. Smith" <g...@krypto.org> >Cc: python-dev@python.org >Subject: Re: [Python-Dev] Status of the fix for the hash collision > vulnerability >Message-ID: > <cage7pnkkhw-_wqiuqc9bhqxnou77f+eprs_q3nqmycstm3j...@mail.gmail.com> >Content-Type: text/plain; charset="iso-8859-1" > >On Fri, Jan 13, 2012 at 5:58 PM, Gregory P. Smith <g...@krypto.org> wrote: > >> >> On Fri, Jan 13, 2012 at 5:38 PM, Guido van Rossum <gu...@python.org>wrote: >> >>> On Fri, Jan 13, 2012 at 5:17 PM, Antoine Pitrou <solip...@pitrou.net>wrote: >>> >>>> On Thu, 12 Jan 2012 18:57:42 -0800 >>>> Guido van Rossum <gu...@python.org> wrote: >>>> > Hm... I started out as a big fan of the randomized hash, but thinking >>>> more >>>> > about it, I actually believe that the chances of some legitimate app >>>> having >>>> > >1000 collisions are way smaller than the chances that somebody's code >>>> will >>>> > break due to the variable hashing. >>>> >>>> Breaking due to variable hashing is deterministic: you notice it as >>>> soon as you upgrade (and then you use PYTHONHASHSEED to disable >>>> variable hashing). That seems better than unpredictable breaking when >>>> some legitimate collision chain happens. >>> >>> >>> Fair enough. But I'm now uncomfortable with turning this on for bugfix >>> releases. I'm fine with making this the default in 3.3, just not in 3.2, >>> 3.1 or 2.x -- it will break too much code and organizations will have to >>> roll back the release or do extensive testing before installing a bugfix >>> release -- exactly what we *don't* want for those. >>> >>> FWIW, I don't believe in the SafeDict solution -- you never know which >>> dicts you have to change. >>> >>> >> Agreed. >> >> Of the three options Victor listed only one is good. >> >> I don't like *SafeDict*. *-1*. It puts the onerous on the coder to >> always get everything right with regards to data that came from outside the >> process never ending up hashed in a non-safe dict or set *anywhere*. >> "Safe" needs to be the default option for all hash tables. >> >> I don't like the "*too many hash collisions*" exception. *-1*. It >> provides non-deterministic application behavior for data driven >> applications with no way for them to predict when it'll happen or where and >> prepare for it. It may work in practice for many applications but is simply >> odd behavior. >> >> I do like *randomly seeding the hash*. *+1*. This is easy. It can easily >> be back ported to any Python version. >> >> It is perfectly okay to break existing users who had anything depending on >> ordering of internal hash tables. Their code was already broken. We >> *will*provide a flag and/or environment variable that can be set to turn the >> feature off at their own peril which they can use in their test harnesses >> that are stupid enough to use doctests with order dependencies. >> > >What an implementation looks like: > > http://pastebin.com/9ydETTag > >some stuff to be filled in, but this is all that is really required. add >logic to allow a particular seed to be specified or forced to 0 from the >command line or environment. add the logic to grab random bytes. add the >autoconf glue to disable it. done. > >-gps > > >> This approach worked fine for Perl 9 years ago. >> https://rt.perl.org/rt3//Public/Bug/Display.html?id=22371 >> >> -gps >> >-------------- next part -------------- >An HTML attachment was scrubbed... >URL: ><http://mail.python.org/pipermail/python-dev/attachments/20120113/3fb82673/attachment-0001.html> > >------------------------------ > >Message: 2 >Date: Sat, 14 Jan 2012 04:19:38 +0100 >From: Barry Warsaw <ba...@python.org> >To: python-dev@python.org >Subject: Re: [Python-Dev] Status of the fix for the hash collision > vulnerability >Message-ID: <20120114041938.098fd14b@rivendell> >Content-Type: text/plain; charset=US-ASCII > >On Jan 13, 2012, at 05:38 PM, Guido van Rossum wrote: > >>On Fri, Jan 13, 2012 at 5:17 PM, Antoine Pitrou <solip...@pitrou.net> wrote: >> >>> Breaking due to variable hashing is deterministic: you notice it as >>> soon as you upgrade (and then you use PYTHONHASHSEED to disable >>> variable hashing). That seems better than unpredictable breaking when >>> some legitimate collision chain happens. >> >> >>Fair enough. But I'm now uncomfortable with turning this on for bugfix >>releases. I'm fine with making this the default in 3.3, just not in 3.2, >>3.1 or 2.x -- it will break too much code and organizations will have to >>roll back the release or do extensive testing before installing a bugfix >>release -- exactly what we *don't* want for those. > >+1 > >-Barry > > >------------------------------ > >Message: 3 >Date: Sat, 14 Jan 2012 04:24:52 +0100 >From: ?ric Araujo <mer...@netwok.org> >To: <python-dev@python.org> >Subject: Re: [Python-Dev] Sphinx version for Python 2.x docs >Message-ID: <ff8dc5d4bd1c5d3583c3ff9c18e24...@netwok.org> >Content-Type: text/plain; charset=UTF-8; format=flowed > >Hi Sandro, > >Thanks for getting the ball rolling on this. One style for markup, one >Sphinx version to code our extensions against and one location for the >documenting guidelines will make our work a bit easier. > >> During the build process, there are some warnings that I can >> understand: >I assume you mean ?can?t?, as you later ask how to fix them. As a >general rule, they?re only warnings, so they don?t break the build, >only >some links or stylings, so I think it?s okay to ignore them *right >now*. > >> Doc/glossary.rst:520: WARNING: unknown keyword: nonlocal >That?s a mistake I did in cefe4f38fa0e. This sentence should be >removed. > >> Doc/library/stdtypes.rst:2372: WARNING: more than one target found >> for >> cross-reference u'next': >Need to use :meth:`.next` to let Sphinx find the right target (more >info >on request :) > >> Doc/library/sys.rst:651: WARNING: unknown keyword: None >Should use ``None``. > >> Doc/reference/datamodel.rst:1942: WARNING: unknown keyword: not in >> Doc/reference/expressions.rst:1184: WARNING: unknown keyword: is not >I don?t know if these should work (i.e. create a link to the >appropriate >language reference section) or abuse the markup (there are ?not? and >?in? keywords, but no ?not in? keyword ? use ``not in``). I?d say >ignore >them. > >Cheers > > >------------------------------ > >Message: 4 >Date: Sat, 14 Jan 2012 04:45:57 +0100 >From: mar...@v.loewis.de >To: python-dev@python.org >Subject: Re: [Python-Dev] Status of the fix for the hash collision > vulnerability >Message-ID: > <20120114044557.horde.mzdrbfnncxdpepp1qvb0...@webmail.df.eu> >Content-Type: text/plain; charset=ISO-8859-1; format=flowed; DelSp=Yes > >> What an implementation looks like: >> >> http://pastebin.com/9ydETTag >> >> some stuff to be filled in, but this is all that is really required. > >I think this statement (and the patch) is wrong. You also need to change >the byte string hashing, at least for 2.x. This I consider the biggest >flaw in that approach - other people may have written string-like objects >which continue to compare equal to a string but now hash different. > >Regards, >Martin > > > > >------------------------------ > >Message: 5 >Date: Fri, 13 Jan 2012 20:00:54 -0800 >From: Guido van Rossum <gu...@python.org> >To: "Gregory P. Smith" <g...@krypto.org> >Cc: Antoine Pitrou <solip...@pitrou.net>, python-dev@python.org >Subject: Re: [Python-Dev] Status of the fix for the hash collision > vulnerability >Message-ID: > <cap7+vjl+qrz0oiqblpcg3qxvqzljboemqpeqykiidigc2un...@mail.gmail.com> >Content-Type: text/plain; charset="iso-8859-1" > >On Fri, Jan 13, 2012 at 5:58 PM, Gregory P. Smith <g...@krypto.org> wrote: > >> It is perfectly okay to break existing users who had anything depending on >> ordering of internal hash tables. Their code was already broken. We >> *will*provide a flag and/or environment variable that can be set to turn the >> feature off at their own peril which they can use in their test harnesses >> that are stupid enough to use doctests with order dependencies. > > >No, that is not how we usually take compatibility between bugfix releases. >"Your code is already broken" is not an argument to break forcefully what >worked (even if by happenstance) before. The difference between CPython and >Jython (or between different CPython feature releases) also isn't relevant >-- historically we have often bent over backwards to avoid changing >behavior that was technically undefined, if we believed it would affect a >significant fraction of users. > >I don't think anyone doubts that this will break lots of code (at least, >the arguments I've heard have been "their code is broken", not "nobody does >that"). > >This approach worked fine for Perl 9 years ago. >> https://rt.perl.org/rt3//Public/Bug/Display.html?id=22371 >> > >I don't know what the Perl attitude about breaking undefined behavior >between micro versions was at the time. But ours is pretty clear -- don't >do it. > >-- >--Guido van Rossum (python.org/~guido) >-------------- next part -------------- >An HTML attachment was scrubbed... >URL: ><http://mail.python.org/pipermail/python-dev/attachments/20120113/16511835/attachment-0001.html> > >------------------------------ > >Message: 6 >Date: Sat, 14 Jan 2012 15:16:32 +1000 >From: Nick Coghlan <ncogh...@gmail.com> >To: python-dev@python.org >Cc: python-check...@python.org >Subject: Re: [Python-Dev] [Python-checkins] cpython: add test, which > was missing from d64ac9ab4cd0 >Message-ID: > <cadisq7fcjlgkrjqeqbhb0onu9eilnhhovtozrdznstdvjzx...@mail.gmail.com> >Content-Type: text/plain; charset=ISO-8859-1 > >On Sat, Jan 14, 2012 at 5:39 AM, benjamin.peterson ><python-check...@python.org> wrote: >> http://hg.python.org/cpython/rev/be85914b611c >> changeset: ? 74363:be85914b611c >> parent: ? ? ?74361:609482c6710e >> user: ? ? ? ?Benjamin Peterson <benja...@python.org> >> date: ? ? ? ?Fri Jan 13 14:39:38 2012 -0500 >> summary: >> ?add test, which was missing from d64ac9ab4cd0 > >Ah, that's where that came from, thanks. > >I still haven't fully trained myself to use hg import instead of >patch, which would avoid precisely this kind of error :P > >Cheers, >Nick. > >-- >Nick Coghlan?? |?? ncogh...@gmail.com?? |?? Brisbane, Australia > > >------------------------------ > >Message: 7 >Date: Sat, 14 Jan 2012 00:43:04 -0500 >From: Terry Reedy <tjre...@udel.edu> >To: python-dev@python.org >Subject: Re: [Python-Dev] Status of the fix for the hash collision > vulnerability >Message-ID: <jer4lp$qe4$1...@dough.gmane.org> >Content-Type: text/plain; charset=UTF-8; format=flowed > >On 1/13/2012 8:58 PM, Gregory P. Smith wrote: > >> It is perfectly okay to break existing users who had anything depending >> on ordering of internal hash tables. Their code was already broken. > >Given that the doc says "Return the hash value of the object", I do not >think we should be so hard-nosed. The above clearly implies that there >is such a thing as *the* Python hash value for an object. And indeed, >that has been true across many versions. If we had written "Return a >hash value for the object, which can vary from run to run", the case >would be different. > >-- >Terry Jan Reedy > > > >------------------------------ > >Message: 8 >Date: Sat, 14 Jan 2012 01:24:54 -0500 >From: Jack Diederich <jackd...@gmail.com> >To: Guido van Rossum <gu...@python.org> >Cc: Python Dev <Python-Dev@python.org> >Subject: Re: [Python-Dev] Status of the fix for the hash collision > vulnerability >Message-ID: > <CACLn2+3Z1EW8Rxox7Zif=20p2sdhxyhv+wo6dhxkkno09+-...@mail.gmail.com> >Content-Type: text/plain; charset=ISO-8859-1 > >On Thu, Jan 12, 2012 at 9:57 PM, Guido van Rossum <gu...@python.org> wrote: >> Hm... I started out as a big fan of the randomized hash, but thinking more >> about it, I actually believe that the chances of some legitimate app having >>>1000 collisions are way smaller than the chances that somebody's code will >> break due to the variable hashing. > >Python's dicts are designed to avoid hash conflicts by resizing and >keeping the available slots bountiful. 1000 conflicts sounds like a >number that couldn't be hit accidentally unless you had a single dict >using a terabyte of RAM (i.e. if Titus Brown doesn't object, we're >good). The hashes also look to exploit cache locality but that is >very unlikely to get one thousand conflicts by chance. If you get >that many there is an attack. > >> This is depending on how the counting is done (I didn't look at MAL's >> patch), and assuming that increasing the hash table size will generally >> reduce collisions if items collide but their hashes are different. > >The patch counts conflicts on an individual insert and not lifetime >conflicts. Looks sane to me. > >> That said, even with collision counting I'd like a way to disable it without >> changing the code, e.g. a flag or environment variable. > >Agreed. Paranoid people can turn the behavior off and if it ever were >to become a problem in practice we could point people to a solution. > >-Jack > > >------------------------------ > >Message: 9 >Date: Sat, 14 Jan 2012 16:53:39 +1000 >From: Nick Coghlan <ncogh...@gmail.com> >To: Georg Brandl <g.bra...@gmx.net> >Cc: python-dev@python.org >Subject: Re: [Python-Dev] cpython: Implement PEP 380 - 'yield from' > (closes #11682) >Message-ID: > <CADiSq7dA6P8U3_MiweM9=s-q49+y0KndeQX=zngwog-dz-h...@mail.gmail.com> >Content-Type: text/plain; charset=ISO-8859-1 > >On Sat, Jan 14, 2012 at 1:17 AM, Georg Brandl <g.bra...@gmx.net> wrote: >> On 01/13/2012 12:43 PM, nick.coghlan wrote: >>> diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst >> >> There should probably be a "versionadded" somewhere on this page. > >Good catch, I added versionchanged notes to this page, simple_stmts >and the StopIteration entry in the library reference. > >>> ?PEP 3155: Qualified name for classes and functions >>> ?================================================== >> >> This looks like a spurious (and syntax-breaking) change. > >Yeah, it was an error I introduced last time I merged from default. Fixed. > >>> diff --git a/Grammar/Grammar b/Grammar/Grammar >>> -argument: test [comp_for] | test '=' test ?# Really [keyword '='] test >>> +argument: (test) [comp_for] | test '=' test ?# Really [keyword '='] test >> >> This looks like a change without effect? > >Fixed. > >It was a lingering after-effect of Greg's original patch (which also >modified the function call syntax to allow "yield from" expressions >with extra parens). I reverted the change to the function call syntax, >but forgot to ditch the added parens while doing so. > >>> diff --git a/Include/genobject.h b/Include/genobject.h >>> >>> - ? ? /* List of weak reference. */ >>> - ? ? PyObject *gi_weakreflist; >>> + ? ? ? ?/* List of weak reference. */ >>> + ? ? ? ?PyObject *gi_weakreflist; >>> ?} PyGenObject; >> >> While these change tabs into spaces, it should be 4 spaces, not 8. > >Fixed. > >>> +PyAPI_FUNC(int) PyGen_FetchStopIterationValue(PyObject **); >> >> Does this API need to be public? If yes, it needs to be documented. > >Hmm, good point - that one needs a bit of thought, so I've put it on >the tracker: http://bugs.python.org/issue13783 > >(that issue also covers your comments regarding the docstring for this >function and whether or not we even need the StopIteration instance >creation API) > >>> -#define CALL_FUNCTION ? ? ? ?131 ? ? /* #args + (#kwargs<<8) */ >>> -#define MAKE_FUNCTION ? ? ? ?132 ? ? /* #defaults + #kwdefaults<<8 + >>> #annotations<<16 */ >>> -#define BUILD_SLICE ?133 ? ? /* Number of items */ >>> +#define CALL_FUNCTION ? 131 ? ? /* #args + (#kwargs<<8) */ >>> +#define MAKE_FUNCTION ? 132 ? ? /* #defaults + #kwdefaults<<8 + >>> #annotations<<16 */ >>> +#define BUILD_SLICE ? ? 133 ? ? /* Number of items */ >> >> Not sure putting these and all the other cosmetic changes into an already >> big patch is such a good idea... > >I agree, but it's one of the challenges of a long-lived branch like >the PEP 380 one (I believe some of these cosmetic changes started life >in Greg's original patch and separating them out would have been quite >a pain). Anyone that wants to see the gory details of the branch >history can take a look at my bitbucket repo: > >https://bitbucket.org/ncoghlan/cpython_sandbox/changesets/tip/branch%28%22pep380%22%29 > >>> diff --git a/Objects/abstract.c b/Objects/abstract.c >>> --- a/Objects/abstract.c >>> +++ b/Objects/abstract.c >>> @@ -2267,7 +2267,6 @@ >>> >>> ? ? ?func = PyObject_GetAttrString(o, name); >>> ? ? ?if (func == NULL) { >>> - ? ? ? ?PyErr_SetString(PyExc_AttributeError, name); >>> ? ? ? ? ?return 0; >>> ? ? ?} >>> >>> @@ -2311,7 +2310,6 @@ >>> >>> ? ? ?func = PyObject_GetAttrString(o, name); >>> ? ? ?if (func == NULL) { >>> - ? ? ? ?PyErr_SetString(PyExc_AttributeError, name); >>> ? ? ? ? ?return 0; >>> ? ? ?} >>> ? ? ?va_start(va, format); >> >> These two changes also look suspiciously unrelated? > >IIRC, I removed those lines while working on the patch because the >message they produce (just the attribute name) is worse than the one >produced by the call to PyObject_GetAttrString (which also includes >the type of the object being accessed). Leaving the original >exceptions alone helped me track down some failures I was getting at >the time. > >I've now made the various CallMethod helper APIs in abstract.c (1 >public, 3 private) consistently leave the GetAttr exception alone and >added an explicit C API note to NEWS. > >(Vaguely related tangent: the new code added by the patch probably has >a few parts that could benefit from the new GetAttrId private API) > >>> diff --git a/Objects/genobject.c b/Objects/genobject.c >>> + ? ? ? ?} else { >>> + ? ? ? ? ? ?PyObject *e = PyStopIteration_Create(result); >>> + ? ? ? ? ? ?if (e != NULL) { >>> + ? ? ? ? ? ? ? ?PyErr_SetObject(PyExc_StopIteration, e); >>> + ? ? ? ? ? ? ? ?Py_DECREF(e); >>> + ? ? ? ? ? ?} >> >> Wouldn't PyErr_SetObject(PyExc_StopIteration, value) suffice here >> anyway? > >I think you're right - so noted in the tracker issue about the C API additions. > >Thanks for the thorough review, a fresh set of eyes is very helpful :) > >Cheers, >Nick. > >-- >Nick Coghlan?? |?? ncogh...@gmail.com?? |?? Brisbane, Australia > > >------------------------------ > >Message: 10 >Date: Sat, 14 Jan 2012 17:01:48 +1000 >From: Nick Coghlan <ncogh...@gmail.com> >To: Jack Diederich <jackd...@gmail.com> >Cc: Guido van Rossum <gu...@python.org>, Python Dev > <Python-Dev@python.org> >Subject: Re: [Python-Dev] Status of the fix for the hash collision > vulnerability >Message-ID: > <cadisq7cmnjm8meehktfja5ss+k0z8u_cf7tmmucn56dwozv...@mail.gmail.com> >Content-Type: text/plain; charset=ISO-8859-1 > >On Sat, Jan 14, 2012 at 4:24 PM, Jack Diederich <jackd...@gmail.com> wrote: >>> This is depending on how the counting is done (I didn't look at MAL's >>> patch), and assuming that increasing the hash table size will generally >>> reduce collisions if items collide but their hashes are different. >> >> The patch counts conflicts on an individual insert and not lifetime >> conflicts. ?Looks sane to me. > >Having a hard limit on the worst-case behaviour certainly sounds like >an attractive prospect. And there's nothing to worry about in terms of >secrecy or sufficient randomness - by default, attackers cannot >generate more than 1000 hash collisions in one lookup, period. > >>> That said, even with collision counting I'd like a way to disable it without >>> changing the code, e.g. a flag or environment variable. >> >> Agreed. ?Paranoid people can turn the behavior off and if it ever were >> to become a problem in practice we could point people to a solution. > >Does MAL's patch allow the limit to be set on a per-dict basis >(including setting it to None to disable collision limiting >completely)? If people have data sets that need to tolerate that kind >of collision level (and haven't already decided to move to a data >structure other than the builtin dict), then it may make sense to >allow them to remove the limit when using trusted input. > >For maintenance versions though, it would definitely need to be >possible to switch it off without touching the code. > >Cheers, >Nick. > >-- >Nick Coghlan?? |?? ncogh...@gmail.com?? |?? Brisbane, Australia > > >------------------------------ > >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev > > >End of Python-Dev Digest, Vol 102, Issue 35 >******************************************* _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com