[Python-Dev] Extent of post-rc churn
http://midwinter.com/~larry/3.4.status/merge.status.html lists enough changes that it sounds more like a bugfix release than just a few last tweaks after the rc. It would probably help if the what's-new-in-rc2 explicitly mentioned that asyncio is new and provisional with 3.4, and listed its changes in a separate subsection, so that the final tweaks to something I might already be using section would be less intimidating. -jJ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] DB-API v2.1 or v3 [inspired by: python 3 niggle: None 1 raises TypeError]
I personally regret that sorting isn't safe, but that ship has sailed. There is practicality benefit in making None compare to everything, just as C and Java do with null pointers -- but it is too late to do by default. Adding a keyword to sorted might be nice -- but then shouldn't it also be added to other sorts, and maybe max/min? It might just be trading one sort of mess for another. What *can* reasonably be changed is the DB-API. Why not just specify that the DB type objects themselves should handle comparison to None? http://www.python.org/dev/peps/pep-0249/#type-objects-and-constructors -jJ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 460 reboot
On Tue, Jan 14, 2014 at 3:06 PM, Guido van Rossum gu...@python.org wrote: Personally I wouldn't add any words suggesting or referring to the option of creation another class for this purpose. You wouldn't recommend subclassing dict for constraining the types of keys or values, would you? Yes, and it is so clear that I suspect I'm missing some context for your question. Do I recommend that each individual application should create new concrete classes instead of just using the builtins? No. When trying to understand (learn about) the text/binary distinction, I do recommend pretending that they are represented by separate classes. Limits on the values in a bytearray are NOT the primary reason for this; the primary reason is that operations like the literal representation or the capitalize method are arbitrary nonsense unless the data happens to be representing ASCII. sound_sample.capitalize() -- syntactically valid, but semantic garbage header.capitalize() -- OK, which implies that data is an instance of something more specific than bytes. Would I recommend subclassing dict if I wanted to constrain the key types? Yes -- though MutableMapping (fewer gates to guard) or the upcoming TransformDict would probably be better still. The existing dict implementation itself effectively uses (hidden, quasi-)subclasses to restrict types of keys strictly for efficiency. (lookdict* variants) -jJ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Close #19762: Fix name of _get_traces() and _get_object_traceback() function
Why are these functions (get_traces and get_object_traceback) private? (1) Is the whole module provisional? At one point, I had thought so, but I don't see that in the PEP or implementation. (I'm not sure that it should be provisional, but I want to be sure that the decision is intentional.) (2) This implementation does lock in certain choices about the nature of traces. (What data to include for analysis vs excluding to save memory; which events are tracked separately and which combined into a single total; organizing the data that is saved in a hash by certain keys; etc) While I would prefer more flexibility, the existing code provides a reasonable default, and I can't forsee changing traces so much that these functions *can't* be reasonably supported unless the rest of the module API changes too. (3) get_object_traceback is the killer app that justifies the specific data-collection choices Victor made; if it isn't public, the implementation starts to look overbuilt. (4) get_traces is about the only way to get at even the all the data that *is* stored, prior to additional summarization. If it isn't public, those default summarization options become even more locked in.. -jJ On Mon, Nov 25, 2013 at 3:34 AM, victor.stinner python-check...@python.orgwrote: http://hg.python.org/cpython/rev/2e2ec595dc58 changeset: 87551:2e2ec595dc58 user:Victor Stinner victor.stin...@gmail.com date:Mon Nov 25 09:33:18 2013 +0100 summary: Close #19762: Fix name of _get_traces() and _get_object_traceback() function name in their docstring. Patch written by Vajrasky Kok. files: Modules/_tracemalloc.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.c --- a/Modules/_tracemalloc.c +++ b/Modules/_tracemalloc.c @@ -1018,7 +1018,7 @@ } PyDoc_STRVAR(tracemalloc_get_traces_doc, -get_traces() - list\n +_get_traces() - list\n \n Get traces of all memory blocks allocated by Python.\n Return a list of (size: int, traceback: tuple) tuples.\n @@ -1083,7 +1083,7 @@ } PyDoc_STRVAR(tracemalloc_get_object_traceback_doc, -get_object_traceback(obj)\n +_get_object_traceback(obj)\n \n Get the traceback where the Python object obj was allocated.\n Return a tuple of (filename: str, lineno: int) tuples.\n -- Repository URL: http://hg.python.org/cpython ___ Python-checkins mailing list python-check...@python.org https://mail.python.org/mailman/listinfo/python-checkins ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
On Wed, Oct 30, 2013 at 6:02 AM, Victor Stinner victor.stin...@gmail.com wrote: 2013/10/30 Jim J. Jewett jimjjew...@gmail.com: Well, unless I missed it... I don't see how to get anything beyond the return value of get_traces, which is a (time-ordered?) list of allocation size with then-current call stack. It doesn't mention any attribute for indicating that some entries are de-allocations, let alone the actual address of each allocation. get_traces() does return the traces of the currently allocated memory blocks. It's not a log of alloc/dealloc calls. The list is not sorted. If you want a sorted list, use take_snapshot.statistics('lineno') for example. Any list is sorted somehow; I had assumed that it was defaulting to order-of-creation, though if you use a dict internally, that might not be the case. If you return it as a list instead of a dict, but that list is NOT in time-order, that is worth documenting Also, am I misreading the documentation of get_traces() function? Get traces of memory blocks allocated by Python. Return a list of (size: int, traceback: tuple) tuples. traceback is a tuple of (filename: str, lineno: int) tuples. So it now sounds like you don't bother to emit de-allocation events because you just remove the allocation from your internal data structure. In other words, you provide a snapshot, but not a history -- except that the snapshot isn't complete either, because it only shows things that appeared after a certain event (the most recent enablement). I still don't see anything here(*) that requires even saving the address, let alone preventing re-use. (*) get_object_traceback(obj) might require a stored address for efficiency, but the base functionality of getting traces doesn't. I still wouldn't worry about address re-use though, because the address should not be re-used until the object has been deleted -- and is no longer available to be passed to get_object_traceback. So the worst that can happen is that an object which was not traced might return a bogus answer instead of failing. In that case, I would expect disabling (and filtering) to stop capturing new allocation events for me, but I would still expect tracemalloc to do proper internal maintenance. tracemalloc has an important overhead in term of performances and memory. The purpose of disable() is to... disable the module, to remove completely the overhead. ... Why would you like to keep traces and disable the module? Because of that very overhead. I think my use typical use case would be similar to Kristján Valur's, but I'll try to spell it out in more detail here. (1) Whoa -- memory hog! How can I fix this? (2) I know -- track all allocations, with a traceback showing why they were made. (At a minimum, I would like to be able to subclass your tool to do this -- preferably without also keeping the full history in memory.) (3) Oh, maybe I should skip the ones that really are temporary and get cleaned up. (You make this easy by handling the de-allocs, though I'm not sure those events get exposed to anyone working at the python level, as opposed to modifying and re-compiling.) (4) hmm... still too big ... I should use filters. (But will changing those filters while tracing is enabled mess up your current implementation?) (5) Argh. What I really want is to know what gets allocated at times like XXX. I can do that if times-like-XXX only ever occur once per process. I *might* be able to do it with filters. But I would rather do it by saying trace on and trace off. Maybe even with a context manager around the suspicious places. (6) Then, at the end of the run, I would say give me the info about how much was allocated when tracing was on. Some of that might be going away again when tracing is off, but at least I know what is making the allocations in the first place. And I know that they're sticking around long enough. Under your current proposal, step (5) turns into set filters trace on ... get_traces serialize to some other storage trace off and step (6) turns into read in from that other storage I just made up on the fly, and do my own summarizing, because my format is almost by definition non-standard. This complication isn't intolerable, but neither is it what I expect from python. And it certainly isn't what I expect from a binary toggle like enable/disable. (So yes, changing the name to clear_traces would help, because I would still be disappointed, but at least I wouldn't be surprised.) Also, if you do stick with the current limitations, then why even have get_traces, as opposed to just take_snapshot? Is there some difference between them, except that a snapshot has some convenience methods and some simple metadata? Later, he wrote: I don't see why disable() would return data. disable is indeed a bad name for something that returns data. The only reason to return data from
[Python-Dev] PEP 454 (tracemalloc) disable == clear?
reset() function: Clear traces of memory blocks allocated by Python. Does this do anything besides clear? If not, why not just re-use the 'clear' name from dicts? disable() function: Stop tracing Python memory allocations and clear traces of memory blocks allocated by Python. I would disable to stop tracing, but I would not expect it to clear out the traces it had already captured. If it has to do that, please put in some sample code showing how to save the current traces before disabling. -jJ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]
On 11/20/12, Daniel Holth dho...@gmail.com wrote: On Tue, Nov 20, 2012 at 3:58 PM, Jim J. Jewett jimjjew...@gmail.com wrote: Vinay Sajip reworded the 'Provides-Dist' definition to explicitly say: The use of multiple names in this field *must not* be used for bundling distributions together. It is intended for use when projects are forked and merged over time ... (1) Then how *should* the bundle-of-several-components case be represented? The useful way to bundle a bunch of things would be to just include them all in an executable folder or zipfile with __main__.py. PEP 426 and the package database would not get involved. The bundle would be distributed as an application you can download and use, not as an sdist on PyPI. When I look at, for example, twisted, there are some fairly fine distinctions. I can imagine some people wanting to handle each little piece differently, since that is the level at which they would be replaced by a more efficient implementation. That doesn't mean that someone using the default should have to manage 47 separate little packages individually. Also note that ZODB is mentioned as a bundling example in the current (2012-11-14) PEP. What does the PEP recommend that they do? Stop including transaction? Keep including it but stop 'Provides-Dist'-ing it? The current PEP also specifies that This field must include the project identified in the Name field, followed by the version : Name (Version). but the examples do not always include version. Why is the MUST there? Is there some way to distinguish between concrete and abstract provisions? For example, if MyMail (2012.11.10) includes 'Provides-Dist: email', does that really get parsed as 'Provides-Dist: email (2012.11.10)'? (2) How is 'Provides-Dist' different from 'Obsoletes-Dist'? The only difference I can see is that it may be a bit more polite to people who do want to install multiple versions of a (possibly abstract) package. The intent of Provides and Obsoletes is different. Obsoletes would not satisfy a requirement during dependency resolution. The RPM guide explains a similar system: As best I can understand, Obsoletes means Go ahead and uninstall that other package. Saying that *without* providing the same functionality seems like a sneaky spelling of Please break whatever relies on that other package. I'm willing to believe that there is a more useful meaning. I'm also willing to believe that they are logically redundant but express different intentions. The current wording doesn't tell me which is true. (Admittedly, that is arguably an upstream bug with other package systems, but you should still either fix it or explicitly delegate the definitions.) And as long as I'm asking for clarification, can foopkg-3.4 obsolete foopgk3.2? If not, is it a semantics problem, or just not idiomatic? If so, does it have a precise meaning, such as no longer interoperates with? And now that I've looked more carefully ... Can a Key: Value pair be continued onto another line? The syntax description under Metadata Files does not say so, but later text suggests that either leading whitespace or a leading tab specifically (from the example code) will work. (And is description a special case?) Is the payload assumed to be utf8 text? Can it be itself a mime message? Are there any restrictions on 'Name'? e.g., Can the name include spaces? line breaks? Must it be a valid python identifier? A valid python qualname? 'Version' says that it must be in the format specified in PEP 386. Unfortunately, it doesn't say which part of 386. Do you mean that it must be acceptable to verlib.NormalizedVersion without first having to call suggest_normalized_version? 'Summary' specifies that it must be one line. Is there a character limit, or do you just mean no line breaks? Do you want to add a Should be less than 80 characters or some such, based on typical tool presentation? Would it be worth repeating the advice that longer descriptions should go in the payload, after all headers? (Otherwise, they have to find 'Description' *and* notice that it is deprecated and figure out what to do instead.) Under 'Description', it isn't entirely clear whether what terminates the field. Multiple paragraphs suggests that there can be multiple lines, but I'm guessing that -- in practice -- they have to be a single logical line, with all but the first starting with whitespace. Under 'Classifier', is PEP 301 really the current authority for classifiers? I would prefer at least a reference to http://pypi.python.org/pypi?%3Aaction=list_classifiers demonstrating which classifiers are currently meaningful. Under 'Requires-Dist', there is an unclosed parenthesis. Does the 'Setup-Requires-Dist' set implicitly include the 'Requires-Dist' set, or should a package be listed both ways if it is required at both setup and runtime? The Summary of Differences from PEP 345 mentions changes to Requires-Dist, but I don't
Re: [Python-Dev] [Python-checkins] cpython: Close #15387: inspect.getmodulename() now uses a new
Why is inspect.getmoduleinfo() deprecated? Is it just to remove circular dependencies? FWIW, I much prefer an API like: tell_me_about(object) to one like: for test_data in (X, Y, Z): usable = tester(object, test_data) if valid(usable): return possible_results[test_data] and to me, inspect.getmoduleinfo(path) looks like the first, while checking the various import.machinery.*SUFFIXES looks like the second. -jJ On 7/18/12, nick.coghlan python-check...@python.org wrote: http://hg.python.org/cpython/rev/af7961e1c362 changeset: 78161:af7961e1c362 user:Nick Coghlan ncogh...@gmail.com date:Wed Jul 18 23:14:57 2012 +1000 summary: Close #15387: inspect.getmodulename() now uses a new importlib.machinery.all_suffixes() API rather than the deprecated inspect.getmoduleinfo() files: Doc/library/importlib.rst | 13 - Doc/library/inspect.rst| 15 --- Lib/importlib/machinery.py | 4 Lib/inspect.py | 11 +-- Misc/NEWS | 3 +++ 5 files changed, 40 insertions(+), 6 deletions(-) diff --git a/Doc/library/importlib.rst b/Doc/library/importlib.rst --- a/Doc/library/importlib.rst +++ b/Doc/library/importlib.rst @@ -533,12 +533,23 @@ .. attribute:: EXTENSION_SUFFIXES - A list of strings representing the the recognized file suffixes for + A list of strings representing the recognized file suffixes for extension modules. .. versionadded:: 3.3 +.. func:: all_suffixes() + + Returns a combined list of strings representing all file suffixes for + Python modules recognized by the standard import machinery. This is a + helper for code which simply needs to know if a filesystem path + potentially represents a Python module (for example, + :func:`inspect.getmodulename`) + + .. versionadded:: 3.3 + + .. class:: BuiltinImporter An :term:`importer` for built-in modules. All known built-in modules are diff --git a/Doc/library/inspect.rst b/Doc/library/inspect.rst --- a/Doc/library/inspect.rst +++ b/Doc/library/inspect.rst @@ -198,9 +198,18 @@ .. function:: getmodulename(path) Return the name of the module named by the file *path*, without including the - names of enclosing packages. This uses the same algorithm as the interpreter - uses when searching for modules. If the name cannot be matched according to the - interpreter's rules, ``None`` is returned. + names of enclosing packages. The file extension is checked against all of + the entries in :func:`importlib.machinery.all_suffixes`. If it matches, + the final path component is returned with the extension removed. + Otherwise, ``None`` is returned. + + Note that this function *only* returns a meaningful name for actual + Python modules - paths that potentially refer to Python packages will + still return ``None``. + + .. versionchanged:: 3.3 + This function is now based directly on :mod:`importlib` rather than the + deprecated :func:`getmoduleinfo`. .. function:: ismodule(object) diff --git a/Lib/importlib/machinery.py b/Lib/importlib/machinery.py --- a/Lib/importlib/machinery.py +++ b/Lib/importlib/machinery.py @@ -13,3 +13,7 @@ from ._bootstrap import ExtensionFileLoader EXTENSION_SUFFIXES = _imp.extension_suffixes() + +def all_suffixes(): +Returns a list of all recognized module suffixes for this process +return SOURCE_SUFFIXES + BYTECODE_SUFFIXES + EXTENSION_SUFFIXES diff --git a/Lib/inspect.py b/Lib/inspect.py --- a/Lib/inspect.py +++ b/Lib/inspect.py @@ -450,8 +450,15 @@ def getmodulename(path): Return the module name for a given file, or None. -info = getmoduleinfo(path) -if info: return info[0] +fname = os.path.basename(path) +# Check for paths that look like an actual module file +suffixes = [(-len(suffix), suffix) +for suffix in importlib.machinery.all_suffixes()] +suffixes.sort() # try longest suffixes first, in case they overlap +for neglen, suffix in suffixes: +if fname.endswith(suffix): +return fname[:neglen] +return None def getsourcefile(object): Return the filename that can be used to locate an object's source. diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -41,6 +41,9 @@ Library --- +- Issue #15397: inspect.getmodulename() is now based directly on importlib + via a new importlib.machinery.all_suffixes() API. + - Issue #14635: telnetlib will use poll() rather than select() when possible to avoid failing due to the select() file descriptor limit. -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
[Python-Dev] PEP 362 minor nits
I've limited this to minor issues, but kept python-dev in the loop because some are questions, rather than merely editorial. Based on: http://hg.python.org/peps/file/tip/pep-0362.txt view pep-0362.txt @ 4466:659639095ace Committing the latest changes to PEP 362 on behalf of Yury Selivanov. author Larry Hastings la...@hastings.org dateTue, 19 Jun 2012 02:38:15 -0700 (3 hours ago) parents c1f693b39292 == 44 * return_annotation : object 45 The annotation for the return type of the function if specified. 46 If the function has no annotation for its return type, this 47 attribute is not set. I don't think you need the if specified, given the next line. Similar comments around line 89 (Parameter.default) and 93 (Parameter.annotation). 48 * parameters : OrderedDict 49 An ordered mapping of parameters' names to the corresponding 50 Parameter objects (keyword-only arguments are in the same order 51 as listed in ``code.co_varnames``). Are you really sure you want to promise the keyword-only order in the PEP? [BoundArguments] 139 * arguments : OrderedDict 140 An ordered, mutable mapping of parameters' names to arguments' values. 141 Does not contain arguments' default values. I think 141 should be reworded, but I'm not certain my wording doesn't have similar problems, so I merely offer it: arguments contains only explicitly bound parameters; parameters for which the binding relied on a default value do not appear in arguments. 142 * args : tuple 143 Tuple of positional arguments values. Dynamically computed from 144 the 'arguments' attribute. 145 * kwargs : dict 146 Dict of keyword arguments values. Dynamically computed from 147 the 'arguments' attribute. Do you want to specify which will contain the normal parameters, that could be called either way? My naive assumption would be that as much as possible gets shoved into args, but once a positional parameter is left to default, remaining parameters are stuck in kwargs. 172 - If the object is not callable - raise a TypeError 173 174 - If the object has a ``__signature__`` attribute and if it 175 is not ``None`` - return a shallow copy of it Should these two be reversed? 183 - If the object is a method or a classmethod, construct and return 184 a new ``Signature`` object, with its first parameter (usually 185 ``self`` or ``cls``) removed 187 - If the object is a staticmethod, construct and return 188 a new ``Signature`` object I would reverse these two, to make it clear that a staticmethod is not treated as a method. 194 - If the object is a class or metaclass: 195 196 - If the object's type has a ``__call__`` method defined in 197 its MRO, return a Signature for it 198 199 - If the object has a ``__new__`` method defined in its class, 200 return a Signature object for it 201 202 - If the object has a ``__init__`` method defined in its class, 203 return a Signature object for it What happens if it inherits a __new__ or __init__ from something more derived than object? 207 Note, that I would remove the comma. 235 Some functions may not be introspectable 236 237 238 Some functions may not be introspectable in certain implementations of 239 Python. For example, in CPython, builtin functions defined in C provide 240 no metadata about their arguments. Adding support for them is out of 241 scope for this PEP. Ideally, it would at least be possible to manually construct a signature, and register them in some central location. (Similar to what is done with pickle or copy.) Checking that location would then have to be an early step in the signature algorithm. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 362 minor nits
On Tue, Jun 19, 2012 at 11:53 AM, Yury Selivanov yselivanov...@gmail.com wrote: Based on: http://hg.python.org/peps/file/tip/pep-0362.txt view pep-0362.txt @ 4466:659639095ace == 142 * args : tuple 143 Tuple of positional arguments values. Dynamically computed from 144 the 'arguments' attribute. 145 * kwargs : dict 146 Dict of keyword arguments values. Dynamically computed from 147 the 'arguments' attribute. Do you want to specify which will contain the normal parameters, that could be called either way? My naive assumption would be that as much as possible gets shoved into args, but once a positional parameter is left to default, remaining parameters are stuck in kwargs. Correct, we push as much as possible to 'args'. Only var_keyword and keyword_only args go to 'kwargs'. But the words positional and keyword more refer to what particularly *args and **kwargs do, disconnected from the Signature's parameters. Which is why there is some ambiguity, and I wondered if you were intentionally leaving it open or not. def f(a): pass s=signature(f) ba1=s.bind(1) Now which of the following are true? # Ambiguous parameters to args ba.args==(1,) and ba.kwargs=={} # or ambiguous parameters to kwargs ba.args=() and ba.kwargs={a:1} Does it matter how the argument was bound? As in, would ba2=s.bind(a=2) produce a different answer? If as much as possible goes to args, then: def g(a=1, b=2, c=3): pass s=signature(g) ba=s.bind(a=10, c=13) would imply ba.args == (10,) and ba.kwargs={c:13} True because a can be written positionally, but c can't unless b is, and b shouldn't be because it relied on the default value. 172 - If the object is not callable - raise a TypeError 173 174 - If the object has a ``__signature__`` attribute and if it 175 is not ``None`` - return a shallow copy of it Should these two be reversed? Do you have a use-case? Not really; the only cases that come to mind are cases where it makes sense to look at an explicit signature attribute, instead of calling the factory. 183 - If the object is a method or a classmethod, construct and return 184 a new ``Signature`` object, with its first parameter (usually 185 ``self`` or ``cls``) removed 187 - If the object is a staticmethod, construct and return 188 a new ``Signature`` object I would reverse these two, to make it clear that a staticmethod is not treated as a method. It's actually not how it's implemented. ... But that's an implementation detail, the algorithm in the PEP just shows the big picture (is it OK?). Right; implementing it in the other order is fine, so long as the actual tests for methods exclude staticmethods. But for someone trying to understand it, staticmethods sound like a kind of method, and I would expect them to be included in something that handles methods, unless they were already excluded by a prior clause. 194 - If the object is a class or metaclass: 195 196 - If the object's type has a ``__call__`` method defined in 197 its MRO, return a Signature for it 198 199 - If the object has a ``__new__`` method defined in its class, 200 return a Signature object for it 201 202 - If the object has a ``__init__`` method defined in its class, 203 return a Signature object for it What happens if it inherits a __new__ or __init__ from something more derived than object? What do you mean by more derived than object? class A: def __init__(self): pass class B(A): ... Because of the distinction between in its MRO and in its class, it looks like the signature of A is based on its __init__, but the signature of subclass B is not. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 362 minor nits
On Tue, Jun 19, 2012 at 2:10 PM, Yury Selivanov yselivanov...@gmail.com wrote: On 2012-06-19, at 12:33 PM, Jim Jewett wrote: On Tue, Jun 19, 2012 at 11:53 AM, Yury Selivanov yselivanov...@gmail.com wrote: Based on: http://hg.python.org/peps/file/tip/pep-0362.txt view pep-0362.txt @ 4466:659639095ace == 142 * args : tuple 143 Tuple of positional arguments values. Dynamically computed from 144 the 'arguments' attribute. 145 * kwargs : dict 146 Dict of keyword arguments values. Dynamically computed from 147 the 'arguments' attribute. Correct, we push as much as possible to 'args'. [examples to clarify] OK, I would just add a sentence and commented example then, something like. Arguments which could be passed as part of either *args or **kwargs will be included only in the args attribute. In the following example: def g(a=1, b=2, c=3): pass s=signature(g) ba=s.bind(a=10, c=13) ba.args (10,) ba.kwargs {'c': 13} Parameter a is part of args, because it can be. Parameter c must be passed as a keyword, because (earlier) parameter b is not being passed an explicit value. I can tweak the PEP to make it more clear for those who don't know that staticmethods are not exactly methods, but do we really need that? I would prefer it, if only because it surprised me. When do distinguish between methods, staticmethod isn't usually the odd man out. And I also agree that the implementation doesn't need to change (except to add a comment), only the PEP. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 362: 4th edition
On Sat, Jun 16, 2012 at 11:27 AM, Nick Coghlan ncogh...@gmail.com wrote: On Sat, Jun 16, 2012 at 1:56 PM, Jim J. Jewett jimjjew...@gmail.com wrote: *Every* Parameter attribute is optional, even name. (Think of builtins, even if they aren't automatically supported yet.) So go ahead and define some others that are sometimes useful. Add only stuff we know is interesting and useful. Agreed, but it doesn't have to be useful in all cases, or even available on all Signatures; if users are already prepared for missing data, it is enough that the attribute be well-defined, and be useful when it does appear. That said, it looks like is_implemented isn't sufficiently well-defined. - kind - name (should be given meaningful content, even for POSITIONAL_ONLY parameters) I agree that it *should* be given meaningful content, but I don't think the Parameter (or Signature) should be blocked without it. I also don't think that a documentation-only name that cannot be used for keyword calls should participate in equality. The existence of the parameter should participate, and its annotation is more important than usual, but its name is not. - default (may be missing, since None is allowed as a default value) - annotation (may be missing, since None is allowed as an annotation) Position is also important, but I'm not certain whether it should be represented in the Parameter, or only in the Signature. copy(source, target) copy(target, source) have different signatures, but I'm not sure whether it would be appropriate to reuse the same parameter objects. Instead of defining a BoundArguments class, just return a copy of the Signature, with value attributes added to the Parameters. No, the BoundArguments class is designed to be easy to feed to a function call as f(*args, **kwds) Why does that take a full class, as opposed to a method returning a tuple and a dict? Use subclasses to distinguish the parameter kind. Please, no, using subclasses when there is no behavioural change is annoying. A **kwargs argument is very different from an ordinary parameter. Its name doesn't matter (and therefore should not be considered in __eq__), it can only appear once per signature, and the possible location of its appearance is different. It is formatted differently (which I would prefer to do in the Parameter, rather than in Signature). It also holds very different data, and must be treated specially by several Signature methods, particularly when either validating or binding. (It is bound to a Mapping, rather than to a single value, so you have to keep it around longer and use a different bind method.) A Signature object has the following public attributes and methods: The more I try to work with it, the more I want direct references to the two special arguments (*args, **kwargs) if they exist. FWIW, the current bind logic to find them -- particularly kwargs -- seems contorted, compared to self.kwargsparameter. (3rd edition) * is_keyword_only : bool ... * is_args : bool ... * is_kwargs : bool ... (4th edition) ... Parameter.POSITIONAL_ONLY ... ... Parameter.POSITIONAL_OR_KEYWORD ... ... Parameter.KEYWORD_ONLY ... ... Parameter.VAR_POSITIONAL ... ... Parameter.VAR_KEYWORD ... This set has already grown, and I can think of others I would like to use. (Pseudo-parameters, such as a method's self instance, or an auxiliary variable.) No. This is the full set of binding behaviours. self is just an ordinary POSITIONAL_OR_KEYWORD argument (or POSITIONAL_ONLY, in some builtin cases). Or no longer a parameter at all, once the method is bound. Except it sort of still is. Same for the space parameter in PyPy. I don't expect the stdlib implementation to support them initially, but I don't want it to get in the way, either. A supposedly closed set gets in the way. I'm not sure if positional parameters should also check position, or if that can be left to the Signature. Positional parameters don't know their relative position, so it *has* to be left to the signature. But perhaps they *should* know their relative position. Also, positional_only, *args, and **kwargs should be able to remove name from the list of compared attributes. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 362: 4th edition
On Mon, Jun 18, 2012 at 10:37 AM, Yury Selivanov yselivanov...@gmail.com wrote: Jim, On 2012-06-18, at 3:08 AM, Jim Jewett wrote: On Sat, Jun 16, 2012 at 11:27 AM, Nick Coghlan ncogh...@gmail.com wrote: On Sat, Jun 16, 2012 at 1:56 PM, Jim J. Jewett jimjjew...@gmail.com wrote: Instead of defining a BoundArguments class, just return a copy of the Signature, with value attributes added to the Parameters. No, the BoundArguments class is designed to be easy to feed to a function call as f(*args, **kwds) Why does that take a full class, as opposed to a method returning a tuple and a dict? Read this thread, please: http://mail.python.org/pipermail/python-dev/2012-June/12.html I reread that. I still don't see why it needs to be an instance of a specific independent class, as opposed to a Signature method that returns a (tuple of) a tuple and a dict. ((arg1, arg2, arg3...), {key1: val2, key2: val2}) Use subclasses to distinguish the parameter kind. Please, no, using subclasses when there is no behavioural change is annoying. [Examples of how the kinds of parameters are qualitatively different.] A **kwargs argument is very different from an ordinary parameter. Its name doesn't matter (and therefore should not be considered in __eq__), The importance of its name depends hugely on the use context. In some it may be very important. The name of kwargs can only be for documentation purposes. Like an annotation or a docstring, it won't affect the success of an attempted call. Annotations are kept because (often) their entire purpose is to document the signature. But docstrings are being dropped, because they often serve other purposes. I've had far more use for docstrings than for the names of positional-only parameters. (In fact, knowing the name of a positional-only parameter has sometimes been an attractive nuisance.) And it is treated specially, along with the *args. Right -- but this was in response to Nick's claim that the distinctions should not be represented as a subclass, because the behavior wasn't different. I consider different __eq__ implementations or formatting concers to be sufficient on their own; I also consider different possible use locations and counts, different used-by-the-system attributes (name), or different value types (object vs collection) to be sufficiently behavioral. A Signature object has the following public attributes and methods: The more I try to work with it, the more I want direct references to the two special arguments (*args, **kwargs) if they exist. FWIW, the current bind logic to find them -- particularly kwargs -- seems contorted, compared to self.kwargsparameter. Well, 'self.kwargsparameter' will break 'self.parameters' collection, unless you want one parameter to be in two places. Correct; it should be redundant. Signature.kwargsparameter should be the same object that occurs as the nth element of Signature.parameters.values(). It is just more convenient to retrieve the parameter directly than it is to iterate through a collection inspecting each element for the value of a specific attribute. In fact, the check types example (in the PEP) is currently shorter and easier to read with 'Signature.parameters' than with dedicated property for '**kwargs' parameter. Agreed; the short-cuts *args and **kwargs are only useful because they are special; they aren't needed when you're doing the same thing to all parameters regardless of type. And if after all you need direct references to *args or **kwargs - write a little helper, which finds them in 'Signature.parameters'. Looking at http://bugs.python.org/review/15008/diff/5143/Lib/inspect.py you already need one in _bind; it is just that saving the info when you pass it isn't too bad if you're already iterating through the whole collection anyhow. Also, positional_only, *args, and **kwargs should be able to remove name from the list of compared attributes. I still believe in the most contexts the name of a parameter matters (even if it's **kwargs). Besides, how can we make __eq__ to be configurable? __eq__ can can an _eq_fields attribute to see which other attributes matter -- but it makes more sense for that to be (sub-) class property. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (time) PEP 418 glossary V2
On Tue, Apr 24, 2012 at 6:38 AM, Victor Stinner victor.stin...@gmail.com wrote: Monotonic - This is a particularly tricky term, as there are several subtly incompatible definitions in use. Is it a definition for the glossary? One use case for a PEP is that someone who does *not* have a background in the area wants to start learning about it. Even excluding the general service of education, these people can be valuable contributors, because they have a fresh perspective. They will almost certainly waste some time retracing dead ends, but I would prefer it be out of a need to prove things to themselves, instead of just because they misunderstood. Given the amount of noise we already went through arguing over what Monotonic should mean, I think we have an obligation to provide these people with a heads-up, even if we don't end up using the term ourselves. And I think we *will* use the terms ourselves, if only as some of the raw os_clock_* choices. C++ followed the mathematical definition ... a monotonic clock only promises not to go backwards. ... additional guarantees, some ... required by the POSIX Confession: I based the above statements strictly on posts to python-dev, from people who seemed to have some experience caring about clock details. I did not find the relevant portions of either specification.[1] Every time I started to search, I got pulled back to other tasks, and the update was just delayed even longer. I still felt it was worth consolidating the state of the discussion. Anyone who feels confident in this domain is welcome to correct me, and encouraged to send replacement text. [1] Can I assume that Victor's links here are the relevant ones, or is someone aware of additional/more complete references for these specifications? http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3128.html#time.clock.monotonic http://pubs.opengroup.org/onlinepubs/95399/basedefs/time.h.html The tradeoffs often include lack of a defined Epoch_ or mapping to `Civil Time`_, I don't know any monotonic with a defined epoch or mappable to the civil time. The very basic seconds (not even milliseconds) since the beginning of 1970 fits that definition, but doesn't seem to fit what most people mean by Monotonic Clock. I'm still a little fuzzy on *why* it shouldn't count as a monotonic clock. Is it technically valid, but a lousy implementation because of insufficient precision or resolution? Is it because the functions used in practice (on a modern OS) to retrieve timestamps don't guarantee to ignore changes to the system clock? and being more expensive (in `Latency`_, power usage, or duration spent within calls to the clock itself) to use. CLOCK_MONOTONIC and CLOCK_REALTIME have the same performances on Linux and FreeBSD. Why would a monotonic clock be more expensive? For example, the clock may represent (a constant multiplied by) ticks of a specific quartz timer on a specific CPU core, and calls would therefore require synchronization between cores. I don't think that synchronizing a counter between CPU cores is something expensive. See the following tables for details: http://www.python.org/dev/peps/pep-0418/#performance Synchronization is always relatively expensive. How expensive depends on a lot of things decides before python was installed. Looking at the first table there (Linux 3.3 with Intel Core i7-2600 at 3.40GHz (8 cores)), CLOCK_MONOTONIC can be hundreds of times slower than time(), and over 50 times slower than CLOCK_MONOTONIC_COARSE. I would assume that CLOCK_MONOTONIC_COARSE meets the technical requirements for a monotonic clock, but does less well at meeting the actual expectations for some combination of (precision/stability/resolution). CLOCK_MONOTONIC and CLOCK_REALTIME use the same hardware clocksource and so have the same latency depending on the hardware. Is this a rule of thumb or a requirement of some standard? Does that fact that Windows, Mac OS X, and GNU/Hurd don't support CLOCK_MONOTONIC indicate that there is a (perhaps informal?) specification that none of their clocks meet, or does it only indicate that they didn't like the name? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] peps: Note that ImportError will no longer be raised due to a missing __init__.py
On Thu, Apr 19, 2012 at 18:56, eric.smith wrote: +Note that an ImportError will no longer be raised for a directory +lacking an ``__init__.py`` file. Such a directory will now be imported +as a namespace package, whereas in prior Python versions an +ImportError would be raised. Given that there is no way to modify the __path__ of a namespace package (short of restarting python?), *should* it be an error if there is exactly one directory? Or is that just a case of other tools out there, didn't happen to install them? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] (time) PEP 418 glossary V2
Glossary Absolute Time - A measurement of time since a specific Epoch_, typically far in the past. Civil Time is the most common example. Typically contrasted with a `Duration`_, as (now - epoch) is generally much larger than any duration that can be appropriately measured with the clock in question. Accuracy The amount of deviation of measurements by a given instrument from true values. See also the wikipedia article on `Accuracy and precision http://en.wikipedia.org/wiki/Accuracy_and_precision`_. Inaccuracy in clocks may be caused by lack of `Precision`_, by `Drift`_, or by an incorrect initial setting of the clock (e.g., timing of threads is inherently inaccurate because perfect synchronization in resetting counters is quite difficult). Adjusted Resetting a clock, presumably to the correct time. This may be done either with a `Step`_ or with `Slew`_. Adjusting a clock normally makes it more accurate with respect to the `Absolute Time`_. The cost is that any durations currently being measured will show a `Bias`_. (17 ticks is not the same Duration_ as 17 ticks plus an adjustment.) Bias Lack of accuracy that is systematically in one direction, as opposed to random errors. When a clock is `Adjusted`_, durations overlapping the adjustment will show a Bias. Civil Time -- Time of day; external to the system. 10:45:13am is a Civil time; A Duration_ like 45 seconds is not a Civil time. Provided by existing functions ``time.localtime()`` and ``time.gmtime()``, which are not changed by this PEP. Clock - An instrument for measuring time. Different clocks have different characteristics; for example, a clock with nanosecond Precision_ may start to Drift_ after a few minutes, while a less precise clock remained accurate for days. This PEP is primarily concerned with clocks which use a unit of seconds, rather than years, or arbitrary units such as a Tick_. Counter --- A clock which increments each time a certain event occurs. A counter is strictly monotonic in the mathematical sense, but does not meet the typical definitions of Monotonic_ when used of a computer clock. It can be used to generate a unique (and ordered) timestamp, but these timestamps cannot be mapped to `Civil Time`_; Tick_ creation may well be bursty, with several advances in the same millisecond followed by several days without any advance. CPU Time A measure of how much CPU effort has been spent on a certain task. CPU seconds are often normalized (so that a variable number can occur in the same actual second). CPU seconds can be important when profiling, but they do not map directly to user response time, nor are they directly comparable to (real time) seconds. Drift - The accumulated error against true time, as defined externally to the system. Drift may be due to imprecision, or to a difference between the average rate at which clock time advances and that of real time. Drift does not include intentional adjustments, but clocks providing `Absolute Time`_ will eventually have to be Adjusted_ to compensate for drift. Duration Elapsed time. The difference between the starting and ending times. Also called Relative Time. Normally contrasted with `Absolute Time`_. While a defined Epoch_ technically creates an implicit duration, this duration is normally too large to be of practical use. Computers can often supply a clock with better Precision_ or higher Resolution_ if they do not have to guarantee meaningful comparisons to any times not generated by the clock itself. Epoch - The reference point of a clock. For clocks providing `Civil Time`_, this is often midnight as the day (and year) rolled over to January 1, 1970. A Monotonic_ clock will typically have an undefined epoch (represented as None). Latency --- Delay. By the time a call to a clock function returns, `Real Time`_ has advanced, possibly by more than the precision of the clock. Monotonic - This is a particularly tricky term, as there are several subtly incompatible definitions in use. C++ followed the mathematical definition, so that a monotonic clock only promises not to go backwards. In practice, that is not sufficient to be useful, and no Operating System provides such a weak guarantee. Most discussions of a Monotonic *Clock* will also assume several additional guarantees, some of which are explicitly required by the POSIX specification. Within this PEP (and Python), the intended meaning is closer to the characteristics expected of a monotonic clock in practice. In addition to not moving backward, a Monotonic Clock should also be Steady_, and should be convertible to a unit of seconds. The tradeoffs often include lack of a defined Epoch_ or mapping to `Civil Time`_, and being more expensive (in `Latency`_, power usage, or duration spent within calls to the clock itself) to use. For example, the clock may represent (a constant multiplied
[Python-Dev] PEP 418 glossary
I believe PEP 418 (or at least the discussion) would benefit greatly from a glossary to encourage people to use the same definitions. This is arguably the Definitions section, but it should move either near the end or (preferably) ahead of the Functions. It also needs to be greatly expanded. Here is my strawman proposal, which does use slightly different definitions than the current PEP even for some terms that the PEP does define: Accuracy: Is the answer correct? Any clock will eventually drift; if a clock is intended to match Civil Time, it will need to be adjusted back to the true time. Adjusted: Resetting a clock to the correct time. This may be done either with a Step or by Slewing. Civil Time: Time of day; external to the system. 10:45:13am is a Civil time; 45 seconds is not. Provided by existing function time.localtime() and time.gmtime(). Not changed by this PEP. Clock: An instrument for measuring time. Different clocks have different characteristics; for example, a clock with nanonsecond precision may start to drift after a few minutes, while a less precise clock remained accurate for days. This PEP is primarily concerned with clocks which use a unit of seconds. Clock_Monotonic: The characteristics expected of a monotonic clock in practice. In addition to being monotonic, the clock should also be steady and have relatively high precision, and should be convertible to a unit of seconds. The tradeoffs often include lack of a defined epoch or mapping to Civil Time, and being more expensive (in latency, power usage, or duration spent within calls to the clock itself) to use. For example, the clock may represent (a constant multiplied by) ticks of a specific quartz timer on a specific CPU core, and calls would therefore require synchronization between cores. The original motivation for this PEP was to provide a cross-platform name for requesting a clock_monotonic clock. Counter: A clock which increments each time a certain event occurs. A counter is strictly monotonic, but not clock_monotonic. It can be used to generate a unique (and ordered) timestamp, but these timestamps cannot be mapped to civil time; tick creation may well be bursty, with several advances in the same millisecond followed by several days without any advance. CPU Time: A measure of how much CPU effort has been spent on a certain task. CPU seconds are often normalized (so that a variable number can occur in the same actual second). CPU seconds can be important when profiling, but they do not map directly to user response time, nor are they directly comparable to (real time) seconds. time.clock() is deprecated because it returns real time seconds on Windows, but CPU seconds on unix, which prevents a consistent cross-platform interpretation. Duration: Elapsed time. The difference between the starting and ending times. A defined epoch creates an implicit (and usually large) duration. More precision can generally be provided for a relatively small duration. Drift: The accumulated error against true time, as defined externally to the system. Epoch: The reference point of a clock. For clocks providing civil time, this is often midnight as the day (and year) rolled over to January 1, 1970. For a clock_monotonic clock, the epoch may be undefined (represented as None). Latency: Delay. By the time a clock call returns, the real time has advanced, possibly by more than the precision of the clock. Microsecond: 1/1,000,000 of a second. Fast enough for most -- but not all -- profiling uses. Millisecond: 1/1,000 of a second. More than adequate for most end-to-end UI measurements, but often too coarse for profiling individual functions. Monotonic: Moving in at most one direction; for clocks, that direction is forward. A (nearly useless) clock that always returns exactly the same time is technically monotonic. In practice, most uses of monotonic with respect to clocks actually refer to a stronger set of guarantees, as described under clock_monotonic Nanosecond 1/1,000,000,000 of a second. The smallest unit of resolution -- and smaller than the actual precision -- available in current mainstream operating systems. Precision: Significant Digits. What is the smallest duration that the clock can distinguish? This differs from resolution in that a difference greater than the minimum precision is actually meaningful. Process Time: Time elapsed since the process began. It is typically measured in CPU time rather than real time, and typically does not advance while the process is suspended. Real Time: Time in the real world. This differs from Civil time in that it is not adjusted, but they should otherwise advance in lockstep. It is not related to the real time of Real Time [Operating] Systems. It is sometimes called wall clock time to avoid that ambiguity; unfortunately, that introduces different ambiguities. Resolution:
[Python-Dev] Who are the decimal volunteers? Re: [Python-checkins] cpython: Resize the coefficient to MPD_MINALLOC also if the requested size is below
I remember that one of the concerns with cdecimal was whether it could be maintained by anyone except Stefan (and a few people who were already overcommitted). If anyone (including absolute newbies) wants to step up, now would be a good time to get involved. A few starter questions, whose answer it would be good to document: Why is there any need for MPD_MINALLOC at all for (immutable) numbers? I suspect that will involve fleshing out some of the memory management issues around dynamic decimals, as touched on here: http://www.bytereef.org/mpdecimal/doc/libmpdec/memory.html#static-and-dynamic-decimals On Mon, Apr 9, 2012 at 3:33 PM, stefan.krah python-check...@python.org wrote: http://hg.python.org/cpython/rev/170bdc5c798b changeset: 76197:170bdc5c798b parent: 76184:02ecb8261cd8 user: Stefan Krah sk...@bytereef.org date: Mon Apr 09 20:47:57 2012 +0200 summary: Resize the coefficient to MPD_MINALLOC also if the requested size is below MPD_MINALLOC. Previously the resize was skipped as a micro optimization. files: Modules/_decimal/libmpdec/mpdecimal.c | 36 -- 1 files changed, 20 insertions(+), 16 deletions(-) diff --git a/Modules/_decimal/libmpdec/mpdecimal.c b/Modules/_decimal/libmpdec/mpdecimal.c --- a/Modules/_decimal/libmpdec/mpdecimal.c +++ b/Modules/_decimal/libmpdec/mpdecimal.c @@ -480,17 +480,20 @@ { assert(!mpd_isconst_data(result)); /* illegal operation for a const */ assert(!mpd_isshared_data(result)); /* illegal operation for a shared */ - + assert(MPD_MINALLOC = result-alloc); + + nwords = (nwords = MPD_MINALLOC) ? MPD_MINALLOC : nwords; + if (nwords == result-alloc) { + return 1; + } if (mpd_isstatic_data(result)) { if (nwords result-alloc) { return mpd_switch_to_dyn(result, nwords, status); } - } - else if (nwords != result-alloc nwords = MPD_MINALLOC) { - return mpd_realloc_dyn(result, nwords, status); - } - - return 1; + return 1; + } + + return mpd_realloc_dyn(result, nwords, status); } /* Same as mpd_qresize, but the complete coefficient (including the old @@ -500,20 +503,21 @@ { assert(!mpd_isconst_data(result)); /* illegal operation for a const */ assert(!mpd_isshared_data(result)); /* illegal operation for a shared */ - - if (mpd_isstatic_data(result)) { - if (nwords result-alloc) { - return mpd_switch_to_dyn_zero(result, nwords, status); - } - } - else if (nwords != result-alloc nwords = MPD_MINALLOC) { - if (!mpd_realloc_dyn(result, nwords, status)) { + assert(MPD_MINALLOC = result-alloc); + + nwords = (nwords = MPD_MINALLOC) ? MPD_MINALLOC : nwords; + if (nwords != result-alloc) { + if (mpd_isstatic_data(result)) { + if (nwords result-alloc) { + return mpd_switch_to_dyn_zero(result, nwords, status); + } + } + else if (!mpd_realloc_dyn(result, nwords, status)) { return 0; } } mpd_uint_zero(result-data, nwords); - return 1; } -- Repository URL: http://hg.python.org/cpython ___ Python-checkins mailing list python-check...@python.org http://mail.python.org/mailman/listinfo/python-checkins ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython (3.2): attempt to fix asyncore buildbot failure
What does this verify? My assumption from the name (test_quick_connect) and the context (an asynchronous server) is that it is verifying the server can handle a certain level of load. Refusing the sockets should then be a failure, or at least a skipped test. Would the below fail even if asyncore.loop were taken out of the threading.Thread target altogether? On Fri, Mar 23, 2012 at 10:10 AM, giampaolo.rodola python-check...@python.org wrote: http://hg.python.org/cpython/rev/2db4e916245a changeset: 75901:2db4e916245a branch: 3.2 parent: 75897:b97964af7299 user: Giampaolo Rodola' g.rod...@gmail.com date: Fri Mar 23 15:07:07 2012 +0100 summary: attempt to fix asyncore buildbot failure files: Lib/test/test_asyncore.py | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/Lib/test/test_asyncore.py b/Lib/test/test_asyncore.py --- a/Lib/test/test_asyncore.py +++ b/Lib/test/test_asyncore.py @@ -741,11 +741,15 @@ for x in range(20): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + s.settimeout(.2) s.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', 1, 0)) - s.connect(server.address) - s.close() - + try: + s.connect(server.address) + except socket.error: + pass + finally: + s.close() class TestAPI_UseSelect(BaseTestAPI): use_poll = False -- Repository URL: http://hg.python.org/cpython ___ Python-checkins mailing list python-check...@python.org http://mail.python.org/mailman/listinfo/python-checkins ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython (2.7): Fixes Issue 14234: fix for the previous commit, keep compilation when
Does this mean that if Python is updated before expat, python will compile out the expat randomization, and therefore not use if even after expat is updated? -jJ On Thu, Mar 15, 2012 at 2:01 PM, benjamin.peterson python-check...@python.org wrote: http://hg.python.org/cpython/rev/ada6bfbeceb8 changeset: 75699:ada6bfbeceb8 branch: 2.7 user: Gregory P. Smith g...@krypto.org date: Wed Mar 14 18:12:23 2012 -0700 summary: Fixes Issue 14234: fix for the previous commit, keep compilation when using --with-system-expat working when the system expat does not have salted hash support. files: Modules/expat/expat.h | 2 ++ Modules/pyexpat.c | 5 + 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/Modules/expat/expat.h b/Modules/expat/expat.h --- a/Modules/expat/expat.h +++ b/Modules/expat/expat.h @@ -892,6 +892,8 @@ XML_SetHashSalt(XML_Parser parser, unsigned long hash_salt); +#define XML_HAS_SET_HASH_SALT /* Python Only: Defined for pyexpat.c. */ + /* If XML_Parse or XML_ParseBuffer have returned XML_STATUS_ERROR, then XML_GetErrorCode returns information about the error. */ diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -1302,8 +1302,13 @@ else { self-itself = XML_ParserCreate(encoding); } +#if ((XML_MAJOR_VERSION = 2) (XML_MINOR_VERSION = 1)) || defined(XML_HAS_SET_HASH_SALT) + /* This feature was added upstream in libexpat 2.1.0. Our expat copy + * has a backport of this feature where we also define XML_HAS_SET_HASH_SALT + * to indicate that we can still use it. */ XML_SetHashSalt(self-itself, (unsigned long)_Py_HashSecret.prefix); +#endif self-intern = intern; Py_XINCREF(self-intern); #ifdef Py_TPFLAGS_HAVE_GC -- Repository URL: http://hg.python.org/cpython ___ Python-checkins mailing list python-check...@python.org http://mail.python.org/mailman/listinfo/python-checkins ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Close #14205: dict lookup raises a RuntimeError if the dict is modified during
I do not believe the change set below is valid. As I read it, the new test verifies that one particular type of Nasty key will provoke a RuntimeError -- but that particular type already did so, by hitting the recursion limit. (It doesn't even really mutate the dict.) Meanwhile, the patch throws out tests for several different types of mutations that have caused problems -- even segfaults -- in the past, even after the dict implementation code was already fixed. Changing these tests to assertRaises would be fine, but they should all be kept; if nothing else, they test whether you've caught all mutation avenues. -jJ On Mon, Mar 5, 2012 at 7:13 PM, victor.stinner python-check...@python.org wrote: http://hg.python.org/cpython/rev/934aaf2191d0 changeset: 75445:934aaf2191d0 user: Victor Stinner victor.stin...@gmail.com date: Tue Mar 06 01:03:13 2012 +0100 summary: Close #14205: dict lookup raises a RuntimeError if the dict is modified during a lookup. if you want to make a sandbox on top of CPython, you have to fix segfaults so let's fix segfaults! files: Lib/test/crashers/nasty_eq_vs_dict.py | 47 -- Lib/test/test_dict.py | 22 +- Lib/test/test_mutants.py | 291 -- Misc/NEWS | 5 +- Objects/dictobject.c | 18 +- 5 files changed, 31 insertions(+), 352 deletions(-) diff --git a/Lib/test/crashers/nasty_eq_vs_dict.py b/Lib/test/crashers/nasty_eq_vs_dict.py deleted file mode 100644 --- a/Lib/test/crashers/nasty_eq_vs_dict.py +++ /dev/null @@ -1,47 +0,0 @@ -# from http://mail.python.org/pipermail/python-dev/2001-June/015239.html - -# if you keep changing a dictionary while looking up a key, you can -# provoke an infinite recursion in C - -# At the time neither Tim nor Michael could be bothered to think of a -# way to fix it. - -class Yuck: - def __init__(self): - self.i = 0 - - def make_dangerous(self): - self.i = 1 - - def __hash__(self): - # direct to slot 4 in table of size 8; slot 12 when size 16 - return 4 + 8 - - def __eq__(self, other): - if self.i == 0: - # leave dict alone - pass - elif self.i == 1: - # fiddle to 16 slots - self.__fill_dict(6) - self.i = 2 - else: - # fiddle to 8 slots - self.__fill_dict(4) - self.i = 1 - - return 1 - - def __fill_dict(self, n): - self.i = 0 - dict.clear() - for i in range(n): - dict[i] = i - dict[self] = OK! - -y = Yuck() -dict = {y: OK!} - -z = Yuck() -y.make_dangerous() -print(dict[z]) diff --git a/Lib/test/test_dict.py b/Lib/test/test_dict.py --- a/Lib/test/test_dict.py +++ b/Lib/test/test_dict.py @@ -379,7 +379,7 @@ x.fail = True self.assertRaises(Exc, d.pop, x) - def test_mutatingiteration(self): + def test_mutating_iteration(self): # changing dict size during iteration d = {} d[1] = 1 @@ -387,6 +387,26 @@ for i in d: d[i+1] = 1 + def test_mutating_lookup(self): + # changing dict during a lookup + class NastyKey: + mutate_dict = None + + def __hash__(self): + # hash collision! + return 1 + + def __eq__(self, other): + if self.mutate_dict: + self.mutate_dict[self] = 1 + return self == other + + d = {} + d[NastyKey()] = 0 + NastyKey.mutate_dict = d + with self.assertRaises(RuntimeError): + d[NastyKey()] = None + def test_repr(self): d = {} self.assertEqual(repr(d), '{}') diff --git a/Lib/test/test_mutants.py b/Lib/test/test_mutants.py deleted file mode 100644 --- a/Lib/test/test_mutants.py +++ /dev/null @@ -1,291 +0,0 @@ -from test.support import verbose, TESTFN -import random -import os - -# From SF bug #422121: Insecurities in dict comparison. - -# Safety of code doing comparisons has been an historical Python weak spot. -# The problem is that comparison of structures written in C *naturally* -# wants to hold on to things like the size of the container, or the -# biggest containee so far, across a traversal of the container; but -# code to do containee comparisons can call back into Python and mutate -# the container in arbitrary ways while the C loop is in midstream. If the -# C code isn't extremely paranoid about digging things out of memory on -# each trip, and artificially boosting refcounts for the duration, anything -# from infinite loops to OS crashes can result (yes, I use Windows wink). -# -# The other problem is that code designed to provoke a weakness is usually -# white-box code, and so catches only the particular vulnerabilities the -# author knew
Re: [Python-Dev] [Python-checkins] peps: Switch back to named functions, since the Ellipsis version degenerated badly
On Wed, Feb 22, 2012 at 10:22 AM, nick.coghlan python-check...@python.org wrote: + in x = weakref.ref(target, report_destruction) + def report_destruction(obj): print({} is being destroyed.format(obj)) +If the repetition of the name seems especially annoying, then a throwaway +name like ``f`` can be used instead:: + in x = weakref.ref(target, f) + def f(obj): + print({} is being destroyed.format(obj)) I still feel that the helper function (or class) is subordinate, and should be indented. Thinking of in ... as a decorator helps, but makes it seem that the helper function is the important part (which it sometimes is...) I understand that adding a colon and indent has its own problems, but ... I'm not certain this is better, and I am certain that the desire for indentation is strong enough to at least justify discussion in the PEP. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for new dictionary implementation
On Fri, Feb 17, 2012 at 1:50 AM, Martin v. Löwis mar...@v.loewis.de wrote: Good idea. However, how do you track per-dict how large the table is? [Or, rather, what is the highest index needed to store any values that are actually set for this instance.] To determine whether it needs to grow the array, it needs to find out how large the array is, no? So: how do you do that? Ah, now I understand; you do need a single ssize_t either on the dict or at the head of the values array to indicate how many slots it has actually allocated. It *may* also be worthwhile to add a second ssize_t to indicate how many are currently in use, for faster results in case of len. But the dict is guaranteed to have at least one free slot, so that extra index will never make the allocation larger than the current code. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for new dictionary implementation
On Thu, Feb 16, 2012 at 4:34 PM, Martin v. Löwis mar...@v.loewis.de wrote: Am 16.02.2012 19:24, schrieb Jim J. Jewett: PEP author Mark Shannon wrote (in http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt): ... allows ... (the ``__dict__`` attribute of an object) to share keys with other attribute dictionaries of instances of the same class. Is the same class a deliberate restriction, or just a convenience of implementation? It's about the implementation: the class keeps a pointer to the key set. A subclass has a separate pointer for that. I would prefer to see that reason in the PEP; after a few years, I have trouble finding email, even when I remember reading the conversation. Have you timed not storing the hash (in the dict) at all, at least for (unicode) str-only dicts? Going to the string for its own cached hash breaks locality a bit more, but saves 1/3 of the memory for combined tables, and may make a big difference for classes that have relatively few instances. I'd be in favor of that, but it is actually an unrelated change: whether or not you share key sets is unrelated to whether or not str-only dicts drop the cached hash. Except that the biggest arguments against it are that it breaks cache locality, and it changes the dictentry struct -- which this patch already does anyway. Given a dict, it may be tricky to determine whether or not it is str-only, i.e. what layout to use. Isn't that exactly the same determination needed when deciding whether or not to use lookdict_unicode? (It would make the switch to the more general lookdict more expensive, as that would involve a new allocation.) Reduction in memory use is directly related to the number of dictionaries with shared keys in existence at any time. These dictionaries are typically half the size of the current dictionary implementation. How do you measure that? The limit for huge N across huge numbers of dicts should be 1/3 (because both hashes and keys are shared); I assume that gets swamped by object overhead in typical small dicts. It's more difficult than that. He also drops the smalltable (which I think is a good idea), so accounting how this all plays together is tricky. All the more reason to explain in the PEP how he measured or approximated it. If a table is split the values in the keys table are ignored, instead the values are held in a separate array. If they're just dead weight, then why not use them to hold indices into the array, so that values arrays only have to be as long as the number of keys, rather than rounding them up to a large-enough power-of-two? (On average, this should save half the slots.) Good idea. However, how do you track per-dict how large the table is? Why would you want to? The per-instance array needs to be at least as large as the highest index used by any key for which it has a value; if the keys table gets far larger (or even shrinks), that doesn't really matter to the instance. What does matter to the instance is getting a value of its own for a new (to it) key -- and then the keys table can tell it which index to use, which in turn tells it whether or not it needs to grow the array. Are are you thinking of len(o.__dict__), which will indeed be a bit slower? That will happen with split dicts and potentially missing values, regardless of how much memory is set aside (or not) for the missing values. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
I realize that _Py_Identifier is a private name, and that PEP 3131 requires anything (except test cases) in the standard library to stick with ASCII ... but somehow, that feels like too long of a chain. I would prefer to see _Py_Identifier renamed to _Py_ASCII_Identifier, or at least a comment stating that Identifiers will (per PEP 3131) always be ASCII -- preferably with an assert to back that up. -jJ On Sat, Feb 4, 2012 at 7:46 PM, victor.stinner python-check...@python.org wrote: http://hg.python.org/cpython/rev/d2c1521ad0a1 changeset: 74772:d2c1521ad0a1 user: Victor Stinner victor.stin...@haypocalc.com date: Sun Feb 05 01:45:45 2012 +0100 summary: _Py_Identifier are always ASCII strings files: Objects/unicodeobject.c | 5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1744,9 +1744,8 @@ _PyUnicode_FromId(_Py_Identifier *id) { if (!id-object) { - id-object = PyUnicode_DecodeUTF8Stateful(id-string, - strlen(id-string), - NULL, NULL); + id-object = unicode_fromascii((unsigned char*)id-string, + strlen(id-string)); if (!id-object) return NULL; PyUnicode_InternInPlace(id-object); -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Hash collision security issue (now public)
In http://mail.python.org/pipermail/python-dev/2012-January/115350.html, Mark Shannon wrote: The minimal proposed change of seeding the hash from a global value (a single memory read and an addition) will have such a minimal performance effect that it will be undetectable even on the most noise-free testing environment. (1) Is it established that this (a single initial add, with no per-loop operations) would be sufficient? I thought that was in the gray area of We don't yet have a known attack, but there are clearly safer options. (2) Even if the direct cost (fetch and add) were free, it might be expensive in practice. The current hash function is designed to send similar strings (and similar numbers) to similar hashes. (2a) That guarantees they won't (initially) collide, even in very small dicts. (2b) It keeps them nearby, which has an effect on cache hits. The exact effect (and even direction) would of course depend on the workload, which makes me distrust micro-benchmarks. If this were a problem in practice, I could understand accepting a little slowdown as the price of safety, but ... it isn't. Even in theory, the only way to trigger this is to take unreasonable amounts of user input and turn it directly into an unreasonable number of keys (as opposed to values, or list elements) placed in the same dict (as opposed to a series of smaller dicts). -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Proposed PEP on concurrent programming support
(I've added back python-ideas, because I think that is still the appropriate forum.) A new suite type - the ``transaction`` will be added to the language. The suite will have the semantics discussed above: modifying an object in the suite will trigger creation of a thread-local shallow copy to be used in the Transaction. Further modifications of the original will cause all existing copies to be discarded and the transaction to be restarted. ... How will you know that an object has been modified? The only ways I can think of are (1) Timestamp every object -- or at least every mutable object -- and hope that everybody agrees on which modifications should count. (2) Make two copies of every object you're using in the suite; at the end, compare one of them to both the original and the one you were operating on. With this solution, you can decide for youself what counts as a modification, but it still isn't straightforward; I would consider changing a value to be changing a dict, even though nothing in the item (header) itself changed. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] That depends on what the meaning of is is (was Re: http://mail.python.org/pipermail/python-dev/2011-December/115172.html)
On Mon, Jan 2, 2012 at 7:16 PM, PJ Eby p...@telecommunity.com wrote: On Mon, Jan 2, 2012 at 4:07 PM, Jim Jewett jimjjew...@gmail.com wrote: But the public header file http://hg.python.org/cpython/file/3ed5a6030c9b/Include/dictobject.h defines the typedef structs for PyDictEntry and _dictobject. What is the purpose of the requiring a real dict without also promising what the header file promises? Er, just because it's in the .h doesn't mean it's in the public API. But in any event, if you're actually serious about this, I'd just point out that: 1. The struct layout doesn't guarantee anything about insertion or lookup algorithms, My concern was about your suggestion of changing the data structure to accommodate some other algorithm -- particularly if it meant that the data would no longer be stored entirely in an array of PyDictEntry. That shouldn't be done lightly even between major versions, and certainly should not be done in a bugfix (or security-only) release. Are you seriously writing code that relies on the C structure layout of dicts? The first page of search results for PyDictEntry suggested that others are. (The code I found did seem to be for getting data from a python dict into some other language, rather than for wsgi.) Because really, that was SO not the point of the dict type requirement. It was so that you could use Python's low-level *API* calls, not muck about with the data structure directly. Would it be too late to clarify that in the PEP itself? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] That depends on what the meaning of is is (was Re: http://mail.python.org/pipermail/python-dev/2011-December/115172.html)
On Mon, Jan 2, 2012 at 1:16 AM, PJ Eby p...@telecommunity.com wrote: On Sun, Jan 1, 2012 at 10:28 PM, Jim Jewett jimjjew...@gmail.com wrote: Given the wording requiring a real dictionary, I would have assumed that it was OK (if perhaps not sensible) to do pointer arithmetic and access the keys/values/hashes directly. (Though if the breakage was between python versions, I would feel guilty about griping too loudly.) If you're going to be a language lawyer about it, I would simply point out that all the spec requires is that type(env) is dict -- it says nothing about how Python defines type or is or dict. So, you're on your own with that one. ;-) But the public header file http://hg.python.org/cpython/file/3ed5a6030c9b/Include/dictobject.h defines the typedef structs for PyDictEntry and _dictobject. What is the purpose of the requiring a real dict without also promising what the header file promises? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Hash collision security issue (now public)
Steven D'Aprano (in http://mail.python.org/pipermail/python-dev/2011-December/115162.html) wrote: By compile-time, do you mean when the byte-code is compilated, i.e. just before runtime, rather than a switch when compiling the Python executable from source? No. I really mean when the C code is initially compiled to produce an python executable. The only reason we're worrying about this is that an adversary may force worst-case performance. If the python instance isn't a server, or at least isn't exposed to untrusted clients, then even a single extra if test is unjustified overhead. Adding overhead to every string hash or every dict lookup is bad. That said, adding some overhead (only) to dict lookups *that already hit half a dozen consecutive collisions* probably is reasonable, because that won't happen very often with normal data. (6 collisions can't happen at all unless there are already at least 6 entries, so small dicts are safe; with at least 1/3 of the slots empty, it should happen only 1/729 for worst-size larger dicts.) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Hash collision security issue (now public)
Paul McMillan in http://mail.python.org/pipermail/python-dev/2012-January/115183.html wrote: Guido van Rossum wrote: Hm. I'm not sure I like the idea of extra arithmetic for every character being hashed. the collision generator doesn't necessarily vary the length of the string. Additionally, if we don't vary based on all the letters in the string, an attacker can fix the characters that we do use and generate colliding strings around them. If the new hash algorithm doesn't kick in before, say, 32 characters, then most currently hashed strings will not be affected. And if the attacker has to add 32 characters to every key, it reduces the this can be done with only N bytes uploaded risk. (The same logic would apply to even longer prefixes, except that an attacker might more easily find short-enough strings that collide.) We could also consider a less computationally expensive operation than the modulo for calculating the lookup index, like simply truncating to the correct number of bits. Given that the modulo is always 2^N, how is that different? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Hash collision security issue (now public)
Victor Stinner wrote in http://mail.python.org/pipermail/python-dev/2012-January/115198.html If we want to protect a website against this attack for example, we must suppose that the attacker can inject arbitrary data and can get (indirectly) the result of hash(str) (e.g. with the representation of a dict in a traceback, with a JSON output, etc.). (1) Is it common to hash non-string input? Because generating integers that collide for certain dict sizes is pretty easy... (2) Would it make sense for traceback printing to sort dict keys? (Any site worried about this issue should already be hiding tracebacks from untrusted clients, but the cost of this extra protection may be pretty small, given that tracebacks shouldn't be printed all that often in the first place.) (3) Should the docs for json.encoder.JSONEncoder suggest sort_keys=True? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html
In http://mail.python.org/pipermail/python-dev/2011-December/115172.html, P. J. Eby wrote: On Sat, Dec 31, 2011 at 7:03 AM, Stephen J. Turnbull stephen at xemacs.org wrote: While the dictionary probe has to start with a hash for backward compatibility reasons, is there a reason the overflow strategy for insertion has to be buckets containing lists? How about double-hashing, etc? This won't help, because the keys still have the same hash value. ANYTHING you do to them after they're generated will result in them still colliding. The *only* thing that works is to change the hash function in such a way that the strings end up with different hashes in the first place. Otherwise, you'll still end up with (deliberate) collisions. Well, there is nothing wrong with switching to a different hash function after N collisions, rather than in the first place. The perturbation effectively does by shoving the high-order bits through the part of the hash that survives the mask. (Well, technically, you could use trees or some other O log n data structure as a fallback once you have too many collisions, for some value of too many. Seems a bit wasteful for the purpose, though.) Your WSGI specification http://www.python.org/dev/peps/pep-0333/ requires using a real dictionary for compatibility; storing some of the values outside the values array would violate that. Do you consider that obsolete? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html
On Sun, Jan 1, 2012 at 8:04 PM, Christian Heimes li...@cheimes.de wrote: Am 02.01.2012 01:37, schrieb Jim Jewett: Well, there is nothing wrong with switching to a different hash function after N collisions, rather than in the first place. The perturbation effectively does by shoving the high-order bits through the part of the hash that survives the mask. Except that it won't work or slow down every lookup of missing keys? It's absolutely crucial that the lookup time is kept as fast as possible. It will only slow down missing keys that themselves hit more than N collisions. Or were you assuming that I meant to switch the whole table, rather than just that one key? I agree that wouldn't work. You can't just change the hash algorithm in the middle of the work without a speed impact on lookups. Right -- but there is nothing wrong with modifying the lookdict (and insert_clean) functions to do something different after the Nth collision than they did after the N-1th. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html
On Sun, Jan 1, 2012 at 10:00 PM, PJ Eby p...@telecommunity.com wrote: On Sun, Jan 1, 2012 at 7:37 PM, Jim Jewett jimjjew...@gmail.com wrote: Well, there is nothing wrong with switching to a different hash function after N collisions, rather than in the first place. The perturbation effectively does by shoving the high-order bits through the part of the hash that survives the mask. Since these are true hash collisions, they will all have the same high order bits. So, the usefulness of the perturbation is limited mainly to the common case where true collisions are rare. That is only because the perturb is based solely on the hash. Switching to an entirely new hash after the 5th collision (for a given lookup) would resolve that (after the 5th collision); the question is whether or not the cost is worthwhile. (Well, technically, you could use trees or some other O log n data structure as a fallback once you have too many collisions, for some value of too many. Seems a bit wasteful for the purpose, though.) Your WSGI specification http://www.python.org/dev/peps/pep-0333/ requires using a real dictionary for compatibility; storing some of the values outside the values array would violate that. When I said use some other data structure, I was referring to the internal implementation of the dict type, not to user code. The only user-visible difference (even at C API level) would be the order of keys() et al. Given the wording requiring a real dictionary, I would have assumed that it was OK (if perhaps not sensible) to do pointer arithmetic and access the keys/values/hashes directly. (Though if the breakage was between python versions, I would feel guilty about griping too loudly.) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Hash collision security issue (now public)
In http://mail.python.org/pipermail/python-dev/2011-December/115138.html, Christian Heimes pointed out that ... we don't have to alter the outcome of hash ... We just need to reduce the chance that an attacker can produce collisions in the dict (and set?) I'll state it more strongly. hash probably should not change (at least for this), but we may want to consider a different conflict resolution strategy when the first slot is already filled. Remember that there was a fair amount of thought and timing effort put into selecting the current strategy; it is deliberately sub-optimal for random input, in order to do better with typical input. http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictnotes.txt If there is a change, it would currently be needed in three places for each of set and dict (the lookdict functions and insertdict_clean). It may be worth adding some macros just to keep those six in sync. Once those macros are in place, that allows a compile-time switch. My personal opinion is that accepting *and parsing* enough data for this to be a problem is enough of an edge case that I don't want normal dicts slowed down at all for this; I would therefore prefer that the change be restricted to such a compile-time switch, with current behavior the default. http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictobject.c#l571 583for (perturb = hash; ep-me_key != NULL; perturb = PERTURB_SHIFT) { 584 i = (i 2) + i + perturb + 1; PERTURB_SHIFT is already a private #define to 5; per dictnotes, 4 and 6 perform almost as well. Someone worried can easily make that change today, and be protected from generic anti-python attacks. I believe the salt suggestions have equivalent to replacing perturb = hash; with something likeperturb = hash + salt; Changing i = (i 2) + i + perturb + 1;would allow effectively replacing the initial hash, but risks spoiling performance in the non-adversary case. Would there be objections to replacing those two lines with something like: for (perturb = FIRST_PERTURB(hash, key); ep-me_key != NULL; perturb = NEXT_PERTURB(hash, key, perturb)) { i = NEXT_SLOT(i, perturb); The default macro definitions should keep things as they are #define FIRST_PERTURB(hash, key)hash #define NEXT_PERTURB(hash, key, perturb)perturb PERTURB_SHIFT #define NEXT_SLOT(i, perturb)(i 2) + i + perturb + 1 while allowing #ifdefs for (slower but) safer things like adding a salt, or even using alternative hashes. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] A new dict for Xmas?
Greg Ewing wrote: Mark Shannon wrote: I have a new dict implementation which allows sharing of keys between objects of the same class. We already have the __slots__ mechanism for memory savings. Have you done any comparisons with that? You can't make Python programmers use slots, neither can you automatically change existing programs. The automatic change is exactly what a dictionary upgrade provides. I haven't read your patch in detail yet, but it sounds like you're replacing the array of keys + array of values with just an array of values, and getting the numerical index from a single per-class array of keys. That would normally be sensible (so thanks!), but it isn't a drop-in replacement. If you have a Data class intended to take arbitrary per-instance attributes, it just forces them all to keep resizing up, even though individual instances would be small with the current dict. How is this more extreme than replacing a pure dict with some auto-calculated slots and an other_attrs dict that would normally remain empty? [It may be harder to implement, because of the difficulty of calculating the slots in advance ... but I don't see it as any worse, once implemented.] Of course, maybe your shared dict just points to sequential array positions (rather than matching the key position) ... in which case, it may well beat slots, though the the Data class would still be a problem. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyUnicodeObject / PyASCIIObject questions
On Tue, Dec 13, 2011 at 2:55 AM, Martin v. Löwis mar...@v.loewis.de wrote: (1) Why is PyObject_HEAD used instead of PyObject_VAR_HEAD? The unicode object is not a var object. In a var object, tp_itemsize gives the element size, which is not possible for unicode objects, since the itemsize may vary by instance. In addition, not all instances have the items after the base object (plus the size of the base object in tp_basicsize is also not always correct). That makes perfect sense. Any chance of adding the rationale to the code? Either inline, such as changing unicodeobject.h line 291 from PyObject_HEAD to something like: PyObject_HEAD /* Not VAR_HEAD, because tp_itemsize varies, and data may be elsewhere. */ or in the large comments around line 288: Note that Strings use PyObject_HEAD and a length field instead of PyObject_VAR_HEAD, because the tp_itemsize varies by instance, and the actual data is not always immediately after the PyASCIIObject header. (2) Why does PyASCIIObject have a wstr member, and why does PyCompactUnicodeObject have wstr_length? As best I can tell from the PEP or header file, wstr is only meaningful when either: No. wstr is most of all relevant if someone calls PyUnicode_AsUnicode(AndSize); any unicode object might get the wstr pointer filled out at some point. I am willing to believe that requests for a wchar_t (or utf-8 or System Locale charset) representation are common enough to justify caching the data after the first request. But then why throw it away in the first place? Wouldn't programs that create unicode from wchar_t data also be the most likely to request wchar_t data back? wstr_length is only relevant if wstr is not NULL. For a pure ASCII string (and also for Latin-1 and other BMP strings), the wstr length will always equal the canonical length (number of code points). wstr_length != length exactly when: 2==sizeof(wchar_t) PyUnicode_4BYTE_KIND == PyUnicode_KIND( str ) which can sometimes be eliminated at compile-time, and always by string creation time. In all other cases, (wstr_length == length), and wstr can be generated by widening the data without having to inspect it. Is it worth eliminating wstr_length (or even wstr) in those cases, or is that too much complexity? (3) I would feel much less nervous if the remaining 4 values of PyUnicode_Kind were explicitly reserved, and the macros raised an error when they showed up. ... If people use C, they can construct all kinds of illegal ... kind values: many places will either work incorrectly, or have an assertion in debug mode already if an unexpected kind is encountered. What I'm asking is that (1) The other values be documented as reserved, rather than as illegal. (2) The macros produce an error rather than silently corrupting data. This allows at least the possibility of a later change such that (3) The macros handle the new values correctly, if only by delegating back to type-supplied functions. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PyUnicodeObject / PyASCIIObject questions
(see http://www.python.org/dev/peps/pep-0393/ and http://hg.python.org/cpython/file/6f097ff9ac04/Include/unicodeobject.h ) typedef struct { PyObject_HEAD Py_ssize_t length; Py_hash_t hash; struct { unsigned int interned:2; unsigned int kind:2; /* now 3 in implementation */ unsigned int compact:1; unsigned int ascii:1; unsigned int ready:1; } state; wchar_t *wstr; } PyASCIIObject; typedef struct { PyASCIIObject _base; Py_ssize_t utf8_length; char *utf8; Py_ssize_t wstr_length; } PyCompactUnicodeObject; typedef struct { PyCompactUnicodeObject _base; union { void *any; Py_UCS1 *latin1; Py_UCS2 *ucs2; Py_UCS4 *ucs4; } data; } PyUnicodeObject; (1) Why is PyObject_HEAD used instead of PyObject_VAR_HEAD? It is because of the names (.length vs .size), or a holdover from when unicode (as opposed to str) did not expect to be compact, or is there a deeper reason? (2) Why does PyASCIIObject have a wstr member, and why does PyCompactUnicodeObject have wstr_length? As best I can tell from the PEP or header file, wstr is only meaningful when either: (2a) wstr is shared with (and redundant to) the canonical representation -- which will therefore not be ASCII. So wstr (and wstr_length) shouldn't need to be represented explicitly, and certainly not in the PyASCIIObject base. or (2b) The string is a Legacy String (and PyUnicode_READY has not been called). Because it is a Legacy String, the object header must already be a full PyUnicodeObject, and the wstr fields could at least be stored there. I'm also not sure why wstr can't be stored in the existing .data member -- once PyUnicode_READY is called, it will either be there (shared) or be discarded. Are there other times when the wstr will be explicitly re-filled and cached? (3) I would feel much less nervous if the remaining 4 values of PyUnicode_Kind were explicitly reserved, and the macros raised an error when they showed up. (Better still would be to allow other values, and to have the macros delegate to some attribute on the (sub) type object.) Discussion on py-ideas strongly suggested that people should not be rolling their own string string representations, and that it won't really save as much as people think it will, etc ... but I'm not sure that saying do it without inheritance is the best solution -- and that is what treating kind as an exhaustive list does. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Case consistency [was: Re: [Python-checkins] cpython: Cleanup code: remove int/long idioms and simplify a while statement.]
Is there a reason to check for if s[:5] == 'pass ' or s[:5] == 'PASS ': instead of if s[:5].lower() == 'pass' ? If so, it should be documented; otherwise, I would rather see the more inclusive form, that would also allow things like Pass -jJ On Sun, Oct 23, 2011 at 4:21 PM, florent.xicluna python-check...@python.org wrote: http://hg.python.org/cpython/rev/67053b135ed9 changeset: 73076:67053b135ed9 user: Florent Xicluna florent.xicl...@gmail.com date: Sun Oct 23 22:11:00 2011 +0200 summary: Cleanup code: remove int/long idioms and simplify a while statement. diff --git a/Lib/ftplib.py b/Lib/ftplib.py --- a/Lib/ftplib.py +++ b/Lib/ftplib.py @@ -175,10 +175,8 @@ # Internal: sanitize a string for printing def sanitize(self, s): - if s[:5] == 'pass ' or s[:5] == 'PASS ': - i = len(s) - while i 5 and s[i-1] in {'\r', '\n'}: - i = i-1 + if s[:5] in {'pass ', 'PASS '}: + i = len(s.rstrip('\r\n')) s = s[:5] + '*'*(i-5) + s[i:] return repr(s) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Enhance Py_ARRAY_LENGTH(): fail at build time if the argument is not an array
On Wed, Sep 28, 2011 at 8:07 PM, Benjamin Peterson benja...@python.org wrote: 2011/9/28 victor.stinner python-check...@python.org: http://hg.python.org/cpython/rev/36fc514de7f0 changeset: 72512:36fc514de7f0 ... Thanks Rusty Russell for having written these amazing C macros! Do we really need a new file? Why not pyport.h where other compiler stuff goes? I would expect pyport to contain only system-specific macros. These seem more universal. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: #11572: improvements to copy module tests along with removal of old test suite
Why was the old test suite removed? Even if everything is covered by the test file (and that isn't clear from this checkin), I don't see anything wrong with a quick test that doesn't require loading the whole testing apparatus. (I would have no objection to including a comment saying that the majority of the tests are in the test file; I just wonder why they have to be removed entirely.) On Fri, Aug 5, 2011 at 5:06 PM, sandro.tosi python-check...@python.org wrote: http://hg.python.org/cpython/rev/74e79b2c114a changeset: 71749:74e79b2c114a user: Sandro Tosi sandro.t...@gmail.com date: Fri Aug 05 23:05:35 2011 +0200 summary: #11572: improvements to copy module tests along with removal of old test suite files: Lib/copy.py | 65 --- Lib/test/test_copy.py | 168 - 2 files changed, 95 insertions(+), 138 deletions(-) diff --git a/Lib/copy.py b/Lib/copy.py --- a/Lib/copy.py +++ b/Lib/copy.py @@ -323,68 +323,3 @@ # Helper for instance creation without calling __init__ class _EmptyClass: pass - -def _test(): - l = [None, 1, 2, 3.14, 'xyzzy', (1, 2), [3.14, 'abc'], - {'abc': 'ABC'}, (), [], {}] - l1 = copy(l) - print(l1==l) - l1 = map(copy, l) - print(l1==l) - l1 = deepcopy(l) - print(l1==l) - class C: - def __init__(self, arg=None): - self.a = 1 - self.arg = arg - if __name__ == '__main__': - import sys - file = sys.argv[0] - else: - file = __file__ - self.fp = open(file) - self.fp.close() - def __getstate__(self): - return {'a': self.a, 'arg': self.arg} - def __setstate__(self, state): - for key, value in state.items(): - setattr(self, key, value) - def __deepcopy__(self, memo=None): - new = self.__class__(deepcopy(self.arg, memo)) - new.a = self.a - return new - c = C('argument sketch') - l.append(c) - l2 = copy(l) - print(l == l2) - print(l) - print(l2) - l2 = deepcopy(l) - print(l == l2) - print(l) - print(l2) - l.append({l[1]: l, 'xyz': l[2]}) - l3 = copy(l) - import reprlib - print(map(reprlib.repr, l)) - print(map(reprlib.repr, l1)) - print(map(reprlib.repr, l2)) - print(map(reprlib.repr, l3)) - l3 = deepcopy(l) - print(map(reprlib.repr, l)) - print(map(reprlib.repr, l1)) - print(map(reprlib.repr, l2)) - print(map(reprlib.repr, l3)) - class odict(dict): - def __init__(self, d = {}): - self.a = 99 - dict.__init__(self, d) - def __setitem__(self, k, i): - dict.__setitem__(self, k, i) - self.a - o = odict({A : B}) - x = deepcopy(o) - print(o, x) - -if __name__ == '__main__': - _test() diff --git a/Lib/test/test_copy.py b/Lib/test/test_copy.py --- a/Lib/test/test_copy.py +++ b/Lib/test/test_copy.py @@ -17,7 +17,7 @@ # Attempt full line coverage of copy.py from top to bottom def test_exceptions(self): - self.assertTrue(copy.Error is copy.error) + self.assertIs(copy.Error, copy.error) self.assertTrue(issubclass(copy.Error, Exception)) # The copy() method @@ -54,20 +54,26 @@ def test_copy_reduce_ex(self): class C(object): def __reduce_ex__(self, proto): + c.append(1) return def __reduce__(self): - raise support.TestFailed(shouldn't call this) + self.fail(shouldn't call this) + c = [] x = C() y = copy.copy(x) - self.assertTrue(y is x) + self.assertIs(y, x) + self.assertEqual(c, [1]) def test_copy_reduce(self): class C(object): def __reduce__(self): + c.append(1) return + c = [] x = C() y = copy.copy(x) - self.assertTrue(y is x) + self.assertIs(y, x) + self.assertEqual(c, [1]) def test_copy_cant(self): class C(object): @@ -91,7 +97,7 @@ hello, hello\u1234, f.__code__, NewStyle, range(10), Classic, max] for x in tests: - self.assertTrue(copy.copy(x) is x, repr(x)) + self.assertIs(copy.copy(x), x) def test_copy_list(self): x = [1, 2, 3] @@ -185,9 +191,9 @@ x = [x, x] y = copy.deepcopy(x) self.assertEqual(y, x) - self.assertTrue(y is not x) - self.assertTrue(y[0] is not x[0]) - self.assertTrue(y[0] is y[1]) + self.assertIsNot(y, x) + self.assertIsNot(y[0], x[0]) + self.assertIs(y[0], y[1]) def test_deepcopy_issubclass(self): # XXX Note: there's no way to test the TypeError coming out of @@
Re: [Python-Dev] [Python-checkins] cpython: Remove mention of medical condition from the test suite.
If you're going to get rid of the pun, you might as well change the whole sentence... On Sun, Jul 3, 2011 at 1:22 PM, georg.brandl python-check...@python.org wrote: http://hg.python.org/cpython/rev/76452b892838 changeset: 71146:76452b892838 parent: 71144:ce52310f61a0 user: Georg Brandl ge...@python.org date: Sun Jul 03 19:22:42 2011 +0200 summary: Remove mention of medical condition from the test suite. files: Lib/test/test_csv.py | 8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Lib/test/test_csv.py b/Lib/test/test_csv.py --- a/Lib/test/test_csv.py +++ b/Lib/test/test_csv.py @@ -459,20 +459,20 @@ '5', '6']]) def test_quoted_quote(self): - self.readerAssertEqual('1,2,3,I see, said the blind man,as he picked up his hammer and saw', + self.readerAssertEqual('1,2,3,I see, said the happy man,as he picked up his hammer and saw', [['1', '2', '3', - 'I see, said the blind man', + 'I see, said the happy man', 'as he picked up his hammer and saw']]) def test_quoted_nl(self): input = '''\ 1,2,3,I see, -said the blind man,as he picked up his +said the happy man,as he picked up his hammer and saw 9,8,7,6''' self.readerAssertEqual(input, [['1', '2', '3', - 'I see,\nsaid the blind man', + 'I see,\nsaid the happy man', 'as he picked up his\nhammer and saw'], ['9','8','7','6']]) -- Repository URL: http://hg.python.org/cpython ___ Python-checkins mailing list python-check...@python.org http://mail.python.org/mailman/listinfo/python-checkins ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: #6771: Move wrapper function into __init__ and eliminate wrapper module
Does this really need to be a bare except? On Sat, Jun 18, 2011 at 8:21 PM, r.david.murray python-check...@python.org wrote: http://hg.python.org/cpython/rev/9c96c3adbcd1 changeset: 70867:9c96c3adbcd1 user: R David Murray rdmur...@bitdance.com date: Sat Jun 18 20:21:09 2011 -0400 summary: #6771: Move wrapper function into __init__ and eliminate wrapper module Andrew agreed in the issue that eliminating the module file made sense. Wrapper has only been exposed as a function, and so there is no (easy) way to access the wrapper module, which in any case only had the one function in it. Since __init__ already contains a couple wrapper functions, it seems to make sense to just move wrapper there instead of importing it from a single function module. files: Lib/curses/__init__.py | 46 +++- Lib/curses/wrapper.py | 50 -- Misc/NEWS | 4 ++ 3 files changed, 49 insertions(+), 51 deletions(-) diff --git a/Lib/curses/__init__.py b/Lib/curses/__init__.py --- a/Lib/curses/__init__.py +++ b/Lib/curses/__init__.py @@ -13,7 +13,6 @@ __revision__ = $Id$ from _curses import * -from curses.wrapper import wrapper import os as _os import sys as _sys @@ -57,3 +56,48 @@ has_key except NameError: from has_key import has_key + +# Wrapper for the entire curses-based application. Runs a function which +# should be the rest of your curses-based application. If the application +# raises an exception, wrapper() will restore the terminal to a sane state so +# you can read the resulting traceback. + +def wrapper(func, *args, **kwds): + Wrapper function that initializes curses and calls another function, + restoring normal keyboard/screen behavior on error. + The callable object 'func' is then passed the main window 'stdscr' + as its first argument, followed by any other arguments passed to + wrapper(). + + + try: + # Initialize curses + stdscr = initscr() + + # Turn off echoing of keys, and enter cbreak mode, + # where no buffering is performed on keyboard input + noecho() + cbreak() + + # In keypad mode, escape sequences for special keys + # (like the cursor keys) will be interpreted and + # a special value like curses.KEY_LEFT will be returned + stdscr.keypad(1) + + # Start color, too. Harmless if the terminal doesn't have + # color; user can test with has_color() later on. The try/catch + # works around a minor bit of over-conscientiousness in the curses + # module -- the error return from C start_color() is ignorable. + try: + start_color() + except: + pass + + return func(stdscr, *args, **kwds) + finally: + # Set everything back to normal + if 'stdscr' in locals(): + stdscr.keypad(0) + echo() + nocbreak() + endwin() diff --git a/Lib/curses/wrapper.py b/Lib/curses/wrapper.py deleted file mode 100644 --- a/Lib/curses/wrapper.py +++ /dev/null @@ -1,50 +0,0 @@ -curses.wrapper - -Contains one function, wrapper(), which runs another function which -should be the rest of your curses-based application. If the -application raises an exception, wrapper() will restore the terminal -to a sane state so you can read the resulting traceback. - - - -import curses - -def wrapper(func, *args, **kwds): - Wrapper function that initializes curses and calls another function, - restoring normal keyboard/screen behavior on error. - The callable object 'func' is then passed the main window 'stdscr' - as its first argument, followed by any other arguments passed to - wrapper(). - - - try: - # Initialize curses - stdscr = curses.initscr() - - # Turn off echoing of keys, and enter cbreak mode, - # where no buffering is performed on keyboard input - curses.noecho() - curses.cbreak() - - # In keypad mode, escape sequences for special keys - # (like the cursor keys) will be interpreted and - # a special value like curses.KEY_LEFT will be returned - stdscr.keypad(1) - - # Start color, too. Harmless if the terminal doesn't have - # color; user can test with has_color() later on. The try/catch - # works around a minor bit of over-conscientiousness in the curses - # module -- the error return from C start_color() is ignorable. - try: - curses.start_color() - except: - pass - - return func(stdscr, *args, **kwds) - finally: - # Set everything back to normal - if 'stdscr' in locals(): - stdscr.keypad(0) - curses.echo() - curses.nocbreak() - curses.endwin() diff --git a/Misc/NEWS
Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib.
Can you clarify (preferably in the commit message as well) exactly *why* these largefile tests are useless? For example, is there another test that covers this already? -jJ On 5/7/11, nadeem.vawda python-check...@python.org wrote: http://hg.python.org/cpython/rev/201dcfc56e86 changeset: 69886:201dcfc56e86 branch: 2.7 parent: 69881:a0147a1f1776 user:Nadeem Vawda nadeem.va...@gmail.com date:Sat May 07 11:28:03 2011 +0200 summary: Issue #11277: Remove useless test from test_zlib. files: Lib/test/test_zlib.py | 42 --- 1 files changed, 0 insertions(+), 42 deletions(-) diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py --- a/Lib/test/test_zlib.py +++ b/Lib/test/test_zlib.py @@ -72,47 +72,6 @@ zlib.crc32('spam', (2**31))) -# Issue #11277 - check that inputs of 2 GB (or 1 GB on 32 bits system) are -# handled correctly. Be aware of issues #1202. We cannot test a buffer of 4 GB -# or more (#8650, #8651 and #10276), because the zlib stores the buffer size -# into an int. -class ChecksumBigBufferTestCase(unittest.TestCase): -if sys.maxsize _4G: -# (64 bits system) crc32() and adler32() stores the buffer size into an -# int, the maximum filesize is INT_MAX (0x7FFF) -filesize = 0x7FFF -else: -# (32 bits system) On a 32 bits OS, a process cannot usually address -# more than 2 GB, so test only 1 GB -filesize = _1G - -@unittest.skipUnless(mmap, mmap() is not available.) -def test_big_buffer(self): -if sys.platform[:3] == 'win' or sys.platform == 'darwin': -requires('largefile', - 'test requires %s bytes and a long time to run' % - str(self.filesize)) -try: -with open(TESTFN, wb+) as f: -f.seek(self.filesize-4) -f.write(asdf) -f.flush() -m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) -try: -if sys.maxsize _4G: -self.assertEqual(zlib.crc32(m), 0x709418e7) -self.assertEqual(zlib.adler32(m), -2072837729) -else: -self.assertEqual(zlib.crc32(m), 722071057) -self.assertEqual(zlib.adler32(m), -1002962529) -finally: -m.close() -except (IOError, OverflowError): -raise unittest.SkipTest(filesystem doesn't have largefile support) -finally: -unlink(TESTFN) - - class ExceptionTestCase(unittest.TestCase): # make sure we generate some expected errors def test_badlevel(self): @@ -595,7 +554,6 @@ def test_main(): run_unittest( ChecksumTestCase, -ChecksumBigBufferTestCase, ExceptionTestCase, CompressTestCase, CompressObjectTestCase -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Are you asserting that all foreign modules (or at least all handled by this) are in C, as opposed to C++ or even Java or Fortran? (And the C won't change?) Is this ASCII restriction (as opposed to even UTF8) really needed? Or are you just saying that we need to create an ASCII name for passing to C? -jJ On 5/7/11, victor.stinner python-check...@python.org wrote: http://hg.python.org/cpython/rev/eb003c3d1770 changeset: 69889:eb003c3d1770 user:Victor Stinner victor.stin...@haypocalc.com date:Sat May 07 12:46:05 2011 +0200 summary: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII The name must be encodable to ASCII because dynamic module must have a function called PyInit_NAME, they are written in C, and the C language doesn't accept non-ASCII identifiers. files: Python/importdl.c | 40 +- 1 files changed, 25 insertions(+), 15 deletions(-) diff --git a/Python/importdl.c b/Python/importdl.c --- a/Python/importdl.c +++ b/Python/importdl.c @@ -20,31 +20,36 @@ const char *pathname, FILE *fp); #endif -/* name should be ASCII only because the C language doesn't accept non-ASCII - identifiers, and dynamic modules are written in C. */ - PyObject * _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp) { -PyObject *m; +PyObject *m = NULL; #ifndef MS_WINDOWS PyObject *pathbytes; #endif +PyObject *nameascii; char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext; dl_funcptr p0; PyObject* (*p)(void); struct PyModuleDef *def; -namestr = _PyUnicode_AsString(name); -if (namestr == NULL) -return NULL; - m = _PyImport_FindExtensionObject(name, path); if (m != NULL) { Py_INCREF(m); return m; } +/* name must be encodable to ASCII because dynamic module must have a + function called PyInit_NAME, they are written in C, and the C language + doesn't accept non-ASCII identifiers. */ +nameascii = PyUnicode_AsEncodedString(name, ascii, NULL); +if (nameascii == NULL) +return NULL; + +namestr = PyBytes_AS_STRING(nameascii); +if (namestr == NULL) +goto error; + lastdot = strrchr(namestr, '.'); if (lastdot == NULL) { packagecontext = NULL; @@ -60,34 +65,33 @@ #else pathbytes = PyUnicode_EncodeFSDefault(path); if (pathbytes == NULL) -return NULL; +goto error; p0 = _PyImport_GetDynLoadFunc(shortname, PyBytes_AS_STRING(pathbytes), fp); Py_DECREF(pathbytes); #endif p = (PyObject*(*)(void))p0; if (PyErr_Occurred()) -return NULL; +goto error; if (p == NULL) { PyErr_Format(PyExc_ImportError, dynamic module does not define init function (PyInit_%s), shortname); -return NULL; +goto error; } oldcontext = _Py_PackageContext; _Py_PackageContext = packagecontext; m = (*p)(); _Py_PackageContext = oldcontext; if (m == NULL) -return NULL; +goto error; if (PyErr_Occurred()) { -Py_DECREF(m); PyErr_Format(PyExc_SystemError, initialization of %s raised unreported exception, shortname); -return NULL; +goto error; } /* Remember pointer to module init function. */ @@ -101,12 +105,18 @@ Py_INCREF(path); if (_PyImport_FixupExtensionObject(m, name, path) 0) -return NULL; +goto error; if (Py_VerboseFlag) PySys_FormatStderr( import %U # dynamically loaded from %R\n, name, path); +Py_DECREF(nameascii); return m; + +error: +Py_DECREF(nameascii); +Py_XDECREF(m); +return NULL; } #endif /* HAVE_DYNAMIC_LOADING */ -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Userlist.copy() wasn't returning a UserList.
Do you also want to assert that u is not v, or would that sort of copy be acceptable by some subclasses? On 5/5/11, raymond.hettinger python-check...@python.org wrote: http://hg.python.org/cpython/rev/f20373fcdde5 changeset: 69865:f20373fcdde5 user:Raymond Hettinger pyt...@rcn.com date:Thu May 05 14:34:35 2011 -0700 summary: Userlist.copy() wasn't returning a UserList. files: Lib/collections/__init__.py | 2 +- Lib/test/test_userlist.py | 6 ++ 2 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/collections/__init__.py b/Lib/collections/__init__.py --- a/Lib/collections/__init__.py +++ b/Lib/collections/__init__.py @@ -887,7 +887,7 @@ def pop(self, i=-1): return self.data.pop(i) def remove(self, item): self.data.remove(item) def clear(self): self.data.clear() -def copy(self): return self.data.copy() +def copy(self): return self.__class__(self) def count(self, item): return self.data.count(item) def index(self, item, *args): return self.data.index(item, *args) def reverse(self): self.data.reverse() diff --git a/Lib/test/test_userlist.py b/Lib/test/test_userlist.py --- a/Lib/test/test_userlist.py +++ b/Lib/test/test_userlist.py @@ -52,6 +52,12 @@ return str(key) + '!!!' self.assertEqual(next(iter(T((1,2, 0!!!) +def test_userlist_copy(self): +u = self.type2test([6, 8, 1, 9, 1]) +v = u.copy() +self.assertEqual(u, v) +self.assertEqual(type(u), type(v)) + def test_main(): support.run_unittest(UserListTest) -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: PyGILState_Ensure(), PyGILState_Release(), PyGILState_GetThisThreadState() are
Would it be a problem to make them available a no-ops? On 4/26/11, victor.stinner python-check...@python.org wrote: http://hg.python.org/cpython/rev/75503c26a17f changeset: 69584:75503c26a17f user:Victor Stinner victor.stin...@haypocalc.com date:Tue Apr 26 23:34:58 2011 +0200 summary: PyGILState_Ensure(), PyGILState_Release(), PyGILState_GetThisThreadState() are not available if Python is compiled without threads. files: Include/pystate.h | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/Include/pystate.h b/Include/pystate.h --- a/Include/pystate.h +++ b/Include/pystate.h @@ -73,9 +73,9 @@ struct _frame *frame; int recursion_depth; char overflowed; /* The stack has overflowed. Allow 50 more calls - to handle the runtime error. */ -char recursion_critical; /* The current calls must not cause - a stack overflow. */ +to handle the runtime error. */ +char recursion_critical; /* The current calls must not cause +a stack overflow. */ /* 'tracing' keeps track of the execution depth when tracing/profiling. This is to prevent the actual trace/profile code from being recorded in the trace/profile. */ @@ -158,6 +158,8 @@ enum {PyGILState_LOCKED, PyGILState_UNLOCKED} PyGILState_STATE; +#ifdef WITH_THREAD + /* Ensure that the current thread is ready to call the Python C API, regardless of the current state of Python, or of its thread lock. This may be called as many times as desired @@ -199,6 +201,8 @@ */ PyAPI_FUNC(PyThreadState *) PyGILState_GetThisThreadState(void); +#endif /* #ifdef WITH_THREAD */ + /* The implementation of sys._current_frames() Returns a dict mapping thread id to that thread's current frame. */ -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython (3.2): Issue #11919: try to fix test_imp failure on some buildbots.
This seems to be changing what is tested -- are you saying that filenames with an included directory name are not intended to be supported? On 4/25/11, antoine.pitrou python-check...@python.org wrote: http://hg.python.org/cpython/rev/2f2c7eb27437 changeset: 69556:2f2c7eb27437 branch: 3.2 parent: 69554:77cf9e4b144b user:Antoine Pitrou solip...@pitrou.net date:Mon Apr 25 21:39:49 2011 +0200 summary: Issue #11919: try to fix test_imp failure on some buildbots. files: Lib/test/test_imp.py | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_imp.py b/Lib/test/test_imp.py --- a/Lib/test/test_imp.py +++ b/Lib/test/test_imp.py @@ -171,8 +171,9 @@ support.rmtree(test_package_name) def test_issue9319(self): +path = os.path.dirname(__file__) self.assertRaises(SyntaxError, - imp.find_module, test/badsyntax_pep3120) + imp.find_module, badsyntax_pep3120, [path]) class ReloadTests(unittest.TestCase): -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] clarification: subset vs equality Re: [Python-checkins] peps: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements
On 4/4/11, brett.cannon python-check...@python.org wrote: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements +Abstract + + +The Python standard library under CPython contains various instances +of modules implemented in both pure Python and C. This PEP requires +that in these instances that both the Python and C code *must* be +semantically identical (except in cases where implementation details +of a VM prevents it entirely). It is also required that new C-based +modules lacking a pure Python equivalent implementation get special +permissions to be added to the standard library. I think it is worth stating explicitly that the C version can be even a strict subset. It is OK for the accelerated C code to rely on the common python version; it is just the reverse that is not OK. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r87980 - in python/branches/py3k/Lib/importlib: _bootstrap.py abc.py
Why? Are annotations being deprecated in general? Or are these particular annotations no longer accurate? -jJ On Wed, Jan 12, 2011 at 9:31 PM, raymond.hettinger python-check...@python.org wrote: Author: raymond.hettinger Date: Thu Jan 13 03:31:25 2011 New Revision: 87980 Log: Issue 10899: Remove function type annotations from the stdlib Modified: python/branches/py3k/Lib/importlib/_bootstrap.py python/branches/py3k/Lib/importlib/abc.py Modified: python/branches/py3k/Lib/importlib/_bootstrap.py == --- python/branches/py3k/Lib/importlib/_bootstrap.py (original) +++ python/branches/py3k/Lib/importlib/_bootstrap.py Thu Jan 13 03:31:25 2011 @@ -345,7 +345,7 @@ class SourceLoader(_LoaderBasics): - def path_mtime(self, path:str) - int: + def path_mtime(self, path): Optional method that returns the modification time for the specified path. @@ -354,7 +354,7 @@ raise NotImplementedError - def set_data(self, path:str, data:bytes) - None: + def set_data(self, path, data): Optional method which writes data to a file path. Implementing this method allows for the writing of bytecode files. Modified: python/branches/py3k/Lib/importlib/abc.py == --- python/branches/py3k/Lib/importlib/abc.py (original) +++ python/branches/py3k/Lib/importlib/abc.py Thu Jan 13 03:31:25 2011 @@ -18,7 +18,7 @@ Abstract base class for import loaders. @abc.abstractmethod - def load_module(self, fullname:str) - types.ModuleType: + def load_module(self, fullname): Abstract method which when implemented should load a module. raise NotImplementedError @@ -28,7 +28,7 @@ Abstract base class for import finders. @abc.abstractmethod - def find_module(self, fullname:str, path:[str]=None) - Loader: + def find_module(self, fullname, path=None): Abstract method which when implemented should find a module. raise NotImplementedError @@ -47,7 +47,7 @@ @abc.abstractmethod - def get_data(self, path:str) - bytes: + def get_data(self, path): Abstract method which when implemented should return the bytes for the specified path. raise NotImplementedError @@ -63,19 +63,19 @@ @abc.abstractmethod - def is_package(self, fullname:str) - bool: + def is_package(self, fullname): Abstract method which when implemented should return whether the module is a package. raise NotImplementedError @abc.abstractmethod - def get_code(self, fullname:str) - types.CodeType: + def get_code(self, fullname): Abstract method which when implemented should return the code object for the module raise NotImplementedError @abc.abstractmethod - def get_source(self, fullname:str) - str: + def get_source(self, fullname): Abstract method which should return the source code for the module. raise NotImplementedError @@ -94,7 +94,7 @@ @abc.abstractmethod - def get_filename(self, fullname:str) - str: + def get_filename(self, fullname): Abstract method which should return the value that __file__ is to be set to. raise NotImplementedError @@ -117,11 +117,11 @@ - def path_mtime(self, path:str) - int: + def path_mtime(self, path): Return the modification time for the path. raise NotImplementedError - def set_data(self, path:str, data:bytes) - None: + def set_data(self, path, data): Write the bytes to the path (if possible). Any needed intermediary directories are to be created. If for some @@ -170,7 +170,7 @@ raise NotImplementedError @abc.abstractmethod - def source_path(self, fullname:str) - object: + def source_path(self, fullname): Abstract method which when implemented should return the path to the source code for the module. raise NotImplementedError @@ -279,19 +279,19 @@ return code_object @abc.abstractmethod - def source_mtime(self, fullname:str) - int: + def source_mtime(self, fullname): Abstract method which when implemented should return the modification time for the source of the module. raise NotImplementedError @abc.abstractmethod - def bytecode_path(self, fullname:str) - object: + def bytecode_path(self, fullname): Abstract method which when implemented should return the path to the bytecode for the module. raise NotImplementedError @abc.abstractmethod - def write_bytecode(self, fullname:str, bytecode:bytes) - bool: + def write_bytecode(self, fullname, bytecode): Abstract method which when
Re: [Python-Dev] [Python-checkins] r87523 - python/branches/py3k/Doc/tutorial/interpreter.rst
It might still be worth saying something like: Note that this python file does something subtly different; the details are not included in this tutorial. On Tue, Dec 28, 2010 at 4:18 AM, georg.brandl python-check...@python.org wrote: Author: georg.brandl Date: Tue Dec 28 10:18:24 2010 New Revision: 87523 Log: Remove confusing paragraph -- this is relevant only to advanced users anyway and does not belong into the tutorial. Modified: python/branches/py3k/Doc/tutorial/interpreter.rst Modified: python/branches/py3k/Doc/tutorial/interpreter.rst == --- python/branches/py3k/Doc/tutorial/interpreter.rst (original) +++ python/branches/py3k/Doc/tutorial/interpreter.rst Tue Dec 28 10:18:24 2010 @@ -58,14 +58,6 @@ ``python -m module [arg] ...``, which executes the source file for *module* as if you had spelled out its full name on the command line. -Note that there is a difference between ``python file`` and ``python -file``. In the latter case, input requests from the program, such as calling -``sys.stdin.read()``, are satisfied from *file*. Since this file has already -been read until the end by the parser before the program starts executing, the -program will encounter end-of-file immediately. In the former case (which is -usually what you want) they are satisfied from whatever file or device is -connected to standard input of the Python interpreter. - When a script file is used, it is sometimes useful to be able to run the script and enter interactive mode afterwards. This can be done by passing :option:`-i` before the script. (This does not work if the script is read from standard ___ Python-checkins mailing list python-check...@python.org http://mail.python.org/mailman/listinfo/python-checkins ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] __file__ and bytecode-only
I understand the need to ship without source -- but why does that require supporting .pyc (or .pyo) -only? Couldn't vendors just replace the real .py files with empty files? Then no one would need the extra stat call, and no one would be bitten by orphaned .pyc files after a rename. [Yes, zips could still allow unmatched names; yes, it would be helpful if a tool were available to sync the last-modification time; yes a deprecation release should still be needed.] -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] codecs.oen [was: PEP 385: the eol-type issue]
M.-A. Lemburg wrote: ... and because of this, the feature is already available if you use codecs.open() instead of the built-in open(): Neil Hodgson asked: So should I not add an issue for the basic open because codecs.open should be used for this case? In python 3, why does codecs.open even still exist? As best I can tell, codecs.open should be the same as regular open, but for a unicode file -- and all text files are treated as unicode in python 3.0 So at this point, are there any differences beyond: (a) The builtin open doesn't work on multi-byte line-endings other than the multi-character CRLF. (In other words, it goes by the traditional Operating System conventions developed when a char was a byte, but the Unicode standard allows for a few more possibilities, which are currently rare in practice.) (b) The codecs version is much slower, because it hasn't seen the optimization effort. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] standard library mimetypes module pathologically broken?
[It may be worth creating a patch; I think most of these comments would be better on the bug-tracker.] (1) In a few cases, it looked like you were changing parameter names between files and filenames. This might break code that was calling it with keyword arguments -- as I typically would for this type of function. (1a) If you are going to change the .sig, you might as well do it right, and make the default be knownfiles rather than the empty tuple. (2) The comment about why inited was set true at the beginning of the function instead of the end should probably be kept, or at least reworded. (3) Default values: (3a) Why the list of known files going back to Apache 1.2, in that order? Is there any risk in using too *new* of a MimeTypes file? I would assume that the goal is to pick up whatever changes the user has made locally, but in that case, it still makes sense to have the newest file be the last one read, in case Apache has made bugfixes. (3b) Also, this would improve cross-platform consistency; if I read that correctly, the Apache files will override the python defaults on unix or a mac, but not on windows. That will change the results on the majority of items in _common_types. (application vs text, whether to put an x- in front of the word pict.) (3c) rtf is listed in non-standard, but http://www.iana.org/assignments/media-types/ does define it. (Though whether to guess application vs text is not defined, and python chooses differently from apache.) (3d) jpg is listed as non-standard. It turns out that this is just for the inverse mapping, where image/jpg is non-standard (for image/jpeg) but that is worth a comment. (see #5) (3e) In _types_map, the lines marked duplicates are duplicate keys, not duplicate values; it would be more clear to also comment out the (first) line itself, instead of just marking it a duplicate. (Or better yet, to mention that it is just being added for the inverse mapping, if that is the case.) (4) Why bother to lazyinit?Is there any sane usecase for a MimeTypes that hasn't been inited? I see value in not reading the default files, but none in not reading at least the files that were asked for. I could see value in only partial initialization if there were several long steps, but right now, initialization is all-or-nothing. If the thing is useless without an init, then it makes sense to just get done it immediately and skip the later checks; anyone who could have actually saved time should just remove the import. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 384: Defining a Stable ABI
Martin v. Löwis wrote: - PyGetSetDef (name, get, set, doc, closure) Is it fully decided that the generally-unused closure parameter will stay until python 4? The accessor macros to these fields (Py_REFCNT, Py_TYPE, Py_SIZE) are also available to applications. There have been several experiments in memory management, ranging from not bothering to change the refcount on permanent objects like None, to proxying objects across multiple threads or processes. I also believe (but don't remember for sure) that some of the proposed Unicode (or String?) optimizations changed the memory layout a bit. So far, these have all been complicated (or slow) enough that they didn't get integrated, but if it ever happens ... I don't think it would justify python 4.0 New Python versions may introduce new slot ids, but slot ids will never be recycled. Slots may get deprecated, but continue to be supported throughout Python 3.x. Weren't there already a few ready for deprecation? Do you really want to commit to them forever? Even if you aren't willing to settle for less than 3.x from now on, it might make sense to at least start with 3.2, rather than 3.0. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 383 and GUI libraries
(sent only to python-dev, as I am not a subscriber of tahoe-dev) Zooko wrote: [Tahoe] currently uses utf-8 for its internal storage (note: nothing to do with reading or writing files from external sources -- only for storing filenames in the decentralized storage system which is accessed by Tahoe clients), and we can't start putting non-utf-8-valid sequences in the filename slot because other Tahoe clients would then get a UnicodeDecodeError exception when trying to read those directories. So what do you do when someone has an existing file whose name is supposed to be in utf-8, but whose actual bytes are not valid utf-8? If you have somehow solved that problem, then you're already done -- the PEP's encoding is a no-op on anything that isn't already invalid unicode. If you have not solved that problem, then those clients will already be getting a UnicodeDecodeError; all the PEP does is make it at least possible for them to recover. ... Requirement 1 (unicode): Each filename that you see needs to be valid unicode (it is stored internally in utf-8). (repeating) What does Tahoe do if this is violated? Do you throw an exception right there and not let them copy the file to tahoe? If so, then that same error correction means that utf8b will never differ from utf-8, and you have nothing to worry about. Requirement 2 (faithful if unicode): Doesn't the PEP meet this? Requirement 3 (no file left behind): Doesn't the PEP also meet this? I thought the concern was just that the name used would not be valid unicode, unless the original name was itself valid unicode. Possible Requirement 4 (faithful bytes if not unicode, a.k.a. round-tripping): Doesn't the PEP also support this? (Only) the invalid bytes get escaped and therefore must be unescaped, but the escapement is reversible. 3. (handling collisions) In either case 2.a or 2.b the resulting unicode string may already be present in the directory. This collision is what the use of half-surrogates (as the escape characters) avoids. Such collisions can't be present unless the data was invalid unicode, in which case it was the result of an escapement (unless something other than python is creating new invalid filenames). -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] #!/usr/bin/env python -- python3 where applicable
Jared Grubb wrote: Ok, so if I understand, the situation is: * python points to 2.x version * python3 points to 3.x version * need to be able to run certain 3k scripts from cmdline (since we're talking about shebangs) using Python3k even though python points to 2.x So, if I got the situation right, then do these same scripts understand that PYTHONPATH and PYTHONHOME and all the others are also probably pointing to 2.x code? Would it make sense to introduce PYTHON2PATH and PYTHON3PATH (or even PYTHON27PATH and PYTHON 32PATH) et al? Or is this an area where we just figure that whoever moved the file locations around for distribution can hardcode things properly? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] package resources [was: setuptools has divided the Python community]
At 11:27 PM 3/26/2009 +, Paul Moore wrote: What I'd really like is essentially some form of virtual filesystem access to stuff addressed relative to a Python package name, P.J. Eby responded: Note that relative to a *Python package name* isn't quite as useful, due to namespace packages. To be unambiguous as to the targeted resource, one needs to be able to reference a specific project, and that requires you to go off the name of a module *within* a package. For example, 'zope.somemodule' rather than just 'zope'. I would expect it to be *most* important then. If I know for sure that an entire package is all together in a single directory, I can just use that directory. If I want all xxx files used by zope, then ... I *do* want information on the duplicates, and the multiple locations. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] return from a generator [was:PEP 380 (yield from a subgenerator) comments]
On Thu, Mar 26, 2009 at 4:19 PM, P.J. Eby wrote: What I don't like is the confusion of adding return values to generators, at least using the 'return' statement. At Fri Mar 27 04:39:48 CET 2009, Guido van Rossum replied: I'm +1 on yield from and +0 on return values in generators. def g(): yield 42 return 43 for x in g(): print x# probably expected to print 42 and then 43 I still don't see why it needs to be a return statement. Why not make the intent of g explicit, by writing either def g(): yield 42 yield 43 or def g(): yield 42 raise StopIteration(43) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] wait time [was: Ext4 data loss]
It is starting to look as though flush (and close?) should take an optional wait parameter, to indicate how much re-assurance you're willing to wait for. It also looks like we can't know enough to predict all sensible symbolic constants -- so instead use a floating point numeric value. f.flush(wait=0) == current behavior f.flush(wait=1) == Do everything you can. On a Mac, this would apparently mean (everything up to and including) fcntl(fd, F_FULLSYNC) f.flush(wait=0.5) == somewhere in between, depending on the operating system and file system and disk drive and other stuff the devoloper won't know in advance. The exact interpretation of intermediate values might depend on the installation or even change over time; the only invariant would be that higher values are at least as safe, and lower values are at least as fast. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] wait time [was: Ext4 data loss]
On 3/12/09, Martin v. Löwis mar...@v.loewis.de wrote: It is starting to look as though flush (and close?) should take an optional wait parameter, to indicate how much re-assurance you're willing to wait for. Unfortunately, such a thing would be unimplementable on most of today's operating systems. What am I missing? _file=file class file(_file): ... def flush(self, wait=0): super().flush(self) if wait 0.25: return if wait 0.5 and os.fdatasync: os.fdatasync(self.fileno()) return os.fsync(self.fileno()) if wait 0.75: return if os.ffullsync: os.ffullsync(self.fileno()) (To be honest, I'm not even seeing why it couldn't be done in Objects/fileobject.c, though I realize extension modules would need to go through the python interface to take advantage of it.) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] sure [was: Ext4 data loss]
[new name instead of wait -- but certainty is too long, patience too hard to spell, etc...] class file(_file): ... def flush(self, sure=0): super().flush(self) if sure 0.25: return if sure 0.5 and os.fdatasync: os.fdatasync(self.fileno()) ... Steven D'Aprano asked Why are you giving the user the illusion of fine control by making the wait parameter a continuous variable and then using it as if it were a discrete variable? We don't know how many possible values there will be, or whether they will be affected by environmental settings. Developers will not always know what sort of systems users will have, but they can indicate (with a ratio) where in the range (slow+safe):(fast+risky) they rate this particular flush. Before this discussion, I knew about sync, but had not paid attention even to datasync, let alone fullsync. I have no idea which additional options may be relevant in the future, or on smaller devices or other storage media. I do expect specific intermediate values (such as 0.3) to be interpreted differently on a laptop than on a desktop. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Integrate BeautifulSoup into stdlib?
Michael Foord wrote: Chris Withers wrote: ... love to see ... but ... not optimistic - python to grow a decent, cross platform, package management system As stated, this may be impossible, because of the difference in what a package should mean on Windows vs Unix. If you just mean a way to add python packages from pypi as with EasyInstall, then maybe. - the standard library to actually shrink to a point where only libraries that are not released elsewhere are included In some environments, each new component must be approved. Once python is approved, the standard library is OK, but adding 7 packages from pypi requires 7 more sets of approvals. On the other hand, if there were a way to say The PSF explicitly endorses Packages X, Y, and Z as worthy of the stdlib; they are distributed separately for administrative reasons, then the initial request could be for Python plus officially endorsed addons That said, it may make sense to just give greater prominence to existing repackagers, such as ActiveState or Enthought. If a library is well maintained then there seems to be little point in moving it into the standard library The official endorsement is in many cases more important than shared distribution. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] html5lib/BeautifulSoup (was: Integrate lxml into the stdlib? (was: Integrate BeautifulSoup into stdlib?))
Stefan Behnel wrote: I would have a hard time feeling happy if a real-world HTML parser was added to the stdlib that provides a totally different interface than the best (and fastest) XML library that the stdlib currently has. I doubt there would be any objection to someone contributing wrappers for upgrades, but I wouldn't count on them being used. lxml may well be the best choice for xml. BeautifulSoup and html5lib wouldn't even exist if that actually mattered for most of *their* use cases. Think of them more as pre-processors, like tidylib. If enough web pages were even valid HTML (let alone valid and well-formed XML), no one would have bothered to write these libraries. BeautifulSoup has the advantage of being long-proven in practice, for ugly html. (You mention an lxml feature with a similar intent, but for lxml, it is one of several addon features; for BeautifulSoup, this is the whole point.) html5lib does not have as long of a history, but it does have the advantage of being almost an endorsed standard. Much of HTML 5 is documenting the workarounds that browser makers already actually employ to handle erroneous input, so that the complexities can at least stop compounding. html5lib is intended as a reference implementation, and the w3c editor has used it to motivate changes in the specification draft. (This may make it unsuitable for inclusion in the stdlib today, because of timing issues.) In other words, it isn't just the heuristics of one particular development team; it is (modulo bugs, and after official publication) the heuristics that the major web browser makers have agreed to treat as correct in the future. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] #ifdef __cplusplus?
Alexander Belopolsky wrote: 4. Should exported symbols be always declared in headers or is it ok to just declare them as extern in .c files where they are used? Is the concern that moving them to a header makes them part of the API? In other words, does replacing PyObject * PyFile_FromString(char *name, char *mode) { extern int fclose(FILE *); ... } with #include stdio.h mean that the stdio.h needs to be included from then on, even if PyFile_FromString stops relying upon it? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Merging flow
Nick Coghlan wrote: For now it looks like we might have to maintain 3.0 manually, with svnmerge only helping out for trunk-2.6 and trunk-py3k Does it make the bookkeeping horrible if you merge from trunk straight to 3.0, and then blocked svnmerged changes from propagating? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Looking for VCS usage scenarios
David Ripton wrote: Time for average user to check out Python sources with bzr: 10 minutes Time for average user to check out Python sources with git or hg: 1 minute Time for average user's trivial patch to be reviewed and committed: 1 year I love DVCS as much as the next guy, but checkout time is so not the bottleneck for this use case. I think Paul's point is that he wants to support people who have not previously planned to contribute to python. Writing the patch may be a matter of minutes, once they implement the fix for themselves. Downloading a new VCS is a major commitment of time and disk space. (And is there setup, and dealing with proxies?) It doesn't take as long (calendar) as waiting for the review, but it takes long enough (clock) that people may not bother to do it. And if they don't, what was the point of switching to a DCVS? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] www.python.org/doc and docs.python.org hotfixed
For the search engine issue, is there any way we can tell robots to ignore the rewrite rules so they see the broken links? (although even that may not be ideal, since what we really want is to tell the robot the link is broken, and provide the new alternative) I may be missing something obvious, but isn't this the exact intent of HTTP response code 301 Moved Permanently http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.2 -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] syntax change justification
Nick Coghlan's explanation of what justifies a syntax change (most of message http://mail.python.org/pipermail/python-dev/2008-October/082831.html ) should probably be added to the standard docs/FAQs somewhere. At the moment, I'm not sure exactly where, though. At the moment, the Developer FAQ (http://www.python.org/dev/faq/) is mostly about using specific tools (rather than design philosophy), and Nick's explanation may be too detailed for the current Explanations section of www.python.org/dev/ Possibly as a Meta-PEP? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] backporting tests [was: [Python-checkins] r66863 - python/trunk/Modules/posixmodule.c]
In http://mail.python.org/pipermail/python-dev/2008-October/082994.html Martin v. Löwis wrote: So 2.6.0 will contain a lot of tests that have never been tested in a wide variety of systems. Some are incorrect, and get fixed in 2.6.1, and stay fixed afterwards. This is completely different from somebody introducing a new test in 2.6.4. It means that there are more failures in a maintenance release, not less as in the first case. If 2.6.1 has some (possibly accidental, but exposed to the users) behavior that is not a clear bug, it should be kept through 2.6.x. You may well want to change it in 2.7, but not in 2.6.4. Adding a test to 2.6.2 ensures that the behavior will not silently disappear because of an unrelated bugfix in 2.6.3. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Advice on numbers.py implementation of binary mixins.
Raymond Hettinger wrote: PEP-3141 outlines an approach to writing binary operators to allow the right operand to override the operation if the left operand inherits the operation from the ABC. Here is my first approximation at how to write them for the Integral mixins: class Integral(Rational): def __and__(self, other): if isinstance(other, (type(self), int, long)): # XXX return int(self) int(other) I think for this mixin, it doesn't matter whether other is an Integral instance; it matter whether it is has a more specific solution. So instead of checking whether isinstance, check whether its __rand__ method is Integral.__rand__. I think you also may want to guard against incomplete right-hand operations, by doing something like replacing the simple return NotImplemented with try: val = other.__rand__(self) if val is not NotImplemented: return val except (TypeError, AttributeError): pass # Use the generic fallback after all return int(self) int(other) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Proposal: add odict to collections
The odict (as proposed here, ordered on time of key insertion) looks like a close match to the dlict needed by some of the optimization proposals. http://python.org/dev/peps/pep-0267/ -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Betas today - I hope
On 6/12/08, Nick Coghlan [EMAIL PROTECTED] wrote: documentation patch for the language reference ... following categories: ... 2. Method lookup MAY bypass __getattribute__, shadowing the attribute in the instance dictionary MAY have ill effects. (slots such as __enter__ and __exit__ that are looked up via normal attribute lookup in CPython will fit into this category) Should this category really be enumerated? I thought that was the default meaning of __name__, so the real clarification is: (1) Requiring that the specific names in category 1 MUST be treated this way. (2) Mentioning __*__ and listing any known exceptions. (Can next be treated this way despite the lack of __*__? Is it forbidden to treat __context__ this way?) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Mini-Pep: An Empty String ABC
So, apart from compatibility purposes, what is the point currently of *not* directly subclassing str? To provide your own storage format, such as a views into existing data. Whether or not this is actually practical is a different question; plenty C code tends to assume it can use the internals of str directly, which breaks on even some subclasses. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 8 vs PEP 371: Additional Discussion
Guido van Rossum wrote: I consider multiprocessing a new API -- while it bears a superficial resemblance with the threading API the similarities are just that, and it should not be constrained by mistakes in that API. The justification for including it is precisely that it is *not* a new API. For multiple processes in general, there are competing APIs, which may well be better. The advantage of this API is that (in many cases) it is a drop-in replacement for threading. If that breaks, then there really isn't any reason to include it in the stdlib yet. This doesn't prevent changing the joint API to conform with PEP 8. But why clean this module while leaving the rest of the stdlib? Because there is a volunteer only makes sense if changes to the other modules would also be welcomed. Is there some reason to believe that changes in the threading API are much less disruptive than changes elsewhere in the stdlib? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Iterable String Redux (aka String ABC)
On 5/27/08, Benji York wrote: Guido van Rossum wrote: Armin Ronacher wrote: Basically *the* problematic situation with iterable strings is something like a `flatten` function that flattens out every iterable object except of strings. I'm not against this, but so far I've not been able to come up with a good set of methods to endow the String ABC with. Another problem is that not everybody draws the line in the same place -- how should instances of bytes, bytearray, array.array, memoryview (buffer in 2.6) be treated? Maybe the opposite approach would be more fruitful. Flattening is about removing nested containers, so perhaps there should be an ABC that things like lists and tuples provide, but strings don't. No idea what that might be. It isn't really stringiness that matters, it is that you have to terminate even though you still have an iterable container. The test is roughly (1==len(v) and v[0]==v), except that you want to stop a layer sooner. Guido had at least a start in Searchable, back when ABC were still in the sandbox: http://svn.python.org/view/sandbox/trunk/abc/abc.py?rev=55321view=auto Searchable represented the fact that (x in c) =/= (x in iter(c)) because of sequence searches like (Error in results) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Encoding detection in the standard library?
David Wolever wrote: IMO, encoding estimation is something that many web programs will have to deal with, so it might as well be built in; I would prefer the option to run `text=input.encode('guess')` (or something similar) than relying on an external dependency or worse yet using a hand-rolled algorithm The (still draft) html5 spec is trying to get error-correction standardized, so it includes all sort of if this fails, do X. Encoding detection will be standardized, so there will be an external standard that we can reference. http://dev.w3.org/html5/spec/Overview.html#determining Note that this portion of the spec is probably not stable yet, as there was some new analysis on which wrong answers provided better results on real world web pages. e.g., http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-March/014127.html http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-March/014190.html There was also a recent analysis of how many characters it takes to sniff successfully X% of the time on today's web, though I'm not finding it at the moment. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] unscriptable?
I dispute this. Indices aren't necessarily numeric (think of an A-Z file), Python has recently added an __index__ slot which means as an integer, and I really am an integer, I'm not just rounding like int(3.4) would do So in the context of python, an index is numeric, whereas subscript has already been used for hashtables with arbitrary keys. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] windows (was: how to easily consume just the parts of eggs that are good for you)
Are the Linux users happy with having a Python package manager that ignores RPM/apt? Why should Windows users be any happier? Because, as you noted, the add/remove programs application is severely limited. I've read one too many Windows is so broken that people who use it obviously don't care about doing things right postings this week I'm honestly not sure that such fine-grained control is the right user interface, particularly for a non-shared system. But even if it were, Windows doesn't really have it, and it isn't so valuable that a solution which works only for python could do much better than the existing 3rd-party setup tools. As a windows user, I don't want python packages showing up in the add/remove programs list, because it won't help me, and will make the few times I do use that tool even more awkward. That said, I agree that if python does package management, offering windows users the choice of using that application is probably a good idea. The catch is that package managers seem to offer far more fine-grained power (even without dependency resolution) than windows. Duplicating this would add lots of complexity just for windows -- and still might not be all that useful. I'm already used to looking for an uninstall.exe in the directory of anything I can actually uninstall, and accepting that most things just don't go away cleanly. As a programmer, this feels wrong, but ... it is probably a good tradeoff for the time I don't want to spend maintaining things. If I really wanted a fancy tool that took care of dependencies and alternate versions, I would be willing to run something python-specific, or to treat each package as a subcomponent that I managed through Change an existing program applied to python. But realistically, I don't see such a tool being used often enough to justify inclusion in the core. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Proposal: from __future__ import unicode_string_literals
Maybe it's not apparent to people that hasn't developed in that kind of environment, and I'm sorry I'm not able to make this clearer. I think I understand the issue. Some contributors will be running under 2.6, others will be running under 3.0. Either the code forks, or one of them is working with (and developing patches against) the result of a compilation step, instead of against the original source code. For example, if I'm using the (real source) py2.6 code, and I create a patch that works for me, it is ready for testing and submission. If I'm using the (generated) py3 code, then I first have to get a copy of the (source) 2.6, figure out how I *would* have written it there, then keep tweaking it so that the generator eventually puts out ... what I had originally written by hand. My (working in 3.0) task would be easier if there is also a 3to2 (so that I can treat my own code as the source), but then entire files will do flip-flops on a regular basis (depening on which version was generated), unless 2to3 and 3to2 somehow create a perfect round-trip. And that compile step -- it can be automated, but I suspect most python projects don't yet have a good place to put the hooks, simply because they haven't needed to in the past. The end result is that the barrier to contributions becomes much higher for people working in at least one of the versions. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r61709 - python/trunk/Doc/library/functions.rst python/trunk/Doc/library/future_builtins.rst python/trunk/Doc/library/python.rst
What is the precise specification of the builtin print function. Does it call str, or does it just behave as if the builtin str had been called? In 2.5, the print statement ignores any overrides of the str builtin, but I'm not sure whether a _function_ should -- and I do think it should be specified. -jJ On 3/21/08, georg.brandl [EMAIL PROTECTED] wrote: New Revision: 61709 == +++ python/trunk/Doc/library/functions.rst Fri Mar 21 20:37:57 2008 @@ -817,6 +817,33 @@ ... +.. function:: print([object, ...][, sep=' '][, end='\n'][, file=sys.stdout]) ... + All non-keyword arguments are converted to strings like :func:`str` does ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] windows standard [was: PEP 365 (Adding the pkg_resources module)]
Terry Reedy The standard (and to me, preferable) way of dealing with such things is to have an 'installation manager' that can reinstall as well as delete and that has a check box for various things to delete. This is what Python needs. Paul Moore: I'd dispute strongly that this is a standard. It may be preferable, but I'm not sure where you see evidence of it being a standard. When I install a large program (such as developer tools, or python itself) on Windows, I expect a choice of default or custom. When I choose custom, I expect a list of components, which can be chosen, not chosen, or mixed (meaning that it has subcomponents, only some of which are chosen). The whole thing only shows up once in Add/Remove programs. If I select it, I do get options to Change or Repair. These let me change my mind on which subcomponents are installed. If I install python and then separately install Zope, it may or may not make sense for Zope to be listed separately as a program to Add or Remove. It does not make sense (to me anyhow) have several individual packages within Zope each listed independently at the Windows level. (Though, to be fair, many (non-python) applications *do* make more than one entry.) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] logging shutdown (was: Re: r61431 - python/trunk/Doc/library/logging.rst)
On 3/19/08, Vinay Sajip [EMAIL PROTECTED] wrote: I think (repeatedly) testing an app through IDLE is a reasonable use case. [other threads may still have references to loggers or handlers] Would it be reasonable for shutdown to remove logging from sys.modules, so that a rerun has some chance of succeeding via its own import? I'm not sure that would be enough in the scenario I mentioned above - would removing a module from sys.modules be a guarantee of removing it from memory? No. It will explicitly not be removed from memory while anything holds a live reference. Removing it from sys.modules just means that the next time a module does import logging, the logging initialization code will run again. It is true that this could cause contention if the old version is still holding an exclusive lock on some output file. It's safer, in my view, for the developer of an application to do cleanup of their app if they want to test repeatedly in IDLE. Depending on the issue just fixed, the app may not have a clean shutdown. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] logging shutdown (was: Re: [Python-checkins] r61431 - python/trunk/Doc/library/logging.rst)
I think (repeatedly) testing an app through IDLE is a reasonable use case. Would it be reasonable for shutdown to remove logging from sys.modules, so that a rerun has some chance of succeeding via its own import? -jJ On 3/16/08, vinay.sajip [EMAIL PROTECTED] wrote: Author: vinay.sajip Date: Sun Mar 16 22:35:58 2008 New Revision: 61431 Modified: python/trunk/Doc/library/logging.rst Log: Clarified documentation on use of shutdown(). Modified: python/trunk/Doc/library/logging.rst == --- python/trunk/Doc/library/logging.rst(original) +++ python/trunk/Doc/library/logging.rstSun Mar 16 22:35:58 2008 @@ -732,7 +732,8 @@ .. function:: shutdown() Informs the logging system to perform an orderly shutdown by flushing and - closing all handlers. + closing all handlers. This should be called at application exit and no + further use of the logging system should be made after this call. .. function:: setLoggerClass(klass) ___ Python-checkins mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/python-checkins ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Py_CLEAR to avoid crashes
A simple way to do this would be to push objects whose refcounts had reached 0 onto a list instead of finalizing them immediately, and have PyEval_EvalFrameEx periodically swap in a new to-delete list and delete the objects on the old one. Some of the memory management threads discussed something similar to this, and pointed to IBM papers on Java. By adding states like tenatively finalizable, the cost of using multiple processors was reduced. The down side is that objects which could be released (and recycled) immediately won't be -- which slows down a fair number of real-world programs that are used to the CPython refcount model. If the resource not being immediately released is scarce (such as file handles), it gets even worse. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Rounding Decimals
On 1/12/08, Guido van Rossum [EMAIL PROTECTED] wrote: On Jan 12, 2008 5:09 PM, Jeffrey Yasskin [EMAIL PROTECTED] wrote: During the discussion about the new Rational implementation (http://bugs.python.org/issue1682), Guido and Raymond decided that Decimal should not implement the new Real ABC from PEP 3141. So I've closed http://bugs.python.org/issue1623 and won't be pursuing any of the extra rounding methods mentioned on this thread. Well, I didn't really decide anything. I suggested that if the developers of Decimal preferred it, it might be better if Decimal did not implement the Real ABC, and Raymond said he indeed preferred it. I read his objections slightly differently. He is very clear that Decimal itself should be restricted to the standard, and therefore should not natively implement the extensions. But he also said that it might be reasonable for another package to subset or superset it in a friendlier way. numbers.py is a different module, which must be explicitly imported. If the objection is that decimal.Decimal(43.2).imag would work (instead of throwing an exception) only when numbers.py has already been imported, then ... well, that confusion is inherent in the abstract classes. Or is the problem that it *still* wouldn't work, without help from the decimal module itself? In that case, 3rd party registration is fairly limited, and this might be a good candidate for trying to figure out ABCs and adapters *should* work together. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Coverity Scan, Python upgraded to rung 2
Neal Norwitz wrote: For codeobject.c, line 327 should not be reachable. ... Christian Heimes wrote: Please suppress the warning. I removed the last two lines and GCC complained ... Either way, it would be worth adding a comment to the source code so this doesn't come up again. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-string keys in namespace dicts
PJE wrote: Isn't the simplest way to cache attribute lookups to just have a cache dictionary in the type, and update that dictionary whenever a change is made to a superclass? That's essentially how __slotted__ attribute changes on base classes work now, isn't it? Neil Toronto wrote: The nice thing about caching pointers to dict entries is that they don't change as often as values do. Is this really true for namespaces? I was thinking that the typical namespace usage is a bunch of inserts (possibly with lookups mixed in), followed by never changing it again until it is deallocated. There are fewer ways to invalidate an entry pointer: inserting set, resize, clear, and delete. I'm not sure how to resize without an inserting set. I'm not sure I've ever seen clear on a namespace. (I have seen it on regular dicts being used as a namespace, such as tcl config options.) I have seen deletes (deleting a temp name) and non-inserting sets ... but they're both rare enough that letting them force the slow path might be a good trade, if the optimization is otherwise simpler. Rare updating also means it's okay to invalidate the entire cache rather than single entries Changing __bases__ seems to do that already. (See http://svn.python.org/view/python/trunk/Objects/typeobject.c?rev=59106view=markup functions like update_subclasses.) So I think an alternate version PJE's question would be: Why not just extend that existing mechanism to work on non-slot, non-method attributes? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] SSL 1.7
Bill Janssen wrote: One thing to watch out for: ssl.SSLError can't inherit from socket.error, as it does in 2.6+, Why not? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] SSL 1.7
Bill Janssen wrote: One thing to watch out for: ssl.SSLError can't inherit from socket.error, as it does in 2.6+, Why not? -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] urllib exception compatibility
urllib goes to goes to some trouble to ensure that it raises IOError, even when the underlying exception comes from another module.[*] I'm wondering if it would make sense to just have those modules' exceptions inherit from IOError. In particular, should socket.error, ftp.Error and httplib.HTTPException (used in Py3K) inherit from IOError? I'm also wondering whether it would be acceptable to change the details of the exceptions. For example, could raise IOError, ('ftp error', msg), sys.exc_info()[2] be reworded, or is there there too much risk that someone is checking for an errno of 'ftp error'? [*] This isn't a heavily tested path; some actually fail with a TypeError since 2.5, because IOError no longer accepts argument tuples longer than 3. http://bugs.python.org/issue1209 Fortunately, this makes me less worried about changing the contents of the specific attributes to something more useful... -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 362: Signature objects
Brett Cannon wrote: A Signature object has the following structure attributes: * name : str Name of the function. This is not fully qualified because function objects for methods do not know the class they are contained within. This makes functions and methods indistinguishable from one another when passed to decorators, preventing proper creation of a fully qualified name. (1) Would this change with the new static __class__ attribute used for the new super? (2) What about functions without a name? Do you want to say str or NoneType, or is that assumed? (3) Is the Signature object live or frozen? (name is writable ... will the Signature object reflect the new name, or the name in use at the time it was created?) * var_annotations: dict(str, object) Dict that contains the annotations for the variable parameters. The keys are of the variable parameter with values of the Is there a special key for the - returns annotation, or is that available as a separate property? The structure of the Parameter object is: * name : (str | tuple(str)) The name of the parameter as a string if it is not a tuple. If the argument is a tuple then a tuple of strings is used. What is used for unnamed arguments (typically provided by C)? I like None, but I see the arguments for both and missing attribute. * position : int The position of the parameter within the signature of the function (zero-indexed). For keyword-only parameters the position value is arbitrary while not conflicting with positional parameters. Is this just a property/alias for signature.parameters.index(self) ? What should a parameter object not associated with a specific signature return? -1, None, or missing attribute? Is there a way to get the associated Signature, or is it compiled out when the Signature and its child Parameters are first constructed? (I think the position property is the only attribute that would use it, unless you want some of the other attributes -- like annotations -- to be live.) ... I would also like to see a * value : object attribute; this would be missing on most functions, but might be filled in on a Signature representing a closure, or an execution frame. When to construct the Signature object? --- The Signature object can either be created in an eager or lazy fashion. In the eager situation, the object can be created during creation of the function object. Since most code doesn't need it, I would expect it to be optimized out at least as often as docstrings are. In the lazy situation, one would pass a function object to a function and that would generate the Signature object and store it to ``__signature__`` if needed, and then return the value of ``__signature__``. Why store it? Do you expect many use cases to need the signature more than once (but not to save it themselves)? If there is a __signature__ attribute on a object, you have to specify whether it can be replaced, which parts of it are writable, how that will affect the function's own behavior, etc. I also suspect it might become a source of heisenbugs, like the reference leaks that were really DUMMY items in a dict. If the Signature is just a snapshot no longer attached to the original function, then people won't expect changes to the Signature to affect the callable. Should ``Signature.bind`` return Parameter objects as keys? (see above) If a Signature is a snapshot (rather than a live part of the function), then it might make more sense to just add a value attribute to Parameter objects. Provide a mapping of parameter name to Parameter object? While providing access to the parameters in order is handy, it might also be beneficial to provide a way to retrieve Parameter objects from a Signature object based on the parameter's name. Which style of access (sequential/iteration or mapping) will influence how the parameters are stored internally and whether __getitem__ accepts strings or integers. I think it should accept both. What storage mechanism to use is an internal detail that should be left to the implementation. I wouldn't expect Signature inspection to be inside a tight loop anyhow, unless it were part of a Generic Function dispatch engine ... and those authors (just PJE?) can optimize on what they actually need. Remove ``has_*`` attributes? If an EAFP approach to the API is taken, Please leave them; it is difficult to catch Exceptions in a list comprehension. Have ``var_args`` and ``_var_kw_args`` default to ``None``? Makes sense to me, particularly since it should probably be consistent with function name, and that should probably be None. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev
Re: [Python-Dev] Add a -z interpreter flag to execute a zip file
On 7/14/07, Andy C [EMAIL PROTECTED] wrote: On 7/13/07, Jim Jewett [EMAIL PROTECTED] wrote: while I think it would be a bad practice to import __main__, I have seen it recommended as the right place to store global (cross-module) settings. Where? People use __main__.py now? No; they don't use a file. It is treated as a strictly dynamic scratchpad, and they do things like import __main__ __main__.DEBUGLEVEL=5 if __main__.myvar: ... -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Add a -z interpreter flag to execute a zip file
Andy C wrote: ... a .zip file with a __zipmain__.py module at its root? Why not just an __init__.py, which you would normally execute if you tried to import/run a directory? * Magically looking at the first argument to see if it's a zip file seems problematic to me. I'd rather be explicit with the -z flag. Likewise, I'd rather be explicit and call it __zipmain__ rather than __main__. Treating zip files (and only zip files) as a special case equivalent to uncompressed files seems like a wart; I would prefer not to special-case zips any more than they already are. If anything, I would like to see the -m option enhanced so that if it gets a recognized collection file type (including a directory or zip), it does the right thing. Whether that actually makes sense, or defeats the purpose of the -m shortcut, I'm not sure. [on using __main__ instead of __init__ or __zipmain__] __main__.py? : ) If someone tries does import __main__ from another module in the program, won't that result in an infinite loop? It doesn't today; it does use circular imports, which can be a problem. while I think it would be a bad practice to import __main__, I have seen it recommended as the right place to store global (cross-module) settings. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] [RFC] urlparse - parse query facility
a) import cgi and call cgi module's query_ps. [circular imports] or b) Implement a stand alone query parsing facility in urlparse *AS IN* cgi module. Assuming (b), please remove the (code for the) parsing from the cgi module, and just import it back from urlparse (or urllib). Since cgi already imports urllib (which imports urlparse), this isn't adding any dependencies -- but it keeps the code in a single location. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] svn viewer confused
Choosing a revision, such as http://svn.python.org/view/python/trunk/Objects/?rev=55606sortby=dateview=log does not lead to the correct generated page; it either times out or generates a much older changelog. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Wither PEP 335 (Overloadable Boolean Operators)?
Greg, If you do update this PEP, please update the __not__ portion as well, at least regarding possible return values. It currently says that __not__ can return NotImplemented, which falls back to the current semantics. (Why? to override an explicit __not__? Then why not just put the current semantics on __object__, and override by calling that directly?) It does not yet say what will happen for objects that return something else outside of {True, False}, such as class AntiBool(object): def __not__(self): return self Is that OK, because not not X should now be spelled bool(x), and you haven't allowed the overriding of __bool__? (And, if so, how does that work Py3K?) -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] The docs, reloaded
Martin v. Löwis schrieb: That docutils happens to be written in Python should make little difference - it's *not* part of the Python language project, and is just a tool for us, very much like latex and latex2html. Not entirely. When I first started looking at python, I read a lot of documentation. Now I don't read it so much; the time when I could easily suggest doc changes without explicitly setting time aside has passed. At that time, the barriers to submitting were fairly large; these are the ones I remember: (1) Not realizing that I *could* submit changes, and they would be welcome. (2) Not understanding it well enough to document it correctly. (3) Not having easy access to the source -- I didn't want to to retype it, or to edit html only to find out it was maintained in some other format. Even once I found the cvs repository, the docs weren't in the main area. (4) Not having an easy way to submit the changes quickly. (5) Wanting to check my work, in case I was wrong. I have no idea how to fix (1) and (2). Putting them on a wiki improves the situation with (3) and (4). (5) is at least helped by keeping the formatting requirements as simple as possible (not sure if ReST does this or not) and by letting me verify them before I submit. Getting docutils is already a barrier; I would like to see a stdlib module (not script hidden off to the side) for verification and conversion. I don't think I installed docutils myself until I started to write a PEP. But once I did download and install and figure out how to call it ... at least it generally worked, and ran with something (python) I was already using. Getting a 3rd party tool that ends up requiring fourth party tools (such as LaTex, but then I need to a viewer, or the old toolchain that required me to install Perl) ... took longer than my attention span. This was despite the fact that I had already used all the needed tools in previous years; they just weren't installed on the machines I had at the time ... and installing them on windows was something that would *probably* work *eventually*. If I had been new to programming, it would have been even more intimidating. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] updated PEP3125, Remove Backslash Continuation
Major rewrite. The inside-a-string continuation is separated from the general continuation. The alternatives section is expaned to als list Andrew Koenig's improved inside-expressions variant, since that is a real contender. If anyone feels I haven't acknowledged their concerns, please tell me. -- PEP: 3125 Title: Remove Backslash Continuation Version: $Revision$ Last-Modified: $Date$ Author: Jim J. Jewett [EMAIL PROTECTED] Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 29-Apr-2007 Post-History: 29-Apr-2007, 30-Apr-2007, 04-May-2007 Abstract Python initially inherited its parsing from C. While this has been generally useful, there are some remnants which have been less useful for python, and should be eliminated. This PEP proposes elimination of terminal ``\`` as a marker for line continuation. Motivation == One goal for Python 3000 should be to simplify the language by removing unnecessary or duplicated features. There are currently several ways to indicate that a logical line is continued on the following physical line. The other continuation methods are easily explained as a logical consequence of the semantics they provide; ``\`` is simply an escape character that needs to be memorized. Existing Line Continuation Methods == Parenthetical Expression - ([{}]) - Open a parenthetical expression. It doesn't matter whether people view the line as continuing; they do immediately recognize that the expression needs to be closed before the statement can end. An examples using each of (), [], and {}:: def fn(long_argname1, long_argname2): settings = {background: random noise volume: barely audible} restrictions = [Warrantee void if used, Notice must be recieved by yesterday Not responsible for sales pitch] Note that it is always possible to parenthesize an expression, but it can seem odd to parenthesize an expression that needs them only for the line break:: assert val4, ( val is too small) Triple-Quoted Strings - Open a triple-quoted string; again, people recognize that the string needs to finish before the next statement starts. banner_message = Satisfaction Guaranteed, or DOUBLE YOUR MONEY BACK!!! some minor restrictions apply Terminal ``\`` in the general case -- A terminal ``\`` indicates that the logical line is continued on the following physical line (after whitespace). There are no particular semantics associated with this. This form is never required, although it may look better (particularly for people with a C language background) in some cases:: assert val4, \ val is too small Also note that the ``\`` must be the final character in the line. If your editor navigation can add whitespace to the end of a line, that invisible change will alter the semantics of the program. Fortunately, the typical result is only a syntax error, rather than a runtime bug:: assert val4, \ val is too small SyntaxError: unexpected character after line continuation character This PEP proposes to eliminate this redundant and potentially confusing alternative. Terminal ``\`` within a string -- A terminal ``\`` within a single-quoted string, at the end of the line. This is arguably a special case of the terminal ``\``, but it is a special case that may be worth keeping. abd\ def 'abd def' + Many of the objections to removing ``\`` termination were really just objections to removing it within literal strings; several people clarified that they want to keep this literal-string usage, but don't mind losing the general case. + The use of ``\`` for an escape character within strings is well known. - But note that this particular usage is odd, because the escaped character (the newline) is invisible, and the special treatment is to delete the character. That said, the ``\`` of ``\(newline)`` is still an escape which changes the meaning of the following character. Alternate Proposals === Several people have suggested alternative ways of marking the line end. Most of these were rejected for not actually simplifying things. The one exception was to let any unfished expression signify a line continuation, possibly in conjunction with increased indentation. This is attractive because it is a generalization of the rule for parentheses. The