from:"Jim Jewett"

[Python-Dev] Extent of post-rc churn

2014-02-20 Thread Jim Jewett

http://midwinter.com/~larry/3.4.status/merge.status.html lists enough
changes that it sounds more like a bugfix release than just a few
last tweaks after the rc.

It would probably help if the what's-new-in-rc2 explicitly mentioned
that asyncio is new and provisional with 3.4, and listed its changes
in a separate subsection, so that the final tweaks to something I
might already be using section would be less intimidating.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] DB-API v2.1 or v3 [inspired by: python 3 niggle: None 1 raises TypeError]

2014-02-20 Thread Jim Jewett

I personally regret that sorting isn't safe, but that ship has sailed.

There is practicality benefit in making None compare to everything,
just as C and Java do with null pointers -- but it is too late to do
by default.

Adding a keyword to sorted might be nice -- but then shouldn't it also
be added to other sorts, and maybe max/min?  It might just be trading
one sort of mess for another.

What *can* reasonably be changed is the DB-API.  Why not just specify
that the DB type objects themselves should handle comparison to None?

http://www.python.org/dev/peps/pep-0249/#type-objects-and-constructors

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460 reboot

2014-01-14 Thread Jim Jewett

On Tue, Jan 14, 2014 at 3:06 PM, Guido van Rossum gu...@python.org wrote:
 Personally I wouldn't add any words suggesting or referring to the
 option of creation another class for this purpose. You wouldn't
 recommend subclassing dict for constraining the types of keys or
 values, would you?

Yes, and it is so clear that I suspect I'm missing some context for
your question.

Do I recommend that each individual application should create new
concrete classes instead of just using the builtins?  No.

When trying to understand (learn about) the text/binary distinction, I
do recommend pretending that they are represented by separate classes.
 Limits on the values in a bytearray are NOT the primary reason for
this; the primary reason is that operations like the literal
representation or the capitalize method are arbitrary nonsense unless
the data happens to be representing ASCII.

sound_sample.capitalize()  -- syntactically valid, but semantic garbage
header.capitalize() -- OK, which implies that data is an instance
of something more specific than bytes.

Would I recommend subclassing dict if I wanted to constrain the key
types?  Yes -- though MutableMapping (fewer gates to guard) or the
upcoming TransformDict would probably be better still.

The existing dict implementation itself effectively uses (hidden,
quasi-)subclasses to restrict types of keys strictly for efficiency.
(lookdict* variants)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Close #19762: Fix name of _get_traces() and _get_object_traceback() function

2013-11-25 Thread Jim Jewett

Why are these functions (get_traces and get_object_traceback) private?

(1)  Is the whole module provisional?  At one point, I had thought so, but
I don't see that in the PEP or implementation.  (I'm not sure that it
should be provisional, but I want to be sure that the decision is
intentional.)

(2)  This implementation does lock in certain choices about the nature of
traces.  (What data to include for analysis vs excluding to save memory;
which events are tracked separately and which combined into a single total;
organizing the data that is saved in a hash by certain keys; etc)

While I would prefer more flexibility, the existing code provides a
reasonable default, and I can't forsee changing traces so much that these
functions *can't* be reasonably supported unless the rest of the module API
changes too.

(3)  get_object_traceback is the killer app that justifies the specific
data-collection choices Victor made; if it isn't public, the implementation
starts to look overbuilt.

(4) get_traces is about the only way to get at even the all the data that
*is* stored, prior to additional summarization.  If it isn't public, those
default summarization options become even more locked in..

-jJ

On Mon, Nov 25, 2013 at 3:34 AM, victor.stinner
python-check...@python.orgwrote:

 http://hg.python.org/cpython/rev/2e2ec595dc58
 changeset:   87551:2e2ec595dc58
 user:Victor Stinner victor.stin...@gmail.com
 date:Mon Nov 25 09:33:18 2013 +0100
 summary:
   Close #19762: Fix name of _get_traces() and _get_object_traceback()
 function
 name in their docstring. Patch written by Vajrasky Kok.

 files:
   Modules/_tracemalloc.c |  4 ++--
   1 files changed, 2 insertions(+), 2 deletions(-)


 diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.c
 --- a/Modules/_tracemalloc.c
 +++ b/Modules/_tracemalloc.c
 @@ -1018,7 +1018,7 @@
  }

  PyDoc_STRVAR(tracemalloc_get_traces_doc,
 -get_traces() - list\n
 +_get_traces() - list\n
  \n
  Get traces of all memory blocks allocated by Python.\n
  Return a list of (size: int, traceback: tuple) tuples.\n
 @@ -1083,7 +1083,7 @@
  }

  PyDoc_STRVAR(tracemalloc_get_object_traceback_doc,
 -get_object_traceback(obj)\n
 +_get_object_traceback(obj)\n
  \n
  Get the traceback where the Python object obj was allocated.\n
  Return a tuple of (filename: str, lineno: int) tuples.\n

 --
 Repository URL: http://hg.python.org/cpython

 ___
 Python-checkins mailing list
 python-check...@python.org
 https://mail.python.org/mailman/listinfo/python-checkins


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-30 Thread Jim Jewett

On Wed, Oct 30, 2013 at 6:02 AM, Victor Stinner
victor.stin...@gmail.com wrote:
 2013/10/30 Jim J. Jewett jimjjew...@gmail.com:
 Well, unless I missed it... I don't see how to get anything beyond
 the return value of get_traces, which is a (time-ordered?) list
 of allocation size with then-current call stack.  It doesn't mention
 any attribute for indicating that some entries are de-allocations,
 let alone the actual address of each allocation.


 get_traces() does return the traces of the currently allocated memory
 blocks. It's not a log of alloc/dealloc calls. The list is not sorted.
 If you want a sorted list, use take_snapshot.statistics('lineno') for
 example.

Any list is sorted somehow; I had assumed that it was defaulting to
order-of-creation, though if you use a dict internally, that might not
be the case.  If you return it as a list instead of a dict, but that list is
NOT in time-order, that is worth documenting

Also, am I misreading the documentation of get_traces() function?

Get traces of memory blocks allocated by Python.
Return a list of (size: int, traceback: tuple) tuples.
traceback is a tuple of (filename: str, lineno: int) tuples.


So it now sounds like you don't bother to emit de-allocation
events because you just remove the allocation from your
internal data structure.

In other words, you provide a snapshot, but not a history --
except that the snapshot isn't complete either, because it
only shows things that appeared after a certain event
(the most recent enablement).

I still don't see anything here(*) that requires even saving
the address, let alone preventing re-use.

(*) get_object_traceback(obj) might require a stored
 address for efficiency, but the base functionality of
getting traces doesn't.

I still wouldn't worry about address re-use though,
because the address should not be re-used until
the object has been deleted -- and is no longer
available to be passed to get_object_traceback.
So the worst that can happen is that an object which
was not traced might return a bogus answer
instead of failing.

 In that case, I would expect disabling (and filtering) to stop
 capturing new allocation events for me, but I would still expect
 tracemalloc to do proper internal maintenance.

 tracemalloc has an important overhead in term of performances and
 memory. The purpose of disable() is to... disable the module, to
 remove completely the overhead.
 ...  Why would you like to keep traces and disable the module?

Because of that very overhead.  I think my use typical use case would
be similar to Kristján Valur's, but I'll try to spell it out in more
detail here.

(1)  Whoa -- memory hog!  How can I fix this?

(2)  I know -- track all allocations, with a traceback showing why they
were made.  (At a minimum, I would like to be able to subclass your
tool to do this -- preferably without also keeping the full history in
memory.)

(3)  Oh, maybe I should skip the ones that really are temporary and
get cleaned up.  (You make this easy by handling the de-allocs,
though I'm not sure those events get exposed to anyone working at
the python level, as opposed to modifying and re-compiling.)

(4)  hmm... still too big ... I should use filters.  (But will changing those
filters while tracing is enabled mess up your current implementation?)

(5)  Argh.  What I really want is to know what gets allocated at times
like XXX.
I can do that if times-like-XXX only ever occur once per process.  I *might* be
able to do it with filters.  But I would rather do it by saying trace on and
trace off.   Maybe even with a context manager around the suspicious
places.

(6)  Then, at the end of the run, I would say give me the info about how much
was allocated when tracing was on.  Some of that might be going away
again when tracing is off, but at least I know what is making the allocations
in the first place.  And I know that they're sticking around long enough.

Under your current proposal, step (5) turns into

set filters
trace on
...
get_traces
serialize to some other storage
trace off

 and step (6) turns into
read in from that other storage I just made up on the fly, and do my own
summarizing, because my format is almost by definition non-standard.

This complication isn't intolerable, but neither is it what I expect
from python.
And it certainly isn't what I expect from a binary toggle like enable/disable.
(So yes, changing the name to clear_traces would help, because I would
still be disappointed, but at least I wouldn't be surprised.)

Also, if you do stick with the current limitations, then why even have
get_traces,
as opposed to just take_snapshot?  Is there some difference between them,
except that a snapshot has some convenience methods and some simple
metadata?

Later, he wrote:
 I don't see why disable() would return data.

disable is indeed a bad name for something that returns data.

The only reason to return data from

[Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-28 Thread Jim Jewett

reset() function:

Clear traces of memory blocks allocated by Python.

Does this do anything besides clear?  If not, why not just re-use the
'clear' name from dicts?


disable() function:

Stop tracing Python memory allocations and clear traces of
memory blocks allocated by Python.

I would disable to stop tracing, but I would not expect it to clear
out the traces it had already captured.  If it has to do that, please
put in some sample code showing how to save the current traces before
disabling.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-11-20 Thread Jim Jewett

On 11/20/12, Daniel Holth dho...@gmail.com wrote:
 On Tue, Nov 20, 2012 at 3:58 PM, Jim J. Jewett jimjjew...@gmail.com
 wrote:

 Vinay Sajip reworded the 'Provides-Dist' definition to explicitly say:

  The use of multiple names in this field *must not* be used for
  bundling distributions together. It is intended for use when
  projects are forked and merged over time ...

 (1)  Then how *should* the bundle-of-several-components case be
 represented?

 The useful way to bundle a bunch of things would be to just include them
 all in an executable folder or zipfile with __main__.py. PEP 426 and the
 package database would not get involved. The bundle would be distributed as
 an application you can download and use, not as an sdist on PyPI.

When I look at, for example, twisted, there are some fairly fine distinctions.

I can imagine some people wanting to handle each little piece
differently, since that is the level at which they would be replaced
by a more efficient implementation.  That doesn't mean that someone
using the default should have to manage 47 separate little packages
individually.

Also note that ZODB is mentioned as a bundling example in the current
(2012-11-14) PEP.  What does the PEP recommend that they do?  Stop
including transaction?  Keep including it but stop 'Provides-Dist'-ing
it?

The current PEP also specifies that This field must include the
project identified in the Name field, followed by the version : Name
(Version). but the examples do not always include version.  Why is
the MUST there?

Is there some way to distinguish between concrete and abstract
provisions?  For example, if  MyMail (2012.11.10)  includes
'Provides-Dist: email', does that really get parsed as 'Provides-Dist:
email (2012.11.10)'?



 (2)  How is 'Provides-Dist' different from 'Obsoletes-Dist'?
 The only difference I can see is that it may be a bit more polite
 to people who do want to install multiple versions of a (possibly
 abstract) package.

 The intent of Provides and Obsoletes is different. Obsoletes would not
 satisfy a requirement during dependency resolution.

 The RPM guide explains a similar system:

As best I can understand, Obsoletes means Go ahead and uninstall that
other package.
Saying that *without* providing the same functionality seems like a
sneaky spelling of Please break whatever relies on that other
package.

I'm willing to believe that there is a more useful meaning.  I'm also
willing to believe that they are logically redundant but express
different intentions.  The current wording doesn't tell me which is
true.  (Admittedly, that is arguably an upstream bug with other
package systems, but you should still either fix it or explicitly
delegate the definitions.)

And as long as I'm asking for clarification, can foopkg-3.4 obsolete
foopgk3.2?  If not, is it a semantics problem, or just not idiomatic?
If so, does it have a precise meaning, such as no longer
interoperates with?

And now that I've looked more carefully ...

Can a Key: Value pair be continued onto another line?  The syntax
description under Metadata Files does not say so, but later text
suggests that either leading whitespace or a leading tab specifically
(from the example code) will work.  (And is description a special
case?)

Is the payload assumed to be utf8 text?  Can it be itself a mime message?

Are there any restrictions on 'Name'?  e.g., Can the name include
spaces?  line breaks?  Must it be a valid python identifier?  A valid
python qualname?

'Version' says that it must be in the format specified in PEP 386.
Unfortunately, it doesn't say which part of 386.  Do you mean that it
must be acceptable to verlib.NormalizedVersion without first having to
call suggest_normalized_version?

'Summary' specifies that it must be one line.  Is there a character
limit, or do you just mean no line breaks?  Do you want to add a
Should be less than 80 characters or some such, based on typical
tool presentation?  Would it be worth repeating the advice that longer
descriptions should go in the payload, after all headers?  (Otherwise,
they have to find 'Description' *and* notice that it is deprecated and
figure out what to do instead.)

Under 'Description', it isn't entirely clear whether what terminates
the field.  Multiple paragraphs suggests that there can be multiple
lines, but I'm guessing that -- in practice -- they have to be a
single logical line, with all but the first starting with whitespace.

Under 'Classifier', is PEP 301 really the current authority for
classifiers?  I would prefer at least a reference to
http://pypi.python.org/pypi?%3Aaction=list_classifiers demonstrating
which classifiers are currently meaningful.

Under 'Requires-Dist', there is an unclosed parenthesis.

Does the 'Setup-Requires-Dist' set implicitly include the
'Requires-Dist' set, or should a package be listed both ways if it is
required at both setup and runtime?

The Summary of Differences from PEP 345 mentions changes to
Requires-Dist, but I don't

Re: [Python-Dev] [Python-checkins] cpython: Close #15387: inspect.getmodulename() now uses a new

2012-07-18 Thread Jim Jewett

Why is inspect.getmoduleinfo() deprecated?  Is it just to remove
circular dependencies?

FWIW, I much prefer an API like:

tell_me_about(object)

to one like:

for test_data in (X, Y, Z):
usable = tester(object, test_data)
if valid(usable):
return possible_results[test_data]

and to me, inspect.getmoduleinfo(path) looks like the first, while
checking the various import.machinery.*SUFFIXES looks like the second.

-jJ

On 7/18/12, nick.coghlan python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/af7961e1c362
 changeset:   78161:af7961e1c362
 user:Nick Coghlan ncogh...@gmail.com
 date:Wed Jul 18 23:14:57 2012 +1000
 summary:
   Close #15387: inspect.getmodulename() now uses a new
 importlib.machinery.all_suffixes() API rather than the deprecated
 inspect.getmoduleinfo()

 files:
   Doc/library/importlib.rst  |  13 -
   Doc/library/inspect.rst|  15 ---
   Lib/importlib/machinery.py |   4 
   Lib/inspect.py |  11 +--
   Misc/NEWS  |   3 +++
   5 files changed, 40 insertions(+), 6 deletions(-)


 diff --git a/Doc/library/importlib.rst b/Doc/library/importlib.rst
 --- a/Doc/library/importlib.rst
 +++ b/Doc/library/importlib.rst
 @@ -533,12 +533,23 @@

  .. attribute:: EXTENSION_SUFFIXES

 -   A list of strings representing the the recognized file suffixes for
 +   A list of strings representing the recognized file suffixes for
 extension modules.

 .. versionadded:: 3.3


 +.. func:: all_suffixes()
 +
 +   Returns a combined list of strings representing all file suffixes for
 +   Python modules recognized by the standard import machinery. This is a
 +   helper for code which simply needs to know if a filesystem path
 +   potentially represents a Python module (for example,
 +   :func:`inspect.getmodulename`)
 +
 +   .. versionadded:: 3.3
 +
 +
  .. class:: BuiltinImporter

  An :term:`importer` for built-in modules. All known built-in modules
 are
 diff --git a/Doc/library/inspect.rst b/Doc/library/inspect.rst
 --- a/Doc/library/inspect.rst
 +++ b/Doc/library/inspect.rst
 @@ -198,9 +198,18 @@
  .. function:: getmodulename(path)

 Return the name of the module named by the file *path*, without
 including the
 -   names of enclosing packages.  This uses the same algorithm as the
 interpreter
 -   uses when searching for modules.  If the name cannot be matched
 according to the
 -   interpreter's rules, ``None`` is returned.
 +   names of enclosing packages. The file extension is checked against all
 of
 +   the entries in :func:`importlib.machinery.all_suffixes`. If it matches,
 +   the final path component is returned with the extension removed.
 +   Otherwise, ``None`` is returned.
 +
 +   Note that this function *only* returns a meaningful name for actual
 +   Python modules - paths that potentially refer to Python packages will
 +   still return ``None``.
 +
 +   .. versionchanged:: 3.3
 +  This function is now based directly on :mod:`importlib` rather than
 the
 +  deprecated :func:`getmoduleinfo`.


  .. function:: ismodule(object)
 diff --git a/Lib/importlib/machinery.py b/Lib/importlib/machinery.py
 --- a/Lib/importlib/machinery.py
 +++ b/Lib/importlib/machinery.py
 @@ -13,3 +13,7 @@
  from ._bootstrap import ExtensionFileLoader

  EXTENSION_SUFFIXES = _imp.extension_suffixes()
 +
 +def all_suffixes():
 +Returns a list of all recognized module suffixes for this process
 +return SOURCE_SUFFIXES + BYTECODE_SUFFIXES + EXTENSION_SUFFIXES
 diff --git a/Lib/inspect.py b/Lib/inspect.py
 --- a/Lib/inspect.py
 +++ b/Lib/inspect.py
 @@ -450,8 +450,15 @@

  def getmodulename(path):
  Return the module name for a given file, or None.
 -info = getmoduleinfo(path)
 -if info: return info[0]
 +fname = os.path.basename(path)
 +# Check for paths that look like an actual module file
 +suffixes = [(-len(suffix), suffix)
 +for suffix in importlib.machinery.all_suffixes()]
 +suffixes.sort() # try longest suffixes first, in case they overlap
 +for neglen, suffix in suffixes:
 +if fname.endswith(suffix):
 +return fname[:neglen]
 +return None

  def getsourcefile(object):
  Return the filename that can be used to locate an object's source.
 diff --git a/Misc/NEWS b/Misc/NEWS
 --- a/Misc/NEWS
 +++ b/Misc/NEWS
 @@ -41,6 +41,9 @@
  Library
  ---

 +- Issue #15397: inspect.getmodulename() is now based directly on importlib
 +  via a new importlib.machinery.all_suffixes() API.
 +
  - Issue #14635: telnetlib will use poll() rather than select() when
 possible
to avoid failing due to the select() file descriptor limit.


 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:

[Python-Dev] PEP 362 minor nits

2012-06-19 Thread Jim Jewett

I've limited this to minor issues, but kept python-dev in the loop
because some are questions, rather than merely editorial.


Based on:  http://hg.python.org/peps/file/tip/pep-0362.txt

view pep-0362.txt @ 4466:659639095ace



Committing the latest changes to PEP 362 on behalf of Yury Selivanov.
author  Larry Hastings la...@hastings.org
dateTue, 19 Jun 2012 02:38:15 -0700 (3 hours ago)
parents c1f693b39292


==


44 * return_annotation : object
45 The annotation for the return type of the function if specified.
46 If the function has no annotation for its return type, this
47 attribute is not set.

I don't think you need the if specified, given the next line.
Similar comments around line 89 (Parameter.default) and 93
(Parameter.annotation).

48 * parameters : OrderedDict
49 An ordered mapping of parameters' names to the corresponding
50 Parameter objects (keyword-only arguments are in the same order
51 as listed in ``code.co_varnames``).

Are you really sure you want to promise the keyword-only order in the PEP?

[BoundArguments]
   139 * arguments : OrderedDict
   140 An ordered, mutable mapping of parameters' names to
arguments' values.
   141 Does not contain arguments' default values.

I think 141 should be reworded, but I'm not certain my wording doesn't
have similar problems, so I merely offer it:

arguments contains only explicitly bound parameters; parameters for
which the binding relied on a default value do not appear in
arguments.


   142 * args : tuple
   143 Tuple of positional arguments values.  Dynamically computed from
   144 the 'arguments' attribute.
   145 * kwargs : dict
   146 Dict of keyword arguments values. Dynamically computed from
   147 the 'arguments' attribute.

Do you want to specify which will contain the normal parameters, that
could be called either way?  My naive assumption would be that as much
as possible gets shoved into args, but once a positional parameter is
left to default, remaining parameters are stuck in kwargs.


   172 - If the object is not callable - raise a TypeError
   173
   174 - If the object has a ``__signature__`` attribute and if it
   175   is not ``None`` - return a shallow copy of it

Should these two be reversed?

   183 - If the object is a method or a classmethod, construct and return
   184   a new ``Signature`` object, with its first parameter (usually
   185   ``self`` or ``cls``) removed

   187 - If the object is a staticmethod, construct and return
   188   a new ``Signature`` object

I would reverse these two, to make it clear that a staticmethod is not
treated as a method.


   194 - If the object is a class or metaclass:
   195
   196 - If the object's type has a ``__call__`` method defined in
   197   its MRO, return a Signature for it
   198
   199 - If the object has a ``__new__`` method defined in its class,
   200   return a Signature object for it
   201
   202 - If the object has a ``__init__`` method defined in its class,
   203   return a Signature object for it

What happens if it inherits a __new__ or __init__ from something more
derived than object?

   207 Note, that

I would remove the comma.


   235 Some functions may not be introspectable
   236 
   237
   238 Some functions may not be introspectable in certain implementations of
   239 Python.  For example, in CPython, builtin functions defined in C provide
   240 no metadata about their arguments.  Adding support for them is out of
   241 scope for this PEP.

Ideally, it would at least be possible to manually construct a
signature, and register them in some central location.  (Similar to
what is done with pickle or copy.)  Checking that location would then
have to be an early step in the signature algorithm.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 362 minor nits

2012-06-19 Thread Jim Jewett

On Tue, Jun 19, 2012 at 11:53 AM, Yury Selivanov
yselivanov...@gmail.com wrote:

 Based on:  http://hg.python.org/peps/file/tip/pep-0362.txt
 view pep-0362.txt @ 4466:659639095ace
 ==

   142 * args : tuple
   143     Tuple of positional arguments values.  Dynamically computed from
   144     the 'arguments' attribute.
   145 * kwargs : dict
   146     Dict of keyword arguments values. Dynamically computed from
   147     the 'arguments' attribute.

 Do you want to specify which will contain the normal parameters, that
 could be called either way?  My naive assumption would be that as much
 as possible gets shoved into args, but once a positional parameter is
 left to default, remaining parameters are stuck in kwargs.

 Correct, we push as much as possible to 'args'.  Only var_keyword
 and keyword_only args go to 'kwargs'.

 But the words positional and keyword more refer to what particularly
 *args and  **kwargs do, disconnected from the Signature's parameters.

Which is why there is some ambiguity, and I wondered if you were
intentionally leaving it open or not.

 def f(a): pass
 s=signature(f)

 ba1=s.bind(1)

Now which of the following are true?

 # Ambiguous parameters to args
 ba.args==(1,)  and ba.kwargs=={}

 # or ambiguous parameters to kwargs
 ba.args=() and ba.kwargs={a:1}

Does it matter how the argument was bound?  As in, would

 ba2=s.bind(a=2)

produce a different answer?



If as much as possible goes to args, then:

 def g(a=1, b=2, c=3): pass
 s=signature(g)
 ba=s.bind(a=10, c=13)

would imply

 ba.args == (10,) and ba.kwargs={c:13}
True

because a can be written positionally, but c can't unless b is, and b
shouldn't be because it relied on the default value.


   172     - If the object is not callable - raise a TypeError
   173
   174     - If the object has a ``__signature__`` attribute and if it
   175       is not ``None`` - return a shallow copy of it

 Should these two be reversed?

 Do you have a use-case?

Not really; the only cases that come to mind are cases where it makes
sense to look at an explicit signature attribute, instead of calling
the factory.


   183     - If the object is a method or a classmethod, construct and return
   184       a new ``Signature`` object, with its first parameter (usually
   185       ``self`` or ``cls``) removed

   187     - If the object is a staticmethod, construct and return
   188       a new ``Signature`` object

 I would reverse these two, to make it clear that a staticmethod is not
 treated as a method.

 It's actually not how it's implemented.
...
 But that's an implementation detail, the algorithm in the PEP just
 shows the big picture (is it OK?).

Right; implementing it in the other order is fine, so long as the
actual tests for methods exclude staticmethods.  But for someone
trying to understand it, staticmethods sound like a kind of method,
and I would expect them to be included in something that handles
methods, unless they were already excluded by a prior clause.

   194     - If the object is a class or metaclass:
   195
   196         - If the object's type has a ``__call__`` method defined in
   197           its MRO, return a Signature for it
   198
   199         - If the object has a ``__new__`` method defined in its class,
   200           return a Signature object for it
   201
   202         - If the object has a ``__init__`` method defined in its class,
   203           return a Signature object for it

 What happens if it inherits a __new__ or __init__ from something more
 derived than object?

 What do you mean by more derived than object?

 class A:
def __init__(self): pass
 class B(A): ...

Because of the distinction between in its MRO and in its class, it
looks like the signature of A is based on its __init__, but the
signature of subclass B is not.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 362 minor nits

2012-06-19 Thread Jim Jewett

On Tue, Jun 19, 2012 at 2:10 PM, Yury Selivanov yselivanov...@gmail.com wrote:
 On 2012-06-19, at 12:33 PM, Jim Jewett wrote:

 On Tue, Jun 19, 2012 at 11:53 AM, Yury Selivanov
 yselivanov...@gmail.com wrote:

 Based on:  http://hg.python.org/peps/file/tip/pep-0362.txt
 view pep-0362.txt @ 4466:659639095ace
 ==

   142 * args : tuple
   143     Tuple of positional arguments values.  Dynamically computed from
   144     the 'arguments' attribute.
   145 * kwargs : dict
   146     Dict of keyword arguments values. Dynamically computed from
   147     the 'arguments' attribute.

 Correct, we push as much as possible to 'args'.

[examples to clarify]

OK, I would just add a sentence and commented example then, something like.

Arguments which could be passed as part of either *args or **kwargs
will be included only in the args attribute.  In the following
example:

 def g(a=1, b=2, c=3): pass
 s=signature(g)
 ba=s.bind(a=10, c=13)
 ba.args
(10,)
 ba.kwargs
{'c': 13}

Parameter a is part of args, because it can be.

Parameter c must be passed as a keyword, because (earlier) parameter b
is not being passed an explicit value.

 I can tweak the PEP to make it more clear for those who don't know
 that staticmethods are not exactly methods, but do we really need that?

I would prefer it, if only because it surprised me.  When do
distinguish between methods, staticmethod isn't usually the odd man
out.

And I also agree that the implementation doesn't need to change
(except to add a comment), only the PEP.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 362: 4th edition

2012-06-18 Thread Jim Jewett

On Sat, Jun 16, 2012 at 11:27 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, Jun 16, 2012 at 1:56 PM, Jim J. Jewett jimjjew...@gmail.com wrote:

    *Every* Parameter attribute is optional, even name.
    (Think of  builtins, even if they aren't automatically
supported yet.)   So go ahead and define some others
that are sometimes useful.

 Add only stuff we know is interesting and useful.

Agreed, but it doesn't have to be useful in all cases, or even
available on all Signatures; if users are already prepared for missing
data, it is enough that the attribute be well-defined, and be useful
when it does appear.  That said, it looks like is_implemented isn't
sufficiently well-defined.

 - kind
 - name (should be given meaningful content, even for POSITIONAL_ONLY 
 parameters)

I agree that it *should* be given meaningful content, but I don't
think the Parameter (or Signature) should be blocked without it.  I
also don't think that a documentation-only name that cannot be used
for keyword calls should participate in equality.  The existence of
the parameter should participate, and its annotation is more important
than usual, but its name is not.

 - default (may be missing, since None is allowed as a default value)
 - annotation (may be missing, since None is allowed as an annotation)

Position is also important, but I'm not certain whether it should be
represented in the Parameter, or only in the Signature.

copy(source, target)
copy(target, source)

have different signatures, but I'm not sure whether it would be
appropriate to reuse the same parameter objects.


    Instead of defining a BoundArguments class, just return
a copy  of the Signature, with value attributes added to
the Parameters.

 No, the BoundArguments class is designed to be easy to
 feed to a function call as f(*args, **kwds)

Why does that take a full class, as opposed to a method returning a
tuple and a dict?

    Use subclasses to distinguish the parameter kind.

 Please, no, using subclasses when there is no behavioural
 change is annoying.

A **kwargs argument is very different from an ordinary parameter.  Its
name doesn't matter (and therefore should not be considered in
__eq__), it can only appear once per signature, and the possible
location of its appearance is different.  It is formatted differently
(which I would prefer to do in the Parameter, rather than in
Signature).  It also holds very different data, and must be treated
specially by several Signature methods, particularly when either
validating or binding.  (It is bound to a Mapping, rather than to a
single value, so you have to keep it around longer and use a different
bind method.)

 A Signature object has the following public attributes and methods:

The more I try to work with it, the more I want direct references to
the two special arguments (*args, **kwargs) if they exist.  FWIW, the
current bind logic to find them -- particularly kwargs -- seems
contorted, compared to self.kwargsparameter.


 (3rd edition)
 * is_keyword_only : bool ...
 * is_args : bool ...
 * is_kwargs : bool ...

 (4th edition)
 ... Parameter.POSITIONAL_ONLY ...
 ... Parameter.POSITIONAL_OR_KEYWORD ...
 ... Parameter.KEYWORD_ONLY ...
 ... Parameter.VAR_POSITIONAL ...
 ... Parameter.VAR_KEYWORD ...

 This set has already grown, and I can think of others I would like to
 use.  (Pseudo-parameters, such as a method's self instance, or an
 auxiliary variable.)

 No. This is the full set of binding behaviours. self is just an
 ordinary POSITIONAL_OR_KEYWORD argument (or POSITIONAL_ONLY, in some
 builtin cases).

Or no longer a parameter at all, once the method is bound.  Except
it sort of still is.  Same for the space parameter in PyPy.  I don't
expect the stdlib implementation to support them initially, but I
don't want it to get in the way, either.  A supposedly closed set gets
in the way.

 I'm not sure
 if positional parameters should also check position, or if that
 can be left to the Signature.

 Positional parameters don't know their relative position, so it *has*
 to be left to the signature.

But perhaps they *should* know their relative position.  Also,
positional_only, *args, and **kwargs should be able to remove name
from the list of compared attributes.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 362: 4th edition

2012-06-18 Thread Jim Jewett

On Mon, Jun 18, 2012 at 10:37 AM, Yury Selivanov
yselivanov...@gmail.com wrote:
 Jim,

 On 2012-06-18, at 3:08 AM, Jim Jewett wrote:
 On Sat, Jun 16, 2012 at 11:27 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, Jun 16, 2012 at 1:56 PM, Jim J. Jewett jimjjew...@gmail.com wrote:

    Instead of defining a BoundArguments class, just return
   a copy  of the Signature, with value attributes added to
   the Parameters.

 No, the BoundArguments class is designed to be easy to
 feed to a function call as f(*args, **kwds)

 Why does that take a full class, as opposed to a method returning a
 tuple and a dict?

 Read this thread, please: 
 http://mail.python.org/pipermail/python-dev/2012-June/12.html

I reread that.  I still don't see why it needs to be an instance of a
specific independent class, as opposed to a Signature method that
returns a (tuple of) a tuple and a dict.

   ((arg1, arg2, arg3...), {key1: val2, key2: val2})


    Use subclasses to distinguish the parameter kind.

 Please, no, using subclasses when there is no behavioural
 change is annoying.

[Examples of how the kinds of parameters are qualitatively different.]

 A **kwargs argument is very different from an ordinary parameter.  Its
 name doesn't matter (and therefore should not be considered in
 __eq__),

 The importance of its name depends hugely on the use context.  In some
 it may be very important.

The name of kwargs can only be for documentation purposes.  Like an
annotation or a docstring, it won't affect the success of an attempted
call.  Annotations are kept because (often) their entire purpose is to
document the signature.  But docstrings are being dropped, because
they often serve other purposes.  I've had far more use for docstrings
than for the names of positional-only parameters.  (In fact, knowing
the name of a positional-only parameter has sometimes been an
attractive nuisance.)

 And it is treated specially, along with the *args.

Right -- but this was in response to Nick's claim that the
distinctions should not be represented as a subclass, because the
behavior wasn't different.

I consider different __eq__ implementations or formatting concers to
be sufficient on their own; I also consider different possible use
locations and counts, different used-by-the-system attributes (name),
or different value types (object vs collection) to be sufficiently
behavioral.

 A Signature object has the following public attributes and methods:

 The more I try to work with it, the more I want direct references to
 the two special arguments (*args, **kwargs) if they exist.  FWIW, the
 current bind logic to find them -- particularly kwargs -- seems
 contorted, compared to self.kwargsparameter.

 Well, 'self.kwargsparameter'  will break 'self.parameters' collection,
 unless you want one parameter to be in two places.

Correct; it should be redundant.  Signature.kwargsparameter should be
the same object that occurs as the nth element of
Signature.parameters.values().  It is just more convenient to retrieve
the parameter directly than it is to iterate through a collection
inspecting each element for the value of a specific attribute.


 In fact, the check types example (in the PEP) is currently shorter and
 easier to read with 'Signature.parameters' than with dedicated property
 for '**kwargs' parameter.

Agreed; the short-cuts *args and **kwargs are only useful because they
are special; they aren't needed when you're doing the same thing to
all parameters regardless of type.

 And if after all you need direct references to *args or **kwargs - write
 a little helper, which finds them in 'Signature.parameters'.

Looking at http://bugs.python.org/review/15008/diff/5143/Lib/inspect.py
you already need one in _bind; it is just that saving the info when
you pass it isn't too bad if you're already iterating through the
whole collection anyhow.

  Also,
 positional_only, *args, and **kwargs should be able to remove name
 from the list of compared attributes.

 I still believe in the most contexts the name of a parameter matters
 (even if it's **kwargs).  Besides, how can we make __eq__ to be
 configurable?

__eq__ can can an _eq_fields attribute to see which other attributes
matter -- but it makes more sense for that to be (sub-) class
property.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] (time) PEP 418 glossary V2

2012-04-24 Thread Jim Jewett

On Tue, Apr 24, 2012 at 6:38 AM, Victor Stinner
victor.stin...@gmail.com wrote:
 Monotonic
 -

 This is a particularly tricky term, as there are several subtly
 incompatible definitions in use.

 Is it a definition for the glossary?

One use case for a PEP is that someone who does *not* have a
background in the area wants to start learning about it.  Even
excluding the general service of education, these people can be
valuable contributors, because they have a fresh perspective.  They
will almost certainly waste some time retracing dead ends, but I would
prefer it be out of a need to prove things to themselves, instead of
just because they misunderstood.

Given the amount of noise we already went through arguing over what
Monotonic should mean, I think we have an obligation to provide
these people with a heads-up, even if we don't end up using the term
ourselves.  And I think we *will* use the terms ourselves, if only as
some of the raw os_clock_* choices.

  C++ followed the mathematical definition
  ... a monotonic clock only promises not to go backwards.
 ... additional guarantees, some ... required by the POSIX

Confession:

I based the above statements strictly on posts to python-dev, from
people who seemed to have some experience caring about clock details.

I did not find the relevant portions of either specification.[1]
Every time I started to search, I got pulled back to other tasks, and
the update was just delayed even longer.  I still felt it was worth
consolidating the state of the discussion.  Anyone who feels confident
in this domain is welcome to correct me, and encouraged to send
replacement text.

[1]  Can I assume that Victor's links here are the relevant ones, or
is someone aware of additional/more complete references for these
specifications?
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3128.html#time.clock.monotonic
http://pubs.opengroup.org/onlinepubs/95399/basedefs/time.h.html

  The tradeoffs often include lack of a defined Epoch_
 or mapping to `Civil Time`_,

 I don't know any monotonic with a defined epoch or
 mappable to the civil time.

The very basic seconds (not even milliseconds) since the beginning of
1970 fits that definition, but doesn't seem to fit what most people
mean by Monotonic Clock.

I'm still a little fuzzy on *why* it shouldn't count as a monotonic
clock.  Is it technically valid, but a lousy implementation because of
insufficient precision or resolution?  Is it because the functions
used in practice (on a modern OS) to retrieve timestamps don't
guarantee to ignore changes to the system clock?

 and being more expensive (in `Latency`_, power usage, or duration spent
 within calls to the clock itself) to use.

 CLOCK_MONOTONIC and CLOCK_REALTIME have the same performances on Linux
 and FreeBSD. Why would a monotonic clock be more expensive?

  For example, the clock may
 represent (a constant multiplied by) ticks of a specific quartz timer
 on a specific CPU core, and calls would therefore require
 synchronization between cores.

 I don't think that synchronizing a counter between CPU cores is
 something expensive. See the following tables for details:
 http://www.python.org/dev/peps/pep-0418/#performance

Synchronization is always relatively expensive.  How expensive depends
on a lot of things decides before python was installed.

Looking at the first table there (Linux 3.3 with Intel Core i7-2600 at
3.40GHz (8 cores)), CLOCK_MONOTONIC can be hundreds of times slower
than time(), and over 50 times slower than CLOCK_MONOTONIC_COARSE.  I
would assume that CLOCK_MONOTONIC_COARSE meets the technical
requirements for a monotonic clock, but does less well at meeting the
actual expectations for some combination of
(precision/stability/resolution).


 CLOCK_MONOTONIC and CLOCK_REALTIME use the same hardware clocksource
 and so have the same latency depending on the hardware.

Is this a rule of thumb or a requirement of some standard?

Does that fact that Windows, Mac OS X, and GNU/Hurd don't support
CLOCK_MONOTONIC indicate that there is a (perhaps informal?)
specification that none of their clocks meet, or does it only indicate
that they didn't like the name?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] peps: Note that ImportError will no longer be raised due to a missing init.py

2012-04-24 Thread Jim Jewett

On Thu, Apr 19, 2012 at 18:56, eric.smith wrote:

 +Note that an ImportError will no longer be raised for a directory
 +lacking an ``__init__.py`` file. Such a directory will now be imported
 +as a namespace package, whereas in prior Python versions an
 +ImportError would be raised.

Given that there is no way to modify the __path__ of a namespace
package (short of restarting python?), *should* it be an error if
there is exactly one directory?

Or is that just a case of other tools out there, didn't happen to
install them?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] (time) PEP 418 glossary V2

2012-04-23 Thread Jim Jewett

Glossary


Absolute Time
-

A measurement of time since a specific Epoch_, typically far in the
past.  Civil Time is the most common example.  Typically contrasted
with a `Duration`_, as (now - epoch) is generally much larger than
any duration that can be appropriately measured with the clock in
question.

Accuracy


The amount of deviation of measurements by a given instrument from true
values. See also the wikipedia article on `Accuracy and precision
http://en.wikipedia.org/wiki/Accuracy_and_precision`_.

Inaccuracy in clocks may be caused by lack of `Precision`_, by
`Drift`_, or by an incorrect initial setting of the clock (e.g., timing
of threads is inherently inaccurate because perfect synchronization in
resetting counters is quite difficult).

Adjusted


Resetting a clock, presumably to the correct time.  This may be done
either with a `Step`_ or with `Slew`_.  Adjusting a clock normally
makes it more accurate with respect to the `Absolute Time`_.  The cost
is that any durations currently being measured will show a `Bias`_.
(17 ticks is not the same Duration_ as 17 ticks plus an adjustment.)

Bias


Lack of accuracy that is systematically in one direction, as opposed to
random errors.  When a clock is `Adjusted`_, durations overlapping the
adjustment will show a Bias.

Civil Time
--

Time of day; external to the system.  10:45:13am is a Civil time;
A Duration_ like 45 seconds is not a Civil time.  Provided by
existing functions ``time.localtime()`` and ``time.gmtime()``, which
are not changed by this PEP.

Clock
-

An instrument for measuring time.  Different clocks have different
characteristics; for example, a clock with nanosecond Precision_ may
start to Drift_ after a few minutes, while a less precise clock
remained accurate for days.

This PEP is primarily concerned with clocks which use a unit of
seconds, rather than years, or arbitrary units such as a Tick_.

Counter
---

A clock which increments each time a certain event occurs.  A counter
is strictly monotonic in the mathematical sense, but does not meet
the typical definitions of Monotonic_ when used of a computer clock.
It can be used to generate a unique (and ordered) timestamp, but these
timestamps cannot be mapped to `Civil Time`_; Tick_ creation may well
be bursty, with several advances in the same millisecond followed
by several days without any advance.

CPU Time


A measure of how much CPU effort has been spent on a certain task.
CPU seconds are often normalized (so that a variable number can
occur in the same actual second).  CPU seconds can be important
when profiling, but they do not map directly to user response time,
nor are they directly comparable to (real time) seconds.

Drift
-

The accumulated error against true time, as defined externally to the
system.  Drift may be due to imprecision, or to a difference between
the average rate at which clock time advances and that of real time.

Drift does not include intentional adjustments, but clocks providing
`Absolute Time`_ will eventually have to be Adjusted_ to compensate
for drift.

Duration


Elapsed time.  The difference between the starting and ending times.
Also called Relative Time.  Normally contrasted with `Absolute Time`_.

While a defined Epoch_ technically creates an implicit duration, this
duration is normally too large to be of practical use.

Computers can often supply a clock with better Precision_ or higher
Resolution_ if they do not have to guarantee meaningful comparisons to
any times not generated by the clock itself.

Epoch
-

The reference point of a clock.  For clocks providing `Civil Time`_,
this is often midnight as the day (and year) rolled over to
January 1, 1970.  A Monotonic_ clock will typically have an undefined
epoch (represented as None).

Latency
---

Delay.  By the time a call to a clock function returns, `Real Time`_
has advanced, possibly by more than the precision of the clock.

Monotonic
-

This is a particularly tricky term, as there are several subtly
incompatible definitions in use.  C++ followed the mathematical
definition, so that a monotonic clock only promises not to go
backwards.  In practice, that is not sufficient to be useful, and no
Operating System provides such a weak guarantee.  Most discussions
of a Monotonic *Clock* will also assume several additional
guarantees, some of which are explicitly required by the POSIX
specification.

Within this PEP (and Python), the intended meaning is closer to
the characteristics expected of a monotonic clock in practice.
In addition to not moving backward, a Monotonic Clock should also be
Steady_, and should be convertible to a unit of seconds.  The tradeoffs
often include lack of a defined Epoch_ or mapping to `Civil Time`_,
and being more expensive (in `Latency`_, power usage, or duration spent
within calls to the clock itself) to use.  For example, the clock may
represent (a constant multiplied

[Python-Dev] PEP 418 glossary

2012-04-11 Thread Jim Jewett

I believe PEP 418 (or at least the discussion) would benefit greatly
from a glossary to encourage people to use the same definitions.  This
is arguably the Definitions section, but it should move either near
the end or (preferably) ahead of the Functions.  It also needs to be
greatly expanded.

Here is my strawman proposal, which does use slightly different
definitions than the current PEP even for some terms that the PEP does
define:

Accuracy:
Is the answer correct?  Any clock will eventually drift; if a
clock is intended to match Civil Time, it will need to be adjusted
back to the true time.

Adjusted:
Resetting a clock to the correct time.  This may be done either
with a Step or by Slewing.

Civil Time:
Time of day; external to the system.  10:45:13am is a Civil time;
45 seconds is not.  Provided by existing function time.localtime() and
time.gmtime().  Not changed by this PEP.

Clock:
An instrument for measuring time.  Different clocks have different
characteristics; for example, a clock with nanonsecond precision
may start to drift after a few minutes, while a less precise clock
remained accurate for days.  This PEP is primarily concerned with
clocks which use a unit of seconds.

Clock_Monotonic:
The characteristics expected of a monotonic clock in practice.  In
addition to being monotonic, the clock should also be steady and
have relatively high precision, and should be convertible to a
unit of seconds.  The tradeoffs often include lack of a defined
epoch or mapping to Civil Time, and being more expensive (in
latency, power usage, or duration spent within calls to the clock
itself) to use.  For example, the clock may represent (a constant
multiplied by) ticks of a specific quartz timer on a specific CPU
core, and calls would therefore require synchronization between cores.
 The original motivation for this PEP was to provide a cross-platform
name for requesting a clock_monotonic clock.

Counter:
A clock which increments each time a certain event occurs.  A
counter is strictly monotonic, but not clock_monotonic.  It can be
used to generate a unique (and ordered) timestamp, but these
timestamps cannot be mapped to civil time; tick creation may well be
bursty, with several advances in the same millisecond followed by
several days without any advance.

CPU Time:
A measure of how much CPU effort has been spent on a certain task.
 CPU seconds are often normalized (so that a variable number can occur
in the same actual second).  CPU seconds can be important when
profiling, but they do not map directly to user response time, nor are
they directly comparable to (real time) seconds.  time.clock() is
deprecated because it returns real time seconds on Windows, but CPU
seconds on unix, which prevents a consistent cross-platform
interpretation.

Duration:
Elapsed time.  The difference between the starting and ending
times.  A defined epoch creates an implicit (and usually large)
duration.  More precision can generally be provided for a relatively
small duration.

Drift:
The accumulated error against true time, as defined externally
to the system.

Epoch:
The reference point of a clock.  For clocks providing civil
time, this is often midnight as the day (and year) rolled over to
January 1, 1970.  For a clock_monotonic clock, the epoch may be
undefined (represented as None).

Latency:
Delay.  By the time a clock call returns, the real time has
advanced, possibly by more than the precision of the clock.

Microsecond:
1/1,000,000 of a second.  Fast enough for most -- but not all --
profiling uses.

Millisecond:
1/1,000 of a second.  More than adequate for most end-to-end UI
measurements, but often too coarse for profiling individual functions.

Monotonic:
Moving in at most one direction; for clocks, that direction is
forward.  A (nearly useless) clock that always returns exactly the
same time is technically monotonic.  In practice, most uses of
monotonic with respect to clocks actually refer to a stronger set of
guarantees, as described under clock_monotonic

Nanosecond
1/1,000,000,000 of a second.  The smallest unit of resolution --
and smaller than the actual precision -- available in current
mainstream operating systems.

Precision:
Significant Digits.  What is the smallest duration that the clock
can distinguish?  This differs from resolution in that a difference
greater than the minimum precision is actually meaningful.

Process Time:
Time elapsed since the process began.  It is typically measured in
CPU time rather than real time, and typically does not advance
while the process is suspended.

Real Time:
Time in the real world.  This differs from Civil time in that it
is not adjusted, but they should otherwise advance in lockstep.  It
is not related to the real time of Real Time [Operating] Systems.
It is sometimes called wall clock time to avoid that ambiguity;
unfortunately, that introduces different ambiguities.

Resolution:

[Python-Dev] Who are the decimal volunteers? Re: [Python-checkins] cpython: Resize the coefficient to MPD_MINALLOC also if the requested size is below

2012-04-09 Thread Jim Jewett

I remember that one of the concerns with cdecimal was whether it could
be maintained by anyone except Stefan (and a few people who were
already overcommitted).

If anyone (including absolute newbies) wants to step up, now would be
a good time to get involved.

A few starter questions, whose answer it would be good to document:

Why is there any need for MPD_MINALLOC at all for (immutable) numbers?

I suspect that will involve fleshing out some of the memory management
issues around dynamic decimals, as touched on here:
http://www.bytereef.org/mpdecimal/doc/libmpdec/memory.html#static-and-dynamic-decimals

On Mon, Apr 9, 2012 at 3:33 PM, stefan.krah python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/170bdc5c798b
 changeset:   76197:170bdc5c798b
 parent:      76184:02ecb8261cd8
 user:        Stefan Krah sk...@bytereef.org
 date:        Mon Apr 09 20:47:57 2012 +0200
 summary:
  Resize the coefficient to MPD_MINALLOC also if the requested size is below
 MPD_MINALLOC. Previously the resize was skipped as a micro optimization.

 files:
  Modules/_decimal/libmpdec/mpdecimal.c |  36 --
  1 files changed, 20 insertions(+), 16 deletions(-)


 diff --git a/Modules/_decimal/libmpdec/mpdecimal.c 
 b/Modules/_decimal/libmpdec/mpdecimal.c
 --- a/Modules/_decimal/libmpdec/mpdecimal.c
 +++ b/Modules/_decimal/libmpdec/mpdecimal.c
 @@ -480,17 +480,20 @@
  {
     assert(!mpd_isconst_data(result)); /* illegal operation for a const */
     assert(!mpd_isshared_data(result)); /* illegal operation for a shared */
 -
 +    assert(MPD_MINALLOC = result-alloc);
 +
 +    nwords = (nwords = MPD_MINALLOC) ? MPD_MINALLOC : nwords;
 +    if (nwords == result-alloc) {
 +        return 1;
 +    }
     if (mpd_isstatic_data(result)) {
         if (nwords  result-alloc) {
             return mpd_switch_to_dyn(result, nwords, status);
         }
 -    }
 -    else if (nwords != result-alloc  nwords = MPD_MINALLOC) {
 -        return mpd_realloc_dyn(result, nwords, status);
 -    }
 -
 -    return 1;
 +        return 1;
 +    }
 +
 +    return mpd_realloc_dyn(result, nwords, status);
  }

  /* Same as mpd_qresize, but the complete coefficient (including the old
 @@ -500,20 +503,21 @@
  {
     assert(!mpd_isconst_data(result)); /* illegal operation for a const */
     assert(!mpd_isshared_data(result)); /* illegal operation for a shared */
 -
 -    if (mpd_isstatic_data(result)) {
 -        if (nwords  result-alloc) {
 -            return mpd_switch_to_dyn_zero(result, nwords, status);
 -        }
 -    }
 -    else if (nwords != result-alloc  nwords = MPD_MINALLOC) {
 -        if (!mpd_realloc_dyn(result, nwords, status)) {
 +    assert(MPD_MINALLOC = result-alloc);
 +
 +    nwords = (nwords = MPD_MINALLOC) ? MPD_MINALLOC : nwords;
 +    if (nwords != result-alloc) {
 +        if (mpd_isstatic_data(result)) {
 +            if (nwords  result-alloc) {
 +                return mpd_switch_to_dyn_zero(result, nwords, status);
 +            }
 +        }
 +        else if (!mpd_realloc_dyn(result, nwords, status)) {
             return 0;
         }
     }

     mpd_uint_zero(result-data, nwords);
 -
     return 1;
  }


 --
 Repository URL: http://hg.python.org/cpython

 ___
 Python-checkins mailing list
 python-check...@python.org
 http://mail.python.org/mailman/listinfo/python-checkins

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython (3.2): attempt to fix asyncore buildbot failure

2012-03-23 Thread Jim Jewett

What does this verify?

My assumption from the name (test_quick_connect) and the context (an
asynchronous server) is that it is verifying the server can handle a
certain level of load.  Refusing the sockets should then be a failure,
or at least a skipped test.

Would the below fail even if asyncore.loop were taken out of the
threading.Thread target  altogether?


On Fri, Mar 23, 2012 at 10:10 AM, giampaolo.rodola
python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/2db4e916245a
 changeset:   75901:2db4e916245a
 branch:      3.2
 parent:      75897:b97964af7299
 user:        Giampaolo Rodola' g.rod...@gmail.com
 date:        Fri Mar 23 15:07:07 2012 +0100
 summary:
  attempt to fix asyncore buildbot failure

 files:
  Lib/test/test_asyncore.py |  10 +++---
  1 files changed, 7 insertions(+), 3 deletions(-)


 diff --git a/Lib/test/test_asyncore.py b/Lib/test/test_asyncore.py
 --- a/Lib/test/test_asyncore.py
 +++ b/Lib/test/test_asyncore.py
 @@ -741,11 +741,15 @@

         for x in range(20):
             s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 +            s.settimeout(.2)
             s.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER,
                          struct.pack('ii', 1, 0))
 -            s.connect(server.address)
 -            s.close()
 -
 +            try:
 +                s.connect(server.address)
 +            except socket.error:
 +                pass
 +            finally:
 +                s.close()

  class TestAPI_UseSelect(BaseTestAPI):
     use_poll = False

 --
 Repository URL: http://hg.python.org/cpython

 ___
 Python-checkins mailing list
 python-check...@python.org
 http://mail.python.org/mailman/listinfo/python-checkins

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython (2.7): Fixes Issue 14234: fix for the previous commit, keep compilation when

2012-03-19 Thread Jim Jewett

Does this mean that if Python is updated before expat, python will
compile out the expat randomization, and therefore not use if even
after expat is updated?

-jJ

On Thu, Mar 15, 2012 at 2:01 PM, benjamin.peterson
python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/ada6bfbeceb8
 changeset:   75699:ada6bfbeceb8
 branch:      2.7
 user:        Gregory P. Smith g...@krypto.org
 date:        Wed Mar 14 18:12:23 2012 -0700
 summary:
  Fixes Issue 14234: fix for the previous commit, keep compilation when
 using --with-system-expat working when the system expat does not have
 salted hash support.

 files:
  Modules/expat/expat.h |  2 ++
  Modules/pyexpat.c     |  5 +
  2 files changed, 7 insertions(+), 0 deletions(-)


 diff --git a/Modules/expat/expat.h b/Modules/expat/expat.h
 --- a/Modules/expat/expat.h
 +++ b/Modules/expat/expat.h
 @@ -892,6 +892,8 @@
  XML_SetHashSalt(XML_Parser parser,
                 unsigned long hash_salt);

 +#define XML_HAS_SET_HASH_SALT  /* Python Only: Defined for pyexpat.c. */
 +
  /* If XML_Parse or XML_ParseBuffer have returned XML_STATUS_ERROR, then
    XML_GetErrorCode returns information about the error.
  */
 diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c
 --- a/Modules/pyexpat.c
 +++ b/Modules/pyexpat.c
 @@ -1302,8 +1302,13 @@
     else {
         self-itself = XML_ParserCreate(encoding);
     }
 +#if ((XML_MAJOR_VERSION = 2)  (XML_MINOR_VERSION = 1)) || 
 defined(XML_HAS_SET_HASH_SALT)
 +    /* This feature was added upstream in libexpat 2.1.0.  Our expat copy
 +     * has a backport of this feature where we also define 
 XML_HAS_SET_HASH_SALT
 +     * to indicate that we can still use it. */
     XML_SetHashSalt(self-itself,
                     (unsigned long)_Py_HashSecret.prefix);
 +#endif
     self-intern = intern;
     Py_XINCREF(self-intern);
  #ifdef Py_TPFLAGS_HAVE_GC

 --
 Repository URL: http://hg.python.org/cpython

 ___
 Python-checkins mailing list
 python-check...@python.org
 http://mail.python.org/mailman/listinfo/python-checkins

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Close #14205: dict lookup raises a RuntimeError if the dict is modified during

2012-03-09 Thread Jim Jewett

I do not believe the change set below is valid.

As I read it, the new test verifies that one particular type of Nasty
key will provoke a RuntimeError -- but that particular type already
did so, by hitting the recursion limit.  (It doesn't even really
mutate the dict.)

Meanwhile, the patch throws out tests for several different types of
mutations that have caused problems -- even segfaults -- in the past,
even after the dict implementation code was already fixed.

Changing these tests to assertRaises would be fine, but they should
all be kept; if nothing else, they test whether you've caught all
mutation avenues.

-jJ

On Mon, Mar 5, 2012 at 7:13 PM, victor.stinner
python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/934aaf2191d0
 changeset:   75445:934aaf2191d0
 user:        Victor Stinner victor.stin...@gmail.com
 date:        Tue Mar 06 01:03:13 2012 +0100
 summary:
  Close #14205: dict lookup raises a RuntimeError if the dict is modified 
 during
 a lookup.

 if you want to make a sandbox on top of CPython, you have to fix segfaults
 so let's fix segfaults!

 files:
  Lib/test/crashers/nasty_eq_vs_dict.py |   47 --
  Lib/test/test_dict.py                 |   22 +-
  Lib/test/test_mutants.py              |  291 --
  Misc/NEWS                             |    5 +-
  Objects/dictobject.c                  |   18 +-
  5 files changed, 31 insertions(+), 352 deletions(-)


 diff --git a/Lib/test/crashers/nasty_eq_vs_dict.py 
 b/Lib/test/crashers/nasty_eq_vs_dict.py
 deleted file mode 100644
 --- a/Lib/test/crashers/nasty_eq_vs_dict.py
 +++ /dev/null
 @@ -1,47 +0,0 @@
 -# from http://mail.python.org/pipermail/python-dev/2001-June/015239.html
 -
 -# if you keep changing a dictionary while looking up a key, you can
 -# provoke an infinite recursion in C
 -
 -# At the time neither Tim nor Michael could be bothered to think of a
 -# way to fix it.
 -
 -class Yuck:
 -    def __init__(self):
 -        self.i = 0
 -
 -    def make_dangerous(self):
 -        self.i = 1
 -
 -    def __hash__(self):
 -        # direct to slot 4 in table of size 8; slot 12 when size 16
 -        return 4 + 8
 -
 -    def __eq__(self, other):
 -        if self.i == 0:
 -            # leave dict alone
 -            pass
 -        elif self.i == 1:
 -            # fiddle to 16 slots
 -            self.__fill_dict(6)
 -            self.i = 2
 -        else:
 -            # fiddle to 8 slots
 -            self.__fill_dict(4)
 -            self.i = 1
 -
 -        return 1
 -
 -    def __fill_dict(self, n):
 -        self.i = 0
 -        dict.clear()
 -        for i in range(n):
 -            dict[i] = i
 -        dict[self] = OK!
 -
 -y = Yuck()
 -dict = {y: OK!}
 -
 -z = Yuck()
 -y.make_dangerous()
 -print(dict[z])
 diff --git a/Lib/test/test_dict.py b/Lib/test/test_dict.py
 --- a/Lib/test/test_dict.py
 +++ b/Lib/test/test_dict.py
 @@ -379,7 +379,7 @@
         x.fail = True
         self.assertRaises(Exc, d.pop, x)

 -    def test_mutatingiteration(self):
 +    def test_mutating_iteration(self):
         # changing dict size during iteration
         d = {}
         d[1] = 1
 @@ -387,6 +387,26 @@
             for i in d:
                 d[i+1] = 1

 +    def test_mutating_lookup(self):
 +        # changing dict during a lookup
 +        class NastyKey:
 +            mutate_dict = None
 +
 +            def __hash__(self):
 +                # hash collision!
 +                return 1
 +
 +            def __eq__(self, other):
 +                if self.mutate_dict:
 +                    self.mutate_dict[self] = 1
 +                return self == other
 +
 +        d = {}
 +        d[NastyKey()] = 0
 +        NastyKey.mutate_dict = d
 +        with self.assertRaises(RuntimeError):
 +            d[NastyKey()] = None
 +
     def test_repr(self):
         d = {}
         self.assertEqual(repr(d), '{}')
 diff --git a/Lib/test/test_mutants.py b/Lib/test/test_mutants.py
 deleted file mode 100644
 --- a/Lib/test/test_mutants.py
 +++ /dev/null
 @@ -1,291 +0,0 @@
 -from test.support import verbose, TESTFN
 -import random
 -import os
 -
 -# From SF bug #422121:  Insecurities in dict comparison.
 -
 -# Safety of code doing comparisons has been an historical Python weak spot.
 -# The problem is that comparison of structures written in C *naturally*
 -# wants to hold on to things like the size of the container, or the
 -# biggest containee so far, across a traversal of the container; but
 -# code to do containee comparisons can call back into Python and mutate
 -# the container in arbitrary ways while the C loop is in midstream.  If the
 -# C code isn't extremely paranoid about digging things out of memory on
 -# each trip, and artificially boosting refcounts for the duration, anything
 -# from infinite loops to OS crashes can result (yes, I use Windows wink).
 -#
 -# The other problem is that code designed to provoke a weakness is usually
 -# white-box code, and so catches only the particular vulnerabilities the
 -# author knew

Re: [Python-Dev] [Python-checkins] peps: Switch back to named functions, since the Ellipsis version degenerated badly

2012-02-23 Thread Jim Jewett

On Wed, Feb 22, 2012 at 10:22 AM, nick.coghlan
python-check...@python.org wrote:
 +    in x = weakref.ref(target, report_destruction)
 +    def report_destruction(obj):
         print({} is being destroyed.format(obj))

 +If the repetition of the name seems especially annoying, then a throwaway
 +name like ``f`` can be used instead::

 +    in x = weakref.ref(target, f)
 +    def f(obj):
 +        print({} is being destroyed.format(obj))


I still feel that the helper function (or class) is subordinate, and
should be indented.  Thinking of in ... as a decorator helps, but
makes it seem that the helper function is the important part (which it
sometimes is...)

I understand that adding a colon and indent has its own problems, but
... I'm not certain this is better, and I am certain that the desire
for indentation is strong enough to at least justify discussion in the
PEP.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

2012-02-17 Thread Jim Jewett

On Fri, Feb 17, 2012 at 1:50 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 Good idea. However, how do you track per-dict how large the
 table is?

[Or, rather, what is the highest index needed to store any values
that are actually set for this instance.]

 To determine whether it needs to grow the array, it needs to find out
 how large the array is, no? So: how do you do that?

Ah, now I understand; you do need a single ssize_t either on the dict
or at the head of the values array to indicate how many slots it has
actually allocated.  It *may* also be worthwhile to add a second
ssize_t to indicate how many are currently in use, for faster results
in case of len.  But the dict is guaranteed to have at least one free
slot, so that extra index will never make the allocation larger than
the current code.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

2012-02-16 Thread Jim Jewett

On Thu, Feb 16, 2012 at 4:34 PM, Martin v. Löwis mar...@v.loewis.de wrote:
Am 16.02.2012 19:24, schrieb Jim J. Jewett:

PEP author Mark Shannon wrote
(in
http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt):

... allows ... (the ``__dict__`` attribute of an object) to share
keys with other attribute dictionaries of instances of the same class.

Is the same class a deliberate restriction, or just a convenience
of implementation?

It's about the implementation: the class keeps a pointer to the key set.
A subclass has a separate pointer for that.

I would prefer to see that reason in the PEP; after a few years, I
have trouble finding email, even when I remember reading the
conversation.

Have you timed not storing the hash (in the dict) at all, at least for
(unicode) str-only dicts? Going to the string for its own cached hash
breaks locality a bit more, but saves 1/3 of the memory for combined
tables, and may make a big difference for classes that have
relatively few instances.

I'd be in favor of that, but it is actually an unrelated change: whether
or not you share key sets is unrelated to whether or not str-only dicts
drop the cached hash.

Except that the biggest arguments against it are that it breaks cache
locality, and it changes the dictentry struct -- which this patch
already does anyway.

Given a dict, it may be tricky to determine
whether or not it is str-only, i.e. what layout to use.

Isn't that exactly the same determination needed when deciding whether
or not to use lookdict_unicode? (It would make the switch to the more
general lookdict more expensive, as that would involve a new
allocation.)

Reduction in memory use is directly related to the number of dictionaries
with shared keys in existence at any time. These dictionaries are typically
half the size of the current dictionary implementation.

How do you measure that? The limit for huge N across huge numbers
of dicts should be 1/3 (because both hashes and keys are shared); I
assume that gets swamped by object overhead in typical small dicts.

It's more difficult than that. He also drops the smalltable (which I
think is a good idea), so accounting how this all plays together is tricky.

All the more reason to explain in the PEP how he measured or approximated it.

If a table is split the values in the keys table are ignored,
instead the values are held in a separate array.

If they're just dead weight, then why not use them to hold indices
into the array, so that values arrays only have to be as long as
the number of keys, rather than rounding them up to a large-enough
power-of-two? (On average, this should save half the slots.)

Good idea. However, how do you track per-dict how large the table is?

Why would you want to?

The per-instance array needs to be at least as large as the highest
index used by any key for which it has a value; if the keys table gets
far larger (or even shrinks), that doesn't really matter to the
instance. What does matter to the instance is getting a value of its
own for a new (to it) key -- and then the keys table can tell it which
index to use, which in turn tells it whether or not it needs to grow
the array.

Are are you thinking of len(o.__dict__), which will indeed be a bit
slower? That will happen with split dicts and potentially missing
values, regardless of how much memory is set aside (or not) for the
missing values.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings

2012-02-06 Thread Jim Jewett

I realize that _Py_Identifier is a private name, and that PEP 3131
requires anything (except test cases) in the standard library to stick
with ASCII ... but somehow, that feels like too long of a chain.

I would prefer to see _Py_Identifier renamed to _Py_ASCII_Identifier,
or at least a comment stating that Identifiers will (per PEP 3131)
always be ASCII -- preferably with an assert to back that up.

-jJ

On Sat, Feb 4, 2012 at 7:46 PM, victor.stinner
python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/d2c1521ad0a1
 changeset:   74772:d2c1521ad0a1
 user:        Victor Stinner victor.stin...@haypocalc.com
 date:        Sun Feb 05 01:45:45 2012 +0100
 summary:
  _Py_Identifier are always ASCII strings

 files:
  Objects/unicodeobject.c |  5 ++---
  1 files changed, 2 insertions(+), 3 deletions(-)


 diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
 --- a/Objects/unicodeobject.c
 +++ b/Objects/unicodeobject.c
 @@ -1744,9 +1744,8 @@
  _PyUnicode_FromId(_Py_Identifier *id)
  {
     if (!id-object) {
 -        id-object = PyUnicode_DecodeUTF8Stateful(id-string,
 -                                                  strlen(id-string),
 -                                                  NULL, NULL);
 +        id-object = unicode_fromascii((unsigned char*)id-string,
 +                                       strlen(id-string));
         if (!id-object)
             return NULL;
         PyUnicode_InternInPlace(id-object);

 --
 Repository URL: http://hg.python.org/cpython
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Jim Jewett

In http://mail.python.org/pipermail/python-dev/2012-January/115350.html,
Mark Shannon wrote:

 The minimal proposed change of seeding the hash from a global value (a
 single memory read and an addition) will have such a minimal performance
 effect that it will be undetectable even on the most noise-free testing
 environment.

(1)  Is it established that this (a single initial add, with no
per-loop operations) would be sufficient?

I thought that was in the gray area of We don't yet have a known
attack, but there are clearly safer options.

(2)  Even if the direct cost (fetch and add) were free, it might be
expensive in practice.  The current hash function is designed to send
similar strings (and similar numbers) to similar hashes.

(2a)  That guarantees they won't (initially) collide, even in very small dicts.
(2b)  It keeps them nearby, which has an effect on cache hits.   The
exact effect (and even direction) would of course depend on the
workload, which makes me distrust micro-benchmarks.

If this were a problem in practice, I could understand accepting a
little slowdown as the price of safety, but ... it isn't.  Even in
theory, the only way to trigger this is to take unreasonable amounts
of user input and turn it directly into an unreasonable number of keys
(as opposed to values, or list elements) placed in the same dict (as
opposed to a series of smaller dicts).

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Proposed PEP on concurrent programming support

2012-01-04 Thread Jim Jewett

(I've added back python-ideas, because I think that is still the
appropriate forum.)

 A new
 suite type - the ``transaction`` will be added to the language. The
 suite will have the semantics discussed above: modifying an object in
 the suite will trigger creation of a thread-local shallow copy to be
 used in the Transaction. Further modifications of the original will
 cause all existing copies to be discarded and the transaction to be
 restarted. ...

How will you know that an object has been modified?

The only ways I can think of are

(1)  Timestamp every object -- or at least every mutable object -- and
hope that everybody agrees on which modifications should count.

(2)  Make two copies of every object you're using in the suite; at the
end, compare one of them to both the original and the one you were
operating on.  With this solution, you can decide for youself what
counts as a modification, but it still isn't straightforward; I would
consider changing a value to be changing a dict, even though
nothing in the item (header) itself changed.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] That depends on what the meaning of is is (was Re: http://mail.python.org/pipermail/python-dev/2011-December/115172.html)

2012-01-03 Thread Jim Jewett

On Mon, Jan 2, 2012 at 7:16 PM, PJ Eby p...@telecommunity.com wrote:
 On Mon, Jan 2, 2012 at 4:07 PM, Jim Jewett jimjjew...@gmail.com wrote:

 But the public header file 
 http://hg.python.org/cpython/file/3ed5a6030c9b/Include/dictobject.h 
 defines the typedef structs for PyDictEntry and _dictobject.

 What is the purpose of the requiring a real dict without also
 promising what the header file promises?

 Er, just because it's in the .h doesn't mean it's in the public API.  But in
 any event, if you're actually serious about this, I'd just point out that:

 1. The struct layout doesn't guarantee anything about insertion or lookup
 algorithms,

My concern was about your suggestion of changing the data structure to
accommodate some other algorithm -- particularly if it meant that  the
data would no longer be stored entirely in an array of PyDictEntry.

That shouldn't be done lightly even between major versions, and
certainly should not be done in a bugfix (or security-only) release.

 Are you seriously writing code that relies on the C structure layout of
 dicts?

The first page of search results for PyDictEntry suggested that others
are.  (The code I found did seem to be for getting data from a python
dict into some other language, rather than for wsgi.)

  Because really, that was SO not the point of the dict type
 requirement.  It was so that you could use Python's low-level *API* calls,
 not muck about with the data structure directly.

Would it be too late to clarify that in the PEP itself?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] That depends on what the meaning of is is (was Re: http://mail.python.org/pipermail/python-dev/2011-December/115172.html)

2012-01-02 Thread Jim Jewett

On Mon, Jan 2, 2012 at 1:16 AM, PJ Eby p...@telecommunity.com wrote:
 On Sun, Jan 1, 2012 at 10:28 PM, Jim Jewett jimjjew...@gmail.com wrote:

 Given the wording requiring a real dictionary, I would have assumed
 that it was OK (if perhaps not sensible) to do pointer arithmetic and
 access the keys/values/hashes directly.  (Though if the breakage was
 between python versions, I would feel guilty about griping too
 loudly.)

 If you're going to be a language lawyer about it, I would simply point out
 that all the spec requires is that type(env) is dict -- it says nothing
 about how Python defines type or is or dict.  So, you're on your own
 with that one. ;-)

But the public header file 
http://hg.python.org/cpython/file/3ed5a6030c9b/Include/dictobject.h 
defines the typedef structs for PyDictEntry and _dictobject.

What is the purpose of the requiring a real dict without also
promising what the header file promises?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Jim Jewett

Steven D'Aprano (in
http://mail.python.org/pipermail/python-dev/2011-December/115162.html)
wrote:

 By compile-time, do you mean when the byte-code is compilated, i.e. just
 before runtime, rather than a switch when compiling the Python executable from
 source?

No.  I really mean when the C code is initially compiled to produce an
python executable.

The only reason we're worrying about this is that an adversary may
force worst-case performance.  If the python instance isn't a server,
or at least isn't exposed to untrusted clients, then even a single
extra if test is unjustified overhead.  Adding overhead to every
string hash or every dict lookup is bad.

That said, adding some overhead (only) to dict lookups *that already
hit half a dozen consecutive collisions* probably is reasonable,
because that won't happen very often with normal data.  (6 collisions
can't happen at all unless there are already at least 6 entries, so
small dicts are safe; with at least 1/3 of the slots empty, it should
happen only 1/729 for worst-size larger dicts.)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Jim Jewett

Paul McMillan in
http://mail.python.org/pipermail/python-dev/2012-January/115183.html
wrote:

 Guido van Rossum wrote:
 Hm. I'm not sure I like the idea of extra arithmetic for every character
 being hashed.

 the collision generator doesn't necessarily vary the length of the
 string. Additionally, if we don't vary based on all the letters in the
 string, an attacker can fix the characters that we do use and generate
 colliding strings around them.

If the new hash algorithm doesn't kick in before, say, 32 characters,
then most currently hashed strings will not be affected.  And if the
attacker has to add 32 characters to every key, it reduces the this
can be done with only N bytes uploaded risk.  (The same logic
would apply to even longer prefixes, except that an attacker might
more easily find short-enough strings that collide.)

 We could also consider a less computationally expensive operation
 than the modulo for calculating the lookup index, like simply truncating
 to the correct number of bits.

Given that the modulo is always 2^N, how is that different?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Jim Jewett

Victor Stinner wrote in
http://mail.python.org/pipermail/python-dev/2012-January/115198.html

 If we want to protect a website against this attack for example, we must
 suppose that the attacker can inject arbitrary data and can get
 (indirectly) the result of hash(str) (e.g. with the representation of a
 dict in a traceback, with a JSON output, etc.).

(1)  Is it common to hash non-string input?  Because generating integers
that collide for certain dict sizes is pretty easy...

(2)  Would it make sense for traceback printing to sort dict keys?  (Any site
worried about this issue should already be hiding tracebacks from untrusted
clients, but the cost of this extra protection may be pretty small, given that
tracebacks shouldn't be printed all that often in the first place.)

(3)  Should the docs for json.encoder.JSONEncoder suggest sort_keys=True?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html

2012-01-01 Thread Jim Jewett

In http://mail.python.org/pipermail/python-dev/2011-December/115172.html,
P. J. Eby wrote:

 On Sat, Dec 31, 2011 at 7:03 AM, Stephen J. Turnbull stephen at xemacs.org 
 wrote:

 While the dictionary probe has to start with a hash for backward
 compatibility reasons, is there a reason the overflow strategy for
 insertion has to be buckets containing lists?  How about
 double-hashing, etc?

 This won't help, because the keys still have the same hash value. ANYTHING
 you do to them after they're generated will result in them still colliding.

 The *only* thing that works is to change the hash function in such a way
 that the strings end up with different hashes in the first place.
 Otherwise, you'll still end up with (deliberate) collisions.

Well, there is nothing wrong with switching to a different hash function after N
collisions, rather than in the first place.  The perturbation
effectively does by
shoving the high-order bits through the part of the hash that survives the mask.

 (Well, technically, you could use trees or some other O log n data
 structure as a fallback once you have too many collisions, for some value
 of too many.  Seems a bit wasteful for the purpose, though.)

Your WSGI specification  http://www.python.org/dev/peps/pep-0333/  requires
using a real dictionary for compatibility; storing some of the values
outside the
values array would violate that.  Do you consider that obsolete?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html

2012-01-01 Thread Jim Jewett

On Sun, Jan 1, 2012 at 8:04 PM, Christian Heimes li...@cheimes.de wrote:
 Am 02.01.2012 01:37, schrieb Jim Jewett:
 Well, there is nothing wrong with switching to a different hash function 
 after N
 collisions, rather than in the first place.  The perturbation
 effectively does by
 shoving the high-order bits through the part of the hash that survives the 
 mask.

 Except that it won't work or slow down every lookup of missing keys?
 It's absolutely crucial that the lookup time is kept as fast as possible.

It will only slow down missing keys that themselves hit more than N collisions.

Or were you assuming that I meant to switch the whole table, rather
than just that one key?  I agree that wouldn't work.

 You can't just change the hash algorithm in the middle of the work
 without a speed impact on lookups.

Right -- but there is nothing wrong with modifying the lookdict (and
insert_clean) functions to do something different after the Nth
collision than they did after the N-1th.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html

2012-01-01 Thread Jim Jewett

On Sun, Jan 1, 2012 at 10:00 PM, PJ Eby p...@telecommunity.com wrote:
 On Sun, Jan 1, 2012 at 7:37 PM, Jim Jewett jimjjew...@gmail.com wrote:

 Well, there is nothing wrong with switching to a different hash function
 after N
 collisions, rather than in the first place.  The perturbation
 effectively does by
 shoving the high-order bits through the part of the hash that survives the
 mask.

 Since these are true hash collisions, they will all have the same high order
 bits.  So, the usefulness of the perturbation is limited mainly to the
 common case where true collisions are rare.

That is only because the perturb is based solely on the hash.
Switching to an entirely new hash after the 5th collision (for a given
lookup) would resolve that (after the 5th collision); the question is
whether or not the cost is worthwhile.

  (Well, technically, you could use trees or some other O log n data
  structure as a fallback once you have too many collisions, for some
  value
  of too many.  Seems a bit wasteful for the purpose, though.)

 Your WSGI specification  http://www.python.org/dev/peps/pep-0333/ 
 requires
 using a real dictionary for compatibility; storing some of the values
 outside the
 values array would violate that.

 When I said use some other data structure, I was referring to the internal
 implementation of the dict type, not to user code.  The only user-visible
 difference (even at C API level) would be the order of keys() et al.

Given the wording requiring a real dictionary, I would have assumed
that it was OK (if perhaps not sensible) to do pointer arithmetic and
access the keys/values/hashes directly.  (Though if the breakage was
between python versions, I would feel guilty about griping too
loudly.)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Hash collision security issue (now public)

2011-12-30 Thread Jim Jewett

In http://mail.python.org/pipermail/python-dev/2011-December/115138.html,
Christian Heimes
pointed out that

 ... we don't have to alter the outcome of hash ... We just need to reduce the 
 chance that
 an attacker can produce collisions in the dict (and set?)

I'll state it more strongly.  hash probably should not change (at
least for this), but we may
want to consider a different conflict resolution strategy when the
first slot is already filled.

Remember that there was a fair amount of thought and timing effort put
into selecting the
current strategy; it is deliberately sub-optimal for random input, in
order to do better with
typical input.  
http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictnotes.txt 


If there is a change, it would currently be needed in three places for
each of set and dict
(the lookdict functions and insertdict_clean).  It may be worth adding
some macros just to
keep those six in sync. Once those macros are in place, that allows a
compile-time switch.

My personal opinion is that accepting *and parsing* enough data for
this to be a problem
is enough of an edge case that I don't want normal dicts slowed down
at all for this; I would
therefore prefer that the change be restricted to such a compile-time
switch, with current
behavior the default.


http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictobject.c#l571

   583for (perturb = hash; ep-me_key != NULL; perturb = PERTURB_SHIFT) {
   584 i = (i  2) + i + perturb + 1;

PERTURB_SHIFT is already a private #define to 5; per dictnotes, 4 and 6 perform
almost as well.  Someone worried can easily make that change today,
and be protected
from generic anti-python attacks.

I believe the salt suggestions have equivalent to replacing
perturb = hash;
with something likeperturb = hash + salt;

Changing i = (i  2) + i + perturb + 1;would allow
effectively replacing the initial hash,
but risks spoiling performance in the non-adversary case.

Would there be objections to replacing those two lines with something like:

for (perturb = FIRST_PERTURB(hash, key);
 ep-me_key != NULL;
 perturb = NEXT_PERTURB(hash, key, perturb)) {
i = NEXT_SLOT(i, perturb);


The default macro definitions should keep things as they are

#define FIRST_PERTURB(hash, key)hash
#define NEXT_PERTURB(hash, key, perturb)perturb  PERTURB_SHIFT
#define NEXT_SLOT(i, perturb)(i  2) + i + perturb + 1

while allowing #ifdefs for (slower but) safer things like adding a
salt, or even using
alternative hashes.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] A new dict for Xmas?

2011-12-16 Thread Jim Jewett

 Greg Ewing wrote:
 Mark Shannon wrote:

 I have a new dict implementation which allows sharing of keys between
 objects of the same class.

 We already have the __slots__ mechanism for memory savings.
 Have you done any comparisons with that?

 You can't make Python programmers use slots, neither can you
 automatically change existing programs.

The automatic change is exactly what a dictionary upgrade provides.

I haven't read your patch in detail yet, but it sounds like you're
replacing the array of keys + array of values with just an array of
values, and getting the numerical index from a single per-class array
of keys.

That would normally be sensible (so thanks!), but it isn't a drop-in
replacement.  If you have a Data class intended to take arbitrary
per-instance attributes, it just forces them all to keep resizing up,
even though individual instances would be small with the current dict.

How is this more extreme than replacing a pure dict with some
auto-calculated slots and an other_attrs dict that would normally
remain empty?

[It may be harder to implement, because of the difficulty of
calculating the slots in advance ... but I don't see it as any worse,
once implemented.]

Of course, maybe your shared dict just points to sequential array
positions (rather than matching the key position) ... in which case,
it may well beat slots, though the the Data class would still be a
problem.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyUnicodeObject / PyASCIIObject questions

2011-12-13 Thread Jim Jewett

On Tue, Dec 13, 2011 at 2:55 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 (1)  Why is PyObject_HEAD used instead of PyObject_VAR_HEAD?

 The unicode object is not a var object. In a var object, tp_itemsize
 gives the element size, which is not possible for unicode objects,
 since the itemsize may vary by instance. In addition, not all instances
 have the items after the base object (plus the size of the base object
 in tp_basicsize is also not always correct).

That makes perfect sense.

Any chance of adding the rationale to the code?  Either inline, such
as changing unicodeobject.h line 291 from

PyObject_HEAD
to something like:
PyObject_HEAD   /* Not VAR_HEAD, because tp_itemsize
varies, and data may be elsewhere. */

or in the large comments around line 288:

Note that Strings use PyObject_HEAD and a length field instead of
PyObject_VAR_HEAD, because the tp_itemsize varies by instance, and the
actual data is not always immediately after the PyASCIIObject  header.



 (2)  Why does PyASCIIObject have a wstr member, and why does
 PyCompactUnicodeObject have wstr_length?  As best I can tell from the
 PEP or header file, wstr is only meaningful when either:

 No. wstr is most of all relevant if someone calls
 PyUnicode_AsUnicode(AndSize); any unicode object might get the
 wstr pointer filled out at some point.

I am willing to believe that requests for a wchar_t (or utf-8 or
System Locale charset) representation are common enough to justify
caching the data after the first request.

But then why throw it away in the first place?  Wouldn't programs that
create unicode from wchar_t data also be the most likely to request
wchar_t data back?

 wstr_length is only relevant if wstr is not NULL. For a pure ASCII
 string (and also for Latin-1 and other BMP strings), the wstr length
 will always equal the canonical length (number of code points).

wstr_length != length exactly when:

2==sizeof(wchar_t) 
PyUnicode_4BYTE_KIND == PyUnicode_KIND( str )

which can sometimes be eliminated at compile-time, and always by
string creation time.

In all other cases, (wstr_length == length), and wstr can be generated
by widening the data without having to inspect it.  Is it worth
eliminating wstr_length (or even wstr) in those cases, or is that too
much complexity?



 (3)  I would feel much less nervous if the remaining 4 values of
 PyUnicode_Kind were explicitly reserved, and the macros raised an
 error when they showed up. ...

 If people use C, they can construct all kinds of illegal ...
 kind values: many places will either work incorrectly, or have
 an assertion in debug mode already if an unexpected kind is
 encountered.

What I'm asking is that
(1)  The other values be documented as reserved, rather than as illegal.
(2)  The macros produce an error rather than silently corrupting data.

This allows at least the possibility of a later change such that

(3)  The macros handle the new values correctly, if only by delegating
back to type-supplied functions.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PyUnicodeObject / PyASCIIObject questions

2011-12-12 Thread Jim Jewett

(see http://www.python.org/dev/peps/pep-0393/ and
http://hg.python.org/cpython/file/6f097ff9ac04/Include/unicodeobject.h
)


typedef struct {
  PyObject_HEAD
  Py_ssize_t length;
  Py_hash_t hash;
  struct {
  unsigned int interned:2;
  unsigned int kind:2;   /* now 3 in implementation */
  unsigned int compact:1;
  unsigned int ascii:1;
  unsigned int ready:1;
  } state;
  wchar_t *wstr;
} PyASCIIObject;

typedef struct {
  PyASCIIObject _base;
  Py_ssize_t utf8_length;
  char *utf8;
  Py_ssize_t wstr_length;
} PyCompactUnicodeObject;

typedef struct {
  PyCompactUnicodeObject _base;
  union {
  void *any;
  Py_UCS1 *latin1;
  Py_UCS2 *ucs2;
  Py_UCS4 *ucs4;
  } data;
} PyUnicodeObject;

(1)  Why is PyObject_HEAD used instead of PyObject_VAR_HEAD?  It is
because of the names (.length vs .size), or a holdover from when
unicode (as opposed to str) did not expect to be compact, or is there
a deeper reason?

(2)  Why does PyASCIIObject have a wstr member, and why does
PyCompactUnicodeObject have wstr_length?  As best I can tell from the
PEP or header file, wstr is only meaningful when either:

(2a)  wstr is shared with (and redundant to) the canonical representation
 -- which will therefore not be ASCII.  So wstr (and
wstr_length) shouldn't need to be
represented explicitly, and certainly not in the PyASCIIObject base.

or

(2b)  The string is a Legacy String (and PyUnicode_READY has not
been called).  Because
it is a Legacy String, the object header must already be a
full PyUnicodeObject, and the wstr
fields could at least be stored there.

I'm also not sure why wstr can't be stored in the existing
.data member -- once PyUnicode_READY
is called, it will either be there (shared) or be discarded.

Are there other times when the wstr will be explicitly
re-filled and cached?

(3)  I would feel much less nervous if the remaining 4 values of
PyUnicode_Kind were explicitly reserved, and the macros raised an
error when they showed up.  (Better still would be to allow other
values, and to have the macros delegate to some attribute on the (sub)
type object.)

Discussion on py-ideas strongly suggested that people should not be
rolling their own string string representations, and that it won't
really save as much as people think it will, etc ... but I'm not sure
that saying do it without inheritance is the best solution -- and
that is what treating kind as an exhaustive list does.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Case consistency [was: Re: [Python-checkins] cpython: Cleanup code: remove int/long idioms and simplify a while statement.]

2011-10-24 Thread Jim Jewett

Is there a reason to check for
    if s[:5] == 'pass ' or s[:5] == 'PASS ':
instead of
if s[:5].lower() == 'pass'
?

If so, it should be documented; otherwise, I would rather see the more
inclusive form, that would also allow things like Pass

-jJ


On Sun, Oct 23, 2011 at 4:21 PM, florent.xicluna
python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/67053b135ed9
 changeset:   73076:67053b135ed9
 user:        Florent Xicluna florent.xicl...@gmail.com
 date:        Sun Oct 23 22:11:00 2011 +0200
 summary:
  Cleanup code: remove int/long idioms and simplify a while statement.

 diff --git a/Lib/ftplib.py b/Lib/ftplib.py
 --- a/Lib/ftplib.py
 +++ b/Lib/ftplib.py
 @@ -175,10 +175,8 @@

     # Internal: sanitize a string for printing
     def sanitize(self, s):
 -        if s[:5] == 'pass ' or s[:5] == 'PASS ':
 -            i = len(s)
 -            while i  5 and s[i-1] in {'\r', '\n'}:
 -                i = i-1
 +        if s[:5] in {'pass ', 'PASS '}:
 +            i = len(s.rstrip('\r\n'))
             s = s[:5] + '*'*(i-5) + s[i:]
         return repr(s)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Enhance Py_ARRAY_LENGTH(): fail at build time if the argument is not an array

2011-09-29 Thread Jim Jewett

On Wed, Sep 28, 2011 at 8:07 PM, Benjamin Peterson benja...@python.org wrote:
 2011/9/28 victor.stinner python-check...@python.org:
 http://hg.python.org/cpython/rev/36fc514de7f0
 changeset:   72512:36fc514de7f0

...
 Thanks Rusty Russell for having written these amazing C macros!

 Do we really need a new file? Why not pyport.h where other compiler stuff 
 goes?

I would expect pyport to contain only system-specific macros.  These
seem more universal.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: #11572: improvements to copy module tests along with removal of old test suite

2011-08-05 Thread Jim Jewett

Why was the old test suite removed?

Even if everything is covered by the test file (and that isn't clear
from this checkin), I don't see anything wrong with a quick test that
doesn't require loading the whole testing apparatus.  (I would have no
objection to including a comment saying that the majority of the tests
are in the test file; I just wonder why they have to be removed
entirely.)

On Fri, Aug 5, 2011 at 5:06 PM, sandro.tosi python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/74e79b2c114a
 changeset:   71749:74e79b2c114a
 user:        Sandro Tosi sandro.t...@gmail.com
 date:        Fri Aug 05 23:05:35 2011 +0200
 summary:
  #11572: improvements to copy module tests along with removal of old test 
 suite

 files:
  Lib/copy.py           |   65 ---
  Lib/test/test_copy.py |  168 -
  2 files changed, 95 insertions(+), 138 deletions(-)


 diff --git a/Lib/copy.py b/Lib/copy.py
 --- a/Lib/copy.py
 +++ b/Lib/copy.py
 @@ -323,68 +323,3 @@
  # Helper for instance creation without calling __init__
  class _EmptyClass:
     pass
 -
 -def _test():
 -    l = [None, 1, 2, 3.14, 'xyzzy', (1, 2), [3.14, 'abc'],
 -         {'abc': 'ABC'}, (), [], {}]
 -    l1 = copy(l)
 -    print(l1==l)
 -    l1 = map(copy, l)
 -    print(l1==l)
 -    l1 = deepcopy(l)
 -    print(l1==l)
 -    class C:
 -        def __init__(self, arg=None):
 -            self.a = 1
 -            self.arg = arg
 -            if __name__ == '__main__':
 -                import sys
 -                file = sys.argv[0]
 -            else:
 -                file = __file__
 -            self.fp = open(file)
 -            self.fp.close()
 -        def __getstate__(self):
 -            return {'a': self.a, 'arg': self.arg}
 -        def __setstate__(self, state):
 -            for key, value in state.items():
 -                setattr(self, key, value)
 -        def __deepcopy__(self, memo=None):
 -            new = self.__class__(deepcopy(self.arg, memo))
 -            new.a = self.a
 -            return new
 -    c = C('argument sketch')
 -    l.append(c)
 -    l2 = copy(l)
 -    print(l == l2)
 -    print(l)
 -    print(l2)
 -    l2 = deepcopy(l)
 -    print(l == l2)
 -    print(l)
 -    print(l2)
 -    l.append({l[1]: l, 'xyz': l[2]})
 -    l3 = copy(l)
 -    import reprlib
 -    print(map(reprlib.repr, l))
 -    print(map(reprlib.repr, l1))
 -    print(map(reprlib.repr, l2))
 -    print(map(reprlib.repr, l3))
 -    l3 = deepcopy(l)
 -    print(map(reprlib.repr, l))
 -    print(map(reprlib.repr, l1))
 -    print(map(reprlib.repr, l2))
 -    print(map(reprlib.repr, l3))
 -    class odict(dict):
 -        def __init__(self, d = {}):
 -            self.a = 99
 -            dict.__init__(self, d)
 -        def __setitem__(self, k, i):
 -            dict.__setitem__(self, k, i)
 -            self.a
 -    o = odict({A : B})
 -    x = deepcopy(o)
 -    print(o, x)
 -
 -if __name__ == '__main__':
 -    _test()
 diff --git a/Lib/test/test_copy.py b/Lib/test/test_copy.py
 --- a/Lib/test/test_copy.py
 +++ b/Lib/test/test_copy.py
 @@ -17,7 +17,7 @@
     # Attempt full line coverage of copy.py from top to bottom

     def test_exceptions(self):
 -        self.assertTrue(copy.Error is copy.error)
 +        self.assertIs(copy.Error, copy.error)
         self.assertTrue(issubclass(copy.Error, Exception))

     # The copy() method
 @@ -54,20 +54,26 @@
     def test_copy_reduce_ex(self):
         class C(object):
             def __reduce_ex__(self, proto):
 +                c.append(1)
                 return 
             def __reduce__(self):
 -                raise support.TestFailed(shouldn't call this)
 +                self.fail(shouldn't call this)
 +        c = []
         x = C()
         y = copy.copy(x)
 -        self.assertTrue(y is x)
 +        self.assertIs(y, x)
 +        self.assertEqual(c, [1])

     def test_copy_reduce(self):
         class C(object):
             def __reduce__(self):
 +                c.append(1)
                 return 
 +        c = []
         x = C()
         y = copy.copy(x)
 -        self.assertTrue(y is x)
 +        self.assertIs(y, x)
 +        self.assertEqual(c, [1])

     def test_copy_cant(self):
         class C(object):
 @@ -91,7 +97,7 @@
                  hello, hello\u1234, f.__code__,
                  NewStyle, range(10), Classic, max]
         for x in tests:
 -            self.assertTrue(copy.copy(x) is x, repr(x))
 +            self.assertIs(copy.copy(x), x)

     def test_copy_list(self):
         x = [1, 2, 3]
 @@ -185,9 +191,9 @@
         x = [x, x]
         y = copy.deepcopy(x)
         self.assertEqual(y, x)
 -        self.assertTrue(y is not x)
 -        self.assertTrue(y[0] is not x[0])
 -        self.assertTrue(y[0] is y[1])
 +        self.assertIsNot(y, x)
 +        self.assertIsNot(y[0], x[0])
 +        self.assertIs(y[0], y[1])

     def test_deepcopy_issubclass(self):
         # XXX Note: there's no way to test the TypeError coming out of
 @@

Re: [Python-Dev] [Python-checkins] cpython: Remove mention of medical condition from the test suite.

2011-07-04 Thread Jim Jewett

If you're going to get rid of the pun, you might as well change the
whole sentence...

On Sun, Jul 3, 2011 at 1:22 PM, georg.brandl python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/76452b892838
 changeset:   71146:76452b892838
 parent:      71144:ce52310f61a0
 user:        Georg Brandl ge...@python.org
 date:        Sun Jul 03 19:22:42 2011 +0200
 summary:
  Remove mention of medical condition from the test suite.

 files:
  Lib/test/test_csv.py |  8 
  1 files changed, 4 insertions(+), 4 deletions(-)


 diff --git a/Lib/test/test_csv.py b/Lib/test/test_csv.py
 --- a/Lib/test/test_csv.py
 +++ b/Lib/test/test_csv.py
 @@ -459,20 +459,20 @@
                                  '5', '6']])

     def test_quoted_quote(self):
 -        self.readerAssertEqual('1,2,3,I see, said the blind man,as he 
 picked up his hammer and saw',
 +        self.readerAssertEqual('1,2,3,I see, said the happy man,as he 
 picked up his hammer and saw',
                                [['1', '2', '3',
 -                                 'I see, said the blind man',
 +                                 'I see, said the happy man',
                                  'as he picked up his hammer and saw']])

     def test_quoted_nl(self):
         input = '''\
  1,2,3,I see,
 -said the blind man,as he picked up his
 +said the happy man,as he picked up his
  hammer and saw
  9,8,7,6'''
         self.readerAssertEqual(input,
                                [['1', '2', '3',
 -                                   'I see,\nsaid the blind man',
 +                                   'I see,\nsaid the happy man',
                                    'as he picked up his\nhammer and saw'],
                                 ['9','8','7','6']])


 --
 Repository URL: http://hg.python.org/cpython

 ___
 Python-checkins mailing list
 python-check...@python.org
 http://mail.python.org/mailman/listinfo/python-checkins


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: #6771: Move wrapper function into init and eliminate wrapper module

2011-06-19 Thread Jim Jewett

Does this really need to be a bare except?

On Sat, Jun 18, 2011 at 8:21 PM, r.david.murray
python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/9c96c3adbcd1
 changeset:   70867:9c96c3adbcd1
 user:        R David Murray rdmur...@bitdance.com
 date:        Sat Jun 18 20:21:09 2011 -0400
 summary:
  #6771: Move wrapper function into __init__ and eliminate wrapper module

 Andrew agreed in the issue that eliminating the module file made sense.
 Wrapper has only been exposed as a function, and so there is no (easy)
 way to access the wrapper module, which in any case only had the one
 function in it.  Since __init__ already contains a couple wrapper
 functions, it seems to make sense to just move wrapper there instead of
 importing it from a single function module.

 files:
  Lib/curses/__init__.py |  46 +++-
  Lib/curses/wrapper.py  |  50 --
  Misc/NEWS              |   4 ++
  3 files changed, 49 insertions(+), 51 deletions(-)


 diff --git a/Lib/curses/__init__.py b/Lib/curses/__init__.py
 --- a/Lib/curses/__init__.py
 +++ b/Lib/curses/__init__.py
 @@ -13,7 +13,6 @@
  __revision__ = $Id$

  from _curses import *
 -from curses.wrapper import wrapper
  import os as _os
  import sys as _sys

 @@ -57,3 +56,48 @@
     has_key
  except NameError:
     from has_key import has_key
 +
 +# Wrapper for the entire curses-based application.  Runs a function which
 +# should be the rest of your curses-based application.  If the application
 +# raises an exception, wrapper() will restore the terminal to a sane state so
 +# you can read the resulting traceback.
 +
 +def wrapper(func, *args, **kwds):
 +    Wrapper function that initializes curses and calls another function,
 +    restoring normal keyboard/screen behavior on error.
 +    The callable object 'func' is then passed the main window 'stdscr'
 +    as its first argument, followed by any other arguments passed to
 +    wrapper().
 +    
 +
 +    try:
 +        # Initialize curses
 +        stdscr = initscr()
 +
 +        # Turn off echoing of keys, and enter cbreak mode,
 +        # where no buffering is performed on keyboard input
 +        noecho()
 +        cbreak()
 +
 +        # In keypad mode, escape sequences for special keys
 +        # (like the cursor keys) will be interpreted and
 +        # a special value like curses.KEY_LEFT will be returned
 +        stdscr.keypad(1)
 +
 +        # Start color, too.  Harmless if the terminal doesn't have
 +        # color; user can test with has_color() later on.  The try/catch
 +        # works around a minor bit of over-conscientiousness in the curses
 +        # module -- the error return from C start_color() is ignorable.
 +        try:
 +            start_color()
 +        except:
 +            pass
 +
 +        return func(stdscr, *args, **kwds)
 +    finally:
 +        # Set everything back to normal
 +        if 'stdscr' in locals():
 +            stdscr.keypad(0)
 +            echo()
 +            nocbreak()
 +            endwin()
 diff --git a/Lib/curses/wrapper.py b/Lib/curses/wrapper.py
 deleted file mode 100644
 --- a/Lib/curses/wrapper.py
 +++ /dev/null
 @@ -1,50 +0,0 @@
 -curses.wrapper
 -
 -Contains one function, wrapper(), which runs another function which
 -should be the rest of your curses-based application.  If the
 -application raises an exception, wrapper() will restore the terminal
 -to a sane state so you can read the resulting traceback.
 -
 -
 -
 -import curses
 -
 -def wrapper(func, *args, **kwds):
 -    Wrapper function that initializes curses and calls another function,
 -    restoring normal keyboard/screen behavior on error.
 -    The callable object 'func' is then passed the main window 'stdscr'
 -    as its first argument, followed by any other arguments passed to
 -    wrapper().
 -    
 -
 -    try:
 -        # Initialize curses
 -        stdscr = curses.initscr()
 -
 -        # Turn off echoing of keys, and enter cbreak mode,
 -        # where no buffering is performed on keyboard input
 -        curses.noecho()
 -        curses.cbreak()
 -
 -        # In keypad mode, escape sequences for special keys
 -        # (like the cursor keys) will be interpreted and
 -        # a special value like curses.KEY_LEFT will be returned
 -        stdscr.keypad(1)
 -
 -        # Start color, too.  Harmless if the terminal doesn't have
 -        # color; user can test with has_color() later on.  The try/catch
 -        # works around a minor bit of over-conscientiousness in the curses
 -        # module -- the error return from C start_color() is ignorable.
 -        try:
 -            curses.start_color()
 -        except:
 -            pass
 -
 -        return func(stdscr, *args, **kwds)
 -    finally:
 -        # Set everything back to normal
 -        if 'stdscr' in locals():
 -            stdscr.keypad(0)
 -            curses.echo()
 -            curses.nocbreak()
 -            curses.endwin()
 diff --git a/Misc/NEWS

Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib.

2011-05-09 Thread Jim Jewett

Can you clarify (preferably in the commit message as well) exactly
*why* these largefile tests are useless?  For example, is there
another test that covers this already?

-jJ

On 5/7/11, nadeem.vawda python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/201dcfc56e86
 changeset:   69886:201dcfc56e86
 branch:  2.7
 parent:  69881:a0147a1f1776
 user:Nadeem Vawda nadeem.va...@gmail.com
 date:Sat May 07 11:28:03 2011 +0200
 summary:
   Issue #11277: Remove useless test from test_zlib.

 files:
   Lib/test/test_zlib.py |  42 ---
   1 files changed, 0 insertions(+), 42 deletions(-)


 diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py
 --- a/Lib/test/test_zlib.py
 +++ b/Lib/test/test_zlib.py
 @@ -72,47 +72,6 @@
   zlib.crc32('spam',  (2**31)))


 -# Issue #11277 - check that inputs of 2 GB (or 1 GB on 32 bits system) are
 -# handled correctly. Be aware of issues #1202. We cannot test a buffer of 4
 GB
 -# or more (#8650, #8651 and #10276), because the zlib stores the buffer
 size
 -# into an int.
 -class ChecksumBigBufferTestCase(unittest.TestCase):
 -if sys.maxsize  _4G:
 -# (64 bits system) crc32() and adler32() stores the buffer size
 into an
 -# int, the maximum filesize is INT_MAX (0x7FFF)
 -filesize = 0x7FFF
 -else:
 -# (32 bits system) On a 32 bits OS, a process cannot usually
 address
 -# more than 2 GB, so test only 1 GB
 -filesize = _1G
 -
 -@unittest.skipUnless(mmap, mmap() is not available.)
 -def test_big_buffer(self):
 -if sys.platform[:3] == 'win' or sys.platform == 'darwin':
 -requires('largefile',
 - 'test requires %s bytes and a long time to run' %
 - str(self.filesize))
 -try:
 -with open(TESTFN, wb+) as f:
 -f.seek(self.filesize-4)
 -f.write(asdf)
 -f.flush()
 -m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
 -try:
 -if sys.maxsize  _4G:
 -self.assertEqual(zlib.crc32(m), 0x709418e7)
 -self.assertEqual(zlib.adler32(m), -2072837729)
 -else:
 -self.assertEqual(zlib.crc32(m), 722071057)
 -self.assertEqual(zlib.adler32(m), -1002962529)
 -finally:
 -m.close()
 -except (IOError, OverflowError):
 -raise unittest.SkipTest(filesystem doesn't have largefile
 support)
 -finally:
 -unlink(TESTFN)
 -
 -
  class ExceptionTestCase(unittest.TestCase):
  # make sure we generate some expected errors
  def test_badlevel(self):
 @@ -595,7 +554,6 @@
  def test_main():
  run_unittest(
  ChecksumTestCase,
 -ChecksumBigBufferTestCase,
  ExceptionTestCase,
  CompressTestCase,
  CompressObjectTestCase

 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Jim Jewett

Are you asserting that all foreign modules (or at least all handled by
this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
won't change?)

Is this ASCII restriction (as opposed to even UTF8) really needed?

Or are you just saying that we need to create an ASCII name for passing to C?

-jJ

On 5/7/11, victor.stinner python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/eb003c3d1770
 changeset:   69889:eb003c3d1770
 user:Victor Stinner victor.stin...@haypocalc.com
 date:Sat May 07 12:46:05 2011 +0200
 summary:
   _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

 The name must be encodable to ASCII because dynamic module must have a
 function
 called PyInit_NAME, they are written in C, and the C language doesn't
 accept
 non-ASCII identifiers.

 files:
   Python/importdl.c |  40 +-
   1 files changed, 25 insertions(+), 15 deletions(-)


 diff --git a/Python/importdl.c b/Python/importdl.c
 --- a/Python/importdl.c
 +++ b/Python/importdl.c
 @@ -20,31 +20,36 @@
 const char *pathname, FILE *fp);
  #endif

 -/* name should be ASCII only because the C language doesn't accept
 non-ASCII
 -   identifiers, and dynamic modules are written in C. */
 -
  PyObject *
  _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp)
  {
 -PyObject *m;
 +PyObject *m = NULL;
  #ifndef MS_WINDOWS
  PyObject *pathbytes;
  #endif
 +PyObject *nameascii;
  char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext;
  dl_funcptr p0;
  PyObject* (*p)(void);
  struct PyModuleDef *def;

 -namestr = _PyUnicode_AsString(name);
 -if (namestr == NULL)
 -return NULL;
 -
  m = _PyImport_FindExtensionObject(name, path);
  if (m != NULL) {
  Py_INCREF(m);
  return m;
  }

 +/* name must be encodable to ASCII because dynamic module must have a
 +   function called PyInit_NAME, they are written in C, and the C
 language
 +   doesn't accept non-ASCII identifiers. */
 +nameascii = PyUnicode_AsEncodedString(name, ascii, NULL);
 +if (nameascii == NULL)
 +return NULL;
 +
 +namestr = PyBytes_AS_STRING(nameascii);
 +if (namestr == NULL)
 +goto error;
 +
  lastdot = strrchr(namestr, '.');
  if (lastdot == NULL) {
  packagecontext = NULL;
 @@ -60,34 +65,33 @@
  #else
  pathbytes = PyUnicode_EncodeFSDefault(path);
  if (pathbytes == NULL)
 -return NULL;
 +goto error;
  p0 = _PyImport_GetDynLoadFunc(shortname,
PyBytes_AS_STRING(pathbytes), fp);
  Py_DECREF(pathbytes);
  #endif
  p = (PyObject*(*)(void))p0;
  if (PyErr_Occurred())
 -return NULL;
 +goto error;
  if (p == NULL) {
  PyErr_Format(PyExc_ImportError,
   dynamic module does not define init function
(PyInit_%s),
   shortname);
 -return NULL;
 +goto error;
  }
  oldcontext = _Py_PackageContext;
  _Py_PackageContext = packagecontext;
  m = (*p)();
  _Py_PackageContext = oldcontext;
  if (m == NULL)
 -return NULL;
 +goto error;

  if (PyErr_Occurred()) {
 -Py_DECREF(m);
  PyErr_Format(PyExc_SystemError,
   initialization of %s raised unreported exception,
   shortname);
 -return NULL;
 +goto error;
  }

  /* Remember pointer to module init function. */
 @@ -101,12 +105,18 @@
  Py_INCREF(path);

  if (_PyImport_FixupExtensionObject(m, name, path)  0)
 -return NULL;
 +goto error;
  if (Py_VerboseFlag)
  PySys_FormatStderr(
  import %U # dynamically loaded from %R\n,
  name, path);
 +Py_DECREF(nameascii);
  return m;
 +
 +error:
 +Py_DECREF(nameascii);
 +Py_XDECREF(m);
 +return NULL;
  }

  #endif /* HAVE_DYNAMIC_LOADING */

 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Userlist.copy() wasn't returning a UserList.

2011-05-06 Thread Jim Jewett

Do you also want to assert that u is not v, or would that sort of
copy be acceptable by some subclasses?

On 5/5/11, raymond.hettinger python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/f20373fcdde5
 changeset:   69865:f20373fcdde5
 user:Raymond Hettinger pyt...@rcn.com
 date:Thu May 05 14:34:35 2011 -0700
 summary:
   Userlist.copy() wasn't returning a UserList.

 files:
   Lib/collections/__init__.py |  2 +-
   Lib/test/test_userlist.py   |  6 ++
   2 files changed, 7 insertions(+), 1 deletions(-)


 diff --git a/Lib/collections/__init__.py b/Lib/collections/__init__.py
 --- a/Lib/collections/__init__.py
 +++ b/Lib/collections/__init__.py
 @@ -887,7 +887,7 @@
  def pop(self, i=-1): return self.data.pop(i)
  def remove(self, item): self.data.remove(item)
  def clear(self): self.data.clear()
 -def copy(self): return self.data.copy()
 +def copy(self): return self.__class__(self)
  def count(self, item): return self.data.count(item)
  def index(self, item, *args): return self.data.index(item, *args)
  def reverse(self): self.data.reverse()
 diff --git a/Lib/test/test_userlist.py b/Lib/test/test_userlist.py
 --- a/Lib/test/test_userlist.py
 +++ b/Lib/test/test_userlist.py
 @@ -52,6 +52,12 @@
  return str(key) + '!!!'
  self.assertEqual(next(iter(T((1,2, 0!!!)

 +def test_userlist_copy(self):
 +u = self.type2test([6, 8, 1, 9, 1])
 +v = u.copy()
 +self.assertEqual(u, v)
 +self.assertEqual(type(u), type(v))
 +
  def test_main():
  support.run_unittest(UserListTest)


 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: PyGILState_Ensure(), PyGILState_Release(), PyGILState_GetThisThreadState() are

2011-04-27 Thread Jim Jewett

Would it be a problem to make them available a no-ops?

On 4/26/11, victor.stinner python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/75503c26a17f
 changeset:   69584:75503c26a17f
 user:Victor Stinner victor.stin...@haypocalc.com
 date:Tue Apr 26 23:34:58 2011 +0200
 summary:
   PyGILState_Ensure(), PyGILState_Release(), PyGILState_GetThisThreadState()
 are
 not available if Python is compiled without threads.

 files:
   Include/pystate.h |  10 +++---
   1 files changed, 7 insertions(+), 3 deletions(-)


 diff --git a/Include/pystate.h b/Include/pystate.h
 --- a/Include/pystate.h
 +++ b/Include/pystate.h
 @@ -73,9 +73,9 @@
  struct _frame *frame;
  int recursion_depth;
  char overflowed; /* The stack has overflowed. Allow 50 more calls
 - to handle the runtime error. */
 -char recursion_critical; /* The current calls must not cause
 - a stack overflow. */
 +to handle the runtime error. */
 +char recursion_critical; /* The current calls must not cause
 +a stack overflow. */
  /* 'tracing' keeps track of the execution depth when tracing/profiling.
 This is to prevent the actual trace/profile code from being recorded
 in
 the trace/profile. */
 @@ -158,6 +158,8 @@
  enum {PyGILState_LOCKED, PyGILState_UNLOCKED}
  PyGILState_STATE;

 +#ifdef WITH_THREAD
 +
  /* Ensure that the current thread is ready to call the Python
 C API, regardless of the current state of Python, or of its
 thread lock.  This may be called as many times as desired
 @@ -199,6 +201,8 @@
  */
  PyAPI_FUNC(PyThreadState *) PyGILState_GetThisThreadState(void);

 +#endif   /* #ifdef WITH_THREAD */
 +
  /* The implementation of sys._current_frames()  Returns a dict mapping
 thread id to that thread's current frame.
  */

 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython (3.2): Issue #11919: try to fix test_imp failure on some buildbots.

2011-04-26 Thread Jim Jewett

This seems to be changing what is tested -- are you saying that
filenames with an included directory name are not intended to be
supported?

On 4/25/11, antoine.pitrou python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/2f2c7eb27437
 changeset:   69556:2f2c7eb27437
 branch:  3.2
 parent:  69554:77cf9e4b144b
 user:Antoine Pitrou solip...@pitrou.net
 date:Mon Apr 25 21:39:49 2011 +0200
 summary:
   Issue #11919: try to fix test_imp failure on some buildbots.

 files:
   Lib/test/test_imp.py |  3 ++-
   1 files changed, 2 insertions(+), 1 deletions(-)


 diff --git a/Lib/test/test_imp.py b/Lib/test/test_imp.py
 --- a/Lib/test/test_imp.py
 +++ b/Lib/test/test_imp.py
 @@ -171,8 +171,9 @@
  support.rmtree(test_package_name)

  def test_issue9319(self):
 +path = os.path.dirname(__file__)
  self.assertRaises(SyntaxError,
 -  imp.find_module, test/badsyntax_pep3120)
 +  imp.find_module, badsyntax_pep3120, [path])


  class ReloadTests(unittest.TestCase):

 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] clarification: subset vs equality Re: [Python-checkins] peps: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements

2011-04-05 Thread Jim Jewett

On 4/4/11, brett.cannon python-check...@python.org wrote:
   Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty
 Requirements

 +Abstract
 +
 +
 +The Python standard library under CPython contains various instances
 +of modules implemented in both pure Python and C. This PEP requires
 +that in these instances that both the Python and C code *must* be
 +semantically identical (except in cases where implementation details
 +of a VM prevents it entirely). It is also required that new C-based
 +modules lacking a pure Python equivalent implementation get special
 +permissions to be added to the standard library.

I think it is worth stating explicitly that the C version can be even
a strict subset.  It is OK for the accelerated C code to rely on the
common python version; it is just the reverse that is not OK.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] r87980 - in python/branches/py3k/Lib/importlib: _bootstrap.py abc.py

2011-01-13 Thread Jim Jewett

Why?

Are annotations being deprecated in general?  Or are these particular
annotations no longer accurate?

-jJ

On Wed, Jan 12, 2011 at 9:31 PM, raymond.hettinger
python-check...@python.org wrote:
 Author: raymond.hettinger
 Date: Thu Jan 13 03:31:25 2011
 New Revision: 87980

 Log:
 Issue 10899: Remove function type annotations from the stdlib

 Modified:
   python/branches/py3k/Lib/importlib/_bootstrap.py
   python/branches/py3k/Lib/importlib/abc.py

 Modified: python/branches/py3k/Lib/importlib/_bootstrap.py
 ==
 --- python/branches/py3k/Lib/importlib/_bootstrap.py    (original)
 +++ python/branches/py3k/Lib/importlib/_bootstrap.py    Thu Jan 13 03:31:25 
 2011
 @@ -345,7 +345,7 @@

  class SourceLoader(_LoaderBasics):

 -    def path_mtime(self, path:str) - int:
 +    def path_mtime(self, path):
         Optional method that returns the modification time for the 
 specified
         path.

 @@ -354,7 +354,7 @@
         
         raise NotImplementedError

 -    def set_data(self, path:str, data:bytes) - None:
 +    def set_data(self, path, data):
         Optional method which writes data to a file path.

         Implementing this method allows for the writing of bytecode files.

 Modified: python/branches/py3k/Lib/importlib/abc.py
 ==
 --- python/branches/py3k/Lib/importlib/abc.py   (original)
 +++ python/branches/py3k/Lib/importlib/abc.py   Thu Jan 13 03:31:25 2011
 @@ -18,7 +18,7 @@
     Abstract base class for import loaders.

     @abc.abstractmethod
 -    def load_module(self, fullname:str) - types.ModuleType:
 +    def load_module(self, fullname):
         Abstract method which when implemented should load a module.
         raise NotImplementedError

 @@ -28,7 +28,7 @@
     Abstract base class for import finders.

     @abc.abstractmethod
 -    def find_module(self, fullname:str, path:[str]=None) - Loader:
 +    def find_module(self, fullname, path=None):
         Abstract method which when implemented should find a module.
         raise NotImplementedError

 @@ -47,7 +47,7 @@
     

     @abc.abstractmethod
 -    def get_data(self, path:str) - bytes:
 +    def get_data(self, path):
         Abstract method which when implemented should return the bytes for
         the specified path.
         raise NotImplementedError
 @@ -63,19 +63,19 @@
     

     @abc.abstractmethod
 -    def is_package(self, fullname:str) - bool:
 +    def is_package(self, fullname):
         Abstract method which when implemented should return whether the
         module is a package.
         raise NotImplementedError

     @abc.abstractmethod
 -    def get_code(self, fullname:str) - types.CodeType:
 +    def get_code(self, fullname):
         Abstract method which when implemented should return the code 
 object
         for the module
         raise NotImplementedError

     @abc.abstractmethod
 -    def get_source(self, fullname:str) - str:
 +    def get_source(self, fullname):
         Abstract method which should return the source code for the
         module.
         raise NotImplementedError
 @@ -94,7 +94,7 @@
     

     @abc.abstractmethod
 -    def get_filename(self, fullname:str) - str:
 +    def get_filename(self, fullname):
         Abstract method which should return the value that __file__ is to 
 be
         set to.
         raise NotImplementedError
 @@ -117,11 +117,11 @@

     

 -    def path_mtime(self, path:str) - int:
 +    def path_mtime(self, path):
         Return the modification time for the path.
         raise NotImplementedError

 -    def set_data(self, path:str, data:bytes) - None:
 +    def set_data(self, path, data):
         Write the bytes to the path (if possible).

         Any needed intermediary directories are to be created. If for some
 @@ -170,7 +170,7 @@
         raise NotImplementedError

     @abc.abstractmethod
 -    def source_path(self, fullname:str) - object:
 +    def source_path(self, fullname):
         Abstract method which when implemented should return the path to 
 the
         source code for the module.
         raise NotImplementedError
 @@ -279,19 +279,19 @@
         return code_object

     @abc.abstractmethod
 -    def source_mtime(self, fullname:str) - int:
 +    def source_mtime(self, fullname):
         Abstract method which when implemented should return the
         modification time for the source of the module.
         raise NotImplementedError

     @abc.abstractmethod
 -    def bytecode_path(self, fullname:str) - object:
 +    def bytecode_path(self, fullname):
         Abstract method which when implemented should return the path to 
 the
         bytecode for the module.
         raise NotImplementedError

     @abc.abstractmethod
 -    def write_bytecode(self, fullname:str, bytecode:bytes) - bool:
 +    def write_bytecode(self, fullname, bytecode):
         Abstract method which when

Re: [Python-Dev] [Python-checkins] r87523 - python/branches/py3k/Doc/tutorial/interpreter.rst

2010-12-28 Thread Jim Jewett

It might still be worth saying something like:

Note that this python file does something subtly different; the
details are not included in this tutorial.

On Tue, Dec 28, 2010 at 4:18 AM, georg.brandl
python-check...@python.org wrote:
 Author: georg.brandl
 Date: Tue Dec 28 10:18:24 2010
 New Revision: 87523

 Log:
 Remove confusing paragraph -- this is relevant only to advanced users anyway 
 and does not belong into the tutorial.

 Modified:
   python/branches/py3k/Doc/tutorial/interpreter.rst

 Modified: python/branches/py3k/Doc/tutorial/interpreter.rst
 ==
 --- python/branches/py3k/Doc/tutorial/interpreter.rst   (original)
 +++ python/branches/py3k/Doc/tutorial/interpreter.rst   Tue Dec 28 10:18:24 
 2010
 @@ -58,14 +58,6 @@
  ``python -m module [arg] ...``, which executes the source file for *module* 
 as
  if you had spelled out its full name on the command line.

 -Note that there is a difference between ``python file`` and ``python
 -file``.  In the latter case, input requests from the program, such as 
 calling
 -``sys.stdin.read()``, are satisfied from *file*.  Since this file has already
 -been read until the end by the parser before the program starts executing, 
 the
 -program will encounter end-of-file immediately.  In the former case (which is
 -usually what you want) they are satisfied from whatever file or device is
 -connected to standard input of the Python interpreter.
 -
  When a script file is used, it is sometimes useful to be able to run the 
 script
  and enter interactive mode afterwards.  This can be done by passing 
 :option:`-i`
  before the script.  (This does not work if the script is read from standard
 ___
 Python-checkins mailing list
 python-check...@python.org
 http://mail.python.org/mailman/listinfo/python-checkins

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] file and bytecode-only

2010-03-03 Thread Jim Jewett

I understand the need to ship without source -- but why does that
require supporting .pyc (or .pyo) -only?

Couldn't vendors just replace the real .py files with empty files?

Then no one would need the extra stat call, and no one would be bitten
by orphaned .pyc files after a rename.

[Yes, zips could still allow unmatched names; yes, it would be helpful
if a tool were available to sync the last-modification time; yes a
deprecation release should still be needed.]

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] codecs.oen [was: PEP 385: the eol-type issue]

2009-08-09 Thread Jim Jewett

 M.-A. Lemburg wrote:

 ... and because of this, the feature is already available if
 you use codecs.open() instead of the built-in open():

Neil Hodgson asked:
 So should I not add an issue for the basic open because codecs.open
 should be used for this case?

In python 3, why does codecs.open even still exist?

As best I can tell, codecs.open should be the same as regular open,
but for a unicode file -- and all text files are treated as unicode in
python 3.0

So at this point, are there any differences beyond:

(a)  The builtin open doesn't work on multi-byte line-endings other
than the multi-character CRLF.  (In other words, it goes by the
traditional Operating System conventions developed when a char was a
byte, but the Unicode standard allows for a few more possibilities,
which are currently rare in practice.)

(b)  The codecs version is much slower, because it hasn't seen the
optimization effort.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] standard library mimetypes module pathologically broken?

2009-08-02 Thread Jim Jewett

[It may be worth creating a patch; I think most of these comments
would be better on the bug-tracker.]

(1)  In a few cases, it looked like you were changing parameter names
between files and filenames.  This might break code that was
calling it with keyword arguments -- as I typically would for this
type of function.

(1a)  If you are going to change the .sig, you might as well do it
right, and make the default be knownfiles rather than the empty
tuple.

(2)  The comment about why inited was set true at the beginning of the
function instead of the end should probably be kept, or at least
reworded.

(3) Default values:

(3a) Why the list of known files going back to Apache 1.2, in that
order?  Is there any risk in using too *new* of a MimeTypes file?

I would assume that the goal is to pick up whatever changes the user
has made locally, but in that case, it still makes sense to have the
newest file be the last one read, in case Apache has made bugfixes.

(3b)  Also, this would improve cross-platform consistency; if I read
that correctly, the Apache files will override the python defaults on
unix or a mac, but not on windows.  That will change the results on
the majority of items in _common_types.  (application vs text, whether
to put an x- in front of the word pict.)

(3c)  rtf is listed in non-standard, but
http://www.iana.org/assignments/media-types/ does define it.  (Though
whether to guess application vs text is not defined, and python
chooses differently from apache.)

(3d)  jpg is listed as non-standard.  It turns out that this is just
for the inverse mapping, where image/jpg is non-standard (for
image/jpeg) but that is worth a comment.  (see #5)

(3e)  In _types_map, the lines marked duplicates are duplicate keys,
not duplicate values; it would be more clear to also comment out the
(first) line itself, instead of just marking it a duplicate.  (Or
better yet, to mention that it is just being added for the inverse
mapping, if that is the case.)


(4)  Why bother to lazyinit?Is there any sane usecase for a
MimeTypes that hasn't been inited?

I see value in not reading the default files, but none in not reading
at least the files that were asked for.  I could see value in only
partial initialization if there were several long steps, but right
now, initialization is all-or-nothing.

If the thing is useless without an init, then it makes sense to just
get done it immediately and skip the later checks; anyone who could
have actually saved time should just remove the import.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 384: Defining a Stable ABI

2009-05-22 Thread Jim Jewett

Martin v. Löwis wrote:

  - PyGetSetDef (name, get, set, doc, closure)

Is it fully decided that the generally-unused closure parameter will
stay until python 4?

 The accessor macros to these fields (Py_REFCNT, Py_TYPE, Py_SIZE)
 are also available to applications.

There have been several experiments in memory management, ranging from
not bothering to change the refcount on permanent objects like None,
to proxying objects across multiple threads or processes.  I also
believe (but don't remember for sure) that some of the proposed
Unicode (or String?) optimizations changed the memory layout a bit.
So far, these have all been complicated (or slow) enough that they
didn't get integrated, but if it ever happens ... I don't think it
would justify python 4.0

 New Python
 versions may introduce new slot ids, but slot ids will never be
 recycled. Slots may get deprecated, but continue to be supported
 throughout Python 3.x.

Weren't there already a few ready for deprecation?  Do you really want
to commit to them forever?  Even if you aren't willing to settle for
less than 3.x from now on, it might make sense to at least start
with 3.2, rather than 3.0.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 383 and GUI libraries

2009-05-03 Thread Jim Jewett

(sent only to python-dev, as I am not a subscriber of tahoe-dev)

Zooko wrote:

 [Tahoe] currently uses utf-8 for its internal storage (note: nothing to
 do with reading or writing files from external sources -- only for
 storing filenames in the decentralized storage system which is
 accessed by Tahoe clients), and we can't start putting non-utf-8-valid
 sequences in the filename slot because other Tahoe clients would
 then get a UnicodeDecodeError exception when trying to read those
 directories.

So what do you do when someone has an existing file whose name is
supposed to be in utf-8, but whose actual bytes are not valid utf-8?

If you have somehow solved that problem, then you're already done --
the PEP's encoding is a no-op on anything that isn't already invalid
unicode.

If you have not solved that problem, then those clients will already
be getting a UnicodeDecodeError; all the PEP does is make it at least
possible for them to recover.

...

 Requirement 1 (unicode):  Each filename that you see needs to be valid
 unicode (it is stored internally in utf-8).

(repeating) What does Tahoe do if this is violated?  Do you throw an
exception right there and not let them copy the file to tahoe?  If so,
then that same error correction means that utf8b will never differ
from utf-8, and you have nothing to worry about.

 Requirement 2 (faithful if unicode):

Doesn't the PEP meet this?

 Requirement 3 (no file left behind):

Doesn't the PEP also meet this?  I thought the concern was just that
the name used would not be valid unicode, unless the original name was
itself valid unicode.

 Possible Requirement 4 (faithful bytes if not unicode, a.k.a.
 round-tripping):

Doesn't the PEP also support this?  (Only) the invalid bytes get
escaped and therefore must be unescaped, but the escapement is
reversible.

 3. (handling collisions)  In either case 2.a or 2.b the resulting
 unicode string may already be present in the directory.

This collision is what the use of half-surrogates (as the escape
characters) avoids.  Such collisions can't be present unless the data
was invalid unicode, in which case it was the result of an escapement
(unless something other than python is creating new invalid
filenames).

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] #!/usr/bin/env python -- python3 where applicable

2009-04-30 Thread Jim Jewett

Jared Grubb wrote:

 Ok, so if I understand, the situation is:
 * python points to 2.x version
 * python3 points to 3.x version
 * need to be able to run certain 3k scripts from cmdline (since we're
talking about shebangs) using Python3k even though python
points to  2.x

 So, if I got the situation right, then do these same scripts
 understand that PYTHONPATH and PYTHONHOME and all the others
 are also  probably pointing to 2.x code?

Would it make sense to introduce PYTHON2PATH and PYTHON3PATH (or even
PYTHON27PATH and PYTHON 32PATH) et al?

Or is this an area where we just figure that whoever moved the file
locations around for distribution can hardcode things properly?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] package resources [was: setuptools has divided the Python community]

2009-03-28 Thread Jim Jewett

At 11:27 PM 3/26/2009 +, Paul Moore wrote:
 What I'd really like is essentially some form of virtual filesystem
 access to stuff addressed relative to a Python package name,

P.J. Eby responded:
 Note that relative to a *Python package name* isn't quite as useful,
 due to namespace packages.  To be unambiguous as to the targeted
 resource, one needs to be able to reference a specific project, and
 that requires you to go off the name of a module *within* a
 package.  For example, 'zope.somemodule' rather than just 'zope'.

I would expect it to be *most* important then.  If I know for sure
that an entire package is all together in a single directory, I can
just use that directory.  If I want all xxx files used by zope, then
... I *do* want information on the duplicates, and the multiple
locations.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] return from a generator [was:PEP 380 (yield from a subgenerator) comments]

2009-03-26 Thread Jim Jewett

On Thu, Mar 26, 2009 at 4:19 PM, P.J. Eby wrote:
 What I don't like is the confusion of adding return values to generators,
 at least using the 'return' statement.

At Fri Mar 27 04:39:48 CET 2009, Guido van Rossum replied:
 I'm +1 on yield from and +0 on return values in generators.

def g():
yield 42
return 43

for x in g():
print x# probably expected to print 42 and then 43

I still don't see why it needs to be a return statement.  Why not make
the intent of g explicit, by writing either

def g():
yield 42
yield 43

or

def g():
yield 42
raise StopIteration(43)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] wait time [was: Ext4 data loss]

2009-03-12 Thread Jim Jewett

It is starting to look as though flush (and close?) should take an
optional wait parameter, to indicate how much re-assurance you're
willing to wait for.

It also looks like we can't know enough to predict all sensible
symbolic constants -- so instead use a floating point numeric value.

f.flush(wait=0)  == current behavior
f.flush(wait=1)  == Do everything you can.  On a Mac, this would
apparently mean (everything up to and including) fcntl(fd, F_FULLSYNC)

f.flush(wait=0.5) == somewhere in between, depending on the operating
system and file system and disk drive and other stuff the devoloper
won't know in advance.

The exact interpretation of intermediate values might depend on the
installation or even change over time; the only invariant would be
that higher values are at least as safe, and lower values are at least
as fast.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] wait time [was: Ext4 data loss]

2009-03-12 Thread Jim Jewett

On 3/12/09, Martin v. Löwis mar...@v.loewis.de wrote:
 It is starting to look as though flush (and close?) should take an
 optional wait parameter, to indicate how much re-assurance you're
 willing to wait for.

 Unfortunately, such a thing would be unimplementable on most of today's
 operating systems.

What am I missing?

_file=file
class file(_file): ...
def flush(self, wait=0):
super().flush(self)
if wait  0.25:
return
if wait  0.5 and os.fdatasync:
os.fdatasync(self.fileno())
return
os.fsync(self.fileno())
if wait  0.75:
return
if os.ffullsync:
os.ffullsync(self.fileno())

(To be honest, I'm not even seeing why it couldn't be done in
Objects/fileobject.c, though I realize extension modules would need to
go through the python interface to take advantage of it.)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] sure [was: Ext4 data loss]

2009-03-12 Thread Jim Jewett

[new name instead of wait -- but certainty is too long, patience too
hard to spell, etc...]

 class file(_file): ...
 def flush(self, sure=0):
 super().flush(self)
 if sure  0.25:
 return
 if sure  0.5 and os.fdatasync:
 os.fdatasync(self.fileno())
 ...

Steven D'Aprano asked
 Why are you giving the user the illusion of fine control by making the
 wait parameter a continuous variable and then using it as if it were a
 discrete variable?

We don't know how many possible values there will be, or whether they
will be affected by environmental settings.  Developers will not
always know what sort of systems users will have, but they can
indicate (with a ratio) where in the range (slow+safe):(fast+risky)
they rate this particular flush.

Before this discussion, I knew about sync, but had not paid attention
even to datasync, let alone fullsync.  I have no idea which additional
options may be relevant in the future, or on smaller devices or other
storage media.

I do expect specific intermediate values (such as 0.3) to be
interpreted differently on a laptop than on a desktop.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-08 Thread Jim Jewett

Michael Foord wrote:
 Chris Withers wrote:
 ... love to see ... but ... not optimistic

 - python to grow a decent, cross platform, package management system

As stated, this may be impossible, because of the difference in what a
package should mean on Windows vs Unix.

If you just mean a way to add python packages from pypi as with
EasyInstall, then maybe.

 - the standard library to actually shrink to a point where only
 libraries that are not released elsewhere are included

In some environments, each new component must be approved.  Once
python is approved, the standard library is OK, but adding 7 packages
from pypi requires 7 more sets of approvals.

On the other hand, if there were a way to say The PSF explicitly
endorses Packages X, Y, and Z as worthy of the stdlib; they are
distributed separately for administrative reasons, then the initial
request could be for Python plus officially endorsed addons

That said, it may make sense to just give greater prominence to
existing repackagers, such as ActiveState or Enthought.

 If a library is well maintained then there seems to be little point in
 moving it into the standard library

The official endorsement is in many cases more important than shared
distribution.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] html5lib/BeautifulSoup (was: Integrate lxml into the stdlib? (was: Integrate BeautifulSoup into stdlib?))

2009-03-05 Thread Jim Jewett

Stefan Behnel wrote:

 I would have a hard time feeling happy
 if a real-world HTML parser was added to the stdlib that provides a totally
 different interface than the best (and fastest) XML library that the stdlib
 currently has.

I doubt there would be any objection to someone contributing wrappers
for upgrades, but I wouldn't count on them being used.

lxml may well be the best choice for xml.

BeautifulSoup and html5lib wouldn't even exist if that actually
mattered for most of *their* use cases.  Think of them more as
pre-processors, like tidylib.  If enough web pages were even valid
HTML (let alone valid and well-formed XML), no one would have bothered
to write these libraries.

BeautifulSoup has the advantage of being long-proven in practice, for
ugly html.  (You mention an lxml feature with a similar intent, but
for lxml, it is one of several addon features; for BeautifulSoup, this
is the whole point.)

html5lib does not have as long of a history, but it does have the
advantage of being almost an endorsed standard.  Much of HTML 5 is
documenting the workarounds that browser makers already actually
employ to handle erroneous input, so that the complexities can at
least stop compounding.  html5lib is intended as a reference
implementation, and the w3c editor has used it to motivate changes in
the specification draft.  (This may make it unsuitable for inclusion
in the stdlib today, because of timing issues.)  In other words, it
isn't just the heuristics of one particular development team; it is
(modulo bugs, and after official publication) the heuristics that the
major web browser makers have agreed to treat as correct in the
future.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] #ifdef __cplusplus?

2009-01-02 Thread Jim Jewett

Alexander Belopolsky wrote:
 4. Should exported symbols be always declared in headers or is it ok
 to just declare them as extern in .c files where they are used?

Is the concern that moving them to a header makes them part of the API?

In other words, does replacing

   PyObject *
   PyFile_FromString(char *name, char *mode)
   {
extern int fclose(FILE *);
   ...
   }

with

   #include stdio.h

mean that the stdio.h needs to be included from then on, even if
PyFile_FromString stops relying upon it?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Merging flow

2008-12-05 Thread Jim Jewett

Nick Coghlan wrote:

 For now it looks like we might have to maintain 3.0 manually, with
 svnmerge only helping out for trunk-2.6 and trunk-py3k

Does it make the bookkeeping horrible if you merge from trunk straight
to 3.0, and then blocked svnmerged changes from propagating?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Looking for VCS usage scenarios

2008-11-05 Thread Jim Jewett

David Ripton wrote:

 Time for average user to check out Python sources with bzr: 10 minutes

 Time for average user to check out Python sources with git or hg: 1 minute

 Time for average user's trivial patch to be reviewed and committed: 1 year

 I love DVCS as much as the next guy, but checkout time is so not the
 bottleneck for this use case.

I think Paul's point is that he wants to support people who have not
previously planned to contribute to python.  Writing the patch may be
a matter of minutes, once they implement the fix for themselves.

Downloading a new VCS is a major commitment of time and disk space.
(And is there setup, and dealing with proxies?)  It doesn't take as
long (calendar) as waiting for the review, but it takes long enough
(clock) that people may not bother to do it.  And if they don't, what
was the point of switching to a DCVS?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] www.python.org/doc and docs.python.org hotfixed

2008-10-10 Thread Jim Jewett

 For the search engine issue, is there any way we can tell robots to
 ignore the rewrite rules so they see the broken links? (although even
 that may not be ideal, since what we really want is to tell the robot
 the link is broken, and provide the new alternative)

I may be missing something obvious, but isn't this the exact intent of

HTTP response code 301 Moved Permanently

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.2

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] syntax change justification

2008-10-10 Thread Jim Jewett

Nick Coghlan's explanation of what justifies a syntax change (most of message
http://mail.python.org/pipermail/python-dev/2008-October/082831.html )
should probably be added to the standard docs/FAQs somewhere.

At the moment, I'm not sure exactly where, though.  At the moment, the
Developer FAQ (http://www.python.org/dev/faq/)  is mostly about using
specific tools (rather than design philosophy), and Nick's explanation
may be too detailed for the current Explanations section of
www.python.org/dev/

Possibly as a Meta-PEP?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] backporting tests [was: [Python-checkins] r66863 - python/trunk/Modules/posixmodule.c]

2008-10-10 Thread Jim Jewett

In http://mail.python.org/pipermail/python-dev/2008-October/082994.html
Martin v. Löwis wrote:

 So 2.6.0 will contain a lot of tests that have never been tested in
 a wide variety of systems. Some are incorrect, and get fixed in 2.6.1,
 and stay fixed afterwards. This is completely different from somebody
 introducing a new test in 2.6.4. It means that there are more failures
 in a maintenance release, not less as in the first case.

If 2.6.1 has some (possibly accidental, but exposed to the users)
behavior that is not a clear bug, it should be kept through 2.6.x.
You may well want to change it in 2.7, but not in 2.6.4.  Adding a
test to 2.6.2 ensures that the behavior will not silently disappear
because of an unrelated bugfix in 2.6.3.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Advice on numbers.py implementation of binary mixins.

2008-06-14 Thread Jim Jewett

Raymond Hettinger wrote:

 PEP-3141 outlines an approach to writing binary
 operators to allow the right operand to override
 the operation if the left operand inherits the
 operation from the ABC.

 Here is my first approximation at how to write
 them for the Integral mixins:

 class Integral(Rational):

def __and__(self, other):
if isinstance(other, (type(self), int, long)):  # XXX
return int(self)  int(other)

I think for this mixin, it doesn't matter whether other is an Integral
instance; it matter whether it is has a more specific solution.

So instead of checking whether isinstance, check whether its __rand__
method is Integral.__rand__.

I think you also may want to guard against incomplete right-hand
operations, by doing something like replacing the simple

 return NotImplemented

with

try:
val = other.__rand__(self)
if val is not NotImplemented:
return val
except (TypeError, AttributeError):
pass
# Use the generic fallback after all
return int(self)  int(other)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Proposal: add odict to collections

2008-06-14 Thread Jim Jewett

The odict (as proposed here, ordered on time of key insertion) looks
like a close match to the dlict needed by some of the optimization
proposals.

http://python.org/dev/peps/pep-0267/

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-3000] Betas today - I hope

2008-06-12 Thread Jim Jewett

On 6/12/08, Nick Coghlan [EMAIL PROTECTED] wrote:

 documentation patch for the language reference ...
 following categories:
...

  2. Method lookup MAY bypass __getattribute__, shadowing the attribute in
 the instance dictionary MAY have ill effects. (slots such as __enter__ and
 __exit__ that are looked up via normal attribute lookup in CPython will fit
 into this category)

Should this category really be enumerated?

I thought that was the default meaning of __name__, so the real
clarification is:

(1)  Requiring that the specific names in category 1 MUST be treated this way.

(2)  Mentioning __*__ and listing any known exceptions.  (Can next
be treated this way despite the lack of __*__?  Is it forbidden to
treat __context__ this way?)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Mini-Pep: An Empty String ABC

2008-06-06 Thread Jim Jewett

 So, apart from compatibility purposes, what is the
 point currently of *not* directly subclassing str?

To provide your own storage format, such as a views into existing data.

Whether or not this is actually practical is a different question;
plenty C code tends to assume it can use the internals of str
directly, which breaks on even some subclasses.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 8 vs PEP 371: Additional Discussion

2008-06-05 Thread Jim Jewett

Guido van Rossum wrote:

 I consider multiprocessing a new API -- while it bears
 a superficial resemblance with the threading API the
 similarities are just that, and it should not be
 constrained by mistakes in that API.

The justification for including it is precisely that it is *not* a new API.

For multiple processes in general, there are competing APIs, which may
well be better.  The advantage of this API is that (in many cases) it
is a drop-in replacement for threading.  If that breaks, then there
really isn't any reason to include it in the stdlib yet.

This doesn't prevent changing the joint API to conform with PEP 8.
But why clean this module while leaving the rest of the stdlib?

Because there is a volunteer only makes sense if changes to the
other modules would also be welcomed.  Is there some reason to believe
that changes in the threading API are much less disruptive than
changes elsewhere in the stdlib?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-3000] Iterable String Redux (aka String ABC)

2008-05-27 Thread Jim Jewett

On 5/27/08, Benji York wrote:
 Guido van Rossum wrote:
   Armin Ronacher wrote:

  Basically *the* problematic situation with iterable strings is something 
  like
   a `flatten` function that flattens out every iterable object except of 
 strings.

  I'm not against this, but so far I've not been able to come up with a
   good set of methods to endow the String ABC with. Another problem is
   that not everybody draws the line in the same place -- how should
   instances of bytes, bytearray, array.array, memoryview (buffer in 2.6)
   be treated?

 Maybe the opposite approach would be more fruitful.  Flattening is about
  removing nested containers, so perhaps there should be an ABC that
  things like lists and tuples provide, but strings don't.  No idea what
  that might be.

It isn't really stringiness that matters, it is that you have to
terminate even though you still have an iterable container.

The test is roughly (1==len(v) and v[0]==v), except that you want to
stop a layer sooner.

Guido had at least a start in Searchable, back when ABC were still in
the sandbox:
http://svn.python.org/view/sandbox/trunk/abc/abc.py?rev=55321view=auto

Searchable represented the fact that (x in c) =/= (x in iter(c))
because of sequence searches like (Error in results)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Encoding detection in the standard library?

2008-04-21 Thread Jim Jewett

David Wolever wrote:

 IMO, encoding estimation is something that
 many web programs will have to deal with,
 so it might as well be built in; I would prefer
 the option to run `text=input.encode('guess')`
 (or something similar) than relying on an external
 dependency or worse yet using a hand-rolled
 algorithm

The (still draft) html5 spec is trying to get error-correction
standardized, so it includes all sort of if this fails, do X.
Encoding detection will be standardized, so there will be an external
standard that we can reference.

http://dev.w3.org/html5/spec/Overview.html#determining

Note that this portion of the spec is probably not stable yet, as
there was some new analysis on which wrong answers provided better
results on real world web pages.

e.g.,

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-March/014127.html

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-March/014190.html

There was also a recent analysis of how many characters it takes to
sniff successfully X% of the time on today's web, though I'm not
finding it at the moment.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] unscriptable?

2008-04-19 Thread Jim Jewett

 I dispute this. Indices aren't necessarily numeric
 (think of an A-Z file),

Python has recently added an __index__ slot which means as an
integer, and I really am an integer, I'm not just rounding like
int(3.4) would do

So in the context of python, an index is numeric, whereas subscript
has already been used for hashtables with arbitrary keys.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] windows (was: how to easily consume just the parts of eggs that are good for you)

2008-04-09 Thread Jim Jewett

 Are the Linux users happy with having a Python
 package manager that ignores RPM/apt? Why
 should Windows users be any happier?

Because, as you noted, the add/remove programs application is severely
limited.

 I've read one too many Windows is so broken
 that people who use it obviously don't care about
 doing things right postings this week

I'm honestly not sure that such fine-grained control is the right user
interface, particularly for a non-shared system.

But even if it were, Windows doesn't really have it, and it isn't so
valuable that a solution which works only for python could do much
better than the existing 3rd-party setup tools.



As a windows user, I don't want python packages showing up in the
add/remove programs list, because it won't help me, and will make the
few times I do use that tool even more awkward.

That said, I agree that if python does package management, offering
windows users the choice of using that application is probably a good
idea.  The catch is that package managers seem to offer far more
fine-grained power (even without dependency resolution) than windows.
Duplicating this would add lots of complexity just for windows -- and
still might not be all that useful.  I'm already used to looking for
an uninstall.exe in the  directory of anything I can actually
uninstall, and accepting that most things just don't go away cleanly.
As a programmer, this feels wrong, but ... it is probably a good
tradeoff for the time I don't want to spend maintaining things.

If I really wanted a fancy tool that took care of dependencies and
alternate versions, I would be willing to run something
python-specific, or to treat each package as a subcomponent that I
managed through Change an existing program applied to python.

But realistically, I don't see such a tool being used often enough to
justify inclusion in the core.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Proposal: from future import unicode_string_literals

2008-03-24 Thread Jim Jewett

 Maybe it's not apparent to people that hasn't
 developed in that kind of environment, and
 I'm sorry I'm not able to make this clearer.

I think I understand the issue.

Some contributors will be running under 2.6, others will be running under 3.0.

Either the code forks, or one of them is working with (and developing
patches against) the result of a compilation step, instead of against
the original source code.

For example, if I'm using the (real source) py2.6 code, and I create a
patch that works for me, it is ready for testing and submission.  If
I'm using the (generated) py3 code, then I first have to get a copy of
the (source) 2.6, figure out how I *would* have written it there, then
keep tweaking it so that the generator eventually puts out ... what I
had originally written by hand.

My (working in 3.0) task would be easier if there is also a 3to2 (so
that I can treat my own code as the source), but then entire files
will do flip-flops on a regular basis (depening on which version was
generated), unless 2to3 and 3to2 somehow create a perfect round-trip.

And that compile step -- it can be automated, but I suspect most
python projects don't yet have a good place to put the hooks, simply
because they haven't needed to in the past.

The end result is that the barrier to contributions becomes much
higher for people working in at least one of the versions.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] r61709 - python/trunk/Doc/library/functions.rst python/trunk/Doc/library/future_builtins.rst python/trunk/Doc/library/python.rst

2008-03-23 Thread Jim Jewett

What is the precise specification of the builtin print function.
Does it call str, or does it just behave as if the builtin str had
been called?

In 2.5, the print statement ignores any overrides of the str builtin,
but I'm not sure whether a _function_ should -- and I do think it
should be specified.

-jJ

On 3/21/08, georg.brandl [EMAIL PROTECTED] wrote:
  New Revision: 61709

==

  +++ python/trunk/Doc/library/functions.rst  Fri Mar 21 20:37:57 2008
  @@ -817,6 +817,33 @@

...
  +.. function:: print([object, ...][, sep=' '][, end='\n'][, file=sys.stdout])
...

  +   All non-keyword arguments are converted to strings like :func:`str` does
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] windows standard [was: PEP 365 (Adding the pkg_resources module)]

2008-03-23 Thread Jim Jewett

Terry Reedy
 The standard (and to me, preferable)  way of dealing
 with such things is to  have an 'installation manager'
 that can reinstall as well as delete and  that has a
 check box for various things to delete.  This is what
 Python  needs.

Paul Moore:
 I'd dispute strongly that this is a standard.
 It may be preferable, but I'm not sure where you
 see evidence of it being a standard.

When I install a large program (such as developer tools, or python
itself) on Windows, I expect a choice of default or custom.   When
I choose custom, I expect a list of components, which can be chosen,
not chosen, or mixed (meaning that it has subcomponents, only some of
which are chosen).

The whole thing only shows up once in Add/Remove programs.  If I
select it, I do get options to Change or Repair.  These let me change
my mind on which subcomponents are installed.

If I install python and then separately install Zope, it may or may
not make sense for Zope to be listed separately as a program to Add
or Remove.  It does not make sense (to me anyhow) have several
individual packages within Zope each listed independently at the
Windows level.  (Though, to be fair, many (non-python) applications
*do* make more than one entry.)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] logging shutdown (was: Re: r61431 - python/trunk/Doc/library/logging.rst)

2008-03-19 Thread Jim Jewett

On 3/19/08, Vinay Sajip [EMAIL PROTECTED] wrote:
  I think (repeatedly) testing an app through IDLE is a reasonable use case.

 [other threads may still have references to loggers or handlers]

   Would it be reasonable for shutdown to remove logging from
   sys.modules, so that a rerun has some chance of succeeding via its own
   import?

 I'm not sure that would be enough in the scenario I mentioned above - would
  removing a module from sys.modules be a guarantee of removing it from
  memory?

No.  It will explicitly not be removed from memory while anything
holds a live reference.

Removing it from sys.modules just means that the next time a module
does import logging, the logging initialization code will run again.

It is true that this could cause contention if the old version is
still holding an exclusive lock on some output file.

  It's safer, in my view, for the developer of an application to do cleanup of
  their app if they want to test repeatedly in IDLE.

Depending on the issue just fixed, the app may not have a clean shutdown.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] logging shutdown (was: Re: [Python-checkins] r61431 - python/trunk/Doc/library/logging.rst)

2008-03-18 Thread Jim Jewett

I think (repeatedly) testing an app through IDLE is a reasonable use case.

Would it be reasonable for shutdown to remove logging from
sys.modules, so that a rerun has some chance of succeeding via its own
import?

-jJ

On 3/16/08, vinay.sajip [EMAIL PROTECTED] wrote:
 Author: vinay.sajip
  Date: Sun Mar 16 22:35:58 2008
  New Revision: 61431

  Modified:
python/trunk/Doc/library/logging.rst
  Log:
  Clarified documentation on use of shutdown().

  Modified: python/trunk/Doc/library/logging.rst
  
 ==
  --- python/trunk/Doc/library/logging.rst(original)
  +++ python/trunk/Doc/library/logging.rstSun Mar 16 22:35:58 2008
  @@ -732,7 +732,8 @@
   .. function:: shutdown()

 Informs the logging system to perform an orderly shutdown by flushing and
  -   closing all handlers.
  +   closing all handlers. This should be called at application exit and no
  +   further use of the logging system should be made after this call.


   .. function:: setLoggerClass(klass)
  ___
  Python-checkins mailing list
  [EMAIL PROTECTED]
  http://mail.python.org/mailman/listinfo/python-checkins

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Jim Jewett

 A simple way to do this would be to push objects whose
 refcounts had reached 0 onto a list instead of finalizing them
 immediately, and have PyEval_EvalFrameEx periodically swap
 in a new to-delete list and delete the objects on the old one.

Some of the memory management threads discussed something similar to
this, and pointed to IBM papers on Java.  By adding states like
tenatively finalizable, the cost of using multiple processors was
reduced.

The down side is that objects which could be released (and recycled)
immediately won't be -- which slows down a fair number of real-world
programs that are used to the CPython refcount model.  If the resource
not being immediately released is scarce (such as file handles), it
gets even worse.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-3000] Rounding Decimals

2008-01-14 Thread Jim Jewett

On 1/12/08, Guido van Rossum [EMAIL PROTECTED] wrote:
 On Jan 12, 2008 5:09 PM, Jeffrey Yasskin [EMAIL PROTECTED] wrote:
  During the discussion about the new Rational implementation
  (http://bugs.python.org/issue1682), Guido and Raymond decided that
  Decimal should not implement the new Real ABC from PEP 3141. So I've
  closed http://bugs.python.org/issue1623 and won't be pursuing any of
  the extra rounding methods mentioned on this thread.

 Well, I didn't really decide anything. I suggested that if the
 developers of Decimal preferred it, it might be better if Decimal did
 not implement the Real ABC, and Raymond said he indeed preferred it.

I read his objections slightly differently.

He is very clear that Decimal itself should be restricted to the
standard, and therefore should not natively implement the extensions.
But he also said that it might be reasonable for another package to
subset or superset it in a friendlier way.

numbers.py is a different module, which must be explicitly imported.

If the objection is that

 decimal.Decimal(43.2).imag

would work (instead of throwing an exception) only when numbers.py has
already been imported, then ... well, that confusion is inherent in
the abstract classes.

Or is the problem that it *still* wouldn't work, without help from the
decimal module itself?  In that case, 3rd party registration is fairly
limited, and this might be a good candidate for trying to figure out
ABCs and adapters *should* work together.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Coverity Scan, Python upgraded to rung 2

2008-01-11 Thread Jim Jewett

Neal Norwitz wrote:
 For codeobject.c, line 327 should not be reachable.
...

Christian Heimes wrote:
 Please suppress the warning. I removed the last
 two lines and GCC complained ...

Either way, it would be worth adding a comment to the source code so
this doesn't come up again.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Non-string keys in namespace dicts

2007-12-04 Thread Jim Jewett

PJE wrote:
 Isn't the simplest way to cache attribute
 lookups to just have a cache dictionary in the type,
 and update that  dictionary whenever a change is
 made to a superclass?  That's  essentially how
 __slotted__ attribute changes on base classes
 work now, isn't it?

Neil Toronto wrote:

 The nice thing about caching pointers to dict
 entries is that they don't change as often as
 values do.

Is this really true for namespaces?

I was thinking that the typical namespace usage is a bunch of inserts
(possibly with lookups mixed in), followed by never changing it again
until it is deallocated.

 There are fewer ways to invalidate an
 entry pointer: inserting set, resize, clear, and delete.

I'm not sure how to resize without an inserting set.

I'm not sure I've ever seen clear on a namespace.  (I have seen it on
regular dicts being used as a namespace, such as tcl config options.)

I have seen deletes (deleting a temp name) and non-inserting sets ...
but they're both rare enough that letting them force the slow path
might be a good trade, if the optimization is otherwise simpler.

 Rare updating also means it's okay to invalidate the
 entire cache rather than single entries

Changing __bases__ seems to do that already.

(See 
http://svn.python.org/view/python/trunk/Objects/typeobject.c?rev=59106view=markup
functions like update_subclasses.)

So I think an alternate version PJE's question would be:

Why not just extend that existing mechanism to work on non-slot,
non-method attributes?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] SSL 1.7

2007-10-17 Thread Jim Jewett

Bill Janssen wrote:

 One thing to watch out for: ssl.SSLError can't
 inherit from socket.error, as it does in 2.6+,

Why not?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] SSL 1.7

2007-10-17 Thread Jim Jewett

Bill Janssen wrote:

 One thing to watch out for: ssl.SSLError can't
 inherit from socket.error, as it does in 2.6+,

Why not?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] urllib exception compatibility

2007-09-26 Thread Jim Jewett

urllib goes to goes to some trouble to ensure that it raises IOError,
even when the underlying exception comes from another module.[*]  I'm
wondering if it would make sense to just have those modules'
exceptions inherit from IOError.

In particular, should socket.error, ftp.Error and
httplib.HTTPException (used in Py3K) inherit from IOError?

I'm also wondering whether it would be acceptable to change the
details of the exceptions.  For example, could

raise IOError, ('ftp error', msg), sys.exc_info()[2]

be reworded, or is there there too much risk that someone is checking
for an errno of 'ftp error'?


[*]  This isn't a heavily tested path; some actually fail with a
TypeError since 2.5, because IOError no longer accepts argument tuples
longer than 3.  http://bugs.python.org/issue1209  Fortunately, this
makes me less worried about changing the contents of the specific
attributes to something more useful...

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 362: Signature objects

2007-09-07 Thread Jim Jewett

Brett Cannon wrote:

 A Signature object has the following structure attributes:

 * name : str
 Name of the function.  This is not fully qualified because
 function objects for methods do not know the class they are
 contained within.  This makes functions and methods
 indistinguishable from one another when passed to decorators,
 preventing proper creation of a fully qualified name.

(1)  Would this change with the new static __class__ attribute used
for the new super?

(2)  What about functions without a name?  Do you want to say str or
NoneType, or is that assumed?

(3)  Is the Signature object live or frozen?  (name is writable ...
will the Signature object reflect the new name, or the name in use at
the time it was created?)

 * var_annotations: dict(str, object)
 Dict that contains the annotations for the variable parameters.
 The keys are of the variable parameter with values of the

Is there a special key for the - returns annotation, or is that
available as a separate property?

 The structure of the Parameter object is:

 * name : (str | tuple(str))
 The name of the parameter as a string if it is not a tuple.  If
 the argument is a tuple then a tuple of strings is used.

What is used for unnamed arguments (typically provided by C)?  I like
None, but I see the arguments for both  and missing attribute.

 * position : int
 The position of the parameter within the signature of the
 function (zero-indexed).  For keyword-only parameters the position
 value is arbitrary while not conflicting with positional
 parameters.

Is this just a property/alias for signature.parameters.index(self) ?

What should a parameter object not associated with a specific
signature return?  -1, None, or missing attribute?

Is there a way to get the associated Signature, or is it compiled
out when the Signature and its child Parameters are first
constructed?  (I think the position property is the only attribute
that would use it, unless you want some of the other attributes --
like annotations -- to be live.)

...

I would also like to see a

 * value : object

attribute; this would be missing on most functions, but might be
filled in on a Signature representing a closure, or an execution
frame.


 When to construct the Signature object?
 ---

 The Signature object can either be created in an eager or lazy
 fashion.  In the eager situation, the object can be created during
 creation of the function object.

Since most code doesn't need it, I would expect it to be optimized out
at least as often as docstrings are.

  In the lazy situation, one would
 pass a function object to a function and that would generate the
 Signature object and store it to ``__signature__`` if
 needed, and then return the value of ``__signature__``.

Why store it?  Do you expect many use cases to need the signature more
than once (but not to save it themselves)?

If there is a __signature__ attribute on a object, you have to specify
whether it can be replaced, which parts of it are writable, how that
will affect the function's own behavior, etc.  I also suspect it might
become a source of heisenbugs, like the reference leaks that were
really DUMMY items in a dict.

If the Signature is just a snapshot no longer attached to the original
function, then people won't expect changes to the Signature to affect
the callable.

 Should ``Signature.bind`` return Parameter objects as keys?

(see above) If a Signature is a snapshot (rather than a live part of
the function), then it might make more sense to just add a value
attribute to Parameter objects.

 Provide a mapping of parameter name to Parameter object?
 

 While providing access to the parameters in order is handy, it might
 also be beneficial to provide a way to retrieve Parameter objects from
 a Signature object based on the parameter's name.  Which style of
 access (sequential/iteration or mapping) will influence how the
 parameters are stored internally and whether __getitem__ accepts
 strings or integers.

I think it should accept both.

What storage mechanism to use is an internal detail that should be
left to the implementation.  I wouldn't expect Signature inspection to
be inside a tight loop anyhow, unless it were part of a Generic
Function dispatch engine ... and those authors (just PJE?) can
optimize on what they actually need.

 Remove ``has_*`` attributes?
 

 If an EAFP approach to the API is taken,

Please leave them; it is difficult to catch Exceptions in a list comprehension.

 Have ``var_args`` and ``_var_kw_args`` default to ``None``?

Makes sense to me, particularly since it should probably be consistent
with function name, and that should probably be None.


-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev

Re: [Python-Dev] Add a -z interpreter flag to execute a zip file

2007-07-14 Thread Jim Jewett

On 7/14/07, Andy C [EMAIL PROTECTED] wrote:
 On 7/13/07, Jim Jewett [EMAIL PROTECTED] wrote:

   while I think it would be a bad practice to
   import __main__,

  I have seen it recommended as the right place to store global
  (cross-module) settings.

 Where?  People use __main__.py now?

No; they don't use a file.  It is treated as a strictly dynamic
scratchpad, and they do things like

import __main__
__main__.DEBUGLEVEL=5
if __main__.myvar: ...

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Add a -z interpreter flag to execute a zip file

2007-07-13 Thread Jim Jewett

Andy C wrote:
... a .zip file with a __zipmain__.py module at its root?

Why not just an __init__.py, which you would normally execute if you
tried to import/run a directory?

 * Magically looking at the first argument to see if it's a zip file
 seems problematic to me.  I'd rather be explicit with the -z flag.
 Likewise, I'd rather be explicit and call it __zipmain__ rather than
 __main__.

Treating zip files (and only zip files) as a special case equivalent
to uncompressed files seems like a wart; I would prefer not to
special-case zips any more than they already are.

If anything, I would like to see the -m option enhanced so that if it
gets a recognized collection file type (including a directory or
zip), it does the right thing.  Whether that actually makes sense, or
defeats the purpose of the -m shortcut, I'm not sure.

[on using __main__ instead of __init__ or __zipmain__]

 __main__.py?  : )  If someone tries does import __main__ from another
 module in the program, won't that result in an infinite loop?

It doesn't today; it does use circular imports, which can be a problem.

 while I think it would be a bad practice to
 import __main__,

I have seen it recommended as the right place to store global
(cross-module) settings.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] [RFC] urlparse - parse query facility

2007-06-13 Thread Jim Jewett

 a) import cgi and call cgi module's query_ps.  [circular imports]

or

 b) Implement a stand alone query parsing facility in urlparse *AS IN*
 cgi module.

Assuming (b), please remove the (code for the) parsing from the cgi
module, and just import it back from urlparse (or urllib).  Since cgi
already imports urllib (which imports urlparse), this isn't adding any
dependencies -- but it keeps the code in a single location.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] svn viewer confused

2007-06-04 Thread Jim Jewett

Choosing a revision, such as

http://svn.python.org/view/python/trunk/Objects/?rev=55606sortby=dateview=log

does not lead to the correct generated page; it either times out or
generates a much older changelog.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Wither PEP 335 (Overloadable Boolean Operators)?

2007-05-25 Thread Jim Jewett

Greg,

If you do update this PEP, please update the __not__ portion as well,
at least regarding possible return values.

It currently says that __not__ can return NotImplemented, which falls
back to the current semantics.   (Why?  to override an explicit
__not__?  Then why not just put the current semantics on __object__,
and override by calling that directly?)

It does not yet say what will happen for objects that return something
else outside of {True, False}, such as

class AntiBool(object):
def __not__(self):  return self

Is that OK, because not not X should now be spelled bool(x), and
you haven't allowed the overriding of __bool__?  (And, if so, how does
that work Py3K?)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] The docs, reloaded

2007-05-22 Thread Jim Jewett

Martin v. Löwis schrieb:

 That docutils happens to be written in Python should make little
 difference - it's *not* part of the Python language project,
 and is just a tool for us, very much like latex and latex2html.

Not entirely.  When I first started looking at python, I read a lot of
documentation.  Now I don't read it so much; the time when I could
easily suggest doc changes without explicitly setting time aside has
passed.

At that time, the barriers to submitting were fairly large; these are
the ones I remember:

(1)  Not realizing that I *could* submit changes, and they would be welcome.
(2)  Not understanding it well enough to document it correctly.
(3)  Not having easy access to the source -- I didn't want to to
retype it, or to edit html only to find out it was maintained in some
other format.  Even once I found the cvs repository, the docs weren't
in the main area.
(4)  Not having an easy way to submit the changes quickly.
(5)  Wanting to check my work, in case I was wrong.

I have no idea how to fix (1) and (2).

Putting them on a wiki improves the situation with (3) and (4).

(5) is at least helped by keeping the formatting requirements as
simple as possible (not sure if ReST does this or not) and by letting
me verify them before I submit.

Getting docutils is already a barrier; I would like to see a stdlib
module (not script hidden off to the side) for verification and
conversion.  I don't think I installed docutils myself until I started
to write a PEP.  But once I did download and install and figure out
how to call it ... at least it generally worked, and ran with
something (python) I was already using.

Getting a 3rd party tool that ends up requiring fourth party tools
(such as LaTex, but then I need to a viewer, or the old toolchain that
required me to install Perl) ... took longer than my attention span.
This was despite the fact that I had already used all the needed tools
in previous years; they just weren't installed on the machines I had
at the time ... and installing them on windows was something that
would *probably* work *eventually*.  If I had been new to programming,
it would have been even more intimidating.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] updated PEP3125, Remove Backslash Continuation

2007-05-04 Thread Jim Jewett

Major rewrite.

The inside-a-string continuation is separated from the general continuation.

The alternatives section is expaned to als list Andrew Koenig's
improved inside-expressions variant, since that is a real contender.

If anyone feels I haven't acknowledged their concerns, please tell me.

--

PEP: 3125
Title: Remove Backslash Continuation
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett [EMAIL PROTECTED]
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Apr-2007
Post-History: 29-Apr-2007, 30-Apr-2007, 04-May-2007


Abstract


Python initially inherited its parsing from C.  While this has
been generally useful, there are some remnants which have been
less useful for python, and should be eliminated.

This PEP proposes elimination of terminal ``\`` as a marker for
line continuation.


Motivation
==

One goal for Python 3000 should be to simplify the language by
removing unnecessary or duplicated features.  There are currently
several ways to indicate that a logical line is continued on the
following physical line.

The other continuation methods are easily explained as a logical
consequence of the semantics they provide; ``\`` is simply an escape
character that needs to be memorized.


Existing Line Continuation Methods
==


Parenthetical Expression - ([{}])
-

Open a parenthetical expression.  It doesn't matter whether people
view the line as continuing; they do immediately recognize that
the expression needs to be closed before the statement can end.

An examples using each of (), [], and {}::

def fn(long_argname1,
   long_argname2):
settings = {background:  random noise
volume:  barely audible}
restrictions = [Warrantee void if used,
Notice must be recieved by yesterday
Not responsible for sales pitch]

Note that it is always possible to parenthesize an expression,
but it can seem odd to parenthesize an expression that needs
them only for the line break::

assert val4, (
val is too small)


Triple-Quoted Strings
-

Open a triple-quoted string; again, people recognize that the
string needs to finish before the next statement starts.

banner_message = 
Satisfaction Guaranteed,
or DOUBLE YOUR MONEY BACK!!!





some minor restrictions apply


Terminal ``\`` in the general case
--

A terminal ``\`` indicates that the logical line is continued on the
following physical line (after whitespace).  There are no
particular semantics associated with this.  This form is never
required, although it may look better (particularly for people
with a C language background) in some cases::

 assert val4, \
val is too small

Also note that the ``\`` must be the final character in the line.
If your editor navigation can add whitespace to the end of a line,
that invisible change will alter the semantics of the program.
Fortunately, the typical result is only a syntax error, rather
than a runtime bug::

 assert val4, \
val is too small

SyntaxError: unexpected character after line continuation character

This PEP proposes to eliminate this redundant and potentially
confusing alternative.


Terminal ``\`` within a string
--

A terminal ``\`` within a single-quoted string, at the end of the
line.  This is arguably a special case of the terminal ``\``, but
it is a special case that may be worth keeping.

 abd\
 def
'abd def'

+ Many of the objections to removing ``\`` termination were really
  just objections to removing it within literal strings; several
  people clarified that they want to keep this literal-string
  usage, but don't mind losing the general case.

+ The use of ``\`` for an escape character within strings is well
  known.

- But note that this particular usage is odd, because the escaped
  character (the newline) is invisible, and the special treatment
  is to delete the character.  That said, the ``\`` of
  ``\(newline)`` is still an escape which changes the meaning of
  the following character.


Alternate Proposals
===

Several people have suggested alternative ways of marking the line
end.  Most of these were rejected for not actually simplifying things.

The one exception was to let any unfished expression signify a line
continuation, possibly in conjunction with increased indentation.

This is attractive because it is a generalization of the rule for
parentheses.

The

1 2 3 >

1 - 100 of 265 matches

Mail list logo