Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread Stephen J. Turnbull
PJ Eby writes:

  By clear, I mean free of prior assumptions.

Ah, well, I guess I've just run into a personal limitation.  I can't
imagine thinking that is free of prior assumptions.  Not my
ownwink/, and not by others, either.

So, unfortunately, I was left with the conventional opposition in
thinking: clear vs. muddy.  That impression was only strengthened
by the phrase vs. simply copying fields from distro tools.

  Put together, the phrase clear thinking on concrete use cases means
  (at least to me), dropping all preconceptions of the existing design
  and starting over from square one, to ask how best the problem may be
  solved, using specific examples as a guide rather than using
  generalities.

Sure, but ISTM that's the opposite of what you've actually been doing,
at least in terms of contributing to my understanding.  One obstacle
to discussion you have contributed to overcoming in my thinking is the
big generality that the packager (ie, the person writing the metadata)
is in a position to recommend good behavior to the installation
tool, vs. being in a position to point out relevant considerations
for users and tools installing the packager's product.

Until that generality is formulated and expressed, it's very difficult
to see why the examples and particular solutions to use cases that
various proponents have described fail to address some real problems.
It was difficult for me to see, at first, what distinction was
actually being made.

Specifically, I thought that the question about Obsoletes vs.
Obsoleted-By was about which package should be considered
authoritative about obsolescence.  That is a reasonable distinction
for that particular discussion, but there is a deeper, and general,
principle behind that.  Namely, metadata is descriptive, not
prescriptive.  Of course once one understands that principle, the
names of the fields don't matter so much, but it is helpful for
naive users of the metadata if the field names strongly connote
description of the package rather than behavior of the tool.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Serhiy Storchaka

On 10.12.12 05:30, Raymond Hettinger wrote:

On Dec 9, 2012, at 10:03 PM, MRAB pyt...@mrabarnett.plus.com
mailto:pyt...@mrabarnett.plus.com wrote:

What happens when a key is deleted? Will that leave an empty slot in
the entry array?

Yes.  See the __delitem__() method in the pure python implemention
at http://code.activestate.com/recipes/578375


You can move the last entry on freed place. This will preserve 
compactness of entries array and simplify and speedup iterations and 
some other operations.


def __delitem__(self, key, hashvalue=None):
if hashvalue is None:
hashvalue = hash(key)
found, i = self._lookup(key, hashvalue)
if found:
index = self.indices[i]
self.indices[i] = DUMMY
self.size -= 1
if index != size:
lasthash = self.hashlist[index]
lastkey = self.keylist[index]
found, j = self._lookup(lastkey, lasthash)
assert found
assert i != j
self.indices[j] = index
self.hashlist[index] = lasthash
self.keylist[index] = lastkey
self.valuelist[index] = self.valuelist[size]
index = size
self.hashlist[index] = UNUSED
self.keylist[index] = UNUSED
self.valuelist[index] = UNUSED
else:
raise KeyError(key)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Serhiy Storchaka

On 10.12.12 09:48, Christian Heimes wrote:

On the other hand every lookup and collision checks needs an additional
multiplication, addition and pointer dereferencing:

   entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx


I think that main slowdown will be in getting index from array with 
variable size of elements. It requires conditional jump which is not 
such cheap as additional or shifting.


switch (self-index_size) {
case 1: idx = ((uint8_t *)self-indices)[i]; break;
case 2: idx = ((uint16_t *)self-indices)[i]; break;
...
}

I think that variable-size indices don't worth effort.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Mark Shannon

On 10/12/12 01:44, Raymond Hettinger wrote:

The current memory layout for dictionaries is
unnecessarily inefficient.  It has a sparse table of
24-byte entries containing the hash value, key pointer,
and value pointer.

Instead, the 24-byte entries should be stored in a
dense table referenced by a sparse table of indices.


What minimum size and resizing factor do you propose for the entries array?

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Do more at compile time; less at runtime

2012-12-10 Thread Victor Stinner
Do know my registervm project? I try to emit better bytecode. See the
implementation on

http://hg.python.org/sandbox/registervm

See for example the documentation:

hg.python.org/sandbox/registervm/file/tip/REGISTERVM.txt
 http://hg.python.org/sandbox/registervm

I implemented other optimisations. See also WPython project which uses also
more efficient bytecode.

http://code.google.com/p/wpython/

Victor
Le 9 déc. 2012 23:23, Mark Shannon m...@hotpy.org a écrit :

 Hi all,

 The current CPython bytecode interpreter is rather more complex than it
 needs to be. A number of bytecodes could be eliminated and a few more
 simplified by moving the work involved in handling compound statements
 (loops, try-blocks, etc) from the interpreter to the compiler.

 This simplest example of this is the while loop...
 while cond:
body

 This currently compiled as

 start:
if not cond goto end
body
goto start
 end:

 but it could be compiled as

 goto test:
 start:
 body
 if cond goto start

 which eliminates one instruction per iteration.

 A more complex example is a return in a try-finally block.

 try:
 part1
 if cond:
 return X
 part2
 finally:
 part3

 Currently, handling the return is complex and involves pseudo
 exceptions, but if part3 were duplicated by the compiler, then the RETURN
 bytecode could just perform a simple return.
 The code above would be compiled thus...

 PUSH_BLOCK try
 part1
 if not X goto endif
 push X
 POP_BLOCK
 part3duplicated
 RETURN_VALUE
 endif:
 part2
 POP_BLOCK
 part3duplicated

 The changes I am proposing are:

 Allow negative line deltas in the lnotab array (bytecode deltas would
 remain non-negative)
 Remove the SETUP_LOOP, BREAK and CONTINUE bytecodes
 Simplify the RETURN bytecode
 Eliminate pseudo exceptions from the interpreter
 Simplify (or perhaps eliminate) SETUP_TRY, END_FINALLY, END_WITH.
 Reverse the sense of the FOR_ITER bytecode (ie. jump on not-exhausted)


 The net effect of these changes would be:
 Reduced code size and reduced code complexity.
 A small (1-5%)? increase in speed, due the simplification of the
 bytecodes and a very small change in the number of bytecodes executed.
 A small change in the static size of the bytecodes (-2% to +2%)?

 Although this is a quite intrusive change, I think it is worthwhile as it
 simplifies ceval.c considerably.
 The interpreter has become rather convoluted and any simplification has to
 be a good thing.

 I've already implemented negative line deltas and the transformed while
 loop: 
 https://bitbucket.org/**markshannon/cpython-lnotab-**signedhttps://bitbucket.org/markshannon/cpython-lnotab-signed
 I'm currently working on the block unwinding.

 So,
 Good idea? Bad idea?
 Should I write a PEP or is the bug tracker sufficient?

 Cheers,
 Mark.







 __**_
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
 victor.stinner%40gmail.comhttp://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Armin Rigo
Hi Raymond,

On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
 Instead, the data should be organized as follows:

 indices =  [None, 1, None, None, None, 0, None, 2]
 entries =  [[-9092791511155847987, 'timmy', 'red'],
 [-8522787127447073495, 'barry', 'green'],
 [-6480567542315338377, 'guido', 'blue']]

As a side note, your suggestion also enables order-preserving
dictionaries: iter() would automatically yield items in the order they
were inserted, as long as there was no deletion.  People will
immediately start relying on this feature...  and be confused by the
behavior of deletion. :-/


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Distutils] Is is worth disentangling distutils?

2012-12-10 Thread Daniel Holth
On Mon, Dec 10, 2012 at 2:22 AM, Antonio Cavallo a.cava...@cavallinux.euwrote:

 Hi,
 I wonder if is it worth/if there is any interest in trying to clean up
 distutils: nothing in terms to add new features, just a *major* cleanup
 retaining the exact same interface.


 I'm not planning anything like *adding features* or rewriting rpm/rpmbuild
 here, simply cleaning up that un-holy code mess. Yes it served well, don't
 get me wrong, and I think it did work much better than anything it was
 meant to replace it.

 I'm not into the py3 at all so I wonder how possibly it could fit/collide
 into the big plan.

 Or I'll be wasting my time?


It has been tried before. IIUC the nature of distutils and extending
distutils is that client code depends on the entire tangle. If you try to
clean it up you will break backwards compatibility.

distutils2 is designed to break backwards compatibility with distutils and
is essentially a cleaned up distutils.

Have you tried Bento? http://bento.readthedocs.org/en/latest/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]]

2012-12-10 Thread Toshio Kuratomi
On Sun, Dec 09, 2012 at 01:51:09PM +1100, Chris Angelico wrote:
 On Sun, Dec 9, 2012 at 1:11 PM, Steven D'Aprano st...@pearwood.info wrote:
  Why would a software package called Spam install a top-level module called
  Jam rather than Spam? Isn't the whole point of Python packages to solve
  this namespace problem?
 
 That would require/demand that the software package MUST define a
 module with its own name, and MUST NOT define any other top-level
 modules, and also that package names MUST be unique. (RFC 2119
 keywords.) That would work, as long as those restrictions are
 acceptable.
 
/me notes that setuptools itself is an example of a package that violates
this rule )setuptools and pkg_resources).

No objections to That would work, as long as those restrictions are
acceptable that seems to sum up where we're at.

-Toshio


pgpC3zr5ZrDtv.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]]

2012-12-10 Thread Toshio Kuratomi
On Sun, Dec 09, 2012 at 12:48:45AM -0500, PJ Eby wrote:
 
 Do any of the distro folks know of a Python project tagged as
 conflicting with another for their distro, where the conflict does
 *not* involve any files in conflict?
 
In Fedora we do work to avoid most types of Conflicts (backporting fixes,
etc) but I can give some examples of where Conflivts could have been used in
the past:

In docutils prior to the latest release, certain portions of docutils was
broken if pyxml was installed (since pyxml replaces certain stdlib xml.*
functionaltiy).  So older docutils versions could have had a Conflicts:
PyXML. Nick has since provided a technique for docutils to use that loads
from the stdlib first and only goes to PyXML if the functionality is not
available there.

Various libraries in web stacks have had bugs that prevent the propser
functioning of the web framework at the top level.  In case of major issues
(security, unable to startup), these top level frameworks could use
versioned Conflicts to prevent installation.  For instance:  TurboGears
might have a Conflicts: CherryPy  2.3.1 

Note, though, that if parallel installable versions and selection of the
proper versions from that work, then this type of Conflict wouldn't be
necessary.  Instead you'd have versioned Requires: instead.

-Toshio


pgpb06wUdfxFB.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Barry Warsaw
On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote:

It's hard to predict how the extra CPU instructions are going to affect
the performance on different CPUs and scenarios. My guts tell me that
your proposal is going to perform better on modern CPUs with lots of L1
and L2 cache and in an average case scenario. A worst case scenario with
lots of collisions might be measurable slower for large dicts and small
CPU cache.

I'd be interested to see what effect this has on memory constrained platforms
such as many current ARM applications (mostly likely 32bit for now).  Python's
memory consumption is an overheard complaint for folks developing for those
platforms.

Cheers,
-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Antoine Pitrou
Le Mon, 10 Dec 2012 10:40:30 +0100,
Armin Rigo ar...@tunes.org a écrit :
 Hi Raymond,
 
 On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
 raymond.hettin...@gmail.com wrote:
  Instead, the data should be organized as follows:
 
  indices =  [None, 1, None, None, None, 0, None, 2]
  entries =  [[-9092791511155847987, 'timmy', 'red'],
  [-8522787127447073495, 'barry', 'green'],
  [-6480567542315338377, 'guido', 'blue']]
 
 As a side note, your suggestion also enables order-preserving
 dictionaries: iter() would automatically yield items in the order they
 were inserted, as long as there was no deletion.  People will
 immediately start relying on this feature...  and be confused by the
 behavior of deletion. :-/

If that's really an issue, we can deliberately scramble the iteration
order a bit :-) (of course it might negatively impact HW prefetching)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Steven D'Aprano

On 10/12/12 20:40, Armin Rigo wrote:


As a side note, your suggestion also enables order-preserving
dictionaries: iter() would automatically yield items in the order they
were inserted, as long as there was no deletion.  People will
immediately start relying on this feature...  and be confused by the
behavior of deletion. :-/



If we want to avoid the attractive nuisance of iteration order being
almost, but not quite, order-preserving, there is a simple fix: when
iterating over the dict, instead of starting at the start of the table,
start at some arbitrary point in the middle and wrap around. That will
increase the cost of iteration slightly, but avoid misleading behaviour.

I think all we need do is change the existing __iter__ method from this:

def __iter__(self):
for key in self.keylist:
if key is not UNUSED:
yield key


to this:

# Untested!
def __iter__(self):
# Choose an arbitrary starting point.
# 103 is chosen because it is prime.
n = (103 % len(self.keylist)) if self.keylist else 0
for key in self.keylist[n:] + self.keylist[:n]:
# I presume the C version will not duplicate the keylist.
if key is not UNUSED:
yield key

This mixes the order of iteration up somewhat, just enough to avoid
misleading people into thinking that this is order-preserving. The
order will depend on the size of the dict, not the history. For
example, with keys a, b, c, ... added in that order, the output is:

len=1  key=a
len=2  key=ba
len=3  key=bca
len=4  key=dabc
len=5  key=deabc
len=6  key=bcdefa
len=7  key=fgabcde
len=8  key=habcdefg
len=9  key=efghiabcd

which I expect is mixed up enough to discourage programmers from
jumping to conclusions about dicts having guaranteed order.



--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby
On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Le Mon, 10 Dec 2012 10:40:30 +0100,
 Armin Rigo ar...@tunes.org a écrit :
 Hi Raymond,

 On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
 raymond.hettin...@gmail.com wrote:
  Instead, the data should be organized as follows:
 
  indices =  [None, 1, None, None, None, 0, None, 2]
  entries =  [[-9092791511155847987, 'timmy', 'red'],
  [-8522787127447073495, 'barry', 'green'],
  [-6480567542315338377, 'guido', 'blue']]

 As a side note, your suggestion also enables order-preserving
 dictionaries: iter() would automatically yield items in the order they
 were inserted, as long as there was no deletion.  People will
 immediately start relying on this feature...  and be confused by the
 behavior of deletion. :-/

 If that's really an issue, we can deliberately scramble the iteration
 order a bit :-) (of course it might negatively impact HW prefetching)

On the other hand, this would also make a fast ordered dictionary
subclass possible, just by not using the free list for additions,
combined with periodic compaction before adds or after deletes.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread PJ Eby
On Sun, Dec 9, 2012 at 10:38 PM, Andrew McNabb amcn...@mcnabbs.org wrote:
 On Fri, Dec 07, 2012 at 05:02:26PM -0500, PJ Eby wrote:
 If the packages have files in conflict, they won't be both installed.
 If they don't have files in conflict, there's nothing important to be
 informed of.  If one is installing pexpect-u, then one does not need
 to discover that it is a successor of pexpect.

 In the specific case of pexpect and pexpect-u, the files don't actually
 conflict.  The pexpect package includes a pexpect.py file, while
 pexpect-u includes a pexpect/ directory.  These conflict, but not in
 the easily detectable sense.

Excellent!  A concrete non-file use case.  Setuptools handles this
particular scenario by including a list of top-level module or package
names, but newer tools ought to look out for this scenario, too.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Antoine Pitrou
Le Mon, 10 Dec 2012 11:16:49 -0500,
PJ Eby p...@telecommunity.com a écrit :

 On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou
 solip...@pitrou.net wrote:
  Le Mon, 10 Dec 2012 10:40:30 +0100,
  Armin Rigo ar...@tunes.org a écrit :
  Hi Raymond,
 
  On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
  raymond.hettin...@gmail.com wrote:
   Instead, the data should be organized as follows:
  
   indices =  [None, 1, None, None, None, 0, None, 2]
   entries =  [[-9092791511155847987, 'timmy', 'red'],
   [-8522787127447073495, 'barry', 'green'],
   [-6480567542315338377, 'guido', 'blue']]
 
  As a side note, your suggestion also enables order-preserving
  dictionaries: iter() would automatically yield items in the order
  they were inserted, as long as there was no deletion.  People will
  immediately start relying on this feature...  and be confused by
  the behavior of deletion. :-/
 
  If that's really an issue, we can deliberately scramble the
  iteration order a bit :-) (of course it might negatively impact HW
  prefetching)
 
 On the other hand, this would also make a fast ordered dictionary
 subclass possible, just by not using the free list for additions,
 combined with periodic compaction before adds or after deletes.

I suspect that's what Raymond has in mind :-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Brett Cannon
On Mon, Dec 10, 2012 at 10:28 AM, Barry Warsaw ba...@python.org wrote:

 On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote:

 It's hard to predict how the extra CPU instructions are going to affect
 the performance on different CPUs and scenarios. My guts tell me that
 your proposal is going to perform better on modern CPUs with lots of L1
 and L2 cache and in an average case scenario. A worst case scenario with
 lots of collisions might be measurable slower for large dicts and small
 CPU cache.

 I'd be interested to see what effect this has on memory constrained
 platforms
 such as many current ARM applications (mostly likely 32bit for now).
  Python's
 memory consumption is an overheard complaint for folks developing for those
 platforms.


Maybe Kristjan can shed some light on how this would have helped him on the
ps3 work he has been doing for Dust 514 as he has had memory issues there.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread PJ Eby
On Mon, Dec 10, 2012 at 3:27 AM, Stephen J. Turnbull step...@xemacs.org wrote:
 PJ Eby writes:

   By clear, I mean free of prior assumptions.

 Ah, well, I guess I've just run into a personal limitation.  I can't
 imagine thinking that is free of prior assumptions.  Not my
 ownwink/, and not by others, either.

I suppose I should have said, free of *known* prior assumptions,
since the trick to suspending assumptions is to find the ones you
*have*.

The deeper assumptions, alas, can usually only be found by clashing
opinions with others...  then stepping back and going, wait...  what
does he/she believe that's *different* from what I believe, that
allows them to have that differing opinion?

And then that's how you find out what it is that *you're* assuming,
that you didn't know you were assuming.  ;-)  (Not to mention what the
other person is.)


   Put together, the phrase clear thinking on concrete use cases means
   (at least to me), dropping all preconceptions of the existing design
   and starting over from square one, to ask how best the problem may be
   solved, using specific examples as a guide rather than using
   generalities.

 Sure, but ISTM that's the opposite of what you've actually been doing,
 at least in terms of contributing to my understanding.  One obstacle
 to discussion you have contributed to overcoming in my thinking is the
 big generality that the packager (ie, the person writing the metadata)
 is in a position to recommend good behavior to the installation
 tool, vs. being in a position to point out relevant considerations
 for users and tools installing the packager's product.

Right, but I started from a concrete scenario I wanted to avoid, which
led me to question the assumption that those fields were actually
useful.  As soon as I began questioning *that* assumption and asking
for use cases (2 years ago, in the last PEP 345 revision discussion),
it became apparent to me that there was something seriously wrong with
the conflicts and obsoletes fields, as they had almost no real utility
as they were defined and understood at that point.

 Until that generality is formulated and expressed, it's very difficult
 to see why the examples and particular solutions to use cases that
 various proponents have described fail to address some real problems.

Unfortunately, it's a chicken-and-egg problem: until you know what
assumptions are being made, you can't formulate them.  It's an
iterative process of exposing assumptions, until you succeed in
actually communicating.  ;-)

Heck, even something as simple as my assumptions about what clear
thinking meant and what I was trying to say has taken some back and
forth to clarify.  ;-)

 It was difficult for me to see, at first, what distinction was
 actually being made.

 Specifically, I thought that the question about Obsoletes vs.
 Obsoleted-By was about which package should be considered
 authoritative about obsolescence.  That is a reasonable distinction
 for that particular discussion, but there is a deeper, and general,
 principle behind that.  Namely, metadata is descriptive, not
 prescriptive.

Actually, the principle I was clinging to for *both* fields was not
giving project authors authority over other people's projects.  It's
fine for metadata to be prescriptive (e.g. requirements), it's just
that it should be prescriptive *only* for that project in isolation.
(In the broader sense, it also applies to the distro situation: the
project author doesn't really have authority over the distro, either,
so it can only be a suggestion there, as well.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread Toshio Kuratomi
On Fri, Dec 7, 2012 at 10:46 PM, PJ Eby p...@telecommunity.com wrote:


 In any case, as I said before, I don't have an issue with the fields
 all being declared as being for informational purposes only.  My issue
 is only with recommendations for automated tool behavior that permit
 one project's author to exercise authority over another project's
 installation.


Skipping over a lot of other replies between you and I because I think that
we disagree on a lot but that's all moot if we agree here.

I have no problems with Obsoletes, Conflicts, Requires, and Provides types
of fields are marked informational.  In fact, there are many cases where
packages are overzealous in their use of Requires right now that cause
distributions to patch the dependency information in the package metadata.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Christian Heimes
Am 10.12.2012 16:28, schrieb Barry Warsaw:
 I'd be interested to see what effect this has on memory constrained platforms
 such as many current ARM applications (mostly likely 32bit for now).  Python's
 memory consumption is an overheard complaint for folks developing for those
 platforms.

I had ARM platforms in mind, too. Unfortunately we don't have any ARM
platforms for testing and performance measurement. Even Trent's
Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
that's all.

Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to
Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7
Cortex-A9) and similar.

Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread Paul Moore
On 10 December 2012 16:35, Toshio Kuratomi a.bad...@gmail.com wrote:
 I have no problems with Obsoletes, Conflicts, Requires, and Provides types
 of fields are marked informational.  In fact, there are many cases where
 packages are overzealous in their use of Requires right now that cause
 distributions to patch the dependency information in the package metadata.

Given the endless debate on these fields, and the fact that it pretty
much all seems to be about what happens when tools enforce them, I'm
+1 on this. Particularly as these fields were not the focus of this
change to the spec in any case.

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Kristján Valur Jónsson
Indeed, I had to change the dict tuning parameters to make dicts behave 
reasonably on the PS3.
Interesting stuff indeed.

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Barry Warsaw
 Sent: 10. desember 2012 15:28
 To: python-dev@python.org
 Subject: Re: [Python-Dev] More compact dictionaries with faster iteration
 

 I'd be interested to see what effect this has on memory constrained
 platforms such as many current ARM applications (mostly likely 32bit for
 now).  Python's memory consumption is an overheard complaint for folks
 developing for those platforms.
 
 Cheers,
 -Barry
 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Barry Warsaw
On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote:

I had ARM platforms in mind, too. Unfortunately we don't have any ARM
platforms for testing and performance measurement. Even Trent's
Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
that's all.

I have a few ARM machines that I can use for testing, though I can't provide
external access to them.

* http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm
  (Which oops, I see is down.  Why did I not get notifications about that?)

* I have a Nexus 7 flashed with Ubuntu 12.10 (soon to be 13.04).

* Pandaboard still sitting in its box. ;)

Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to
Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7
Cortex-A9) and similar.

Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of
getting them up and running.  I am not offering for *that*. ;)

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Armin Rigo
Hi Philip,

On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote:
 On the other hand, this would also make a fast ordered dictionary
 subclass possible, just by not using the free list for additions,
 combined with periodic compaction before adds or after deletes.

Technically, I could see Python switching to ordered dictionaries
everywhere.  Raymond's insight suddenly makes it easy for CPython and
PyPy, and at least Jython could use the LinkedHashMap class (although
this would need checking with Jython guys).  I'd vaguely argue that
dictionary orders are one of the few non-reproducible factors in a
Python program, so it might be a good thing.  But only vaguely ---
maybe I got far too used to random orders over time...


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread R. David Murray
On Mon, 10 Dec 2012 16:06:23 +, krist...@ccpgames.com wrote:
 Indeed, I had to change the dict tuning parameters to make dicts
 behave reasonably on the PS3.

 Interesting stuff indeed.

Out of both curiosity and a possible application of my own for the
information, what values did you end up with?

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Is is worth disentangling distutils?

2012-12-10 Thread Antonio Cavallo

Hi,
I wonder if is it worth/if there is any interest in trying to clean up 
distutils: nothing in terms to add new features, just a *major* cleanup 
retaining the exact same interface.



I'm not planning anything like *adding features* or rewriting 
rpm/rpmbuild here, simply cleaning up that un-holy code mess. Yes it 
served well, don't get me wrong, and I think it did work much better 
than anything it was meant to replace it.


I'm not into the py3 at all so I wonder how possibly it could 
fit/collide into the big plan.


Or I'll be wasting my time?

Thanks

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is is worth disentangling distutils?

2012-12-10 Thread Brian Curtin
On Mon, Dec 10, 2012 at 1:22 AM, Antonio Cavallo
a.cava...@cavallinux.eu wrote:
 I'm not into the py3 at all so I wonder how possibly it could fit/collide
 into the big plan.

 Or I'll be wasting my time?

If you're not doing it on Python 3 then you are wasting your time.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby
On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo ar...@tunes.org wrote:
 On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote:
 On the other hand, this would also make a fast ordered dictionary
 subclass possible, just by not using the free list for additions,
 combined with periodic compaction before adds or after deletes.

 Technically, I could see Python switching to ordered dictionaries
 everywhere.  Raymond's insight suddenly makes it easy for CPython and
 PyPy, and at least Jython could use the LinkedHashMap class (although
 this would need checking with Jython guys).

What about IronPython?

Also, note that using ordered dictionaries carries a performance cost
for dictionaries whose keys change a lot.  This probably wouldn't
affect most dictionaries in most programs, because module and object
dictionaries generally don't delete and re-add a lot of keys very
often. But in cases where a dictionary is used as a queue or stack or
something of that sort, the packing costs could add up.  Under the
current scheme, as long as collisions were minimal, the contents
wouldn't be repacked very often.

Without numbers it's hard to say for certain, but the advantage to
keeping ordered dictionaries a distinct type is that the standard
dictionary type can then get that extra bit of speed in exchange for
dropping the ordering requirement.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Antoine Pitrou
On Mon, 10 Dec 2012 11:53:17 -0500
Barry Warsaw ba...@python.org wrote:
 On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote:
 
 I had ARM platforms in mind, too. Unfortunately we don't have any ARM
 platforms for testing and performance measurement. Even Trent's
 Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
 that's all.
 
 I have a few ARM machines that I can use for testing, though I can't provide
 external access to them.
 
 * http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm
   (Which oops, I see is down.  Why did I not get notifications about that?)

Because buildbot.python.org is waiting for someone (mail.python.org,
perhaps) to accept its SMTP requests. Feel free to ping the necessary
people ;-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Emacs users: hg-tools-grep

2012-12-10 Thread Brandon W Maister

 P.S. Who wants to abuse Jono and Matthew's copyright again and provide a
 shudder git version?


Oh, I do!

I also feel weird about adding a copyright to this, but how will other
people feel comfortable using it if I don't?

Also I put it in github, in case people want to fix it:
https://github.com/quodlibetor/git-tools.el

;; Copyright (c) 2012 Brandon W Maister
;;
;; Permission is hereby granted, free of charge, to any person obtaining
;; a copy of this software and associated documentation files (the
;; Software), to deal in the Software without restriction, including
;; without limitation the rights to use, copy, modify, merge, publish,
;; distribute, sublicense, and/or sell copies of the Software, and to
;; permit persons to whom the Software is furnished to do so, subject to
;; the following conditions:
;;
;; The above copyright notice and this permission notice shall be
;; included in all copies or substantial portions of the Software.
;;
;; THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

;; This code is based on hg-tools.el, which in turn is based on bzr-tools.el
;; Copyright (c) 2008-2012 Jonathan Lange, Matthew Lefkowitz, Barry A.
Warsaw

(provide 'git-tools)

(defconst git-tools-grep-command
  git ls-files -z | xargs -0 grep -In %s
  The command used for grepping files using git. See `git-tools-grep'.)

;; Run 'code' at the root of the branch which dirname is in.
(defmacro git-tools-at-branch-root (dirname rest code)
  `(let ((default-directory (locate-dominating-file (expand-file-name
,dirname) .git))) ,@code))


(defun git-tools-grep (expression dirname)
  Search a branch for `expression'. If there's a C-u prefix, prompt for
`dirname'.
  (interactive
   (let* ((string (read-string Search for: ))
  (dir (if (null current-prefix-arg)
   default-directory
 (read-directory-name (format Search for %s in: 
string)
 (list string dir)))
  (git-tools-at-branch-root dirname
   (grep-find (format git-tools-grep-command (shell-quote-argument
expression)



On Sat, Dec 8, 2012 at 4:51 PM, Barry Warsaw ba...@python.org wrote:

 Hark fellow Emacsers.  All you unenlightened heathens can stop reading now.

 A few years ago, my colleague Jono Lange wrote probably the best little
 chunk
 of Emacs lisp ever.  `M-x bzr-tools-grep` lets you easily search a Bazaar
 repository for a case-sensitive string, providing you with a nice *grep*
 buffer which you can scroll through.  When you find a code sample you want
 to
 look at, C-c C-c visits the file and plops you right at the matching line.
 You *only* grep through files under version control, so you get to ignore
 generated files, and compilation artifacts, etc.

 Of course, this doesn't help you for working on the Python code base,
 because
 Mercurial.  I finally whipped up this straight up rip of Jono's code to
 work
 with hg.  I'm actually embarrassed to put a copyright on this thing, and
 would
 happily just donate it to Jono, drop it in Python's Misc directory, or
 slip it
 like a lump of coal into the xmas stocking of whoever wants to maintain
 it
 for the next 20 years.

 But anyway, it's already proven enormously helpful to me, so here it is.

 Cheers,
 -Barry

 P.S. Who wants to abuse Jono and Matthew's copyright again and provide a
 shudder git version?

 ;; Copyright (c) 2012 Barry A. Warsaw
 ;;
 ;; Permission is hereby granted, free of charge, to any person obtaining
 ;; a copy of this software and associated documentation files (the
 ;; Software), to deal in the Software without restriction, including
 ;; without limitation the rights to use, copy, modify, merge, publish,
 ;; distribute, sublicense, and/or sell copies of the Software, and to
 ;; permit persons to whom the Software is furnished to do so, subject to
 ;; the following conditions:
 ;;
 ;; The above copyright notice and this permission notice shall be
 ;; included in all copies or substantial portions of the Software.
 ;;
 ;; THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 ;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 ;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
 ;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
 ;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
 ;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
 ;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

 ;; This code is based on bzr-tools.el
 ;; Copyright (c) 2008-2009 Jonathan Lange, Matthew Lefkowitz

 (provide 

Re: [Python-Dev] Emacs users: hg-tools-grep

2012-12-10 Thread Barry Warsaw
On Dec 10, 2012, at 03:02 PM, Brandon W Maister wrote:

I also feel weird about adding a copyright to this, but how will other
people feel comfortable using it if I don't?

Nice!  I've been told that Emacs 23 is probably a minimum requirement because
locate-dominating-file is missing in = Emacs 22.  I'm using Emacs 24.2 almost
everywhere these days.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Terry Reedy

On 12/10/2012 1:38 PM, PJ Eby wrote:

On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo ar...@tunes.org wrote:

On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote:

On the other hand, this would also make a fast ordered dictionary
subclass possible, just by not using the free list for additions,
combined with periodic compaction before adds or after deletes.


Technically, I could see Python switching to ordered dictionaries
everywhere.  Raymond's insight suddenly makes it easy for CPython and
PyPy, and at least Jython could use the LinkedHashMap class (although
this would need checking with Jython guys).


What about IronPython?

Also, note that using ordered dictionaries carries a performance cost
for dictionaries whose keys change a lot.  This probably wouldn't
affect most dictionaries in most programs, because module and object
dictionaries generally don't delete and re-add a lot of keys very
often. But in cases where a dictionary is used as a queue or stack or
something of that sort, the packing costs could add up.  Under the
current scheme, as long as collisions were minimal, the contents
wouldn't be repacked very often.

Without numbers it's hard to say for certain, but the advantage to
keeping ordered dictionaries a distinct type is that the standard
dictionary type can then get that extra bit of speed in exchange for
dropping the ordering requirement.


I think that there should be a separate OrderedDict class (or subclass) 
and that the dict docs should warn (if not now) that while iterating a 
dict *may*, in some circumstances, give items in insertion *or* sort 
order, it *will not* in all cases. Example of the latter:


 d = {8:0, 9:0, 15:0, 13:0, 14:0}
 for k in d: print(k)

8
9
13
14
15

If one entered the keys in sorted order, as might be natural, one might 
mistakenly think that insertion order was being reproduced.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Tim Delaney
On 11 December 2012 05:01, Armin Rigo ar...@tunes.org wrote:

 Hi Philip,

 On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote:
  On the other hand, this would also make a fast ordered dictionary
  subclass possible, just by not using the free list for additions,
  combined with periodic compaction before adds or after deletes.

 Technically, I could see Python switching to ordered dictionaries
 everywhere.  Raymond's insight suddenly makes it easy for CPython and
 PyPy, and at least Jython could use the LinkedHashMap class (although
 this would need checking with Jython guys).  I'd vaguely argue that
 dictionary orders are one of the few non-reproducible factors in a
 Python program, so it might be a good thing.  But only vaguely ---
 maybe I got far too used to random orders over time...


Whilst I think Python should not move to ordered dictionaries everywhere, I
would say there is an argument (no pun intended) for making **kwargs a
dictionary that maintains insertion order *if there are no deletions*. It
sounds like we could get that for free with this implementation, although
from another post IronPython might not have something suitable.

I think there are real advantages to doing so - a trivial one being the
ability to easily initialise an ordered dictionary from another ordered
dictionary.

I could also see an argument for having this property for all dicts. There
are many dictionaries that are never deleted from (probably most dict
literals) and it would be nice to have an iteration order for them that
matched the source code.

However if deletions occur all bets would be off. If you need to maintain
insertion order in the face of deletions, use an explicit ordereddict.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Guido, Dropbox, and Python

2012-12-10 Thread Terry Reedy
For those who have not heard, Guido left Google Friday and starts at 
Dropbox in January. (I hope you enjoy the break in between ;-).


https://twitter.com/gvanrossum/status/277126763295944705
https://tech.dropbox.com/2012/12/welcome-guido/

My question, Guido, is how this will affect Python development, and in 
particular, your work on async. If not proprietary info, does or will 
Dropbox use Python3?


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guido, Dropbox, and Python

2012-12-10 Thread Guido van Rossum
On Mon, Dec 10, 2012 at 1:51 PM, Terry Reedy tjre...@udel.edu wrote:
 For those who have not heard, Guido left Google Friday and starts at Dropbox
 in January. (I hope you enjoy the break in between ;-).

Thank you, I am already enjoying it!

 https://twitter.com/gvanrossum/status/277126763295944705
 https://tech.dropbox.com/2012/12/welcome-guido/

 My question, Guido, is how this will affect Python development, and in
 particular, your work on async. If not proprietary info, does or will
 Dropbox use Python3?

It should not change a thing. I don't know whether Dropbox uses Python
3 but I don't think it will make any difference. Of course I have not
yet started working at Dropbox so I will be able to give better
answers in a few months.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Andrew Svetlov
On Mon, Dec 10, 2012 at 11:29 PM, Tim Delaney tim.dela...@aptare.com wrote:
 On 11 December 2012 05:01, Armin Rigo ar...@tunes.org wrote:

 Hi Philip,

 On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote:
  On the other hand, this would also make a fast ordered dictionary
  subclass possible, just by not using the free list for additions,
  combined with periodic compaction before adds or after deletes.

 Technically, I could see Python switching to ordered dictionaries
 everywhere.  Raymond's insight suddenly makes it easy for CPython and
 PyPy, and at least Jython could use the LinkedHashMap class (although
 this would need checking with Jython guys).  I'd vaguely argue that
 dictionary orders are one of the few non-reproducible factors in a
 Python program, so it might be a good thing.  But only vaguely ---
 maybe I got far too used to random orders over time...


 Whilst I think Python should not move to ordered dictionaries everywhere, I
 would say there is an argument (no pun intended) for making **kwargs a
 dictionary that maintains insertion order *if there are no deletions*. It
 sounds like we could get that for free with this implementation, although
 from another post IronPython might not have something suitable.

Please, no. dict and kwargs should be unordered without any assumptions.

 I think there are real advantages to doing so - a trivial one being the
 ability to easily initialise an ordered dictionary from another ordered
 dictionary.

It can be done with adding short-circuit for OrderedDict class to
accept another OrderedDict instance.

 I could also see an argument for having this property for all dicts. There
 are many dictionaries that are never deleted from (probably most dict
 literals) and it would be nice to have an iteration order for them that
 matched the source code.

 However if deletions occur all bets would be off. If you need to maintain
 insertion order in the face of deletions, use an explicit ordereddict.

 Tim Delaney

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com




--
Thanks,
Andrew Svetlov
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby
On Mon, Dec 10, 2012 at 4:29 PM, Tim Delaney tim.dela...@aptare.com wrote:
 Whilst I think Python should not move to ordered dictionaries everywhere, I
 would say there is an argument (no pun intended) for making **kwargs a
 dictionary that maintains insertion order *if there are no deletions*.

Oooh.  Me likey.  There have been many times where I've wished kwargs
were ordered when designing an API.

(Oddly, I don't remember any one of the APIs specifically, so don't
ask me for a good example.  I just remember a bunch of different
physical locations where I was when I thought, Ooh, what if I
could...  no, that's not going to work.)

One other useful place for ordered dictionaries is class definitions
processed by class decorators: no need to write a metaclass just to
know what order stuff was defined in.

  It sounds like we could get that for free with this implementation, although
 from another post IronPython might not have something suitable.

Actually, IronPython may already have ordered dictionaries by default; see:

  http://mail.python.org/pipermail/ironpython-users/2006-May/002319.html

It's described as an implementation detail that may change, perhaps
that could be changed to being unchanging.  ;-)


 I think there are real advantages to doing so - a trivial one being the 
 ability
 to easily initialise an ordered dictionary from another ordered dictionary.

Or to merge two of them together, either at creation or .update().

I'm really starting to wonder if it might not be worth paying the
compaction overhead to just make all dictionaries ordered, all the
time.  The problem is that if you are always adding new keys and
deleting old ones (as might be done in a LRU cache, a queue, or other
things like that) you'll probably see a lot of compaction overhead
compared to today's dicts.

OTOH...  with a good algorithm for deciding *when* to compact, we can
actually make the amortized cost O(1) or so, so maybe that's not a big
deal.  The cost to do a compaction is at worst, the current size of
the table.  So if you wait until a table has twice as many entries
(more precisely, until the index of the last entry is twice what it
was at last compaction), you will amortize the compaction cost down to
one entry move per add, or O(1).

That would handle the case of a cache or queue, but I'm not sure how
it would work with supersized dictionaries that are then deleted down
to a fraction of their original size.  I suppose if you delete your
way down to half the entries being populated, then you end up with two
moves per delete, or O(2).  (Yes, I know that's not a valid O number.)

So, offhand, it seems pretty doable, and unlikely to significantly
change worst-case performance even for pathological use cases.  (Like
using a dict when you should be using a deque.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby
On Mon, Dec 10, 2012 at 5:14 PM, Andrew Svetlov
andrew.svet...@gmail.com wrote:
 Please, no. dict and kwargs should be unordered without any assumptions.

Why?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread fwierzbi...@gmail.com
On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo ar...@tunes.org wrote:
 Technically, I could see Python switching to ordered dictionaries
 everywhere.  Raymond's insight suddenly makes it easy for CPython and
 PyPy, and at least Jython could use the LinkedHashMap class (although
 this would need checking with Jython guys).
I honestly hope this doesn't happen - we use ConcurrentHashMap for our
dictionaries (which lack ordering) and I'm sure getting it to preserve
insertion order would cost us.

-Frank
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger

On Dec 10, 2012, at 2:48 AM, Christian Heimes christ...@python.org wrote:

 On the other hand every lookup and collision checks needs an additional
 multiplication, addition and pointer dereferencing:
 
  entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx


Currently, the dict implementation allows alternative lookup
functions based on whether the keys are all strings.
The choice of lookup function is stored in a function pointer.
That lets each lookup use the currently active lookup
function without having to make any computations or branches.

Likewise, the lookup functions could be swapped between
char, short, long, and ulong index sizes during the resize step.
IOW, the selection only needs to be made once per resize,
not one per lookup.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread fwierzbi...@gmail.com
On Mon, Dec 10, 2012 at 3:13 PM, fwierzbi...@gmail.com
fwierzbi...@gmail.com wrote:
 On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo ar...@tunes.org wrote:
 Technically, I could see Python switching to ordered dictionaries
 everywhere.  Raymond's insight suddenly makes it easy for CPython and
 PyPy, and at least Jython could use the LinkedHashMap class (although
 this would need checking with Jython guys).
 I honestly hope this doesn't happen - we use ConcurrentHashMap for our
 dictionaries (which lack ordering) and I'm sure getting it to preserve
 insertion order would cost us.
I just found this http://code.google.com/p/concurrentlinkedhashmap/ so
maybe it wouldn't be all that bad. I still personally like the idea of
leaving basic dict order undetermined (there is already an OrderedDict
if you need it right?) But if ConcurrentLinkedHashMap is as good as is
suggested on that page then Jython doesn't need to be the thing that
blocks the argument.

-Frank
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger

On Dec 10, 2012, at 4:06 AM, Mark Shannon m...@hotpy.org wrote:

 Instead, the 24-byte entries should be stored in a
 dense table referenced by a sparse table of indices.
 
 What minimum size and resizing factor do you propose for the entries array?

There are many ways to do this.  I don't know which is best.
The proof-of-concept code uses the existing list resizing logic.
Another approach is to pre-allocate the two-thirds maximum
(This is simple and fast but gives the smallest space savings.)
A hybrid approach is to allocate in two steps (1/3 and then 2/3
if needed).  

Since hash tables aren't a new problem, there may
already be published research on the best way to handle
the entries array. 


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Mark Shannon



On 10/12/12 23:39, Raymond Hettinger wrote:


On Dec 10, 2012, at 4:06 AM, Mark Shannon m...@hotpy.org
mailto:m...@hotpy.org wrote:


Instead, the 24-byte entries should be stored in a
dense table referenced by a sparse table of indices.


What minimum size and resizing factor do you propose for the entries
array?


There are many ways to do this.  I don't know which is best.
The proof-of-concept code uses the existing list resizing logic.
Another approach is to pre-allocate the two-thirds maximum


What do you mean by maximum?


(This is simple and fast but gives the smallest space savings.)
A hybrid approach is to allocate in two steps (1/3 and then 2/3
if needed).


I think you need to do some more analysis on this. It is possible that 
there is some improvement to be had from your approach, but I don't 
think the improvements will be as large as you have claimed.


I suspect that you may be merely trading performance for reduced memory
use, which can be done much more easily by reducing the minimum size
and increasing the load factor.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger

On Dec 10, 2012, at 7:04 PM, Mark Shannon m...@hotpy.org wrote:

 Another approach is to pre-allocate the two-thirds maximum
 (This is simple and fast but gives the smallest space savings.)
 
 What do you mean by maximum?

A dict with an index table size of 8 gets resized when it is two-thirds full,
so the maximum number of entries is 5.  If you pre-allocate five entries
for the initial dict, you've spent 5 * 24 bytes + 8 bytes for the indices
for a total of 128 bytes.  This compares to the current table of 8 * 24 bytes
totaling 192 bytes.   

Many other strategies are possible.  The proof-of-concept code 
uses the one employed by regular python lists. 
Their growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, 
This produces nice memory savings for entry lists.

If you have a suggested allocation pattern or other 
constructive suggestion, it would be would welcome.  
Further sniping and unsubstantiated FUD would not.


Raymond
 ___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Trent Nelson
On Mon, Dec 10, 2012 at 08:53:17AM -0800, Barry Warsaw wrote:
 On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote:
 
 I had ARM platforms in mind, too. Unfortunately we don't have any ARM
 platforms for testing and performance measurement. Even Trent's
 Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
 that's all.

 Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to
 Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7
 Cortex-A9) and similar.
 
 Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of
 getting them up and running.  I am not offering for *that*. ;)

If someone donates the hardware, I'll take care of everything else.

Trent.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger

On Dec 10, 2012, at 4:02 AM, Serhiy Storchaka storch...@gmail.com wrote:

 On 10.12.12 05:30, Raymond Hettinger wrote:
 On Dec 9, 2012, at 10:03 PM, MRAB pyt...@mrabarnett.plus.com
 mailto:pyt...@mrabarnett.plus.com wrote:
 What happens when a key is deleted? Will that leave an empty slot in
 the entry array?
 Yes.  See the __delitem__() method in the pure python implemention
 at http://code.activestate.com/recipes/578375
 
 You can move the last entry on freed place. This will preserve compactness of 
 entries array and simplify and speedup iterations and some other operations.
 
 def __delitem__(self, key, hashvalue=None):
if hashvalue is None:
hashvalue = hash(key)
found, i = self._lookup(key, hashvalue)
if found:
index = self.indices[i]
self.indices[i] = DUMMY
self.size -= 1
if index != size:
lasthash = self.hashlist[index]
lastkey = self.keylist[index]
found, j = self._lookup(lastkey, lasthash)
assert found
assert i != j
self.indices[j] = index
self.hashlist[index] = lasthash
self.keylist[index] = lastkey
self.valuelist[index] = self.valuelist[size]
index = size
self.hashlist[index] = UNUSED
self.keylist[index] = UNUSED
self.valuelist[index] = UNUSED
else:
raise KeyError(key)


That is a clever improvement.   Thank you.

Using your idea (plus some tweaks) cleans-up the code a bit
(simplifying iteration, simplifying the resizing logic, and eliminating the 
UNUSED constant).
I'm updating the posted code to reflect your suggestion.

Thanks again,


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] where is the python import implemented

2012-12-10 Thread Isml
Hi, everyone,
 I am testing modifying the pyc file when it is imported. As I know, there is three situation:
 1、runing in the python.exe
 eg: python.exe test.pyc
 in this situation, I find the sourceon line 1983 in file pythonrun.c
 2、import the pyc from a zip file
 I find the source on line 1132 in zipimport.c
 3、do a normal import
 eg: two file : main.py and testmodule.py
 and in main.py:
 import testmodule

 in this situation, I can not find the source code how python implement it. I test a wrong format pyc, and got a error "ImportError: bad magic number",and I search "bad magic number" in the source code, I find it is in importlib/_bootstrap.py(line 815),but when I modify this error info(eg: test bad magic) and run again, nothing is changed. It seems that the file is not the correct position.___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no

2012-12-10 Thread Antoine Pitrou
On Tue, 11 Dec 2012 03:05:19 +0100 (CET)
gregory.p.smith python-check...@python.org wrote:
   Using 'long double' to force this structure to be worst case aligned is no
 longer required as of Python 2.5+ when the gc_refs changed from an int (4
 bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes.
 
 The use of a 'long double' triggered a warning by Clang trunk's
 Undefined-Behavior Sanitizer as on many platforms a long double requires
 16-byte alignment but the Python memory allocator only guarantees 8 byte
 alignment.
 
 So our code would allocate and use these structures with technically improper
 alignment.  Though it didn't matter since the 'dummy' field is never used.
 This silences that warning.
 
 Spelunking into code history, the double was added in 2001 to force better
 alignment on some platforms and changed to a long double in 2002 to appease
 Tru64.  That issue should no loner be present since the upgrade from int to
 Py_ssize_t where the minimum structure size increased to 16 (unless anyone
 knows of a platform where ssize_t is 4 bytes?)

What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t).

 We can probably get rid of the double and this union hack all together today.
 That is a slightly more invasive change that can be left for later.

How do you suggest to get rid of it? Some platforms still have strict
alignment rules and we must enforce that PyObjects (*) are always
aligned to the largest possible alignment, since a PyObject-derived
struct can hold arbitrary C types.

(*) GC-enabled PyObjects, anyway. Others will be naturally aligned
thanks to the memory allocator.


What's more, I think you shouldn't be doing this kind of change in a
bugfix release. It might break compiled C extensions since you are
changing some characteristics of object layout (although you would
probably only break those extensions which access the GC header, which
is probably not many of them). Resource consumption improvements
generally go only into the next feature release.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no

2012-12-10 Thread Antoine Pitrou
On Tue, 11 Dec 2012 08:16:27 +0100
Antoine Pitrou solip...@pitrou.net wrote:

 On Tue, 11 Dec 2012 03:05:19 +0100 (CET)
 gregory.p.smith python-check...@python.org wrote:
Using 'long double' to force this structure to be worst case aligned is no
  longer required as of Python 2.5+ when the gc_refs changed from an int (4
  bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes.
  
  The use of a 'long double' triggered a warning by Clang trunk's
  Undefined-Behavior Sanitizer as on many platforms a long double requires
  16-byte alignment but the Python memory allocator only guarantees 8 byte
  alignment.
  
  So our code would allocate and use these structures with technically 
  improper
  alignment.  Though it didn't matter since the 'dummy' field is never used.
  This silences that warning.
  
  Spelunking into code history, the double was added in 2001 to force better
  alignment on some platforms and changed to a long double in 2002 to appease
  Tru64.  That issue should no loner be present since the upgrade from int to
  Py_ssize_t where the minimum structure size increased to 16 (unless anyone
  knows of a platform where ssize_t is 4 bytes?)
 
 What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t).
 
  We can probably get rid of the double and this union hack all together 
  today.
  That is a slightly more invasive change that can be left for later.
 
 How do you suggest to get rid of it? Some platforms still have strict
 alignment rules and we must enforce that PyObjects (*) are always
 aligned to the largest possible alignment, since a PyObject-derived
 struct can hold arbitrary C types.

Ok, I hadn't seen your proposal. I find it reasonable:

“A more correct non-hacky alternative if any alignment issues are still
found would be to use a compiler specific alignment declaration on the
structure and determine which value to use at configure time.”


However, the commit is still problematic, and I think it should be
reverted. We can't remove the alignment hack just because it seems to
be useless on x86(-64).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com