date:20121210

Re: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no

2012-12-10 Thread Antoine Pitrou

On Tue, 11 Dec 2012 08:16:27 +0100
Antoine Pitrou  wrote:

> On Tue, 11 Dec 2012 03:05:19 +0100 (CET)
> gregory.p.smith  wrote:
> >   Using 'long double' to force this structure to be worst case aligned is no
> > longer required as of Python 2.5+ when the gc_refs changed from an int (4
> > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes.
> > 
> > The use of a 'long double' triggered a warning by Clang trunk's
> > Undefined-Behavior Sanitizer as on many platforms a long double requires
> > 16-byte alignment but the Python memory allocator only guarantees 8 byte
> > alignment.
> > 
> > So our code would allocate and use these structures with technically 
> > improper
> > alignment.  Though it didn't matter since the 'dummy' field is never used.
> > This silences that warning.
> > 
> > Spelunking into code history, the double was added in 2001 to force better
> > alignment on some platforms and changed to a long double in 2002 to appease
> > Tru64.  That issue should no loner be present since the upgrade from int to
> > Py_ssize_t where the minimum structure size increased to 16 (unless anyone
> > knows of a platform where ssize_t is 4 bytes?)
> 
> What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t).
> 
> > We can probably get rid of the double and this union hack all together 
> > today.
> > That is a slightly more invasive change that can be left for later.
> 
> How do you suggest to get rid of it? Some platforms still have strict
> alignment rules and we must enforce that PyObjects (*) are always
> aligned to the largest possible alignment, since a PyObject-derived
> struct can hold arbitrary C types.

Ok, I hadn't seen your proposal. I find it reasonable:

“A more correct non-hacky alternative if any alignment issues are still
found would be to use a compiler specific alignment declaration on the
structure and determine which value to use at configure time.”


However, the commit is still problematic, and I think it should be
reverted. We can't remove the alignment hack just because it seems to
be useless on x86(-64).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no

2012-12-10 Thread Antoine Pitrou

On Tue, 11 Dec 2012 03:05:19 +0100 (CET)
gregory.p.smith  wrote:
>   Using 'long double' to force this structure to be worst case aligned is no
> longer required as of Python 2.5+ when the gc_refs changed from an int (4
> bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes.
> 
> The use of a 'long double' triggered a warning by Clang trunk's
> Undefined-Behavior Sanitizer as on many platforms a long double requires
> 16-byte alignment but the Python memory allocator only guarantees 8 byte
> alignment.
> 
> So our code would allocate and use these structures with technically improper
> alignment.  Though it didn't matter since the 'dummy' field is never used.
> This silences that warning.
> 
> Spelunking into code history, the double was added in 2001 to force better
> alignment on some platforms and changed to a long double in 2002 to appease
> Tru64.  That issue should no loner be present since the upgrade from int to
> Py_ssize_t where the minimum structure size increased to 16 (unless anyone
> knows of a platform where ssize_t is 4 bytes?)

What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t).

> We can probably get rid of the double and this union hack all together today.
> That is a slightly more invasive change that can be left for later.

How do you suggest to get rid of it? Some platforms still have strict
alignment rules and we must enforce that PyObjects (*) are always
aligned to the largest possible alignment, since a PyObject-derived
struct can hold arbitrary C types.

(*) GC-enabled PyObjects, anyway. Others will be naturally aligned
thanks to the memory allocator.

What's more, I think you shouldn't be doing this kind of change in a
bugfix release. It might break compiled C extensions since you are
changing some characteristics of object layout (although you would
probably only break those extensions which access the GC header, which
is probably not many of them). Resource consumption improvements
generally go only into the next feature release.

Regards

Antoine.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] where is the python "import" implemented

2012-12-10 Thread Isml

Hi, everyone,
    I am testing modifying the pyc file when it is imported. As I know, there is three situation:
    1、runing in the python.exe
 eg: python.exe test.pyc
    in this situation, I find the source on line 1983 in file pythonrun.c
    2、import the pyc from a zip file
    I find the source on line 1132 in zipimport.c
    3、do a normal import
    eg: two file : main.py and testmodule.py
    and in main.py:
    import testmodule
 
    in this situation, I can not find the source code how python implement it. I test a wrong format pyc, and got a error "ImportError: bad magic number"，and I search "bad magic number" in the source code,  I find it is in importlib/_bootstrap.py(line 815)，but when I modify this error info(eg: test bad magic) and run again, nothing is changed. It seems that the file is not the correct position.___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger


On Dec 10, 2012, at 4:02 AM, Serhiy Storchaka  wrote:

> On 10.12.12 05:30, Raymond Hettinger wrote:
>> On Dec 9, 2012, at 10:03 PM, MRAB > > wrote:
>>> What happens when a key is deleted? Will that leave an empty slot in
>>> the entry array?
>> Yes.  See the __delitem__() method in the pure python implemention
>> at http://code.activestate.com/recipes/578375
> 
> You can move the last entry on freed place. This will preserve compactness of 
> entries array and simplify and speedup iterations and some other operations.
> 
> def __delitem__(self, key, hashvalue=None):
>if hashvalue is None:
>hashvalue = hash(key)
>found, i = self._lookup(key, hashvalue)
>if found:
>index = self.indices[i]
>self.indices[i] = DUMMY
>self.size -= 1
>if index != size:
>lasthash = self.hashlist[index]
>lastkey = self.keylist[index]
>found, j = self._lookup(lastkey, lasthash)
>assert found
>assert i != j
>self.indices[j] = index
>self.hashlist[index] = lasthash
>self.keylist[index] = lastkey
>self.valuelist[index] = self.valuelist[size]
>index = size
>self.hashlist[index] = UNUSED
>self.keylist[index] = UNUSED
>self.valuelist[index] = UNUSED
>else:
>raise KeyError(key)


That is a clever improvement.   Thank you.

Using your idea (plus some tweaks) cleans-up the code a bit
(simplifying iteration, simplifying the resizing logic, and eliminating the 
UNUSED constant).
I'm updating the posted code to reflect your suggestion.

Thanks again,


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Trent Nelson

On Mon, Dec 10, 2012 at 08:53:17AM -0800, Barry Warsaw wrote:
> On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote:
> 
> >I had ARM platforms in mind, too. Unfortunately we don't have any ARM
> >platforms for testing and performance measurement. Even Trent's
> >Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
> >that's all.

> >Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to
> >Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7
> >Cortex-A9) and similar.
> 
> Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of
> getting them up and running.  I am not offering for *that*. ;)

If someone donates the hardware, I'll take care of everything else.

Trent.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger

On Dec 10, 2012, at 7:04 PM, Mark Shannon  wrote:

>> Another approach is to pre-allocate the two-thirds maximum
>> (This is simple and fast but gives the smallest space savings.)
> 
> What do you mean by maximum?

A dict with an index table size of 8 gets resized when it is two-thirds full,
so the maximum number of entries is 5.  If you pre-allocate five entries
for the initial dict, you've spent 5 * 24 bytes + 8 bytes for the indices
for a total of 128 bytes.  This compares to the current table of 8 * 24 bytes
totaling 192 bytes.   

Many other strategies are possible.  The proof-of-concept code 
uses the one employed by regular python lists. 
Their growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, 
This produces nice memory savings for entry lists.

If you have a suggested allocation pattern or other 
constructive suggestion, it would be would welcome.  
Further sniping and unsubstantiated FUD would not.

Raymond
 ___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Mark Shannon




On 10/12/12 23:39, Raymond Hettinger wrote:


On Dec 10, 2012, at 4:06 AM, Mark Shannon mailto:m...@hotpy.org>> wrote:


Instead, the 24-byte entries should be stored in a
dense table referenced by a sparse table of indices.


What minimum size and resizing factor do you propose for the entries
array?


There are many ways to do this.  I don't know which is best.
The proof-of-concept code uses the existing list resizing logic.
Another approach is to pre-allocate the two-thirds maximum


What do you mean by maximum?


(This is simple and fast but gives the smallest space savings.)
A hybrid approach is to allocate in two steps (1/3 and then 2/3
if needed).


I think you need to do some more analysis on this. It is possible that 
there is some improvement to be had from your approach, but I don't 
think the improvements will be as large as you have claimed.


I suspect that you may be merely trading performance for reduced memory
use, which can be done much more easily by reducing the minimum size
and increasing the load factor.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger


On Dec 10, 2012, at 1:38 PM, PJ Eby  wrote:

> Without numbers it's hard to say for certain, but the advantage to
> keeping ordered dictionaries a distinct type is that the standard
> dictionary type can then get that extra bit of speed in exchange for
> dropping the ordering requirement.

I expect that dicts and OrderedDicts will remain separate 
for reasons of speed, space, and respect for people's mental models.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger

On Dec 10, 2012, at 4:06 AM, Mark Shannon  wrote:

>> Instead, the 24-byte entries should be stored in a
>> dense table referenced by a sparse table of indices.
> 
> What minimum size and resizing factor do you propose for the entries array?

There are many ways to do this.  I don't know which is best.
The proof-of-concept code uses the existing list resizing logic.
Another approach is to pre-allocate the two-thirds maximum
(This is simple and fast but gives the smallest space savings.)
A hybrid approach is to allocate in two steps (1/3 and then 2/3
if needed).  

Since hash tables aren't a new problem, there may
already be published research on the best way to handle
the entries array. 

Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread fwierzbi...@gmail.com

On Mon, Dec 10, 2012 at 3:13 PM, fwierzbi...@gmail.com
 wrote:
> On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo  wrote:
>> Technically, I could see Python switching to ordered dictionaries
>> everywhere.  Raymond's insight suddenly makes it easy for CPython and
>> PyPy, and at least Jython could use the LinkedHashMap class (although
>> this would need checking with Jython guys).
> I honestly hope this doesn't happen - we use ConcurrentHashMap for our
> dictionaries (which lack ordering) and I'm sure getting it to preserve
> insertion order would cost us.
I just found this http://code.google.com/p/concurrentlinkedhashmap/ so
maybe it wouldn't be all that bad. I still personally like the idea of
leaving basic dict order undetermined (there is already an OrderedDict
if you need it right?) But if ConcurrentLinkedHashMap is as good as is
suggested on that page then Jython doesn't need to be the thing that
blocks the argument.

-Frank
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Raymond Hettinger

On Dec 10, 2012, at 2:48 AM, Christian Heimes  wrote:

> On the other hand every lookup and collision checks needs an additional
> multiplication, addition and pointer dereferencing:
> 
>  entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx

Currently, the dict implementation allows alternative lookup
functions based on whether the keys are all strings.
The choice of lookup function is stored in a function pointer.
That lets each lookup use the currently active lookup
function without having to make any computations or branches.

Likewise, the lookup functions could be swapped between
char, short, long, and ulong index sizes during the resize step.
IOW, the selection only needs to be made once per resize,
not one per lookup.

Raymond___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread fwierzbi...@gmail.com

On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo  wrote:
> Technically, I could see Python switching to ordered dictionaries
> everywhere.  Raymond's insight suddenly makes it easy for CPython and
> PyPy, and at least Jython could use the LinkedHashMap class (although
> this would need checking with Jython guys).
I honestly hope this doesn't happen - we use ConcurrentHashMap for our
dictionaries (which lack ordering) and I'm sure getting it to preserve
insertion order would cost us.

-Frank
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby

On Mon, Dec 10, 2012 at 5:14 PM, Andrew Svetlov
 wrote:
> Please, no. dict and kwargs should be unordered without any assumptions.

Why?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby

On Mon, Dec 10, 2012 at 4:29 PM, Tim Delaney  wrote:
> Whilst I think Python should not move to ordered dictionaries everywhere, I
> would say there is an argument (no pun intended) for making **kwargs a
> dictionary that maintains insertion order *if there are no deletions*.

Oooh.  Me likey.  There have been many times where I've wished kwargs
were ordered when designing an API.

(Oddly, I don't remember any one of the APIs specifically, so don't
ask me for a good example.  I just remember a bunch of different
physical locations where I was when I thought, "Ooh, what if I
could...  no, that's not going to work.")

One other useful place for ordered dictionaries is class definitions
processed by class decorators: no need to write a metaclass just to
know what order stuff was defined in.

>  It sounds like we could get that for free with this implementation, although
> from another post IronPython might not have something suitable.

Actually, IronPython may already have ordered dictionaries by default; see:

  http://mail.python.org/pipermail/ironpython-users/2006-May/002319.html

It's described as an implementation detail that may change, perhaps
that could be changed to being unchanging.  ;-)

> I think there are real advantages to doing so - a trivial one being the 
> ability
> to easily initialise an ordered dictionary from another ordered dictionary.

Or to merge two of them together, either at creation or .update().

I'm really starting to wonder if it might not be worth paying the
compaction overhead to just make all dictionaries ordered, all the
time.  The problem is that if you are always adding new keys and
deleting old ones (as might be done in a LRU cache, a queue, or other
things like that) you'll probably see a lot of compaction overhead
compared to today's dicts.

OTOH...  with a good algorithm for deciding *when* to compact, we can
actually make the amortized cost O(1) or so, so maybe that's not a big
deal.  The cost to do a compaction is at worst, the current size of
the table.  So if you wait until a table has twice as many entries
(more precisely, until the index of the last entry is twice what it
was at last compaction), you will amortize the compaction cost down to
one entry move per add, or O(1).

That would handle the case of a cache or queue, but I'm not sure how
it would work with supersized dictionaries that are then deleted down
to a fraction of their original size.  I suppose if you delete your
way down to half the entries being populated, then you end up with two
moves per delete, or O(2).  (Yes, I know that's not a valid O number.)

So, offhand, it seems pretty doable, and unlikely to significantly
change worst-case performance even for pathological use cases.  (Like
using a dict when you should be using a deque.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Andrew Svetlov

On Mon, Dec 10, 2012 at 11:29 PM, Tim Delaney  wrote:
> On 11 December 2012 05:01, Armin Rigo  wrote:
>>
>> Hi Philip,
>>
>> On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby  wrote:
>> > On the other hand, this would also make a fast ordered dictionary
>> > subclass possible, just by not using the free list for additions,
>> > combined with periodic compaction before adds or after deletes.
>>
>> Technically, I could see Python switching to ordered dictionaries
>> everywhere.  Raymond's insight suddenly makes it easy for CPython and
>> PyPy, and at least Jython could use the LinkedHashMap class (although
>> this would need checking with Jython guys).  I'd vaguely argue that
>> dictionary orders are one of the few non-reproducible factors in a
>> Python program, so it might be a good thing.  But only vaguely ---
>> maybe I got far too used to random orders over time...
>
>
> Whilst I think Python should not move to ordered dictionaries everywhere, I
> would say there is an argument (no pun intended) for making **kwargs a
> dictionary that maintains insertion order *if there are no deletions*. It
> sounds like we could get that for free with this implementation, although
> from another post IronPython might not have something suitable.
>
Please, no. dict and kwargs should be unordered without any assumptions.

> I think there are real advantages to doing so - a trivial one being the
> ability to easily initialise an ordered dictionary from another ordered
> dictionary.
>
It can be done with adding short-circuit for OrderedDict class to
accept another OrderedDict instance.

> I could also see an argument for having this property for all dicts. There
> are many dictionaries that are never deleted from (probably most dict
> literals) and it would be nice to have an iteration order for them that
> matched the source code.
>
> However if deletions occur all bets would be off. If you need to maintain
> insertion order in the face of deletions, use an explicit ordereddict.
>
> Tim Delaney
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com
>



--
Thanks,
Andrew Svetlov
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Guido, Dropbox, and Python

2012-12-10 Thread Guido van Rossum

On Mon, Dec 10, 2012 at 1:51 PM, Terry Reedy  wrote:
> For those who have not heard, Guido left Google Friday and starts at Dropbox
> in January. (I hope you enjoy the break in between ;-).

Thank you, I am already enjoying it!

> https://twitter.com/gvanrossum/status/277126763295944705
> https://tech.dropbox.com/2012/12/welcome-guido/
>
> My question, Guido, is how this will affect Python development, and in
> particular, your work on async. If not proprietary info, does or will
> Dropbox use Python3?

It should not change a thing. I don't know whether Dropbox uses Python
3 but I don't think it will make any difference. Of course I have not
yet started working at Dropbox so I will be able to give better
answers in a few months.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Guido, Dropbox, and Python

2012-12-10 Thread Terry Reedy

For those who have not heard, Guido left Google Friday and starts at 
Dropbox in January. (I hope you enjoy the break in between ;-).


https://twitter.com/gvanrossum/status/277126763295944705
https://tech.dropbox.com/2012/12/welcome-guido/

My question, Guido, is how this will affect Python development, and in 
particular, your work on async. If not proprietary info, does or will 
Dropbox use Python3?


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Tim Delaney

On 11 December 2012 05:01, Armin Rigo  wrote:

> Hi Philip,
>
> On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby  wrote:
> > On the other hand, this would also make a fast ordered dictionary
> > subclass possible, just by not using the free list for additions,
> > combined with periodic compaction before adds or after deletes.
>
> Technically, I could see Python switching to ordered dictionaries
> everywhere.  Raymond's insight suddenly makes it easy for CPython and
> PyPy, and at least Jython could use the LinkedHashMap class (although
> this would need checking with Jython guys).  I'd vaguely argue that
> dictionary orders are one of the few non-reproducible factors in a
> Python program, so it might be a good thing.  But only vaguely ---
> maybe I got far too used to random orders over time...
>

Whilst I think Python should not move to ordered dictionaries everywhere, I
would say there is an argument (no pun intended) for making **kwargs a
dictionary that maintains insertion order *if there are no deletions*. It
sounds like we could get that for free with this implementation, although
from another post IronPython might not have something suitable.

I think there are real advantages to doing so - a trivial one being the
ability to easily initialise an ordered dictionary from another ordered
dictionary.

I could also see an argument for having this property for all dicts. There
are many dictionaries that are never deleted from (probably most dict
literals) and it would be nice to have an iteration order for them that
matched the source code.

However if deletions occur all bets would be off. If you need to maintain
insertion order in the face of deletions, use an explicit ordereddict.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Terry Reedy

On 12/10/2012 1:38 PM, PJ Eby wrote:

On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo  wrote:

On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby  wrote:

On the other hand, this would also make a fast ordered dictionary
subclass possible, just by not using the free list for additions,
combined with periodic compaction before adds or after deletes.

Technically, I could see Python switching to ordered dictionaries
everywhere.  Raymond's insight suddenly makes it easy for CPython and
PyPy, and at least Jython could use the LinkedHashMap class (although
this would need checking with Jython guys).

What about IronPython?

Also, note that using ordered dictionaries carries a performance cost
for dictionaries whose keys change a lot.  This probably wouldn't
affect most dictionaries in most programs, because module and object
dictionaries generally don't delete and re-add a lot of keys very
often. But in cases where a dictionary is used as a queue or stack or
something of that sort, the packing costs could add up.  Under the
current scheme, as long as collisions were minimal, the contents
wouldn't be repacked very often.

Without numbers it's hard to say for certain, but the advantage to
keeping ordered dictionaries a distinct type is that the standard
dictionary type can then get that extra bit of speed in exchange for
dropping the ordering requirement.

I think that there should be a separate OrderedDict class (or subclass) 
and that the dict docs should warn (if not now) that while iterating a 
dict *may*, in some circumstances, give items in insertion *or* sort 
order, it *will not* in all cases. Example of the latter:

>>> d = {8:0, 9:0, 15:0, 13:0, 14:0}
>>> for k in d: print(k)

8
9
13
14
15

If one entered the keys in sorted order, as might be natural, one might 
mistakenly think that insertion order was being reproduced.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Emacs users: hg-tools-grep

2012-12-10 Thread Barry Warsaw

On Dec 10, 2012, at 03:02 PM, Brandon W Maister wrote:

>I also feel weird about adding a copyright to this, but how will other
>people feel comfortable using it if I don't?

Nice!  I've been told that Emacs 23 is probably a minimum requirement because
locate-dominating-file is missing in <= Emacs 22.  I'm using Emacs 24.2 almost
everywhere these days.

-Barry

signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Emacs users: hg-tools-grep

2012-12-10 Thread Brandon W Maister

>
> P.S. Who wants to abuse Jono and Matthew's copyright again and provide a
>  git version?
>

Oh, I do!

I also feel weird about adding a copyright to this, but how will other
people feel comfortable using it if I don't?

Also I put it in github, in case people want to fix it:
https://github.com/quodlibetor/git-tools.el

;; Copyright (c) 2012 Brandon W Maister
;;
;; Permission is hereby granted, free of charge, to any person obtaining
;; a copy of this software and associated documentation files (the
;; "Software"), to deal in the Software without restriction, including
;; without limitation the rights to use, copy, modify, merge, publish,
;; distribute, sublicense, and/or sell copies of the Software, and to
;; permit persons to whom the Software is furnished to do so, subject to
;; the following conditions:
;;
;; The above copyright notice and this permission notice shall be
;; included in all copies or substantial portions of the Software.
;;
;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

;; This code is based on hg-tools.el, which in turn is based on bzr-tools.el
;; Copyright (c) 2008-2012 Jonathan Lange, Matthew Lefkowitz, Barry A.
Warsaw

(provide 'git-tools)

(defconst git-tools-grep-command
  "git ls-files -z | xargs -0 grep -In %s"
  "The command used for grepping files using git. See `git-tools-grep'.")

;; Run 'code' at the root of the branch which dirname is in.
(defmacro git-tools-at-branch-root (dirname &rest code)
  `(let ((default-directory (locate-dominating-file (expand-file-name
,dirname) ".git"))) ,@code))

(defun git-tools-grep (expression dirname)
  "Search a branch for `expression'. If there's a C-u prefix, prompt for
`dirname'."
  (interactive
   (let* ((string (read-string "Search for: "))
  (dir (if (null current-prefix-arg)
   default-directory
 (read-directory-name (format "Search for %s in: "
string)
 (list string dir)))
  (git-tools-at-branch-root dirname
   (grep-find (format git-tools-grep-command (shell-quote-argument
expression)

On Sat, Dec 8, 2012 at 4:51 PM, Barry Warsaw  wrote:

> Hark fellow Emacsers.  All you unenlightened heathens can stop reading now.
>
> A few years ago, my colleague Jono Lange wrote probably the best little
> chunk
> of Emacs lisp ever.  `M-x bzr-tools-grep` lets you easily search a Bazaar
> repository for a case-sensitive string, providing you with a nice *grep*
> buffer which you can scroll through.  When you find a code sample you want
> to
> look at, C-c C-c visits the file and plops you right at the matching line.
> You *only* grep through files under version control, so you get to ignore
> generated files, and compilation artifacts, etc.
>
> Of course, this doesn't help you for working on the Python code base,
> because
> Mercurial.  I finally whipped up this straight up rip of Jono's code to
> work
> with hg.  I'm actually embarrassed to put a copyright on this thing, and
> would
> happily just donate it to Jono, drop it in Python's Misc directory, or
> slip it
> like a lump of coal into the xmas stocking of whoever wants to "maintain"
> it
> for the next 20 years.
>
> But anyway, it's already proven enormously helpful to me, so here it is.
>
> Cheers,
> -Barry
>
> P.S. Who wants to abuse Jono and Matthew's copyright again and provide a
>  git version?
>
> ;; Copyright (c) 2012 Barry A. Warsaw
> ;;
> ;; Permission is hereby granted, free of charge, to any person obtaining
> ;; a copy of this software and associated documentation files (the
> ;; "Software"), to deal in the Software without restriction, including
> ;; without limitation the rights to use, copy, modify, merge, publish,
> ;; distribute, sublicense, and/or sell copies of the Software, and to
> ;; permit persons to whom the Software is furnished to do so, subject to
> ;; the following conditions:
> ;;
> ;; The above copyright notice and this permission notice shall be
> ;; included in all copies or substantial portions of the Software.
> ;;
> ;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> ;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> ;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> ;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
> ;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
> ;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
> ;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
>
> ;; This code is based on bzr-tools.el
> ;; Copyright (c) 2008-2009

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Antoine Pitrou

On Mon, 10 Dec 2012 11:53:17 -0500
Barry Warsaw  wrote:
> On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote:
> 
> >I had ARM platforms in mind, too. Unfortunately we don't have any ARM
> >platforms for testing and performance measurement. Even Trent's
> >Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
> >that's all.
> 
> I have a few ARM machines that I can use for testing, though I can't provide
> external access to them.
> 
> * http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm
>   (Which oops, I see is down.  Why did I not get notifications about that?)

Because buildbot.python.org is waiting for someone (mail.python.org,
perhaps) to accept its SMTP requests. Feel free to ping the necessary
people ;-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby

On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo  wrote:
> On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby  wrote:
>> On the other hand, this would also make a fast ordered dictionary
>> subclass possible, just by not using the free list for additions,
>> combined with periodic compaction before adds or after deletes.
>
> Technically, I could see Python switching to ordered dictionaries
> everywhere.  Raymond's insight suddenly makes it easy for CPython and
> PyPy, and at least Jython could use the LinkedHashMap class (although
> this would need checking with Jython guys).

What about IronPython?

Also, note that using ordered dictionaries carries a performance cost
for dictionaries whose keys change a lot.  This probably wouldn't
affect most dictionaries in most programs, because module and object
dictionaries generally don't delete and re-add a lot of keys very
often. But in cases where a dictionary is used as a queue or stack or
something of that sort, the packing costs could add up.  Under the
current scheme, as long as collisions were minimal, the contents
wouldn't be repacked very often.

Without numbers it's hard to say for certain, but the advantage to
keeping ordered dictionaries a distinct type is that the standard
dictionary type can then get that extra bit of speed in exchange for
dropping the ordering requirement.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is is worth disentangling distutils?

2012-12-10 Thread Brian Curtin

On Mon, Dec 10, 2012 at 1:22 AM, Antonio Cavallo
 wrote:
> I'm not into the py3 at all so I wonder how possibly it could fit/collide
> into the big plan.
>
> Or I'll be wasting my time?

If you're not doing it on Python 3 then you are wasting your time.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Is is worth disentangling distutils?

2012-12-10 Thread Antonio Cavallo


Hi,
I wonder if is it worth/if there is any interest in trying to "clean" up 
distutils: nothing in terms to add new features, just a *major* cleanup 
retaining the exact same interface.



I'm not planning anything like *adding features* or rewriting 
rpm/rpmbuild here, simply cleaning up that un-holy code mess. Yes it 
served well, don't get me wrong, and I think it did work much better 
than anything it was meant to replace it.


I'm not into the py3 at all so I wonder how possibly it could 
fit/collide into the big plan.


Or I'll be wasting my time?

Thanks

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread R. David Murray

On Mon, 10 Dec 2012 16:06:23 +, krist...@ccpgames.com wrote:
> Indeed, I had to change the dict tuning parameters to make dicts
> behave reasonably on the PS3.
>
> Interesting stuff indeed.

Out of both curiosity and a possible application of my own for the
information, what values did you end up with?

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Armin Rigo

Hi Philip,

On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby  wrote:
> On the other hand, this would also make a fast ordered dictionary
> subclass possible, just by not using the free list for additions,
> combined with periodic compaction before adds or after deletes.

Technically, I could see Python switching to ordered dictionaries
everywhere.  Raymond's insight suddenly makes it easy for CPython and
PyPy, and at least Jython could use the LinkedHashMap class (although
this would need checking with Jython guys).  I'd vaguely argue that
dictionary orders are one of the few non-reproducible factors in a
Python program, so it might be a good thing.  But only vaguely ---
maybe I got far too used to random orders over time...

A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Barry Warsaw

On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote:

>I had ARM platforms in mind, too. Unfortunately we don't have any ARM
>platforms for testing and performance measurement. Even Trent's
>Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
>that's all.

I have a few ARM machines that I can use for testing, though I can't provide
external access to them.

* http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm
  (Which oops, I see is down.  Why did I not get notifications about that?)

* I have a Nexus 7 flashed with Ubuntu 12.10 (soon to be 13.04).

* Pandaboard still sitting in its box. ;)

>Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to
>Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7
>Cortex-A9) and similar.

Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of
getting them up and running.  I am not offering for *that*. ;)

Cheers,
-Barry

signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Kristján Valur Jónsson

Indeed, I had to change the dict tuning parameters to make dicts behave 
reasonably on the PS3.
Interesting stuff indeed.

> -Original Message-
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames@python.org] On Behalf Of Barry Warsaw
> Sent: 10. desember 2012 15:28
> To: python-dev@python.org
> Subject: Re: [Python-Dev] More compact dictionaries with faster iteration
> 

> I'd be interested to see what effect this has on memory constrained
> platforms such as many current ARM applications (mostly likely 32bit for
> now).  Python's memory consumption is an overheard complaint for folks
> developing for those platforms.
> 
> Cheers,
> -Barry
> 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread Paul Moore

On 10 December 2012 16:35, Toshio Kuratomi  wrote:
> I have no problems with Obsoletes, Conflicts, Requires, and Provides types
> of fields are marked informational.  In fact, there are many cases where
> packages are overzealous in their use of Requires right now that cause
> distributions to patch the dependency information in the package metadata.

Given the endless debate on these fields, and the fact that it pretty
much all seems to be about what happens when tools enforce them, I'm
+1 on this. Particularly as these fields were not the focus of this
change to the spec in any case.

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Christian Heimes

Am 10.12.2012 16:28, schrieb Barry Warsaw:
> I'd be interested to see what effect this has on memory constrained platforms
> such as many current ARM applications (mostly likely 32bit for now).  Python's
> memory consumption is an overheard complaint for folks developing for those
> platforms.

I had ARM platforms in mind, too. Unfortunately we don't have any ARM
platforms for testing and performance measurement. Even Trent's
Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi,
that's all.

Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to
Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7
Cortex-A9) and similar.

Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread Toshio Kuratomi

On Fri, Dec 7, 2012 at 10:46 PM, PJ Eby  wrote:

>
> In any case, as I said before, I don't have an issue with the fields
> all being declared as being for informational purposes only.  My issue
> is only with recommendations for automated tool behavior that permit
> one project's author to exercise authority over another project's
> installation.

Skipping over a lot of other replies between you and I because I think that
we disagree on a lot but that's all moot if we agree here.

I have no problems with Obsoletes, Conflicts, Requires, and Provides types
of fields are marked informational.  In fact, there are many cases where
packages are overzealous in their use of Requires right now that cause
distributions to patch the dependency information in the package metadata.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread PJ Eby

On Mon, Dec 10, 2012 at 3:27 AM, Stephen J. Turnbull  wrote:
> PJ Eby writes:
>
>  > By "clear", I mean "free of prior assumptions".
>
> Ah, well, I guess I've just run into a personal limitation.  I can't
> imagine thinking that is "free of prior assumptions".  Not my
> own, and not by others, either.

I suppose I should have said, "free of *known* prior assumptions",
since the trick to suspending assumptions is to find the ones you
*have*.

The deeper assumptions, alas, can usually only be found by clashing
opinions with others...  then stepping back and going, "wait...  what
does he/she believe that's *different* from what I believe, that
allows them to have that differing opinion?"

And then that's how you find out what it is that *you're* assuming,
that you didn't know you were assuming.  ;-)  (Not to mention what the
other person is.)

>  > Put together, the phrase "clear thinking on concrete use cases" means
>  > (at least to me), "dropping all preconceptions of the existing design
>  > and starting over from square one, to ask how best the problem may be
>  > solved, using specific examples as a guide rather than using
>  > generalities."
>
> Sure, but ISTM that's the opposite of what you've actually been doing,
> at least in terms of contributing to my understanding.  One obstacle
> to discussion you have contributed to overcoming in my thinking is the
> big generality that the packager (ie, the person writing the metadata)
> is in a position to recommend "good behavior" to the installation
> tool, vs. being in a position to point out "relevant considerations"
> for users and tools installing the packager's product.

Right, but I started from a concrete scenario I wanted to avoid, which
led me to question the assumption that those fields were actually
useful.  As soon as I began questioning *that* assumption and asking
for use cases (2 years ago, in the last PEP 345 revision discussion),
it became apparent to me that there was something seriously wrong with
the conflicts and obsoletes fields, as they had almost no real utility
as they were defined and understood at that point.

> Until that generality is formulated and expressed, it's very difficult
> to see why the examples and particular solutions to use cases that
> various proponents have described fail to address some real problems.

Unfortunately, it's a chicken-and-egg problem: until you know what
assumptions are being made, you can't formulate them.  It's an
iterative process of exposing assumptions, until you succeed in
actually communicating.  ;-)

Heck, even something as simple as my assumptions about what "clear
thinking" meant and what I was trying to say has taken some back and
forth to clarify.  ;-)

> It was difficult for me to see, at first, what distinction was
> actually being made.
>
> Specifically, I thought that the question about "Obsoletes" vs.
> "Obsoleted-By" was about which package should be considered
> authoritative about obsolescence.  That is a reasonable distinction
> for that particular discussion, but there is a deeper, and general,
> principle behind that.  Namely, "metadata is descriptive, not
> prescriptive."

Actually, the principle I was clinging to for *both* fields was not
giving project authors authority over other people's projects.  It's
fine for metadata to be prescriptive (e.g. requirements), it's just
that it should be prescriptive *only* for that project in isolation.
(In the broader sense, it also applies to the distro situation: the
project author doesn't really have authority over the distro, either,
so it can only be a suggestion there, as well.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Brett Cannon

On Mon, Dec 10, 2012 at 10:28 AM, Barry Warsaw  wrote:

> On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote:
>
> >It's hard to predict how the extra CPU instructions are going to affect
> >the performance on different CPUs and scenarios. My guts tell me that
> >your proposal is going to perform better on modern CPUs with lots of L1
> >and L2 cache and in an average case scenario. A worst case scenario with
> >lots of collisions might be measurable slower for large dicts and small
> >CPU cache.
>
> I'd be interested to see what effect this has on memory constrained
> platforms
> such as many current ARM applications (mostly likely 32bit for now).
>  Python's
> memory consumption is an overheard complaint for folks developing for those
> platforms.
>

Maybe Kristjan can shed some light on how this would have helped him on the
ps3 work he has been doing for Dust 514 as he has had memory issues there.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Antoine Pitrou

Le Mon, 10 Dec 2012 11:16:49 -0500,
PJ Eby  a écrit :

> On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou
>  wrote:
> > Le Mon, 10 Dec 2012 10:40:30 +0100,
> > Armin Rigo  a écrit :
> >> Hi Raymond,
> >>
> >> On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
> >>  wrote:
> >> > Instead, the data should be organized as follows:
> >> >
> >> > indices =  [None, 1, None, None, None, 0, None, 2]
> >> > entries =  [[-9092791511155847987, 'timmy', 'red'],
> >> > [-8522787127447073495, 'barry', 'green'],
> >> > [-6480567542315338377, 'guido', 'blue']]
> >>
> >> As a side note, your suggestion also enables order-preserving
> >> dictionaries: iter() would automatically yield items in the order
> >> they were inserted, as long as there was no deletion.  People will
> >> immediately start relying on this "feature"...  and be confused by
> >> the behavior of deletion. :-/
> >
> > If that's really an issue, we can deliberately scramble the
> > iteration order a bit :-) (of course it might negatively impact HW
> > prefetching)
> 
> On the other hand, this would also make a fast ordered dictionary
> subclass possible, just by not using the free list for additions,
> combined with periodic compaction before adds or after deletes.

I suspect that's what Raymond has in mind :-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread PJ Eby

On Sun, Dec 9, 2012 at 10:38 PM, Andrew McNabb  wrote:
> On Fri, Dec 07, 2012 at 05:02:26PM -0500, PJ Eby wrote:
>> If the packages have files in conflict, they won't be both installed.
>> If they don't have files in conflict, there's nothing important to be
>> informed of.  If one is installing pexpect-u, then one does not need
>> to discover that it is a successor of pexpect.
>
> In the specific case of pexpect and pexpect-u, the files don't actually
> conflict.  The pexpect package includes a "pexpect.py" file, while
> pexpect-u includes a "pexpect/" directory.  These conflict, but not in
> the easily detectable sense.

Excellent!  A concrete non-file use case.  Setuptools handles this
particular scenario by including a list of top-level module or package
names, but newer tools ought to look out for this scenario, too.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread PJ Eby

On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou  wrote:
> Le Mon, 10 Dec 2012 10:40:30 +0100,
> Armin Rigo  a écrit :
>> Hi Raymond,
>>
>> On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
>>  wrote:
>> > Instead, the data should be organized as follows:
>> >
>> > indices =  [None, 1, None, None, None, 0, None, 2]
>> > entries =  [[-9092791511155847987, 'timmy', 'red'],
>> > [-8522787127447073495, 'barry', 'green'],
>> > [-6480567542315338377, 'guido', 'blue']]
>>
>> As a side note, your suggestion also enables order-preserving
>> dictionaries: iter() would automatically yield items in the order they
>> were inserted, as long as there was no deletion.  People will
>> immediately start relying on this "feature"...  and be confused by the
>> behavior of deletion. :-/
>
> If that's really an issue, we can deliberately scramble the iteration
> order a bit :-) (of course it might negatively impact HW prefetching)

On the other hand, this would also make a fast ordered dictionary
subclass possible, just by not using the free list for additions,
combined with periodic compaction before adds or after deletes.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Steven D'Aprano


On 10/12/12 20:40, Armin Rigo wrote:


As a side note, your suggestion also enables order-preserving
dictionaries: iter() would automatically yield items in the order they
were inserted, as long as there was no deletion.  People will
immediately start relying on this "feature"...  and be confused by the
behavior of deletion. :-/



If we want to avoid the attractive nuisance of iteration order being
almost, but not quite, order-preserving, there is a simple fix: when
iterating over the dict, instead of starting at the start of the table,
start at some arbitrary point in the middle and wrap around. That will
increase the cost of iteration slightly, but avoid misleading behaviour.

I think all we need do is change the existing __iter__ method from this:

def __iter__(self):
for key in self.keylist:
if key is not UNUSED:
yield key


to this:

# Untested!
def __iter__(self):
# Choose an arbitrary starting point.
# 103 is chosen because it is prime.
n = (103 % len(self.keylist)) if self.keylist else 0
for key in self.keylist[n:] + self.keylist[:n]:
# I presume the C version will not duplicate the keylist.
if key is not UNUSED:
yield key

This mixes the order of iteration up somewhat, just enough to avoid
misleading people into thinking that this is order-preserving. The
order will depend on the size of the dict, not the history. For
example, with keys a, b, c, ... added in that order, the output is:

len=1  key=a
len=2  key=ba
len=3  key=bca
len=4  key=dabc
len=5  key=deabc
len=6  key=bcdefa
len=7  key=fgabcde
len=8  key=habcdefg
len=9  key=efghiabcd

which I expect is mixed up enough to discourage programmers from
jumping to conclusions about dicts having guaranteed order.



--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Antoine Pitrou

Le Mon, 10 Dec 2012 10:40:30 +0100,
Armin Rigo  a écrit :
> Hi Raymond,
> 
> On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
>  wrote:
> > Instead, the data should be organized as follows:
> >
> > indices =  [None, 1, None, None, None, 0, None, 2]
> > entries =  [[-9092791511155847987, 'timmy', 'red'],
> > [-8522787127447073495, 'barry', 'green'],
> > [-6480567542315338377, 'guido', 'blue']]
> 
> As a side note, your suggestion also enables order-preserving
> dictionaries: iter() would automatically yield items in the order they
> were inserted, as long as there was no deletion.  People will
> immediately start relying on this "feature"...  and be confused by the
> behavior of deletion. :-/

If that's really an issue, we can deliberately scramble the iteration
order a bit :-) (of course it might negatively impact HW prefetching)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Barry Warsaw

On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote:

>It's hard to predict how the extra CPU instructions are going to affect
>the performance on different CPUs and scenarios. My guts tell me that
>your proposal is going to perform better on modern CPUs with lots of L1
>and L2 cache and in an average case scenario. A worst case scenario with
>lots of collisions might be measurable slower for large dicts and small
>CPU cache.

I'd be interested to see what effect this has on memory constrained platforms
such as many current ARM applications (mostly likely 32bit for now).  Python's
memory consumption is an overheard complaint for folks developing for those
platforms.

Cheers,
-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]]

2012-12-10 Thread Toshio Kuratomi

On Sun, Dec 09, 2012 at 12:48:45AM -0500, PJ Eby wrote:
> 
> Do any of the distro folks know of a Python project tagged as
> conflicting with another for their distro, where the conflict does
> *not* involve any files in conflict?
> 
In Fedora we do work to avoid most types of Conflicts (backporting fixes,
etc) but I can give some examples of where Conflivts could have been used in
the past:

In docutils prior to the latest release, certain portions of docutils was
broken if pyxml was installed (since pyxml replaces certain stdlib xml.*
functionaltiy).  So older docutils versions could have had a Conflicts:
PyXML. Nick has since provided a technique for docutils to use that loads
from the stdlib first and only goes to PyXML if the functionality is not
available there.

Various libraries in web stacks have had bugs that prevent the propser
functioning of the web framework at the top level.  In case of major issues
(security, unable to startup), these top level frameworks could use
versioned Conflicts to prevent installation.  For instance:  TurboGears
might have a Conflicts: CherryPy < 2.3.1 

Note, though, that if parallel installable versions and selection of the
proper versions from that work, then this type of Conflict wouldn't be
necessary.  Instead you'd have versioned Requires: instead.

-Toshio

pgpb06wUdfxFB.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]]

2012-12-10 Thread Toshio Kuratomi

On Sun, Dec 09, 2012 at 01:51:09PM +1100, Chris Angelico wrote:
> On Sun, Dec 9, 2012 at 1:11 PM, Steven D'Aprano  wrote:
> > Why would a software package called "Spam" install a top-level module called
> > "Jam" rather than "Spam"? Isn't the whole point of Python packages to solve
> > this namespace problem?
> 
> That would require/demand that the software package MUST define a
> module with its own name, and MUST NOT define any other top-level
> modules, and also that package names MUST be unique. (RFC 2119
> keywords.) That would work, as long as those restrictions are
> acceptable.
> 
/me notes that setuptools itself is an example of a package that violates
this rule )setuptools and pkg_resources).

No objections to "That would work, as long as those restrictions are
acceptable."... that seems to sum up where we're at.

-Toshio


pgpC3zr5ZrDtv.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Distutils] Is is worth disentangling distutils?

2012-12-10 Thread Daniel Holth

On Mon, Dec 10, 2012 at 2:22 AM, Antonio Cavallo wrote:

> Hi,
> I wonder if is it worth/if there is any interest in trying to "clean" up
> distutils: nothing in terms to add new features, just a *major* cleanup
> retaining the exact same interface.
>
>
> I'm not planning anything like *adding features* or rewriting rpm/rpmbuild
> here, simply cleaning up that un-holy code mess. Yes it served well, don't
> get me wrong, and I think it did work much better than anything it was
> meant to replace it.
>
> I'm not into the py3 at all so I wonder how possibly it could fit/collide
> into the big plan.
>
> Or I'll be wasting my time?
>

It has been tried before. IIUC the nature of distutils and extending
distutils is that client code depends on the entire tangle. If you try to
clean it up you will break backwards compatibility.

distutils2 is designed to break backwards compatibility with distutils and
is essentially a cleaned up distutils.

Have you tried Bento? http://bento.readthedocs.org/en/latest/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Alexey Kachayev

Hi!

2012/12/10 Armin Rigo 

> Hi Raymond,
>
> On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
>  wrote:
> > Instead, the data should be organized as follows:
> >
> > indices =  [None, 1, None, None, None, 0, None, 2]
> > entries =  [[-9092791511155847987, 'timmy', 'red'],
> > [-8522787127447073495, 'barry', 'green'],
> > [-6480567542315338377, 'guido', 'blue']]
>
> As a side note, your suggestion also enables order-preserving
> dictionaries: iter() would automatically yield items in the order they
> were inserted, as long as there was no deletion.  People will
> immediately start relying on this "feature"...  and be confused by the
> behavior of deletion. :-/
>

I'm not sure about "relying" cause currently Python supports this feature
only with OrderedDict object. So it's common for python developers do not
rely on inserting ordering when using generic dict.


>
> A bientôt,
>
> Armin.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/kachayev%40gmail.com
>



-- 
Kind regards,
Alexey S. Kachayev,
CTO at Kitapps Inc.
--
http://codemehanika.org
Skype: kachayev
Tel: +380-996692092
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Armin Rigo

Hi Raymond,

On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger
 wrote:
> Instead, the data should be organized as follows:
>
> indices =  [None, 1, None, None, None, 0, None, 2]
> entries =  [[-9092791511155847987, 'timmy', 'red'],
> [-8522787127447073495, 'barry', 'green'],
> [-6480567542315338377, 'guido', 'blue']]

As a side note, your suggestion also enables order-preserving
dictionaries: iter() would automatically yield items in the order they
were inserted, as long as there was no deletion.  People will
immediately start relying on this "feature"...  and be confused by the
behavior of deletion. :-/

A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Do more at compile time; less at runtime

2012-12-10 Thread Victor Stinner

Do know my registervm project? I try to emit better bytecode. See the
implementation on

http://hg.python.org/sandbox/registervm

See for example the documentation:

hg.python.org/sandbox/registervm/file/tip/REGISTERVM.txt
 

I implemented other optimisations. See also WPython project which uses also
more efficient bytecode.

http://code.google.com/p/wpython/

Victor
Le 9 déc. 2012 23:23, "Mark Shannon"  a écrit :

> Hi all,
>
> The current CPython bytecode interpreter is rather more complex than it
> needs to be. A number of bytecodes could be eliminated and a few more
> simplified by moving the work involved in handling compound statements
> (loops, try-blocks, etc) from the interpreter to the compiler.
>
> This simplest example of this is the while loop...
> while cond:
>body
>
> This currently compiled as
>
> start:
>if not cond goto end
>body
>goto start
> end:
>
> but it could be compiled as
>
> goto test:
> start:
> body
> if cond goto start
>
> which eliminates one instruction per iteration.
>
> A more complex example is a return in a try-finally block.
>
> try:
> part1
> if cond:
> return X
> part2
> finally:
> part3
>
> Currently, handling the return is complex and involves "pseudo
> exceptions", but if part3 were duplicated by the compiler, then the RETURN
> bytecode could just perform a simple return.
> The code above would be compiled thus...
>
> PUSH_BLOCK try
> part1
> if not X goto endif
> push X
> POP_BLOCK
> part3   <<< duplicated
> RETURN_VALUE
> endif:
> part2
> POP_BLOCK
> part3   <<< duplicated
>
> The changes I am proposing are:
>
> Allow negative line deltas in the lnotab array (bytecode deltas would
> remain non-negative)
> Remove the SETUP_LOOP, BREAK and CONTINUE bytecodes
> Simplify the RETURN bytecode
> Eliminate "pseudo exceptions" from the interpreter
> Simplify (or perhaps eliminate) SETUP_TRY, END_FINALLY, END_WITH.
> Reverse the sense of the FOR_ITER bytecode (ie. jump on not-exhausted)
>
>
> The net effect of these changes would be:
> Reduced code size and reduced code complexity.
> A small (1-5%)? increase in speed, due the simplification of the
> bytecodes and a very small change in the number of bytecodes executed.
> A small change in the static size of the bytecodes (-2% to +2%)?
>
> Although this is a quite intrusive change, I think it is worthwhile as it
> simplifies ceval.c considerably.
> The interpreter has become rather convoluted and any simplification has to
> be a good thing.
>
> I've already implemented negative line deltas and the transformed while
> loop: 
> https://bitbucket.org/**markshannon/cpython-lnotab-**signed
> I'm currently working on the block unwinding.
>
> So,
> Good idea? Bad idea?
> Should I write a PEP or is the bug tracker sufficient?
>
> Cheers,
> Mark.
>
>
>
>
>
>
>
> __**_
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/**mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
> victor.stinner%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Mark Shannon


On 10/12/12 01:44, Raymond Hettinger wrote:

The current memory layout for dictionaries is
unnecessarily inefficient.  It has a sparse table of
24-byte entries containing the hash value, key pointer,
and value pointer.

Instead, the 24-byte entries should be stored in a
dense table referenced by a sparse table of indices.


What minimum size and resizing factor do you propose for the entries array?

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Serhiy Storchaka


On 10.12.12 09:48, Christian Heimes wrote:

On the other hand every lookup and collision checks needs an additional
multiplication, addition and pointer dereferencing:

   entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx


I think that main slowdown will be in getting index from array with 
variable size of elements. It requires conditional jump which is not 
such cheap as additional or shifting.


switch (self->index_size) {
case 1: idx = ((uint8_t *)self->indices)[i]; break;
case 2: idx = ((uint16_t *)self->indices)[i]; break;
...
}

I think that variable-size indices don't worth effort.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Serhiy Storchaka


On 10.12.12 05:30, Raymond Hettinger wrote:

On Dec 9, 2012, at 10:03 PM, MRAB mailto:pyt...@mrabarnett.plus.com>> wrote:

What happens when a key is deleted? Will that leave an empty slot in
the entry array?

Yes.  See the __delitem__() method in the pure python implemention
at http://code.activestate.com/recipes/578375


You can move the last entry on freed place. This will preserve 
compactness of entries array and simplify and speedup iterations and 
some other operations.


def __delitem__(self, key, hashvalue=None):
if hashvalue is None:
hashvalue = hash(key)
found, i = self._lookup(key, hashvalue)
if found:
index = self.indices[i]
self.indices[i] = DUMMY
self.size -= 1
if index != size:
lasthash = self.hashlist[index]
lastkey = self.keylist[index]
found, j = self._lookup(lastkey, lasthash)
assert found
assert i != j
self.indices[j] = index
self.hashlist[index] = lasthash
self.keylist[index] = lastkey
self.valuelist[index] = self.valuelist[size]
index = size
self.hashlist[index] = UNUSED
self.keylist[index] = UNUSED
self.valuelist[index] = UNUSED
else:
raise KeyError(key)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]

2012-12-10 Thread Stephen J. Turnbull

PJ Eby writes:

 > By "clear", I mean "free of prior assumptions".

Ah, well, I guess I've just run into a personal limitation.  I can't
imagine thinking that is "free of prior assumptions".  Not my
own, and not by others, either.

So, unfortunately, I was left with the conventional opposition in
thinking: "clear" vs. "muddy".  That impression was only strengthened
by the phrase "vs. simply copying fields from distro tools."

 > Put together, the phrase "clear thinking on concrete use cases" means
 > (at least to me), "dropping all preconceptions of the existing design
 > and starting over from square one, to ask how best the problem may be
 > solved, using specific examples as a guide rather than using
 > generalities."

Sure, but ISTM that's the opposite of what you've actually been doing,
at least in terms of contributing to my understanding.  One obstacle
to discussion you have contributed to overcoming in my thinking is the
big generality that the packager (ie, the person writing the metadata)
is in a position to recommend "good behavior" to the installation
tool, vs. being in a position to point out "relevant considerations"
for users and tools installing the packager's product.

Until that generality is formulated and expressed, it's very difficult
to see why the examples and particular solutions to use cases that
various proponents have described fail to address some real problems.
It was difficult for me to see, at first, what distinction was
actually being made.

Specifically, I thought that the question about "Obsoletes" vs.
"Obsoleted-By" was about which package should be considered
authoritative about obsolescence.  That is a reasonable distinction
for that particular discussion, but there is a deeper, and general,
principle behind that.  Namely, "metadata is descriptive, not
prescriptive."  Of course once one understands that principle, the
names of the fields don't matter so much, but it is helpful for
"naive" users of the metadata if the field names strongly connote
description of the package rather than behavior of the tool.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

50 matches

Mail list logo