On Thu, 2006-03-30 at 20:23 +1000, Nick Coghlan wrote: > Robert Brewer wrote: > > Nick Coghlan wrote: > >> There are three big use cases: > >> > >> dict.keys > >> dict.values > >> dict.items > >> > >> Currently these all return lists, which may be expensive in > >> terms of copying. They all have iter* variants which while > >> memory efficient, are far less convenient to work with. > > > > I'm still wondering what "far less convenient" means. Is it simply the 4 > > extra key presses? I find the iter* variants to be a great solution. > > An iterator has some serious limitations as a view of a container: > > 1. Can only iterate once
Call the iter generator again to get a fresh iter. > 2. Can't check number of items len( dict ) > 3. Truth value testing doesn't work What does true and false for an iter mean? > 4. Containment tests don't work Put the iter down and use the view. 'x' in dict (1,'abc') in dict.items 'albatross' in dict.values > 5. String representation is well-nigh useless print list( mydata ) > 6. Value-based comparison doesn't work You mean d1 = {... d2 = {... d1.keys() == d2.keys() Eww, you are depending on a lot on the internal operation of the dict. It is possiable to generate different orders for the same key set (hash tables start at 8 slots; any 5 items x & 0x7 == y & 0x7 will collide ... Watch: >>> d = {} >>> d[0] = 0 >>> d[8] = 8 >>> d.keys() [0, 8] >>> d = {} >>> d[8] = 8 >>> d[0] = 0 >>> d.keys() [8, 0] > > The source object supports all of these things, but the iterator doesn't. > This > is why iteritems and friends are poor substitutes for their list based > equivalents in many situations. I feel strongly about this. Sorry in advance about the rant ... What you describe in this paragraph is a special case. A dict.key isn't a list; while it was implemeneted as a list, and the list form is currently understood, it isn't really a list because dict's have no ordering. That we call it a list is a product of our own inertia At risk of being sacrcastic the semantics (however not the performance) of dict.keys is basically: random.shuffle( list(dict.iter()) ) Basically, part of what I'm advocating for is the functions associated with a datastore to do the minimum amount of work to support their underlying semantics. There seemed to be a concensus in the community on the size of the view proposal, and I'm reimplementing the PEP to reflect that. But what I can't resolve is the other anciliary issue: "To list or iter." I'm not yet ready to resolve that issue. The views don't resolve it either, and by their nature are biased towards the iter approach. They provide __iter__ because its light weight to do, but there is no way a light weight view can provide you with ordering information from an unordered datastore. Now, as a means of resolving this conflict, I'm open to the notion of a view implementing both __iter__ and an explicit .list method to avoid any extra overhead in generating a list from an iter instead of directly from the dict as we do now. Personally, I think returning a list as a default is disgusting. Given any two operations, one which is cheap and the other expensive, I feel its the expensive one that your should have to work for. We have two choices: * Force the iter club to build that nasty list and iterate off of it. There are times when you *can't* do this ... when your dict is big and memory is limited. The integral( d_time, memory ) is a lot worse here, and this will be a killer on memory constrained devices. * Force the list club to get a nasty iter that they have to build a list from. The cost: list( ... ) -- they have the same overall O(n) cost, perhaps with a small constant overhead, they had before. (Lots of pain upfront, or a little pain as you go, oh, I can see all sorts of classical economic "utility of" arguments. The value of your n-st byte of ram vs. your m-th.) If given the requirement that the language implement only one, I feel that it must be an iter. If the programmer feels that the iter is a bad substitute for the list, then they can cast to a list. If the programmer feels that the list is a bad substitute for the iter, well, they could cast to a list when their machine is done paging or their telephone reboots. Insistence on a list represents a certain degree of hubris with respect to the available resources. Some users, particularly the embedded market, have significant space constrains. Cheers - Adam _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com