Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-11 Thread Nick Coghlan
Anthony Baxter wrote:
On Thursday 10 March 2005 17:29, Raymond Hettinger wrote:
Or the implementation can have a switch to choose between keep-first
logic or replace logic.
The latter seems a bit odd to me.  The key position would be determined
by the first encountered while the value would be determined by the last
encountered.  Starting with [(10, v1), (20, v2), (10.0, v3)], the
ordered dictionary's items would look like [(10, v3), (20, v2)].

Or, alternately, we keep the last occurence, and move it to the new position.
There's a wide number of different use cases, each with a slightly different
final result, and for this reason alone I'm -0 on it for the library. 

Anthony
But surely such use cases could be more easily handled by subclassing from 
collections.OrderedDict and tweaking the semantics than by having to implement 
an ordered mapping from scratch.

Would the default semantics below really be that suprising?
An ordered dictionary remembers the order in which keys are first seen and, 
when iterated over, returns entries based on that order. This applies to direct 
iteration, iteration over values and (key, value) pairs, to the list-producing 
methods (i.e. keys(), values() and items()) and to any other operations that 
involve implicit iteration (e.g. converting to a string representation). 
Overwriting an entry replaces its value, but does not affect its position in the 
key order. Removing an entry (using 'del') _does_ remove it from the key order. 
Accordingly, if the entry is later recreated, it will then occur last in the key 
order.

This behaviour is analagous to that of a list constructed using only 
list.append() to add items (indeed, the key order can be thought of as a list 
constructed in this fashion, with keys appended to the list when they are first 
encountered).

An ordered dictionary provides a sort() method. The sort operation is applied to 
the key ordering and affects future iteration over the dictionary. Again, this 
is analagous to sorting a list.

For instance, to convert a standard dictionary to an ordered dictionary using a 
specific key function:

  x = collections.OrderedDict(sorted(d.itervalues(), key=keyfunc))
Cheers,
Nick.
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Michael Hudson
Delaney, Timothy C (Timothy) [EMAIL PROTECTED] writes:

 Set: Items are iterated over in the order that they are added. Adding an
 item that compares equal to one that is already in the set does not
 replace the item already in the set, and does not change the iteration
 order. Removing an item, then re-adding it moves the item to the end of
 the iteration order.

Well, this could be satisfied by an append_new operation on lists,
right (thinking of Common Lisps #'cl:pushnew)?  Complexity not that
great, of course, but I've written code like:

if a not in l:
l.append(a)

and not suffered that badly for it before now...

 Dict: Keys are iterated over in the order that they are added. Setting a
 value using a key that compares equal to one already in the dict
 replaces the value, but not the key, and does not change the iteration
 order. Removing a key (and value) then re-adding it moves the key to the
 end of the iteration order.

And these are what CL types call association lists, in effect.

Cheers,
mwh

-- 
  #ifndef P_tmpdir
  printf( Go buy a better computer );
  exit( ETHESKYISFALLINGANDIWANTMYMAMA );
 -- Dimitri Maziuk on writing secure code, asr
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Nick Coghlan
Delaney, Timothy C (Timothy) wrote:
OTOH, ordered set and ordered dict implies different things to
different people - usually sorted rather than the order things were
put in. Perhaps temporally-ordered ;)
OTGH*, I would expect an OrderedDict / OrderedSet to have 'add to the end' 
semantics, but also provide a 'sort()' method so that the ordering could be 
changed at a later date.

IOW, by default the ordering is temporal. Sorting the ordered dict/set changes 
the iteration order for the current contents. Further additions are still added 
in temporal order until such time as the dict/set is sorted again.

The parallels are to using list.append() to build a list, and list.sort() to 
order the current contents (in fact, a simplistic approach could use that exact 
technique to remember the order of keys, at the cost of doubling key storage 
requirements).

Cheers,
Nick.
*OTGH: On the gripping hand :)
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Barry Warsaw
On Wed, 2005-03-09 at 19:39, Tommy Burnette wrote:
 I'd say I'm +0.  fwiw- I've been using a locally-rolled OrderedDict
 implementation for the last 5-6 years in which insertion order is the
 only order respected.  I use it all over the place (in a code base of
 ~60k lines of python code).
 
 so there's another use case for you.  bust as you say, easy to do
 yourself... 

I'm -0 on adding it to the stdlib, but mostly because I don't like the
name, and I suspect it's going to be one of those nuggets lurking in the
standard library that few people will use, tending either to overlook it
or just roll their own anyway because the standard one doesn't have
quite the right semantics.

FWIW, email.Message.Message /exposes/ an ordered dictionary-like
interface even though it's implemented as a simple list.  It was
considered at the time that the number of headers in an email message
wouldn't be so large that anything else would be worth the complexity. 
I think that still holds, for the original uses cases at least.

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread BJörn Lindqvist
I would LOVE for **kwargs to be an ordered dict. It would allow me to
write code like this:

.class MyTuple:
.def __init__(self, **kwargs):
.self.__dict__ = ordereddict(kwargs)
.
.def __iter__(self):
.for k, v in self.__dict__.items():
.yield v
.
.t = MyTuple(r = 99, g = 12, b = 4)
.r, g, b = t
.print r, g, b

I know it goes beyond the original intention of the proposal, but I
figure I'd mention it anyway because the unorder of **kwargs has been
bugging me alot.

-- 
mvh Björn
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Anthony Baxter
On Thursday 10 March 2005 17:29, Raymond Hettinger wrote:
 Or the implementation can have a switch to choose between keep-first
 logic or replace logic.

 The latter seems a bit odd to me.  The key position would be determined
 by the first encountered while the value would be determined by the last
 encountered.  Starting with [(10, v1), (20, v2), (10.0, v3)], the
 ordered dictionary's items would look like [(10, v3), (20, v2)].

Or, alternately, we keep the last occurence, and move it to the new position.
There's a wide number of different use cases, each with a slightly different
final result, and for this reason alone I'm -0 on it for the library. 

Anthony

-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-10 Thread Raymond Hettinger
[BJörn Lindqvist]
 I would LOVE for **kwargs to be an ordered dict. It would allow me to
 write code like this:
 
 .class MyTuple:
 .def __init__(self, **kwargs):
 .self.__dict__ = ordereddict(kwargs)

This doesn't work.  The kwargs are already turned into a regular
dictionary before ordereddict sees it.


Raymond Hettinger

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Martin v. Löwis
Delaney, Timothy C (Timothy) wrote:
Does anyone else think it would be worthwhile adding these to
collections, or should I just make a cookbook entry?
-0 for the addition to the collections module, -1 on the specific
name.
Java made a *big* mistake by putting Hash into the names of these
things. From the outside, it is primarily a Dictionary; only when
you look closer you see that this specific dictionary requires
hashable keys (as opposed to, say, the C++ std::map, which requires
sortable keys).
So consequently, the data type should be called OrderedDictionary,
and its cookbook recipe is
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Steven Bethard
Thomas Heller [EMAIL PROTECTED] wrote:
 [About an ordered dictionary]
[snip] 
 I cannot understand why people are against adding it to stdlib (after
 the name, the implementation, and the exact place have been decided).
 It's certainly a useful data type, isn't it?

Well, that was basically the question I posed.  So far I've seen only
one use for it, and that one is better served by adding a function to
itertools.  What use do you have for it other than filtering
duplicates from a list while retaining order?

Steve
-- 
You can wordify anything if you just verb it.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread James Y Knight
On Mar 9, 2005, at 4:19 PM, Thomas Heller wrote:
If this were the only use case, you are right.  I cannot remember the
use case I once had right now, and probably the code has been rewritten
anyway.  But I assume use cases exist, otherwise there weren't so many
recipes for the ordered dictionary.
I use ordered dictionaries for testing. With an ordered dict I can 
string compare the output of my program to what is expected. Without an 
ordered dict, I'd have to re-parse the output and order it, which would 
require some complicated code that's just as likely to be wrong as the 
code I'm trying to test.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Martin v. Löwis
Thomas Heller wrote:
I cannot understand why people are against adding it to stdlib (after
the name, the implementation, and the exact place have been decided).
It's certainly a useful data type, isn't it?
It depends on the precise semantics. You often want a dictionary where
the keys come out in some order - but that is rarely the order in which
they were added to the dictionary. Most frequently, you want the keys
sorted, according to some criteria. If not that, I would assume that you
typically have the order of keys determined before even filling the
dictionary, in which case you can do
for k in keys_in_preferred_order:
v = hashtable[k]
process(k,v)
I remember having needed that once in the past 15 years (in Smalltalk
at the time), so I wrote an OrderedDictionary for Smalltalk/V (which
didn't have it). It took me an hour or so.
I don't recall what precisely it was that I needed it for, and I cannot
think of any use case for the data type right now.
So I'm -0 on adding the data type: I have a vague feeling it is needed,
but rarely, and I don't know precisely what for.
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread John Williams
Steven Bethard wrote:
 Thomas Heller [EMAIL PROTECTED] wrote:

 [About an ordered dictionary]


 Well, that was basically the question I posed.  So far I've seen only
 one use for it, and that one is better served by adding a function to
 itertools.  What use do you have for it other than filtering
 duplicates from a list while retaining order?

 Steve
Using a LinkedHashMap generally cuts down in the amount of apparent 
randomness in a program.  This is especially helpful when it comes time 
to debug a really complicated program by diffing log files, since it 
prevents slightly different maps from having wildly different iteration 
orders.  Often using a plain HashMap can introduce enough randomness to 
make two otherwise similar log files nearly impossible to compare.

The other use case I have is for dealing with data where the iteration 
order doesn't matter to the program but it does matter to users.  For 
instance, consider the ConfigParser's write method.  Any ordering of 
values in the output is functionally equivalent, but the original data 
is likely to have come from a file that was arranged in some meaningful 
order, and it would be nice to preserve that order, especially if it can 
be done with no extra effort.

--jw
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Martin v. Löwis
John Williams wrote:
The other use case I have is for dealing with data where the iteration 
order doesn't matter to the program but it does matter to users.  For 
instance, consider the ConfigParser's write method.  Any ordering of 
values in the output is functionally equivalent, but the original data 
is likely to have come from a file that was arranged in some meaningful 
order, and it would be nice to preserve that order, especially if it can 
be done with no extra effort.
That is a good example, IMO. But then, in the specific case, the
dictionary for each section is created deeply inside ConfigParser,
with the lines
cursect = {'__name__': sectname}
self._sections[sectname] = cursect
So this uses a builtin dict, and there is no easy way to override it
for the application.
Of course, given your reasoning, ConfigParser *should* use an
OrderedDictionary (probably both for cursect and for self._sections),
in which case this would all be transparent to the application.
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Tommy Burnette

I'd say I'm +0.  fwiw- I've been using a locally-rolled OrderedDict
implementation for the last 5-6 years in which insertion order is the
only order respected.  I use it all over the place (in a code base of
~60k lines of python code).

so there's another use case for you.  bust as you say, easy to do
yourself... 

=?ISO-8859-1?Q?Martin_v._L=F6wis?= writes:
| Thomas Heller wrote:
|  I cannot understand why people are against adding it to stdlib (after
|  the name, the implementation, and the exact place have been decided).
|  It's certainly a useful data type, isn't it?
| 
| It depends on the precise semantics. You often want a dictionary where
| the keys come out in some order - but that is rarely the order in which
| they were added to the dictionary. Most frequently, you want the keys
| sorted, according to some criteria. If not that, I would assume that you
| typically have the order of keys determined before even filling the
| dictionary, in which case you can do
| 
| for k in keys_in_preferred_order:
|  v = hashtable[k]
|  process(k,v)
| 
| I remember having needed that once in the past 15 years (in Smalltalk
| at the time), so I wrote an OrderedDictionary for Smalltalk/V (which
| didn't have it). It took me an hour or so.
| 
| I don't recall what precisely it was that I needed it for, and I cannot
| think of any use case for the data type right now.
| 
| So I'm -0 on adding the data type: I have a vague feeling it is needed,
| but rarely, and I don't know precisely what for.
| 
| Regards,
| Martin
| ___
| Python-Dev mailing list
| Python-Dev@python.org
| http://mail.python.org/mailman/listinfo/python-dev
| Unsubscribe: http://mail.python.org/mailman/options/python-dev/tommy%40ilm.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Aahz
On Wed, Mar 09, 2005, Tommy Burnette wrote:

 I'd say I'm +0.  fwiw- I've been using a locally-rolled OrderedDict
 implementation for the last 5-6 years in which insertion order is the
 only order respected.  I use it all over the place (in a code base of
 ~60k lines of python code).
 
 so there's another use case for you.  bust as you say, easy to do
 yourself... 

Gee, I just found out I could have used an OrderedDict today.  (We're
using a dict that we're now having to add an auxilliary list to to track
when keys are added.)  (This isn't particularly an argument in favor of
adding OrderedDict to stdlib, but it's another use case.  Each dict key
contains a dict value; the subkeys from later-added keys are supposed to
override earlier subkeys.  The original implementation relied on subkeys
being unique, but that doesn't work for our new business requirements.)
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death.  --GvR
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Aahz
On Wed, Mar 09, 2005, Raymond Hettinger wrote:
 [Aahz]

 Gee, I just found out I could have used an OrderedDict today.  (We're
 using a dict that we're now having to add an auxilliary list to to track
 when keys are added.)  (This isn't particularly an argument in favor of
 adding OrderedDict to stdlib, but it's another use case.  Each dict key
 contains a dict value; the subkeys from later-added keys are supposed to
 override earlier subkeys.  The original implementation relied on subkeys
 being unique, but that doesn't work for our new business requirements.)
 
 If I read the proposal correctly, order would be determined by when the
 key was first encountered.  Presumably, that would mean that the related
 value would also be the first encountered (not overridden by later-added
 keys as dictated by your business requirements).  

Hm  Guess this means we need a PEP!
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death.  --GvR
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Delaney, Timothy C (Timothy)
Jeff Bauer wrote:

 I'm not specifically lobbying for its inclusion in
 stdlib, but I often find an ordered dict useful when I
 want both ordered and random access, e.g. situations:
 
- database table fields/attributes
- drop down field selectors

Yep - these are also cases that are familiar to me - it's the type of
thing you don't think about until you actually need it.

 In both cases, I could get by with other techniques, but I
 would use stdlib ordered dicts if they were available.
 I also prefer the term ordered dict to LinkedHashXXX.

You may notice the subject is LinkedHashXXX *equivalents* ;) There is no
way I would ever propose adding them with those names.

OTOH, ordered set and ordered dict implies different things to
different people - usually sorted rather than the order things were
put in. Perhaps temporally-ordered ;)

BTW, just to clarify the semantics:

Set: Items are iterated over in the order that they are added. Adding an
item that compares equal to one that is already in the set does not
replace the item already in the set, and does not change the iteration
order. Removing an item, then re-adding it moves the item to the end of
the iteration order.

Dict: Keys are iterated over in the order that they are added. Setting a
value using a key that compares equal to one already in the dict
replaces the value, but not the key, and does not change the iteration
order. Removing a key (and value) then re-adding it moves the key to the
end of the iteration order.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Delaney, Timothy C (Timothy)
Steven Bethard wrote:

 def filterdups(iterable):
  seen = set()
  for item in iterable:
  if item not in seen:
  seen.add(item)
  yield item
 
 Adding this to, say, itertools would cover all my use cases.  And as
 long as you don't have too many duplicates, filterdups as above should
 keep memory consumption down better.

Thinking about this further - memory usage would be almost identical. By
the time you completed the iterable, you would have built up exactly the
same set internally - although probably not as memory efficient since it
would be being built piecemeal. OTOH, an ordered set has a bit of extra
memory for maintaining the order, so it's going to be pretty close.

The only thing this gains you (and it's significant) is the ability to
work on any iterable lazily.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Steven Bethard
Delaney, Timothy C (Timothy) [EMAIL PROTECTED] wrote:
 Steven Bethard wrote:
 
  def filterdups(iterable):
   seen = set()
   for item in iterable:
   if item not in seen:
   seen.add(item)
   yield item
 
  Adding this to, say, itertools would cover all my use cases.  And as
  long as you don't have too many duplicates, filterdups as above should
  keep memory consumption down better.
 
 Thinking about this further - memory usage would be almost identical. By
 the time you completed the iterable, you would have built up exactly the
 same set internally - although probably not as memory efficient since it
 would be being built piecemeal. OTOH, an ordered set has a bit of extra
 memory for maintaining the order, so it's going to be pretty close.

Definitely true that if you iterate through your entire iterable, it
doesn't gain you anything over an OrderedSet.  OTOH, if you only end
up looking at the first N elements of the iterable, then this would be
more efficient than putting the entire iterable into an OrderedSet and
taking the first N from there.  Of course you can put only the first
elements into the OrderedSet, but note that you can't just put in the
first N; you have to add elements of the iterable into the OrderedSet
until it has len(obj) == N.  Not that this should be more than a few
lines of code, but it's code that you wouldn't have to write with
fitlerdups.

That said, you're right of course that filterdups is certainly not a
big win over OrderedSet.  I was really just trying to point out that
if we were just trying to provide a solution to the
filtering-duplicates-while-maintaining-order problem that OrderedSet
wasn't the only path to that goal.  I think since then there have been
a number of other reasonable cases suggested for wanting an
OrderedSet, e.g.:

* A DB result set that is indexed by a key but also maintains row
order [Eli Stevens, Jeff Bauer]

* A config-file data structure that indexes by option names but
maintains the order of elements read from a config file [John
Williams]

* Drop-down field selectors indexed by both name and sequence position
[Jeff Bauer]

So I'm now probably +0.5 on adding an OrderedSet to collections.  Not
that my opinion is particularly influential, of course. ;-)

Steve
-- 
You can wordify anything if you just verb it.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-09 Thread Raymond Hettinger
  If I read the proposal correctly, order would be determined by when
the
  key was first encountered.  Presumably, that would mean that the
related
  value would also be the first encountered (not overridden by
later-added
  keys as dictated by your business requirements).
 
 Hm  Guess this means we need a PEP!

Or the implementation can have a switch to choose between keep-first
logic or replace logic.

The latter seems a bit odd to me.  The key position would be determined
by the first encountered while the value would be determined by the last
encountered.  Starting with [(10, v1), (20, v2), (10.0, v3)], the
ordered dictionary's items would look like [(10, v3), (20, v2)].  


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-08 Thread Brett C.
Steven Bethard wrote:
Delaney, Timothy C (Timothy) [EMAIL PROTECTED] wrote:
The perennial how do I remove duplicates from a list topic came up on
c.l.py and in the discussion I mentioned the java 1.5 LinkedHashSet and
LinkedHashMap. I'd thought about proposing these before, but couldn't
think of where to put them. It was pointed out that the obvious place
would be the collections module.
For those who don't know, LinkedHashSet and LinkedHashMap are simply
hashed sets and maps that iterate in the order that the keys were added
to the set/map. I almost invariably use them for the above scenario -
removing duplicates without changing order.
Does anyone else think it would be worthwhile adding these to
collections, or should I just make a cookbook entry?

I guess I'm -0 on this.
Though I was the one that suggested that collections is the right
place to put them, I'm not really certain how much we gain by
including them.  I too would only ever use them for removing
duplicates from a list.  But if we're trying to provide a solution to
this problem, I'd rather see an iterable-friendly one.  See a previous
thread on this issue[1] where I suggest something like:
def filterdups(iterable):
 seen = set()
 for item in iterable:
 if item not in seen:
 seen.add(item)
 yield item
Adding this to, say, itertools would cover all my use cases.  And as
long as you don't have too many duplicates, filterdups as above should
keep memory consumption down better.
I am -1 on adding LinkedHash*.  While I can understand wanting to get rid of 
duplicates easily and wanting a good solutoin, Steven's snippet of code shows 
rolling your own solution is easy.

Plus this can even be simplified down to a one-liner using itertools already::
  itertools.ifilterfalse(lambda item, _set=set():
   (item in _set) or (_set.add(item) and False),
 iterable)
I don't think it is the prettiest solution, but it does show that coming up 
with one is not hard nor restricted to only a single solution that requires 
knowing some Python idiom (well, mine does for the default arg to the lambda, 
but Steven's doesn't).

The last thing I want to happen is for Python's stdlib to grow every possible 
data structure out there like Java seems to have.  I don't ever want to have to 
think about what *variant* of a data structure I should use, only what *type* 
of data structure (and no, I don't count collections.deque and Queue.Queue an 
overlap since the latter is meant more for thread communication, at least in my 
mind).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] LinkedHashSet/LinkedHashMap equivalents

2005-03-08 Thread Delaney, Timothy C (Timothy)
Greg Ward wrote:

 I'll attach another approach to the same problem, an ordered
 dictionary object.  I believe the semantics of
 adding/readding/deleting keys is the same as java.util.LinkedHashMap
 -- certainly it seems the most sensible and easy-to-implement
 semantics. 

That's essentially the same approach I used - I didn't get around to
properly implementing a doubly-linked list between the keys - I just
maintained a list with the correct order. If I'm going to write a
Cookbook recipe though I should probably do it properly ...

Personally, I think it would be quite nice if all python dictionaries
had these semantics - it would be nice to be able to say that items will
be returned in the order they're inserted - but I wouldn't want to incur
the additional cost in such a fundamental data structure.

One thing I did notice - dict.__repr__ and dict.__str__ don't produce
strings in the iteration order of subclasses - they always use the
arbitrary dict iteration order. Is this intentional?

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com