Re: best way to compare contents of 2 lists?

2009-04-27 Thread ajaksu
On Apr 24, 4:41 pm, norseman norse...@hughes.net wrote:
 (How do I) ...explain these?
[...]
 I can get the sashimi and take it home and BBQ it, I can roast it, I can
 steam it, I can wok it,..., but the other is what it is. (greasy)

 Besides, who says I like your cooking?  :)

Err... http://mail.python.org/pipermail/python-list/2008-September/679360.html
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-25 Thread Esmail

norseman wrote:


Of course each method has its time and place of use and Python has some 
well oiled list search and list maintain routines of its own. List 
comparisons are most accurate when using presorted ones. (Some things 
don't lend themselves to sorting very well. Like paragraphs, books, 
chapters, computer programs, manuals, etc... These need the searchers 
(equivalent to the Unix diff) for checking equivalence.)



HTH

Steve


Thanks Steve for bringing up various items to consider for these sort
of tasks, very helpful indeed.

Best,
Esmail

--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread bearophileHUGS
Esmail:
 oh, I forgot to mention that each list may contain duplicates.

Comparing the sorted lists is a possible O(n ln n) solution:

a.sort()
b.sort()
a == b

Another solution is to use frequency dicts, O(n):

from itertools import defaultdict
d1 = defaultdict(int)
for el in a:
d1[el] += 1
d2 = defaultdict(int)
for el in b:
d2[el] += 1
d1 == d2

As the arrays (Python lists) become large the second solution may
become faster.

Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Piet van Oostrum
 John Yeung gallium.arsen...@gmail.com (JY) wrote:

JY It takes care of the duplicates, but so does your initial solution,
JY which I like best:

 sorted(a)==sorted(b)

JY This is concise, clear, and in my opinion, the most Pythonic.  It may
JY well even be the fastest.  (If you didn't have to match up the numbers
JY of duplicates, the set solution would be most Pythonic and probably
JY fastest.)

But it may fail if the list cannot be sorted because it contains
incomparable elements:

 a = [1, 2, 3+4j]
 b = [2, 3+4j, 1]
 set(a) == set(b)
True
 sorted(a)==sorted(b)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: no ordering relation is defined for complex numbers

In Python 3 there are even more incomparable objects pairs.
-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Steven D'Aprano
On Thu, 23 Apr 2009 21:51:42 -0400, Esmail wrote:

 set(a) == set(b)# test if a and b have the same elements
 
 # check that each list has the same number of each element # i.e.   
 [1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2] for elem in set(a):
   a.count(elem) == b.count(elem)
 
 Ah .. this part would take care of different number of duplicates in the
 lists. Cool.

At significant cost of extra work.

Counting the number of times a single element occurs in the list is O(N). 
Counting the number of times every element occurs in the list is O(N**2). 
Sorting is O(N*log N), so for large lists, sorting will probably be much 
cheaper.


-- 
Steven
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Esmail

John Yeung wrote:

 so does your initial solution,
which I like best:


sorted(a)==sorted(b)


This is concise, clear, and in my opinion, the most Pythonic.  It may
well even be the fastest.  


Great .. I can live with that :-)

--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Esmail

MRAB wrote:


You could use Raymond Hettinger's Counter class:

http://code.activestate.com/recipes/576611/

on both lists and compare them for equality.


thanks for the pointer, I'll study the code provided.

Esmail

--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread bearophileHUGS
Arnaud Delobelle:
 Thanks to the power of negative numbers, you only need one dict:

 d = defaultdict(int)
 for x in a:
     d[x] += 1
 for x in b:
     d[x] -= 1
 # a and b are equal if d[x]==0 for all x in d:
 not any(d.itervalues())

Very nice, I'll keep this for future use.
Someday I'll have to study this new new kind of numbers that can
represent borrowed items too.

Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Arnaud Delobelle
On Apr 24, 7:12 am, bearophileh...@lycos.com wrote:
[...]
 Another solution is to use frequency dicts, O(n):

 from itertools import defaultdict
 d1 = defaultdict(int)
 for el in a:
     d1[el] += 1
 d2 = defaultdict(int)
 for el in b:
     d2[el] += 1
 d1 == d2

Thanks to the power of negative numbers, you only need one dict:

d = defaultdict(int)
for x in a:
d[x] += 1
for x in b:
d[x] -= 1
# a and b are equal if d[x]==0 for all x in d:
not any(d.itervalues())

--
Arnaud
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Esmail

Thanks all, after reading all the posting and suggestions for
alternatives, I think I'll be going with

sorted(a)==sorted(b)

it seems fast, intuitive and clean and can deal with
duplicates too.

Best,
Esmail

--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Esmail

Piet van Oostrum wrote:

John Yeung gallium.arsen...@gmail.com (JY) wrote:



JY It takes care of the duplicates, but so does your initial solution,
JY which I like best:



sorted(a)==sorted(b)



JY This is concise, clear, and in my opinion, the most Pythonic.  It may
JY well even be the fastest.  (If you didn't have to match up the numbers
JY of duplicates, the set solution would be most Pythonic and probably
JY fastest.)


But it may fail if the list cannot be sorted because it contains
incomparable elements:


Ah .. good point .. something I had not considered. Lucky for me in
this case it's not an issue, but something I'll keep in mind for the
future.

--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Terry Reedy

Steven D'Aprano wrote:

On Thu, 23 Apr 2009 21:51:42 -0400, Esmail wrote:


set(a) == set(b)# test if a and b have the same elements

# check that each list has the same number of each element # i.e.   
[1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2] for elem in set(a):

  a.count(elem) == b.count(elem)

Ah .. this part would take care of different number of duplicates in the
lists. Cool.


At significant cost of extra work.

Counting the number of times a single element occurs in the list is O(N). 
Counting the number of times every element occurs in the list is O(N**2). 


A frequency dict should be O(n) also, and hence faster than sorting.

Sorting is O(N*log N), so for large lists, sorting will probably be much 
cheaper.





--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread norseman

Esmail wrote:

What is the best way to compare the *contents* of two different
lists regardless of their respective order? The lists will have
the same number of items, and be of the same type.

E.g. a trivial example (my lists will be larger),

a=[1, 2, 3]

b=[2, 3, 1]

should yield true if a==b

I suppose I could sort them and then compare them. I.e.,

sorted(a)==sorted(b)


I am wondering if there is a more efficient/preferred way to do so.

Thanks,
Esmail

--
http://mail.python.org/mailman/listinfo/python-list


=
Technically, ==  is reserved for identical, as in byte for byte same

-If sorted(listA) == sorted(listB):
-  etc
Is the preferred.

While there are other ways, the ones I know of are much more computer 
intensive.


Things like:
  get (next) line from file1
  if checking it against each line of file2 yields a found
loop
  else
...
are a real I/O killers.
Indexes are good but introduce a separate maintenance issue.


By the way - does original order of original files count?
If not, sorting AND KEEPING can speed up future things.

Of course each method has its time and place of use and Python has some 
well oiled list search and list maintain routines of its own. List 
comparisons are most accurate when using presorted ones. (Some things 
don't lend themselves to sorting very well. Like paragraphs, books, 
chapters, computer programs, manuals, etc... These need the searchers 
(equivalent to the Unix diff) for checking equivalence.)



HTH

Steve
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread Steven D'Aprano
On Fri, 24 Apr 2009 10:39:39 -0700, norseman wrote:

 Technically, ==  is reserved for identical, as in byte for byte same

Really? Then how do you explain these?

 u'abc' == 'abc'
True
 1 == 1.0
True
 2L == 2
True

 import decimal
 decimal.Decimal('42') == 42
True


Here's one to think about:

 d1 = {-1: None, -2: None}
 d2 = {-2: None, -1: None}
 print d1, d2
{-2: None, -1: None} {-1: None, -2: None}
 d1 == d2
True

If d1 and d2 are equal and both are dictionaries with the same keys and 
same values, why do they print in different orders?



-- 
Steven
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-24 Thread norseman

Steven D'Aprano wrote:

On Fri, 24 Apr 2009 10:39:39 -0700, norseman wrote:


Technically, ==  is reserved for identical, as in byte for byte same


Really? Then how do you explain these?


u'abc' == 'abc'

True

1 == 1.0

True

2L == 2

True

import decimal
decimal.Decimal('42') == 42

True


Here's one to think about:


d1 = {-1: None, -2: None}
d2 = {-2: None, -1: None}
print d1, d2

{-2: None, -1: None} {-1: None, -2: None}

d1 == d2

True

If d1 and d2 are equal and both are dictionaries with the same keys and 
same values, why do they print in different orders?





=
(How do I) ...explain these?
  Python cooks the test function.
  That explains a few things I thought were kinda weird.

...one to think about:
Either someone has a programming glitch or else likes stack use.
LoadA push, read-LILO, readB, compare, loop to read-LILO.
And they decided to just pre-load as a matter of course, but maybe 
forgot to read-LILO in print... ie.. didn't finish cooking print.

(or left it out as a reminder or)

Technically, ==  is reserved for identical, as in byte for byte same
(when == is used to check identical, uncooked)


As a matter of self - I don't favor cooking. I prefer 'cooked' things 
have a different name so I can get what I want, when I want.

Choice on wall:
  fish, deep fried in light beer batter
  sashimi

in case you didn't notice - cooking takes longer :)

I can get the sashimi and take it home and BBQ it, I can roast it, I can 
steam it, I can wok it,..., but the other is what it is. (greasy)


Besides, who says I like your cooking?  :)

Steve
--
http://mail.python.org/mailman/listinfo/python-list


best way to compare contents of 2 lists?

2009-04-23 Thread Esmail

What is the best way to compare the *contents* of two different
lists regardless of their respective order? The lists will have
the same number of items, and be of the same type.

E.g. a trivial example (my lists will be larger),

a=[1, 2, 3]

b=[2, 3, 1]

should yield true if a==b

I suppose I could sort them and then compare them. I.e.,

sorted(a)==sorted(b)


I am wondering if there is a more efficient/preferred way to do so.

Thanks,
Esmail

--
http://mail.python.org/mailman/listinfo/python-list


RE: best way to compare contents of 2 lists?

2009-04-23 Thread Hans DushanthaKumar
'set' comes to mind, though I'm not sure if there are performance
inplications with large amopunts of data.

 a=[1, 2, 3]
 b=[2, 3, 1]
 set(a) == set(b)
True



Cheers,
Hans

-Original Message-
From: python-list-bounces+hans.dushanthakumar=hcn.com...@python.org
[mailto:python-list-bounces+hans.dushanthakumar=hcn.com...@python.org]
On Behalf Of Esmail
Sent: Friday, 24 April 2009 11:31 AM
To: python-list@python.org
Subject: best way to compare contents of 2 lists?

What is the best way to compare the *contents* of two different
lists regardless of their respective order? The lists will have
the same number of items, and be of the same type.

E.g. a trivial example (my lists will be larger),

a=[1, 2, 3]

b=[2, 3, 1]

should yield true if a==b

I suppose I could sort them and then compare them. I.e.,

sorted(a)==sorted(b)


I am wondering if there is a more efficient/preferred way to do so.

Thanks,
Esmail

--
http://mail.python.org/mailman/listinfo/python-list
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-23 Thread David Robinow
On Thu, Apr 23, 2009 at 9:31 PM, Esmail ebo...@hotmail.com wrote:
 What is the best way to compare the *contents* of two different
 lists regardless of their respective order? The lists will have
 the same number of items, and be of the same type.

 E.g. a trivial example (my lists will be larger),

 a=[1, 2, 3]

 b=[2, 3, 1]

 should yield true if a==b

 I suppose I could sort them and then compare them. I.e.,

 sorted(a)==sorted(b)


 I am wondering if there is a more efficient/preferred way to do so.

 Thanks,
 Esmail

 --
 http://mail.python.org/mailman/listinfo/python-list


set(a) == set(b)# test if a and b have the same elements

# check that each list has the same number of each element
# i.e.[1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2]
for elem in set(a):
  a.count(elem) == b.count(elem)
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-23 Thread Esmail

Esmail wrote:

What is the best way to compare the *contents* of two different
lists regardless of their respective order? The lists will have
the same number of items, and be of the same type.

E.g. a trivial example (my lists will be larger),

a=[1, 2, 3]

b=[2, 3, 1]

should yield true if a==b

I suppose I could sort them and then compare them. I.e.,

sorted(a)==sorted(b)


I am wondering if there is a more efficient/preferred way to do so.


oh, I forgot to mention that each list may contain duplicates. So I
suppose this means I would have to sort the two lists and compare them
afterall, unless there's another way?


--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-23 Thread Esmail

David Robinow wrote:

On Thu, Apr 23, 2009 at 9:31 PM, Esmail ebo...@hotmail.com wrote:

What is the best way to compare the *contents* of two different
lists regardless of their respective order? The lists will have
the same number of items, and be of the same type.

E.g. a trivial example (my lists will be larger),

a=[1, 2, 3]

b=[2, 3, 1]

should yield true if a==b

I suppose I could sort them and then compare them. I.e.,

sorted(a)==sorted(b)


I am wondering if there is a more efficient/preferred way to do so.

Thanks,
Esmail

--
http://mail.python.org/mailman/listinfo/python-list



set(a) == set(b)# test if a and b have the same elements

# check that each list has the same number of each element
# i.e.[1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2]
for elem in set(a):
  a.count(elem) == b.count(elem)


Ah .. this part would take care of different number of duplicates
in the lists. Cool.

thanks,
Esmail


--
http://mail.python.org/mailman/listinfo/python-list



--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-23 Thread MRAB

Esmail wrote:

Esmail wrote:

What is the best way to compare the *contents* of two different
lists regardless of their respective order? The lists will have
the same number of items, and be of the same type.

E.g. a trivial example (my lists will be larger),

a=[1, 2, 3]

b=[2, 3, 1]

should yield true if a==b

I suppose I could sort them and then compare them. I.e.,

sorted(a)==sorted(b)


I am wondering if there is a more efficient/preferred way to do so.


oh, I forgot to mention that each list may contain duplicates. So I
suppose this means I would have to sort the two lists and compare them
afterall, unless there's another way?


You could use Raymond Hettinger's Counter class:

http://code.activestate.com/recipes/576611/

on both lists and compare them for equality.
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-23 Thread Esmail

Hans DushanthaKumar wrote:
'set' comes to mind, 


Yes, I thought about that, but I wasn't sure how to potentially
deal with different number of duplicates in the lists. The post
by David seems to address that issue nicely.


though I'm not sure if there are performance
inplications with large amopunts of data.


Yes, I wonder about these sort of things too, not knowing too
much about how Python does things internally.

Thanks for the suggestion,
Esmail

--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to compare contents of 2 lists?

2009-04-23 Thread John Yeung
Esmail ebo...@hotmail.com wrote:
 What is the best way to compare the *contents* of two different
 lists regardless of their respective order? The lists will have
 the same number of items, and be of the same type.

Best can mean different things.  Fastest?  Shortest code?  Most
readable?

 David Robinow wrote:
  set(a) == set(b)    # test if a and b have the same elements

  # check that each list has the same number of each element
  # i.e.    [1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2]
  for elem in set(a):
    a.count(elem) == b.count(elem)

 Ah .. this part would take care of different number of duplicates
 in the lists. Cool.

It takes care of the duplicates, but so does your initial solution,
which I like best:

 sorted(a)==sorted(b)

This is concise, clear, and in my opinion, the most Pythonic.  It may
well even be the fastest.  (If you didn't have to match up the numbers
of duplicates, the set solution would be most Pythonic and probably
fastest.)

John
--
http://mail.python.org/mailman/listinfo/python-list