Re: Efficient Rank Ordering of Nested Lists
On Aug 4, 7:29 pm, [EMAIL PROTECTED] (Alex Martelli) wrote: Cousin Stanley [EMAIL PROTECTED] wrote: ... for i , item in reversed( enumerate( sorted( single_list ) ) ) : ... TypeError: argument to reversed() must be a sequence Oops, right. Well then, aux_seq = list(enumerate(sorted(single_list))) for i, item in reversed(aux_seq): ... or the like. Alex The particular embedded list I have been tinkering with is 43 x 3884 and the dictionary approach (as you can probably guess) is a big improvement over the linear one. It also has the added benefit of being compatible with Python24. If I'm not mistaken the binary approach suggested by Neil requires Python25. Since in some situations I don't have control over the version available I will go with a variant of the dictionary recipe. -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
On Aug 3, 8:38 am, [EMAIL PROTECTED] (Alex Martelli) wrote: [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Each use of sortedList.index is O(N) [where N is len(singleList)], and you have N such uses in the map in the inner function, so this approach is O(N squared). Neil's suggestion to use bisect replaces the O(N) .index with an O(log N) search, so the overall performance is O(N log N) [[and you can't do better than that, big-O wise, because the sort step is also O(N log N)]]. beginner's advice to use a dictionary is also good and may turn out to be faster, just because dicts are SO fast in Python -- but you need to try and measure both alternatives. One way to use a dict (warning, untested code): def rankList(singleList): d = {} for i, item in reversed(enumerate(sorted(singleList))): d[item] = i return [d[item] for item in singleList] If you find the for-clause too rich in functionality, you can of course split it up a bit; but note that you do need the 'reversed' to deal with the corner case of duplicate items (without it, you'd end up with 1s instead of 0s for the third one of the sublists in your example). If speed is of the essence you might try to measure what happens if you replace the returned expression with map(d.__getitem__, singleList), but I suspect the list comprehension is faster as well as clearer. Another potential small speedup is to replace the first 3 statements with just one: d = dict((item,i) for i,item in reversed(enumerate(sorted(singleList but THIS density of functionality is a bit above my personal threshold of comfort (sparse is better than dense:-). Alex Thanks for breaking it down so thoroughly. I try the different recipes and see which gives the best trade offs between readability and performance. Agreed that dict() approach looks promising. -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
On Aug 3, 9:20 am, [EMAIL PROTECTED] wrote: On Aug 2, 10:20 pm, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Isn't there something wrong with the ordering? Pablo's answers are: [ 1, 2, 3, 4, 5 ] == [0, 1, 2, 3, 4] correct [ 3, 1, 5, 2, 4 ] == [2, 0, 4, 1, 3] wrong? [ -1.1, 2.2, 0, -1.1, 13 ] == [0, 3, 2, 0, 4] wrong? Doing it in my head I get: [ 3, 1, 5, 2, 4 ] == [ 1, 3, 0, 4, 2 ] [ -1.1, 2.2, 0, -1.1, 13 ] == [0, 3, 2, 1, 4] What gives? Did I misunderstand what rank ordering (handling ties as well) means? There are many ways to skin this cat. I am considering the following approach: http://en.wikipedia.org/wiki/Rank_order#Standard_competition_ranking_.28.221224.22_ranking.29 -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
On Aug 3, 5:53 am, Neil Cerutti [EMAIL PROTECTED] wrote: On 2007-08-03, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Use binary search instead of linear. import bisect import functools def rankLists(nestedList): def rankList(singleList): sortedList = list(sorted(singleList)) return map(functools.partial(bisect.bisect_left, sortedList), singleList) return map(rankList, nestedList) -- Neil Cerutti Facts are stupid things. --Ronald Reagan Very clean and precisely what I was looking for. Thanks for expanding my knowledge of the language. -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
beginner's advice to use a dictionary is also good and may turn out to be faster, just because dicts are SO fast in Python -- but you need to try and measure both alternatives. One way to use a dict ( warning, untested code ) : def rank_list( single_list ) : d = { } for i , item in reversed( enumerate( sorted( single_list ) ) ) : d[ item ] = i return [ d[ item ] for item in single_list ] Alex I tested it but I don't know how to fix it :-) Both Pablo's original version and Neil's version work but I get the following traceback from your version Traceback (most recent call last): File list_nested_rank.py, line 100, in module test_01( list_source ) File list_nested_rank.py, line 87, in test_01 list_print( this_func( list_source ) ) File list_nested_rank.py, line 62, in rank_lists_02 return map( rank_list , nested_list ) File list_nested_rank.py, line 56, in rank_list for i , item in reversed( enumerate( sorted( single_list ) ) ) : TypeError: argument to reversed() must be a sequence -- Stanley C. Kitching Human Being Phoenix, Arizona == Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News== http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups = East and West-Coast Server Farms - Total Privacy via Encryption = -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
Cousin Stanley [EMAIL PROTECTED] wrote: ... for i , item in reversed( enumerate( sorted( single_list ) ) ) : ... TypeError: argument to reversed() must be a sequence Oops, right. Well then, aux_seq = list(enumerate(sorted(single_list))) for i, item in reversed(aux_seq): ... or the like. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
Cousin Stanley [EMAIL PROTECTED] wrote: ... for i , item in reversed( enumerate( sorted( single_list ) ) ) : ... TypeError: argument to reversed() must be a sequence Oops, right. Well then, aux_seq = list(enumerate(sorted(single_list))) for i, item in reversed(aux_seq): ... Alex That fixed it. Thanks -- Stanley C. Kitching Human Being Phoenix, Arizona == Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News== http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups = East and West-Coast Server Farms - Total Privacy via Encryption = -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Each use of sortedList.index is O(N) [where N is len(singleList)], and you have N such uses in the map in the inner function, so this approach is O(N squared). Neil's suggestion to use bisect replaces the O(N) .index with an O(log N) search, so the overall performance is O(N log N) [[and you can't do better than that, big-O wise, because the sort step is also O(N log N)]]. beginner's advice to use a dictionary is also good and may turn out to be faster, just because dicts are SO fast in Python -- but you need to try and measure both alternatives. One way to use a dict (warning, untested code): def rankList(singleList): d = {} for i, item in reversed(enumerate(sorted(singleList))): d[item] = i return [d[item] for item in singleList] If you find the for-clause too rich in functionality, you can of course split it up a bit; but note that you do need the 'reversed' to deal with the corner case of duplicate items (without it, you'd end up with 1s instead of 0s for the third one of the sublists in your example). If speed is of the essence you might try to measure what happens if you replace the returned expression with map(d.__getitem__, singleList), but I suspect the list comprehension is faster as well as clearer. Another potential small speedup is to replace the first 3 statements with just one: d = dict((item,i) for i,item in reversed(enumerate(sorted(singleList but THIS density of functionality is a bit above my personal threshold of comfort (sparse is better than dense:-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
On 2007-08-03, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Use binary search instead of linear. import bisect import functools def rankLists(nestedList): def rankList(singleList): sortedList = list(sorted(singleList)) return map(functools.partial(bisect.bisect_left, sortedList), singleList) return map(rankList, nestedList) -- Neil Cerutti Facts are stupid things. --Ronald Reagan -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
On Aug 2, 10:20 pm, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Isn't there something wrong with the ordering? Pablo's answers are: [ 1, 2, 3, 4, 5 ] == [0, 1, 2, 3, 4] correct [ 3, 1, 5, 2, 4 ] == [2, 0, 4, 1, 3] wrong? [ -1.1, 2.2, 0, -1.1, 13 ] == [0, 3, 2, 0, 4] wrong? Doing it in my head I get: [ 3, 1, 5, 2, 4 ] == [ 1, 3, 0, 4, 2 ] [ -1.1, 2.2, 0, -1.1, 13 ] == [0, 3, 2, 1, 4] What gives? Did I misunderstand what rank ordering (handling ties as well) means? -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
On Aug 2, 8:20 pm, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Indexing the sorted list with a dictionary will speed it up a little. -- http://mail.python.org/mailman/listinfo/python-list
Efficient Rank Ordering of Nested Lists
A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? -- http://mail.python.org/mailman/listinfo/python-list