Vito De Tullio wrote: > Michael Selik wrote: > >>> > I need to scan a list of strings. If one of the elements matches the >>> > beginning of a search keyword, that element needs to snap to the front >>> > of the list. >>> >>> I know this post regards the function passing, but, on you specific >>> problem, >>> can't you just ... sort the list with a custom key? >> >> If the number of matches is small relative to the size of the list, I'd >> expect the sort would be slower than most of the other suggestions. > > umm... I take the liberty to set up a little test > > $ cat test_sort_prefix.py > #!/usr/bin/env python3 > > from sys import argv > from timeit import timeit > > > def sort_in_place(l): > def key(e): > return not e.startswith('yes') > l.sort(key=key) > > > def original_solution(mylist): > for i in range(len(mylist)): > if(mylist[i].startswith('yes')): > mylist[:] = [mylist[i]] + mylist[:i] + mylist[i+1:]
original_solution() reverses the order of the matches while sort_in_place() doesn't. If the order is not relevant I'd try something like def exchange(items): pos = 0 for index, item in enumerate(items): if item.startswith("yes"): items[pos], items[index] = item, items[pos] pos += 1 > def main(): > if argv[1] == 'sort_in_place': > f = sort_in_place > elif argv[1] == 'original_solution': > f = original_solution > else: > raise Exception() > > nomatches = int(argv[2]) > matches = int(argv[3]) > > l = ["no_" for _ in range(nomatches)] + ["yes_" for _ in > range(matches)] Python's timsort knows how to deal with (partially) sorted lists. I suggest that you add random.seed(42) # same list for all script invocations random.shuffle(l) here to spread the matches somewhat over the list. > print(timeit("f(l)", number=100, globals={"f": f, "l": l})) Remember that f(l) modifies l, so all but the first invocation will see a list where all matches are at the beginning. In other words: in 99 out of 100 runs the list is already sorted. While adding the overhead of copying the list I would expect the results of timeit("f(l[:])", ...) to be more realistic. > if __name__ == '__main__': > main() > $ ./test_sort_prefix.py sort_in_place 1000000 3 > 33.260575089996564 > $ ./test_sort_prefix.py original_solution 1000000 3 > 35.93009542999789 > $ > > > in my tests there is no sensible difference between the two solutions... > and the number of matches is risible :) Indeed, as the number of matches rises your sort() approach will likely fare better. Your conclusions may be correct, but the benchmark to support them is flawed. -- https://mail.python.org/mailman/listinfo/python-list