> On Sep 19, 2019, at 12:07, Richard Higginbotham <higgi...@gmail.com> wrote:
> 
> Its faster than the built in set type.

Not according to timeit. If you want to dispute that, you need a proper 
benchmark, not just repeating the claims from your inaccurate one.

I’m not sure what your relationtype is supposed to do (especially given than 1 
and 2 produce the same values), and all versions seem like unusual use cases 
rather than anything approaching realistic, but I left that all alone and just 
replaced your benchmarks with timeit, repeated them enough times to actually be 
meaningful, and changed the test to the operation you claim it’s about rather 
than a different one. The output pretty clearly shows that a pre-existing set 
beats a pre-sorted list, and building a set on the fly beats sorting a list on 
the fly, and even comparing blatantly unfairly with building a set on the fly 
vs. using a pre-sorted list, the set still wins except in the smallest test.

I didn’t include tests against more realistic datasets where your algorithm 
does significantly worse; this shows that even your chosen input is slower with 
your algorithm if you measure it properly.

And, while we’re at it, I added a version of your algorithm that works with 
arbitrary iterables and also takes a key function (which you always want when 
you’re doing anything related to sorting in Python), and it’s about 10-20% 
faster than your list version, which shows that you aren’t getting any 
performance benefit from restricting yourself to lists only.

See https://pastebin.com/BC2zgtN6 for the code.

> Uniqueness of values isn't a requirement.

Of course. But non-unique values make the set algorithm faster, and they make 
your algorithm slower. And nearly-unique values don’t affect the set time, but 
they make your algorithm slower. So you can’t just ignore it.

And, again, you have to decide what happens with non-unique values, how many 
times they appear in the output.

> The values need to be comparable so that they can be sorted. This allows a 
> linear time complexity.

Except that sorting is log-linear, not linear, so it doesn’t.

Meanwhile, set actually _does_ allow linear time complexity.

Again, none of this means your code is useless. I think a step-compare 
intersection is useful enough to belong in itertools, at least as a recipe. But 
it’s not faster than set, or simpler, or anything else you’re arguing.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B5NTB5GPZLQDECNZUX2CVHVJIB724MNL/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to