[Python-ideas] Re: Need a core developer sponsor for a PEP

2019-09-19 Thread Philippe Prados
Thanks Andrew for spending so much time answering me. I publish a new version of the proposition with a synthesis of remarks, and a separation between strong and optional proposition. * You need __ror__ as well as __or__. No, in this sit

[Python-ideas] Re: Add a slot for "keys" in tp_as_mapping

2019-09-19 Thread Greg Ewing
Serhiy Storchaka wrote: PyDict_Merge() is called every time when you have a call like `f(a=1, **kw)` So would not be worth to add slots for keys (and maybe for values and items) to the tp_as_mapping structure? Only if looking up those methods is taking a substantial proportion of time. Measu

[Python-ideas] Re: Need a core developer sponsor for a PEP

2019-09-19 Thread Steven D'Aprano
On Thu, Sep 19, 2019 at 09:03:06AM +0200, Philippe Prados wrote: > * You need __ror__ as well as __or__. > No, in this situation, Python auto invoke ``__or__`` in case of ``__ror__``. I don't think that Python will invoke ``__or__`` if the ``__ror__`` method is missing. Testing in Python 3.8:

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Steven D'Aprano
On Thu, Sep 19, 2019 at 03:26:27PM +1000, Chris Angelico wrote: > On Thu, Sep 19, 2019 at 3:08 PM Richard Higginbotham > wrote: > > > > I'm not sure if there is any interest by others but I have frequently come > > across cases where I would like to compare items in one list in another > > simi

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Chris Angelico
On Thu, Sep 19, 2019 at 10:32 PM Steven D'Aprano wrote: > > On Thu, Sep 19, 2019 at 03:26:27PM +1000, Chris Angelico wrote: > > On Thu, Sep 19, 2019 at 3:08 PM Richard Higginbotham > > wrote: > > > > > > I'm not sure if there is any interest by others but I have frequently > > > come across cas

[Python-ideas] Re: Skip modules by default in star-import

2019-09-19 Thread Joao S. O. Bueno
Just because du to the side tracking it is looking like "it might be a nice idea": IMHO, this is not. and I second Brendan's arguments - This fix too litle, and normally just for beginners who happen to use * imports in code that will selfon be put to critical use. (and even if it does, the extra

[Python-ideas] Re: Skip modules by default in star-import

2019-09-19 Thread Neil Girdhar
I think I agree with Serhiy, but I don't think Anders' suggestion is workable since it's very common in __init__.py to do exactly this (import * from local modules in order to re-export them), and you definitely don't want to explitly foo=foo for everything. That's the whole reason you "import

[Python-ideas] Re: Need a core developer sponsor for a PEP

2019-09-19 Thread Andrew Barnert via Python-ideas
Steven already answered many of these, so I’ll just snip the ones he didn’t. On Sep 19, 2019, at 00:03, Philippe Prados wrote: > > > * The static types in typing are not instances of type, so you need to work > out what to do with them. > I do not understand the remark. My patch of 'mypy' acc

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Andrew Barnert via Python-ideas
On Sep 18, 2019, at 12:29, Richard Higginbotham wrote: > > I'm not sure if there is any interest by others but I have frequently come > across cases where I would like to compare items in one list in another > similar to relational algebra. For anyone who doesn’t want to read this whole reply:

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Higginbotham
Its faster than the built in set type. Uniqueness of values isn't a requirement. The values need to be comparable so that they can be sorted. This allows a linear time complexity. That's basically the only restraint on the values. Here are some speed test results and pseudocode. def naive(a,b):

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Higginbotham
The value of the order is actually the critical part. If they are both ordered I can complete all comparisons in one pass over both lists at the same time. Lets think about in A and not B a = [1,2,3,6] b = [1,4,5,6,7] start at a[0] and b[0], 0 -- a[i] == b[j] so increment both indexes. 1 -- a[i]

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Chris Angelico
On Fri, Sep 20, 2019 at 5:08 AM Richard Higginbotham wrote: > > Its faster than the built in set type. Uniqueness of values isn't a > requirement. The values need to be comparable so that they can be sorted. > This allows a linear time complexity. > Note that the linear time complexity assumes

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Chris Angelico
On Fri, Sep 20, 2019 at 5:11 AM Richard Higginbotham wrote: > The original use case was detecting new files in various directories. Each > directory was expected to have millions of 4KB files and maybe at most 100k > of new files. This was in around 2002 and it was a huge time savings for me. >

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Andrew Barnert via Python-ideas
> On Sep 19, 2019, at 12:07, Richard Higginbotham wrote: > > Its faster than the built in set type. Not according to timeit. If you want to dispute that, you need a proper benchmark, not just repeating the claims from your inaccurate one. I’m not sure what your relationtype is supposed to do (

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Andrew Barnert via Python-ideas
On Sep 19, 2019, at 12:09, Richard Higginbotham wrote: > > It might / probably? even be adapted to be used in the built in set > operations. How? I’m not sure if you get how sets work, and maybe you think they’re based on a red-black tree or skiplist or some other logarithmic collection, like

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Higginbotham
>> For example are the file names in A in B and if so return a new list with >> those items. Long story short, I wrote some functions to do that. They are >> quite simple and fast (due to timsort in no small part). >> Even the plain python code is faster than the built in set functions (afaik). >

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Chris Angelico
On Fri, Sep 20, 2019 at 6:19 AM Richard Higginbotham wrote: > >c = [val for val in a if val in filterset] > This is the crux I think. The complexity is the number of elements in > filterset. Since filterset has to be check for every value in a its O(len(a) > * len(b) * filter search time). You w

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Higginbotham
That's exactly what I expected. It's the opposite though. The time to check one list for all the elements in another list was far longer. Granted this was a Linux box using ext2. The other helpful element is that the filesystem returned the files in the proper order. One of the tests runs I ment

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Chris Angelico
On Fri, Sep 20, 2019 at 6:31 AM Richard Higginbotham wrote: > > That's exactly what I expected. It's the opposite though. The time to check > one list for all the elements in another list was far longer. Granted this > was a Linux box using ext2. The other helpful element is that the filesystem

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Higginbotham
I meant a near linear time complexity. n log n is very very close to n even when n is large. It's not constant, if it were it could be dropped entirely... It's pretty close though. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Musil
On Thu, Sep 19, 2019 at 10:18 PM Richard Higginbotham wrote: > I'm sorry I haven't used the mail list before and didn't send my comments > to the list. Please see my response to Chris for some example times. I'm > reluctant to use anything less than about a ms for timing purposes. There's > too m

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Musil
On Thu, Sep 19, 2019 at 11:58 PM Richard Musil wrote: > I did not verify the results (I hope they are correct), but the timing > suggest that the set operation is always faster. When however we add the > set setup to the mix, the build up time, which is O(n + m) becomes more > significant than ac

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Dominik Vilsmeier
It might be interesting to note that Numpy provides various set routines operating on their arrays (and hence lists as well, by conversion): https://docs.scipy.org/doc/numpy/reference/routines.set.html For [intersection](https://docs.scipy.org/doc/numpy/reference/generated/numpy.intersect1d.htm

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Andrew Barnert via Python-ideas
On Sep 19, 2019, at 13:18, Richard Higginbotham wrote: >>> For example are the file names in A in B and if so return a new list with >>> those items. Long story short, I wrote some functions to do that. They are >>> quite simple and fast (due to timsort in no small part). >>> Even the plain pyt

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Andrew Barnert via Python-ideas
On Sep 19, 2019, at 15:18, Richard Musil wrote: > > Ok, I misread the original code, the lists were not sorted in the previous > results (and their should be). So with the correction to have them sorted, I think to be fair, you want to show it _both_ ways, just as you’re showing sets both with

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Higginbotham
I think there is a problem here using timeit for this type of problem, look at the large amount of time used by the set init time. Noticing this I changed my sorts to in place and got better results. I assume there is some copying of locals or other overhead but I can't explain the results. htt

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Richard Higginbotham
It's not really constant though. Look at all of the test results here. Richard Musil in the following post shows one value of A=1000,B=1 with .034 ms and the next point A=1,B=5 If the hash were constant time and A increased by 10x then the time should be 0.34ms but its not. It's 0.54

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Andrew Barnert via Python-ideas
On Sep 19, 2019, at 19:18, Richard Higginbotham wrote: > > It's not really constant though. It’s really hard to have a discussion when all of your posts are all replies, but you don’t give us any clue what you’re replying to. There are multiple responses from multiple people since your last em

[Python-ideas] Re: Set operations with Lists

2019-09-19 Thread Tim Peters
[Richard Higginbotham ] > If you start including the time it takes to convert lists to sets its even > more > pronounced. hash values can collide and the bigger the data set the > more likely it is to happen. That's not so. Python's hash tables dynamically resize so that the load factor never ex