Re: Implement QuerySet.__contains__?

2020-06-05 Thread Shai Berger
On Fri, 5 Jun 2020 13:58:47 +0200 Johan Schiff wrote: > I still think the wrong call was made here [...] > *(Assuming queryset should be evaluated is probably correct in > most cases, but sometimes adds a big performance hit when the code > goes into production and the dataset grows - code that

Re: Implement QuerySet.__contains__?

2020-06-05 Thread Tim Graham
(You can reopen https://code.djangoproject.com/ticket/24141 rather than creating a new ticket.) On Friday, June 5, 2020 at 9:06:48 AM UTC-4, Aymeric Augustin wrote: > > If people agree QuerySet.contains() should be added, how do we move > forward? > > > Yes, if there's no major new argument in

Re: Implement QuerySet.__contains__?

2020-06-05 Thread Aymeric Augustin
> If people agree QuerySet.contains() should be added, how do we move forward? Yes, if there's no major new argument in a couple days, we can assume consensus on adding .contains(). * Create a ticket in Trac. Point to this discussion and to the previous ticket where .contains() was suggested.

Re: Implement QuerySet.__contains__?

2020-06-05 Thread Johan Schiff
Shai does make a good point about changing a well documented behaviour. That argument wins me over. Adding .contains() and updating the documentation goes a long way to make it easier for developers. Best case would be that Dajngo does not make assumptions about what the developer wants, but to

Re: Implement QuerySet.__contains__?

2020-06-05 Thread אורי
Hi, I'm thinking, maybe instead of: if obj in queryset: Users should write: if obj in list(queryset): Which I understand is already working? Why does the first line (without list()) should work anyway? And if users want performance, why not write: if queryset.filter(pk=obj.pk).exists():

Re: Implement QuerySet.__contains__?

2020-06-05 Thread Adam Johnson
I'm with Shai, against changing bool() and len() behaviour. I think the "fetch the whole queryset" behaviour is normally helpful for beginners, who write things like: if qs: for x in qs: ... # really common for anyone new to Python: for i in len(qs): do_something(qs[i]) We have

Re: Implement QuerySet.__contains__?

2020-06-05 Thread Shai Berger
On Tue, 2 Jun 2020 20:57:54 +0200 Aymeric Augustin wrote: > > We're talking about a trade-off between preserving optimisations in > existing code bases and expertise of advanced users versus doing the > right thing by default for less experienced users. I disagree. The suggestion is to make

Re: Implement QuerySet.__contains__?

2020-06-03 Thread Johan Schiff
To answer my own question: No, I wasn't doing it correctly. I should have done a sanity check before posting. New timeit code and results at bottom, now using a smaller dataset (~900 objects). Notable: - An .exists() query is about 1/100 the time of full fetch in this case. This

Re: Implement QuerySet.__contains__?

2020-06-03 Thread Johan Schiff
Thanks for great info. First, I'm leaning towards Aymeric's proposition here. I do recognize that there is a lot to consider. This seems to be important: 1. Developers must be able to explicitly choose methods to optimize for their environment. (Considering database latency, dataset size,

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Roger Gammans
On Tue, 2020-06-02 at 11:31 -0700, Javier Buzzi wrote: > ps @roger > > >>> timeit.timeit('m.GL.objects.filter(pk=x.pk)', setup='import > myapp.models as m;x = m.GL.objects.all()[324]', number=100) > 0.05818330496549606 > > is not doing anything, add a `.exists()` or `len(..)` or something to >

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Aymeric Augustin
> On 2 Jun 2020, at 19:42, Tim Graham wrote: > > And here's some past discussion: > https://code.djangoproject.com/ticket/24141 - contains() method for QuerySets > (closed as needsinfo due to no mailing list discussion to find a consensus) > https://github.com/django/django/pull/3906 -

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Javier Buzzi
My 2 cents, I think @johan's suggestion makes sense. if obj in queryset: It's very pythonic. it should do what __len__ does and cache it, if you want the single quick db query you can always use exists(). ps @roger >>> timeit.timeit('m.GL.objects.filter(pk=x.pk)', setup='import myapp.models

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Tim Graham
And here's some past discussion: https://code.djangoproject.com/ticket/24141 - contains() method for QuerySets (closed as needsinfo due to no mailing list discussion to find a consensus) https://github.com/django/django/pull/3906 - Efficient QuerySet.__contains__ (closed as wontfix due to the

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Roger Gammans
Hi, Here are some performance numbers against a local SQLite in case any one is interested. GL has about 40,000 records for reference.324 was just a random number chosen, of different 'in's >>> sys.version'3.7.2rc1 (default, Dec 12 2018, 06:25:49) \n[GCC 8.2.0]'>>> django.__version__'3.0.6'>>>

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Tim Graham
It may help to know that QuerySet.__contains__() was implemented until Django 1.6 when chunked reads were removed from QuerySet iteration: https://github.com/django/django/commit/70679243d1786e03557c28929f9762a119e3ac14 On Tuesday, June 2, 2020 at 7:43:25 AM UTC-4, Aymeric Augustin wrote: > >

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Aymeric Augustin
> On 2 Jun 2020, at 13:30, Florian Apolloner wrote: > > On Tuesday, June 2, 2020 at 11:28:34 AM UTC+2, Adam Johnson wrote: > If you already fetched the queryset, `if obj in queryset` will make a new > database query > > I don't see why we couldn't implement __contains__ to do a walk through >

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Florian Apolloner
On Tuesday, June 2, 2020 at 11:28:34 AM UTC+2, Adam Johnson wrote: > > If you already fetched the queryset, `if obj in queryset` will make a new >> database query > > > I don't see why we couldn't implement __contains__ to do a walk through > _result_cache if it has been fetched? > we are doing

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Adam Johnson
> > If you already fetched the queryset, `if obj in queryset` will make a new > database query I don't see why we couldn't implement __contains__ to do a walk through _result_cache if it has been fetched? On Tue, 2 Jun 2020 at 10:13, Aymeric Augustin < aymeric.augus...@polytechnique.org> wrote:

Re: Implement QuerySet.__contains__?

2020-06-02 Thread Aymeric Augustin
Hello Johan, You explained the upside. There's one downside to be aware of. If you already fetched the queryset, `if obj in queryset` will make a new database query, which will be slower than walking through the already fetched data (unless the queryset is really large and the database really