Hi, print(qs.query) and share.
On Wednesday, January 13, 2021 at 11:25:06 PM UTC+5:30 [email protected] wrote: > Hi all, > > I wanted to cross post my question / problem in regards to Django's ORM > `annotate` performance. Not sure if I should post it here on or Django > developers mailing list, but I wanted to start here. > > > https://stackoverflow.com/questions/65506731/django-orm-annotate-performance > > --- > > I'm using Django and Django REST Framework at work and we've been having > some performance issues with couple endpoints lately. We started by making > sure that the SQL part is optimized, no unnecessary N+1 queries, indexes > where possible, etc. > > Looking at the database part itself, it seems to be very fast (3 SQL > queries total, under a second), even with larger datasets, but the API > endpoint still took >5 seconds to return. I started profiling the Python > code using couple different tools and the majority of time is always spent > inside the `annotate` and `set_group_by` functions in Django. > > [image: le0oG.png] > > I tried Googling about `annotate` and performance, looking at Django docs, > but there's no mention of it being a 'costly' operation, especially when > used with the `F` function. > > The `annotate` part of the code looks something like this: > > qs = qs.annotate( > foo_name=models.F("foo__core__name"), > foo_birth_date=models.F("foo__core__birth_date"), > bar_name=models.F("bar__core__name"), > spam_id=models.F("baz__spam_id"), > spam_name=models.F("baz__spam__core__name"), > spam_start_date=models.F("baz__spam__core__start_date"), > eggs_id=models.F("baz__spam__core___eggs_id"), > eggs_name=models.F("baz__spam__eggs__core___name"), > ) > > qs = ( > qs.order_by("foo_id", "eggs_id", "-spam_start_date", "bar_name") > .values( > "foo_name", > "foo_birth_date", > "bar_name", > "spam_id", > "spam_name", > "eggs_id", > "eggs_name", > ) > .distinct() > ) > > The query is quite big, spans multiple relationships, so I was sure that > the problem is database related, but it doesn't seem to be. All the > `select_related` and `prefetch_related` are there, indexes too. > > I tried rewriting the code without `annotate` at all, but it didn't seem > to help. I started wondering wether the time spent in `annotate` is really > a red herring and it's only how the profiler sees it, but all profilers I > tried showed the same thing. > > While I feel like I know Django quite well and had success optimising API > endpoints before, I'm not sure what 'thread' to pull in this case. I tried > looking at Django internals, especially around `annotate` and > `set_group_by` but couldn't pin point the time spent there. My last ditch > effort will be trying to rewrite those couple endpoints with raw SQL, but > I'd very much like to avoid that. > > All help will be much appreciated : ) > -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/34d7939c-f7de-4793-bff3-cf46bb5fe68en%40googlegroups.com.

