On Wednesday, February 5, 2014 8:01:53 PM UTC, Anssi Kääriäinen wrote:
>
> Something like this might work:
>
> Transaction.objects.values_list('member_id').annotate(total=Sum('amount')).filter(total__gt=0).aggregate(Sum('total'))
>
This didn't work - it produced a "SELECT FROM" query, which obviously
didn't work - tried adding 'amount' to the values_list, but that didn't
help either. Eventually got it to work by using .only('member', 'amount')
instead, and it *was* fast, but it didn't quite do what we need: we only
want the *outstanding* credit/debt, i.e. some member might have many
transactions, but they all add up to zero - so instead of increasing both
total_debt and total_credit by that value, it shouldn't affect the totals.
Which is why I wanted to total up by members first... any idea how to get
that to work ?
>
> That is, don't start from Member, Django isn't smart enough to get rid of
> the non-necessary joins. Instead go directly for the query you wrote in
> SQL. In the ORM .values() is the way to control GROUP BY clause. Getting
> the order of .values() and .annotate() right is important, and it is
> sometimes hard to figure out how to do it correctly. Also, if you have some
> extra ordering going on in the query just issue an empty .order_by() call -
> that will get rid of all ordering.
>
> - Anssi
>
> On Wednesday, February 5, 2014 8:11:29 PM UTC+2, ST wrote:
>>
>> Hi,
>>
>> I'm trying to optimize the run-time of getting total credit and debt
>> values out of our database. Ideally I'd like to formulate it as a Django
>> query. This is the raw SQL query I have, which produces the right answer
>> and is very fast (milliseconds):
>>
>> SELECT sum(tg.total) FROM
>> (
>> SELECT sum(t.amount) AS total, t.member_id AS member_id
>> FROM club_transaction AS t
>> WHERE t.member_id IS NOT NULL
>> GROUP BY t.member_id
>> ) AS tg
>> WHERE tg.total < 0
>>
>> (plus a second query for > 0)
>>
>> My Django implementation was:
>>
>> m = Member.objects.annotate(balance=Sum('transaction__amount'))
>> m_debt = m.filter(balance__lt=0).aggregate(total=Sum('balance'))
>> m_credit = m.filter(balance__gt=0).aggregate(total=Sum('balance'))
>>
>> which looks a lot nicer, is easier to understand and maintain.
>>
>> However, it results in the following SQL query (slightly redacted):
>>
>> SELECT SUM(balance) FROM
>> (
>> SELECT "club_member"."id" AS "id", {all the other fields},
>> SUM("club_transaction"."amount") AS "balance"
>> FROM "club_member"
>> LEFT OUTER JOIN "auth_user" ON ("club_member"."user_id" =
>> "auth_user"."id")
>> LEFT OUTER JOIN "club_transaction" ON ("club_member"."id" =
>> "club_transaction"."member_id")
>> GROUP BY "club_member"."id", {all the other fields},
>> "auth_user"."last_name", "auth_user"."first_name"
>> HAVING SUM("club_transaction"."amount") < 0
>> ORDER BY "auth_user"."last_name" ASC, "auth_user"."first_name" ASC
>> ) subquery
>>
>> (again, plus another one for > 0)
>> which is very slow (almost 1.5 seconds).
>>
>> How can I construct a Django query which doesn't request (and group by)
>> all the unnecessary other fields ?
>> I already tried playing around with only() and values() but never got it
>> to work.
>>
>> Looking forward to your responses!
>>
>> best regards,
>> ST
>>
>
--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-users/3e239afe-037c-4ace-b9eb-ac2404f75981%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.