Though it doesn’t directly answer your query, you might be interested in this 
package:
https://github.com/burke-software/django-report-builder


From: [email protected] [mailto:[email protected]] On 
Behalf Of Samuel Abels
Sent: Monday, November 6, 2017 1:33 PM
To: Django users
Subject: Equivalent of multi-table JOIN (another post on reverse select_related)

I am working on a reporting feature that allows users for querying arbitrary 
models and fields, and present the result as a table. For example, consider the 
following object model:

Package
   |
   v
Device <- Component
   ^         ^
   |         |
   |---- Interface <---2--- Connection
            ^^^
           / | \
          /  |  \
         /   |   \
  Sampling  IP   Policy

(The dash is the direction of a ForeignKey.)
To produce a report, I chose a three-step process:

1. The user interfaces returns a list of fields to be included in the report, 
such as

args = dict('Device.hostname__contains': 'localhost', 
'Package.name__icontains': 'unix', 'IP.address__contains': '192')

2. Given the list of args, find the shortest path that connects all required 
models. For the example above, the result is a tuple:

path = Device, Package, Interface, IP

3. In theory, I could now perform the following SQL request:

SELECT * FROM myapp_device d
LEFT JOIN myapp_package pa ON pa.device_id=d.id
LEFT JOIN myapp_interface ifc ON ifc.device_id=d.id
LEFT JOIN myapp_ip ip ON ip.interface_id=ifc.id;

But of course, I want to avoid the raw SQL. I considered the following options:

- Using Device.objects.select_related() does not work, because Device has a 1:n 
relation to Package (and also to Unit), which Django's select_related() does 
not support.

- Using prefetch_related() does not work, because it prefetches everything, 
which is too much in our case (>100 million rows if a user queries on all 
tables), and it does not provide us with a total of the number of rows 
selected. In practice, I want to count(*) everything for displaying a total, 
and fetch only a subset, using LIMIT.

Our tests showed that the raw SQL query with LEFT JOIN is fast enough for 
production, regardless of what fields and objects are being queried. The 
craziest query I built took about 20 seconds, which is ok for what we are 
trying to do.

Any other options?

-Samuel
--
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To post to this group, send email to 
[email protected]<mailto:[email protected]>.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com<https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/296e72112fa946b6b3f13cf52b96303d%40ISS1.ISS.LOCAL.
For more options, visit https://groups.google.com/d/optout.

Reply via email to