Thank you for your response. I have tested out the patch in the linked
thread and it works very well on a bunch of complex queries that I
have tested, improving both the planning time significantly and the
execution time drastically.

I have also read through the entire linked discussion thread as well
as a few other large threads linked from it, and found the discussion
very interesting.

I don't believe that all such queries are "poorly-written". As was
discussed in the other threads, the reason these types of self-joins
can occur is when you use SQL views. You can create a library of
reusable views that are small, easy-to-understand and readable. Then
you build them up into bigger views, and finally query from them. But
then you end up with lots of (hidden) self-joins. The alternative is
to copy&paste the shared logic from the views into all of the queries.

I understand the need to be conservative about which optimizations to
apply in order to not waste time looking for opportunities that don't
exist. One idea I had that I didn't see mentioned is the following
heuristic: Only if a query references an SQL view (or multiple views),
then try to apply the self_join_removal optimization. This should be
enough, because as you say, no human would intentionally write such a
query. Queries generated by ORMs were also discussed, so I believe it
might also be beneficial to consider queries that contain inner
SELECTs.


Reply via email to