Hi Vik, Thanks for your feedback, very helpful.
I modified your query slightly, this will return all vacancy templates and all level 1 vacancies which arent templates, and does so in about ~800-900ms less, an great improvement on the original query. SELECT "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated", "Vacancy"."CustomAccess", "Department"."Name" as "Department", list("Occupation"."Name") as "Occupation", "Vacancy"."PositionNo", "Vacancy"."Template" FROM "Vacancy" JOIN "CategoryOption_TableRow" as "c_50" ON ( "c_50"."Category_TableID"= 50 AND "c_50"."RowID" = "Vacancy"."ID" AND "c_50"."CategoryOptionID"=19205) LEFT JOIN "CategoryOption_TableRow" as "c_22" ON ( "c_22"."RowID" = "Vacancy"."ID" AND "c_22"."Category_TableID" = 22) LEFT JOIN "CategoryOption" as "Occupation" ON ( "Occupation"."ID" = "c_22"."CategoryOptionID") LEFT JOIN "TableRow_TableRow" as "t_33" ON ( "t_33"."Table1RowID" = "Vacancy"."ID" AND "t_33"."Table_TableID" = 33) LEFT JOIN "Department" ON ( "Department"."ID" = "t_33"."Table2RowID" AND "Department"."Active" = 't' AND "Department"."ClientID" = 263) WHERE "Vacancy"."ClientID" = 263 AND NOT EXISTS ( SELECT 1 FROM "Vacancy" as "v" JOIN "CategoryOption_TableRow" "ct126" on ( "ct126"."Category_TableID" = 126 AND "RowID" = "v"."ID") WHERE "v"."Template" AND "ct126"."CategoryOptionID" IN (34024,35254,35255,35256) AND "v"."ID" = "Vacancy"."ID") AND ("Vacancy"."Template" OR ("Vacancy"."Template" = 'f' AND "Vacancy"."Level" = 1)) GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated", "Vacancy"."CustomAccess", "Department"."Name", "Vacancy"."PositionNo", "Vacancy"."Template" On Wed, Aug 12, 2015 at 9:35 PM, Vik Fearing-3 [via PostgreSQL] < ml-node+s1045698n5861873...@n5.nabble.com> wrote: > On 08/12/2015 04:34 AM, robbyc wrote: > > Hi, > > > > I am new to optimizing queries and i'm getting a slow running time > > (~1.5secs) with the following SQL: > > Before mucking about with work_mem and indexes, the first thing to do is > rewrite this query correctly. Here are just some of the things wrong > with the query as written: > > * You're doing a DISTINCT on the same set of columns also in a GROUP BY. > This is redundant and causes needless deduplication. > > * You're joining two GROUPed BY then DISTINCTed queries using the UNION > operator which will do yet another pass for deduplication. > > * You've got the entire query repeated for just a simple difference in > the global WHERE clause. These can be merged. > > * You've kept LEFT JOINs in the subquery but you don't use any values > from them. These can be safely removed altogether. > > * You're using a NOT IN clause which is almost never what you want. Use > NOT EXISTS instead. > > What is this list() function? How is it defined? Can it be replaced > with string_agg()? > > You're not doing yourself any favors at all with all this quoting and > mixed case stuff. > > Here is a rewritten version, please let me know how it performs: > > SELECT "Vacancy"."ID", > "Vacancy"."JobTitle", > "Vacancy"."DateCreated", > "Vacancy"."CustomAccess", > "Department"."Name" as "Department", > list("Occupation"."Name") as "Occupation", > "Vacancy"."PositionNo", > "Vacancy"."Template" > FROM "Vacancy" > JOIN "CategoryOption_TableRow" as "c_50" ON ( > "c_50"."Category_TableID"= 50 > AND "c_50"."RowID" = "Vacancy"."ID" > AND "c_50"."CategoryOptionID"=19205) > LEFT JOIN "CategoryOption_TableRow" as "c_22" ON ( > "c_22"."RowID" = "Vacancy"."ID" > AND "c_22"."Category_TableID" = 22) > LEFT JOIN "CategoryOption" as "Occupation" ON ( > "Occupation"."ID" = "c_22"."CategoryOptionID") > LEFT JOIN "TableRow_TableRow" as "t_33" ON ( > "t_33"."Table1RowID" = "Vacancy"."ID" > AND "t_33"."Table_TableID" = 33) > LEFT JOIN "Department" ON ( > "Department"."ID" = "t_33"."Table2RowID" > AND "Department"."Active" = 't' > AND "Department"."ClientID" = 263) > WHERE "Vacancy"."ClientID" = 263 > AND NOT EXISTS ( > SELECT 1 > FROM "Vacancy" as _Vacancy > JOIN "CategoryOption_TableRow" "ct126" on ( > "ct126"."Category_TableID" = 126 > AND "RowID" = _Vacancy."ID") > WHERE _Vacancy."Template" > AND "ct126"."CategoryOptionID" IN (34024,35254,35255,35256) > AND _Vacancy."ID" = "Vacancy"."ID") > AND ("Vacancy"."Template" = 't' OR "Vacancy"."Level" = 1) > GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated", > "Vacancy"."CustomAccess", "Department"."Name", > "Vacancy"."PositionNo", "Vacancy"."Template" > > > > SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle", > > "Vacancy"."DateCreated", "Vacancy"."CustomAccess" > > , "Department"."Name" as "Department", list("Occupation"."Name") as > > "Occupation", "Vacancy"."PositionNo" > > , "Vacancy"."Template" from > > "Vacancy" > > LEFT JOIN "CategoryOption_TableRow" as "c_22" > > ON ("c_22"."RowID" = "Vacancy"."ID" > > and "c_22"."Category_TableID" = 22) > > LEFT JOIN "CategoryOption" as "Occupation" > > ON ("Occupation"."ID" = "c_22"."CategoryOptionID") > > LEFT JOIN "TableRow_TableRow" as "t_33" > > ON ("t_33"."Table1RowID" = "Vacancy"."ID" > > and "t_33"."Table_TableID" = 33 ) > > LEFT JOIN "Department" > > ON ("Department"."ID" = "t_33"."Table2RowID" and > > "Department"."Active" = 't' and "Department" > > ."ClientID" = 263) > > JOIN "CategoryOption_TableRow" as "c_50" ON ("c_50"."Category_TableID"= > 50 > > and "c_50"."RowID" = "Vacancy" > > ."ID" and "c_50"."CategoryOptionID"=19205) > > WHERE "Vacancy"."ClientID" = 263 AND "Vacancy"."ID" NOT IN(SELECT > > DISTINCT("Vacancy"."ID") > > FROM "Vacancy" join "CategoryOption_TableRow" "ct126" on > > ("ct126"."Category_TableID" = 126 > > and "RowID" = "Vacancy"."ID") > > left join "Workflow" on ("Workflow"."VacancyID" = > "Vacancy"."ID" > > and "Workflow"."Level" > > = 1) > > left join "CategoryOption_TableRow" "c30" on ("c30"."RowID" > = > > "Workflow"."ID" and "c30"."Category_TableID" > > = 30 and "c30"."CategoryOptionID" = 21923) > > WHERE "Template" AND "ct126"."CategoryOptionID" > > IN(34024,35254,35255,35256)) and "Vacancy" > > ."Template" = 't' > > GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated", > > "Vacancy"."CustomAccess", "Department" > > ."Name", "Vacancy"."PositionNo", "Vacancy"."Template" > > UNION > > SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle", > > "Vacancy"."DateCreated", "Vacancy"."CustomAccess" > > , "Department"."Name" as "Department", list("Occupation"."Name") as > > "Occupation", "Vacancy"."PositionNo" > > , "Vacancy"."Template" from > > "Vacancy" > > LEFT JOIN "CategoryOption_TableRow" as "c_22" > > ON ("c_22"."RowID" = "Vacancy"."ID" > > and "c_22"."Category_TableID" = 22) > > LEFT JOIN "CategoryOption" as "Occupation" > > ON ("Occupation"."ID" = "c_22"."CategoryOptionID") > > LEFT JOIN "TableRow_TableRow" as "t_33" > > ON ("t_33"."Table1RowID" = "Vacancy"."ID" > > and "t_33"."Table_TableID" = 33 ) > > LEFT JOIN "Department" > > ON ("Department"."ID" = "t_33"."Table2RowID" and > > "Department"."Active" = 't' and "Department" > > ."ClientID" = 263) > > JOIN "CategoryOption_TableRow" as "c_50" ON ("c_50"."Category_TableID"= > 50 > > and "c_50"."RowID" = "Vacancy" > > ."ID" and "c_50"."CategoryOptionID"=19205) > > WHERE "Vacancy"."ClientID" = 263 AND "Vacancy"."ID" NOT IN(SELECT > > DISTINCT("Vacancy"."ID") > > FROM "Vacancy" join "CategoryOption_TableRow" "ct126" on > > ("ct126"."Category_TableID" = 126 > > and "RowID" = "Vacancy"."ID") > > left join "Workflow" on ("Workflow"."VacancyID" = > "Vacancy"."ID" > > and "Workflow"."Level" > > = 1) > > left join "CategoryOption_TableRow" "c30" on ("c30"."RowID" > = > > "Workflow"."ID" and "c30"."Category_TableID" > > = 30 and "c30"."CategoryOptionID" = 21923) > > WHERE "Template" AND "ct126"."CategoryOptionID" > > IN(34024,35254,35255,35256)) and "Vacancy" > > ."Template" <> 't' AND "Vacancy"."Level" = 1 > > GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated", > > "Vacancy"."CustomAccess", "Department" > > ."Name", "Vacancy"."PositionNo", "Vacancy"."Template" > > ORDER BY "JobTitle" > > > > Running explain analyze gives me the following information: > > http://explain.depesz.com/s/pdC <http://explain.depesz.com/s/pdC> > > > > For a total runtime: 2877.157 ms > > > > If i remove the left joins on Department and TableRow_TableRow this > reduces > > the run time by about a third. > > Additionally removing CategoryOption and CategoryOption_TableRow joins > > further reduces by a about a third. > > > > Given that i need both these joins for the information retrieved by > them, > > what would be the best way to re-factor this query so it runs faster? > > > > Looking at the output of explain analyze the hash aggregates and sort > seem > > to be the primary issue. > > > > Thanks in advance > > > -- > Vik Fearing +33 6 46 75 15 36 > http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support > > > -- > Sent via pgsql-performance mailing list ([hidden email] > <http:///user/SendEmail.jtp?type=node&node=5861873&i=0>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://postgresql.nabble.com/Slow-Query-tp5861835p5861873.html > To unsubscribe from Slow Query, click here > <http://postgresql.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=5861835&code=cm9iY2FtcGJlbGw3M0BnbWFpbC5jb218NTg2MTgzNXwxOTc1MDc2ODM4> > . > NAML > <http://postgresql.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- Regards Robert Campbell +61412062971 robcampbel...@gmail.com -- View this message in context: http://postgresql.nabble.com/Slow-Query-tp5861835p5861961.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.