Hello, I'm having trouble with the performance from a query used to create a materialized view.
I need to be able to build the keyword_performance_flat_matview view in around 2-4 hours overnight. However, it currently takes in excess of 24 hours. I'm wondering if there is anything I can do to improve the performance? As you can see below it's a big query, and I didn't want to overwhelm everyone with the schema, so let me know what bits you might need to help! Any help improving the performance will be greatly appreciated. Thank you. EngineYard M3 Extra Large instance: Intel Xeon CPU E5-2670 v2 @ 2.50GHz (quad core) 15 GB Ram PostgreSQL 9.3.6 queries: create materialized view campaign_category_lookup as select scenario_campaigns.id as campaign_id, join_website_budget_labels.label_id as category_id from scenario_campaigns inner join scenario_campaign_vendor_instances on scenario_campaign_vendor_instances.campaign_id = scenario_campaigns.id inner join join_website_budget_labels_campaign_vendor_instances_1f8636 on join_website_budget_labels_campaign_vendor_instances_1f8636.vendor_instance_id = scenario_campaign_vendor_instances.id inner join join_website_budget_labels on join_website_budget_labels_campaign_vendor_instances_1f8636.website_budget_label_id = join_website_budget_labels.id inner join account_website_budgets on join_website_budget_labels.budget_id = account_website_budgets.id and account_website_budgets.state = 'approved' where scenario_campaign_vendor_instances.campaign_id = scenario_campaigns.id and account_website_budgets.start_date <= current_date and account_website_budgets.end_date >= current_date order by approved_at desc; create materialized view temp_keyword_perf_unaggr as select account_websites.id as website_id, account_websites.namespace as website_namespace, scenario_keyword_vendor_instances.inventory_disabled as inventory_disabled, scenario_keyword_vendor_instances.condition_disabled as condition_disabled, scenario_ad_groups.campaign_id, scenario_ad_groups.id as ad_group_id, scenario_keywords.id as keyword_id, scenario_keyword_texts.value as keyword_name, scenario_keyword_performances.*, (select category_id from campaign_category_lookup where campaign_category_lookup.campaign_id = scenario_ad_groups.campaign_id limit 1) as category_id from scenarios inner join account_websites on scenarios.website_id = account_websites.id inner join scenario_campaigns on scenario_campaigns.scenario_id = scenarios.id inner join scenario_ad_groups on scenario_ad_groups.campaign_id = scenario_campaigns.id inner join scenario_keywords on scenario_keywords.ad_group_id = scenario_ad_groups.id inner join scenario_keyword_texts on scenario_keyword_texts.id = scenario_keywords.text_id left outer join scenario_keyword_vendor_instances on scenario_keyword_vendor_instances.keyword_id = scenario_keywords.id left outer join scenario_keyword_performances on scenario_keyword_performances.api_id = scenario_keyword_vendor_instances.api_id and scenario_keyword_performances.date >= (date_trunc('month', now()) - '1 month'::interval)::date -- start of previous month where scenarios.deleted_at is null and scenario_keyword_texts.value is not null and account_websites.active = 't'; create materialized view keyword_performance_flat_matview as select website_id, website_namespace, campaign_id, ad_group_id, keyword_name, keyword_id, network, device, inventory_disabled, condition_disabled, category_id, date, sum(impressions) as impressions, sum(clicks) as clicks, sum(conv_one) as conv_one, sum(conv_many) as conv_many, sum(cost) as cost, sum(conv_value) as conv_value, avg(avg_position) as avg_position from temp_keyword_perf_unaggr group by website_id, website_namespace, campaign_id, ad_group_id, keyword_id, keyword_name, device, network, inventory_disabled, condition_disabled, category_id, date; Explain output for temp_keyword_perf_unaggr: Merge Right Join (cost=8796955.87..1685073792.18 rows=296873848 width=213) Merge Cond: (scenario_keyword_performances.api_id = scenario_keyword_vendor_instances.api_id) -> Index Scan using index_keyword_performances_on_vendor_instance_id_and_date on scenario_keyword_performances (cost=0.44..203167.46 rows=392586 width=144) Index Cond: (date >= ((date_trunc('month'::text, now()) - '1 mon'::interval))::date) -> Materialize (cost=8796955.43..8883724.51 rows=17353816 width=77) -> Sort (cost=8796955.43..8840339.97 rows=17353816 width=77) Sort Key: scenario_keyword_vendor_instances.api_id -> Hash Join (cost=2755544.36..5939172.05 rows=17353816 width=77) Hash Cond: (scenario_keywords.text_id = scenario_keyword_texts.id) -> Hash Right Join (cost=2171209.00..4417042.21 rows=17353816 width=48) Hash Cond: (scenario_keyword_vendor_instances.keyword_id = scenario_keywords.id) -> Seq Scan on scenario_keyword_vendor_instances (cost=0.00..821853.20 rows=33362520 width=14) -> Hash (cost=1827291.60..1827291.60 rows=16931312 width=38) -> Hash Join (cost=219154.58..1827291.60 rows=16931312 width=38) Hash Cond: (scenario_keywords.ad_group_id = scenario_ad_groups.id) -> Seq Scan on scenario_keywords (cost=0.00..946491.60 rows=32550260 width=12) -> Hash (cost=186041.43..186041.43 rows=1712492 width=30) -> Hash Join (cost=6569.88..186041.43 rows=1712492 width=30) Hash Cond: (scenario_ad_groups.campaign_id = scenario_campaigns.id) -> Seq Scan on scenario_ad_groups (cost=0.00..133539.47 rows=3292247 width=8) -> Hash (cost=5596.79..5596.79 rows=77847 width=26) -> Hash Join (cost=100.50..5596.79 rows=77847 width=26) Hash Cond: (scenario_campaigns.scenario_id = scenarios.id) -> Seq Scan on scenario_campaigns (cost=0.00..4156.60 rows=149660 width=8) -> Hash (cost=85.98..85.98 rows=1161 width=26) -> Hash Join (cost=16.43..85.98 rows=1161 width=26) Hash Cond: (scenarios.website_id = account_websites.id) -> Seq Scan on scenarios (cost=0.00..50.32 rows=2032 width=8) Filter: (deleted_at IS NULL) -> Hash (cost=12.92..12.92 rows=281 width=22) -> Seq Scan on account_websites (cost=0.00..12.92 rows=281 width=22) Filter: active -> Hash (cost=292793.16..292793.16 rows=14352816 width=37) -> Seq Scan on scenario_keyword_texts (cost=0.00..292793.16 rows=14352816 width=37) Filter: (value IS NOT NULL) SubPlan 1 -> Limit (cost=0.28..5.63 rows=1 width=4) -> Index Scan using campaign_category_lookup_campaign_id_idx on campaign_category_lookup (cost=0.28..10.99 rows=2 width=4) Index Cond: (campaign_id = scenario_ad_groups.campaign_id) -- [image: photo] *Tom McLoughlin* Software Developer | m: 08 8224 1711 e: t...@dynamiccreative.com | w: http://www.dynamiccreative.com/ <https://www.facebook.com/pages/Dynamic-Creative/464934570224209?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://twitter.com/dynamiccreative?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <http://www.linkedin.com/company/2805415?trk=vsrp_companies_res_name&trkInfo=VSRPsearchId%3A995331391047787965%2CVSRPtargetId%3A2805415%2CVSRPcmpt%3Aprimary&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> We're looking for Developers, from Senior to Graduate <http://www.seek.com.au/job/29260404?pos=8&type=standout&engineConfig=&tier=no_tier&whereid=&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> Check out our FREE eBook: Google Shopping Best Practices for PLA's <http://resources.dynamiccreative.com/google-shopping-7-best-practices?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> Please Note: Our team is not available 10am to 10.30am CST every Thursday due to our weekly company meeting <http://www.dynamiccreative.com/contact-us?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> The latest from our Team Blogs -PLA’s (Product Listing Ads) vs Google Shopping Campaigns <http://blog.dynamiccreative.com/blog/plas-product-listing-ads-vs-google-shopping-campaigns?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>