On Oct 20, 2017, at 3:31 PM, Justin Pryzby <pry...@telsasoft.com> wrote: > > On Fri, Oct 20, 2017 at 03:08:26PM -0800, Israel Brewster wrote: >> Summary: the following query takes around 12 seconds on my test machine. On >> my production machine, it's at half an hour and counting. What's going on? >> >> which, when run on my test server, has this explain analyze output: >> https://explain.depesz.com/s/4piv <https://explain.depesz.com/s/4piv>. >> Around 12 second runtime, which isn't too bad (in the grand scheme of >> things), although there is probably room for improvement. > > Are these cast to ::date cast is really needed (Alternately, do you have an > index on column::date ?) > |WHERE outtime::date>='2017-01-01' > |ON outtime::date BETWEEN oag_schedules.startdate > |AND outtime::date BETWEEN oag_batches.eff_from
Possibly not, but outtime is a timestamp while startdate and eff_from are simple date fields. I was being explicit :-) and no, I don't have an index on column::date, that's certainly something I can try to optimize things. > The problem is clearly here: > > Merge Join (cost=30,604.12..31,301.12 ROWS=1 width=76) (actual > time=1,153.883..9,812.434 ROWS=3,420,235 loops=1) > Merge Cond: ((((oag_schedules.flightnum)::text) = > (legdetail.flightnum)::text) AND ((oag_schedules.origin)::text = > (legdetail.legfrom)::text)) > Join Filter: (((legdetail.outtime)::date >= oag_schedules.startdate) AND > ((legdetail.outtime)::date <= COALESCE(oag_schedules.enddate, > 'infinity'::date)) AND (date_part('isodow'::text, > ((legdetail.outtime)::date)::timestamp without time zone) = ANY > ((oag_schedules.frequency)::double precision[]))) > ROWS REMOVED BY JOIN FILTER: 6822878 > > Can you send "explain" (not explain analyze) for the production server? Not for the full query - it only just completed, after 70 minutes or so, and I wasn't running under EXPLAIN ANALYZE. Running with a shorter date range of only 7 days, as you suggest below: https://explain.depesz.com/s/r80j <https://explain.depesz.com/s/r80j> I notice that this is a COMPLETELY different query plan from what I got on my test machine, but I'm not sure if that's a good thing or not. The end result is obviously not. > > And \d for those tables. Sure: flightlogs=# \d legdetail Table "public.legdetail" Column | Type | Modifiers ------------+--------------------------+-------------------------------------------------------- id | integer | not null default nextval('legdetail_id_seq'::regclass) logid | integer | flightnum | character varying(32) | legfrom | character varying(6) | legto | character varying(6) | pax | integer | default 0 cargo | integer | default 0 legdate | date | default now() outtime | timestamp with time zone | not null ontime | timestamp with time zone | offtime | timestamp with time zone | intime | timestamp with time zone | blocktime | interval | flighttime | interval | mail | integer | default 0 csom | character(1) | nonrevpax | smallint | legsic | character varying(128) | nonrevfrt | integer | ti | boolean | default false Indexes: "legdetail_pkey" PRIMARY KEY, btree (id) "csom_idx" btree (csom) "flightnum_idx" btree (flightnum) "legdate_idx" btree (legdate) "legfrom_idx" btree (legfrom) "legto_idx" btree (legto) "logid_idx" btree (logid) "outtime_idx" btree (outtime) Foreign-key constraints: "logfk" FOREIGN KEY (logid) REFERENCES logs(id) Triggers: legdetail_audit AFTER UPDATE ON legdetail FOR EACH ROW WHEN (old.* <> new.*) EXECUTE PROCEDURE changeaudit() update_hours_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE update_hours() update_logtime_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE update_logtime() update_status_legs AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE acstatupdate_loc() flightlogs=# \d oag_batches Foreign table "public.oag_batches" Column | Type | Modifiers | FDW Options ----------+--------------------------+-----------+-------------------------- batchid | uuid | not null | (column_name 'batchid') name | character varying(256) | not null | (column_name 'name') created | timestamp with time zone | | (column_name 'created') eff_from | date | | (column_name 'eff_from') eff_to | date | | (column_name 'eff_to') Server: localhost FDW Options: (schema_name 'public', table_name 'batches') flightlogs=# \d oag_schedules Foreign table "public.oag_schedules" Column | Type | Modifiers | FDW Options --------------+------------------------+-----------+------------------------------ id | uuid | not null | (column_name 'id') batchid | uuid | | (column_name 'batchid') company | character varying(3) | | (column_name 'company') flightnum | integer | not null | (column_name 'flightnum') startdate | date | | (column_name 'startdate') enddate | date | | (column_name 'enddate') frequency | integer[] | | (column_name 'frequency') origin | character varying(3) | | (column_name 'origin') depart_time | time without time zone | | (column_name 'depart_time') destination | character varying(3) | | (column_name 'destination') arrival_time | time without time zone | | (column_name 'arrival_time') equipment | character varying(3) | | (column_name 'equipment') flight_type | character(1) | | (column_name 'flight_type') num_seats | smallint | | (column_name 'num_seats') Server: localhost FDW Options: (schema_name 'public', table_name 'schedules') > > And/or EXPLAIN ANALYZE for a query with shorter date range on production (to > confirm it has a similar problem in rowcount estimation). > > You can try munging the query to move/change the "Join Filter" components of > the query to see which one is contributing most to the rowcount estimate being > off by a factor of 3e6. > > Justin