I may (or may not) have found the solution: a reindex on the 3 tables fixed the query plan. Now I can plan to reindex only the involved indexes at the start of the data import procedure.
On Wed, Oct 31, 2012 at 11:55 AM, Vincenzo Melandri <vmelan...@imolinfo.it>wrote: > Hi all :) > > I'm here again. > This time I'll provide more details (explain analyze, data-type, and > indexes), hope it will be enough :) > > The query that is performing a plan that i do not understand is the > following: > -------------------- > select [some fields from all 3 tables] > from > DATA_SEQUENCES > join SUBSCRIPTION on > SUBSCRIPTION.key1 = DATA_SEQUENCES.key1 AND > SUBSCRIPTION.key2 = DATA_SEQUENCES.key2 > join people on > people.key1 = SUBSCRIPTION.people_key1 AND > people.key2 = SUBSCRIPTION.people_key2 > WHERE DATA_SEQUENCES.import_id = 1351674661 > -------------------- > > This is the explain analyze: > > -------------------- > Merge Join (cost=2902927.01..2973307.79 rows=790371 width=240) (actual > time=40525.439..40525.439 rows=0 loops=1) > Merge Cond: ((people.key1 = subscription.people_key1) AND (people.key2 = > subscription.people_key2)) > -> Sort (cost=2885618.73..2904468.49 rows=7539905 width=240) (actual > time=40525.268..40525.268 rows=1 loops=1) > Sort Key: people.key1, people.key2 > Sort Method: external merge Disk: 466528kB > -> Seq Scan on people (cost=0.00..323429.05 rows=7539905 > width=240) (actual time=0.029..5193.057 rows=7539469 loops=1) > -> Sort (cost=17308.28..17318.76 rows=4193 width=16) (actual > time=0.167..0.167 rows=0 loops=1) > Sort Key: subscription.people_key1, subscription.people_key2 > Sort Method: quicksort Memory: 25kB > -> Nested Loop (cost=0.00..17055.99 rows=4193 width=16) (actual > time=0.154..0.154 rows=0 loops=1) > -> Seq Scan on data_sequences (cost=0.00..150.15 rows=39 > width=16) (actual time=0.154..0.154 rows=0 loops=1) > Filter: (import_id = 1351674661) > -> Index Scan using xpksubscription on subscription > (cost=0.00..431.86 rows=108 width=16) (never executed) > Index Cond: ((subscription.key1 = data_sequences.key1) > AND (subscription.key2 = data_sequences.key2)) > Total runtime: 40600.815 ms > -------------------- > > All the key, key2, and relative foreign keys are int4. Import_id is a > bigint. > I'm not reporting the full create table script 'cause people and > subscription both have lots of fields. I know this can be wrong (lots of > field on big table), but this is an environment born something like 20 > years ago and not intended from the start for such a big data volume. > > I have the following indexes: > > on People: > CREATE UNIQUE INDEX people_pkey ON people USING btree (key1, key2) > CREATE INDEX people_pkey_hash_loc ON people USING hash (key1); > CREATE INDEX people_pkey_hash_id ON people USING hash (key2); > > on Subscription: > CREATE UNIQUE INDEX subscription_pkey ON subscription USING btree (key1, > key2) > CREATE INDEX subscription_fk_people ON subscription USING btree > (people_key1, people_key2) > > on Data_sequences: > create index data_sequences_key on data_sequences USING btree (key1, key2); > create index data_sequences_id on data_sequences USING btree (import_id); > > What i don't understand is WHY the seq scan on people, and how can I cast > the import_id to make it use the index on data_sequences (another useless > seq scan). > Mind that when I run this explain analyze there were no records on > data_sequences. So all the time (40 seconds!) is for the useless seq scan > on people. Both people and subscription have lots of records (10-20.000.000 > range). > I'm running 8.4 (haven't tested it on 9.2.1 yet, but we planned to upgrade > ASAP cause we have other queries which will benefit from the > index-only-scan new feature). > > > Thank you in advance, > -- > Vincenzo. > -- Vincenzo. Imola Informatica Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute in questo messaggio sono riservate ed a uso esclusivo del destinatario. Pursuant to Legislative Decree No. 196/2003, you are hereby informed that this message contains confidential information intended only for the use of the addressee.