Re: [PERFORM] same plan, add 1 condition, 1900x slower
On Fri, 2005-11-11 at 10:53 -0500, Tom Lane wrote: After re-reading your explanation of what you're doing with the data, I thought of a possible explanation. Is the source value exactly correlated with the external_id_map primary key? Sort of. In this case, at the beginning of external_id_map, yes, though further down the table they're not. For example, if we got new subjects from 'SCH' at this point, they'd get assigned external_id_map.target_id (the primary key) values that are totally unrelated to what the current set are (the values in the external_id_map primary key just come off of a sequence that we use for everything). Right now though, since the 'SCH' data came in a contiguous chunk right at the beginning and hasn't changed or grown since then, the correlation is pretty exact, I think. It's true that there are no 'SCH' rows in the table after the first contiguous set (when I get back to work I'll check exactly what row that is). It's interesting that there are these correlations in the the data that didn't exist at all in my mental model. what you need to do is incorporate the source value into the external_id_map index key somehow. Then the index scan would be able to realize that there is no possibility of finding another row with source = 'SCH'. The simplest way is just to make a 2-column index I thought that's what I had done with the external_id_map_source_target_id index: statgen= \d util.external_id_map Table util.external_id_map Column | Type | Modifiers ---+---+--- source_id | character varying(32) | not null source| character(3) | not null target_id | bigint| not null Indexes: external_id_map_primary_key PRIMARY KEY, btree (target_id) external_id_map_source_source_id_unique UNIQUE, btree (source, source_id) external_id_map_source btree (source) external_id_map_source_target_id btree (source, target_id) Foreign-key constraints: external_id_map_source FOREIGN KEY (source) REFERENCES util.source(id) So if I understand your suggestion correctly, we're back to the why isn't this query using index foo FAQ. For the external_id_map table, the statistics target for source is 200; the other two columns are at the default level because I didn't think of them as being very interesting statistics-wise. I suppose I should probably go ahead and raise the targets for every column of that table; I expect the planning time is negligible, and our queries tend to be large data-wise. Beyond that, I'm not sure how else to encourage the use of that index. If I changed that index to be (target_id, source) would it make a difference? Thanks for your help, Mitch ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Faster db architecture for a twisted table.
On Sat, 2005-12-03 at 23:00 +, Rodrigo Madera wrote: CREATE TABLE person( id bigint PRIMARY KEY, first_name TEXT, age INT, mother bigint REFERENCES person, father biging REFERENCES person, siblings array of bigints (don't remember the syntax, but you get the point) ); Well, this is ok, but imagine a search for brothers of person id 34. We would have to search inside the record's 'siblings' array. Is this a bad design? is this going to be slow? Well, I don't know how close this example is to your actual problem, but the siblings array is redundant, AFAICS. If you got rid of it, you could query for full sibling brothers with something like (not tested): select bro.* from person p inner join person bro on (p.mother = bro.mother) AND (p.father = bro.father) where bro.sex='M' and p.id=34 ...assuming you added a sex field, which you would need in any case to query for brothers. You could query for half-siblings by changing the AND into an OR, I think. Mitch ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Simple Join
On Wed, 2005-12-14 at 17:47 -0500, Tom Lane wrote: That plan looks perfectly fine to me. You could try forcing some other choices by fooling with the planner enable switches (eg set enable_seqscan = off) but I doubt you'll find much improvement. There are too many rows being pulled from ordered_products to make an index nestloop a good idea. Well, I'm no expert either, but if there was an index on ordered_products (paid, suspended_sub, id) it should be mergejoinable with the index on to_ship.ordered_product_id, right? Given the conditions on paid and suspended_sub. If you (Kevin) try adding such an index, ideally it would get used given that you're only pulling out a small fraction of the rows in to_ship. If it doesn't get used, then I had a similar issue with 8.0.3 where an index that was mergejoinable (only because of the restrictions in the where clause) wasn't getting picked up. Mitch Kevin Brown wrote: CREATE TABLE to_ship ( id int8 NOT NULL DEFAULT nextval(('to_ship_seq'::text)::regclass), ordered_product_id int8 NOT NULL, bounced int4 NOT NULL DEFAULT 0, operator_id varchar(20) NOT NULL, timestamp timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone, CONSTRAINT to_ship_pkey PRIMARY KEY (id), CONSTRAINT to_ship_ordered_product_id_fkey FOREIGN KEY (ordered_product_id) REFERENCES ordered_products (id) ON UPDATE RESTRICT ON DELETE RESTRICT ) WITHOUT OIDS; CREATE TABLE ordered_products ( id int8 NOT NULL DEFAULT nextval(('ordered_products_seq'::text)::regclass), order_id int8 NOT NULL, product_id int8 NOT NULL, recipient_address_id int8 NOT NULL, hide bool NOT NULL DEFAULT false, renewal bool NOT NULL DEFAULT false, timestamp timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone, operator_id varchar(20) NOT NULL, suspended_sub bool NOT NULL DEFAULT false, quantity int4 NOT NULL DEFAULT 1, price_paid numeric NOT NULL, tax_paid numeric NOT NULL DEFAULT 0, shipping_paid numeric NOT NULL DEFAULT 0, remaining_issue_obligation int4 NOT NULL DEFAULT 0, parent_product_id int8, delivery_method_id int8 NOT NULL, paid bool NOT NULL DEFAULT false, CONSTRAINT ordered_products_pkey PRIMARY KEY (id), CONSTRAINT ordered_products_order_id_fkey FOREIGN KEY (order_id) REFERENCES orders (id) ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT ordered_products_parent_product_id_fkey FOREIGN KEY (parent_product_id) REFERENCES ordered_products (id) ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT ordered_products_recipient_address_id_fkey FOREIGN KEY (recipient_address_id) REFERENCES addresses (id) ON UPDATE RESTRICT ON DELETE RESTRICT ) WITHOUT OIDS; === The two indexes that should matter === CREATE INDEX ordered_product_id_index ON to_ship USING btree (ordered_product_id); CREATE INDEX paid_index ON ordered_products USING btree (paid); ordered_products.id is a primary key, so it should have an implicit index. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster