Re: [PERFORM] Speedup hint needed, if available? :)
Mario Splivalo wrote: Hello again. I have to track user subscriptions to certain mailinglists, and I also need to track credits users have on those mailinglists. On one side I have procedures that add credits, on other side I have procedures that subtract available credits. Add/subtract is pretty intensive, around 30-50 adds per minute (usualy 10 or 100 credits), and around 200-500 subtracts per minute (usualy by one or two credits). I have created table user_subscriptions to track user subscriptions to certain mailing list. I have derived subscription_id as primary key. I have two other tables, user_subscription_credits_given, and _credits_taken, wich track credits for subscription added or subtracted to or from certain subscription. I created those two tables so I could eliminate a lot of UPDATES on user_subscriptions table (if I were to have a column 'credits' in that table). It sounds to me like you have decided beforehand that the obvious solution (update a credit field in the user_subscriptions table) is not going to perform well. Have you tried it? How does it perform? If it does indeed give you performance problems, you could instead run some kind of batch job to update the credits field (and delete the /given/taken records). Finally: You could refactor the query to get rid of the union: SELECT u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from, u.valid_to, ( SELECT sum(credits) FROM credits_given WHERE subscription_id = u.subscription_id ) - ( SELECT sum(credits) FROM credits_taken WHERE subscription_id = u.subscription_id) ) AS credits FROM user_subscriptions u WHERE u.user_id = 1 (Not tested). You will probably need a COALESCE around each of the subqueries to avoid problems with nulls. rantThe sum of an empty set of numbers is 0. The conjunction of an empty set of booleans is true. The SQL standard somehow manages to get this wrong/rant /Nis ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Speedup hint needed, if available? :)
Hello again. I have to track user subscriptions to certain mailinglists, and I also need to track credits users have on those mailinglists. On one side I have procedures that add credits, on other side I have procedures that subtract available credits. Add/subtract is pretty intensive, around 30-50 adds per minute (usualy 10 or 100 credits), and around 200-500 subtracts per minute (usualy by one or two credits). I have created table user_subscriptions to track user subscriptions to certain mailing list. I have derived subscription_id as primary key. I have two other tables, user_subscription_credits_given, and _credits_taken, wich track credits for subscription added or subtracted to or from certain subscription. I created those two tables so I could eliminate a lot of UPDATES on user_subscriptions table (if I were to have a column 'credits' in that table). user_subscriptions table is projected to have around 100.000 rows, and _credits_given/_credits_taken table is projected to have around 10.000.000 rows. Now, I have filled the tables with test data, and the query results is kind of poor. It takes almost 50 seconds to get the data for the particular subscription. Now, is there a way to speed this up, or I need different approach? Here is the DDL/DML: CREATE TABLE user_subscriptions ( subscription_id int4 NOT NULL DEFAULT nextval('user_subscriptions_id_seq'::regclass), user_id int4 NOT NULL, mailinglist_id int4 NOT NULL, valid_from timestamptz NOT NULL, valid_to timestamptz, CONSTRAINT user_subscriptions_pkey PRIMARY KEY (subscription_id) ); CREATE TABLE user_subscription_credits_given ( subscription_id int4 NOT NULL, credits int4 NOT NULL, CONSTRAINT user_subscription_credits_given_fk__subscription_id FOREIGN KEY (subscription_id) REFERENCES user_subscriptions (subscription_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ); CREATE INDEX fki_user_subscriptions_fk__mailinglist_id ON user_subscriptions USING btree (mailinglist_id); CREATE INDEX fki_user_subscriptions_fk__users_id ON user_subscriptions USING btree (user_id); CREATE INDEX fki_user_subscription_credits_given_fk__subscription_id ON user_subscription_credits_given USING btree (subscription_id); CREATE INDEX fki_user_subscription_credits_taken_fk__subscription_id ON user_subscription_credits_taken USING btree (subscription_id); Here is the query which gets information on particular user, shows subscriptions to mailinglists and available credits on those mailinglists: SELECT u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from, u.valid_to, sum(credits.credits_given - credits.credits_taken)::integer AS credits FROM user_subscriptions u LEFT JOIN (SELECT user_subscription_credits_given.subscription_id, user_subscription_credits_given.credits AS credits_given, 0 AS credits_taken FROM user_subscription_credits_given UNION ALL SELECT user_subscription_credits_taken.subscription_id, 0 AS credits_given, user_subscription_credits_taken.credits AS credits_taken FROM user_subscription_credits_taken) credits ON u.subscription_id = credits.subscription_id where u.user_id = 1 GROUP BY u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from, u.valid_to And here is the 'explain analyze' of the above query: QUERY PLAN -- HashAggregate (cost=200079055.24..200079055.28 rows=2 width=36) (actual time=56527.153..56527.163 rows=2 loops=1) - Nested Loop Left Join (cost=200033690.72..200078931.34 rows=8260 width=36) (actual time=0.432..54705.844 rows=275366 loops=1) Join Filter: (outer.subscription_id = inner.subscription_id) - Index Scan using fki_user_subscriptions_fk__users_id on user_subscriptions u (cost=0.00..3.03 rows=2 width=28) (actual time=0.030..0.055 rows=2 loops=1) Index Cond: (user_id = 1) - Materialize (cost=200033690.72..200045984.63 rows=825991 width=12) (actual time=0.043..22404.107 rows=826032 loops=2) - Subquery Scan credits (cost=1.00..200028830.73 rows=825991 width=12) (actual time=0.050..31500.589 rows=826032 loops=1) - Append (cost=1.00..200020570.82 rows=825991 width=8) (actual time=0.041..22571.540 rows=826032 loops=1) - Subquery Scan *SELECT* 1 (cost=1.00..11946.96 rows=78148 width=8) (actual time=0.031..1226.640 rows=78148 loops=1) - Seq Scan on user_subscription_credits_given (cost=1.00..11165.48 rows=78148 width=8) (actual time=0.022..404.253 rows=78148 loops=1) - Subquery Scan *SELECT* 2 (cost=1.00..100018623.86 rows=747843 width=8) (actual time=0.032..12641.705 rows=747884
Re: [PERFORM] Speedup hint needed, if available? :)
Mario Splivalo [EMAIL PROTECTED] writes: Here is the query which gets information on particular user, shows subscriptions to mailinglists and available credits on those mailinglists: SELECT u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from, u.valid_to, sum(credits.credits_given - credits.credits_taken)::integer AS credits FROM user_subscriptions u LEFT JOIN (SELECT user_subscription_credits_given.subscription_id, user_subscription_credits_given.credits AS credits_given, 0 AS credits_taken FROM user_subscription_credits_given UNION ALL SELECT user_subscription_credits_taken.subscription_id, 0 AS credits_given, user_subscription_credits_taken.credits AS credits_taken FROM user_subscription_credits_taken) credits ON u.subscription_id = credits.subscription_id where u.user_id = 1 GROUP BY u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from, u.valid_to Do you have realistic test data? The EXPLAIN shows that this is pulling 275366 of the 826032 rows in the two tables, which seems like rather a lot for a single user. If it's reasonable that the query needs to fetch one-third of the data, then you should resign yourself to it taking awhile :-( If the expected number of matching rows were much smaller, it would make sense to use indexscans over the two big tables, but unfortunately existing PG releases don't know how to generate an indexscan join with a UNION ALL in between :-(. FWIW, 8.2 will be able to do it. In current releases the only thing I can suggest is to merge user_subscription_credits_given and user_subscription_credits_taken into one table so you don't need the UNION ALL. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Speedup hint needed, if available? :)
On Tue, 2006-05-30 at 11:05 -0400, Tom Lane wrote: Do you have realistic test data? The EXPLAIN shows that this is pulling 275366 of the 826032 rows in the two tables, which seems like rather a lot for a single user. If it's reasonable that the query needs to fetch one-third of the data, then you should resign yourself to it taking awhile :-( I'd say so, yes. The user_subscription table now has only six rows, but the number of actions (giving/taking credits) for a user could easily be as high as 50.000. If the expected number of matching rows were much smaller, it would make sense to use indexscans over the two big tables, but unfortunately existing PG releases don't know how to generate an indexscan join with a UNION ALL in between :-(. FWIW, 8.2 will be able to do it. In current releases the only thing I can suggest is to merge user_subscription_credits_given and user_subscription_credits_taken into one table so you don't need the UNION ALL. See, that's an idea! :) Thnx, I'll try that. Is it inapropriate to ask about rough estimate on availableness of 8.2? :) Mario -- Mario Splivalo Mob-Art [EMAIL PROTECTED] I can do it quick, I can do it cheap, I can do it well. Pick any two. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster