subject:"\[PERFORM\] Speedup hint needed, if available\? \:\)"

Re: [PERFORM] Speedup hint needed, if available? :)

2006-05-31 Thread Nis Jorgensen


Mario Splivalo wrote:

Hello again.

I have to track user subscriptions to certain mailinglists, and I also
need to track credits users have on those mailinglists. On one side I
have procedures that add credits, on other side I have procedures that
subtract available credits. Add/subtract is pretty intensive, around
30-50 adds per minute (usualy 10 or 100 credits), and around 200-500
subtracts per minute (usualy by one or two credits).

I have created table user_subscriptions to track user subscriptions to
certain mailing list. I have derived subscription_id as primary key. I
have two other tables, user_subscription_credits_given, and
_credits_taken, wich track credits for subscription added or subtracted
to or from certain subscription. I created those two tables so I could
eliminate a lot of UPDATES on user_subscriptions table (if I were to
have a column 'credits' in that table). 


It sounds to me like you have decided beforehand that the obvious
solution (update a credit field in the user_subscriptions table) is not 
going to perform well. Have you tried it? How does it perform?


If it does indeed give you performance problems, you could instead run 
some kind of batch job to update the credits field (and delete the 
/given/taken records).


Finally: You could refactor the query to get rid of the union:

SELECT u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from,
u.valid_to, (
  SELECT sum(credits)
  FROM credits_given
  WHERE subscription_id = u.subscription_id
) - (
  SELECT sum(credits)
  FROM credits_taken
  WHERE subscription_id = u.subscription_id)
) AS credits
FROM user_subscriptions u
WHERE u.user_id = 1

(Not tested).

You will probably need a COALESCE around each of the subqueries to avoid
problems with nulls. rantThe sum of an empty set of numbers is 0. The
conjunction of an empty set of booleans is true. The SQL standard 
somehow manages to get this wrong/rant


/Nis



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Speedup hint needed, if available? :)

2006-05-30 Thread Mario Splivalo

Hello again.

I have to track user subscriptions to certain mailinglists, and I also
need to track credits users have on those mailinglists. On one side I
have procedures that add credits, on other side I have procedures that
subtract available credits. Add/subtract is pretty intensive, around
30-50 adds per minute (usualy 10 or 100 credits), and around 200-500
subtracts per minute (usualy by one or two credits).

I have created table user_subscriptions to track user subscriptions to
certain mailing list. I have derived subscription_id as primary key. I
have two other tables, user_subscription_credits_given, and
_credits_taken, wich track credits for subscription added or subtracted
to or from certain subscription. I created those two tables so I could
eliminate a lot of UPDATES on user_subscriptions table (if I were to
have a column 'credits' in that table). user_subscriptions table is
projected to have around 100.000 rows, and _credits_given/_credits_taken
table is projected to have around 10.000.000 rows. 

Now, I have filled the tables with test data, and the query results is
kind of poor. It takes almost 50 seconds to get the data for the
particular subscription. Now, is there a way to speed this up, or I need
different approach?

Here is the DDL/DML:

CREATE TABLE user_subscriptions
(
  subscription_id int4 NOT NULL DEFAULT
nextval('user_subscriptions_id_seq'::regclass),
  user_id int4 NOT NULL,
  mailinglist_id int4 NOT NULL,
  valid_from timestamptz NOT NULL,
  valid_to timestamptz,
  CONSTRAINT user_subscriptions_pkey PRIMARY KEY (subscription_id)
);

CREATE TABLE user_subscription_credits_given
(
  subscription_id int4 NOT NULL,
  credits int4 NOT NULL,
  CONSTRAINT user_subscription_credits_given_fk__subscription_id FOREIGN
KEY (subscription_id)
  REFERENCES user_subscriptions (subscription_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION
);


CREATE INDEX fki_user_subscriptions_fk__mailinglist_id
  ON user_subscriptions
  USING btree
  (mailinglist_id);

CREATE INDEX fki_user_subscriptions_fk__users_id
  ON user_subscriptions
  USING btree
  (user_id);

CREATE INDEX fki_user_subscription_credits_given_fk__subscription_id
  ON user_subscription_credits_given
  USING btree
  (subscription_id);

CREATE INDEX fki_user_subscription_credits_taken_fk__subscription_id
  ON user_subscription_credits_taken
  USING btree
  (subscription_id);


Here is the query which gets information on particular user, shows
subscriptions to mailinglists and available credits on those
mailinglists:

SELECT u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from,
u.valid_to, sum(credits.credits_given - credits.credits_taken)::integer
AS credits
FROM user_subscriptions u
LEFT JOIN 
(SELECT user_subscription_credits_given.subscription_id,
user_subscription_credits_given.credits AS credits_given, 0 AS
credits_taken
   FROM user_subscription_credits_given
UNION ALL 
 SELECT user_subscription_credits_taken.subscription_id, 0 AS
credits_given, user_subscription_credits_taken.credits AS credits_taken
   FROM user_subscription_credits_taken) credits 
ON u.subscription_id = credits.subscription_id
where
u.user_id = 1
GROUP BY u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from,
u.valid_to

And here is the 'explain analyze' of the above query:


QUERY PLAN
--
 HashAggregate  (cost=200079055.24..200079055.28 rows=2 width=36)
(actual time=56527.153..56527.163 rows=2 loops=1)
   -  Nested Loop Left Join  (cost=200033690.72..200078931.34 rows=8260
width=36) (actual time=0.432..54705.844 rows=275366 loops=1)
 Join Filter: (outer.subscription_id =
inner.subscription_id)
 -  Index Scan using fki_user_subscriptions_fk__users_id on
user_subscriptions u  (cost=0.00..3.03 rows=2 width=28) (actual
time=0.030..0.055 rows=2 loops=1)
   Index Cond: (user_id = 1)
 -  Materialize  (cost=200033690.72..200045984.63 rows=825991
width=12) (actual time=0.043..22404.107 rows=826032 loops=2)
   -  Subquery Scan credits
(cost=1.00..200028830.73 rows=825991 width=12) (actual
time=0.050..31500.589 rows=826032 loops=1)
 -  Append  (cost=1.00..200020570.82
rows=825991 width=8) (actual time=0.041..22571.540 rows=826032 loops=1)
   -  Subquery Scan *SELECT* 1
(cost=1.00..11946.96 rows=78148 width=8) (actual
time=0.031..1226.640 rows=78148 loops=1)
 -  Seq Scan on
user_subscription_credits_given  (cost=1.00..11165.48
rows=78148 width=8) (actual time=0.022..404.253 rows=78148 loops=1)
   -  Subquery Scan *SELECT* 2
(cost=1.00..100018623.86 rows=747843 width=8) (actual
time=0.032..12641.705 rows=747884

Re: [PERFORM] Speedup hint needed, if available? :)

2006-05-30 Thread Tom Lane

Mario Splivalo [EMAIL PROTECTED] writes:
 Here is the query which gets information on particular user, shows
 subscriptions to mailinglists and available credits on those
 mailinglists:

 SELECT u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from,
 u.valid_to, sum(credits.credits_given - credits.credits_taken)::integer
 AS credits
 FROM user_subscriptions u
 LEFT JOIN 
   (SELECT user_subscription_credits_given.subscription_id,
 user_subscription_credits_given.credits AS credits_given, 0 AS
 credits_taken
FROM user_subscription_credits_given
   UNION ALL 
  SELECT user_subscription_credits_taken.subscription_id, 0 AS
 credits_given, user_subscription_credits_taken.credits AS credits_taken
FROM user_subscription_credits_taken) credits 
   ON u.subscription_id = credits.subscription_id
 where
   u.user_id = 1
 GROUP BY u.subscription_id, u.user_id, u.mailinglist_id, u.valid_from,
 u.valid_to

Do you have realistic test data?  The EXPLAIN shows that this is pulling
275366 of the 826032 rows in the two tables, which seems like rather a
lot for a single user.  If it's reasonable that the query needs to fetch
one-third of the data, then you should resign yourself to it taking
awhile :-(

If the expected number of matching rows were much smaller, it would
make sense to use indexscans over the two big tables, but unfortunately
existing PG releases don't know how to generate an indexscan join
with a UNION ALL in between :-(.  FWIW, 8.2 will be able to do it.
In current releases the only thing I can suggest is to merge
user_subscription_credits_given and user_subscription_credits_taken
into one table so you don't need the UNION ALL.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Speedup hint needed, if available? :)

2006-05-30 Thread Mario Splivalo

On Tue, 2006-05-30 at 11:05 -0400, Tom Lane wrote:

 Do you have realistic test data?  The EXPLAIN shows that this is pulling
 275366 of the 826032 rows in the two tables, which seems like rather a
 lot for a single user.  If it's reasonable that the query needs to fetch
 one-third of the data, then you should resign yourself to it taking
 awhile :-(

I'd say so, yes. The user_subscription table now has only six rows, but
the number of actions (giving/taking credits) for a user could easily be
as high as 50.000. 

 If the expected number of matching rows were much smaller, it would
 make sense to use indexscans over the two big tables, but unfortunately
 existing PG releases don't know how to generate an indexscan join
 with a UNION ALL in between :-(.  FWIW, 8.2 will be able to do it.
 In current releases the only thing I can suggest is to merge
 user_subscription_credits_given and user_subscription_credits_taken
 into one table so you don't need the UNION ALL.

See, that's an idea! :) Thnx, I'll try that.

Is it inapropriate to ask about rough estimate on availableness of
8.2? :)

Mario
-- 
Mario Splivalo
Mob-Art
[EMAIL PROTECTED]

I can do it quick, I can do it cheap, I can do it well. Pick any two.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Speedup hint needed, if available? :)

[PERFORM] Speedup hint needed, if available? :)

Re: [PERFORM] Speedup hint needed, if available? :)

Re: [PERFORM] Speedup hint needed, if available? :)

4 matches

Site Navigation

Mail list logo

Footer information