Re: [PERFORM] Query/database optimization

2006-08-02 Thread Eugeny N Dzhurinsky
On Tue, Aug 01, 2006 at 11:15:11PM -0400, Tom Lane wrote:
 Eugeny N Dzhurinsky [EMAIL PROTECTED] writes:
  [slow query]
 The bulk of your time seems to be going into this indexscan:
   -  Index Scan using 
  task_scheduler_icustomer_id on task_scheduler ts  (cost=2.03..11.51 rows=1 
  width=51) (actual time=2.785..2.785 rows=0 loops=4161)
 Index Cond: (outer.customer_id = 
  ts.customer_id)
 Filter: 
  ((get_available_pages(task_id, customer_id)  0) AND 
  ((get_check_period(task_id, next_check) IS NULL) OR 
  ((date_part('epoch'::text, get_check_period(task_id, next_check)) - 
  date_part('epoch'::text, (timenow())::timestamp without time zone))  
  3600::double precision)) AND (status  1) AND start_time)::time with 
  time zone  ('now'::text)::time(6) with time zone) AND ((stop_time)::time 
  with time zone  ('now'::text)::time(6) with time zone)) OR ((start_time IS 
  NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan)))
 SubPlan
   -  Unique  (cost=2.02..2.03 
  rows=1 width=4) (actual time=0.617..0.631 rows=3 loops=1)
  ...
 
 I kinda doubt that the index search itself is that slow --- doubtless
 the problem comes from having to evaluate that filter condition on a lot
 of rows.  How fast are those functions you're calling?

Well, not really fast, especially get_available_pages

there is special table with history of changes, and there is a view for latest
changes per task, and this function selects all records from a view for given
ID, then calculates sum of pages of tasks and then calculates number of
available pages as number of allowed pages deduct number of processed pages.

probably there is bottleneck in this view selection?

-- 
Eugene N Dzhurinsky

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Query/database optimization

2006-08-01 Thread Eugeny N Dzhurinsky
Hello, I have a query:

explain analyze select tu.url_id, tu.url, coalesce(sd.recurse, 100), case when
COALESCE(get_option('use_banner')::integer,0) = 0 then 0 else ts.use_banner
end as use_banner, ts.use_cookies, ts.use_robots,  ts.includes, ts.excludes,
ts.track_domain, ts.task_id,get_available_pages(ts.task_id,ts.customer_id),
ts.redirects from task_url tu  inner join task_scheduler ts on
tu.task_id=ts.task_id  inner join  (subscription s inner join subscription_dic
sd  on sd.id=s.dict_id )  on s.customer_id=ts.customer_id  inner join customer
c on c.customer_id=ts.customer_id AND c.active  WHERE
get_available_pages(ts.task_id,ts.customer_id)  0 AND
((get_option('expired_users')::integer = 0) OR (isfinite(last_login) AND
extract('day' from current_timestamp - last_login)::integer =
coalesce(get_option('expired_users')::integer,100))) AND  ((s.status is null
AND ts.customer_id is null)  OR s.status  0) AND
(get_check_period(ts.task_id,ts.next_check) is null OR
(unix_timestamp(get_check_period(ts.task_id,ts.next_check)) -
unix_timestamp(timenow())  3600)) AND ts.status  1 AND ((ts.start_time 
current_time AND ts.stop_time  current_time)  OR (ts.start_time is null AND
ts.stop_time is null))  AND tu.url_id = 1  AND ts.customer_id not in (select
distinct customer_id from task_scheduler where status = 1)  order by
ts.next_check is not null, unix_timestamp(ts.next_check) -
unix_timestamp(timenow()) limit 10;

which produces this query plan:
 Limit  (cost=2874.98..2874.99 rows=2 width=88) (actual 
time=11800.535..11800.546 rows=3 loops=1)
   -  Sort  (cost=2874.98..2874.99 rows=2 width=88) (actual 
time=11800.529..11800.532 rows=3 loops=1)
 Sort Key: (ts.next_check IS NOT NULL), (date_part('epoch'::text, 
ts.next_check) - date_part('epoch'::text, (timenow())::timestamp without time 
zone))
 -  Nested Loop  (cost=4.37..2874.97 rows=2 width=88) (actual 
time=10249.115..11800.486 rows=3 loops=1)
   -  Nested Loop  (cost=4.37..2868.87 rows=2 width=55) (actual 
time=10247.721..11796.303 rows=3 loops=1)
 Join Filter: (inner.id = outer.dict_id)
 -  Nested Loop  (cost=2.03..2865.13 rows=2 width=55) 
(actual time=10247.649..11796.142 rows=3 loops=1)
   Join Filter: (((inner.status IS NULL) AND 
(outer.customer_id IS NULL)) OR (inner.status  0))
   -  Nested Loop  (cost=2.03..2858.34 rows=2 
width=55) (actual time=10247.583..11795.936 rows=3 loops=1)
 -  Seq Scan on customer c  (cost=0.00..195.71 
rows=231 width=4) (actual time=0.082..154.344 rows=4161 loops=1)
   Filter: (active AND isfinite(last_login) 
AND ((date_part('day'::text, (('now'::text)::timestamp(6) with time zone - 
(last_login)::timestamp with time zone)))::integer = 150))
 -  Index Scan using 
task_scheduler_icustomer_id on task_scheduler ts  (cost=2.03..11.51 rows=1 
width=51) (actual time=2.785..2.785 rows=0 loops=4161)
   Index Cond: (outer.customer_id = 
ts.customer_id)
   Filter: ((get_available_pages(task_id, 
customer_id)  0) AND ((get_check_period(task_id, next_check) IS NULL) OR 
((date_part('epoch'::text, get_check_period(task_id, next_check)) - 
date_part('epoch'::text, (timenow())::timestamp without time zone))  
3600::double precision)) AND (status  1) AND start_time)::time with time 
zone  ('now'::text)::time(6) with time zone) AND ((stop_time)::time with time 
zone  ('now'::text)::time(6) with time zone)) OR ((start_time IS NULL) AND 
(stop_time IS NULL))) AND (NOT (hashed subplan)))
   SubPlan
 -  Unique  (cost=2.02..2.03 rows=1 
width=4) (actual time=0.617..0.631 rows=3 loops=1)
   -  Sort  (cost=2.02..2.03 
rows=1 width=4) (actual time=0.613..0.617 rows=3 loops=1)
 Sort Key: customer_id
 -  Index Scan using 
task_scheduler_istatus on task_scheduler  (cost=0.00..2.01 rows=1 width=4) 
(actual time=0.044..0.580 rows=3 loops=1)
   Index Cond: (status 
= 1)
   -  Index Scan using subscription_icustomer_id on 
subscription s  (cost=0.00..3.38 rows=1 width=12) (actual time=0.035..0.041 
rows=1 loops=3)
 Index Cond: (outer.customer_id = 
s.customer_id)
 -  Materialize  (cost=2.34..2.65 rows=31 width=8) (actual 
time=0.008..0.027 rows=6 loops=3)
   -  Seq Scan on subscription_dic sd  
(cost=0.00..2.31 rows=31 width=8) (actual time=0.013..0.034 rows=6 loops=1)
   -  Index Scan using task_url_storage_task_id on task_url tu  
(cost=0.00..3.03 rows=1 width=37) (actual 

Re: [PERFORM] Query/database optimization

2006-08-01 Thread Tom Lane
Eugeny N Dzhurinsky [EMAIL PROTECTED] writes:
 [slow query]

The bulk of your time seems to be going into this indexscan:

  -  Index Scan using 
 task_scheduler_icustomer_id on task_scheduler ts  (cost=2.03..11.51 rows=1 
 width=51) (actual time=2.785..2.785 rows=0 loops=4161)
Index Cond: (outer.customer_id = 
 ts.customer_id)
Filter: ((get_available_pages(task_id, 
 customer_id)  0) AND ((get_check_period(task_id, next_check) IS NULL) OR 
 ((date_part('epoch'::text, get_check_period(task_id, next_check)) - 
 date_part('epoch'::text, (timenow())::timestamp without time zone))  
 3600::double precision)) AND (status  1) AND start_time)::time with 
 time zone  ('now'::text)::time(6) with time zone) AND ((stop_time)::time 
 with time zone  ('now'::text)::time(6) with time zone)) OR ((start_time IS 
 NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan)))
SubPlan
  -  Unique  (cost=2.02..2.03 rows=1 
 width=4) (actual time=0.617..0.631 rows=3 loops=1)
 ...

I kinda doubt that the index search itself is that slow --- doubtless
the problem comes from having to evaluate that filter condition on a lot
of rows.  How fast are those functions you're calling?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly