Re: [PERFORM] Index problem

2003-09-25 Thread Rigmor Ukuhe
> > What causes this behaviour? is there any workaround? Suggestions?
> >
>
> How many rows are there in the table, and can you post the
> 'explain analyze' for both queries after doing a 'vacuum verbose analyze
> [tablename]'?

There are about 2500 rows in that table.

1st query explain analyze: Seq Scan on PRIORITY_STATISTICS
(cost=0.00..491.44 rows=127 width=12) (actual time=98.58..98.58 rows=0
loops=1)
Total runtime: 98.74 msec

2nd query explain analyze: NOTICE:  QUERY PLAN:

Index Scan using PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
PRIORITY_STATISTICS_reo_id on PRIORITY_STATISTICS  (cost=0.00..394.06
rows=102 width=12) (actual time=20.93..20.93 rows=0 loops=1)
Total runtime: 21.59 msec

Any help?

Rigmor


>
> Cheers
>
> Matt
>
>
>
> ---
> Incoming mail is certified Virus Free.
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.515 / Virus Database: 313 - Release Date: 01.09.2003
>
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.515 / Virus Database: 313 - Release Date: 01.09.2003


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Index problem

2003-09-25 Thread Matt Clark
> There are about 2500 rows in that table.
>
> 1st query explain analyze: Seq Scan on PRIORITY_STATISTICS
> (cost=0.00..491.44 rows=127 width=12) (actual time=98.58..98.58 rows=0
> loops=1)
> Total runtime: 98.74 msec
>
> 2nd query explain analyze: NOTICE:  QUERY PLAN:
>
> Index Scan using PRIORITY_STATISTICS_reo_id, PRIORITY_STATISTICS_reo_id,
[snip]
> PRIORITY_STATISTICS_reo_id on PRIORITY_STATISTICS  (cost=0.00..394.06
> rows=102 width=12) (actual time=20.93..20.93 rows=0 loops=1)
> Total runtime: 21.59 msec

With only 2500 rows the planner could be deciding that it's going to have to read 
every disk block to do an index scan anyway, so it
might as well do a sequential scan.  If the pages are in fact in the kernel cache then 
the compute time will dominate, not the IO
time, so it ends up looking like a bad plan, but it's probably not really such a bad 
plan...

Is your effective_cache_size set to something sensibly large?

You could also try decreasing cpu_index_tuple_cost and cpu_tuple_cost.  These will 
affect all your queries though, so what you gain
on one might be lost on another.

Matt



---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] performance hit when joining with a view?

2003-09-25 Thread Palle Girgensohn
Hi!

A performance question:

I have some tables:

  Tabell "public.person"
 Kolumn  |   Typ|  Modifierare
--+--+---
userid   | text | not null
giver| text |
first_name   | text |
last_name| text |
email| text |
default_language | text | default 'sv'
created  | timestamp with time zone | default now()
created_by   | text |
Index: person_pkey primärnyckel btree (userid),
  person_index unik btree (userid),
  person_giver_idx btree (giver)
Främmande nyckel-villkor: pp_fk9 FOREIGN KEY (giver) REFERENCES 
providers(giver) ON UPDATE CASCADE ON DELETE CASCADE,
 pp_fk2 FOREIGN KEY (created_by) REFERENCES 
person(userid) ON UPDATE CASCADE ON DELETE SET NULL

Tabell "public.wiol"
Kolumn  | Typ |  Modifierare
-+-+---
userid  | text| not null
course_id   | integer |
login_ts| timestamp without time zone | default now()
latest_event_ts | timestamp without time zone | default now()
Främmande nyckel-villkor: pp_fk2 FOREIGN KEY (course_id) REFERENCES 
course(id) ON UPDATE CASCADE ON DELETE CASCADE,
 pp_fk1 FOREIGN KEY (userid) REFERENCES 
person(userid) ON UPDATE CASCADE ON DELETE CASCADE

and a view:

  Vy "public.person_wiol_view"
 Kolumn  |   Typ| Modifierare
--+--+-
userid   | text |
giver| text |
first_name   | text |
last_name| text |
email| text |
default_language | text |
created  | timestamp with time zone |
created_by   | text |
course_id| integer  |
Vydefinition: SELECT p.userid, p.giver, p.first_name, p.last_name, p.email, 
p.default_language, p.created, p.created_by, w.course_id FROM (person p 
LEFT JOIN wiol w ON ((p.userid = w.userid)));

Now, with about 3 tuples in person and about 40 in wiol, executing a 
left outer join with the view gives horrible performance:

explain  analyze select p.pim_id, p.recipient, p.sender, p.message, p.ts, 
p.type, case when sender.userid is not null then sender.first_name || ' ' 
|| sender.last_name else null end as sender_name, sender.course_id is not 
null as is_online from pim p left outer join person_wiol_view sender on 
(sender.userid = p.sender) where p.recipient = 'axto6551' and p.type >= 0 
limit 1;
QUERY PLAN
---
--
Limit  (cost=0.00..1331.26 rows=1 width=180) (actual time=866.14..1135.65 
rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..1331.26 rows=1 width=180) (actual 
time=866.13..1135.63 rows=2 loops=1)
Join Filter: ("inner".userid = "outer".sender)
->  Seq Scan on pim p  (cost=0.00..0.00 rows=1 width=112) (actual 
time=0.05..0.18 rows=2 loops=1)
  Filter: ((recipient = 'axto6551'::text) AND ("type" >= 0))
->  Materialize  (cost=956.15..956.15 rows=30009 width=68) (actual 
time=369.33..437.86 rows=22045 loops=2)
  ->  Hash Join  (cost=0.00..956.15 rows=30009 width=68) 
(actual time=0.45..605.21 rows=30013 loops=1)
Hash Cond: ("outer".userid = "inner".userid)
->  Seq Scan on person p  (cost=0.00..806.09 
rows=30009 width=32) (actual time=0.16..279.28 rows=30009 loops=1)
->  Hash  (cost=0.00..0.00 rows=1 width=36) (actual 
time=0.13..0.13 rows=0 loops=1)
  ->  Seq Scan on wiol w  (cost=0.00..0.00 rows=1 
width=36) (actual time=0.02..0.09 rows=8 loops=1)
Total runtime: 1143.93 msec
(12 rader)

but rewriting the question with an explicit join uses the indices, and runs 
*much* faster:

explain  analyze select p.pim_id, p.recipient, p.sender, p.message, p.ts, 
p.type, case when sender.userid is not null then sender.first_name || ' ' 
|| sender.last_name else null end as sender_name, w.course_id is not null 
as is_online from pim p left outer join person sender on (sender.userid = 
p.sender) left join wiol w on (w.userid=sender.userid) where p.recipient = 
'axto6551' and p.type >= 0  limit 1;
QUERY PLAN
---
-
Limit  (cost=0.00..6.03 rows=

Re: [PERFORM] Index problem

2003-09-25 Thread Tom Lane
"Rigmor Ukuhe" <[EMAIL PROTECTED]> writes:
>>> What causes this behaviour? is there any workaround? Suggestions?

At some point the planner is going to decide that one seqscan is cheaper
than repeated indexscans.  At some point it'll be right ... but in this
case it seems its relative cost estimates are off a bit.  You might try
reducing random_page_cost to bring them more into line with reality.
(But keep in mind that the reality you are measuring appears to be
small-table-already-fully-cached reality.  On a large table you might
find that small random_page_cost isn't such a hot idea after all.)

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] performance hit when joining with a view?

2003-09-25 Thread Tom Lane
Palle Girgensohn <[EMAIL PROTECTED]> writes:
> Vydefinition: SELECT p.userid, p.giver, p.first_name, p.last_name, p.email, 
> p.default_language, p.created, p.created_by, w.course_id FROM (person p 
> LEFT JOIN wiol w ON ((p.userid = w.userid)));

>  explain  analyze select p.pim_id, p.recipient, p.sender, p.message, p.ts, 
> p.type, case when sender.userid is not null then sender.first_name || ' ' 
> || sender.last_name else null end as sender_name, sender.course_id is not 
> null as is_online from pim p left outer join person_wiol_view sender on 
> (sender.userid = p.sender) where p.recipient = 'axto6551' and p.type >= 0 
> limit 1;

> explain  analyze select p.pim_id, p.recipient, p.sender, p.message, p.ts, 
> p.type, case when sender.userid is not null then sender.first_name || ' ' 
> || sender.last_name else null end as sender_name, w.course_id is not null 
> as is_online from pim p left outer join person sender on (sender.userid = 
> p.sender) left join wiol w on (w.userid=sender.userid) where p.recipient = 
> 'axto6551' and p.type >= 0  limit 1;

These are not actually the same query.  In the former case the implicit
parenthesization of the joins is
pim left join (person left join wiol)
whereas in the latter case the implicit parenthesization is left-to-right:
(pim left join person) left join wiol
Since the only restriction conditions you have provided are on pim, the
first parenthesization implies forming the entire join of person and
wiol :-(.

If you were using plain joins then the two queries would be logically
equivalent, but outer joins are in general not associative, so the
planner will not consider re-ordering them.

There is some work in 7.4 to make the planner smarter about outer joins,
but offhand I don't think any of it will improve results for this
particular example.

I have seen some academic papers about how to prove that a particular
pair of outer join operators can safely be swapped (as I think is true
in this example).  Some knowledge of that sort may eventually get into
the planner, but it ain't there now.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] upping checkpoints on production server

2003-09-25 Thread Robert Treat
On Wed, 2003-09-24 at 17:57, Tom Lane wrote:
> Robert Treat <[EMAIL PROTECTED]> writes:
> > In .conf file I have default checkpoints set to 3, but I noticed that in
> > my pg_xlog directory I always seem to have at least 8 log files. Since
> > this is more than the suggested 7, I'm wondering if this means I ought
> > to bump my checkpoint segments up to 4?
> 
> Hm.  What is the typical delta in the mod times of the log files?  It
> sounds like you are in a regime where checkpoints are always triggered
> by checkpoint_segments and never by checkpoint_timeout, in which case
> increasing the former might be a good idea.  Or decrease the latter,
> but that could put a drag on performance.
> 

# ls -lht /var/lib/pgsql/data/pg_xlog/
total 129M
-rw---1 postgres postgres  16M Sep 25 11:12 006E0059
-rw---1 postgres postgres  16M Sep 25 11:12 006E005A
-rw---1 postgres postgres  16M Sep 25 11:08 006E0058
-rw---1 postgres postgres  16M Sep 25 11:05 006E005F
-rw---1 postgres postgres  16M Sep 25 11:02 006E005E
-rw---1 postgres postgres  16M Sep 25 10:59 006E005D
-rw---1 postgres postgres  16M Sep 25 10:55 006E005B
-rw---1 postgres postgres  16M Sep 25 10:51 006E005C

#ls -lht /var/lib/pgsql/data/pg_xlog/
total 129M
-rw---1 postgres postgres  16M Sep 25 10:52 006E0054
-rw---1 postgres postgres  16M Sep 25 10:51 006E0053
-rw---1 postgres postgres  16M Sep 25 10:49 006E0052
-rw---1 postgres postgres  16M Sep 25 10:45 006E0059
-rw---1 postgres postgres  16M Sep 25 10:40 006E0057
-rw---1 postgres postgres  16M Sep 25 10:37 006E0058
-rw---1 postgres postgres  16M Sep 25 10:33 006E0056
-rw---1 postgres postgres  16M Sep 25 10:29 006E0055






from the 7.4 docs:

 "Checkpoints are fairly expensive because they force all dirty kernel
buffers to disk using the operating system sync() call. Busy servers may
fill checkpoint segment files too quickly, causing excessive
checkpointing." 

it goes on to mention checkpoint_warning, which I don't have in 7.3, but
I think this is a case where I'd likely see those warnings. The server
in question has a fairly high write/read ratio and is fairly busy (over
100 tps iirc). 

since more often than not I don't make it to 5 minutes, seems like
upping checkpoint segments is the way to go, right?

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] TPC-R benchmarks

2003-09-25 Thread Oleg Lebedev
Title: Message



I am running TPC-R 
benchmarks with a scale factor of 1, which correspond to approximately 1 GB 
database size on PostgreSQL 7.3.4 installed on CygWin on Windows XP. I dedicated 
128 MB of shared memory to my postrges installation.
Most of the queries 
were able to complete in a matter of minutes, but query 17 was taking hours and 
hours. The query is show below. Is there any way to optimize it 
?
 
select sum(l_extendedprice) / 7.0 as 
avg_yearlyfrom lineitem, partwhere p_partkey 
= l_partkey and p_brand = 'Brand#11' and p_container = 'SM 
PKG' and l_quantity < 
(  select   0.2 * 
avg(l_quantity)  from   lineitem  where   l_partkey 
= p_partkey );
 
Thanks.
 
Oleg

*

This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network. 

*





Re: [PERFORM] TPC-R benchmarks

2003-09-25 Thread Jenny Zhang
I am running TPC-H with scale factor of 1 on RedHat7.2 with the kernel
2.5.74.  Q17 can always finish in about 7 seconds on my system.  The
execution plan is:

 Aggregate  (cost=780402.43..780402.43 rows=1 width=48)
   ->  Nested Loop  (cost=0.00..780397.50 rows=1973 width=48)
 Join Filter: ("inner".l_quantity < (subplan))
 ->  Seq Scan on part  (cost=0.00..8548.00 rows=197 width=12)
   Filter: ((p_brand = 'Brand#31'::bpchar) AND (p_container
= 'LG CASE'::bpchar))
 ->  Index Scan using i_l_partkey on lineitem 
(cost=0.00..124.32 rows=30 width=36)
   Index Cond: ("outer".p_partkey = lineitem.l_partkey)
 SubPlan
   ->  Aggregate  (cost=124.40..124.40 rows=1 width=11)
 ->  Index Scan using i_l_partkey on lineitem 
(cost=0.00..124.32 rows=30 width=11)
   Index Cond: (l_partkey = $0)
(11 rows)

Hope this helps,
Jenny
On Thu, 2003-09-25 at 12:40, Oleg Lebedev wrote:
> I am running TPC-R benchmarks with a scale factor of 1, which correspond
> to approximately 1 GB database size on PostgreSQL 7.3.4 installed on
> CygWin on Windows XP. I dedicated 128 MB of shared memory to my postrges
> installation.
> Most of the queries were able to complete in a matter of minutes, but
> query 17 was taking hours and hours. The query is show below. Is there
> any way to optimize it ?
>  
> select
>  sum(l_extendedprice) / 7.0 as avg_yearly
> from
>  lineitem,
>  part
> where
>  p_partkey = l_partkey
>  and p_brand = 'Brand#11'
>  and p_container = 'SM PKG'
>  and l_quantity < (
>   select
>0.2 * avg(l_quantity)
>   from
>lineitem
>   where
>l_partkey = p_partkey
>  );
>  
> Thanks.
>  
> Oleg
> 
> *
> 
> This e-mail may contain privileged or confidential material intended for the named 
> recipient only.
> If you are not the named recipient, delete this message and all attachments.
> Unauthorized reviewing, copying, printing, disclosing, or otherwise using 
> information in this e-mail is prohibited.
> We reserve the right to monitor e-mail sent through our network. 
> 
> *


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] TPC-R benchmarks

2003-09-25 Thread Oleg Lebedev
Seems like in your case postgres uses an i_l_partkey index on lineitem
table. I have a foreign key constraint defined between the lineitem and
part table, but didn't create an special indexes. Here is my query plan:

   ->  Aggregate  (cost=1517604222.32..1517604222.32 rows=1 width=31)
 ->  Hash Join  (cost=8518.49..1517604217.39 rows=1969 width=31)
   Hash Cond: ("outer".l_partkey = "inner".p_partkey)
   Join Filter: ("outer".l_quantity < (subplan))
   ->  Seq Scan on lineitem  (cost=0.00..241889.15
rows=6001215 widt
h=27)
   ->  Hash  (cost=8518.00..8518.00 rows=197 width=4)
 ->  Seq Scan on part  (cost=0.00..8518.00 rows=197
width=4)

   Filter: ((p_brand = 'Brand#11'::bpchar) AND
(p_contai
ner = 'SM PKG'::bpchar))
   SubPlan
 ->  Aggregate  (cost=256892.28..256892.28 rows=1
width=11)
   ->  Seq Scan on lineitem  (cost=0.00..256892.19
rows=37 w
idth=11)
 Filter: (l_partkey = $0)

-Original Message-
From: Jenny Zhang [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 25, 2003 3:33 PM
To: Oleg Lebedev
Cc: [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [PERFORM] TPC-R benchmarks


I am running TPC-H with scale factor of 1 on RedHat7.2 with the kernel
2.5.74.  Q17 can always finish in about 7 seconds on my system.  The
execution plan is:


 Aggregate  (cost=780402.43..780402.43 rows=1 width=48)
   ->  Nested Loop  (cost=0.00..780397.50 rows=1973 width=48)
 Join Filter: ("inner".l_quantity < (subplan))
 ->  Seq Scan on part  (cost=0.00..8548.00 rows=197 width=12)
   Filter: ((p_brand = 'Brand#31'::bpchar) AND (p_container
= 'LG CASE'::bpchar))
 ->  Index Scan using i_l_partkey on lineitem 
(cost=0.00..124.32 rows=30 width=36)
   Index Cond: ("outer".p_partkey = lineitem.l_partkey)
 SubPlan
   ->  Aggregate  (cost=124.40..124.40 rows=1 width=11)
 ->  Index Scan using i_l_partkey on lineitem 
(cost=0.00..124.32 rows=30 width=11)
   Index Cond: (l_partkey = $0)
(11 rows)

Hope this helps,
Jenny
On Thu, 2003-09-25 at 12:40, Oleg Lebedev wrote:
> I am running TPC-R benchmarks with a scale factor of 1, which 
> correspond to approximately 1 GB database size on PostgreSQL 7.3.4 
> installed on CygWin on Windows XP. I dedicated 128 MB of shared memory

> to my postrges installation. Most of the queries were able to complete

> in a matter of minutes, but query 17 was taking hours and hours. The 
> query is show below. Is there any way to optimize it ?
>  
> select
>  sum(l_extendedprice) / 7.0 as avg_yearly
> from
>  lineitem,
>  part
> where
>  p_partkey = l_partkey
>  and p_brand = 'Brand#11'
>  and p_container = 'SM PKG'
>  and l_quantity < (
>   select
>0.2 * avg(l_quantity)
>   from
>lineitem
>   where
>l_partkey = p_partkey
>  );
>  
> Thanks.
>  
> Oleg
> 
> *
> 
> This e-mail may contain privileged or confidential material intended 
> for the named recipient only. If you are not the named recipient, 
> delete this message and all attachments. Unauthorized reviewing, 
> copying, printing, disclosing, or otherwise using information in this 
> e-mail is prohibited. We reserve the right to monitor e-mail sent 
> through our network.
> 
> *

*

This e-mail may contain privileged or confidential material intended for the named 
recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information 
in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network. 

*


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] TPC-R benchmarks

2003-09-25 Thread Jenny Zhang
The index is created by:
create index i_l_partkey on lineitem (l_partkey);
I do not have any foreign key defined.  Does the spec require foreign
keys?

When you create a foreign key reference, does PG create an index
automatically?

Can you try with the index?

Jenny
On Thu, 2003-09-25 at 14:39, Oleg Lebedev wrote:
> Seems like in your case postgres uses an i_l_partkey index on lineitem
> table. I have a foreign key constraint defined between the lineitem and
> part table, but didn't create an special indexes. Here is my query plan:
> 
>->  Aggregate  (cost=1517604222.32..1517604222.32 rows=1 width=31)
>  ->  Hash Join  (cost=8518.49..1517604217.39 rows=1969 width=31)
>Hash Cond: ("outer".l_partkey = "inner".p_partkey)
>Join Filter: ("outer".l_quantity < (subplan))
>->  Seq Scan on lineitem  (cost=0.00..241889.15
> rows=6001215 widt
> h=27)
>->  Hash  (cost=8518.00..8518.00 rows=197 width=4)
>  ->  Seq Scan on part  (cost=0.00..8518.00 rows=197
> width=4)
> 
>Filter: ((p_brand = 'Brand#11'::bpchar) AND
> (p_contai
> ner = 'SM PKG'::bpchar))
>SubPlan
>  ->  Aggregate  (cost=256892.28..256892.28 rows=1
> width=11)
>->  Seq Scan on lineitem  (cost=0.00..256892.19
> rows=37 w
> idth=11)
>  Filter: (l_partkey = $0)
> 
> -Original Message-
> From: Jenny Zhang [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, September 25, 2003 3:33 PM
> To: Oleg Lebedev
> Cc: [EMAIL PROTECTED];
> [EMAIL PROTECTED]
> Subject: Re: [PERFORM] TPC-R benchmarks
> 
> 
> I am running TPC-H with scale factor of 1 on RedHat7.2 with the kernel
> 2.5.74.  Q17 can always finish in about 7 seconds on my system.  The
> execution plan is:
> 
> 
>  Aggregate  (cost=780402.43..780402.43 rows=1 width=48)
>->  Nested Loop  (cost=0.00..780397.50 rows=1973 width=48)
>  Join Filter: ("inner".l_quantity < (subplan))
>  ->  Seq Scan on part  (cost=0.00..8548.00 rows=197 width=12)
>Filter: ((p_brand = 'Brand#31'::bpchar) AND (p_container
> = 'LG CASE'::bpchar))
>  ->  Index Scan using i_l_partkey on lineitem 
> (cost=0.00..124.32 rows=30 width=36)
>Index Cond: ("outer".p_partkey = lineitem.l_partkey)
>  SubPlan
>->  Aggregate  (cost=124.40..124.40 rows=1 width=11)
>  ->  Index Scan using i_l_partkey on lineitem 
> (cost=0.00..124.32 rows=30 width=11)
>Index Cond: (l_partkey = $0)
> (11 rows)
> 
> Hope this helps,
> Jenny
> On Thu, 2003-09-25 at 12:40, Oleg Lebedev wrote:
> > I am running TPC-R benchmarks with a scale factor of 1, which 
> > correspond to approximately 1 GB database size on PostgreSQL 7.3.4 
> > installed on CygWin on Windows XP. I dedicated 128 MB of shared memory
> 
> > to my postrges installation. Most of the queries were able to complete
> 
> > in a matter of minutes, but query 17 was taking hours and hours. The 
> > query is show below. Is there any way to optimize it ?
> >  
> > select
> >  sum(l_extendedprice) / 7.0 as avg_yearly
> > from
> >  lineitem,
> >  part
> > where
> >  p_partkey = l_partkey
> >  and p_brand = 'Brand#11'
> >  and p_container = 'SM PKG'
> >  and l_quantity < (
> >   select
> >0.2 * avg(l_quantity)
> >   from
> >lineitem
> >   where
> >l_partkey = p_partkey
> >  );
> >  
> > Thanks.
> >  
> > Oleg
> > 
> > *
> > 
> > This e-mail may contain privileged or confidential material intended 
> > for the named recipient only. If you are not the named recipient, 
> > delete this message and all attachments. Unauthorized reviewing, 
> > copying, printing, disclosing, or otherwise using information in this 
> > e-mail is prohibited. We reserve the right to monitor e-mail sent 
> > through our network.
> > 
> > *
> 
> *
> 
> This e-mail may contain privileged or confidential material intended for the named 
> recipient only.
> If you are not the named recipient, delete this message and all attachments.
> Unauthorized reviewing, copying, printing, disclosing, or otherwise using 
> information in this e-mail is prohibited.
> We reserve the right to monitor e-mail sent through our network. 
> 
> *
> 
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Indices arent being used

2003-09-25 Thread rantunes
Hi guys 

Im running a Datawarehouse benchmark (APB-1) on PostgreSql. The objective is to
choose which of the to main db (PostgreSQL, MySQL) is fastest. I've run into a
small problem which I hope could be resolved here.

I'm trying to speed up this query:

select count(*) from actvars, prodlevel where
actvars.product_level=prodlevel.code_level and
prodlevel.division_level='OY3S5LAPALL6';

ACTVARS is a fact table of aproximatly 16 million rows, PRODLEVEL has 20700
rows. Both have btree indices. 

I executed the query and it took me almost half an hour to execute! Running the
same query on MySQL the result came 6 seconds after. As you can see there is a
large differences between execution times.

After running an explain:

Aggregate  (cost=3123459.62..3123459.62 rows=1 width=32)
   ->  Merge Join  (cost=3021564.79..3119827.17 rows=1452981 width=32)
 Merge Cond: ("outer".product_level = "inner".code_level)
 ->  Sort  (cost=3020875.00..3060938.81 rows=16025523 width=16)
   Sort Key: actvars.product_level
   ->  Seq Scan on actvars  (cost=0.00..365711.23 rows=16025523
width=16)
 ->  Sort  (cost=689.79..694.48 rows=1877 width=16)
   Sort Key: prodlevel.code_level
   ->  Seq Scan on prodlevel  (cost=0.00..587.75 rows=1877 width=16)
 Filter: (division_level = 'OY3S5LAPALL6'::bpchar)

I found that the indices werent being used. 

The database has been vacuumed and analyze has been executed.

I tried disabling the seqscan, so as to force index usage. The planner uses
index scans but the query stil takes a very long time to execute.

Any suggestions on resolving this would would be appreciated.

P.S: Im running PostgrSQL
7.3.2

-
This message was sent using Endymion MailMan.
http://www.endymion.com/products/mailman/



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Indices arent being used

2003-09-25 Thread Rod Taylor
> Im running a Datawarehouse benchmark (APB-1) on PostgreSql. The objective is to
> choose which of the to main db (PostgreSQL, MySQL) is fastest. I've run into a
> small problem which I hope could be resolved here.
> 
> I'm trying to speed up this query:
> 
> select count(*) from actvars, prodlevel where
> actvars.product_level=prodlevel.code_level and
> prodlevel.division_level='OY3S5LAPALL6';

How about EXPLAIN ANALYZE output?

> ACTVARS is a fact table of aproximatly 16 million rows, PRODLEVEL has 20700
> rows. Both have btree indices. 

> The database has been vacuumed and analyze has been executed.

The usual postgresql.conf adjustments have also been made?



signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] TPC-R benchmarks

2003-09-25 Thread Josh Berkus
Jenny,

> create index i_l_partkey on lineitem (l_partkey);
> I do not have any foreign key defined.  Does the spec require foreign
> keys?
>
> When you create a foreign key reference, does PG create an index
> automatically?

No.   A index is not required to enforce a foriegn key, and is sometimes not 
useful (for example, FK fields with only 3 possible values).

So it may be that you need to create an index on that field.


-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] TPC-R benchmarks

2003-09-25 Thread Tom Lane
Oleg Lebedev <[EMAIL PROTECTED]> writes:
> Seems like in your case postgres uses an i_l_partkey index on lineitem
> table. I have a foreign key constraint defined between the lineitem and
> part table, but didn't create an special indexes. Here is my query plan:

The planner is obviously unhappy with this plan (note the large cost
numbers), but it can't find a way to do better.  An index on
lineitem.l_partkey would help, I think.

The whole query seems like it's written in a very inefficient fashion;
couldn't the estimation of '0.2 * avg(l_quantity)' be amortized across
multiple join rows?  But I dunno whether the TPC rules allow for
significant manual rewriting of the given query.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly