date:20050722

[PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Dawid Kuroczko

Hello, I have PostgreSQL 8.0.3 running on a "workstation" with 768 MB
of RAM, under FreeBSD.  And I have a 47-milion row table:

qnex=# explain select * from log;
  QUERY PLAN   
---
 Seq Scan on log  (cost=0.00..1741852.36 rows=47044336 width=180)
(1 row)

...which is joined with a few smaller ones, like:

qnex=# explain select * from useragents;
QUERY PLAN 
---
 Seq Scan on useragents  (cost=0.00..9475.96 rows=364896 width=96)
(1 row)

shared_buffers = 5000
random_page_cost = 3
work_mem = 102400
effective_cache_size = 6

Now, if I do a SELECT:

qnex=# EXPLAIN SELECT * FROM log NATURAL JOIN useragents LIMIT 1;
 QUERY PLAN
-
 Limit  (cost=15912.20..15912.31 rows=1 width=272)
   ->  Hash Join  (cost=15912.20..5328368.96 rows=47044336 width=272)
 Hash Cond: ("outer".useragent_id = "inner".useragent_id)
 ->  Seq Scan on log  (cost=0.00..1741852.36 rows=47044336 width=180)
 ->  Hash  (cost=9475.96..9475.96 rows=364896 width=96)
   ->  Seq Scan on useragents  (cost=0.00..9475.96
rows=364896 width=96)
(6 rows)

Or:

qnex=# EXPLAIN SELECT * FROM log NATURAL LEFT JOIN useragents LIMIT 1;
 QUERY PLAN
-
 Limit  (cost=15912.20..15912.31 rows=1 width=272)
   ->  Hash Left Join  (cost=15912.20..5328368.96 rows=47044336 width=272)
 Hash Cond: ("outer".useragent_id = "inner".useragent_id)
 ->  Seq Scan on log  (cost=0.00..1741852.36 rows=47044336 width=180)
 ->  Hash  (cost=9475.96..9475.96 rows=364896 width=96)
   ->  Seq Scan on useragents  (cost=0.00..9475.96
rows=364896 width=96)
(6 rows)

Time: 2.688 ms

...the query seems to last forever (its hashing 47 million rows!)

If I set enable_hashjoin=false:

qnex=# EXPLAIN ANALYZE SELECT * FROM log NATURAL LEFT JOIN useragents LIMIT 1;
QUERY
PLAN
---
 Limit  (cost=0.00..3.07 rows=1 width=272) (actual time=74.214..74.216
rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..144295895.01 rows=47044336
width=272) (actual time=74.204..74.204 rows=1 loops=1)
 ->  Seq Scan on log  (cost=0.00..1741852.36 rows=47044336
width=180) (actual time=23.270..23.270 rows=1 loops=1)
 ->  Index Scan using useragents_pkey on useragents 
(cost=0.00..3.02 rows=1 width=96) (actual time=50.867..50.867 rows=1
loops=1)
   Index Cond: ("outer".useragent_id = useragents.useragent_id)
 Total runtime: 74.483 ms

...which is way faster.  Of course if I did:

qnex=# EXPLAIN ANALYZE SELECT * FROM log NATURAL LEFT JOIN useragents
WHERE logid = (SELECT logid FROM log LIMIT 1);
QUERY PLAN
---
 Nested Loop Left Join  (cost=0.04..6.09 rows=1 width=272) (actual
time=61.403..61.419 rows=1 loops=1)
   InitPlan
 ->  Limit  (cost=0.00..0.04 rows=1 width=4) (actual
time=0.029..0.032 rows=1 loops=1)
   ->  Seq Scan on log  (cost=0.00..1741852.36 rows=47044336
width=4) (actual time=0.023..0.023 rows=1 loops=1)
   ->  Index Scan using log_pkey on log  (cost=0.00..3.02 rows=1
width=180) (actual time=61.316..61.319 rows=1 loops=1)
 Index Cond: (logid = $0)
   ->  Index Scan using useragents_pkey on useragents 
(cost=0.00..3.02 rows=1 width=96) (actual time=0.036..0.042 rows=1
loops=1)
 Index Cond: ("outer".useragent_id = useragents.useragent_id)
 Total runtime: 61.741 ms
(9 rows)

...I tried tweaking cpu_*, work_mem, effective_cache and so on, but without
any luck.  47 milion table is huge compared to useragents (I actually need
to join the log with 3 similar to useragents tables, and create a view out
of it).  Also tried using LEFT/RIGHT JOINS insead of (inner) JOINs...
Of course the database is freshly vacuum analyzed, and statistics are
set at 50...

My view of the problem is that planner ignores the "LIMIT" part.  It assumes
it _needs_ to return all 47 million rows joined with the useragents table, so
the hashjoin is the only sane approach.  But chances are that unless I'll
use LIMIT 20, the nested loop will be much faster.

Any ideas how to make it work (other than rewriting the query to use
subselects, use explicit id-rows, disabling hashjoin completely)?
Or is this a bug?

   Regards,
  Dawid

---

Re: [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread PFC



Which row do you want ? Do you want 'a row' at random ?
I presume you want the N latest rows ?
	In that case you should use an ORDER BY on an indexed field, the serial  
primary key will do nicely (ORDER BY id DESC) ; it's indexed so it will  
use the index and it will fly.



Any ideas how to make it work (other than rewriting the query to use
subselects, use explicit id-rows, disabling hashjoin completely)?
Or is this a bug?

   Regards,
  Dawid

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match





---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Tom Lane

Dawid Kuroczko <[EMAIL PROTECTED]> writes:
> qnex=# EXPLAIN SELECT * FROM log NATURAL JOIN useragents LIMIT 1;

>  Limit  (cost=15912.20..15912.31 rows=1 width=272)
>->  Hash Join  (cost=15912.20..5328368.96 rows=47044336 width=272)

> If I set enable_hashjoin=false:

> qnex=# EXPLAIN ANALYZE SELECT * FROM log NATURAL LEFT JOIN useragents LIMIT 1;

>  Limit  (cost=0.00..3.07 rows=1 width=272) (actual time=74.214..74.216
> rows=1 loops=1)
>->  Nested Loop Left Join  (cost=0.00..144295895.01 rows=47044336
> width=272) (actual time=74.204..74.204 rows=1 loops=1)

This is quite strange.  The nestloop plan definitely should be preferred
in the context of the LIMIT, considering that it has far lower estimated
cost.  And it is preferred in simple tests for me.  It seems there must
be something specific to your installation that's causing the planner to
go wrong.  Can you develop a self-contained test case that behaves this
way for you?

I recall we saw a similar complaint a month or two back, but the
complainant never followed up with anything useful for tracking down
the problem.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Another index question

2005-07-22 Thread Madison Kelly


Hi all,

  I am trying to do an update on a table but so far I can't seem to 
come up with a usable index. After my last question/thread the user 
'PFC' recommended I store whether a file was to be backed up as either 
't'(rue), 'f'(alse) or 'i'(nherit) to speed up changing files and sub 
directories under a given directory when it was toggled. I've more or 
less finished implementing this and it is certainly a LOT faster but I 
am hoping to make it just a little faster still with an Index.


  Tom Lane pointed out to me that I needed 'text_pattern_ops' on my 
'file_parent_dir' column in the index if I wanted to do pattern matching 
(the C locale wasn't set). Now I have added an additional condition and 
I think this might be my problem. Here is a sample query I am trying to 
create my index for:



UPDATE file_info_2 SET file_backup='i' WHERE file_backup!='i' AND 
file_parent_dir='/';


  This would be an example of someone changing the backup state of the 
root of a partition. It could also be:



UPDATE file_info_2 SET file_backup='i' WHERE file_backup!='i' AND 
file_parent_dir='/usr';


  If, for example, the user was toggling the backup state of the '/usr' 
directory.


  I suspected that because I was using "file_backup!='i'" that maybe I 
was running into the same problem as before so I tried creating the index:



tle-bu=> CREATE INDEX file_info_2_mupdate_idx ON file_info_2 
(file_backup bpchar_pattern_ops, file_parent_dir text_pattern_ops);


tle-bu=> EXPLAIN ANALYZE UPDATE file_info_2 SET file_backup='i' WHERE 
file_backup!='i' AND file_parent_dir~'^/'; 
  QUERY PLAN

-
 Seq Scan on file_info_2  (cost=0.00..13379.38 rows=1 width=134) 
(actual time=1623.819..1624.087 rows=4 loops=1)
   Filter: ((file_backup <> 'i'::bpchar) AND (file_parent_dir ~ 
'^/'::text))

 Total runtime: 1628.053 ms
(3 rows)


  This index wasn't used though, even when I set 'enable_seqscan' to 
'OFF'. The column 'file_backup' is 'char(1)' and the column 
'file_parent_dir' is 'text'.



tle-bu=> \d file_info_2; \di file_info_2_mupdate_idx; 
Table "public.file_info_2"

 Column  | Type |  Modifiers
-+--+--
 file_group_name | text |
 file_group_uid  | integer  | not null
 file_mod_time   | bigint   | not null
 file_name   | text | not null
 file_parent_dir | text | not null
 file_perm   | integer  | not null
 file_size   | bigint   | not null
 file_type   | character(1) | not null
 file_user_name  | text |
 file_user_uid   | integer  | not null
 file_backup | character(1) | not null default 'i'::bpchar
 file_display| character(1) | not null default 'i'::bpchar
 file_restore| character(1) | not null default 'i'::bpchar
Indexes:
"file_info_2_mupdate_idx" btree (file_backup bpchar_pattern_ops, 
file_parent_dir text_pattern_ops)

"file_info_2_supdate_idx" btree (file_parent_dir, file_name, file_type)

List of relations
 Schema |  Name   | Type  |  Owner  |Table
+-+---+-+-
 public | file_info_2_mupdate_idx | index | madison | file_info_2
(1 row)

  Could it be that there needs to be a certain number of 
"file_backup!='i'" before the planner will use the index? I have also 
tried not defining an op_class on both tables (and one at a time) but I 
can't seem to figure this out.


  As always, thank you!

Madison

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Sam Mason

Dawid Kuroczko wrote:
>work_mem = 102400

>...I tried tweaking cpu_*, work_mem, effective_cache and so on, but without
>any luck.

I'm hoping you didn't tweak it enough!  I posted something similar
this a while ago, but haven't since got around to figuring out
a useful test case to send to the list.

Try reducing your work_mem down to 1000 or so and things should
start doing what you expect.

  Sam

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Solved (was: Re: [PERFORM] Another index question)

2005-07-22 Thread Madison Kelly


Line noise, sorry...

  After posting I went back to reading the pgsql docs and saw the query:


SELECT am.amname AS index_method, opc.opcname AS opclass_name, 
opr.oprname AS opclass_operator FROM pg_am am, pg_opclass opc, pg_amop 
amop, pg_operator opr WHERE opc.opcamid = am.oid AND amop.amopclaid = 
opc.oid AND amop.amopopr = opr.oid ORDER BY index_method, opclass_name, 
opclass_operator;


  Which listed all the op_classes. I noticed none of the 
opclass_operators supported '!=' so I wondered if that was simply an 
unindexable (is that a word?) operator. So I tried creating the index:



tle-bu=> CREATE INDEX file_info_2_mupdate_idx ON file_info_2 
(file_backup, file_parent_dir text_pattern_ops);


  And changing my query to:


tle-bu=> EXPLAIN ANALYZE UPDATE file_info_2 SET file_backup='i' WHERE 
file_backup='t' OR file_backup='f' AND file_parent_dir~'^/';


QUERY PLAN
-
 Index Scan using file_info_2_mupdate_idx, file_info_2_mupdate_idx on 
file_info_2  (cost=0.00..10.04 rows=1 width=134) (actual 
time=0.112..0.718 rows=4 loops=1)
   Index Cond: ((file_backup = 't'::bpchar) OR ((file_backup = 
'f'::bpchar) AND (file_parent_dir ~>=~ '/'::text) AND (file_parent_dir 
~<~ '0'::text)))
   Filter: ((file_backup = 't'::bpchar) OR ((file_backup = 'f'::bpchar) 
AND (file_parent_dir ~ '^/'::text)))

 Total runtime: 60.359 ms
(4 rows)

  Bingo!

  Hopefully someone might find this useful in the archives. :p

Madison


Madison Kelly wrote:

Hi all,

  I am trying to do an update on a table but so far I can't seem to come 
up with a usable index. After my last question/thread the user 'PFC' 
recommended I store whether a file was to be backed up as either 
't'(rue), 'f'(alse) or 'i'(nherit) to speed up changing files and sub 
directories under a given directory when it was toggled. I've more or 
less finished implementing this and it is certainly a LOT faster but I 
am hoping to make it just a little faster still with an Index.


  Tom Lane pointed out to me that I needed 'text_pattern_ops' on my 
'file_parent_dir' column in the index if I wanted to do pattern matching 
(the C locale wasn't set). Now I have added an additional condition and 
I think this might be my problem. Here is a sample query I am trying to 
create my index for:



UPDATE file_info_2 SET file_backup='i' WHERE file_backup!='i' AND 
file_parent_dir='/';


  This would be an example of someone changing the backup state of the 
root of a partition. It could also be:



UPDATE file_info_2 SET file_backup='i' WHERE file_backup!='i' AND 
file_parent_dir='/usr';


  If, for example, the user was toggling the backup state of the '/usr' 
directory.


  I suspected that because I was using "file_backup!='i'" that maybe I 
was running into the same problem as before so I tried creating the index:



tle-bu=> CREATE INDEX file_info_2_mupdate_idx ON file_info_2 
(file_backup bpchar_pattern_ops, file_parent_dir text_pattern_ops);


tle-bu=> EXPLAIN ANALYZE UPDATE file_info_2 SET file_backup='i' WHERE 
file_backup!='i' AND file_parent_dir~'^/';   QUERY PLAN
- 

 Seq Scan on file_info_2  (cost=0.00..13379.38 rows=1 width=134) (actual 
time=1623.819..1624.087 rows=4 loops=1)
   Filter: ((file_backup <> 'i'::bpchar) AND (file_parent_dir ~ 
'^/'::text))

 Total runtime: 1628.053 ms
(3 rows)


  This index wasn't used though, even when I set 'enable_seqscan' to 
'OFF'. The column 'file_backup' is 'char(1)' and the column 
'file_parent_dir' is 'text'.



tle-bu=> \d file_info_2; \di file_info_2_mupdate_idx; Table 
"public.file_info_2"

 Column  | Type |  Modifiers
-+--+--
 file_group_name | text |
 file_group_uid  | integer  | not null
 file_mod_time   | bigint   | not null
 file_name   | text | not null
 file_parent_dir | text | not null
 file_perm   | integer  | not null
 file_size   | bigint   | not null
 file_type   | character(1) | not null
 file_user_name  | text |
 file_user_uid   | integer  | not null
 file_backup | character(1) | not null default 'i'::bpchar
 file_display| character(1) | not null default 'i'::bpchar
 file_restore| character(1) | not null default 'i'::bpchar
Indexes:
"file_info_2_mupdate_idx" btree (file_backup bpchar_pattern_ops, 
file_parent_dir text_pattern_ops)

"file_info_2_supdate_idx" btree (file_parent_dir, file_name, file_type)

List of relations
 Schema |  Name   | Type  |  Owner  |Table
+-+---+-+-
 public | file_info_2_mupdate_idx | index | madison | file_info_2
(1 row)

  C

Re: [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Dawid Kuroczko

On 7/22/05, Tom Lane <[EMAIL PROTECTED]> wrote:
> Dawid Kuroczko <[EMAIL PROTECTED]> writes:
> > qnex=# EXPLAIN SELECT * FROM log NATURAL JOIN useragents LIMIT 1;
> 
> >  Limit  (cost=15912.20..15912.31 rows=1 width=272)
> >->  Hash Join  (cost=15912.20..5328368.96 rows=47044336 width=272)
> 
> This is quite strange.  The nestloop plan definitely should be preferred
> in the context of the LIMIT, considering that it has far lower estimated
> cost.  And it is preferred in simple tests for me.  It seems there must
> be something specific to your installation that's causing the planner to
> go wrong.  Can you develop a self-contained test case that behaves this
> way for you?

Why, certainly.  I did test it also on Gentoo Linux PostgreSQL 8.0.1 (yeah,
a bit older one), but the behaviour is the same.  The test looks like this:

-- First lets make a "small" lookup table -- 40 rows.
CREATE TABLE lookup (
lookup_id serial PRIMARY KEY,
value integer NOT NULL
);
INSERT INTO lookup (value) SELECT * FROM generate_series(1, 40);
VACUUM ANALYZE lookup;
-- Then lets make a huge data table...
CREATE TABLE huge_data (
huge_data_id serial PRIMARY KEY,
lookup_id integer NOT NULL
);
INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM lookup;
INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM huge_data; --800 000
INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM huge_data; --  1 600 000
INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM huge_data; --  3 200 000
INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM huge_data; --  6 400 000
INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM huge_data; -- 12 800 000
-- You may want to put ANALYZE and EXPLAIN between each of these
-- steps.  In my cases, at 12.8 mln rows PostgreSQL seems to go for hashjoin
-- in each case.  YMMV, so you may try to push it up to 1024 mln rows.
INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM huge_data; -- 25 600 000
ANALYZE huge_data;
EXPLAIN SELECT * FROM huge_data NATURAL JOIN lookup LIMIT 1;

My EXPLAIN FROM Linux (SMP P-III box), with PostgreSQL 8.0.1, during making
this test case:

qnex=# EXPLAIN SELECT * FROM huge_data NATURAL JOIN lookup LIMIT 1;
 QUERY PLAN

Limit  (cost=0.00..3.21 rows=1 width=12)
  ->  Nested Loop  (cost=0.00..19557596.04 rows=6094777 width=12)
->  Seq Scan on huge_data  (cost=0.00..95372.42 rows=6399942 width=8)
->  Index Scan using lookup_pkey on lookup  (cost=0.00..3.02
rows=1 width=8)
  Index Cond: ("outer".lookup_id = lookup.lookup_id)
(5 rows)

Time: 4,333 ms
qnex=# INSERT INTO huge_data (lookup_id) SELECT lookup_id FROM
huge_data; -- 12 800 000
INSERT 0 640
Time: 501014,692 ms
qnex=# ANALYZE huge_data;
ANALYZE
Time: 4243,453 ms
qnex=# EXPLAIN SELECT * FROM huge_data NATURAL JOIN lookup LIMIT 1;
  QUERY PLAN
---
Limit  (cost=11719.00..11719.09 rows=1 width=12)
  ->  Hash Join  (cost=11719.00..1212739.73 rows=12800185 width=12)
Hash Cond: ("outer".lookup_id = "inner".lookup_id)
->  Seq Scan on huge_data  (cost=0.00..190747.84 rows=12800184 width=8)
->  Hash  (cost=5961.00..5961.00 rows=40 width=8)
  ->  Seq Scan on lookup  (cost=0.00..5961.00 rows=40 width=8)
(6 rows)

   Regards,
 Dawid

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Tom Lane

I wrote:
> Dawid Kuroczko <[EMAIL PROTECTED]> writes:
>> qnex=# EXPLAIN SELECT * FROM log NATURAL JOIN useragents LIMIT 1;

>> Limit  (cost=15912.20..15912.31 rows=1 width=272)
>> ->  Hash Join  (cost=15912.20..5328368.96 rows=47044336 width=272)

>> If I set enable_hashjoin=false:

>> qnex=# EXPLAIN ANALYZE SELECT * FROM log NATURAL LEFT JOIN useragents LIMIT 
>> 1;

>> Limit  (cost=0.00..3.07 rows=1 width=272) (actual time=74.214..74.216
>> rows=1 loops=1)
>> ->  Nested Loop Left Join  (cost=0.00..144295895.01 rows=47044336
>> width=272) (actual time=74.204..74.204 rows=1 loops=1)

> This is quite strange.  The nestloop plan definitely should be preferred
> in the context of the LIMIT, considering that it has far lower estimated
> cost.  And it is preferred in simple tests for me.

After a suitable period of contemplating my navel, I figured out
what is going on here: the total costs involved are large enough that
the still-fairly-high startup cost of the hash is disregarded by
compare_fuzzy_path_costs(), and so the nestloop is discarded as not
having any significant potential advantage in startup time.

I think that this refutes the original scheme of using the same fuzz
factor for both startup and total cost comparisons, and therefore
propose the attached patch.

Comments?

regards, tom lane

*** src/backend/optimizer/util/pathnode.c.orig  Fri Jul 15 13:09:25 2005
--- src/backend/optimizer/util/pathnode.c   Fri Jul 22 12:08:25 2005
***
*** 98,157 
  static int
  compare_fuzzy_path_costs(Path *path1, Path *path2, CostSelector criterion)
  {
-   Costfuzz;
- 
/*
!* The fuzz factor is set at one percent of the smaller total_cost,
!* but not less than 0.01 cost units (just in case total cost is
!* zero).
 *
 * XXX does this percentage need to be user-configurable?
 */
-   fuzz = Min(path1->total_cost, path2->total_cost) * 0.01;
-   fuzz = Max(fuzz, 0.01);
- 
if (criterion == STARTUP_COST)
{
!   if (Abs(path1->startup_cost - path2->startup_cost) > fuzz)
!   {
!   if (path1->startup_cost < path2->startup_cost)
!   return -1;
!   else
!   return +1;
!   }
  
/*
 * If paths have the same startup cost (not at all unlikely),
 * order them by total cost.
 */
!   if (Abs(path1->total_cost - path2->total_cost) > fuzz)
!   {
!   if (path1->total_cost < path2->total_cost)
!   return -1;
!   else
!   return +1;
!   }
}
else
{
!   if (Abs(path1->total_cost - path2->total_cost) > fuzz)
!   {
!   if (path1->total_cost < path2->total_cost)
!   return -1;
!   else
!   return +1;
!   }
  
/*
 * If paths have the same total cost, order them by startup 
cost.
 */
!   if (Abs(path1->startup_cost - path2->startup_cost) > fuzz)
!   {
!   if (path1->startup_cost < path2->startup_cost)
!   return -1;
!   else
!   return +1;
!   }
}
return 0;
  }
--- 98,138 
  static int
  compare_fuzzy_path_costs(Path *path1, Path *path2, CostSelector criterion)
  {
/*
!* We use a fuzz factor of 1% of the smaller cost.
 *
 * XXX does this percentage need to be user-configurable?
 */
if (criterion == STARTUP_COST)
{
!   if (path1->startup_cost > path2->startup_cost * 1.01)
!   return +1;
!   if (path2->startup_cost > path1->startup_cost * 1.01)
!   return -1;
  
/*
 * If paths have the same startup cost (not at all unlikely),
 * order them by total cost.
 */
!   if (path1->total_cost > path2->total_cost * 1.01)
!   return +1;
!   if (path2->total_cost > path1->total_cost * 1.01)
!   return -1;
}
else
{
!   if (path1->total_cost > path2->total_cost * 1.01)
!   return +1;
!   if (path2->total_cost > path1->total_cost * 1.01)
!   return -1;
  
/*
 * If paths have the same total cost, order them by startup 
cost.
 */
!   if (path1->startup_cost > path2->startup_cost * 1.01)
!   return +1;
!   if (path2->startup_cost > path1->

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

2005-07-22 Thread Joshua D. Drake


Here is the SCSI output:

Web Server

SCSI subsystem driver Revision: 1.00
megaraid: v1.18j (Release Date: Mon Jul  7 14:39:55 EDT 2003)
megaraid: found 0x1028:0x000f:idx 0:bus 4:slot 3:func 0
scsi0 : Found a MegaRAID controller at 0xf883f000, IRQ: 18
scsi0 : Enabling 64 bit support
megaraid: [412W:H406] detected 1 logical drives
megaraid: supports extended CDBs.
megaraid: channel[1] is raid.
megaraid: channel[2] is raid.
scsi0 : LSI Logic MegaRAID 412W 254 commands 15 targs 5 chans 7 luns


Database Server

SCSI subsystem driver Revision: 1.00
megaraid: v1.18j (Release Date: Mon Jul  7 14:39:55 EDT 2003)
megaraid: found 0x101e:0x1960:idx 0:bus 5:slot 0:func 0
scsi0 : Found a MegaRAID controller at 0xf883f000, IRQ: 21
scsi0 : Enabling 64 bit support
megaraid: [196T:3.33] detected 1 logical drives
megaraid: supports extended CDBs.
megaraid: channel[1] is raid.
megaraid: channel[2] is raid.
scsi0 : LSI Logic MegaRAID 196T 254 commands 15 targs 5 chans 7 luns
Starting timer : 0 0
blk: queue c5f2d218, I/O limit 4095Mb (mask 0x)
scsi0: scanning virtual channel 0 for logical drives.
  Vendor: MegaRAID  Model: LD 0 RAID5   86G  Rev: 196T
  Type:   Direct-Access  ANSI SCSI revision: 02
Starting timer : 0 0


The webserver is a 1U and it actually performs better on the IO than the 
database server even though the database server is running 6 disks versus 3.


The database server is a PE (Power Edge) 6600

Database Server IO:

[EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda

/dev/sda:
 Timing buffer-cache reads:   1888 MB in  2.00 seconds = 944.00 MB/sec
 Timing buffered disk reads:   32 MB in  3.06 seconds =  10.46 MB/sec

Second Database Server IO:

[EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda

/dev/sda:
 Timing buffer-cache reads:   1816 MB in  2.00 seconds = 908.00 MB/sec
 Timing buffered disk reads:   26 MB in  3.11 seconds =   8.36 MB/sec
[EMAIL PROTECTED] root]#

Which is just horrible.


Sincerely,

Joshua D. Drake






Patrick Welche wrote:

On Thu, Jul 21, 2005 at 09:19:04PM -0700, Luke Lonergan wrote:


Joshua,

On 7/21/05 7:53 PM, "Joshua D. Drake" <[EMAIL PROTECTED]> wrote:


Well I know that isn't true at least not with ANY of the Dells my
customers have purchased in the last 18 months. They are still really,
really slow.


That's too bad, can you cite some model numbers?  SCSI?



I would be interested too, given

   http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=30531


Cheers,

Patrick

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Simon Riggs

On Fri, 2005-07-22 at 12:20 -0400, Tom Lane wrote:
> I think that this refutes the original scheme of using the same fuzz
> factor for both startup and total cost comparisons, and therefore
> propose the attached patch.
> 
> Comments?

Looks good. I think it explains a few other wierd perf reports also.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> Looks good. I think it explains a few other wierd perf reports also.

Could be.  I went back to look at Sam Mason's report about three weeks
ago, and it definitely seems to explain his issue.  The "fuzzy cost
comparison" logic is new in 8.0 so it hasn't had all that much
testing...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

2005-07-22 Thread Mark Wong

On a single spindle:

$ time dd if=/dev/zero of=bigfile bs=8k count=200
200+0 records in
200+0 records out

real2m8.569s
user0m0.725s
sys 0m19.633s

None of my drives are partitioned big enough for me to create 2x RAM
sized files on a single disk.  I have 16MB RAM and only 36GB drives. 
But here are some number for my 12-disk lvm2 striped volume.

$ time dd if=/dev/zero of=bigfile3 bs=8k count=400
400+0 records in
400+0 records out

real1m17.059s
user0m1.479s
sys 0m41.293s

Mark

On Thu, 21 Jul 2005 16:14:47 -0700
"Luke Lonergan" <[EMAIL PROTECTED]> wrote:

> Cool!
> 
> At what rate does your disk setup write sequential data, e.g.:
>   time dd if=/dev/zero of=bigfile bs=8k count=50
> 
> (sized for 2x RAM on a system with 2GB)
> 
> BTW - the Compaq smartarray controllers are pretty broken on Linux from a
> performance standpoint in our experience.  We've had disastrously bad
> results from the SmartArray 5i and 6 controllers on kernels from 2.4 ->
> 2.6.10, on the order of 20MB/s.
> 
> For comparison, the results on our dual opteron with a single LSI SCSI
> controller with software RAID0 on a 2.6.10 kernel:
> 
> [EMAIL PROTECTED] dbfast]$ time dd if=/dev/zero of=bigfile bs=8k
> count=50
> 50+0 records in
> 50+0 records out
> 
> real0m24.702s
> user0m0.077s
> sys 0m8.794s
> 
> Which calculates out to about 161MB/s.
> 
> - Luke
> 
> 
> On 7/21/05 2:55 PM, "Mark Wong" <[EMAIL PROTECTED]> wrote:
> 
> > I just ran through a few tests with the v14 patch against 100GB of data
> > from dbt3 and found a 30% improvement; 3.6 hours vs 5.3 hours.  Just to
> > give a few details, I only loaded data and started a COPY in parallel
> > for each the data files:
> > http://www.testing.osdl.org/projects/dbt3testing/results/fast_copy/
> > 
> > Here's a visual of my disk layout, for those familiar with the database
> > schema:
> > http://www.testing.osdl.org/projects/dbt3testing/results/fast_copy/layout-dev4
> > -010-dbt3.html
> > 
> > I have 6 arrays of fourteen 15k rpm drives in a split-bus configuration
> > attached to a 4-way itanium2 via 6 compaq smartarray pci-x controllers.
> > 
> > Let me know if you have any questions.
> > 
> > Mark
> > 
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Planner doesn't look at LIMIT?

2005-07-22 Thread Dawid Kuroczko

On 7/22/05, Tom Lane <[EMAIL PROTECTED]> wrote:
> > This is quite strange.  The nestloop plan definitely should be preferred
> > in the context of the LIMIT, considering that it has far lower estimated
> > cost.  And it is preferred in simple tests for me.
> 
> After a suitable period of contemplating my navel, I figured out
> what is going on here: the total costs involved are large enough that
> the still-fairly-high startup cost of the hash is disregarded by
> compare_fuzzy_path_costs(), and so the nestloop is discarded as not
> having any significant potential advantage in startup time.
> 
> I think that this refutes the original scheme of using the same fuzz
> factor for both startup and total cost comparisons, and therefore
> propose the attached patch.
> 
> Comments?

Works great!!!

With LIMIT below 4 000 000 rows (its 47-milion row table) it prefers
nested loops, then it starts to introduce merge joins.

   Regards,
  Dawid

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[PERFORM] Planner doesn't look at LIMIT?

Re: [PERFORM] Planner doesn't look at LIMIT?

Re: [PERFORM] Planner doesn't look at LIMIT?

[PERFORM] Another index question

Re: [PERFORM] Planner doesn't look at LIMIT?

Solved (was: Re: [PERFORM] Another index question)

Re: [PERFORM] Planner doesn't look at LIMIT?

Re: [PERFORM] Planner doesn't look at LIMIT?

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

Re: [HACKERS] [PERFORM] Planner doesn't look at LIMIT?

Re: [HACKERS] [PERFORM] Planner doesn't look at LIMIT?

Re: [PERFORM] [PATCHES] COPY FROM performance improvements

Re: [PERFORM] Planner doesn't look at LIMIT?

13 matches

Site Navigation

Mail list logo

Footer information