Re: [PERFORM] Shared buffers, Sort memory, Effective Cache Size

2004-04-27 Thread Manfred Koizar
On Wed, 21 Apr 2004 10:01:30 -0700, Qing Zhao [EMAIL PROTECTED]
wrote:
I have recently configured my PG7.3 on a G5 (8GB RAM) with
shmmax set to 512MB and shared_buffer=5, sort_mem=4096
and effective cache size = 1.  It seems working great so far but
I am wondering if I should make effctive cache size larger myself.

Yes, much larger!  And while you are at it make shared_buffers smaller.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] planner/optimizer question

2004-04-27 Thread Atesz
Hi,

You should try the next queries:

select support_person_id from ticket_crm_map where crm_id = 7 GROUP BY
support_person_id;
select support_person_id from ticket_crm_map where crm_id = 1 GROUP BY
support_person_id;

It can use the 'ticket_crm_map_crm_id_suppid' index. 

Generally the Postgres use an k-column index if columns of your
conditions are prefix of the index column.
For example:
CREATE INDEX test_idx on test(col1,col2,col3,col4);
SELECT * FROM test WHERE col1=3 AND col2=13;  -- This can use the index.

But the next queries cannot use the index:
SELECT * FROM test WHERE col1=3 AND col3=13;.
SELECT * FROM test WHERE col2=3;

If you have problem with seq_scan or sort, you can disable globally and
locally: 
SET enable_seqscan=0;
SET enable_sort = 0;

Regards, Antal Attila



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Join problem

2004-04-27 Thread Silke Trissl
Hi,
I came across a very intriguing thing:
I had to join two tables and in both tables I wanted to restrict the 
result set by some (text/varchar) attributes.

Here is an example:
Table item # 147 000 entries
Column | Type  | Modifiers
---+---+
 id| integer   | not null
 description   | text  |
 comment   | text  | not null
 order_id  | integer   |
Table orders # 210 000 entries
  Column  |  Type  | Modifiers
-++---
 order_id| integer|
 order_name  | character varying(255) |
The tables have 147 000 and 210 000 entries, respectively.
First I tried the following query, which took ages:
(Query 1)
EXPLAIN ANALYZE
SELECT item.id
FROM item, orders
WHERE orders.order_name ~* 'Smit'
  AND item.description ~* 'CD'
  and orders.order_id = item.order_id;

I found out, that the change of the operator from '~*' to '=' for the 
item.description brought a great boost in performance (425 secs to 1 
sec!), but not in cost (Query plans at the end).

(Query 2)
 EXPLAIN ANALYZE
SELECT item.id
FROM item, orders
WHERE orders.order_name ~* 'Smit'
  AND item.description = 'CD'
  and orders.order_id = item.order_id;
The main difference was that Query 2 used the Hash join instead of the 
Nested Loop, so I disabled the option 'NESTED LOOP' and got for Query 1 
a similar time as for Query 2.

Can anyone tell me, why in one case the Hash join and in the other the 
much worse Nested Loop is prefered?
And my second question is, is there any possibility to execute the first 
query without disabling the Nested Loop first, but get the good 
performance of the Hash join?

Many thanks in advance for your help or suggestions
Silke
QUERY PLANS:
#
Query 1:
  QUERY PLAN
-- 

 Nested Loop  (cost=0.00..28836.75 rows=1 width=4) (actual 
time=65350.780..452130.702 rows=6 loops=1)
   Join Filter: (inner.order_id = outer.order_id)
   -  Seq Scan on item  (cost=0.00..28814.24 rows=1 width=8) (actual 
time=33.180..1365.190 rows=716 loops=1)
 Filter: (description ~* 'CD'::text)
   -  Seq Scan on orders  (cost=0.00..22.50 rows=1 width=4) (actual 
time=21.644..629.500 rows=18 loops=716)
 Filter: ((order_name)::text ~* 'Smith'::text)
 Total runtime: 452130.782 ms
###

Query 2:
  QUERY PLAN
-- 

  Hash Join  (cost=22.50..28840.44 rows=4 width=4) (actual 
time=1187.798..1187.798 rows=0 loops=1)
   Hash Cond: (outer.order_id = inner.order_id)
   -  Seq Scan on item  (cost=0.00..28814.24 rows=733 width=8) (actual 
time=542.737..542.737 rows=0 loops=1)
 Filter: (description = 'CD'::text)
   -  Hash  (cost=22.50..22.50 rows=1 width=4) (actual 
time=645.042..645.042 rows=0 loops=1)
 -  Seq Scan on orders  (cost=0.00..22.50 rows=1 width=4) 
(actual time=22.373..644.996 rows=18 loops=1)
   Filter: ((order_name)::text ~* 'Smith'::text)
 Total runtime: 1187.865 ms


Query 1 with  'set enable_nestloop to false'
   QUERY PLAN
---
 Hash Join  (cost=22.50..28836.75 rows=1 width=4) (actual 
time=1068.593..2003.330 rows=6 loops=1)
   Hash Cond: (outer.item_id = inner.item_id)
   -  Seq Scan on item  (cost=0.00..28814.24 rows=1 width=8) (actual 
time=33.347..1357.073 rows=716 loops=1)
 Filter: (description ~* 'CD'::text)
   -  Hash  (cost=22.50..22.50 rows=1 width=4) (actual 
time=645.287..645.287 rows=0 loops=1)
 -  Seq Scan on orders  (cost=0.00..22.50 rows=1 width=4) 
(actual time=22.212..645.239 rows=18 loops=1)
   Filter: ((order_name)::text ~* 'CD'::text)
 Total runtime: 2003.409 ms

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Wierd context-switching issue on Xeon patch for 7.4.1

2004-04-27 Thread Josh Berkus
Dave,

 Are you testing this with Tom's code, you need to do a baseline
 measurement with 10 and then increase it, you will still get lots of cs,
 but it will be less.

No, that was just a test of 1000 straight up.Tom outlined a method, but I 
didn't see any code that would help me find a better level, other than just 
trying each +100 increase one at a time.   This would take days of testing 
...
-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] OT: Help with performance problems

2004-04-27 Thread scott.marlowe
On Mon, 26 Apr 2004, Rob Fielding wrote:

 scott.marlowe wrote:
  On Fri, 23 Apr 2004, Chris Hoover wrote:
  
  
 DB's on Powervaults 220S using raid 5 (over 6 disks)
  
  
  What controller is this, the adaptec?  We've found it to be slower than 
  the LSI megaraid based controller, but YMMV.
 
 Wow, really? You got any more details of the chipset, mobo and kernel 
 driver ?

We're running on a Dell 2650, the controller is the U320 LSI megaraid 2 
channel (they only make the one that I know of right now).  Don't know my 
mobo chipset offhand, but might be able to find out what one dell includes 
on the 2650.  The kernel driver is the latest megaraid2 driver as of about 
Feb this year.

 I've been taken to my wits end wrestling with an LSI MegaRAID 320-1 
 controller on a supermicro board all weekend. I just couldn't get 
 anything more than 10MB/sec out of it with megaraid driver v1 OR v2 in 
 Linux 2.4.26, nor the version in 2.6.6-rc2. After 2 days of humming the 
 Adaptec mantra I gave in and switched the array straight onto the 
 onboard Adaptec 160 controller (same cable and everything). Software 
 RAID 5 gets me over 40MB sec for a nominal cpu hit - more than 4 times 
 what I could get out of the MegaRAID controller :( Even the 2nd SCSI-2 
 channel gets 40MB/sec max (pg_xlog :)
 
 And HOW LONG does it take to detect drives during POSTo never 
 mind ... I really just wanna rant :) There should be a free counseling 
 service for enraged sysops.

I wonder if your controller is broken or something?  Or maybe on a PCI 
slow that has to share IRQs or something.  I've had great luck with 
SuperMicro mobos in the past (we're talking dual PPro 200 mobos, so 
seriously, IN THE PAST here... )  Hell, my Dual PPro 200 with an old 
MegaRAID 428 got 18 Megs a second cfer rate no problem.

Have you tried that lsi card in another machine / mobo combo?  Can you 
disable the onboard adaptec?  We have on our Dell 2650s, the only active 
controllers are the onboard IDE and the add in LSI-320-2 controller.

We're running ours with 128 Meg cache (I think could be 64) set to write 
back.  I think our throughput on a RAID-1 pair was somewhere around 40+ 
megs a second reads with bonnie++  With RAID-5 it was not really much 
faster at reads (something like 60 megs a second) but was much more 
scalable under heavy parellel read/write access for PostgreSQL.

Have you updated the BIOS on the mobo to see if that helps?  I'm just 
throwing darts at the wall here.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Wierd context-switching issue on Xeon patch for 7.4.1

2004-04-27 Thread Dave Cramer
Josh,

I think you can safely increase by orders of magnitude here, instead of
by +100, my wild ass guess is that the sweet spot is the spin time
should be approximately the time it takes to consume the resource. So if
you have a really fast machine then the spin count should be higher. 

Also you have to take into consideration your memory bus speed, with the
pause instruction inserted in the loop the timing is now dependent on
memory speed.

But... you need a baseline first.

Dave
On Tue, 2004-04-27 at 14:05, Josh Berkus wrote:
 Dave,
 
  Are you testing this with Tom's code, you need to do a baseline
  measurement with 10 and then increase it, you will still get lots of cs,
  but it will be less.
 
 No, that was just a test of 1000 straight up.Tom outlined a method, but I 
 didn't see any code that would help me find a better level, other than just 
 trying each +100 increase one at a time.   This would take days of testing 
 ...
-- 
Dave Cramer
519 939 0336
ICQ # 14675561


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Wierd context-switching issue on Xeon patch for 7.4.1

2004-04-27 Thread Josh Berkus
Dave,

 But... you need a baseline first.

A baseline on CS?   I have that 

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Simply join in PostrgeSQL takes too long

2004-04-27 Thread Vitaly Belman
Hello pgsql-performance,

  I discussed the whole subject for some time in DevShed and didn't
  achieve much (as for results). I wonder if any of you guys can help
  out:

  http://forums.devshed.com/t136202/s.html

Regards,
 Vitaly Belman
 
 ICQ: 1912453
 AIM: VitalyB1984
 MSN: [EMAIL PROTECTED]
 Yahoo!: VitalyBe


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-27 Thread Nick Barr
Vitaly Belman wrote:
Hello pgsql-performance,
  I discussed the whole subject for some time in DevShed and didn't
  achieve much (as for results). I wonder if any of you guys can help
  out:
  http://forums.devshed.com/t136202/s.html
So cutting and pasting:
- SCHEMA -
CREATE TABLE bv_bookgenres (
book_id INT NOT NULL,
genre_id INT NOT NULL
);
CREATE TABLE bv_genre_children (
genre_id INT,
genre_child_id INT
);
---
- QUERY -
select DISTINCT
  book_id
from
  bookgenres,
  genre_children
WHERE
 bookgenres.genre_id = genre_children.genre_child_id AND
 genre_children.genre_id = 1
LIMIT 10
-
- EXPLAIN ANALYZE -
QUERY PLAN
Limit  (cost=6503.51..6503.70 rows=10 width=4) (actual 
time=703.000..703.000 rows=10 loops=1)
  -  Unique  (cost=6503.51..6738.20 rows=12210 width=4) (actual 
time=703.000..703.000 rows=10 loops=1)
-  Sort  (cost=6503.51..6620.85 rows=46937 width=4) (actual 
time=703.000..703.000 rows=24 loops=1)
  Sort Key: bv_bookgenres.book_id
  -  Merge Join  (cost=582.45..2861.57 rows=46937 width=4) 
(actual time=46.000..501.000 rows=45082 loops=1)
Merge Cond: (outer.genre_id = inner.genre_child_id)
-  Index Scan using genre_id on bv_bookgenres 
(cost=0.00..1462.84 rows=45082 width=8) (actual time=0.000..158.000 
rows=45082 loops=1)
-  Sort  (cost=582.45..598.09 rows=6256 width=2) 
(actual time=46.000..77.000 rows=49815 loops=1)
  Sort Key: bv_genre_children.genre_child_id
  -  Index Scan using genre_id2 on 
bv_genre_children  (cost=0.00..187.98 rows=6256 width=2) (actual 
time=0.000..31.000 rows=6379 loops=1)
Index Cond: (genre_id = 1)
Total runtime: 703.000 ms
---

- CONF SETTINGS -
shared_buffers = 1000   # min 16, at least max_connections*2, 8KB each
sort_mem = 1
#work_mem = 1024# min 64, size in KB
#maintenance_work_mem = 16384   # min 1024, size in KB
#max_stack_depth = 2048 # min 100, size in KB
-
Have you VACUUM ANALYZED recently. If not do that then rerun the EXPLAIN 
ANALYZE.

You might wanna bump shared_buffers. You have 512MB RAM right? You 
probably want to bump shared_buffers to 1, restart PG then run a 
VACUUM ANALYZE. Then rerun the EXPLAIN ANALYZE.

If that doesnt help try doing a
ALTER TABLE bv_genre_children ALTER COLUMN genre_child_id SET STATISTICS 
100;

followed by a:
VACUUM ANALYZE bv_genre_children;
You might also want to be tweaking the effective_cache_size parameter in 
 postgresql.conf, but I am unsure how this would work on Windows. Does 
Windows have a kernel disk cache anyone?


HTH
Nick


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-27 Thread Atesz
Hi,

You can try some variation:

SELECT 
  book_id
FROM  
  bookgenres, genre_children
WHERE
   bookgenres.genre_id = genre_children.genre_child_id AND  
   genre_children.genre_id = 1
GROUP BY book_id
LIMIT 10

The next works if the 'genre_child_id' is UNIQUE on the 'genre_children'
table.

SELECT 
  book_id
FROM  
  bookgenres
WHERE
   bookgenres.genre_id = (SELECT genre_child_id FROM genre_children
WHERE genre_id = 1)
GROUP BY book_id
LIMIT 10

You may need some index. Try these with EXPLAIN!
CREATE INDEX bookgenres_genre_id_book_id ON bookgenres(genre_id,
book_id);  or
CREATE INDEX bookgenres_book_id_genre_id ON bookgenres(book_id,
genre_id);
CREATE INDEX genre_children_genre_id ON genre_children(genre_id);

Regards, Antal Attila



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-27 Thread Rod Taylor
On Tue, 2004-04-27 at 17:27, Vitaly Belman wrote:
 Hello pgsql-performance,
 
   I discussed the whole subject for some time in DevShed and didn't
   achieve much (as for results). I wonder if any of you guys can help
   out:
 
   http://forums.devshed.com/t136202/s.html

You're taking the wrong approach. Rather than using a select query to
ensure that the book_id is distinct, add a constraint to the table so
that is guaranteed.

CREATE UNIQUE INDEX bv_bookgeneres_unq ON bv_bookgenres(book_id,
genre_id);

Now you can do a simple join (Drop the DISTINCT keyword) and achieve the
same results.

The point is that a book cannot be of a certain genre more than once.

Without the distinct, this should take a matter of a few milliseconds to
execute.



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] planner/optimizer question

2004-04-27 Thread Tom Lane
[EMAIL PROTECTED] writes:
 ... Wouldn't the most efficient plan be to scan the index regardless
 of crm_id because the only columns needed are in the index?

No.  People coming from other databases often have the misconception
that queries can be answered by looking only at an index.  That is never
true in Postgres because row validity info is only stored in the table;
so we must always visit the table entry to make sure the row is still
valid/visible for the current query.

Accordingly, columns added to the index that aren't constrained by the
WHERE clause are not very useful ...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [JDBC] [PERFORM] is a good practice to create an index on the

2004-04-27 Thread Dave Cramer
AFAIK, oids aren't used for anything internally, so duplicates don't
really matter. Besides, what would you do about duplicate oid's ?

The best suggestion is of course his last, don't use them.


On Mon, 2004-04-26 at 22:48, Christopher Kings-Lynne wrote:
  I am using the oid of the table as the main key and I've found that is 
  not indexed (maybe because I have declared another primary key in the 
  table)
  
  it is a good practice to create an index like this on the oid of a table?
  CREATE INDEX idoid annuncio400 USING btree (oid);
 
 Yes it is - in fact you really should add a unique index, not just a 
 normal index, as you want to enforce uniqueness of the oid column.  It 
 is theoretically possible to end up with duplicate oids in wraparound 
 situations.
 
 Even better though is to not use oids at all, of course...
 
 Chris
 
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faqs/FAQ.html
 
 
 
 !DSPAM:408dcc51235334924183622!
 
 
-- 
Dave Cramer
519 939 0336
ICQ # 14675561


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [JDBC] [PERFORM] is a good practice to create an index on the

2004-04-27 Thread Christopher Kings-Lynne
AFAIK, oids aren't used for anything internally, so duplicates don't
really matter. Besides, what would you do about duplicate oid's ?
If he's using them _externally_, then he does have to worry about 
duplicates.

Chris
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [JDBC] [PERFORM] is a good practice to create an index on the

2004-04-27 Thread Christopher Kings-Lynne
I am going to use them as primary key of the table, so I'll surely need 
them unique :)
Eduoardo, I REALLY suggest you don't use them at all.  You should make a 
primary key like this:

CREATE TABLE blah (
  id SERIAL PRIMARY KEY,
  ...
);
Also note that by default, OIDs are NOT dumped by pg_dump.  You will 
need to add extra switches to your pg_dump backup to ensure that they are.

Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster