Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-24 Thread Andrus
Tomas, OK, what was the number of unused pointer items in the VACUUM output? I posted it in this thread: VACUUM FULL ANALYZE VERBOSE; ... INFO: free space map contains 14353 pages in 314 relations DETAIL: A total of 2 page slots are in use (including overhead). 89664 page slots are requ

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-24 Thread tv
>> Given the fact that the performance issues are caused by bloated tables >> and / or slow I/O subsystem, moving to a similar system won't help I >> guess. > > I have ran VACUUM FULL ANALYZE VERBOSE > and set MAX_FSM_PAGES = 15 > > So there is no any bloat except pg_shdepend indexes which shou

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-24 Thread Andrus
Tomas, Let's suppose you set a reasonable value (say 8096) instead of 2GB. That gives about 160MB. Anyway this depends - if you have a lot of slow queries caused by on-disk sorts / hash tables, use a higher value. Otherwise leave it as it is. Probably product orders table is frequently joined

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-23 Thread Tomas Vondra
Scott, thank you. > work_mem = 512 This is very easy to try. You can change work_mem for just a single session, and this can in some cases help performance quite a bit, and in others not at all. I would not recommend having it lower than at least 4MB on a server like that unless you have a lo

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-23 Thread Tomas Vondra
My test computer has PostgreSql 8.3, 4 GB RAM, SSD disks, Intel X2Extreme CPU So it is much faster than this prod server. No idea how to emulate this environment. I can create new db in prod server as old copy but this can be used in late night only. Well, a faster but comparable system may not

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-23 Thread Andrus
Scott, thank you. > work_mem = 512 This is very easy to try. You can change work_mem for just a single session, and this can in some cases help performance quite a bit, and in others not at all. I would not recommend having it lower than at least 4MB on a server like that unless you have a lo

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-23 Thread Andrus
I guess you have backups - take them, restore the database on a different machine (preferably with the same / similar hw config) and tune the queries on it. After restoring all the tables / indexes will be 'clean' (not bloated), so you'll see if performing VACUUM FULL / CLUSTER is the right solut

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-23 Thread Scott Carey
> Appoaches which probably does not change perfomance: > 6. Upgrade to 8.4 or to 8.3.5 Both of these will improve performance a little, even with the same query plan and same data. I would expect about a 10% improvement for 8.3.x on most memory bound select queries. 8.4 won't be out for a few

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-23 Thread Tomas Vondra
Risky to try in prod server. Requires creating randomly distributed product_id testcase to measure difference. What should I do next? I guess you have backups - take them, restore the database on a different machine (preferably with the same / similar hw config) and tune the queries on it.

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-23 Thread Andrus
You could try writing a plpgsql function which would generate the data set. Or you could use your existing data set. Creating 3.5 mln rows using stored proc is probably slow. Probably it would be better and faster to use some random() and generate_series() trick. In this case others can try it a

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-22 Thread PFC
Thank you very much for great sample. I tried to create testcase from this to match production db: 1.2 million orders 3.5 million order details 13400 products with char(20) as primary keys containing ean-13 codes mostly 3 last year data every order has usually 1..3 detail lines same product

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-22 Thread Andrus
You could perhaps run a little check on the performance of the RAID, is it better than linux software RAID ? Does it leverage NCQ appropriately when running queries in parallel ? I was told that this RAID is software RAID. I have no experience what to check. This HP server was installed 3 years

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-22 Thread PFC
On Fri, 21 Nov 2008 21:07:02 +0100, Tom Lane <[EMAIL PROTECTED]> wrote: PFC <[EMAIL PROTECTED]> writes: Index on orders_products( product_id ) and orders_products( order_id ): => Same plan Note that in this case, a smarter planner would use the new index to perform a BitmapAn

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Scott Carey
@postgresql.org Subject: Re: [PERFORM] Hash join on int takes 8..114 seconds > If it's not a million rows, then the table is bloated. Try (as postgres > or some other db superuser) "vacuum full pg_shdepend" and a "reindex > pg_shdepend". reindex table pg_shdepend

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
If it's not a million rows, then the table is bloated. Try (as postgres or some other db superuser) "vacuum full pg_shdepend" and a "reindex pg_shdepend". reindex table pg_shdepend causes error ERROR: shared table "pg_shdepend" can only be reindexed in stand-alone mode vacuum full verbose pg_

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Tomas Vondra
2. Run the following commands periodically in this order: VACUUM FULL; vacuum full pg_shdepend; CLUSTER rid on (toode); CLUSTER dok on (kuupaev); REINDEX DATABASE mydb; REINDEX SYSTEM mydb; ANALYZE; Are all those command required or can something leaved out ? Running CLUSTER after VACUUM FULL

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Tomas Vondra
Thank you. My 8.1.4 postgresql.conf does not contain such option. So vacuum_cost_delay is off probably. Since doc does not recommend any value, I planned to use 2000 Will value of 30 allow other clients to work when VACUUM FULL is running ? No, as someone already noted the VACUUM FULL is bloc

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Alvaro Herrera
Andrus wrote: >> So I gather you're not doing any vacuuming, eh? > > Log files for every day are full of garbage messages below. > So I hope that vacuum is running well, isn't it ? This does not really mean that autovacuum has done anything in the databases. If the times are consistently separat

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
Alvaro, 1. vacuum_cost_delay does not affect vacuum full 2. vacuum full is always blocking, regardless of settings So only way is to disable other database acces if vacuum full is required. So I gather you're not doing any vacuuming, eh? Log files for every day are full of garbage messages

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Tom Lane
PFC <[EMAIL PROTECTED]> writes: > Index on orders_products( product_id ) and orders_products( order_id ): > => Same plan > Note that in this case, a smarter planner would use the new index to > perform a BitmapAnd before hitting the heap to get the rows. Considering that the query h

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Alvaro Herrera
Andrus wrote: > Will value of 30 allow other clients to work when VACUUM FULL is running ? 1. vacuum_cost_delay does not affect vacuum full 2. vacuum full is always blocking, regardless of settings So I gather you're not doing any vacuuming, eh? -- Alvaro Herrera

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
Alvaro, Are you really using vacuum_cost_delay=2000? If so, therein lies your problem. That's a silly value to use for that variable. Useful values are in the 20-40 range probably, or maybe 10-100 being extremely generous. Thank you. My 8.1.4 postgresql.conf does not contain such option. So

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Alvaro Herrera
Andrus wrote: > I discovered vacuum_cost_delay=2000 option. Will this remove blocking > issue and allow vacuum full to work ? No. Are you really using vacuum_cost_delay=2000? If so, therein lies your problem. That's a silly value to use for that variable. Useful values are in the 20-40 range

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Alan Hodgson
On Friday 21 November 2008, "Andrus" <[EMAIL PROTECTED]> wrote: > Those commands cause server probably to stop responding to other client > like vacuum full pg_shdepend > did. > > Should vacuum_cost_delay = 2000 allow other users to work when running > those commands ? Any vacuum full or cluster w

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
How to vacuum full pg_shdepend automatically so that other users can work at same time ? Your table is horribly bloated. You must use VACUUM FULL + REINDEX (as superuser) on it, however unfortunately, it is blocking. Therefore, you should wait for sunday night to do this, when noone will notic

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread PFC
log file seems that mostly only those queries are slow: SELECT ... FROM dok JOIN rid USING (dokumnr) JOIN ProductId USING (ProductId) WHERE rid.ProductId LIKE :p1 || '%' AND dok.SaleDate>=:p2 :p1 and :p2 are parameters different for different queries. dok contains several years of d

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
Thomas, Thank you. Just the most important points: 1) "dok" table contains 1235086 row versions in 171641 pages (with 8kB pages this means 1.4GB MB of data), but there are 1834279 unused item pointers (i.e. about 60% of the space is wasted) 2) "rid" table contains 3275189 roiws in 165282 (wit

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread PFC
Server has 2 GB RAM. It has SATA RAID 0,1 integrated controller (1.5Gbps) and SAMSUNG HD160JJ mirrored disks. You could perhaps run a little check on the performance of the RAID, is it better than linux software RAID ? Does it leverage NCQ appropriately when running queries in para

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread PFC
How to vacuum full pg_shdepend automatically so that other users can work at same time ? Your table is horribly bloated. You must use VACUUM FULL + REINDEX (as superuser) on it, however unfortunately, it is blocking. Therefore, you should wait for sunday night to do this, when noo

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
Richard, Thank you. Try "SELECT count(*) FROM pg_shdepend". This query returns 3625 and takes 35 seconds to run. If it's not a million rows, then the table is bloated. Try (as postgres or some other db superuser) "vacuum full pg_shdepend" and a "reindex pg_shdepend". vacuum full verbose

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Richard Huxton
Andrus wrote: >> - what's the size of the dataset relative to the RAM ? > > Db size is 7417 MB > relevant table sizes in desc by size order: > > 140595 dok 2345 MB > 2 1214 pg_shdepend 2259 MB > 6 123

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread tv
Just the most important points: 1) "dok" table contains 1235086 row versions in 171641 pages (with 8kB pages this means 1.4GB MB of data), but there are 1834279 unused item pointers (i.e. about 60% of the space is wasted) 2) "rid" table contains 3275189 roiws in 165282 (with 8kB pages this means

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
PFC, thank you. OK so vmstat says you are IO-bound, this seems logical if the same plan has widely varying timings... Let's look at the usual suspects : - how many dead rows in your tables ? are your tables data, or bloat ? (check vacuum verbose, etc) set search_path to firma2,public; vacuu

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-21 Thread Andrus
Richard, In addition to "top" below, you'll probably find "vmstat 5" useful. Thank you. During this query run (65 sec), vmstat 5 shows big values in bi,cs and wa columns: procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread PFC
OK so vmstat says you are IO-bound, this seems logical if the same plan has widely varying timings... Let's look at the usual suspects : - how many dead rows in your tables ? are your tables data, or bloat ? (check vacuum verbose, etc) - what's the size of the dataset re

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Richard Huxton
Andrus wrote: > Richard, > >> At a quick glance, the plans look the same to me. The overall costs are >> certainly identical. That means whatever is affecting the query times it >> isn't the query plan. >> >> So - what other activity is happening on this machine? Either other >> queries are taking

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Andrus
Richard, At a quick glance, the plans look the same to me. The overall costs are certainly identical. That means whatever is affecting the query times it isn't the query plan. So - what other activity is happening on this machine? Either other queries are taking up noticeable resources, or some

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Richard Huxton
Andrus wrote: > Query below seems to use indexes everywhere in most optimal way. > dokumnr column is of type int > > Speed of this query varies rapidly: > > In live db fastest response I have got is 8 seconds. > Re-running same query after 10 seconds may take 60 seconds. > Re-running it again af

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Andrus
Just a question, what are you doing with the 20.000 result rows ? Those rows represent monthly sales data of one item. They are used as following: 1. Detailed sales report for month. This report can browsed in screen for montly sales and ordering analysis. 2. Total reports. In those reports,

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-19 Thread PFC
Query below seems to use indexes everywhere in most optimal way. dokumnr column is of type int Speed of this query varies rapidly: In live db fastest response I have got is 8 seconds. Re-running same query after 10 seconds may take 60 seconds. Re-running it again after 10 seconds may take 114

[PERFORM] Hash join on int takes 8..114 seconds

2008-11-19 Thread Andrus
Query below seems to use indexes everywhere in most optimal way. dokumnr column is of type int Speed of this query varies rapidly: In live db fastest response I have got is 8 seconds. Re-running same query after 10 seconds may take 60 seconds. Re-running it again after 10 seconds may take 114 s