Re: [PERFORM] n00b autovacuum question
On 18/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote: > Antoine wrote: > > Hi, > > I have enabled the autovacuum daemon, but occasionally still get a > > message telling me I need to run vacuum when I access a table in > > pgadmin. > > pgAdmin notices a discrepancy between real rowcount and estimated > rowcount and thus suggests to run vacuum/analyze; it won't examine > autovacuum rules so it might warn although autovac is running ok. > > If you're sure autovacuum is running fine, just dismiss the message. I guess that is my problem - I a not sure it is running fine. The process is definitely running but I am getting lots of complaints about performance. This probably has lots to do with crap code and not much to do with the database but I am still searching the maintenance avenue... We have a massive project coming up and I want to go for Postgres (the boss wants Oracle). If I can't get my stuff together I am not sure my arguments will stick... problem is I don't really have the time to experiment properly. Cheers Antoine -- This is where I should put some witty comment. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] partitioning
Hi, Is there any work on the cards for implementing other partitioning strategies? I see mysql 5.1 will have support for hashes and stuff but didn't see anything in the todos for postgres. Cheers Antoine -- This is where I should put some witty comment. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] database model tshirt sizes
We have size and color in the product table itself. It is really an attribute of the product. If you update the availability of the product often, I would split out the quantity into a separate table so that you can truncate and update as needed. Patrick Hatcher Development Manager Analytics/MIO Macys.com "NbForYou" <[EMAIL PROTECTED] .com> To Sent by: pgsql-performance cc [EMAIL PROTECTED] .org Subject [PERFORM] database model tshirt sizes 03/18/06 07:03 AM Hello, Does anybody know how to build a database model to include sizes for rings, tshirts, etc? the current database is built like: table product = productid int8 PK productname charvar(255) quantity int4 what i want now is that WHEN (not all products have multiple sizes) there are multiple sizes available. The sizes are stored into the database. I was wondering to include a extra table: table sizes: productid int8 FK size varchar(100) but then i have a quantity problem. Because now not all size quantities can be stored into this table, because it allready exist in my product table. How do professionals do it? How do they make their model to include sizes if any available? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] database model tshirt sizes
another approach would be: table product: productid int8 PK productname charvar(255) table versions productid int8 FK versionid int8 PK size color ... quantity int4 an example would be then: table product: - productid: 123, productname: 'nice cotton t-shirt' - productid: 442, productname: 'another cotton t-shirt' table versions: - productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11 - productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1 - productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4 - productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9 - productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0 that way you can have more than 1 quantity / color / size combination per product and still have products that come in one size. so instead of only using a 2nd table for cases where more than one size is available, you would always use a 2nd table. this probably reduces your code complexity quite a bit and only needs 1 JOIN. - thomas - Original Message - From: "Patrick Hatcher" <[EMAIL PROTECTED]> To: "NbForYou" <[EMAIL PROTECTED]> Cc: ; <[EMAIL PROTECTED]> Sent: Sunday, March 19, 2006 2:59 PM Subject: Re: [PERFORM] database model tshirt sizes We have size and color in the product table itself. It is really an attribute of the product. If you update the availability of the product often, I would split out the quantity into a separate table so that you can truncate and update as needed. Patrick Hatcher Development Manager Analytics/MIO Macys.com "NbForYou" <[EMAIL PROTECTED] .com> To Sent by: pgsql-performance cc [EMAIL PROTECTED] .org Subject [PERFORM] database model tshirt sizes 03/18/06 07:03 AM Hello, Does anybody know how to build a database model to include sizes for rings, tshirts, etc? the current database is built like: table product = productid int8 PK productname charvar(255) quantity int4 what i want now is that WHEN (not all products have multiple sizes) there are multiple sizes available. The sizes are stored into the database. I was wondering to include a extra table: table sizes: productid int8 FK size varchar(100) but then i have a quantity problem. Because now not all size quantities can be stored into this table, because it allready exist in my product table. How do professionals do it? How do they make their model to include sizes if any available? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] n00b autovacuum question
Antoine wrote: On 18/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote: Antoine wrote: Hi, I have enabled the autovacuum daemon, but occasionally still get a message telling me I need to run vacuum when I access a table in pgadmin. Bring up the postgresql.conf editor on that server, and watch if pgadmin complains. Regards, Andreas ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] n00b autovacuum question
On 19/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote: > Antoine wrote: > > On 18/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote: > > > >>Antoine wrote: > >> > >>>Hi, > >>>I have enabled the autovacuum daemon, but occasionally still get a > >>>message telling me I need to run vacuum when I access a table in > >>>pgadmin. > > Bring up the postgresql.conf editor on that server, and watch if pgadmin > complains. Hi, I am not sure I understand what "bring up" means. Could you explain? Thanks Antoine -- This is where I should put some witty comment. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
pgsql-performance@postgresql.org
Kenji, On 3/17/06 4:08 PM, "Kenji Morishige" <[EMAIL PROTECTED]> wrote: > Thanks guys, I'm studying each of your responses and am going to start to > experiement. I notice that no one asked you about your disk bandwidth - the Adaptec 2200S is a "known bad" controller - the bandwidth to/from in RAID5 is about 1/2 to 1/3 of a single disk drive, which is far too slow for a 10GB database, and IMO should disqualify a RAID adapter from being used at all. Without fixing this, I'd suggest that all of the other tuning described here will have little value, provided your working set is larger than your RAM. You should test the I/O bandwidth using these simple tests: time bash -c "dd if=/dev/zero of=bigfile bs=8k count=100 && sync" then: time dd if=bigfile of=/dev/null bs=8k You should get on the order of 150MB/s on four disk drives in RAID5. And before people jump in about "random I/O", etc, the sequential scan test will show whether the controller is just plain bad very quickly. If it can't do sequential fast, it won't do seeks fast either. - Luke ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] database model tshirt sizes
So a default value for all products would be size:"all" for example, the same tshirt shop also sells cdroms. It size attribute would be to place it to be :"all". (because we cannot place an uniqe index on null values) But the industry evolves and so in time the same cdrom is now available for pc and playstation. So i would like to have it as 1 productid but with different attributes: pc (with quantity 5) and playstation (with quantity 3). So when I do an insert for this 2 products with 1 productid it would be like: insert into versions (productid,size,quantity) values (345,'pc',5); insert into versions (productid,size,quantity) values (345,'playstation',3); if however the product existed we get an error: because the default value version "all" did also exist and is now obsolete population versions: productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11 productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1 productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4 productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9 productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0 productid: 345, versionid: 3, color: null, size: 'all', quantity: 15 productid: 345, versionid: 3, color: null, size: 'pc', quantity: 5 productid: 345, versionid: 3, color: null, size: 'playstation', quantity: 3 WOULD HAVE TO BE: population versions: productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11 productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1 productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4 productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9 productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0 productid: 345, versionid: 3, color: null, size: 'pc', quantity: 5 productid: 345, versionid: 3, color: null, size: 'playstation', quantity: 3 ALSO: what is versionid used for? - Original Message - From: <[EMAIL PROTECTED]> To: "NbForYou" <[EMAIL PROTECTED]> Cc: ; <[EMAIL PROTECTED]> Sent: Sunday, March 19, 2006 3:37 PM Subject: Re: [PERFORM] database model tshirt sizes another approach would be: table product: productid int8 PK productname charvar(255) table versions productid int8 FK versionid int8 PK size color ... quantity int4 an example would be then: table product: - productid: 123, productname: 'nice cotton t-shirt' - productid: 442, productname: 'another cotton t-shirt' table versions: - productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11 - productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1 - productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4 - productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9 - productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0 that way you can have more than 1 quantity / color / size combination per product and still have products that come in one size. so instead of only using a 2nd table for cases where more than one size is available, you would always use a 2nd table. this probably reduces your code complexity quite a bit and only needs 1 JOIN. - thomas - Original Message - From: "Patrick Hatcher" <[EMAIL PROTECTED]> To: "NbForYou" <[EMAIL PROTECTED]> Cc: ; <[EMAIL PROTECTED]> Sent: Sunday, March 19, 2006 2:59 PM Subject: Re: [PERFORM] database model tshirt sizes We have size and color in the product table itself. It is really an attribute of the product. If you update the availability of the product often, I would split out the quantity into a separate table so that you can truncate and update as needed. Patrick Hatcher Development Manager Analytics/MIO Macys.com "NbForYou" <[EMAIL PROTECTED] .com> To Sent by: pgsql-performance cc [EMAIL PROTECTED] .org Subject [PERFORM] database model tshirt sizes 03/18/06 07:03 AM Hello, Does anybody know how to build a database model to include sizes for rings, tshirts, etc? the current database is built like: table product = productid int8 PK productname charvar(255) quantity int4 what i want now is that WHEN (not all products have multiple sizes) there are multiple sizes available. The sizes are stored into the database. I was wondering to include a extra table: table sizes: productid int8 FK size varchar(100) but then i have a quantity problem. Because now not all size quantities can be stored into this table, because it allready exist in my product table. How do professionals do it? How do they make their model to include sizes if any av
pgsql-performance@postgresql.org
-Original Message- From: "Luke Lonergan"<[EMAIL PROTECTED]> Sent: 19/03/06 16:26:58 To: "Kenji Morishige"<[EMAIL PROTECTED]>, "Claus Guttesen"<[EMAIL PROTECTED]> Cc: "pgsql-performance@postgresql.org" Subject: Re: [PERFORM] Best OS & Configuration for Dual Xeon w/4GB & > I notice that no one asked you about your disk bandwidth - the Adaptec 2200S >is a "known bad" controller - Agreed - We have a couple at work which got relagated to use in 'toy' boxes when we realised how bad they were, long before they ever saw any production use. Regards, Dave -Unmodified Original Message- Kenji, On 3/17/06 4:08 PM, "Kenji Morishige" <[EMAIL PROTECTED]> wrote: > Thanks guys, I'm studying each of your responses and am going to start to > experiement. I notice that no one asked you about your disk bandwidth - the Adaptec 2200S is a "known bad" controller - the bandwidth to/from in RAID5 is about 1/2 to 1/3 of a single disk drive, which is far too slow for a 10GB database, and IMO should disqualify a RAID adapter from being used at all. Without fixing this, I'd suggest that all of the other tuning described here will have little value, provided your working set is larger than your RAM. You should test the I/O bandwidth using these simple tests: time bash -c "dd if=/dev/zero of=bigfile bs=8k count=100 && sync" then: time dd if=bigfile of=/dev/null bs=8k You should get on the order of 150MB/s on four disk drives in RAID5. And before people jump in about "random I/O", etc, the sequential scan test will show whether the controller is just plain bad very quickly. If it can't do sequential fast, it won't do seeks fast either. - Luke ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] partial indexes and inference
I have a case where it seems the planner should be able to infer more from its partial indexes than it is doing. Observe: px=# select version(); version PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2 (1 row) px=# \d pxmdvalue Table "store.pxmdvalue" Column | Type | Modifiers +--+--- entityid | bigint | not null fieldid| integer | not null value | text | not null datatypeid | integer | not null tsi| tsvector | Indexes: "pxmdvalue_pk" PRIMARY KEY, btree (entityid, fieldid) "pxmdvalue_atom_val_idx" btree (value) WHERE datatypeid = 22 "pxmdvalue_bigint_val_idx" btree ((value::bigint)) WHERE datatypeid = 43 "pxmdvalue_datatypeid_idx" btree (datatypeid) "pxmdvalue_int_val_idx" btree ((value::integer)) WHERE datatypeid = 16 "pxmdvalue_str32_val0_idx" btree (lower(value)) WHERE datatypeid = 2 AND octet_length(value) < 2700 "pxmdvalue_str32_val1_idx" btree (lower(value) text_pattern_ops) WHERE datatypeid = 2 AND octet_length(value) < 2700 "pxmdvalue_str_val0_idx" btree (lower(value)) WHERE datatypeid = 85 AND octet_length(value) < 2700 "pxmdvalue_str_val1_idx" btree (lower(value) text_pattern_ops) WHERE datatypeid = 85 AND octet_length(value) < 2700 "pxmdvalue_time_val_idx" btree (px_text2timestamp(value)) WHERE datatypeid = 37 px=# explain analyse select * from pxmdvalue where datatypeid = 43 and fieldid = 857 and cast(value as bigint) = '1009'; QUERY PLAN Bitmap Heap Scan on pxmdvalue (cost=2143.34..2685.74 rows=1 width=245) (actual time=144.411..144.415 rows=1 loops=1) Recheck Cond: (((value)::bigint = 1009::bigint) AND (datatypeid = 43)) Filter: (fieldid = 857) -> BitmapAnd (cost=2143.34..2143.34 rows=138 width=0) (actual time=144.394..144.394 rows=0 loops=1) -> Bitmap Index Scan on pxmdvalue_bigint_val_idx (cost=0.00..140.23 rows=1758 width=0) (actual time=0.021..0.021 rows=2 loops=1) Index Cond: ((value)::bigint = 1009::bigint) -> Bitmap Index Scan on pxmdvalue_datatypeid_idx (cost=0.00..2002.85 rows=351672 width=0) (actual time=144.127..144.127 rows=346445 loops=1) Index Cond: (datatypeid = 43) Total runtime: 144.469 ms (9 rows) px=# drop index pxmdvalue_datatypeid_idx; DROP INDEX px=# explain analyse select * from pxmdvalue where datatypeid = 43 and fieldid = 857 and cast(value as bigint) = '1009'; QUERY PLAN - Index Scan using pxmdvalue_bigint_val_idx on pxmdvalue (cost=0.00..6635.06 rows=1 width=245) (actual time=0.018..0.022 rows=1 loops=1) Index Cond: ((value)::bigint = 1009::bigint) Filter: (fieldid = 857) Total runtime: 0.053 ms (4 rows) Notice the two bitmap index scans in the first version of the query. The one that hits the pxmdvalue_bigint_val_idx actually subsumes the work of the second one, as it is a partial index on the same condition that the second bitmap scan is checking. So that second bitmap scan is a complete waste of time and effort, afaict. When I remove the pxmdvalue_datatypeid_idx index, to prevent it using that second bitmap scan, the resulting query is much faster, although its estimated cost is rather higher. Any clues, anyone? Is this indeed a limitation of the query planner, in that it doesn't realise that the partial index is all it needs here? Or is something else going on that is leading the cost estimation astray? Tim -- --- Tim Allen [EMAIL PROTECTED] Proximity Pty Ltd http://www.proximity.com.au/ ---(end of broadcast)--- TIP 6: explain analyze is your friend