Re: [PERFORM] n00b autovacuum question

2006-03-19 Thread Antoine
On 18/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote:
> Antoine wrote:
> > Hi,
> > I have enabled the autovacuum daemon, but occasionally still get a
> > message telling me I need to run vacuum when I access a table in
> > pgadmin.
>
> pgAdmin notices a discrepancy between real rowcount and estimated
> rowcount and thus suggests to run vacuum/analyze; it won't examine
> autovacuum rules so it might warn although autovac is running ok.
>
> If you're sure autovacuum is running fine, just dismiss the message.

I guess that is my problem - I a not sure it is running fine. The
process is definitely running but I am getting lots of complaints
about performance. This probably has lots to do with crap code and not
much to do with the database but I am still searching the maintenance
avenue... We have a massive project coming up and I want to go for
Postgres (the boss wants Oracle). If I can't get my stuff together I
am not sure my arguments will stick... problem is I don't really have
the time to experiment properly.
Cheers
Antoine

--
This is where I should put some witty comment.

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] partitioning

2006-03-19 Thread Antoine
Hi,
Is there any work on the cards for implementing other partitioning
strategies? I see mysql 5.1 will have support for hashes and stuff but
didn't see anything in the todos for postgres.
Cheers
Antoine
--
This is where I should put some witty comment.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] database model tshirt sizes

2006-03-19 Thread Patrick Hatcher
We have size and color in the product table itself.  It is really an
attribute of the product.  If you update the availability of the product
often, I would split out the quantity into a separate table so that you can
truncate and update as needed.

Patrick Hatcher
Development Manager  Analytics/MIO
Macys.com


   
 "NbForYou"
 <[EMAIL PROTECTED] 
 .com>  To 
 Sent by:
 pgsql-performance  cc 
 [EMAIL PROTECTED] 
 .org  Subject 
   [PERFORM] database model tshirt 
   sizes   
 03/18/06 07:03 AM 
   
   
   
   
   




Hello,

Does anybody know how to build a database model to include sizes for rings,
tshirts, etc?


the current database is built like:

table product
=

productid int8 PK
productname charvar(255)
quantity int4


what i want now is that WHEN (not all products have multiple sizes) there
are multiple sizes available. The sizes are stored into the database. I was
wondering to include a extra table:

table sizes:

productid int8 FK
size varchar(100)


but then i have a quantity problem. Because now not all size quantities can
be stored into this table, because it allready exist in my product table.

How do professionals do it? How do they make their model to include sizes
if any available?


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] database model tshirt sizes

2006-03-19 Thread me

another approach would be:

table product:

productid int8 PK
productname charvar(255)


table versions

productid int8 FK
versionid int8 PK
size
color
...
quantity int4


an example would be then:

table product:
- productid: 123, productname: 'nice cotton t-shirt'
- productid: 442, productname: 'another cotton t-shirt'

table versions:
- productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11
- productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1
- productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4
- productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9
- productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0


that way you can have more than 1 quantity / color / size combination per 
product and still have products that come in one size. so instead of only 
using a 2nd table for cases where more than one size is available, you would 
always use a 2nd table. this probably reduces your code complexity quite a 
bit and only needs 1 JOIN.


- thomas



- Original Message - 
From: "Patrick Hatcher" <[EMAIL PROTECTED]>

To: "NbForYou" <[EMAIL PROTECTED]>
Cc: ; 
<[EMAIL PROTECTED]>

Sent: Sunday, March 19, 2006 2:59 PM
Subject: Re: [PERFORM] database model tshirt sizes



We have size and color in the product table itself.  It is really an
attribute of the product.  If you update the availability of the product
often, I would split out the quantity into a separate table so that you 
can

truncate and update as needed.

Patrick Hatcher
Development Manager  Analytics/MIO
Macys.com



"NbForYou"
<[EMAIL PROTECTED]
.com>  To
Sent by:  
pgsql-performance  cc
[EMAIL PROTECTED]
.org  Subject
  [PERFORM] database model tshirt
  sizes
03/18/06 07:03 AM









Hello,

Does anybody know how to build a database model to include sizes for 
rings,

tshirts, etc?


the current database is built like:

table product
=

productid int8 PK
productname charvar(255)
quantity int4


what i want now is that WHEN (not all products have multiple sizes) there
are multiple sizes available. The sizes are stored into the database. I 
was

wondering to include a extra table:

table sizes:

productid int8 FK
size varchar(100)


but then i have a quantity problem. Because now not all size quantities 
can

be stored into this table, because it allready exist in my product table.

How do professionals do it? How do they make their model to include sizes
if any available?


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match





---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] n00b autovacuum question

2006-03-19 Thread Andreas Pflug

Antoine wrote:

On 18/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote:


Antoine wrote:


Hi,
I have enabled the autovacuum daemon, but occasionally still get a
message telling me I need to run vacuum when I access a table in
pgadmin.


Bring up the postgresql.conf editor on that server, and watch if pgadmin 
complains.


Regards,
Andreas

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] n00b autovacuum question

2006-03-19 Thread Antoine
On 19/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote:
> Antoine wrote:
> > On 18/03/06, Andreas Pflug <[EMAIL PROTECTED]> wrote:
> >
> >>Antoine wrote:
> >>
> >>>Hi,
> >>>I have enabled the autovacuum daemon, but occasionally still get a
> >>>message telling me I need to run vacuum when I access a table in
> >>>pgadmin.
>
> Bring up the postgresql.conf editor on that server, and watch if pgadmin
> complains.

Hi,
I am not sure I understand what "bring up" means. Could you explain?
Thanks
Antoine

--
This is where I should put some witty comment.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


pgsql-performance@postgresql.org

2006-03-19 Thread Luke Lonergan
Kenji,


On 3/17/06 4:08 PM, "Kenji Morishige" <[EMAIL PROTECTED]> wrote:

> Thanks guys, I'm studying each of your responses and am going to start to
> experiement.

I notice that no one asked you about your disk bandwidth - the Adaptec 2200S
is a "known bad" controller - the bandwidth to/from in RAID5 is about 1/2 to
1/3 of a single disk drive, which is far too slow for a 10GB database, and
IMO should disqualify a RAID adapter from being used at all.

Without fixing this, I'd suggest that all of the other tuning described here
will have little value, provided your working set is larger than your RAM.

You should test the I/O bandwidth using these simple tests:
  time bash -c "dd if=/dev/zero of=bigfile bs=8k count=100 && sync"

then:
  time dd if=bigfile of=/dev/null bs=8k

You should get on the order of 150MB/s on four disk drives in RAID5.

And before people jump in about "random I/O", etc, the sequential scan test
will show whether the controller is just plain bad very quickly.  If it
can't do sequential fast, it won't do seeks fast either.

- Luke



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] database model tshirt sizes

2006-03-19 Thread NbForYou

So a default value for all products would be size:"all"

for example, the same tshirt shop also sells cdroms.

It size attribute would be to place it to be :"all". (because we cannot 
place an uniqe index on null values)


But the industry evolves and so in time the same cdrom is now available for 
pc and playstation.


So i would like to have it as 1 productid but with different attributes: pc 
(with quantity 5) and playstation (with quantity 3).
So when I do an insert for this 2 products with 1 productid it would be 
like:


insert into versions (productid,size,quantity) values (345,'pc',5);
insert into versions (productid,size,quantity) values (345,'playstation',3);

if however the product existed we get an error:

because the default value version "all" did also exist and is now obsolete

population versions:


productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11
productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1
productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4
productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9
productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0
productid: 345, versionid: 3, color: null, size: 'all', quantity: 15
productid: 345, versionid: 3, color: null, size: 'pc', quantity: 5
productid: 345, versionid: 3, color: null, size: 'playstation', quantity: 3

WOULD HAVE TO BE:

population versions:


productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11
productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1
productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4
productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9
productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0
productid: 345, versionid: 3, color: null, size: 'pc', quantity: 5
productid: 345, versionid: 3, color: null, size: 'playstation', quantity: 3

ALSO:

what is versionid used for?


- Original Message - 
From: <[EMAIL PROTECTED]>

To: "NbForYou" <[EMAIL PROTECTED]>
Cc: ; 
<[EMAIL PROTECTED]>

Sent: Sunday, March 19, 2006 3:37 PM
Subject: Re: [PERFORM] database model tshirt sizes



another approach would be:

table product:

productid int8 PK
productname charvar(255)


table versions

productid int8 FK
versionid int8 PK
size
color
...
quantity int4


an example would be then:

table product:
- productid: 123, productname: 'nice cotton t-shirt'
- productid: 442, productname: 'another cotton t-shirt'

table versions:
- productid: 123, versionid: 1, color: 'black', size: 'all', quantity: 11
- productid: 442, versionid: 2, color: 'yellow', size: 'l', quantity: 1
- productid: 442, versionid: 2, color: 'yellow', size: 's', quantity: 4
- productid: 442, versionid: 2, color: 'red', size: 'xl', quantity: 9
- productid: 442, versionid: 2, color: 'blue', size: 's', quantity: 0


that way you can have more than 1 quantity / color / size combination per 
product and still have products that come in one size. so instead of only 
using a 2nd table for cases where more than one size is available, you 
would always use a 2nd table. this probably reduces your code complexity 
quite a bit and only needs 1 JOIN.


- thomas



- Original Message - 
From: "Patrick Hatcher" <[EMAIL PROTECTED]>

To: "NbForYou" <[EMAIL PROTECTED]>
Cc: ; 
<[EMAIL PROTECTED]>

Sent: Sunday, March 19, 2006 2:59 PM
Subject: Re: [PERFORM] database model tshirt sizes



We have size and color in the product table itself.  It is really an
attribute of the product.  If you update the availability of the product
often, I would split out the quantity into a separate table so that you 
can

truncate and update as needed.

Patrick Hatcher
Development Manager  Analytics/MIO
Macys.com



"NbForYou"
<[EMAIL PROTECTED]
.com>  To
Sent by:  
pgsql-performance  cc
[EMAIL PROTECTED]
.org  Subject
  [PERFORM] database model tshirt
  sizes
03/18/06 07:03 AM









Hello,

Does anybody know how to build a database model to include sizes for 
rings,

tshirts, etc?


the current database is built like:

table product
=

productid int8 PK
productname charvar(255)
quantity int4


what i want now is that WHEN (not all products have multiple sizes) there
are multiple sizes available. The sizes are stored into the database. I 
was

wondering to include a extra table:

table sizes:

productid int8 FK
size varchar(100)


but then i have a quantity problem. Because now not all size quantities 
can

be stored into this table, because it allready exist in my product table.

How do professionals do it? How do they make their model to include sizes
if any av

pgsql-performance@postgresql.org

2006-03-19 Thread Dave Page

-Original Message-
From: "Luke Lonergan"<[EMAIL PROTECTED]>
Sent: 19/03/06 16:26:58
To: "Kenji Morishige"<[EMAIL PROTECTED]>, "Claus Guttesen"<[EMAIL PROTECTED]>
Cc: "pgsql-performance@postgresql.org"
Subject: Re: [PERFORM] Best OS & Configuration for Dual Xeon w/4GB &

> I notice that no one asked you about your disk bandwidth - the Adaptec 2200S
>is a "known bad" controller - 

Agreed - We have a couple at work which got relagated to use in 'toy' boxes 
when we realised how bad they were, long before they ever saw any production 
use.

Regards, Dave

-Unmodified Original Message-
Kenji,


On 3/17/06 4:08 PM, "Kenji Morishige" <[EMAIL PROTECTED]> wrote:

> Thanks guys, I'm studying each of your responses and am going to start to
> experiement.

I notice that no one asked you about your disk bandwidth - the Adaptec 2200S
is a "known bad" controller - the bandwidth to/from in RAID5 is about 1/2 to
1/3 of a single disk drive, which is far too slow for a 10GB database, and
IMO should disqualify a RAID adapter from being used at all.

Without fixing this, I'd suggest that all of the other tuning described here
will have little value, provided your working set is larger than your RAM.

You should test the I/O bandwidth using these simple tests:
  time bash -c "dd if=/dev/zero of=bigfile bs=8k count=100 && sync"

then:
  time dd if=bigfile of=/dev/null bs=8k

You should get on the order of 150MB/s on four disk drives in RAID5.

And before people jump in about "random I/O", etc, the sequential scan test
will show whether the controller is just plain bad very quickly.  If it
can't do sequential fast, it won't do seeks fast either.

- Luke



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] partial indexes and inference

2006-03-19 Thread Tim Allen
I have a case where it seems the planner should be able to infer more 
from its partial indexes than it is doing. Observe:


px=# select version();
version

 PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2
(1 row)

px=# \d pxmdvalue
  Table "store.pxmdvalue"
   Column   |   Type   | Modifiers
+--+---
 entityid   | bigint   | not null
 fieldid| integer  | not null
 value  | text | not null
 datatypeid | integer  | not null
 tsi| tsvector |
Indexes:
"pxmdvalue_pk" PRIMARY KEY, btree (entityid, fieldid)
"pxmdvalue_atom_val_idx" btree (value) WHERE datatypeid = 22
"pxmdvalue_bigint_val_idx" btree ((value::bigint)) WHERE datatypeid 
= 43

"pxmdvalue_datatypeid_idx" btree (datatypeid)
"pxmdvalue_int_val_idx" btree ((value::integer)) WHERE datatypeid = 16
"pxmdvalue_str32_val0_idx" btree (lower(value)) WHERE datatypeid = 
2 AND octet_length(value) < 2700
"pxmdvalue_str32_val1_idx" btree (lower(value) text_pattern_ops) 
WHERE datatypeid = 2 AND octet_length(value) < 2700
"pxmdvalue_str_val0_idx" btree (lower(value)) WHERE datatypeid = 85 
AND octet_length(value) < 2700
"pxmdvalue_str_val1_idx" btree (lower(value) text_pattern_ops) 
WHERE datatypeid = 85 AND octet_length(value) < 2700
"pxmdvalue_time_val_idx" btree (px_text2timestamp(value)) WHERE 
datatypeid = 37


px=# explain analyse select * from pxmdvalue where datatypeid = 43 and 
fieldid = 857 and cast(value as bigint) = '1009';


QUERY PLAN

 Bitmap Heap Scan on pxmdvalue  (cost=2143.34..2685.74 rows=1 
width=245) (actual time=144.411..144.415 rows=1 loops=1)

   Recheck Cond: (((value)::bigint = 1009::bigint) AND (datatypeid = 43))
   Filter: (fieldid = 857)
   ->  BitmapAnd  (cost=2143.34..2143.34 rows=138 width=0) (actual 
time=144.394..144.394 rows=0 loops=1)
 ->  Bitmap Index Scan on pxmdvalue_bigint_val_idx 
(cost=0.00..140.23 rows=1758 width=0) (actual time=0.021..0.021 rows=2 
loops=1)

   Index Cond: ((value)::bigint = 1009::bigint)
 ->  Bitmap Index Scan on pxmdvalue_datatypeid_idx 
(cost=0.00..2002.85 rows=351672 width=0) (actual time=144.127..144.127 
rows=346445 loops=1)

   Index Cond: (datatypeid = 43)
 Total runtime: 144.469 ms
(9 rows)

px=# drop index pxmdvalue_datatypeid_idx;
DROP INDEX
px=# explain analyse select * from pxmdvalue where datatypeid = 43 and 
fieldid = 857 and cast(value as bigint) = '1009';

   QUERY PLAN
-
 Index Scan using pxmdvalue_bigint_val_idx on pxmdvalue 
(cost=0.00..6635.06 rows=1 width=245) (actual time=0.018..0.022 rows=1 
loops=1)

   Index Cond: ((value)::bigint = 1009::bigint)
   Filter: (fieldid = 857)
 Total runtime: 0.053 ms
(4 rows)



Notice the two bitmap index scans in the first version of the query. The 
one that hits the pxmdvalue_bigint_val_idx actually subsumes the work of 
the second one, as it is a partial index on the same condition that the 
second bitmap scan is checking. So that second bitmap scan is a complete 
waste of time and effort, afaict. When I remove the 
pxmdvalue_datatypeid_idx index, to prevent it using that second bitmap 
scan, the resulting query is much faster, although its estimated cost is 
rather higher.


Any clues, anyone? Is this indeed a limitation of the query planner, in 
that it doesn't realise that the partial index is all it needs here? Or 
is something else going on that is leading the cost estimation astray?


Tim

--
---
Tim Allen  [EMAIL PROTECTED]
Proximity Pty Ltd  http://www.proximity.com.au/

---(end of broadcast)---
TIP 6: explain analyze is your friend