subject:"\[PERFORM\] CPU 0.1% IOWAIT 99% for decisonnal queries"

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-24 Thread Patrick Vedrines





Hello Gustavo,

Your question seems to say that you suspect a disk 
issue, and a few hours later, Simon told me "Sounds like your 
disks/layout/something is pretty sick".
To be clear in my own mind about it, I've just 
copyed (time cp) the "aggregate" table files (4 Gb) from one disk to an another 
one: it takes 22 mn !(3 Mb/s).
That seems to demonstrate that Postgres is not the 
cause of this issue. 
I've just untrusted to my system engineerthe 
analysis of my disks...

In case I would have to change my disks, do you 
have any performance figures related tothe types you mentionned (reiserfs 
vs ext3) ?
I don't use RAID since the security is not a 
concern.

Thank a lot for your help !

Patrick

  Hi 
  Patrick, How is configured your disk array? Do you 
  have a Perc 4?Tip: Use reiserfs instead ext3, raid 0+1 and deadline 
  I/O scheduler in kernel linux 2.6

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-24 Thread Patrick Vedrines






Hello Simon,

 Sounds like your disks/layout/something is 
pretty sick. You don't mention I/O bandwidth, controller or RAID, so you 
should look more into those topics.Well seen ! (as we say in 
France).
As I said to Gustavo, your last suspicion took me 
into a simple disk test: I've just copyed (time cp) the "aggregate" table files 
(4 Gb) from one disk to an another one: it takes 22 mn !(3 Mb/s).
That seems to demonstrate that Postgres is not the 
cause of this issue. 
I've just untrusted to my system engineerthe 
analysis of my disks...

 Setting shared_buffers that high will do you 
no good at all, as Richard suggests.I (and my own tests) agree my 
you.
 You've got 1.5Gb of shared_buffers and 
 2Gb data. In 8.0, the scan will hardly use the cache at all, nor 
will it ever, since the data is bigger than the cache. Notably, the scan 
of B should NOT spoil the cache for AAre you sure of that ? Is Postgres able 
to say to OS: "don't use the cache for this query"?
 Priming the cache is quite hard...but not 
impossible. What will kill you on a shared_buffers that big is the 
bgwriter, which you should turn off by setting bgwriter_maxpages = 
0
Is bgwriter concerned as my application applyes 
only SELECT ?
  On the other hand...just go for 
more RAM, as you suggest...but you should create a RAMdisk, rather than 
use too large shared_buffersthat way your data is always in RAM, 
rather than maybe in RAM.I am not an Linux expert: Is it possible 
(and how) to create a RAMdisk ?



Thank a lot for your help !

Patrick

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-24 Thread Patrick Vedrines

Hello Richard,

 Perhaps look into clustering the tables.
Good idea : I will try to go further into this way.

  There is no index on the aggregate table since the criterias, their
  number and their scope are freely choosen by the customers.

 Hmm... not convinced this is a good idea.
Long days ago, when my application used Informix, I've try to index the
aggregate table: It was a nightmare to manage all these indexes (and their
volume) for a uncertain benefit.

 If you don't have any indexes and the table isn't clustered then PG has
 no choice but to scan the entire table for every query. As you note,
 that's going to destroy your cache. You can increase the RAM but sooner
 or later, you'll get the same problem.
I agree with you : You remarks take me not to rely to the cache features.

 3. Split your tables into two - common fields, uncommon fields, that way
 filtering on the common fields might take less space.
 4. Split your tables by date, one table per month or year. Then re-write
 your customers' queries on-the-fly to select from the right table. Will
 only help with queries on date of course.
That forces me to rewrite my query generator which is already a very complex
program (in fact the heart of the system)

 5. Place each database on its own machine or virtual machine so they
 don't interfere with each other.
I'm afraid I don't have the money for that. As Simon and Gustavo suggested,
I will check my SCSI disks first.


Thank a lot for your advises !

Amicalement,

Patrick



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-24 Thread Gustavo Franklin Nóbrega - Planae





Good day Patrick!
 
 I can help you to design you disk layout for better perform and
security. Please, tell me how many disks (and some specs, like capacity
and RPM).

 If you want to know more, there is a very interesting article abou
benckmark filesystem ( http://linuxgazette.net/102/piszcz.html ). In
this article, ReiserFS 3.6, JFS and XFS are in the same level at top
depending your application, and ext3 is more slow than others. I
believe that version 4 of the ReiserFS is better that version 3.6, but
I could not still test it.

 Raid0 (striping, more at
http://www.pcguide.com/ref/hdd/perf/raid/levels/singleLevel0-c.html) or
Raid 0+1 (stripping + mirror, more at
http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html) is
very insteresting to postgresql. Raid0 provides only performance, and
Raid 0+1 provides performance and security. Take a look at this
articles and think about to use Raid
(http://www.pcguide.com/ref/hdd/perf/raid/).

 I'm glad to help. Best regards!
Atenciosamente,

Gustavo Franklin Nbrega
Infraestrutura e Banco de Dados
Planae Tecnologia da Informao
(+55) 14 3106-3514
http://www.planae.com.br




Patrick Vedrines wrote:

  
  
  
  
  
  Hello Gustavo,
  
  Your question seems to say that you
suspect a disk issue, and a few hours later, Simon told me "Sounds like
your disks/layout/something is pretty sick".
  To be clear in my own mind about it,
I've just copyed (time cp) the "aggregate" table files (4 Gb) from one
disk to an another one: it takes 22 mn !(3 Mb/s).
  That seems to demonstrate that
Postgres is not the cause of this issue. 
  I've just untrusted to my system
engineerthe analysis of my disks...
  
  In case I would have to change my
disks, do you have any performance figures related tothe types you
mentionned (reiserfs vs ext3) ?
  I don't use RAID since the security
is not a concern.
  
  Thank a lot for your help !
  
  Patrick
  
  

Hi Patrick,

 How is configured your disk array? Do you have a Perc 4?

Tip: Use reiserfs instead ext3, raid 0+1 and deadline I/O scheduler in
kernel linux 2.6

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-24 Thread Patrick Vedrines

Great !

I'm not an expert but as far as I know, my 15
databases are spread over 4 SCSI RAID disks 73 GB 10K RPM mounted under ext3
mode.
I remember that they where provided by DELL under
RAID5 and I asked my system engineer forswitching them to standard SCSI
because I don't care about security but only about speed and capacity ( maybe
this switch was not set properly at this time...).

Thank you for these interesting links: I 've sent
them to my system engineer with my two hands !

Amicalement

Patrick

- Original Message -
From:
Gustavo
Franklin Nóbrega - Planae
To: Patrick Vedrines
Cc: performance pgsql
Sent: Thursday, March 24, 2005 12:52
PM
Subject: Re: [PERFORM] CPU 0.1% IOWAIT
99% for decisonnal queries
Good day Patrick!
I can help you to design you disk layout for better perform and security.
Please, tell me how many disks (and some specs, like capacity and
RPM). If you want to know more, there is a very
interesting article abou benckmark filesystem ( http://linuxgazette.net/102/piszcz.html
). In this article, ReiserFS 3.6, JFS and XFS are in the same level at top
depending your application, and ext3 is more slow than others. I believe that
version 4 of the ReiserFS is better that version 3.6, but I could not still
test it. Raid0 (striping, more at http://www.pcguide.com/ref/hdd/perf/raid/levels/singleLevel0-c.html)
or Raid 0+1 (stripping + mirror, more at http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html)
is very insteresting to postgresql. Raid0 provides only performance, and Raid
0+1 provides performance and security. Take a look at this articles and think
about to use Raid (http://www.pcguide.com/ref/hdd/perf/raid/).
I'm glad to help. Best regards!Atenciosamente,

Gustavo Franklin Nóbrega
Infraestrutura e Banco de Dados
Planae Tecnologia da Informação
(+55) 14 3106-3514
http://www.planae.com.br

Patrick Vedrines wrote:

Hello Gustavo,

Your question seems to say that you suspect a
disk issue, and a few hours later, Simon told me "Sounds like your
disks/layout/something is pretty sick".
To be clear in my own mind about it, I've just
copyed (time cp) the "aggregate" table files (4 Gb) from one disk to an
another one: it takes 22 mn !(3 Mb/s).
That seems to demonstrate that Postgres is not
the cause of this issue.
I've just untrusted to my system
engineerthe analysis of my disks...

In case I would have to change my disks, do you
have any performance figures related tothe types you mentionned
(reiserfs vs ext3) ?
I don't use RAID since the security is not a
concern.

Thank a lot for your help !

Patrick

Hi
Patrick, How is configured your disk array? Do
you have a Perc 4?Tip: Use reiserfs instead ext3, raid 0+1 and
deadline I/O scheduler in kernel linux 2.6

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-24 Thread Gustavo Franklin Nóbrega - Planae

Hi!

I have a Dell PowerEdge 2600 with a Perc 4/DI and 4 scsi disks
35GB. I have made a array raid 0+1 with 4 disks, because is mission
critical application. But, for your, you can configure a raid0, thats
is faster than raid5 for 4 disks.

Ask to your system enginner what is distribution of Linux is used,
kernel version and if he does a kernel tuning for your hardware and
application.

Best regards.
Atenciosamente,

Gustavo Franklin Nbrega
Infraestrutura e Banco de Dados
Planae Tecnologia da Informao
(+55) 14 3106-3514
http://www.planae.com.br

Patrick Vedrines wrote:

Great !

I'm not an expert but as far as I
know, my 15 databases are spread over 4 SCSI RAID disks 73 GB 10K RPM
mounted under ext3 mode.
I remember that they where provided
by DELL under RAID5 and I asked my system engineer forswitching them
to standard SCSI because I don't care about security but only about
speed and capacity ( maybe this switch was not set properly at this
time...).

Thank you for these interesting
links: I 've sent them to my system engineer with my two hands !

Amicalement

Patrick

-
Original Message -
From:
Gustavo Franklin Nbrega - Planae

To:
Patrick Vedrines
Cc:
performance pgsql
Sent:
Thursday, March 24, 2005 12:52 PM
Subject:
Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Good day Patrick!

I can help you to design you disk layout for better perform and
security. Please, tell me how many disks (and some specs, like capacity
and RPM).

If you want to know more, there is a very interesting article abou
benckmark filesystem ( http://linuxgazette.net/102/piszcz.html
). In this article, ReiserFS 3.6, JFS and XFS are in the same level at
top depending your application, and ext3 is more slow than others. I
believe that version 4 of the ReiserFS is better that version 3.6, but
I could not still test it.

Raid0 (striping, more at http://www.pcguide.com/ref/hdd/perf/raid/levels/singleLevel0-c.html)
or Raid 0+1 (stripping + mirror, more at http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html)
is very insteresting to postgresql. Raid0 provides only performance,
and Raid 0+1 provides performance and security. Take a look at this
articles and think about to use Raid (http://www.pcguide.com/ref/hdd/perf/raid/).

I'm glad to help. Best regards!
Atenciosamente,

Gustavo Franklin Nbrega
Infraestrutura e Banco de Dados
Planae Tecnologia da Informao
(+55) 14 3106-3514
http://www.planae.com.br

Patrick Vedrines wrote:

Hello Gustavo,

Your question seems to say that
you suspect a disk issue, and a few hours later, Simon told me "Sounds
like your disks/layout/something is pretty sick".
To be clear in my own mind about
it, I've just copyed (time cp) the "aggregate" table files (4 Gb) from
one disk to an another one: it takes 22 mn !(3 Mb/s).
That seems to demonstrate that
Postgres is not the cause of this issue.
I've just untrusted to my system
engineerthe analysis of my disks...

In case I would have to change
my disks, do you have any performance figures related tothe types you
mentionned (reiserfs vs ext3) ?
I don't use RAID since the
security is not a concern.

Thank a lot for your help !

Patrick

Hi Patrick,

How is configured your disk array? Do you have a Perc 4?

Tip: Use reiserfs instead ext3, raid 0+1 and deadline I/O scheduler in
kernel linux 2.6

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-24 Thread Simon Riggs

On Thu, 2005-03-24 at 10:48 +0100, Patrick Vedrines wrote:
  You've got 1.5Gb of shared_buffers and  2Gb data. In 8.0, the scan
 will
  hardly use the cache at all, nor will it ever, since the data is
 bigger
  than the cache. Notably, the scan of B should NOT spoil the cache
 for A
 Are you sure of that ? Is Postgres able to say to OS: don't use the
 cache for this query?

PostgreSQL 8.0 has the ARC algorithm which prevents cache spoiling of
the shared_buffers, but has no direct influence over the OS cache.

  Priming the cache is quite hard...but not impossible.
  What will kill you on a shared_buffers that big is the bgwriter,
 which
  you should turn off by setting bgwriter_maxpages = 0
 Is bgwriter concerned as my application applyes only SELECT ?

With very large shared_buffers the bgwriter's default settings are a
problem. You don't need it, so I suggest turning it off.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-22 Thread Patrick Vedrines




Hi everyone,

I'm developping a web decisonnal application based 
on 
-Red Hat 3 ES
-Postgresql 8.0.1 
-Dell poweredge 2850, Ram 2Gb, 2 procs, 3 Ghz, 1Mb 
cache and 4 disks ext3 10,000 r/mn
I am alone in the box and there is not any 
crontab.

I have 2 databases (A and B) with exactly the same 
schemas: 
-one main table called "aggregate" having no 
indexes and supporting only SELECT statements (loaded one time a month with a 
new bundle of datas). Row size # 200 bytes (50 columns of type char(x) or 
integer)
-and several small 'reference' tables not shown by 
the following example for clarity reasons.
-Database A : aggregate contains 2,300,000 records 
( 500 Mb)

-DatabaseB : aggregate contains 9,000,000 
records ( 2 Gb)

There is no index on the aggregate table since the 
criterias, their number and their scope are freely choosen by the 
customers.

The query :
  select 
sum(ca)   from 
aggregate   
where (issue_date = '2004-01' and issue_date = '2004-02' 
);
takes 5s on database A ( 5mn30s* the first time, 
probably to fill the cache) 
and 21mn* on 
database B (whatever it is the first time or not).

explain shows sequential scan of 
course:
Aggregate 
(cost=655711.85..655711.85 rows=1 width=4) - Seq Scan 
on "aggregate" (cost=0.00..647411.70 rows=3320060 
width=4) Filter: 
((issue_date = '2004-01'::bpchar) AND (issue_date = 
'2004-02'::bpchar))

*Here is the 'top' display for these response 
times:
91 processes: 90 sleeping, 1 running, 0 zombie, 0 
stoppedCPU states: cpu user 
nice system irq softirq 
iowait 
idle 
total 0,0% 0,0% 
0,2% 0,1% 0,0% 48,6% 
51,0% 
cpu00 0,0% 0,0% 
0,0% 0,0% 0,0% 0,0% 
100,0% 
cpu01 0,0% 0,0% 
1,0% 0,0% 0,0% 99,0% 
0,0% 
cpu02 0,0% 0,0% 
0,0% 0,5% 0,0% 
0,0% 
99,5% 
cpu03 0,0% 0,0% 
0,0% 0,0% 0,0% 95,5% 4,5%Mem: 
2061424k av, 2043944k used, 17480k 
free, 0k shrd, 6104k 
buff 
1551692k actv, 172496k in_d, 30452k in_cSwap: 2096440k 
av, 0k used, 2096440k 
free 
1792852k cached
 PID USER PRI 
NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU 
COMMAND21983 postgres 20 0 9312 9312 8272 
D 0,2 
0,4 0:00 1 postmaster 1 
root 15 0 488 
488 432 S 0,0 0,0 
0:06 2 init 2 
root RT 0 
0 0 0 SW 0,0 
0,0 0:00 0 migration/0

For the 5s response time, the 'top' command shows 
0% iowait and 25% cpu.


- I guess this is a cache issue but how can I 
manage/control it ? 
Is Postgres managing it's own cache or does it use the OS cache ?

- Is using the cache is a good approach? 
It does not seem to work for large databases : I tryed several different 
values for postgres.conf and /proc/sys/kernel/shmmax without detecting any 
response time enhancement (For example : shared_buffers = 19, sort_mem 
= 4096, effective_cache_size = 37000 andkernel/shmmax=12 
)
Do I have to upgrade the RAM to 6Gb or/and buy faster HD(of what 
type?) ?
Moreover, a query on database B will destroy the cache previously build for 
database A, increasing the response time for the next query on database A. And I 
have in fact 15 databases !


- In my case, what should be the best parameters combination between 
postgres.conf and /proc/sys/kernel/shmmax ?

- is there a way to reduce the size of the "aggregate" table files (1Gb + 
1Gb + 1 Gb + 0.8Gb = 3.8Gb for the "aggregate" table instead of 2Gb = 200 * 
9,000,000 records) by playing with the data types or others parameters 
(fillfactor ?). 
Vacuum (even full) seems to be useless since the aggregate table supports 
only 'copy aggregate from' and 'select'.

-is it possible to define a sort of RAM filesystem (as itexists 
in DOS/Windows) which I could create and populate my databases into 
?...since the databases does not support updates for this 
application.

Sorry for my naive questions and my poor english but any help or advise 
will be greatly appreciated !

Patrick Vedrines

PS (maybe of interest for some users like me) : 
I created a partition on a new similar disk but on the last cylinders (near 
the periphery) and copied the database B into it: the response timeis 25% 
faster (i.e. 15mn instead of 21mn). But 15 mn is still too long for my customers 
(5 mn would be nice).

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-22 Thread Gustavo F Nobrega - Planae





Hi Patrick,

 How is configured your disk array? Do you have a Perc 4?

Tip: Use reiserfs instead ext3, raid 0+1 and deadline I/O scheduler in
kernel linux 2.6
Atenciosamente,

Gustavo Franklin Nbrega
Infra-Estrutura e Banco de Dados
Planae Tecnologia da Informao
(+55) 14 3224-3066 Ramal 209
www.planae.com.br



Patrick Vedrines wrote:

  
  
  
  Hi everyone,
  
  I'm developping a web decisonnal
application based on 
  -Red Hat 3 ES
  -Postgresql 8.0.1 
  -Dell poweredge 2850, Ram 2Gb, 2
procs, 3 Ghz, 1Mb cache and 4 disks ext3 10,000 r/mn
  I am alone in the box and there is
not any crontab.
  
  I have 2 databases (A and B) with
exactly the same schemas: 
  -one main table called "aggregate"
having no indexes and supporting only SELECT statements (loaded one
time a month with a new bundle of datas). Row size # 200 bytes (50
columns of type char(x) or integer)
  -and several small 'reference'
tables not shown by the following example for clarity reasons.
  -Database A : aggregate contains
2,300,000 records ( 500 Mb)
  
  -DatabaseB : aggregate contains
9,000,000 records ( 2 Gb)
  
  
  There is no index on the aggregate
table since the criterias, their number and their scope are freely
choosen by the customers.
  
  The query :
select sum(ca) 
  from aggregate 
  where (issue_date = '2004-01' and issue_date =
'2004-02' );
  takes 5s on database A ( 5mn30s* the
first time, probably to fill the cache) 
  and 21mn* on database B (whatever it is the first time or not).
  
  explain shows sequential scan of
course:
  Aggregate
(cost=655711.85..655711.85 rows=1 width=4)
 - Seq Scan on "aggregate" (cost=0.00..647411.70 rows=3320060
width=4)
 Filter: ((issue_date = '2004-01'::bpchar) AND (issue_date
= '2004-02'::bpchar))
  
  *Here is the 'top' display for these
response times:
  91 processes: 90 sleeping, 1
running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
 total 0,0% 0,0% 0,2% 0,1% 0,0% 48,6% 51,0%
 cpu00 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 100,0%
 cpu01 0,0% 0,0% 1,0% 0,0% 0,0% 99,0% 0,0%
 cpu02 0,0% 0,0% 0,0% 0,5% 0,0% 0,0% 99,5%
 cpu03 0,0% 0,0% 0,0% 0,0% 0,0% 95,5% 4,5%
Mem: 2061424k av, 2043944k used, 17480k free, 0k shrd,
6104k buff
 1551692k actv, 172496k in_d, 30452k in_c
Swap: 2096440k av, 0k used, 2096440k free
1792852k cached
   PID USER PRI NI SIZE RSS
SHARE STAT %CPU %MEM TIME CPU COMMAND
21983 postgres 20 0 9312 9312 8272 D 0,2 0,4 0:00 1 postmaster
 1 root 15 0 488 488 432 S 0,0 0,0 0:06 2 init
 2 root RT 0 0 0 0 SW 0,0 0,0 0:00 0
migration/0
  
  
  For the 5s response time, the 'top'
command shows 0% iowait and 25% cpu.
  
  
  - I guess this is a cache issue but
how can I manage/control it ? 
  Is Postgres managing it's own cache or does it use the OS cache ?
  
  - Is using the cache is a good approach? 
  It does not seem to work for large databases : I tryed several
different values for postgres.conf and /proc/sys/kernel/shmmax without
detecting any response time enhancement (For example : shared_buffers =
19, sort_mem = 4096, effective_cache_size = 37000
andkernel/shmmax=12 )
  Do I have to upgrade the RAM to 6Gb or/and buy faster HD(of
what type?) ?
  Moreover, a query on database B will destroy the cache
previously build for database A, increasing the response time for the
next query on database A. And I have in fact 15 databases !
  
  
  - In my case, what should be the best parameters combination
between postgres.conf and /proc/sys/kernel/shmmax ?
  
  
  - is there a way to reduce the size of the "aggregate" table
files (1Gb + 1Gb + 1 Gb + 0.8Gb = 3.8Gb for the "aggregate" table
instead of 2Gb = 200 * 9,000,000 records) by playing with the data
types or others parameters (fillfactor ?). 
  Vacuum (even full) seems to be useless since the aggregate table
supports only 'copy aggregate from' and 'select'.
  
  -is it possible to define a sort of RAM filesystem (as
itexists in DOS/Windows) which I could create and populate my
databases into ?...since the databases does not support updates for
this application.
  
  Sorry for my naive questions and my poor english but any help or
advise will be greatly appreciated !
  
  Patrick Vedrines
  
  PS (maybe of interest for some users like me) : 
  I created a partition on a new similar disk but on the last
cylinders (near the periphery) and copied the database B into it: the
response timeis 25% faster (i.e. 15mn instead of 21mn). But 15 mn is
still too long for my customers (5 mn would be nice).

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-22 Thread Richard Huxton

Patrick Vedrines wrote:
Hi everyone,
I'm developping a web decisonnal application based on -Red Hat 3 ES 
-Postgresql 8.0.1 -Dell poweredge 2850, Ram 2Gb, 2 procs, 3 Ghz, 1Mb
cache and 4 disks ext3 10,000 r/mn I am alone in the box and there is
not any crontab.

I have 2 databases (A and B) with exactly the same schemas: -one main
table called aggregate having no indexes and supporting only SELECT
statements (loaded one time a month with a new bundle of datas).
Perhaps look into clustering the tables.
 Row
size # 200 bytes (50 columns of type char(x) or integer) -and several
small 'reference' tables not shown by the following example for
clarity reasons. -Database A : aggregate contains 2,300,000 records (
500 Mb) -Database B : aggregate contains 9,000,000 records ( 2 Gb)
There is no index on the aggregate table since the criterias, their
number and their scope are freely choosen by the customers.
Hmm... not convinced this is a good idea.
The query : select  sum(ca) from aggregate where  (issue_date =
'2004-01' and issue_date = '2004-02' ); takes 5s on database A (
5mn30s* the first time, probably to fill the cache) and  21mn* on
database B (whatever it is the first time or not).
Because A fits in the cache and B doesn't.
- I guess this is a cache issue but how can I manage/control it ? Is
Postgres managing it's own cache or does it use the OS cache ?
Both
- Is using the cache is a good approach? It does not seem to work for
large databases : I tryed several different values for postgres.conf
and /proc/sys/kernel/shmmax without detecting any response time
enhancement (For example : shared_buffers = 19 , sort_mem = 4096
, effective_cache_size = 37000 and kernel/shmmax=12 ) Do I
have to upgrade the RAM to 6Gb or/and buy faster HD (of what type?) ?
 Moreover, a query on database B will destroy the cache previously
build for database A, increasing the response time for the next query
on database A. And I have in fact 15 databases !
If you don't have any indexes and the table isn't clustered then PG has 
no choice but to scan the entire table for every query. As you note, 
that's going to destroy your cache. You can increase the RAM but sooner 
or later, you'll get the same problem.

- In my case, what should be the best parameters combination between
postgres.conf and /proc/sys/kernel/shmmax ?
http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
http://www.powerpostgresql.com/PerfList
- is there a way to reduce the size of the aggregate table files
(1Gb + 1Gb + 1 Gb + 0.8Gb = 3.8Gb for the aggregate table instead
of 2Gb = 200 * 9,000,000 records) by playing with the data types or
others parameters (fillfactor ?). Vacuum (even full) seems to be
useless since the aggregate table supports only 'copy aggregate from'
and 'select'.
You can replace int4 with int2 and so on (where possible) but that will 
only delay problems.

- is it possible to define a sort of RAM filesystem (as it exists in
DOS/Windows) which I could create and populate my databases into ?
...since the databases does not support updates for this application.
Won't help - your cache is already doing that. Some things you can do 
(in order of effort)

1. Cluster the large tables
2. Analyse your customers' queries and try a couple of indexes - some 
choices will be more common than others.
3. Split your tables into two - common fields, uncommon fields, that way 
filtering on the common fields might take less space.
4. Split your tables by date, one table per month or year. Then re-write 
your customers' queries on-the-fly to select from the right table. Will 
only help with queries on date of course.
5. Place each database on its own machine or virtual machine so they 
don't interfere with each other.

I'd start with items 1,2 and see if that helps though.
PS - it might make sense to have an unusually large shared_mem for PG, 
but I'm not familiar enough with the changes in the cache handling in 
8.0 to say for sure.
PPS - there are more changes coming for 8.1, but I know even less about 
those.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

2005-03-22 Thread Simon Riggs

On Tue, 2005-03-22 at 19:08 +0100, Patrick Vedrines wrote:
 I have 2 databases (A and B) with exactly the same schemas: 
 -one main table called aggregate having no indexes and supporting
 only SELECT statements (loaded one time a month with a new bundle of
 datas). Row size # 200 bytes (50 columns of type char(x) or integer) 
 -and several small 'reference' tables not shown by the following
 example for clarity reasons.
 -Database A : aggregate contains 2,300,000 records ( 500 Mb)
 -Database B : aggregate contains 9,000,000 records ( 2 Gb)

 (For example : shared_buffers = 19 , sort_mem = 4096 ,
 effective_cache_size = 37000 and kernel/shmmax=12 )
 Do I have to upgrade the RAM to 6Gb or/and buy faster HD (of what
 type?) ?

Setting shared_buffers that high will do you no good at all, as Richard
suggests.

You've got 1.5Gb of shared_buffers and  2Gb data. In 8.0, the scan will
hardly use the cache at all, nor will it ever, since the data is bigger
than the cache. Notably, the scan of B should NOT spoil the cache for
A...

Priming the cache is quite hard...but not impossible.

What will kill you on a shared_buffers that big is the bgwriter, which
you should turn off by setting bgwriter_maxpages = 0

 PS (maybe of interest for some users like me) : 
 I created a partition on a new similar disk but on the last cylinders
 (near the periphery) and copied the database B into it: the response
 time is 25% faster (i.e. 15mn instead of 21mn). But 15 mn is still too
 long for my customers (5 mn would be nice).

Sounds like your disks/layout/something is pretty sick. You don't
mention I/O bandwidth, controller or RAID, so you should look more into
those topics.

On the other hand...just go for more RAM, as you suggest...but you
should create a RAMdisk, rather than use too large
shared_buffersthat way your data is always in RAM, rather than maybe
in RAM.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

[PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

Re: [PERFORM] CPU 0.1% IOWAIT 99% for decisonnal queries

11 matches

Site Navigation

Mail list logo

Footer information