Hi all,

We are operating a 1.5GB postgresql database for a year and we have problems for nearly a month. Usually everything is OK with the database, queries are executed fast even if they are complicated but sometimes and for half an hour, we have a general slow down.


The server is a dedicated quad xeon with 4GB and a RAID1 array for the
system and a RAID10 one for postgresql data. We have very few
updates/inserts/deletes during the day.

Postgresql version is 7.4.8.

- the database is vacuumed, analyzed regularly (but we are not using
autovacuum) and usually performs pretty well ;
- IOs are OK, the database is entirely in RAM (see top.txt and
iostat.txt attached) ;
- CPUs are usually 25% idle, load is never really growing and its
maximum is below 5 ;
- I attached two plans for a simple query, the one is what we have when
the server is fast, the other when we have a slow down: it's exactly the
same plan but, as you can see it, the time to fetch the first row from
indexes is quite high for the slow query ;
- during this slow down, queries that usually take 500ms can take up to
60 seconds (and sometimes even more) ;
- we have up to 130 permanent connections from our apache servers during
this slow down as we have a lot of apache children waiting for a response.

I attached a vmstat output. Context switches are quite high but I'm not
sure it can be called a context switch storm and that this is the cause
of our problem.

Thanks for any advice or idea to help us understanding this problem and hopefully solve it.

Regards,

--
Guillaume
[EMAIL PROTECTED] root]# iostat 10

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               7.20         0.00        92.00          0        920
sda1              0.00         0.00         0.00          0          0
sda2              6.40         0.00        78.40          0        784
sda3              0.00         0.00         0.00          0          0
sda4              0.00         0.00         0.00          0          0
sda5              0.00         0.00         0.00          0          0
sda6              0.80         0.00        13.60          0        136
sdb               5.00         0.00       165.60          0       1656
sdb1              5.00         0.00       165.60          0       1656

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               1.30         0.00        20.80          0        208
sda1              0.00         0.00         0.00          0          0
sda2              0.70         0.00         9.60          0         96
sda3              0.00         0.00         0.00          0          0
sda4              0.00         0.00         0.00          0          0
sda5              0.00         0.00         0.00          0          0
sda6              0.60         0.00        11.20          0        112
sdb               5.40         0.00       173.60          0       1736
sdb1              5.40         0.00       173.60          0       1736

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               2.20         0.00        28.00          0        280
sda1              0.00         0.00         0.00          0          0
sda2              2.20         0.00        28.00          0        280
sda3              0.00         0.00         0.00          0          0
sda4              0.00         0.00         0.00          0          0
sda5              0.00         0.00         0.00          0          0
sda6              0.00         0.00         0.00          0          0
sdb               5.20         0.00       171.20          0       1712
sdb1              5.20         0.00       171.20          0       1712

                                                            QUERY PLAN          
                                                  
----------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..13.52 rows=2 width=1119) (actual time=0.154..0.167 
rows=1 loops=1)
   ->  Index Scan using pk_newslang on newslang nl  (cost=0.00..3.87 rows=1 
width=1004) (actual time=0.053..0.055 rows=1 loops=1)
         Index Cond: (((codelang)::text = 'FRA'::text) AND (3498704 = numnews))
   ->  Nested Loop Left Join  (cost=0.00..9.61 rows=2 width=119) (actual 
time=0.050..0.059 rows=1 loops=1)
         ->  Index Scan using pk_news on news n  (cost=0.00..3.31 rows=2 
width=98) (actual time=0.021..0.023 rows=1 loops=1)
               Index Cond: (numnews = 3498704)
         ->  Index Scan using pk_photo on photo p  (cost=0.00..3.14 rows=1 
width=25) (actual time=0.021..0.025 rows=1 loops=1)
               Index Cond: (p.numphoto = "outer".numphoto)
 Total runtime: 0.320 ms        
(9 rows)

                                                             QUERY PLAN         
                                                    
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..13.52 rows=2 width=1119) (actual 
time=155.286..155.305 rows=1 loops=1)
   ->  Index Scan using pk_newslang on newslang nl  (cost=0.00..3.87 rows=1 
width=1004) (actual time=44.575..44.579 rows=1 loops=1)
         Index Cond: (((codelang)::text = 'FRA'::text) AND (3498704 = numnews))
   ->  Nested Loop Left Join  (cost=0.00..9.61 rows=2 width=119) (actual 
time=110.648..110.660 rows=1 loops=1)
         ->  Index Scan using pk_news on news n  (cost=0.00..3.31 rows=2 
width=98) (actual time=0.169..0.174 rows=1 loops=1)
               Index Cond: (numnews = 3498704)
         ->  Index Scan using pk_photo on photo p  (cost=0.00..3.14 rows=1 
width=25) (actual time=110.451..110.454 rows=1 loops=1)
               Index Cond: (p.numphoto = "outer".numphoto)
 Total runtime: 155.514 ms
(9 rows)


 17:08:41  up 19 days, 15:16,  1 user,  load average: 4.03, 4.26, 4.36
288 processes: 285 sleeping, 3 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total   59.0%    0.0%    8.8%   0.2%     0.0%    0.0%   31.9%
           cpu00   52.3%    0.0%   13.3%   0.9%     0.0%    0.0%   33.3%
           cpu01   65.7%    0.0%    7.6%   0.0%     0.0%    0.0%   26.6%
           cpu02   58.0%    0.0%    7.6%   0.0%     0.0%    0.0%   34.2%
           cpu03   60.0%    0.0%    6.6%   0.0%     0.0%    0.0%   33.3%
Mem:  3857224k av, 3495880k used,  361344k free,       0k shrd,   92160k buff
                   2374048k actv,  463576k in_d,   37708k in_c
Swap: 4281272k av,   25412k used, 4255860k free                 2173392k cached

[EMAIL PROTECTED] root]# vmstat 10
procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 4  0  25412 250160  91736 2170944    0    1     3     3    4     3  5  3  3  0
 7  0  25412 247160  91736 2171024    0    0     0   116  588  4483 65  4 31  0
 3  0  25412 235456  91752 2171132    0    0     0   129  491  3670 70  4 26  0
 5  0  25412 233696  91760 2171216    0    0     0   152  530  4768 61  4 34  0
 5  0  25412 233248  91768 2171232    0    0     0   183  624  5379 59  5 36  0
 9  0  25412 195332  91788 2171304    0    0     0   127  541  4811 58  5 37  0

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to