Re: [PERFORM] Large number of tables slow insert

2008-08-26 Thread Matthew Wakeling

On Sat, 23 Aug 2008, Loic Petit wrote:

I use Postgresql 8.3.1-1 to store a lot of data coming from a large amount of 
sensors. In order to have good
performances on querying by timestamp on each sensor, I partitionned my 
measures table for each sensor. Thus I create
a lot of tables.


As far as I can see, you are having performance problems as a direct 
result of this design decision, so it may be wise to reconsider. If you 
have an index on both the sensor identifier and the timestamp, it should 
perform reasonably well. It would scale a lot better with thousands of 
sensors too.


Matthew

--
And why do I do it that way? Because I wish to remain sane. Um, actually,
maybe I should just say I don't want to be any worse than I already am.
- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Large number of tables slow insert

2008-08-26 Thread Scott Marlowe
On Tue, Aug 26, 2008 at 6:50 AM, Matthew Wakeling [EMAIL PROTECTED] wrote:
 On Sat, 23 Aug 2008, Loic Petit wrote:

 I use Postgresql 8.3.1-1 to store a lot of data coming from a large amount
 of sensors. In order to have good
 performances on querying by timestamp on each sensor, I partitionned my
 measures table for each sensor. Thus I create
 a lot of tables.

 As far as I can see, you are having performance problems as a direct result
 of this design decision, so it may be wise to reconsider. If you have an
 index on both the sensor identifier and the timestamp, it should perform
 reasonably well. It would scale a lot better with thousands of sensors too.

Properly partitioned, I'd expect one big table to outperform 3,000
sparsely populated tables.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Autovacuum does not stay turned off

2008-08-26 Thread Jerry Champlin
We are attempting to turn off autovacuum but it keeps coming back.  We 
can't afford the performance hit from vacuum while end users are 
accessing our system.


Postgresql Version:  8.3.3
OS:  Linux 2.6.18-53.el5PAE #1 SMP

Running PostgreSQL setting: 
sspg=# show autovacuum;

autovacuum

off
(1 row)

pg.log Log Entries:
2008-08-26 15:24:50 GMTLOG:  autovacuum launcher started
-- and then we manually kill it
postgres 32371  0.0  0.1 1133768 23572 ?   Ss   15:16   0:00 
postgres: autovacuum worker process   rollup_data_store

# kill 32371
2008-08-26 15:24:53 GMTFATAL:  terminating autovacuum process due to 
administrator command


Does anyone know what will cause this bahavior for autovacuum?

Thank you in advance

-Jerry


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovacuum does not stay turned off

2008-08-26 Thread hubert depesz lubaczewski
On Tue, Aug 26, 2008 at 09:27:48AM -0600, Jerry Champlin wrote:
 Does anyone know what will cause this bahavior for autovacuum?

http://www.postgresql.org/docs/current/interactive/runtime-config-autovacuum.html
- autovacuum_freeze_max_age

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: [EMAIL PROTECTED] / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovacuum does not stay turned off

2008-08-26 Thread Andrew Sullivan
On Tue, Aug 26, 2008 at 09:27:48AM -0600, Jerry Champlin wrote:

 Does anyone know what will cause this bahavior for autovacuum?

You're probably approaching the wraparound limit in some database.  

If you think you can't afford the overhead when users are accessing
the system, when are you vacuuming?

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Query w empty result set with LIMIT orders of magnitude slower than without

2008-08-26 Thread Frank Joerdens
It seems to me that the planner makes a very poor decision with this
particular query:

--- snip ---
woome= explain analyze SELECT webapp_invite.id,
webapp_invite.person_id, webapp_invite.session_id,
webapp_invite.created, webapp_invite.text,
webapp_invite.subject, webapp_invite.email,
webapp_invite.batch_seen, webapp_invite.woouser,
webapp_invite.accepted, webapp_invite.declined,
webapp_invite.deleted, webapp_invite.local_start_time FROM
webapp_invite INNER JOIN webapp_person ON
(webapp_invite.person_id = webapp_person.id) INNER JOIN
webapp_person T3 ON (webapp_invite.person_id = T3.id) WHERE
webapp_person.is_suspended = false  AND T3.is_banned = false
AND webapp_invite.woouser = 'suggus'  ORDER BY
webapp_invite.id DESC LIMIT 5;

 QUERY PLAN
---
 Limit  (cost=0.00..3324.29 rows=5 width=44) (actual
time=2545.137..2545.137 rows=0 loops=1)
   -  Nested Loop  (cost=0.00..207435.61 rows=312 width=44) (actual
time=2545.135..2545.135 rows=0 loops=1)
 -  Nested Loop  (cost=0.00..204803.04 rows=322 width=48)
(actual time=2545.133..2545.133 rows=0 loops=1)
   -  Index Scan Backward using webapp_invite_pkey on
webapp_invite  (cost=0.00..201698.51 rows=382 width=44) (actual
time=2545.131..2545.131 rows=0 loops=1)
 Filter: ((woouser)::text = 'suggus'::text)
   -  Index Scan using webapp_person_pkey on
webapp_person t3  (cost=0.00..8.11 rows=1 width=4) (never executed)
 Index Cond: (t3.id = webapp_invite.person_id)
 Filter: (NOT t3.is_banned)
 -  Index Scan using webapp_person_pkey on webapp_person
(cost=0.00..8.16 rows=1 width=4) (never executed)
   Index Cond: (webapp_person.id = webapp_invite.person_id)
   Filter: (NOT webapp_person.is_suspended)
 Total runtime: 2545.284 ms
(12 rows)
--- snap ---

because if I just remove the LIMIT, it runs like the wind:

--- snip ---
woome= explain analyze SELECT webapp_invite.id,
webapp_invite.person_id, webapp_invite.session_id,
webapp_invite.created, webapp_invite.text,
webapp_invite.subject, webapp_invite.email,
webapp_invite.batch_seen, webapp_invite.woouser,
webapp_invite.accepted, webapp_invite.declined,
webapp_invite.deleted, webapp_invite.local_start_time FROM
webapp_invite INNER JOIN webapp_person ON
(webapp_invite.person_id = webapp_person.id) INNER JOIN
webapp_person T3 ON (webapp_invite.person_id = T3.id) WHERE
webapp_person.is_suspended = false  AND T3.is_banned = false
AND webapp_invite.woouser = 'suggus'  ORDER BY
webapp_invite.id DESC;

QUERY PLAN
---
 Sort  (cost=7194.46..7195.24 rows=312 width=44) (actual
time=0.141..0.141 rows=0 loops=1)
   Sort Key: webapp_invite.id
   Sort Method:  quicksort  Memory: 25kB
   -  Nested Loop  (cost=12.20..7181.53 rows=312 width=44) (actual
time=0.087..0.087 rows=0 loops=1)
 -  Nested Loop  (cost=12.20..4548.96 rows=322 width=48)
(actual time=0.085..0.085 rows=0 loops=1)
   -  Bitmap Heap Scan on webapp_invite
(cost=12.20..1444.44 rows=382 width=44) (actual time=0.084..0.084
rows=0 loops=1)
 Recheck Cond: ((woouser)::text = 'suggus'::text)
 -  Bitmap Index Scan on
webapp_invite_woouser_idx  (cost=0.00..12.10 rows=382 width=0) (actual
time=0.081..0.081 rows=0 loops=1)
   Index Cond: ((woouser)::text = 'suggus'::text)
   -  Index Scan using webapp_person_pkey on
webapp_person t3  (cost=0.00..8.11 rows=1 width=4) (never executed)
 Index Cond: (t3.id = webapp_invite.person_id)
 Filter: (NOT t3.is_banned)
 -  Index Scan using webapp_person_pkey on webapp_person
(cost=0.00..8.16 rows=1 width=4) (never executed)
   Index Cond: (webapp_person.id = webapp_invite.person_id)
   Filter: (NOT webapp_person.is_suspended)
 Total runtime: 0.295 ms
(16 rows)
--- snap ---

And for this particular filter, the result set is empty to boot, so
the LIMIT doesn't even do anything.

Does this behaviour make sense to anyone? Can I force the planner
somehow to be smarter about it?

Thanks!

Frank

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread henk de wit

Hi,

We're currently having a problem with queries on a medium sized table. This 
table is 22GB in size (via select pg_size_pretty(pg_relation_size('table'));). 
It has 7 indexes, which bring the total size of the table to 35 GB (measured 
with pg_total_relation_size).

On this table we're inserting records with a relatively low frequency of +- 
6~10 per second. We're using PG 8.3.1 on a machine with two dual core 2.4Ghz 
XEON CPUs, 16 GB of memory and Debian Linux. The machine is completely devoted 
to PG, nothing else runs on the box.

Lately we're getting a lot of exceptions from the Java process that does these 
inserts: An I/O error occured while sending to the backend. No other 
information is provided with this exception (besides the stack trace of 
course). The pattern is that for about a minute, almost every insert to this 22 
GB table results in this exception. After this minute everything is suddenly 
fine and PG happily accepts all inserts again. We tried to nail the problem 
down, and it seems that every time this happens, a select query on this same 
table is in progress. This select query starts right before the insert problems 
begin and most often right after this select query finishes executing, inserts 
are fine again. Sometimes though inserts only fail in the middle of the 
execution of this select query. E.g. if the select query starts at 12:00 and 
ends at 12:03, inserts fail from 12:01 to 12:02.

We have spend a lot of hours in getting to the bottom of this, but our ideas 
for fixing this problem are more or less exhausted at the moment.

I wonder if anyone recognizes this problem and could give some pointers to 
stuff that we could investigate next. 

Thanks a lot in advance.

_
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovacuum does not stay turned off

2008-08-26 Thread Jerry Champlin
This makes sense.  What queries can I run to see how close to the limit 
we are?  We need to determine if we should stop the process which 
updates and inserts into this table until after the critical time this 
afternoon when we can perform the required maintenance on this table.


hubert depesz lubaczewski wrote:

On Tue, Aug 26, 2008 at 09:27:48AM -0600, Jerry Champlin wrote:
  

Does anyone know what will cause this bahavior for autovacuum?



http://www.postgresql.org/docs/current/interactive/runtime-config-autovacuum.html
- autovacuum_freeze_max_age

depesz

  

Andrew Sullivan wrote:

On Tue, Aug 26, 2008 at 09:27:48AM -0600, Jerry Champlin wrote:
  

Does anyone know what will cause this bahavior for autovacuum?



You're probably approaching the wraparound limit in some database.  


If you think you can't afford the overhead when users are accessing
the system, when are you vacuuming?

A

  
We are changing the table structure tonight.  These two tables are very 
highly updated.  The goal is to use autovacuum but not have it take 10 
days to run on a 13MM record table.


Thanks

-Jerry


Re: [PERFORM] Query w empty result set with LIMIT orders of magnitude slower than without

2008-08-26 Thread Tom Lane
Frank Joerdens [EMAIL PROTECTED] writes:
 It seems to me that the planner makes a very poor decision with this
 particular query:

Try increasing the stats on woouser.  You need it to make a smaller
estimate of the number of matching rows here:

-  Index Scan Backward using webapp_invite_pkey on
 webapp_invite  (cost=0.00..201698.51 rows=382 width=44) (actual
 time=2545.131..2545.131 rows=0 loops=1)
  Filter: ((woouser)::text = 'suggus'::text)

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query w empty result set with LIMIT orders of magnitude slower than without (SOLVED, pls disregard)

2008-08-26 Thread Frank Joerdens
Eh, there was a spurious join in that query which was created by an
ORM which messed things up apparently. Sorry for the noise. This
abstracted version of the original query that does the same is fast:

woome= EXPLAIN ANALYZE
SELECT *
FROM webapp_invite i
INNER JOIN webapp_person p ON (i.id = p.id)
WHERE p.is_suspended = false
AND p.is_banned = false
AND i.woouser = 'suggus'
ORDER BY i.id DESC LIMIT 5;

QUERY PLAN
---
 Limit  (cost=4549.51..4549.52 rows=5 width=238) (actual
time=0.071..0.071 rows=0 loops=1)
   -  Sort  (cost=4549.51..4549.58 rows=31 width=238) (actual
time=0.070..0.070 rows=0 loops=1)
 Sort Key: i.id
 Sort Method:  quicksort  Memory: 25kB
 -  Nested Loop  (cost=12.20..4548.99 rows=31 width=238)
(actual time=0.036..0.036 rows=0 loops=1)
   -  Bitmap Heap Scan on webapp_invite i
(cost=12.20..1444.45 rows=382 width=44) (actual time=0.034..0.034
rows=0 loops=1)
 Recheck Cond: ((woouser)::text = 'suggus'::text)
 -  Bitmap Index Scan on
webapp_invite_woouser_idx  (cost=0.00..12.10 rows=382 width=0) (actual
time=0.032..0.032 rows=0 loops=1)
   Index Cond: ((woouser)::text = 'suggus'::text)
   -  Index Scan using webapp_person_pkey on
webapp_person p  (cost=0.00..8.11 rows=1 width=194) (never executed)
 Index Cond: (p.id = i.id)
 Filter: ((NOT p.is_suspended) AND (NOT p.is_banned))
 Total runtime: 0.183 ms
(13 rows)

Time: 1.114 ms

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovacuum does not stay turned off

2008-08-26 Thread hubert depesz lubaczewski
On Tue, Aug 26, 2008 at 10:45:31AM -0600, Jerry Champlin wrote:
 This makes sense.  What queries can I run to see how close to the limit  
 we are?  We need to determine if we should stop the process which  
 updates and inserts into this table until after the critical time this  
 afternoon when we can perform the required maintenance on this table.

select datname, age(datfrozenxid) from pg_database;

Best regards,

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: [EMAIL PROTECTED] / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread Mark Lewis
On Tue, 2008-08-26 at 18:44 +0200, henk de wit wrote:
 Hi,
 
 We're currently having a problem with queries on a medium sized table. This 
 table is 22GB in size (via select 
 pg_size_pretty(pg_relation_size('table'));). It has 7 indexes, which bring 
 the total size of the table to 35 GB (measured with pg_total_relation_size).
 
 On this table we're inserting records with a relatively low frequency of +- 
 6~10 per second. We're using PG 8.3.1 on a machine with two dual core 2.4Ghz 
 XEON CPUs, 16 GB of memory and Debian Linux. The machine is completely 
 devoted to PG, nothing else runs on the box.
 
 Lately we're getting a lot of exceptions from the Java process that does 
 these inserts: An I/O error occured while sending to the backend. No other 
 information is provided with this exception (besides the stack trace of 
 course). The pattern is that for about a minute, almost every insert to this 
 22 GB table results in this exception. After this minute everything is 
 suddenly fine and PG happily accepts all inserts again. We tried to nail the 
 problem down, and it seems that every time this happens, a select query on 
 this same table is in progress. This select query starts right before the 
 insert problems begin and most often right after this select query finishes 
 executing, inserts are fine again. Sometimes though inserts only fail in the 
 middle of the execution of this select query. E.g. if the select query starts 
 at 12:00 and ends at 12:03, inserts fail from 12:01 to 12:02.
 
 We have spend a lot of hours in getting to the bottom of this, but our ideas 
 for fixing this problem are more or less exhausted at the moment.
 
 I wonder if anyone recognizes this problem and could give some pointers to 
 stuff that we could investigate next. 
 
 Thanks a lot in advance.

If the select returns a lot of data and you haven't enabled cursors (by
calling setFetchSize), then the entire SQL response will be loaded in
memory at once, so there could be an out-of-memory condition on the
client.

Or if the select uses sorts and PG thinks it has access to more sort
memory than is actually available on the system (due to ulimits,
physical memory restrictions, etc) then you could run into problems that
look like out-of-memory errors on the server.

If could also be something else entirely; exceeding your max
connections, something like that.

A really good place to start would be to enable tracing on the JDBC
driver.  Look at the docs for the PostgreSQL JDBC driver for how to
enable logging; that should give you a much better picture of exactly
where and what is failing.

If the issue is server-side, then you will also want to look at the
PostgreSQL logs on the server; anything as serious as a backend aborting
should write an entry in the log.

-- Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovacuum does not stay turned off

2008-08-26 Thread Jerry Champlin
Thanks for the help.  The applied solution follows.  We will be taking a 
number of maintenance steps to manage these very high update tables 
which I will summarize later as I suspect we are not the only ones with 
this challenge.


http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND
http://www.postgresql.org/docs/current/interactive/catalog-pg-autovacuum.html

data_store=# SELECT relname, oid, age(relfrozenxid) FROM pg_class WHERE 
relkind = 'r';

...
hour_summary| 16392 | 252934596
percentile_metadata | 20580 | 264210966
(51 rows)

data_store=# insert into pg_autovacuum values 
(16392,false,35000,2,35000,1,200,200,35000,5);

INSERT 0 1
data_store=# insert into pg_autovacuum values 
(20580,false,35000,2,35000,1,200,200,35000,5);

INSERT 0 1
data_store=#


hubert depesz lubaczewski wrote:

On Tue, Aug 26, 2008 at 10:45:31AM -0600, Jerry Champlin wrote:
  
This makes sense.  What queries can I run to see how close to the limit  
we are?  We need to determine if we should stop the process which  
updates and inserts into this table until after the critical time this  
afternoon when we can perform the required maintenance on this table.



select datname, age(datfrozenxid) from pg_database;

Best regards,

depesz

  


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread Scott Marlowe
On Tue, Aug 26, 2008 at 10:44 AM, henk de wit [EMAIL PROTECTED] wrote:

 Hi,

 We're currently having a problem with queries on a medium sized table. This 
 table is 22GB in size (via select 
 pg_size_pretty(pg_relation_size('table'));). It has 7 indexes, which bring 
 the total size of the table to 35 GB (measured with pg_total_relation_size).

 On this table we're inserting records with a relatively low frequency of +- 
 6~10 per second. We're using PG 8.3.1 on a machine with two dual core 2.4Ghz 
 XEON CPUs, 16 GB of memory and Debian Linux. The machine is completely 
 devoted to PG, nothing else runs on the box.

 Lately we're getting a lot of exceptions from the Java process that does 
 these inserts: An I/O error occured while sending to the backend. No other 
 information is provided with this exception (besides the stack trace of 
 course).

What do your various logs (pgsql, application, etc...) have to say?
Can you read a java stack trace?  Sometimes slogging through them will
reveal some useful information.

 The pattern is that for about a minute, almost every insert to this 22 GB 
 table results in this exception. After this minute everything is suddenly 
 fine and PG happily accepts all inserts again. We tried to nail the problem 
 down, and it seems that every time this happens, a select query on this same 
 table is in progress. This select query starts right before the insert 
 problems begin and most often right after this select query finishes 
 executing, inserts are fine again. Sometimes though inserts only fail in the 
 middle of the execution of this select query. E.g. if the select query starts 
 at 12:00 and ends at 12:03, inserts fail from 12:01 to 12:02.

Sounds to me like your connections are timing out (what's your timeout
in jdbc set to?)

A likely cause is that you're getting big checkpoint spikes.  What
does vmstat 10 say during these spikes?  If you're running the
sysstate service with data collection then sar can tell you a lot.

If it is a checkpoint issue then you need more aggresive bgwriter
settings, and possibly more bandwidth on your storage array.

Note that you can force a checkpoint from a superuser account at the
command line.  You can always force one and see what happens to
performance during it.  You'll need to wait a few minutes or so
between runs to see an effect.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query w empty result set with LIMIT orders of magnitude slower than without

2008-08-26 Thread Frank Joerdens
On Tue, Aug 26, 2008 at 5:53 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Frank Joerdens [EMAIL PROTECTED] writes:
 It seems to me that the planner makes a very poor decision with this
 particular query:

 Try increasing the stats on woouser.  You need it to make a smaller
 estimate of the number of matching rows here:

Way cool, this

alter table webapp_invite alter column woouser set statistics 50;

did the trick, without cleaning up the spurious join there even.

Thanks!

Frank

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread henk de wit
 If the select returns a lot of data and you haven't enabled cursors (by
 calling setFetchSize), then the entire SQL response will be loaded in
 memory at once, so there could be an out-of-memory condition on the
 client.

I hear you. This is absolutely not the case though. There is no other exception 
anywhere besides the An I/O error occured while sending to the backend.. The 
select query eventually returns something between 10 and 50 rows. Also, the 
select query runs from another machine than the one that issues the inserts to 
the data base. I failed to mention in the openings post that simultaneously 
with this select/insert query for this single 22 GB table, thousands if not 
tens of thousands other queries are hitting other tables in the same database. 
There are about 70 other tables, with a combined size of about 40 GB. None of 
those 70 others tables has a size anywhere near that 22GB of the problematic 
table. There is never even a single problem of this kind with any of those 
other tables.

When doing research on the net, it became clear that a lot of these An I/O 
error... exceptions are caused by malfunctioning switches or routers in the 
network between the application server(s) and the data base. In our case this 
can hardly be true. As mentioned, a great number of other queries are hitting 
the database. Some of these are very small (exeuction times of about 10 ms), 
while others are quite large (hundreds of lines of SQL with over 12 joins and 
an execution time of several minutes). Not a single one of those results in 
this I/O error.

 If could also be something else entirely; exceeding your max
 connections, something like that.

We indeed ran into that, but I think more as collateral damage. When this 
single select query for the 22 GB table is executing and those inserts start to 
fail, this also starts holding up things. As a results the 6 writes per second 
start queuing up and requesting more and more connections from our connection 
pool (DBCP as supplied by Apache Tomcat). We had the maximum of our connection 
pool set to a too high value and after a while Postgres responded with a 
message that the connection limit was exceeded. We thereafter lowered the max 
of our connection pool and didn't see that particular message anymore.

So, it seems likely that An I/O error occured while sending to the backend. 
doet not mean connection limit exceeded, since the latter message is 
explitely given when this is the case.

 A really good place to start would be to enable tracing on the JDBC
 driver.

That's a good idea indeed. I'll try to enable this and see what it does.

 If the issue is server-side, then you will also want to look at the
 PostgreSQL logs on the server; anything as serious as a backend aborting
 should write an entry in the log.

We studied the PG logs extensively but unfortunately could not really detect 
anything that could point to the problem there.


_
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread henk de wit
 What do your various logs (pgsql, application, etc...) have to say?

There
is hardly anything helpful in the pgsql log. The application log
doesn't mention anything either. We log a great deal of information in
our application, but there's nothing out of the ordinary there,
although there's of course always a chance that somewhere we missed
something. 

 Can you read a java stack trace?  Sometimes slogging through them will
 reveal some useful information.

I can read a java stack trace very well, I'm primarily a Java developer. The 
stack trace is the following one:

org.postgresql.util.PSQLException:
An I/O error occured while sending to the backend.

at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:218)

at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451)

at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:350)

at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:304)

at
org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:102)

at
org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:102)


Our
code simply executes the same statement.executeUpdate() that is does
about 500.000 times during a business day. As soon as this select query
is hitting this 22 GB table then there's a chance that 'suddenly' all
these utterly simply insert queries start failing. The insert query is
nothing special either. It's just an INSERT INTO ... VALUES (...)
type of thing. The select query can actually be a few different kinds
of queries, but basically the common thing between them is reading from
this 22 GB table. In fact, our system administrator just told me that
even the DB backup is able to trigger this behaviour. As soon as the
backup process is reading from this 22 GB table, the inserts on it
-may- start to fail.

 Sounds to me like your connections are timing out (what's your timeout
 in jdbc set to?)

There's
no explicit timeout being set. Queries can theoretically execute for
hours. In some rare cases, some queries indeed run for that long.
 
 A likely cause is that you're getting big checkpoint spikes.  What
 does vmstat 10 say during these spikes?

It's
hard to reproduce the problem. We're trying to simulate it in on our
testing servers but haven't been successfull yet. The problem typically
lasts for only a minute a time on the production server and there's no
saying on when it will occur again. Of course we could try to enfore it
by running this select query continously, but for a production server
it's not an easy decission to actually do that. So therefore basically
all we were able to do now is investigate the logs afterwards. I'll try
to run vmstat though when the problem happens when I'm at the console. 

 If you're running the
 sysstate service with data collection then sar can tell you a lot.

Ok, I'm not a big PG expert so I'll have to look into what that means exactly 
;) Thanks for the advice though.

Kind regards,
Henk
_
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

[PERFORM] control the number of clog files and xlog files

2008-08-26 Thread Duan Ligong

Hi, 

Would you like to be so kind as to answer the following questions:

- Is there any way to control the number of clog files and xlog files? 
I encounter an issue that there are too many clog files under the 
pg_clog/ directory which occupy more space than I can endure..

- What determines the number of clog files? what determines the 
number of xlog files?

-  I understand pg_xlog is used to record WAL. but what is pg_clog
is used to? Is it used to record some meta-information on the xlog?

- What effect does Deleting the clog and xlogfiles bring about?
Will it cause Postgresql abnormal stopping?

#I am not sure this is the right list to ask such a question.But
since I post it in the ADMIN list, and no one give me an answer,
I wanna try this list.


Best regards
Gera



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] control the number of clog files and xlog files

2008-08-26 Thread Alvaro Herrera
Duan Ligong wrote:

 Would you like to be so kind as to answer the following questions:
 
 - Is there any way to control the number of clog files and xlog files? 
 I encounter an issue that there are too many clog files under the 
 pg_clog/ directory which occupy more space than I can endure..

pg_clog files are controlled by tuple freezing, which is done by vacuum,
and it depends on the autovacuum_min_freeze_age parameter and
vacuum_freeze_min_age.  Please read

http://www.postgresql.org/docs/8.3/interactive/routine-vacuuming.html
and
http://www.postgresql.org/docs/8.3/interactive/runtime-config-client.html#GUC-VACUUM-FREEZE-MIN-AGE


 - What determines the number of clog files? what determines the 
 number of xlog files?

The number of xlog files will depend on checkpoints.  You need to
restrict checkpoint_segments to control this.  Note that this can have a
serious performance impact.

 -  I understand pg_xlog is used to record WAL. but what is pg_clog
 is used to? Is it used to record some meta-information on the xlog?

clog is the commit log, i.e. it records transactions that have been
committed and those that have been aborted.  You cannot delete files
unless you want to corrupt your database.

 - What effect does Deleting the clog and xlogfiles bring about?
 Will it cause Postgresql abnormal stopping?

Your data will be corrupt.  It may continue to work for a while, and
suddenly stop working at a future time.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance