Re: [PERFORM] JDBC question for PG 8.3.9
On Wed, Apr 14, 2010 at 7:10 PM, Craig Ringer cr...@postnewspapers.com.au wrote: On 15/04/10 04:49, Dave Crooke wrote: Hi foilks I am using PG 8.3 from Java. I am considering a performance tweak which will involve holding about 150 java.sql.PreparedStatment objects open against a single PGSQL connection. Is this safe? I know that MySQL does not support prepared statements /per se/, and so their implementation of PreparedStatement is nothing more than some client-side convenience code that knows how to escape and format constants for you. Is this the case for PG, or does the PG JDBC driver do the real thing? Pg supports real server-side prepared statements, as does the JDBC driver. IIRC (and I can't say this with 100% certainty without checking the sources or a good look at TFM) the PostgreSQL JDBC driver initially does only a client-side prepare. However, if the PreparedStatement is re-used more than a certain number of times (five by default?) it switches to server-side prepared statements. This is partially true. The driver uses an unnamed prepared statement on the server. This has actually caused a bunch of performance complaints on the jdbc list, because the query plan may change at that switch-over point, since with a server-side prepared statement Pg no longer has a specific value for each parameter and may pick a more generic plan. This is a limitation of the server, not the driver -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] 8.3.9 - latency spikes with Linux (and tuning for consistently low latency)
Hi, we are seeing latency spikes in the 2-3 second range (sometimes 8-10s) for queries that usually take 3-4ms on our systems and I am running out of things to try to get rid of them. Perhaps someone here has more ideas - here's a description of the systems and what I've tried with no impact at all: 2 x 6-core Opterons (2431) 32GB RAM 2 SATA disks (WD1500HLFS) in software RAID-1 Linux 2.6.26 64 bit (Debian kernel) PostgreSQL 8.3.9 (Debian package) FS mounted with option noatime vm.dirty_ratio = 80 3 DB clusters, 2 of which are actively used, all on the same RAID-1 FS fsync=off shared_buffers=5GB (database size is ~4.7GB on disk right now) temp_buffers=50MB work_mem=500MB wal_buffers=256MB (*) checkpoint_segments=256 (*) commit_delay=10 (*) autovacuum=off (*) (*) added while testing, no change w.r.t. the spikes seen at all The databases have moderate read load (no burst load, typical web backend) and somewhat regular write load (updates in batches, always single-row update/delete/inserts using the primary key, 90% updates, a few 100s to 1000s rows together, without explicit transactions/locking). This is how long the queries take (seen from the client): Thu Apr 15 18:16:14 CEST 2010 real 0m0.004s Thu Apr 15 18:16:15 CEST 2010 real 0m0.004s Thu Apr 15 18:16:16 CEST 2010 real 0m0.003s Thu Apr 15 18:16:17 CEST 2010 real 0m0.005s Thu Apr 15 18:16:18 CEST 2010 real 0m0.068s Thu Apr 15 18:16:19 CEST 2010 real 0m0.004s Thu Apr 15 18:16:20 CEST 2010 real 0m0.005s Thu Apr 15 18:16:21 CEST 2010 real 0m0.235s Thu Apr 15 18:16:22 CEST 2010 real 0m0.005s Thu Apr 15 18:16:23 CEST 2010 real 0m3.006s == ! Thu Apr 15 18:16:27 CEST 2010 real 0m0.004s Thu Apr 15 18:16:28 CEST 2010 real 0m0.084s Thu Apr 15 18:16:29 CEST 2010 real 0m0.003s Thu Apr 15 18:16:30 CEST 2010 real 0m0.005s Thu Apr 15 18:16:32 CEST 2010 real 0m0.038s Thu Apr 15 18:16:33 CEST 2010 real 0m0.005s Thu Apr 15 18:16:34 CEST 2010 real 0m0.005s The spikes aren't periodic, i.e. not every 10,20,30 seconds or 5 minutes etc, they seem completely random... PostgreSQL also reports (due to log_min_duration_statement=1000) small bursts of queries that take much longer than they should: [nothing for a few minutes] 2010-04-15 16:50:03 CEST LOG: duration: 8995.934 ms statement: select ... 2010-04-15 16:50:04 CEST LOG: duration: 3383.780 ms statement: select ... 2010-04-15 16:50:04 CEST LOG: duration: 3328.523 ms statement: select ... 2010-04-15 16:50:05 CEST LOG: duration: 1120.108 ms statement: select ... 2010-04-15 16:50:05 CEST LOG: duration: 1079.879 ms statement: select ... [nothing for a few minutes] (explain analyze yields 5-17ms for the above queries) Things I've tried apart from the PostgreSQL parameters above: - switching from ext3 with default journal settings to data=writeback - switching to ext2 - vm.dirty_background_ratio set to 1, 10, 20, 60 - vm.dirty_expire_centisecs set to 3000 (default), 864 (1 day) - fsync on - some inofficial Debian 2.6.32 kernel and ext3 with data=writeback (because of http://lwn.net/Articles/328363/ although it seems to address fsync latency and not read latency) - running irqbalance All these had no visible impact on the latency spikes. I can also exclude faulty hardware with some certainty (since we have 12 identical systems with this problem). I am suspecting some strange software RAID or kernel problem, unless the default bgwriter settings can actually cause selects to get stuck for so long when there are too many dirty buffers (I hope not). Unless I'm missing something, I only have a non-RAID setup or ramdisks (tmpfs), or SSDs left to try to get rid of these, so any suggestion will be greatly appreciated. Generally, I'd be very interested in hearing how people tune their databases and their hardware/Linux for consistently low query latency (esp. when everything should fit in memory). Regards, Marinos -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8.3.9 - latency spikes with Linux (and tuning for consistently low latency)
Marinos Yannikos m...@geizhals.at writes: we are seeing latency spikes in the 2-3 second range (sometimes 8-10s) for queries that usually take 3-4ms on our systems and I am running out of things to try to get rid of them. Have you checked whether the spikes correlate with checkpoints? Turn on log_checkpoints and watch for awhile. If so, fooling with the checkpoint parameters might give some relief. However, 8.3 already has the spread-checkpoint code so I'm not sure how much more win can be had there. More generally, you should watch vmstat/iostat output and see if you can correlate the spikes with I/O activity, CPU peaks, etc. A different line of thought is that maybe the delays have to do with lock contention --- log_lock_waits might help you identify that. fsync=off That's pretty scary. work_mem=500MB Yipes. I don't think you have enough RAM for that to be safe. commit_delay=10 (*) This is probably not a good idea. autovacuum=off (*) Nor this. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8.3.9 - latency spikes with Linux (and tuning for consistently low latency)
Tom Lane t...@sss.pgh.pa.us wrote: Have you checked whether the spikes correlate with checkpoints? Turn on log_checkpoints and watch for awhile. If so, fooling with the checkpoint parameters might give some relief. If that by itself doesn't do it, I've found that making the background writer more aggressive can help. We've had good luck with: bgwriter_lru_maxpages = 1000 bgwriter_lru_multiplier = 4.0 If you still have checkpoint-related latency issues, you could try scaling back shared_buffers, letting the OS cache handle more of the data. Also, if you have a RAID controller with a battery-backed RAM cache, make sure it is configured for write-back. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine
Josh Berkus wrote: Basically, vacuuming of a table which normally takes about 20 minutes interactively with vacuum_cost_delay set to 20 had not completed after 14 hours. When I trussed it, I saw activity which indicated to me that autovacuum was doing a pollsys, presumably for cost_limit, every data page. Autovacuum was running with vacuum_cost_limit = 200 and autovacuum_vacuum_cost_delay = 20, which I believe is the default for 8.3. Truss output: pollsys(0xFD7FFFDF83E0, 0, 0xFD7FFFDF8470, 0x) = 0 So what is it polling? Please try truss -v pollsys; is there a way in Solaris to report what each file descriptor is pointing to? (In linux I'd look at /proc/pid/fd) We don't call pollsys anywhere. Something in Solaris must be doing it under the hood. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set
I have followed the instructions below to no avail any thoughts? http://jdbc.postgresql.org/documentation/83/query.html#query-with-cursor This is what happens when I reduce the fetch_size to 50 ... stops after about 950msec and 120 fetches (6k rows) 13:59:56,054 [PerfDataMigrator] ERROR com.hyper9.storage.sample.persistence.PersistenceManager:3216 - Unexpected error while migrating sample data: 6000 org.postgresql.util.PSQLException: ERROR: portal C_14 does not exist at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327) at org.postgresql.core.v3.QueryExecutorImpl.fetch(QueryExecutorImpl.java:1527) at org.postgresql.jdbc2.AbstractJdbc2ResultSet.next(AbstractJdbc2ResultSet.java:1843) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:169) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:169) at com.hyper9.storage.sample.persistence.PersistenceManager$Migrator.run(PersistenceManager.java:3156) at java.lang.Thread.run(Thread.java:619) Cheers Dave On Thu, Apr 15, 2010 at 2:42 PM, Dave Crooke dcro...@gmail.com wrote: Hey folks I am trying to do a full table scan on a large table from Java, using a straightforward select * from foo. I've run into these problems: 1. By default, the PG JDBC driver attempts to suck the entire result set into RAM, resulting in *java.lang.OutOfMemoryError* ... this is not cool, in fact I consider it a serious bug (even MySQL gets this right ;-) I am only testing with a 9GB result set, but production needs to scale to 200GB or more, so throwing hardware at is is not feasible. 2. I tried using the official taming method, namely * java.sql.Statement.setFetchSize(1000)* and this makes it blow up entirely with an error I have no context for, as follows (the number C_10 varies, e.g. C_12 last time) ... org.postgresql.util.PSQLException: ERROR: portal C_10 does not exist at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327) at org.postgresql.core.v3.QueryExecutorImpl.fetch(QueryExecutorImpl.java:1527) at org.postgresql.jdbc2.AbstractJdbc2ResultSet.next(AbstractJdbc2ResultSet.java:1843) This is definitely a bug :-) Is there a known workaround for this ... will updating to a newer version of the driver fix this? Is there a magic incation of JDBC calls that will tame it? Can I cast the objects to PG specific types and access a hidden API to turn off this behaviour? If the only workaround is to explicitly create a cursor in PG, is there a good example of how to do this from Java? Cheers Dave
Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine
Alvaro Herrera alvhe...@commandprompt.com writes: We don't call pollsys anywhere. Something in Solaris must be doing it under the hood. pg_usleep calls select(), and some googling indicates that select() is implemented as pollsys() on recent Solaris versions. So Josh's assumption that those are delay calls seems plausible. But it shouldn't be sleeping after each page with normal cost_delay parameters, should it? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: We don't call pollsys anywhere. Something in Solaris must be doing it under the hood. pg_usleep calls select(), and some googling indicates that select() is implemented as pollsys() on recent Solaris versions. So Josh's assumption that those are delay calls seems plausible. But it shouldn't be sleeping after each page with normal cost_delay parameters, should it? Certainly not ... The only explanation would be that the cost balance gets over the limit very frequently. So one of the params would have to be abnormally high (vacuum_cost_page_hit, vacuum_cost_page_miss, vacuum_cost_page_dirty). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] stats collector suddenly causing lots of IO
Chris li...@deksai.com writes: I have a lot of centos servers which are running postgres. Postgres isn't used that heavily on any of them, but lately, the stats collector process keeps causing tons of IO load. It seems to happen only on servers with centos 5. The versions of postgres that are running are: 8.1.18 8.2.6 8.3.1 8.3.5 8.3.6 8.3.7 8.3.8 8.3.9 8.4.2 8.4.3 Do these different server versions really all show the problem to the same extent? I'd expect 8.4.x in particular to be cheaper than the older branches. Are their pgstat.stat files all of similar sizes? (Note that 8.4.x keeps pgstat.stat under $PGDATA/pg_stat_tmp/ whereas in earlier versions it was under $PGDATA/global/.) If your applications create/use/drop a lot of tables (perhaps temp tables) then bloat of the pgstat.stat file is to be expected, but it should get cleaned up by vacuum (including autovacuum). What is your vacuuming policy on these servers ... do you use autovacuum? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8.3.9 - latency spikes with Linux (and tuning for consistently low latency)
Marinos Yannikos wrote: vm.dirty_ratio = 80 This is tuned the opposite direction of what you want. The default tuning in the generation of kernels you're using is: /proc/sys/vm/dirty_ratio = 10 /proc/sys/vm/dirty_background_ratio = 5 And those should be considered upper limits if you want to tune for latency. Unfortunately, even 5% will still allow 1.6GB of dirty data to queue up without being written given 32GB of RAM, which is still plenty to lead to a multi-second pause at times. 3 DB clusters, 2 of which are actively used, all on the same [software] RAID-1 FS So your basic problem here is that you don't have enough disk I/O to support this load. You can tune it all day and that fundamental issue will never go away. You'd need a battery-backed write controller capable of hardware RAID to even have a shot at supporting a system with this much RAM without long latency pauses. I'd normally break out the WAL onto a separate volume too. [nothing for a few minutes] 2010-04-15 16:50:03 CEST LOG: duration: 8995.934 ms statement: select ... 2010-04-15 16:50:04 CEST LOG: duration: 3383.780 ms statement: select ... 2010-04-15 16:50:04 CEST LOG: duration: 3328.523 ms statement: select ... 2010-04-15 16:50:05 CEST LOG: duration: 1120.108 ms statement: select ... 2010-04-15 16:50:05 CEST LOG: duration: 1079.879 ms statement: select ... [nothing for a few minutes] Guessing five minutes each time? You should turn on checkpoint_logs to be sure, but I'd bet money that's the interval, and that these are checkpoint spikes. If the checkpoing log shows up at about the same time as all these queries that were blocking behind it, that's what you've got. shared_buffers=5GB (database size is ~4.7GB on disk right now) The best shot you have at making this problem a little better just with software tuning is to reduce this to something much smaller; 128MB - 256MB would be my starting suggestion. Make sure checkpoint_segments is still set to a high value. The other thing you could try is to tune like this: checkpoint_segments=256MB checkpoint_timeout=20min Which would get you 4X as much checkpoint spreading as you have now. fsync=off This is just generally a bad idea. work_mem=500MB wal_buffers=256MB (*) commit_delay=10 (*) That's way too big a value for work_mem; there's no sense making wal_buffers bigger than 16MB; and you shouldn't ever adjust commit_delay. It's a mostly broken feature that might even introduce latency issues in your situation. None of these are likely related to your problem today though. I am suspecting some strange software RAID or kernel problem, unless the default bgwriter settings can actually cause selects to get stuck for so long when there are too many dirty buffers (I hope not). This fairly simple: your kernel is configured to allow the system to cache hundreds of megabytes, if not gigabytes, of writes. There is no way to make that go completely away because the Linux kernel has an unfortunate design in terms of being low latency. I've written two papers in this area: http://www.westnet.com/~gsmith/content/linux-pdflush.htm http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm And I doubt I could get the worst case on these tuned down to under a second using software RAID without a proper disk controller. Periodically, the database must get everything in RAM flushed out to disk, and the only way to make that happen instantly is for there to be a hardware write cache to dump it into, and the most common way to get one of those is to buy a hardware RAID card. Unless I'm missing something, I only have a non-RAID setup or ramdisks (tmpfs), or SSDs left to try to get rid of these Battery-backed write caching controller, and then re-tune afterwards. Nothing else will improve your situation very much. SSDs have their own issues under heavy writes and the RAID has nothing to do with your problem. If this is disposable data and you can run from a RAM disk, now that would work, but now you've got some serious work to do in order to make that persistent. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine
Josh Berkus j...@agliodbs.com writes: But it shouldn't be sleeping after each page with normal cost_delay parameters, should it? Right, that's why I find this puzzling. If the problem was easier to reproduce it would be easier to analyze. The behavior would be explained if VacuumCostLimit were getting set to zero (or some unreasonably small value) in the autovac worker process. I looked at the autovac code that manages that, and it seems complicated enough that a bug wouldn't surprise me in the least. I especially note that wi_cost_limit is explicitly initialized to zero, rather than something sane; and that table_recheck_autovac falls back to setting vac_cost_limit from the previous value of VacuumCostLimit ... which is NOT constant but in general is left over from the previously processed table. One should also keep in mind that SIGHUP processing might reload VacuumCostLimit from GUC values. So I think that area needs a closer look. Josh, are you sure that both servers are identical in terms of both GUC-related and per-table autovacuum settings? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine
Josh, are you sure that both servers are identical in terms of both GUC-related and per-table autovacuum settings? I should check per-table. GUC, yes, because the company has source management for config files. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] SOLVED: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set
When a connection is used for both reading and writing, a commit() also destroys any open cursors. Simple workaround - use two connections. See full discussion on JDBC list. Cheers Dave On Thu, Apr 15, 2010 at 3:01 PM, Dave Crooke dcro...@gmail.com wrote: I have followed the instructions below to no avail any thoughts? http://jdbc.postgresql.org/documentation/83/query.html#query-with-cursor This is what happens when I reduce the fetch_size to 50 ... stops after about 950msec and 120 fetches (6k rows) 13:59:56,054 [PerfDataMigrator] ERROR com.hyper9.storage.sample.persistence.PersistenceManager:3216 - Unexpected error while migrating sample data: 6000 org.postgresql.util.PSQLException: ERROR: portal C_14 does not exist at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327) at org.postgresql.core.v3.QueryExecutorImpl.fetch(QueryExecutorImpl.java:1527) at org.postgresql.jdbc2.AbstractJdbc2ResultSet.next(AbstractJdbc2ResultSet.java:1843) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:169) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:169) at com.hyper9.storage.sample.persistence.PersistenceManager$Migrator.run(PersistenceManager.java:3156) at java.lang.Thread.run(Thread.java:619) Cheers Dave On Thu, Apr 15, 2010 at 2:42 PM, Dave Crooke dcro...@gmail.com wrote: Hey folks I am trying to do a full table scan on a large table from Java, using a straightforward select * from foo. I've run into these problems: 1. By default, the PG JDBC driver attempts to suck the entire result set into RAM, resulting in *java.lang.OutOfMemoryError* ... this is not cool, in fact I consider it a serious bug (even MySQL gets this right ;-) I am only testing with a 9GB result set, but production needs to scale to 200GB or more, so throwing hardware at is is not feasible. 2. I tried using the official taming method, namely * java.sql.Statement.setFetchSize(1000)* and this makes it blow up entirely with an error I have no context for, as follows (the number C_10 varies, e.g. C_12 last time) ... org.postgresql.util.PSQLException: ERROR: portal C_10 does not exist at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327) at org.postgresql.core.v3.QueryExecutorImpl.fetch(QueryExecutorImpl.java:1527) at org.postgresql.jdbc2.AbstractJdbc2ResultSet.next(AbstractJdbc2ResultSet.java:1843) This is definitely a bug :-) Is there a known workaround for this ... will updating to a newer version of the driver fix this? Is there a magic incation of JDBC calls that will tame it? Can I cast the objects to PG specific types and access a hidden API to turn off this behaviour? If the only workaround is to explicitly create a cursor in PG, is there a good example of how to do this from Java? Cheers Dave