Re: [PERFORM] Sorted group by

2010-08-11 Thread Matthew Wakeling
Original query: explain analyse select * from tracker where objectid < 120; QUERY PLAN --- Index Scan using tracker_objectid on tracker (cost=0.00..915152.62 rows=3684504 width=33) (

Re: [PERFORM] Sorted group by

2010-08-10 Thread Matthew Wakeling
On Tue, 10 Aug 2010, Thomas Kellerer wrote: No. It's built in (8.4) and it's called Windowing functions: http://www.postgresql.org/docs/8.4/static/tutorial-window.html http://www.postgresql.org/docs/8.4/static/functions-window.html SELECT group, last_value(value) over(ORDER BY number) FROM table

[PERFORM] Sorted group by

2010-08-10 Thread Matthew Wakeling
I'm trying to eke a little bit more performance out of an application, and I was wondering if there was a better way to do the following: I am trying to retrieve, for many sets of rows grouped on a couple of fields, the value of an ungrouped field where the row has the highest value in anoth

Re: [PERFORM] Advice configuring ServeRAID 8k for performance

2010-08-06 Thread Matthew Wakeling
On Thu, 5 Aug 2010, Scott Marlowe wrote: RAID6 is basically RAID5 with a hot spare already built into the array. On Fri, 6 Aug 2010, Pierre C wrote: As others said, RAID6 is RAID5 + a hot spare. No. RAID6 is NOT RAID5 plus a hot spare. RAID5 uses a single parity datum (XOR) to ensure protec

Re: [PERFORM] Testing Sandforce SSD

2010-07-26 Thread Matthew Wakeling
On Mon, 26 Jul 2010, Greg Smith wrote: Matthew Wakeling wrote: Does your latency graph really have milliseconds as the y axis? If so, this device is really slow - some requests have a latency of more than a second! Have you tried that yourself? If you generate one of those with standard

Re: [PERFORM] Testing Sandforce SSD

2010-07-26 Thread Matthew Wakeling
On Sun, 25 Jul 2010, Yeb Havinga wrote: Graph of TPS at http://tinypic.com/r/b96aup/3 and latency at http://tinypic.com/r/x5e846/3 Does your latency graph really have milliseconds as the y axis? If so, this device is really slow - some requests have a latency of more than a second! Matthew

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-12 Thread Matthew Wakeling
On Sat, 10 Jul 2010, Tom Lane wrote: Doesn't pgpool do this? No, and in fact that's exactly why the proposed implementation isn't ever going to be in core: it's not possible to do it portably. I'm surprised. Doesn't apache httpd do this? Does it have to do a whole load of non-portable stuff?

Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Matthew Wakeling
On Fri, 9 Jul 2010, Kevin Grittner wrote: Interesting idea. As far as I can see, you are suggesting solving the too many connections problem by allowing lots of connections, but only allowing a certain number to do anything at a time? Right. I think in some situations, this arrangement would

Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Matthew Wakeling
On Fri, 9 Jul 2010, Kevin Grittner wrote: Any thoughts on the "minimalist" solution I suggested a couple weeks ago?: http://archives.postgresql.org/pgsql-hackers/2010-06/msg01385.php http://archives.postgresql.org/pgsql-hackers/2010-06/msg01387.php So far, there has been no comment by anyone...

Re: [PERFORM] Two different execution plan for the same request

2010-07-07 Thread Matthew Wakeling
On Wed, 7 Jul 2010, JOUANIN Nicolas (44) wrote: It seems to work fine (same execution plan and less duration) after :  - setting default_statistics_target to 100  - full vacuum with analyze Don't do VACUUM FULL. Matthew -- I suppose some of you have done a Continuous Maths course. Yes? Contin

Re: [PERFORM] how to (temporarily) disable/minimize benefits of disk block cache or postgresql shared buffer

2010-07-05 Thread Matthew Wakeling
On Fri, Jul 2, 2010 at 8:30 AM, Craig Ringer wrote: Yeah, if you're in a weird virtualized environment like that you're likely to have problems... On Sat, 3 Jul 2010, Rajesh Kumar Mallah wrote: Thanks for thinking about it.I do not understand why u feel OpenVz is weird. at the most its not ver

Re: [PERFORM] Highly Efficient Custom Sorting

2010-07-02 Thread Matthew Wakeling
On Fri, Jul 2, 2010 at 12:08 AM, Tom Lane wrote: I'm guessing from tea leaves, but the impression I got from Eliot's description is that he's using plpgsql functions as sort comparators. It's not surprising that that sucks performance-wise compared to having the equivalent logic in C/C++ functio

Re: [PERFORM] ideal storage configuration

2010-06-30 Thread Matthew Wakeling
On Tue, 29 Jun 2010, Samuel Gendler wrote: The copy statements execute in a small fraction of the minute in which they occur. I'm going to ask a silly question here. If the system is already coping quite well with the load, then why are you changing it? All old data gets removed by dropping

Re: [PERFORM] Write performance

2010-06-24 Thread Matthew Wakeling
On Thu, 24 Jun 2010, Janning wrote: We have a 12 GB RAM machine with intel i7-975 and using 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" Those discs are 1.5TB, not 1.5GB. One disk for the system and WAL etc. and one SW RAID-0 with two disks for postgresql data. Our database is a

Re: [PERFORM] WAL+Os on a single disk

2010-06-24 Thread Matthew Wakeling
On Wed, 23 Jun 2010, Scott Marlowe wrote: We have a 12 x 600G hot swappable disk system (raid 10) and 2 internal disk  ( 2x 146G) Does it make sense to put the WAL and OS on the internal disks So for us, the WAL and OS and logging on the same data set works well. Generally, it is recommended

Re: [PERFORM] raid10 write performance

2010-06-23 Thread Matthew Wakeling
On Wed, 23 Jun 2010, Ivan Voras wrote: On 06/23/10 14:00, Florian Weimer wrote: Barrier support on RAID10 seems to require some smallish amount of non-volatile storage which supports a high number of write operations per second, so a software-only solution might not be available. If I understa

Re: [PERFORM] B-Heaps

2010-06-18 Thread Matthew Wakeling
On Fri, 18 Jun 2010, Robert Haas wrote: On Tue, Jun 15, 2010 at 8:23 AM, Matthew Wakeling wrote: Absolutely, and I said in http://archives.postgresql.org/pgsql-performance/2010-03/msg00272.php but applied to the Postgres B-tree indexes instead of heaps. This is an interesting idea. I would

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-18 Thread Matthew Wakeling
Dimitri Fontaine wrote: Well I guess I'd prefer a per-transaction setting Not possible, as many others have said. As soon as you make an unsafe transaction, all the other transactions have nothing to rely on. On Thu, 17 Jun 2010, Pierre C wrote: A per-table (or per-index) setting makes more

Re: [PERFORM] Parallel queries for a web-application |performance testing

2010-06-17 Thread Matthew Wakeling
On Wed, 16 Jun 2010, Balkrishna Sharma wrote: Hello,I will have a web application having postgres 8.4+ as backend. At any given time, there will be max of 1000 parallel web-users interacting with the database (read/write)I wish to do performance testing of 1000 simultaneous read/write to the da

Re: [PERFORM] B-Heaps

2010-06-15 Thread Matthew Wakeling
On Mon, 14 Jun 2010, Eliot Gable wrote: Just curious if this would apply to PostgreSQL: http://queue.acm.org/detail.cfm?id=1814327 Absolutely, and I said in http://archives.postgresql.org/pgsql-performance/2010-03/msg00272.php but applied to the Postgres B-tree indexes instead of heaps. It's a

Re: [PERFORM] slow query performance

2010-06-11 Thread Matthew Wakeling
On Fri, 11 Jun 2010, Kenneth Marshall wrote: If you check the archives, you will see that this is not easy to do because of the effects of caching. Indeed. If you were to take the value at completely face value, a modern hard drive is capable of transferring sequential pages somewhere between

Re: [PERFORM] Weird XFS WAL problem

2010-06-04 Thread Matthew Wakeling
On Thu, 3 Jun 2010, Greg Smith wrote: And it's also quite reasonable for a RAID controller to respond to that "flush the whole cache" call by flushing its cache. Remember that the RAID controller is presenting itself to the OS as a large disc, and hiding the individual discs from the OS. Why s

Re: [PERFORM] slow query

2010-06-04 Thread Matthew Wakeling
On Thu, 3 Jun 2010, Anj Adu wrote: http://explain.depesz.com/s/kHa I'm interested in why the two partitions dev4_act_dy_fact and dev4_act_dy_fact_2010_05_t3 are treated so differently. I'm guessing that the former is the parent and the latter the child table? When accessing the parent table

Re: [PERFORM] Weird XFS WAL problem

2010-06-03 Thread Matthew Wakeling
On Thu, 3 Jun 2010, Craig James wrote: Also, are barriers *on* on the RAID1 mount and off on the RAID10 one? It was the barriers. "barrier=1" isn't just a bad idea on ext4, it's a disaster. This worries me a little. Does your array have a battery-backed cache? If so, then it should be fast

Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present

2010-06-03 Thread Matthew Wakeling
On Wed, 2 Jun 2010, Jori Jovanovich wrote: (2) Making the query faster by making the string match LESS specific (odd, seems like it should be MORE) No, that's the way round it should be. The LIMIT changes it all. Consider if you have a huge table, and half of the entries match your WHERE claus

Re: [PERFORM] Overusing 1 CPU

2010-06-02 Thread Matthew Wakeling
On Wed, 2 Jun 2010, Mozzi wrote: This box is basically adle @ the moment as it is still in testing yet top shows high usage on just 1 of the cores. First port of call: What process is using the CPU? Run top on a fairly wide terminal and use the "c" button to show the full command line. Matth

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-06-01 Thread Matthew Wakeling
On Tue, 1 Jun 2010, Stephen Frost wrote: * Matthew Wakeling (matt...@flymine.org) wrote: The major case I found when writing pl/pgsql was when trying to build arrays row by row. AFAIK when I tried it, adding a row to an array caused the whole array to be copied, which put a bit of a damper on

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-06-01 Thread Matthew Wakeling
On Fri, 28 May 2010, Merlin Moncure wrote: At best, if you are a ninja with the marginally documented backend api, you will create code that goes about as fast as your pl/pgsql function for 10 times the amount of input work, unless there are heavy amounts of 'other than sql' code in your function

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-06-01 Thread Matthew Wakeling
On Sun, 23 May 2010, David Jarvis wrote: The measurement table indexes (on date and weather station) were not being used because the only given date ranges (e.g., 1900 - 2009) were causing the planner to do a full table scan, which is correct. I wonder if you might see some benefit from CLUSTER

Re: [PERFORM] prepared query performs much worse than regular query

2010-05-21 Thread Matthew Wakeling
On Fri, 21 May 2010, Richard Yen wrote: Any ideas why the query planner chooses a different query plan when using prepared statements? This is a FAQ. Preparing a statement makes Postgres create a plan, without knowing the values that you will plug in, so it will not be as optimal as if the v

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Matthew Wakeling
Regarding the leap year problem, you might consider creating a modified day of year field, which always assumes that the year contains a leap day. Then a given number always resolves to a given date, regardless of year. If you then partition (or index) on that field, then you may get a benefit.

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Matthew Wakeling
On Fri, 21 May 2010, Yeb Havinga wrote: For time based data I would for sure go for year based indexing. On the contrary, most of the queries seem to be over many years, but rather restricting on the time of year. Therefore, partitioning by month or some other per-year method would seem sensi

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Matthew Wakeling
On Thu, 20 May 2010, David Jarvis wrote: I took out the date conditions: SELECT m.* FROM climate.measurement m WHERE m.category_id = 1 and m.station_id = 2043 This uses the station indexes: Yes, because there is only one station_id selected. That's exactly what an index is for. Then c

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Matthew Wakeling
On Wed, 19 May 2010, David Jarvis wrote: extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND That portion of the WHERE clause cannot use an index on m.taken. Postgres does not look inside functions (like extract) to see if something indexable is present. To get an index to work, you c

Re: [PERFORM] merge join killing performance

2010-05-19 Thread Matthew Wakeling
On Wed, 19 May 2010, Scott Marlowe wrote: It's apparently estimating (wrongly) that the merge join won't have to scan very much of "files" before it can stop because it finds an eid value larger than any eid in the other table.  So the issue here is an inexact stats value for the max eid. I wan

Re: [PERFORM] merge join killing performance

2010-05-18 Thread Matthew Wakeling
On Tue, 18 May 2010, Scott Marlowe wrote: Aggregate (cost=902.41..902.42 rows=1 width=4) -> Merge Join (cost=869.97..902.40 rows=1 width=4) Merge Cond: (f.eid = ev.eid) -> Index Scan using files_eid_idx on files f (cost=0.00..157830.39 rows=3769434 width=8) Okay, that's we

Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-19 Thread Matthew Wakeling
On Sat, 17 Apr 2010, Віталій Тимчишин wrote: As of making planner more clever, may be it is possible to introduce division on "fast queries" and "long queries", so that if after fast planning cost is greater then some configurable threshold, advanced planning techniques (or settings) are used. As

Re: [PERFORM] LIMIT causes planner to do Index Scan using a less optimal index

2010-04-07 Thread Matthew Wakeling
On Wed, 7 Apr 2010, sherry.ctr@faa.gov wrote: Please just let me know if Postgres can do this kind of index or not. create index idx1 on tb1(col1, col2) Then later we can find it is useful or useless. Have you tried it? Grzegorz Jaśkiewicz wrote: something like this: create index id

Re: [PERFORM] experiments in query optimization

2010-03-31 Thread Matthew Wakeling
On Tue, 30 Mar 2010, Faheem Mitha wrote: work_mem = 1 GB (see diag.{tex/pdf}). Sure, but define sane setting, please. I guess part of the point is that I'm trying to keep memory low You're trying to keep memory usage low, but you have work_mem set to 1GB? Matthew -- "Prove to thyself that

Re: [PERFORM] Performance regarding LIKE searches

2010-03-30 Thread Matthew Wakeling
On Mon, 29 Mar 2010, randa...@bioinfo.wsu.edu wrote: WHERE ... lower(n.name) LIKE 'Scaffold:scaffold_163:1000..1199%' ... I'm sure you noticed that this is never going to return any rows? Matthew -- Me... a skeptic? I trust you have proof? -- Sent via pgsql-performance mailing list (pgsql-p

Re: [PERFORM] Optimizer showing wrong rows in plan

2010-03-29 Thread Matthew Wakeling
On Mon, 29 Mar 2010, Tadipathri Raghu wrote: As per the documentation, one page is 8kb, when i create a table with int as one column its 4 bytes. If i insert 2000 rows, it should be in one page only as its 8kb, but its extending vastly as expected. Example shown below, taking the previous example

Re: [PERFORM] memory question

2010-03-25 Thread Matthew Wakeling
On Wed, 24 Mar 2010, Campbell, Lance wrote: I have 24 Gig of memory on my server... Our server manager seems to think that I have way to much memory. He thinks that we only need 5 Gig. You organisation probably spent more money getting your server manager to investigate how much RAM you need

Re: [PERFORM] GiST index performance

2010-03-22 Thread Matthew Wakeling
On Mon, 22 Mar 2010, Yeb Havinga wrote: Yes, that is certainly a factor. For example, the page size for bioseg which we use here is 130 entries, which is very excessive, and doesn't allow very deep trees. On the other hand, it means that a single disc seek performs quite a lot of work. Yeah,

Re: [PERFORM] GiST index performance

2010-03-22 Thread Matthew Wakeling
On Sat, 20 Mar 2010, Yeb Havinga wrote: The gist virtual pages would then match more the original blocksizes that were used in Guttman's R-tree paper (first google result, then figure 4.5). Since the nature/characteristics of the underlying datatypes and keys is not changed, it might be that with

Re: [PERFORM] too complex query plan for not exists query and multicolumn indexes

2010-03-22 Thread Matthew Wakeling
On Fri, 19 Mar 2010, Stephen Frost wrote: ...it has to go to an external on-disk sort (see later on, and how to fix that). This was covered on this list a few months ago, in http://archives.postgresql.org/pgsql-performance/2009-08/msg00184.php and http://archives.postgresql.org/pgsql-performa

Re: [PERFORM] pg_dump far too slow

2010-03-18 Thread Matthew Wakeling
On Sun, 14 Mar 2010, David Newall wrote: nohup time pg_dump -f database.dmp -Z9 database I presumed pg_dump was CPU-bound because of gzip compression, but a test I ran makes that seem unlikely... There was some discussion about this a few months ago at http://archives.postgresql.org/pgsql-

Re: [PERFORM] GiST index performance

2010-03-15 Thread Matthew Wakeling
On Thu, 25 Feb 2010, Bruce Momjian wrote: Was there every any conclusion on this issue? Not really. Comments inline: Matthew Wakeling wrote: Revisiting the thread a month back or so, I'm still investigating performance problems with GiST indexes in Postgres. Looking at

Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread Matthew Wakeling
On Mon, 15 Mar 2010, Tom Lane wrote: For an example like this one, you have to keep in mind that the toast-table rows for the large bytea value have to be marked deleted, too. Also, since I/O happens in units of pages, the I/O volume to delete a tuple is just as much as the I/O to create it. (T

[PERFORM] Dell PERC H700/H800

2010-02-11 Thread Matthew Wakeling
Just a heads up - apparently the more recent Dell RAID controllers will no longer recognise hard discs that weren't sold through Dell. http://www.channelregister.co.uk/2010/02/10/dell_perc_11th_gen_qualified_hdds_only/ As one of the comments points out, that kind of makes them no longer SATA

Re: [PERFORM] Optimizing Postgresql server and FreeBSD for heavy read and writes

2010-02-04 Thread Matthew Wakeling
On Thu, 4 Feb 2010, Amitabh Kant wrote: On Wed, Feb 3, 2010 at 10:05 PM, Ivan Voras wrote: If you can, add another 2 drives in RAID 1 and move+symlink the pg_xlog directory to the new array. Can't do anything about this server now, but would surely keep in mind before upgrading other servers.

Re: [PERFORM] System overload / context switching / oom, 8.3

2010-02-03 Thread Matthew Wakeling
On Tue, 2 Feb 2010, Rob wrote: pg 8.3.9, Debian Etch, 8gb ram, quadcore xeon, megaraid (more details at end) ~240 active databases, 800+ db connections via tcp. Linux 2.6.18-6-686-bigmem #1 SMP Thu Nov 5 17:30:05 UTC 2009 i686 GNU/Linux (Debian Etch) 8 MB RAM 4 Quad Core Intel(R) Xeon(R) CPU

Re: [PERFORM] Should the optimiser convert a CASE into a WHERE if it can?

2010-01-27 Thread Matthew Wakeling
On Wed, 27 Jan 2010, Віталій Тимчишин wrote: How about SELECT SUM (case when id > 120 and id < 121 then 1 end) from tbl_tracker; That is very interesting. * All the functions should be noop for null input Alas, not true for COUNT(*), AVG(), etc. Matthew -- An optimist sees the gl

Re: [PERFORM] test send (recommended by Dave Page)

2010-01-27 Thread Matthew Wakeling
On Wed, 27 Jan 2010, Mark Steben wrote: Subject: [PERFORM] test send (recommended by Dave Page) Hi all - sorry to create additional email 'noise' But I've been trying to post a rather long query to The pgsql-performance user list. Dave thought That it might have been bounced due to the lengt

Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Matthew Wakeling
On Wed, 27 Jan 2010, Thom Brown wrote: Had a quick look at a benchmark someone put together of MySQL vs PostgreSQL, and while PostgreSQL is generally faster, I noticed the bulk delete was very slow: http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html Is this normal? On the contrar

Re: [PERFORM] Should the optimiser convert a CASE into a WHERE if it can?

2010-01-26 Thread Matthew Wakeling
On Tue, 26 Jan 2010, Richard Neill wrote: SELECT SUM (case when id > 120 and id < 121 then 1 else 0 end) from tbl_tracker; Explain shows that this does a sequential scan. I'd defer to Tom on this one, but really, for Postgres to work this out, it would have to peer deep into the myst

Re: [PERFORM] splitting data into multiple tables

2010-01-26 Thread Matthew Wakeling
On Mon, 25 Jan 2010, nair rajiv wrote: I am working on a project that will take out structured content from wikipedia and put it in our database... there is a table which will approximately have 5 crore entries after data harvesting. Have you asked the Wikimedia Foundation if they mind you con

Re: [PERFORM] splitting data into multiple tables

2010-01-26 Thread Matthew Wakeling
On Mon, 25 Jan 2010, Viji V Nair wrote: I think this wont help that much if you have a single machine. Partition the table and keep the data in different nodes. Have a look at the tools like pgpool.II So partitioning. You have three choices: 1. Use a single table 2. Partition the table on the

Re: [PERFORM] Sql result b where condition

2010-01-25 Thread Matthew Wakeling
On Mon, 25 Jan 2010, A. Kretschmer wrote: In response to ramasubramanian : Please, create a new mail for a new topic and don't hijack other threads. Even more so - this isn't probably the right mailing list for generic sql help questions. select ENAME,ORIG_SALARY from employee where (ename

Re: [PERFORM] Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

2010-01-25 Thread Matthew Wakeling
On Mon, 25 Jan 2010, Richard Huxton wrote: OK - so the first query processes 19,799 rows in 31,219 ms (about 1.5ms per row) The second processes 2,606 rows in 3,813 ms (about 1.3ms per row). Agreed. One query is faster than the other because it has to do an eighth the amount of work. Matth

Re: [PERFORM] Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

2010-01-22 Thread Matthew Wakeling
On Fri, 22 Jan 2010, Tory M Blue wrote: But the same sql that returns maybe 500 rows is pretty fast, it's the return of 10K+ rows that seems to stall and is CPU Bound. Okay, so you have two differing cases. Show us the EXPLAIN ANALYSE for both of them, and we will see what the difference is.

Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-21 Thread Matthew Wakeling
On Thu, 21 Jan 2010, Greg Smith wrote: In the attachement you'll find 2 screenshots perfmon34.png and perfmon35.png (I hope 2x14 kb is o.k. for the mailing list). I don't think they made it to the list? No, it seems that no emails with image attachments ever make it through the list server.

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-21 Thread Matthew Wakeling
On Wed, 20 Jan 2010, Greg Smith wrote: Basically, to an extent, that's right. However, when you get 16 drives or more into a system, then it starts being an issue. I guess if I test a system with *only* 16 drives in it one day, maybe I'll find out. *Curious* What sorts of systems have you tr

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-20 Thread Matthew Wakeling
On Fri, 15 Jan 2010, Greg Smith wrote: It seems to me that CFQ is simply bandwidth limited by the extra processing it has to perform. I'm curious what you are doing when you see this. 16 disc 15kRPM RAID0, when using fadvise with more than 100 simultaneous 8kB random requests. I sent an emai

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Matthew Wakeling
On Fri, 15 Jan 2010, Craig James wrote: That's the perception I get. CFQ is the default scheduler, but in most systems I have seen, it performs worse than the other three schedulers, all of which seem to have identical performance. I would avoid anticipatory on a RAID array though. I thought

Re: [PERFORM] new server I/O setup

2010-01-15 Thread Matthew Wakeling
On Fri, 15 Jan 2010, Fernando Hevia wrote: I was wondering if disabling the bbu cache on the RAID 1 array would make any difference. All 256MB would be available for the random I/O on the RAID 10. That would be pretty disastrous, to be honest. The benefit of the cache is not only that it smoot

Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-15 Thread Matthew Wakeling
On Thu, 14 Jan 2010, fka...@googlemail.com wrote: Nevertheless: If your explanation covers all what can be said about it then replacing the hard disk by a faster one should increase the performance here (I'll try to check that out). Probably. However, it is worth you running the test again, and

Re: [PERFORM] new server I/O setup

2010-01-15 Thread Matthew Wakeling
On Thu, 14 Jan 2010, Scott Marlowe wrote: I've just received this new server: 1 x XEON 5520 Quad Core w/ HT 8 GB RAM 1066 MHz 16 x SATA II Seagate Barracuda 7200.12 3ware 9650SE w/ 256MB BBU 2 discs in RAID 1 for OS + pg_xlog partitioned with ext2. 12 discs in RAID 10 for postgres data, sole par

Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-15 Thread Matthew Wakeling
On Thu, 14 Jan 2010, fka...@googlemail.com wrote: The data needs to be written first to the WAL, in order to provide crash-safety. So you're actually writing 1600MB, not 800. I understand. So the actual throughput is 32MB/s which is closer to 43 MB/s, of course. Can I verify that by temporaril

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Matthew Wakeling
On Thu, 14 Jan 2010, Greg Smith wrote: Andy Colson wrote: So if there is very little io, or if there is way way too much, then the scheduler really doesn't matter. So there is a slim middle ground where the io is within a small percent of the HD capacity where the scheduler might make a diffe

Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Matthew Wakeling
On Thu, 14 Jan 2010, tom wrote: i have a db-table "data_measurand" with about 6000 (60 Millions) rows and the following query takes about 20-30 seconds (with psql): mydb=# select count(*) from data_measurand; count -- 60846187 (1 row) Sounds pretty reasonable to me. Looking at you

Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Matthew Wakeling
On Thu, 14 Jan 2010, fka...@googlemail.com wrote: This takes about 50s, so, 800MB/50s = 16MB/s. However the harddisk (sata) could write 43 MB/s in the worst case! Why is write performance limited to 16 MB/s? Several reasons: The data needs to be written first to the WAL, in order to provide

Re: [PERFORM] performance config help

2010-01-12 Thread Matthew Wakeling
On Tue, 12 Jan 2010, Bob Dusek wrote: Each of the concurrent clients does a series of selects, inserts, updates, and deletes. The requests would generally never update or delete the same rows in a table. However, the requests do generally write to the same tables. And, they are all reading fro

Re: [PERFORM] performance config help

2010-01-12 Thread Matthew Wakeling
On Mon, 11 Jan 2010, Bob Dusek wrote: How do I learn more about the actual lock contention in my db?   Lock contention makes some sense.  Each of the 256 requests are relatively similar.  So, I don't doubt that lock contention could be an issue.  I just don't know how to observe it or correct

Re: [PERFORM] Choice of bitmap scan over index scan

2010-01-11 Thread Matthew Wakeling
On Mon, 11 Jan 2010, Mathieu De Zutter wrote: > seq_page_cost = 0.1 > random_page_cost = 0.1 So if this query usually does *not* hit the cache, it will be probably faster if I leave it like that? While testing a query I execute it that much that it's always getting into the cache. However, s

Re: [PERFORM] Air-traffic benchmark

2010-01-07 Thread Matthew Wakeling
On Thu, 7 Jan 2010, Gurgel, Flavio wrote: If one single query execution had a step that brought a page to the buffercache, it's enough to increase another step speed and change the execution plan, since the data access in memory is (usually) faster then disk. Postgres does not change a query

Re: [PERFORM] Digesting explain analyze

2010-01-07 Thread Matthew Wakeling
On Thu, 7 Jan 2010, Jesper Krogh wrote: If disk seeks are killing you a kinda crazy idea would be to duplicate the table - clustering one by (id1) and the other one by an index on (id2) and unioning the results of each. That's doubling the disk space needs for the table. Is there any odds that

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Matthew Wakeling
On Wed, 6 Jan 2010, Dmitri Girski wrote: On the other hand, if I use ip addresses this should not attract any possible issues with DNS, right? Not true. It is likely that the server program you are connecting to will perform a reverse DNS lookup to work out who the client is, for logging or

Re: [PERFORM] Idea how to get rid of Bitmap Heap Scan

2009-12-18 Thread Matthew Wakeling
On Fri, 18 Dec 2009, Michael N. Mikhulya wrote: The problem here is that we are forced to fetch "files" in Bitmap Heap Scan. But actually there is no need for the whole "files" record. The necessary data is only "files" ids. The idea is to avoid fetching data from "files" table, and get the ids

Re: [PERFORM] Fw: Help me put 2 Gigs of RAM to use

2009-12-10 Thread Matthew Wakeling
On Thu, 10 Dec 2009, Mark Stosberg wrote: What I'm noticing is that the while the FreeBSD server has 4 Gigs of memory, there are rarely every more than 2 in use-- the memory use graphs as being rather constant. My goal is to make good use of those 2 Gigs of memory to improve performance and reduc

Re: [PERFORM] Optimizing Bitmap Heap Scan.

2009-12-08 Thread Matthew Wakeling
On Tue, 8 Dec 2009, niraj patel wrote: Thanks very much for the analysis. It does takes 17 sec to execute when data is not in cache. It sounds like the table is already very much ordered by the workspaceid, otherwise this would have taken much longer. What I would like to ask can partitionin

Re: [PERFORM] SSD + RAID

2009-12-08 Thread Matthew Wakeling
On Fri, 13 Nov 2009, Greg Smith wrote: In order for a drive to work reliably for database use such as for PostgreSQL, it cannot have a volatile write cache. You either need a write cache with a battery backup (and a UPS doesn't count), or to turn the cache off. The SSD performance figures you

Re: [PERFORM] Optimizing Bitmap Heap Scan.

2009-12-08 Thread Matthew Wakeling
On Tue, 8 Dec 2009, niraj patel wrote:  Group  (cost=509989.19..511518.30 rows=9 width=10) (actual time=1783.102..2362.587 rows=261 loops=1)    ->  Sort  (cost=509989.19..510753.74 rows=305821 width=10) (actual time=1783.097..2121.378 rows=272211 loops=1) Sort Key: topfamilyid

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Matthew Wakeling
On Tue, 1 Dec 2009, Jean-Michel Pouré wrote: PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. Huh? At what point does the planner carry over previous plans and use them to further optimise

Re: [PERFORM] Query times change by orders of magnitude as DB ages

2009-11-26 Thread Matthew Wakeling
On Wed, 25 Nov 2009, Grzegorz Jaśkiewicz wrote: the out of order data layout is primary reason for index bloat. And that happens , and gets worse over time once data is more and more distributed. ("random" deletes, etc). That's not index bloat. Sure, having the table not in the same order as

Re: [PERFORM] Query times change by orders of magnitude as DB ages

2009-11-25 Thread Matthew Wakeling
On Wed, 25 Nov 2009, Richard Neill wrote: On Sun, 22 Nov 2009, Richard Neill wrote: Worse still, doing a cluster of most of the tables and vacuum full analyze Why are you doing a vacuum full? That command is not meant to be used except in the most unusual of circumstances, as it causes bloat

Re: [PERFORM] Strange performance degradation

2009-11-25 Thread Matthew Wakeling
On Tue, 24 Nov 2009, Denis Lussier wrote: Bouncing the app will roll back the transactions. Depends on the application. Some certainly use a shutdown hook to flush data out to a database cleanly. Obviously if you kill -9 it, then all bets are off. Matthew -- Software suppliers are trying t

[PERFORM] RAID card recommendation

2009-11-24 Thread Matthew Wakeling
We're about to purchase a new server to store some of our old databases, and I was wondering if someone could advise me on a RAID card. We want to make a 6-drive SATA RAID array out of 2TB drives, and it will be RAID 5 or 6 because there will be zero write traffic. The priority is stuffing as

Re: [PERFORM] Strange performance degradation

2009-11-24 Thread Matthew Wakeling
On Tue, 24 Nov 2009, Denis Lussier wrote: IMHO the client application is already confused and it's in Prod. Shouldn't he perhaps terminate/abort the IDLE connections in Prod and work on correcting the problem so it doesn't occur in Dev/Test?? The problem is, the connection isn't just IDLE - it

Re: [PERFORM] [GENERAL] Strange performance degradation

2009-11-24 Thread Matthew Wakeling
On Mon, 23 Nov 2009, Lorenzo Allegrucci wrote: Anyway, how can I get rid those "idle in transaction" processes? Can I just kill -15 them or is there a less drastic way to do it? Are you crazy? Sure, if you want to destroy all of the changes made to the database in that transaction and thorough

Re: [PERFORM] Query times change by orders of magnitude as DB ages

2009-11-23 Thread Matthew Wakeling
On Sun, 22 Nov 2009, Richard Neill wrote: Worse still, doing a cluster of most of the tables and vacuum full analyze Why are you doing a vacuum full? That command is not meant to be used except in the most unusual of circumstances, as it causes bloat to indexes. If you have run a cluster co

Re: [PERFORM] Why is the query not using the index for sorting?

2009-11-23 Thread Matthew Wakeling
On Sun, 22 Nov 2009, Jonathan Blitz wrote: I have a table with a number of columns.   I perform   Select * from table order by a,b   There is an index on a,b which is clustered (as well as indexes on a and b alone). I have issued the cluster and anyalze commands. Did you analyse *after* creati

Re: [PERFORM] Strange performance degradation

2009-11-20 Thread Matthew Wakeling
On Fri, 20 Nov 2009, Lorenzo Allegrucci wrote: performance is degrading... In normal conditions the postgres process uses about 3% of cpu time but when is in "degraded" conditions it can use up to 25% of cpu time. You don't really give enough information to determine what is going on here.

Re: [PERFORM] SSD + RAID

2009-11-20 Thread Matthew Wakeling
On Thu, 19 Nov 2009, Greg Smith wrote: This is why turning the cache off can tank performance so badly--you're going to be writing a whole 128K block no matter what if it's force to disk without caching, even if it's just to write a 8K page to it. Theoretically, this does not need to be the ca

Re: [PERFORM] Too much blocks read

2009-11-18 Thread Matthew Wakeling
On Wed, 18 Nov 2009, Waldomiro wrote: So, I need to retrieve only the last value for some key. That key has about 20.000 tuples in this table. SELECT field1 FROM table_7milions WHERE field1 = 'my_key' ORDER BY field1 DESC LIMIT 1 What's the point of this query? You are forcing Postgresql to r

Re: [PERFORM] Weird index or sort behaviour

2009-11-18 Thread Matthew Wakeling
On Sat, 14 Nov 2009, Tom Lane wrote: Matthew Wakeling writes: [ discussion about applying materialize to a mergejoin's inner indexscan ] I have finally gotten round to doing something about this, and applied the attached patch to CVS HEAD. Could you test it on your problem case to see

Re: [PERFORM] database size growing continously

2009-10-29 Thread Matthew Wakeling
On Thu, 29 Oct 2009, Josh Rovero wrote: Do you ever "vacuum full" to reclaim empty record space? Unless you expect the size of the database to permanently decrease by a significant amount, that is a waste of time, and may cause bloat in indexes. In this case, since the space will be used agai

Re: [PERFORM] Postgresql optimisation

2009-10-28 Thread Matthew Wakeling
On Wed, 28 Oct 2009, Dave Dutcher wrote: Also if you switch to truncate then you should ANALYSE the tables after you finish inserting. Note that VACUUM ANALYSE is not necessary after a truncate/insert because there should be no dead tuples to vacuum. Perhaps reading the other replies in the th

Re: [PERFORM] Issues with \copy from file

2009-10-19 Thread Matthew Wakeling
On Sun, 18 Oct 2009, Scott Marlowe wrote: You can only write data then commit it so fast to one drive, and that speed is usually somewhere in the megabyte per second range. 450+150 in 5 minutes is 120 Megs per second, that's pretty fast, but is likely the max speed of a modern super fast 15k rpm

Re: [PERFORM] Query performance

2009-10-12 Thread Matthew Wakeling
On Mon, 12 Oct 2009, Grzegorz Jaśkiewicz wrote: try setting work_mem to higher value. As postgresql will fallback to disc sorting if the content doesn't fit in work_mem, which it probably doesn't (8.4+ show the memory usage for sorting, which your explain doesn't have). For reference, here's

Re: [PERFORM] Query performance

2009-10-12 Thread Matthew Wakeling
On Mon, 12 Oct 2009, S Arvind wrote: I can understand left join, actually can any one tell me why sort operation is carried out and wat Materialize means... Can anyone explain me the mentioned plan with reason(s)? Merge Left Join  (cost=62451.86..67379.08 rows=286789 width=0)   Merge Cond

  1   2   3   4   >