Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
On Mon, May 16, 2011 at 4:31 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, May 16, 2011 at 4:19 AM, Robert Klemme shortcut...@googlemail.com wrote: - If the planner chooses a Bitmap Index Scan, it effectively scans the index to figure out which table blocks to read, and then reads those table blocks in block number order, so that the I/O is sequential, with skips. Are these two separate phases (i.e. first scan index completely, then access table)? Yes. So then a single query will only ever access one of both at a time. Separating index and tables might not be a totally good idea generally. Richard Foote has an excellent article about Oracle but I assume at least a few things do apply to PostgreSQL as well - it's at least worth as something to check PostgreSQL's access patterns against: http://richardfoote.wordpress.com/2008/04/16/separate-indexes-from-tables-some-thoughts-part-i-everything-in-its-right-place/ I would probably rather try to separate data by the nature and frequency of accesses. One reasonable separation would be to leave all frequently accessed tables *and* their indexes on local RAID and moving less frequently accessed data to the SAN. This separation could be easily identified if you have separate tables for current and historic data. Yeah, I think the idea of putting tables and indexes in separate tablespaces is mostly to bring more I/O bandwidth to bear on the same data. Richard commented on that as well, I believe it was in http://richardfoote.wordpress.com/2008/04/18/separate-indexes-from-tables-some-thoughts-part-ii-there-there/ The main point is that you do not benefit from the larger IO bandwidth if access patterns do not permit parallel access to both disks (e.g. because you first need to read index blocks in order to know the table blocks to read). The story might be different though if you have a lot of concurrent accesses. But even then, if the table is a hotspot chances are that index blocks are cached and you only need physical IO for table blocks... But there are other reasonable things you might do also - e.g. put the indexes on an SSD, and the tables on a spinning disk, figuring that the SSD is less reliable but you can always rebuild the index if you need to... Richard commented on that theory as well: http://richardfoote.wordpress.com/2008/05/02/indexes-in-their-own-tablespace-recoverability-advantages-get-back/ The point: if you do the math you might figure that lost indexes lead to so much downtime that you don't want to risk that and the rebuild isn't all that simple (in terms of time). For a reasonable sized database recovery might be significantly faster than rebuilding. Also, a lot of people have reported big speedups from putting pg_xlog on a dedicated RAID 1 pair, or moving the PostgreSQL logs off the data partition. So those sorts of divisions should be considered also. Now, this is something I'd seriously consider because access patterns to pg_xlog are vastly different than those of table and index data! So you want to have pg_xlog on a device with high reliability and high write speed. Your idea of dividing things by access frequency is another good thought. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hash semi join caused by IN (select ...)
Hi, select from t1 left join t2 WHERE id IN (select ) Does it work as expected with one less join? If so, try increasing join_collapse_limit ... That did the trick - thanks a lot. I only had to increase join_collapse_limit a bit and now get an almost perfect plan. Instead of hash-joining all the data, the planner generates nested-loop-joins with index only on the few rows I fetch. Using = ANY(array(select... )) also seems to work, I wonder which one works better. Does ANY(ARRAY(...)) force the optimizer to plan the subquery seperated from the main query? Thanks, Clemens -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
On 05/17/2011 03:00 PM, Robert Klemme wrote: The main point is that you do not benefit from the larger IO bandwidth if access patterns do not permit parallel access to both disks (e.g. because you first need to read index blocks in order to know the table blocks to read). This makes me wonder if Pg attempts to pre-fetch blocks of interest for areas where I/O needs can be known in advance, while there's still other works or other I/O to do. For example, pre-fetching for the next iteration of a nested loop while still executing the prior one. Is it even possible? I'm guessing not, because (AFAIK) Pg uses only synchronous blocking I/O, and with that there isn't really a way to pre-fetch w/o threads or helper processes. Linux (at least) supports buffered async I/O, so it'd be possible to submit such prefetch requests ... on modern Linux kernels. Portably doing so, though - not so much. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
2011/5/17 Craig Ringer cr...@postnewspapers.com.au: On 05/17/2011 03:00 PM, Robert Klemme wrote: The main point is that you do not benefit from the larger IO bandwidth if access patterns do not permit parallel access to both disks (e.g. because you first need to read index blocks in order to know the table blocks to read). This makes me wonder if Pg attempts to pre-fetch blocks of interest for areas where I/O needs can be known in advance, while there's still other works or other I/O to do. For example, pre-fetching for the next iteration of a nested loop while still executing the prior one. Is it even possible? I'm guessing not, because (AFAIK) Pg uses only synchronous blocking I/O, and with that there isn't really a way to pre-fetch w/o threads or helper processes. Linux (at least) supports buffered async I/O, so it'd be possible to submit such prefetch requests ... on modern Linux kernels. Portably doing so, though - not so much. Prefetching is used in bitmapheapscan. The GUC effeective_io_concurrency allow you increase the prefetch window. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Fill Factor
Hello, How fillfactor impact performance of query? I have two cases, One is a operational table, for each insert it have an update, this table must have aprox. 1.000 insert per second and 1.000 update per second (same inserted row) Is necessary to change the fill factor? The other case is a table that have few insert (statistics) but thousands or millons of update, In this case the fillfactor is not necessary to change? Thanks! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fill Factor
On Tue, May 17, 2011 at 6:59 AM, Anibal David Acosta a...@devshock.com wrote: Hello, How fillfactor impact performance of query? Fillfactor tells the db how much empty space to leave in the database when creating a table and inserting rows. If you set it to 90% then 10% of the space in the table will be available for updates can be used for the new data. Combined with pg 8.3+ HOT updates, this free space allows updates to non-indexed fields to be close to free because now the index for that row needs no updates if the new datum for that row first in the same 8k pg block. I have two cases, One is a operational table, for each insert it have an update, this table must have aprox. 1.000 insert per second and 1.000 update per second (same inserted row) If you could combine the insert and update into one action that would be preferable really. Is necessary to change the fill factor? Not necessary but possibly better for performance. The other case is a table that have few insert (statistics) but thousands or millons of update, In this case the fillfactor is not necessary to change? Actually updates are the time that a lower fill factor is most useful. But it doesn't need to be really low. anything below 95% is likely more than you need. But it really depends on your access patterns. If you're updating 20% of a table at a time, then a fillfactor of ~80% might be the best fit. Whether or not the updates fit under the HOT umbrella, lowering fill factor enough to allow the updates to happen in place without adding pages to the table files is usually a win. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
On Tue, May 17, 2011 at 11:47 AM, Craig Ringer cr...@postnewspapers.com.au wrote: On 05/17/2011 03:00 PM, Robert Klemme wrote: The main point is that you do not benefit from the larger IO bandwidth if access patterns do not permit parallel access to both disks (e.g. because you first need to read index blocks in order to know the table blocks to read). This makes me wonder if Pg attempts to pre-fetch blocks of interest for areas where I/O needs can be known in advance, while there's still other works or other I/O to do. For example, pre-fetching for the next iteration of a nested loop while still executing the prior one. Is it even possible? I'm guessing not, because (AFAIK) Pg uses only synchronous blocking I/O, and with that there isn't really a way to pre-fetch w/o threads or helper processes. Linux (at least) supports buffered async I/O, so it'd be possible to submit such prefetch requests ... on modern Linux kernels. Portably doing so, though - not so much. There is a much more serious obstacle than the mere technical (if that was one at all): prefetching is only reasonable if you can predict which data you need with high probability (say = 80%). If you can't you'll have much more IO than without prefetching and overall performance likely suffers. Naturally that probability depends on the data at hand and the access pattern. As Cédric wrote, there seems to be at least one case where it's done. Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fill Factor
2011/5/17 Scott Marlowe scott.marl...@gmail.com: On Tue, May 17, 2011 at 6:59 AM, Anibal David Acosta a...@devshock.com wrote: Hello, How fillfactor impact performance of query? Fillfactor tells the db how much empty space to leave in the database when creating a table and inserting rows. If you set it to 90% then 10% of the space in the table will be available for updates can be used for the new data. Combined with pg 8.3+ HOT updates, this free space allows updates to non-indexed fields to be close to free because now the index for that row needs no updates if the new datum for that row first in the same 8k pg block. I have two cases, One is a operational table, for each insert it have an update, this table must have aprox. 1.000 insert per second and 1.000 update per second (same inserted row) If you could combine the insert and update into one action that would be preferable really. Is necessary to change the fill factor? Not necessary but possibly better for performance. depend of deletes ratio too... without delete I am unsure a reduced fillfactor will have a good impact on the long term. The other case is a table that have few insert (statistics) but thousands or millons of update, In this case the fillfactor is not necessary to change? Actually updates are the time that a lower fill factor is most useful. But it doesn't need to be really low. anything below 95% is likely more than you need. But it really depends on your access patterns. If you're updating 20% of a table at a time, then a fillfactor of ~80% might be the best fit. Whether or not the updates fit under the HOT umbrella, lowering fill factor enough to allow the updates to happen in place without adding pages to the table files is usually a win. And one possible way to help adjust the fillfactor is to control the relation size. Sometimes reducing fillfactor a lot (60-80%) is good, the table is stuck at some XX MB and page are well reused. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Using pgiosim realistically
On Mon, May 16, 2011 at 01:54:06PM -0400, Jeff wrote: Yep - you need multiple threads to get max throughput of your io. I am running: ~/pgiosim -c -b 100G -v -t4 file[0-9]* Will each thread move 100GB of data? I am seeing: 158.69%, 4260 read, 0 written, 3407.64kB/sec 425.95 iops Maybe the completion target percentage is off because of the threads? -- -- rouilj John Rouillard System Administrator Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan
On May 16, 2011, at 10:46 AM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Mon, May 16, 2011 at 12:49 AM, Jesper Krogh jes...@krogh.cc wrote: Ok, it may not work as well with index'es, since having 1% in cache may very well mean that 90% of all requested blocks are there.. for tables in should be more trivial. Tables can have hot spots, too. Consider a table that holds calendar reservations. Reservations can be inserted, updated, deleted. But typically, the most recent data will be what is most actively modified, and the older data will be relatively more (though not completely) static, and less frequently accessed. Such examples are common in many real-world applications. Yes. I'm not convinced that measuring the fraction of a table or index that's in cache is really going to help us much. Historical cache hit rates might be useful, but only to the extent that the incoming query has a similar access pattern to those in the (recent?) past. It's not an easy problem. I almost wonder if we should not try to measure this at all, but instead let the DBA set a per-table or per-index number to use, analogous to the override we added recently for column n-distinct statistics ... I think the challenge there would be how to define the scope of the hot-spot. Is it the last X pages? Last X serial values? Something like correlation? Hmm... it would be interesting if we had average relation access times for each stats bucket on a per-column basis; that would give the planner a better idea of how much IO overhead there would be for a given WHERE clause. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)
Hi Jim You actually made me think about the schema Michel and I are using: And KVP is? ;) CREATE TABLE mykvpstore( id bigint PRIMARY KEY ) CREATE TABLE kvp ( id bigint REFERENCES mykvpstore(id), key text NOT NULL, value text, ); -- with index on key And the table with the associative array type (hstore) is: CREATE TABLE myhstore ( id bigint PRIMARY KEY, kvps hstore NOT NULL ); -- with GIST index on obj It seems to me that in the mykvpstore-kvp there is also some overhead. And yes, we have no clue what keys to anticipate, except for some common ones like 'name': The use case is coming from OpenStreetMap (http://wiki.openstreetmap.org/wiki/Database_schema ). Yours, Stefan 2011/5/17 Jim Nasby j...@nasby.net: On May 16, 2011, at 8:47 AM, Merlin Moncure wrote: On Sat, May 14, 2011 at 5:10 AM, Stefan Keller sfkel...@gmail.com wrote: Hi, I am conducting a benchmark to compare KVP table vs. hstore and got bad hstore performance results when the no. of records is greater than about 500'000. CREATE TABLE kvp ( id SERIAL PRIMARY KEY, key text NOT NULL, value text ); -- with index on key CREATE TABLE myhstore ( id SERIAL PRIMARY KEY, obj hstore NOT NULL ); -- with GIST index on obj Does anyone have experience with that? hstore is not really designed for large-ish sets like that. And KVP is? ;) IIRC hstore ends up just storing everything as text, with pointers to know where things start and end. There's no real indexing inside hstore, so basically the only thing it can do is scan the entire hstore. That said, I would strongly reconsider using KVP for anything except the most trivial of data sets. It is *extremely* inefficient. Do you really have absolutely no idea what *any* of your keys will be? Even if you need to support a certain amount of non-deterministic stuff, I would put everything you possibly can into real fields and only use KVP or hstore for things that you really didn't anticipate. Keep in mind that for every *value*, your overhead is 24 bytes for the heap header, 2+ varlena bytes in the heap, plus the length of the key. In the index you're looking at 6+ bytes of overhead, 1+ byte for varlena, plus the length of the key. The PK will cost you an additional 16-24 bytes, depending on alignment. So that's a *minimum* of ~50 bytes per value, and realistically the overhead will be closer to 65-70 bytes, *per value*. Unless your values are decent-sized strings, the overhead is going to be many times larger than the actual data! -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)
On May 16, 2011, at 8:47 AM, Merlin Moncure wrote: On Sat, May 14, 2011 at 5:10 AM, Stefan Keller sfkel...@gmail.com wrote: Hi, I am conducting a benchmark to compare KVP table vs. hstore and got bad hstore performance results when the no. of records is greater than about 500'000. CREATE TABLE kvp ( id SERIAL PRIMARY KEY, key text NOT NULL, value text ); -- with index on key CREATE TABLE myhstore ( id SERIAL PRIMARY KEY, obj hstore NOT NULL ); -- with GIST index on obj Does anyone have experience with that? hstore is not really designed for large-ish sets like that. And KVP is? ;) IIRC hstore ends up just storing everything as text, with pointers to know where things start and end. There's no real indexing inside hstore, so basically the only thing it can do is scan the entire hstore. That said, I would strongly reconsider using KVP for anything except the most trivial of data sets. It is *extremely* inefficient. Do you really have absolutely no idea what *any* of your keys will be? Even if you need to support a certain amount of non-deterministic stuff, I would put everything you possibly can into real fields and only use KVP or hstore for things that you really didn't anticipate. Keep in mind that for every *value*, your overhead is 24 bytes for the heap header, 2+ varlena bytes in the heap, plus the length of the key. In the index you're looking at 6+ bytes of overhead, 1+ byte for varlena, plus the length of the key. The PK will cost you an additional 16-24 bytes, depending on alignment. So that's a *minimum* of ~50 bytes per value, and realistically the overhead will be closer to 65-70 bytes, *per value*. Unless your values are decent-sized strings, the overhead is going to be many times larger than the actual data! -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance