Re: [PERFORM] Having some problems with concurrent COPY commands
On 10/12/2015 11:14 PM, Shaun Thomas wrote: On Mon, Oct 12, 2015 at 1:28 PM, Andres Freundwrote: Any chance you could provide profiles of such a run? This is as simple as I could make it reliably. With one copy running, the thread finishes in about 1 second. With 2, it's 1.5s each, and with all 4, it's a little over 3s for each according to the logs. I have log_min_duration_statement set to 1000, so it's pretty obvious. The scary part is that it's not even scaling linearly; performance is actually getting *worse* with each subsequent thread. Regarding performance, all of this fits in memory. The tables are only 100k rows with the COPY statement. The machine itself is 8 CPUs with 32GB of RAM, so it's not an issue of hardware. So far as I can tell, it happens on every version I've tested on, from 9.2 to 9.4. I also take back what I said about wal_level. Setting it to minimal does nothing. Disabling archive_mode and setting max_wal_senders to 0 also does nothing. With 4 concurrent processes, each takes 3 seconds, for a total of 12 seconds to import 400k rows when it would take 4 seconds to do sequentially. Sketchy. I was not able reproduce that behaviour on my laptop. I bumped the number of rows in your script 10, to make it run a bit longer. Attached is the script I used. The total wallclock time the COPYs takes on 9.4 is about 8 seconds for a single COPY, and 12 seconds for 4 concurrent COPYs. So it's not scaling as well as you might hope, but it's certainly not worse-than-serial either, as you you're seeing. If you're seeing this on 9.2 and 9.4 alike, this can't be related to the XLogInsert scaling patch, although you might've found a case where that patch didn't help where it should've. I ran "perf" to profile the test case, and it looks like about 80% of the CPU time is spent in the b-tree comparison function. That doesn't leave much scope for XLogInsert scalability to matter one way or another. I have no explanation for what you're seeing though. A bad spinlock implementation perhaps? Anything special about the hardware at all? Can you profile it on your system? Which collation? - Heikki launch4.sh Description: Bourne shell script -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server: SSD/RAID recommendations?
On 07/07/2015 10:59 PM, Graeme B. Bell wrote: Cache flushing isn't an atomic operation though. Even if the ordering is right, you are likely to have a partial fsync on the disk when the lights go out - isn't your FS still corrupt? If the filesystem is worth its salt, no. Journaling filesystems for example rely on the journal to work around that problem, and there are other mechanisms. PostgreSQL has exactly the same problem and uses the WAL to solve it. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server: SSD/RAID recommendations?
On 07/07/2015 09:01 PM, Wes Vaske (wvaske) wrote: Regarding: “lie about their fsync status.” This is mostly semantics but it might help google searches on the issue. A drive doesn’t support fsync(), that’s a filesystem/kernel process. A drive will do a FLUSH CACHE. Before kernels 2.6.low numbers the fsync() call wouldn’t sent any ATA or SCSI command to flush the disk cache. Whereas—AFAICT—modern kernels and file system versions*will* do this. When ‘sync’ is called the filesystem will issue the appropriate command to the disk to flush the write cache. For ATA, this is “FLUSH CACHE” (E7h). To check support for the command use: [root@postgres ~]# smartctl --identify /dev/sdu | grep FLUSH CACHE 83 13 1 FLUSH CACHE EXT supported 83 12 1 FLUSH CACHE supported 86 13 1 FLUSH CACHE EXT supported 86 12 1 FLUSH CACHE supported The 1s in the 3rd column represent SUPPORTED for the feature listed in the last column. Right, to be precise, the problem isn't the drive lies about fsync(). It lies about FLUSH CACHE instead. Search replace fsync() with FLUSH CACHE, and the same question remains: When the drive breaks its promise wrt. FLUSH CACHE, does it nevertheless guarantee that the order the data is eventually flushed to disk is consistent with the order in which the data and FLUSH CACHE were sent to the drive? That's an important distinction, because it makes the difference between the most recent data the application saved might be lost even though the FLUSH CACHE command returned and your filesystem is corrupt. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server: SSD/RAID recommendations?
On 07/07/2015 05:15 PM, Wes Vaske (wvaske) wrote: The M500/M550/M600 are consumer class drives that don't have power protection for all inflight data.* (like the Samsung 8x0 series and the Intel 3x0 5x0 series). The M500DC has full power protection for inflight data and is an enterprise-class drive (like the Samsung 845DC or Intel S3500 S3700 series). So any drive without the capacitors to protect inflight data will suffer from data loss if you're using disk write cache and you pull the power. Wow, I would be pretty angry if I installed a SSD in my desktop, and it loses a file that I saved just before pulling the power plug. *Big addendum: There are two issues on powerloss that will mess with Postgres. Data Loss and Data Corruption. The micron consumer drives will have power loss protection against Data Corruption and the enterprise drive will have power loss protection against BOTH. https://www.micron.com/~/media/documents/products/white-paper/wp_ssd_power_loss_protection.pdf The Data Corruption problem is only an issue in non-SLC NAND but it's industry wide. And even though some drives will protect against that, the protection of inflight data that's been fsync'd is more important and should disqualify *any* consumer drives from *any* company from consideration for use with Postgres. So it lies about fsync()... The next question is, does it nevertheless enforce the correct ordering of persisting fsync'd data? If you write to file A and fsync it, then write to another file B and fsync it too, is it guaranteed that if B is persisted, A is as well? Because if it isn't, you can end up with filesystem (or database) corruption anyway. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] optimization join on random value
On 05/04/2015 12:23 AM, Anton Bushmelev wrote: Hello guru of postgres, it's possoble to tune query with join on random string ? i know that it is not real life example, but i need it for tests. soe=# explain soe-# SELECT ADDRESS_ID, soe-# CUSTOMER_ID, soe-# DATE_CREATED, soe-# HOUSE_NO_OR_NAME, soe-# STREET_NAME, soe-# TOWN, soe-# COUNTY, soe-# COUNTRY, soe-# POST_CODE, soe-# ZIP_CODE soe-# FROM ADDRESSES soe-# WHERE customer_id = trunc( random()*45000) ; QUERY PLAN --- Seq Scan on addresses (cost=0.00..165714.00 rows=22500 width=84) Filter: ((customer_id)::double precision = trunc((random() * 45000::double precision))) (2 rows) There are two problems here that prohibit the index from being used: 1. random() is volatile, so it's recalculated for each row. 2. For the comparison, customer_id is cast to a float, and the index is on the bigint value. To work around the first problem, put the random() call inside a subquery. And for the second problem, cast to bigint. SELECT ... FROM addresses WHERE customer_id = (SELECT random()*45000)::bigint - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reverse Key Index
On 02/26/2015 12:31 AM, Josh Berkus wrote: On 02/14/2015 10:35 AM, Sven R. Kunze wrote: Thanks for the immediate reply. I understand the use case is quite limited. On the other hand, I see potential when it comes to applications which use PostgreSQL. There, programmers would have to change a lot of code to tweak existing (and more importantly working) queries to hash/reverse an id column first. Using ORMs would make this change even more painful and maybe even impossible. When reading https://richardfoote.wordpress.com/2008/01/14/introduction-to-reverse-key-indexes-part-i/ carefully, it also seems to work with index scan partially in case of equality comparisons. Seems like a good use for SP-GiST. Go for it! A b-tree opclass that just compares from right-to-left would work just as well, and perform better. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Number of Columns and Update
On 12/22/2014 10:53 PM, Robert DiFalco wrote: This may fall into the category of over-optimization but I've become curious. I have a user table with about 14 columns that are all 1:1 data - so they can't be normalized. When I insert a row all columns need to be set. But when I update, I sometimes only update 1-2 columns at a time. Does the number of columns impact update speed? For example: UPDATE users SET email = ? WHERE id = ?; Yes, the number of columns in the table matters. The update is just as expensive regardless of how many of the columns you update. When a row is updated, PostgreSQL creates a new version of the whole row. The new row version takes more space when the table has more columns, leading to more bloating of the table, which generally slows things down. In most applications the difference isn't big enough to matter, but it can be significant if you have very wide rows, and you update a lot. PostgreSQL 9.4 made an improvement on this. In earlier versions, the new row version was also included completely in the WAL record, which added overhead. In 9.4, any columns at the beginning or end of the row that are not modified are left out of the WAL record, as long as the new row version is stored on the same page as the old one (which is common). For updating a single column, or a few columns that are next to each other, that's the same as saying that only the modified part of the row is WAL-logged. I can easily break this up into logical tables like user_profile, user_credential, user_contact_info, user_summary, etc with each table only having 1-4 columns. But with the multiple tables I would often be joining them to bring back a collection of columns. That would help with the above-mentioned issues, but dealing with multiple tables certainly adds a lot of overhead too. Most likely you're better off just having the single table, after all. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Small performance regression in 9.2 has a big impact
On 11/25/2014 10:36 PM, Scott Marlowe wrote: OK so there's a simple set of tree functions we use at work. They're quite fast in 8.4 and they've gotten about 40% slower in 9.2. They're a simple mix of sql and plpgsql functions which are at http://pastebin.com/SXTnNhd5 and which I've attached. Here's a test query: select tree_ancestor_keys('00010001'); According to explain analyze on both 8.4 and 9.2 they have the same plan. However, on the same machine the query is about 40% slower on 9.2. Note we're not hitting the disks, or even buffers here. It's pure in memory plpsql and sql that we're running. explain analyze select tree_ancestor_keys('00010001') from generate_series(1,1000); On 8.4 runs in about 280 to 300 ms. (you can run it once and get the same diff, it's just easier to see with the generate series forcing it to run 1000 times to kind of even out the noise.) On 9.2, same machine, clean fresh dbs etc, it runs in ~400 ms. And that difference seems to be there on all plpgsql and sql functions. In our application, these tree functions get called millions and millions of times a day, and a 40% performance penalty is a pretty big deal. We're already using the trick of telling the query planner that this function will return 1 row with alter function rows 1 etc. That helps a lot but it doesn't fix this underlying performance issue. Server versions are 8.4.22 (last I think) and 9.2.9. If anyone has any suggestions I'd love to hear them. I don't know why this regressed between those versions, but looking at the functions, there's some low-hanging fruit: 1. tree_ancestor_keys() could use UNION ALL instead of UNION. (I believe duplicates are expected here, although I'm not 100% sure). 2. tree_ancestor_keys() calculates tree_level($1) every time it recurses. Would be cheaper to calculate once, and pass it as argument. Put together: CREATE FUNCTION tree_ancestor_keys(bit varying, integer, integer) RETURNS SETOF bit varying LANGUAGE sql IMMUTABLE STRICT AS $_$ select tree_ancestor_key($1, $2) union all select tree_ancestor_keys($1, $2 + 1, $3) where $2 $3 $_$; CREATE or replace FUNCTION tree_ancestor_keys(bit varying, integer) RETURNS SETOF bit varying LANGUAGE sql IMMUTABLE STRICT AS $_$ select tree_ancestor_keys($1, $2 + 1, tree_level($1)) $_$; These changes make your test query go about 2x faster on my laptop, with git master. I'm sure you could optimize the functions further, but those at least seem like fairly safe and simple changes. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] auto vaccum is dying
On 10/02/2014 07:43 AM, Rodrigo Barboza wrote: Hello, I have a table that receives lots of updates and inserts. Auto vaccum is always being cancelled on that table. One day the database went on standby and I had to act manually to recover. What should I do to avoid auto vaccum cancel? Cancellation happens when you run a command that requires an a stronger on the table, like ALTER or TRUNCATE. Plain UPDATEs or INSERTS will not cause cancellations. There must be something else going on, causing the cancellations. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Which update action quicker?
On 09/23/2014 11:37 PM, Emi Lu wrote: Hello list, For a big table with more than 1,000,000 records, may I know which update is quicker please? (1) update t1 set c1 = a.c1 from a where pk and t1.c1a.c1; .. update t1 set c_N = a.c_N from a where pk and t1.c_Na.c_N; (2) update t1 set c1 = a.c1 , c2 = a.c2, ... c_N = a.c_N from a where pk AND ( t1.c1 a.c1 OR t1.c2 a.c2. t1.c_N a.c_N) Probably (2). is not indexable, so each update will have to perform a sequential scan of the table. With (2), you only need to scan it once, with (1) you have to scan it N times. Also, method (1) will update the same row multiple times, if it needs to have more than one column updated. Or other quicker way for update action? If a large percentage of the table needs to be updated, it can be faster to create a new table, insert all the rows with the right values, drop the old table and rename the new one in its place. All in one transaction. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Aggregating tsqueries
On 09/17/2014 07:56 AM, Alexander Hill wrote: Hello, I have a table of tree nodes with a tsquery column. To get a subtree's tsquery, I need to OR all of its nodes' tsqueries together. I defined a custom aggregate using tsquery_or: CREATE AGGREGATE tsquery_or_agg (tsquery) ( sfunc = tsquery_or, stype = tsquery ); but I've found that tsquery_or_agg(query) is about a hundred times slower than this: ('(' || string_agg(query::text, ')|(') || ')')::tsquery That works perfectly so I'm happy to continue doing it, but I'm curious to know why the difference is so great and if anything can be done about it? string_agg's state transition function uses a buffer that's expanded as needed. At every step, the next string is appended to the buffer. Your custom aggregate is less efficient, because it constructs a new tsquery object at every step. In every step, a new tsquery object is allocated and the old result and the next source tsquery are copied to it. That's much more expensive. If you're not shy of writing C code, you could write a more efficient version of tsquery_or_agg too, using a similar technique. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On 06/29/2014 03:43 PM, Soni M wrote: top and sar says 100% cpu usage of one core, no sign of I/O wait. Hmm, I wonder what it's doing then... If you have perf installed on the system, you can do perf top to get a quick overlook of where the CPU time is spent. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On 06/30/2014 05:46 PM, Soni M wrote: Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 Ok, so it's stuck doing something.. Can you get build with debug symbols installed, so that we could see the function name? - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On 06/29/2014 11:14 AM, Soni M wrote: Everything works fine until on Thursday we have high load on master, and after that every streaming replica lag further behind the master. Even on night and weekend where all server load is low. But the slony slave is OK at all. What does 'top' on the standby say? Is the startup process using 100% of (one) CPU replaying records, or is it waiting for I/O? How large is the database, does it fit in RAM? Any clues in the system or PostgreSQL logs? - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner doesn't take indexes into account
On 05/28/2014 12:59 PM, Grzegorz Olszewski wrote: random_page_cost = 4.0 seq_page_cost = 1.0 There is about 500,000 rows and about 500 new rows each business day. About 96% of rows meet given conditions, that is, count shoud be about 480,000. When such a large percentage of the rows match, a sequential scan is indeed a better plan than an index scan. Sequential access is much faster than random access. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] NFS, file system cache and shared_buffers
On 05/27/2014 02:06 PM, Albe Laurenz wrote: I just learned that NFS does not use a file system cache on the client side. On the other hand, PostgreSQL relies on the file system cache for performance, because beyond a certain amount of shared_buffers performance will suffer. Together these things seem to indicate that you cannot get good performance with a large database over NFS since you can leverage memory speed. Now I wonder if there are any remedies (CacheFS?) and what experiences people have made with the performance of large databases over NFS. I have no personal experience with NFS, but sounds like a higher-than-usual shared_buffers value would be good. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Revisiting disk layout on ZFS systems
On 04/28/2014 06:47 PM, Karl Denninger wrote: What I am curious about, however, is the xlog -- that appears to suffer pretty badly from 128k record size, although it compresses even more-materially; 1.94x (!) The files in the xlog directory are large (16MB each) and thus first blush would be that having a larger record size for that storage area would help. It appears that instead it hurts. The WAL is fsync'd frequently. My guess is that that causes a lot of extra work to repeatedly recompress the same data, or something like that. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Revisiting disk layout on ZFS systems
On 04/28/2014 09:07 PM, Karl Denninger wrote: The WAL is fsync'd frequently. My guess is that that causes a lot of extra work to repeatedly recompress the same data, or something like that. It shouldn't as ZFS re-writes on change, and what's showing up is not high I/O*count* but rather percentage-busy, which implies lots of head movement (that is, lots of sub-allocation unit writes.) That sounds consistent frequent fsyncs. Isn't WAL essentially sequential writes during normal operation? Yes, it's totally sequential. But it's fsync'd at every commit, which means a lot of small writes. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tsearch2, large data and indexes
On 04/24/2014 10:57 PM, Sergey Konoplev wrote: On Thu, Apr 24, 2014 at 5:34 AM, Ivan Voras ivo...@freebsd.org wrote: On 24 April 2014 13:34, Heikki Linnakangas hlinnakan...@vmware.com wrote: As the docs say, the GIN index does not store the weights. As such, there is no need to strip them. A recheck would be necessary if your query needs the weights, precisely because the weights are not included in the index. (In the OP's query, it's the ranking that was causing the detoasting.) Thanks! My problem is that I actually need the ranking. My queries can return a large number of documents (tens of thousands) but I usually need only the first couple of pages of most relevant results (e.g. 50-100 records). With PostgreSQL and tsearch2, this means that the tens of thousands of documents found via the index are then detoasted and ranked. Heikki, what about the GIN improvements part 3: ordering in index patch, was it committed? http://www.postgresql.org/message-id/flat/capphfduwvqv5b0xz1dzuqaw29erdculzp2wotfjzdbs7bhp...@mail.gmail.com Nope, wasn't committed. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tsearch2, large data and indexes
On 04/24/2014 01:56 AM, Sergey Konoplev wrote: On Wed, Apr 23, 2014 at 4:08 AM, Ivan Voras ivo...@freebsd.org wrote: Ok, I found out what is happening, quoting from the documentation: GIN indexes are not lossy for standard queries, but their performance depends logarithmically on the number of unique words. (However, GIN indexes store only the words (lexemes) oftsvector values, and not their weight labels. Thus a table row recheck is needed when using a query that involves weights.) My query doesn't have weights but the tsvector in the table has them - I take it this is what is meant by involves weights. So... there's really no way for tsearch2 to produce results based on the index alone, without recheck? This is... limiting. My guess is that you could use strip() function [1] to get rid of weights in your table or, that would probably be better, in your index only by using expressions in it and in the query, eg. As the docs say, the GIN index does not store the weights. As such, there is no need to strip them. A recheck would be necessary if your query needs the weights, precisely because the weights are not included in the index. (In the OP's query, it's the ranking that was causing the detoasting.) - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tsearch2, large data and indexes
On 04/22/2014 10:57 AM, Ivan Voras wrote: On 22 April 2014 08:40, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04/20/2014 02:15 AM, Ivan Voras wrote: More details: after thinking about it some more, it might have something to do with tsearch2 and indexes: the large data in this case is a tsvector, indexed with GIN, and the query plan involves a re-check condition. The query is of the form: SELECT simple_fields FROM table WHERE fts @@ to_tsquery('...'). Does the re-check condition mean that the original tsvector data is always read from the table in addition to the index? Yes, if the re-check condition involves the fts column. I don't see why you would have a re-check condition with a query like that, though. Are there some other WHERE-conditions that you didn't show us? Yes, I've read about tsearch2 and GIN indexes and there shouldn't be a recheck condition - but there is. This is the query: SELECT documents.id, title, raw_data, q, ts_rank(fts_data, q, 4) AS rank, html_filename FROM documents, to_tsquery('document') AS q WHERE fts_data @@ q ORDER BY rank DESC LIMIT 25; It's the ranking that's causing the detoasting. ts_rank(fts_data, q, 4) has to fetch the contents of the fts_data column. Sorry, I was confused earlier: the Recheck Cond: line is always there in the EXPLAIN output of bitmap index scans, even if the recheck condition is never executed at runtime. It's because the executor has to be prepared to run the recheck-condition, if the bitmap grows large enough to become lossy, so that it only stores the page numbers of matching tuples, not the individual tuples - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Workaround: Planner preference for tsquery filter vs. GIN index in fast text search
On 04/20/2014 07:46 AM, Oleg Bartunov wrote: btw, 9.4 should be wiser in case of rare+common terms, thanks to GIN fast scan feature. Indeed, although we didn't actually do anything to the planner to make it understand when fast scan helps. Doing something about cost estimation is still on the 9.4 Open Items list, but I don't have any ideas on what to do about it, and I haven't heard anything from Alexander about that either. That means that the cost estimation issue Laurence saw is going to be even worse in 9.4, because GIN is going to be faster than a seq scan in more cases than before and the planner doesn't know about it. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tsearch2, large data and indexes
On 04/20/2014 02:15 AM, Ivan Voras wrote: Hello, If a table contains simple fields as well as large (hundreds of KiB) text fields, will accessing only the simple fields cause the entire record data, including the large fields, to be read and unpacked? (e.g. SELECT int_field FROM table_with_large_text) No. More details: after thinking about it some more, it might have something to do with tsearch2 and indexes: the large data in this case is a tsvector, indexed with GIN, and the query plan involves a re-check condition. The query is of the form: SELECT simple_fields FROM table WHERE fts @@ to_tsquery('...'). Does the re-check condition mean that the original tsvector data is always read from the table in addition to the index? Yes, if the re-check condition involves the fts column. I don't see why you would have a re-check condition with a query like that, though. Are there some other WHERE-conditions that you didn't show us? The large fields are stored in the toast table. You can check if the toast table is accessed with a query like this: select * from pg_stat_all_tables where relid = (select reltoastrelid from pg_class where relname='table'); Run that before and after your query, and see if the numbers change. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] unneeded joins on view
On 04/16/2014 06:13 PM, Linos wrote: I thought that Postgresql would optimize out joins on columns I don't ask for when I use the view but it doesn't, this query: It doesn't, because it would be wrong. It still has to check that the tables have a matching row (or multiple matching rows). If you use LEFT JOINs instead, and have a unique index on all the ID columns, then the planner can do what you expected and leave out the joins. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Getting query plan alternatives from query planner?
On 04/14/2014 09:36 PM, Stefan Keller wrote: Who's the elephant in the room who is reluctant to introduce explicit hints? Please read some of the previous discussions on this. Like this, in this very same thread: http://www.postgresql.org/message-id/15381.1395410...@sss.pgh.pa.us I'd like to have explicit hints, *of the kind explained in that message*. Hints that tell the planner what the data distribution is like. Hints to override statistics and heuristics used by the planner. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Batch update query performance
On 04/07/2014 03:06 PM, Albe Laurenz wrote: Hans Drexler wrote: Postgres needs close to 50 minutes to process the same query on the same data. Sometimes, Postgres needs more than 2 hours. The application performs an update query on every row of the table. The exact SQL of this query is: update t67cdi_nl_cmp_descr set is_grc_002='Y' [...] We tried removing all indexes. That reduces the runtime to ~3 minutes. When we start to put indexes back, the run time of the query increases again with each index added. Do I read that right that the duration of the update is reduced from 50 or 120 minutes to 3 when you drop all the indexes? If that's true, you might be able to drop and re-create the indexes as part of the same transaction, and come out ahead. DROP/CREATE INDEX is transactional in PostgreSQL, so you can do: BEGIN; DROP INDEX index1; ... DROP INDEX index15; UPDATE t67cdi_nl_cmp_descr SET is_grc_002='Y' CREATE INDEX index1 ...; ... CREATE INDEX index15 ...; COMMIT; This will take an AccessExclusiveLock on the table, though, so the table will be inaccessible to concurrent queries while it's running. Actually, since you are effectively rewriting the table anyway, you could create a new table with same structure, insert all rows from the old table, with is_grc_002 set to 'Y', drop the old table, and rename the new table into its place. Do all the rows really need to be updated? If some of the rows already have is_grc_002='Y', you can avoid rewriting those rows by adding a WHERE-clause: WHERE NOT is_grc_002='Y' OR is_grc_002 IS NULL. You could also play tricks with partitioning. Don't store the is_grc_002 row in the table at all. Instead, create two tables, one for the rows that implicitly have is_grc_002='Y' and another for all the other rows. Then create a view on the union of the two tables, which adds the is_grc_002 column. Instead of doing a full-table update, you can just alter the view to display is_grc_002='Y' for both tables (and add a new table to hold new rows with is_grc_002'Y'). Hypothesis we have tried many things to solve this problem ourselves, but to no avail so far. Our hypothesis is that the Postgres creates new records for all rows and then needs to update all 15 indexes to make them point to the new rows. There does not seem to be a way to avoid that. Question: - Is our hypothesis correct? - Can the forum please advise us on possible ways to make the query faster? Your hypothesis may be correct. Yeah, sounds about right. A full-table UPDATE like that is pretty much the worst-case scenario for PostgreSQL's MVCC system, unfortunately. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSI slows down over time
On 04/06/2014 05:25 AM, Ryan Johnson wrote: I've tried linux perf, but all it says is that lots of time is going to LWLock (but callgraph tracing doesn't work in my not-bleeding-edge kernel). Make sure you compile with the -fno-omit-frame-pointer flag. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PGSQL, checkpoints, and file system syncs
On 04/03/2014 08:39 PM, Reza Taheri wrote: Hello PGSQL performance community, You might remember that I pinged you in July 2012 to introduce the TPC-V benchmark. I am now back with more data, and a question about checkpoints. As far as the plans for the benchmark, we are hoping to release a benchmarking kit for multi-VM servers this year (and of course one can always simply configure it to run on one database) I am now dealing with a situation of performance dips when checkpoints complete. To simplify the discussion, I have reproduced the problem on a single VM/single database. Complete config info is in the attached files. Briefly, it is a 6-vCPU VM with 91G of memory, and 70GB in PGSQL shared buffers. The host has 512GB of memory and 4 sockets of Westmere (E7-4870) processors with HT enabled. The data tablespace is on an ext4 file system on a (virtual) disk which is striped on 16 SSD drives in RAID 0. This is obviously overkill for the load we are putting on this one VM, but in the usual benchmarking config, the 16 SSDs are shared by 24 VMs. Log is on an ext3 file system on 4 spinning drives in RAID 1. We are running PGSQL version 9.2 on RHEL 6.4; here are some parameters of interest (postgresql.conf is in the attachment): checkpoint_segments = 1200 checkpoint_timeout = 360s checkpoint_completion_target = 0.8 wal_sync_method = open_datasync wal_buffers = 16MB wal_writer_delay = 10ms effective_io_concurrency = 10 effective_cache_size = 1024MB When running tests, I noticed that when a checkpoint completes, we have a big burst of writes to the data disk. The log disk has a very steady write rate that is not affected by checkpoints except for the known phenomenon of more bytes in each log write when a new checkpoint period starts. In a multi-VM config with all VMs sharing the same data disks, when these write bursts happen, all VMs take a hit. So I set out to see what causes this write burst. After playing around with PGSQL parameters and observing its behavior, it appears that the bursts aren't produced by the database engine; they are produced by the file system. I suspect PGSQL has to issue a sync(2)/fsync(2)/sync_file_range(2) system call at the completion of the checkpoint to ensure that all blocks are flushed to disk before creating a checkpoint marker. To test this, I ran a loop to call sync(8) once a second. The graphs in file run280.mht have the throughput, data disk activity, and checkpoint start/completion timestamps for the baseline case. You can see that the checkpoint completion, the write burst, and the throughput dip all occur at the same time, so much so that it is hard to see the checkpoint completion line under the graph of writes. It looks like the file system does a mini flush every 30 seconds. The file run274.mht is the case with sync commands running in the background. You can see that everything is more smooth. Is there something I can set in the PGSQL parameters or in the file system parameters to force a steady flow of writes to disk rather than waiting for a sync system call? Mounting with commit=1 did not make a difference. Try setting the vm.dirty_bytes sysctl. Something like 256MB might be a good starting point. This comes up fairly often, see e.g.: http://www.postgresql.org/message-id/flat/27c32fd4-0142-44fe-8488-9f366dc75...@mr-paradox.net - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suboptimal query plan when using expensive BCRYPT functions
On 03/22/2014 02:59 AM, Erik van Zijst wrote: Is there any way I can get postgres to perform the hash calculations on the *result* of the other parts of the where clause, instead of the other way around? Or else rewrite the query? The planner doesn't know that the crypt function is expensive. That can be fixed with ALTER FUNCTION crypt(text, text) COST high value. Even with that, I'm not sure if the planner is smart enough to optimize the query the way you'd want, but it's worth a try. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] not using my GIN index in JOIN expression
On 02/27/2014 04:06 PM, Jean-Max Reymond wrote: I am running the last version of PostgreSQL 9.3.3 I have two tables detm and corm and a lot of datas in the column cormdata of corm table (1.4 GB). I have a GIN index on cormdata: CREATE INDEX ix_corm_fulltext_cormdata ON corm USING gin (to_tsvector('french'::regconfig, cormdata)) WHERE cormishtml IS FALSE AND length(cormdata) 2; select distinct b.detmmailid from corm b where (to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and b.cormishtml is false and length(b.cormdata) 2) is very fast and use the GIN index. HashAggregate (cost=2027.72..2031.00 rows=328 width=52) - Bitmap Heap Scan on corm b (cost=24.25..2026.35 rows=548 width=52) Recheck Cond: ((to_tsvector('french'::regconfig, cormdata) @@ to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE) AND (length(cormdata) 2)) - Bitmap Index Scan on ix_corm_fulltext_cormdata (cost=0.00..24.11 rows=548 width=0) Index Cond: (to_tsvector('french'::regconfig, cormdata) @@ to_tsquery('mauritanie'::text)) With a join an another table detm, GIN index is not used explain select distinct a.detmmailid from detm a JOIN corm b on a.detmmailid = b.detmmailid where ((to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and b.cormishtml is false and length(b.cormdata) 2) OR ( detmobjet ~* 'mauritanie' )) HashAggregate (cost=172418.27..172423.98 rows=571 width=52) - Hash Join (cost=28514.92..172416.85 rows=571 width=52) Hash Cond: (b.detmmailid = a.detmmailid) Join Filter: (((to_tsvector('french'::regconfig, b.cormdata) @@ to_tsquery('mauritanie'::text)) AND (b.cormishtml IS FALSE) AND (length(b.cormdata) 2)) OR (a.detmobjet ~* 'mauritanie'::text)) - Seq Scan on corm b (cost=0.00..44755.07 rows=449507 width=689) - Hash (cost=19322.74..19322.74 rows=338574 width=94) - Seq Scan on detm a (cost=0.00..19322.74 rows=338574 width=94) If I remove OR ( detmobjet ~* 'mauritanie' ) in the select, the GIN index is used explain select distinct a.detmmailid from detm a JOIN corm b on a.detmmailid = b.detmmailid where ((to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and b.cormishtml is false and length(b.cormdata) 2)) HashAggregate (cost=4295.69..4301.17 rows=548 width=52) - Nested Loop (cost=24.67..4294.32 rows=548 width=52) - Bitmap Heap Scan on corm b (cost=24.25..2026.35 rows=548 width=52) Recheck Cond: ((to_tsvector('french'::regconfig, cormdata) @@ to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE) AND (length(cormdata) 2)) - Bitmap Index Scan on ix_corm_fulltext_cormdata (cost=0.00..24.11 rows=548 width=0) Index Cond: (to_tsvector('french'::regconfig, cormdata) @@ to_tsquery('mauritanie'::text)) - Index Only Scan using pkey_detm on detm a (cost=0.42..4.13 rows=1 width=52) Index Cond: (detmmailid = b.detmmailid) How can i force the use of the GIN index ? thanks for your tips, The problem with the OR detmobject ~* 'mauritanie' restriction is that the rows that match that condition cannot be found using the GIN index. I think you'd want the system to fetch all the rows that match the other condition using the GIN index, and do something else to find the other rows. The planner should be able to do that if you rewrite the query as a UNION: select a.detmmailid from detm a JOIN corm b on a.detmmailid = b.detmmailid where (to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and b.cormishtml is false and length(b.cormdata) 2) union select a.detmmailid from detm a JOIN corm b on a.detmmailid = b.detmmailid where detmobjet ~* 'mauritanie' Note that that will not return rows in 'detm' that have no matching rows in 'corm' table, even if they match the detmobjet ~* 'mauritanie condition. That's what your original query also did, but if that's not what you want, leave out the JOIN from the second part of the union. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] DB size and TABLE sizes don't seem to add up
On 02/18/2014 12:14 AM, David Wall wrote: I am running PG 9.2.4 and I am trying to figure out why my database size shows one value, but the sum of my total relation sizes is so much less. Basically, I'm told my database is 188MB, but the sum of my total relation sizes adds up to just 8.7MB, which is 1/20th of the reported total. Where is the 19/20th of my data then? We do make significant use of large objects, so I suspect it's in there. Is there a relation size query that would include the large object data associated with any OIDs in those tables? You can use select pg_total_relation_size('pg_largeobject') to get the total size of the large objects. Attributing large objects to the tables that refer them is more difficult. For a single table, something like this: select sum(pg_column_size(lo.data)) from lotest_stash_values t, pg_largeobject lo where lo.loid = t.loid; Replace lotest_stash_values with the table's name and lo.loid with the name of the OID column. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange performance boost with random()
On 02/10/2014 09:52 PM, M Putz wrote: Hello, While analyzing performance, we encountered the following phenomenon, SELECT sum(pow(.5*generate_series,.5)) FROM generate_series(1,100); is much much (a hundred times) slower than SELECT sum(pow(random()*generate_series,.5)) FROM generate_series(1,100); and asymptotic difference is even more astounding. This seems counter-intuitive, considering the cost of an additional random() call instead of a constant factor. What are the reasons for this strange performance boost? Different data type. The first uses numeric, which is pretty slow for doing calculations. random() returns a double, which makes the pow and sum to also use double, which is a lot faster. To see the effect, try these variants: SELECT sum(pow(.5::float8 * generate_series,.5)) FROM generate_series(1,100); SELECT sum(pow(random()::numeric * generate_series,.5)) FROM generate_series(1,100); - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Increasing query time after updates
On 01/21/2014 08:26 AM, Katharina Koobs wrote: Hi, We have a PostgreSQL DB, version 8.4 on a Suse Linux system. Every night a script runs with several updates and inserts. The query time at day increases after approximately 3 weeks from a few minutes to about an hour. Does it get gradually slower every day, or suddenly jump from few minutes to one hour after three weeks? The former would suggest some kind of bloating or fragmentation, while the latter would suggest a change in a query plan (possibly still caused by bloating). Does the database size change over time? After export, drop and import the DB the query time is again at a few minutes. We have tested vacuum full, vacuum analyze and reindex and get no improvement. Has anyone an idea why the queries are getting slower and slower? One theory is that the tables are initially more or less ordered by one column, but get gradually shuffled by the updates. Exporting and importing would load the data back in order. However, a blow to that theory is that a pg_dump + reload will load the tuples in roughly the same physical order, but perhaps you used something else for the export+import. You could try running CLUSTER on any large tables. Since version 9.0, VACUUM FULL does more or less the same as CLUSTER, ie. rewrites the whole table, but in 8.4 it's different. Thank you so much for your help! The DB configuration: Virtual server, 7GB RAM, DB size = 16GB shared_buffers = 1024MB temp_buffers = 32MB work_mem = 8MB checkpoint_segments = 20 effective_cache_size = 512MB max_locks_per_transaction = 256 With 7GB of RAM, you might want to raise effective_cache_size to something like 4GB. It doesn't allocate anything, but tells PostgreSQL how much memory it can expect the operating system to use as buffer cache, which can influence query plans. I doubt it makes any difference for the problem you're seeing, but just as general advice.. 8.4 is quite old by now, and will no longer be supported by the community after July 2014. You'll have to upgrade pretty soon anyway, so you might as well upgrade now and see if it helps. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql in a Virtual Machine
On 25.11.2013 22:01, Lee Nguyen wrote: Hi, Having attended a few PGCons, I've always heard the remark from a few presenters and attendees that Postgres shouldn't be run inside a VM. That bare metal is the only way to go. Here at work we were entertaining the idea of running our Postgres database on our VM farm alongside our application vm's. We are planning to run a few Postgres synchronous replication nodes. Why shouldn't we run Postgres in a VM? What are the downsides? Does anyone have any metrics or benchmarks with the latest Postgres? I've also heard people say that they've seen PostgreSQL to perform worse in a VM. In the performance testing that we've done in VMware, though, we haven't seen any big impact. So I guess the answer is that it depends on the specific configuration of CPU, memory, disks and the software. Synchronous replication is likely going to be the biggest bottleneck by far, unless it's mostly read-only. I don't know if virtualization will have a measurable impact on network latency, which is what matters for synchronous replication. So, I'd suggest that you try it yourself, and see how it performs. And please report back to the list, I'd also love to see some numbers! - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index on a range array
On 13.08.2013 23:47, Daniel Cristian Cruz wrote: Hello, I'm trying to simplify a schema, where I had many ranges floating around. My idea is to put them all in an array field and query like this: SELECT event.* FROM event JOIN participant_details USING (participant_id) WHERE tsrange(event.start, event.end) ANY (participant_details.periods); periods is tsrange[]. I've tryed and it worked, but without indexes. I've tried something, but didn't found anything... Does someone know how to index this kind of field (tsrange[])? From the docs I learn that there is some GIST magic, but I would need to code in C. Is that true? Yeah. It might be somewhat tricky to write an efficient GIST implementation for this anyway. What you'd really want to do is to index each value in the array separately, which is more like what GIN does. With the partial match infrastructure in GIN, it might be possible to write a GIN implementation that can speed up range overlap queries. However, that certainly requires C coding too. A couple of alternatives come to mind: You could create the index on just the min and max values of the periods, and in the query check for overlap with that. If there typically aren't big gaps between the periods of each participant, that might work well. Or you could split the range of expected timestamps into discrete steps, for example at one-day granularity. Create a function to convert a range into an array of steps, e.g convert each range into an array of days that the range overlaps with. Create a GIN index on that array, and use it in the query. Something like this: -- Returns an int representing the day the given timestamp falls into create function epochday(timestamp) returns int4 as $$ select extract (epoch from $1)::int4/(24*3600) $$ language sql immutable; -- Same for a range. Returns an array of ints representing all the -- days that the given range overlaps with. create function epochdays(tsrange) returns integer[] as $$ select array_agg(g) from generate_series(epochday(lower($1)), epochday(upper($1))) g $$ language sql immutable; -- Same for an array of ranges. Returns an array of ints representing -- all the days that overlap with any of the given timestamp ranges create function epochdays(ranges tsrange[]) returns integer[] as $$ declare r tsrange; result integer[]; begin foreach r in array ranges loop result = result || (select array_agg(g) from generate_series(epochday(lower(r)), epochday(upper(r))) g); end loop; return result; end; $$ language plpgsql immutable; -- Create the index on that: create index period_days on participant_details using gin (epochdays(periods)); -- Query like this: SELECT event.* FROM event JOIN participant_details USING (participant_id) -- This WHERE-clause is for correctness: WHERE tsrange(event.start, event.end) ANY (participant_details.periods); -- and this is to make use of the index: AND epochdays(tsrange(event.start, event.end)) epochdays((participant_details.periods)); - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 03.05.2013 15:41, Simon Riggs wrote: What appears to be happening is we're spending a lot of time in TransactionIdIsInProgress() so we can set hints and then when we find it is still in progress we then spend more time in XidIsInSnapshot() while we check that it is still invisible to us. Even if the transaction we see repeatedly ends, we will still pay the cost in XidIsInSnapshot repeatedly as we execute. Given that code path, I would expect it to suck worse on a live system with many sessions, and even worse with many subtransactions. (1) A proposed fix is attached, but its only a partial one and barely tested. Deeper fixes might be (2) to sort the xid array if we call XidIsInSnapshot too many times in a transaction. I don't think that is worth it, because a long running snapshot may be examined many times, but is unlikely to see multiple in-progress xids repeatedly. Whereas your case seems reasonably common. Yeah, sorting would be a waste of time most of the time. Instead of adding a new cache field, how about just swapping the matched XID to the beginning of the array? Did you have some simple performance test script for this? - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 06.05.2013 04:51, Mark Kirkwood wrote: On 05/05/13 00:49, Simon Riggs wrote: On 3 May 2013 13:41, Simon Riggs si...@2ndquadrant.com wrote: (3) to make the check on TransactionIdIsInProgress() into a heuristic, since we don't *need* to check that, so if we keep checking the same xid repeatedly we can reduce the number of checks or avoid xids that seem to be long running. That's slightly more coding than my quick hack here but seems worth it. I think we need both (1) and (3) but the attached patch does just (1). This is a similar optimisation to the one I introduced for TransactionIdIsKnownCompleted(), except this applies to repeated checking of as yet-incomplete xids, and to bulk concurrent transactions. ISTM we can improve performance of TransactionIdIsInProgress() by caching the procno of our last xid. Mark, could you retest with both these patches? Thanks. Thanks Simon, will do and report back. Did anyone ever try (3) ? I'm not sure if this the same idea as (3) above, but ISTM that HeapTupleSatisfiesMVCC doesn't actually need to call TransactionIdIsInProgress(), because it checks XidInMVCCSnapshot(). The comment at the top of tqual.c says: * NOTE: must check TransactionIdIsInProgress (which looks in PGXACT array) * before TransactionIdDidCommit/TransactionIdDidAbort (which look in * pg_clog). Otherwise we have a race condition: we might decide that a * just-committed transaction crashed, because none of the tests succeed. * xact.c is careful to record commit/abort in pg_clog before it unsets * MyPgXact-xid in PGXACT array. That fixes that problem, but it also * means there is a window where TransactionIdIsInProgress and * TransactionIdDidCommit will both return true. If we check only * TransactionIdDidCommit, we could consider a tuple committed when a * later GetSnapshotData call will still think the originating transaction * is in progress, which leads to application-level inconsistency. The * upshot is that we gotta check TransactionIdIsInProgress first in all * code paths, except for a few cases where we are looking at * subtransactions of our own main transaction and so there can't be any * race condition. If TransactionIdIsInProgress() returns true for a given XID, then surely it was also running when the snapshot was taken (or had not even began yet). In which case the XidInMVCCSnapshot() call will also return true. Am I missing something? There's one little problem: we currently only set the hint bits when TransactionIdIsInProgress() returns false. If we do that earlier, then even though HeapTupleSatisfiesMVCC works correctly thanks to the XidInMVCCSnapshot call, other HeapTupleSatisfies* functions that don't call XIdInMVCCSnapshot might see the tuple as committed or aborted too early, if they see the hint bit as set while the transaction is still in-progress according to the proc array. Would have to check all the callers of those other HeapTupleSatisfies* functions to verify if that's OK. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Check Pointer
On 30.05.2013 15:09, itishree sukla wrote: In our server Check pointer process is consuming 8 GB of memory, what could be the possible reason? Can any one please help. Are you sure you're measuring the memory correctly? The RES field in top output, for example, includes shared memory, ie. the whole buffer cache. Shared memory isn't really consumed by the checkpointer process, but shared by all postgres processes. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Thinking About Correlated Columns (again)
On 15.05.2013 18:31, Shaun Thomas wrote: I've seen conversations on this since at least 2005. There were even proposed patches every once in a while, but never any consensus. Anyone care to comment? Well, as you said, there has never been any consensus. There are basically two pieces to the puzzle: 1. What metric do you use to represent correlation between columns? 2. How do use collect that statistic? Based on the prior discussions, collecting the stats seems to be tricky. It's not clear for which combinations of columns it should be collected (all possible combinations? That explodes quickly...), or how it can be collected without scanning the whole table. I think it would be pretty straightforward to use such a statistic, once we have it. So perhaps we should get started by allowing the DBA to set a correlation metric manually, and use that in the planner. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Predicate information in EXPLAIN Command
On 14.05.2013 12:23, Sameer Thakur wrote: Hello, I am trying to find predicate information for a given SQL query plan as provided by Oracle using DBMS_XPLAN. I am looking at the EXPLAIN command for getting this query plan information, with no luck so far. Does the EXPLAIN command provide predicate information? Sure. For example, postgres=# explain select * from a where id = 123; QUERY PLAN --- Seq Scan on a (cost=0.00..40.00 rows=12 width=4) Filter: (id = 123) (2 rows) The predicate is right there on the Filter line. Likewise for a join: postgres=# explain select * from a, b where a.id = b.id; QUERY PLAN - Hash Join (cost=64.00..134.00 rows=2400 width=8) Hash Cond: (a.id = b.id) - Seq Scan on a (cost=0.00..34.00 rows=2400 width=4) - Hash (cost=34.00..34.00 rows=2400 width=4) - Seq Scan on b (cost=0.00..34.00 rows=2400 width=4) (5 rows) The join predicate is on the Hash Cond line. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [PERFORM] Question about network bandwidth usage between PostgreSQL’s client and server
On 25.04.2013 02:56, Kelphet Xiong wrote: In all the experiments, the lineitem and partsupp tables reside in memory because there is no io activities observed from iotop. Since there is enough network bandwidth (1Gb/s or 128MB/s) between client and server, I would like to know what determines the data transferring rate or the network bandwidth usage between a client and a server when network bandwidth is enough. Since there's enough network bandwidth available, the bottleneck is elsewhere. I don't know what it is in your example - maybe it's the I/O capacity, or CPU required to process the result in the server before it's sent over the network. It could also be in the client, on how fast it can process the results coming from the server. I'd suggest running 'top' on the server while the query is executed, and keeping an eye on the CPU usage. If it's pegged at 100%, the bottleneck is the server's CPU. For example, given that the size of each tuple of lineitem table is 88% of that of partsupp, why the average network usage for sequential scan of lineitem table is only 50% that of partsupp table? And why the average network usage of their join is higher than that of sequential scan of lineitem but lower than that of sequential scan of partsupp table? Here's a wild guess: the query on lineitem is bottlenecked by CPU usage in the server. A lot of CPU could be spent on converting the date fields from on-disk format to the text representation that's sent over the network; I've seen that conversion use up a lot of CPU time on some test workloads. Try leaving out the date columns from the query to test that theory. If that's the bottleneck, you could try fetching the result in binary format, that should consume less CPU in the server. You didn't mention what client library you're using, but e.g with libpq, see the manual on PQexecParams on how to set the result format. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance with the new security release
On 22.04.2013 19:48, Anne Rosset wrote: Hi, We are seeing some overall performance degradation in our application since we installed the security release. Other commits were also done at the same time in the application so we don't know yet if the degradation has any relationship with the security release. While we are digging into this, I would like to know if it is possible that the release has some impact on performance. After reading this It was created as a side effect of a refactoring effort to make establishing new connections to a PostgreSQL server faster, and the associated code more maintainable., I am thinking it is quite possible. I doubt that particular commit, the one that fixed the security issue, could cause any meaningful slowdown. But it's not impossible that some other fix included in the release would cause a regression, although we try to be careful to avoid that. If you narrow the culprit down to the new PostgreSQL version, we're going to need more details to find the root cause. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index usage for tstzrange?
On 21.03.2013 06:07, Vasilis Ventirozos wrote: On Thu, Mar 21, 2013 at 5:58 AM, Tom Lanet...@sss.pgh.pa.us wrote: What I find more disturbing is that this is what I get from the example in HEAD: regression=# explain SELECT * FROM a WHERE ts@ tstzrange('2013-01-01','2013-01-01 00:10:00'); ERROR: XX000: type 1184 is not a range type LOCATION: range_get_typcache, rangetypes.c:1451 Haven't traced through it to determine exactly what's happening, but isn't this a legitimate usage? And if it isn't, surely a more user-facing error ought to be getting thrown somewhere upstream of here. It is a legit usage, this is from a test i did myself (9.2.3) test=# explain SELECT * FROM a WHERE ts@ tstzrange('2013-01-01','2013-04-01 00:10:00'); QUERY PLAN Seq Scan on a (cost=0.00..23.75 rows=1 width=44) Filter: (ts@ '[2013-01-01 00:00:00+02,2013-04-01 00:10:00+03)'::tstzrange) Looks like the range type cost estimation patch broke this, back in August already. The case of var @ constant, where constant is a range and var is an element, that's broken. The cost estimation function, rangesel(), incorrectly assumes that the 'var' is always a range type. It's a bit worrying that no-one noticed until now. I'll add a test for that operator to the rangetypes regression test. The immediate fix is attached, but this made me realize that rangesel() is still missing estimation for the element @ range operator. It shouldn't be hard to implement, I'm pretty sure we have all the statistics we need for that. - Heikki diff --git a/src/backend/utils/adt/rangetypes_selfuncs.c b/src/backend/utils/adt/rangetypes_selfuncs.c index 76dc913..c450c6a 100644 --- a/src/backend/utils/adt/rangetypes_selfuncs.c +++ b/src/backend/utils/adt/rangetypes_selfuncs.c @@ -154,8 +154,6 @@ rangesel(PG_FUNCTION_ARGS) } } - typcache = range_get_typcache(fcinfo, vardata.vartype); - /* * OK, there's a Var and a Const we're dealing with here. We need the * Const to be of same range type as the column, else we can't do anything @@ -169,6 +167,8 @@ rangesel(PG_FUNCTION_ARGS) */ if (operator == OID_RANGE_CONTAINS_ELEM_OP) { + typcache = range_get_typcache(fcinfo, vardata.vartype); + if (((Const *) other)-consttype == typcache-rngelemtype-type_id) { RangeBound lower, upper; @@ -185,6 +185,8 @@ rangesel(PG_FUNCTION_ARGS) } else { + typcache = range_get_typcache(fcinfo, ((Const *) other)-consttype); + if (((Const *) other)-consttype == vardata.vartype) constrange = DatumGetRangeType(((Const *) other)-constvalue); } -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] BUG: endless lseek(.., SEEK_END) from select queries on x64 builds
On 22.02.2013 20:10, Markus Schulz wrote: Am Freitag, 22. Februar 2013, 14:35:25 schrieb Heikki Linnakangas: You could check what the generic plan looks like by taking the query used in the java program, with the parameter markers, and running EXPLAIN on that. how can i do this? I've tried the following in my ejb-test-function to: String query = ... entitymanager.createNativeQuery(query)...; entitymanager.createNativeQuery(EXPLAIN ANALYZE + query)...; but the second createNativeQuery call runs fast every time and will show the same plan and the first hangs after the fourth call to this function. You can take the query, replace the ? parameter markers with $1, $2, and so forth, and explain it with psql like this: prepare foo (text) as select * from mytable where id = $1; explain analyze execute foo ('foo'); On 9.2, though, this will explain the specific plan for those parameters, so it might not be any different from what you already EXPLAINed. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] BUG: endless lseek(.., SEEK_END) from select queries on x64 builds
On 22.02.2013 10:25, Markus Schulz wrote: i can run the query four times with good performance and after that postgresql starts with the strange lseek() behavior. By default, the JDBC driver re-plans the prepared statement for the first 4 invocations of the query. On the fifth invocation, it switches to using a generic plan, which will be reused on subsequent invocations. See http://jdbc.postgresql.org/documentation/head/server-prepare.html. The generic plan seems to perform much worse in this case. You can disable that mechanism and force re-planning the query every time by setting the prepareThreshold=0 parameter on the data source. You could check what the generic plan looks like by taking the query used in the java program, with the parameter markers, and running EXPLAIN on that. PostgreSQL version 9.2 might work better in this case. It has some smarts in the server to generate parameter-specific plans even when prepared statements are used, if the planner thinks a specific plan will be faster. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PG_XLOG 27028 files running out of space
On 14.02.2013 12:49, Tory M Blue wrote: My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what happened this has been running flawless for years. I do have archiving turned on and run an archive command every 10 minutes. I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb. What are my options? You'll need to delete some of the oldest xlog files to release disk space. But first you need to make sure you don't delete any files that are still needed, and what got you into this situation in the first place. You say that you run an archive command every 10 minutes. What do you mean by that? archive_command specified in postgresql.conf is executed automatically by the system, so you don't need to and should not run that manually. After archive_command has run successfully, and the system doesn't need the WAL file for recovery anymore (ie. after the next checkpoint), the system will delete the archived file to release disk space. Clearly that hasn't been working in your system for some reason. If archive_command doesn't succeed, ie. it returns a non-zero return code, the system will keep retrying forever until it succeeds, without deleting the file. Have you checked the logs for any archive_command errors? To get out of the immediate trouble, run pg_controldata, and make note of this line: Latest checkpoint's REDO WAL file:00010001 Anything older than that file is not needed for recovery. You can delete those, if you have them safely archived. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow Query Help
On 05.02.2013 05:45, Will Platnick wrote: We upgraded from PG 9.1 to 9.2. Since the upgrade, the # of active queries has raised significantly, especially during our peak time where lots of users are logging in. According to New Relic, this query is now taking up the most amount of time during peak activity and my pg_stat_activity and slow log sampling agrees. We have 3 DB servers referenced here, production running 9.2.2, semi-idle (idle except for replication when I ran the test) running 9.2.2, and 9.1.3 completely idle with an old dump restored. The only thing that stands out is that it always checks both indexes for matches. Since you only want a single row as a result, it seems like it would be better to first check one index, and only check the other one if there's no match. Rewriting the query with UNION should do that: SELECT id, username, password, email, verified, timezone FROM users WHERE lower(username) = 'randomuser' UNION ALL SELECT id, username, password, email, verified, timezone FROM users WHERE lower(email) = 'randomuser' LIMIT 1; Also, if you can assume that email addresses always contain the @-character, you could take advantage of that and only do the lower(email) = 'randomuser' search if there is one. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Analyze and default_statistics_target
On 21.01.2013 17:29, AJ Weber wrote: I was under the impression that the default_statistics_target was a percentage of rows to analyze. Maybe this is not the case? Nope. I ran an analyze during a quiet point last night and for a few of my large tables, I didn't get what I consider a reasonable sampling of rows. When running with verbose enabled, it appeared that a maximum of 24 rows were being analyzed, including on tables exceeding 4-8mm rows. My default_statistics_target = 80. Shouldn't I be analyzing a larger percentage of these big tables? Analyze only needs a fairly small random sample of the rows in the table to get a picture of what the data looks like. Compare with e.g opinion polls; you only need to sample a few thousand people to get a result with reasonable error bound. That's for estimating the histogram. Estimating ndistinct is a different story, and it's well-known that the estimates of ndistinct are sometimes wildly wrong. What is the unit-of-measure used for default_statistics_target? It's the number of entries stored in the histogram and most-common-values list in pg_statistics. See also http://www.postgresql.org/docs/devel/static/planner-stats.html: The amount of information stored in pg_statistic by ANALYZE, in particular the maximum number of entries in the most_common_vals and histogram_bounds arrays for each column, can be set on a column-by-column basis using the ALTER TABLE SET STATISTICS command, or globally by setting the default_statistics_target configuration variable. The default limit is presently 100 entries. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partition insert trigger using C language
On 10.01.2013 21:48, Matheus de Oliveira wrote: I have made a small modification to keep the plans, and it got from 33957.768ms to 43782.376ms. If I'm reading results.txt correctly, the avg runtimes are: C and SPI_execute_with_args: 58567.708 ms C and SPI_(prepare/keepplan/execute_plan): 43782.376 ms C and heap_insert: 33957.768 ms So switching to prepared plans helped quite a lot, but it's still slower than direct heap_inserts. One thing that caught my eye: CREATE OR REPLACE FUNCTION partition_insert_trigger_spi() RETURNS trigger LANGUAGE C VOLATILE STRICT AS 'partition_insert_trigger_spi','partition_insert_trigger_spi' SET DateStyle TO 'ISO'; Calling a function with SET options has a fair amount of overhead, to set/restore the GUC on every invocation. That should be avoided in a performance critical function like this. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partition insert trigger using C language
On 10.01.2013 20:45, Matheus de Oliveira wrote: Inspired by Charles' thread and the work of Emmanuel [1], I have made some experiments trying to create a trigger to make partitioning using C language. The first attempt was not good, I tried to use SPI [2] to create a query to insert into the correct child table, but it took almost no improvement compared with the PL/pgSQL code. The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partition insert trigger using C language
On 10.01.2013 21:11, Matheus de Oliveira wrote: On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangashlinnakan...@vmware.com wrote: The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. Yeah. I thought about that, but the problem was that I assumed the INSERTs came with random date, so in the worst scenario I would have to keep the plans of all of the child partitions. Am I wrong? But thinking better, even with hundreds of partitions, it wouldn't use to much memory/resource, would it? Right, a few hundred saved plans would probably still be ok. And if that ever becomes a problem, you could keep the plans in a LRU list and only keep the last 100 plans or so. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] FW: performance issue with a 2.5gb joinded table
On 03.01.2013 15:30, Daniel Westermann wrote: What additionally makes me wonder is, that the same table in oracle is taking much less space than in postgresql: SQL select sum(bytes) from dba_extents where segment_name = 'TEST1'; SUM(BYTES) -- 1610612736 select pg_relation_size('mgmtt_own.test1'); pg_relation_size -- 2502082560 (1 row) (sysdba@[local]:) [bi_dwht] \d+ mgmtt_own.test1 Table mgmtt_own.test1 Column| Type | Modifiers | Storage | Description --+---+---+-+- slsales_batch| numeric(8,0) | | main| slsales_checksum | numeric(8,0) | | main| slsales_reg_id | numeric(8,0) | | main| slsales_prod_id | numeric(8,0) | | main| slsales_date_id | numeric(8,0) | | main| slsales_pos_id | numeric(8,0) | | main| slsales_amt_sales_gross | numeric(16,6) | | main| slsales_amt_sales_discount | numeric(16,6) | | main| slsales_units_sales_gross| numeric(8,0) | | main| slsales_amt_returns | numeric(16,6) | | main| slsales_amt_returns_discount | numeric(16,6) | | main| slsales_units_returns| numeric(8,0) | | main| slsales_amt_est_winnings | numeric(16,6) | | main| Indexes: itest1 btree (slsales_date_id) CLUSTER, tablespace mgmtt_idx itest2 btree (slsales_prod_id), tablespace mgmtt_idx Has OIDs: no Tablespace: mgmtt_dat One difference is that numerics are stored more tightly packed on Oracle. Which is particularly good for Oracle as they don't have other numeric data types than number. On PostgreSQL, you'll want to use int4 for ID-fields, where possible. An int4 always takes up 4 bytes, while a numeric holding an integer value in the same range is typically 5-9 bytes. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres configuration for 8 CPUs, 6 GB RAM
On 27.11.2012 09:47, Syed Asif Tanveer wrote: I am using PostgreSQL 9.1.5 for Data warehousing and OLAP puposes. Data size is around 100 GB and I have tuned my PostgreSQL accordingly still I am facing performance issues. The query performance is too low despite tables being properly indexed and are vacuumed and analyzed at regular basis. CPU usage never exceeded 15% even at peak usage times. Kindly guide me through if there are any mistakes in setting configuration parameters. Below are my system specs and please find attached my postgresql configuration parameters for current system. The configuration looks OK to me at a quick glance. I'd suggest looking at the access plans of the queries that are too slow (ie. EXPLAIN ANALYZE). How low is too low, and how fast do the queries need to be? What kind of an I/O system does the server have? See also https://wiki.postgresql.org/wiki/Slow_Query_Questions - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to keep queries low latency as concurrency increases
On 25.11.2012 18:30, Catalin Iacob wrote: So it seems we're just doing too many connections and too many queries. Each page view from a user translates to multiple requests to the application server and each of those translates to a connection and at least a few queries (which are done in middleware and therefore happen for each and every query). One pgbouncer can handle lots of concurrent idle connections and lots of queries/second but our 9000 queries/second to seem push it too much. The longer term solution for us would probably be to do less connections (by doing less Django requests for a page) and less queries, before our deadline we were just searching for a short term solution to handle an expected traffic spike. The typical solution to that is caching, see https://docs.djangoproject.com/en/1.4/topics/cache/. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 22.11.2012 02:53, Jeff Janes wrote: That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. When I run into unexpectedly poor performance, I have an intuitive enough feel for my own data that I know what plan it ought to be using. Figuring out why it is not using it is very hard. For one thing, EXPLAIN tells you about the winning plan, but there is no visibility into what ought to be the winning plan but isn't, so no way to see why it isn't.So you first have to use our existing non-hint hints (enable_*, doing weird things with cost_*, CTE stuff) to trick it into using the plan I want it to use, before I can figure out why it isn't using it, before I could figure out what hints of the style you are suggesting to supply to get it to use it. I'm sure that happens too, but my gut feeling is that more often the EXPLAIN ANALYZE output reveals a bad estimate somewhere in the plan, and the planner chooses a bad plan based on the bad estimate. If you hint the planner by giving a better estimate for where the estimator got it wrong, the planner will choose the desired plan. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] fast read of binary data
On 22.11.2012 09:54, Eildert Groeneveld wrote: ok, I did have a look at the libpq librar, and you are right, there is a way to obtain binary data from the backend through the PQexecParams res = PQexecParams(conn, DECLARE myportal CURSOR FOR select genotype_bits from v_genotype_data, 0, /* zero param */ NULL,/* let the backend deduce param type */ paramValues, NULL,/* don't need param lengths since text*/ NULL,/* default to all text params */ 1); /* ask for binary results */ genotype_bits is defined as bit varying in the backend. When writing the results: for (i = 0; i PQntuples(res); i++) { for (j = 0; j nFields; j++) fwrite(PQgetvalue(res, i, j),10,1,f); } it is clear that the results are NOT in binary format: eg(eno,snp): od -b junk |head 000 061 060 061 060 061 060 061 060 061 060 061 060 061 060 061 060 You need to ask for binary results when you execute the FETCH statements. Asking for binary results on the DECLARE CURSOR statement has no effect, as DECLARE CURSOR doesn't return any results; it's the FETCH that follows that returns the result set. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Poor performance using CTE
On 21.11.2012 01:53, Tom Lane wrote: I think the more interesting question is what cases wouldn't be covered by such a rule. Typically you need to use OFFSET 0 in situations where the planner has guessed wrong about costs or rowcounts, and I think people are likely using WITH for that as well. Should we be telling people that they ought to insert OFFSET 0 in WITH queries if they want to be sure there's an optimization fence? Yes, I strongly feel that we should. Writing a query using WITH often makes it more readable. It would be a shame if people have to refrain from using it, because the planner treats it as an optimization fence. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Poor performance using CTE
On 21.11.2012 17:42, Gavin Flower wrote: On 22/11/12 04:32, Andres Freund wrote: On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote: I wasn't talking about removing it. My point was that if the optimization fence around CTEs is removed a lot of people will need to rework apps where they have used them for that purpose. And I continue to think that spelling it OFFSET 0 is horribly obscure. +1 FWIW, I'm happy with OFFSET 0. Granted, it's pretty obscure, but that's what we've historically recommended, and it's pretty ugly to have to specify a fence like that in the first place. Whenever you have to resort to it, you ought have a comment in the query explaining why you need to force the planner like that, anyway. WITH foo AS (SELECT ...) (barrier=on|off)? 9.3 introduces the syntax, defaulting to on 9.4 switches the default to off. WITH foo AS (SELECT ...) (fence=on|off)? WITH foo AS (SELECT ...) (optimisation_fence=on|off)? If we are to invent a new syntax for this, can we please come up with something that's more widely applicable than just the WITH syntax. Something that you could use to replace OFFSET 0 in a subquery, too. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 21.11.2012 15:42, Kevin Grittner wrote: Better, IMV, would be to identify what sorts of hints people actually find useful, and use that as the basis for TODO items for optimizer improvement as well as inventing clear ways to specify the desired coercion. I liked the suggestion that a CTE which didn't need to be materialized because of side-effects or multiple references have a keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to specify that. Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell the planner that there are only 10 rows that match the a.id = b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 8.4, COPY, and high concurrency
On 13.11.2012 21:13, Jon Nelson wrote: I was working on a data warehousing project where a fair number of files could be COPY'd more or less directly into tables. I have a somewhat nice machine to work with, and I ran on 75% of the cores I have (75% of 32 is 24). Performance was pretty bad. With 24 processes going, each backend (in COPY) spent 98% of it's time in semop (as identified by strace). I tried larger and smaller shared buffers, all sorts of other tweaks, until I tried reducing the number of concurrent processes from 24 to 4. Disk I/O went up (on average) at least 10X and strace reports that the top system calls are write (61%), recvfrom (25%), and lseek (14%) - pretty reasonable IMO. Given that each COPY is into it's own, newly-made table with no indices or foreign keys, etc, I would have expected the interaction among the backends to be minimal, but that doesn't appear to be the case. What is the likely cause of the semops? I'd guess it's lock contention on WALInsertLock. That means, the system is experiencing lock contention on generating WAL records for the insertions. If that theory is correct, you ought to get a big gain if you have wal_level=minimal, and you create or truncate the table in the same transaction with the COPY. That allows the system to skip WAL-logging the COPY. Or you could upgrade to 9.2. The WAL-logging of bulk COPY was optimized in 9.2, it should help precisely the scenario you're facing. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dbt2 performance regresses from 9.1.6 to 9.2.1
On 05.11.2012 16:32, Claudio Freire wrote: On Sun, Nov 4, 2012 at 7:23 PM, Dong Yey...@vmware.com wrote: You are using prepared statements, this makes me think that this regression might be due to support for parameter specific plans for prepared statements. [1] Can you run the test on both versions without prepared statements and see if the regressions remains. Without prepare statement, we got 48837.33 avg notpm with 9.1.6 and 43264.54 avg notpm with 9.2.1. notps over time shows the slowdown of 9.2.1 is evident during the entire course of the run. Their profiles are posted on http://pgsql.privatepaste.com/b770f72967 (9.1.6) and http://pgsql.privatepaste.com/6fa8b7f174 (9.2.1). You know... it does look as if 9.2.1 is generating a lot more pressure into the memory allocator (AllocSetAlloc notably higher). Did you check the access plans of the queries? 9.2 planner might choose a slightly worse plan. Or perhaps index-only scans are hurting performance with the DBT-2 queries. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Replaying 48 WAL files takes 80 minutes
On 30.10.2012 10:50, Albe Laurenz wrote: Why does WAL replay read much more than it writes? I thought that pretty much every block read during WAL replay would also get dirtied and hence written out. Not necessarily. If a block is modified and written out of the buffer cache before next checkpoint, the latest version of the block is already on disk. On replay, the redo routine reads the block, sees that the change was applied, and does nothing. I wonder why the performance is good in the first few seconds. Why should exactly the pages that I need in the beginning happen to be in cache? This is probably because of full_page_writes=on. When replay has a full page image of a block, it doesn't need to read the old contents from disk. It can just blindly write the image to disk. Writing a block to disk also puts that block in the OS cache, so this also efficiently warms the cache from the WAL. Hence in the beginning of replay, you just write a lot of full page images to the OS cache, which is fast, and you only start reading from disk after you've filled up the OS cache. If this theory is true, you should see a pattern in the I/O stats, where in the first seconds there is no I/O, but the CPU is 100% busy while it reads from WAL and writes out the pages to the OS cache. After the OS cache fills up with the dirty pages (up to dirty_ratio, on Linux), you will start to see a lot of writes. As the replay progresses, you will see more and more reads, as you start to get cache misses. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] deadlock_timeout affect on performance
On 01.10.2012 19:49, pg noob wrote: Hi all, I have a question about the deadlock_timeout in regards to performance. Right now we have this timeout set at its default of 1s. My understanding of it is that this means that every 1 second the server will check for deadlocks. Not quite. It means that when a backend gets blocked, waiting on a lock, it will check for deadlocks after waiting for 1 second. When no backend is waiting for a lock, there are no deadlock checks regardless of deadlock_timeout. What I am wondering is how much of a performance improvement we would expect to get if this was raised to 30 seconds? Is it negligible or could it be a substantial performance improvement on a busy system? We very rarely have deadlocks and waiting 30 seconds to discover one doesn't seem too bad. It's almost certainly negligible. If you regularly have deadlocks, it might even better for performance to make the timeout shorter than 1 s, so that deadlocks are detected earlier, and backends will spend less time deadlocked, and more time doing real work. Although I doubt it will make any meaningful difference either way. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] exponential performance decrease in ISD transaction
On 31.08.2012 15:27, John Nash wrote: Program 1: dbtransfromfile: this program creates a simple table consisting of a one int column table. After the creation, the program inserts 1000 tuples in the table, which are never deleted, after that the program reads a transaction pattern from a given file and executes it a number of times determined when the program is launched. The transaction we are launching is (INSERT/SELECT/DELETE) the following: insert into T_TEST values (1);select * from T_TEST where c1=1000;delete from T_TEST where c1=1;commit; Sounds like the table keeps growing when rows are inserted and subsequently deleted. PostgreSQL doesn't immediately remove deleted tuples from the underlying file, but simply marks them as deleted. The rows are not physically removed until autovacuum kicks in and cleans it up, or the table is vacuumed manually. I'd suggest creating an index on t_test(c1), if there isn't one already. It's not helpful when the table is small, but when the table is bloated with all the dead tuples from the deletions, it should help to keep the access fast despite the bloat. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgress is taking lot of CPU on our embedded hardware.
On 27.01.2012 15:34, Jayashankar K B wrote: Hi, We are having an embedded system with a freescale m68k architecture based micro-controller, 256MB RAM running a customized version of Slackware 12 linux. It's a relatively modest Hardware. Fascinating! We have installed postgres 9.1 as our database engine. While testing, we found that the Postgres operations take more than 70% of CPU and the average also stays above 40%. This is suffocating the various other processes running on the system. Couple of them are very critical ones. The testing involves inserting bulk number of records (approx. 1 records having between 10 and 20 columns). Please let us know how we can reduce CPU usage for the postgres. The first step would be to figure out where all the time is spent. Are there unnecessary indexes you could remove? Are you using INSERT statements or COPY? Sending the data in binary format instead of text might shave some cycles. If you can run something like oprofile on the system, that would be helpful to pinpoint the expensive part. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] regarding CLUSTER and HUGE work_mem / maintenance_work_mem
On 27.01.2012 19:43, Jon Nelson wrote: Let's say I have a 7GB table with 3-4 indices for a total of 10-12GB. Furthermore, let's say I have a machine with sufficient memory for me to set the work_mem and maintenance_work_mem to 20GB (just for this session). When I issue a CLUSTER using one of the indices, I see PostgreSQL (by way of strace) performing an index scan which amounts to large quantities of random I/O. In my case, that means it takes a very, very long time. PostgreSQL is largely at defaults, except for a 2GB shared_buffers and a few unrelated changes. The system itself has 32GB of physical RAM and has plenty free. Why didn't PostgreSQL just read the table into memory (and the interesting index) as a sequential scan, sort, and then write it out? It seems like there would be more than enough memory for that. The sequential I/O rate on this machine is 50-100x the random I/O rate. I'm using 8.4.10 (with the 'inet' de-toasting patch) on Scientific Linux 6.1. The suppport for doing a seqscan+sort in CLUSTER was introduced in version 9.1. Before that, CLUSTER always did an indexscan. See release notes: http://www.postgresql.org/docs/9.1/static/release-9-1.html#AEN107416 -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgress is taking lot of CPU on our embedded hardware.
On 27.01.2012 20:30, Jayashankar K B wrote: Hi Heikki Linnakangas: We are using series of Insert statements to insert the records into database. Sending data in binary is not an option as the module that writes into DB has been finalized. We do not have control over that. That certainly limits your options. Please let me know how we can proceed. On the net I couldn't get hold of any good example where Postgres has been used on limited Hardware system. I don't think there's anything particular in postgres that would make it a poor choice on a small system, as far as CPU usage is concerned anyway. But inserting rows in a database is certainly slower than, say, writing them into a flat file. At what rate are you doing the INSERTs? And how fast would they need to be? Remember that it's normal that while the INSERTs are running, postgres will use all the CPU it can to process them as fast as possible. So the question is, at what rate do they need to be processed to meet your target. Lowering the process priority with 'nice' might help too, to give the other important processes priority over postgres. The easiest way to track down where the time is spent would be to run a profiler, if that's possible on your platform. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Cursor fetch performance issue
On 24.01.2012 23:34, Tony Capobianco wrote: Here's the explain: pg=# explain select getMemberAdminPrevious_sp(247815829, 1,'test.em...@hotmail.com', 'Email', 'Test'); QUERY PLAN -- Result (cost=0.00..0.26 rows=1 width=0) (1 row) Time: 1.167 ms That's not very helpful. We'd need to see the plan of the query within the function, not the plan on invoking the function. The auto_explain contrib module with auto_explain_log_nested_statements=on might be useful to get that. There was discussion of 'LIKE' v. '=' and wildcard characters are not being entered into the $1 parameter. This is not generating a sql string. I feel it's something to do with the fetch of the refcursor. The cursor is a larger part of a function: CREATE OR REPLACE FUNCTION PUBLIC.GETMEMBERADMINPREVIOUS_SP2 ( p_memberid IN numeric, p_websiteid IN numeric, p_emailaddress IN varchar, p_firstname IN varchar, p_lastname IN varchar) RETURNS refcursor AS $$ DECLARE refrefcursor; l_sysdateidnumeric; BEGIN l_sysdateid := sysdateid(); if (p_memberid != 0) then if (p_emailaddress IS NOT NULL) then OPEN ref FOR SELECT m.memberid, m.websiteid, m.emailaddress, m.firstname, m.lastname, m.regcomplete, m.emailok FROM members m WHERE m.emailaddress LIKE p_emailaddress ANDm.changedate_id l_sysdateid ORDER BY m.emailaddress, m.websiteid; end if; end if; Return ref; EXCEPTION WHEN NO_DATA_FOUND THEN Return null; END; $$ LANGUAGE 'plpgsql'; The theory that the query takes a long time because LIKE p_emailaddress is not optimizeable by the planner seems the most likely to me. If you don't actually use any wildcards in the email, try replacing LIKE with =. If you do, then you can try the OPEN ref FOR EXECUTE syntax. That way the query is re-planned every time, and the planner can take advantage of the parameter value. That enables it to use an index on the email address column, when there isn't in fact any wildcards in the value, and also estimate the selectivities better which can lead to a better plan. Like this: CREATE OR REPLACE FUNCTION public.getmemberadminprevious_sp2(p_memberid numeric, p_websiteid numeric, p_emailaddress character varying, p_firstname character varying, p_lastname character varying) RETURNS refcursor LANGUAGE plpgsql AS $function$ DECLARE refrefcursor; l_sysdateidnumeric; BEGIN l_sysdateid := sysdateid(); if (p_memberid != 0) then if (p_emailaddress IS NOT NULL) then OPEN ref FOR EXECUTE $query$ SELECT m.memberid, m.websiteid, m.emailaddress, m.firstname, m.lastname, m.regcomplete, m.emailok FROM members m WHERE m.emailaddress LIKE $1 ANDm.changedate_id $2 ORDER BY m.emailaddress, m.websiteid; $query$ USING p_emailaddress, l_sysdateid; end if; end if; Return ref; EXCEPTION WHEN NO_DATA_FOUND THEN Return null; END; $function$ -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autovacuum, exclude table
On 12.12.2011 16:25, Anibal David Acosta wrote: I have a couple of tables with about 400millions of records increasing about 5 millions per day. I think that disabling autovac over those tables, and enabling daily manual vacuum (in some idle hour) will be better. I am right? Possibly. If the system is otherwise idle, it sounds sensible to do routine maintenance at that time. Is possible to exclude autovacuum over some tables? Sure, see http://www.postgresql.org/docs/9.1/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS ALTER TABLE foo SET (autovacuum_enabled=false, toast.autovacuum_enabled = false); It might be better, though, to let autovacuum enabled, and just do the additional manual VACUUM in the idle period. If the daily manual VACUUM is enough to keep the bloat within the autovacuum thresholds, autovacuum will never kick in. If it's not enough, then you probably want autovacuum to run. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSL encryption makes bytea transfer slow
On 04.11.2011 10:43, Albe Laurenz wrote: Marti Raudsepp wrote: Disabling OpenSSL compression in the source (which is possible since OpenSSL 1.0.0) does not give me any performance improvement. If it doesn't give you any performance improvement then you haven't disabled compression. Modern CPUs can easily saturate 1 GbitE with AES256-encrypted connections. Compression is usually the bottleneck, at 20-30 MB/s. Hmm, my knowledge of OpenSSL is so little that it is well possible that I did it wrong. I have attached the small patch I used; can you see where I went wrong? That only works with OpenSSL 1.0.0 - did you upgrade? I thought you were using 0.9.7a earlier. FWIW, it would be better to test #ifdef SSL_OP_NO_COMPRESSION directly, rather than the version number. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSL encryption makes bytea transfer slow
On 28.10.2011 14:02, Albe Laurenz wrote: We selected a 30MB bytea with psql connected with -h localhost and found that it makes a huge difference whether we have SSL encryption on or off. Without SSL the SELECT finished in about a second, with SSL it took over 23 seconds (measured with \timing in psql). During that time, the CPU is 100% busy. All data are cached in memory. Is this difference as expected? I tried to reproduce that, but only saw about 4x difference in the timing, not 23x. $ PGSSLMODE=disable ~/pgsql.master/bin/psql -h localhost postgres psql (9.2devel) Type help for help. postgres=# \o foo postgres=# \timing Timing is on. postgres=# SELECT repeat(xx,65536)::bytea FROM (SELECT string_agg(lpad(to_hex(x),2, '0' ),'') AS xx FROM generate_series(0,255) x) AS xx; Time: 460,782 ms $ PGSSLMODE=require ~/pgsql.master/bin/psql -h localhost postgres psql (9.2devel) SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256) Type help for help. postgres=# \o foo postgres=# \timing Timing is on. postgres=# SELECT repeat(xx,65536)::bytea FROM (SELECT string_agg(lpad(to_hex(x),2, '0' ),'') AS xx FROM generate_series(0,255) x) AS xx; Time: 1874,276 ms oprofile suggests that all that overhead is coming from compression. Apparently SSL does compression automatically. Oprofile report of the above test case with SSL enabled: samples %image name symbol name 2817774.4753 libz.so.1.2.3.4 /usr/lib/libz.so.1.2.3.4 1814 4.7946 postgres byteain 1459 3.8563 libc-2.13.so __memcpy_ssse3_back 1437 3.7982 libcrypto.so.0.9.8 /usr/lib/libcrypto.so.0.9.8 896 2.3682 postgres hex_encode 304 0.8035 vmlinux-3.0.0-1-amd64clear_page_c 271 0.7163 libc-2.13.so __strlen_sse42 222 0.5868 libssl.so.0.9.8 /usr/lib/libssl.so.0.9.8 And without: samples %image name symbol name 1601 27.4144 postgres byteain 865 14.8116 postgres hex_encode 835 14.2979 libc-2.13.so __memcpy_ssse3_back 290 4.9658 vmlinux-3.0.0-1-amd64clear_page_c 280 4.7945 libc-2.13.so __strlen_sse42 184 3.1507 vmlinux-3.0.0-1-amd64page_fault 174 2.9795 vmlinux-3.0.0-1-amd64put_mems_allowed Maybe your data is very expensive to compress for some reason? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] : PG9.0 - Checkpoint tuning and pg_stat_bgwriter
On 04.10.2011 13:50, Venkat Balaji wrote: I have got a situation where in i see the production system is loaded with the checkpoints and at-least 1000+ buffers are being written for every checkpoint. 1000 buffers isn't very much, that's only 8 MB, so that's not alarming itself. I am thinking of increasing the checkpoint_segments. Below are our current settings - checkpoint_segments = 8 checkpoint_timeout = 5 mins checkpoint_completion_target = 0.5 bgwriter_delay = 200ms bgwriter_lru_maxpages = 100 bgwriter_lru_multiplier = 2 Looking forward for suggestions. Yep, increase checkpoint_segments. And you probably want to raise checkpoint_timeout too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?
On 14.09.2011 09:39, Stefan Keller wrote: Should I open a ticket? What ticket? With whom? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query performance issue
On 31.08.2011 12:00, Jayadevan M wrote: Hello all, I have a query which takes about 20 minutes to execute and retrieves 2000-odd records. The explain for the query is pasted here http://explain.depesz.com/s/52f The same query, with similar data structures/indexes and data comes back in 50 seconds in Oracle. We just ported the product to PostgreSQL and are testing it. Any input on what to look for? Possible relevant parameters are shared_buffers = 4GB temp_buffers = 8MB work_mem = 96MB maintenance_work_mem = 1GB effective_cache_size = 8GB default_statistics_target = 50 It is a machine with 16 GB RAM. Please run EXPLAIN ANALYZE on the query and post that, it's hard to say what's wrong from just the query plan, without knowing where the time is actually spent. And the schema of the tables involved, and any indexes on them. (see also http://wiki.postgresql.org/wiki/SlowQueryQuestions) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Infinite Cache
On 05.07.2011 16:35, Shaun Thomas wrote: I'd say it's probably safe enough these days. But it's also one of those exclusive selling points they're using right now to garner EDB customers. So I doubt it'll be released any time *soon*, though may make it eventually. I doubt the community would want it even if it was open sourced. As an open source project, what would probably make more sense is a similar caching mechanism built into the kernel, somewhere between the filesystem cache and user-space. That way any you could use it with any application that benefits from the kind of large cache that Infinite Cache provides. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Order of tables
On 28.04.2011 12:20, Rishabh Kumar Jain wrote: How the tables must be ordered in the list of tables in from statement? There is no difference in performance, if that's what you mean. (If not, then pgsql-novice or pgsql-sql mailing list would've be more appropriate) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] C on Client versus C on Server
On 02.04.2011 21:52, Eliot Gable wrote: 1) Each select statement from the User Client C Module would be a separate transaction which would drastically increase transaction overhead for the whole set of requests. You could wrap the statements in BEGIN-COMMIT in the client code. 2) Writing the billing data at the end would mean that I not only have to pull all the data down to the User Client, I must also push the data back up to the server for writing the billing records. Yeah, that sounds right. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance Test for PostgreSQL9
On 28.02.2011 11:10, Selva manickaraja wrote: OK, somehow I got these modules installed. Finally I successfully built and installed PostgreSQL! I must thank you guys so much for helping. Now coming to the real issue of the matter. According to the documentation the gmake installcheck can be run in various directories. However it seem to be only local. Can these tests be run from local but to stress test a database on a remote machine. This way I don't need to go on building postgresql from source in every new db server. I will wait for your reply. Try PGHOST=servername make installcheck -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] inheritance: planning time vs children number vs column number
On 28.02.2011 11:38, Marc Cousin wrote: I've been facing a very large (more than 15 seconds) planning time in a partitioned configuration. The amount of partitions wasn't completely crazy, around 500, not in the thousands. The problem was that there were nearly 1000 columns in the parent table (very special use case, there is a reason for this application for having these many columns). The check constraint was extremely simple (for each child, 1 column = 1 constant, always the same column). As I was surprised by this very large planning time, I have been trying to study the variation of planning time against several parameters: - number of columns - number of children tables - constraint exclusion's value (partition or off) What (I think) I measured is that the planning time seems to be O(n^2) for the number of columns, and O(n^2) for the number of children tables. Constraint exclusion had a limited impact on planning time (it added between 20% and 100% planning time when there were many columns). Testing here with a table with 1000 columns and 100 partitions, about 80% of the planning time is looking up the statistics on attribute width, to calculate average tuple width. I don't see O(n^2) behavior, though, it seems linear. I'd like to know if this is a known behavior ? And if I could mitigate it somehow ? I'm out of ideas on how to make it faster, I'm afraid. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] application of KNN code to US zipcode searches?
On 17.02.2011 17:20, Mark Stosberg wrote: I thought the benefit of KNN was that you could retrieve the rows in distance order, so that a query for the closest 20 locations (for example) would be very fast. I wouldn't have expected it to be helpful when you're selecting all the rows regardless of distance. Kevin, Thanks for the feedback. You are right that my reduced test case wasn't a good approximation. I added a limit, to simulate finding the 100 zipcodes closest to 90210. Below I compare 4 approaches to the same query: 1. Cube search 2. Earth Distance Search 3. Simple point distance (no index) 4. Simple point distance (KNN) Now KNN benchmarks to be almost 100x faster! That's very promising. Then there's only the issue that simple point distance is not expected to be a good enough approximation of earth-distances. Perhaps that can be solved by pre-computing coordinates based on the lat/long pairs much like the map projections used to present a curved surface on a flat map? Given that's OK to be be a few miles off, it seems we have some leeway here. Recommendations? The existing opclasses only support distance-to-a-point, but I believe the KNN gist code is flexible enough that it could be used for distance to the edge of a shape as well. Someone just needs to write the operators and support functions. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Field wise checking the performance.
On 14.02.2011 14:06, dba wrote: I have two identical tables. But the with of the fields are different. Need to know whether changing from varchar(100) to varchar(30) will increase the performance, or its just the memory access. It will make no difference. The max length is just a constraint on what values can be stored, it doesn't affect how the strings are stored. In both cases, the strings are stored in a variable-length format. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Simple (hopefully) throughput question?
On 03.11.2010 17:52, Nick Matheson wrote: We have an application that needs to do bulk reads of ENTIRE Postgres tables very quickly (i.e. select * from table). We have observed that such sequential scans run two orders of magnitude slower than observed raw disk reads (5 MB/s versus 100 MB/s). Part of this is due to the storage overhead we have observed in Postgres. In the example below, it takes 1 GB to store 350 MB of nominal data. However that suggests we would expect to get 35 MB/s bulk read rates. Observations using iostat and top during these bulk reads suggest that the queries are CPU bound, not I/O bound. In fact, repeating the queries yields similar response times. Presumably if it were an I/O issue the response times would be much shorter the second time through with the benefit of caching. We have tried these simple queries using psql, JDBC, pl/java stored procedures, and libpq. In all cases the client code ran on the same box as the server. We have experimented with Postgres 8.1, 8.3 and 9.0. Try COPY, ie. COPY bulk_performance.counts TO STDOUT BINARY. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Useless sort by
On 13/09/10 19:48, Tom Lane wrote: Gaetano Mendolamend...@gmail.com writes: Of course I'm not suggesting to take away the sort by and give the user an unsorted result, I'm asking why the the optimizer in cases like: select unique(a) from v_table_with_order_by; doesn't takes away the order by inside the view and puts it back rewriting the query like this: select unique(a) from v_table_without_order_by order by a; That changes the order in which the rows are fed to unique(a). The principal real-world use for a non-top-level ORDER BY is exactly to determine the order in which rows are fed to a function, so we will have a revolt on our hands if we break that. You could check for volatile functions. I think this could be done safely. However, it doesn't seem worthwhile, it would be a fair amount of code, and it's not usually a good idea to put an ORDER BY in a view or subquery anyway unless you also have volatile functions in there, or you want to coerce the optimizer to choose a certain plan. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] now() gives same time within the session
On 12/07/10 14:15, A. Kretschmer wrote: Use timeofday() instead, now() returns the transaction starting time. timeofday() is a legacy function kept only for backwards-compatibility. It returns a string, which is quite awkward. Use clock_timestamp() instead. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Analysis Function
On 11/06/10 23:38, David Jarvis wrote: I added an explicit cast in the SQL: dateserial(extract(YEAR FROM m.taken)::int,'||p_month1||','||p_day1||') d1, dateserial(extract(YEAR FROM m.taken)::int,'||p_month2||','||p_day2||') d2 The function now takes three integer parameters; there was no performance loss. We had a little chat about this with Magnus. It's pretty surprising that there's no built-in function to do this, we should consider adding one. We could have a function like: construct_timestamp(year int4, month int4, date int4, hour int4, minute int4, second int4, milliseconds int4, timezone text) Now that we have named parameter notation, callers can use it to conveniently fill in only the fields needed: SELECT construct_timestamp(year := 1999, month := 10, date := 22); -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Analysis Function
On 11/06/10 11:25, David Jarvis wrote: Datum dateserial (PG_FUNCTION_ARGS) { int32 p_year = PG_GETARG_INT32(0); int32 p_month = PG_GETARG_INT32(1); int32 p_day = PG_GETARG_INT32(2); DateADT d = date2j (p_year, p_month, p_day) - POSTGRES_EPOCH_JDATE; PG_RETURN_DATEADT(d); } Compiles without errors or warnings. The function is integrated as follows: CREATE OR REPLACE FUNCTION dateserial(integer, integer, integer) RETURNS text AS 'ymd.so', 'dateserial' LANGUAGE 'c' IMMUTABLE STRICT COST 1; However, when I try to use it, the database segfaults: select dateserial( 2007, 1, 3 ) Any ideas why? The C function returns a DateADT, which is a typedef for int32, but the CREATE FUNCTION statement claims that it returns 'text'. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovaccum settings while Bulk Loading data
On 10/06/10 11:47, Ambarish Bhattacharya wrote: It would be helpful if you can suggest me the appropriate Autovacuum settings for handling this large data as my autovacuum setting is hanging the entire process. What do you mean by hanging the entire process? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovaccum settings while Bulk Loading data
Please keep the mailing list CC'd, so that others can help. On 10/06/10 15:30, Ambarish Bhattacharya wrote: On 10/06/10 11:47, Ambarish Bhattacharya wrote: It would be helpful if you can suggest me the appropriate Autovacuum settings for handling this large data as my autovacuum setting is hanging the entire process. What do you mean by hanging the entire process? Hanging the entire process means...the autovacuum and auto analyzes starts and after that there is no acitivity i could see in the postgres log related to the bulk loading and when checked the postgres processes from the task manager i could see few of the postgres porcess are still running and had to be killed from there..normal shut down in not happening in this case... You'll have to provide a lot more details if you want people to help you. How do you bulk load the data? What kind of log messages do you normally get in the PostgreSQL log related to bulk loading? Autovacuum or autoanalyze should not interfere with loading data, even if it runs simultaneously. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need to increase performance of a query
On 10/06/10 23:08, Anne Rosset wrote: Heikki Linnakangas wrote: On 10/06/10 22:47, Craig James wrote: Postgres normally doesn't index NULL values even if the column is indexed, so it has to do a table scan when your query includes an IS NULL condition. That was addressed in version 8.3. 8.3 and upwards can use an index for IS NULL. I believe the NULLs were stored in the index in earlier releases too, they just couldn't be searched for. I am using postgres 8.3.6. So why doesn't it use my index? Well, apparently the planner doesn't think it would be any cheaper. I wonder if this helps: CREATE INDEX item_rank_project_id ON item_rank(project_id, rank, pf_id); And make sure you drop any of the indexes that are not being used, to make sure the planner doesn't choose them instead. (You should upgrade to 8.3.11, BTW. There's been a bunch of bug-fixes in-between, though I don't know if any are related to this, but there's other important fixes there) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] FSM - per database or per installation?
Craig James wrote: Are the FSM parameters for each database, or the entire Postgres system? In other words, if I have 100 databases, do I need to increase max_fsm_pages and max_fsm_relations by a factor of 100, or keep them the same as if I just have one database? I suspect they're per-database, i.e. as I add databases, I don't have to increase the FSM parameters, but the documentation isn't 100% clear on this point. It's per cluster, ie *not* per-database. The parameter is gone in 8.4, BTW. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD + RAID
Craig James wrote: I've wondered whether this would work for a read-mostly application: Buy a big RAM machine, like 64GB, with a crappy little single disk. Build the database, then make a really big RAM disk, big enough to hold the DB and the WAL. Then build a duplicate DB on another machine with a decent disk (maybe a 4-disk RAID10), and turn on WAL logging. The system would be blazingly fast, and you'd just have to be sure before you shut it off to shut down Postgres and copy the RAM files back to the regular disk. And if you didn't, you could always recover from the backup. Since it's a read-mostly system, the WAL logging bandwidth wouldn't be too high, so even a modest machine would be able to keep up. Should work, but I don't see any advantage over attaching the RAID array directly to the 1st machine with the RAM and turning synchronous_commit=off. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD + RAID
Merlin Moncure wrote: 2009/11/13 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: Laszlo Nagy wrote: * I need at least 32GB disk space. So DRAM based SSD is not a real option. I would have to buy 8x4GB memory, costs a fortune. And then it would still not have redundancy. At 32GB database size, I'd seriously consider just buying a server with a regular hard drive or a small RAID array for redundancy, and stuffing 16 or 32 GB of RAM into it to ensure everything is cached. That's tried and tested technology. lots of ram doesn't help you if: *) your database gets written to a lot and you have high performance requirements When all the (hot) data is cached, all writes are sequential writes to the WAL, with the occasional flushing of the data pages at checkpoint. The sequential write bandwidth of SSDs and HDDs is roughly the same. I presume the fsync latency is a lot higher with HDDs, so if you're running a lot of small write transactions, and don't want to risk losing any recently committed transactions by setting synchronous_commit=off, the usual solution is to get a RAID controller with a battery-backed up cache. With a BBU cache, the fsync latency should be in the same ballpark as with SDDs. *) your data is important Huh? The data is safely on the hard disk in case of a crash. The RAM is just for caching. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD + RAID
Merlin Moncure wrote: On Sat, Nov 14, 2009 at 6:17 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: lots of ram doesn't help you if: *) your database gets written to a lot and you have high performance requirements When all the (hot) data is cached, all writes are sequential writes to the WAL, with the occasional flushing of the data pages at checkpoint. The sequential write bandwidth of SSDs and HDDs is roughly the same. I presume the fsync latency is a lot higher with HDDs, so if you're running a lot of small write transactions, and don't want to risk losing any recently committed transactions by setting synchronous_commit=off, the usual solution is to get a RAID controller with a battery-backed up cache. With a BBU cache, the fsync latency should be in the same ballpark as with SDDs. BBU raid controllers might only give better burst performance. If you are writing data randomly all over the volume, the cache will overflow and performance will degrade. We're discussing a scenario where all the data fits in RAM. That's what the large amount of RAM is for. The only thing that's being written to disk is the WAL, which is sequential, and the occasional flush of data pages from the buffer cache at checkpoints, which doesn't happen often and will be spread over a period of time. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD + RAID
Laszlo Nagy wrote: * I need at least 32GB disk space. So DRAM based SSD is not a real option. I would have to buy 8x4GB memory, costs a fortune. And then it would still not have redundancy. At 32GB database size, I'd seriously consider just buying a server with a regular hard drive or a small RAID array for redundancy, and stuffing 16 or 32 GB of RAM into it to ensure everything is cached. That's tried and tested technology. I don't know how you came to the 32 GB figure, but keep in mind that administration is a lot easier if you have plenty of extra disk space for things like backups, dumps+restore, temporary files, upgrades etc. So if you think you'd need 32 GB of disk space, I'm guessing that 16 GB of RAM would be enough to hold all the hot data in cache. And if you choose a server with enough DIMM slots, you can expand easily if needed. Just my 2 cents, I'm not really an expert on hardware.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] maintain_cluster_order_v5.patch
ph...@apra.asso.fr wrote: Hi Jeff, If you can help (either benchmark work or C coding), try reviving the features by testing them and merging them with the current tree. OK, that's the rule of the game in such a community. I am not a good C writer, but I will see what I could do. The FSM rewrite in 8.4 opened up more options for implementing this. The patch used to check the index for the block the nearest key is stored in, read that page in, and insert there if there's enough free space on it. with the new FSM, you can check how much space there is on that particular page before fetching it. And if it's full, the new FSM data structure can be searched for a page with enough free space as close as possible to the old page, although there's no interface to do that yet. A completely different line of attack would be to write a daemon that concurrently moves tuples in order to keep the table clustered. It would interfere with UPDATEs and DELETEs, and ctids of the tuples would change, but for many use cases it would be just fine. We discussed a utility like that as a replacement for VACUUM FULL on hackers a while ago, see thread Feedback on getting rid of VACUUM FULL. A similar approach would work here, the logic for deciding which tuples to move and where would just be different. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] vacuumdb command
soorjith p wrote: I used the vacuumdb command. But in its output I cann't see VACUUM. The last part of output is DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 1 pages contain useful free space. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: free space map contains 768 pages in 392 relations DETAIL: A total of 6720 page slots are in use (including overhead). 6720 page slots are required to track all free space. Current limits are: 153600 page slots, 1000 relations, using 965 kB. I think if the process is complete then last part of output is VACUUM. Is it means the process is not complete? No. It is complete. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] updating a row in a table with only one row
Robert Haas wrote: On Fri, Oct 2, 2009 at 9:54 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Oct 2, 2009 at 4:18 AM, Michal Vitecek f...@mageo.cz wrote: Hello everyone, I'm using PostgreSQL 8.3.8 running on a server with 2 Xeon CPUs, 4GB RAM, 4+2 disks in RAID 5 and CentOS 5.3. There's only one database which dumped with pgdump takes ~0.5GB. There are ~100 tables in the database and one of them (tableOne) always contains only a single row. There's one index on it. However performing update on the single row (which occurs every 60 secs) takes a considerably long time -- around 200ms. The system is not loaded in any way. The table definition is: CREATE TABLE tableOne ( value1 BIGINT NOT NULL, value2 INTEGER NOT NULL, value3 INTEGER NOT NULL, value4 INTEGER NOT NULL, value5 INTEGER NOT NULL, ); CREATE INDEX tableOne_index1 ON tableOne (value5); And the SQL query to update the _only_ row in the above table is: ('value5' can't be used to identify the row as I don't know it at the time) UPDATE tableOne SET value1 = newValue1, value2 = newValue2, value5 = newValue5; And this is what EXPLAIN says on the above SQL query: DB= EXPLAIN UPDATE tableOne SET value1 = newValue1, value2 = newValue2, value5 = newValue5; LOG: duration: 235.948 ms statement: EXPLAIN UPDATE tableOne SET value1 = newValue1, value2 = newValue2, value5 = newValue5; QUERY PLAN Seq Scan on jackpot (cost=0.00..1.01 rows=1 width=14) (1 row) What takes PostgreSQL so long? I guess I could add a fake 'id' column, create an index on it to identify the single row, but still -- the time seems quite ridiculous to me. it is ridiculous. your problem is almost definitely dead rows. I can't recall (and I can't find the info anywhere) if the 'hot' feature requires an index to be active -- I think it does. If so, creating a dummy field and indexing it should resolve the problem. Can you confirm the dead row issue by doing vacuum verbose and create the index? please respond with your results, I'm curious. Also, is autovacuum on? Have you measured iowait? Since he's updating all the fields in the table, an index will certainly ensure that HOT does not apply, no? An extra index shouldn't hurt if you don't update the indexed dummy column. But the existing tableOne_index1 will cause HOT to not apply, if value5 is updated. I'd suggest dropping it (and not creating any other indexes either), it won't do any good on a table with only one row anyway. If the table is indeed bloated, VACUUM FULL should shrink it back. I wonder how it got to be that way, though. Autovacuum should keep a table like that in check. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Persistent Plan Cache
Tom Lane wrote: Joshua Rubin jru...@esoft.com writes: We have a very large, partitioned, table that we often need to query from new connections, but frequently with similar queries. We have constraint exclusion on to take advantage of the partitioning. This also makes query planning more expensive. As a result, the CPU is fully loaded, all the time, preparing queries, many of which have been prepared, identically, by other connections. If you're depending on constraint exclusion, it's hard to see how plan caching could help you at all. The generated plan needs to vary depending on the actual WHERE-clause parameters. That's what the OP really should've complained about. If we addressed that, so that a generic plan was created that determines which child tables can be excluded at run time, there would be no need for the persistent plan cache. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Persistent Plan Cache
Joshua Rubin wrote: We hardcode the parts of the where clause so that the prepared plan will not vary among the possible partitions of the table. The only values that are bound would not affect the planner's choice of table. Then you would benefit from using prepared statements in the client, and/or connection pooling to avoid having to re-prepare because of reconnecting. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] partitioning max() sql not using index
Kevin Kempter wrote: Hi all I have a large table (2billion rows) that's partitioned by date based on an epoch int value. We're running a select max(id) where id is the PK. I have a PK index on each of the partitions, no indexes at all on the base table. If I hit a partition table directly I get an index scan as expected: The planner isn't smart enough to create the plan you're expecting. There was discussion and even a patch posted recently about that: http://archives.postgresql.org/pgsql-hackers/2009-07/msg01115.php It seems the thread petered out, but the concept seems sane. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance