[PERFORM] transaction ID wrap limit
Hi all, A vacuum full command logs the message: ... LOG: transaction ID wrap limit is 1073822617, limited by database A Sometimes ago, the vacuum full logged: ... LOG: transaction ID wrap limit is 2147484148, limited by database A What causes that difference of the limit ? Should I set or optimize something on my Postgresql server ? TIA, Sabin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] New to PostgreSQL, performance considerations
Sorry for the delay in responding. I had familial obligations. As a matter of fact, I am spending a decent amount of time on this. I don't usually pore through documentation for compilers and OS's to the degree I've been since this thread started. Nor do I usually try and get access to the HW I'm presently tracking down. I'll post my thoughts re: detailed analysis of gcc/g++ compiler options later today or tomorrow as work schedule allows. Why this is worth it: 1= Any gains from setup and configuration are the cheapest ones available once we codify how to obtain them. 2= any public database or knowledge about how to best setup, configure, and test pg is very good for the community. 3= developers need to know and agree on proper procedure and technique for generating results for discussion or we end up wasting a lot of time. 4= measuring and documenting pg performance means we know where best to allocate resources for improving pg. Or where using pg is (in)appropriate compared to competitors. Potential performance gains are not the only value of this thread. Ron Peacetree At 12:33 PM 12/16/2006, Michael Stone wrote: On Sat, Dec 16, 2006 at 10:53:21AM -0500, Ron wrote: The most important gain IMO is Knowledge, and I'd say there is still more to learn and/or verify IMHO. YMMV. Well, I think there are other areas where I can spend my time where potential gains are more likely. YMMV (although, I note, you don't seem to be spending much of your own time testing this) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] transaction ID wrap limit
Sabin Coanda [EMAIL PROTECTED] writes: A vacuum full command logs the message: ... LOG: transaction ID wrap limit is 1073822617, limited by database A Sometimes ago, the vacuum full logged: ... LOG: transaction ID wrap limit is 2147484148, limited by database A What causes that difference of the limit ? The limit is *supposed* to advance. The fact that it jumped this much in one step suggests you're not vacuuming often enough :-( regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Query plan changing when queried data does not
I have a database that has 3 tables with a relatively small number of records in each. (see schema/counts below). These 3 tables are loaded when the db is created and there are never any updates or deletes on the 3 tables. This database does have many other tables. ds_tables 132 rows, ds_types281 rows, ds_columns 2191 rows When I run the query below on a new database where all the other tables are empty except for the 3 tables that contain static information, the query time is ~200ms. When I run the same query on a production database that has many records in the other tables (3 static tables the same), the query took ~7 seconds. When I run it again on a test database with more data then the production database, but with a different distribution of data, ~1.2 seconds. I have seen this query take as much as 25 seconds because all seq scans where used. Vacuum full analyze and reindex on ONLY the 3 static tables reduced the query time. All queries where run on the same computer with the same postgresql.conf. i.e no configuration changes. I am trying to find out what causes the query on production database to be so much slower. The query is changing (index vs sequencial scan) when the data remains the same. Why would the existence of data in other tables affect the query performance on the 3 static tables? Why does vacuum full and reindex make a difference if the 3 tables are never updated or records deleted? Using postgresql 8.1.3. QUERY explain analyze select ds_tables.name as table_name, ds_columns.name as column_name from ds_tables left join ds_columns on ds_tables.classid = ds_columns.classid left join ds_types on ds_columns.typeid = ds_types.typeid where ds_types.name like 'OMWeakObjRef%' and lower(ds_columns.name) not in ('owner','ownergroup','generatedby','originator','extendedby','audituser ','settingclassdef','itemowner','srcobj','dstobj','srcweakobj','dstweako bj','notificationcreateuser','metadataowner','rpcdef','settingdef','sett ingparent','taskacceptuser','workobj','testref') and lower(ds_tables.name) not in ('ds_omdatatest','ds_ommessage','ds_omusersetting','ds_omloginsession',' ds_omclassdef','ds_omuser','ds_omusergroupsetting','ds_omtestobject','ds _omhomedirectory')and lower(ds_tables.name) like 'ds_om%'; RESULT USING NEW DATABASE Nested Loop (cost=34.48..73.15 rows=1 width=64) (actual time=0.897..42.562 rows=55 loops=1) - Nested Loop (cost=34.48..61.38 rows=2 width=48) (actual time=0.782..41.378 rows=61 loops=1) - Bitmap Heap Scan on ds_types (cost=2.02..9.63 rows=1 width=16) (actual time=0.160..0.707 rows=130 loops=1) Filter: (name ~~ 'OMWeakObjRef%'::text) - Bitmap Index Scan on ds_types_name_key (cost=0.00..2.02 rows=4 width=0) (actual time=0.124..0.124 rows=130 loops=1) Index Cond: ((name = 'OMWeakObjRef'::text) AND (name 'OMWeakObjRef='::text)) - Bitmap Heap Scan on ds_columns (cost=32.46..51.64 rows=9 width=64) (actual time=0.301..0.307 rows=0 loops=130) Recheck Cond: (ds_columns.typeid = outer.typeid) Filter: ((lower(name) 'owner'::text) AND (lower(name) 'ownergroup'::text) AND (lower(name) 'generatedby'::text) AND (lower(name) 'originator'::text) AND (lower(name) 'extendedby'::text) AND (lower(name) 'audituser'::text) AND (lower(name) 'settingclassdef'::text) AND (lower(name) 'itemowner'::text) AND (lower(name) 'srcobj'::text) AND (lower(name) 'dstobj'::text) AND (lower(name) 'srcweakobj'::text) AND (lower(name) 'dstweakobj'::text) AND (lower(name) 'notificationcreateuser'::text) AND (lower(name) 'metadataowner'::text) AND (lower(name) 'rpcdef'::text) AND (lower(name) 'settingdef'::text) AND (lower(name) 'settingparent'::text) AND (lower(name) 'taskacceptuser'::text) AND (lower(name) 'workobj'::text) AND (lower(name) 'testref'::text)) - Bitmap Index Scan on ds_columns_pkey (cost=0.00..32.46 rows=9 width=0) (actual time=0.293..0.293 rows=3 loops=130) Index Cond: (ds_columns.typeid = outer.typeid) - Index Scan using ds_tables_pkey on ds_tables (cost=0.00..5.87 rows=1 width=48) (actual time=0.012..0.014 rows=1 loops=61) Index Cond: (ds_tables.classid = outer.classid) Filter: ((lower(name) 'ds_omdatatest'::text) AND (lower(name) 'ds_ommessage'::text) AND (lower(name) 'ds_omusersetting'::text) AND (lower(name) 'ds_omloginsession'::text) AND (lower(name) 'ds_omclassdef'::text) AND (lower(name) 'ds_omuser'::text) AND (lower(name) 'ds_omusergroupsetting'::text) AND (lower(name) 'ds_omtestobject'::text) AND (lower(name) 'ds_omhomedirectory'::text) AND (lower(name) ~~ 'ds_om%'::text)) Total runtime: 191.034 ms (15 rows) RESULT USING PRODUCTION DATABASE Nested Loop (cost=27.67..69.70 rows=1 width=46) (actual time=12.433..6905.152 rows=55 loops=1) Join Filter: (inner.typeid = outer.typeid) - Index Scan using ds_types_name_key on ds_types (cost=0.00..5.57 rows=1
Re: [PERFORM] Query plan changing when queried data does not
Harry Hehl [EMAIL PROTECTED] writes: Why does vacuum full and reindex make a difference if the 3 tables are never updated or records deleted? Probably because you did an ANALYZE somewhere and updated the planner's stats. I think your major problem is poor estimation of the ds_tables result: - Seq Scan on ds_tables (cost=0.00..59.80 rows=1 width=48) (actual time=15.208..15.968 rows=121 loops=1) Filter: ((lower(name) 'ds_omdatatest'::text) AND (lower(name) 'ds_ommessage'::text) AND (lower(name) 'ds_omusersetting'::text) AND (lower(name) 'ds_omloginsession'::text) AND (lower(name) 'ds_omclassdef'::text) AND (lower(name) 'ds_omuser'::text) AND (lower(name) 'ds_omusergroupsetting'::text) AND (lower(name) 'ds_omtestobject'::text) AND (lower(name) 'ds_omhomedirectory'::text) AND (lower(name) ~~ 'ds_om%'::text)) If you have an index on lower(name) then ANALYZE will collect statistics on it, and you'd get an estimate of the result size that was better than random chance ... but I bet you have no such index. You might get some improvement from raising the default statistics target, too. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] opportunity to benchmark a quad core Xeon
On Sat, 16 Dec 2006, Arjen van der Meijden wrote: On 16-12-2006 4:24 Jeff Frost wrote: We can add more RAM and drives for testing purposes. Can someone suggest what benchmarks with what settings would be desirable to see how this system performs. I don't believe I've seen any postgres benchmarks done on a quad xeon yet. We've done our standard benchmark on a dual X5355: http://tweakers.net/reviews/661 Verdict is that for a price/performance-ratio you're better off with a 5160, but in absolute performance it does win. Arjen, Have you guys run your benchmark on a quad opteron board yet? I'm curious how the dual quad core Intels compare to quad dual core opteron. -- Jeff Frost, Owner [EMAIL PROTECTED] Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Insertion to temp table deteriorating over time
I have an update on this. The reason I couldn't reproduce this problem was because of the way I was creating the temp table in my tests. I was using: CREATE TEMP TABLE tmp (LIKE perm); This did not observe performance degradation over time. However, the way our application was creating this table (something I should have observed sooner, no doubt) is: CREATE TEMP TABLE tmp AS SELECT column-list FROM perm LIMIT 0; This, on its own however, is not enough to reproduce the problem. Next imagine that perm is actually a view, which is defined as a UNION ALL SELECT from several other views, and those views are also defined as UNION ALL SELECTs from a bunch of permanent tables. All views have insert rules redirecting rows according to some criteria. The whole structure is pretty convoluted. I can fix this problem by using CREATE TEMP TABLE ... LIKE instead of CREATE TEMP TABLE ... AS. I'm still curious about the root cause of this problem. From the docs, I see that CREATE TABLE AS evaluates the query just once to create the table, but based on what I'm seeing, I'm wondering whether this isn't truly the case. Are there any known issues with CREATE TABLE AS when the table you're creating is temporary and you're selecting from a view? Steve On 12/15/06, Tom Lane [EMAIL PROTECTED] wrote: Steven Flatt [EMAIL PROTECTED] writes: I've been trying to reproduce the problem for days now :). I've done pretty much exactly what you describe below, but I can't reproduce the problem on any of our lab machines. Something is indeed special in this environment. Yuck. You could try strace'ing the problem backend and see if anything is visibly different between fast and slow operation. I don't suppose you have oprofile on that machine, but if you did it'd be even better. regards, tom lane
Re: [PERFORM] Insertion to temp table deteriorating over time
Steven Flatt [EMAIL PROTECTED] writes: I can fix this problem by using CREATE TEMP TABLE ... LIKE instead of CREATE TEMP TABLE ... AS. That seems ... um ... bizarre. Now are you able to put together a self-contained test case? Seems like we could have two independent bugs here: first, why (and how) is the temp table different, and second how does that result in the observed performance problem. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Advice on selecting good values for work_mem?
In response to Tom Lane [EMAIL PROTECTED]: Bill Moran [EMAIL PROTECTED] writes: Does the creation of a temp file trigger any logging? No; but it wouldn't be hard to add some if you wanted. I'd do it at deletion, not creation, so you could log the size the file reached. See FileClose() in src/backend/storage/file/fd.c. Is this along the lines of what you were thinking? Is this acceptable to get pulled into the tree (maintaining local patches sucks ;) I've only been using this patch a day and I'm already giddy about how much it helps tuning work memory sizes ... -- Bill Moran Collaborative Fusion Inc. *** fd.c.prev Mon Dec 18 16:09:51 2006 --- fd.c Mon Dec 18 16:09:31 2006 *** *** 939,944 --- 939,945 FileClose(File file) { Vfd *vfdP; + struct stat u filestats; Assert(FileIsValid(file)); *** *** 968,973 --- 969,982 { /* reset flag so that die() interrupt won't cause problems */ vfdP-fdstate = ~FD_TEMPORARY; + if (fstat(vfdP-fd, filestats)) { + ereport(WARNING, + (errmsg(A temporary file of %d bytes was used, + filestats.st_size), + errhint(You many need to increase work_mem.))); + } else { + elog(ERROR, Could not stat \%s\: %m, vfdP-fileName); + } if (unlink(vfdP-fileName)) elog(LOG, failed to unlink \%s\: %m, vfdP-fileName); ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Advice on selecting good values for work_mem?
In response to Stephen Frost [EMAIL PROTECTED]: * Bill Moran ([EMAIL PROTECTED]) wrote: What I'm fuzzy on is how to discretely know when I'm overflowing work_mem? Obviously, if work_mem is exhausted by a particular query, temp files will be created and performance will begin to suck, I don't believe this is necessairly *always* the case. There are instances in PostgreSQL where it will just continue to allocate memory beyond the work_mem setting. This is usually due to poor statistics (you changed the data in the table dramatically and havn't run analyze, or you never ran analyze on the table at all, or the statistics gathering values are set too low to capture enough information about the data, etc). It would nice if it was possible to have this detected and logged, or similar. Additionally, work_mem isn't actually a per-query thing, aiui, it's more like a per-node in the planner thing. That is to say that if you have multiple sorts going on, or a sort and a hash, that *both* of those expect to be able to use up to work_mem amount of memory. I'm aware of that. It's one of the reasons I asked about monitoring its usage. I mean, if I could be sure that each process only used work_mem amount of space, it would be pretty easy to run some calculations and go to management and say, these servers need X amount of RAM for optimal performance ... As it is, I'm trying to find the most complex queries and estimate how many joins and sorts there are and how much that's going to add up to. It'd be nice to be able to crank up the debugging and have postgresql say: QUERY 0: total work_mem: bytes JOIN 0: x bytes JOIN 1: y bytes ... Perhaps it's in there somewhere ... I haven't experimented with cranking the logging up to maximum yet. If it's missing, I'm hoping to have some time to add it. Adding debugging to PostgreSQL is a pretty easy way to learn how the code fits together ... Also, another point you might want to consider how to handle is that work_mem has no bearing on libpq and I don't recall there being a way to constrain libpq's memory usage. This has been an issue for me just today when a forgot a couple parameters to a join which caused a cartesean product result and ended up running the box out of memory. Sure, it's my fault, and unlikely to happen in an application, but it still sucks. :) It also managed to run quickly enough that I didn't notice what was happening. :/ Of course, the server side didn't need much memory at all to generate that result. Also, libpq stores everything in *it's* memory before passing it to the client. An example scenario of this being kind of an issue is psql, you need double the memory size of a given result because the result is first completely grabbed and stored in libpq and then sent to your pager (eg: less) which then sucks it all into memory again. In applications (and I guess psql, though I never think of it, and it'd be nice to have as a configurable option if it isn't already...) you can use cursors to limit the amount of memory libpq uses. In our case, the database servers are always dedicated, and the application side always runs on a different server. This is both a blessing and a curse: On the one hand, I don't have to worry about any client apps eating up RAM on the DB server. On the other hand, last week we found a place where a query with lots of joins was missing a key WHERE clause, it was pulling something like 10X the number of records it needed, then limiting it further on the client side. Optimizing this sort of thing is something I enjoy. As these are new things (both the temp file creation logging and the work_mem overflow detection, I believe), this discussion is probably more appropriate for -hackers. True. It started out here because I wasn't sure that the stuff didn't already exist, and was curious how others were doing it. When I've had some more opportunity to investigate work_mem monitoring, I'll start the discussion back up on -hackers. That leads to my other question. Assuming I've got lots of connections (which I do), how can I determine if work_mem is too high? Do server processes allocated it even if they don't actually use it? Is the only way to find out to reduce it and see when it starts to be a problem? If so, that leads back to my first question: how can I be sure whether temp files were created or not? Yeah, look for swappiness... It'd be nice to be able to get memory statistics on queries which have been run though... My goal is to set work_mem as small as is possible for the most common queries, then force the developers to use set work_mem to x to adjust it for big queries. Sounds like an excellent plan. Be careful though, work_mem settings can affect query plans and they may discover that if set high enough the planner will, for example, do a hashjoin which is much faster than sorting and merge-joining, but takes
Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2
On Fri, 2006-12-15 at 10:57 -0500, Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Am Freitag, 15. Dezember 2006 11:28 schrieb Simon Riggs: Until we work out a better solution we can fix this in two ways: 1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ... 2. enable_analyze_timer = off | on (default) (USERSET) The second one is enough in my mind. I don't see any point in either one. If you're not going to collect timing data then the only useful info EXPLAIN ANALYZE could provide is knowledge of which rowcount estimates are badly off ... and to get that, you have to wait for the query to finish, which may well be impractical even without the gettimeofday overhead. On a different part of this thread, you say: On Fri, 2006-12-15 at 09:56 -0500, Tom Lane wrote: The fundamental problem with it was the assumption that different executions of a plan node will have the same timing. That's not true, in fact not even approximately true. It doesn't make sense to me to claim that the timing is so important that we cannot do without it, at the same time as saying it isn't even approximately true that is highly variable. We had discussed upthread the idea of having an option to issue a NOTICE as soon as any actual rowcount exceeds the estimate by some-configurable-percentage, and that seems to me to be a much more useful response to the problem of E.A. takes too long than removing gettimeofday. One thing that's not too clear to me though is how the NOTICE would identify the node at which the rowcount was exceeded... We'd have to output the whole EXPLAIN as a NOTICE for it to make any sense. If we can't do without the timings, then half an EXPLAIN would be even worse. We'd need to take account of non-linear nodes. Hash nodes react badly beyond a certain point, HashAgg even worse. Sort performs poorly after the end of memory, as does Materialize. Other nodes are more linear so would need a different percentage. I don't like the sound of a whole gaggle of GUCs to describe that. Any ideas? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Insertion to temp table deteriorating over time
Please ignore my post from earlier today. As strange as it sounds, changing CREATE TEMP TABLE ... AS to CREATE TEMP TABLE ... LIKE appeared to fix my performance problem because things errored out so quickly (and silently in my test program). After checking the pgsql logs, it became clear to me that you can't use LIKE on a view. Duh. Moving forward, I have also discovered that our temp table did in fact have a varchar column (no specified limit on varchar). With this in mind, I could easily reproduce the problem on a temp table with one column. So... Issue #1: (I'm assuming there's a reasonable explanation for this.) If I create a temp table with a single varchar column (or text column), do 100 inserts to that table, copy to a permanent table, truncate the temp table and repeat, the time required for the 100 inserts grows almost linearly. Maybe the data is treated as large objects. Note that if I change the column type to varchar(SOME_LIMIT), integer, timestamptz, interval, etc., performance does not degrade. Also note that if I do not use a temp table (but do use a varchar column), inserts are slower (as expected) but do not degrade over time. So this seems to be specific to temp tables with varchar/text column(s). Issue #2: As I said earlier, the temp table is created via: CREATE TEMP TABLE tmp AS SELECT column-list FROM perm LIMIT 0; where perm is a view defined as follows: View definition: SELECT column-list FROM view2 JOIN tbl USING (col1, col2) WHERE some-conditions UNION ALL SELECT column-list FROM view3 JOIN tbl USING (col1, col2) WHERE some-conditions; Now the varchar columns that end up in the perm view come from the tbl table, but in tbl, they are defined as varchar(40). Somehow the 40 limit is lost when constructing the view. After a little more testing, I found that this problem only occurs when you are creating a view (i.e. CREATE TABLE ... AS does not observe this problem) and also that the UNION ALL clause must be present to observe this problem. This looks like a bug. I know this is Postgres 7.4.6 and I haven't been able to verify with a later version of Postgres, but does this look familiar to anyone? Steve
Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2
Simon Riggs [EMAIL PROTECTED] writes: On Fri, 2006-12-15 at 09:56 -0500, Tom Lane wrote: The fundamental problem with it was the assumption that different executions of a plan node will have the same timing. That's not true, in fact not even approximately true. It doesn't make sense to me to claim that the timing is so important that we cannot do without it, at the same time as saying it isn't even approximately true that is highly variable. Huh? What I said was that successive executions of the same plan node may take considerably different amounts of time, and the proposed sampling patch failed to handle that situation with acceptable accuracy. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings