Re: [HACKERS] SKIP LOCKED DATA
That is quite useful feature to implement smth. like message queues based on database and so on. Now there is possibility to jump over luck of such feature in Postgres using current advisory lock implementation (pg_try_advisory_xact_lock to determine if somebody already acquired log on particular row). So Im not sure this is an urgent matter. Best regards, Ilya On Mon, Jan 16, 2012 at 6:33 AM, Tom Lane wrote: > It sounds to me like "silently give the wrong answers". Are you sure > there are not better, more deterministic ways to solve your problem? > > (Or in other words: the fact that Oracle has it isn't enough to persuade > me it's a good idea.) > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Group commit, revised
On 16.01.2012 00:42, Peter Geoghegan wrote: I've also attached the results of a pgbench-tools driven benchmark, which are quite striking (Just the most relevant image - e-mail me privately if you'd like a copy of the full report, as I don't want to send a large PDF file to the list as a courtesy to others). Apart from the obvious improvement in throughput, there is also a considerable improvement in average and worst latency at all client counts. Impressive results. How about uploading the PDF to the community wiki? The main way that I've added value here is by refactoring and fixing bugs. There were some tricky race conditions that caused the regression tests to fail for that early draft patch, but there were a few other small bugs too. There is an unprecedented latch pattern introduced by this patch: Two processes (the WAL Writer and any given connection backend) have a mutual dependency. Only one may sleep, in which case the other is responsible for waking it. Much of the difficulty in debugging this patch, and much of the complexity that I've added, came from preventing both from simultaneously sleeping, even in the face of various known failure modes like postmaster death. If this happens, it does of course represent a denial-of-service, so that's something that reviewers are going to want to heavily scrutinise. I suppose that sync rep should be fine here, as it waits on walsenders, but I haven't actually comprehensively tested the patch, so there may be undiscovered unpleasant interactions with other xlog related features. I can report that it passes the regression tests successfully, and on an apparently consistently basis - I battled with intermittent failures for a time. I think it might be simpler if it wasn't the background writer that's responsible for "driving" the group commit queue, but the backends themselves. When a flush request comes in, you join the queue, and if someone else is already doing the flush, sleep until the driver wakes you up. If no-one is doing the flush yet (ie. the queue is empty), start doing it yourself. You'll need a state variable to keep track who's driving the queue, but otherwise I think it would be simpler as there would be no dependency on WAL writer. I tend think of the group commit facility as a bus. Passengers can hop on board at any time, and they take turns on who drives the bus. When the first passengers hops in, there is no driver so he takes the driver seat. When the next passenger hops in, he sees that someone is driving the bus already, so he sits down, and places a big sign on his forehead stating the bus stop where he wants to get off, and goes to sleep. When the driver has reached his own bus stop, he wakes up all the passengers who wanted to get off at the same stop or any of the earlier stops [1]. He also wakes up the passenger whose bus stop is the farthest from the current stop, and gets off the bus. The woken-up passengers who have already reached their stops can immediately get off the bus, and the one who has not notices that no-one is driving the bus anymore, so he takes the driver seat. [1] in a real bus, a passenger would not be happy if he's woken up too late and finds himself at the next stop instead of the one where he wanted to go, but for group commit, that is fine. In this arrangement, you could use the per-process semaphore for the sleep/wakeups, instead of latches. I'm not sure if there's any difference, but semaphores are more tried and tested, at least. The benefits (and, admittedly, the controversies) of this patch go beyond mere batching of commits: it also largely, though not entirely, obviates the need for user backends to directly write/flush WAL, and the WAL Writer may never sleep if it continually finds work to do - wal_writer_delay is obsoleted, as are commit_siblings and commit_delay. wal_writer_delay is still needed for controlling how often asynchronous commits are flushed to disk. Auxiliary processes cannot use group commit. The changes made prevent them from availing of commit_siblings/commit_delay parallelism, because it doesn't exist anymore. Auxiliary processes have PGPROC entries too. Why can't they participate? Group commit is sometimes throttled, which seems appropriate - if a backend requests that the WAL Writer flush an LSN deemed too far from the known flushed point, that request is rejected and the backend goes through another path, where XLogWrite() is called. Hmm, if the backend doing the big flush gets the WALWriteLock before a bunch of group committers, the group committers will have to wait until the big flush is finished, anyway. I presume the idea of the throttling is to avoid the situation where a bunch of small commits need to wait for a huge flush to finish. Perhaps the big flusher should always join the queue, but use some heuristic to first flush up to the previous commit request, to wake up others quickly, and do another flush to flush
Re: [HACKERS] Standalone synchronous master
On Mon, Jan 16, 2012 at 7:01 AM, Jeff Janes wrote: > On Fri, Jan 13, 2012 at 10:12 AM, Kevin Grittner > wrote: >> Jeff Janes wrote:\ >> >>> I don't understand why this is controversial. >> >> I'm having a hard time seeing why this is considered a feature. It >> seems to me what is being proposed is a mode with no higher >> integrity guarantee than asynchronous replication, but latency >> equivalent to synchronous replication. > > There are never 100% guarantees. You could always have two > independent failures (the WAL disk of the master and of the slave) > nearly simultaneously. > > If you look at weaker guarantees, then with asynchronous replication > you are almost guaranteed to lose transactions on a fail-over of a > busy server, and with the proposed option you are almost guaranteed > not to, as long as disconnections are rare. Yes. The proposed mode guarantees that you don't lose transactions when single failure happens, but asynchronous replication doesn't. So the proposed one has the benefit of reducing the risk of data loss to a certain extent. OTOH, when more than one failures happen, in the proposed mode, you may lose transactions. For example, imagine the case where the standby crashes, the standalone master runs for a while, then its database gets corrupted. In this case, you would lose any transactions committed while standalone master is running. So, if you want to avoid such a data loss, you can use synchronous replication mode. OTOH, if you can endure the data loss caused by double failure for some reasons (e.g., using reliable hardware...) but not that caused by single failure, and want to improve the availability (i.e., want to prevent transactions from being blocked after single failure happens), the proposed one is good option to use. I believe that some people need this proposed replication mode. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logging in high performance systems.
On 11/23/2011 09:28 PM, Theo Schlossnagle wrote: The first thing I did was add hook points where immediate statement logging happens "pre_exec" and those that present duration "post_exec". These should, with optimization turned on, have only a few instructions of impact when no hooks are registered (we could hoist the branch outside the function call if that were identified as an issue). https://github.com/postwait/postgres/commit/62bb9dfa2d373618f10e46678612720a3a01599a Theo: could you e-mail the current "pre-flight and post-flight logging hooks" code as a patch to the list? I put it into the CF app so we don't forget to take a look at this, but from a policy point of view an official patch submission to the list is needed here. I've looked at the code on github enough to know we probably want the sort of registration scheme you've proposed no matter what ends up getting logged. The majority of Theo's patch is worrying about how to register and manage multiple hook consumers around statements that are being executed. The other one we got from Martin is hooking all log output and not worrying about the multiple consumer problems. There is an important distinction to make here. Statement logging is one of the largest producers of logging data you want to worry about optimizing for on a high performance system. The decision about whether to log or not may need to be made by the hook. Something that hooks EmitErrorReport has already lost an important opportunity to make a decision about whether the system is perhaps too busy to worry about logging right now at all; you've already paid a chunk of the logging overhead getting the log message to it. I think both areas are going to be important to hook eventually. The meeting to discuss this area at our last BWPUG group happened, and we made some good progress mapping out what various people would like to see here. I'm holding the token for the job of summarizing all that usefully for the community. I'll get back to that as soon as I can now once the January CF is rolling along, hopefully later this week. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SKIP LOCKED DATA
On Mon, Jan 16, 2012 at 5:33 AM, Tom Lane wrote: > Thomas Munro writes: >> I am wondering out loud whether I am brave enough to try to propose >> SKIP LOCKED DATA support and would be grateful for any feedback and/or >> {en|dis}couragement. I don't see it on the todo list, and didn't find >> signs of others working on this (did I miss something?), but there are >> examples of users asking for this feature (by various names) on the >> mailing lists. Has the idea already been rejected, is it >> fundamentally infeasible for some glaring reason, or far too >> complicated for new players? > > It sounds to me like "silently give the wrong answers". Are you sure > there are not better, more deterministic ways to solve your problem? The name is misleading. It means "open a cursor on a query, when you fetch if you see a locked row return quickly from the fetch". The idea is that if its locked it is still being written and therefore not interesting (yet). Sounds reasonable request. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CommitFest 2012-01 kick-off
Despite the best attempts of all my neighbors to distract me, as if there was something more important going on Sunday afternoon in Baltimore than preparing for the CommitFest, I've tried to get all the loose patches onto the CommitFest app and get the early reviews accounted for on there too. CF 2012-01 is now officially started. We have 95 patches in this one, with 6 of those early getting an early commit. This is not unprecedented. At this point last year, CF 2011-01 had 96 patches including 8 early commits, and eventually 69 of those were committed. That committed its last and most contentious patch (Sync Rep) on March 6th of 2011. Seems likely we're staring at about 6 weeks of last CF this year too. The main difference to be aware of is that there's a less diversity in authors this time. The nudging suggestion we make that every patch submitter should review at least one patch has helped keep the reviewers proportional to the submissions as they grow. The problem with this CF is that there are only 39 authors involved, so even if every one of them did two reviews that still wouldn't cut it. We've got two people who where involved in submitting more than 10 patches (Simon and Peter Eisentraut), three who submitted 5 or more (Noah, Robert Haas, and myself), and several other multiple submitters. Keeping up with feedback and revving your own patches while usefully reviewing other people's as well is tough if you have more than one of each happening all at once, and we have a lot of community members in that position right now. I personally am also concerned at the possible dependencies and coupling with all the performance and performance instrumentation patches. I count about 30 of those floating around, and 9.2 performance features have certainly been stacking on each other usefully so far. So many performance improvements that we may not be able to absorb them all at once...there are worse problems to have. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Measuring relation free space
On Sat, Jan 14, 2012 at 02:40:46PM -0500, Jaime Casanova wrote: > On Sat, Jan 14, 2012 at 6:26 AM, Noah Misch wrote: > > > > - pgstattuple() and relation_free_space() should emit the same number, even > > if > > ?that means improving pgstattuple() at the same time. > > yes, i just wanted to understand which one was more accurate and > why... and give the opportunity for anyone to point my error if any pgstattuple()'s decision to treat half-dead pages like deleted pages is better. That transient state can only end in the page's deletion. I don't know about counting non-leaf pages, but I personally wouldn't revisit pgstattuple()'s decision there. In the indexes I've briefly surveyed, the ratio of leaf pages to non-leaf pages is 100:1 or better. No amount of bloat in that 1% will matter. Feel free to make the argument if you think otherwise, though; I've only taken a brief look at the topic. > > - relation_free_space() belongs in the pgstattuple extension, because its > > role > > ?is cheaper access to a single pgstattuple() metric. > > oh! right! so, what about just fixing the free_percent that > pgstattuple is providing If pgstattuple() meets your needs, you might indeed no longer need any patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New replication mode: write
On Fri, Jan 13, 2012 at 9:27 PM, Fujii Masao wrote: > On Fri, Jan 13, 2012 at 7:30 PM, Simon Riggs wrote: >> On Fri, Jan 13, 2012 at 9:15 AM, Simon Riggs wrote: >>> On Fri, Jan 13, 2012 at 7:41 AM, Fujii Masao wrote: >>> Thought? Comments? >>> >>> This is almost exactly the same as my patch series >>> "syncrep_queues.v[1,2].patch" earlier this year. Which I know because >>> I was updating that patch myself last night for 9.2. I'm about half >>> way through doing that, since you and I agreed in Ottawa I would do >>> this. Perhaps it is better if we work together? >> >> I think this comment is mostly pointless. We don't have time to work >> together and there's no real reason to. You know what you're doing, so >> I'll leave you to do it. >> >> Please add the Apply mode. > > OK, will do. Done. Attached is the updated version of the patch. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 1559,1567 SET ENABLE_SEQSCAN TO OFF; Specifies whether transaction commit will wait for WAL records to be written to disk before the command returns a success ! indication to the client. Valid values are on, ! local, and off. The default, and safe, value ! is on. When off, there can be a delay between when success is reported to the client and when the transaction is really guaranteed to be safe against a server crash. (The maximum delay is three times .) Unlike --- 1559,1567 Specifies whether transaction commit will wait for WAL records to be written to disk before the command returns a success ! indication to the client. Valid values are on, write, ! apply, local, and off. The default, and safe, ! value is on. When off, there can be a delay between when success is reported to the client and when the transaction is really guaranteed to be safe against a server crash. (The maximum delay is three times .) Unlike *** *** 1579,1589 SET ENABLE_SEQSCAN TO OFF; If is set, this parameter also controls whether or not transaction commit will wait for the transaction's WAL records to be flushed to disk and replicated ! to the standby server. The commit wait will last until a reply from ! the current synchronous standby indicates it has written the commit ! record of the transaction to durable storage. If synchronous replication is in use, it will normally be sensible either to wait ! both for WAL records to reach both the local and remote disks, or to allow the transaction to commit asynchronously. However, the special value local is available for transactions that wish to wait for local flush to disk, but not synchronous replication. --- 1579,1600 If is set, this parameter also controls whether or not transaction commit will wait for the transaction's WAL records to be flushed to disk and replicated ! to the standby server. When on, the commit wait will last ! until a reply from the current synchronous standby indicates it has flushed ! the commit record of the transaction to durable storage. This will ! avoids any data loss unless the database cluster of both primary and ! standby gets corrupted simultaneously. When write, ! the commit wait will last until a reply from the current synchronous ! standby indicates it has received the commit record of the transaction ! to memory. Normally this causes no data loss at the time of failover. ! However, if both primary and standby crash, and the database cluster of ! the primary gets corrupted, recent committed transactions might ! be lost. When apply, the commit will wait until the current ! synchronous standby has replayed the committed changes successfully. ! This guarantees that any transactions are visible on the synchronous ! standby when they are committed. If synchronous replication is in use, it will normally be sensible either to wait ! for both local flush and replication of WAL records, or to allow the transaction to commit asynchronously. However, the special value local is available for transactions that wish to wait for local flush to disk, but not synchronous replication. *** a/doc/src/sgml/high-availability.sgml --- b/doc/src/sgml/high-availability.sgml *** *** 1011,1016 primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass' --- 1011,1039 + Setting synchronous_commit to write will + cause each commit to wait for confirmation that the standby has received + the commit
Re: [HACKERS] Generate call graphs in run-time
On 01/09/2012 10:00 PM, Joel Jacobson wrote: > Because of this I decided to sample data in run-time to get a real-life > picture of the system. > Any functions not being called in productions are not that important to > include in the documentation anyway. FWIW I have a similar problem - with a similar solution. I'll try to describe it, in case there is some synergy to be leveraged :) > In pgstat_init_function_usage(), each call to a function will add the > caller->called oid pair unless already set in the hash. > When pgstat_end_function_usage() is called, the current function oid is > removed from the List of parent oids. > In AtEOXact_PgStat(), called upon commit/rollback, the call graph is > sorted and written to the log unless already seen in the session. > The variables are resetted in pgstat_clear_snapshot(). > My approach was to add parent oid to the per-backend function stats structure - PgStat_BackendFunctionEntry. Also, I changed the hash key for that structure to (oid, parent) pair. This means that within the backend the function usage is always tracked with the context of calling function. This has the nice property that you get the per-parent usage stats as well. Also the additional lists for parent tracking are avoided. During pgstat_report_stat() the call graph (with stats) is output to logs and the statistics uploaded to collector -- with the parent oid removed. > This functionality is probably something one would like to enable only > temporarily in the production environment. > A new configuration parameter would therefore be good, just like > track_functions. Perhaps track_callgraphs? > I opted to always track the function parents -- this is very cheap. The logging on the other hand is quite heavy and needs to be explicitly configured (I used log_usage_stats GUC). There is a patch for this and we do use it in production for occasional troubleshooting and dependency analysis. Can't attach immediately though -- it has some extra cruft in it that needs to be cleaned up. > Instead of writing the call graphs to the postgres log, it would ne more > useful to let the statistics collector keep track of the call graphs, to > allow easier access than having to parse through the log file. Indeed. Something like a pg_stat_user_function_details view would be very useful. Something along the lines of: Column | Type | --++ funcid | oid| parentfuncid | oid| <-- new schemaname | name | funcname | name | calls| bigint | total_time | bigint | self_time| bigint | And then rewrite pg_stat_user_functions by aggregating the detailed view. That'd make the individual pg_stat_get_function* functions a bit slower, but that is probably a non-issue - at least not if the pg_stat_user_functions view is rewritten to use a SRF. regards, Martin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] age(xid) on hot standby
Excerpts from Peter Eisentraut's message of dom ene 15 10:00:03 -0300 2012: > > On ons, 2011-12-28 at 14:35 -0500, Tom Lane wrote: > > Alvaro Herrera writes: > > > Excerpts from Peter Eisentraut's message of mié dic 28 15:04:09 -0300 > > > 2011: > > >> On a hot standby, this fails with: > > >> ERROR: cannot assign TransactionIds during recovery > > > > > I think we could just have the xid_age call > > > GetCurrentTransactionIdIfAny, and if that returns InvalidXid, use > > > ReadNewTransactionId instead. That xid_age assigns a transaction seems > > > more of an accident than really intended. > > > > The trouble with using ReadNewTransactionId is that it makes the results > > volatile, not stable as the function is declared to be. > > Could we alleviate that problem with some caching within the function? Maybe if we have it be invalidated at transaction end, that could work. So each new transaction would get a fresh value. If you had a long running transaction the cached value would get behind, but maybe this is not a problem or we could design some protection against it. For the check_postgres case I imagine it opens a new connection on each round so this would not be a problem. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers
On Sun, Jan 15, 2012 at 11:03 AM, Hitoshi Harada wrote: > On Sat, Jan 14, 2012 at 6:48 AM, Robert Haas wrote: >> On Sat, Jan 14, 2012 at 5:25 AM, Hitoshi Harada wrote: >>> The patch looks ok, though I wonder if we could have a way to release >>> the lock on namespace much before the end of transaction. >> >> Well, that wold kind of miss the point, wouldn't it? I mean, the race >> is that the process dropping the schema might not see the newly >> created object. The only way to prevent that is to hold some sort of >> lock until the newly created object is visible, which doesn't happen >> until commit. >> >>> I know it's a limited situation, though. Maybe the right way would be >>> to check the namespace at the end of the transaction if any object is >>> created, rather than locking it. >> >> And then what? If you find that you created an object in a namespace >> that's been subsequently dropped, what do you do about that? As far >> as I can see, your own really choice would be to roll back the >> transaction that uncovers this problem, which is likely to produce >> more rollbacks than just letting the deadlock detector sort it out. > > Right. I thought this patch introduced lock on schema in SELECT, but > actually we'd been locking schema with SELECT since before the patch. No, we lock the table with SELECT, not the schema. This doesn't add any additional locking for DML: nor is any needed, AFAICS. > So the behavior becomes consistent (between SELECT and CREATE) now > with it. And I agree that my insist is pointless after looking more. OK. Thanks for your review. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] inconsistent comparison of CHECK constraints
While reviewing Nikhil Sontakke's fix for the inherited constraints open item we have, I noticed that MergeWithExistingConstraint and MergeConstraintsIntoExisting are using rather different mechanism to compare equality of the constraint expressions; the former does this: { Datumval; boolisnull; val = fastgetattr(tup, Anum_pg_constraint_conbin, conDesc->rd_att, &isnull); if (isnull) elog(ERROR, "null conbin for rel %s", RelationGetRelationName(rel)); if (equal(expr, stringToNode(TextDatumGetCString(val found = true; } So plain string comparison of the node's string representation. MergeConstraintsIntoExisting is instead doing this: if (acon->condeferrable != bcon->condeferrable || acon->condeferred != bcon->condeferred || strcmp(decompile_conbin(a, tupleDesc), decompile_conbin(b, tupleDesc)) != 0) where decompile_conbin is defined roughly as expr = DirectFunctionCall2(pg_get_expr, attr, ObjectIdGetDatum(con->conrelid)); return TextDatumGetCString(expr); So it is first decompiling the node into its source representation, then comparing that. Do we care about this? If so, which version is preferrable? I also noticed that MergeConstraintsIntoExisting is doing a scan on conrelid and then manually filtering for conname, which seems worse than the other code that's just using conname/connamespace as scankey. This is probably better on tables with tons of constraints. -- Álvaro Herrera -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] age(xid) on hot standby
Alvaro Herrera writes: > Excerpts from Peter Eisentraut's message of dom ene 15 10:00:03 -0300 2012: >> On ons, 2011-12-28 at 14:35 -0500, Tom Lane wrote: >>> The trouble with using ReadNewTransactionId is that it makes the results >>> volatile, not stable as the function is declared to be. >> Could we alleviate that problem with some caching within the function? > Maybe if we have it be invalidated at transaction end, that could work. > So each new transaction would get a fresh value. Yeah, I think that would work. > If you had a long > running transaction the cached value would get behind, but maybe this is > not a problem or we could design some protection against it. Nobody has complained about the fact that age()'s reference point remains fixed throughout a transaction on the master, so I don't see why we'd not be happy with that behavior on a standby. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reprise: pretty print viewdefs
On 01/13/2012 02:50 PM, Andrew Dunstan wrote: On 01/13/2012 12:31 AM, Hitoshi Harada wrote: So my conclusion is it's better than nothing, but we could do better job here. From timeline perspective, it'd be ok to apply this patch and improve more later in 9.3+. I agree, let's look at the items other than the target list during 9.3. But I do think this addresses the biggest pain point. Actually, it turns out to be very simple to add wrapping logic for the FROM clause, as in the attached updated patch, and I think we should do that for this round. cheers andrew *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 13758,13764 SELECT pg_type_is_visible('myschema.widget'::regtype); pg_get_viewdef(view_name, pretty_bool) text !get underlying SELECT command for view (deprecated) pg_get_viewdef(view_oid) --- 13758,13765 pg_get_viewdef(view_name, pretty_bool) text !get underlying SELECT command for view, ! lines with fields are wrapped to 80 columns if pretty_bool is true (deprecated) pg_get_viewdef(view_oid) *** *** 13768,13774 SELECT pg_type_is_visible('myschema.widget'::regtype); pg_get_viewdef(view_oid, pretty_bool) text !get underlying SELECT command for view pg_options_to_table(reloptions) --- 13769,13782 pg_get_viewdef(view_oid, pretty_bool) text !get underlying SELECT command for view, ! lines with fields are wrapped to 80 columns if pretty_bool is true ! ! !pg_get_viewdef(view_oid, wrap_int) !text !get underlying SELECT command for view, ! wrapping lines with fields as specified, pretty printing is implied pg_options_to_table(reloptions) *** a/src/backend/utils/adt/ruleutils.c --- b/src/backend/utils/adt/ruleutils.c *** *** 73,78 --- 73,80 #define PRETTYFLAG_PAREN 1 #define PRETTYFLAG_INDENT 2 + #define PRETTY_WRAP_DEFAULT 79 + /* macro to test if pretty action needed */ #define PRETTY_PAREN(context) ((context)->prettyFlags & PRETTYFLAG_PAREN) #define PRETTY_INDENT(context) ((context)->prettyFlags & PRETTYFLAG_INDENT) *** *** 136,141 static SPIPlanPtr plan_getrulebyoid = NULL; --- 138,144 static const char *query_getrulebyoid = "SELECT * FROM pg_catalog.pg_rewrite WHERE oid = $1"; static SPIPlanPtr plan_getviewrule = NULL; static const char *query_getviewrule = "SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND rulename = $2"; + static int pretty_wrap = PRETTY_WRAP_DEFAULT; /* GUC parameters */ bool quote_all_identifiers = false; *** *** 381,386 pg_get_viewdef_ext(PG_FUNCTION_ARGS) --- 384,406 } Datum + pg_get_viewdef_wrap(PG_FUNCTION_ARGS) + { + /* By OID */ + Oid viewoid = PG_GETARG_OID(0); + int wrap = PG_GETARG_INT32(1); + int prettyFlags; + char *result; + + /* calling this implies we want pretty printing */ + prettyFlags = PRETTYFLAG_PAREN | PRETTYFLAG_INDENT; + pretty_wrap = wrap; + result = pg_get_viewdef_worker(viewoid, prettyFlags); + pretty_wrap = PRETTY_WRAP_DEFAULT; + PG_RETURN_TEXT_P(string_to_text(result)); + } + + Datum pg_get_viewdef_name(PG_FUNCTION_ARGS) { /* By qualified name */ *** *** 3013,3018 get_target_list(List *targetList, deparse_context *context, --- 3033,3039 char *sep; int colno; ListCell *l; + boollast_was_multiline = false; sep = " "; colno = 0; *** *** 3021,3026 get_target_list(List *targetList, deparse_context *context, --- 3042,3051 TargetEntry *tle = (TargetEntry *) lfirst(l); char *colname; char *attname; + StringInfoData targetbuf; + int leading_nl_pos = -1; + char *trailing_nl; + int pos; if (tle->resjunk) continue; /* ignore junk entries */ *** *** 3030,3035 get_target_list(List *targetList, deparse_context *context, --- 3055,3069 colno++; /* + * Put the new field spec into targetbuf so we can + * decide after we've got it whether or not it needs + * to go on a new line. + */ + + initStringInfo(&targetbuf); + context->buf = &targetbuf; + + /* * We special-case Var nodes rather than using get_rule_expr. This is * needed because get_rule_expr will display a whole-row Var as * "foo.*", which is the preferred notation in most contexts, but at *** *** 3063,3070 get_target_list(List *targetList, deparse_context *context, if (colname) /* resname could be NULL */ { if (attname == NULL || strcmp(attname, colname) != 0) ! appendStringInfo(buf, " AS %s", quote_identifier(colname))
Re: [HACKERS] inconsistent comparison of CHECK constraints
Alvaro Herrera writes: > While reviewing Nikhil Sontakke's fix for the inherited constraints open > item we have, I noticed that MergeWithExistingConstraint and > MergeConstraintsIntoExisting are using rather different mechanism to > compare equality of the constraint expressions; the former does this: > if (equal(expr, stringToNode(TextDatumGetCString(val > So plain string comparison of the node's string representation. No, that's *not* a "plain string comparison", and if it were it would be wrong. This is doing equal() on the node trees, which is in fact the correct implementation. (If we just compared the nodeToString strings textually, then the result would depend on fields that equal() is designed to ignore, such as source-line locations. And we definitely want this comparison to ignore those.) > MergeConstraintsIntoExisting is instead doing this: > if (acon->condeferrable != bcon->condeferrable || > acon->condeferred != bcon->condeferred || > strcmp(decompile_conbin(a, tupleDesc), >decompile_conbin(b, tupleDesc)) != 0) That's kind of a crock, but it's necessary because it's trying to detect equivalence of constraint expressions belonging to different tables, which could have different physical column numbers as noted by the comment. So I don't see a way to reduce it to a simple equal(). But for constraints applicable to just one table, equal() should be preferred as it's simpler and more reliable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace
Excerpts from Jaime Casanova's message of lun ene 16 03:23:30 -0300 2012: > On Tue, Dec 13, 2011 at 12:29 PM, Julien Tachoires wrote: > > > > OK, considering that, I don't see any way to handle the case raised by > > Jaime :( > > Did you consider what Álvaro suggested? anyway, seems is too late for > this commitfest. > are you intending to resume work on this for the next cycle? > do we consider this as a useful thing? The remaining bits shouldn't be too hard. In case Julien is not interested in the task, I have added a link to this discussion in the TODO item. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standalone synchronous master
On Sun, Jan 15, 2012 at 5:01 PM, Jeff Janes wrote: > Since synchronous_standby_names cannot be changed without bouncing the > server, we do not provide the tools for an external tool to make this > change cleanly. Yes, it can. It's PGC_SIGHUP. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpoint sync pause
On Mon, Jan 16, 2012 at 2:57 AM, Greg Smith wrote: > ... > 2012-01-16 02:39:01.184 EST [25052]: DEBUG: checkpoint sync: number=34 > file=base/16385/11766 time=0.006 msec > 2012-01-16 02:39:01.184 EST [25052]: DEBUG: checkpoint sync delay: seconds > left=3 > 2012-01-16 02:39:01.284 EST [25052]: DEBUG: checkpoint sync delay: seconds > left=2 > 2012-01-16 02:39:01.385 EST [25052]: DEBUG: checkpoint sync delay: seconds > left=1 > 2012-01-16 02:39:01.860 EST [25052]: DEBUG: checkpoint sync: number=35 > file=global/12007 time=375.710 msec > 2012-01-16 02:39:01.860 EST [25052]: DEBUG: checkpoint sync delay: seconds > left=3 > 2012-01-16 02:39:01.961 EST [25052]: DEBUG: checkpoint sync delay: seconds > left=2 > 2012-01-16 02:39:02.061 EST [25052]: DEBUG: checkpoint sync delay: seconds > left=1 > 2012-01-16 02:39:02.161 EST [25052]: DEBUG: checkpoint sync: number=36 > file=base/16385/11754 time=0.008 msec > 2012-01-16 02:39:02.555 EST [25052]: LOG: checkpoint complete: wrote 2586 > buffers (63.1%); 1 transaction log file(s) added, 0 removed, 0 recycled; > write=2.422 s, sync=13.282 s, total=16.123 s; sync files=36, longest=1.085 > s, average=0.040 s > > No docs yet, really need a better guide to tuning checkpoints as they exist > now before there's a place to attach a discussion of this to. Yeah, I think this is an area where a really good documentation patch might help more users than any code we could write. On the technical end, I dislike this a little bit because the parameter is clearly something some people are going to want to set, but it's not at all clear what value they should set it to and it has complex interactions with the other checkpoint settings - and the user's hardware configuration. If there's no way to make it more self-tuning, then perhaps we should just live with that, but it would be nice to come up with something more user-transparent. Also, I am still struggling with what the right benchmarking methodology even is to judge whether any patch in this area "works". Can you provide more details about your test setup? Just one random thought: I wonder if it would make sense to cap the delay after each sync to the time spending performing that sync. That would make the tuning of the delay less sensitive to the total number of files, because we won't unnecessarily wait after each sync when they're not actually taking any time to complete. It's probably easier to estimate the number of segments that are likely to contain lots of dirty data than to estimate the total number of segments that you might have touched at least once since the last checkpoint, and there's no particular reason to think the latter is really what you should be tuning on anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat documentation tables
Bruce had a patch to turn SGML descriptions of system view into comments via some Perl program or something. He posted it many moons ago and I haven't seen an updated version. Bruce, do you have something to say on this topic? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?
On Sat, Dec 31, 2011 at 8:26 AM, Simon Riggs wrote: > On Fri, Dec 30, 2011 at 10:20 PM, Peter Eisentraut wrote: >> On ons, 2011-08-24 at 11:24 -0700, Daniel Farina wrote: >>> I was poking around at tablecmds and index.c and wonder if a similar >>> two-pass approach as used by CREATE INDEX CONCURRENTLY can be used to >>> create a DROP INDEX CONCURRENTLY, and if there would be any interest >>> in accepting such a patch. >> >> Hmm, it seems I just independently came up with this same concept. My >> problem is that if a CREATE INDEX CONCURRENTLY fails, you need an >> exclusive lock on the table just to clean that up. If the table is >> under constant load, you can't easily do that. So a two-pass DROP INDEX >> CONCURRENTLY might have been helpful for me. > > Here's a patch for this. Please review. I don't see how setting indisvalid to false helps with this, because IIUC when a session sees indisvalid = false, it is supposed to avoid using the index for queries but still make new index entries when a write operation happens - but to drop an index, I think you'd need to get into a state where no one was using the index for anything at all. Maybe we need to change indisvalid to a "char" and make it three valued: c = being created currently, v = valid, d = being dropped concurrently, or something like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New replication mode: write
On Mon, Jan 16, 2012 at 12:45 PM, Fujii Masao wrote: > Done. Attached is the updated version of the patch. Thanks. I'll review this first, but can't start immediately. Please expect something back in 2 days. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace
Hi, 2012/1/16 Alvaro Herrera : > > Excerpts from Jaime Casanova's message of lun ene 16 03:23:30 -0300 2012: >> On Tue, Dec 13, 2011 at 12:29 PM, Julien Tachoires wrote: >> > >> > OK, considering that, I don't see any way to handle the case raised by >> > Jaime :( >> >> Did you consider what Álvaro suggested? anyway, seems is too late for >> this commitfest. Not yet. >> are you intending to resume work on this for the next cycle? >> do we consider this as a useful thing? That's a good question. If the answer is "yes", I'll continue on this work. > > The remaining bits shouldn't be too hard. In case Julien is not > interested in the task, I have added a link to this discussion in the > TODO item. > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Caching for stable expressions with constant arguments v6
Hi list, Here's v6 of my expression caching patch. The only change in v6 is added expression cost estimation in costsize.c. I'm setting per-tuple cost of CacheExpr to 0 and moving sub-expression tuple costs into the startup cost. As always, this work is also available from my Github "cache" branch: https://github.com/intgr/postgres/commits/cache This patch was marked as "Returned with Feedback" from the 2011-11 commitfest. I expected to get to tweak the patch in response to feedback before posting to the next commitfest. But said feedback didn't arrive, and with me being on vacation, I missed the 2012-01 CF deadline. :( I will add it to the 2012-01 commitfest now, I hope that's OK. If not, feel free to remove it and I'll put it in 2012-Next. PS: Jaime, have you had a chance to look at the patch? Even if you're not done with the review, I'd be glad to get some comments earlier. And thanks for reviewing. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Caching for stable expressions with constant arguments v6
On Mon, Jan 16, 2012 at 12:06 PM, Marti Raudsepp wrote: > > I will add it to the 2012-01 commitfest now, I hope that's OK. If not, > feel free to remove it and I'll put it in 2012-Next. > i'm not the CF manager so he can disagree with me... but IMHO your patch has been almost complete since last CF and seems something we want... i made some performance test that didn't give me good results last time but didn't show them and wasn't able to reproduce so maybe it was something in my machine... so, i would say add it and if the CF manager disagrees he can say so -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
On Sun, Jan 15, 2012 at 3:45 AM, Magnus Hagander wrote: > On Sun, Jan 15, 2012 at 09:37, Greg Smith wrote: >> On 01/15/2012 03:17 AM, Magnus Hagander wrote: >>> And FWIW, I'd find it a lot more useful for the CF app to have the >>> ability to post *reviews* in it, that would end up being properly >>> threaded. >> >> Next you'll be saying we should have some sort of web application to help >> with the whole review process, show the work on an integrated sort of Review >> Board or something. What crazy talk. > > Well, it's early in the morning for being sunday, I blame that. > >> My contribution toward patch review ease for this week is that peg is quite >> a bit smarter about referring to the correct part of the origin git repo >> now, when you've been working on a branch for a while then create a new one: >> https://github.com/gregs1104/peg > > Being able to refer to a git branch is one of those things that have > been on the todo list for the cf app since pgcon last year... Do we have an actual written TODO list for the cf app somewhere? If not, I think creating one would be a good idea. I realize I've been remiss in addressing some of the things people want, but the lack of any centralized place where such items are collected doesn't make it simpler. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
On Mon, Jan 16, 2012 at 18:57, Robert Haas wrote: > On Sun, Jan 15, 2012 at 3:45 AM, Magnus Hagander wrote: >> On Sun, Jan 15, 2012 at 09:37, Greg Smith wrote: >>> On 01/15/2012 03:17 AM, Magnus Hagander wrote: And FWIW, I'd find it a lot more useful for the CF app to have the ability to post *reviews* in it, that would end up being properly threaded. >>> >>> Next you'll be saying we should have some sort of web application to help >>> with the whole review process, show the work on an integrated sort of Review >>> Board or something. What crazy talk. >> >> Well, it's early in the morning for being sunday, I blame that. >> >>> My contribution toward patch review ease for this week is that peg is quite >>> a bit smarter about referring to the correct part of the origin git repo >>> now, when you've been working on a branch for a while then create a new one: >>> https://github.com/gregs1104/peg >> >> Being able to refer to a git branch is one of those things that have >> been on the todo list for the cf app since pgcon last year... > > Do we have an actual written TODO list for the cf app somewhere? If > not, I think creating one would be a good idea. I realize I've been > remiss in addressing some of the things people want, but the lack of > any centralized place where such items are collected doesn't make it > simpler. I don't think so - I've been keeping mine in your mailbox ;) A simple wiki page is probably enough - going for an actual tracker or anything seems vastly overkill... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat documentation tables
On sön, 2012-01-15 at 12:20 -0500, Tom Lane wrote: > Magnus Hagander writes: > > Right now we have a single table on > > http://www.postgresql.org/docs/devel/static/monitoring-stats.html#MONITORING-STATS-VIEWS > > that lists all our statistics views ... > > I'd like to turn that into one table for each view, > > Please follow the style already used for system catalogs; ie I think > there should be a summary table with one entry per view, and then a > separate description and table-of-columns for each view. If the tables had proper table and column comments, you might even be able to generate the SGML documentation automatically. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result metadata
On ons, 2012-01-11 at 17:16 -0300, Alvaro Herrera wrote: > > I propose to add two functions to the result object: > > > > .colnames() returns a list of column names (strings) > > .coltypes() returns a list of type OIDs (integers) > > No typmods? Didn't think about that, but could be added using similar interface and code. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result metadata
On ons, 2012-01-11 at 22:52 +0100, Dimitri Fontaine wrote: > Peter Eisentraut writes: > > .colnames() returns a list of column names (strings) > > .coltypes() returns a list of type OIDs (integers) > > > > I just made that up because there is no guidance in the other standard > > PLs for this sort of thing, AFAICT. > > What about having the same or comparable API as in psycopg or DB API > > http://initd.org/psycopg/docs/cursor.html > > You could expose a py.description structure? I deliberately chose not to do that, because the PL/Python API is intentionally totally different from the standard DB-API, and mixing in some semi-conforming look-alike would be quite confusing from both ends. I think we should stick with the PL/Python API being a small layer on top of SPI, and let the likes of plpydbapi handle the rest. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgstat documentation tables
On Mon, Jan 16, 2012 at 19:41, Peter Eisentraut wrote: > On sön, 2012-01-15 at 12:20 -0500, Tom Lane wrote: >> Magnus Hagander writes: >> > Right now we have a single table on >> > http://www.postgresql.org/docs/devel/static/monitoring-stats.html#MONITORING-STATS-VIEWS >> > that lists all our statistics views ... >> > I'd like to turn that into one table for each view, >> >> Please follow the style already used for system catalogs; ie I think >> there should be a summary table with one entry per view, and then a >> separate description and table-of-columns for each view. > > If the tables had proper table and column comments, you might even be > able to generate the SGML documentation automatically. Well, I'd expect some of those columns to get (at least over time) significantly more detailed information than they have now. Certainly more than you'd put in comments in the catalogs. And having some sort of combination there seems to overcomplicate things... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup option for handling symlinks
On sön, 2012-01-08 at 22:22 +0100, Magnus Hagander wrote: > On Sun, Jan 8, 2012 at 21:53, Peter Eisentraut wrote: > > I've recently had a possible need for telling pg_basebackup how to > > handle symlinks in the remote data directory, instead of ignoring them, > > which is what currently happens. Possible options were recreating the > > symlink locally (pointing to a file on the local system) or copying the > > file the symlink points to. This is mainly useful in scenarios where > > configuration files are symlinked from the data directory. Has anyone > > else had the need for this? Is it worth pursuing? > > Yes. > > I came up to the same issue though - in one case it would've been best > to copy the link, in the other case it would've been best to copy the > contents of the file :S Couldn't decide which was most important... > Maybe a switch would be needed? Yes. Do we need to preserve the third behavior of ignoring symlinks? tar has an -h option that causes symlinks to be followed. The default there is to archive the symlink itself. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?
On mån, 2012-01-16 at 11:17 -0500, Robert Haas wrote: > I don't see how setting indisvalid to false helps with this, because > IIUC when a session sees indisvalid = false, it is supposed to avoid > using the index for queries but still make new index entries when a > write operation happens - but to drop an index, I think you'd need to > get into a state where no one was using the index for anything at all. ISTM that one would need to set indisready to false instead. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign key locks, 2nd attempt
On 15.01.2012 06:49, Alvaro Herrera wrote: --- 164,178 #define HEAP_HASVARWIDTH 0x0002 /* has variable-width attribute(s) */ #define HEAP_HASEXTERNAL 0x0004 /* has external stored attribute(s) */ #define HEAP_HASOID 0x0008 /* has an object-id field */ ! #define HEAP_XMAX_KEYSHR_LOCK 0x0010 /* xmax is a key-shared locker */ #define HEAP_COMBOCID 0x0020 /* t_cid is a combo cid */ #define HEAP_XMAX_EXCL_LOCK 0x0040 /* xmax is exclusive locker */ ! #define HEAP_XMAX_IS_NOT_UPDATE 0x0080 /* xmax, if valid, is only a locker. ! * Note this is not set unless ! * XMAX_IS_MULTI is also set. */ ! ! #define HEAP_LOCK_BITS(HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_IS_NOT_UPDATE | \ !HEAP_XMAX_KEYSHR_LOCK) #define HEAP_XMIN_COMMITTED 0x0100 /* t_xmin committed */ #define HEAP_XMIN_INVALID 0x0200 /* t_xmin invalid/aborted */ #define HEAP_XMAX_COMMITTED 0x0400 /* t_xmax committed */ HEAP_XMAX_IS_NOT_UPDATE is a pretty opaque name for that. From the name, I'd say that a DELETE would set that, but the comment says it has to do with locking. After going through all the combinations in my mind, I think I finally understood it: HEAP_XMAX_IS_NOT_UPDATE is set if xmax is a multi-xact, that represent only locking xids, no updates. How about calling it "HEAP_XMAX_LOCK_ONLY", and setting it whenever whether or not xmax is a multi-xid? - I haven't done anything about exposing FOR KEY UPDATE to the SQL level. There clearly isn't consensus about exposing this; in fact there isn't consensus on exposing FOR KEY SHARE, but I haven't changed that from the previous patch, either. I think it would be useful to expose it. Not that anyone should be using them in an application (or would it be useful?), but I feel it could make testing significantly easier. - pg_upgrade bits are missing. I guess we'll need to rewrite pg_multixact contents in pg_upgrade. Is the page format backwards-compatible? Why are you renaming HeapTupleHeaderGetXmax() into HeapTupleHeaderGetRawXmax()? Any current callers of HeapTupleHeaderGetXmax() should already check that HEAP_XMAX_IS_MULTI is not set. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is CF 2011-11 still listed as "In Progress"?
On Sat, Jan 14, 2012 at 8:58 PM, Greg Smith wrote: > I would have sworn I left this next to the bike shed...from the crickets > chirping I guess not. I did complete bumping forward the patches that > slipped through the November CF the other day, and it's properly closed now. > > As for CF 2012-01, I had thought Robert Haas was going to run that one. My > saying that is not intended to put him on the hook. Last year, I tried to talk a fairly active roll in getting the CommitFest wrapped up in a reasonable period of time, and that didn't really get a lot of support, so I'm not particularly inclined to do it again. I have long felt that it's important, especially for non-committers, not to go too long between CommitFests, because that means going a long time without much opportunity to get things committed, or even to get feedback. And even for committers, it's not particularly productive to sit around for a long time with the tree closed to new work. Virtually nobody wants to grow a gigantic patch before seeing any of it go in; the chance of rejection is way too high. At least, that's my view. But, I've noticed that nothing good comes of me pressing my own view too hard. Either we as a community value having the CommitFest wrap up in a reasonable period of time, or we don't. If we do, then let's make it happen together. If we don't, then let's resign ourselves now to the fact that 9.2 will not hit the shelves for a very long time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?
On Mon, Jan 16, 2012 at 2:06 PM, Peter Eisentraut wrote: > On mån, 2012-01-16 at 11:17 -0500, Robert Haas wrote: >> I don't see how setting indisvalid to false helps with this, because >> IIUC when a session sees indisvalid = false, it is supposed to avoid >> using the index for queries but still make new index entries when a >> write operation happens - but to drop an index, I think you'd need to >> get into a state where no one was using the index for anything at all. > > ISTM that one would need to set indisready to false instead. Maybe we should set both to false? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign key locks, 2nd attempt
Excerpts from Heikki Linnakangas's message of lun ene 16 16:17:42 -0300 2012: > > On 15.01.2012 06:49, Alvaro Herrera wrote: > > --- 164,178 > > #define HEAP_HASVARWIDTH0x0002/* has variable-width > > attribute(s) */ > > #define HEAP_HASEXTERNAL0x0004/* has external stored > > attribute(s) */ > > #define HEAP_HASOID0x0008/* has an object-id field */ > > ! #define HEAP_XMAX_KEYSHR_LOCK0x0010/* xmax is a key-shared locker > > */ > > #define HEAP_COMBOCID0x0020/* t_cid is a combo cid */ > > #define HEAP_XMAX_EXCL_LOCK0x0040/* xmax is exclusive locker > > */ > > ! #define HEAP_XMAX_IS_NOT_UPDATE0x0080/* xmax, if valid, is only a > > locker. > > ! * Note this is not set unless > > ! * XMAX_IS_MULTI is also set. */ > > ! > > ! #define HEAP_LOCK_BITS(HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_IS_NOT_UPDATE > > | \ > > ! HEAP_XMAX_KEYSHR_LOCK) > > #define HEAP_XMIN_COMMITTED0x0100/* t_xmin committed */ > > #define HEAP_XMIN_INVALID0x0200/* t_xmin invalid/aborted */ > > #define HEAP_XMAX_COMMITTED0x0400/* t_xmax committed */ > > HEAP_XMAX_IS_NOT_UPDATE is a pretty opaque name for that. From the name, > I'd say that a DELETE would set that, but the comment says it has to do > with locking. After going through all the combinations in my mind, I > think I finally understood it: HEAP_XMAX_IS_NOT_UPDATE is set if xmax is > a multi-xact, that represent only locking xids, no updates. How about > calling it "HEAP_XMAX_LOCK_ONLY", and setting it whenever whether or not > xmax is a multi-xid? Hm, sounds like a good idea. Will do. > > - I haven't done anything about exposing FOR KEY UPDATE to the SQL > > level. There clearly isn't consensus about exposing this; in fact > > there isn't consensus on exposing FOR KEY SHARE, but I haven't > > changed that from the previous patch, either. > > I think it would be useful to expose it. Not that anyone should be using > them in an application (or would it be useful?), but I feel it could > make testing significantly easier. Okay, two votes in favor; I'll go do that too. > > - pg_upgrade bits are missing. > > I guess we'll need to rewrite pg_multixact contents in pg_upgrade. Is > the page format backwards-compatible? It's not. I haven't worked out what pg_upgrade needs to do, honestly. I think we should just not copy old pg_multixact files when upgrading across this patch. I was initially thinking that pg_multixact should return the empty set if requested members of a multi that preceded the freeze point. That way, I thought, we would just never try to access a page originated in the older version (assuming the freeze point is set to "current" whenever pg_upgrade runs). However, as things currently stand, accessing an old multi raises an error. So maybe we need a scheme a bit more complex to handle this. > Why are you renaming HeapTupleHeaderGetXmax() into > HeapTupleHeaderGetRawXmax()? Any current callers of > HeapTupleHeaderGetXmax() should already check that HEAP_XMAX_IS_MULTI is > not set. I had this vague impression that it'd be better to break existing callers so that they ensure they decide between HeapTupleHeaderGetRawXmax and HeapTupleHeaderGetUpdateXid. Noah suggested changing the macro name, too. It's up to each caller to decide what's the sematics they want. Most want the latter; and callers outside core are more likely to want that one. If we kept the old name, they would get the wrong value. If we want to keep the original name, it's the same to me. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result metadata
Peter Eisentraut writes: > I deliberately chose not to do that, because the PL/Python API is > intentionally totally different from the standard DB-API, and mixing in > some semi-conforming look-alike would be quite confusing from both ends. Fair enough. > I think we should stick with the PL/Python API being a small layer on > top of SPI, and let the likes of plpydbapi handle the rest. I'm discovering that, and again, fair enough :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_stat_database deadlock counter
Attached patch adds a counter for number of deadlocks in a database to pg_stat_database. While not enough to diagnose a problem on it's own, this is an easy way to get an indicator when for when you need to go look in the logs for details. Overhead should be very small - one counter per database is not enough to bloat the statsfile,and if you have enough deadlocks that the sendinf of the messages actually cause a performance overhead, you have a bigger problem... Comments? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ *** a/doc/src/sgml/monitoring.sgml --- b/doc/src/sgml/monitoring.sgml *** *** 283,289 postgres: user database host --- 283,290 read requests avoided by finding the block already in buffer cache), number of rows returned, fetched, inserted, updated and deleted, the total number of queries canceled due to conflict with recovery (on ! standby servers), total number of deadlocks detected, and time of ! last statistics reset. *** a/src/backend/catalog/system_views.sql --- b/src/backend/catalog/system_views.sql *** *** 574,579 CREATE VIEW pg_stat_database AS --- 574,580 pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted, pg_stat_get_db_conflict_all(D.oid) AS conflicts, + pg_stat_get_db_deadlocks(D.oid) AS deadlocks, pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset FROM pg_database D; *** a/src/backend/postmaster/pgstat.c --- b/src/backend/postmaster/pgstat.c *** *** 286,291 static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len); --- 286,292 static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len); static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len); static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len); + static void pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len); /* *** *** 1339,1344 pgstat_report_recovery_conflict(int reason) --- 1340,1364 pgstat_send(&msg, sizeof(msg)); } + /* + * pgstat_report_deadlock() - + * + * Tell the collector about a deadlock detected. + * + */ + void + pgstat_report_deadlock(void) + { + PgStat_MsgDeadlock msg; + + if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_counts) + return; + + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_DEADLOCK); + msg.m_databaseid = MyDatabaseId; + pgstat_send(&msg, sizeof(msg)); + } + /* -- * pgstat_ping() - * *** *** 3185,3190 PgstatCollectorMain(int argc, char *argv[]) --- 3205,3214 pgstat_recv_recoveryconflict((PgStat_MsgRecoveryConflict *) &msg, len); break; + case PGSTAT_MTYPE_DEADLOCK: + pgstat_recv_deadlock((PgStat_MsgDeadlock *) &msg, len); + break; + default: break; } *** *** 3266,3271 pgstat_get_db_entry(Oid databaseid, bool create) --- 3290,3296 result->n_conflict_snapshot = 0; result->n_conflict_bufferpin = 0; result->n_conflict_startup_deadlock = 0; + result->n_deadlocks = 0; result->stat_reset_timestamp = GetCurrentTimestamp(); *** *** 4403,4408 pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len) --- 4428,4449 } /* -- + * pgstat_recv_deadlock() - + * + * Process as DEADLOCK message. + * -- + */ + static void + pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len) + { + PgStat_StatDBEntry *dbentry; + + dbentry = pgstat_get_db_entry(msg->m_databaseid, true); + + dbentry->n_deadlocks++; + } + + /* -- * pgstat_recv_funcstat() - * * Count what the backend has done. *** a/src/backend/storage/lmgr/deadlock.c --- b/src/backend/storage/lmgr/deadlock.c *** *** 938,943 DeadLockReport(void) --- 938,945 pgstat_get_backend_current_activity(info->pid, false)); } + pgstat_report_deadlock(); + ereport(ERROR, (errcode(ERRCODE_T_R_DEADLOCK_DETECTED), errmsg("deadlock detected"), *** a/src/backend/utils/adt/pgstatfuncs.c --- b/src/backend/utils/adt/pgstatfuncs.c *** *** 78,83 extern Datum pg_stat_get_db_conflict_snapshot(PG_FUNCTION_ARGS); --- 78,84 extern Datum pg_stat_get_db_conflict_bufferpin(PG_FUNCTION_ARGS); extern Datum pg_stat_get_db_conflict_startup_deadlock(PG_FUNCTION_ARGS); extern Datum pg_stat_get_db_conflict_all(PG_FUNCTION_ARGS); + extern Datum pg_stat_get_db_deadlocks(PG_FUNCTION_ARGS); extern Datum pg_stat_get_db_stat_reset_time(PG_FUNCTION_ARGS); extern Datum pg_stat_get_bgwriter_timed_checkpoints(PG_FUNCTION_ARGS); *** *** 1275,1280 pg_stat_get_db_conflict_all(PG_FUNCTION_ARGS) --- 1276
Re: [HACKERS] SKIP LOCKED DATA
Thomas Munro writes: > A common usage for this is to increase parallelism in systems with > multiple workers taking jobs from a queue. I've used it for this > purpose myself on another RDBMS, having seen it recommended for some > types of work queue implementation. It may have other uses. To attack this problem in PostgreSQL we also have PGQ, which is Skytools3 has support for cooperative consumers. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Warning in views.c
Seem 1575fbcb caused this warning: view.c: In function ‘DefineVirtualRelation’: view.c:105:6: warning: variable ‘namespaceId’ set but not used [-Wunused-but-set-variable] Attached seems to be the easy fix - or am I missing something obvious? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c index c3520ae..f895488 100644 --- a/src/backend/commands/view.c +++ b/src/backend/commands/view.c @@ -102,7 +102,6 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace, List *options) { Oid viewOid; - Oid namespaceId; LOCKMODE lockmode; CreateStmt *createStmt = makeNode(CreateStmt); List *attrList; @@ -167,8 +166,7 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace, * namespace is temporary. */ lockmode = replace ? AccessExclusiveLock : NoLock; - namespaceId = - RangeVarGetAndCheckCreationNamespace(relation, lockmode, &viewOid); + (void) RangeVarGetAndCheckCreationNamespace(relation, lockmode, &viewOid); if (OidIsValid(viewOid) && replace) { -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
On 1/14/12 8:44 PM, Greg Smith wrote: > Second, e-mail provides some level of validation that patches being > submitted are coming from the person they claim. We currently reject > patches that are only shared with the community on the web, via places > like github. The process around this mailing list tries to make it > clear sending patches to here is a code submission under the PostgreSQL > license. And e-mail nowadays keeps increasing the number of checks that > confirm it's coming from the person it claims sent it. I can go check > into the DKIM credentials your Gmail message to the list contained if > I'd like, to help confirm it really came from your account. E-mail > headers are certainly not perfectly traceable and audit-able, but they > are far better than what you'd get from a web submission. Little audit > trail there beyond "came from this IP address". Putting submitters aside, I have to say based on teaching people how to use the CF stuff on Thursday night that the process of submitting a review of a patch is VERY unintuitive, or in the words of one reviewer "astonishingly arcane". Summing up: 1. Log into CF. Claim the patch by editing it. 2. Write a review and email it to pgsql-hackers. 3. Dig the messageID out of your sent mail. 4. Add a comment to the patch, type "Review" with the messageID, and ideally a short summary comment of the review. 5. Edit the patch to change its status as well as to remove yourself as reviewer if you plan to do no further review. There are so many things wrong with this workflow I wouldn't know where to start. The end result, though, is that it strongly discourages the occasional reviewer by making the review process cumbersome and confusing. I'll also point out that the process for *applying* a patch, if you don't subscribe to hackers and keep archives around on your personal machine for months, is also very cumbersome and error-prone. Copy and paste from a web page? Really? Certainly we could spend the next 6 years incrementally improving the CF app "in our spare time". But maybe it might be a better thing to look at the code development tools which are already available? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
Excerpts from Josh Berkus's message of lun ene 16 17:48:41 -0300 2012: > Putting submitters aside, I have to say based on teaching people how to > use the CF stuff on Thursday night that the process of submitting a > review of a patch is VERY unintuitive, or in the words of one reviewer > "astonishingly arcane". Summing up: > > 1. Log into CF. Claim the patch by editing it. > > 2. Write a review and email it to pgsql-hackers. > > 3. Dig the messageID out of your sent mail. > > 4. Add a comment to the patch, type "Review" with the messageID, and > ideally a short summary comment of the review. > > 5. Edit the patch to change its status as well as to remove yourself as > reviewer if you plan to do no further review. > > There are so many things wrong with this workflow I wouldn't know where > to start. Other than having to figure out Message-Ids, which most MUAs seem to hide as much as possible, is there anything here of substance? I mean, if getting a message-id from Gmail is all that complicated, please complain to Google. I mean, is email arcane? Surely not. Are summary lines arcane? Give me a break. So the only real complain point here is message-id, which normally people don't care about and don't even know they exist. So they have to learn about it. Let's keep in mind that pgsql-hackers email is our preferred form of communication. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
> I mean, is email arcane? Surely not. Are summary lines arcane? Give > me a break. So the only real complain point here is message-id, which > normally people don't care about and don't even know they exist. So > they have to learn about it. The complaint is that the reviewer is expected to use two different and wholly incompatible methods of communication, each of which requires a separate registration, to post the review. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SKIP LOCKED DATA
On 1/15/12 3:01 PM, Thomas Munro wrote: > 5. Probably many other things that I'm not aware of right now and > won't discover until I dig/ask further and/or run into a brick wall! > > Useful? Doable? Useful, yes. Harder than it looks, probably. I tried to mock up a version of this years ago for a project where I needed it, and ran into all kinds of race conditions. Anyway, if it could be made to work, this is extremely useful for any application which needs queueing behavior (with is a large plurality, if not a majority, of applications). -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
On Mon, Jan 16, 2012 at 1:10 PM, Alvaro Herrera wrote: > > Excerpts from Josh Berkus's message of lun ene 16 17:48:41 -0300 2012: > >> Putting submitters aside, I have to say based on teaching people how to >> use the CF stuff on Thursday night that the process of submitting a >> review of a patch is VERY unintuitive, or in the words of one reviewer >> "astonishingly arcane". Summing up: >> >> 1. Log into CF. Claim the patch by editing it. >> >> 2. Write a review and email it to pgsql-hackers. >> >> 3. Dig the messageID out of your sent mail. >> >> 4. Add a comment to the patch, type "Review" with the messageID, and >> ideally a short summary comment of the review. >> >> 5. Edit the patch to change its status as well as to remove yourself as >> reviewer if you plan to do no further review. >> >> There are so many things wrong with this workflow I wouldn't know where >> to start. > > Other than having to figure out Message-Ids, which most MUAs seem to > hide as much as possible, is there anything here of substance? I find it an annoyance, but can't get too worked up over it. > I mean, > if getting a message-id from Gmail is all that complicated, please > complain to Google. But after digging the message-id out of gmail and entering it into the commitfest app, the resulting link is broken because the email has not yet shown up in the archives. So now I have to wonder if I did something wrong, and keep coming back every few hours to see if will start working. > > I mean, is email arcane? Surely not. Are summary lines arcane? The way you have to set them is pretty arcane. Again, I can't get too worked over it, but if it were made simpler I'd be happier. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is CF 2011-11 still listed as "In Progress"?
On 1/16/12 11:42 AM, Robert Haas wrote: > But, I've noticed that nothing good comes of me pressing my own view > too hard. Either we as a community value having the CommitFest wrap > up in a reasonable period of time, or we don't. Reality is, alas, not nearly so binary as this, and therin lie the delays. While almost everyone agrees that ending the Commitfests on time is a good thing, almost everyone has at least one patch they would extend the CF in order to get done. This is the fundamental scheduling struggle of every single software project I've ever worked on, so I don't see why we would expect it to be different on PostgreSQL just because we adopted the CF model. The benefit of the CF process is that it makes it *visible* when we're getting behind. But it doesn't stop us from doing so. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
Excerpts from Jeff Janes's message of lun ene 16 18:37:59 -0300 2012: > > I mean, > > if getting a message-id from Gmail is all that complicated, please > > complain to Google. > > But after digging the message-id out of gmail and entering it into the > commitfest app, the resulting link is broken because the email has not > yet shown up in the archives. So now I have to wonder if I did > something wrong, and keep coming back every few hours to see if will > start working. Hours? Unless a message is delayed for moderation, it should show up in archives within tem minutes. If you have problems finding emails after that period, by all means complain. Now that we've moved archives to a new host, perhaps we could rerun the archive script every two minutes instead of ten. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
On 01/16/2012 03:48 PM, Josh Berkus wrote: 3. Dig the messageID out of your sent mail. 4. Add a comment to the patch, type "Review" with the messageID, and ideally a short summary comment of the review. This is the time consuming part that would benefit the most from some automation. The message-id digging is an obvious sore spot, which is why I focused on improvements to eliminate so much of that first in my suggestions. The problem is that we don't actually want every message sent to the list on a thread to appear on the CF summary, and writing that short summary content is an important step. Archived messages deemed notable enough that someone linked the two are the only ones that appear in the patch history. That makes it possible to come up to speed on the most interesting history points of a patch in a reasonable period of time--even if you missed the earlier discussion. I think any of the other alternatives we might adopt would end up associating all of the e-mail history around a patch. That's the firehose, and spraying the CF app with it makes the whole thing a lot less useful. I don't think this is an unsolvable area to improve. It's been stuck behind the major postgresql.org site overhaul, which is done now. Adding some web service style APIs to probe the archives for message IDs by a) ancestor and b) author would make it possible to sand off a whole lot of rough edges here. While it's annoying in its current form, doing all my work based on message IDs has been a huge improvement over the old approach, where URLs into the archives were date based and not always permanent. I'll also point out that the process for *applying* a patch, if you don't subscribe to hackers and keep archives around on your personal machine for months, is also very cumbersome and error-prone. Copy and paste from a web page? Really? The most reasonable answer to this is for people to publish a git repo URL in addition to the "official" submission of changes to the list in patch form. And momentum toward doing that just keeps going up, even among longer term contributors who weren't git advocates at all a year during the transition. I nudged Simon that way and he's pushing branches for major patches but not small ones yet, it looks like Andrew fully embraced bitbucket recently, etc. We're 16 months into git adoption. I'm pretty happy with how well that's going. We don't need to add infrastructure to enable people to push code to github and link to their branch comparison repo viewer as a way to view the patch; that's already available to anyone who wants is. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Non-inheritable check constraints
Excerpts from Nikhil Sontakke's message of vie dic 23 01:02:10 -0300 2011: > FWIW, here's a quick fix for the issue that Robert pointed out. Thanks, applied. > Again it's > a variation of the first issue that he reported. We have two functions > which try to merge existing constraints: > > MergeWithExistingConstraint() in heap.c > MergeConstraintsIntoExisting() in tablecmds.c > > Nice names indeed :) Yeah, I added a comment about the other one on each of them. > I have also tried to change the error message as per Alvaro's suggestions. > I will really try to see if we have other issues. Really cannot have Robert > reporting all the bugs and getting annoyed, but there are lotsa variations > possible with inheritance.. So did you find anything? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Non-inheritable check constraints
Excerpts from Nikhil Sontakke's message of vie dic 23 01:02:10 -0300 2011: > I have also tried to change the error message as per Alvaro's suggestions. > I will really try to see if we have other issues. Really cannot have Robert > reporting all the bugs and getting annoyed, but there are lotsa variations > possible with inheritance.. BTW thank you for doing the work on this. Probably the reason no one bothers too much with inheritance is that even something that's seemingly simple such as what you tried to do here has all these little details that's hard to get right. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inconsistent comparison of CHECK constraints
Excerpts from Tom Lane's message of lun ene 16 12:44:57 -0300 2012: > Alvaro Herrera writes: > > While reviewing Nikhil Sontakke's fix for the inherited constraints open > > item we have, I noticed that MergeWithExistingConstraint and > > MergeConstraintsIntoExisting are using rather different mechanism to > > compare equality of the constraint expressions; the former does this: > > > if (equal(expr, stringToNode(TextDatumGetCString(val > > > So plain string comparison of the node's string representation. > > No, that's *not* a "plain string comparison", and if it were it would be > wrong. This is doing equal() on the node trees, which is in fact the > correct implementation. Aha, that makes sense. > > MergeConstraintsIntoExisting is instead doing this: > > > if (acon->condeferrable != bcon->condeferrable || > > acon->condeferred != bcon->condeferred || > > strcmp(decompile_conbin(a, tupleDesc), > >decompile_conbin(b, tupleDesc)) != 0) > > That's kind of a crock, but it's necessary because it's trying to detect > equivalence of constraint expressions belonging to different tables, > which could have different physical column numbers as noted by the > comment. So I don't see a way to reduce it to a simple equal(). > But for constraints applicable to just one table, equal() should be > preferred as it's simpler and more reliable. It makes plenty of sense too. I've left the two separate implementations alone. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IDLE in transaction introspection
On Sun, Jan 15, 2012 at 4:28 AM, Greg Smith wrote: > On 01/12/2012 11:57 AM, Scott Mead wrote: > >> Pretty delayed, but please find the attached patch that addresses all the >> issues discussed. >> > > The docs on this v4 look like they suffered a patch order problem here. > In the v3, you added a whole table describing the pg_stat_activity > documentation in more detail than before. v4 actually tries to remove > those new docs, a change which won't even apply as they don't exist > upstream. > > My guess is you committed v3 to somewhere, applied the code changes for > v4, but not the documentation ones. It's easy to do that and end up with a > patch that removes a bunch of docs the previous patch added. I have to be > careful to always do something like "git diff origin/master" to avoid this > class of problem, until I got into that habit I did this sort of thing > regularly. > > gak Okay, I'll fix that and the rules.out regression that I missed --Scott OpenSCG, http://www.openscg.com > > -- > Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD > PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com > >
Re: [HACKERS] Review of: pg_stat_statements with query tree normalization
Excerpts from Daniel Farina's message of dom ene 15 08:41:55 -0300 2012: > Onto the mechanism: the patch is both a contrib and changes to > Postgres. The changes to postgres are mechanical in nature, but > voluminous. The key change is to not only remember the position of > Const nodes in the query tree, but also their length, and this change > is really extensive although repetitive. It was this swathe of change > that bitrotted the source, when new references to some flex constructs > were added. Every such reference has needs to explicitly refer to > '.begins', which is the beginning position of the const -- this used > to be the only information tracked. Because .length was never > required by postgres in the past, it fastidiously bookkeeps that > information without ever referring to it internally: only > pg_stat_statements does. I wonder if it would make sense to split out those changes from the patch, including a one-member struct definition to the lexer source, which could presumably be applied in advance of the rest of the patch. That way, if other parts of the main patch are contentious, the tree doesn't drift under you. (Or rather, it still drifts, but you no longer care because your bits are already in.) -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] automating CF submissions (was xlog location arithmetic)
On 01/16/2012 05:25 PM, Greg Smith wrote: The most reasonable answer to this is for people to publish a git repo URL in addition to the "official" submission of changes to the list in patch form. And momentum toward doing that just keeps going up, even among longer term contributors who weren't git advocates at all a year during the transition. I nudged Simon that way and he's pushing branches for major patches but not small ones yet, it looks like Andrew fully embraced bitbucket recently, etc. If we're going to do that, the refspec to be pulled needs to be a tag, I think, not just a branch, and people would have to get into the habit of tagging commits and explicitly pushing tags. I probably should be doing that, and it is now built into the buildfarm client release mechanism, but I usually don't when just publishing dev work. Guess I need to start. I'll probably use tag names like branch-MMDDHHMM. I certainly like the idea of just being able to pull in a tag from a remote instead of applying a patch. (BTW, I use both bitbucket and github. They both have advantages.) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of: pg_stat_statements with query tree normalization
On 01/16/2012 06:19 PM, Alvaro Herrera wrote: I wonder if it would make sense to split out those changes from the patch, including a one-member struct definition to the lexer source, which could presumably be applied in advance of the rest of the patch. That way, if other parts of the main patch are contentious, the tree doesn't drift under you. (Or rather, it still drifts, but you no longer care because your bits are already in.) The way this was packaged up was for easier reviewer consumption, just pull down the whole thing and run with it. I was already thinking that if we've cleared the basics with a positive review and are moving more toward commit, it would be better to have it split into three pieces: -Core parsing changes -pg_stat_statements changes -Test programs And then work through those in that order. Whether or not the test programs even go into core as contrib code is a useful open question. While Peter had a version of this that worked completely within the boundaries of an extension, no one was really happy with that. At a minimum the .length changes really need to land in 9.2 to enable this feature to work well. As Daniel noted, it's a lot of code changes, but not a lot of code complexity. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of: pg_stat_statements with query tree normalization
Alvaro Herrera writes: > Excerpts from Daniel Farina's message of dom ene 15 08:41:55 -0300 2012: >> Onto the mechanism: the patch is both a contrib and changes to >> Postgres. The changes to postgres are mechanical in nature, but >> voluminous. The key change is to not only remember the position of >> Const nodes in the query tree, but also their length, and this change >> is really extensive although repetitive. > I wonder if it would make sense to split out those changes from the > patch, including a one-member struct definition to the lexer source, > which could presumably be applied in advance of the rest of the patch. > That way, if other parts of the main patch are contentious, the tree > doesn't drift under you. (Or rather, it still drifts, but you no longer > care because your bits are already in.) Well, short of seeing an acceptable patch for the larger thing, I don't want to accept a patch to add that field to Const, because I think it's a kluge. I'm still feeling that there must be a better way ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of: pg_stat_statements with query tree normalization
On 16 January 2012 23:43, Greg Smith wrote: > While Peter had a version of this that worked completely within the > boundaries of an extension, no one was really happy with that. At a minimum > the .length changes really need to land in 9.2 to enable this feature to > work well. As Daniel noted, it's a lot of code changes, but not a lot of > code complexity. Right. As I've said in earlier threads, we're mostly just making the YYLTYPE representation closer to that of the default, which has the following fields: first_line, first_column, last_line and last_column. I just want two fields. I think we've already paid most of the price that this imposes, by using the @n feature in the first place. Certainly, I couldn't isolate any additional overhead. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is CF 2011-11 still listed as "In Progress"?
On 01/16/2012 02:42 PM, Robert Haas wrote: But, I've noticed that nothing good comes of me pressing my own view too hard. Either we as a community value having the CommitFest wrap up in a reasonable period of time, or we don't. If we do, then let's make it happen together. If we don't, then let's resign ourselves now to the fact that 9.2 will not hit the shelves for a very long time. I think this is getting more predictable simply based on having some history. The trail blazing you led here for some time didn't know what was and wasn't possible yet. I feel that the basic shape of things, while still fuzzy in spots, is a lot more clear now. We need to have schedule goals. There needs to be a date where we switch toward a more aggressive "is someone going to commit this soon?" stance on things that are still open. At that point, someone needs to be the person not afraid to ramp up pressure toward returning things that just aren't going to make it to commit quality. That's a thankless task that rarely leaves anyone involved happy. But this project won't easily move to "ship on this date" instead of "ship when it's ready", and I don't want that to change. There's two sides to that. The quality control on most of the 100% date driven release software I use is terrible. The way the CF schedule is lined up now, there's enough margin in the schedule that we can handle some drift there, while still keeping the quality standards high. The currently open CF is probably going to take at least 6 weeks. That doesn't change the fact that hard decisions about return vs. continue toward possible commit should be accelerating by or before the 4 week mark. The other side to this is that when some big and/or hard features land does impact PostgreSQL's adoption. To close some of them, you almost need the sort of focus that only seems to come from recognizing you're past the original goal date, this one big thing is holding up forward progress, and everyone who can should be pushing on that usefully to wrap it up. Could the last 9.1 CF have closed up 1 to 2 weeks earlier if Sync Rep had been bumped? Probably. Was it worth going off-schedule by that much so that it did ship in 9.1? I think so. But with every day marching past the original goal, the thinking should turn toward what simplified subset is commit quality. If there's not a scaled down feature some trimming might extract and get to commit quality--which was the case with how Sync Rep ended up being committed--that one needs to close. The couple of weeks over target 9.1 slipped is as bad as we can let this go now. I made one big mistake for 2011-11 CF I want to learn how to avoid next time. When we went past the schedule goal--closing on 12/15--I got most of the patches closed out. What I should have done at that point is push toward alpha3 release, even though there were ~5 things still open. The fact that some patches in that CF are still being tinkered with shouldn't delay a date-driven alpha drop. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Should we add crc32 in libpgport?
I have been working with xlogdump and noticed that unfortunately it cannot be installed without access to a postgres build directory, which makes the exported functionality in src/include/utils/pg_crc.h useless unless one has access to pg_crc.o -- which would only happen if a build directory is lying around. Yet, pg_crc.h is *installed* in server/utils, at least from my Debian package. Worse yet, those crc implementations are the same but ever-so-slightly different (hopefully in an entirely non-semantically-important way). On more inspection, I also realized that the hstore and ltree contribs *also* have crc32 implementations, dating back to 2006 and 2002, respectively. So I think the following actions might make sense: * stop the distribution of pg_crc.h XOR include the crc tables into libpgport.a necessary to make them work * Utilize the canonical pgport implementation of crc in both contribs I tried my very best (mostly git log and reading the linked discussion in the archives, as well as searching the archives) to find any explanation why this is anything but an oversight that seems to consistently result in less-desirable engineering in anything that is compiled separately from Postgres but intends to link against it when it comes to computing a CRC. Copying CRC32 implementations everywhere is not the worst thing, but I find it inadequately explained why it's necessary for now, at least. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of: pg_stat_statements with query tree normalization
On 16 January 2012 23:53, Tom Lane wrote: > Well, short of seeing an acceptable patch for the larger thing, I don't > want to accept a patch to add that field to Const, because I think it's > a kluge. I'm still feeling that there must be a better way ... What does an acceptable patch look like? Does your objection largely hinge on the fact that the serialisation is performed after the re-writing stage rather on the raw parse tree, or is it something else? Despite my full plate this commitfest, I am determined that this feature be available in 9.2, as I believe that it is very important. Instrumentation of queries is something that it just isn't possible to do well right now, with each of the available third party solutions or pg_stat_statements. That really needs to change. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of: pg_stat_statements with query tree normalization
On Mon, Jan 16, 2012 at 3:53 PM, Tom Lane wrote: > Well, short of seeing an acceptable patch for the larger thing, I don't > want to accept a patch to add that field to Const, because I think it's > a kluge. I'm still feeling that there must be a better way ... Hm. Maybe it is tractable to to find the position of the lexme immediately after the Const? Outside of the possible loss of whitespace (is that a big deal? I'm not sure) that could do the trick...after all, the entire lexeme stream is annotated with the beginning position, if memory serves, and that can be related in some way to the end position of the previous lexeme, sort-of. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should we add crc32 in libpgport?
On Mon, Jan 16, 2012 at 4:56 PM, Daniel Farina wrote: > I have been working with xlogdump and noticed that unfortunately it > cannot be installed without access to a postgres build directory, > which makes the exported functionality in src/include/utils/pg_crc.h > useless unless one has access to pg_crc.o -- which would only happen > if a build directory is lying around. Yet, pg_crc.h is *installed* in > server/utils, at least from my Debian package. Worse yet, those crc > implementations are the same but ever-so-slightly different (hopefully > in an entirely non-semantically-important way). Out-of-order editing snafu. "Worse yet, those crc implementations are the..." should come after the note about there being two additional crc implementations in the postgres contribs. Looking back on it, it's obvious why those contribs had it in the first place: because they started external, and needed CRC, and the most expedient thing to do is include yet another implementation. So arguably this problem has occurred three times: in xlogdump, and in pre-contrib versions of hstore, and ltree. It's just the latter two sort of get a free pass by the virtue of having access to the postgres build directory as contribs in this day and age. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is CF 2011-11 still listed as "In Progress"?
On Mon, Jan 16, 2012 at 7:01 PM, Greg Smith wrote: > I think this is getting more predictable simply based on having some > history. The trail blazing you led here for some time didn't know what was > and wasn't possible yet. I feel that the basic shape of things, while still > fuzzy in spots, is a lot more clear now. > > We need to have schedule goals. There needs to be a date where we switch > toward a more aggressive "is someone going to commit this soon?" stance on > things that are still open. At that point, someone needs to be the person > not afraid to ramp up pressure toward returning things that just aren't > going to make it to commit quality. That's a thankless task that rarely > leaves anyone involved happy. There's some value in deciding early that there's no hope for a given patch, because it frees up resources, of which we do not have an unlimited supply, to deal with other patches. I have mixed feelings about the whole 4-weeks vs. 6-weeks thing with respect to the final CommitFest. If everyone signed up to review as many patches as they submitted, then on day one of the CommitFest every patch would have a reviewer, and if anyone who didn't submit patches signed up as well, some patches would have more than one. In a week, they'd all have an initial review, and in two weeks they'd all have had two reviews, and anything that wasn't ready at that point could be punted while the committers ground through the rest. In fact, we have a gigantic number of patches that have no reviewer assigned, and as the CommitFest goes on and on, the number of people who still have the energy to keep slogging through the pile is going to steadily diminish. So when we say that we're going to let the CommitFest go on for 6 weeks, what we're really saying is that we'd like to reward the few brave souls who will actually keep plugging at this for 4 weeks by asking them to go another 2 weeks after that. Or in some cases, what we're saying that we'd like to give patch authors another two weeks to finish work that should really have been done before submitting the patch in the first place. Now, on the flip side, I think that if we get the CommitFest done in 6 weeks, that will be almost as good as getting it done in 4 weeks, because the last two release cycles I've put huge amounts of energy into trying to get the release stable enough to release before July and August roll around and everybody disappears. It didn't work, either time. If that's not going to happen anyway, then there's not really much point in getting stressed about another week or two. On the other hand, speaking only for myself, if I review and/or commit 15 patches in the next month - i.e. three times the number I've submitted, and note that most of what I've submitted for this CommitFest is pretty simple stuff - I'm not going to be very enthusiastic about taking another weeks to pick up 7 or 8 more. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpoint sync pause
On 01/16/2012 11:00 AM, Robert Haas wrote: Also, I am still struggling with what the right benchmarking methodology even is to judge whether any patch in this area "works". Can you provide more details about your test setup? The "test" setup is a production server with a few hundred users at peak workload, reading and writing to the database. Each RAID controller (couple of them with their own tablespaces) has either 512MG or 1GB of battery-backed write cache. The setup that leads to the bad situation happens like this: -The steady stream of backend writes that happen between checkpoints have filled up most of the OS write cache. A look at /proc/meminfo shows around 2.5GB "Dirty:" -Since we have shared_buffers set to 512MB to try and keep checkpoint storms from being too bad, there might be 300MB of dirty pages involved in the checkpoint. The write phase dumps this all into Linux's cache. There's now closer to 3GB of dirty data there. @64GB of RAM, this is still only 4.7% though--just below the effective lower range for dirty_background_ratio. Linux is perfectly content to let it all sit there. -Sync phase begins. Between absorption and the new checkpoint writes, there are >300 segments to sync waiting here. -The first few syncs force data out of Linux's cache and into the BBWC. Some of these return almost instantly. Others block for a moderate number of seconds. That's not necessarily a showstopper, on XFS at least. So long as the checkpointer is not being given all of the I/O in the system, the fact that it's stuck waiting for a sync doesn't mean the server is unresponsive to the needs of other backends. Early data might look like this: DEBUG: Sync #1 time=21.969000 gap=0.00 msec DEBUG: Sync #2 time=40.378000 gap=0.00 msec DEBUG: Sync #3 time=12574.224000 gap=3007.614000 msec DEBUG: Sync #4 time=91.385000 gap=2433.719000 msec DEBUG: Sync #5 time=2119.122000 gap=2836.741000 msec DEBUG: Sync #6 time=67.134000 gap=2840.791000 msec DEBUG: Sync #7 time=62.005000 gap=3004.823000 msec DEBUG: Sync #8 time=0.004000 gap=2818.031000 msec DEBUG: Sync #9 time=0.006000 gap=3012.026000 msec DEBUG: Sync #10 time=302.75 gap=3003.958000 msec [Here 'gap' is a precise measurement of how close the sync pause feature is working, with it set to 3 seconds. This is from an earlier version of this patch. All the timing issues I used to measure went away in the current implementation because it doesn't have to worry about doing background writer LRU work anymore, with the checkpointer split out] But after a few hundred of these, every downstream cache is filled up. The result is seeing more really ugly sync times, like #164 here: DEBUG: Sync #160 time=1147.386000 gap=2801.047000 msec DEBUG: Sync #161 time=0.004000 gap=4075.115000 msec DEBUG: Sync #162 time=0.005000 gap=2943.966000 msec DEBUG: Sync #163 time=962.769000 gap=3003.906000 msec DEBUG: Sync #164 time=45125.991000 gap=3033.228000 msec DEBUG: Sync #165 time=4.031000 gap=2818.013000 msec DEBUG: Sync #166 time=212.537000 gap=3039.979000 msec DEBUG: Sync #167 time=0.005000 gap=2820.023000 msec ... DEBUG: Sync #355 time=2.55 gap=2806.425000 msec LOG: Sync 355 files longest=45125.991000 msec average=1276.177977 msec At the same time #164 is happening, that 45 second long window, a pile of clients will get stuck where they can't do any I/O. The RAID controller that used to have a useful mix of data is now completely filled with >=512MB of random writes. It's now failing to write as fast as new data is coming in. Eventually that leads to pressure building up in Linux's cache. Now you're in the bad place: dirty_background_ratio is crossed, Linux is now worried about spooling all cached writes to disk as fast as it can, the checkpointer is sync'ing its own important data to disk as fast as it can too, and all caches are inefficient because they're full. To recreate a scenario like this, I've realized the benchmark needs to have a couple of characteristics: -It has to focus on transaction latency instead of throughput. We know that doing syncs more often will lower throughput due to reduced reordering etc. -It cannot run at maximum possible speed all the time. It needs to be the case that the system keeps up with the load during the rest of the time, but the sync phase of checkpoints causes I/O to queue faster than it's draining, thus saturating all caches and then blocking backends. Ideally, "Dirty:" in /proc/meminfo will reach >90% of the dirty_background_ratio trigger line around the same time the sync phase starts. -There should be a lot of clients doing a mix of work. The way Linux I/O works, the scheduling for readers vs. writers is complicated, and this is one of the few areas where things like CFQ vs. Deadline matter. I've realized now one reason I never got anywhere with this while running pgbench tests is that pgbench always runs at 100%
Re: [HACKERS] age(xid) on hot standby
Excerpts from Tom Lane's message of lun ene 16 12:27:03 -0300 2012: > > Alvaro Herrera writes: > > Excerpts from Peter Eisentraut's message of dom ene 15 10:00:03 -0300 2012: > >> On ons, 2011-12-28 at 14:35 -0500, Tom Lane wrote: > >>> The trouble with using ReadNewTransactionId is that it makes the results > >>> volatile, not stable as the function is declared to be. > > >> Could we alleviate that problem with some caching within the function? > > > Maybe if we have it be invalidated at transaction end, that could work. > > So each new transaction would get a fresh value. > > Yeah, I think that would work. So who's going to work on a patch? Peter, are you? If not, we should add it to the TODO list. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpoint sync pause
On 1/16/12 5:59 PM, Greg Smith wrote: > > What I think is needed instead is a write-heavy benchmark with a think > time in it, so that we can dial the workload up to, say, 90% of I/O > capacity, but that spikes to 100% when checkpoint sync happens. Then > rearrangements in syncing that reduces caching pressure should be > visible as a latency reduction in client response times. My guess is > that dbt-2 can be configured to provide such a workload, and I don't see > a way forward here except for me to fully embrace that and start over > with it. You can do this with custom pgbench workloads, thanks to random and sleep functions. Somebody went and make pgbench programmable, I don't remember who. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] sepgsql's DROP Permission checks
On Tue, Jan 10, 2012 at 7:51 AM, Kohei KaiGai wrote: > The attached patch adds OAT_DROP object-access-hook around permission > checks of object deletion. > Due to the previous drop statement reworks, the number of places to > put this hook is limited to these six points: RemoveObjects, > RemoveRelations, ATExecDropColumn, dropdb, DropTableSpace and > shdepDropOwned(). > > In sepgsql side, it checks {drop} permission for each object types, > and {remove_name} permission to the schema that owning the object > being removed. I'm still considering whether the drop permission > should be applied on objects being removed in cascade mode. > It is not difficult to implement. We can determine the bahavior on > object deletion with same manner of creation; that saves contextual > information using ProcessUtility hook. > > At this moment, the current proposed patch does not apply checks on > cascaded deletion, according to SQL behavior. However, my concern is > that user can automatically have right to remove all the objects > underlying a partidular schema being removable, even if individual > tables or functions are not able to be removed. > > So, my preference is sepgsql references dependency tables to check > {drop} permissions of involved objects, not only the target object. Hmm. I kind of wonder if we shouldn't just have the OAT_DROP hook get invoked for every actual drop, rather than only for the top-level object. It's somewhat appealing to have the hook more-or-less match up the permissions checks, as you have it here, but in general it seems more useful to have a callback for each thing dropped than to have a callback for each thing named to be dropped. What is your opinion? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should we add crc32 in libpgport?
On Mon, Jan 16, 2012 at 8:09 PM, Daniel Farina wrote: > On Mon, Jan 16, 2012 at 4:56 PM, Daniel Farina wrote: >> I have been working with xlogdump and noticed that unfortunately it >> cannot be installed without access to a postgres build directory, >> which makes the exported functionality in src/include/utils/pg_crc.h >> useless unless one has access to pg_crc.o -- which would only happen >> if a build directory is lying around. Yet, pg_crc.h is *installed* in >> server/utils, at least from my Debian package. Worse yet, those crc >> implementations are the same but ever-so-slightly different (hopefully >> in an entirely non-semantically-important way). > > Out-of-order editing snafu. "Worse yet, those crc implementations are > the..." should come after the note about there being two additional > crc implementations in the postgres contribs. > > Looking back on it, it's obvious why those contribs had it in the > first place: because they started external, and needed CRC, and the > most expedient thing to do is include yet another implementation. So > arguably this problem has occurred three times: in xlogdump, and in > pre-contrib versions of hstore, and ltree. It's just the latter two > sort of get a free pass by the virtue of having access to the postgres > build directory as contribs in this day and age. I think you make a compelling case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should we add crc32 in libpgport?
On Mon, Jan 16, 2012 at 6:18 PM, Robert Haas wrote: > I think you make a compelling case. That's enough for me to just do it. Expect a patch soon. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is CF 2011-11 still listed as "In Progress"?
On 01/16/2012 08:27 PM, Robert Haas wrote: the last two release cycles I've put huge amounts of energy into trying to get the release stable enough to release before July and August roll around and everybody disappears. It didn't work, either time. If that's not going to happen anyway, then there's not really much point in getting stressed about another week or two. Adjusting that expectation is another side to pragmatism based on recent history I think needs to be acknowledged, but is unlikely to be improved on. 9.0 shipped on September 20. 9.1 shipped on September 11. If we say the last CF of each release is unlikely to wrap up before early March each year, that's 6 months of "settling" time between major feature freeze and release. So far that seems to result in stable releases to be proud of, on a predictable enough yearly schedule. Trying to drop the settling time has been frustrating for you, difficult to accomplish, and I'm unsure it's even necessary. Yes, there are some users of PostgreSQL who feel the yearly release cycle is too slow. As I was saying upthread, I don't see any similarly complicated projects doing better whose QA hasn't suffered for it. Are there any examples of serious database software that navigate the new features vs. low bug count trade-off as well as PostgreSQL, while also releasing more often? The one thing that really wasn't acceptable was holding off all new development during the entire freeze period. Branching 9.2 much earlier, then adding the June CommitFest last year, seems to have released a lot of the pressure there. Did it push back the 9.1 release or drop its quality level? Those two things are not decoupled. I think we'd need to look at "fixes backported to 9.1 after 9.2 was branched" to see how much benefit there was to holding off release until September, instead of the July/August time-frame you were pushing for. Could 9.1 have shipped in July and kept the same quality level? My guess is that the additional delay had some value for smoking bugs out. Would have to actually look at the commit history more closely to have an informed opinion on that. I find your tone during this thread a bit strange. I see the way you in particular have pushed on formalizing the CommitFest process the last few years to be a big success. I've been staring at the approaching work left on 9.2, finding a successful track record that outlines a game plan for what's left, even seeing enough data for rough metrics on how long things should take. That's a huge step forward for everyone compared to the state of things a few years ago, where the state of the art was a patch queue in everyone's mailbox, and new submitters had no idea when they'd get feedback. Your hoping that it was possible to get releases out in the summer of each year hasn't worked out so far. I know that was frustrating for you, but I certainly don't see that as a failure; just something we've now seen enough feedback on to acknowledge and accept as impractical. If the flood of last minute submissions right before the freeze submission deadline takes 6 weeks to clear now, that still seems a whole lot better than what I remember of 8.3 and 8.4 development. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is CF 2011-11 still listed as "In Progress"?
On Mon, Jan 16, 2012 at 10:00 PM, Greg Smith wrote: > On 01/16/2012 08:27 PM, Robert Haas wrote: >> the last two release cycles I've put huge amounts of energy >> into trying to get the release stable enough to release before July >> and August roll around and everybody disappears. It didn't work, >> either time. If that's not going to happen anyway, then there's not >> really much point in getting stressed about another week or two. > > Adjusting that expectation is another side to pragmatism based on recent > history I think needs to be acknowledged, but is unlikely to be improved on. > 9.0 shipped on September 20. 9.1 shipped on September 11. If we say the > last CF of each release is unlikely to wrap up before early March each year, > that's 6 months of "settling" time between major feature freeze and release. > So far that seems to result in stable releases to be proud of, on a > predictable enough yearly schedule. Trying to drop the settling time has > been frustrating for you, difficult to accomplish, and I'm unsure it's even > necessary. > > Yes, there are some users of PostgreSQL who feel the yearly release cycle is > too slow. As I was saying upthread, I don't see any similarly complicated > projects doing better whose QA hasn't suffered for it. Are there any > examples of serious database software that navigate the new features vs. low > bug count trade-off as well as PostgreSQL, while also releasing more often? Totally agreed, on all of the above. > The one thing that really wasn't acceptable was holding off all new > development during the entire freeze period. Branching 9.2 much earlier, > then adding the June CommitFest last year, seems to have released a lot of > the pressure there. Also agreed. > Did it push back the 9.1 release or drop its quality > level? Those two things are not decoupled. I think we'd need to look at > "fixes backported to 9.1 after 9.2 was branched" to see how much benefit > there was to holding off release until September, instead of the July/August > time-frame you were pushing for. Could 9.1 have shipped in July and kept > the same quality level? My guess is that the additional delay had some > value for smoking bugs out. Would have to actually look at the commit > history more closely to have an informed opinion on that. Having looked over the commit history just now and thought about it a bit, I don't think it either pushed back the 9.1 release or dropped its quality level. We were still fixing bugs over the summer, and most of those wouldn't have been found any sooner if we haven't branched. Actually, I think 9.1 has probably been the most solid release of the three I've been around long to remember; and maybe the best feature set, too. > I find your tone during this thread a bit strange. I see the way you in > particular have pushed on formalizing the CommitFest process the last few > years to be a big success. Thanks; I appreciate the sentiment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should we add crc32 in libpgport?
Daniel Farina writes: > Copying CRC32 implementations everywhere is not the worst thing, but I > find it inadequately explained why it's necessary for now, at least. Agreed, but I don't care for your proposed solution (put it in libpgport) because that assumes a fact not in evidence, namely that external projects have access to libpgport either. Is it possible to put enough stuff in pg_crc.h so that external code could just include that, perhaps after an extra #define to enable extra code? In the worst case we could just do #ifdef PROVIDE_CRC_IMPLEMENTATION ... current contents of pg_crc.c ... #endif but perhaps there's some intermediate possibility that's less ugly. As for whether we could drop the existing near-duplicate code in contrib/, I think we'd first have to convince ourselves that it was functionally identical, because otherwise replacing those versions would break existing ltree and hstore indexes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Arithmetic operators for macaddr type
On Tue, Dec 13, 2011 at 2:16 PM, Brendan Jurd wrote: > On 12 December 2011 15:59, Pavel Stehule wrote: >> 2011/12/12 Brendan Jurd : >>> I just bumped into a situation where I wanted to do a little macaddr >>> arithmetic in postgres. I note that the inet type has support for >>> bitwise AND, OR and NOT, as well as subtraction, but macaddr has none >>> of the above. >> >> +1 >> > > Here is a patch for $SUBJECT. I merely added support for ~, & and | > operators for the macaddr type. The patch itself is rather trivial, > and includes regression tests and a doc update. The patch looks fine except that it uses the OIDs already used in pg_proc.h. Attached is the updated version of the patch, which fixes the above problem. > For the documentation, I did think about adding a new table for the > macaddr operators, but in the end decided it would probably be an > overkill. Agreed. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8300,8306 CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple The macaddr type also supports the standard relational operators (>, <=, etc.) for ! lexicographical ordering. --- 8300,8308 The macaddr type also supports the standard relational operators (>, <=, etc.) for ! lexicographical ordering, and the bitwise arithmetic operators ! (~, & and |) ! for NOT, AND and OR. *** a/src/backend/utils/adt/mac.c --- b/src/backend/utils/adt/mac.c *** *** 242,247 hashmacaddr(PG_FUNCTION_ARGS) --- 242,300 } /* + * Arithmetic functions: bitwise NOT, AND, OR. + */ + Datum + macaddr_not(PG_FUNCTION_ARGS) + { + macaddr *addr = PG_GETARG_MACADDR_P(0); + macaddr *result; + + result = (macaddr *) palloc(sizeof(macaddr)); + result->a = ~addr->a; + result->b = ~addr->b; + result->c = ~addr->c; + result->d = ~addr->d; + result->e = ~addr->e; + result->f = ~addr->f; + PG_RETURN_MACADDR_P(result); + } + + Datum + macaddr_and(PG_FUNCTION_ARGS) + { + macaddr *addr1 = PG_GETARG_MACADDR_P(0); + macaddr *addr2 = PG_GETARG_MACADDR_P(1); + macaddr *result; + + result = (macaddr *) palloc(sizeof(macaddr)); + result->a = addr1->a & addr2->a; + result->b = addr1->b & addr2->b; + result->c = addr1->c & addr2->c; + result->d = addr1->d & addr2->d; + result->e = addr1->e & addr2->e; + result->f = addr1->f & addr2->f; + PG_RETURN_MACADDR_P(result); + } + + Datum + macaddr_or(PG_FUNCTION_ARGS) + { + macaddr *addr1 = PG_GETARG_MACADDR_P(0); + macaddr *addr2 = PG_GETARG_MACADDR_P(1); + macaddr *result; + + result = (macaddr *) palloc(sizeof(macaddr)); + result->a = addr1->a | addr2->a; + result->b = addr1->b | addr2->b; + result->c = addr1->c | addr2->c; + result->d = addr1->d | addr2->d; + result->e = addr1->e | addr2->e; + result->f = addr1->f | addr2->f; + PG_RETURN_MACADDR_P(result); + } + + /* * Truncation function to allow comparing mac manufacturers. * From suggestion by Alex Pilosov */ *** a/src/include/catalog/pg_operator.h --- b/src/include/catalog/pg_operator.h *** *** 1116,1121 DESCR("greater than"); --- 1116,1128 DATA(insert OID = 1225 ( ">=" PGNSP PGUID b f f 829 829 16 1223 1222 macaddr_ge scalargtsel scalargtjoinsel )); DESCR("greater than or equal"); + DATA(insert OID = 3141 ( "~" PGNSP PGUID l f f 0 829 829 0 0 macaddr_not - - )); + DESCR("bitwise not"); + DATA(insert OID = 3142 ( "&" PGNSP PGUID b f f 829 829 829 0 0 macaddr_and - - )); + DESCR("bitwise and"); + DATA(insert OID = 3143 ( "|" PGNSP PGUID b f f 829 829 829 0 0 macaddr_or - - )); + DESCR("bitwise or"); + /* INET type (these also support CIDR via implicit cast) */ DATA(insert OID = 1201 ( "=" PGNSP PGUID b t t 869 869 16 1201 1202 network_eq eqsel eqjoinsel )); DESCR("equal"); *** a/src/include/catalog/pg_proc.h --- b/src/include/catalog/pg_proc.h *** *** 2039,2044 DATA(insert OID = 834 ( macaddr_ge PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 --- 2039,2047 DATA(insert OID = 835 ( macaddr_ne PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "829 829" _null_ _null_ _null_ _null_ macaddr_ne _null_ _null_ _null_ )); DATA(insert OID = 836 ( macaddr_cmp PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 23 "829 829" _null_ _null_ _null_ _null_ macaddr_cmp _null_ _null_ _null_ )); DESCR("less-equal-greater"); + DATA(insert OID = 3144 ( macaddr_not PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 829 "829" _null_ _null_ _null_ _null_ macaddr_not _null_ _null_ _null_ )); + DATA(insert OID = 3145 ( macaddr_and PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 829 "829 829" _null_ _null_ _null_ _null_ macaddr_and _null_ _null_ _null_ )); + DATA(insert OID = 3146 ( macaddr_or PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 829 "829 829" _null_ _null_ _null_ _nu
Re: [HACKERS] foreign key locks, 2nd attempt
On 16.01.2012 21:52, Alvaro Herrera wrote: Excerpts from Heikki Linnakangas's message of lun ene 16 16:17:42 -0300 2012: On 15.01.2012 06:49, Alvaro Herrera wrote: - pg_upgrade bits are missing. I guess we'll need to rewrite pg_multixact contents in pg_upgrade. Is the page format backwards-compatible? It's not. I haven't worked out what pg_upgrade needs to do, honestly. I think we should just not copy old pg_multixact files when upgrading across this patch. Sorry, I meant whether the *data* page format is backwards-compatible? the multixact page format clearly isn't. I was initially thinking that pg_multixact should return the empty set if requested members of a multi that preceded the freeze point. That way, I thought, we would just never try to access a page originated in the older version (assuming the freeze point is set to "current" whenever pg_upgrade runs). However, as things currently stand, accessing an old multi raises an error. So maybe we need a scheme a bit more complex to handle this. Hmm, could we create new multixact files filled with zeros, covering the range that was valid in the old cluster? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Non-inheritable check constraints
> > I have also tried to change the error message as per Alvaro's > suggestions. > > I will really try to see if we have other issues. Really cannot have > Robert > > reporting all the bugs and getting annoyed, but there are lotsa > variations > > possible with inheritance.. > > So did you find anything? > > Nothing much as yet. I think we are mostly there with the code but will keep on trying some more variations along the lines of what Robert tried out whenever I get the time next. Regards, Nikhils
Re: [HACKERS] Review: Non-inheritable check constraints
> > I will really try to see if we have other issues. Really cannot have > Robert > > reporting all the bugs and getting annoyed, but there are lotsa > variations > > possible with inheritance.. > > BTW thank you for doing the work on this. Probably the reason no one > bothers too much with inheritance is that even something that's > seemingly simple such as what you tried to do here has all these little > details that's hard to get right. > > Thanks Alvaro. Yeah, inheritance is "semantics" quicksand :) Appreciate all the help from you, Robert and Alex Hunsaker on this. Regards, Nikhils
Re: [HACKERS] WAL Restore process during recovery
On Mon, Jan 16, 2012 at 2:06 AM, Simon Riggs wrote: > WALRestore process asynchronously executes restore_command while > recovery continues working. > > Overlaps downloading of next WAL file to reduce time delays in file > based archive recovery. > > Handles cases of file-only and streaming/file correctly. Though I've not reviewed the patch deeply yet, I observed the following two problems when I tested the patch. When I set up streaming replication + archive (i.e., restore_command is set) and started the standby, I got the following error: FATAL: all AuxiliaryProcs are in use LOG: walrestore process (PID 18839) exited with exit code 1 When I started an archive recovery without setting restore_command, it successfully finished. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql case preserving completion
On Thu, Jan 12, 2012 at 5:29 AM, Peter Eisentraut wrote: > In psql, the tab completion always converts key words to upper case. In > practice, I and I think most users type in lower case. So then you end > up with commands looking like this: > > => alter TABLE foo add CONSTRAINT bar check (a > 0); > > To address this, I have implemented a slightly different completion mode > that looks at the word being completed and converts the completed word > to the case of the original word. (Well, it looks at the first letter.) > > In fact, since almost all completions in psql are of this nature, I made > this the default mode for COMPLETE_WITH_CONST and COMPLETE_WITH_LIST and > added a new macro COMPLETE_WITH_LIST_CS that uses the old case-sensitive > behavior. The latter is used mainly for completing backslash commands. > > After playing with this a little, I find the behavior more pleasing. > Less yelling. ;-) > > Patch attached. When I tested the patch, "create ta" was converted unexpectedly to "create TABLE" though "alter ta" was successfully converted to "alter table". As far as I read the patch, you seems to have forgotten to change create_or_drop_command_generator() or something. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers