Re: [HACKERS] loss of transactions in streaming replication
On Thu, Oct 20, 2011 at 9:51 PM, Fujii Masao wrote: > On Thu, Oct 20, 2011 at 1:05 AM, Robert Haas wrote: >> OK, so this is an artifact of the changes to make libpq communication >> bidirectional. But I'm still confused about where the error is coming >> from. In your OP, you wrote "In 9.2dev and 9.1, when walreceiver >> detects an error while sending data to WAL stream, it always emits >> ERROR even if there are data available in the receive buffer." So >> that implied to me that this is only going to trigger if you have a >> shutdown together with an awkwardly-timed error. But your scenario >> for reproducing this problem doesn't seem to involve an error. > > Yes, my scenario doesn't cause any real error. My original description was > misleading. The following would be closer to the truth: > > "In 9.2dev and 9.1, when walreceiver detects the termination of replication > connection while sending data to WAL stream, it always emits ERROR > even if there are data available in the receive buffer." Ah, OK. I think I now agree that this is a bug and that we should fix and back-patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] loss of transactions in streaming replication
On Thu, Oct 20, 2011 at 1:05 AM, Robert Haas wrote: > OK, so this is an artifact of the changes to make libpq communication > bidirectional. But I'm still confused about where the error is coming > from. In your OP, you wrote "In 9.2dev and 9.1, when walreceiver > detects an error while sending data to WAL stream, it always emits > ERROR even if there are data available in the receive buffer." So > that implied to me that this is only going to trigger if you have a > shutdown together with an awkwardly-timed error. But your scenario > for reproducing this problem doesn't seem to involve an error. Yes, my scenario doesn't cause any real error. My original description was misleading. The following would be closer to the truth: "In 9.2dev and 9.1, when walreceiver detects the termination of replication connection while sending data to WAL stream, it always emits ERROR even if there are data available in the receive buffer." Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
Hello, > > Robert Haas writes: > >> - Why does the second byte need special handling for 0xED and 0xF4? > > > > http://www.faqs.org/rfcs/rfc3629.html > > > > See section 4 in particular. The underlying requirement is to disallow > > multiple representations of the same Unicode code point. The special handling skips the utf8 code regions corresponds to the regions U+D800 - U+DFFF and U+11 - U+11 in ucs-4. The former is reserved for use with the UTF-16 encoding form as surrougate pairs and do not directly represent characters as described in section 3 of rfc3629. The latter is the region which is out of the utf-8 range by the definition described also in the same section. former> The definition of UTF-8 prohibits encoding character former> numbers between U+D800 and U+DFFF, which are reserved for former> use with the UTF-16 encoding form (as surrogate pairs) former> and do not directly represent characters. latter> In UTF-8, characters from the U+..U+10 range (the latter> UTF-16 accessible range) are encoded using sequences of 1 latter> to 4 octets. # However, I wrote this exception simplly mimicked the # pg_utf8_validator()'s behavior at the beginning. This must be the basis of the behavior of pg_utf8_verifier(), and pg_utf8_increment() has taken over it. It may be good to describe this origin of the special handling as comment of these functions to avoid this sort of confusion. > I'm still confused. The input string is already known to be valid > UTF-8, so the second byte (if there is one) must be between 0x80 and > 0xBF. Therefore it will be neither 0xED nor 0xF4. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ProcessStandbyHSFeedbackMessage can make global xmin go backwards
I wrote: > ProcessStandbyHSFeedbackMessage has a race condition: it thinks it can > call GetOldestXmin and then the result will politely hold still while > it considers what to do next. But in fact, whoever has the oldest xmin > could exit their transaction, allowing the global minimum to advance. > If a VACUUM process then inspects the ProcArray, it could compute an > oldest xmin that is newer than the value that > ProcessStandbyHSFeedbackMessage installs just afterwards. So much > for keeping the data the standby wanted. Actually, it's worse than that. Clamping the walsender's xmin to GetOldestXmin() is useless, and if it weren't useless, this code would be completely wrong even discounting the bugs I pointed out already. Consider what it is we're trying to do here: we'd like to prevent VACUUMs on the master from deleting dead rows with xmax newer than the xmin the standby is requesting. Setting the walsender's xmin will only affect VACUUMs launched after that moment; anything that's already running will have already determined its threshold xmin, and perhaps already removed rows. You can't retroactively fix that. So clamping the walsender's xmin to GetOldestXmin doesn't actually do anything for data integrity; it only ensures that the value of GetOldestXmin doesn't go backwards on successive calls. But that's possible anyway, as the comments for that function already note. What's worse is that the only thing that is prevented from going backwards is the value of GetOldestXmin with allDbs = true. But that's only used by vacuums on shared catalogs. If we wanted to prevent the values seen by vacuums on non-shared tables from going backwards, we'd have to do some much-more-complex calculation that identified the largest value of GetOldestXmin with allDbs = false, across all databases. And that would *still* be wrong, because then the walsender would only be protecting data that is newer than the newest such value, which might allow data in other databases to be reclaimed too early. You could only really make this work by maintaining per-database xmins, which the standby isn't sending and there's no place to put into the walsender's ProcArray entry either. So I've concluded that there's just no point in the GetOldestXmin clamping, and we should just apply the xmin value we get from the standby if it passes basic sanity checks (the epoch test), and hope that we're not too late to prevent loss of the data the standby wanted. This line of reasoning also suggests that it's a pretty bad idea for the walsender to invalidate its existing xmin if it gets a transiently bogus (out of epoch) value from the standby. I think it should sit on the xmin it has, instead, to avoid potential loss of needed data. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI implementation question
On Thu, Oct 20, 2011 at 07:33:59AM -0500, Kevin Grittner wrote: > Dan Ports wrote: > > The part that's harder is building the list of potential conflicts > > that's used to identify safe snapshots for r/o transactions. That > > (currently) has to happen atomically taking the snapshot. > > That's not obvious to me; could you elaborate on the reasons? If the > commit sequence number is incremented under cover of an existing > ProcArrayLock, and the current value is assigned to a snapshot under > cover of same, acquiring SerializableXactHashLock after we get the > snapshot seems to work for SxactGlobalXmin and WritableSxactCount, > AFAICS. Well, whenever a r/w transaction commits or aborts, we need to check whether that caused any concurrent r/o transactions to have a known-safe or unsafe snapshot. The way that happens now is that, when a r/o transaction starts, it scans the list of r/w transactions and adds a pointer to itself in their sxact->possibleUnsafeConflicts. When one of them commits, it scans the list of possible conflicts and does the appropriate thing. That's not ideal because: - it requires modifing another transaction's sxact when registering a serializable transaction, so that means taking SerializableXactHashLock exclusive. - the set of concurrent transactions used to identify the possible conflicts needs to match the one used to build the snapshot. Otherwise, a transaction might commit between when the snapshot is take and when we find possible conflicts. (Holding SerializableXactHashLock prevents this.) > Yeah, I don't see how we can avoid taking a LW lock to get a > serializable transaction which needs a SERIALIZABLEXACT set up, but > it might be possible and worthwhile to split the uses of > SerializableXactHashLock so that it's not such a hotspot. Oh, right, one other problem is that the sxact free list is also protected by SerializableXactHashLock, so allocating from it requires locking. That one could be fixed by protecting it with its own lock, or (better yet) eventually moving to a lock-free implementation. In general, the contention problem is that SerializableXactHashLock basically protects all SSI state except the predicate locks themselves (notably, the dependency graph). This isn't partitioned at all, so looking at or modifying a single sxact requires locking the whole graph. I'd like to replace this with something finer-grained. Dan -- Dan R. K. Ports MIT CSAILhttp://drkp.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXECUTE tab completion
On 10/20/2011 09:53 PM, Tom Lane wrote: With that change, the correct test at line 795 would become else if (pg_strcasecmp(prev_wd, "DROP") == 0&& prev2_wd[0] == '\0') I've committed this --- please adjust the EXECUTE patch to match. Thanks for cleaning up the code to some sanity, I should have done so myself when I noticed the problem. A new version is attached. Best regards, Andreas -- Andreas Karlsson diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c new file mode 100644 index abf9bc7..ee63198 *** a/src/bin/psql/tab-complete.c --- b/src/bin/psql/tab-complete.c *** static const SchemaQuery Query_for_list_ *** 588,593 --- 588,598 " FROM pg_catalog.pg_available_extensions "\ " WHERE substring(pg_catalog.quote_ident(name),1,%d)='%s' AND installed_version IS NULL" + #define Query_for_list_of_prepared_statements \ + " SELECT pg_catalog.quote_ident(name) "\ + " FROM pg_catalog.pg_prepared_statements "\ + " WHERE substring(pg_catalog.quote_ident(name),1,%d)='%s'" + /* * This is a list of all "things" in Pgsql, which can show up after CREATE or * DROP; and there is also a query to get a list of them. *** psql_completion(char *text, int start, i *** 1640,1645 --- 1645,1656 COMPLETE_WITH_LIST(list_CSV); } + /* EXECUTE */ + /* must not match CREATE TRIGGER ... EXECUTE PROCEDURE */ + else if (pg_strcasecmp(prev_wd, "EXECUTE") == 0 && + prev2_wd[0] == '\0') + COMPLETE_WITH_QUERY(Query_for_list_of_prepared_statements); + /* CREATE DATABASE */ else if (pg_strcasecmp(prev3_wd, "CREATE") == 0 && pg_strcasecmp(prev2_wd, "DATABASE") == 0) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on documentation builds on OSX w/ macports
--On 20. Oktober 2011 02:02:09 +0200 Florian Pflug wrote: In the mean time, the modified ports are all contained in the attached tar.bz2, should any of ye fellow OSX users want to try them out before that. Simply extract that archive, and add file:// to /opt/local/etc/macports/sources.conf. After that, port install openjade docbook-sgml-4.2 should give you a working docbook SGML environment. Very cool! Will test it tomorrow... -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXECUTE tab completion
I wrote: > What I suggest is that we redefine previous_word() as returning an empty > string, not NULL, anytime there is no such preceding word. This is > better than the current behavior because (a) it's less surprising and > (b) it's not ambiguous. Right now the caller can't tell the difference > between "DROP" and "DROP DROP DROP". > With that change, the correct test at line 795 would become > else if (pg_strcasecmp(prev_wd, "DROP") == 0 && > prev2_wd[0] == '\0') I've committed this --- please adjust the EXECUTE patch to match. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] DROP statement reworks
On Thu, Oct 20, 2011 at 10:49 AM, Kohei KaiGai wrote: >>> part-3: drop statement reworks for other object classes >> >> This is going to need some rebasing. >> > OK, I rebased it. > > This patch includes bug-fix when we tried to drop non-existence > operator family with IF EXISTS. I'm thinking we should probably pull that part out and do it separately. Seems like it should probably be back-patched. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXECUTE tab completion
Alvaro Herrera writes: > Excerpts from Josh Kupershmidt's message of mié oct 19 23:50:58 -0300 2011: >> I assume this is an accepted quirk of previous_word() since we have >> this existing similar code: > Maybe both are wrong, though the DROP case seems to work so maybe it's > just dead code. I looked at the code more closely and I now see what Josh saw and why this code is like this. If you take a desultory look at previous_word() you will come away with the impression that it returns NULL when asked for a word before the first word on the line. But in reality, it returns NULL only if there is nothing before the current point. Otherwise, you get duplicates of the first word on the line. For example in "DROP TABLE " we'll get prev_wd = TABLE prev2_wd = DROP prev3_wd = DROP prev4_wd = DROP prev5_wd = DROP prev6_wd = DROP I think this is just a plain old coding bug: the stop-test assumes that end is reset to -1 each time through the outer loop, but it isn't. However, we can't just fix the bug, because if previous_word() actually starts to return NULLs like its author seems to have intended, the strcasecmp calls that use its output will dump core. What I suggest is that we redefine previous_word() as returning an empty string, not NULL, anytime there is no such preceding word. This is better than the current behavior because (a) it's less surprising and (b) it's not ambiguous. Right now the caller can't tell the difference between "DROP" and "DROP DROP DROP". With that change, the correct test at line 795 would become else if (pg_strcasecmp(prev_wd, "DROP") == 0 && prev2_wd[0] == '\0') (or any other way you care to write a test for empty string). It looks like there is only one place in the file that is actually expecting a null result in prev_wd, so there wouldn't be much collateral damage to fix. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXECUTE tab completion
Excerpts from Josh Kupershmidt's message of mié oct 19 23:50:58 -0300 2011: > On Wed, Oct 19, 2011 at 10:40 PM, Tom Lane wrote: > > Josh Kupershmidt writes: > >> Incidentally, I was wondering what the heck was up with a clause like this: > >> else if (pg_strcasecmp(prev_wd, "EXECUTE") == 0 && > >> pg_strcasecmp(prev2_wd, "EXECUTE") == 0) > > > > Hmm, maybe || was meant not && ? It seems pretty unlikely that the > > above test would ever trigger on valid SQL input. > > Well, changing '&&' to '||' breaks the stated comment of the patch, namely: > /* must not match CREATE TRIGGER ... EXECUTE PROCEDURE */ > > I assume this is an accepted quirk of previous_word() since we have > this existing similar code: > > /* DROP, but watch out for DROP embedded in other commands */ > /* complete with something you can drop */ > else if (pg_strcasecmp(prev_wd, "DROP") == 0 && > pg_strcasecmp(prev2_wd, "DROP") == 0) Maybe both are wrong, though the DROP case seems to work so maybe it's just dead code. This was introduced in commit 90725929465474648de133d216b873bdb69fe357: *** *** 674,685 psql_completion(char *text, int start, int end) else if (pg_strcasecmp(prev_wd, "CREATE") == 0) matches = completion_matches(text, create_command_generator); ! /* DROP, except ALTER (TABLE|DOMAIN|GROUP) sth DROP */ /* complete with something you can drop */ else if (pg_strcasecmp(prev_wd, "DROP") == 0 && !pg_strcasecmp(prev3_wd, "TABLE") != 0 && !pg_strcasecmp(prev3_wd, "DOMAIN") != 0 && !pg_strcasecmp(prev3_wd, "GROUP") != 0) matches = completion_matches(text, drop_command_generator); /* ALTER */ --- 674,683 else if (pg_strcasecmp(prev_wd, "CREATE") == 0) matches = completion_matches(text, create_command_generator); ! /* DROP, but watch out for DROP embedded in other commands */ /* complete with something you can drop */ else if (pg_strcasecmp(prev_wd, "DROP") == 0 && !pg_strcasecmp(prev2_wd, "DROP") == 0) matches = completion_matches(text, drop_command_generator); /* ALTER */ -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI implementation question
Dan Ports wrote: > I think it would be fairly sensible to push some of this into > ProcArray, actually. The commit sequence numbers just involve > assigning/incrementing a global counter when taking a snapshot and > finishing a transaction -- that's not too much work, doesn't > require any additional locking beyond ProcArrayLock, and isn't too > tied to SSI. (I could imagine it being useful for other purposes, > though I'm not going to make that argument too seriously without > actually having one in mind.) So far it sounds like we're on the same page. > SxactGlobalXmin and WritableSxactCount are obviously more > SSI-specific, but I think we can come up with something reasonable > to do with them. Agreed. > The part that's harder is building the list of potential conflicts > that's used to identify safe snapshots for r/o transactions. That > (currently) has to happen atomically taking the snapshot. That's not obvious to me; could you elaborate on the reasons? If the commit sequence number is incremented under cover of an existing ProcArrayLock, and the current value is assigned to a snapshot under cover of same, acquiring SerializableXactHashLock after we get the snapshot seems to work for SxactGlobalXmin and WritableSxactCount, AFAICS. > We'll probably have to do this in some significantly different way, > but I haven't quite worked out what it is yet. Well, transaction startup needs some rearrangement, clearly. So far nothing fundamental has caught my attention beyond the commit sequence number changes. > On the bright side, if we can address these three issues, we > shouldn't need to take SerializableXactHashLock at all when > starting a transaction. (Realistically, we might have to take it or > some other lock shared to handle one of them -- but I really want > starting a serializable xact to not take any exclusive locks.) Yeah, I don't see how we can avoid taking a LW lock to get a serializable transaction which needs a SERIALIZABLEXACT set up, but it might be possible and worthwhile to split the uses of SerializableXactHashLock so that it's not such a hotspot. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers