Re: [HACKERS] loss of transactions in streaming replication

2011-10-20 Thread Robert Haas
On Thu, Oct 20, 2011 at 9:51 PM, Fujii Masao  wrote:
> On Thu, Oct 20, 2011 at 1:05 AM, Robert Haas  wrote:
>> OK, so this is an artifact of the changes to make libpq communication
>> bidirectional.  But I'm still confused about where the error is coming
>> from.  In your OP, you wrote "In 9.2dev and 9.1, when walreceiver
>> detects an error while sending data to WAL stream, it always emits
>> ERROR even if there are data available in the receive buffer."  So
>> that implied to me that this is only going to trigger if you have a
>> shutdown together with an awkwardly-timed error.  But your scenario
>> for reproducing this problem doesn't seem to involve an error.
>
> Yes, my scenario doesn't cause any real error. My original description was
> misleading. The following would be closer to the truth:
>
>    "In 9.2dev and 9.1, when walreceiver detects the termination of replication
>    connection while sending data to WAL stream, it always emits ERROR
>    even if there are data available in the receive buffer."

Ah, OK.  I think I now agree that this is a bug and that we should fix
and back-patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] loss of transactions in streaming replication

2011-10-20 Thread Fujii Masao
On Thu, Oct 20, 2011 at 1:05 AM, Robert Haas  wrote:
> OK, so this is an artifact of the changes to make libpq communication
> bidirectional.  But I'm still confused about where the error is coming
> from.  In your OP, you wrote "In 9.2dev and 9.1, when walreceiver
> detects an error while sending data to WAL stream, it always emits
> ERROR even if there are data available in the receive buffer."  So
> that implied to me that this is only going to trigger if you have a
> shutdown together with an awkwardly-timed error.  But your scenario
> for reproducing this problem doesn't seem to involve an error.

Yes, my scenario doesn't cause any real error. My original description was
misleading. The following would be closer to the truth:

"In 9.2dev and 9.1, when walreceiver detects the termination of replication
connection while sending data to WAL stream, it always emits ERROR
even if there are data available in the receive buffer."

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-10-20 Thread Kyotaro HORIGUCHI
Hello,

> > Robert Haas  writes:
> >> - Why does the second byte need special handling for 0xED and 0xF4?
> >
> > http://www.faqs.org/rfcs/rfc3629.html
> >
> > See section 4 in particular.  The underlying requirement is to disallow
> > multiple representations of the same Unicode code point.

 The special handling skips the utf8 code regions corresponds to
the regions U+D800 - U+DFFF and U+11 - U+11 in ucs-4. The
former is reserved for use with the UTF-16 encoding form as
surrougate pairs and do not directly represent characters as
described in section 3 of rfc3629. The latter is the region which
is out of the utf-8 range by the definition described also in the
same section.

former> The definition of UTF-8 prohibits encoding character
former> numbers between U+D800 and U+DFFF, which are reserved for
former> use with the UTF-16 encoding form (as surrogate pairs)
former> and do not directly represent characters.

latter> In UTF-8, characters from the U+..U+10 range (the
latter> UTF-16 accessible range) are encoded using sequences of 1
latter> to 4 octets.

# However, I wrote this exception simplly mimicked the
# pg_utf8_validator()'s behavior at the beginning.


This must be the basis of the behavior of pg_utf8_verifier(), and
pg_utf8_increment() has taken over it. It may be good to describe
this origin of the special handling as comment of these functions
to avoid this sort of confusion.


> I'm still confused.  The input string is already known to be valid
> UTF-8, so the second byte (if there is one) must be between 0x80 and
> 0xBF.  Therefore it will be neither 0xED nor 0xF4.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ProcessStandbyHSFeedbackMessage can make global xmin go backwards

2011-10-20 Thread Tom Lane
I wrote:
> ProcessStandbyHSFeedbackMessage has a race condition: it thinks it can
> call GetOldestXmin and then the result will politely hold still while
> it considers what to do next.  But in fact, whoever has the oldest xmin
> could exit their transaction, allowing the global minimum to advance.
> If a VACUUM process then inspects the ProcArray, it could compute an
> oldest xmin that is newer than the value that
> ProcessStandbyHSFeedbackMessage installs just afterwards.  So much
> for keeping the data the standby wanted.

Actually, it's worse than that.  Clamping the walsender's xmin to
GetOldestXmin() is useless, and if it weren't useless, this code would
be completely wrong even discounting the bugs I pointed out already.

Consider what it is we're trying to do here: we'd like to prevent
VACUUMs on the master from deleting dead rows with xmax newer than the
xmin the standby is requesting.  Setting the walsender's xmin will only
affect VACUUMs launched after that moment; anything that's already
running will have already determined its threshold xmin, and perhaps
already removed rows.  You can't retroactively fix that.  So clamping
the walsender's xmin to GetOldestXmin doesn't actually do anything for
data integrity; it only ensures that the value of GetOldestXmin doesn't
go backwards on successive calls.  But that's possible anyway, as the
comments for that function already note.

What's worse is that the only thing that is prevented from going
backwards is the value of GetOldestXmin with allDbs = true.  But that's
only used by vacuums on shared catalogs.  If we wanted to prevent
the values seen by vacuums on non-shared tables from going backwards,
we'd have to do some much-more-complex calculation that identified
the largest value of GetOldestXmin with allDbs = false, across all
databases.  And that would *still* be wrong, because then the walsender
would only be protecting data that is newer than the newest such value,
which might allow data in other databases to be reclaimed too early.
You could only really make this work by maintaining per-database xmins,
which the standby isn't sending and there's no place to put into the
walsender's ProcArray entry either.

So I've concluded that there's just no point in the GetOldestXmin
clamping, and we should just apply the xmin value we get from the
standby if it passes basic sanity checks (the epoch test), and hope that
we're not too late to prevent loss of the data the standby wanted.

This line of reasoning also suggests that it's a pretty bad idea for the
walsender to invalidate its existing xmin if it gets a transiently bogus
(out of epoch) value from the standby.  I think it should sit on the
xmin it has, instead, to avoid potential loss of needed data.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI implementation question

2011-10-20 Thread Dan Ports
On Thu, Oct 20, 2011 at 07:33:59AM -0500, Kevin Grittner wrote:
> Dan Ports  wrote:
> > The part that's harder is building the list of potential conflicts
> > that's used to identify safe snapshots for r/o transactions. That
> > (currently) has to happen atomically taking the snapshot.
>  
> That's not obvious to me; could you elaborate on the reasons?  If the
> commit sequence number is incremented under cover of an existing
> ProcArrayLock, and the current value is assigned to a snapshot under
> cover of same, acquiring SerializableXactHashLock after we get the
> snapshot seems to work for SxactGlobalXmin and WritableSxactCount,
> AFAICS.

Well, whenever a r/w transaction commits or aborts, we need to check
whether that caused any concurrent r/o transactions to have a
known-safe or unsafe snapshot. The way that happens now is that, when a
r/o transaction starts, it scans the list of r/w transactions and adds
a pointer to itself in their sxact->possibleUnsafeConflicts. When one
of them commits, it scans the list of possible conflicts and does the
appropriate thing.

That's not ideal because:

  - it requires modifing another transaction's sxact when registering a
serializable transaction, so that means taking
SerializableXactHashLock exclusive.

  - the set of concurrent transactions used to identify the possible
conflicts needs to match the one used to build the snapshot.
Otherwise, a transaction might commit between when the snapshot is
take and when we find possible conflicts. (Holding
SerializableXactHashLock prevents this.)

> Yeah, I don't see how we can avoid taking a LW lock to get a
> serializable transaction which needs a SERIALIZABLEXACT set up, but
> it might be possible and worthwhile to split the uses of
> SerializableXactHashLock so that it's not such a hotspot.

Oh, right, one other problem is that the sxact free list is also
protected by SerializableXactHashLock, so allocating from it requires
locking. That one could be fixed by protecting it with its own lock, or
(better yet) eventually moving to a lock-free implementation.

In general, the contention problem is that SerializableXactHashLock
basically protects all SSI state except the predicate locks themselves
(notably, the dependency graph). This isn't partitioned at all, so
looking at or modifying a single sxact requires locking the whole
graph. I'd like to replace this with something finer-grained.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXECUTE tab completion

2011-10-20 Thread Andreas Karlsson

On 10/20/2011 09:53 PM, Tom Lane wrote:

With that change, the correct test at line 795 would become



 else if (pg_strcasecmp(prev_wd, "DROP") == 0&&
  prev2_wd[0] == '\0')


I've committed this --- please adjust the EXECUTE patch to match.


Thanks for cleaning up the code to some sanity, I should have done so 
myself when I noticed the problem.


A new version is attached.

Best regards,
Andreas

--
Andreas Karlsson
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
new file mode 100644
index abf9bc7..ee63198
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
*** static const SchemaQuery Query_for_list_
*** 588,593 
--- 588,598 
  "   FROM pg_catalog.pg_available_extensions "\
  "  WHERE substring(pg_catalog.quote_ident(name),1,%d)='%s' AND installed_version IS NULL"
  
+ #define Query_for_list_of_prepared_statements \
+ " SELECT pg_catalog.quote_ident(name) "\
+ "   FROM pg_catalog.pg_prepared_statements "\
+ "  WHERE substring(pg_catalog.quote_ident(name),1,%d)='%s'"
+ 
  /*
   * This is a list of all "things" in Pgsql, which can show up after CREATE or
   * DROP; and there is also a query to get a list of them.
*** psql_completion(char *text, int start, i
*** 1640,1645 
--- 1645,1656 
  		COMPLETE_WITH_LIST(list_CSV);
  	}
  
+ /* EXECUTE */
+ 	/* must not match CREATE TRIGGER ... EXECUTE PROCEDURE */
+ 	else if (pg_strcasecmp(prev_wd, "EXECUTE") == 0 &&
+ 			 prev2_wd[0] == '\0')
+ 		COMPLETE_WITH_QUERY(Query_for_list_of_prepared_statements);
+ 
  	/* CREATE DATABASE */
  	else if (pg_strcasecmp(prev3_wd, "CREATE") == 0 &&
  			 pg_strcasecmp(prev2_wd, "DATABASE") == 0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on documentation builds on OSX w/ macports

2011-10-20 Thread Bernd Helmle



--On 20. Oktober 2011 02:02:09 +0200 Florian Pflug  wrote:


In the mean time, the modified ports are all contained in the
attached tar.bz2, should any of ye fellow OSX users want to try them
out before that.

Simply extract that archive, and add
  file://
to /opt/local/etc/macports/sources.conf. After that,
  port install openjade docbook-sgml-4.2
should give you a working docbook SGML environment.


Very cool! Will test it tomorrow...

--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXECUTE tab completion

2011-10-20 Thread Tom Lane
I wrote:
> What I suggest is that we redefine previous_word() as returning an empty
> string, not NULL, anytime there is no such preceding word.  This is
> better than the current behavior because (a) it's less surprising and
> (b) it's not ambiguous.  Right now the caller can't tell the difference
> between "DROP" and "DROP DROP DROP".

> With that change, the correct test at line 795 would become

> else if (pg_strcasecmp(prev_wd, "DROP") == 0 &&
>  prev2_wd[0] == '\0')

I've committed this --- please adjust the EXECUTE patch to match.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] DROP statement reworks

2011-10-20 Thread Robert Haas
On Thu, Oct 20, 2011 at 10:49 AM, Kohei KaiGai  wrote:
>>> part-3: drop statement reworks for other object classes
>>
>> This is going to need some rebasing.
>>
> OK, I rebased it.
>
> This patch includes bug-fix when we tried to drop non-existence
> operator family with IF EXISTS.

I'm thinking we should probably pull that part out and do it
separately.  Seems like it should probably be back-patched.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXECUTE tab completion

2011-10-20 Thread Tom Lane
Alvaro Herrera  writes:
> Excerpts from Josh Kupershmidt's message of mié oct 19 23:50:58 -0300 2011:
>> I assume this is an accepted quirk of previous_word() since we have
>> this existing similar code:

> Maybe both are wrong, though the DROP case seems to work so maybe it's
> just dead code.

I looked at the code more closely and I now see what Josh saw and why
this code is like this.  If you take a desultory look at previous_word()
you will come away with the impression that it returns NULL when asked
for a word before the first word on the line.  But in reality, it
returns NULL only if there is nothing before the current point.
Otherwise, you get duplicates of the first word on the line.  For
example in "DROP TABLE " we'll get
prev_wd = TABLE
prev2_wd = DROP
prev3_wd = DROP
prev4_wd = DROP
prev5_wd = DROP
prev6_wd = DROP
I think this is just a plain old coding bug: the stop-test assumes that
end is reset to -1 each time through the outer loop, but it isn't.

However, we can't just fix the bug, because if previous_word() actually
starts to return NULLs like its author seems to have intended, the
strcasecmp calls that use its output will dump core.

What I suggest is that we redefine previous_word() as returning an empty
string, not NULL, anytime there is no such preceding word.  This is
better than the current behavior because (a) it's less surprising and
(b) it's not ambiguous.  Right now the caller can't tell the difference
between "DROP" and "DROP DROP DROP".

With that change, the correct test at line 795 would become

else if (pg_strcasecmp(prev_wd, "DROP") == 0 &&
 prev2_wd[0] == '\0')

(or any other way you care to write a test for empty string).  It looks
like there is only one place in the file that is actually expecting a
null result in prev_wd, so there wouldn't be much collateral damage
to fix.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXECUTE tab completion

2011-10-20 Thread Alvaro Herrera

Excerpts from Josh Kupershmidt's message of mié oct 19 23:50:58 -0300 2011:
> On Wed, Oct 19, 2011 at 10:40 PM, Tom Lane  wrote:
> > Josh Kupershmidt  writes:
> >> Incidentally, I was wondering what the heck was up with a clause like this:
> >>     else if (pg_strcasecmp(prev_wd, "EXECUTE") == 0 &&
> >>              pg_strcasecmp(prev2_wd, "EXECUTE") == 0)
> >
> > Hmm, maybe || was meant not && ?  It seems pretty unlikely that the
> > above test would ever trigger on valid SQL input.
> 
> Well, changing '&&' to '||' breaks the stated comment of the patch, namely:
>   /* must not match CREATE TRIGGER ... EXECUTE PROCEDURE */
> 
> I assume this is an accepted quirk of previous_word() since we have
> this existing similar code:
> 
> /* DROP, but watch out for DROP embedded in other commands */
> /* complete with something you can drop */
> else if (pg_strcasecmp(prev_wd, "DROP") == 0 &&
>  pg_strcasecmp(prev2_wd, "DROP") == 0)

Maybe both are wrong, though the DROP case seems to work so maybe it's
just dead code.  This was introduced in commit
90725929465474648de133d216b873bdb69fe357:

***
*** 674,685  psql_completion(char *text, int start, int end)
else if (pg_strcasecmp(prev_wd, "CREATE") == 0)
matches = completion_matches(text, create_command_generator);
  
! /* DROP, except ALTER (TABLE|DOMAIN|GROUP) sth DROP */
/* complete with something you can drop */
else if (pg_strcasecmp(prev_wd, "DROP") == 0 &&
!pg_strcasecmp(prev3_wd, "TABLE") != 0 &&
!pg_strcasecmp(prev3_wd, "DOMAIN") != 0 &&
!pg_strcasecmp(prev3_wd, "GROUP") != 0)
matches = completion_matches(text, drop_command_generator);
  
  /* ALTER */
--- 674,683 
else if (pg_strcasecmp(prev_wd, "CREATE") == 0)
matches = completion_matches(text, create_command_generator);
  
! /* DROP, but watch out for DROP embedded in other commands */
/* complete with something you can drop */
else if (pg_strcasecmp(prev_wd, "DROP") == 0 &&
!pg_strcasecmp(prev2_wd, "DROP") == 0)
matches = completion_matches(text, drop_command_generator);
  
  /* ALTER */


-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI implementation question

2011-10-20 Thread Kevin Grittner
Dan Ports  wrote:
 
> I think it would be fairly sensible to push some of this into
> ProcArray, actually. The commit sequence numbers just involve
> assigning/incrementing a global counter when taking a snapshot and
> finishing a transaction -- that's not too much work, doesn't
> require any additional locking beyond ProcArrayLock, and isn't too
> tied to SSI. (I could imagine it being useful for other purposes,
> though I'm not going to make that argument too seriously without
> actually having one in mind.)
 
So far it sounds like we're on the same page.
 
> SxactGlobalXmin and WritableSxactCount are obviously more
> SSI-specific, but I think we can come up with something reasonable
> to do with them.
 
Agreed.
 
> The part that's harder is building the list of potential conflicts
> that's used to identify safe snapshots for r/o transactions. That
> (currently) has to happen atomically taking the snapshot.
 
That's not obvious to me; could you elaborate on the reasons?  If the
commit sequence number is incremented under cover of an existing
ProcArrayLock, and the current value is assigned to a snapshot under
cover of same, acquiring SerializableXactHashLock after we get the
snapshot seems to work for SxactGlobalXmin and WritableSxactCount,
AFAICS.
 
> We'll probably have to do this in some significantly different way,
> but I haven't quite worked out what it is yet.
 
Well, transaction startup needs some rearrangement, clearly.  So far
nothing fundamental has caught my attention beyond the commit
sequence number changes.
 
> On the bright side, if we can address these three issues, we
> shouldn't need to take SerializableXactHashLock at all when
> starting a transaction. (Realistically, we might have to take it or
> some other lock shared to handle one of them -- but I really want
> starting a serializable xact to not take any exclusive locks.)
 
Yeah, I don't see how we can avoid taking a LW lock to get a
serializable transaction which needs a SERIALIZABLEXACT set up, but
it might be possible and worthwhile to split the uses of
SerializableXactHashLock so that it's not such a hotspot.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers