date:20131213

On 2013-12-13 20:58:24 +1300, David Rowley wrote:
 On Wed, Dec 11, 2013 at 1:11 PM, Robert Haas robertmh...@gmail.com wrote:
 This introduced a new compiler warning on the visual studios build:
   d:\postgres\b\src\backend\utils\cache\relcache.c(3958): warning C4715:
 'RelationGetIndexAttrBitmap' : not all control paths return a value
 [D:\Postgres\b\postgres.vcxproj]
 
 The attached patch fixes it.

I thought we'd managed to get elog(ERROR) properly annotated as noreturn
on msvc as well?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug in visibility map WAL-logging

2013-12-13 Thread Heikki Linnakangas


lazy_vacuum_page() does this:

1. START_CRIT_SECTION()
2. Remove dead tuples from the page, marking the itemid's unused.
3. MarkBufferDirty
4. if all remaining tuples on the page are visible to everyone, set the 
all-visible flag on the page, and call visibilitymap_set() to set the VM 
bit.
5 visibilitymap_set() writes a WAL record about setting the bit in the 
visibility map.

6. write the WAL record of removing the dead tuples.
7. END_CRIT_SECTION().

See the problem? Setting the VM bit is WAL-logged first, before the 
removal of the tuples. If you stop recovery between the two WAL records, 
the page's VM bit in the VM map will be set, but the dead tuples are 
still on the page.


This bug was introduced in Feb 2013, by commit 
fdf9e21196a6f58c6021c967dc5776a16190f295, so it's present in 9.3 and master.


The fix seems quite straightforward, we just have to change the order of 
those two records. I'll go commit that.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Simon Riggs

On 12 December 2013 21:58, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 On Thu, Dec 12, 2013 at 3:42 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:

 On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
 
  On 12 December 2013 15:19, Simon Riggs si...@2ndquadrant.com wrote:
 
   Don't panic guys! I meant UTC offset only. And yes, it may not be
   needed, will check.
 
  Checked, all non-UTC TZ offsets work without further effort here.
 

 Thanks!


 Reviewing the committed patch I noted that the CheckForStandbyTrigger()
 after the delay was removed.

 If we promote the standby during the delay and don't check the trigger
 immediately after the delay, then we will replay undesired WALs records.

 The attached patch add this check.

I removed it because it was after the pause. I'll replace it, but
before the pause.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

Hi,

On 2013-12-12 19:35:36 -0800, Christophe Pettus wrote:
 On Dec 12, 2013, at 6:41 PM, Andres Freund and...@2ndquadrant.com wrote:
 
  Christophe: are there any unusual ERROR messages preceding the crash,
  possibly some minutes before?
 
 Interestingly, each spinlock PANIC is *followed*, about one minute later (+/- 
 five seconds) by a canceling statement due to statement timeout on that 
 exact query.  The queries vary enough in text that it is unlikely to be a 
 coincidence.
 
 There are a *lot* of canceling statement due to statement timeout messages, 
 which is interesting, because:

Tom, could this be caused by c357be2cd9434c70904d871d9b96828b31a50cc5?
Specifically the added CHECK_FOR_INTERRUPTS() in handle_sig_alarm()?
ISTM nothing is preventing us from jumping out of code holding a
spinlock?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

On 2013-12-13 11:56:47 +, Simon Riggs wrote:
 On 12 December 2013 21:58, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  Reviewing the committed patch I noted that the CheckForStandbyTrigger()
  after the delay was removed.
 
  If we promote the standby during the delay and don't check the trigger
  immediately after the delay, then we will replay undesired WALs records.
 
  The attached patch add this check.
 
 I removed it because it was after the pause. I'll replace it, but
 before the pause.

Doesn't after the pause make more sense? If somebody promoted while we
were waiting, we want to recognize that before rolling forward? The wait
can take a long while after all?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL: better default ciphersuite

2013-12-13 Thread Marko Kreen

On Thu, Dec 12, 2013 at 09:18:03PM -0500, Peter Eisentraut wrote:
 On Thu, 2013-12-12 at 12:30 +0200, Marko Kreen wrote:
  First, if there is explicit wish to keep RC4/SEED in play, I'm fine
  with HIGH:MEDIUM:!aNULL as new default.  Clarity-wise, it's still
  much better than current value.  And this value will result *exactly*
  same list in same order as current value.
 
 If we have to make a change, I'd go for that, but I'm not convinced that
 this is necessarily clearer.

Yeah, the clarity argument is getting thinner...

And my latest patch was for HIGH:MEDIUM:+3DES:!aNULL.

I still think it's better to have positive statements there -
gimme this and that - instad badly-named 'DEFAULT' and then
lot's of negatives applied to it.  But it's not that straightforward
anymore - the +3DES breaks the leave everything to OpenSSL angle.

But we do need to change default suite list to have one that works
well with prefer-server-ciphers option, which means it should contain
at least the +3DES workaround.  Client that don't want AES256 are
reasonable as AES256 does not have any practical advantages over AES128.

I don't think just reverting the default is good idea - we should then
add documentation to option that if you flip this, add such fixes
to cipher list.  Which seems silly.

And not documenting anything and just leaving matters to admins
seems bad idea too - they are not in better position to do such
research than we are now.

So I think we can pick good default, now, and everybody will benefit.


For fun, how to go overboard on the issue - Mozilla recommendations
for TLS setup on their infrastructure:

  https://wiki.mozilla.org/Security/Server_Side_TLS

It also discusses various issues with TLS, so it's good read.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Simon Riggs

On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-13 11:56:47 +, Simon Riggs wrote:
 On 12 December 2013 21:58, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  Reviewing the committed patch I noted that the CheckForStandbyTrigger()
  after the delay was removed.
 
  If we promote the standby during the delay and don't check the trigger
  immediately after the delay, then we will replay undesired WALs records.
 
  The attached patch add this check.

 I removed it because it was after the pause. I'll replace it, but
 before the pause.

 Doesn't after the pause make more sense? If somebody promoted while we
 were waiting, we want to recognize that before rolling forward? The wait
 can take a long while after all?

That would change the way pause currently works, which is OOS for that patch.

I'm happy to discuss such a change, but if agreed, it would need to
apply in all cases, not just this one.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] patch: option --if-exists for pg_dump

2013-12-13 Thread Pavel Stehule

Hello

I am sending a rebased patch.

Now dump generated with --if-exists option is readable by pg_restore

Regards

Pavel
commit 19f21165343b1aaa6cc21d381b84e3c0ce6e3b46
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Fri Dec 13 14:02:46 2013 +0100

--if-exists option for pg_dump

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 8d45f24..18f7346 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -650,6 +650,16 @@ PostgreSQL documentation
  /varlistentry
 
  varlistentry
+  termoption--if-exists/option/term
+  listitem
+   para
+It use conditional commands (with literalIF EXISTS/literal
+clause) for cleaning database schema.
+   /para
+  /listitem
+ /varlistentry
+
+ varlistentry
   termoption--disable-dollar-quoting//term
   listitem
para
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 5c6a101..7e5134e 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -270,6 +270,16 @@ PostgreSQL documentation
  /varlistentry
 
  varlistentry
+  termoption--if-exists/option/term
+  listitem
+   para
+It use conditional commands (with literalIF EXISTS/literal
+clause) for cleaning database schema.
+   /para
+  /listitem
+ /varlistentry
+
+ varlistentry
   termoption--disable-dollar-quoting//term
   listitem
para
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 6927968..83f7216 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -113,6 +113,7 @@ typedef struct _restoreOptions
 	char	   *superuser;		/* Username to use as superuser */
 	char	   *use_role;		/* Issue SET ROLE to this */
 	int			dropSchema;
+	int			if_exists;
 	const char *filename;
 	int			dataOnly;
 	int			schemaOnly;
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 63a8009..23dc77e 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -55,7 +55,8 @@ static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
 	 const int compression, ArchiveMode mode, SetupWorkerPtr setupWorkerPtr);
 static void _getObjectDescription(PQExpBuffer buf, TocEntry *te,
 	  ArchiveHandle *AH);
-static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt, bool isData, bool acl_pass);
+static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt,
+ bool isData, bool acl_pass);
 static char *replace_line_endings(const char *str);
 static void _doSetFixedOutputState(ArchiveHandle *AH);
 static void _doSetSessionAuth(ArchiveHandle *AH, const char *user);
@@ -234,6 +235,7 @@ RestoreArchive(Archive *AHX)
 	bool		parallel_mode;
 	TocEntry   *te;
 	OutputContext sav;
+	
 
 	AH-stage = STAGE_INITIALIZING;
 
@@ -2938,9 +2940,35 @@ _getObjectDescription(PQExpBuffer buf, TocEntry *te, ArchiveHandle *AH)
 		strcmp(type, OPERATOR CLASS) == 0 ||
 		strcmp(type, OPERATOR FAMILY) == 0)
 	{
-		/* Chop DROP  off the front and make a modifiable copy */
-		char	   *first = pg_strdup(te-dropStmt + 5);
-		char	   *last;
+		char	*first;
+		char	*last;
+
+		/* try to search IF EXISTS in DROP command */
+		if (strstr(te-dropStmt, IF EXISTS) != NULL)
+		{
+			char buffer[40];
+			size_t   l;
+
+			/* IF EXISTS clause should be optional, check it*/
+			snprintf(buffer, sizeof(buffer), DROP %s IF EXISTS, type);
+			l = strlen(buffer);
+
+			if (strncmp(te-dropStmt, buffer, l) == 0)
+			{
+/* append command type to target type */
+appendPQExpBufferStr(buf, type);
+
+/* skip first n chars, and create a modifieble copy */
+first = pg_strdup(te-dropStmt + l);
+			}
+			else
+/* DROP IF EXISTS pattern is not appliable on dropStmt */
+first = pg_strdup(te-dropStmt + 5);
+		}
+
+		else
+			/* IF EXISTS clause was not used, simple solution */
+			first = pg_strdup(te-dropStmt + 5);
 
 		/* point to last character in string */
 		last = first + strlen(first) - 1;
@@ -2961,7 +2989,8 @@ _getObjectDescription(PQExpBuffer buf, TocEntry *te, ArchiveHandle *AH)
 }
 
 static void
-_printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt, bool isData, bool acl_pass)
+_printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt, bool isData,
+  bool acl_pass)
 {
 	/* ACLs are dumped only during acl pass */
 	if (acl_pass)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 224e8cb..0f602c9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -140,6 +140,7 @@ static int	no_security_labels = 0;
 static int	no_synchronized_snapshots = 0;
 static int	no_unlogged_table_data = 0;
 static int	serializable_deferrable = 0;
+static int	if_exists = 0;
 
 
 static void help(const char *progname);
@@ -346,6 +347,7 @@ main(int argc, char **argv)

Re: [HACKERS] Time-Delayed Standbys

On 2013-12-13 13:09:13 +, Simon Riggs wrote:
 On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-12-13 11:56:47 +, Simon Riggs wrote:
  On 12 December 2013 21:58, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
   Reviewing the committed patch I noted that the CheckForStandbyTrigger()
   after the delay was removed.
  
   If we promote the standby during the delay and don't check the trigger
   immediately after the delay, then we will replay undesired WALs records.
  
   The attached patch add this check.
 
  I removed it because it was after the pause. I'll replace it, but
  before the pause.
 
  Doesn't after the pause make more sense? If somebody promoted while we
  were waiting, we want to recognize that before rolling forward? The wait
  can take a long while after all?
 
 That would change the way pause currently works, which is OOS for that patch.

But this feature isn't pause itself - it's imo something
independent. Note that we currently
a) check pause again after recoveryApplyDelay(),
b) do check for promotion if the sleep in recoveryApplyDelay() is
   interrupted. So not checking after the final sleep seems confusing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changeset Extraction Interfaces

2013-12-13 Thread Robert Haas

On Thu, Dec 12, 2013 at 1:52 PM, Andres Freund and...@2ndquadrant.com wrote:
 Puh. I honestly have zero confidence in DBAs making an informed decision
 about something like this. Honestly, for a replication solution, how
 often do you think this will be an issue?

If you imagine a scenario where somebody establishes a replication
slot and then keeps it forever, not often.  But if you're trying to do
something more ad hoc, where replication slots might be used just for
short periods of time and then abandoned, I think it could come up
pretty frequently.  Generally, I think you're being too dismissive of
the stuff I'm complaining about here.  If we just can't get this, well
then I suppose we can't.  But I think the amount of time that it takes
Hot Standby to open for connections is an issue, precisely because
it's got to wait until certain criteria are met before it can
establish a snapshot, and sometimes that takes an unpleasantly long
time.  I think it unlikely that we can export that logic to this case
also and experience no pain as a result.

In fact, I think that even restricting things to streaming changes
from transactions started after we initiate replication is going to be
an annoying amount of delay for some purposes.  People will accept it
because, no matter how you slice it, this is an awesome new
capability.  Full stop.  That having been said, I don't find it at all
hard to imagine someone wanting to jump into the replication stream at
an arbitrary point in time and see changes from every transaction that
*commits* after that point, even if it began earlier, or even to see
changes from transactions that have not yet committed as they happen.
I realize that's asking for a pony, and I'm not saying you have to go
off and do that right now in order for this to move forward, or indeed
that it will ever happen at all.  What I am saying is that I find it
entirely likely that people are going to push the limits of this
thing, that this is one of the limits I expect them to push, and that
the more we can do to put policy in the hands of the user without
pre-judging the sanity of what they're trying to do, the happier we
(and our users) will be.

  It's not too difficult to provide an option to do that. What I've been
  thinking of was to correlate the confirmation of consumption with the
  transaction the SRF is running in. So, confirm the data as consumed if
  it commits, and don't if not. I think we could do that relatively easily
  by registering a XACT_EVENT_COMMIT.

 That's a bit too accident-prone for my taste.  I'd rather the DBA had
 some equivalent of peek_at_replication(nchanges int).

 One point for my suggested behaviour is that it closes a bigger
 racecondition. Currently as soon as start_logical_replication() has
 finished building the tuplestore it marks the endposition as
 received. But we very well can fail before the user has received all
 those changes.

Right.  I think your idea is good, but maybe there should also be a
version of the function that never confirms receipt even if the
transaction commits.  That would be useful for ad-hoc poking at the
queue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v6.8

2013-12-13 Thread David Rowley

On Sat, Dec 14, 2013 at 12:12 AM, Andres Freund and...@2ndquadrant.comwrote:

 On 2013-12-13 20:58:24 +1300, David Rowley wrote:
  On Wed, Dec 11, 2013 at 1:11 PM, Robert Haas robertmh...@gmail.com
 wrote:
  This introduced a new compiler warning on the visual studios build:
d:\postgres\b\src\backend\utils\cache\relcache.c(3958): warning C4715:
  'RelationGetIndexAttrBitmap' : not all control paths return a value
  [D:\Postgres\b\postgres.vcxproj]
 
  The attached patch fixes it.

 I thought we'd managed to get elog(ERROR) properly annotated as noreturn
 on msvc as well?


It looks like this is down to the elog macro, where the elevel is being
assigned to a variable elevel_ then we're only doing pg_unreachable(); if
elevel_ = ERROR. The compiler must not be confident enough to optimise out
the if condition even though the elevel is not changed after it is set from
the constant.

Regards

David Rowley


 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] patch: make_timestamp function

On Fri, Dec 13, 2013 at 7:09 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 I though about it. Then there are two questions

 a) Could we have a make_timetz function?

 b) What type we use for timezone?



I just think in a function that returns the timestamp with timezone based
on the current 'timezone' setting.

fabrizio=# show timezone;
  TimeZone
-
 Brazil/East
(1 row)

fabrizio=# select '2013-12-13 11:29:45.786937'::timestamptz;
  timestamptz
---
 2013-12-13 11:29:45.786937-02
(1 row)

fabrizio=# set timezone to 'UTC';
SET
fabrizio=# select '2013-12-13 11:29:45.786937'::timestamptz;
  timestamptz
---
 2013-12-13 11:29:45.786937+00
(1 row)

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

[HACKERS] [bug fix] multibyte messages are displayed incorrectly on the client

2013-12-13 Thread MauMau


Hello,

The attached patch fixes incorrect message output on the client side.  I 
guess this problem can happen with any major release.  Could you review 
this?



[Problem]
When the client's locale differs from the server's message locale, the 
messages generated on the server are converted appropriately and sent to the 
client.  For example, if the server runs on Linux with lc_messages = 
'ja_JP.UTF-8' in postgresql.conf, and you run psql on Windows where the 
system locale is SJIS, Japanese messages are converted from UTF-8 to SJIS on 
the server and sent to psql.  psql can display those SJIS messages 
correctly.  This is no problem.


However, this desirable behavior holds true only after the database session 
is established.  The error messages during session establishment are 
displayed incorrectly, and you cannot recognize the message contents.  For 
example, run psql -d postgres -U non-existent-username.  The displayed 
message is unrecognizable.



[Cause]
While the session is being established, the server cannot use the client 
encoding for message conversion yet, because it cannot access system 
catalogs to retrieve conversion functions.  So, the server sends messages to 
the client without conversion.  In the above example, the server sends 
Japanese UTF-8 messages to psql, which expects those messages in SJIS.



[Fix]
Disable message localization during session startup.  In other words, 
messages are output in English until the database session is established.



Regards
MauMau


no_localize_message_in_startup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Simon Riggs

On 13 December 2013 13:22, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-13 13:09:13 +, Simon Riggs wrote:
 On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-12-13 11:56:47 +, Simon Riggs wrote:
  On 12 December 2013 21:58, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
   Reviewing the committed patch I noted that the 
   CheckForStandbyTrigger()
   after the delay was removed.
  
   If we promote the standby during the delay and don't check the trigger
   immediately after the delay, then we will replay undesired WALs records.
  
   The attached patch add this check.
 
  I removed it because it was after the pause. I'll replace it, but
  before the pause.
 
  Doesn't after the pause make more sense? If somebody promoted while we
  were waiting, we want to recognize that before rolling forward? The wait
  can take a long while after all?

 That would change the way pause currently works, which is OOS for that patch.

 But this feature isn't pause itself - it's imo something
 independent. Note that we currently
 a) check pause again after recoveryApplyDelay(),
 b) do check for promotion if the sleep in recoveryApplyDelay() is
interrupted. So not checking after the final sleep seems confusing.

I'm proposing the attached patch.

This patch implements a consistent view of recovery pause, which is
that when paused, we don't check for promotion, during or immediately
after. That is user noticeable behaviour and shouldn't be changed
without thought and discussion on a separate thread with a clear
descriptive title. (I might argue in favour of it myself, I'm not yet
decided).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


snippet.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2013-12-13 Thread Craig Ringer

Hi all

Here's an updated version of the updatable security barrier views patch
that's proposed as the next stage of progressing row-security toward
useful and maintainable inclusion in core.

If updatable security barrier views are available then the row-security
code won't have to play around with remapping vars and attrs during
query planning when it injects its subquery late. We'll simply stop the
planner flattening an updatable security barrier view, and for
row-security we'll dynamically inject one during rewriting when we see a
relation that has a row-security attribute.

This patch just implements updatable security barrier views. It is a
work in progress with at least two known crash bugs and limited testing.
I'd greatly appreciate comments (and advice) from those who are
interested in the problem domain as we proceed further into work on 9.4.

The patch is attached; it's on top of
46328916eefc5f9eaf249518e96f68afcd35923b, current head. It doens't yet
touch the documentation, but the only change needed should be to remove
the restriction on security_barrier views from the definition of what a
simply updatable view is.


There are couple of known issues: a crash with INSERT ... RETURNING;
DEFAULT handling through views isn't implemented yet; and a crash I just
found on UPDATE of a view that re-orders the original table columns. As
a result it doesn't survive make check yet.

I'm still working on fixing these issues and on finding more.
Suggestions/comments would be appreciated. I'll post specifics of the
INSERT ... RETURNING one soon, as I'm increasingly stuck on it.


Simple demo:

-- The 'id' is 'integer' not 'serial' because of the limitation with
-- DEFAULT mentioned below.

CREATE TABLE t (id integer primary key, secret text);

INSERT INTO t(id, secret)
SELECT x, 'secret'||x FROM generate_series(1,100) x;

CREATE VIEW t_even AS
SELECT id, secret FROM t WHERE id % 2 = 0;

CREATE VIEW t_even_sb WITH (security_barrier) AS
SELECT id, secret FROM t WHERE id % 2 = 0;

CREATE VIEW t_even_check WITH (check_option = 'cascaded') AS
SELECT id, secret FROM t WHERE id % 2 = 0;

CREATE VIEW t_even_check_sb WITH (check_option = 'cascaded',
security_barrier) AS
SELECT id, secret FROM t WHERE id % 2 = 0;


You'll find that the same f_leak tests used in the original read
security barrier views work here, too.



-- Subsets of cols work:

CREATE VIEW just_id AS SELECT id FROM t;
INSERT INTO just_id(id) VALUES (101);

CREATE VIEW just_id_sb WITH (security_barrier) AS
SELECT id FROM t;
INSERT INTO just_id_sb(id) VALUES (101);


-- Reversed column-lists work:

CREATE VIEW reversed AS SELECT secret, id FROM t;
INSERT INTO reversed(id, secret) VALUES (102, 'fred');

CREATE VIEW reversed_sb WITH (security_barrier) AS
SELECT secret, id FROM t;
INSERT INTO reversed_sb(id, secret) VALUES (102, 'fred');

-- WITH CHECK OPTION is working

postgres=# INSERT INTO t_even_check(id, secret) values (296, 'blah');
INSERT 0 1
postgres=# INSERT INTO t_even_check(id, secret) values (297, 'blah');
ERROR:  new row violates WITH CHECK OPTION for view t_even_check
DETAIL:  Failing row contains (297, blah).


postgres=# INSERT INTO t_even_check_sb(id, secret) values (298, 'blah');
INSERT 0 1
postgres=# INSERT INTO t_even_check_sb(id, secret) values (299, 'blah');
ERROR:  new row violates WITH CHECK OPTION for view t_even_check_sb
DETAIL:  Failing row contains (299, blah).






-- 3-col views are OK with various permutations

CREATE TABLE t3 ( id integer primary key, aa text, bb text );

CREATE VIEW t3_bai AS SELECT bb, aa, id FROM t3;
INSERT INTO t3_bai VALUES ('bb','aa',3);
UPDATE t3_bai SET bb = 'whatever' WHERE id = 3 RETURNING *;
DELETE FROM t3_bai RETURNING *;


CREATE VIEW t3_bai_sb WITH (security_barrier)
AS SELECT bb, aa, id FROM t3;
INSERT INTO t3_bai_sb VALUES ('bb','aa',3);

-- This crashes, with or without RETURNING. Bug in re-ord
-- UPDATE t3_bai_sb SET bb = 'whatever' WHERE id = 3 RETURNING *;

-- This is OK:
DELETE FROM t3_bai_sb RETURNING *;




-- 










Known issues:

DEFAULT injection doesn't occur correctly through the view. Needs some
changes in the rewriter where expand_target_list is called. Haven't
investigated in detail yet. Causes inserts through views to fail if
there's a default not null constraint, among other issues.

Any INSERT with a RETURNING clause through a view causes a crash (simple
VALUES clauses) or fails with no relation entry for relid 1 (INSERT
... SELECT). UPDATE and DELETE is fine. Seems to be doing subquery
pull-up, producing a simple result sub-plan that incorrectly has a Var
reference but doesn't perform any scan. Issue traced to plan_subquery,
but no deeper yet.


Privilege enforcement has not yet been through a thorough test matrix.


UPDATE of a subset of columns fails. E.g.:

CREATE VIEW just_secret AS SELECT secret FROM t;
UPDATE just_secret SET secret = 'fred';



Known failing queries. Note that it doesn't matter what you choose from
RETURNING, any reference to a result relation

Re: [HACKERS] Time-Delayed Standbys

On Fri, Dec 13, 2013 at 11:44 AM, Simon Riggs si...@2ndquadrant.com wrote:

On 13 December 2013 13:22, Andres Freund and...@2ndquadrant.com wrote:
On 2013-12-13 13:09:13 +, Simon Riggs wrote:
On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com
wrote:
On 2013-12-13 11:56:47 +, Simon Riggs wrote:
On 12 December 2013 21:58, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
Reviewing the committed patch I noted that the
CheckForStandbyTrigger()
after the delay was removed.

If we promote the standby during the delay and don't check the
trigger
immediately after the delay, then we will replay undesired WALs
records.

The attached patch add this check.

I removed it because it was after the pause. I'll replace it, but
before the pause.

Doesn't after the pause make more sense? If somebody promoted while
we
were waiting, we want to recognize that before rolling forward? The
wait
can take a long while after all?

That would change the way pause currently works, which is OOS for that
patch.

But this feature isn't pause itself - it's imo something
independent. Note that we currently
a) check pause again after recoveryApplyDelay(),
b) do check for promotion if the sleep in recoveryApplyDelay() is
interrupted. So not checking after the final sleep seems confusing.

I'm proposing the attached patch.

This patch implements a consistent view of recovery pause, which is
that when paused, we don't check for promotion, during or immediately
after. That is user noticeable behaviour and shouldn't be changed
without thought and discussion on a separate thread with a clear
descriptive title. (I might argue in favour of it myself, I'm not yet
decided).

In my previous message [1] I attach a patch equal to your ;-)

Regards,

[1]
http://www.postgresql.org/message-id/CAFcNs+qD0AJ=qzhsHD9+v_Mhz0RTBJ=cJPCT_T=ut_jvvnc...@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] Changeset Extraction Interfaces

On 2013-12-13 08:30:41 -0500, Robert Haas wrote:
 On Thu, Dec 12, 2013 at 1:52 PM, Andres Freund and...@2ndquadrant.com wrote:
  Puh. I honestly have zero confidence in DBAs making an informed decision
  about something like this. Honestly, for a replication solution, how
  often do you think this will be an issue?
 
 If you imagine a scenario where somebody establishes a replication
 slot and then keeps it forever, not often.  But if you're trying to do
 something more ad hoc, where replication slots might be used just for
 short periods of time and then abandoned, I think it could come up
 pretty frequently.

But can you imagine those users needing an exported snapshot? I can
think of several short-lived usages, but all of those are unlikely to
need a consistent view of the overall database. And those are less
likely to be full blown replication solutions.
I.e. it's not the DBA making that decision but the developer making the
decision based on whether he requires the snapshot or not.

 Generally, I think you're being too dismissive of the stuff I'm
 complaining about here.  If we just can't get this, well then I
 suppose we can't.

I think I am just scared of needing to add more features before getting
the basics done and in consequence overrunning 9.4...

 But I think the amount of time that it takes
 Hot Standby to open for connections is an issue, precisely because
 it's got to wait until certain criteria are met before it can
 establish a snapshot, and sometimes that takes an unpleasantly long
 time.

The HS situation is quite different though. There the problem is hit on
every restart, not just the initial start; Suboverflowed xl_running
xacts can delay HS startup for a long time; and we don't start up at the
exact point we know a xl_running_xacts just has been logged.

FWIW I think we need to fix HS to store it's visibility state at
restartpoints, so we get rid of that problem. Not sure who will have
time to do that tho.

 That having been said, I don't find it at all
 hard to imagine someone wanting to jump into the replication stream at
 an arbitrary point in time and see changes from every transaction that
 *commits* after that point, even if it began earlier, or even to see
 changes from transactions that have not yet committed as they happen.

I agree they are desirable, but all those will require additional state
to be kept about running transactions. That's not saying it
shouldn't/cannot be done, to the contrary, just that it requires some
engineering effort.

 I realize that's asking for a pony, and I'm not saying you have to go
 off and do that right now in order for this to move forward, or indeed
 that it will ever happen at all.  What I am saying is that I find it
 entirely likely that people are going to push the limits of this
 thing, that this is one of the limits I expect them to push, and that
 the more we can do to put policy in the hands of the user without
 pre-judging the sanity of what they're trying to do, the happier we
 (and our users) will be.

Completely agreed. I really think this the most basic building block,
missing many important features. And we'll be busy for some time adding
those.

  One point for my suggested behaviour is that it closes a bigger
  racecondition. Currently as soon as start_logical_replication() has
  finished building the tuplestore it marks the endposition as
  received. But we very well can fail before the user has received all
  those changes.
 
 Right.  I think your idea is good, but maybe there should also be a
 version of the function that never confirms receipt even if the
 transaction commits.  That would be useful for ad-hoc poking at the
 queue.

Ok, that sounds easy enough, maybe
pg_decoding_slot_get_[binary_]changes()
pg_decoding_slot_peek_[binary_]changes()
?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logging WAL when updating hintbit

2013-12-13 Thread Heikki Linnakangas


On 12/13/2013 07:55 AM, Sawada Masahiko wrote:

On Fri, Dec 13, 2013 at 1:51 PM, Dilip kumar dilip.ku...@huawei.com wrote:

On 04 December 2013, Sawada Masahiko Wrote


I attached the patch which have modified based on Robert suggestion,
and fixed typo.


I have reviewed the modified patch and I have some comments..

1. Patch need to be rebased (failed applying on head)

2. crc field should be at end in ControlFileData struct, because for crc 
calculation, it directly take the offset of crc field in ControlFileData.

 /* CRC of all above ... MUST BE LAST! */
 pg_crc32crc;
+
+   /* Enable logging WAL when updating hint bits */
+   boolwal_log_hintbits;
} ControlFileData;
3. wal_log_hintbits field should be printed in PrintControlValues function.


Actually, no. It's reset anyway like wal_level and some other settings, 
so it's not an interesting value to print out.


Thanks for the review!


I have modified the patch base on your comment, and I attached the v7 patch.


Thanks, committed with some minor changes:

The amount of extra WAL-logging with wal_log_hintbits=on is almost, but 
not exactly the same as with checksums enabled. With checksums enabled, 
visibilitymap_set() creates a full-page image of the heap page, but with 
wal_log_hintbits=on, it does not. For pg_rewind's purposes, that's 
correct, but I feel that it would be better if the decision on whether 
to WAL-log or not was exactly the same in both cases. One reason is that 
you could then use wal_log_hintbits=on to evaluate how big the impact on 
WAL volume would be if you had checksums enabled. I committed it that way.


OTOH, for pg_rewind's purposes, there's actually no need to take a full 
page image when a hint bit is set. A small WAL-record indicating that 
the page was modified would be enough. It's particularly strange to 
insist that hint bit updates create full-page images, when you have 
full_page_writes=off so that normal updates don't create them.


We're a bit schizophrenic with full page writes also when data checksums 
are enabled, though. If I'm reading the code correctly, you can turn 
full_page_writes=off even when data checksums are enabled, which exposes 
you to torn page problems. Which might be OK under some special 
circumstances. But you'll still get full-page images of hint bits!


I think it's a bit excessive to require wal_level  minimal if you set 
wal_log_hintbits=on. It's a bit silly to ask for wal_log_hintbits=on 
without archiving, but I also don't see a big reason to throw an error. 
It would be annoying, if you want to e.g temporarily set 
wal_level=minimal when you do a big data load, and re-initialize the 
standby afterwards. Now you'd also need to remember to turn 
wal_log_hintbits=off temporarily, and remember to turn it back on 
afterwards. So I just removed that check.


I'm not totally satisfied with the name of the GUC, wal_log_hintbits. 
The term hint bits doesn't mean anything to most people. I couldn't 
come up with anything better, but if someone has a better suggestion, we 
can still change it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

Andres Freund and...@2ndquadrant.com writes:
 Tom, could this be caused by c357be2cd9434c70904d871d9b96828b31a50cc5?
 Specifically the added CHECK_FOR_INTERRUPTS() in handle_sig_alarm()?
 ISTM nothing is preventing us from jumping out of code holding a
 spinlock?

Hm ... what should stop it is that ImmediateInterruptOK wouldn't be
set while we're messing with any spinlocks.  Except that ProcessInterrupts
doesn't check that gating condition :-(.  I think you're probably right:
what should be in the interrupt handler is something like
if (ImmediateInterruptOK) CHECK_FOR_INTERRUPTS();

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ruleutils vs. empty targetlists

Dean Rasheed dean.a.rash...@gmail.com writes:
 I can't think of any practical uses for this kind of query, so I don't
 think it's worth worrying too much about its results until/unless
 someone comes up with a real use-case.

 However, given that we currently support queries like select distinct
 * from nocols (albeit with rather odd results), I don't think we
 should start throwing new errors for them. Perhaps the actual risk of
 a backwards-compatibility break is small, but so too is any benefit
 from adding such new errors.

 So +1 for the patch as-is, with no new errors.

How about as-is in the back branches, and throw the new errors only
in HEAD?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logging WAL when updating hintbit

Heikki Linnakangas hlinnakan...@vmware.com writes:
 I'm not totally satisfied with the name of the GUC, wal_log_hintbits. 

Me either; at the very least, it's short an underscore: wal_log_hint_bits
would be more readable.  But how about just wal_log_hints?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ruleutils vs. empty targetlists

2013-12-13 Thread Dean Rasheed

On 13 December 2013 15:07, Tom Lane t...@sss.pgh.pa.us wrote:
 Dean Rasheed dean.a.rash...@gmail.com writes:
 I can't think of any practical uses for this kind of query, so I don't
 think it's worth worrying too much about its results until/unless
 someone comes up with a real use-case.

 However, given that we currently support queries like select distinct
 * from nocols (albeit with rather odd results), I don't think we
 should start throwing new errors for them. Perhaps the actual risk of
 a backwards-compatibility break is small, but so too is any benefit
 from adding such new errors.

 So +1 for the patch as-is, with no new errors.

 How about as-is in the back branches, and throw the new errors only
 in HEAD?


Seems reasonable.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

On 2013-12-13 09:52:06 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Tom, could this be caused by c357be2cd9434c70904d871d9b96828b31a50cc5?
  Specifically the added CHECK_FOR_INTERRUPTS() in handle_sig_alarm()?
  ISTM nothing is preventing us from jumping out of code holding a
  spinlock?
 
 Hm ... what should stop it is that ImmediateInterruptOK wouldn't be
 set while we're messing with any spinlocks.  Except that ProcessInterrupts
 doesn't check that gating condition :-(.

It really can't, right? Otherwise explicit CHECK_FOR_INTERRUPTS()s in
normal code wouldn't do much anymore since ImmediateInterruptOK is so
seldomly set. The control flow around signal handling always drives me
crazy.

 I think you're probably right:
 what should be in the interrupt handler is something like
 if (ImmediateInterruptOK) CHECK_FOR_INTERRUPTS();

Yea, that sounds right. Or just don't set process interrupts there, it
doesn't seem to be required for correctness?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

Andres Freund and...@2ndquadrant.com writes:
 On 2013-12-13 09:52:06 -0500, Tom Lane wrote:
 I think you're probably right:
 what should be in the interrupt handler is something like
 if (ImmediateInterruptOK) CHECK_FOR_INTERRUPTS();

 Yea, that sounds right. Or just don't set process interrupts there, it
 doesn't seem to be required for correctness?

It is if we need to break out of a wait-for-lock ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

On 2013-12-13 10:30:48 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-12-13 09:52:06 -0500, Tom Lane wrote:
  I think you're probably right:
  what should be in the interrupt handler is something like
  if (ImmediateInterruptOK) CHECK_FOR_INTERRUPTS();
 
  Yea, that sounds right. Or just don't set process interrupts there, it
  doesn't seem to be required for correctness?
 
 It is if we need to break out of a wait-for-lock ...

Right, that uses MyProc-sem and not MyProc-procLatch...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

On closer inspection, I'm thinking that actually it'd be a good idea if
handle_sig_alarm did what we do in, for example, HandleCatchupInterrupt:
it should save, clear, and restore ImmediateInterruptOK, so as to make
the world safe for timeout handlers to do things that might include a
CHECK_FOR_INTERRUPTS.

And while we're on the subject ... isn't bgworker_die() utterly and
completely broken?  That unconditional elog(FATAL) means that no process
using that handler can do anything remotely interesting, like say touch
shared memory.

I didn't find any other similar hazards in a quick look through all our
signal handlers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

On 2013-12-13 11:26:44 -0500, Tom Lane wrote:
 On closer inspection, I'm thinking that actually it'd be a good idea if
 handle_sig_alarm did what we do in, for example, HandleCatchupInterrupt:
 it should save, clear, and restore ImmediateInterruptOK, so as to make
 the world safe for timeout handlers to do things that might include a
 CHECK_FOR_INTERRUPTS.

Shouldn't the HOLD_INTERRUPTS() in handle_sig_alarm() prevent any
eventual ProcessInterrupts() in the timeout handlers from doing anything
harmful?
Even if so, making sure ImmediateInterruptOK is preserved seems worthwile
anyway.

 And while we're on the subject ... isn't bgworker_die() utterly and
 completely broken?  That unconditional elog(FATAL) means that no process
 using that handler can do anything remotely interesting, like say touch
 shared memory.

Yes, looks broken to me.

 I didn't find any other similar hazards in a quick look through all our
 signal handlers.

One thing I randomly noticed just now is the following in
RecoveryConflictInterrupt():
elog(FATAL, unrecognized conflict mode: %d,
 (int) reason);
obviously that's not really ever going to hit, but it should either be a
PANIC or an Assert() for the reasons you cite.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

Christophe Pettus x...@thebuild.com writes:
 Yes, that's what is happening there (I had to check with the client's 
 developers).  It's possible that the one-minute repeat is due to the 
 application reissuing the query, rather than specifically related to the 
 spinlock issue.  What this does reveal is that all the spinlock issues have 
 been on long-running queries, for what it is worth.

Please apply commit 478af9b79770da43a2d89fcc5872d09a2d8731f8 and see
if that doesn't fix it for you.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing

On 2013-12-13 13:39:20 -0300, Alvaro Herrera wrote:
 Here's cache code with LRU superpowers (ahem.)

Heh.

 I settled on 256 as number of entries because it's in the same ballpark
 as MaxHeapTuplesPerPage which seems a reasonable guideline to follow.

Sounds ok.

 I considered the idea of avoiding palloc/pfree for cache entries
 entirely, instead storing them in a static array which is referenced
 from the dlist; unfortunately that doesn't work because each cache entry
 is variable size, depending on number of members.  We could try to work
 around that and allocate a large shared array for members, but that
 starts to smell of over-engineering, so I punted.

Good plan imo.

 *** 1326,1331  mXactCacheGetBySet(int nmembers, MultiXactMember *members)
 --- 1331,1337 
   if (memcmp(members, entry-members, nmembers * 
 sizeof(MultiXactMember)) == 0)
   {
   debug_elog3(DEBUG2, CacheGet: found %u, entry-multi);
 + dlist_move_head(MXactCache, iter.cur);
   return entry-multi;
   }
   }

That's only possible because we immediately abort the loop, otherwise
we'd corrupt the iterator. Maybe that deserves a comment.

 + 
 + dlist_move_head(MXactCache, iter.cur);
 + 

Heh. I forgot that we already had that bit; I was wondering whether you
had to forgot to include it in the patch ;)

   static char *
 --- 1420,1435 
   /* mXactCacheGetBySet assumes the entries are sorted, so sort them */
   qsort(entry-members, nmembers, sizeof(MultiXactMember), 
 mxactMemberComparator);
   
 ! dlist_push_head(MXactCache, entry-node);
 ! if (MXactCacheMembers++ = MAX_CACHE_ENTRIES)
 ! {
 ! dlist_node *node;
 ! 
 ! node = dlist_tail_node(MXactCache);
 ! dlist_delete(dlist_tail_node(MXactCache));
 ! MXactCacheMembers--;
 ! pfree(dlist_container(mXactCacheEnt, node, node));
 ! }
   }

Duplicate dlist_tail_node(). Maybe add a debug_elog3(.. CacheGet:
pruning %u from cache)?

I wondered before if we shouldn't introduce a layer above dlists, that
support keeping track of the number of elements, and maybe also have
support for LRU behaviour. Not as a part this patch, just generally.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

2013-12-13 Thread Christophe Pettus


On Dec 13, 2013, at 8:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Please apply commit 478af9b79770da43a2d89fcc5872d09a2d8731f8 and see
 if that doesn't fix it for you.

Great, thanks.  Would the statement_timeout firing invoke this path?  (I'm 
wondering why this particular installation was experiencing this.)

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

Andres Freund and...@2ndquadrant.com writes:
 On 2013-12-13 11:26:44 -0500, Tom Lane wrote:
 On closer inspection, I'm thinking that actually it'd be a good idea if
 handle_sig_alarm did what we do in, for example, HandleCatchupInterrupt:
 it should save, clear, and restore ImmediateInterruptOK, so as to make
 the world safe for timeout handlers to do things that might include a
 CHECK_FOR_INTERRUPTS.

 Shouldn't the HOLD_INTERRUPTS() in handle_sig_alarm() prevent any
 eventual ProcessInterrupts() in the timeout handlers from doing anything
 harmful?

Sorry, I misspoke there.  The case I'm worried about is doing something
like a wait for lock, which would unconditionally set and then reset
ImmediateInterruptOK.  That's not very plausible perhaps, but on the other
hand we are calling DeadLockCheck() in there, and who knows what future
timeout handlers might try to do?

BTW, I'm about to go put a HOLD_INTERRUPTS/RESUME_INTERRUPTS into
HandleCatchupInterrupt and HandleNotifyInterrupt too, for essentially the
same reason.  At least the first of these *does* include semaphore ops,
so I think it's theoretically vulnerable to losing control if a timeout
occurs while it's waiting for a semaphore.  There's probably no real bug
today because I don't think we enable catchup interrupts at any point
where a timeout would be active, but that doesn't sound terribly
future-proof.  If a timeout did happen, holding off interrupts would have
the effect of postponing the query cancel till we're done with the catchup
interrupt, which seems reasonable.

 One thing I randomly noticed just now is the following in
 RecoveryConflictInterrupt():
   elog(FATAL, unrecognized conflict mode: %d,
(int) reason);
 obviously that's not really ever going to hit, but it should either be a
 PANIC or an Assert() for the reasons you cite.

Yeah, PANIC there seems good.  I also thought about using
START_CRIT_SECTION/END_CRIT_SECTION instead of
HOLD_INTERRUPTS/RESUME_INTERRUPTS in these signal handlers.  That would
both hold off interrupts and cause any elog(ERROR/FATAL) within the
handler to be promoted to PANIC.  But I'm not sure that'd be a net
stability improvement...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

Christophe Pettus x...@thebuild.com writes:
 On Dec 13, 2013, at 8:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Please apply commit 478af9b79770da43a2d89fcc5872d09a2d8731f8 and see
 if that doesn't fix it for you.

 Great, thanks.  Would the statement_timeout firing invoke this path?  (I'm 
 wondering why this particular installation was experiencing this.)

Yeah, the problem is that either statement_timeout or lock_timeout
could cause control to be taken away from code that thinks it's
straight-line code and so doesn't have provision for getting cleaned
up at transaction abort.  Spinlocks certainly fall in that category.
I'm afraid other weird failures are possible, though I'm not sure
what.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

On 2013-12-13 12:19:56 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Shouldn't the HOLD_INTERRUPTS() in handle_sig_alarm() prevent any
  eventual ProcessInterrupts() in the timeout handlers from doing anything
  harmful?
 
 Sorry, I misspoke there.  The case I'm worried about is doing something
 like a wait for lock, which would unconditionally set and then reset
 ImmediateInterruptOK.

I sure hope we're not going to introduce more paths that do this, but I
am not going to bet on it...

I remember trying to understand why the deadlock detector is safe doing
as it does when I was all green and was trying to understand the HS patch
and it drove me nuts.

 BTW, I'm about to go put a HOLD_INTERRUPTS/RESUME_INTERRUPTS into
 HandleCatchupInterrupt and HandleNotifyInterrupt too, for essentially the
 same reason.

Sounds good, both already do a ProcessInterrupts() at their end, so the
holdoff shouldn't lead to absorbed interrupts.

I wonder what to do about bgworker's bgworker_die()? I don't really see
how that can be fixed without breaking the API?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-13 Thread Jeff Davis

On Wed, 2013-12-11 at 20:49 +0100, Dimitri Fontaine wrote:
 Robert Haas robertmh...@gmail.com writes:
  I strongly agree.  PostgreSQL has succeeded because we try not to do
  things at all until we're sure we know how to do them right.
 
 I still agree to the principle, or I wouldn't even try. Not in details,
 because the current design passed all the usual criteria a year ago.
 
   http://www.postgresql.org/message-id/6466.1354817...@sss.pgh.pa.us

For what it's worth, I think the idea of extension templates has good
conceptual integrity. Extensions are external blobs. To make them work
more smoothly in several ways, we move them into the catalog. They have
pretty much the same upsides and downsides of our existing extensions,
aside from issues directly related to filesystem vs. catalog.

Stephen had some legitimate concerns. I don't entirely agree, but they
are legitimate concerns, and we don't want to just override them.

At the same time, I'm skeptical of the alternatives Stephen offered
(though I don't think he intended them as a full proposal).

So right now I'm discouraged about the whole idea of installing
extensions using SQL. I don't see a lot of great options. On top of
that, the inability to handle native code limits the number of
extensions that could make use of such a facility, which dampens my
enthusiasm.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: make_timestamp function

2013-12-13 Thread Martijn van Oosterhout

On Thu, Dec 12, 2013 at 08:50:26PM -0200, Fabrízio de Royes Mello wrote:
 On Thu, Dec 12, 2013 at 3:11 PM, Pavel Stehule pavel.steh...@gmail.comwrote:
 
  Hello
 
  this patch try to complete a set of functions make_date and make_timestamp.
 
 
 Could we have the 'make_timestamptz' function too?

Wouldn't this just be:

SELECT make_timestamp(...) at time zone 'foo';

(assuming make_timestamp actually returns a timestamp and not a
timestamptz).

or do you mean something else?

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] stuck spinlock

Andres Freund and...@2ndquadrant.com writes:
 I wonder what to do about bgworker's bgworker_die()? I don't really see
 how that can be fixed without breaking the API?

IMO it should be flushed and bgworkers should use the same die() handler
as every other backend, or else one like the one in worker_spi, which just
sets a flag for testing later.  If we try to change the signal handling
contracts, 80% of backend code will be unusable in bgworkers, which is not
where we want to be I think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

On 2013-12-13 12:54:09 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I wonder what to do about bgworker's bgworker_die()? I don't really see
  how that can be fixed without breaking the API?
 
 IMO it should be flushed and bgworkers should use the same die() handler
 as every other backend, or else one like the one in worker_spi, which just
 sets a flag for testing later.

Agreed on not going forward like now, but I don't really see how they
could usefully use die(). I think we should just mandate that every
bgworker conneced to shared memory registers a sigterm handler - we
could put a check into BackgroundWorkerUnblockSignals(). We should leave
the current handler in for unconnected one though...
bgworkers are supposed to be written as a loop around procLatch, so
adding a !got_sigterm, probably isn't too hard.

It sucks that people might have bgworkers out there that don't register
their own sigterm handlers, but adding a sigterm handler will be
backward compatible and it's in the example bgworker, so it's probably
not too bad.

 If we try to change the signal handling
 contracts, 80% of backend code will be unusable in bgworkers, which is not
 where we want to be I think.

Yea, I think that's out of the question.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: make_timestamp function

On Fri, Dec 13, 2013 at 3:53 PM, Martijn van Oosterhout klep...@svana.org
wrote:

 On Thu, Dec 12, 2013 at 08:50:26PM -0200, Fabrízio de Royes Mello wrote:
  On Thu, Dec 12, 2013 at 3:11 PM, Pavel Stehule pavel.steh...@gmail.com
wrote:
 
   Hello
  
   this patch try to complete a set of functions make_date and
make_timestamp.
  
  
  Could we have the 'make_timestamptz' function too?

 Wouldn't this just be:

 SELECT make_timestamp(...) at time zone 'foo';

 (assuming make_timestamp actually returns a timestamp and not a
 timestamptz).

 or do you mean something else?


Your example will convert the timestamp into time zone defined by 'at time
zone...'.

I think the goal of the make_date/time/timestamp function series is build
a date/time/timestamp from scratch, so the use of 'make_timestamptz' is to
build a specific timestamp with timezone and don't convert it.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] autovacuum_work_mem

2013-12-13 Thread Bruce Momjian

On Wed, Dec 11, 2013 at 10:35:32AM -0800, Josh Berkus wrote:
 On 12/11/2013 09:57 AM, Robert Haas wrote:
  I don't agree with that assessment.  Anything that involves changing
  the scheduling of autovacuum is a major project that will legitimately
  provoke much controversy.  Extensive testing will be needed to prove
  that the new algorithm doesn't perform worse than the current
  algorithm in any important cases.  I have my doubts about whether that
  can be accomplished in an entire release cycle, let alone 2-3 days.
  In contrast, the patch proposed does something that is easy to
  understand, clearly safe, and an improvement over what we have now.
 
 +1
 
 There is an inherent tuning and troubleshooting challenge in anything
 involving a feedback loop.

We have avoided feedback loops in the past.  I think eventually we are
going to need to tackle them, but it is a big job, and vacuum memory
usage would be at the bottom of my list of feedback loop tasks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum_work_mem

2013-12-13 Thread Heikki Linnakangas


On 12/13/2013 08:24 PM, Bruce Momjian wrote:

On Wed, Dec 11, 2013 at 10:35:32AM -0800, Josh Berkus wrote:

On 12/11/2013 09:57 AM, Robert Haas wrote:

I don't agree with that assessment.  Anything that involves changing
the scheduling of autovacuum is a major project that will legitimately
provoke much controversy.  Extensive testing will be needed to prove
that the new algorithm doesn't perform worse than the current
algorithm in any important cases.  I have my doubts about whether that
can be accomplished in an entire release cycle, let alone 2-3 days.
In contrast, the patch proposed does something that is easy to
understand, clearly safe, and an improvement over what we have now.


+1

There is an inherent tuning and troubleshooting challenge in anything
involving a feedback loop.


We have avoided feedback loops in the past.  I think eventually we are
going to need to tackle them, but it is a big job, and vacuum memory
usage would be at the bottom of my list of feedback loop tasks.


I haven't been following this thread in detail, but would it help if we 
implemented a scheme to reduce (auto)vacuum's memory usage? Such schemes 
have been discussed in the past, packing the list of dead items more 
tightly.


I guess you'd still want to have a limit on autovacuum's memory usage. A 
much lower limit than you'd want to allow for one-off CREATE INDEX 
operations and such.


(Personally, I don't care whether we add this new option or not. And -1 
for feedback loops.)


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

2013-12-13 Thread Robert Haas

On Fri, Dec 13, 2013 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 And while we're on the subject ... isn't bgworker_die() utterly and
 completely broken?  That unconditional elog(FATAL) means that no process
 using that handler can do anything remotely interesting, like say touch
 shared memory.

Yeah, but for the record (since I see I got cc'd here), that's not my
fault.  I moved it into bgworker.c, but it's been like that since
Alvaro's original commit of the bgworker facility
(da07a1e856511dca59cbb1357616e26baa64428e).

While I was developing the shared memory message queueing stuff, I
experimented using die() as the signal handler and didn't have very
good luck.  I can't remember exactly what wasn't working any more,
though.  I agree that it would be good if we can make that work.
Right now we've got other modules growing warts like
WalRcvImmediateInterruptOK, which doesn't seem good.

It seems to me that we should change every place that temporarily
changes ImmediateInterruptOK to restore the original value instead of
making assumptions about what it must have been.
ClientAuthentication(), md5_crypt_verify(), PGSemaphoreLock() and
WalSndLoop() all have this disease.

I also really wonder if notify and catchup interrupts ought to be
taught to respect ImmediateInterruptOK, instead of having their own
switches for the same thing.  Right now there are an awful lot of
places that do this:

ImmediateInterruptOK = false;   /* not idle anymore */
DisableNotifyInterrupt();
DisableCatchupInterrupt();

...and that doesn't seem like a good thing.  Heaven forfend someone
were to do only two out of the three.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

2013-12-13 Thread Robert Haas

On Fri, Dec 13, 2013 at 1:15 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-13 12:54:09 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I wonder what to do about bgworker's bgworker_die()? I don't really see
  how that can be fixed without breaking the API?

 IMO it should be flushed and bgworkers should use the same die() handler
 as every other backend, or else one like the one in worker_spi, which just
 sets a flag for testing later.

 Agreed on not going forward like now, but I don't really see how they
 could usefully use die(). I think we should just mandate that every
 bgworker conneced to shared memory registers a sigterm handler - we
 could put a check into BackgroundWorkerUnblockSignals(). We should leave
 the current handler in for unconnected one though...
 bgworkers are supposed to be written as a loop around procLatch, so
 adding a !got_sigterm, probably isn't too hard.

I think the !got_sigterm thing is complete bunk.  If a background
worker is running SQL queries, it really ought to honor a query cancel
or sigterm at the next CHECK_FOR_INTERRUPTS().  But the default
background worker handler for SIGUSR1 just sets the process latch, and
worker_spi's sigterm handler just sets a private variable got_sigterm.
 So ProcessInterrupts() will never get called, and if it did it
wouldn't do anything anyway. That's really pretty horrible, because it
means that the query worker_spi runs can't be interrupted short of a
SIGQUIT.  So I think worker_spi is really a very bad example of how to
do this right.  In the as-yet-uncommitted test-shm-mq-v1.patch, I did
this:

+static void
+handle_sigterm(SIGNAL_ARGS)
+{
+   int save_errno = errno;
+
+   if (MyProc)
+   SetLatch(MyProc-procLatch);
+
+   if (!proc_exit_inprogress)
+   {
+   InterruptPending = true;
+   ProcDiePending = true;
+   }
+
+   errno = save_errno;
+}

...but I'm not 100% sure that's right, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum_work_mem

2013-12-13 Thread Alvaro Herrera

Heikki Linnakangas escribió:

 I haven't been following this thread in detail, but would it help if
 we implemented a scheme to reduce (auto)vacuum's memory usage? Such
 schemes have been discussed in the past, packing the list of dead
 items more tightly.

Well, it would help some, but it wouldn't eliminate the problem
completely.  Autovacuum scales its memory usage based on the size of the
table.  There will always be a table so gigantic that a maximum
allocated memory is to be expected; and DBAs will need a way to limit
the memory consumption even if we pack dead items more densely.

I was playing with keeping item pointers for each page in a bitmapset.
This was pretty neat and used a lot less memory than currently, except
that I needed to allocate a large chunk of memory and then have
bitmapsets use words within that large allocation space.  It turned out
to be too ugly so I abandoned it.  With the varbit encoding thingy in
the recent GIN patchset, maybe it would be workable.

I think a more dense packing is desired regardless of this patch.
Maybe we can have a generic module for item pointer arrays which could
be useful for both autovacuum and GIN.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-13 Thread Stephen Frost

* Jeff Davis (pg...@j-davis.com) wrote:
 For what it's worth, I think the idea of extension templates has good
 conceptual integrity. Extensions are external blobs. To make them work
 more smoothly in several ways, we move them into the catalog. They have
 pretty much the same upsides and downsides of our existing extensions,
 aside from issues directly related to filesystem vs. catalog.

I've never particularly liked the idea that extensions are external
blobs, to be honest.

 Stephen had some legitimate concerns. I don't entirely agree, but they
 are legitimate concerns, and we don't want to just override them.
 
 At the same time, I'm skeptical of the alternatives Stephen offered
 (though I don't think he intended them as a full proposal).

It was more thoughts on how I'd expect these things to work.  I've also
been talking to David about what he'd like to see done with PGXN and his
thinking was a way to automate creating RPMs and DEBs based on PGXN spec
files, but he points out that the challenge there is dealing with
external dependencies.

 So right now I'm discouraged about the whole idea of installing
 extensions using SQL. I don't see a lot of great options. On top of
 that, the inability to handle native code limits the number of
 extensions that could make use of such a facility, which dampens my
 enthusiasm.

Yeah, having looked at PGXN, it turns out that some 80+% of the
extensions there have .c code included, something well beyond what I was
expecting, but most of those cases also look to have external
dependencies (eg: FDWs), which really makes me doubt this notion that
they could be distributed independently and outside of the OS packaging
system (or that it would be a particularly good idea to even try...).

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Logging WAL when updating hintbit

2013-12-13 Thread Bruce Momjian

On Fri, Dec 13, 2013 at 10:14:06AM -0500, Tom Lane wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
  I'm not totally satisfied with the name of the GUC, wal_log_hintbits. 
 
 Me either; at the very least, it's short an underscore: wal_log_hint_bits
 would be more readable.  But how about just wal_log_hints?

Is wal_log redundant (two logs)?  How about wal_record_hit_bits?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

2013-12-13 Thread Alvaro Herrera

Robert Haas escribió:
 On Fri, Dec 13, 2013 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  And while we're on the subject ... isn't bgworker_die() utterly and
  completely broken?  That unconditional elog(FATAL) means that no process
  using that handler can do anything remotely interesting, like say touch
  shared memory.
 
 Yeah, but for the record (since I see I got cc'd here), that's not my
 fault.  I moved it into bgworker.c, but it's been like that since
 Alvaro's original commit of the bgworker facility
 (da07a1e856511dca59cbb1357616e26baa64428e).

I see the blame falls on me ;-)  I reckon I blindly copied this
stuff from elsewhere without thinking very much about it.  As noted
upthread, even the example code uses a different handler for SIGTERM.
There wasn't much else that we could do; simply letting the generic code
run without any SIGTERM installed didn't seem the right thing to do.
(You probably recall that the business of starting workers with signals
blocked was installed later.)

I found a few workers in github in a quick search:
https://github.com/umitanuki/mongres/blob/master/mongres.c
https://github.com/markwkm/pg_httpd/blob/master/pg_httpd.c
https://github.com/ibarwick/config_log/blob/master/config_log.c
https://github.com/gleu/stats_recorder/blob/master/stats_recorder_spi.c
https://github.com/michaelpq/pg_workers/blob/master/kill_idle/kill_idle.c
https://github.com/le0pard/pg_web/blob/master/src/pg_web.c

Not a single one of the uses bgworker_die() -- they all follow
worker_spi's lead of setting a got_sigterm flag and SetLatch().

If there was a way for raising an #error at compile time whenever a
worker relies on the existing signal handler, I would vote for doing
that.  (But then I have no idea how to do such a thing.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-13 Thread Dimitri Fontaine

Stephen Frost sfr...@snowman.net writes:
 Stephen had some legitimate concerns. I don't entirely agree, but they
 are legitimate concerns, and we don't want to just override them.

The main disturbing concern for me is to do with pg_restore and major
upgrades, where the blob we have in the catalogs might not parse
correctly into the new version.

Of course, it's easy enough to fix either the source database or the
pg_restore text itself, as it's just plain SQL in there.

 At the same time, I'm skeptical of the alternatives Stephen offered
 (though I don't think he intended them as a full proposal).

I began working out a full proposal out of them, and here's the most
important conclusions I can offer from that work:

  - The main constraint we have to work against is that no matter what,
extension objects are not going to be selected for pg_dump.

That basically means that the only model of extensions we accept is
ever going to be the contrib model, whereas my current attemps are
meant to evolve the extension model way beyond contrib.

The tension we have there is extremely hard to resolve, which
explains the strange idea of storing SQL blobs in the catalogs.

  - While it's possible to work out some new versioned container
objects, I see mainly 3 consequences of doing so:

 1. We're going to kill extensions, which design would be limited to
only care about contribs: no data, code in C, any single version
of the extension is intended to work against only one major
version of PostgreSQL.

Guess what, I know of no extension maintained by those rules
outside of contribs.

The third rule is the easy one to work around, of course, except
if you consider the pg_restore behaviour, framed as a bug by
Stephen.

 2. The new thing would seamlessly allow for data only extensions,
such as census or geolocation, etc, because we would actually
backup that data.

 3. The separation of concerns in between the extension author who
maintains the install and upgrade scripts and the DBA who
applies them is going to be pushed entirely to the client
software, and that sounds way more fragile that what we have
now.

So I see 3 solutions ahead of us: we allow extensions to manage more
than the contrib model, a third-party software is going to work around
it to make extensions usable for non-contrib things, or we're providing
a new kind of container in core and kill extensions.

Of course, none of them are exclusive, and I think realistically we're
going to have to live with at least 2 of those alternatives in a near
future.

 It was more thoughts on how I'd expect these things to work.  I've also
 been talking to David about what he'd like to see done with PGXN and his
 thinking was a way to automate creating RPMs and DEBs based on PGXN spec
 files, but he points out that the challenge there is dealing with
 external dependencies.

With all due respect, we don't live in a world where its customary to
have root privileges on your production service anymore.

 So right now I'm discouraged about the whole idea of installing
 extensions using SQL. I don't see a lot of great options. On top of
 that, the inability to handle native code limits the number of
 extensions that could make use of such a facility, which dampens my
 enthusiasm.

Rejoice! Have a look at the documentation for dynamic_library_path. Any
distribution or packaging software would trivially be able to make it so
that the modules (.so) are loaded from a non-system place.

Only missing is another path GUC to allow PostgreSQL to search for the
extension control files in non-system places too, meanwhile it's already
possible to edit the file system privileges.

 Yeah, having looked at PGXN, it turns out that some 80+% of the
 extensions there have .c code included, something well beyond what I was

I call that a chicken and eggs problem.

Every serious PostgreSQL user will have stored procedure code to
maintain, where “serious” means that PostgreSQL is considered a part of
the application platform.

The only reason why this code is not organized as an extension today is
because extensions are restricted to the contrib model. There's
absolutely no surprise in discovering that current extensions in the
wild are about all fitting the only currently supported model.

 expecting, but most of those cases also look to have external
 dependencies (eg: FDWs), which really makes me doubt this notion that
 they could be distributed independently and outside of the OS packaging
 system (or that it would be a particularly good idea to even try...).

It seems to me that if dynamic_library_path and the system dynamic
loader are both looking at the same places, then storing modules and
their dependencies there is going to be all you need to make it work.

The main packaging system (debian and red hat) have automatic dependency
tracking

Re: [HACKERS] stuck spinlock

Robert Haas robertmh...@gmail.com writes:
 It seems to me that we should change every place that temporarily
 changes ImmediateInterruptOK to restore the original value instead of
 making assumptions about what it must have been.

No, that's backwards.  The problem isn't that it could be sane to enter,
say, PGSemaphoreLock with ImmediateInterruptOK already true; to get
there, you'd have had to pass through boatloads of code in which it
patently isn't safe for that to be the case.  Rather, the problem is
that once you get there it might *still* be unsafe to throw an error.
HOLD/RESUME_INTERRUPTS are designed to handle exactly that problem.

The only other way we could handle it would be if every path from (say)
HandleNotifyInterrupt down to PGSemaphoreLock passed a bool flag to tell
it don't turn on ImmediateInterruptOK; which is pretty unworkable.

 I also really wonder if notify and catchup interrupts ought to be
 taught to respect ImmediateInterruptOK, instead of having their own
 switches for the same thing.

They're not switches for the same thing though; the effects are
different, and in fact there are places that do and should flip only some
of these, PGSemaphoreLock being just the most obvious one.  I agree that
it might be possible to simplify things, but it would take more thought
than you seem to have put into it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: make_timestamp function

=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 I think the goal of the make_date/time/timestamp function series is build
 a date/time/timestamp from scratch, so the use of 'make_timestamptz' is to
 build a specific timestamp with timezone and don't convert it.

Yeah; we don't really want to incur an extra timezone rotation just to get
to a timestamptz.  However, it's not clear to me if make_timestamptz()
needs to have an explicit zone parameter or not.  It could just assume
that you meant the active timezone.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: make_timestamp function

On Fri, Dec 13, 2013 at 5:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
writes:
  I think the goal of the make_date/time/timestamp function series is
build
  a date/time/timestamp from scratch, so the use of 'make_timestamptz' is
to
  build a specific timestamp with timezone and don't convert it.

 Yeah; we don't really want to incur an extra timezone rotation just to get
 to a timestamptz.  However, it's not clear to me if make_timestamptz()
 needs to have an explicit zone parameter or not.  It could just assume
 that you meant the active timezone.


+1. And if you want a different timezone you can just set the 'timezone'
GUC.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] patch: make_timestamp function

2013-12-13 Thread Andrew Dunstan



On 12/13/2013 02:35 PM, Tom Lane wrote:

=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:

I think the goal of the make_date/time/timestamp function series is build
a date/time/timestamp from scratch, so the use of 'make_timestamptz' is to
build a specific timestamp with timezone and don't convert it.

Yeah; we don't really want to incur an extra timezone rotation just to get
to a timestamptz.  However, it's not clear to me if make_timestamptz()
needs to have an explicit zone parameter or not.  It could just assume
that you meant the active timezone.





Why not overload the function, with one version having the explicit TZ 
param?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: make_timestamp function

2013-12-13 Thread Jim Nasby


On 12/13/13 1:49 PM, Fabrízio de Royes Mello wrote:


On Fri, Dec 13, 2013 at 5:35 PM, Tom Lane t...@sss.pgh.pa.us 
mailto:t...@sss.pgh.pa.us wrote:
 
  =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com 
mailto:fabriziome...@gmail.com writes:
   I think the goal of the make_date/time/timestamp function series is build
   a date/time/timestamp from scratch, so the use of 'make_timestamptz' is to
   build a specific timestamp with timezone and don't convert it.
 
  Yeah; we don't really want to incur an extra timezone rotation just to get
  to a timestamptz.  However, it's not clear to me if make_timestamptz()
  needs to have an explicit zone parameter or not.  It could just assume
  that you meant the active timezone.
 

+1. And if you want a different timezone you can just set the 'timezone' GUC.


Why wouldn't we have a version that optionally accepts the timezone? That 
mirrors what you can currently do with a cast from text, and having to set the 
GUC if you need a different TZ would be a real PITA.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing

2013-12-13 Thread Alvaro Herrera

Andres Freund wrote:
 On 2013-12-12 18:24:34 -0300, Alvaro Herrera wrote:

  +   else if (TransactionIdIsValid(update_xid)  !has_lockers)
  +   {
  +   /*
  +* If there's a single member and it's an update, pass it back 
  alone
  +* without creating a new Multi.  (XXX we could do this when 
  there's a
  +* single remaining locker, too, but that would complicate the 
  API too
  +* much; moreover, the case with the single updater is more
  +* interesting, because those are longer-lived.)
  +*/
  +   Assert(nnewmembers == 1);
  +   *flags |= FRM_RETURN_IS_XID;
  +   if (update_committed)
  +   *flags |= FRM_MARK_COMMITTED;
  +   xid = update_xid;
  +   }
 
 Afaics this will cause HEAP_KEYS_UPDATED to be reset, is that
 problematic? I don't really remember what it's needed for TBH...

So we do reset HEAP_KEYS_UPDATED, and in general that bit seems critical
for several things.  So it should be kept when a Xmax is carried over
from the pre-frozen version of the tuple.  But while reading through
that, I realize that we should also be keeping HEAP_HOT_UPDATED in that
case.  And particularly we should never clear HEAP_ONLY_TUPLE.

So I think heap_execute_freeze_tuple() is wrong, because it's resetting
the whole infomask to zero, and then setting it to only the bits that
heap_prepare_freeze_tuple decided that it needed set.  That seems bogus
to me.  heap_execute_freeze_tuple() should only clear a certain number
of bits, and then possibly set some of the same bits; but the remaining
flags should remain untouched.  So HEAP_KEYS_UPDATED, HEAP_UPDATED and
HEAP_HOT_UPDATED should be untouched by heap_execute_freeze_tuple;
heap_prepare_freeze_tuple needn't worry about querying those bits at
all.

Only when FreezeMultiXactId returns FRM_INVALIDATE_XMAX, and when the
Xmax is not a multi and it gets removed, should those three flags be
removed completely.

HEAP_ONLY_TUPLE should be untouched by the freezing protocol.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

2013-12-13 Thread Merlin Moncure

On Fri, Dec 13, 2013 at 12:32 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Dec 13, 2013 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 And while we're on the subject ... isn't bgworker_die() utterly and
 completely broken?  That unconditional elog(FATAL) means that no process
 using that handler can do anything remotely interesting, like say touch
 shared memory.

 Yeah, but for the record (since I see I got cc'd here), that's not my
 fault.  I moved it into bgworker.c, but it's been like that since
 Alvaro's original commit of the bgworker facility
 (da07a1e856511dca59cbb1357616e26baa64428e).


Is this an edge case or something that will hit a lot of users?
Arbitrary server panics seems pretty serious...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

2013-12-13 Thread Christophe Pettus


On Dec 13, 2013, at 1:49 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Is this an edge case or something that will hit a lot of users?

My understanding (Tom can correct me if I'm wrong, I'm sure) is that it is an 
issue for servers on 9.3.2 where there are a lot of query cancellations due to 
facilities like statement_timeout or lock_timeout that cancel a query 
asynchronously.  I assume pg_cancel_backend() would apply as well.

We've only seen it on one client, and that client had a *lot* (thousands on 
thousands) of statement_timeout cancellations.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stuck spinlock

2013-12-13 Thread Christophe Pettus


On Dec 13, 2013, at 8:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Please apply commit 478af9b79770da43a2d89fcc5872d09a2d8731f8 and see
 if that doesn't fix it for you.

It appears to fix it.  Thanks!

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reference to parent query from ANY sublink

2013-12-13 Thread Antonin Houska

On 12/12/2013 04:36 PM, Tom Lane wrote:
 BTW, on further thought, I'm afraid this is a bigger can of worms than
 it appears.  The remarks above presume that the subquery is simple enough
 to be pulled up, which is the case in this example.  It might not be too
 hard to make that case work.  But what if the subquery *isn't* simple
 enough to be pulled up --- for instance, it includes grouping or
 aggregation?  Then there's no way to unify its WHERE clause with the upper
 semijoin qual.  At the very least, this breaks the uniqueify-then-do-a-
 plain-join implementation strategy for semijoins.

After having thought about it further, I think I understand.

 So I'm now thinking this patch isn't worth pursuing.  Getting all the
 corner cases right would be a significant amount of work, and in the
 end it would only benefit strangely-written queries.

Originally it seemed to me that I just (luckily) found a new opportunity
for the existing infrastructure. To change the infrastructure because of
this small feature would be exactly the opposite. Thanks for having
taken a look at it.

// Antonin Houska (Tony)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE