date:20131127

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Boszormenyi Zoltan


2013-11-27 20:49 keltezéssel, Alvaro Herrera írta:

Boszormenyi Zoltan escribió:


If you consider all these:

- certain combinations of query and DECLARE stmt flags fail;
- adding NO SCROLL is breaking backward compatibility;
- the readahead code has to really know whether the cursor is
   scrollable so it can behave just like the server;

then returning the SCROLL / NO SCROLL flag in the command tag is
not a bad solution in my view. In fact, this was the only workable
solution I could come up with to make it work reliably when neither
SCROLL nor NO SCROLL is specified by the application.

Would it work to have a function of some sort to which you give a cursor
name and it returns whether it is scrollable or not?


D'oh. Yes, that would also work. Thanks for the idea. :-)
I will implement it and adapt my remaining patches.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-11-27 Thread Pavel Stehule

Hello


2013/11/27 Tom Lane 

> Dean Rasheed  writes:
> > Actually the IF EXISTS in DROP TABLE now applies to the schema as
> > well. Unfortunately there is currently no consistency across the
> > various DROP commands --- some tolerate a non-existent schema, while
> > others error out.
>
> Yeah.  I think now that we've had this discussion, we should make them
> all tolerate a non-existent schema.  I'm fine with having that happen
> over a series of patches rather than all at once though.
>
> > Also amongst those that tolerate a non-existent
> > schema, the resulting notices are not consistent --- some report the
> > schema-qualified object name, while others just report the local
> > object name.
>
> Less excited about this part, but on the whole I'd vote for the "schema
> "no_such_schema" does not exist" wording in cases where the schema isn't
> there.  The other way is less specific for no very good reason.
>

can be used this behave (see attached patch, please)?

if it is correct, I'll work on second patch, that unify check and notices
for other DROP IF EXISTS statements.

Regards

Pavel



>
> regards, tom lane
>
commit a2de2b402247fded7feba90bf299e91ee14aacca
Author: Pavel Stehule 
Date:   Thu Nov 28 08:39:43 2013 +0100

initial

diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index 9011190..234673f 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -942,26 +942,30 @@ get_object_address_relobject(ObjectType objtype, List *objname,
 
 		/* Extract relation name and open relation. */
 		relname = list_truncate(list_copy(objname), nnames - 1);
-		relation = heap_openrv(makeRangeVarFromNameList(relname),
-			   AccessShareLock);
-		reloid = RelationGetRelid(relation);
+		relation = heap_openrv_extended(makeRangeVarFromNameList(relname),
+			   AccessShareLock,
+			   missing_ok);
+
+		reloid = (relation != NULL) ? RelationGetRelid(relation) : InvalidOid;
 
 		switch (objtype)
 		{
 			case OBJECT_RULE:
 address.classId = RewriteRelationId;
-address.objectId = get_rewrite_oid(reloid, depname, missing_ok);
+address.objectId = OidIsValid(reloid) ?
+	get_rewrite_oid(reloid, depname, missing_ok) : InvalidOid;
 address.objectSubId = 0;
 break;
 			case OBJECT_TRIGGER:
 address.classId = TriggerRelationId;
-address.objectId = get_trigger_oid(reloid, depname, missing_ok);
+address.objectId = OidIsValid(reloid) ?
+	get_trigger_oid(reloid, depname, missing_ok) : InvalidOid;
 address.objectSubId = 0;
 break;
 			case OBJECT_CONSTRAINT:
 address.classId = ConstraintRelationId;
-address.objectId =
-	get_relation_constraint_oid(reloid, depname, missing_ok);
+address.objectId = OidIsValid(reloid) ?
+	get_relation_constraint_oid(reloid, depname, missing_ok) : InvalidOid;
 address.objectSubId = 0;
 break;
 			default:
@@ -975,7 +979,9 @@ get_object_address_relobject(ObjectType objtype, List *objname,
 		/* Avoid relcache leak when object not found. */
 		if (!OidIsValid(address.objectId))
 		{
-			heap_close(relation, AccessShareLock);
+			if (relation != NULL)
+heap_close(relation, AccessShareLock);
+
 			relation = NULL;	/* department of accident prevention */
 			return address;
 		}
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index b32ad3a..191aa17 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/pg_class.h"
 #include "catalog/pg_proc.h"
 #include "commands/defrem.h"
+#include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "parser/parse_type.h"
@@ -32,6 +33,12 @@
 static void does_not_exist_skipping(ObjectType objtype,
 		List *objname, List *objargs);
 
+static void prepare_notice_recheck_parent(const char *message,
+		List *objname,
+const char **msg,
+char **name,
+char **args);
+
 /*
  * Drop one or more objects.
  *
@@ -125,6 +132,53 @@ RemoveObjects(DropStmt *stmt)
 }
 
 /*
+ * Prepare text of "does not exist, skipping" message for triggers and rules,
+ * where we check schema name first, then table name, and as last we raise
+ * message about missing object.
+ */
+static void
+prepare_notice_recheck_parent(const char *message, List *objname,
+			const char **msg,
+			char **name,
+			char **args)
+{
+	RangeVar	*rel;
+
+	rel = makeRangeVarFromNameList(list_truncate(list_copy(objname),
+			list_length(objname) - 1));
+
+	if (rel->schemaname != NULL && !OidIsValid(LookupNamespaceNoError(rel->schemaname)))
+	{
+		*msg = gettext_noop("schema \"%s\" does not exists, skipping");
+		*name = rel->schemaname;
+		*args = NULL;
+	}
+	else if (!OidIsValid(RangeVarGetRelid(rel, NoLock, true)))
+	{
+		*msg = gettext_noop("relation \"%s\" does not exists, skipping");
+		*name = rel->relname;
+		*args = NULL;
+	}
+	else
+	{
+		StringInfoData

Re: [HACKERS] GIN improvements part 1: additional information

2013-11-27 Thread Alexander Korotkov

On Wed, Nov 27, 2013 at 1:14 AM, Heikki Linnakangas  wrote:

> On 11/26/13 15:34, Alexander Korotkov wrote:
>
>> What's your plans about GIN now? I tried to rebase packed posting lists
>> with head. But I found that you've changed interface of placeToPage
>> function. That's conflicts with packed posting lists, because
>> dataPlaceToPageLeaf needs not only offset number to describe place to
>> insert item pointer. Do you like to commit rework of handling GIN
>> incomplete splits before?
>>
>
> Yeah, I'm planning to get back to this patch after committing the
> incomplete splits patch. I think the refactoring of the WAL-logging that I
> did in that patch will simplify this patch, too. I'll take a look at
> Michael's latest comments on the incomplete splits patch tomorrow, so I
> should get back to this on Thursday or Friday.


Should I try to rebase this patch now or you plan to do it yourself? Some
changes like "void *insertdata" argument make me think you have some
particular plan to rebase this patch, but I didn't get it exactly.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-11-27 Thread Rajeev rastogi

On 28 November 2013, Amit Kapila Wrote:
> >> Further Review of this patch:
> >> b. why to display 'First log segment after reset' in 'Currrent
> >> pg_control values' section as now the same information
> >> is available in new section "Values to be used after reset:" ?
> >
> > May not be always. Suppose if user has given new value of seg no and
> TLI, then it will be different.
> > Otherwise it will be same.
> > So now I have changed the code in such way that the value of XLOG
> > segment # read from checkpoint record gets printed as part of current
> > value and any further new value gets printed in Values to be reset
> > (This will be always at-least one plus the current value). Because of
> following code in FindEndOfXLOG
> > xlogbytepos = newXlogSegNo *
> ControlFile.xlog_seg_size;
> > newXlogSegNo = (xlogbytepos + XLogSegSize - 1)
> / XLogSegSize;
> > newXlogSegNo++;
> 
> It can be different, but I don't think we should display it in both
> sections because:
> a. it doesn't belong to control file.
> b. if you carefully look at original link
> (http://www.postgresql.org/message-id/1283277511-sup-2...@alvh.no-
> ip.org),
> these values are not getting displayed in pg_control values section.
> 
> So I suggest it is better to remove it from pg_control values section.

Done as per suggestion.

> 
> Few new comments:
> 
> 1.
> Formatting for 'OldestMulti's DB' and 'OldestXID's DB' is not
> consistent with other values, try by printing it.

Changed to align output.

> 2.
> +It will print values in two sections. In first section it will
> print all original/guessed values
> +and in second section all values to be used after reset. In
> second section filename will be
> +always printed as segment number will be at-least one more than
> current. Also if any other parameter
> +is given using appropriate switch, then those new values will be
> also printed.
> 
> a. the length of newly added lines is not in sync with previous text,
> try to keep length less than 80 chars.
> b. I think there is no need of text (In second section ), you can
> remove all text after that point.

Done.

> 3.
> ! printf(_("  -n   no update, just show extracted control
> values (for testing) and to be reset values of parameters(if
> given)\n")); I find this line too long and elaborative, try to make it
> shorter.

Changed accordingly.
 

Please provide your opinion.

Thanks and Regards,
Kumar Rajeev Rastogi


pg_resetxlogsectionV4.patch
Description: pg_resetxlogsectionV4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-11-27 Thread Amit Kapila

On Tue, Nov 26, 2013 at 5:33 PM, Rajeev rastogi
 wrote:
> On 26 November 2013, Amit Kapila Wrote:
>> Further Review of this patch:
>> b. why to display 'First log segment after reset' in 'Currrent
>> pg_control values' section as now the same information
>> is available in new section "Values to be used after reset:" ?
>
> May not be always. Suppose if user has given new value of seg no and TLI, 
> then it will be different.
> Otherwise it will be same.
> So now I have changed the code in such way that the value of XLOG segment # 
> read from
> checkpoint record gets printed as part of current value and any further new 
> value gets
> printed in Values to be reset (This will be always at-least one plus the 
> current value). Because
> of following code in FindEndOfXLOG
> xlogbytepos = newXlogSegNo * 
> ControlFile.xlog_seg_size;
> newXlogSegNo = (xlogbytepos + XLogSegSize - 1) / 
> XLogSegSize;
> newXlogSegNo++;

It can be different, but I don't think we should display it in both
sections because:
a. it doesn't belong to control file.
b. if you carefully look at original link
(http://www.postgresql.org/message-id/1283277511-sup-2...@alvh.no-ip.org),
these values are not getting displayed in pg_control values section.

So I suggest it is better to remove it from pg_control values section.

Few new comments:

1.
Formatting for 'OldestMulti's DB' and 'OldestXID's DB' is not
consistent with other values, try by printing it.

2.
+It will print values in two sections. In first section it will
print all original/guessed values
+and in second section all values to be used after reset. In
second section filename will be
+always printed as segment number will be at-least one more than
current. Also if any other parameter
+is given using appropriate switch, then those new values will be
also printed.

a. the length of newly added lines is not in sync with previous text,
try to keep length less than 80 chars.
b. I think there is no need of text (In second section ), you can
remove all text after that point.

3.
! printf(_("  -n   no update, just show extracted control
values (for testing) and to be reset values of parameters(if
given)\n"));
I find this line too long and elaborative, try to make it shorter.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Get more from indices.

2013-11-27 Thread Etsuro Fujita

I wrote:
> Kyotaro HORIGUCHI wrote:
> > > * In get_relation_info(), the patch determines the branch condition
> > > "keyattno != ObjectIdAttributeNumber".  I fail to understand the
> > > meaning of this branch condition.  Could you explain about it?

> > Literally answering, it means oid cannot be NULL (if it exists).

> Understood.  Thank you for the detailed information.  But I'm not sure
it's
> worth complicating the code.  What use cases are you thinking?

Having said that, I've reconsidered about this, and start to think it would
be better that all system attributes are processed in the same way as are
user attributes because that makes the code more simple.  Yes, as you
mention, it's not necessarily guaranteed that system attributes have the
uniqueness property in general, but that's another problem.

Thanks,

Best regards,
Etsuro Fujita

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Put json type into alphabetical order in manual table

2013-11-27 Thread Peter Eisentraut

Fixed.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Bruce Momjian

On Wed, Nov 27, 2013 at 03:36:12PM -0800, Kevin Grittner wrote:
> Tom Lane  wrote:
> 
> > I'm inclined to agree with Kevin that this behavior is wrong and
> > should be fixed (and back-patched), so far as pg_dumpall is concerned.
> > pg_dumpall's charter is to be able to recreate a database cluster's
> > contents in a virgin installation, but it's failing to honor that
> > contract if the cluster has any ALTER DATABASE SET default_read_only
> > settings.  Similarly, I think it's reasonable to try to make pg_upgrade
> > cope with the case.
> >
> > I also agree with *not* changing pg_dump, since it is not the charter
> > of pg_dump to recreate a whole cluster, and the objection about possibly
> > restoring into a database that was meant to be protected by this setting
> > seems to have some force.
> 
> For example, I have seen dumps accidentally restored to the
> postgres database on multiple occasions.  You might, for example,
> flag the postgres database with this, and thereby block such
> accidents.  The patch as it stands would allow pg_dumpall to
> replicate such a cluster, flag and all.  Without the patch you get
> many errors.
> 
> It is also much easier to work around with pg_dump output.  You
> could get a psql connection to a database, set this off for the
> connection, and use \i to read the pg_dump output file.  Or you
> could concatenate a SET statement in front of the pg_dump output
> when piping it in.  There is no correspondingly easy solution for
> pg_dumpall.

OK, I have read through this again, and now I see the idea is that we
are trying to have pg_dumpall override the default_transaction_read_only
value, but keep it for pg_dump restores.  That is a pretty tricky
use-case, so I think we need to mention this in the C comments.

Tom and you think we should backpatch, and it seems only I am against
it, so I suppose you can move forward. We are packaging soon for a minor
release.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Bruce Momjian

On Wed, Nov 27, 2013 at 03:36:12PM -0800, Kevin Grittner wrote:
> Of the people who posted on this thread supporting that, I think
> Tom said it best:
> 
> Tom Lane  wrote:
> 
> > I'm inclined to agree with Kevin that this behavior is wrong and
> > should be fixed (and back-patched), so far as pg_dumpall is concerned.
> > pg_dumpall's charter is to be able to recreate a database cluster's
> > contents in a virgin installation, but it's failing to honor that
> > contract if the cluster has any ALTER DATABASE SET default_read_only
> > settings.  Similarly, I think it's reasonable to try to make pg_upgrade
> > cope with the case.
> >
> > I also agree with *not* changing pg_dump, since it is not the charter
> > of pg_dump to recreate a whole cluster, and the objection about possibly
> > restoring into a database that was meant to be protected by this setting
> > seems to have some force.
> 
> For example, I have seen dumps accidentally restored to the
> postgres database on multiple occasions.  You might, for example,
> flag the postgres database with this, and thereby block such
> accidents.  The patch as it stands would allow pg_dumpall to
> replicate such a cluster, flag and all.  Without the patch you get
> many errors.
> 
> It is also much easier to work around with pg_dump output.  You
> could get a psql connection to a database, set this off for the
> connection, and use \i to read the pg_dump output file.  Or you
> could concatenate a SET statement in front of the pg_dump output
> when piping it in.  There is no correspondingly easy solution for
> pg_dumpall.

Well, I can understand that, but part of the argument was that
default_transaction_read_only is not part of the database, but rather
just the transaction default:

> Karsten wrote:
> Maybe I am splitting hairs but setting transactions to readonly
> per default does not mean the database *as such* is to be readonly.
> It literally applies to the *default* state of transactions (as
> opposed to the ONLY state of transactions). No more, no less.

I ask again!

> What is the logic that has us setting statement_timeout in
> pg_dump but default_transaction_read_only in pg_dumpall?

Why can't I get an answer to that question?  Is it that
statement_timeout is less likely to lead to a restore failure?  Are all
the other settings output from pg_dump safe?  Is only
default_transaction_read_only a problem?  Whatever the answer is, the
patch should explain why we are singling out
default_transaction_read_only for pg_dumpall use and everything else is
in pg_dump.

Why does it feel like I am going around in circles here?  I feel I like
am reliving the materialized view record comparison thread all over
again.  :-(

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Tom Lane

I wrote:
> We could still do this if we were willing to actually reject requests
> for session level locks on fast-path-eligible lock types.  At the moment
> that costs us nothing really.  If anyone ever did have a use case, we
> could consider adding the extra logic to support it.

Nope, that *still* doesn't work, because in non-allLocks mode the main
loop won't clear any locks that have been promoted from fastpath to
regular.  Sigh.  For the moment I'm proposing that we just re-fetch
the list header after acquiring the lock.  The attached patch is slightly
more verbose than that, because I took the opportunity to reformulate the
while() loop as a for() loop and thereby eliminate some goto's.

regards, tom lane

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 072b276..75df457 100644
*** a/src/backend/storage/lmgr/lock.c
--- b/src/backend/storage/lmgr/lock.c
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2098,2116 
  	{
  		LWLockId	partitionLock = FirstLockMgrLock + partition;
  		SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);

! 		proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
! 			 offsetof(PROCLOCK, procLink));
! 
! 		if (!proclock)
  			continue;			/* needn't examine this partition */

  		LWLockAcquire(partitionLock, LW_EXCLUSIVE);

! 		while (proclock)
  		{
  			bool		wakeupNeeded = false;
- 			PROCLOCK   *nextplock;

  			/* Get link first, since we may unlink/delete this proclock */
  			nextplock = (PROCLOCK *)
--- 2098,2134 
  	{
  		LWLockId	partitionLock = FirstLockMgrLock + partition;
  		SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);
+ 		PROCLOCK   *nextplock;

! 		/*
! 		 * If the proclock list for this partition is empty, we can skip
! 		 * acquiring the partition lock.  This optimization is trickier than
! 		 * it looks, because another backend could be in process of adding
! 		 * something to our proclock list due to promoting one of our
! 		 * fast-path locks.  However, any such lock must be one that we
! 		 * decided not to delete above, so it's okay to skip it again now;
! 		 * we'd just decide not to delete it again.  We must, however, be
! 		 * careful to re-fetch the list header once we've acquired the
! 		 * partition lock, to be sure we see the up to date version.
! 		 *
! 		 * XXX This argument assumes that the locallock table correctly
! 		 * represents all of our fast-path locks.  While allLocks mode
! 		 * guarantees to clean up all of our normal locks regardless of the
! 		 * locallock situation, we lose that guarantee for fast-path locks.
! 		 * This is not ideal.
! 		 */
! 		if (SHMQueueNext(procLocks, procLocks,
! 		 offsetof(PROCLOCK, procLink)) == NULL)
  			continue;			/* needn't examine this partition */

  		LWLockAcquire(partitionLock, LW_EXCLUSIVE);

! 		for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
! 			   offsetof(PROCLOCK, procLink));
! 			 proclock;
! 			 proclock = nextplock)
  		{
  			bool		wakeupNeeded = false;

  			/* Get link first, since we may unlink/delete this proclock */
  			nextplock = (PROCLOCK *)
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2123,2129 

  			/* Ignore items that are not of the lockmethod to be removed */
  			if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
! goto next_item;

  			/*
  			 * In allLocks mode, force release of all locks even if locallock
--- 2141,2147 

  			/* Ignore items that are not of the lockmethod to be removed */
  			if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
! continue;

  			/*
  			 * In allLocks mode, force release of all locks even if locallock
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2139,2145 
  			 * holdMask == 0 and are therefore recyclable
  			 */
  			if (proclock->releaseMask == 0 && proclock->holdMask != 0)
! goto next_item;

  			PROCLOCK_PRINT("LockReleaseAll", proclock);
  			LOCK_PRINT("LockReleaseAll", lock, 0);
--- 2157,2163 
  			 * holdMask == 0 and are therefore recyclable
  			 */
  			if (proclock->releaseMask == 0 && proclock->holdMask != 0)
! continue;

  			PROCLOCK_PRINT("LockReleaseAll", proclock);
  			LOCK_PRINT("LockReleaseAll", lock, 0);
*** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2168,2176 
  		lockMethodTable,
  		LockTagHashCode(&lock->tag),
  		wakeupNeeded);
- 
- 	next_item:
- 			proclock = nextplock;
  		}		/* loop over PROCLOCKs within this partition */

  		LWLockRelease(partitionLock);
--- 2186,2191 
*** PostPrepare_Locks(TransactionId xid)
*** 3142,3160 
  	{
  		LWLockId	partitionLock = FirstLockMgrLock + partition;
  		SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);

! 		proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
! 			 offsetof(PROCLOCK, procLink));
! 
! 		if (!proclock)
  			continue;			/* needn't examine this partitio

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Peter Geoghegan

On Wed, Nov 27, 2013 at 1:09 AM, Peter Geoghegan  wrote:
> This, along with the already-discussed "attempted to update invisible
> tuple" forms a full account of unexpected ERRORs seen during the
> extended run of the test case, so far.

Actually, it was slightly misleading of me to say it's the same
test-case; in fact, this time I ran each pgbench run with a variable,
random number of seconds between 2 and 20 inclusive (as opposed to
always 2 seconds). If you happen to need help recreating this, I am
happy to give it.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Proposed feature: Selective Foreign Keys

2013-11-27 Thread Tom Dunstan

Hi all!

The Problem
-
One case that traditional SQL doesn't handle very well is when you have a child 
entity which can be attached to a number of different parent entities. Examples 
might be comments, tags or file attachments - we might have 20 different 
entities in the system that we would like our users to be able add comments to, 
but the existing solutions for mapping this all have downsides.

Existing solution 1: Join tables ahoy
If I can have a list of comments on every other object in the system, and I 
want to have referrential integrity, then the obvious thing to do is create a 
join table between each entity and the comments table.
Pros:
 - Straight forward, traditional object-with-collection-of-child SQL structure
Cons:
 - If a parent object gets deleted here, we can't use foreign keys to delete 
e.g. a child comment, so we'll have to either explicitly do it as part of our 
delete logic or have a cleanup process to catch orphans. Or do a dance with 
delete triggers on the join tables deleting the comment.
 - For n entities requiring comments in the system, we need n join tables.If we 
want both comments and e.g. tags and likes on all of our entities, we now have 
3n join tables for what should be some relatively self-contained on-the-side 
data - this is could be more tables than the entire rest of the system
 - It's difficult to create any kind of self-contained component for building 
applications in this scenario, as it will need to know about every other entity 
in the system, or be able to poke around inside whatever ORM or data access 
system that you have to work out what join tables it needs when running queries.

Existing solution 2: Enter the matrix
Instead of having n join tables, let's just mash them all together, with a 
column per parent object, and a check constraint to force exactly one of those 
columns to be set.
Pros:
 - Less bloat in the number of tables
Cons:
 - Doesn't solve orphan problem
 - Addition of a new entity which needs comments and we now have to add another 
column onto it, potentially rewriting the whole thing
 - Ugly

Existing solution 3: Embed the matrix
Kinda like the dependency matrix table, except that all the columns referencing 
potential parent objects we put into the comment table instead.
Pros:
 - Everything contained in column table
 - No orphans, since cascaded deletes will now delete the actual comment
Cons:
 - Comment table now has references to every single type that it may be 
attached to
 - Addition of a new entity and we probably have to rewrite the comment table 
now

Existing solution 4: Abandon ye all referential integrity
Have a column indicating parent type and another one for the id. In the case of 
comments this would be directly on the comment table itself. In the case of 
something like tags that we might expect to be shared between entities, it 
would be in a single join table. 
Pros:
 - Pretty self-contained
 - Data model which has neither lots of empty columns or lots of tables
 - Can make new entities "commentable" without rewriting anything
 - Because it's self-contained, can build application components that don't 
need to know much about the rest of your system. For example this is the 
approach that the grails taggable and commentable plugins take.
Cons:
 - No referential integrity, since we can't have a single column pointing to 
different tables with existing foreign key infrastructure
 - Since there's no real db support for doing things this way, existing ORMs 
etc don't really know how use a single column to join against multiple 
different tables based on a discriminator or 'switch' column.

Existing solution 5: Everything's a thing
Make your entity hierarchy have a base level object which can have comments 
attached, and then everything that you need to be "commentable" has to extend 
that. You can do that in an ORM, or with table inheritance in the database.
Pros:
 - Single top-level thing to hang your data on
Cons:
 - You've polluted your object hierarchy just to hang some stuff off of the end 
of it rather than it being driven by behaviours
 - You're going to be paying a performance penalty - everything that extends 
that base level object will now join against it incessantly, and you now have a 
global id sequence or whatever that you may not want.

Basically none of the above handle the situation very well. The cleanest is 
solution 4, but lack of RI sucks.

Feature Proposal: Selective foreign keys.
-
Allow foreign keys to have where clauses. The above comment example using 
solution 4 might then look like then following:

CREATE TABLE comment as (
  id bigserial primary key,
  content text not null,
  parent_entity regclass not null,
  parent_id int8
);
ALTER TABLE comment ADD CONSTRAINT comment_blog_fk FOREIGN KEY (parent_id) 
REFERENCES blog(id) WHERE (parent_entity = ‘blog');
ALTER TABLE comment ADD CONSTRAINT comment_event_fk FOREIGN KEY (parent

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Peter Eisentraut

On 11/26/13, 5:14 PM, Kevin Grittner wrote:
> I happened to build in a shell that was still set up for the clang
> address sanitizer, and got the attached report.  On a rerun it was
> repeatable.  XLogInsert() seems to read past the end of a variable
> allocated on the stack in doPickSplit(). I haven't tried to analyze
> it past that, since this part of the code is unfamiliar to me.

I also see that.  It only happens in 64-bit builds.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund

On 2013-11-27 18:18:02 -0500, Noah Misch wrote:
> > I think the likelihood of the problem affecting !all-visible pages is
> > close to zero. Each vacuum will try to clean those, so they surely will
> > get vacuumed at some point. I think the only way that could happen is if
> > the ConditionalLockBufferForCleanup() fails in each vacuum. And that
> > seems a bit unlikely.
> 
> The page could have sat all-visible (through multiple XID epochs, let's say)
> until a recent UPDATE.

Good point.

> > > At first, I supposed we could offer a tool to blindly freeze such tuples.
> > > However, there's no guarantee that they are in harmony with recent 
> > > changes to
> > > the database; transactions that wrongly considered those tuples invisible 
> > > may
> > > have made decisions incompatible with their existence.  For example, 
> > > reviving
> > > such a tuple could violate a UNIQUE constraint if the user had already
> > > replaced the missing row manually.
> > 
> > Good point, although since they are all on all-visible pages sequential
> > scans will currently already find those. It's primarily index scans that
> > won't. So it's not really reviving them...
> 
> True.  Since a dump/reload of the database would already get the duplicate key
> violation, the revival is not making anything clearly worse.  And if we hope
> for manual repair, many DBAs just won't do that at all.

Especially if it involves compiling C code...

> > The primary reason why I think it might be a good idea to "revive"
> > automatically is, that an eventual full-table/freeze vacuum will
> > currently delete them which seems bad.
> 
> Will it?  When the page became all-visible, the tuples were all hinted.  They
> will never be considered dead.  Every 2B transactions, they will alternate
> between live and not-yet-committed.

Good point again. And pretty damn good luck.

Although 9.3+ multixact rows look like they could return _DEAD since
we'll do an TransactionIdDidAbort() (via GetUpdateXid()) and
TransactionIdDidCommit() in there and we don't set XMAX_COMMITTED hint
bits for XMAX_IS_MULTI rows. As an additional problem, once multixact.c
has pruned the old multis away we'll get errors from there on.
But that's less likely to affect many rows.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Tom Lane

Dimitri Fontaine  writes:
> Atri Sharma  writes:
>> Can we add a function to the FDW API to define a SQL to foreign server
>> side conversion?

> I think that's not tenable. Even if you limit the discussion to the
> postgres_fdw, some queries against past version will stop working
> against new version (keywords changes, catalogs, default settings, etc).

> I don't think you want to embed a full parser of every supported FOREIGN
> version of PostgreSQL inside the postgres_fdw code, so I think the text
> of the view needs to be an opaque string.

I'm not real sure what this'd buy us that wouldn't be done as well or
better by creating a view on the remote side.  (IOW, there's nothing
that says that the remote object backing a foreign table can't be a
view.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Tom Lane

Andres Freund  writes:
> On 2013-11-27 17:25:44 -0500, Tom Lane wrote:
>> Or we
>> could add a restriction to EligibleForRelationFastPath that restricts
>> the fast-path mechanism to non-session locks, in which case we'd not
>> need to make the zeroing contingent on allLocks either.  I don't think
>> we take any fast-path-eligible locks in session mode anyway, so this
>> wouldn't be giving up any functionality on that end.

> That seems like the best thing to do to me.

I poked a bit more at this and realized that what I had in mind above
doesn't work: a session lock isn't a distinct lock type, just a different
owner, and moreover usually we ask for a session lock on a rel that we
already have locked normally.  So it doesn't work to say "skip trying fast
path if sessionLock" --- we typically won't get that far in the code.
We could conceivably make it work if we were willing to forcibly promote
an existing fastpath lock to regular when a session lock gets asked for,
but that's complication I don't want much, especially since it would go
completely untested in normal use.  (I verified the claim above that we
never take session locks on any fast-path-eligible lock types.)

We could still do this if we were willing to actually reject requests
for session level locks on fast-path-eligible lock types.  At the moment
that costs us nothing really.  If anyone ever did have a use case, we
could consider adding the extra logic to support it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine

Atri Sharma  writes:
> Can we add a function to the FDW API to define a SQL to foreign server
> side conversion?

I think that's not tenable. Even if you limit the discussion to the
postgres_fdw, some queries against past version will stop working
against new version (keywords changes, catalogs, default settings, etc).

I don't think you want to embed a full parser of every supported FOREIGN
version of PostgreSQL inside the postgres_fdw code, so I think the text
of the view needs to be an opaque string.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Andres Freund

On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote:
> Andres Freund  wrote:
> > What do you mean with "how far back"?
> 
> What back-patching will be needed for a fix?  It sounds like 9.3?

Yep.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Kevin Grittner

Andres Freund  wrote:
> On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote:

>> ... however, I have not been able to trigger that Assert even with
>> gdb breakpoints at what I think are the right spots.  Any
>> suggestions?  How far back is it true that the above
>> HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
>> but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same
>> tuple structure can return InvalidTransactionId?
>
> What do you mean with "how far back"?

What back-patching will be needed for a fix?  It sounds like 9.3?

> Afaics you need a multixact consisting out of a) the updater and b) a
> lock. That's probably easiest to get if you update a row in one session
> without changing the primary key, and then key-share lock it in
> another. Or the other way round.
> Then abort the updater.

Thanks!  I'll keep trying to generate a failure at that point.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Kevin Grittner

Bruce Momjian  wrote:
> On Wed, Nov 27, 2013 at 06:05:13AM -0800, Kevin Grittner wrote:
>> Bruce Momjian  wrote:

>>> I am confused what we are patching.  Are we patching pg_dump,
>>> pg_dumpall, or both?
>>
>> Just pg_dumpall.c.
>
> OK, there was a pg_dump patch earlier which we are not using now.

Right, that was v1, early in the discussion.  This is v3, based on
responding to the discussion on this thread.

> What is the logic that has us setting statement_timeout in
> pg_dump but default_transaction_read_only in pg_dumpall?

Of the people who posted on this thread supporting that, I think
Tom said it best:

Tom Lane  wrote:

> I'm inclined to agree with Kevin that this behavior is wrong and
> should be fixed (and back-patched), so far as pg_dumpall is concerned.
> pg_dumpall's charter is to be able to recreate a database cluster's
> contents in a virgin installation, but it's failing to honor that
> contract if the cluster has any ALTER DATABASE SET default_read_only
> settings.  Similarly, I think it's reasonable to try to make pg_upgrade
> cope with the case.
>
> I also agree with *not* changing pg_dump, since it is not the charter
> of pg_dump to recreate a whole cluster, and the objection about possibly
> restoring into a database that was meant to be protected by this setting
> seems to have some force.

For example, I have seen dumps accidentally restored to the
postgres database on multiple occasions.  You might, for example,
flag the postgres database with this, and thereby block such
accidents.  The patch as it stands would allow pg_dumpall to
replicate such a cluster, flag and all.  Without the patch you get
many errors.

It is also much easier to work around with pg_dump output.  You
could get a psql connection to a database, set this off for the
connection, and use \i to read the pg_dump output file.  Or you
could concatenate a SET statement in front of the pg_dump output
when piping it in.  There is no correspondingly easy solution for
pg_dumpall.

--
Kevin GrittnerEDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MultiXact pessmization in 9.3

2013-11-27 Thread Tom Lane

Andres Freund  writes:
> On 2013-11-27 19:24:35 -0300, Alvaro Herrera wrote:
>> The other idea is to just not backpatch this.

> I think backpatching is a good idea, I have seen GetMultiXactIdMembers()
> + slru code take up 80% cpu in strange workloads. But it possibly might
> be a good idea to wait till after the next point release to give people
> at least a minimal chance of catching problems.

Agreed on both counts.  We're close enough now to Monday's wrap that we
should avoid anything that risks destabilizing 9.3.x, unless it's to fix
a serious bug.  AIUI this is just a performance issue, so let's wait till
after 9.3.2 is done to push in the fix.  But since it is a performance
regression from pre-9.3, never back-patching the fix at all isn't very
attractive.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Andres Freund

On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote:
> Andres Freund  wrote:
> 
> > HeapTupleHeaderGetUpdateXid() ignores aborted updaters
> > and returns InvalidTransactionId in that case, but
> > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...
> 
> That sure *sounds* like it should cause a problem for this code in
> CheckForSerializableConflictOut():

Yea. IMNSHO the current state is a API design flaw. We really should be
returning the aborted xid since that's what happens for non-multixact
ids.

>     htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
>     switch (htsvResult)
>     {
>     [ ... ]
>     case HEAPTUPLE_DELETE_IN_PROGRESS:
>     xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
>     break;
>     [ ... ]
>     }
>     Assert(TransactionIdIsValid(xid));
> 
> ... however, I have not been able to trigger that Assert even with
> gdb breakpoints at what I think are the right spots.  Any
> suggestions?  How far back is it true that the above
> HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
> but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same
> tuple structure can return InvalidTransactionId?  Is ther

What do you mean with "how far back"?

> e some
> other condition (besides a ROLLBACK of an UPDATE on the tuple being
> read) which needs to be met?  Is any particular timing necessary?

Afaics you need a multixact consisting out of a) the updater and b) a
lock. That's probably easiest to get if you update a row in one session
without changing the primary key, and then key-share lock it in
another. Or the other way round.
Then abort the updater.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Noah Misch

On Wed, Nov 27, 2013 at 10:43:05PM +0100, Andres Freund wrote:
> On 2013-11-27 14:53:27 -0500, Noah Misch wrote:
> > How would you characterize the chances of this happening with default
> > *vacuum_freeze_*_age settings?  Offhand, it seems you would need to 
> > encounter
> > this bug during each of ~10 generations of autovacuum_freeze_max_age before
> > the old rows actually become invisible.
> 
> I think realistically, to actually trigger the bug, it needs to happen
> quite a bit more often. But in some workloads it's pretty darn easy to
> hit. E.g. if significant parts of the table are regularly deleted, lots,
> if not most, of your vacuums will spuriously increase relfrozenxid above
> the actual value. Each time only by a small amount, but due to that
> small increase there never will be an actual full table vacuum since
> freeze_table_age will never even remotely be reached.

That makes sense.

> > Maximizing detection is valuable, and the prognosis for automated repair is
> > poor.  I would want a way to extract tuples having xmin outside the range of
> > CLOG that are marked HEAP_XMIN_COMMITTED or appear on an all-visible
> > page.
> 
> I think the likelihood of the problem affecting !all-visible pages is
> close to zero. Each vacuum will try to clean those, so they surely will
> get vacuumed at some point. I think the only way that could happen is if
> the ConditionalLockBufferForCleanup() fails in each vacuum. And that
> seems a bit unlikely.

The page could have sat all-visible (through multiple XID epochs, let's say)
until a recent UPDATE.

> > At first, I supposed we could offer a tool to blindly freeze such tuples.
> > However, there's no guarantee that they are in harmony with recent changes 
> > to
> > the database; transactions that wrongly considered those tuples invisible 
> > may
> > have made decisions incompatible with their existence.  For example, 
> > reviving
> > such a tuple could violate a UNIQUE constraint if the user had already
> > replaced the missing row manually.
> 
> Good point, although since they are all on all-visible pages sequential
> scans will currently already find those. It's primarily index scans that
> won't. So it's not really reviving them...

True.  Since a dump/reload of the database would already get the duplicate key
violation, the revival is not making anything clearly worse.  And if we hope
for manual repair, many DBAs just won't do that at all.

> The primary reason why I think it might be a good idea to "revive"
> automatically is, that an eventual full-table/freeze vacuum will
> currently delete them which seems bad.

Will it?  When the page became all-visible, the tuples were all hinted.  They
will never be considered dead.  Every 2B transactions, they will alternate
between live and not-yet-committed.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Tom Lane

Peter Eisentraut  writes:
> On 11/27/13, 3:47 PM, Tom Lane wrote:
>> Given these considerations, I think it'd be better to allow explicit
>> application control over whether read-ahead happens for a particular
>> query.  And I have no problem whatsoever with requiring that the cursor
>> be explicitly marked SCROLL or NO SCROLL before read-ahead will occur.

> Well, technically, unspecified means NO SCROLL according to the SQL
> standard.  A lot of applications in ECPG are ported from other systems,
> which might make that assumption.  It wouldn't be very nice to have to
> change all that.

Hm.  So you're suggesting that ECPG fix this problem by inserting an
explicit NO SCROLL clause into translated DECLARE CURSOR commands, if
there's not a SCROLL clause?

That would solve the problem of the ECPG library not being sure which
behavior applies, but it might break existing apps that were unknowingly
relying on a simple cursor being scrollable.  OTOH any such app would be
subject to breakage anyway as a result of planner changes, so it's hard to
complain against this, as long as it's happening in a major version
update.

I'm for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MultiXact bugs

2013-11-27 Thread Kevin Grittner

Andres Freund  wrote:

> HeapTupleHeaderGetUpdateXid() ignores aborted updaters
> and returns InvalidTransactionId in that case, but
> HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS...

That sure *sounds* like it should cause a problem for this code in
CheckForSerializableConflictOut():

    htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
    switch (htsvResult)
    {
    [ ... ]
    case HEAPTUPLE_DELETE_IN_PROGRESS:
    xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
    break;
    [ ... ]
    }
    Assert(TransactionIdIsValid(xid));

... however, I have not been able to trigger that Assert even with
gdb breakpoints at what I think are the right spots.  Any
suggestions?  How far back is it true that the above
HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS
but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same
tuple structure can return InvalidTransactionId?  Is there some
other condition (besides a ROLLBACK of an UPDATE on the tuple being
read) which needs to be met?  Is any particular timing necessary?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-27 Thread Andres Freund

On 2013-11-27 13:57:52 -0300, Alvaro Herrera wrote:
> Per bug report by J Smith in
> cadfupgc5bmtv-yg9znxv-vcfkb+jprqs7m2oesqxam_4z1j...@mail.gmail.com
> diagnosed by Andres Freund.

Alvaro, do you see a way this could actually have caused J.'s problems?
I thought about a few, but each turned about to not really be
possible...
I can easily see why we would uselessly wait or worse, but the
pg_subtrans errors I can't really explain with it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MultiXact pessmization in 9.3

2013-11-27 Thread Andres Freund

On 2013-11-27 19:24:35 -0300, Alvaro Herrera wrote:
> One other thought is that MultiXactIdIsRunning and GetMultiXactIdMembers
> are public functions, so this patch would represent an API change in
> 9.3.  I doubt any external modules would be relying on these functions,
> but there's so many care and thought put into avoiding API changes on
> released versions that I'm nervous about doing it here.  So I think we'd
> need to provide a compatibility shim to avoid that.

-0.5 for providing compatibility shims for this. There really doesn't
seem to be legitimate use for that api outside the guts of heapam.c
besides crude debugging hacks. So it seems like wasted effort.

> (I generally dislike to keep compatibility stuff forever, so I would
> provide this backward-compatible functions in 9.3 only.  Anyone using it
> would have to fix the code and recompile for 9.4+.  This means a #ifdef
> in code meant to work on top of both 9.3 and 9.4.  Anyone opines
> otherwise?)

If at all, we definitely should only do it for 9.3.

> The other idea is to just not backpatch this.

I think backpatching is a good idea, I have seen GetMultiXactIdMembers()
+ slru code take up 80% cpu in strange workloads. But it possibly might
be a good idea to wait till after the next point release to give people
at least a minimal chance of catching problems.

> Other than that, this patch implements the optimization suggested here.

Great!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Andres Freund

On 2013-11-27 17:25:44 -0500, Tom Lane wrote:
> Or we
> could add a restriction to EligibleForRelationFastPath that restricts
> the fast-path mechanism to non-session locks, in which case we'd not
> need to make the zeroing contingent on allLocks either.  I don't think
> we take any fast-path-eligible locks in session mode anyway, so this
> wouldn't be giving up any functionality on that end.

That seems like the best thing to do to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] overflow checks optimized away

2013-11-27 Thread Bruce Momjian

On Mon, Sep  9, 2013 at 12:21:56PM -0400, Robert Haas wrote:
> On Sat, Sep 7, 2013 at 6:55 PM, Greg Stark  wrote:
> >> Should these patches be applied?
> >
> > I have a copy of the program and was going to take care of this.
> 
> When?

2.5 months later, status report?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-27 Thread Bruce Momjian

On Wed, Nov 27, 2013 at 04:44:02PM -0500, Bruce Momjian wrote:
> I could live with this:
> 
>  Issuing ROLLBACK outside of a transaction
>  block has no effect except emitting a warning.

Proposed doc patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/ref/abort.sgml b/doc/src/sgml/ref/abort.sgml
new file mode 100644
index f3a2fa8..ce70e7f
*** a/doc/src/sgml/ref/abort.sgml
--- b/doc/src/sgml/ref/abort.sgml
*** ABORT [ WORK | TRANSACTION ]
*** 63,69 

!Issuing ABORT outside of a transaction block has no effect.

--- 63,70 

!Issuing ABORT outside of a transaction block
!has no effect except emitting a warning.

diff --git a/doc/src/sgml/ref/rollback.sgml b/doc/src/sgml/ref/rollback.sgml
new file mode 100644
index 4f79621..3465d51
*** a/doc/src/sgml/ref/rollback.sgml
--- b/doc/src/sgml/ref/rollback.sgml
*** ROLLBACK [ WORK | TRANSACTION ]
*** 60,66 

 Issuing ROLLBACK outside of a transaction
!block has no effect.

--- 60,66 

 Issuing ROLLBACK outside of a transaction
!block has no effect except emitting a warning.

diff --git a/doc/src/sgml/ref/set.sgml b/doc/src/sgml/ref/set.sgml
new file mode 100644
index 5a84f69..6ef06e6
*** a/doc/src/sgml/ref/set.sgml
--- b/doc/src/sgml/ref/set.sgml
*** SET [ SESSION | LOCAL ] TIME ZONE { 
Specifies that the command takes effect for only the current
transaction.  After COMMIT or ROLLBACK,
!   the session-level setting takes effect again.  This has no effect
!   outside of a transaction block.

--- 110,117 

Specifies that the command takes effect for only the current
transaction.  After COMMIT or ROLLBACK,
!   the session-level setting takes effect again.  Issuing this
!   outside of a transaction block has no effect except emitting a warning.

diff --git a/doc/src/sgml/ref/set_constraints.sgml b/doc/src/sgml/ref/set_constraints.sgml
new file mode 100644
index a33190c..541a50b
*** a/doc/src/sgml/ref/set_constraints.sgml
--- b/doc/src/sgml/ref/set_constraints.sgml
*** SET CONSTRAINTS { ALL | 
 This command only alters the behavior of constraints within the
!current transaction.  This has no effect outside of a transaction block.

--- 99,106 

 This command only alters the behavior of constraints within the
!current transaction.  Issuing this outside of a transaction block
!has no effect except emitting a warning.

diff --git a/doc/src/sgml/ref/set_transaction.sgml b/doc/src/sgml/ref/set_transaction.sgml
new file mode 100644
index e90ff4a..1597657
*** a/doc/src/sgml/ref/set_transaction.sgml
--- b/doc/src/sgml/ref/set_transaction.sgml
*** SET SESSION CHARACTERISTICS AS TRANSACTI
*** 185,191 

 If SET TRANSACTION is executed without a prior
 START TRANSACTION or BEGIN,
!it will have no effect.

--- 185,191 

 If SET TRANSACTION is executed without a prior
 START TRANSACTION or BEGIN,
!ithas no effect except emitting a warning.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Another bug introduced by fastpath patch

2013-11-27 Thread Tom Lane

In LockReleaseAll, we have this coding:

for (partition = 0; partition < NUM_LOCK_PARTITIONS; partition++)
{
LWLockIdpartitionLock = FirstLockMgrLock + partition;
SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);

proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
 offsetof(PROCLOCK, procLink));

if (!proclock)
continue;/* needn't examine this partition */

LWLockAcquire(partitionLock, LW_EXCLUSIVE);

... process the proclock list ...

LWLockRelease(partitionLock);
}/* loop over partitions */


That is, we check the proclock list head before acquiring the matching
partitionLock, and assume we can skip scanning this partition if the list
is empty.  Once upon a time that was safe, but since the fast-path patch
it's not very safe, because some other backend could be in process of
promoting one of our fast-path locks (and hence changing the proclock
list).  If we could be sure that we had removed *all* our fastpath locks
in the earlier loop then it'd be okay, but because LockReleaseAll's name
is a bit of a misnomer, I don't think we can assume that.

The simplest fix is just to move the fetch of the list header down
past the LWLockAcquire.  This is pretty annoying, though, because in
typical small transactions that will require a lot of additional
partition lock acquire/releases.

I think it might be safe to check the list header, skip the partition
if it's null, but if it isn't then acquire the lock *and re-fetch the
list header*.  The idea here is that if there is someone else busy
promoting one of our fastpath locks, it must be a lock we'd decided not
to delete in the previous loop, and so we would again decide not to
delete it in the main loop.  Therefore, it doesn't matter if we decide
to skip a partition microseconds before somebody changes the list header
from null to not-null.  However, once we've decided to process a
partition, we'd better get an up-to-date view of the list state.

This does do some damage to the original concept that allLocks mode
would guarantee to clean up all locks of the target lockmethod even
if the locallock table was damaged.  I wonder whether we shouldn't
get rid of the existing logic about fastpath in the loop over locallocks
entirely, and replace it with code that just zeroes the fastpath array
when lockmethodid == DEFAULT_LOCKMETHOD and allLocks.  That would likely
be faster than what you've got here as well as more robust.  Or we
could add a restriction to EligibleForRelationFastPath that restricts
the fast-path mechanism to non-session locks, in which case we'd not
need to make the zeroing contingent on allLocks either.  I don't think
we take any fast-path-eligible locks in session mode anyway, so this
wouldn't be giving up any functionality on that end.

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MultiXact pessmization in 9.3

2013-11-27 Thread Alvaro Herrera

Alvaro Herrera wrote:

> Correct.  The only difficulty here is that we would need to pass down
> the fact that we know for certain that this is only a locking Multixact.
> There are some callers that go to it indirectly via MultiXactIdWait or
> MultiXactIdExpand, but now that I look I think it's fine for those to
> pass false (i.e. assume there might be an update and disable the
> optimization), since those aren't hot compared to the other cases.
> 
> This patch implements this idea, but I haven't tested it much beyond
> compiling and ensuring it passes the existing tests.

.. and it turns out it doesn't work.  To be really effective, we need
MultiXactIdIsRunning to be passed down the flag too, so it can pass it
to GetMultiXactIdMembers.

One other thought is that MultiXactIdIsRunning and GetMultiXactIdMembers
are public functions, so this patch would represent an API change in
9.3.  I doubt any external modules would be relying on these functions,
but there's so many care and thought put into avoiding API changes on
released versions that I'm nervous about doing it here.  So I think we'd
need to provide a compatibility shim to avoid that.

(I generally dislike to keep compatibility stuff forever, so I would
provide this backward-compatible functions in 9.3 only.  Anyone using it
would have to fix the code and recompile for 9.4+.  This means a #ifdef
in code meant to work on top of both 9.3 and 9.4.  Anyone opines
otherwise?)

The other idea is to just not backpatch this.

Other than that, this patch implements the optimization suggested here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/contrib/pgrowlocks/pgrowlocks.c
--- b/contrib/pgrowlocks/pgrowlocks.c
***
*** 168,174  pgrowlocks(PG_FUNCTION_ARGS)
  
  allow_old = !(infomask & HEAP_LOCK_MASK) &&
  	(infomask & HEAP_XMAX_LOCK_ONLY);
! nmembers = GetMultiXactIdMembers(xmax, &members, allow_old);
  if (nmembers == -1)
  {
  	values[Atnum_xids] = "{0}";
--- 168,175 
  
  allow_old = !(infomask & HEAP_LOCK_MASK) &&
  	(infomask & HEAP_XMAX_LOCK_ONLY);
! nmembers = GetMultiXactIdMembers(xmax, &members, allow_old,
!  false);
  if (nmembers == -1)
  {
  	values[Atnum_xids] = "{0}";
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 3987,3993  l3:
  			 * the case, HeapTupleSatisfiesUpdate would have returned
  			 * MayBeUpdated and we wouldn't be here.
  			 */
! 			nmembers = GetMultiXactIdMembers(xwait, &members, false);
  
  			for (i = 0; i < nmembers; i++)
  			{
--- 3987,3995 
  			 * the case, HeapTupleSatisfiesUpdate would have returned
  			 * MayBeUpdated and we wouldn't be here.
  			 */
! 			nmembers =
! GetMultiXactIdMembers(xwait, &members, false,
! 	  HEAP_XMAX_IS_LOCKED_ONLY(infomask));
  
  			for (i = 0; i < nmembers; i++)
  			{
***
*** 4008,4014  l3:
  }
  			}
  
! 			pfree(members);
  		}
  
  		/*
--- 4010,4017 
  }
  			}
  
! 			if (members)
! pfree(members);
  		}
  
  		/*
***
*** 4157,4163  l3:
   * been the case, HeapTupleSatisfiesUpdate would have returned
   * MayBeUpdated and we wouldn't be here.
   */
! nmembers = GetMultiXactIdMembers(xwait, &members, false);
  
  if (nmembers <= 0)
  {
--- 4160,4168 
   * been the case, HeapTupleSatisfiesUpdate would have returned
   * MayBeUpdated and we wouldn't be here.
   */
! nmembers =
! 	GetMultiXactIdMembers(xwait, &members, false,
! 		  HEAP_XMAX_IS_LOCKED_ONLY(infomask));
  
  if (nmembers <= 0)
  {
***
*** 4626,4632  l5:
  		 * MultiXactIdExpand if we weren't to do this, so this check is not
  		 * incurring extra work anyhow.
  		 */
! 		if (!MultiXactIdIsRunning(xmax))
  		{
  			if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
  TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax,
--- 4631,4637 
  		 * MultiXactIdExpand if we weren't to do this, so this check is not
  		 * incurring extra work anyhow.
  		 */
! 		if (!MultiXactIdIsRunning(xmax, HEAP_XMAX_IS_LOCKED_ONLY(old_infomask)))
  		{
  			if (HEAP_XMAX_IS_LOCKED_ONLY(old_infomask) ||
  TransactionIdDidAbort(MultiXactIdGetUpdateXid(xmax,
***
*** 5217,5223  GetMultiXactIdHintBits(MultiXactId multi, uint16 *new_infomask,
  	 * We only use this in multis we just created, so they cannot be values
  	 * pre-pg_upgrade.
  	 */
! 	nmembers = GetMultiXactIdMembers(multi, &members, false);
  
  	for (i = 0; i < nmembers; i++)
  	{
--- 5222,5228 
  	 * We only use this in multis we just created, so they cannot be values
  	 * pre-pg_upgrade.
  	 */
! 	nmembers = GetMultiXactIdMembers(multi, &members, false, false);
  
  	for (i = 0; i < nmembers; i++)
  	{
***
*** 5293,5299  MultiXactIdGetUpdateXid(TransactionId xmax, uint16 t_infomask)

Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-27 Thread Bruce Momjian

On Wed, Nov 27, 2013 at 03:59:31PM -0500, Robert Haas wrote:
> On Tue, Nov 26, 2013 at 5:02 PM, Bruce Momjian  wrote:
> > On Tue, Nov 26, 2013 at 01:58:04PM -0300, Alvaro Herrera wrote:
> >> Bruce Momjian escribió:
> >> > On Tue, Nov 26, 2013 at 11:22:39AM -0500, Tom Lane wrote:
> >>
> >> > > > Uh, I ended up mentioning "no effect" to highlight it does nothing,
> >> > > > rather than mention a warning.  Would people prefer I say "warning"? 
> >> > > >  Or
> >> > > > should I say "issues a warning because it has no effect" or 
> >> > > > something?
> >> > > > It is easy to change.
> >> > >
> >> > > I'd revert the change Robert highlights above.  ISTM you've changed the
> >> > > code to match the documentation; why would you then change the docs?
> >> >
> >> > Well, I did it to make it consistent.  The question is what to write for
> >> > _all_ of the new warnings, including SET.  Do we say "warning", do we
> >> > say "it has no effect", or do we say both?  The ABORT is a just one case
> >> > of that.
> >>
> >> Maybe "it emits a warning and otherwise has no effect"?  Emitting a
> >> warning is certainly not doing nothing; as has been pointed out in the
> >> SSL renegotiation thread, it might cause the log to fill disk.
> >
> > OK, doc patch attached.
> 
> Seems broadly reasonable, but I'd use "no other effect" throughout.

That sounds awkward, e.g.:

 Issuing ROLLBACK outside of a transaction
 block emits a warning but has no other effect.

I could live with this:

 Issuing ROLLBACK outside of a transaction
 block has no effect except emitting a warning.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Andres Freund

On 2013-11-27 15:29:24 -0500, Noah Misch wrote:
> > If you are confident that neither of these is a real risk, I'll
> > relax about this.
> 
> If there is a real risk, I'm not seeing it.

Me neither.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund

On 2013-11-27 14:53:27 -0500, Noah Misch wrote:
> How would you characterize the chances of this happening with default
> *vacuum_freeze_*_age settings?  Offhand, it seems you would need to encounter
> this bug during each of ~10 generations of autovacuum_freeze_max_age before
> the old rows actually become invisible.

I think realistically, to actually trigger the bug, it needs to happen
quite a bit more often. But in some workloads it's pretty darn easy to
hit. E.g. if significant parts of the table are regularly deleted, lots,
if not most, of your vacuums will spuriously increase relfrozenxid above
the actual value. Each time only by a small amount, but due to that
small increase there never will be an actual full table vacuum since
freeze_table_age will never even remotely be reached.

The client that made me look into the issue noticed problems on
pg_attribute - presumably because of temporary table usage primarily
affecting the tail end of pg_attribute.

> On Wed, Nov 27, 2013 at 02:14:53PM +0100, Andres Freund wrote:
> > With regard to fixing things up, ISTM the best bet is heap_prune_chain()
> > so far. That's executed b vacuum and by opportunistic pruning and we
> > know we have the appropriate locks there. Looks relatively easy to fix
> > up things there. Not sure if there are any possible routes to WAL log
> > this but using log_newpage()?
> > I am really not sure what the best course of action is :(
> 
> Maximizing detection is valuable, and the prognosis for automated repair is
> poor.  I would want a way to extract tuples having xmin outside the range of
> CLOG that are marked HEAP_XMIN_COMMITTED or appear on an all-visible
> page.

I think the likelihood of the problem affecting !all-visible pages is
close to zero. Each vacuum will try to clean those, so they surely will
get vacuumed at some point. I think the only way that could happen is if
the ConditionalLockBufferForCleanup() fails in each vacuum. And that
seems a bit unlikely.

> At first, I supposed we could offer a tool to blindly freeze such tuples.
> However, there's no guarantee that they are in harmony with recent changes to
> the database; transactions that wrongly considered those tuples invisible may
> have made decisions incompatible with their existence.  For example, reviving
> such a tuple could violate a UNIQUE constraint if the user had already
> replaced the missing row manually.

Good point, although since they are all on all-visible pages sequential
scans will currently already find those. It's primarily index scans that
won't. So it's not really reviving them...
The primary reason why I think it might be a good idea to "revive"
automatically is, that an eventual full-table/freeze vacuum will
currently delete them which seems bad.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Bruce Momjian

On Wed, Nov 27, 2013 at 06:05:13AM -0800, Kevin Grittner wrote:
> Bruce Momjian  wrote:
> > On Tue, Nov 26, 2013 at 03:25:44PM -0800, Kevin Grittner wrote:
> >> Bruce Momjian  wrote:
> >>
> >>> How are we handling breakage of pg_dump, not pg_dumpall?
> >>
> >> That was discussed.  Do you have something to add?
> >
> > I am confused what we are patching.  Are we patching pg_dump,
> > pg_dumpall, or both?
> 
> Just pg_dumpall.c.

OK, there was a pg_dump patch earlier which we are not using now.

> > Are we propagating other settings from pg_dump to pg_dumpall,
> > like statement_timeout?
> 
> pg_dumpall output sets up the global objects (including their
> properties) and then does a \connect to each database, followed by
> the same output that pg_dump would generate for that database. 
> That includes the SET statements for statement_timeout, etc.  The
> patch does nothing to change what objects or properties the
> pg_dumpall output tries to set up, it just sets a property *on the
> current connection* to allow those statements to run without error.

What is the logic that has us setting statement_timeout in pg_dump but
default_transaction_read_only in pg_dumpall?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL

2013-11-27 Thread Bruce Momjian

On Sat, Jan 12, 2013 at 02:14:03PM -0500, Kevin Grittner wrote:
> Amit Kapila wrote:
> > On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:
> 
> >> Surely VACUUM FULL should rebuild the visibility map, and make
> >> tuples in the new relation all-visible, no?
> 
> Certainly it seems odd to me that VACUUM FULL leaves the the table
> in a less-well maintained state in terms of visibility than a
> "normal" vacuum. VACUUM FULL should not need to be followed by
> another VACUUM.
> 
> > I think it cannot made all visible.
> 
> I don't think all tuples in the relation are necessarily visible to
> all transactions, but the ones which are should probably be flagged
> that way.

I have developed the attached proof-of-concept patch to fix the problem
of having no visibility map after CLUSTER or VACUUM FULL.  I tested with
these queries:

CREATE TABLE test(x INT PRIMARY KEY);
INSERT INTO test VALUES (1);
VACUUM FULL test; -- or CLUSTER
SELECT relfilenode FROM pg_class WHERE relname = 'test';
 relfilenode
-
   16399

Then 'ls -l data/base/16384/16399*' to see the *_vm file.  I am not sure
how to test that the vm contents are valid.

This patch is fairly tricky because our CLUSTER/VACUUM FULL behavior
does not do writes through the shared buffer cache, as outlined in this
C comment block:

 * We can't use the normal heap_insert function to insert into the new
 * heap, because heap_insert overwrites the visibility information.
 * We use a special-purpose raw_heap_insert function instead, which
 * is optimized for bulk inserting a lot of tuples, knowing that we have
 * exclusive access to the heap.  raw_heap_insert builds new pages in
 * local storage.  When a page is full, or at the end of the process,
 * we insert it to WAL as a single record and then write it to disk
 * directly through smgr.  Note, however, that any data sent to the new
 * heap's TOAST table will go through the normal bufmgr.

I originally tried to do this higher up in the stack but ran into
problems because I couldn't access the new heap page so I had to do it
at the non-shared-buffer page level.  I reused the lazy vacuum routines.

I need to know this is the right approach, and need to know what things
are wrong or missing.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
new file mode 100644
index 951894c..c01a6a8
*** a/src/backend/access/heap/rewriteheap.c
--- b/src/backend/access/heap/rewriteheap.c
***
*** 107,112 
--- 107,114 
  #include "access/rewriteheap.h"
  #include "access/transam.h"
  #include "access/tuptoaster.h"
+ #include "access/visibilitymap.h"
+ #include "commands/vacuum.h"
  #include "storage/bufmgr.h"
  #include "storage/smgr.h"
  #include "utils/memutils.h"
*** typedef OldToNewMappingData *OldToNewMap
*** 172,177 
--- 174,180 
  
  /* prototypes for internal functions */
  static void raw_heap_insert(RewriteState state, HeapTuple tup);
+ static void update_page_vm(Relation relation, Page page, BlockNumber blkno);
  
  
  /*
*** end_heap_rewrite(RewriteState state)
*** 281,286 
--- 284,290 
  		RelationOpenSmgr(state->rs_new_rel);
  
  		PageSetChecksumInplace(state->rs_buffer, state->rs_blockno);
+ 		update_page_vm(state->rs_new_rel, state->rs_buffer, state->rs_blockno);
  
  		smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM, state->rs_blockno,
     (char *) state->rs_buffer, true);
*** raw_heap_insert(RewriteState state, Heap
*** 633,638 
--- 637,643 
  			RelationOpenSmgr(state->rs_new_rel);
  
  			PageSetChecksumInplace(page, state->rs_blockno);
+ 			update_page_vm(state->rs_new_rel, page, state->rs_blockno);
  
  			smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM,
  	   state->rs_blockno, (char *) page, true);
*** raw_heap_insert(RewriteState state, Heap
*** 677,679 
--- 682,704 
  	if (heaptup != tup)
  		heap_freetuple(heaptup);
  }
+ 
+ static void
+ update_page_vm(Relation relation, Page page, BlockNumber blkno)
+ {
+ 	Buffer		vmbuffer = InvalidBuffer;
+ 	TransactionId visibility_cutoff_xid;
+ 
+ 	visibilitymap_pin(relation, blkno, &vmbuffer);
+ 	Assert(BufferIsValid(vmbuffer));
+ 
+ 	if (!visibilitymap_test(relation, blkno, &vmbuffer) &&
+ 		heap_page_is_all_visible(relation, InvalidBuffer, page,
+  &visibility_cutoff_xid))
+ 	{
+ 		PageSetAllVisible(page);
+ 		visibilitymap_set(relation, blkno, InvalidBlockNumber,
+ 		  InvalidXLogRecPtr, vmbuffer, visibility_cutoff_xid);
+ 	}
+ 	ReleaseBuffer(vmbuffer);
+ }
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
new file mode 100644
index 7f40d89..a42511b
*** a/src/backend/access/heap/visibilitymap.c
--- b/src/backend/access/heap/visibili

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Peter Eisentraut

On 11/27/13, 3:47 PM, Tom Lane wrote:
> Given these considerations, I think it'd be better to allow explicit
> application control over whether read-ahead happens for a particular
> query.  And I have no problem whatsoever with requiring that the cursor
> be explicitly marked SCROLL or NO SCROLL before read-ahead will occur.

Well, technically, unspecified means NO SCROLL according to the SQL
standard.  A lot of applications in ECPG are ported from other systems,
which might make that assumption.  It wouldn't be very nice to have to
change all that.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Peter Eisentraut

On 11/27/13, 2:49 PM, Alvaro Herrera wrote:
> Would it work to have a function of some sort to which you give a cursor
> name and it returns whether it is scrollable or not?

That might make sense.  I think this case is similar to the question
whether a view is updatable.  You wouldn't put that information in the
CREATE VIEW command tag.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bytea_ouput = escape vs encode(byte, 'escape')

2013-11-27 Thread David Johnston

Jim Nasby-2 wrote
> I'm wondering why bytes_output = escape produces different output than
> encode(byte, 'escape') does. Is this intentional? If so, why?
> 
> cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
>   cr  |  lf  
> --+--
>  \x0d | \x0a
> (1 row)
> 
> cnuapp_prod@postgres=# set bytea_output = escape;
> SET
> cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
>   cr  |  lf  
> --+--
>  \015 | \012
> (1 row)
> 
> cnuapp_prod@postgres=# select encode(e'\r'::bytea,'escape') AS cr,
> encode(e'\n'::bytea, 'escape') AS lf;
>  cr | lf 
> +
>  \r |   +
> | 
> (1 row)
> 
> cnuapp_prod@postgres=# 

encode takes a bytea and provides what it would be as a text (using the
specified encoding to perform the conversion).

the "bytea" output examples are simple output of the contents of the
byte-array without an supposition as to what those bytes represent.  It is
strictly a serialization format and not an encoding/decoding of the
contents.

In this example the two "functions" are acting as paired input/output.

I'm thinking the direction you are assuming from the word "encode" is
confusing you - as it did me at first.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/bytea-ouput-escape-vs-encode-byte-escape-tp5780643p5780647.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread David Fetter

On Thu, Nov 28, 2013 at 01:29:46AM +0530, Atri Sharma wrote:
> On Thu, Nov 28, 2013 at 12:54 AM, David Fetter  wrote:
> > On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
> >> Shigeru Hanada escribió:
> >>
> >> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> >> > about interaction between client.  Besides it, I think it would be
> >> > nice to allow arbitrary FDW as backend of dblink interface like this:
> >> >
> >> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> >> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> >> > syntax') as t(/* record type definition */...);
> >> >
> >> > This provides a way to execute query without defining foreign table.
> >>
> >> Seems to me that if you want to read remote tables without creating a
> >> foreign table, you could define them locally using something like the
> >> WITH syntax and then use them normally in the rest of the query.
> >
> > WITH, or SRF, or whatever, the point is that we need to be able to
> > specify what we're sending--probably single opaque strings delimited
> > just as we do other strings--and what we might get back--errors only,
> > rows, [sets of] refcursors are the ones I can think of offhand.
> 
> +1
> 
> The input-output formats need to be defined clearly.
> 
> How about sending parse trees? Is it even possible?

I don't see why parse trees wouldn't be something that could
eventually be sent to other PostgreSQL servers, but I see that whole
discussion as orthogonal to this one.

My point here is that there needs to be an "escape to native" system
available in SQL so people can communicate directly with the remote
systems in the systems' own languages.  It's a little bit analogous to
making assembler available from C, or C from HLLs.

> > What we can't do is assume that our parser needs to, or even could, in
> > principle, understand these things in more detail than that.
> 
> Agreed.
> 
> I wonder if its possible to give this task to the FDW implementing
> authority instead, and get FDW to translate to the required format.

I don't know that the FDW would necessarily need to get involved
except in the sense of "full recognition before processing."
http://langsec.org/occupy/

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-27 Thread Robert Haas

On Tue, Nov 26, 2013 at 5:02 PM, Bruce Momjian  wrote:
> On Tue, Nov 26, 2013 at 01:58:04PM -0300, Alvaro Herrera wrote:
>> Bruce Momjian escribió:
>> > On Tue, Nov 26, 2013 at 11:22:39AM -0500, Tom Lane wrote:
>>
>> > > > Uh, I ended up mentioning "no effect" to highlight it does nothing,
>> > > > rather than mention a warning.  Would people prefer I say "warning"?  
>> > > > Or
>> > > > should I say "issues a warning because it has no effect" or something?
>> > > > It is easy to change.
>> > >
>> > > I'd revert the change Robert highlights above.  ISTM you've changed the
>> > > code to match the documentation; why would you then change the docs?
>> >
>> > Well, I did it to make it consistent.  The question is what to write for
>> > _all_ of the new warnings, including SET.  Do we say "warning", do we
>> > say "it has no effect", or do we say both?  The ABORT is a just one case
>> > of that.
>>
>> Maybe "it emits a warning and otherwise has no effect"?  Emitting a
>> warning is certainly not doing nothing; as has been pointed out in the
>> SSL renegotiation thread, it might cause the log to fill disk.
>
> OK, doc patch attached.

Seems broadly reasonable, but I'd use "no other effect" throughout.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] bytea_ouput = escape vs encode(byte, 'escape')

2013-11-27 Thread Jim Nasby

I'm wondering why bytes_output = escape produces different output than 
encode(byte, 'escape') does. Is this intentional? If so, why?

cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
  cr  |  lf  
--+--
 \x0d | \x0a
(1 row)

cnuapp_prod@postgres=# set bytea_output = escape;
SET
cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
  cr  |  lf  
--+--
 \015 | \012
(1 row)

cnuapp_prod@postgres=# select encode(e'\r'::bytea,'escape') AS cr, 
encode(e'\n'::bytea, 'escape') AS lf;
 cr | lf 
+
 \r |   +
| 
(1 row)

cnuapp_prod@postgres=# 
-- 
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Tom Lane

Boszormenyi Zoltan  writes:
> If you consider all these:

> - certain combinations of query and DECLARE stmt flags fail;
> - adding NO SCROLL is breaking backward compatibility;
> - the readahead code has to really know whether the cursor is
>scrollable so it can behave just like the server;

If you're claiming that readahead inside ECPG will behave absolutely
transparently in all cases, I think that's bogus anyway.  Consider for
example a query that will get a divide-by-zero error when it computes
the 110th row --- but the application stops after fetching 100 rows.
Everything's fine, until you insert some readahead logic.  Queries
containing volatile functions might also not be happy about readahead.

Given these considerations, I think it'd be better to allow explicit
application control over whether read-ahead happens for a particular
query.  And I have no problem whatsoever with requiring that the cursor
be explicitly marked SCROLL or NO SCROLL before read-ahead will occur.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread Kevin Grittner

AK  wrote:

> the documentation is not incorrect, it is just incomplete.

Yeah, that's exactly how I saw it.  :-)

Docs changed on master only.  Thanks for the report!

--

Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Noah Misch

On Wed, Nov 27, 2013 at 11:38:23AM -0800, Kevin Grittner wrote:
> Noah Misch  wrote:
> > The threat is that rounding the read size up to the next MAXALIGN
> > would cross into an unreadable memory page, resulting in a
> > SIGSEGV.  Every palloc chunk has MAXALIGN'd size under the hood,
> > so the excess read of "toDelete" cannot cause a SIGSEGV.  For a
> > stack variable, it depends on the ABI.  I'm not aware of an ABI
> > where the four bytes past the end of this stack variable could be
> > unreadable, which is not to claim I'm well-read on the topic.  We
> > should fix this in due course on code hygiene grounds, but I
> > would not back-patch it.
> 
> If you're sure.  I hadn't worked through the code, but had two
> concerns (neither of which was about a SEGSEGV):
> 
> (1)  That multiple MAXALIGNs of shorter values could push the
> structure into overlap with the next thing on the stack, allowing
> one or the other to get stepped on.

These are out-of-bounds reads only, not writes.  Also, the excess doesn't
accumulate that way; the code reads beyond any particular stack variable or
palloc chunk by no more than 7 bytes.

> (2)  That the CRC calculation might picking up uninitialized data
> which was not actually going to match what was used during
> recovery, leading to "end of recovery" on replay.

The CRC calculation does pick up unspecified bytes, but we copy those same
bytes into the actual WAL record.  The effect is similar to that of
unspecified pad bytes in the middle of xlrec structures.

> If you are confident that neither of these is a real risk, I'll
> relax about this.

If there is a real risk, I'm not seeing it.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Atri Sharma

On Thu, Nov 28, 2013 at 12:54 AM, David Fetter  wrote:
> On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
>> Shigeru Hanada escribió:
>>
>> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially
>> > about interaction between client.  Besides it, I think it would be
>> > nice to allow arbitrary FDW as backend of dblink interface like this:
>> >
>> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
>> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote
>> > syntax') as t(/* record type definition */...);
>> >
>> > This provides a way to execute query without defining foreign table.
>>
>> Seems to me that if you want to read remote tables without creating a
>> foreign table, you could define them locally using something like the
>> WITH syntax and then use them normally in the rest of the query.
>
> WITH, or SRF, or whatever, the point is that we need to be able to
> specify what we're sending--probably single opaque strings delimited
> just as we do other strings--and what we might get back--errors only,
> rows, [sets of] refcursors are the ones I can think of offhand.

+1

The input-output formats need to be defined clearly.

How about sending parse trees? Is it even possible?

> What we can't do is assume that our parser needs to, or even could, in
> principle, understand these things in more detail than that.

Agreed.

I wonder if its possible to give this task to the FDW implementing
authority instead, and get FDW to translate to the required format.




-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread AK

I concur - the documentation is not incorrect, it is just incomplete.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Should-we-improve-documentation-on-isolation-levels-tp5780629p5780636.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Noah Misch

How would you characterize the chances of this happening with default
*vacuum_freeze_*_age settings?  Offhand, it seems you would need to encounter
this bug during each of ~10 generations of autovacuum_freeze_max_age before
the old rows actually become invisible.

On Wed, Nov 27, 2013 at 02:14:53PM +0100, Andres Freund wrote:
> With regard to fixing things up, ISTM the best bet is heap_prune_chain()
> so far. That's executed b vacuum and by opportunistic pruning and we
> know we have the appropriate locks there. Looks relatively easy to fix
> up things there. Not sure if there are any possible routes to WAL log
> this but using log_newpage()?
> I am really not sure what the best course of action is :(

Maximizing detection is valuable, and the prognosis for automated repair is
poor.  I would want a way to extract tuples having xmin outside the range of
CLOG that are marked HEAP_XMIN_COMMITTED or appear on an all-visible page.  At
first, I supposed we could offer a tool to blindly freeze such tuples.
However, there's no guarantee that they are in harmony with recent changes to
the database; transactions that wrongly considered those tuples invisible may
have made decisions incompatible with their existence.  For example, reviving
such a tuple could violate a UNIQUE constraint if the user had already
replaced the missing row manually.  A module that offers "SELECT * FROM
rows_wrongly_invisible('anytable')" would aid manual cleanup efforts.
freeze_if_wrongly_invisible(tid) would be useful, too.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Alvaro Herrera

Boszormenyi Zoltan escribió:

> If you consider all these:
> 
> - certain combinations of query and DECLARE stmt flags fail;
> - adding NO SCROLL is breaking backward compatibility;
> - the readahead code has to really know whether the cursor is
>   scrollable so it can behave just like the server;
> 
> then returning the SCROLL / NO SCROLL flag in the command tag is
> not a bad solution in my view. In fact, this was the only workable
> solution I could come up with to make it work reliably when neither
> SCROLL nor NO SCROLL is specified by the application.

Would it work to have a function of some sort to which you give a cursor
name and it returns whether it is scrollable or not?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread Kevin Grittner

AK  wrote:

> The following is not precise:
>
> 13.2.1. Read Committed Isolation Level
>
> "Also note that two successive SELECT commands can see different data, even
> though they are within a single transaction, if other transactions commit
> changes during execution of the first SELECT."
>
> I think it should be re-worded as follows
>
> "Also note that two successive SELECT commands can see different data, even
> though they are within a single transaction, if other transactions commit
> after the first SELECT starts, and before the second SELECT starts."
>
> The reason: there could be other DML running between these two SELECTs.

That's a fair point.  I'll commit something to the master branch.
I don't think this rises to the level of a documentation bug that
needs to be back-patched.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Boszormenyi Zoltan


2013-11-27 19:16 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan  writes:

2013-11-23 22:01 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan  writes:

Attached is the patch that modified the command tag returned by
the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR"
or "DECLARE NO SCROLL CURSOR" depending on the cursor's
scrollable flag that can be determined internally even if neither is
asked explicitly.

This does not strike me as an acceptable change.  It will break any code
that's expecting the existing command tag, for little or no benefit
to most applications.  Even if backwards compatibility were of no concern,
I'm not convinced it's a good thing to expose the backend's internal
choices about query plans used for cursors, which is what this is
basically doing.

I saw code in the backend allowing a cursor to be scrollable, although
it was not declared as such. How about ripping that out?

That also fails the unnecessary-backwards-compatibility-break test.


If you read the rest of the mail, it turns out it wasn't a serious question.

Getting the SCROLL / NO SCROLL flags from the preprocessor is no problem.
The only problem is when it's unspecified.

Treating it as NO SCROLL (or adding it to the DECLARE command behind
the application's back) would break apps that want to scroll backward.
(Not ABSOLUTE.)

On the other hand, what problems would one face when adding SCROLL
implicitly if it's unspecified? It's not a workable solution either, see below.

As the documentation suggests, an application that wants to use
UPDATE/DELETE WHERE CURRENT OF ..., it's highly recommended
that the cursor is for a FOR UPDATE query. Watch this scenario:

zozo=> begin;
BEGIN
zozo=> declare mycur cursor for select * from t1 for update;
DECLARE CURSOR
zozo=> fetch from mycur;
 id | t
+---
  1 | a
(1 row)

zozo=> fetch from mycur;
 id | t
+---
  2 | b
(1 row)

zozo=> update t1 set t=t||'_x' where current of mycur;
UPDATE 1
zozo=> fetch from mycur;
 id | t
+---
  3 | c
(1 row)

zozo=> delete from t1 where current of mycur;
DELETE 1
zozo=> move absolute 0 in mycur;
MOVE 0
zozo=> fetch from mycur;
 id | t
+---
  1 | a
(1 row)

zozo=> fetch from mycur;
 id | t
+---
(0 rows)

Although the server complains about MOVE BACKWARD, it's not
complaining about MOVE ABSOLUTE, despite it's clearly moving
backward. The cursor position is tracked in the backend in a long
variable and it's not overflowed. This is also legacy behaviour,
changing it would break backward compatibility.

The other problem I see is with the documentation: it says that
the INSENSITIVE keyword is just a placeholder, all cursors are insensitive.
It's clearly false. Moving back to the start, previously existing rows
won't show up again. It's not strictly a sensitive cursor, either,
because the row with id=2 would show up with the new value of "t".
This neither sensitive nor insensitive behaviour is what the SQL
standard calls an "asensitive" cursor. It would worth a doc change.
This is what's written in 9.3:

"
If the cursor's query includes FOR UPDATE or FOR SHARE, then returned rows are locked at 
the time they are first fetched, in the same way as for a regular SELECT 
 command with these 
options. In addition, the returned rows will be the most up-to-date versions; therefore 
these options provide the equivalent of what the SQL standard calls a "sensitive cursor". 
(Specifying INSENSITIVE together with FOR UPDATE or FOR SHARE is an error.)

"
( http://www.postgresql.org/docs/9.3/interactive/sql-declare.html )

However, if the cursor is declared without FOR UPDATE, both
the explicit SCROLL keyword (or implicit, if the query is simple),
scrolling backward and DML with WHERE CURRENT OF are allowed.
In this case, the cursor is really insensitive, FETCH statements
after MOVE ABSOLUTE 0 return all rows with their original data.

This is just to show that adding SCROLL behind the application's
back is also pointless. If the query (which can also be a prepared
statement in ECPG) contains FOR UPDATE, adding SCROLL to the
DECLARE statement would make it fail.

If you consider all these:

- certain combinations of query and DECLARE stmt flags fail;
- adding NO SCROLL is breaking backward compatibility;
- the readahead code has to really know whether the cursor is
  scrollable so it can behave just like the server;

then returning the SCROLL / NO SCROLL flag in the command tag is
not a bad solution in my view. In fact, this was the only workable
solution I could come up with to make it work reliably when neither
SCROLL nor NO SCROLL is specified by the application.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

Re: [HACKERS] logical changeset generation v6.7

2013-11-27 Thread Fabrízio de Royes Mello

On Thu, Nov 14, 2013 at 11:46 AM, Andres Freund 
wrote:
>
> On 2013-11-12 18:50:33 +0100, Andres Freund wrote:
> > > You've actually changed the meaning of this section (and not in a
good way):
> > >
> > >  be set at server start. wal_level must be set
> > > -to archive or hot_standby to allow
> > > -connections from standby servers.
> > > +to archive, hot_standby or
logical
> > > +to allow connections from standby servers.
> > >
> > > I think that the previous text meant that you needed archive - or, if
> > > you want to allow connections, hot_standby.  The new text loses that
> > > nuance.
> >
> > Yea, that's because it was lost on me in the first place...
>
> I think that's because the nuance isn't actually in the text - note that
> it is talking about max_wal_senders and talking about connections
> *from*, not *to* standby servers.
> I've reformulated the wal_level paragraph and used "or higher" in
> several places now.
>
> Ok, so here's a rebased version of this. I tried to fix all the issues
> you mentioned, and it's based on the split off IsSystemRelation() patch,
> I've sent yesterday (included here).
>

Hello,

I'm trying to apply the patches but show some warnings/errors:

$ gunzip -c
/home/fabrizio/Downloads/0002-wal_decoding-Add-wal_level-logical-and-log-data-requ.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0005-wal_decoding-Introduce-wal-decoding-via-catalog-time.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0006-wal_decoding-Implement-VACUUM-FULL-CLUSTER-support-v.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0007-wal_decoding-Only-peg-the-xmin-horizon-for-catalog-t.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0011-wal_decoding-pg_recvlogical-Introduce-pg_receivexlog.patch.gz
| git apply -
error: patch failed: src/bin/pg_basebackup/streamutil.c:210
error: src/bin/pg_basebackup/streamutil.c: patch does not apply

The others are applied correctly. The permission warning must be fixed and
0011 bust be rebased.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Kevin Grittner

Noah Misch  wrote:

> (Kevin, I saw no attachment.)

Apologies.  Trying again.

> The threat is that rounding the read size up to the next MAXALIGN
> would cross into an unreadable memory page, resulting in a
> SIGSEGV.  Every palloc chunk has MAXALIGN'd size under the hood,
> so the excess read of "toDelete" cannot cause a SIGSEGV.  For a
> stack variable, it depends on the ABI.  I'm not aware of an ABI
> where the four bytes past the end of this stack variable could be
> unreadable, which is not to claim I'm well-read on the topic.  We
> should fix this in due course on code hygiene grounds, but I
> would not back-patch it.

If you're sure.  I hadn't worked through the code, but had two
concerns (neither of which was about a SEGSEGV):

(1)  That multiple MAXALIGNs of shorter values could push the
structure into overlap with the next thing on the stack, allowing
one or the other to get stepped on.

(2)  That the CRC calculation might picking up uninitialized data
which was not actually going to match what was used during
recovery, leading to "end of recovery" on replay.

If you are confident that neither of these is a real risk, I'll
relax about this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company=
==7232==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7fffbcb2e834 at pc 0xef0335 bp 0x7fffbcb29b30 sp 0x7fffbcb29b28
READ of size 1 at 0x7fffbcb2e834 thread T0
#0 0xef0334 in XLogInsert 
/home/kgrittn/pg/master/src/backend/access/transam/xlog.c:1040
#1 0xd53792 in doPickSplit 
/home/kgrittn/pg/master/src/backend/access/spgist/spgdoinsert.c:1391
#2 0xd0d9f1 in spgdoinsert 
/home/kgrittn/pg/master/src/backend/access/spgist/spgdoinsert.c:2008
#3 0xcbb4d1 in spginsert 
/home/kgrittn/pg/master/src/backend/access/spgist/spginsert.c:238
#4 0x46f9de3 in FunctionCall6Coll 
/home/kgrittn/pg/master/src/backend/utils/fmgr/fmgr.c:1436
#5 0xad43fb in index_insert 
/home/kgrittn/pg/master/src/backend/access/index/indexam.c:223
#6 0x2122fcd in ExecInsertIndexTuples 
/home/kgrittn/pg/master/src/backend/executor/execUtils.c:1104
#7 0x228413f in ExecInsert 
/home/kgrittn/pg/master/src/backend/executor/nodeModifyTable.c:274
#8 0x227fba8 in ExecModifyTable 
/home/kgrittn/pg/master/src/backend/executor/nodeModifyTable.c:1014
#9 0x2026b03 in ExecProcNode 
/home/kgrittn/pg/master/src/backend/executor/execProcnode.c:377
#10 0x1fef534 in ExecutePlan 
/home/kgrittn/pg/master/src/backend/executor/execMain.c:1474
#11 0x1fee488 in standard_ExecutorRun 
/home/kgrittn/pg/master/src/backend/executor/execMain.c:308
#12 0x1fec7e9 in ExecutorRun 
/home/kgrittn/pg/master/src/backend/executor/execMain.c:256
#13 0x34f05ab in ProcessQuery 
/home/kgrittn/pg/master/src/backend/tcop/pquery.c:185
#14 0x34e8c3a in PortalRunMulti 
/home/kgrittn/pg/master/src/backend/tcop/pquery.c:1279
#15 0x34e1b9c in PortalRun 
/home/kgrittn/pg/master/src/backend/tcop/pquery.c:816
#16 0x34b6721 in exec_simple_query 
/home/kgrittn/pg/master/src/backend/tcop/postgres.c:1054
#17 0x34b1420 in PostgresMain 
/home/kgrittn/pg/master/src/backend/tcop/postgres.c:3998
#18 0x2f1f925 in BackendRun 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:4085
#19 0x2f1b830 in BackendStartup 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:3774
#20 0x2efcc96 in ServerLoop 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:1585
#21 0x2ef13ee in PostmasterMain 
/home/kgrittn/pg/master/src/backend/postmaster/postmaster.c:1240
#22 0x24dc3c3 in main /home/kgrittn/pg/master/src/backend/main/main.c:196
#23 0x2b89d526e76c in __libc_start_main 
/build/buildd/eglibc-2.15/csu/libc-start.c:226
#24 0x4dc3cc in _start ??:?

Address 0x7fffbcb2e834 is located in stack of thread T0 at offset 2644 in frame
#0 0xd2b5ef in doPickSplit 
/home/kgrittn/pg/master/src/backend/access/spgist/spgdoinsert.c:680

  This frame has 57 object(s):
[32, 40) ''
[96, 104) ''
[160, 168) ''
[224, 232) ''
[288, 296) ''
[352, 356) ''
[416, 417) ''
[480, 481) ''
[544, 545) 'insertedNew'
[608, 632) 'in'
[672, 720) 'out'
[768, 776) 'procinfo'
[832, 833) 'includeNew'
[896, 900) 'i'
[960, 964) 'max'
[1024, 1028) 'n'
[1088, 1096) 'innerTuple'
[1152, 1160) 'node'
[1216, 1224) 'nodes'
[1280, 1284) 'newInnerBuffer'
[1344, 1348) 'newLeafBuffer'
[1408, 1416) 'heapPtrs'
[1472, 1480) 'leafPageSelect'
[1536, 1544) 'leafSizes'
[1600, 1608) 'toDelete'
[1664, 1672) 'toInsert'
[1728, 1730) 'redirectTuplePos'
[1792, 1796) 'startOffsets'
[1856, 1864) 'newLeafs'
[1920, 1924) 'spaceToDelete'
[1984, 1988) 'currentFreeSpace'
[2048, 2052) 'totalLeafSizes'
[2112, 2113) 'allTheSame'
[2176, 2496) 'rdata'
[2528, 2532) 'nRdata'
[2592, 2644) 'xlrec' <== Memory access at

[HACKERS] Should we improve documentation on isolation levels?

2013-11-27 Thread AK

I am not sure if i am posting in the right place - correct me if I am wrong.
The following is not precise:

13.2.1. Read Committed Isolation Level

"Also note that two successive SELECT commands can see different data, even
though they are within a single transaction, if other transactions commit
changes during execution of the first SELECT."

I think it should be re-worded as follows

"Also note that two successive SELECT commands can see different data, even
though they are within a single transaction, if other transactions commit
after the first SELECT starts, and before the second SELECT starts."

The reason: there could be other DML running between these two SELECTs.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Should-we-improve-documentation-on-isolation-levels-tp5780629.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread David Fetter

On Wed, Nov 27, 2013 at 10:29:34AM -0300, Alvaro Herrera wrote:
> Shigeru Hanada escribió:
> 
> > SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> > about interaction between client.  Besides it, I think it would be
> > nice to allow arbitrary FDW as backend of dblink interface like this:
> > 
> > postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> > postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> > syntax') as t(/* record type definition */...);
> > 
> > This provides a way to execute query without defining foreign table.
> 
> Seems to me that if you want to read remote tables without creating a
> foreign table, you could define them locally using something like the
> WITH syntax and then use them normally in the rest of the query.

WITH, or SRF, or whatever, the point is that we need to be able to
specify what we're sending--probably single opaque strings delimited
just as we do other strings--and what we might get back--errors only,
rows, [sets of] refcursors are the ones I can think of offhand.

What we can't do is assume that our parser needs to, or even could, in
principle, understand these things in more detail than that.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Peter Geoghegan

On Wed, Nov 27, 2013 at 1:09 AM, Peter Geoghegan  wrote:
> Since it took me a relatively long time to recreate this, it may not
> be trivial to do so. Unless you don't think it's useful to do so, I'm
> going to give this test a full 24 hours, just in case it shows up
> anything else like this.

I see a further, distinct error message this morning:

"ERROR:  unrecognized heap_lock_tuple status: 1"

This is a would-be "attempted to lock invisible tuple" error, but with
the error raised by some heap_lock_tuple() call site, unlike the
previous situation where heap_lock_tuple() raised the error directly.
Since with the most recent revision, we handle this (newly possible)
return code in the new ExecLockHeapTupleForUpdateSpec() function, that
just leaves EvalPlanQualFetch() as a plausible place to see it, given
the codepaths exercised in the test case.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Tom Lane

Boszormenyi Zoltan  writes:
> 2013-11-23 22:01 keltezéssel, Tom Lane írta:
>> Boszormenyi Zoltan  writes:
>>> Attached is the patch that modified the command tag returned by
>>> the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR"
>>> or "DECLARE NO SCROLL CURSOR" depending on the cursor's
>>> scrollable flag that can be determined internally even if neither is
>>> asked explicitly.

>> This does not strike me as an acceptable change.  It will break any code
>> that's expecting the existing command tag, for little or no benefit
>> to most applications.  Even if backwards compatibility were of no concern,
>> I'm not convinced it's a good thing to expose the backend's internal
>> choices about query plans used for cursors, which is what this is
>> basically doing.

> I saw code in the backend allowing a cursor to be scrollable, although
> it was not declared as such. How about ripping that out?

That also fails the unnecessary-backwards-compatibility-break test.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Tom Lane

Kevin Grittner  writes:
> Tom Lane  wrote:
>> 3. Establish a coding rule that if you catch an error with
>> PG_TRY() and don't re-throw it, you have to unblock signals in
>> your PG_CATCH block.

> Could that be done in the PG_END_TRY macro?

Interesting idea ... [ thinks for a bit ... ] but I'm not sure it's really
a great fix.  PG_END_TRY would have to unblock *all* signals, I think,
and it's not clear you want that in that mechanism.  Consider a PG_TRY
executing inside a signal handler.  It would be semantically reasonable
to use sigsetjmp(foo,1) in PG_TRY, but as I said, I'm trying to avoid
that on performance grounds.

In any case, after sleeping on it I've realized that the HOLD_INTERRUPTS
dance I proposed adding to handle_sig_alarm() is necessary but not
sufficient to deal with interruptions from SIGINT.  The real reason that
we need HOLD_INTERRUPTS there is that the signal handler may be executing
while ImmediateInterruptsOK is true.  So if a random SIGINT happened to
arrive while we're updating the timeout data structures, we could lose
control and leave partially updated (ie, totally corrupted) timeout info.
Adding HOLD_INTERRUPTS prevents that case.  But, imagine that SIGINT
happens just as we enter handle_sig_alarm(), before we can do
HOLD_INTERRUPTS.  Then the SIGINT handler could longjmp out of the SIGALRM
handler; the data structures are OK, but we've effectively lost that
interrupt, and no new one is scheduled.  This isn't much of a problem
for the existing timer event handlers, because we'd just cancel them
anyway on the way out of the (sub)transaction, cf LockErrorCleanup.
But it could be a problem for future usages of the timer infrastructure.

In the case where we longjmp all the way out to the idle loop, we cancel
all pending timeouts anyway in postgres.c's sigsetjmp cleanup stanza.
It would be reasonable to add a PG_SETMASK(&UnBlockSig) in the same area.

I am thinking that what we want to do is add a signal unblock step to
(sub)transaction abort, too, as well as a call to timeout.c to let it
re-establish its timer interrupt in case one got lost.  This would fix
the problem as long as you assume that anyone catching an arbitrary error
condition will do a subtransaction abort to get back into a good state.
But we're requiring that already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Handling GIN incomplete splits

2013-11-27 Thread Heikki Linnakangas


On 11/22/13 15:04, Michael Paquier wrote:

2) post recovery cleanup:
- OK, so roughly the soul of this patch is to change the update
mechanism for a left child gin page so as the parent split is always
done first before any new data is inserted in this child. And this
ensures that we can remove the xlog cleanup mechanism for gin page
splits in the same fashion as gist... xlog redo mechanism is then
adapted according to that.



- I did some tests with the patch:
-- Index creation time
vanilla: 3266.834
with the two patches: 3412.473 ms


Hmm. I didn't expect any change in index creation time. Is that 
repeatable, or within the margin of error?



2-1) In ginFindParents, is the case where the stack has no parent
possible (aka the stack is the root itself)? Shouldn't this code path
check if root is NULL or not?


No. A root split doesn't need to find the parent (because there isn't 
any), so we never call ginFindParents with a stack containing just the 
root. If you look at the only caller of ginFindParents, it's quite clear 
that stack->parent always exists.



2-2) Not sure that this structure is in-line with the project policy:
struct
{
BlockNumber left;
BlockNumber right;
} children;
Why not adding a complementary structure in gin_private.h doing that?
It could be used as well in ginxlogSplit to specify a left/right
family of block numbers.


I don't think we have a project policy against in-line structs. Might be 
better style to not do it, anyway, though.


Come to think of it, that children variable mustn't be declared inside 
the if-statement; it's no longer in scope when the XLogInsert was done, 
so the &children was pointing to potentially uninitialized piece of 
stack. Valgrind would've complained.


I ended up using BlockIdData[2] for that.

Committed, thanks for the review!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Handling GIN incomplete splits

2013-11-27 Thread Jeff Janes

On Wed, Nov 13, 2013 at 8:49 AM, Heikki Linnakangas  wrote:

> Here's another part of my crusade against xlog cleanup routines. This
> series of patches gets rid of the gin_cleanup() function, which is
> currently used to finish splits of GIN b-tree pages, if the system crashes
> (or an error occurs) between splitting a page and inserting its downlink to
> the parent.
>
> The first three patches just move code around. IMHO they make the code
> more readable, so they should be committed in any case. The meat is in the
> fourth patch.
>
> Thoughts, objections?
>
> Alexander, I'm sorry if this conflicts with your GIN patches. Feel free to
> post the latest versions of your patches against the current master,
> ignoring patches. I can fix the bitrot. That said, I think these
> refactorings will make your code look a little bit nicer too, so you might
> want to rebase because of that anyway.

Hi Heikki,

The commit 04eee1fa9ee80dabf7 of this series causes a self-deadlock in the
LWLock code during the operation below, with it trying to take
an LW_EXCLUSIVE on a high, even-numbered lockid when it already holds the
same lockid.

CREATE INDEX planet_osm_ways_nodes ON planet_osm_ways USING gin (nodes)
 WITH (FASTUPDATE=OFF);

It happens pretty reliably using osm2pgsql.

I will try to come up with a simple reproducible demonstration, and stack
trace, over the weekend.

Cheers,

Jeff

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Atri Sharma

On Wed, Nov 27, 2013 at 11:08 PM, Dimitri Fontaine
 wrote:
> Atri Sharma  writes:
>> This would work,but how can we do it for FDWs which do not parse SQL?
>> Am I missing something here?
>
> Worst case:
>
>   CREATE FOREIGN VIEW foo
>   AS $$
> whatever syntax is accepted on the other side


That doesnt sound like a very good idea.

Can we add a function to the FDW API to define a SQL to foreign server
side conversion?

I am just musing though.

Regards,

Atri



-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine

Atri Sharma  writes:
> This would work,but how can we do it for FDWs which do not parse SQL?
> Am I missing something here?

Worst case:

  CREATE FOREIGN VIEW foo
  AS $$
whatever syntax is accepted on the other side
  $$;

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Atri Sharma

> I guess the view query would have to be validated by the FDW, which
> would just receive a text.

+1

This is not exactly in context, but I and David Fetter discussed
recently how we could do similar thing.

This would work,but how can we do it for FDWs which do not parse SQL?

Am I missing something here?

Regards,

Atri


-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock on object is already held

2013-11-27 Thread Tom Lane

Daniel Wood  writes:
> Does the original version of my stress test not repro the problem on 9.2?

[ tries it ... ]  No, it doesn't, or at least the MTBF is a couple orders
of magnitude better than on 9.3.

Another odd thing (seen with my short version as well as your original)
is that 9.3/HEAD run the test case enormously faster than 9.2 and 9.1
do.  The older versions seem to spend a lot of time sleeping, which
I don't understand.

> Why does LockAcquireExtended() test for "nLocks == 0" in the "if
> (dontWait)" block before calling RemoveLocalLock()?

Looks like a useless test to me --- we wouldn't be here at all if nLocks
had been positive to start with, and there's nothing in between that
could raise the count.  On the other hand, removing a LOCALLOCK that
did have positive count would be disastrous.  Maybe what would be
more appropriate is an Assert(nLocks == 0) in RemoveLocalLock().

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine

Shigeru Hanada  writes:
> I'm sorry but I don't see the point here.  Do you mean that user
> executes CREATE FOREIGN VIEW in advance and uses the view in a

Yes that's what I mean.

> I think it's nice to support executing ad-hoc remote query written in
> the syntax which is valid only on remote data source through FDW, and
> at the moment dblink interface seems feasible for that purpose.

I guess the view query would have to be validated by the FDW, which
would just receive a text.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-27 Thread Alvaro Herrera

Alvaro Herrera escribió:
> Andres Freund escribió:

> This seems simple to handle by adding the check you propose to the loop.
> Basically if the xmax doesn't match the xmin, we reached the end,
> there's nothing more to lock and we can return success without any
> further work:

As mentioned in the thread for bug #8434, the complete working patch for
this is attached.

> > b) Check whether a chain element actually aborted - currently we're
> >only doing that in the HEAP_KEYS_UPDATED updated case, but that seems
> >wrong (we can't check for committed tho!).
> 
> Let me point out that this is exactly the same code that would be
> affected by my proposed fix for #8434, which would have this check the
> updateXid in all cases, not only in KEYS_UPDATED as currently.

I posted a patch for this problem in the thread about #8434.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
commit 1a4cadc729e724fa1aa6f260a4988b3615fccd1b
Author: Alvaro Herrera 
Date:   Wed Nov 27 12:12:42 2013 -0300

Compare new tuple's Xmin to previous Xmax while following an update chain

Not doing so causes us to traverse an update chain that has been broken
by concurrent page pruning.  All other code that traverses update chains
is careful to do this check, so replicate it here too.  Failure to do so
leads to erroneous CLOG, subtrans or multixact lookups.

Per bug report by J Smith in
cadfupgc5bmtv-yg9znxv-vcfkb+jprqs7m2oesqxam_4z1j...@mail.gmail.com
diagnosed by Andres Freund.

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c3b2108..5eb45ff 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4819,6 +4819,7 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
 old_infomask;
 	TransactionId xmax,
 new_xmax;
+	TransactionId priorXmax = InvalidTransactionId;
 
 	ItemPointerCopy(tid, &tupid);
 
@@ -4844,6 +4845,18 @@ l4:
 		CHECK_FOR_INTERRUPTS();
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 
+		/*
+		 * Check the tuple XMIN against prior XMAX, if any.  If we reached
+		 * the end of the chain, we're done, so return success.
+		 */
+		if (TransactionIdIsValid(priorXmax) &&
+			!TransactionIdEquals(HeapTupleHeaderGetXmin(mytup.t_data),
+ priorXmax))
+		{
+			UnlockReleaseBuffer(buf);
+			return HeapTupleMayBeUpdated;
+		}
+
 		old_infomask = mytup.t_data->t_infomask;
 		xmax = HeapTupleHeaderGetRawXmax(mytup.t_data);
 
@@ -4944,6 +4957,7 @@ l4:
 		}
 
 		/* tail recursion */
+		priorXmax = HeapTupleHeaderGetUpdateXid(mytup.t_data);
 		ItemPointerCopy(&(mytup.t_data->t_ctid), &tupid);
 		UnlockReleaseBuffer(buf);
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Noah Misch

On Wed, Nov 27, 2013 at 06:23:38AM -0800, Kevin Grittner wrote:
> Andres Freund  wrote:
> > On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
> >
> >> I happened to build in a shell that was still set up for the clang
> >> address sanitizer, and got the attached report.  On a rerun it was

(Kevin, I saw no attachment.)

> >> repeatable.  XLogInsert() seems to read past the end of a variable
> >> allocated on the stack in doPickSplit(). I haven't tried to analyze
> >> it past that, since this part of the code is unfamiliar to me.
> >
> > Yea, I've seen that one before as well and planned to report it at some
> > point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
> > up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
> > toDelete.
> 
> Have you established whether having the CRC calculation read past
> the end of the buffer can cause problems on recovery or standby
> systems?  Should we try to get this fixed by Monday?

The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic.  We should fix
this in due course on code hygiene grounds, but I would not back-patch it.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify the DECLARE CURSOR command tag depending on the scrollable flag

2013-11-27 Thread Boszormenyi Zoltan


2013-11-23 22:01 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan  writes:

Attached is the patch that modified the command tag returned by
the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR"
or "DECLARE NO SCROLL CURSOR" depending on the cursor's
scrollable flag that can be determined internally even if neither is
asked explicitly.

This does not strike me as an acceptable change.  It will break any code
that's expecting the existing command tag, for little or no benefit
to most applications.  Even if backwards compatibility were of no concern,
I'm not convinced it's a good thing to expose the backend's internal
choices about query plans used for cursors, which is what this is
basically doing.


I saw code in the backend allowing a cursor to be scrollable, although
it was not declared as such. How about ripping that out?

That way there would be no incentive for lazy SQL coding using simple cursors.

You can argue that it would also break application compatibility but
on the other hand, such a code has always been buggy and should be fixed.


It is expected by the ECPG cursor readahead code.

And that doesn't sound like a sufficient excuse.  You should only assume a
cursor is scrollable if SCROLL was specified in the cursor declaration
command, which it'd seem to me is something ECPG would or easily could
know about commands it issues.


Yes, it can and I have a patch in the series passing this info to ecpglib.

I am also arguing for backward compatibility on a different angle:
this small backend change would still allow using simple cursors
in ECPG while using the cursor readahead.

And it's not the first time drivers have to adapt to new PostgreSQL major 
versions.
If it was, I wouldn't have the courage to set a precedent either.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Shigeru Hanada

2013/11/27 Dimitri Fontaine :
> Alvaro Herrera  writes:
>> Seems to me that if you want to read remote tables without creating a
>> foreign table, you could define them locally using something like the
>> WITH syntax and then use them normally in the rest of the query.
>
> I guess what's needed here is a kind of barrier that allows pushing a
> whole arbitrary subquery (with joins and quals and whatnot) down to the
> remote side.

Yes, a big problem is how to skip parsing remote query in PG context.
Bare query string (other than string literal) always parsed by PG
parser, but remote side would have different syntax and semantics, as
Dimitri says we need to pass whole of arbitrary query string to remote
side as-is.

> My current thinking about how to solve that would be to add a notion of
> FOREIGN VIEW in our system, which would basically implement that barrier
> and send the view definition on the remote, with known quals values as
> constants, or something like that.

I'm sorry but I don't see the point here.  Do you mean that user
executes CREATE FOREIGN VIEW in advance and uses the view in a
subsequent query? Or, allow new syntax like WITH alias AS FOREIGN VIEW
(remote query)?

I think it's nice to support executing ad-hoc remote query written in
the syntax which is valid only on remote data source through FDW, and
at the moment dblink interface seems feasible for that purpose.

-- 
Shigeru HANADA

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Merlin Moncure

On Wed, Nov 27, 2013 at 4:20 AM, Shigeru Hanada
 wrote:
> Hi Merlin,
>
> 2013/11/22 Merlin Moncure :
>> On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
>>  wrote:
>>> 2013/11/22 Tom Lane :
 Merlin Moncure  writes:
> On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian  wrote:
>> I know join pushdowns seem insignificant, but it helps to restrict what
>> data must be passed back because you would only pass back joined rows.

> By 'insignificant' you mean 'necessary to do any non-trivial real
> work'.   Personally, I'd prefer it if FDW was extended to allow
> arbitrary parameterized queries like every other database connectivity
> API ever made ever.

 [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
 following the SQL standard's SQL/MED design, which does not do it
 like that.
>>>
>>> Pass-through mode mentioned in SQL/MED standard might be what he wants.
>>
>> happen to have a link handy?
>
> SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> about interaction between client.  Besides it, I think it would be
> nice to allow arbitrary FDW as backend of dblink interface like this:
>
> postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> syntax') as t(/* record type definition */...);
>
> This provides a way to execute query without defining foreign table.

yeah.  (thanks for indulging -- this is barely on topic I guess).

if it were possible to create a supporting function (say, fdw_link)
that could somehow interface with a previously established server, it
could probably be worked out.   Then all FDW could leverage
parameterization without having to copy and paste the pgsql-fdw qual
push code.  But that would be a fairly large break from the rest of
the FDW syntax and having to define the record at each call site is
admittedly a bit of a headache.

Hm, another way to think about this would be to somehow abstract the
qual push into a library so that it could be accessed by other FDWs if
they opted in.  This would address my chief complaint that only the
pgsql-fdw (the only database for which we already have an in-core high
quality connection api) driver could tap the excellent work you've
done.  If this were even possible, it would probably result in more
fdw API changes.

If my:

SELECT * FROM big_sql_server_foreign_table WHERE id = x;

was fast, that'd be pretty nice.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Name type in postgres

2013-11-27 Thread Tom Lane

mohsen soodkhah mohammadi  writes:
> I want do understand that did can we have name without null-terminator?

No.  Possibly that was the original intention, but for many years the
system has enforced that Name contains at least one trailing null byte, ie
the maximum usable length of identifiers is NAMEDATALEN-1.  So the code
you quote is valid.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [HITB-Announce] #HITB2014AMS Call for Papers Now Open

2013-11-27 Thread Hafez Kamal


Hi everyone - The Call for Papers for the 5th annual HITB Security
Conference in Amsterdam is now open. #HITB2014AMS takes place at the
Beurs van Berlage from the 27th - 30th of May 2014. The official
conference hotel for the event is the Hilton DoubleTree.

As always we start with 2-days of hands on technical trainings followed
by a 2-day triple track conference. However, 2014 also sees the
introduction of a brand new addition to the HITB line up - the HITB
Haxpo or hacker expo!

With support and backing from iAmsterdam, HITB Haxpo is going to be
unlike anything that's been done before. Think IT security exhibition
meets Makerfaire with a generous touch of HITBSecConf flavor thrown in
for good measure.

What exactly does that mean? Imagine an area dedicated to hackerspaces;
makers with 3D printers, laser cutters and other fabrication goodies
coupled with TOOOL's Lock Picking Village, HITB and Mozilla's
HackWEEKDAY developer hackathon, a hackerspaces challenge featuring LEGO
Mindstorms EV3, our Capture the Flag 'live hacking' competition and
topped off by a 3 day IT exhibition featuring Microsoft and Google as
the main anchors. The best part? Entrance to the 3-day Haxpo is
COMPLETELY FREE!

As 2014 marks our 5th year anniversary of HITB in EU, #HITB2014AMS will
be a very special edition conference with an ALL WOMEN keynote line up:

Opening Keynote (29th May 2014):


Katie Moussouris (Lead, Microsoft Security Response Center)
Kristin Lovejoy (GM, IBM Security Services Division)

Closing Keynote (29th May 2014):
===

Mischel Kwon (Former Director, US Computer Emergency Response Team)

Opening Keynote (30th May 2014):


Pamela Fusco (Chief Information Security Officer, Apollo Group)
Jennifer Steffens (Chief Executive Officer, IOActive)

Closing Keynote (30th May 2014):


Jaya Baloo (Chief Information Security Officer, KPN Telecom)

---

We're looking for talks that are highly technical, but most importantly,
material which is new, fresh and hasn't been presented previously.

Submission Deadlines:

   Round #1 selection: 1st week January 2014
   Round #2 selection: 1st Week of February 2014


CFP submissions will be evaluated in 2 rounds. If all slots are filled
in the first selection round, we will close CFP early so DON'T DELAY
SUBMITTING

HITB CFP: http://cfp.hackinthebox.org/

Event Website: http://haxpo.nl || http://conference.hitb.org
(Opens 1st Dec)

===

Each accepted submission will entitle the speaker(s) to
accommodation for 3 nights / 4 days and travel expense reimbursement
up to EUR1200.00 per speaking slot.

Topics of interest include, but are not limited to the following:

  Cloud Security
  File System Security
  3G/4G/WIMAX Security
  SS7/GSM/VoIP Security
  Security of Medical Devices
  Critical Infrastructure Security
  Smartphone / MobileSecurity
  Smart Card and Physical Security
  Network Protocols, Analysis and Attacks
  Applications of Cryptographic Techniques
  Side Channel Analysis of Hardware Devices
  Analysis of Malicious Code / Viruses / Malware
  Data Recovery, Forensics and Incident Response
  Hardware based attacks and reverse engineering
  Windows / Linux / OS X / *NIX Security Vulnerabilities
  Next Generation Exploit and Exploit Mitigation Techniques
  NFC, WLAN, GPS, HAM Radio, Satellite, RFID and Bluetooth Security

WHITE PAPER: If your presentation is short listed for inclusion into the
conference program, a technical white paper must also be provided for
review (3000 - 5000 words).

Your submissions will be reviewed by The HITB CFP Review Committee:

Charlie Miller (formerly Principal Research Consultant, Accuvant Labs)
Katie Moussouris, Senior Security Strategist, Microsoft
Itzik Kotler, Chief Technology Officer, Security Art
Cesar Cerrudo, Chief Technology Officer, IOActive
Jeremiah Grossman, Founder, Whitehat Security
Andrew Cushman, Senior Director, Microsoft
Saumil Shah, Founder CEO Net-Square
Thanh 'RD' Nguyen, THC, VNSECURITY
Alexander Kornburst, Red Database
Fredric Raynal, QuarksLab
Shreeraj Shah, Founder, BlueInfy
Emmanuel Gadaix, Founder, TSTF
Andrea Barisani, Inverse Path
Philippe Langlois, TSTF
Ed Skoudis, InGuardians
Haroon Meer, Thinkst
Chris Evans, Google
Raoul Chiesa, TSTF/ISECOM
rsnake, SecTheory
Gal Diskin, Intel
Skyper, THC

Note: We do not accept product or vendor related pitches. If you would
like to showcase your company's products or technology at HITB Haxpo
(which also has it's own set of speaking slots), please email
i...@haxpo.nl or conferencei...@hackinthebox.org

Regards,
Hafez Kamal
Hack in The Box (M) Sdn. Bhd
36th Floor, Menara Maxis
Kuala Lumpur City Centre
50088 Kuala Lumpur, Malaysia
Tel: +603-26157299
Fax: +603-26150088



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2013-11-27 Thread Amit Kapila

On Wed, Nov 27, 2013 at 7:35 PM, Robert Haas  wrote:
> On Wed, Nov 27, 2013 at 12:56 AM, Amit Kapila  wrote:
>>> The basic idea is that you use a rolling hash function to divide up
>>> the history data into chunks of a given average size.  So we scan the
>>> history data, compute a rolling hash value at each point, and each
>>> time the bottom N bits are zero, we consider that to be the end of a
>>> chunk.  We enter all the chunks into a hash table.  The chunk size
>>> will vary, but on the average, given a reasonably well-behaved rolling
>>> hash function (the pglz one probably doesn't qualify) it'll happen
>>> every 2^N bytes, so perhaps for this purpose we'd choose N to be
>>> between 3 and 5.  Then, we scan the input we want to compress and
>>> divide it into chunks in the same way.  Chunks that don't exist in the
>>> history data get copied to the output, while those that do get
>>> replaced with a reference to their position in the history data.
>>
>> I think this idea looks better than current and it will definately
>> improve some of the cases, but not sure we can win in all cases.
>> We have tried one of the similar idea (reduce size of hash and
>> eventually comparision) by adding every 10 bytes to the history
>> lookup table rather than every byte. It improved most cases but not
>> all cases ("hundred tiny fields, all changed",
>>  "hundred tiny fields, half changed" test were still slow).
>> Patch and results are at link (refer approach-1):
>> http://www.postgresql.org/message-id/001f01ce1c14$d3af0770$7b0d1650$@kap...@huawei.com
>
> What you did there will, I think, tend to miss a lot of compression
> opportunities.  Suppose for example that the old tuple is
> ABCDEFGHIJKLMNOP and the new tuple is xABCDEFGHIJKLMNOP.  After
> copying one literal byte we'll proceed to copy 9 more, missing the
> fact that there was a long match available after the first byte.

That is right, but one idea to try that out was to see if we can
reduce CPU usage at cost of compression,
but we found that it didn't completely eliminate that problem.

> The
> advantage of the fingerprinting technique is that it's supposed to be
> resistant to that sort of thing.

Okay, one question arise here is that can it be better in terms of CPU
usage as compare to when
we have used hash function for every 10th byte, if you have a feeling
that it can improve situation,
I can try a prototype implementation of same to check the results.


>> Now the tough question is what are the possible options for this patch
>> and which one to pick:
>> a. optimize encoding technique, so that it can improve results in most
>> cases even if not all.
>> b. have a table level option or guc to enable/disable WAL compression.
>> c. use some heuristics to check if chances of compression are good,
>> then only perform compression.
>> 1. apply this optimization for tuple size > 128 and < 2000
>> 2. apply this optimization if number of modified columns are less
>> than 25% (some threshold number) of total columns.
>> I think we can get modified columns from target entry and use
>> it if triggers haven't changed that tuple. I remember
>> earlier there were concerns that this value can't be trusted
>> completely, but I think using it as a heuristic is not a
>> problem, even if this number is not right in some cases.
>> d. forget about this optimization and reject the patch.
>> I think by doing option 'b' and 'c' together can make this
>> optimization usable in cases where it is actually useful.
>
> I agree that we probably want to do (b), and I suspect we want both a
> GUC and a reloption, assuming that can be done relatively cleanly.
>
> However, I think we should explore (a) more before we explore (c).

Sure, but to explore (a), the scope is bit bigger. We have below
options to explore (a):
1. try to optimize existing algorithm as used in patch, which we have
tried but ofcourse we can spend some more time to see if anything more
can be tried out.
2. try fingerprint technique as suggested by you above.
3. try some other standard methods like vcdiff, lz4 etc.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Kevin Grittner

Tom Lane  wrote:

> 3. Establish a coding rule that if you catch an error with
> PG_TRY() and don't re-throw it, you have to unblock signals in
> your PG_CATCH block.

Could that be done in the PG_END_TRY macro?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2013-11-27 Thread Kevin Grittner

Andres Freund  wrote:
> On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
>
>> I happened to build in a shell that was still set up for the clang
>> address sanitizer, and got the attached report.  On a rerun it was
>> repeatable.  XLogInsert() seems to read past the end of a variable
>> allocated on the stack in doPickSplit(). I haven't tried to analyze
>> it past that, since this part of the code is unfamiliar to me.
>
> Yea, I've seen that one before as well and planned to report it at some
> point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
> up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
> toDelete.

Have you established whether having the CRC calculation read past
the end of the buffer can cause problems on recovery or standby
systems?  Should we try to get this fixed by Monday?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-11-27 Thread Vik Fearing

On 11/26/2013 10:07 PM, Dimitri Fontaine wrote:
> What I now think we should do is only grant superusers the privileges to
> install an extension from a template they own or is owned by another
> superuser.

Say what?  Superusers bypass all privileges by definition.

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new unicode table border styles for psql

2013-11-27 Thread Pavel Stehule

2013/11/26 Pavel Stehule 

>
>
>
> 2013/11/26 Peter Eisentraut 
>
>> On 11/22/13, 3:26 AM, Pavel Stehule wrote:
>> > website is related to patch for 9.3 (I add note there)
>> >
>> > patch for 9.4 is fixed - and now with small doc
>>
>> I think it would help if we considered the new border styles and the new
>> line styles separately.
>>
>> I don't find the new border styles to be particularly useful.  They just
>> use up vertical screen space, which is usually more precious than
>> vertical space.  Is there a situation where you would find these styles
>> to be more useful than the existing ones?  Keep in mind that pset is
>> usually set permanently, so it wouldn't be practical to use a different
>> border style depending on how the query results shape up (like \x auto).
>>
>> Now for the linestyles.  I can see how some of them are attractive, but
>> several of them have poor aesthetics, I think.  I don't see a reason to
>> accept 7 new styles just for fun.  If I had to choose, I'd consider
>> -double1 and -double4 to be acceptable.
>>
>
I am sending reduced patch

I add a double1, double4 and double5 - renamed to double1,double2, double3

support for border 3 and 4 removed

postgres=# \pset linestyle double1 \pset border 2 \l
Line style (linestyle) is double1.
Border style (border) is 2.
   List of databases
┌───┬──┬──┬─┬─┬───┐
│   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   │
╞═══╪══╪══╪═╪═╪═══╡
│ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   │
│ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   │
│   │  │  │ │ │
postgres=CTc/postgres │
│ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   │
│   │  │  │ │ │
postgres=CTc/postgres │
└───┴──┴──┴─┴─┴───┘
(3 rows)

postgres=# \pset linestyle double2 \pset border 2 \l
Line style (linestyle) is double2.
Border style (border) is 2.
   List of databases
╔═══╤══╤══╤═╤═╤═══╗
║   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   ║
╟───┼──┼──┼─┼─┼───╢
║ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   ║
║ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   ║
║   │  │  │ │ │
postgres=CTc/postgres ║
║ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres   ║
║   │  │  │ │ │
postgres=CTc/postgres ║
╚═══╧══╧══╧═╧═╧═══╝
(3 rows)

postgres=# \pset linestyle double3 \pset border 2 \l
Line style (linestyle) is double3.
Border style (border) is 2.
   List of databases
╔═══╦══╦══╦═╦═╦═══╗
║   Name║  Owner   ║ Encoding ║   Collate   ║Ctype║   Access
privileges   ║
╠═══╬══╬══╬═╬═╬═══╣
║ postgres  ║ postgres ║ UTF8 ║ en_US.UTF-8 ║ en_US.UTF-8
║   ║
║ template0 ║ postgres ║ UTF8 ║ en_US.UTF-8 ║ en_US.UTF-8 ║
=c/postgres   ║
║   ║  ║  ║ ║ ║
postgres=CTc/postgres ║
║ template1 ║ postgres ║ UTF8 ║ en_US.UTF-8 ║ en_US.UTF-8 ║
=c/postgres   ║
║   ║  ║  ║ ║ ║
postgres=CTc/postgres ║
╚═══╩══╩══╩═╩═╩═══╝
(3 rows)

Pavel

>
>
commit 47b3258db6e3e82399417f57ad551ad740348635
Author: Pavel Stehule 
Date:   Wed Nov 27 15:09:01 2013 +0100

initial

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 264cfe6..6e74690 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2088,9 +2088,10 @@ lo_import 152801
   
   Sets the border line drawing style to one
   of ascii, old-ascii
-  or unicode.
-  Unique abbreviations are allowed.  (That would mean one
-  letter is enough.)
+  or unicode, double1,
+  double2 or double3
+  (unicode styles). Unique abbreviations are allowed.
+  (That would mean one letter is enough.)
   The default setting is ascii.
   This option only affects the aligned and
   wrapped output formats.
diff --git a/src

Re: [HACKERS] Shave a few instructions from child-process startup sequence

2013-11-27 Thread Robert Haas

On Tue, Nov 26, 2013 at 10:12 PM, Gurjeet Singh  wrote:
> On Tue, Nov 26, 2013 at 9:42 PM, Peter Eisentraut  wrote:
>>
>> src/backend/postmaster/postmaster.c:2255: indent with spaces.
>> +else
>> src/backend/postmaster/postmaster.c:2267: indent with spaces.
>> +break
>
>
> Not sure how that happened! Attached is the updated patch.

This is a performance patch, so it should come with benchmark results
demonstrating that it accomplishes its intended purpose.  I don't see
any.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade ?deficiency

2013-11-27 Thread Kevin Grittner

Bruce Momjian  wrote:
> On Tue, Nov 26, 2013 at 03:25:44PM -0800, Kevin Grittner wrote:
>> Bruce Momjian  wrote:
>>
>>> How are we handling breakage of pg_dump, not pg_dumpall?
>>
>> That was discussed.  Do you have something to add?
>
> I am confused what we are patching.  Are we patching pg_dump,
> pg_dumpall, or both?

Just pg_dumpall.c.

> Can I see the full patch?

It was attached to this post:

http://www.postgresql.org/message-id/1385225082.8248.yahoomail...@web162901.mail.bf1.yahoo.com

> Are we propagating other settings from pg_dump to pg_dumpall,
> like statement_timeout?

pg_dumpall output sets up the global objects (including their
properties) and then does a \connect to each database, followed by
the same output that pg_dump would generate for that database. 
That includes the SET statements for statement_timeout, etc.  The
patch does nothing to change what objects or properties the
pg_dumpall output tries to set up, it just sets a property *on the
current connection* to allow those statements to run without error.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2013-11-27 Thread Robert Haas

On Wed, Nov 27, 2013 at 12:56 AM, Amit Kapila  wrote:
>> - There is a comment "TODO: It would be nice to behave like the
>> history and the source strings were concatenated, so that you could
>> compress using the new data, too."  If we're not already doing that,
>> then how are we managing to compress WAL by more than one-quarter in
>> the "hundred tiny fields, all changed" case?
>
> Algorithm is not doing concatenation of history and source strings,
> the hash table is formed just with history data and then later
> if match is not found then it is added to history, so this can be the
> reason for the above result.

>From the compressor's point of view, that's pretty much equivalent to
behaving as if those strings were concatenated.

The point is that there's a difference between using the old tuple's
history entries to compress the new tuple and using the new tuple's
own history to compress it.  The former is delta-compression, which is
what we're supposedly doing here.  The later is just plain
compression.  That doesn't *necessarily* make it a bad idea, but they
are clearly two different things.

>> The basic idea is that you use a rolling hash function to divide up
>> the history data into chunks of a given average size.  So we scan the
>> history data, compute a rolling hash value at each point, and each
>> time the bottom N bits are zero, we consider that to be the end of a
>> chunk.  We enter all the chunks into a hash table.  The chunk size
>> will vary, but on the average, given a reasonably well-behaved rolling
>> hash function (the pglz one probably doesn't qualify) it'll happen
>> every 2^N bytes, so perhaps for this purpose we'd choose N to be
>> between 3 and 5.  Then, we scan the input we want to compress and
>> divide it into chunks in the same way.  Chunks that don't exist in the
>> history data get copied to the output, while those that do get
>> replaced with a reference to their position in the history data.
>
> I think this idea looks better than current and it will definately
> improve some of the cases, but not sure we can win in all cases.
> We have tried one of the similar idea (reduce size of hash and
> eventually comparision) by adding every 10 bytes to the history
> lookup table rather than every byte. It improved most cases but not
> all cases ("hundred tiny fields, all changed",
>  "hundred tiny fields, half changed" test were still slow).
> Patch and results are at link (refer approach-1):
> http://www.postgresql.org/message-id/001f01ce1c14$d3af0770$7b0d1650$@kap...@huawei.com

What you did there will, I think, tend to miss a lot of compression
opportunities.  Suppose for example that the old tuple is
ABCDEFGHIJKLMNOP and the new tuple is xABCDEFGHIJKLMNOP.  After
copying one literal byte we'll proceed to copy 9 more, missing the
fact that there was a long match available after the first byte.  The
advantage of the fingerprinting technique is that it's supposed to be
resistant to that sort of thing.

> Now the tough question is what are the possible options for this patch
> and which one to pick:
> a. optimize encoding technique, so that it can improve results in most
> cases even if not all.
> b. have a table level option or guc to enable/disable WAL compression.
> c. use some heuristics to check if chances of compression are good,
> then only perform compression.
> 1. apply this optimization for tuple size > 128 and < 2000
> 2. apply this optimization if number of modified columns are less
> than 25% (some threshold number) of total columns.
> I think we can get modified columns from target entry and use
> it if triggers haven't changed that tuple. I remember
> earlier there were concerns that this value can't be trusted
> completely, but I think using it as a heuristic is not a
> problem, even if this number is not right in some cases.
> d. forget about this optimization and reject the patch.
> I think by doing option 'b' and 'c' together can make this
> optimization usable in cases where it is actually useful.

I agree that we probably want to do (b), and I suspect we want both a
GUC and a reloption, assuming that can be done relatively cleanly.

However, I think we should explore (a) more before we explore (c).   I
think there's a good chance that we can reduce the CPU overhead of
this enough to feel comfortable having it enabled by default.  If we
proceed with heuristics as in approach (c), I don't think that's the
end of the world, but I think there will be more corner cases where we
lose and have to fiddle things manually.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Dimitri Fontaine

Alvaro Herrera  writes:
> Seems to me that if you want to read remote tables without creating a
> foreign table, you could define them locally using something like the
> WITH syntax and then use them normally in the rest of the query.

I guess what's needed here is a kind of barrier that allows pushing a
whole arbitrary subquery (with joins and quals and whatnot) down to the
remote side.

My current thinking about how to solve that would be to add a notion of
FOREIGN VIEW in our system, which would basically implement that barrier
and send the view definition on the remote, with known quals values as
constants, or something like that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Handling GIN incomplete splits

2013-11-27 Thread Heikki Linnakangas


On 11/22/13 15:04, Michael Paquier wrote:

On Thu, Nov 21, 2013 at 9:44 PM, Heikki Linnakangas
 wrote:

Here's a new version. To ease the review, I split the remaining patch again
into two, where the first patch is just yet more refactoring.

I also fixed some bugs in WAL logging and replay that I bumped into while
testing.

Cool. Here is the review of the two remaining patches.
1) More refactoring, general remarks:
- Code compiles without warnings
- Passes make check
- If I got it correctly... this patch separates the part managing data
to be inserted from ginbtree. I can understand the meaning behind that
if we consider that GinBtree is used only to define methods and search
conditions (flags and keys). However I am wondering if this does not
make the code more complicated...


Well, I think it's an improvement in readability to separate the 
insertion payload from the search information. With the second patch it 
becomes more important, because if an incompletely split page is 
encountered while you're inserting a value, you needs to insert the 
downlink for the old incomplete split first, and then continue the 
insertion of the original vaule where you left. That "context switching" 
becomes a lot easier when value you're inserting is passed around 
separately.



Particularly the flag isDelete that
is only passed to ginxlogInsert meritates its comment IMO. Note that I
haven't read the 2nd patch when writing that :)


Yeah, that gets removed in the second patch, which changes the WAL 
record format. :-)



1-2) Could it be possible to change the variable name of
"GinBtreeEntryInsertData *entry" in entryIsEnoughSpace? entry->entry
is kind of hard to apprehend... Renaming it to either insertEntry.
Another idea would be to rename entry in GinBtreeEntryInsertData to
entryData or entryTuple.


ok, renamed the variable to insertData.


1-3) This block of code is present two times:
+   if (!isleaf)
+   {
+   PostingItem *pitem = GinDataPageGetPostingItem(lpage, off);
+   PostingItemSetBlockNumber(pitem, updateblkno);
+   }
Should the update of a downlink to point to the next page be a
separate function?


Doesn't seem any better to me.

Thanks, committed this refactoring patch now. Will continue on the main 
patch.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Traffic jams in fn_extra

2013-11-27 Thread Greg Stark

On Sun, Nov 24, 2013 at 4:21 PM, Simon Riggs  wrote:

> Why do you need to do this dance with fn_extra?
>
> It's possible to allocate a hash table in a Transaction-lifetime
> memory context on first call into a function then cache things there.
>

fn_extra gives a handle per function call site.

It sounds to me like the complexity is coming not from having many Postgres
functions but from having lots of infrastructure backing up those
functions. So if many of their Postgres functions call a C function to do
various things and the C function wants to cache something somewhere
related to the object they've been passed then the natural thing to do is
have the Postgres function pass fn_extra down to the C function but if they
have many such C functions it gets a bit tricky.

But you could declare fn_extra to be a hash table yourself since it's your
Postgres function that gets to decide how fn_extra is going to be used. You
could then pass that hash table down to the various C functions to cache
state. However that might still be a bit odd. If you call the same C
function twice from the same Postgres function it'll get the same hash
table for both calls. fn_extra is per Postgres function call site.

-- 
greg

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Alvaro Herrera

Shigeru Hanada escribió:

> SQL/MED standard doesn't say much about PASS THROUGH mode, especially
> about interaction between client.  Besides it, I think it would be
> nice to allow arbitrary FDW as backend of dblink interface like this:
> 
> postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
> postgres=> SELECT * FROM dblink('con1', 'some query written in remote
> syntax') as t(/* record type definition */...);
> 
> This provides a way to execute query without defining foreign table.

Seems to me that if you want to read remote tables without creating a
foreign table, you could define them locally using something like the
WITH syntax and then use them normally in the rest of the query.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund

On 2013-11-27 14:45:25 +0200, Heikki Linnakangas wrote:
> On 11/27/13 14:11, Andres Freund wrote:
> >I don't think this warning is likely to be hit as the code stands -
> >heap_page_prune() et. al. will have removed all dead tuples already,
> >right and so has_dead_tuples won't be set.
> 
> It might be a good idea to add such a warning to heap_page_prune(). Or also
> emit the warning in lazy_scan_heap() if heap_page_prune() returned > 0.

> >Independent from this, ISTM we should add a
> >else if (PageIsAllVisible(page) && all_visible)
> >to those checks.
> 
> Can you elaborate, where should that be added?

I was thinking of adding such a warning below
elog(WARNING, "page containing dead tuples is marked as all-visible in 
relation \"%s\" page %u",..)
but cannot warn against that because GetOldestXmin() can go backwards...

I think it's probably sufficient to set has_dead_tuples = true in the
ItemIdIsDead() branch in lazy_scan_heap(). That should catch relevant
actions from heap_page_prune().

Besides not warning in against deletions from heap_page_prune(), the
current warning logic is also buggy for tables without indexes...

/*
 * If there are no indexes then we can vacuum the page right now
 * instead of doing a second scan.
 */
if (nindexes == 0 &&
vacrelstats->num_dead_tuples > 0)
{
/* Remove tuples from heap */
lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
has_dead_tuples = false;

That happens before the
else if (PageIsAllVisible(page) && has_dead_tuples)
check.

With regard to fixing things up, ISTM the best bet is heap_prune_chain()
so far. That's executed b vacuum and by opportunistic pruning and we
know we have the appropriate locks there. Looks relatively easy to fix
up things there. Not sure if there are any possible routes to WAL log
this but using log_newpage()?
I am really not sure what the best course of action is :(

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Heikki Linnakangas


On 11/27/13 14:11, Andres Freund wrote:

On 2013-11-27 13:56:58 +0200, Heikki Linnakangas wrote:

Afaics the likelihood of the issue occuring on non-all-visible pages is
pretty low, since they'd need to be skipped due to lock contention
repeatedly.



Hmm. If a page has its visibility-map flag set, but contains a tuple that
appears to be dead because you've wrapped around, vacuum will give a
warning:  "page containing dead tuples is marked as all-visible in relation
\"%s\" page %u".


I don't think this warning is likely to be hit as the code stands -
heap_page_prune() et. al. will have removed all dead tuples already,
right and so has_dead_tuples won't be set.


It might be a good idea to add such a warning to heap_page_prune(). Or 
also emit the warning in lazy_scan_heap() if heap_page_prune() returned > 0.



Independent from this, ISTM we should add a
else if (PageIsAllVisible(page) && all_visible)
to those checks.


Can you elaborate, where should that be added?

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-11-27 Thread Pavel Stehule

I'll prepare patch


2013/11/27 Tom Lane 

> Dean Rasheed  writes:
> > Actually the IF EXISTS in DROP TABLE now applies to the schema as
> > well. Unfortunately there is currently no consistency across the
> > various DROP commands --- some tolerate a non-existent schema, while
> > others error out.
>
> Yeah.  I think now that we've had this discussion, we should make them
> all tolerate a non-existent schema.  I'm fine with having that happen
> over a series of patches rather than all at once though.
>
> > Also amongst those that tolerate a non-existent
> > schema, the resulting notices are not consistent --- some report the
> > schema-qualified object name, while others just report the local
> > object name.
>
> Less excited about this part, but on the whole I'd vote for the "schema
> "no_such_schema" does not exist" wording in cases where the schema isn't
> there.  The other way is less specific for no very good reason.
>
> regards, tom lane
>

Re: [HACKERS] PL/Python: domain over array support

2013-11-27 Thread Heikki Linnakangas


On 11/27/13 14:15, Marko Kreen wrote:

On Tue, Nov 26, 2013 at 07:12:00PM -0200, Rodolfo Campero wrote:

2013/11/26 Heikki Linnakangas 

Oops, sorry about that. Fixed.


Maybe be you forgot to modify
plpython_types_3.out


Yes.  Heikki, please fix plpython_types_3.out too.

See attached patch.


Ah, sorry. Committed..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Andres Freund

On 2013-11-27 13:56:58 +0200, Heikki Linnakangas wrote:
> Ok, committed and backpatched that.

Thanks.

> >I wonder if we need to integrate any mitigating logic? Currently the
> >corruption may only become apparent long after it occurred, that's
> >pretty bad. And instructing people run a vacuum after the ugprade will
> >cause the corrupted data being lost if they are already 2^31 xids.
> 
> Ugh :-(. Running vacuum after the upgrade is the right thing to do to
> prevent further damage, but you're right that it will cause any
> already-wrapped around data to be lost forever. Nasty.

> >But integrating logic to fix things into heap_page_prune() looks
> >somewhat ugly as well.
> 
> I think any mitigating logic we might add should go into vacuum. It should
> be possible for a DBA to run a command, and after it's finished, be
> confident that you're safe. That means vacuum.

Well, heap_page_prune() is the first thing that's executed by
lazy_scan_heap(), that's why I was talking about it. So anything we do
need to happen in there or before.

> >Afaics the likelihood of the issue occuring on non-all-visible pages is
> >pretty low, since they'd need to be skipped due to lock contention
> >repeatedly.

> Hmm. If a page has its visibility-map flag set, but contains a tuple that
> appears to be dead because you've wrapped around, vacuum will give a
> warning:  "page containing dead tuples is marked as all-visible in relation
> \"%s\" page %u".

I don't think this warning is likely to be hit as the code stands -
heap_page_prune() et. al. will have removed all dead tuples already,
right and so has_dead_tuples won't be set.

Independent from this, ISTM we should add a
   else if (PageIsAllVisible(page) && all_visible)
to those checks.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python: domain over array support

2013-11-27 Thread Marko Kreen

On Tue, Nov 26, 2013 at 07:12:00PM -0200, Rodolfo Campero wrote:
> 2013/11/26 Heikki Linnakangas 
> > Oops, sorry about that. Fixed.
> 
> Maybe be you forgot to modify
> plpython_types_3.out

Yes.  Heikki, please fix plpython_types_3.out too.

See attached patch.

-- 
marko

diff --git a/src/pl/plpython/expected/plpython_types_3.out b/src/pl/plpython/expected/plpython_types_3.out
index 25331f2..e104356 100644
--- a/src/pl/plpython/expected/plpython_types_3.out
+++ b/src/pl/plpython/expected/plpython_types_3.out
@@ -664,6 +664,9 @@ SELECT * FROM test_type_conversion_array_error();
 ERROR:  return value of function with array return type is not a Python sequence
 CONTEXT:  while creating return value
 PL/Python function "test_type_conversion_array_error"
+--
+-- Domains over arrays
+--
 CREATE DOMAIN ordered_pair_domain AS integer[] CHECK (array_length(VALUE,1)=2 AND VALUE[1] < VALUE[2]);
 CREATE FUNCTION test_type_conversion_array_domain(x ordered_pair_domain) RETURNS ordered_pair_domain AS $$
 plpy.info(x, type(x))

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incomplete freezing when truncating a relation during vacuum

2013-11-27 Thread Heikki Linnakangas


On 11/27/13 11:15, Andres Freund wrote:

On 2013-11-27 11:01:55 +0200, Heikki Linnakangas wrote:

On 11/27/13 01:21, Andres Freund wrote:

On 2013-11-26 13:32:44 +0100, Andres Freund wrote:

This seems to be the case since
b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using
scan_all to determine whether we can set new_frozen_xid. That's a slight
pessimization when we scan a relation fully without explicitly scanning
it in its entirety, but given this isn't the first bug around
scanned_pages/rel_pages I'd rather go that way. The aforementioned
commit wasn't primarily concerned with that.
Alternatively we could just compute new_frozen_xid et al before the
lazy_truncate_heap.


I've gone for the latter in this preliminary patch. Not increasing
relfrozenxid after an initial data load seems like a bit of a shame.

I wonder if we should just do scan_all || vacrelstats->scanned_pages <
vacrelstats->rel_pages?


Hmm, you did (scan_all || vacrelstats->scanned_pages <
vacrelstats->rel_pages) for relminmxid, and just (vacrelstats->scanned_pages
< vacrelstats->rel_pages) for relfrozenxid. That was probably not what you
meant to do, the thing you did for relfrozenxid looks good to me.


I said it's a preliminary patch ;), really, I wasn't sure what of both
to go for.


Does the attached look correct to you?


Looks good.


Ok, committed and backpatched that.


I wonder if we need to integrate any mitigating logic? Currently the
corruption may only become apparent long after it occurred, that's
pretty bad. And instructing people run a vacuum after the ugprade will
cause the corrupted data being lost if they are already 2^31 xids.


Ugh :-(. Running vacuum after the upgrade is the right thing to do to 
prevent further damage, but you're right that it will cause any 
already-wrapped around data to be lost forever. Nasty.



But integrating logic to fix things into heap_page_prune() looks
somewhat ugly as well.


I think any mitigating logic we might add should go into vacuum. It 
should be possible for a DBA to run a command, and after it's finished, 
be confident that you're safe. That means vacuum.



Afaics the likelihood of the issue occuring on non-all-visible pages is
pretty low, since they'd need to be skipped due to lock contention
repeatedly.


Hmm. If a page has its visibility-map flag set, but contains a tuple 
that appears to be dead because you've wrapped around, vacuum will give 
a warning:  "page containing dead tuples is marked as all-visible in 
relation \"%s\" page %u". So I think if a manual VACUUM FREEZE passes 
without giving that warning, that vacuum hasn't destroyed any data. So 
we could advise to take a physical backup of the data directory, and run 
a manual VACUUM FREEZE on all databases. If it doesn't give a warning, 
you're safe from that point onwards. If it does, you'll want to recover 
from an older backup, or try to manually salvage just the lost rows from 
the backup, and re-index. Ugly, but hopefully rare in practice.


Unfortunately that doesn't mean that you haven't already lost some data. 
Wrap-around could've already happened, and vacuum might already have run 
and removed some rows. You'll want to examine your logs and grep for 
that warning.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add transforms feature

2013-11-27 Thread Hannu Krosing

On 11/15/2013 05:04 PM, Dimitri Fontaine wrote:
> Hi,
>
> Peter Eisentraut  writes:
>> Rebased patch.  No changes except that merge conflicts were resolved,
>> and I had to add some Data::Dumper tweaks to the regression tests so
>> that the results came out in  consistent order on different versions of
>> Perl.

> On the higher level design, the big question here is about selective
> behavior. As soon as you CREATE TRANSFORM FOR hstore LANGUAGE plperl
> then any plperl function will now receive its hstore arguments as a
> proper perl hash rather than a string.
>
> Any pre-existing plperl function with hstore arguments or return type
> then needs to be upgraded to handle the new types nicely, and some of
> those might not be under the direct control of the DBA running the
> CREATE TRANSFORM command, when using some plperl extensions for example.
>
> A mechanism allowing for the transform to only be used in some functions
> but not others might be useful. The simplest such mechanism I can think
> of is modeled against the PL/Java classpath facility as specified in the
> SQL standard: you attach a classpath per schema.
If we start adding granularity, then why not go all the way?

I mean, we could do it in the following way

1) create named transforms

CREATE [DEFAULT] TRANSFORM  FOR  LANGUAGE  
(...details...);

2) use it when declaring a function

CREATE function (
IN   [[USING] [TRANSFORM] ],
INOUT   [[USING] [IN] [TRANSFORM] ] [[USING] 
[OUT] [TRANSFORM] ],
OUT   [[USING] [TRANSFORM] ],

 ... 
) LANGUAGE  $$

$$;

This approach allows full flexibility in using "old" packages, especially
if we define old transform behaviour as "DEFAULT TRANSFORM"

Default transforms also allow easy way for rewriting current type i/o
conversions between languages into transforms.

There are immediately a few transforms that I would find useful

A) pass field data to language as pairs of (typeoid, typebin)

this is useful for speed, especially if you do not want to use many
of the passed arguments on most invocations

B) pass field data in as (typeoid, typebin), except do not de-toast
values but
pass in the toast ids, so the function is free to use only parts of
toasted values as it needs

C) pass field data in as string, probably the default behaviour for
languages like pl/tcl and pl/sh

D) and then of course just having a sensible transforms for extension
types like the current patch provides.

> Worst case, that I really don't think we need, would be addressing that
> per-argument:
>
>   CREATE FUNCTION foo (hash hstore WITH TRANSFORM, kv hstore) …
>
> I certainly hope we don't need that, and sure can't imagine use cases
> for that level of complexity at the time of writing this review.
>
A typical use case would be to have a "short" hstore always passed in as
dictionary
and have another possibly large hstore passed in as toast pointer.

And if we want to have all type conversions between postgres and pls
re-written
as transforms, then we do need named transforms, not just one per (pl,
type) pair.

Also, if we allow flexibility, the it is probably a good idea to
implement full flexibility
first and then look at making usage easy after that, instead of adding
flexibility in
small steps.

Once we have per-argument transforms in place, we can look at setting
per-schema
defaults for ease of use.

As large part of this is actually abstracting i/o conversions out of pl
function code,
I think we should look at allowing the conversion functions to be
written in the
target pl language in addition to C.

I'll see if I can resurrect my patch for support of "cstring" and
"internal" types in pl/python
function defs for this.

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ "is not a valid Win32 application"

2013-11-27 Thread Rajeev rastogi

On 27 November 2013, Naoya Anzai wrote:
> Hi, Rajeev
> 
> > > I tested the latest patch. My observation is:
> > >   If we give relative data directory path while registering the
> > > service, then service start fails.
> > >   But same works if the data directory is absolute path.
> > >
> > >   Looks like an existing issue. May be we need to internally
> convert
> > > relative data path to absolute.
> >
> > Since the mentioned issue is an existing issue and not because of
> this patch.
> > So can we take that as separate defect and fix. If so, then I can
> move
> > this patch to "ready for committer".
> 
> I think so too.
> In boot by Service, CurrentDirectory seems to be "C:/Windows/system32".
> So, you have to set a relative data directory path that the starting
> point to be "C:/Windows/system32".
> 

OK. Then I am moving it to "ready for committer".


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Get more from indices.

2013-11-27 Thread Etsuro Fujita

Kyotaro HORIGUCHI wrote:
> > * In get_relation_info(), the patch determines the branch condition
> > "keyattno != ObjectIdAttributeNumber".  I fail to understand the
> > meaning of this branch condition.  Could you explain about it?

> Literally answering, it means oid cannot be NULL (if it exists).

Understood.  Thank you for the detailed information.  But I'm not sure it's
worth complicating the code.  What use cases are you thinking?

Thanks,

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-11-27 Thread Shigeru Hanada

Hi Merlin,

2013/11/22 Merlin Moncure :
> On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada
>  wrote:
>> 2013/11/22 Tom Lane :
>>> Merlin Moncure  writes:
 On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian  wrote:
> I know join pushdowns seem insignificant, but it helps to restrict what
> data must be passed back because you would only pass back joined rows.
>>>
 By 'insignificant' you mean 'necessary to do any non-trivial real
 work'.   Personally, I'd prefer it if FDW was extended to allow
 arbitrary parameterized queries like every other database connectivity
 API ever made ever.
>>>
>>> [ shrug... ]  So use dblink.  For better or worse, the FDW stuff is
>>> following the SQL standard's SQL/MED design, which does not do it
>>> like that.
>>
>> Pass-through mode mentioned in SQL/MED standard might be what he wants.
>
> happen to have a link handy?

SQL/MED standard doesn't say much about PASS THROUGH mode, especially
about interaction between client.  Besides it, I think it would be
nice to allow arbitrary FDW as backend of dblink interface like this:

postgres=> SELECT dblink_connect('con1', 'server name of an FDW');
postgres=> SELECT * FROM dblink('con1', 'some query written in remote
syntax') as t(/* record type definition */...);

This provides a way to execute query without defining foreign table.
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-11-27 Thread Pavel Stehule

2013/11/27 Peter Eisentraut 

> On Tue, 2013-11-19 at 10:40 -0500, Robert Haas wrote:
> > I think the goal was to get to RAISE ASSERT
> > WHEN ...; then, if assertions are off, you do nothing; if they're on,
> > you error.  IF condition THEN RAISE..." isn't a suitable surrogate in
> > that case because you incur the overhead of testing the condition
> > regardless.
>
> So if I do RAISE ASSERT WHEN condition and assertions are off, then
> condition wouldn't even be evaluated?  But what about RAISE NOTICE WHEN,
> when log_min_messages is error?  What about the side effects of the
> format string?  This is all just getting too weird.
>
> I don't see anything wrong with considering a separate ASSERT command
> with its own semantics, like in many other programming languages.
>
>
> My objection against ASSERT command was one - it was too simply (against
to cost of possible collision from introduction new (wide used) keyword.

I can live with ASSERT statement - but I expect as minimum a possibility to
specify level (failure, tracing, ...) and specify a message related to
assert. Assert with only expression is not enough.

Regards

Pavel

[HACKERS] Name type in postgres

2013-11-27 Thread mohsen soodkhah mohammadi

hello.
I want do understand that did can we have name without null-terminator?
in src/backend/util/adt/name.c in nameout() function is:
Names = PG_GETARG_NAME(0);
PG_RETURN_CSTRING(pstrdup(NameStr(*s)));
what do the pstrdup() function?
do this function create string from name with null-terminate?

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-11-27 Thread Peter Geoghegan

On Wed, Nov 27, 2013 at 1:20 AM, Andres Freund  wrote:
> Even the "deadlock detected" errors might be a fkey-locking issue. Bug
> #8434, but that's really hard to know without more details.

Thanks, I was aware of that but didn't make the connection.

I've written a test-case that is designed to exercise one case that
deadlocks like crazy - deadlocking is the expected, correct behavior.
The deadlock errors are not in themselves suspicious. Actually, if
anything I find it suspicious that there aren't more deadlocks.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ "is not a valid Win32 application"

2013-11-27 Thread Naoya Anzai

Hi, Rajeev

> > I tested the latest patch. My observation is:
> > If we give relative data directory path while registering the
> > service, then service start fails.
> > But same works if the data directory is absolute path.
> > 
> > Looks like an existing issue. May be we need to internally
> > convert relative data path to absolute.
> 
> Since the mentioned issue is an existing issue and not because of this patch.
> So can we take that as separate defect and fix. If so, then I can 
> move this patch to "ready for committer".

I think so too. 
In boot by Service, CurrentDirectory seems to be "C:/Windows/system32".
So, you have to set a relative data directory path that the starting point to 
be "C:/Windows/system32".

> 
> Thanks and Regards,
> Kumar Rajeev Rastogi
> 
> 

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Soft, Ltd.
E-Mail: anzai-na...@mxu.nes.nec.co.jp
---


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 107 matches

Mail list logo