Re: [HACKERS] PoC: Partial sort

2013-12-31 Thread David Rowley
On Tue, Dec 31, 2013 at 2:41 PM, Andreas Karlsson andr...@proxel.se wrote:

 On 12/29/2013 08:24 AM, David Rowley wrote:

 If it was possible to devise some way to reuse any
 previous tuplesortstate perhaps just inventing a reset method which
 clears out tuples, then we could see performance exceed the standard
 seqscan - sort. The code the way it is seems to lookup the sort
 functions from the syscache for each group then allocate some sort
 space, so quite a bit of time is also spent in palloc0() and pfree()

 If it was not possible to do this then maybe adding a cost to the number
 of sort groups would be better so that the optimization is skipped if
 there are too many sort groups.


 It should be possible. I have hacked a quick proof of concept for reusing
 the tuplesort state. Can you try it and see if the performance regression
 is fixed by this?

 One thing which have to be fixed with my patch is that we probably want to
 close the tuplesort once we have returned the last tuple from ExecSort().

 I have attached my patch and the incremental patch on Alexander's patch.


Thanks, the attached is about 5 times faster than it was previously with my
test case upthread.

The times now look like:

No pre-sortable index:
Total runtime: 86.278 ms

With pre-sortable index with partial sorting
Total runtime: 47.500 ms

With the query where there is no index the sort remained in memory.

I spent some time trying to find a case where the partial sort is slower
than the seqscan - sort. The only places partial sort seems slower are
when the number of estimated sort groups are around the crossover point
where the planner would be starting to think about performing a seqscan -
sort instead. I'm thinking right now that it's not worth raising the costs
around this as the partial sort is less likely to become a disk sort than
the full sort is.

I'll keep going with trying to break it.

Regards

David Rowley



 --
 Andreas Karlsson



Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2013-12-31 Thread Simon Riggs
On 30 December 2013 19:52, Christian Kruse christ...@2ndquadrant.com wrote:

 I created two patches..

Patches are related but separate, so should be tracked on separate
threads. Please add them to the CF app also.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-31 Thread Heikki Linnakangas

On 12/31/2013 09:18 AM, Peter Geoghegan wrote:

On Sun, Dec 29, 2013 at 9:09 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

While mulling this over further, I had an idea about this: suppose we
marked the tuple in some fashion that indicates that it's a promise
tuple.  I imagine an infomask bit, although the concept makes me wince
a bit since we don't exactly have bit space coming out of our ears
there.  Leaving that aside for the moment, whenever somebody looks at
the tuple with a mind to calling XactLockTableWait(), they can see
that it's a promise tuple and decide to wait on some other heavyweight
lock instead.  The simplest thing might be for us to acquire a
heavyweight lock on the promise tuple before making index entries for
it, and then have callers wait on that instead always instead of
transitioning from the tuple lock to the xact lock.


Yeah, that seems like it should work. You might not even need an infomask
bit for that; just take the other heavyweight lock always before calling
XactLockTableWait(), whether it's a promise tuple or not. If it's not,
acquiring the extra lock is a waste of time but if you're going to sleep
anyway, the overhead of one extra lock acquisition hardly matters.


Are you suggesting that I lock the tuple only (say, through a special
LockPromiseTuple() call), or lock the tuple *and* call
XactLockTableWait() afterwards? You and Robert don't seem to be in
agreement about which here.


I meant the latter, ie. grab the new kind of lock first, then check if 
the tuple is still there, and then call XactLockTableWait() as usual.



The inserter has to acquire the heavyweight lock before releasing the buffer
lock, because otherwise another inserter (or deleter or updater) might see
the tuple, acquire the heavyweight lock, and fall to sleep on
XactLockTableWait(), before the inserter has grabbed the heavyweight lock.
If that race condition happens, you have the original problem again, ie. the
updater unnecessarily waits for the inserting transaction to finish, even
though it already killed the tuple it inserted.


Right. Can you suggest a workaround to the above problems?


Umm, I did, in the next paragraph ;-) :


That seems easy to avoid. If the heavyweight lock uses the
transaction id as the key, just like
XactLockTableInsert/XactLockTableWait, you can acquire it before
doing the insertion.


Let me elaborate that. The idea is to have new heavy-weight lock that's 
just like the transaction lock used by 
XactLockTableInsert/XactLockTableWait, but separate from that. Let's 
call it PromiseTupleInsertionLock. The insertion procedure in INSERT ... 
ON DUPLICATE looks like this:


1. PromiseTupleInsertionLockAcquire(my xid)
2. Insert heap tuple
3. Insert index tuples
4. Check if conflict happened. Kill the already-inserted tuple on conflict.
5. PromiseTupleInsertionLockRelease(my xid)

IOW, the only change to the current patch is that you acquire the new 
kind of lock before starting the insertion, and you release it after 
you've killed the tuple, or you know you're not going to kill it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-31 Thread Peter Geoghegan
On Tue, Dec 31, 2013 at 12:52 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 1. PromiseTupleInsertionLockAcquire(my xid)
 2. Insert heap tuple
 3. Insert index tuples
 4. Check if conflict happened. Kill the already-inserted tuple on conflict.
 5. PromiseTupleInsertionLockRelease(my xid)

 IOW, the only change to the current patch is that you acquire the new kind
 of lock before starting the insertion, and you release it after you've
 killed the tuple, or you know you're not going to kill it.

Where does row locking fit in there? - you may need to retry when that
part is incorporated, of course. What if you have multiple promise
tuples from a contended attempt to insert a single slot, or multiple
broken promise tuples across multiple slots or even multiple commands
in the same xact?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Christian Kruse
Hi there,

I created a patch improving the log_lock_wait messages by adding
relation infos (name and OID) as well as tuple infos (if present –
ctid and, if available, the tuple itself) in the context.

Sample output (log_lock_waits=on required):

session 1:
CREATE TABLE foo (val integer);
INSERT INTO foo (val) VALUES (1);
BEGIN;
UPDATE foo SET val = 3;

session 2:
BEGIN;
UPDATE TABLE foo SET val = 2;

Output w/o patch:

LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms

Output with patch:

LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
CONTEXT:  relation name: foo (OID 16385)
tuple (ctid (0,1)): (1)

I am not really sure where to put the functions. Currently they are
located in backend/storage/lmgr/lmgr.c because XactLockTableWait() is
located there, too. What do you think?

I also created a test spec for easy creation of the log output;
however, I was not able to provide an expected file since the process
IDs vary from test run to test run.

Regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index f8545c1..cfa49c2 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2702,9 +2702,11 @@ l1:
 
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
+			XactLockTableWaitSetupErrorContextCallback(relation, tp);
 			/* wait for multixact */
 			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
 			NULL, infomask);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,10 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
+			XactLockTableWaitSetupErrorContextCallback(relation, tp);
+
 			XactLockTableWait(xwait);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3254,9 +3259,11 @@ l2:
 			TransactionId update_xact;
 			int			remain;
 
+			XactLockTableWaitSetupErrorContextCallback(relation, oldtup);
 			/* wait for multixact */
 			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
 			infomask);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3330,7 +3337,10 @@ l2:
 			else
 			{
 /* wait for regular transaction to end */
+XactLockTableWaitSetupErrorContextCallback(relation, oldtup);
+
 XactLockTableWait(xwait);
+XactLockTableWaitCleanupErrorContextCallback();
 LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 /*
@@ -4398,7 +4408,11 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
+{
+	XactLockTableWaitSetupErrorContextCallback(relation, tuple);
 	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+	XactLockTableWaitCleanupErrorContextCallback();
+}
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -4453,7 +4467,11 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
+{
+	XactLockTableWaitSetupErrorContextCallback(relation, tuple);
 	XactLockTableWait(xwait);
+	XactLockTableWaitCleanupErrorContextCallback();
+}
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -5139,9 +5157,14 @@ l4:
 
 	if (needwait)
 	{
+		XactLockTableWaitSetupErrorContextCallback(rel, mytup);
+
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 		XactLockTableWait(members[i].xid);
 		pfree(members);
+
+		XactLockTableWaitCleanupErrorContextCallback();
+
 		goto l4;
 	}
 	if (res != HeapTupleMayBeUpdated)
@@ -5199,8 +5222,13 @@ l4:
  needwait);
 if (needwait)
 {
+	XactLockTableWaitSetupErrorContextCallback(rel, mytup);
+
 	LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 	XactLockTableWait(rawxmax);
+
+	XactLockTableWaitCleanupErrorContextCallback();
+
 	goto l4;
 }
 if (res != HeapTupleMayBeUpdated)
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index a452fea..7f79615 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -165,7 +165,11 @@ top:
 		{
 			/* Have to wait for the other guy ... */
 			_bt_relbuf(rel, buf);
+
+			XactLockTableWaitSetupErrorContextCallback(rel, NULL);
 			XactLockTableWait(xwait);
+			XactLockTableWaitCleanupErrorContextCallback();
+
 			/* start over... */
 			_bt_freestack(stack);
 			goto top;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 7ad9720..6e80e37 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2297,7 +2297,11 @@ IndexBuildHeapScan(Relation heapRelation,
 			 * Must drop the lock on the buffer before we wait
 			 */
 			LockBuffer(scan-rs_cbuf, BUFFER_LOCK_UNLOCK);
+
+			

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2013-12-31 Thread Christian Kruse
Hi,

On 31/12/13 08:48, Simon Riggs wrote:
  I created two patches..
 
 Patches are related but separate, so should be tracked on separate
 threads.

[x] Done (in 20131231091244.gb25...@defunct.ch)

 Please add them to the CF app also.

[x] Done. I modified the existing commitfest entry
(https://commitfest.postgresql.org/action/patch_view?id=1350) and
the second, new one is located here:

https://commitfest.postgresql.org/action/patch_view?id=1351

Regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpfpe4EtSAQ4.pgp
Description: PGP signature


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-31 Thread Peter Geoghegan
On Tue, Dec 31, 2013 at 12:52 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Are you suggesting that I lock the tuple only (say, through a special
 LockPromiseTuple() call), or lock the tuple *and* call
 XactLockTableWait() afterwards? You and Robert don't seem to be in
 agreement about which here.

 I meant the latter, ie. grab the new kind of lock first, then check if the
 tuple is still there, and then call XactLockTableWait() as usual.

I don't follow this either. Through what exact mechanism does the
waiter know that there was a wait on the
PromiseTupleInsertionLockAcquire() call, and so it should not wait on
XactLockTableWait()? Does whatever mechanism you have in mind not have
race conditions?


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Store Extension Options

2013-12-31 Thread Pavel Stehule
Hello

I am looking on this patch

ALTER TABLE foo SET (ext.somext.do_replicate=true);

Why is there fixed prefix ext ?

This feature is similar to attaching setting to function

CREATE OR REPLACE FUNCTION ... SET var = ...;

We can use someprefix.someguc without problems there.

Regards

Pavel


Re: [HACKERS] Proposal: variant of regclass

2013-12-31 Thread Pavel Stehule
2013/12/31 Tatsuo Ishii is...@postgresql.org

  On 12/31/2013 02:38 AM, Tatsuo Ishii wrote:
  Before proceeding the work, I would like to make sure that followings
  are complete list of new functions. Inside parentheses are
  corresponding original functions.
 
  toregproc (regproc)
  toregoper (regoper)
  toregclass (regclass)
  toregtype (regtype)
 
  Pardon the bikeshedding, but those are hard to read for me.  I would
  prefer to go with the to_timestamp() model and add an underscore to
  those names.

 I have no problem with adding to_. Objection?


I like to_reg* too

Regards

Pavel



 Best regards,
 --
 Tatsuo Ishii
 SRA OSS, Inc. Japan
 English: http://www.sraoss.co.jp/index_en.php
 Japanese: http://www.sraoss.co.jp


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] [PATCH] Store Extension Options

2013-12-31 Thread Fabrízio de Royes Mello
On Tue, Dec 31, 2013 at 9:08 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 Hello

 I am looking on this patch

 ALTER TABLE foo SET (ext.somext.do_replicate=true);

 Why is there fixed prefix ext ?

 This feature is similar to attaching setting to function

 CREATE OR REPLACE FUNCTION ... SET var = ...;

 We can use someprefix.someguc without problems there.


Hi,

We use the prefix ext (aka namespace) to distinguish these options which
are related to extensions.

Have you seen the previous thread [1] ?

Regards,

[1]
http://www.postgresql.org/message-id/CAFcNs+rqCq1H5eXW-cvdti6T-xo2STEkhjREx=odmakk5ti...@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] [PATCH] Store Extension Options

2013-12-31 Thread Pavel Stehule
2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com


 On Tue, Dec 31, 2013 at 9:08 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
 
  Hello
 
  I am looking on this patch
 
  ALTER TABLE foo SET (ext.somext.do_replicate=true);
 
  Why is there fixed prefix ext ?
 
  This feature is similar to attaching setting to function
 
  CREATE OR REPLACE FUNCTION ... SET var = ...;
 
  We can use someprefix.someguc without problems there.
 

 Hi,

 We use the prefix ext (aka namespace) to distinguish these options which
 are related to extensions.

 Have you seen the previous thread [1] ?


yes, but I don't understand why it is necessary? I use a analogy with
custom GUC - and there we don't use similar prefix. Only any prefix is
required - and it can contain a dot.

Regards

Pavel



 Regards,

 [1]
 http://www.postgresql.org/message-id/CAFcNs+rqCq1H5eXW-cvdti6T-xo2STEkhjREx=odmakk5ti...@mail.gmail.com

 --
 Fabrízio de Royes Mello
 Consultoria/Coaching PostgreSQL
  Timbira: http://www.timbira.com.br
  Blog sobre TI: http://fabriziomello.blogspot.com
  Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
  Twitter: http://twitter.com/fabriziomello



Re: [HACKERS] [PATCH] Store Extension Options

2013-12-31 Thread Fabrízio de Royes Mello
On Tue, Dec 31, 2013 at 9:38 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:


 2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com


 On Tue, Dec 31, 2013 at 9:08 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:
 
  Hello
 
  I am looking on this patch
 
  ALTER TABLE foo SET (ext.somext.do_replicate=true);
 
  Why is there fixed prefix ext ?
 
  This feature is similar to attaching setting to function
 
  CREATE OR REPLACE FUNCTION ... SET var = ...;
 
  We can use someprefix.someguc without problems there.
 

 Hi,

 We use the prefix ext (aka namespace) to distinguish these options
which are related to extensions.

 Have you seen the previous thread [1] ?


 yes, but I don't understand why it is necessary? I use a analogy with
custom GUC - and there we don't use similar prefix. Only any prefix is
required - and it can contain a dot.


We use the namespace ext to the internal code
(src/backend/access/common/reloptions.c) skip some validations and store
the custom GUC.

Do you think we don't need to use the ext namespace?


Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] [PATCH] Store Extension Options

2013-12-31 Thread Pavel Stehule
2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com


 On Tue, Dec 31, 2013 at 9:38 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
 
 
  2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com
 
 
  On Tue, Dec 31, 2013 at 9:08 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  
   Hello
  
   I am looking on this patch
  
   ALTER TABLE foo SET (ext.somext.do_replicate=true);
  
   Why is there fixed prefix ext ?
  
   This feature is similar to attaching setting to function
  
   CREATE OR REPLACE FUNCTION ... SET var = ...;
  
   We can use someprefix.someguc without problems there.
  
 
  Hi,
 
  We use the prefix ext (aka namespace) to distinguish these options
 which are related to extensions.
 
  Have you seen the previous thread [1] ?
 
 
  yes, but I don't understand why it is necessary? I use a analogy with
 custom GUC - and there we don't use similar prefix. Only any prefix is
 required - and it can contain a dot.
 

  We use the namespace ext to the internal code
 (src/backend/access/common/reloptions.c) skip some validations and store
 the custom GUC.

 Do you think we don't need to use the ext namespace?


yes - there be same mechanism as we use for GUC

Pavel




 Regards,


 --
 Fabrízio de Royes Mello
 Consultoria/Coaching PostgreSQL
  Timbira: http://www.timbira.com.br
  Blog sobre TI: http://fabriziomello.blogspot.com
  Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
  Twitter: http://twitter.com/fabriziomello



Re: [HACKERS] PoC: Partial sort

2013-12-31 Thread Alvaro Herrera
David Rowley escribió:

 I was about to test it tonight, but I'm having trouble getting the patch to
 compile... I'm really wondering which compiler you are using as it seems
 you're declaring your variables in some strange places.. See nodeSort.c
 line 101. These variables are declared after there has been an if statement
 in the same scope. That's not valid in C. (The patch did however apply
 without any complaints).

AFAIR C99 allows mixed declarations and code.  Visual Studio only
implements C89 though, which is why it fails to compile there.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2013-12-31 Thread Fabrízio de Royes Mello
On Tue, Dec 31, 2013 at 10:37 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com

 On Tue, Dec 31, 2013 at 9:38 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:
 
  2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com
 
  On Tue, Dec 31, 2013 at 9:08 AM, Pavel Stehule 
pavel.steh...@gmail.com wrote:
  
   Hello
  
   I am looking on this patch
  
   ALTER TABLE foo SET (ext.somext.do_replicate=true);
  
   Why is there fixed prefix ext ?
  
   This feature is similar to attaching setting to function
  
   CREATE OR REPLACE FUNCTION ... SET var = ...;
  
   We can use someprefix.someguc without problems there.
  
 
  Hi,
 
  We use the prefix ext (aka namespace) to distinguish these options
which are related to extensions.
 
  Have you seen the previous thread [1] ?
 
 
  yes, but I don't understand why it is necessary? I use a analogy with
custom GUC - and there we don't use similar prefix. Only any prefix is
required - and it can contain a dot.
 

 We use the namespace ext to the internal code
(src/backend/access/common/reloptions.c) skip some validations and store
the custom GUC.

 Do you think we don't need to use the ext namespace?


 yes - there be same mechanism as we use for GUC


If we going to that way then we can expand the use of this patch to store
custom GUCs to functions also, and we can wrote a function (like
current_setting) to get specific GUC values, like:

ALTER TABLE foo SET (myextension.option=on);

SELECT current_setting('foo'::regclass, 'myextension.option');

Comments?

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Simon Riggs
On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com wrote:

 Output w/o patch:

 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms

 Output with patch:

 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
 CONTEXT:  relation name: foo (OID 16385)
 tuple (ctid (0,1)): (1)

That is useful info.

I think the message should be changed to say this only, without a context line

LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
tuple (0,1) after 11688.720 ms

My reason is that pid 24774 was waiting for a *tuple lock* and it was
eventually granted, so thats what it should say. ALL locks wait for
the current transaction that holds the lock to complete, but only
tuple locks currently refer to a lock wait as a wait for a
transaction. That is confusing for users, as well as being
inconsistent with other existing messages.

Replacing the old text with the new info would represent a net
improvement in usefulness and understandability.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2013-12-31 Thread Pavel Stehule
2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com


 On Tue, Dec 31, 2013 at 10:37 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
 
  2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com
 
  On Tue, Dec 31, 2013 at 9:38 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  
   2013/12/31 Fabrízio de Royes Mello fabriziome...@gmail.com
  
   On Tue, Dec 31, 2013 at 9:08 AM, Pavel Stehule 
 pavel.steh...@gmail.com wrote:
   
Hello
   
I am looking on this patch
   
ALTER TABLE foo SET (ext.somext.do_replicate=true);
   
Why is there fixed prefix ext ?
   
This feature is similar to attaching setting to function
   
CREATE OR REPLACE FUNCTION ... SET var = ...;
   
We can use someprefix.someguc without problems there.
   
  
   Hi,
  
   We use the prefix ext (aka namespace) to distinguish these options
 which are related to extensions.
  
   Have you seen the previous thread [1] ?
  
  
   yes, but I don't understand why it is necessary? I use a analogy with
 custom GUC - and there we don't use similar prefix. Only any prefix is
 required - and it can contain a dot.
  
 
  We use the namespace ext to the internal code
 (src/backend/access/common/reloptions.c) skip some validations and store
 the custom GUC.
 
  Do you think we don't need to use the ext namespace?
 
 
  yes - there be same mechanism as we use for GUC
 

 If we going to that way then we can expand the use of this patch to store
 custom GUCs to functions also, and we can wrote a function (like
 current_setting) to get specific GUC values, like:

 ALTER TABLE foo SET (myextension.option=on);

 SELECT current_setting('foo'::regclass, 'myextension.option');


I like it

Pavel



 Comments?

 --
 Fabrízio de Royes Mello
 Consultoria/Coaching PostgreSQL
  Timbira: http://www.timbira.com.br
  Blog sobre TI: http://fabriziomello.blogspot.com
  Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
  Twitter: http://twitter.com/fabriziomello



Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com wrote:
 Output with patch:
 
 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
 CONTEXT:  relation name: foo (OID 16385)
 tuple (ctid (0,1)): (1)

 That is useful info.

 I think the message should be changed to say this only, without a context line

 LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
 tuple (0,1) after 11688.720 ms

 My reason is that pid 24774 was waiting for a *tuple lock* and it was
 eventually granted, so thats what it should say.

No, that wasn't what it was waiting for, and lying to the user like that
isn't going to make things better.

Christian's idea of a context line seems plausible to me.  I don't
care for this implementation too much --- a global variable?  Ick.
Make a wrapper function for XactLockTableWait instead, please.
And I'm not real sure that showing the whole tuple contents is a good
thing; I can see people calling that a security leak, not to mention
that the performance consequences could be dire.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Simon Riggs
On 31 December 2013 16:36, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com wrote:
 Output with patch:

 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
 CONTEXT:  relation name: foo (OID 16385)
 tuple (ctid (0,1)): (1)

 That is useful info.

 I think the message should be changed to say this only, without a context 
 line

 LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
 tuple (0,1) after 11688.720 ms

 My reason is that pid 24774 was waiting for a *tuple lock* and it was
 eventually granted, so thats what it should say.

 No, that wasn't what it was waiting for, and lying to the user like that
 isn't going to make things better.

Like that is ambiguous and I don't understand you or what you are
objecting to.

When we run SELECT ... FOR SHARE we are attempting to lock rows. Why
is the transaction we are waiting for important when we wait to lock
rows, but when we wait to lock relations it isn't?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix_PGSTAT_NUM_TABENTRIES_macro patch

2013-12-31 Thread Tom Lane
Mark Dilger markdil...@yahoo.com writes:
 In src/include/pgstat.h, the PGSTAT_NUM_TABENTRIES macro
 attempts to subtract off the size of the PgStat_MsgTabstat
 struct up to the m_entry[] field.  This macro was correct up
 until the fields m_block_read_time and m_block_write_time
 were added to that struct, as the macro was not changed to
 include their size.

Yeah, that's a bug.

 This patch fixes the macro.

I'm inclined to think that we should look for a less breakable way to
manage the message size limit; if Robert missed this issue in that patch,
other people are going to miss it in future patches.  The existing coding
isn't really right anyway when you consider possible alignment padding.
(I think the present struct definitions probably don't involve any
padding, but that's not a very future-proof assumption either.)  It'd be
better to somehow drive the calculation from offsetof(m_entry).  I'm not
seeing any way to do that that doesn't require some notational changes
in the code, though.  One way would be to rejigger it like this:

#define PGSTAT_MAX_MSG_SIZE 1000

typedef union PgStat_MsgTabstat
{
struct {
PgStat_MsgHdr hdr;
Oid   databaseid;
int   nentries;
int   xact_commit;
int   xact_rollback;
PgStat_Counter block_read_time;/* times in microseconds */
PgStat_Counter block_write_time;
PgStat_TableEntry entry[FLEXIBLE_ARRAY_MEMBER];
} m;
char padding[PGSTAT_MAX_MSG_SIZE];/* pad sizeof this union to target */
} PgStat_MsgTabstat;

#define PGSTAT_NUM_TABENTRIES ((PGSTAT_MAX_MSG_SIZE - 
offsetof(PgStat_MsgTabstat, m.entry)) / sizeof(PgStat_TableEntry))

but then we'd have to run around and replace m_hdr with m.hdr etc
in the code referencing the message types that use this mechanism.
Kind of annoying.

 As an aside, the PGSTAT_MSG_PAYLOAD define from which
 these field sizes are being subtracted is a bit of a WAG, if
 I am reading the code correctly, in which case whether the
 two additional fields are subtracted from that WAG is perhaps
 not critical.  But the code is certainly easier to understand if
 the macro matches the definition of the struct.

Yeah, it seems unlikely that exceeding the 1000-byte target, even by quite
a lot, would cause any performance problem on most platforms --- a quick
check says that the MTU on the loopback interface is near 16K on the
machines I have handy.  So that dampens my enthusiasm for making invasive
code changes to ensure we meet the target exactly.  Still, the next
oversight of this sort might introduce a larger error.

Does anyone have any other ideas about making this coding more robust?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix_PGSTAT_NUM_TABENTRIES_macro patch

2013-12-31 Thread Mark Dilger


I don't care for the places in the code that say things like

    foo - sizeof(int)

where int refers implicitly to a specific variable or struct field, but
you have to remember that and change it by hand if you change the
type of the variable or struct.

But this sort of code is quite common in postgres, and not
at all unique to src/include/pgstat.h.  I did not try to tackle
that project-wide, as it would turn a one-line patch (which has
a good chance of being accepted) into a huge patch that
would likely be rejected.

On the other hand, if the community feels this kind of code
cleanup is needed, I might jump in.


Mark Dilger
 

On Tuesday, December 31, 2013 9:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
Mark Dilger markdil...@yahoo.com writes:
 In src/include/pgstat.h, the PGSTAT_NUM_TABENTRIES macro
 attempts to subtract off the size of the PgStat_MsgTabstat
 struct up to the m_entry[] field.  This macro was correct up
 until the fields m_block_read_time and m_block_write_time
 were added to that struct, as the macro was not changed to
 include their size.

Yeah, that's a bug.

 This patch fixes the macro.

I'm inclined to think that we should look for a less breakable way to
manage the message size limit; if Robert missed this issue in that patch,
other people are going to miss it in future patches.  The existing coding
isn't really right anyway when you consider possible alignment padding.
(I think the present struct definitions probably don't involve any
padding, but that's not a very future-proof assumption either.)  It'd be
better to somehow drive the calculation from offsetof(m_entry).  I'm not
seeing any way to do that that doesn't require some notational changes
in the code, though.  One way would be to rejigger it like this:

#define PGSTAT_MAX_MSG_SIZE 1000

typedef union PgStat_MsgTabstat
{
    struct {
        PgStat_MsgHdr hdr;
        Oid           databaseid;
        int           nentries;
        int           xact_commit;
        int           xact_rollback;
        PgStat_Counter block_read_time;    /* times in microseconds */
        PgStat_Counter block_write_time;
        PgStat_TableEntry entry[FLEXIBLE_ARRAY_MEMBER];
    } m;
    char padding[PGSTAT_MAX_MSG_SIZE];    /* pad sizeof this union to target */
} PgStat_MsgTabstat;

#define PGSTAT_NUM_TABENTRIES ((PGSTAT_MAX_MSG_SIZE - 
offsetof(PgStat_MsgTabstat, m.entry)) / sizeof(PgStat_TableEntry))

but then we'd have to run around and replace m_hdr with m.hdr etc
in the code referencing the message types that use this mechanism.
Kind of annoying.

 As an aside, the PGSTAT_MSG_PAYLOAD define from which
 these field sizes are being subtracted is a bit of a WAG, if
 I am reading the code correctly, in which case whether the
 two additional fields are subtracted from that WAG is perhaps
 not critical.  But the code is certainly easier to understand if
 the macro matches the definition of the struct.

Yeah, it seems unlikely that exceeding the 1000-byte target, even by quite
a lot, would cause any performance problem on most platforms --- a quick
check says that the MTU on the loopback interface is near 16K on the
machines I have handy.  So that dampens my enthusiasm for making invasive
code changes to ensure we meet the target exactly.  Still, the next
oversight of this sort might introduce a larger error.

Does anyone have any other ideas about making this coding more robust?

            regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix_PGSTAT_NUM_TABENTRIES_macro patch

2013-12-31 Thread Tom Lane
Mark Dilger markdil...@yahoo.com writes:
 I don't care for the places in the code that say things like

     foo - sizeof(int)

 where int refers implicitly to a specific variable or struct field, but
 you have to remember that and change it by hand if you change the
 type of the variable or struct.

 But this sort of code is quite common in postgres, and not
 at all unique to src/include/pgstat.h.

Really?  I think we're using offsetof for this type of thing in most
places.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: persistent plpgsql plugin info - field plugin_info for plpgsql_function structure

2013-12-31 Thread Pavel Stehule
Hello

I am working on plpgsql_check and I would to write a protection against
repeated check - so I need to mark a compiled (cached) function. Now,
plpgsql extension can store own data to exec state, and I would to some
similar for plpgsql_function. I believe so it can be useful for any plpgsql
extension that collects data per signature (and recreating) means so
refresh is necessary.

Ideas, objections?

Regards

Pavel


Re: [HACKERS] proposal: persistent plpgsql plugin info - field plugin_info for plpgsql_function structure

2013-12-31 Thread Alvaro Herrera
Pavel Stehule escribió:
 Hello
 
 I am working on plpgsql_check and I would to write a protection against
 repeated check - so I need to mark a compiled (cached) function. Now,
 plpgsql extension can store own data to exec state, and I would to some
 similar for plpgsql_function. I believe so it can be useful for any plpgsql
 extension that collects data per signature (and recreating) means so
 refresh is necessary.

I'm not sure I understand this.  Do you want to avoid running the
checker if a previous run was seen as successful and function has not
changed?  Suppose the function depends on a table.  I invoke the check
(it succeeds), then drop the table, then invoke the check again.  What
should happen?  Conversely, if I invoke the check and it fails, then I
create the table and invoke the check again, what should happen?  How
does this idea of yours know when to invalidate the cached result of the
check when unrelated objects, which the function depends on, are
dropped/created/modified?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix_PGSTAT_NUM_TABENTRIES_macro patch

2013-12-31 Thread Mark Dilger
A quick grep through the code reveals lots of examples,
so I'll just paste the first ones I notice.  There are 
references to sizeof(Oid), sizeof(uint32), sizeof(bool),
and sizeof(uint8) that clearly refer to fields in structs that
the macros refer to implicitly, but there is no way for the
compiler to detect if you change the struct but not the
macro.

I see these as similar to what I was talking about in
src/include/pgstat.h.


Mark Dilger


src/include/pg_attribute.h:
--

#define ATTRIBUTE_FIXED_PART_SIZE \
    (offsetof(FormData_pg_attribute,attcollation) + sizeof(Oid))


src/include/access/tuptoaster.h:
-

#define MaximumBytesPerTuple(tuplesPerPage) \
    MAXALIGN_DOWN((BLCKSZ - \
   MAXALIGN(SizeOfPageHeaderData + (tuplesPerPage) * 
sizeof(ItemIdData))) \
  / (tuplesPerPage))

#define TOAST_MAX_CHUNK_SIZE    \
    (EXTERN_TUPLE_MAX_SIZE -    \
 MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)) -  \
 sizeof(Oid) -  \
 sizeof(int32) -    \
 VARHDRSZ)

src/include/access/htup.h:
--

#define HeapTupleHeaderGetOid(tup) \
( \
    ((tup)-t_infomask  HEAP_HASOID) ? \
    *((Oid *) ((char *)(tup) + (tup)-t_hoff - sizeof(Oid))) \
    : \
    InvalidOid \
)

#define HeapTupleHeaderSetOid(tup, oid) \
do { \
    Assert((tup)-t_infomask  HEAP_HASOID); \
    *((Oid *) ((char *)(tup) + (tup)-t_hoff - sizeof(Oid))) = (oid); \
} while (0)

#define MINIMAL_TUPLE_OFFSET \
    ((offsetof(HeapTupleHeaderData, t_infomask2) - sizeof(uint32)) / 
MAXIMUM_ALIGNOF * MAXIMUM_ALIGNOF)
#define MINIMAL_TUPLE_PADDING \
    ((offsetof(HeapTupleHeaderData, t_infomask2) - sizeof(uint32)) % 
MAXIMUM_ALIGNOF)
#define MINIMAL_TUPLE_DATA_OFFSET \
    offsetof(MinimalTupleData, t_infomask2)

#define SizeOfHeapDelete    (offsetof(xl_heap_delete, all_visible_cleared) + 
sizeof(bool))

#define SizeOfHeapHeader    (offsetof(xl_heap_header, t_hoff) + sizeof(uint8))








On Tuesday, December 31, 2013 10:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
Mark Dilger markdil...@yahoo.com writes:
 I don't care for the places in the code that say things like

     foo - sizeof(int)

 where int refers implicitly to a specific variable or struct field, but
 you have to remember that and change it by hand if you change the
 type of the variable or struct.

 But this sort of code is quite common in postgres, and not
 at all unique to src/include/pgstat.h.

Really?  I think we're using offsetof for this type of thing in most

places.

            regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: persistent plpgsql plugin info - field plugin_info for plpgsql_function structure

2013-12-31 Thread Pavel Stehule
2013/12/31 Alvaro Herrera alvhe...@2ndquadrant.com

 Pavel Stehule escribió:
  Hello
 
  I am working on plpgsql_check and I would to write a protection against
  repeated check - so I need to mark a compiled (cached) function. Now,
  plpgsql extension can store own data to exec state, and I would to some
  similar for plpgsql_function. I believe so it can be useful for any
 plpgsql
  extension that collects data per signature (and recreating) means so
  refresh is necessary.

 I'm not sure I understand this.  Do you want to avoid running the
 checker if a previous run was seen as successful and function has not
 changed?  Suppose the function depends on a table.  I invoke the check
 (it succeeds), then drop the table, then invoke the check again.  What
 should happen?  Conversely, if I invoke the check and it fails, then I
 create the table and invoke the check again, what should happen?  How
 does this idea of yours know when to invalidate the cached result of the
 check when unrelated objects, which the function depends on, are
 dropped/created/modified?


plpgsql_check is designed for interactive (active) mode and passive mode.
In interactive mode a enhanced checking is started by explicit request -
explicit using plpgsql_check function - this feature is taken from patches
that I sent to mailing list. In this mode a check is executed always (when
checking is enabled) - and it is called repeatedly (when user requests it).

Passive mode is taken from plpgsql_lint extension. It is plpgsql extension
based on begin_func callback. plpgsql_lint doesn't support fatal_errors
option - every detected error is fatal, that stops execution. plpgsql_check
allows fatal_errors = false (plpgsql_check raises warnings only], and I am
searching way how to minimize repeated warning messages. It is one
motivation. Second motivation is increasing speed of regression tests by
removing repeated check. Good idea is a function that removes mark from
compiled function - so anybody can do recheck without leaving of session.

Requested feature doesn't help me implement this concept 100%, but helps
with check If I worked with some instance of function or not. And inside
core a implementation is cheap. Outside core it is a magic with hash and
checking transaction id (about 200 lines). When I worked on extension for
coverage calculation I had to solve same task, so I think so this variable
can be useful generally for similar tasks.


[HACKERS] proposal: multiple read-write masters in a cluster with wal-streaming synchronization

2013-12-31 Thread Mark Dilger
This is not entirely pie in the sky, but feel free to tell me why this is 
crazy.  I
have had this idea for several years, but have not seen anyone else suggest it,
nor any arguments why it would not work.


If we had 64-bit Oids, we could reserve the top 16 bits (for instance) to 
indicate a
server ID.  Each server in a cluster could be configured with a unique serverID 
in
[0..65535].  Every table, index, etc that is created on a server could be 
assigned
an Oid congruent to this serverID modulus 65536.  Each server would still have a
total of 2^48 Oids that it could assign before reaching Oid exhaustion, which is
64k times more Oids than the current design.  (The 16bit / 48bit split is 
arbitrary,
and could be 8bit / 56bit or whatever.)


Any INSERT, UPDATE, DELETE, INDEX, REINDEX, DROP that was run on a server
could be rejected if the object has the serverId of a different server in the 
cluster.
The serverId could be kept at all times in memory, and Oids for database objects
are always looked up when handling these SQL operations anyway, so there
should be effectively zero overhead for rejecting invalid operations that 
attempt
to change on server X a table/index/whatever which belongs to server Y.
I don't see how to run such a setup in serializable mode, but with read 
committed
it should be ok for server X to change its own tables based on what it perceives
to be the current state of data on server X and Y (reading local copies of 
tables
from Y) at the same time that server Y changes its own tables based on what it
perceives to be the current state of data on server Y and X (reading local 
copies
of tables from X).  The inconsistencies possible from these actions seem to me
the same as the inconsistencies possible when the two operations are happening
on the same server simultaneously in read committed mode.


WAL records from each server should only ever change objects belonging to that
server, so with some effort it might be possible for every server in the 
cluster to
replay the WAL records from every other server without collisions.  (This is 
the point
where my idea might be crazy -- I'm not sure how difficult/impossible it would
be to merge WAL information from multiple sources).

The total amount of WAL data that needs to be replayed on any given server would
be proportion to the total amount of data changes cluster-wide, which is no 
different
than the current state of affairs; it just happens to all come from a single 
master in
the current design.

The performance problems that at first seem inevitable in this design owing to 
the
need for each server to wait for wal replay from other servers to avoid data 
inconsistencies
would only actually happen if updates were frequently based on reads from other
servers.  If the vast majority of data changes were not based on the contents of
remotely owned tables/indexes/sequences and such, then changes to local data
could proceed without stalling.

Therefore, this approach would probably be most appropriate for highly 
partitioned
data, with each partition being owned by one server in the cluster, and 
modifications
based on data from more than one partition being rare.  (SELECTs based on data
from multiple partitions should be just fine.)

If you think about it, that's just an extension of the current architecture.  
Currently,
all data is partitioned into a single partition belonging to the master, with 
zero partitions
on the slaves, and all updates are performed based on a single partition, that 
being
the one on the master.  It just isn't obvious that this is the design, because 
it is a
degenerate case of the more general case that I am describing.


Application software that wants to run INSERTS, UPDATE, etc. would have to 
connect
to the appropriate server for the table in question, but that might not be too 
bad
if the data is partitioned anyway -- just connect to the server with the 
partition you
want to change.

Thanks in advance for having even read this far

Mark Dilger


Re: [HACKERS] proposal: multiple read-write masters in a cluster with wal-streaming synchronization

2013-12-31 Thread Alvaro Herrera
Mark Dilger wrote:
 This is not entirely pie in the sky, but feel free to tell me why this is 
 crazy.

Have you seen http://wiki.postgresql.org/wiki/BDR ?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: persistent plpgsql plugin info - field plugin_info for plpgsql_function structure

2013-12-31 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 Requested feature doesn't help me implement this concept 100%, but helps
 with check If I worked with some instance of function or not. And inside
 core a implementation is cheap. Outside core it is a magic with hash and
 checking transaction id (about 200 lines). When I worked on extension for
 coverage calculation I had to solve same task, so I think so this variable
 can be useful generally for similar tasks.

Are you proposing a reserved-for-plugins void* in struct
PLpgSQL_function similar to the existing one in struct PLpgSQL_execstate?

If so, while it sounds harmless in itself, I think your argument above is
actually the strongest reason *not* to do it.  The existing plpgsql plugin
infrastructure is incapable of supporting more than one plugin at a time,
and the more attractive we make it, the more likely we're going to have
conflicts.  It was never meant to support anything but the plpgsql
debugger.  Before we start aiding and abetting the development of other
plugins, we need a design that allows more than one of them to be
installed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Get more from indices.

2013-12-31 Thread Tom Lane
Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 [ pathkey_and_uniqueindx_v7_20131203.patch ]

I started to look at this patch.  I don't understand the reason for the
foreach loop in index_pathkeys_are_extensible (and the complete lack of
comments in the patch isn't helping).  Isn't it sufficient to check that
the index is unique/immediate/allnotnull and its ordering is a prefix
of query_pathkeys?  If not, what's the rationale for the specific tests
being made on the pathkeys --- this code doesn't make much sense to me.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple read-write masters in a cluster with wal-streaming synchronization

2013-12-31 Thread Mark Dilger
The BDR documentation 
http://wiki.postgresql.org/images/7/75/BDR_Presentation_PGCon2012.pdf
says,

    Physical replication forces us to use just one
 node: multi-master required for write scalability

    Physical replication provides best read scalability

I am inclined to agree with the second statement, but
I think my proposal invalidates the first statement, at
least for a particular rigorous partitioning over which
server owns which data.

In my own workflow, I load lots of data from different
sources.  The partition the data loads into depends on
which source it came from, and it is never mixed or
cross referenced in any operation that writes the data.
It is only mixed in the sense that applications query
data from multiple sources.

So for me, multi-master with physical replication seems
possible, and would presumably provide the best
read scalability.  I doubt that I am in the only database
user who has this kind of workflow.

The alternatives are ugly.  I can load data from separate
sources into separate database servers *without* replication
between them, but then the application layer has to 
emulate queries across the data.  (Yuck.)  Or I can use
logical replication such as BDR, but then the servers
are spending more effort than with physical replication,
so I get less bang for the buck when I purchase more
servers to add to the cluster.  Or I can use FDW to access
data from other servers, but that means the same data
may be pulled across the wire arbitrarily many times, with
corresponding impact on the bandwidth. 

Am I missing something here?  Does BDR really provide
an equivalent solution?


Second, it seems that BDR leaves to the client the responsibility
for making schemas the same everywhere.  Perhaps this is just
a limitation of the implementation so far, which will be resolved
in the future?



On Tuesday, December 31, 2013 12:33 PM, Alvaro Herrera 
alvhe...@2ndquadrant.com wrote:
 
Mark Dilger wrote:

 This is not entirely pie in the sky, but feel free to tell me why this is 
 crazy.

Have you seen http://wiki.postgresql.org/wiki/BDR ?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Peter Geoghegan
On Tue, Dec 31, 2013 at 1:12 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 I created a patch improving the log_lock_wait messages by adding
 relation infos (name and OID) as well as tuple infos (if present –
 ctid and, if available, the tuple itself) in the context.

I think that this is a worthwhile effort, but like Tom I don't think
that it's universally true that these waiters are waiting on a tuple
lock. Most obviously, the XactLockTableWait() call you've modified
within nbtinsert.c is not. In actuality, it's waiting on what I
informally refer to as a value lock for a value being inserted into
a unique index. Unlike the tuple locking code, a LockTuple() call in
advance of the XactLockTableWait() call, to arbitrate ordering, is not
present. No tuple is locked at all.

ISTM that you should be printing just the value and the unique index
there, and not any information about the tuple proper. Doing any less
could be highly confusing. You should do an analysis of where this
kind of exception applies. For better direction about where that new
infrastructure ought to live, you might find some useful insight from
commit 991f3e5ab3f8196d18d5b313c81a5f744f3baaea.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] What exactly is drop-index-concurrently-1.spec trying to test?

2013-12-31 Thread Tom Lane
Peter pointed out in
http://www.postgresql.org/message-id/527c0fe9.7000...@gmx.net
that Kyotaro-san's patch to treat unique indexes as satisfying any sort
condition that they are a prefix of broke the drop-index-concurrently-1
isolation test.  The latest iterations of the patch respond to that by
changing the expected output.  However, that seems rather wrongheaded,
because AFAICT the intent of that part of the test is to demonstrate that
a seqscan has particular behavior; so if the planner starts generating an
indexscan instead, the test no longer proves anything of the kind.

What I'm wondering though is what's the point of testing that a concurrent
DROP INDEX doesn't affect a seqscan?  That seems kinda silly, so it's
tempting to address the patch's problem by just removing the steps
involving the getrow_seq query, rather than hacking it up enough so we'd
still get a seqscan plan.

What's more, the expected output shows that the other sessions are
blocking the DROP INDEX CONCURRENTLY until they complete, which means if
there *were* any odd effects like that, this test likely would fail to
expose them anyway.  I'd have thought the test would be designed to allow
the DROP to complete and then re-test that the results of the prepared
query are still sane, but it does no such thing.

So, in short, exactly what is it that this is supposed to be testing?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: persistent plpgsql plugin info - field plugin_info for plpgsql_function structure

2013-12-31 Thread Pavel Stehule
2013/12/31 Tom Lane t...@sss.pgh.pa.us

 Pavel Stehule pavel.steh...@gmail.com writes:
  Requested feature doesn't help me implement this concept 100%, but helps
  with check If I worked with some instance of function or not. And inside
  core a implementation is cheap. Outside core it is a magic with hash and
  checking transaction id (about 200 lines). When I worked on extension for
  coverage calculation I had to solve same task, so I think so this
 variable
  can be useful generally for similar tasks.

 Are you proposing a reserved-for-plugins void* in struct
 PLpgSQL_function similar to the existing one in struct PLpgSQL_execstate?

 If so, while it sounds harmless in itself, I think your argument above is
 actually the strongest reason *not* to do it.  The existing plpgsql plugin
 infrastructure is incapable of supporting more than one plugin at a time,
 and the more attractive we make it, the more likely we're going to have
 conflicts.  It was never meant to support anything but the plpgsql
 debugger.  Before we start aiding and abetting the development of other
 plugins, we need a design that allows more than one of them to be
 installed.


ok, what we can do better?

Can be solution a callback on plpgsql_HashTableInsert and
plpgsql_HashTableInsert? With these callbacks a relation between function
and some plugin data can be implemented more simply.

Regards

Pavel




 regards, tom lane



Re: [HACKERS] PoC: Partial sort

2013-12-31 Thread Andreas Karlsson

On 12/28/2013 04:51 PM, Alexander Korotkov wrote:

I've compiled it with clang. Yeah, there was mixed declarations. I've
rechecked it with gcc, now it gives no warnings. I didn't try it with
visual studio, but I hope it will be OK.


I looked at this version of the patch and noticed a possibility for 
improvement. You could decrement the bound for the tuplesort after every 
completed sort. Otherwise the optimizations for small limits wont apply 
to partial sort.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: persistent plpgsql plugin info - field plugin_info for plpgsql_function structure

2013-12-31 Thread Amit Kapila
On Wed, Jan 1, 2014 at 12:52 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2013/12/31 Alvaro Herrera alvhe...@2ndquadrant.com

 I'm not sure I understand this.  Do you want to avoid running the
 checker if a previous run was seen as successful and function has not
 changed?  Suppose the function depends on a table.  I invoke the check
 (it succeeds), then drop the table, then invoke the check again.  What
 should happen?  Conversely, if I invoke the check and it fails, then I
 create the table and invoke the check again, what should happen?  How
 does this idea of yours know when to invalidate the cached result of the
 check when unrelated objects, which the function depends on, are
 dropped/created/modified?


 plpgsql_check is designed for interactive (active) mode and passive mode. In
 interactive mode a enhanced checking is started by explicit request -
 explicit using plpgsql_check function - this feature is taken from patches
 that I sent to mailing list. In this mode a check is executed always (when
 checking is enabled) - and it is called repeatedly (when user requests it).

 Passive mode is taken from plpgsql_lint extension. It is plpgsql extension
 based on begin_func callback. plpgsql_lint doesn't support fatal_errors
 option - every detected error is fatal, that stops execution. plpgsql_check
 allows fatal_errors = false (plpgsql_check raises warnings only], and I am
 searching way how to minimize repeated warning messages. It is one
 motivation. Second motivation is increasing speed of regression tests by
 removing repeated check. Good idea is a function that removes mark from
 compiled function - so anybody can do recheck without leaving of session.

Will it really help by adding different modes for plpgsql_check functionality,
is that going to help in overcoming the concerns raised by committer
which I understand was mainly about code duplication, improvement
in messages and comments in code or am I missing something?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG FIX] Compare returned value by socket() against PGINVALID_SOCKET instead of 0

2013-12-31 Thread Amit Kapila
On Wed, Dec 25, 2013 at 6:05 PM, Joel Jacobson j...@trustly.com wrote:
 Hi,

 I've tried to fix some bugs reported by Andrey Karpov in an article at
 http://www.viva64.com/en/b/0227/

 The value returned by socket() is unsigned on Windows and can thus not
 be checked if less than zero to detect an error, instead
 PGINVALID_SOCKET should be used, which is hard-coded to -1 on
 non-windows platforms.

Though I have not verified all instances you have fixed in your patch, but I
feel the general direction to fix is right.
I think you can upload your patch for next CommitFest.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers