Re: [HACKERS] jsonb and nested hstore

2014-03-11 Thread Alexander Korotkov
On Tue, Mar 11, 2014 at 5:19 AM, Peter Geoghegan p...@heroku.com wrote:

 On Mon, Mar 10, 2014 at 4:19 AM, Alexander Korotkov
 aekorot...@gmail.com wrote:
  Here it is.

 So it looks like what you have here is analogous to the other problems
 that I fixed with both GiST and GIN. That isn't surprising, and this
 does fix my test-case. I'm not terribly happy about the lack of
 explanation for the hashing in that loop, though. Why use COMP_CRC32()
 at all, for one thing?

 Why do this for non-primitive jsonb hashing?

 COMP_CRC32(stack-hash_state, PATH_SEPARATOR, 1);

 Where PATH_SEPARATOR is:

 #define PATH_SEPARATOR (\0)

 Actually, come to think of it, why not just use one hashing function
 everywhere? i.e., jsonb_hash(PG_FUNCTION_ARGS)? It's already very
 similar. Pretty much every hash operator support function 1 (i.e. a
 particular type's hash function) is implemented with hash_any(). Can't
 we just do the same here? In any case it isn't obvious why the
 requirements for those two things (the hashing mechanism used by the
 jsonb_hash_ops GIN opclass, and the hash operator class support
 function 1 hash function) cannot be the same thing.


It's because CRC32 interface allows incremental calculation while hash_any
requires single chunk of memory. I don't think that unfolding everything is
good idea. But we could implement incremental interface for hash_any.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-11 Thread Amit Kapila
On Mon, Mar 10, 2014 at 5:14 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 On 09/03/14 12:15, Amit Kapila wrote:
 [...] due to which the message it displays seems to be
 incomplete. Message it displays is as below:

 LOG:  process 1800 still waiting for ShareLock on transaction 679 after 
 1014.000
  ms
 CONTEXT:  while attempting to lock in relation public.idx_t1 of database 
 pos
 tgres

 Well, there is no tuple information available, so we have to leave it out.

 Here it is not clear what it attempts *lock in*

 Ok, I reworded the message as you suggested below. Should make this
 clearer as well.

As per your recent changes, wherever we have tuple info, it will be used in
message, else it will just use relinfo. So the message difference will be as
below which I think is okay.

Places where tuple info not available

LOG:  process 5788 still waiting for ShareLock on transaction 679 after 1014.000
 ms
CONTEXT:  while attempting to operate in relation public.idx_t1 of database
postgres

Places where tuple info available

LOG:  process 5788 still waiting for ShareLock on transaction 686 after 1014.000
 ms
CONTEXT:  while attempting to operate on tuple (0,1) with values (1, aaa, bbb) i
n relation public.t1 of database postgres

Actually, I had thought of passing tuple info from places where currently
it is not passing, but it will need some extra code which I think is not
worthy for message information.
For Example in case of _bt_doinsert(), if we have to pass index tuple
info, we might need to change prototype of XactLockTableWaitWithInfo()
such that it can accept both IndexTuple and HeapTuple and then use
inside function
accordingly. Also I think this is okay to not display Index tuple here, as it
is still not visible. For other cases where we are using DirtySnapshot,
the things can be more complex to identify if tuple can be logged. So
I think it's better to leave logging it as you have done in patch.


 Can you explain why you prefer the WithInfo naming? Just curious, it
 seemed to me the more descriptive name should have been preferred.

Actually, apart from MultiXactIdWaitWithErrorContext(), I had
thought of below options:

1. Change the prototype of MultiXactIdWait()/XactLockTableWait()
to pass rel and tuple info and make new functions XactLockTableWait_Internal()
which will do the actual work, but to achieve that we need call
internal function from some places where top level is not getting
called. Such coding convetion is used at few places (heap_beginscan_internal).

2. Name new functions as  MultiXactIdWaitExtended()/XactLockTableWaitExtended()
or MultiXactIdWaitEx()/XactLockTableWaitEx().
You can find some other similar functions (ReadBufferExtended,
relation_openrv_extended)

3. MultiXactIdWaitWithInfo()/XactLockTableWaitWithInfo()

Earlier I found option 3 as a good choice, but now again thinking on
it I am leaning towards option 2.


 6.
 @@ -1981,7 +1981,8 @@ EvalPlanQualFetch(EState *estate, Relation
 relation, int lockmode,
   if (TransactionIdIsValid(SnapshotDirty.xmax))
   {
   ReleaseBuffer(buffer);
 - XactLockTableWait(SnapshotDirty.xmax);
 + XactLockTableWaitWithInfo(relation, tuple,
 +  SnapshotDirty.xmax);

 In function EvalPlanQualFetch(), we are using SnapshotDirty to fetch
 the tuple, so I think there is a chance that it will log the tuple which
 otherwise will not be visible. I don't think this is right.

 Hm, after checking source I tend to agree. I replaced it with NULL.

Today, again looking into it, I found that function
heap_lock_updated_tuple_rec() is using SnapshotAny to fetch the tuple
and I think for this case also it's not safe to Log the tuple.

Could you please once check (if you are comfortable doing so) wherever
this patch is passing tuple, whether it is okay to pass it based on visibility
rules, else I will again verify it once.

 8.
  void
  WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode)
  {
 - List   *l;
 + List   *l;

 Extra spacing not needed.

 What is the policy to do here?

The simple rule I follow is don't touch the code which has no relation
to current patch.

 [...] and recently this is used in function SnapBuildFindSnapshot().

 Hm, should we add the context there, too?

I don't think you can use it here as there is no relation or tuple
information available. This is a unique kind of usage for this API where
it is used to wait for a transaction to complete without operating on
relation/tuple, whereas all other places this is used to wait if some other
transaction is operating on a tuple.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-11 Thread Christian Kruse
Hi,

On 11/03/14 13:23, Amit Kapila wrote:
 [… snip …]
 So I think it's better to leave logging it as you have done in
 patch.

Agreed.

 […]
 2. Name new functions as  
 MultiXactIdWaitExtended()/XactLockTableWaitExtended()
 or MultiXactIdWaitEx()/XactLockTableWaitEx().
 You can find some other similar functions (ReadBufferExtended,
 relation_openrv_extended)
 
 3. MultiXactIdWaitWithInfo()/XactLockTableWaitWithInfo()
 
 Earlier I found option 3 as a good choice, but now again thinking on
 it I am leaning towards option 2.

Changing it once again will only cause more work and won't do a big
difference. So I suggest sticking with the current function names.

 Today, again looking into it, I found that function
 heap_lock_updated_tuple_rec() is using SnapshotAny to fetch the tuple
 and I think for this case also it's not safe to Log the tuple.
 
 Could you please once check (if you are comfortable doing so) wherever
 this patch is passing tuple, whether it is okay to pass it based on visibility
 rules, else I will again verify it once.

I think I got all places, but it would be nice to have a confirmation.

  [… policy regarding whitespaces …]
 The simple rule I follow is don't touch the code which has no relation
 to current patch.

OK. Thanks.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 71ec740..bcc656a 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithInfo(Relation rel, HeapTuple tup, MultiXactId multi,
+	MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2714,8 +2716,8 @@ l1:
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWaitWithInfo(relation, tp, (MultiXactId) xwait,
+	MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2741,7 +2743,7 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithInfo(relation, tp, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3266,8 +3268,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
-			infomask);
+			MultiXactIdWaitWithInfo(relation, oldtup,
+	   (MultiXactId) xwait, mxact_status, remain, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3306,7 +3308,7 @@ l2:
 			/*
 			 * There was no UPDATE in the MultiXact; or it aborted. No
 			 * TransactionIdIsInProgress() call needed here, since we called
-			 * MultiXactIdWait() above.
+			 * MultiXactIdWaitWithInfo() above.
 			 */
 			if (!TransactionIdIsValid(update_xact) ||
 TransactionIdDidAbort(update_xact))
@@ -3341,7 +3343,7 @@ l2:
 			else
 			{
 /* wait for regular transaction to end */
-XactLockTableWait(xwait);
+XactLockTableWaitWithInfo(relation, oldtup, xwait);
 LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 /*
@@ -4409,7 +4411,8 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+	MultiXactIdWaitWithInfo(relation, tuple,
+(MultiXactId) xwait, status, NULL, infomask);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -4464,7 +4467,7 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	XactLockTableWait(xwait);
+	XactLockTableWaitWithInfo(relation, tuple, xwait);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -5151,7 +5154,7 @@ l4:
 	if (needwait)
 	{
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-		XactLockTableWait(members[i].xid);
+		XactLockTableWaitWithInfo(rel, NULL, members[i].xid);
 		pfree(members);
 		goto l4;
 	}
@@ -5211,7 +5214,7 @@ l4:
 if (needwait)
 {
 	LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-	XactLockTableWait(rawxmax);
+	XactLockTableWaitWithInfo(rel, NULL, rawxmax);
 	goto l4;
 }
 if (res != HeapTupleMayBeUpdated)
@@ -6169,6 +6172,33 @@ MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 }
 
 /*
+ * MultiXactIdWaitWithInfo
+ *		Sets up the error context callback for reporting tuple
+ *		information and relation for a lock to aquire
+ *
+ * Use this instead of calling MultiXactIdWait()
+ */
+static void

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-11 Thread Rajeev rastogi
On 10 March 2014 23:44, Tom Lane wrote:

 Unfortunately, while testing it I noticed that there's a potentially
 fatal backwards-compatibility problem, namely that the COPY n status
 gets printed on stdout, which is the same place that COPY OUT data is
 going.  While this isn't such a big problem for interactive use, usages
 like this one are pretty popular:
 
psql -c 'copy mytable to stdout' mydatabase | some-program
 
 With the patch, COPY n gets included in the data sent to some-program,
 which never happened before and is surely not what the user wants.
 The same if the -c string uses \copy.
 
 There are several things we could do about this:
 
 1. Treat this as a non-backwards-compatible change, and document that
 people have to use -q if they don't want the COPY tag in the output.
 I'm not sure this is acceptable.
 
 2. Kluge ProcessResult so that it continues to not pass back a PGresult
 for the COPY TO STDOUT case, or does so only in limited circumstances
 (perhaps only if isatty(stdout), for instance).

 I'm inclined to think #2 is the best answer if we can't stomach #1.

Is it OK to have different status output for different flavor of COPY command? 
I am afraid that It will become kind of inconsistent result.

Also not providing the command result status may be inconsistent from 
behavior of any other SQL commands.

I agree that it breaks the backward compatibility but I am not sure if anyone 
is so tightly coupled with this ( or whether they will be effected with 
additional status result).

To me option #1 seems to be more suitable specially since there is an option to 
disable
the status output by giving -q.

Please provide your opinion or let me know If I have missed something.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is SPI safe to use in multi-threaded PL/Java?

2014-03-11 Thread MauMau

From: Tom Lane t...@sss.pgh.pa.us

MauMau maumau...@gmail.com writes:
To put the question in other words, is it safe to load a multi-threaded 
PL
library in the single-threaded backend process, if the PL only calls SPI 
in

the main thread?


When it breaks, we're not going to be concerned.


I may not understand your nuance.  Which of the following do you mean?

* PL/Java's design is dangerous in terms of the mixture of single- and 
multi-threading, and we cannot be 100% sure whether there's really no 
problem.


* SPI must not be used in multi-threaded process, even if only one thread 
calls SPI functions at a time.  So what we can say is that PL/Java is not 
safe theoretically in terms of SPI.


Regards
MauMau







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Disk usage for intermediate results in join

2014-03-11 Thread Parul Lakkad
Hi,

I am trying to figure out when disk is used to store intermediate results
while performing joins in postgres.

According my findings using 'explain analyse ' only merge sort uses disk.
Can anyone please throw some more light on this?

Thanks,
Parul


Re: [HACKERS] [ISSUE] pg_dump: schema with OID 0 does not exist

2014-03-11 Thread Robert Haas
On Tue, Mar 11, 2014 at 12:23 AM, Prakash Itnal prakash...@gmail.com wrote:
 Can someone confirm is this really an issue? or any reasons for missing
 rows?

Well, your database is definitely getting corrupted somehow.  But
there's no information in your email which would enable someone to
guess how.

I blogged on this subject a while back, but I don't know whether it's
relevant to your situation:

http://rhaas.blogspot.com/2012/03/why-is-my-database-corrupted.html

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disk usage for intermediate results in join

2014-03-11 Thread Heikki Linnakangas

On 03/11/2014 01:24 PM, Parul Lakkad wrote:

Hi,

I am trying to figure out when disk is used to store intermediate results
while performing joins in postgres.

According my findings using 'explain analyse ' only merge sort uses disk.
Can anyone please throw some more light on this?


Hash joins will also spill to disk if the hash-side of the join is large 
enough. The planner usually tries to avoid it, but sometimes it happens.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-03-11 Thread Alvaro Herrera
MauMau escribió:
 Hi, Amit san,
 
 I'm replying to your previous email.  I wanted to reply to your
 latest mail below, but I removed it from my mailer by mistake.
 
 http://www.postgresql.org/message-id/CAA4eK1LAg6ndZdWLb5e=Ep5DzcE8KZU=JbmO+tFwySYHm2ja=q...@mail.gmail.com
 
 Do you know how I can reply to an email which was deleted locally?
 I thought I could download an old mail by clicking raw link and
 import it to the mailer.  However, it requires username/password
 input, and it seems to be different from the one for editing
 CommitFest.  I couldn't find how to authenticate myself.

The box that asks for password tells you what the user/password is.  I
think it's something like archives/archives or similar.  The password is
there only to keep spammers out, not to have any real auth.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Magnus Hagander
On Sun, Mar 9, 2014 at 9:00 PM, Thom Brown t...@linux.com wrote:

 Hi,

 I've noticed that db_user_namespace has had the following note
 attached to it since 2002:

 This feature is intended as a temporary measure until a complete
 solution is found. At that time, this option will be removed.

 It will be 12 years this year since this temporary measure was
 added.  I'm just wondering, is there any complete solution that
 anyone had in mind yet?  Or should this just be deprecated?


I'd say +1 to remove it. That would also make it possible to get id of
password authentication...

But we might want to officially deprecate it for 9.4, and then remove it
later?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sun, Mar 9, 2014 at 9:00 PM, Thom Brown t...@linux.com wrote:
 It will be 12 years this year since this temporary measure was
 added.  I'm just wondering, is there any complete solution that
 anyone had in mind yet?  Or should this just be deprecated?

 I'd say +1 to remove it. That would also make it possible to get id of
 password authentication...

If we remove it without providing a substitute feature, people who are
using it will rightly complain.

Are you claiming there are no users, and if so, on what evidence?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Magnus Hagander
On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  On Sun, Mar 9, 2014 at 9:00 PM, Thom Brown t...@linux.com wrote:
  It will be 12 years this year since this temporary measure was
  added.  I'm just wondering, is there any complete solution that
  anyone had in mind yet?  Or should this just be deprecated?

  I'd say +1 to remove it. That would also make it possible to get id of
  password authentication...

 If we remove it without providing a substitute feature, people who are
 using it will rightly complain.

 Are you claiming there are no users, and if so, on what evidence?


I am claiming that I don't think anybody is using that, yes.

Based on the fact that I have *never* come across it on any system I've
come across since, well, forever. Except once I think, many years ago, when
someone had enabled it by mistake and needed my help to remove it...

But we should absolutely deprecate it first in that place. Preferrably
visibly (e.g. with a log message when people use it). That could at least
get those people who use it to let us know they do, to that way figure out
if they do - and can de-deprecate it.

Or if someone wants to fix it properly of course :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Is SPI safe to use in multi-threaded PL/Java?

2014-03-11 Thread Tom Lane
MauMau maumau...@gmail.com writes:
 From: Tom Lane t...@sss.pgh.pa.us
 When it breaks, we're not going to be concerned.

 I may not understand your nuance.  Which of the following do you mean?

 * PL/Java's design is dangerous in terms of the mixture of single- and 
 multi-threading, and we cannot be 100% sure whether there's really no 
 problem.

That, more or less.  There is exactly zero provision in the Postgres
code for multiple threads to exist inside a backend process.  It's
possible that PL/Java manages to completely insulate the Java world
from the C world, so that the C code never sees more than one thread.
But any leakage at all in that abstraction is probably going to cause
bugs; and as I said, we (PG hackers) are not going to consider such
bugs to be our problem.

On platforms where the standard libc supports threading (which is most,
these days), I'd be particularly worried about leakage along the path
java - libc - postgres.  If libc becomes aware that there are multiple
threads executing inside the process, it's likely to change behaviors.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Are you claiming there are no users, and if so, on what evidence?

 I am claiming that I don't think anybody is using that, yes.

 Based on the fact that I have *never* come across it on any system I've
 come across since, well, forever. Except once I think, many years ago, when
 someone had enabled it by mistake and needed my help to remove it...

A slightly more scientific basis for that would be to ask on
pgsql-general.

 Or if someone wants to fix it properly of course :)

Yeah, that's what we've been hoping for for 12 years.  I stopped holding
my breath awhile ago.

Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
has been.  I'm just expecting lots of push-back if we try.  And it's kind
of hard to resist push-back when you don't have a substitute to offer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-11 Thread Tom Lane
Rajeev rastogi rajeev.rast...@huawei.com writes:
 On 10 March 2014 23:44, Tom Lane wrote:
 Unfortunately, while testing it I noticed that there's a potentially
 fatal backwards-compatibility problem, namely that the COPY n status
 gets printed on stdout, which is the same place that COPY OUT data is
 going.  While this isn't such a big problem for interactive use, usages
 like this one are pretty popular:
 
 psql -c 'copy mytable to stdout' mydatabase | some-program
 
 With the patch, COPY n gets included in the data sent to some-program,
 which never happened before and is surely not what the user wants.
 The same if the -c string uses \copy.

 Is it OK to have different status output for different flavor of COPY 
 command? 
 I am afraid that It will become kind of inconsistent result.

Well, that's the big question here.

We already do have different status output for different kinds of COPY,
ie we don't report it for COPY FROM STDIN/TO STDOUT.  What now emerges
is that there's a good reason for the omission in the case of TO STDOUT.
I certainly hadn't remembered that, and there's no documentation of it
in either code comments or the SGML docs.

After sleeping on it, I'm inclined to think we should continue to not
print status for COPY TO STDOUT.  Aside from the risk of breaking scripts,
there's a decent analogy to be made to SELECT: we don't print a status
tag for that either.

That leaves the question of whether we want to start printing a tag for
the COPY FROM STDIN case.  I don't think that'd create much risk of
breaking anything, and the analogy to SELECT doesn't hold either.
OTOH, Robert opined upthread that FROM STDIN and TO STDOUT shouldn't
behave differently; does that argument still impress anyone?  And given
that different COPY cases are still going to behave differently, maybe
we should just stick with the status quo.  It's been like this for a
mighty long time with few complaints.

In any case, some documentation and code comment changes would be
appropriate ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Andrew Dunstan


On 03/11/2014 09:57 AM, Tom Lane wrote:

Magnus Hagander mag...@hagander.net writes:

On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Are you claiming there are no users, and if so, on what evidence?

I am claiming that I don't think anybody is using that, yes.
Based on the fact that I have *never* come across it on any system I've
come across since, well, forever. Except once I think, many years ago, when
someone had enabled it by mistake and needed my help to remove it...

A slightly more scientific basis for that would be to ask on
pgsql-general.


Or if someone wants to fix it properly of course :)

Yeah, that's what we've been hoping for for 12 years.  I stopped holding
my breath awhile ago.

Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
has been.  I'm just expecting lots of push-back if we try.  And it's kind
of hard to resist push-back when you don't have a substitute to offer.





Or we try to make it work. I don't think the idea is inherently bad, and 
I know there are people (like ISPs) who would like to have it work 
properly. Maybe in these days when most people are on dedicated VMs this 
matters less, but I don't think shared database servers are totally dead 
yet.


The docs say:

   db_user_namespace causes the client's and server's user name
   representation to differ. Authentication checks are always done with
   the server's user name so authentication methods must be configured
   for the server's user name, not the client's. Because md5 uses the
   user name as salt on both the client and server, md5 cannot be used
   with db_user_namespace.

Is that the only major issue? Why not have the server strip out the @db 
part if this is on? If we made this an initdb-time setting rather than a 
GUC then we'd remove the problems caused by turning this on and off. I'm 
not sure what other problems that might cause, but it doesn't seem 
totally intractable at first glance.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Stephen Frost
* Andrew Dunstan (and...@dunslane.net) wrote:
 Or we try to make it work. I don't think the idea is inherently bad,
 and I know there are people (like ISPs) who would like to have it
 work properly. Maybe in these days when most people are on dedicated
 VMs this matters less, but I don't think shared database servers are
 totally dead yet.

Agreed.  There are certainly pretty big hosting companies out there
which are already doing multi-tenant PG, but they're using their own
approaches instead of anything we provide (because what we provide
sucks, basically..).

 The docs say:
 
db_user_namespace causes the client's and server's user name
representation to differ. Authentication checks are always done with
the server's user name so authentication methods must be configured
for the server's user name, not the client's. Because md5 uses the
user name as salt on both the client and server, md5 cannot be used
with db_user_namespace.
 
 Is that the only major issue? Why not have the server strip out the
 @db part if this is on? If we made this an initdb-time setting
 rather than a GUC then we'd remove the problems caused by turning
 this on and off. I'm not sure what other problems that might cause,
 but it doesn't seem totally intractable at first glance.

Isn't the other issue for ISPs essentially that we don't have row-level
security for our global catalogs?  as in- we can't limit what's in
pg_authid to only those entries a given user should be able to see?  I
don't think db_user_namespace addresses that issue (but I didn't go look
either).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Andrew Dunstan


On 03/11/2014 12:37 PM, Stephen Frost wrote:


Isn't the other issue for ISPs essentially that we don't have row-level
security for our global catalogs?  as in- we can't limit what's in
pg_authid to only those entries a given user should be able to see?  I
don't think db_user_namespace addresses that issue (but I didn't go look
either).





Possibly, but we don't have to solve every problem at once. As the 
Sermon on the Mount says, Sufficient unto the day is the evil thereof.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Retain dynamic shared memory segments for postmaster lifetime

2014-03-11 Thread Robert Haas
On Mon, Mar 10, 2014 at 11:26 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Mon, Mar 10, 2014 at 11:37 PM, Robert Haas robertmh...@gmail.com wrote:
 Looks good, committed.  However, I changed it so that
 dsm_keep_segment() does not also perform the equivalent of
 dsm_keep_mapping(); those are two separate operations.

 So are you expecting that if some one needs to retain dynamic segment's
 till PM lifetime, they should call both dsm_keep_segment() and
 dsm_keep_mapping()?

If they want to keep both the mapping and the segment, yes.  But in
general those two things are independent of each other.  A process
could want to map the segment and store some data in it, and then it
could want to unmap the segment; and then later the segment could be
mapped again (perhaps from some other backend) to get the data out.

 If we don't call both, it can lead to following warning:
 postgres=# select dsm_demo_create('this message is from session-new', 1);
 WARNING:  dynamic shared memory leak: segment 1402373971 still referenced

Well, that's just an artifact of the coding of dsm_demo_create().
Code that doesn't use dsm_keep_mapping() needs to be sure to call
dsm_detach() in the non-error path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Simon Riggs
On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Joe Conway m...@joeconway.com writes:
 I am probably missing something obvious, but why does the
 AccessShareLock remain held on a table after a SELECT statement is
 complete when in a transaction block?

 *Any* lock acquired by user command is held till end of transaction;
 AccessShareLock isn't special.

 In general, releasing early would increase the risk of undesirable
 behaviors such as tables changing definition mid-transaction.

I thought good question at first, but the workaround is simple...
just don't use multi-step transactions, submit each request as a
separate transaction.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-11 Thread Alvaro Herrera
Tom Lane escribió:
 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  You are correct. pg_dump export reloptions using WITH clause of CREATE
  TABLE statement. I.e.:
 
  CREATE TABLE foo (
  )
  WITH (autovacuum_enabled=false, bdr.do_replicate=false);
 
  So if this statement checks for 'bdr' extension is loaded then in partial
  restore it can be fail.
 
 I see absolutely *nothing* wrong with failing that command if bdr is not
 installed.  For an analogy, if this table includes a column of type bar
 defined by some extension baz, we are certainly going to fail the
 CREATE TABLE if baz isn't installed.
 
 Now, if bdr is installed but the validation doesn't happen unless bdr
 is loaded in some sense, then that is an implementation deficiency
 that I think we can insist be rectified before this feature is accepted.

So, I spent some time on this patch the last couple of days to add some
validation.  But after submitting it, it seems to me that there wasn't
as much consensus on how to handle validation than at first I thought.

So, the first and simplest way to go about this, of course, is just not
validate anything. This is what Fabrízio's patch does.  So the table
owner can execute
  ALTER TABLE mary SET (supercalifragilisticexpialidocious.umbrella_flight = 
'hell yeah')
and that would be it.  Whether a module makes use of this later or not,
is not that guy's problem.  This is mostly what we do for GUCs, note, so
it's not exactly groundbreaking.

As a second possibility, my patch as posted allows one to register a
namespace.  So pg_dump can do this:
  SELECT pg_register_option_namespace('supercalifragilisticexpialidocious');
and then create the table just like the above ALTER TABLE.  Note that
the variable name, and the value, are not checked until later.  If a
module comes later and says hey, I own that super- option namespace,
and I have option umbrella_flight but it's a boolean (by calling
add_bool_reloption), that will raise an error.  Note that in my patch as
posted, if you set the parameter umbrella_flight='better not' to an
index, but the parameter has only been defined for tables
(RELOPT_KIND_HEAP), it will be silently accepted.  Also note that we can
add a function equivalent to EmitWarningOnPlaceholders (Andres' idea),
so that any unrecognized option will cause some noise and it can be
identified right away.  Since only table owners can set options, this
seems more than good to me; it's not like table owners are going to mess
up by adding pointless options just for the heck of it.


A further possibility requires modules to also register options (not
only namespaces), and to validate each and every option as soon as it's
created.  So if you try to set an option that has not been previously
registered by a module, that will raise an error right there.  This
seems to be what Tom, Robert and Peter want.  However, getting there
seems very laborious; apparently, we will need a new system catalog to
register option validators, for starters.  We'll also need a way to load
the module whenever a table gets an option in a not-loaded module.  (I
think this will fall off automatically if the validator is registered,
because when the validator is called, the module is loaded by the
system).

One slight problem with this is what to do with extensions that don't
provide any C code.  Some use cases require options that can be set and
accessed only from SQL code.


My question here for the hackers community at large is this: are we okay
with implementing the second option I propose above?  If we are, then I
will see about finalizing the patch and getting it committed.  If we are
not, and we're determined that only the third option is acceptable, I
will jump out of this thread and stop wasting everyone's time.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Atri Sharma
On Tue, Mar 11, 2014 at 10:56 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
  Joe Conway m...@joeconway.com writes:
  I am probably missing something obvious, but why does the
  AccessShareLock remain held on a table after a SELECT statement is
  complete when in a transaction block?
 
  *Any* lock acquired by user command is held till end of transaction;
  AccessShareLock isn't special.
 
  In general, releasing early would increase the risk of undesirable
  behaviors such as tables changing definition mid-transaction.

 I thought good question at first, but the workaround is simple...
 just don't use multi-step transactions, submit each request as a
 separate transaction.


 Wouldnt that tend to get inefficient?

Regards,

Atri



-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Simon Riggs
On 11 March 2014 17:29, Atri Sharma atri.j...@gmail.com wrote:



 On Tue, Mar 11, 2014 at 10:56 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
  Joe Conway m...@joeconway.com writes:
  I am probably missing something obvious, but why does the
  AccessShareLock remain held on a table after a SELECT statement is
  complete when in a transaction block?
 
  *Any* lock acquired by user command is held till end of transaction;
  AccessShareLock isn't special.
 
  In general, releasing early would increase the risk of undesirable
  behaviors such as tables changing definition mid-transaction.

 I thought good question at first, but the workaround is simple...
 just don't use multi-step transactions, submit each request as a
 separate transaction.


 Wouldnt that tend to get inefficient?

Please outline your alternate proposal so we can judge the comparative
efficiency.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Atri Sharma
On Tue, Mar 11, 2014 at 11:07 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 17:29, Atri Sharma atri.j...@gmail.com wrote:
 
 
 
  On Tue, Mar 11, 2014 at 10:56 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
 
  On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
   Joe Conway m...@joeconway.com writes:
   I am probably missing something obvious, but why does the
   AccessShareLock remain held on a table after a SELECT statement is
   complete when in a transaction block?
  
   *Any* lock acquired by user command is held till end of transaction;
   AccessShareLock isn't special.
  
   In general, releasing early would increase the risk of undesirable
   behaviors such as tables changing definition mid-transaction.
 
  I thought good question at first, but the workaround is simple...
  just don't use multi-step transactions, submit each request as a
  separate transaction.
 
 
  Wouldnt that tend to get inefficient?

 Please outline your alternate proposal so we can judge the comparative
 efficiency.



I dont have an alternate proposal yet. I was just wondering if per step
transactions could lead to a drop in performance.

If that is the best way to go, I am all for it.

Regards,

Atri



-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-11 Thread Simon Riggs
On 11 March 2014 17:26, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Tom Lane escribió:
 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  You are correct. pg_dump export reloptions using WITH clause of CREATE
  TABLE statement. I.e.:

  CREATE TABLE foo (
  )
  WITH (autovacuum_enabled=false, bdr.do_replicate=false);

  So if this statement checks for 'bdr' extension is loaded then in partial
  restore it can be fail.

 I see absolutely *nothing* wrong with failing that command if bdr is not
 installed.  For an analogy, if this table includes a column of type bar
 defined by some extension baz, we are certainly going to fail the
 CREATE TABLE if baz isn't installed.

 Now, if bdr is installed but the validation doesn't happen unless bdr
 is loaded in some sense, then that is an implementation deficiency
 that I think we can insist be rectified before this feature is accepted.

 So, I spent some time on this patch the last couple of days to add some
 validation.  But after submitting it, it seems to me that there wasn't
 as much consensus on how to handle validation than at first I thought.

 So, the first and simplest way to go about this, of course, is just not
 validate anything. This is what Fabrízio's patch does.  So the table
 owner can execute
   ALTER TABLE mary SET (supercalifragilisticexpialidocious.umbrella_flight = 
 'hell yeah')
 and that would be it.  Whether a module makes use of this later or not,
 is not that guy's problem.  This is mostly what we do for GUCs, note, so
 it's not exactly groundbreaking.

If a module fails to use a parameter that may be a problem. But
forcing us to validate this using user written code may not improve
the situation.

What happens if I have two extensions that both use the namespace foo?
That means we would run two validation routines on it, and if they
disagree on the set of options and values then we are hosed.

-1 to *requiring* validation for table-level options for exactly the
same reasons we no longer validate custom GUCs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is AccessShareLock held until end of transaction?

2014-03-11 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/11/2014 12:26 PM, Simon Riggs wrote:
 On 11 March 2014 03:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Joe Conway m...@joeconway.com writes:
 I am probably missing something obvious, but why does the 
 AccessShareLock remain held on a table after a SELECT statement
 is complete when in a transaction block?
 
 *Any* lock acquired by user command is held till end of
 transaction; AccessShareLock isn't special.
 
 In general, releasing early would increase the risk of
 undesirable behaviors such as tables changing definition
 mid-transaction.
 
 I thought good question at first, but the workaround is
 simple... just don't use multi-step transactions, submit each
 request as a separate transaction.

Yeah, I told them that already. Unfortunately in this environment it
is not an option. It isn't a huge problem, but I did find it
surprising (as did the client) that a purely read-only transaction
could cause a deadlock with a concurrent CREATE TABLE.

It would seem that once the SELECT statement has finished we could
drop the AccessShareLock, but I guess that would open a can of works
that we don't want to contemplate.

Joe


- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTH0tlAAoJEDfy90M199hlxtUP/isxPT8ZRPf8X/vM3+vS4XR2
CTwNB292c9TLADSfi4lHFCXu8kqOpx29/9PJHUHrhTrCQE10USdC5uBN04u9si0a
SL5cmwtSeSn3YacgksNpPz0u9spGVdO4XqcMq9oh5gcsSeRf14NXIPAvUk7yRPTA
leVo7CArOfyld0QdRNw3JP50tAoHYJQynomkClg/9U+jYtk/aBpCSe/KL++d5esl
xt8iGZQ/wdZu+vWSdeaJMvGUYNOu4ts7wgtrqvLv9qLXDAiftfIC6NuakKY3WHY6
2OYz64Xd+wH0ZWEhYnSjkQR354RXSm0JQNos02nAjviDON6r6OJk3ny7Rw/mKbAw
ZR2Ze3EFYcnMeV9Rrg1DccDzqWK9lq7tHD++IfbQ/36xvOcxh4pQuZQt9erTJ4q1
l9MrHE8PA4mVDgcGlhcdzDl+/po/0ghy/HWgH72NjGpEX+fChh7Pad9ZCO5r33Du
V3EZXfdLwnokx/VRi0N61ZeBJCCKWSST3SrZKJk5ao7y8dQPIICryLJlM9sTxlXf
2wiQlybElpaqWxy+Ou3M7EYdPvGNOLHMCt8yUK5n+RFTEtljKNwy1E9NvJWWiVl9
SfA/6GXXsGlO0rQ723R1vPAFHtTo82ibQaiCNujVPu/2yecKl4MsdtaZApkilLqx
EPoWWGrs3cURvar6gmju
=DOcV
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cleaner build output when not much has changed

2014-03-11 Thread Gurjeet Singh
On Mon, Mar 10, 2014 at 8:12 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Gurjeet Singh wrote:
  On Tue, Nov 26, 2013 at 12:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
   Gurjeet Singh gurj...@singh.im writes:
I was looking for ways to reduce the noise in Postgres make output,
specifically, I wanted to eliminate the Nothing to be done for
 `all' 
messages, since they don't add much value, and just ad to the
 clutter.
  
   Why don't you just use make -s if you don't want to see that?
   The example output you show is not much less verbose than before.
 
  I have a shell function that now adds --no-print-directory to my make
  command. This patch combined with that switch makes the output really
 clean
  (at least from my perspective). Since the use of a command-line switch
 can
  be easily left to personal choice, I am not proposing to add that or its
  makefile-equivalent. But modifying the makefiles to suppress noise is not
  that everyone can be expected to do, and do it right.

 FWIW you can add a src/Makefile.custom file with this:

 all:
 @true

 and it will get rid of most noise.


As I noted in the first email in this chain, this causes a warning:

GNUmakefile:14: warning: overriding commands for target `all'
/home/gurjeet/dev/POSTGRES/src/Makefile.custom:2: warning: ignoring old
commands for target `all'

I have since settled for `make -s`. On slow builds it keeps me guessing for
a long time, without any indication of progress, but I've learned to live
with that.

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com http://www.enterprisedb.com


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-11 Thread Fabrízio de Royes Mello
On Tue, Mar 11, 2014 at 2:43 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 17:26, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
  Tom Lane escribió:
  =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
writes:
   You are correct. pg_dump export reloptions using WITH clause of
CREATE
   TABLE statement. I.e.:
 
   CREATE TABLE foo (
   )
   WITH (autovacuum_enabled=false, bdr.do_replicate=false);
 
   So if this statement checks for 'bdr' extension is loaded then in
partial
   restore it can be fail.
 
  I see absolutely *nothing* wrong with failing that command if bdr is
not
  installed.  For an analogy, if this table includes a column of type bar
  defined by some extension baz, we are certainly going to fail the
  CREATE TABLE if baz isn't installed.
 
  Now, if bdr is installed but the validation doesn't happen unless bdr
  is loaded in some sense, then that is an implementation deficiency
  that I think we can insist be rectified before this feature is
accepted.
 
  So, I spent some time on this patch the last couple of days to add some
  validation.  But after submitting it, it seems to me that there wasn't
  as much consensus on how to handle validation than at first I thought.
 
  So, the first and simplest way to go about this, of course, is just not
  validate anything. This is what Fabrízio's patch does.  So the table
  owner can execute
ALTER TABLE mary SET
(supercalifragilisticexpialidocious.umbrella_flight = 'hell yeah')
  and that would be it.  Whether a module makes use of this later or not,
  is not that guy's problem.  This is mostly what we do for GUCs, note, so
  it's not exactly groundbreaking.

 If a module fails to use a parameter that may be a problem. But
 forcing us to validate this using user written code may not improve
 the situation.

 What happens if I have two extensions that both use the namespace foo?
 That means we would run two validation routines on it, and if they
 disagree on the set of options and values then we are hosed.

 -1 to *requiring* validation for table-level options for exactly the
 same reasons we no longer validate custom GUCs.


In a previous email [1] I asked about alternatives to drive the work but
unfortunately no one replied. So because we already do that to custom GUCs,
and is the simpler way to implement this feature then I did that.

Regards,

[1]
http://www.postgresql.org/message-id/cafcns+r1zxtruzleceuj1sf9qr6ciks7he-esmkzoznh4nx...@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-11 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 -1 to *requiring* validation for table-level options for exactly the
 same reasons we no longer validate custom GUCs.

Well, that is an interesting analogy, but I'm not sure how much it applies
here.  In the case of a GUC, you can fairly easily validate it once the
module does get loaded (and before the module actually tries to do
anything with it).  I don't see how that's going to work for table
options.  I trust nobody is seriously proposing that on module load, we're
going to scan the whole of pg_class looking to see if there are incorrect
settings.  (Even if we did, what would we do about it?  Not try to force a
pg_class update, for sure.  And what if the module is loading into the
postmaster thanks to a preload spec?)

I don't really think partial validation makes sense.  We could just remove
the whole topic, and tell extension authors that it's up to them to defend
themselves against bizarre values stored for their table options.  But I'm
wondering if there's really so much use-case for a feature like that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] The case against multixact GUCs

2014-03-11 Thread Josh Berkus
Hackers,

In the 9.3.3 updates, we added three new GUCs to control multixact
freezing.  This was an unprecented move in my memory -- I can't recall
ever adding a GUC to a minor release which wasn't backwards
compatibility for a security fix.  This was a mistake.

What makes these GUCs worse is that nobody knows how to set them; nobody
on this list and nobody in the field.  Heck, I doubt 1 in 1000 of our
users (or 1 in 10 people on this list) know what a multixact *is*.

Further, there's no clear justification why these cannot be set to be
the same as our other freeze ages (which our users also don't
understand), or a constant calculated portion of them, or just a
constant.  Since nobody anticipated someone adding a GUC in a minor
release, there was no discussion of this topic that I can find; the new
GUCs were added as a side effect of fixing the multixact vacuum issue.
 Certainly I would have raised a red flag if the discussion of the new
GUCs hadn't been buried deep inside really long emails.

Adding new GUCs which nobody has any idea how to set, or can even
explain to new users, is not a service to our users.  These should be
removed.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] logical decoding documentation?

2014-03-11 Thread Peter Eisentraut
Where, if anywhere, is the current documentation for writing or using a
logical decoding output plugin consumer thingy?

I'm trying to find my way into it ...

src/backend/replication/logical/logical.c, which textually contains most
of the functions that appear to interact with the test_decoding module,
contains this in the header comment:


The idea is that a consumer provides three callbacks, one to read WAL,
one to prepare a data write, and a final one for actually writing since
their implementation depends on the type of consumer.  Check
logicalfunc.c for an example implementations of a fairly simple consumer
and a implementation of a WAL reading callback that's suitable for
simpler consumers.


There is no file logicalfunc.c.  And test_decoding actually uses five
callbacks, not three.

Is a consumer the same as a decoder?

test_decoding.c contains this:

/* These must be available to pg_dlsym() */
static void pg_decode_startup(LogicalDecodingContext *ctx,
OutputPluginOptions *opt, bool is_init);
...

which is surely wrong.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding documentation?

2014-03-11 Thread Andres Freund
Hi,

On 2014-03-11 15:57:39 -0400, Peter Eisentraut wrote:
 Where, if anywhere, is the current documentation for writing or using a
 logical decoding output plugin consumer thingy?

There's a pending patch for it. The corresponding commit is
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commit;h=5eeedd55b2d7e53b5fdcdab6a8e74bb666d75bcc

I welcome feedback about it. I've spent a fair bit of time immersed in
this stuff, and I am not really sure anymore what's understandable and
whatnot ;)

It's referencing pg_recvlogical which isn't committed yet (that's the
commit just before), that's why the docs weren't committed with the
feature itself.

 src/backend/replication/logical/logical.c, which textually contains most
 of the functions that appear to interact with the test_decoding module,
 contains this in the header comment:
 
 
 The idea is that a consumer provides three callbacks, one to read WAL,
 one to prepare a data write, and a final one for actually writing since
 their implementation depends on the type of consumer.  Check
 logicalfunc.c for an example implementations of a fairly simple consumer
 and a implementation of a WAL reading callback that's suitable for
 simpler consumers.
 
 
 There is no file logicalfunc.c.

Hrmpf: There's a missing 's', it's logicalfuncs.c.

 And test_decoding actually uses five callbacks, not three.

The callbacks logicalfuncs.c header comment is talking about adding a
new method to output data. Currently you can stream out changes via
walsender and via the SQL SRFs. But it might be interesting to
e.g. consume the changes in a bgworker, without going through either SQL
or walsender. To do that you need the three callbacks referenced above.

 Is a consumer the same as a decoder?

A consumer is just the recipient of the changestream. I.e the walsender
that streams out the changestream, or the SRF that spills the data into
a tuplestore.

I don't think the term decoder is used anywhere, but if it is, it'd
be the output plugin.

 test_decoding.c contains this:
 
 /* These must be available to pg_dlsym() */
 static void pg_decode_startup(LogicalDecodingContext *ctx,
 OutputPluginOptions *opt, bool is_init);
 ...
 
 which is surely wrong.

Hm, these days the comment should be above _PG_init() and
_PG_output_plugin_init(). That changed around several times...

Could you perhaps commit the attached patch fixing the issues you
mentioned?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/contrib/test_decoding/test_decoding.c b/contrib/test_decoding/test_decoding.c
index e356c7c..31aa012 100644
--- a/contrib/test_decoding/test_decoding.c
+++ b/contrib/test_decoding/test_decoding.c
@@ -33,6 +33,7 @@
 
 PG_MODULE_MAGIC;
 
+/* These must be available to pg_dlsym() */
 extern void		_PG_init(void);
 extern void		_PG_output_plugin_init(OutputPluginCallbacks *cb);
 
@@ -43,7 +44,6 @@ typedef struct
 	bool		include_timestamp;
 } TestDecodingData;
 
-/* These must be available to pg_dlsym() */
 static void pg_decode_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 			  bool is_init);
 static void pg_decode_shutdown(LogicalDecodingContext *ctx);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 13a22d4..04e407a 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -12,14 +12,17 @@
  *together provide logical decoding, primarily by providing so
  *called LogicalDecodingContexts. The goal is to encapsulate most of the
  *internal complexity for consumers of logical decoding, so they can
- *create and consume a changestream with a low amount of code.
+ *create and consume a changestream with a low amount of code. Builtin
+ *consumers are the walsender and SQL SRF interface, but it's possible to
+ *add further ones without changing core code, e.g. to consume changes in
+ *a bgworker.
  *
  *The idea is that a consumer provides three callbacks, one to read WAL,
  *one to prepare a data write, and a final one for actually writing since
  *their implementation depends on the type of consumer.  Check
- *logicalfunc.c for an example implementations of a fairly simple consumer
+ *logicalfuncs.c for an example implementation of a fairly simple consumer
  *and a implementation of a WAL reading callback that's suitable for
- *simpler consumers.
+ *simple consumers.
  *-
  */
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal (9.5) : psql unicode border line styles

2014-03-11 Thread Pavel Stehule
Hello

I had to reduce allowed line style to single or double, because unicode
allows only combination single,double or single,thick

postgres=# \l
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access
privileges
---+--+--+-+-+---
 postgres  | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/postgres  +
   |  |  | | |
postgres=CTc/postgres
 template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/postgres  +
   |  |  | | |
postgres=CTc/postgres
(3 rows)

postgres=# \pset border 2
Border style (border) is 2.
postgres=# \pset linestyle unicode
Line style (linestyle) is unicode.
postgres=# \l
   List of databases
┌───┬──┬──┬─┬─┬───┐
│   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   │
├───┼──┼──┼─┼─┼───┤
│ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   │
│ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
│ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
└───┴──┴──┴─┴─┴───┘
(3 rows)

postgres=# \pset unicode_header_linestyle double
Unicode border linestyle is double.
postgres=# \l
   List of databases
┌───┬──┬──┬─┬─┬───┐
│   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   │
╞═══╪══╪══╪═╪═╪═══╡
│ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8
│   │
│ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
│ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
=c/postgres  ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
└───┴──┴──┴─┴─┴───┘
(3 rows)

postgres=#

Regards

Pavel



2014-03-07 19:24 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello

 I am returning back to this topic. Last time I proposed styles:


 http://www.postgresql.org/message-id/cafj8prclgoktryjpbtoncpgyftrcz-zgfowdc1jqulb+ede...@mail.gmail.com

 http://postgres.cz/wiki/Pretty_borders_in_psql

 This experiment fails, but there are some interesting tips in discuss.

 So I propose little bit different proposal - choose one predefined style
 for any table lines elements. These styles are active only when linestyle
 is unicode.

 So possible line elements are:

 * border,
 * header_separator,
 * row_separator,
 * column_separator,

 Possible styles (for each element)

 * none,
  * single,
 * double,
 * thick,

 It should to have enough variability to define all styles proposed early.
 I hope, so this proposal is secure and simple for usage. Styles should be
 persistently saved in .psqlrc file - and some examples can be in
 documentation.

 Usage:

 \pset linestyle_border double
 \pset linestyle_header_separator single
 \pset linestyle_row_separator single
 \pset linestyle_column_separator single

 \pset linestyle unicode

 ╔═══╤╤═══╗
 ║ a │ b  │   c   ║
 ╟───┼┼───╢
 ║ 1 │ 2012-05-24 │ Hello ║
 ╟───┼┼───╢
 ║ 2 │ 2012-05-25 │ Hello ║
 ║   ││ World ║
 ╚═══╧╧═══╝
 (2 rows)


 Comments, ideas ?

 Regards

 Pavel








From ea7a046afbd6f517b37606f3e52bd2e401d1fad2 Mon Sep 17 00:00:00 2001
From: Pavel Stehule pavel.steh...@gooddata.com
Date: Tue, 11 Mar 2014 20:57:08 +0100
Subject: [PATCH] initial

---
 doc/src/sgml/ref/psql-ref.sgml |  36 
 src/bin/psql/command.c | 118 +++-
 src/bin/psql/print.c   | 199 -
 src/bin/psql/print.h   |  22 -
 src/bin/psql/startup.c |   5 ++
 src/bin/psql/tab-complete.c|  13 ++-
 6 files changed, 345 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 8813be8..3e7e748 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2273,6 +2273,42 @@ lo_import 152801
   /para
   /listitem
   

Re: [HACKERS] The case against multixact GUCs

2014-03-11 Thread Alvaro Herrera
Sigh ...

Josh Berkus wrote:

 Further, there's no clear justification why these cannot be set to be
 the same as our other freeze ages (which our users also don't
 understand), or a constant calculated portion of them, or just a
 constant.

Calculated portion was my first proposal.  The objection that was raised
was that there's no actual correlation between Xid consumption rate and
multixact consumption rate.  That observation is correct; in some use
cases multixacts will be consumed faster, in others they will be
consumed slower.  So there's no way to have multixact cleanup not cause
extra autovacuum load if we don't have the parameters.

 Since nobody anticipated someone adding a GUC in a minor
 release, there was no discussion of this topic that I can find; the new
 GUCs were added as a side effect of fixing the multixact vacuum issue.
  Certainly I would have raised a red flag if the discussion of the new
 GUCs hadn't been buried deep inside really long emails.

When problems are tough, explanations get long.  There's no way around
that.  I cannot go highlighting text in red hoping that people will read
those parts.

 Adding new GUCs which nobody has any idea how to set, or can even
 explain to new users, is not a service to our users.  These should be
 removed.

I don't think we're going to remove the parameters.  My interpretation
of the paragraph above is can we please have some documentation that
explains how to set these parameters.  To that, the answer is sure, we
can.  However, I don't have time to write it at this point.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The case against multixact GUCs

2014-03-11 Thread David Johnston
Josh Berkus wrote
 Hackers,
 
 In the 9.3.3 updates, we added three new GUCs to control multixact
 freezing.  This was an unprecented move in my memory -- I can't recall
 ever adding a GUC to a minor release which wasn't backwards
 compatibility for a security fix.  This was a mistake.

It probably should have been handled better but the decision to make these
parameters visible itself doesn't seem to be the wrong decision - especially
when limited to a fairly recently released back-branch.


 What makes these GUCs worse is that nobody knows how to set them; nobody
 on this list and nobody in the field.  Heck, I doubt 1 in 1000 of our
 users (or 1 in 10 people on this list) know what a multixact *is*.

That isn't a reason in itself to not have the capability if it is actually
needed.


 Further, there's no clear justification why these cannot be set to be
 the same as our other freeze ages (which our users also don't
 understand), or a constant calculated portion of them, or just a
 constant.  Since nobody anticipated someone adding a GUC in a minor
 release, there was no discussion of this topic that I can find; the new
 GUCs were added as a side effect of fixing the multixact vacuum issue.
  Certainly I would have raised a red flag if the discussion of the new
 GUCs hadn't been buried deep inside really long emails.

The release documentation makes a pointed claim that the situation WAS that
the two were identical; but the different consumption rates dictated making
the multi-xact configuration independently configurable.  So in effect the
GUC was always present - just not user-visible.

Even if there are not any current best practices surrounding this topic at
least this way as methods are developed there is an actual place to put the
derived value.  As a starting point one can simply look at the defaults and,
if they have change the value for the non-multi value apply the same factor
to the custom multi-version.

Now, obviously someone has to think to actually do that - and the release
notes probably should have provided such guidance - but as I state
explicitly below the issue is more about insufficient communication and
education and less about providing the flexibility.


 Adding new GUCs which nobody has any idea how to set, or can even
 explain to new users, is not a service to our users.  These should be
 removed.

Or we should insist that those few that do have an understanding create some
kind of wiki document, or even a documentation section, to educate those
that are not as knowledgeable in this area.

For good reason much of the recent focus in this area has been actually
getting the feature to work.  Presuming that it is a desirable feature -
which it hopefully is given it made it into the wild - to have then such
focus has obviously been necessary given the apparent complexity of this
feature (as evidenced by the recent serious bug reports) but hopefully the
feature itself is mostly locked down and education will begin.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/The-case-against-multixact-GUCs-tp5795561p5795573.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-03-11 Thread Tomas Vondra
Hi all,

a quick question that just occured to me - do you plan to tweak the cost
estimation fot GIN indexes, in this patch?

IMHO it would be appropriate, given the improvements and gains, but it
seems to me gincostestimate() was not touched by this patch.

I just ran into this while testing some jsonb stuff, and after creating
a GIN and GIST indexes on the same column, I get these two plans:

===

db=# explain analyze select count(*) from messages_2 where headers ?
'x-virus-scanned';

  QUERY PLAN

 Aggregate  (cost=1068.19..1068.20 rows=1 width=0) (actual
time=400.149..400.150 rows=1 loops=1)
   -  Bitmap Heap Scan on messages_2  (cost=10.44..1067.50 rows=278
width=0) (actual time=27.974..395.840 rows=70499 loops=1)
 Recheck Cond: (headers ? 'x-virus-scanned'::text)
 Rows Removed by Index Recheck: 33596
 Heap Blocks: exact=40978
 -  Bitmap Index Scan on messages_2_gist_idx  (cost=0.00..10.37
rows=278 width=0) (actual time=21.762..21.762 rows=104095 loops=1)
   Index Cond: (headers ? 'x-virus-scanned'::text)
 Planning time: 0.052 ms
 Total runtime: 400.179 ms
(9 rows)

Time: 400,467 ms

db=# drop index messages_2_gist_idx;
DROP INDEX

db=# explain analyze select count(*) from messages_2 where headers ?
'x-virus-scanned';
  QUERY PLAN

 Aggregate  (cost=1083.91..1083.92 rows=1 width=0) (actual
time=39.130..39.130 rows=1 loops=1)
   -  Bitmap Heap Scan on messages_2  (cost=26.16..1083.22 rows=278
width=0) (actual time=11.285..36.248 rows=70499 loops=1)
 Recheck Cond: (headers ? 'x-virus-scanned'::text)
 Heap Blocks: exact=23896
 -  Bitmap Index Scan on messages_2_gin_idx  (cost=0.00..26.09
rows=278 width=0) (actual time=7.974..7.974 rows=70499 loops=1)
   Index Cond: (headers ? 'x-virus-scanned'::text)
 Planning time: 0.064 ms
 Total runtime: 39.160 ms
(8 rows)

Time: 39,509 ms

===

So while the GIN plans seems to be just slightly expensive than GIN,
it's actually way faster.

Granted, most won't have GIN and GIST index on the same column at the
same time, but bad cost estimate may cause other issues. Maybe I could
achieve this by tweaking the various cost GUCs, but ISTM that tweaking
the cost estimation would be appropriate.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-11 Thread Tomas Vondra
Hi,

I've spent a few hours stress-testing this a bit - loading a mail
archive with ~1M of messages (with headers stored in a jsonb column) and
then doing queries on that. Good news - no crashes or any such issues so
far. The queries that I ran manually seem to return sane results.

The only problem I ran into is with limited index row size with GIN
indexes. I understand it's not a bug, but I admit I haven't realized I
might run into it in this case ...

The data I used for testing is just a bunch of e-mail messages, with
headers stored as jsonb, so each row has something like this in
headers column:

{
 from : John Doe j...@example.com,
 to : [Jane Doe j...@example.com, Jack Doe j...@example.com],
 cc : ...,
 bcc : ...,
 ... various other headers ...
}

The snag is that some of the header values may be very long, exceeding
the limit of 1352 bytes and causing errors like this:

  ERROR:  index row size 1416 exceeds maximum 1352 for index gin_idx

A good example of such header is dkim-signature which basically
contains the whole message digitally signed with DKIM. The signature
tends to be long and non-compressible, thanks to the signature.

I'm wondering what's the best way around this, because I suspect many
new users (especially those attracted by jsonb and GIN improvements)
will run into this. Maybe not immediately, but eventully they'll try to
insert a jsonb with long value, and it will fail ...

With btree indexes on text I would probably create an index on
substr(column,0,1000) or something like that, but doing that with JSON
seems a bit strange.

I assume we need to store the actual values in the GIN index (so a hash
is not sufficient), right?

GIST indexes work, but with that I have to give up the significant
performance gains that we got thanks to Alexander's GIN patches.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 On 03/11/2014 06:57 AM, Tom Lane wrote:
 Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
 has been.  I'm just expecting lots of push-back if we try.  And it's kind
 of hard to resist push-back when you don't have a substitute to offer.

 Yeah, what we really need is encapsulated per-DB users and local
 superusers.  I think every agrees that this is the goal, but nobody
 wants to put in the work to implement a generalized solution.

Well ... you know, how hard would it really be?  Extend the primary
key of pg_authid to be (username, database_oid) with the convention
of database_oid = 0 for a globally known user.  Add some syntax to
CREATE USER to indicate whether you're creating a global user
(the default) or one known only within the current database.  I'm
not quite sure what to do with local users for pg_dump/pg_dumpall;
perhaps pg_dump should get the responsibility of creating local users?
But otherwise it seems like there are no issues that we'd not have to
solve anyway if we wanted to make db_user_name less of a crock.

In particular, I'd like to see an exclusion that prevents local users
from having the same name as any global user, so that we don't have
ambiguity in GRANT and similar commands.  This doesn't seem simple to
enforce (if we supported partial indexes on system catalogs, it would
be ...) but surely this representation is more amenable to enforcing it
than the existing one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-11 Thread Peter Geoghegan
On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra t...@fuzzy.cz wrote:
   ERROR:  index row size 1416 exceeds maximum 1352 for index gin_idx

All index AMs have similar restrictions.

 A good example of such header is dkim-signature which basically
 contains the whole message digitally signed with DKIM. The signature
 tends to be long and non-compressible, thanks to the signature.

 I'm wondering what's the best way around this, because I suspect many
 new users (especially those attracted by jsonb and GIN improvements)
 will run into this. Maybe not immediately, but eventully they'll try to
 insert a jsonb with long value, and it will fail ...

The jsonb_hash_ops operator class just stores a 32-bit integer hash
value (it always sets the recheck flag, which only some of the other
default GIN opclass' strategies do). It only supports containment, and
not the full variety of operators that the default opclass supports,
which is why it isn't the default. I think that in practice the
general recommendation will be that when indexing at the top level,
use jsonb_hash_ops. When indexing nested items, use the more flexible
default GIN opclass. That seems like a pretty smart trade-off to me.

The more I think about it, the more inclined I am to lose GiST support
entirely for the time being. It lets us throw out about 700 lines of C
code, which is a very significant fraction of the total, removes the
one open bug, and removes the least understood part of the code. The
GiST opclass is not particularly compelling for this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-11 Thread Simon Riggs
On 11 March 2014 18:33, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 -1 to *requiring* validation for table-level options for exactly the
 same reasons we no longer validate custom GUCs.

 Well, that is an interesting analogy, but I'm not sure how much it applies
 here.  In the case of a GUC, you can fairly easily validate it once the
 module does get loaded (and before the module actually tries to do
 anything with it).  I don't see how that's going to work for table
 options.  I trust nobody is seriously proposing that on module load, we're
 going to scan the whole of pg_class looking to see if there are incorrect
 settings.  (Even if we did, what would we do about it?  Not try to force a
 pg_class update, for sure.  And what if the module is loading into the
 postmaster thanks to a preload spec?)

Thank goodness for that. Strict validation does seem scary.

 I don't really think partial validation makes sense.  We could just remove
 the whole topic, and tell extension authors that it's up to them to defend
 themselves against bizarre values stored for their table options.  But I'm
 wondering if there's really so much use-case for a feature like that.

DBAs are fairly used to the idea that if you put crap data in the
database then bad things happen. We provide the table, they provide
the data. Validation is possible, but not enforced as essential.
(Except in terms of the datatype - but then we are also validating
data to specific types here).

So I think that DBAs will also cope rather well with table-level
options without us nannying them.

There is nothing more annoying that needing to run scripts in a
specific sequence to make them work, or dumps that fail because
certain modules aren't loaded yet (or cannot ever be so). And maybe
the DBA wants to annotate tables based on a design and then later move
to implement modules to take advantage of the annotation.

Having an option be set and yet be unvalidated and/or unused is no
more annoying than having a column in a table that is known incorrect
and/or not accessed. Searching for badly set options needs to be
possible, even easy, but hard validation can cause problems. And if we
try and force it, whats to stop people from using a dummy validator
just to circumvent the strictness?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 The docs say:

 db_user_namespace causes the client's and server's user name
 representation to differ. Authentication checks are always done with
 the server's user name so authentication methods must be configured
 for the server's user name, not the client's. Because md5 uses the
 user name as salt on both the client and server, md5 cannot be used
 with db_user_namespace.

 Is that the only major issue?

That's one major issue.

Another one is the hacky way of distinguishing global from per-db users
(ie, user must append @ to log in as a global user).  I'd love to get rid
of that requirement, but not sure how.  Would it be all right for the
server to first probe for user@db and then for just user, or vice versa?
How would this integrate with pg_hba.conf?

 Why not have the server strip out the @db part if this is on?

Hmm ... that's a thought.  IIRC, the client doesn't actually know that
this is going on, it just sends the user name, and hashes against that
too.  If the server remembered the bare user name it could hash against
that as well.

At least, that would work for db-local usernames.  It *doesn't* work for
global names, because the client side has no idea that the @ ought to not
be counted for hashing.  Again, if we could get rid of that convention,
it'd be far less messy.

A possible objection would be that changing the definition of what to hash
would invalidate existing MD5 password hashes; but since MD5 passwords
have never been allowed with db_user_namespace names anyway, that doesn't
seem very forceful.

 If we made this an initdb-time setting rather than a 
 GUC then we'd remove the problems caused by turning this on and off.

Why do we need to restrict that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Andrew Dunstan


On 03/11/2014 07:41 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

The docs say:
 db_user_namespace causes the client's and server's user name
 representation to differ. Authentication checks are always done with
 the server's user name so authentication methods must be configured
 for the server's user name, not the client's. Because md5 uses the
 user name as salt on both the client and server, md5 cannot be used
 with db_user_namespace.
Is that the only major issue?

That's one major issue.

Another one is the hacky way of distinguishing global from per-db users
(ie, user must append @ to log in as a global user).  I'd love to get rid
of that requirement, but not sure how.  Would it be all right for the
server to first probe for user@db and then for just user, or vice versa?


user@db first, I think. I guess that means don't name your global users 
the same as db-specific users. Maybe we should prevent a conflict? Or if 
we allow a conflict then only require user@ in conflicting cases.




How would this integrate with pg_hba.conf?


Not sure.




Why not have the server strip out the @db part if this is on?

Hmm ... that's a thought.  IIRC, the client doesn't actually know that
this is going on, it just sends the user name, and hashes against that
too.  If the server remembered the bare user name it could hash against
that as well.

At least, that would work for db-local usernames.  It *doesn't* work for
global names, because the client side has no idea that the @ ought to not
be counted for hashing.  Again, if we could get rid of that convention,
it'd be far less messy.



Right.



A possible objection would be that changing the definition of what to hash
would invalidate existing MD5 password hashes; but since MD5 passwords
have never been allowed with db_user_namespace names anyway, that doesn't
seem very forceful.


Agreed.




If we made this an initdb-time setting rather than a
GUC then we'd remove the problems caused by turning this on and off.

Why do we need to restrict that?





Oh, probably a thinko in my part. I thought the setting would affect the 
stored password, but now I think about it some more I see it doesn't.



I think we might be on the track of something useful here.


cheers

andre


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-11 Thread Peter Geoghegan
On Tue, Mar 11, 2014 at 4:41 PM, Peter Geoghegan p...@heroku.com wrote:
 I think that in practice the
 general recommendation will be that when indexing at the top level,
 use jsonb_hash_ops. When indexing nested items, use the more flexible
 default GIN opclass. That seems like a pretty smart trade-off to me.

By which I mean: index nested items using an expressional GIN index.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Josh Berkus
On 03/11/2014 06:57 AM, Tom Lane wrote:
 Mind you, I wouldn't be unhappy to see it go away; it's a kluge and always
 has been.  I'm just expecting lots of push-back if we try.  And it's kind
 of hard to resist push-back when you don't have a substitute to offer.

Yeah, what we really need is encapsulated per-DB users and local
superusers.  I think every agrees that this is the goal, but nobody
wants to put in the work to implement a generalized solution.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Andrew Dunstan


On 03/11/2014 09:37 PM, Tom Lane wrote:



In particular, I'd like to see an exclusion that prevents local users
from having the same name as any global user, so that we don't have
ambiguity in GRANT and similar commands.  This doesn't seem simple to
enforce (if we supported partial indexes on system catalogs, it would
be ...) but surely this representation is more amenable to enforcing it
than the existing one.




Should be workable if you're creating a local name - just check against 
the list of global roles. For creating global names, maybe we could keep 
an invisible shared catalog (or maybe only visible to global superusers) 
of local names / db oids. New global names would be checked against that 
catalog. Something like that.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 03/11/2014 09:37 PM, Tom Lane wrote:
 In particular, I'd like to see an exclusion that prevents local users
 from having the same name as any global user, so that we don't have
 ambiguity in GRANT and similar commands.  This doesn't seem simple to
 enforce (if we supported partial indexes on system catalogs, it would
 be ...) but surely this representation is more amenable to enforcing it
 than the existing one.

 Should be workable if you're creating a local name - just check against 
 the list of global roles.

Concurrent creations won't be safe without some sort of locking scheme.
A unique index would be a lot better way of plugging that hole than a
system-wide lock on user creation.  But not sure how to define a unique
index that allows (joe, db1) to coexist with (joe, db2) but not with
(joe, 0).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Aidan Van Dyk
So I'll admit to using it, only in toy setups...

I use it with trust and ident, on local connections though, not password

I try to keep my laptops clean of mysqld, and I use PG.  And only on my
laptop/PC,  I make a database for every user...  And every app get's a
userid, and a schemaEvery user get's passwordless access to their
database.  And the userid associates the app, and the defaults that get
used on their connections.

So, I think it's neat, but wouldn't be put out if it was removed ...


On Tue, Mar 11, 2014 at 9:47 AM, Magnus Hagander mag...@hagander.netwrote:

 On Tue, Mar 11, 2014 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  On Sun, Mar 9, 2014 at 9:00 PM, Thom Brown t...@linux.com wrote:
  It will be 12 years this year since this temporary measure was
  added.  I'm just wondering, is there any complete solution that
  anyone had in mind yet?  Or should this just be deprecated?

  I'd say +1 to remove it. That would also make it possible to get id of
  password authentication...

 If we remove it without providing a substitute feature, people who are
 using it will rightly complain.

 Are you claiming there are no users, and if so, on what evidence?


 I am claiming that I don't think anybody is using that, yes.

 Based on the fact that I have *never* come across it on any system I've
 come across since, well, forever. Except once I think, many years ago, when
 someone had enabled it by mistake and needed my help to remove it...

 But we should absolutely deprecate it first in that place. Preferrably
 visibly (e.g. with a log message when people use it). That could at least
 get those people who use it to let us know they do, to that way figure out
 if they do - and can de-deprecate it.

 Or if someone wants to fix it properly of course :)

 --
  Magnus Hagander
  Me: http://www.hagander.net/
  Work: http://www.redpill-linpro.com/




-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a
slave.


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Andrew Dunstan


On 03/11/2014 11:06 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 03/11/2014 09:37 PM, Tom Lane wrote:

In particular, I'd like to see an exclusion that prevents local users
from having the same name as any global user, so that we don't have
ambiguity in GRANT and similar commands.  This doesn't seem simple to
enforce (if we supported partial indexes on system catalogs, it would
be ...) but surely this representation is more amenable to enforcing it
than the existing one.

Should be workable if you're creating a local name - just check against
the list of global roles.

Concurrent creations won't be safe without some sort of locking scheme.
A unique index would be a lot better way of plugging that hole than a
system-wide lock on user creation.  But not sure how to define a unique
index that allows (joe, db1) to coexist with (joe, db2) but not with
(joe, 0).





Create (joe, db1), (joe, db2) ... for each global user? Might get a tad 
ugly if you have lots of global users or lots of databases.


Just trying to be a bit creative :-)

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Jaime Casanova
On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But not sure how to define a unique
 index that allows (joe, db1) to coexist with (joe, db2) but not with
 (joe, 0).


and why you want that restriction? when you login you need to specify
the db, right? if you don't specify the db then is the global 'joe'
that want to login

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Andrew Dunstan


On 03/11/2014 11:50 PM, Jaime Casanova wrote:

On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:

But not sure how to define a unique
index that allows (joe, db1) to coexist with (joe, db2) but not with
(joe, 0).


and why you want that restriction? when you login you need to specify
the db, right? if you don't specify the db then is the global 'joe'
that want to login



You can't login without specifying a db. Every session is connected to a db.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-11 Thread Rajeev rastogi
On 11 March 2014 19:52, Tom Lane wrote:

 After sleeping on it, I'm inclined to think we should continue to not
 print status for COPY TO STDOUT.  Aside from the risk of breaking
 scripts, there's a decent analogy to be made to SELECT: we don't print
 a status tag for that either.

It is correct that SELECT does not print conventional way of status tag but 
still it prints the number
of rows selected (e.g. (2 rows)) along with rows actual value, which can be 
very well considered
as kind of status. User can make out with this result, that how many rows have 
been selected.

But in-case of COPY TO STDOUT, if we don't print anything, then user does not 
have any direct way of finding
that how many rows were copied from table to STDOUT, which might have been very 
useful.

Please let me know your opinion or if I have missed something.

Thanks and Regards,
Kumar Rajeev Rastogi





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread Jaime Casanova
On Tue, Mar 11, 2014 at 10:52 PM, Andrew Dunstan and...@dunslane.net wrote:

 On 03/11/2014 11:50 PM, Jaime Casanova wrote:

 On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 But not sure how to define a unique
 index that allows (joe, db1) to coexist with (joe, db2) but not with
 (joe, 0).

 and why you want that restriction? when you login you need to specify
 the db, right? if you don't specify the db then is the global 'joe'
 that want to login


 You can't login without specifying a db. Every session is connected to a db.


then, what's the problem with global users?

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only table scan?)

2014-03-11 Thread Haribabu Kommi
On Thu, Mar 6, 2014 at 10:15 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2014-03-06 18:17 GMT+09:00 Haribabu Kommi kommi.harib...@gmail.com:
 I will update you later regarding the performance test results.


I ran the performance test on the cache scan patch and below are the readings.

Configuration:

Shared_buffers - 512MB
cache_scan.num_blocks - 600
checkpoint_segments - 255

Machine:
OS - centos - 6.4
CPU - 4 core 2.5 GHZ
Memory - 4GB

 Head  patched  Diff
Select -  500K772ms2659ms-200%
Insert - 400K   3429ms 1948ms  43% (I am
not sure how it improved in this case)
delete - 200K 2066ms 3978ms-92%
update - 200K3915ms  5899ms-50%

This patch shown how the custom scan can be used very well but coming
to patch as It is having
some performance problem which needs to be investigated.

I attached the test script file used for the performance test.

Regards,
Hari Babu
Fujitsu Australia
shared_buffers = 512MB
shared_preload_libraries = 'cache_scan'
cache_scan.num_blocks = 600
checkpoint_segments - 255

\timing

--cache scan select 5 million
create table test(f1 int, f2 char(70), f3 float, f4 char(100));

truncate table test;
insert into test values (generate_series(1,500), 'fujitsu', 1.1, 'Australia 
software tech pvt ltd');
checkpoint;

CREATE TRIGGER test_cache_row_sync AFTER INSERT OR UPDATE OR DELETE ON test FOR 
ROW EXECUTE PROCEDURE cache_scan_synchronizer();
CREATE TRIGGER test_cache_stmt_sync AFTER TRUNCATE ON test FOR STATEMENT 
EXECUTE PROCEDURE cache_scan_synchronizer();

vacuum analyze test;
explain select count(*) from test where f1  0;
select count(*) from test where f1  0;
select count(*) from test where f1  0;

delete from test where f1  100 and f1 = 120;

update test set f1 = f1 + 300 where f1  120 and f1 = 140;

insert into test values (generate_series(101, 140), 'fujitsu', 1.1, 
'Australia software tech pvt ltd');

drop table test cascade;

--sequence scan select 5 million
create table test(f1 int, f2 char(70), f3 float, f4 char(100));

truncate table test;
insert into test values (generate_series(1,500), 'fujitsu', 1.1, 'Australia 
software tech pvt ltd');
checkpoint;

vacuum analyze test;
set cache_scan.enabled = off;

explain select count(*) from test where f1  0;
select count(*) from test where f1  0;

delete from test where f1  100 and f1 = 120;

update test set f1 = f1 + 300 where f1  120 and f1 = 140;

insert into test values (generate_series(101, 140), 'fujitsu', 1.1, 
'Australia software tech pvt ltd');

drop table test cascade;


--cache scan select 500K
create table test(f1 int, f2 char(70), f3 float, f4 char(100));

truncate table test;
insert into test values (generate_series(1,500), 'fujitsu', 1.1, 'Australia 
software tech pvt ltd');
checkpoint;

CREATE TRIGGER test_cache_row_sync AFTER INSERT OR UPDATE OR DELETE ON test FOR 
ROW EXECUTE PROCEDURE cache_scan_synchronizer();
CREATE TRIGGER test_cache_stmt_sync AFTER TRUNCATE ON test FOR STATEMENT 
EXECUTE PROCEDURE cache_scan_synchronizer();

vacuum analyze test;
explain select count(*) from test where f1 % 10 = 5;
select count(*) from test where f1 % 10 = 5;
select count(*) from test where f1 % 10 = 5;

delete from test where f1  100 and f1 = 120;

update test set f1 = f1 + 300 where f1  120 and f1 = 140;

insert into test values (generate_series(101, 140), 'fujitsu', 1.1, 
'Australia software tech pvt ltd');

drop table test cascade;

--sequence scan select 500K
create table test(f1 int, f2 char(70), f3 float, f4 char(100));

truncate table test;
insert into test values (generate_series(1,500), 'fujitsu', 1.1, 'Australia 
software tech pvt ltd');
checkpoint;

vacuum analyze test;
set cache_scan.enabled = off;

explain select count(*) from test where f1 % 10 = 5;
select count(*) from test where f1 % 10 = 5;

delete from test where f1  100 and f1 = 120;

update test set f1 = f1 + 300 where f1  120 and f1 = 140;

insert into test values (generate_series(101, 140), 'fujitsu', 1.1, 
'Australia software tech pvt ltd');

drop table test cascade;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread David Johnston
Andrew Dunstan wrote
 On 03/11/2014 11:50 PM, Jaime Casanova wrote:
 On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane lt;

 tgl@.pa

 gt; wrote:
 But not sure how to define a unique
 index that allows (joe, db1) to coexist with (joe, db2) but not with
 (joe, 0).

 and why you want that restriction? when you login you need to specify
 the db, right? if you don't specify the db then is the global 'joe'
 that want to login

 
 You can't login without specifying a db. Every session is connected to a
 db.
 
 cheers
 
 andrew

Random Thoughts:

So if dave is already a user in db1 only that specific dave can be made a
global user - any other dave would be disallowed.

Would user - password be a better PK? Even with the obvious issue that
password part of the key can change.  user-password to database is a
properly many-to-many relationship.  Or see next for something simpler.

A simple implementation would simply have the global users copied into each
database as it is constructed.  There would also be a link from each of the
database-specific users and the global master so that a password change
issued against the global user propagates to all the database-specific
versions.

Construction of a new global user would cause an immediate copy-over; which
can fail if the simple name is already in use in any of the existing
databases.  Promoting becomes a problem that would ideally have a solution
- but then you are back to needing to distinguish between dave-db1 and
dave-db2...

Be nice if all users could be global and there would be some way to give
them permissions on databases.

Or go the opposite extreme and all users are local and the concept of
global would have to be added by those desiring it.  Basically move
user-management at the cluster level to a cluster support application and
leave databases free-standing.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/db-user-namespace-a-temporary-measure-tp5795305p5795609.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers