[HACKERS] Transactions involving multiple postgres foreign servers

2016-10-03 Thread Masahiko Sawada
On Tue, Oct 4, 2016 at 1:26 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com > wrote:
>>>
>>> Why always rollback any dangling transaction? There can be a case that
>>> a foreign server has a dangling transaction which needs to be
>>> committed because the portions of that transaction on the other shards
>>> are committed.
>>
>> Right, we can heuristically make a decision whether we do COMMIT or
>> ABORT on local server.
>> For example, if COMMIT PREPARED succeeded on at least one foreign
>> server, the local server return OK to client and the other dangling
>> transactions should be committed later.
>> We can find out that we should do either commit or abort the dangling
>> transaction by checking CLOG.
>
> Heuristics can not become the default behavior. A user should be given
> an option to choose a heuristic, and he should be aware of the
> pitfalls when using this heuristic. I guess, first, we need to get a
> solution which ensures that the transaction gets committed on all the
> servers or is rolled back on all the foreign servers involved. AFAIR,
> my patch did that. Once we have that kind of solution, we can think
> about heuristics.

I meant that we could determine it heuristically only when remote server
crashed in 2nd phase of 2PC.
For example, what does the local server returns to client when no one
remote server returns OK to local server in 2nd phase of 2PC for more than
statement_timeout seconds? Ok or error?

>>
>> But we need to handle the case where the CLOG file containing XID
>> necessary for resolving dangling transaction is truncated.
>> If the user does VACUUM FREEZE just after remote server crashed, it
>> could be truncated.
>
> Hmm, this needs to be fixed. Even my patch relied on XID to determine
> whether the transaction committed or rolled back locally and thus to
> decide whether it should be committed or rolled back on all the
> foreign servers involved. I think I had taken care of the issue you
> have pointed out here. Can you please verify the same?
>
>>
>>> The way gid is crafted, there is no way to check whether the given
>>> prepared transaction was created by the local server or not. Probably
>>> the local server needs to add a unique signature in GID to identify
>>> the transactions prepared by itself. That signature should be
>>> transferred to standby to cope up with the fail-over of local server.
>>
>> Maybe we can use database system identifier in control file.
>
> may be.
>
>>
>>> In this idea, one has to keep on polling the foreign server to find
>>> any dangling transactions. In usual scenario, we shouldn't have a
>>> large number of dangling transactions, and thus periodic polling might
>>> be a waste.
>>
>> We can optimize it by storing the XID that is resolved heuristically
>> into the control file or system catalog, for example.
>>
>
> There will be many such XIDs. We don't want to dump so many things in
> control file, esp. when that's not control data. System catalog is out
> of question since a rollback of local transaction would make those
> rows in the system catalog invisible. That's the reason, why I chose
> to write the foreign prepared transactions to files rather than a
> system catalog.
>

We can store the lowest in-doubt transaction id (say in-doubt XID) that
needs to be resolved later into control file and the CLOG containing XID
greater than in-doubt XID is never truncated.
We need to try to solve such transaction only when in-doubt XID is not NULL.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-10-03 Thread Amit Langote

Hi,

On 2016/10/04 13:26, Ashutosh Bapat wrote:
>>>
>>> Why always rollback any dangling transaction? There can be a case that
>>> a foreign server has a dangling transaction which needs to be
>>> committed because the portions of that transaction on the other shards
>>> are committed.
>>
>> Right, we can heuristically make a decision whether we do COMMIT or
>> ABORT on local server.
>> For example, if COMMIT PREPARED succeeded on at least one foreign
>> server, the local server return OK to client and the other dangling
>> transactions should be committed later.
>> We can find out that we should do either commit or abort the dangling
>> transaction by checking CLOG.
> 
> Heuristics can not become the default behavior. A user should be given
> an option to choose a heuristic, and he should be aware of the
> pitfalls when using this heuristic. I guess, first, we need to get a
> solution which ensures that the transaction gets committed on all the
> servers or is rolled back on all the foreign servers involved. AFAIR,
> my patch did that. Once we have that kind of solution, we can think
> about heuristics.

I wonder if Sawada-san is referring to some sort of quorum-based (atomic)
commitment protocol [1, 2], although I agree that that would be an
advanced technique for handling the limitations such as blocking nature of
the basic two-phase commit protocol in case of communication failures,
IOW, meant for better availability rather than correctness.

Thanks,
Amit

[1]
https://en.wikipedia.org/wiki/Quorum_(distributed_computing)#Quorum-based_voting_in_commit_protocols

[2] http://hub.hku.hk/bitstream/10722/158032/1/Content.pdf




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-10-03 Thread Amit Kapila
On Tue, Oct 4, 2016 at 10:06 AM, Amit Kapila  wrote:
> On Thu, Sep 29, 2016 at 8:27 PM, Amit Kapila  wrote:
>> On Thu, Sep 29, 2016 at 6:04 AM, Robert Haas  wrote:
>>> On Wed, Sep 28, 2016 at 3:04 PM, Robert Haas  wrote:
>>>
>>> As I was looking at the old text regarding deadlock risk, I realized
>>> what may be a serious problem.  Suppose process A is performing a scan
>>> of some hash index.  While the scan is suspended, it attempts to take
>>> a lock and is blocked by process B.  Process B, meanwhile, is running
>>> VACUUM on that hash index.  Eventually, it will do
>>> LockBufferForCleanup() on the hash bucket on which process A holds a
>>> buffer pin, resulting in an undetected deadlock. In the current
>>> coding, A would hold a heavyweight lock and B would attempt to acquire
>>> a conflicting heavyweight lock, and the deadlock detector would kill
>>> one of them.  This patch probably breaks that.  I notice that that's
>>> the only place where we attempt to acquire a buffer cleanup lock
>>> unconditionally; every place else, we acquire the lock conditionally,
>>> so there's no deadlock risk.  Once we resolve this problem, the
>>> paragraph about deadlock risk in this section should be revised to
>>> explain whatever solution we come up with.
>>>
>>> By the way, since VACUUM must run in its own transaction, B can't be
>>> holding arbitrary locks, but that doesn't seem quite sufficient to get
>>> us out of the woods.  It will at least hold ShareUpdateExclusiveLock
>>> on the relation being vacuumed, and process A could attempt to acquire
>>> that same lock.
>>>
>>
>> Right, I think there is a danger of deadlock in above situation.
>> Needs some more thoughts.
>>
>
> I think one way to avoid the risk of deadlock in above scenario is to
> take the cleanup lock conditionally, if we get the cleanup lock then
> we will delete the items as we are doing in patch now, else it will
> just mark the tuples as dead and ensure that it won't try to remove
> tuples that are moved-by-split.  Now, I think the question is how will
> these dead tuples be removed.  We anyway need a separate mechanism to
> clear dead tuples for hash indexes as during scans we are marking the
> tuples as dead if corresponding tuple in heap is dead which are not
> removed later.  This is already taken care in btree code via
> kill_prior_tuple optimization.  So I think clearing of dead tuples can
> be handled by a separate patch.
>

I think we can also remove the dead tuples next time when vacuum
visits the bucket and is able to acquire the cleanup lock.  Right now,
we are just checking if the corresponding heap tuple is dead, we can
have an additional check as well to ensure if the current item is dead
in index, then consider it in list of deletable items.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Un-include access/heapam.h

2016-10-03 Thread Amit Langote
I noticed that un-including access/heapam.h from the following files
leaves things just fine.

src/backend/commands/aggregatecmds.c
src/backend/commands/collationcmds.c
src/backend/commands/conversioncmds.c
src/backend/commands/lockcmds.c

It seems any calls into heapam.c that there used to be have since moved to
the corresponding backend/catalog/* files or elsewhere.

Attached a patch.

Thanks,
Amit
diff --git a/src/backend/commands/aggregatecmds.c b/src/backend/commands/aggregatecmds.c
index b36f09e..19db38d 100644
--- a/src/backend/commands/aggregatecmds.c
+++ b/src/backend/commands/aggregatecmds.c
@@ -22,7 +22,6 @@
  */
 #include "postgres.h"
 
-#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 0c75d16..9bba748 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -14,7 +14,6 @@
  */
 #include "postgres.h"
 
-#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/dependency.h"
diff --git a/src/backend/commands/conversioncmds.c b/src/backend/commands/conversioncmds.c
index 175d4ab..2c85d26 100644
--- a/src/backend/commands/conversioncmds.c
+++ b/src/backend/commands/conversioncmds.c
@@ -14,7 +14,6 @@
  */
 #include "postgres.h"
 
-#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
index 175d1f3..a0c0d75 100644
--- a/src/backend/commands/lockcmds.c
+++ b/src/backend/commands/lockcmds.c
@@ -14,7 +14,6 @@
  */
 #include "postgres.h"
 
-#include "access/heapam.h"
 #include "catalog/namespace.h"
 #include "catalog/pg_inherits_fn.h"
 #include "commands/lockcmds.h"

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-10-03 Thread Amit Kapila
On Thu, Sep 29, 2016 at 8:27 PM, Amit Kapila  wrote:
> On Thu, Sep 29, 2016 at 6:04 AM, Robert Haas  wrote:
>> On Wed, Sep 28, 2016 at 3:04 PM, Robert Haas  wrote:
>>
>> As I was looking at the old text regarding deadlock risk, I realized
>> what may be a serious problem.  Suppose process A is performing a scan
>> of some hash index.  While the scan is suspended, it attempts to take
>> a lock and is blocked by process B.  Process B, meanwhile, is running
>> VACUUM on that hash index.  Eventually, it will do
>> LockBufferForCleanup() on the hash bucket on which process A holds a
>> buffer pin, resulting in an undetected deadlock. In the current
>> coding, A would hold a heavyweight lock and B would attempt to acquire
>> a conflicting heavyweight lock, and the deadlock detector would kill
>> one of them.  This patch probably breaks that.  I notice that that's
>> the only place where we attempt to acquire a buffer cleanup lock
>> unconditionally; every place else, we acquire the lock conditionally,
>> so there's no deadlock risk.  Once we resolve this problem, the
>> paragraph about deadlock risk in this section should be revised to
>> explain whatever solution we come up with.
>>
>> By the way, since VACUUM must run in its own transaction, B can't be
>> holding arbitrary locks, but that doesn't seem quite sufficient to get
>> us out of the woods.  It will at least hold ShareUpdateExclusiveLock
>> on the relation being vacuumed, and process A could attempt to acquire
>> that same lock.
>>
>
> Right, I think there is a danger of deadlock in above situation.
> Needs some more thoughts.
>

I think one way to avoid the risk of deadlock in above scenario is to
take the cleanup lock conditionally, if we get the cleanup lock then
we will delete the items as we are doing in patch now, else it will
just mark the tuples as dead and ensure that it won't try to remove
tuples that are moved-by-split.  Now, I think the question is how will
these dead tuples be removed.  We anyway need a separate mechanism to
clear dead tuples for hash indexes as during scans we are marking the
tuples as dead if corresponding tuple in heap is dead which are not
removed later.  This is already taken care in btree code via
kill_prior_tuple optimization.  So I think clearing of dead tuples can
be handled by a separate patch.

I have also thought about using page-scan-at-a-time idea which has
been discussed upthread[1], but I think we can't completely eliminate
the need to out-wait scans (cleanup lock requirement) for scans that
are started when split-in-progress or for non-MVCC scans as described
in that e-mail [1].  We might be able to find some way to solve the
problem with this approach, but I think it will be slightly
complicated and much more work is required as compare to previous
approach.

What is your preference among above approaches to resolve this problem
or let me know if you have a better idea to solve it?


[1] - 
https://www.postgresql.org/message-id/CAA4eK1Jj1UqneTXrywr%3DGg87vgmnMma87LuscN_r3hKaHd%3DL2g%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-10-03 Thread Ashutosh Bapat
>>
>> Why always rollback any dangling transaction? There can be a case that
>> a foreign server has a dangling transaction which needs to be
>> committed because the portions of that transaction on the other shards
>> are committed.
>
> Right, we can heuristically make a decision whether we do COMMIT or
> ABORT on local server.
> For example, if COMMIT PREPARED succeeded on at least one foreign
> server, the local server return OK to client and the other dangling
> transactions should be committed later.
> We can find out that we should do either commit or abort the dangling
> transaction by checking CLOG.

Heuristics can not become the default behavior. A user should be given
an option to choose a heuristic, and he should be aware of the
pitfalls when using this heuristic. I guess, first, we need to get a
solution which ensures that the transaction gets committed on all the
servers or is rolled back on all the foreign servers involved. AFAIR,
my patch did that. Once we have that kind of solution, we can think
about heuristics.

>
> But we need to handle the case where the CLOG file containing XID
> necessary for resolving dangling transaction is truncated.
> If the user does VACUUM FREEZE just after remote server crashed, it
> could be truncated.

Hmm, this needs to be fixed. Even my patch relied on XID to determine
whether the transaction committed or rolled back locally and thus to
decide whether it should be committed or rolled back on all the
foreign servers involved. I think I had taken care of the issue you
have pointed out here. Can you please verify the same?

>
>> The way gid is crafted, there is no way to check whether the given
>> prepared transaction was created by the local server or not. Probably
>> the local server needs to add a unique signature in GID to identify
>> the transactions prepared by itself. That signature should be
>> transferred to standby to cope up with the fail-over of local server.
>
> Maybe we can use database system identifier in control file.

may be.

>
>> In this idea, one has to keep on polling the foreign server to find
>> any dangling transactions. In usual scenario, we shouldn't have a
>> large number of dangling transactions, and thus periodic polling might
>> be a waste.
>
> We can optimize it by storing the XID that is resolved heuristically
> into the control file or system catalog, for example.
>

There will be many such XIDs. We don't want to dump so many things in
control file, esp. when that's not control data. System catalog is out
of question since a rollback of local transaction would make those
rows in the system catalog invisible. That's the reason, why I chose
to write the foreign prepared transactions to files rather than a
system catalog.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics (v19)

2016-10-03 Thread Michael Paquier
On Mon, Oct 3, 2016 at 8:25 PM, Heikki Linnakangas  wrote:
> Yeah. The idea was to use something like pg_node_tree to store all the
> different kinds of statistics, the histogram, the MCV, and the functional
> dependencies, in one datum. Or JSON, maybe. It sounds better than an opaque
> bytea blob, although I'd prefer something more relational. For the
> functional dependencies, I think we could get away with a simple float
> array, so let's do that in the first cut, and revisit this for the MCV and
> histogram later.

OK. A second thing was related to the use of schemas in the new system
catalogs. As mentioned in [1], those could be removed.
[1]: 
https://www.postgresql.org/message-id/cab7npqtu40q5_nsghvomjfbyh1hdtqmbfdj+kwfjspam35b...@mail.gmail.com.

> Separate columns for the functional dependencies, the MCVs,
> and the histogram, probably makes sense anyway.

Probably..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-10-03 Thread Masahiko Sawada
On Wed, Sep 28, 2016 at 3:30 PM, Ashutosh Bapat
 wrote:
> On Wed, Sep 28, 2016 at 10:43 AM, Masahiko Sawada  
> wrote:
>> On Tue, Sep 27, 2016 at 9:06 PM, Ashutosh Bapat
>>  wrote:
>>> On Tue, Sep 27, 2016 at 2:54 PM, Masahiko Sawada  
>>> wrote:
 On Mon, Sep 26, 2016 at 9:07 PM, Ashutosh Bapat
  wrote:
> On Mon, Sep 26, 2016 at 5:25 PM, Masahiko Sawada  
> wrote:
>> On Mon, Sep 26, 2016 at 7:28 PM, Ashutosh Bapat
>>  wrote:
>>> My original patch added code to manage the files for 2 phase
>>> transactions opened by the local server on the remote servers. This
>>> code was mostly inspired from the code in twophase.c which manages the
>>> file for prepared transactions. The logic to manage 2PC files has
>>> changed since [1] and has been optimized. One of the things I wanted
>>> to do is see, if those optimizations are applicable here as well. Have
>>> you considered that?
>>>
>>>
>>
>> Yeah, we're considering it.
>> After these changes are committed, we will post the patch incorporated
>> these changes.
>>
>> But what we need to do first is the discussion in order to get consensus.
>> Since current design of this patch is to transparently execute DCL of
>> 2PC on foreign server, this code changes lot of code and is
>> complicated.
>
> Can you please elaborate. I am not able to understand what DCL is
> involved here. According to [1], examples of DCL are GRANT and REVOKE
> command.

 I meant transaction management command such as PREPARE TRANSACTION and
 COMMIT/ABORT PREPARED command.
 The web page I refered might be wrong, sorry.

>> Another approach I have is to push down DCL to only foreign servers
>> that support 2PC protocol, which is similar to DML push down.
>> This approach would be more simpler than current idea and is easy to
>> use by distributed transaction manager.
>
> Again, can you please elaborate, how that would be different from the
> current approach and how does it simplify the code.
>

 The idea is just to push down PREPARE TRANSACTION, COMMIT/ROLLBACK
 PREPARED to foreign servers that support 2PC.
 With this idea, the client need to do following operation when foreign
 server is involved with transaction.

 BEGIN;
 UPDATE parent_table SET ...; -- update including foreign server
 PREPARE TRANSACTION 'xact_id';
 COMMIT PREPARED 'xact_id';

 The above PREPARE TRANSACTION and COMMIT PREPARED command are pushed
 down to foreign server.
 That is, the client needs to execute PREPARE TRANSACTION and

 In this idea, I think that we don't need to do followings,

 * Providing the prepare id of 2PC.
   Current patch adds new API prepare_id_provider() but we can use the
 prepare id of 2PC that is used on parent server.

 * Keeping track of status of foreign servers.
   Current patch keeps track of status of foreign servers involved with
 transaction but this idea is just to push down transaction management
 command to foreign server.
   So I think that we no longer need to do that.
>>>
 COMMIT/ROLLBACK PREPARED explicitly.
>>>
>>> The problem with this approach is same as one previously stated. If
>>> the connection between local and foreign server is lost between
>>> PREPARE and COMMIT the prepared transaction on the foreign server
>>> remains dangling, none other than the local server knows what to do
>>> with it and the local server has lost track of the prepared
>>> transaction on the foreign server. So, just pushing down those
>>> commands doesn't work.
>>
>> Yeah, my idea is one of the first step.
>> Mechanism that resolves the dangling foreign transaction and the
>> resolver worker process are necessary.
>>

 * Adding max_prepared_foreign_transactions parameter.
   It means that the number of transaction involving foreign server is
 the same as max_prepared_transactions.

>>>
>>> That isn't true exactly. max_prepared_foreign_transactions indicates
>>> how many transactions can be prepared on the foreign server, which in
>>> the method you propose should have a cap of max_prepared_transactions
>>> * number of foreign servers.
>>
>> Oh, I understood, thanks.
>>
>> Consider sharding solution using postgres_fdw (that is, the parent
>> postgres server has multiple shard postgres servers), we need to
>> increase max_prepared_foreign_transactions whenever new shard server
>> is added to cluster, or to allocate enough size in advance. But the
>> estimation of enough max_prepared_foreign_transactions would not be
>> easy, for example can we estimate it by (max throughput of the system)
>> * (the number of foreign servers)?
>>
>> One new idea I came up with is that we set transaction id on parent
>> server to global transaction id (gid) that is prepared on shard
>> server.
>> And pg_fdw_resolver worker process perio

Re: [HACKERS] Tracking wait event for latches

2016-10-03 Thread Michael Paquier
On Tue, Oct 4, 2016 at 1:55 AM, Robert Haas  wrote:
> On Mon, Oct 3, 2016 at 3:30 AM, Michael Paquier
>  wrote:
>> [ new patch ]
>
> I think this is unnecessarily awkward for callers; the attached
> version takes a different approach which I think will be more
> convenient.  The attached version also (1) moves a lot more of the
> logic from latch.c/h to pgstat.c/h, which I think is more appropriate;
> (2) more thoroughly separates the wait events by class; (3) renames
> SecureRead/SecureWrite to ClientRead/ClientWrite (whether to also
> rename the C functions is an interesting question, but not the most
> pressing one IMHO), (4) creates a real wait event for GetSafeSnapshot
> and removes the unnecessary and overly generic ProcSleep and
> ProcSignal wait events, and (5) incorporates a bit of copy editing.

OK with that.

> I've tested that this seems to work in basic cases, but more testing
> is surely welcome.  If there are no major objections, I will commit
> this version.

In pgstat_get_wait_event_type you are forgetting WAIT_IPC.

+
+ IPC
+ BgWorkerShutdown
+ Waiting for background worker to shut down.
+
Here this should be morerows=9. You removed two entries, and added one
with SafeSnapshot.

The rest looks good to me. Thanks for the feedback and the time!
-- 
Michael


wait-event-v14.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2016-10-03 Thread Michael Paquier
On Mon, Oct 3, 2016 at 9:52 PM, Jesper Pedersen
 wrote:
> Maybe "Returned with Feedback" is more appropriate, as there is still
> development left.

I have switched it to "waiting on author".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pnstrdup considered armed and dangerous

2016-10-03 Thread Andres Freund
Hi,

A colleage of me just wrote innocent looking code like
char *shardRelationName = pnstrdup(relationName, NAMEDATALEN);
which is at the moment wrong if relationName isn't preallocated to
NAMEDATALEN size.

/*
 * pnstrdup
 *  Like pstrdup(), but append null byte to a
 *  not-necessarily-null-terminated input string.
 */
char *
pnstrdup(const char *in, Size len)
{
char   *out = palloc(len + 1);

memcpy(out, in, len);
out[len] = '\0';
return out;
}

isn't that a somewhat weird behaviour / implementation? Not really like
strndup(), which one might believe to be analoguous...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question / requests.

2016-10-03 Thread Alvaro Herrera
Robert Haas wrote:
> On Fri, Sep 30, 2016 at 11:20 AM, Francisco Olarte
>  wrote:
> > After some messages due to vacuumdb auto-deadlocking itself on the
> > system tables when doing paralell vacuum of a full database I
> > suggested adding some flags to make vacuumdb process schemas. I was
> > asked wether I could write a patch for that and I am thinking on doing
> > it.
> 
> What messages are you seeing, exactly? "auto-deadlocking" isn't a thing.

https://www.postgresql.org/message-id/57EBC9AE.2060302%40163.com

I wonder if the real answer isn't just to disallow -f with parallel
vacuuming.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql \setfileref

2016-10-03 Thread Gilles Darold
Le 03/10/2016 à 23:03, Robert Haas a écrit :
> On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold  wrote:
>> 4) An other problem is that like this this patch will allow anyone to upload 
>> into a
>> column the content of any system file that can be read by postgres system 
>> user
>> and then allow non system user to read its content.
> I thought this was a client-side feature, so that it would let a
> client upload any file that the client can read, but not things that
> can only be read by the postgres system user.
>

Yes that's right, sorry for the noise, forget this fourth report.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql \setfileref

2016-10-03 Thread Robert Haas
On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold  wrote:
> 4) An other problem is that like this this patch will allow anyone to upload 
> into a
> column the content of any system file that can be read by postgres system user
> and then allow non system user to read its content.

I thought this was a client-side feature, so that it would let a
client upload any file that the client can read, but not things that
can only be read by the postgres system user.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] asynchronous execution

2016-10-03 Thread Robert Haas
On Wed, Sep 28, 2016 at 12:30 AM, Amit Khandekar  wrote:
> On 24 September 2016 at 06:39, Robert Haas  wrote:
>> Since Kyotaro Horiguchi found that my previous design had a
>> system-wide performance impact due to the ExecProcNode changes, I
>> decided to take a different approach here: I created an async
>> infrastructure where both the requestor and the requestee have to be
>> specifically modified to support parallelism, and then modified Append
>> and ForeignScan to cooperate using the new interface.  Hopefully that
>> means that anything other than those two nodes will suffer no
>> performance impact.  Of course, it might have other problems
>
> I see that the reason why you re-designed the asynchronous execution
> implementation is because the earlier implementation showed
> performance degradation in local sequential and local parallel scans.
> But I checked that the ExecProcNode() changes were not that
> significant as to cause the degradation.

I think we need some testing to prove that one way or the other.  If
you can do some - say on a plan with multiple nested loop joins with
inner index-scans, which will call ExecProcNode() a lot - that would
be great.  I don't think we can just rely on "it doesn't seem like it
should be slower", though - ExecProcNode() is too important a function
for us to guess at what the performance will be.

The thing I'm really worried about with either implementation is what
happens when we start to add asynchronous capability to multiple
nodes.  For example, if you imagine a plan like this:

Append
-> Hash Join
  -> Foreign Scan
  -> Hash
-> Seq Scan
-> Hash Join
  -> Foreign Scan
  -> Hash
-> Seq Scan

In order for this to run asynchronously, you need not only Append and
Foreign Scan to be async-capable, but also Hash Join.  That's true in
either approach.  Things are slightly better with the original
approach, but the basic problem is there in both cases.  So it seems
we need an approach that will make adding async capability to a node
really cheap, which seems like it might be a problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Learning to hack Postgres - Keeping track of ctids

2016-10-03 Thread Robert Haas
On Mon, Oct 3, 2016 at 4:30 PM, Emrul  wrote:
> I suspect you're right. I've looked at the code and it will be very difficult
> (especially if I want to do it as an extension rather than patching
> Postgres) and with all the stuff I'd need to do to make it work you're also
> right that it probably won't improve upon just using primary key Ids.

Fortunately, primary keys are awesome, and are designed for exactly
the problem you're trying to solve, so that's OK.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Learning to hack Postgres - Keeping track of ctids

2016-10-03 Thread Emrul
I suspect you're right. I've looked at the code and it will be very difficult
(especially if I want to do it as an extension rather than patching
Postgres) and with all the stuff I'd need to do to make it work you're also
right that it probably won't improve upon just using primary key Ids.

I've scrapped any further exploration; too hard and no real benefit.



--
View this message in context: 
http://postgresql.nabble.com/Learning-to-hack-Postgres-Keeping-track-of-ctids-tp5923649p5924248.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question / requests.

2016-10-03 Thread Robert Haas
On Fri, Sep 30, 2016 at 11:20 AM, Francisco Olarte
 wrote:
> After some messages due to vacuumdb auto-deadlocking itself on the
> system tables when doing paralell vacuum of a full database I
> suggested adding some flags to make vacuumdb process schemas. I was
> asked wether I could write a patch for that and I am thinking on doing
> it.

What messages are you seeing, exactly? "auto-deadlocking" isn't a thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Learning to hack Postgres - Keeping track of ctids

2016-10-03 Thread Robert Haas
On Thu, Sep 29, 2016 at 4:15 PM, Emrul  wrote:
> What I'd like to do is store a reference to all the links from one record
> using an array type that stores links to all related tables.
>
> At first, I've succeeded in doing this using primary key Ids and this works
> fine.  However, I'd like to be able to bypass the index lookup altogether by
> storing the ctids in my array instead of the primary key ids.

I suspect you're going to find that this is very difficult and doesn't
actually make anything any better than just using the primary key IDs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GENERAL] Understanding “max_wal_size” and “min_wal_size” parameters default values from postgresql.conf file

2016-10-03 Thread otar shavadze
Thank you very much

On Mon, Oct 3, 2016 at 11:46 PM, Tom Lane  wrote:

> otar shavadze  writes:
> > name |  setting | unit--
> > max_wal_size | 64   |
> > min_wal_size | 5|
>
> > I have 2 questions:
>
> > 1) Why these values doesn't match default values, which are shown in
> docs?
> > I never changed config settings at all.
>
> They do match the defaults.
>
> > 2) Why unit column is empty/NULL for these parameters? What means 64 and
> 5
> > values in this case? MB? GB? or what?
>
> The problem seems to be that somebody missed out adding GUC_UNIT_XSEGS
> to the switch in GetConfigOptionByNum.  It should be showing you
> something like "16MB" in the unit column, I think.
>
> regards, tom lane
>


Re: [HACKERS] Showing parallel status in \df+

2016-10-03 Thread Pavel Stehule
2016-10-03 22:03 GMT+02:00 Tom Lane :

> Pavel Stehule  writes:
> > 2016-10-03 21:54 GMT+02:00 Robert Haas :
> >> On Fri, Sep 30, 2016 at 8:47 PM, Tom Lane  wrote:
> >>> Personally I'm on the edge of washing my hands of the whole thing...
>
> >> The hand-washing strategy has a lot to recommend it; this thread is
> >> going nowhere fast.  I don't care enough to put up a big stink about
> >> the idea of removing PL source code from \df+ output, but it's not
> >> what I'd choose to do; let's call me -0 on that option.
>
> > I can write the patch - I am sure so cleaned \df+ output will be better
> > than what we have now.
>
> Writing a patch is not the problem.  Getting consensus on what it should
> do is the problem.
>

I am feeling consensus on removing source of PL from \dt+. There is partial
consensus on saving this field (renamed) for C and internal language. I am
not sure about consensus about \sf enhancing.

First point is almost clean -- others not, but is not necessary do it now.
Who needs some special functionality, he can do direct query on pg_proc. It
is not mayor functionality - there is more than one possible substitution -
so cleaning without any other changes should be ok too.

Regards

Pavel



>
> regards, tom lane
>


Re: [HACKERS] Showing parallel status in \df+

2016-10-03 Thread Tom Lane
Pavel Stehule  writes:
> 2016-10-03 21:54 GMT+02:00 Robert Haas :
>> On Fri, Sep 30, 2016 at 8:47 PM, Tom Lane  wrote:
>>> Personally I'm on the edge of washing my hands of the whole thing...

>> The hand-washing strategy has a lot to recommend it; this thread is
>> going nowhere fast.  I don't care enough to put up a big stink about
>> the idea of removing PL source code from \df+ output, but it's not
>> what I'd choose to do; let's call me -0 on that option.

> I can write the patch - I am sure so cleaned \df+ output will be better
> than what we have now.

Writing a patch is not the problem.  Getting consensus on what it should
do is the problem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Showing parallel status in \df+

2016-10-03 Thread Pavel Stehule
2016-10-03 21:54 GMT+02:00 Robert Haas :

> On Fri, Sep 30, 2016 at 8:47 PM, Tom Lane  wrote:
> > Well, alternatively, can we get a consensus for doing that?  People
> > did speak against removing PL source code from \df+ altogether, but
> > maybe they're willing to reconsider if the alternative is doing nothing.
> >
> > Personally I'm on the edge of washing my hands of the whole thing...
>
> The hand-washing strategy has a lot to recommend it; this thread is
> going nowhere fast.  I don't care enough to put up a big stink about
> the idea of removing PL source code from \df+ output, but it's not
> what I'd choose to do; let's call me -0 on that option.
>

I can write the patch - I am sure so cleaned \df+ output will be better
than what we have now.

Regards

Pavel



>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] proposal: psql \setfileref

2016-10-03 Thread Gilles Darold
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:not tested

Contents & Purpose
==

This patch adds a new type of psql variables: file references, that can be
set using command \setfileref. The value of the named variable is the path
to the referenced file. It allows simple inserting of large data without
necessity of manual escaping or using LO api. Use:

postgres=# create table test (col1 bytea);
postgres=# \setfileref a ~/avatar.gif
postgres=# insert into test values(:a);

Content of file is returned as bytea, the text output can be used when
escaping is not required, in this case use convert_from() to obtain a
text output.

postgres=# create table test (col1 text);
postgres=# \setfileref a ~/README.txt
postgres=# insert into test values(convert_from(:a, 'utf8'));

The content of file reference variables is not persistent in memory.

List of file referenced variable can be listed using \set

postgres=# \set
...
b = ^'~/README.txt'

Empty file outputs an empty bytea (\x).

Initial Run
===

The patch applies cleanly to HEAD and doesn't break anything, at least the
regression tests all pass successfully.

Feedback on testing
===

I see several problems.

1) Error reading referenced file returns the system error and a syntax error
on variable:

postgres=# \setfileref a /etc/sudoers
postgres=# insert into test values (:a);
/etc/sudoers: Permission denied
ERROR:  syntax error at or near ":"
LINE 1: insert into t1 values (:a);

2) Trying to load a file with size upper than the 1GB limit reports the 
following
error (size 2254110720 bytes):

postgres=# \setfileref b /home/VMs/ISOs/sol-10-u11-ga-x86-dvd.iso
postgres=# insert into test values (:b);
INSERT 0 1
postgres=# ANALYZE test;
postgres=# SELECT relname, relkind, relpages, 
pg_size_pretty(relpages::bigint*8192) FROM pg_class WHERE relname='test';
 relname | relkind | relpages | pg_size_pretty 
-+-+--+
 test| r   |1 | 8192 bytes
(1 row)

postgres=# select * from test;
 col1 
--
 \x
(1 row)

This should not insert an empty bytea but might raise an error instead.

Trying to load smaller file but with bytea escaping total size upper than
the 1GB limit (size 666894336 bytes) reports:

postgres=# \setfileref a /var/ISOs/CentOS-7-x86_64-Minimal-1503-01.iso
postgres=# insert into t1 values (1, :a, 'tr');
ERROR:  out of memory
DETAIL:  Cannot enlarge string buffer containing 0 bytes by 1333788688 
more bytes.
Time: 1428.657 ms (00:01.429)

This raise an error as bytea escaping increase content size which is the 
behavior expected.

3) The path doesn't not allow the use of pipe to system command, which is a 
secure
behavior, but it is quite easy to perform a DOS by using special files like:

postgres=# \setfileref b /dev/random
   
postgres=# insert into t1 (:b);.

this will never end until we kill the psql session. I think we must also prevent
non regular files to be referenced using S_ISREG.

I think all these three errors can be caught and prevented at variable 
declaration using some tests on files like:

is readable?
is a regular file?
is small size enough?

to report an error directly at variable file reference setting.

4) An other problem is that like this this patch will allow anyone to upload 
into a
column the content of any system file that can be read by postgres system user
and then allow non system user to read its content. For example, connected as
a basic PostgreSQL only user:

testdb=> select current_user;
 current_user 
--
 user1
(1 row)
testdb=> \setfileref b /etc/passwd
testdb=> insert into test values (:b);
INSERT 0 1

then to read the file:

testdb=> select convert_from(col1, 'utf8') from t1; 

Maybe the referenced files must be limited to some part of the filesystem
or the feature limited to superuser only.

5) There is no documentation at all on this feature. Here a proposal
for inclusion in doc/src/sgml/ref/psql-ref.sgml 

\setfileref name value
Sets the internal variable name as a reference to the file path
set as value. To unset a variable, use the \unset command.

File references are shown as file path prefixed with the ^ character
when using the \set command alone.

Valid variable names can contain characters, digits, and underscores.
See the section Variables below for details. Variable names are
 

Re: [HACKERS] Removing link-time cross-module refs in contrib

2016-10-03 Thread Andres Freund
On 2016-10-03 15:40:12 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-10-03 14:49:20 -0400, Tom Lane wrote:
> >> ... ignoring unresolved symbols in shlibs is the default
> >> on Linux, and while you can make it throw errors, that just leads to
> >> errors for all the references into the core backend.  Not very helpful.
> >> AFAICS, GNU ld lacks any equivalent to macOS' -bundle_loader switch,
> >> which is what we'd need to make this usable.
> 
> > Hm. I wonder if it's actually possible to link against the main backend,
> > when compiling as a position-independent-executable...
> 
> I tried that, didn't work.

Too bad :(.   The only way I could see checking this properly would be
to LD_PRELOAD the .so against postgres, at build time. That should then
error out if there's undefined symbols; without the issue that _PG_init
might error out for some reason.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Showing parallel status in \df+

2016-10-03 Thread Robert Haas
On Fri, Sep 30, 2016 at 8:47 PM, Tom Lane  wrote:
> Well, alternatively, can we get a consensus for doing that?  People
> did speak against removing PL source code from \df+ altogether, but
> maybe they're willing to reconsider if the alternative is doing nothing.
>
> Personally I'm on the edge of washing my hands of the whole thing...

The hand-washing strategy has a lot to recommend it; this thread is
going nowhere fast.  I don't care enough to put up a big stink about
the idea of removing PL source code from \df+ output, but it's not
what I'd choose to do; let's call me -0 on that option.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "Re: Question about grant create on database and pg_dump/pg_dumpall

2016-10-03 Thread Robert Haas
On Thu, Sep 29, 2016 at 11:29 AM, Tom Lane  wrote:
> The fundamental thing we have to do in order to move forward on this is
> to rethink what's the division of labor between pg_dump and pg_dumpall.
> I find the patch as presented quite unacceptable because it's made no
> effort to do that (or even to touch the documentation).
>
> What do people think of this sketch:
>
> 1. pg_dump without --create continues to do what it does today, ie it just
> dumps objects within the database, assuming that database-level properties
> will already be set correctly for the target database.
>
> 2. pg_dump with --create creates the target database and also sets all
> database-level properties (ownership, ACLs, ALTER DATABASE SET, etc etc).
>
> 3. pg_dumpall loses all code relating to individual-database creation
> and property setting and instead relies on pg_dump --create to do that.
> This would leave only the code relating to "pg_dumpall -g" (ie, dump roles
> and tablespaces) within pg_dumpall itself.

Seems like a good sketch.

> One thing that would still be messy is that presumably "pg_dumpall -g"
> would issue ALTER ROLE SET commands, but it's unclear what to do with
> ALTER ROLE IN DATABASE SET commands.  Should those become part of
> "pg_dump --create"'s charter?  It seems like not, but I'm not certain.

I could go either way on this.

> Another thing that requires some thought is that pg_dumpall is currently
> willing to dump ACLs and other properties for template1/template0, though
> it does not invoke pg_dump on them.  If we wanted to preserve that
> behavior while still moving the code that does those things to pg_dump,
> pg_dump would have to grow an option that would let it do that.  But
> I'm not sure how much of that behavior is actually sensible.

I'm not sure, either, but it's usually bad when dump-and-restore
doesn't dump-and-restore things which a user might reasonably have
changed.  That tends to lead to bug reports and/or pg_upgrade
failures.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Understanding “max_wal_size” and “min_wal_size” parameters default values from postgresql.conf file

2016-10-03 Thread Tom Lane
otar shavadze  writes:
> name |  setting | unit--
> max_wal_size | 64   |
> min_wal_size | 5|

> I have 2 questions:

> 1) Why these values doesn't match default values, which are shown in docs?
> I never changed config settings at all.

They do match the defaults.

> 2) Why unit column is empty/NULL for these parameters? What means 64 and 5
> values in this case? MB? GB? or what?

The problem seems to be that somebody missed out adding GUC_UNIT_XSEGS
to the switch in GetConfigOptionByNum.  It should be showing you
something like "16MB" in the unit column, I think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Copy-editing for contrib/pg_visibility documentation.

2016-10-03 Thread Robert Haas
On Mon, Oct 3, 2016 at 3:33 PM, Kevin Grittner  wrote:
>> Anyway, we've probably beaten this horse to death.
>
> Just to be sure of that, I'll cite the Chicago Manual of Style (my
> preferred style guide), which seems to chart a course somewhere in
> the middle:
>
> http://www.chicagomanualofstyle.org/qanda/data/faq/topics/Whichvs.That.html

I'd say that charts a policy which I can live with.

/me ducks

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing link-time cross-module refs in contrib

2016-10-03 Thread Tom Lane
Andres Freund  writes:
> On 2016-10-03 14:49:20 -0400, Tom Lane wrote:
>> ... ignoring unresolved symbols in shlibs is the default
>> on Linux, and while you can make it throw errors, that just leads to
>> errors for all the references into the core backend.  Not very helpful.
>> AFAICS, GNU ld lacks any equivalent to macOS' -bundle_loader switch,
>> which is what we'd need to make this usable.

> Hm. I wonder if it's actually possible to link against the main backend,
> when compiling as a position-independent-executable...

I tried that, didn't work.

>> A workable compromise for Linux might be to enable -Wl,-z,now which
>> changes unresolved symbol resolution from lazy to on-load.

> Hm, I think that's the default already no?

Oh, maybe, I was looking for compile switches.  Will test.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_visibility craps out in assert-enabled builds

2016-10-03 Thread Tom Lane
Robert Haas  writes:
> On Fri, Sep 30, 2016 at 10:24 PM, Tom Lane  wrote:
>> The problem seems to be that HeapTupleSatisfiesVacuum asserts
>> Assert(ItemPointerIsValid(&htup->t_self));
>> while collect_corrupt_items hasn't bothered to set up the t_self
>> field of the HeapTupleData it's passing in.  This would imply that
>> you never tested this code in an assert-enabled build, which I find
>> surprising.  Am I missing something?

> My standard build script uses --enable-cassert, so I doubt that's the
> case.  It's more likely that on my system it just happened to find
> some non-zero garbage at that point on the stack that made it not fail
> the assertion.

Yeah, after I looked closer at what the Assert is actually testing,
I realized it was just blind luck that I'd managed to see it fail.
It's a pretty weak test :-(.  Anyway, fixed now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Copy-editing for contrib/pg_visibility documentation.

2016-10-03 Thread Kevin Grittner
On Mon, Oct 3, 2016 at 9:22 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Sure, I'm not arguing with trying to be formal.  The grammatical rule
>> that you're describing doesn't exist for me, though.  I believe that
>> "that" can only introduce a restrictive clause, whereas "which" can
>> introduce either a descriptive or a restrictive clause.
>
> Yeah, as was noted downthread, that's the British view of it.

Even in the Midwest I have frequently heard people arguing to avoid
"that" in most situations where either could work.  I ran into one
professor who went to what I considered silly lengths to expurgate
the word from documents.

> Anyway, we've probably beaten this horse to death.

Just to be sure of that, I'll cite the Chicago Manual of Style (my
preferred style guide), which seems to chart a course somewhere in
the middle:

http://www.chicagomanualofstyle.org/qanda/data/faq/topics/Whichvs.That.html

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: ON UPDATE REMOVE foreign key action

2016-10-03 Thread Vitaly Burovoy
On 10/3/16, Kirill Berezin  wrote:
> *One-line Summary:* On foreign key update we unable to remove all depended
> records. Currently we have "ON REMOVE CASCADE DELETE", but no "ON UPDATE
> CASCADE DELETE". We can only update field to NULL or DEFAULT.

I think there are three causes why we don't have it implemented.
The first one is that there is no such grammar in the SQL spec (your
version is also wrong: SQL spec has "ON DELETE CASCADE" as well as "ON
DELETE CASCADE" [or any other action instead of "CASCADE"]).

The second one is in almost all cases there is no reason to delete
rows because of updating referenced row. If these rows are still
connected, they should be updated, if not --- left as is ("NO ACTION")
or with reference link deleted ("SET NULL" or "DEFAULT").
These rows has data, that's why they are still in tables. They can be
deleted (by reference) if and only if "parent" or "linked" data (all
data, not just referenced key) is deleted.

> *Business Use-case:* Cache expiration on hash/version update. Revoke all
> access on account id update.

> In my case i met this situation: I am using access links to share user
> account. Account owner can give private link to somebody, and its session
> become mirrored. (Owner access to account granted).

And the third cause is avoiding of bad design. If you has to give
access to anyone and you know access will be revoked soon (or late),
it is wise to give private link with different identificator which can
be easily found and removed by a grantor id (your id).

> You cant imagine facebook desktop and mobile sessions.

Which, of course, have different session ids. You can revoke session
without renaming your own.

> It's just shortcut for
> entering credentials. Now i am implementing "revoke all but me". Its done
> simple, since each user is uuid indexed, i am just generate new uuid for
> current account. Old uuid become invalid to other sessions - since no
> record is found in base.
> I want to remove any pending access links, prevent bad guy restore access.
> I can possibly set linked account to NULL,

Why just don't delete them when grantor revokes access?

> and then clear record on
> expiration, but i feel that automatically removing on update event is more
> rational.

I personally don't see necessity to introduce new non-spec grammar.
If you think I has not understood you, send an example with schema ---
what you have now and how you expect it should be.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-03 Thread Robert Haas
On Fri, Sep 30, 2016 at 12:23 PM, Julien Rouhaud
 wrote:
> I've already fixed every other issues mentioned upthread, but I'm facing
> a problem for this one.  Assuming that the bgworker classes are supposed
> to be mutually exclusive, I don't see a simple and clean way to add such
> a check in SanityCheckBackgroundWorker().  Am I missing something
> obvious, or can someone give me some advice for this?

My advice is "don't worry about it".   If somebody thinks of something
that can be usefully added there, it will take very little time to add
it and test that it works.  Don't get hung up on that for now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_visibility craps out in assert-enabled builds

2016-10-03 Thread Robert Haas
On Fri, Sep 30, 2016 at 10:24 PM, Tom Lane  wrote:
> So I tried using pg_visibility's pg_check_visible() as part of
> testing the business with pg_upgrade generating faulty visibility
> maps on bigendian servers, and it instantly generated an assert
> failure here:
>
> #2  0x0041de78 in ExceptionalCondition (conditionName= unavailable, due to optimizations>, errorType= due to optimizations>, fileName= optimizations>, lineNumber= optimizations>) at assert.c:54
> #3  0x0045c410 in HeapTupleSatisfiesVacuum (htup=0x0, OldestXmin=9170, 
> buffer=2958) at tqual.c:1169
> #4  0x00a41c3c in tuple_all_visible (tup=0xbfffd8e4, OldestXmin=9170, 
> buffer=) at 
> pg_visibility.c:719
> #5  0x00a420a8 in collect_corrupt_items (relid=46802, all_visible= temporarily unavailable, due to optimizations>, all_frozen= unavailable, due to optimizations>) at pg_visibility.c:630
> #6  0x00a4262c in pg_check_visible (fcinfo=0x104b704) at pg_visibility.c:328
>
> The problem seems to be that HeapTupleSatisfiesVacuum asserts
>
> Assert(ItemPointerIsValid(&htup->t_self));
>
> while collect_corrupt_items hasn't bothered to set up the t_self
> field of the HeapTupleData it's passing in.  This would imply that
> you never tested this code in an assert-enabled build, which I find
> surprising.  Am I missing something?

My standard build script uses --enable-cassert, so I doubt that's the
case.  It's more likely that on my system it just happened to find
some non-zero garbage at that point on the stack that made it not fail
the assertion.

/me tests.

Yep.  I just checked out 71d05a2c7b82379bb1013a0e338906349c54ed85 and
added an elog() immediately after the call to
HeapTupleSatisfiesVacuum(), and I can hit that elog() without failing
an assertion.  Furthermore I can see that debug_assertions = on.

> (I'm not really sure *why* HeapTupleSatisfiesVacuum contains this
> Assert, because it doesn't do anything with t_self, but nonetheless
> the Assert has been there for several years.  Seems to have been
> inserted by you, in fact.)

I suspect if we reviewed the discussion leading up to
0518eceec3a1cc2b71da04e839f05f555fdd8567 we'd find it.  I don't
actually recall the discussion of checking t_self specifically, but I
know checking t_tableOid had been requested multiple times.  I suspect
we just decided to check both.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing link-time cross-module refs in contrib

2016-10-03 Thread Andres Freund
On 2016-10-03 14:49:20 -0400, Tom Lane wrote:
> > On 2016-10-03 12:29:18 -0400, Tom Lane wrote:
> >> The patch seems pretty successful in terms of being noninvasive to
> >> the code.  I think the major objection to it would be that we no
> >> longer have any direct compiler-verified connection between the
> >> signatures of the called functions in hstore/plpython and what
> >> hstore_plpython thinks they are.
> 
> > We could instead add a AssertVariableIsOfType(), besides the library
> > lookup maybe?
> 
> Hmm ... had not occurred to me that that might work on a function name.
> I'll go try it.

I'm pretty sure it does, I've used it for that in the past.


> Andres Freund  writes:
> >> If we were to push forward with this idea, the remaining work
> >> would be to fix the other two contrib transform modules similarly,
> >> after which I would want to revert the changes in commit cac765820
> >> and later that suppressed linker errors for unresolved symbols in
> >> contrib links.  The possibility of getting back that error detection
> >> is actually the main motivation for this IMO.
> 
> > That'd be rather neat.
> 
> I experimented with that aspect a bit today.  For macOS it's sufficient
> to remove the "-Wl,-undefined,dynamic_lookup" linker flag that cac765820
> added.  However, ignoring unresolved symbols in shlibs is the default
> on Linux, and while you can make it throw errors, that just leads to
> errors for all the references into the core backend.  Not very helpful.
> AFAICS, GNU ld lacks any equivalent to macOS' -bundle_loader switch,
> which is what we'd need to make this usable.

Hm. I wonder if it's actually possible to link against the main backend,
when compiling as a position-independent-executable...


> A workable compromise for Linux might be to enable -Wl,-z,now which
> changes unresolved symbol resolution from lazy to on-load.  That would
> at least guarantee that any testing whatsoever would detect incorrect
> references, even if the bad call wasn't exercised.

Hm, I think that's the default already no? At least linux has
#define pg_dlopen(f)dlopen((f), RTLD_NOW | RTLD_GLOBAL)
which should trigger that behaviour, and I do remember seing errors
because of non-exercised function and variable references.


Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing link-time cross-module refs in contrib

2016-10-03 Thread Tom Lane
Andres Freund  writes:
> On 2016-10-03 12:29:18 -0400, Tom Lane wrote:
>> The patch seems pretty successful in terms of being noninvasive to
>> the code.  I think the major objection to it would be that we no
>> longer have any direct compiler-verified connection between the
>> signatures of the called functions in hstore/plpython and what
>> hstore_plpython thinks they are.

> We could instead add a AssertVariableIsOfType(), besides the library
> lookup maybe?

Hmm ... had not occurred to me that that might work on a function name.
I'll go try it.

>> If we were to push forward with this idea, the remaining work
>> would be to fix the other two contrib transform modules similarly,
>> after which I would want to revert the changes in commit cac765820
>> and later that suppressed linker errors for unresolved symbols in
>> contrib links.  The possibility of getting back that error detection
>> is actually the main motivation for this IMO.

> That'd be rather neat.

I experimented with that aspect a bit today.  For macOS it's sufficient
to remove the "-Wl,-undefined,dynamic_lookup" linker flag that cac765820
added.  However, ignoring unresolved symbols in shlibs is the default
on Linux, and while you can make it throw errors, that just leads to
errors for all the references into the core backend.  Not very helpful.
AFAICS, GNU ld lacks any equivalent to macOS' -bundle_loader switch,
which is what we'd need to make this usable.

A workable compromise for Linux might be to enable -Wl,-z,now which
changes unresolved symbol resolution from lazy to on-load.  That would
at least guarantee that any testing whatsoever would detect incorrect
references, even if the bad call wasn't exercised.

No idea about what to do to change this on Windows.  If we had error
detection on Linux+Windows+macOS that would be good enough for me ---
anyone who feels like fixing it for minor platforms is welcome to,
but it would be unlikely that we'd detect any new problems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macro customizable hashtable / bitmapscan & aggregation perf

2016-10-03 Thread Andres Freund
Hi,


On 2016-10-03 13:26:09 +0200, Arthur Silva wrote:
> On Sat, Oct 1, 2016 at 2:44 AM, Andres Freund  wrote:
> A couple of comments.
> * 80% occupancy is a bit conservative for RH hashing, it works well up to
> 90% if you use the early stops due to distance. So that TODO is worth
> pursuing.

I found 90% a tiny bit slower during modifications, due to the increased
moving of cells around.  I actually implemented that TODO at some point,
it's not actually a very clear win for narrow elements and mid-sized
tables - the additional instructions for distance computations cost.

I've played with a different version of robin hood hashing, where the
buckets are ordered by hash-value. I.e. the hashvalue is shifted right,
instead of being masked, to get the hash bucket. That allows to have a
hashtable which is ordered by the hash-value, thus distance checks
simply become >=.  The problem with that is that it only works if you
have an "overflow" area at the end of the table, which is hard to size
correctly.


> * An optimized memory layout for RH hashmap is along the lines of
> HHHKVKVKV, using one bit of the hash as the present/absent flag. That plays
> specially well with large occupancies (~90%). Due to RH the probings on the
> H array are very short and within a single cache line. Even with a 31bit
> hash it's reallyyy rare to have to probe more than 1 entry in the KV array.
> Essentially guaranteeing that the 99% percentile of lookups will only hit a
> couple of cache lines (if you ignore crossing cache lines and other
> details).

That seems likely to end up being noticeably more complicated - I'm not
sure the cpu overhead of the more complex lookups weighs of the
benefits.  I'd like to get the basics in, we can optimize further later
on.  Based on my instrumentation the majority of time is currently spent
in the cache-miss for the initial bucket, everything else inside the
hash table code is quite small. After that it's hash value computations
in the execGrouping case.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing link-time cross-module refs in contrib

2016-10-03 Thread Andres Freund
Hi,

On 2016-10-03 12:29:18 -0400, Tom Lane wrote:
> The patch seems pretty successful in terms of being noninvasive to
> the code.  I think the major objection to it would be that we no
> longer have any direct compiler-verified connection between the
> signatures of the called functions in hstore/plpython and what
> hstore_plpython thinks they are.  That is, if someone were to
> modify hstore and change the signature of say hstoreCheckKeyLen,
> this would not cause any compiler complaints in hstore_plpython,
> just runtime problems.

> A slightly ugly way of alleviating that risk would be to put the
> function typedefs right beside the externs in the originating
> modules, eg in hstore.h
> 
>  extern size_t hstoreCheckKeyLen(size_t len);
> +typedef size_t (*hstoreCheckKeyLen_t) (size_t len);

We could instead add a AssertVariableIsOfType(), besides the library
lookup maybe?


> If we were to push forward with this idea, the remaining work
> would be to fix the other two contrib transform modules similarly,
> after which I would want to revert the changes in commit cac765820
> and later that suppressed linker errors for unresolved symbols in
> contrib links.  The possibility of getting back that error detection
> is actually the main motivation for this IMO.

That'd be rather neat.


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench more operators & functions

2016-10-03 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> It already is a script, it's just hardwired as a string constant in
> >> pgbench.c rather than being a separate file.  I think Fabien is
> >> suggesting that it could be changed to more nearly approximate the
> >> actual TPC-B spec, but IMO that would be a seriously bad idea because
> >> it would invalidate all cross-version performance comparisons.  We
> >> decided years ago that the default script is what it is and we aren't
> >> going to change it to try to match TPC-B more exactly.
> 
> > If we could replicate what the hardwired script does in an external
> > script, keeping that as the default, and then provide a 'Closer to
> > TPC-B' script, then I'm all for that.
> 
> I've got no objection to a more-nearly-TPC-B script as an option.
> But why do you feel the need to pull the default script out into
> a separate file?  Seems to me that just adds maintenance complexity,
> and the need for pgbench to have a notion of a library directory,
> for little gain.

Part of it is a feeling that we should really be 'eating our own
dogfood' when it comes to pgbench, but also that it seems to add
unnecessary C-level code to an otherwise general-purpose utility
for no particular reason except "that's how it was first written."

Perhaps that's overkill for this case and you have an interesting point
that it might require additional code in pgbench (though I'm not
completely convinced of that...), so I won't push too hard on it, but I
still think it'd be "better" to have pgbench just be the general purpose
utility and not also have some built-in thing, even if it's obvious that
it could just be a script.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pgbench more operators & functions

2016-10-03 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> It already is a script, it's just hardwired as a string constant in
>> pgbench.c rather than being a separate file.  I think Fabien is
>> suggesting that it could be changed to more nearly approximate the
>> actual TPC-B spec, but IMO that would be a seriously bad idea because
>> it would invalidate all cross-version performance comparisons.  We
>> decided years ago that the default script is what it is and we aren't
>> going to change it to try to match TPC-B more exactly.

> If we could replicate what the hardwired script does in an external
> script, keeping that as the default, and then provide a 'Closer to
> TPC-B' script, then I'm all for that.

I've got no objection to a more-nearly-TPC-B script as an option.
But why do you feel the need to pull the default script out into
a separate file?  Seems to me that just adds maintenance complexity,
and the need for pgbench to have a notion of a library directory,
for little gain.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench more operators & functions

2016-10-03 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Fabien COELHO (coe...@cri.ensmp.fr) wrote:
> >> Indeed, some kind of "if" is needed, for instance to implement
> >> "tpc-b" correctly.
> 
> > That's an interesting point..  Have you thought about ripping out the
> > built-in TPC-B-like functionality of pgbench and making that into a
> > script instead?
> 
> It already is a script, it's just hardwired as a string constant in
> pgbench.c rather than being a separate file.  I think Fabien is
> suggesting that it could be changed to more nearly approximate the
> actual TPC-B spec, but IMO that would be a seriously bad idea because
> it would invalidate all cross-version performance comparisons.  We
> decided years ago that the default script is what it is and we aren't
> going to change it to try to match TPC-B more exactly.

If we could replicate what the hardwired script does in an external
script, keeping that as the default, and then provide a 'Closer to
TPC-B' script, then I'm all for that.  If the existing "hardwired
script" is really just a string constant, then we shouldn't need to
worry about that invalidating prior runs.

> >> The SQL syntax for CASE is on the very heavy side and would be quite
> >> complicated to implement in pgbench, so I rejected that and selected
> >> the simplest possible function for the job.
> 
> > I'm not quite sure that I follow why you feel that CASE would be too
> > difficult to implement here..?
> 
> If you want simple, you could provide just a subset of CASE (ie, only
> the CASE WHEN boolexpr variant).  I think inventing some random new syntax
> is a bad idea.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Hash Indexes

2016-10-03 Thread Tom Lane
Jeff Janes  writes:
> I've done a simple comparison using pgbench's default transaction, in which
> all the primary keys have been dropped and replaced with indexes of either
> hash or btree type, alternating over many rounds.

> I run 'pgbench -c16 -j16 -T 900 -M prepared' on an 8 core machine with a
> scale of 40.  All the data fits in RAM, but not in shared_buffers (128MB).

> I find a 4% improvement for hash indexes over btree indexes, 9324.744
> vs 9727.766.  The difference is significant at p-value of 1.9e-9.

Thanks for doing this work!

> The four versions of hash indexes (HEAD, concurrent, wal, cache, applied
> cumulatively) have no statistically significant difference in performance
> from each other.

Interesting.

> I think I don't see improvement in hash performance with the concurrent and
> cache patches because I don't have enough cores to get to the contention
> that those patches are targeted at.

Possibly.  However, if the cache patch is not a prerequisite to the WAL
fixes, IMO somebody would have to demonstrate that it has a measurable
performance benefit before it would get in.  It certainly doesn't look
like it's simplifying the code, so I wouldn't take it otherwise.

I think, though, that this is enough to put to bed the argument that
we should toss the hash AM entirely.  If it's already competitive with
btree today, despite the lack of attention that it's gotten, there is
reason to hope that it will be a significant win (for some use-cases,
obviously) in future.  We should now get back to reviewing these patches
on their own merits.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-10-03 Thread Jeff Janes
On Thu, Sep 29, 2016 at 5:14 PM, Robert Haas  wrote:

> On Thu, Sep 29, 2016 at 8:07 PM, Peter Geoghegan  wrote:
> > On Wed, Sep 28, 2016 at 8:06 PM, Andres Freund 
> wrote:
> >> On 2016-09-28 15:04:30 -0400, Robert Haas wrote:
> >>> Andres already
> >>> stated that he things working on btree-over-hash would be more
> >>> beneficial than fixing hash, but at this point it seems like he's the
> >>> only one who takes that position.
> >>
> >> Note that I did *NOT* take that position. I was saying that I think we
> >> should evaluate whether that's not a better approach, doing some simple
> >> performance comparisons.
> >
> > I, for one, agree with this position.
>
> Well, I, for one, find it frustrating.  It seems pretty unhelpful to
> bring this up only after the code has already been written.  The first
> post on this thread was on May 10th.  The first version of the patch
> was posted on June 16th.  This position was first articulated on
> September 15th.
>
> But, by all means, please feel free to do the performance comparison
> and post the results.  I'd be curious to see them myself.
>


I've done a simple comparison using pgbench's default transaction, in which
all the primary keys have been dropped and replaced with indexes of either
hash or btree type, alternating over many rounds.

I run 'pgbench -c16 -j16 -T 900 -M prepared' on an 8 core machine with a
scale of 40.  All the data fits in RAM, but not in shared_buffers (128MB).

I find a 4% improvement for hash indexes over btree indexes, 9324.744
vs 9727.766.  The difference is significant at p-value of 1.9e-9.

The four versions of hash indexes (HEAD, concurrent, wal, cache, applied
cumulatively) have no statistically significant difference in performance
from each other.

I certainly don't see how btree-over-hash-over-integer could be better than
direct btree-over-integer.

I think I don't see improvement in hash performance with the concurrent and
cache patches because I don't have enough cores to get to the contention
that those patches are targeted at.  But since the concurrent patch is a
prerequisite to the wal patch, that is enough to justify it even without a
demonstrated performance boost.  A 4% gain is not astonishing, but is nice
to have provided we can get it without giving up crash safety.

Cheers,

Jeff


Re: [HACKERS] Tracking wait event for latches

2016-10-03 Thread Robert Haas
On Mon, Oct 3, 2016 at 3:30 AM, Michael Paquier
 wrote:
> [ new patch ]

I think this is unnecessarily awkward for callers; the attached
version takes a different approach which I think will be more
convenient.  The attached version also (1) moves a lot more of the
logic from latch.c/h to pgstat.c/h, which I think is more appropriate;
(2) more thoroughly separates the wait events by class; (3) renames
SecureRead/SecureWrite to ClientRead/ClientWrite (whether to also
rename the C functions is an interesting question, but not the most
pressing one IMHO), (4) creates a real wait event for GetSafeSnapshot
and removes the unnecessary and overly generic ProcSleep and
ProcSignal wait events, and (5) incorporates a bit of copy editing.

I've tested that this seems to work in basic cases, but more testing
is surely welcome.  If there are no major objections, I will commit
this version.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 8ca1c1c..9badfe6 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -17,6 +17,7 @@
 #include "access/xact.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "pgstat.h"
 #include "storage/latch.h"
 #include "utils/hsearch.h"
 #include "utils/memutils.h"
@@ -496,7 +497,7 @@ pgfdw_get_result(PGconn *conn, const char *query)
 			wc = WaitLatchOrSocket(MyLatch,
    WL_LATCH_SET | WL_SOCKET_READABLE,
    PQsocket(conn),
-   -1L);
+   -1L, WAIT_EXTENSION);
 			ResetLatch(MyLatch);
 
 			CHECK_FOR_INTERRUPTS();
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index f400785..c5d7728 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -679,6 +679,42 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   buffer in question.
  
 
+
+ 
+  Activity: The server process is idle.  This is used by
+  system processes waiting for activity in their main processing loop.
+  wait_event will identify the specific wait point.
+ 
+
+
+ 
+  Extension: The server process is waiting for activity
+  in an extension module.  This category is useful for modules to
+  track custom waiting points.
+ 
+
+
+ 
+  Client: The server process is waiting for some activity
+  on a socket from user applications, and that the server expects
+  something to happen that is independent from its internal processes.
+  wait_event will identify the specific wait point.
+ 
+
+
+ 
+  IPC: The server process is waiting for some activity
+  from another process in the server.  wait_event will
+  identify the specific wait point.
+ 
+
+
+ 
+  Timeout: The server process is waiting for a timeout
+  to expire.  wait_event will identify the specific wait
+  point.
+ 
+

   
  
@@ -1085,6 +1121,139 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  BufferPin
  Waiting to acquire a pin on a buffer.
 
+
+ Activity
+ ArchiverMain
+ Waiting in main loop of the archiver process.
+
+
+ AutoVacuumMain
+ Waiting in main loop of autovacuum launcher process.
+
+
+ BgWriterHibernate
+ Waiting in background writer process, hibernating.
+
+
+ BgWriterMain
+ Waiting in main loop of background writer process background worker.
+
+
+ CheckpointerMain
+ Waiting in main loop of checkpointer process.
+
+
+ PgStatMain
+ Waiting in main loop of the statistics collector process.
+
+
+ RecoveryWalAll
+ Waiting for WAL from any kind of source (local, archive or stream) at recovery.
+
+
+ RecoveryWalStream
+ Waiting for WAL from a stream at recovery.
+
+
+ SysLoggerMain
+ Waiting in main loop of syslogger process.
+
+
+ WalReceiverMain
+ Waiting in main loop of WAL receiver process.
+
+
+ WalSenderMain
+ Waiting in main loop of WAL sender process.
+
+
+ WalWriterMain
+ Waiting in main loop of WAL writer process.
+
+
+ Client
+ ClientRead
+ Waiting to read data from the client.
+
+
+ ClientWrite
+ Waiting to write data from the client.
+
+
+ SSLOpenServer
+ Waiting for SSL while attempting connection.
+
+   

Re: [HACKERS] pgbench more operators & functions

2016-10-03 Thread Tom Lane
Stephen Frost  writes:
> * Fabien COELHO (coe...@cri.ensmp.fr) wrote:
>> In the attached patched I only included pg operators, plus "xor"
>> which I feel is missing and does not seem to harm.

> I'm pretty sure we should hold off on adding 'xor' until it's actually
> in PG proper, otherwise we run a very serious risk that whatever we do
> to add it in PG will be different from what you're suggesting we do here
> for pgbench.

Agreed --- we don't really want stuff in pgbench's language that's not
in regular SQL, IMO.

>> Indeed, some kind of "if" is needed, for instance to implement
>> "tpc-b" correctly.

> That's an interesting point..  Have you thought about ripping out the
> built-in TPC-B-like functionality of pgbench and making that into a
> script instead?

It already is a script, it's just hardwired as a string constant in
pgbench.c rather than being a separate file.  I think Fabien is
suggesting that it could be changed to more nearly approximate the
actual TPC-B spec, but IMO that would be a seriously bad idea because
it would invalidate all cross-version performance comparisons.  We
decided years ago that the default script is what it is and we aren't
going to change it to try to match TPC-B more exactly.

>> The SQL syntax for CASE is on the very heavy side and would be quite
>> complicated to implement in pgbench, so I rejected that and selected
>> the simplest possible function for the job.

> I'm not quite sure that I follow why you feel that CASE would be too
> difficult to implement here..?

If you want simple, you could provide just a subset of CASE (ie, only
the CASE WHEN boolexpr variant).  I think inventing some random new syntax
is a bad idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removing link-time cross-module refs in contrib

2016-10-03 Thread Tom Lane
Pursuant to Andres' suggestion in
https://www.postgresql.org/message-id/20161002223927.57xns3arkdg4h...@alap3.anarazel.de
attached is a draft patch that gets rid of link-time references
from hstore_plpython to both hstore and plpython.  I've verified
that this allows "LOAD 'hstore_plpython'" to succeed in a fresh
session without having loaded the prerequisite modules first.

The patch seems pretty successful in terms of being noninvasive to
the code.  I think the major objection to it would be that we no
longer have any direct compiler-verified connection between the
signatures of the called functions in hstore/plpython and what
hstore_plpython thinks they are.  That is, if someone were to
modify hstore and change the signature of say hstoreCheckKeyLen,
this would not cause any compiler complaints in hstore_plpython,
just runtime problems.

A slightly ugly way of alleviating that risk would be to put the
function typedefs right beside the externs in the originating
modules, eg in hstore.h

 extern size_t hstoreCheckKeyLen(size_t len);
+typedef size_t (*hstoreCheckKeyLen_t) (size_t len);

I'm not sure if that is a net improvement or not --- thoughts?

If we were to push forward with this idea, the remaining work
would be to fix the other two contrib transform modules similarly,
after which I would want to revert the changes in commit cac765820
and later that suppressed linker errors for unresolved symbols in
contrib links.  The possibility of getting back that error detection
is actually the main motivation for this IMO.

Comments?

regards, tom lane

diff --git a/contrib/hstore_plpython/Makefile b/contrib/hstore_plpython/Makefile
index c4dad6f..a55c9a1 100644
*** a/contrib/hstore_plpython/Makefile
--- b/contrib/hstore_plpython/Makefile
*** MODULE_big = hstore_plpython$(python_maj
*** 4,10 
  OBJS = hstore_plpython.o $(WIN32RES)
  PGFILEDESC = "hstore_plpython - hstore transform for plpython"
  
! PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plpython $(python_includespec) -I$(top_srcdir)/contrib/hstore
  
  EXTENSION = hstore_plpythonu hstore_plpython2u hstore_plpython3u
  DATA = hstore_plpythonu--1.0.sql hstore_plpython2u--1.0.sql hstore_plpython3u--1.0.sql
--- 4,10 
  OBJS = hstore_plpython.o $(WIN32RES)
  PGFILEDESC = "hstore_plpython - hstore transform for plpython"
  
! PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plpython $(python_includespec) -I$(top_srcdir)/contrib/hstore -DPLPYTHON_LIBNAME='"plpython$(python_majorversion)"'
  
  EXTENSION = hstore_plpythonu hstore_plpython2u hstore_plpython3u
  DATA = hstore_plpythonu--1.0.sql hstore_plpython2u--1.0.sql hstore_plpython3u--1.0.sql
*** include $(top_builddir)/src/Makefile.glo
*** 23,41 
  include $(top_srcdir)/contrib/contrib-global.mk
  endif
  
! # In configurations that forbid undefined symbols in libraries, link with each
! # dependency.  This does preclude pgxs builds.
  ifeq ($(PORTNAME), aix)
  rpathdir = $(pkglibdir):$(python_libdir)
! SHLIB_LINK += ../hstore/libhstore.exp $(python_libspec) $(python_additional_libs) $(sort $(wildcard ../../src/pl/plpython/libplpython*.exp))
! endif
  ifeq ($(PORTNAME), win32)
! SHLIB_LINK += ../hstore/libhstore.a $(sort $(wildcard ../../src/pl/plpython/libpython*.a)) $(sort $(wildcard ../../src/pl/plpython/libplpython*.a))
  endif
- 
- ifeq ($(PORTNAME), cygwin)
- SHLIB_LINK += -L../hstore -lhstore -L../../src/pl/plpython \
- 	-lplpython$(python_majorversion) $(python_libspec)
  endif
  
  REGRESS_OPTS += --load-extension=hstore
--- 23,40 
  include $(top_srcdir)/contrib/contrib-global.mk
  endif
  
! # We must link libpython explicitly
  ifeq ($(PORTNAME), aix)
  rpathdir = $(pkglibdir):$(python_libdir)
! SHLIB_LINK += $(python_libspec) $(python_additional_libs)
! else
  ifeq ($(PORTNAME), win32)
! # ... see silliness in plpython Makefile ...
! SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a))
! else
! rpathdir = $(python_libdir)
! SHLIB_LINK += $(python_libspec)
  endif
  endif
  
  REGRESS_OPTS += --load-extension=hstore
diff --git a/contrib/hstore_plpython/hstore_plpython.c b/contrib/hstore_plpython/hstore_plpython.c
index 6f2751a..9e2eb2f 100644
*** a/contrib/hstore_plpython/hstore_plpython.c
--- b/contrib/hstore_plpython/hstore_plpython.c
***
*** 1,4 
--- 1,5 
  #include "postgres.h"
+ 
  #include "fmgr.h"
  #include "plpython.h"
  #include "plpy_typeio.h"
***
*** 6,11 
--- 7,67 
  
  PG_MODULE_MAGIC;
  
+ extern void _PG_init(void);
+ 
+ /* Linkage to functions in plpython module */
+ typedef char *(*PLyObject_AsString_t) (PyObject *plrv);
+ 
+ static PLyObject_AsString_t PLyObject_AsString_p;
+ 
+ #define PLyObject_AsString PLyObject_AsString_p
+ 
+ /* Linkage to functions in hstore module */
+ typedef HStore *(*hstoreUpgrade_t) (Datum orig);
+ typedef int (*hstoreUniquePairs_t) (Pairs *a, int32 l, int32 *buflen);
+ typedef HStore *(*hstorePairs_t) (Pairs *pairs, int32 pcount, int32 buflen);
+ ty

[HACKERS] Proposal: ON UPDATE REMOVE foreign key action

2016-10-03 Thread Kirill Berezin
*One-line Summary:* On foreign key update we unable to remove all depended
records. Currently we have "ON REMOVE CASCADE DELETE", but no "ON UPDATE
CASCADE DELETE". We can only update field to NULL or DEFAULT.


*Business Use-case:* Cache expiration on hash/version update. Revoke all
access on account id update.

In my case i met this situation: I am using access links to share user
account. Account owner can give private link to somebody, and its session
become mirrored. (Owner access to account granted). You cant imagine
facebook desktop and mobile sessions. It's just shortcut for
entering credentials. Now i am implementing "revoke all but me". Its done
simple, since each user is uuid indexed, i am just generate new uuid for
current account. Old uuid become invalid to other sessions - since no
record is found in base.
I want to remove any pending access links, prevent bad guy restore access.
I can possibly set linked account to NULL, and then clear record on
expiration, but i feel that automatically removing on update event is more
rational.


*User impact with the change:* Instead of writing "on update" triggers for
each depended table, wished action is done by single line.

*Implementation details:* On cascade switch "update" action to "delete".

*Estimated Development Time:* Few hours or less.

*Opportunity Window Period:* Non applicable, minor feature

*Budget Money:* I am ready to implement myself, if approved.

*Contact Information:* ene...@exsul.net


Re: [HACKERS] pgbench more operators & functions

2016-10-03 Thread Stephen Frost
Fabien,

* Fabien COELHO (coe...@cri.ensmp.fr) wrote:
> >>bitwise: <<, >>, &, |, ^/#, ~
> >>comparisons: =/==, <>/!=, <, <=, >, >=
> >>logical: and/&&, or/||, xor/^^, not, !
> >
> >I'm not sure that we want to introduce operators '&&', '||' as logical
> >'and' and 'or' when those have specific meaning in PG which is different
> >(array overlaps and concatenation, specifically).  I can certainly see
> >how it could be useful to be able to perform string concatenation in a
> >\set command in pgbench, for example..
> >
> >Also, are we sure that we want to have both '=' and '==' for comparison?
> 
> The reason I added C operators is that Pg SQL has '!=' and a few
> others, so once some are there I do not see why others should not be
> there as well for consistency.
> 
> Now I agree that having operators which behave differently from psql
> to pgbench is a bad idea.
> 
> In the attached patched I only included pg operators, plus "xor"
> which I feel is missing and does not seem to harm.

I'm pretty sure we should hold off on adding 'xor' until it's actually
in PG proper, otherwise we run a very serious risk that whatever we do
to add it in PG will be different from what you're suggesting we do here
for pgbench.

> >[...] I certainly understand how the if() function could be useful
> 
> Indeed, some kind of "if" is needed, for instance to implement
> "tpc-b" correctly.

That's an interesting point..  Have you thought about ripping out the
built-in TPC-B-like functionality of pgbench and making that into a
script instead?  Doing so would likely point out places where we need to
add functionality or cases which aren't handled very well.  We'd also be
able to rip out all that code from pgbench and make it into a general
utility instead of "general utility + other stuff."

> >, but I'm not entirely sure that the if(expression, true-result,
> >false-result) is the best approach.  In particular, I recall cases
> >with other languages where that gets to be quite ugly when there
> >are multiple levels of if/else using that approach.
> 
> I think that pgbench is not a programming language, but an
> expression language, which means that you have to provide a result,
> so you can only have balanced if/then/else, which simplifies things
> a bit compared to "full" language.
> 
> The SQL syntax for CASE is on the very heavy side and would be quite
> complicated to implement in pgbench, so I rejected that and selected
> the simplest possible function for the job.

I'm not quite sure that I follow why you feel that CASE would be too
difficult to implement here..?

> Maybe the "if" function could be named something fancier to avoid
> possible future clash on an eventual "if" keyword? Note that excel
> has an IF function. Maybe "conditional"... However, as I do not
> envision pgbench growing to a full language, I would be fine keeping
> "if" as it is.

Excel would be one of the ugly if-nesting cases that I was thinking
of, actually.  I'm certainly not thrilled with the idea of modelling
anything off of it.

> >The table should really have a row per operator, which is what we
> >do in pretty much all of the core documention. [...]
> 
> Ok for one line per operator.

Thanks!

> >The question which was brought up about having a line-continuation
> >(eg: '\') character would be useful,
> 
> :-) I submitted a patch for that, which got rejected and resulted in
> Tom making pgbench share psql lexer. This is a good thing, although
> the continuation extension was lost in the process. Also this means
> that now such logic would probably be shared with psql, not
> necessarily a bad thing either, but clearly material for another
> patch.

Sure, that makes sense to do independently.

> >which really makes me wonder if we shouldn't try to come up with a
> >way to create functions in a pgbench script that can later be used
> >in a \set command.
> 
> I do not think that pgbench script is a good target to define
> functions, because the script is implicitely in a very large loop
> which is executing it over and over again. I do not think that it
> would make much sense to redefine functions on each transaction...
> So my opinion is that without a compeling use case, I would avoid
> such a feature, which would need some careful thinking on the design
> side, and the implementation of which is non trivial.

If we're going to replace the built-in TPC-B functionality then we're
going to need a way to pass in an 'init' script of some kind which only
gets run once, that could certainly be where functions could be
defined.  As for the use-case for functions, well, we need an if()
construct, as discussed, and having a need for loops doesn't seem too
far off from needing conditional control structures.

> >With such an approach, we could have proper control structures for
> >conditionals (if/else), loops (while/for), etc, and not complicate
> >the single-statement set of operators with those constructs.
> 
> If you want control, you can already use a DO 

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2016-10-03 Thread Daniel Verite
Craig Ringer wrote:

> I think it's mostly of interest to app authors and driver developers
> and that's what it's aimed at. pg_restore could benefit a lot too.

Wouldn't pgbench benefit from it?
It was mentioned some time ago [1], in relationship to the 
\into construct, how client-server latency was important enough to
justify the use of a "\;" separator between statements, to send them
as a group.

But with the libpq batch API, maybe this could be modernized
with meta-commands like this:
  \startbatch
  ...
  \endbatch
which would essentially call PQbeginBatchMode() and PQsendEndBatch().
Inside the batch section, collecting results would be interleaved
with sending queries.
Interdepencies between results and subsequent queries could
be handled or ignored, depending on how sophisticated we'd want
this.

This might also draw more users to the batch API, because
it would make it easier to check how exactly it affects
the performance of specific sequences of SQL statements to be
grouped in a batch.
For instance it would make sense for programmers to benchmark mock-ups
of their code with pgbench with/without batching, before embarking on
adapting it from blocking mode to asynchronous/non-blocking mode.


[1]
https://www.postgresql.org/message-id/alpine.DEB.2.20.1607140925400.1962%40sto

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY command with RLS bug

2016-10-03 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Sat, Oct 1, 2016 at 3:11 AM, Stephen Frost  wrote:
> > Comments and testing welcome, of course, though it's looking pretty good
> > to me at this point and I'll likely commit it in another day or two
> > unless issues are found.
> 
> +* nodes/value.h) that correspond to the column name
> "correspondS"?
> 
> Except that, it looks good to me.

Ah, good point, fixed.

Just working through back-patching it and doing final checks, will be
pushing it shortly.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Renaming of pg_xlog and pg_clog

2016-10-03 Thread Tom Lane
Robert Haas  writes:
> I think the tests for PQserverVersion(conn) / 100 >= 1000 are strange.
> I submit that either PQserverVersion(conn) >= 10 or
> PQserverVersion(conn) / 1 >= 10 is an easier-to-understand test.
> I vote for the first style.

+1, that's the way most existing tests of this sort are written.

(Right at the moment, there's kind of a lot of zeroes there, but
once we get to version 11 it'll be less bad.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Copy-editing for contrib/pg_visibility documentation.

2016-10-03 Thread Tom Lane
Robert Haas  writes:
> Sure, I'm not arguing with trying to be formal.  The grammatical rule
> that you're describing doesn't exist for me, though.  I believe that
> "that" can only introduce a restrictive clause, whereas "which" can
> introduce either a descriptive or a restrictive clause.

Yeah, as was noted downthread, that's the British view of it.

> It's impossible for to imagine someone reading "functions which return text
> must do X" and coming away with the conclusion that all functions
> return text.

I deliberately chose an example in which the implication was silly, but
in other cases it's less silly and so it may not be clear to the reader
that you didn't intend to imply it.

> The reason I tend to prefer "which" is that "that" can mean lots of
> other things, too.

Sure, but you can make examples in the other direction as well.  FWIW,
I agree that it's a good idea to try to avoid "that that" and similar
cases where confusion could be introduced by multiple possible meanings
of "that"; and this particular grammatical rule sometimes loses out in
such cases.  But the changes you complained about didn't involve any
such situation.

Anyway, we've probably beaten this horse to death.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Renaming of pg_xlog and pg_clog

2016-10-03 Thread Christoph Berg
Re: Michael Paquier 2016-09-30 


"pg_trans" is used in two places:

> -pg_clog records the commit status for each transaction that has been assigned
> +pg_trans records the commit status for each transaction that has been 
> assigned

> - /* copy old commit logs to new data dir */
> - copy_subdir_files("pg_clog");
> + /*
> +  * Copy old commit logs to new data dir. pg_clog has been renamed to
> +  * pg_trans in post-10 clusters.

(Fwiw, I'd prefer something shorter than the imho clumsy
pg_transaction. "pg_xact" sounded very fine for me, it also combines
nicely with pg_subxact and the existing pg_multixact.)

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-10-03 Thread Christoph Berg
Hi Gilles,

I've just tried v4 of the patch. The OID you picked for
pg_current_logfile doesn't work anymore, but after increasing it
randomly by 1, it compiles. I like the added functionality,
especially that "select pg_read_file(pg_current_logfile());" just
works.

What bugs me is the new file "pg_log_file" in PGDATA. It clutters the
directory listing. I wouldn't know where else to put it, but you might
want to cross-check that with the thread that is trying to reshuffle
the directory layout to make it easier to exclude files from backups.
(Should this file be part of backups?)

It's probably correct to leave the file around on shutdown (given it's
still a correct pointer). But there might be a case for removing it on
startup if logging_collector isn't active anymore.

Also, pg_log_file is tab-completion-unfriendly, it conflicts with
pg_log/. Maybe name it current_logfile?

Another thing that might possibly be improved is csv logging:

# select pg_read_file(pg_current_logfile());
 pg_read_file  
───
 LOG:  ending log output to stderr↵
 HINT:  Future log output will go to log destination "csvlog".↵

-rw---  1 cbe staff 1011 Okt  3 15:06 postgresql-2016-10-03_150602.csv
-rw---  1 cbe staff   96 Okt  3 15:06 postgresql-2016-10-03_150602.log

... though it's unclear what to do if both stderr and csvlog are
selected.

Possibly NULL should be returned if only "syslog" is selected.
(Maybe remove pg_log_file once 'HINT:  Future log output will go to
log destination "syslog".' is logged?)

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decoding proacl

2016-10-03 Thread Stephen Frost
Greetings Bene,

* Benedikt Grundmann (bgrundm...@janestreet.com) wrote:
> I'm trying to understand how to decode proacl in pg_proc.  The
> documentation says:
> 
> PostgreSQL grants default privileges on some types of objects to PUBLIC.
> ... EXECUTE privilege for functions; ... Also, these initial default
> privilege settings can be changed using the ALTER DEFAULT PRIVILEGES
> command.
> 
> I also found this email
>  by
> Tom saying that NULL means the default of execute to public.
> 
> Questions:
> 
> a) Does NULL mean execute to public?  Or does it mean whatever
> pg_default_acl contains for functions?

NULL means that the default rights are in place.  For functions that
means 'PUBLIC' has 'EXECUTE' rights on the function, yes.

> b) What does it mean if pg_default_acl is empty?

This means that no 'DEFAULT PRIVILEGES' have been set using the command
'ALTER DEFAULT PRIVILEGES'.  See:

https://www.postgresql.org/docs/current/static/sql-alterdefaultprivileges.html

> c) If NULL means execute to public can somebody explain why this happens:
> 
> postgres_prod@proddb_testing=# select proacl from pg_proc where proname =
> 'hstore_eq';
> ─[ RECORD 1 ]
> proacl │ ¤
> 
> Time: 87.862 ms
> postgres_prod@proddb_testing=# grant execute on function hstore_eq(hstore,
> hstore) to public;
> GRANT
> Time: 88.931 ms
> postgres_prod@proddb_testing=# select proacl from pg_proc where proname =
> 'hstore_eq';
> ─[ RECORD 1 ]
> proacl │ *{=X/postgres_prod,postgres_prod=X/postgres_prod}*
> 
> I would have expected the bold to still be NULL.  Also I haven't found any
> combination of statements to set it back to NULL (short of dropping and
> recreating hstore_eq).

The act of running the GRANT causes the ACL to be set explicitly.  There
is no way to set it back to NULL once it's been set (but you can
certainly set the privileges to any possible value, including one which
has the same effect as having it be set to NULL..).

> Which leads me to I guess the most important questions.
> 
> d) Other than compactness in the representation of acls is there any
> practical difference between an the above representation and having NULL in
> proacl?

No.

If you're curious, check the acldefault() function.  Using that you can
see what a value of 'NULL' actually means, and for a function, you'll
see it means exactly the same as the ACL above.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Copy-editing for contrib/pg_visibility documentation.

2016-10-03 Thread Robert Haas
On Mon, Oct 3, 2016 at 8:51 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Why do you keep insisting on changing case where I've written "which"
>> to instead say "that" in situations where AFAIK either is perfectly
>> correct?  I find such changes at best neutral, and in some cases
>> worse.
>
> What I was taught in school was that "that" introduces a restrictive
> clause, i.e. one that limits the membership of whatever group was
> just mentioned, while "which" introduces a descriptive clause, i.e.
> one that just provides more information about the group.  So for
> example
>
> Functions that return a pass-by-reference type must do X.
>
> is correct, while
>
> Functions, which return a pass-by-reference type, must do X.
>
> carries an implication that *all* functions in the system return
> pass-by-reference types.  Even if you think that that's obviously
> silly, it may confuse readers who are accustomed to this distinction
> being drawn.  On the other hand, this is fine:
>
> Functions that return text, which is a pass-by-reference type,
> must do X.
>
> I've made the point more obvious in the above by setting off descriptive
> clauses with commas, which is a common thing to do.  But the punctuation
> is optional.
>
> I realize that this is nitpickery, and wouldn't usually bother about
> the distinction in, say, code comments. But we are striving to be
> somewhat formal in the user-facing documentation, no?

Sure, I'm not arguing with trying to be formal.  The grammatical rule
that you're describing doesn't exist for me, though.  I believe that
"that" can only introduce a restrictive clause, whereas "which" can
introduce either a descriptive or a restrictive clause.  It's
impossible for to imagine someone reading "functions which return text
must do X" and coming away with the conclusion that all functions
return text.

The reason I tend to prefer "which" is that "that" can mean lots of
other things, too.  Consider:

Donald Trump takes advantage of tax loopholes.  The tax loopholes that
that man uses should be eliminated.
vs.
Donald Trump takes advantage of tax loopholes.  The tax loopholes
which that man uses should be eliminated.

In my opinion, the second one is considerably superior.  If "which"
can only introduce a descriptive clause, then what does the second one
mean?  The sentence must now be construed to mean that some
unspecified set of tax loopholes should be eliminated.  We know that
each of them are used by Donald Trump, but not that they include every
loophole used by Donald Trump.  Of course, nobody would read the
sentence that way: it is drop-dead obvious that "which that man uses"
is intended to define the set of tax loopholes, not to describe it.  I
certainly used such constructions many times in the papers I wrote in
college, which I think also qualify as formal writing, and I don't
think I got dinged for doing so.

I tend to find the construction involving "which" to lead to easier
reading, because there's no ambiguity about it. Note that this is
perfectly fine:

The tax loopholes Donald Trump uses should be eliminated.

I don't know what that's called from the standpoint of formal grammar,
but it's clear enough.  So presumably I can also replace "Donald
Trump" with "that man":

The tax loopholes that man uses should be eliminated.

Well, now when I get to the word "that", my brain freezes up for a
second: is this introducing a clause beginning with "that", or is
introducing a clause beginning with nothing with "that" as the first
word of the clause?  I can't tell until I keep reading, and I might
have to backup and reread to be sure I'm understanding the meaning
correctly.  In contrast, if the clause had been introduced with
"which", it would have been clear immediately, which is a desirable
property for documentation to have.

I apologize for writing an email that mentions "Donald Trump" no less
than 7 times.  Sorry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Renaming of pg_xlog and pg_clog

2016-10-03 Thread Robert Haas
On Fri, Sep 30, 2016 at 1:45 AM, Michael Paquier
 wrote:
> As there have been no reviews at code level, I am moving that to the next CF.

Code review of 0001:

I think the tests for PQserverVersion(conn) / 100 >= 1000 are strange.
I submit that either PQserverVersion(conn) >= 10 or
PQserverVersion(conn) / 1 >= 10 is an easier-to-understand test.
I vote for the first style.  So in pg_basebackup.c I'd do:

#define MINIMUM_VERSION_FOR_PG_WAL 10
...
int version = PQserverVersion(conn);
...
if (version >= MINIMUM_VERSION_FOR_PG_WAL)
   /* do whatever */

Also, for this sort of thing:

+if (!conn)
+/* Error message already written in GetConnection() */
+exit(1);

...because of the comment, I'd add braces around this.

Tom changed the error-reporting framework for pg_upgrade in
f002ed2b8e45286fe73a36e119cba2016ea0de19, so you'll need to do some
rebasing over that.  I haven't checked what the new conventions are,
but obviously you'll want to try to be consistent with them.

Other than those minor nitpicks, I think this looks OK.  I'm not sure
we have consensus on 0002, but 0001 seems to be popular with far more
people than not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2016-10-03 Thread Jesper Pedersen

On 09/29/2016 04:02 PM, Peter Eisentraut wrote:

On 9/29/16 4:00 PM, Peter Eisentraut wrote:

Since the commit fest is drawing to a close, I'll set this patch as
returned with feedback.


Actually, the CF app informs me that moving to the next CF is more
appropriate, so I have done that.



Ok, thanks for your feedback.

Maybe "Returned with Feedback" is more appropriate, as there is still 
development left.


Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Copy-editing for contrib/pg_visibility documentation.

2016-10-03 Thread Tom Lane
Robert Haas  writes:
> Why do you keep insisting on changing case where I've written "which"
> to instead say "that" in situations where AFAIK either is perfectly
> correct?  I find such changes at best neutral, and in some cases
> worse.

What I was taught in school was that "that" introduces a restrictive
clause, i.e. one that limits the membership of whatever group was
just mentioned, while "which" introduces a descriptive clause, i.e.
one that just provides more information about the group.  So for
example

Functions that return a pass-by-reference type must do X.

is correct, while

Functions, which return a pass-by-reference type, must do X.

carries an implication that *all* functions in the system return
pass-by-reference types.  Even if you think that that's obviously
silly, it may confuse readers who are accustomed to this distinction
being drawn.  On the other hand, this is fine:

Functions that return text, which is a pass-by-reference type,
must do X.

I've made the point more obvious in the above by setting off descriptive
clauses with commas, which is a common thing to do.  But the punctuation
is optional.

I realize that this is nitpickery, and wouldn't usually bother about
the distinction in, say, code comments. But we are striving to be
somewhat formal in the user-facing documentation, no?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-10-03 Thread Christoph Berg
Re: Fabien COELHO 2016-10-03 
> > I "\set" a bunch of lengthy SQL commands in there, e.g.
> 
> I agree that this looks like a desirable feature, however I would tend to
> see that as material for another independent patch.

Sure, my question was by no means intending to stop your pgbench patch
from going forward by adding extra requirements.

> Hmmm. I'm not sure how this is parsed. If this is considered a string '...',
> then maybe \set should wait for the end of the string instead of the end of
> the line, i.e. no continuation would be needed...
> 
>  \set config '
> SELECT name, ...
>CASE ... END
> FROM pg_settings
> WHERE ...;'

I guess that would be the sane solution here, yes. Not adding any \
chars at the end of the line would also mean cut-and-paste of the RHS
content would work.

Thanks for the feedback!

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-03 Thread Anastasia Lubennikova

03.10.2016 05:22, Michael Paquier:

On Tue, Sep 27, 2016 at 12:17 AM, Anastasia Lubennikova
 wrote:

Ok, I'll write it in a few days.

Marked as returned with feedback per last emails exchanged.


The only complaint about this patch was a lack of README,
which is fixed now (see the attachment). So, I added it to new CF,
marked as ready for committer.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index d4f9090..99735ce 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1483,7 +1483,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1509,7 +1509,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, &numatts);
+		results = get_pkey_attnames(rel, &indnkeyatts);
 
 		relation_close(rel, AccessShareLock);
 
@@ -1529,9 +1529,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2021,10 +2021,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2034,8 +2034,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2056,12 +2056,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..c024cf3 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 appendStringInfoCharMacro(payload, operation);
 
-for (i = 0; i < numatts; i++)
+for (i = 0; i < indnkeyatts; i++)
 {
 	int			colno = index->indkey.values[i];
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 322d8d6..0983c93 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3564,6 +3564,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 40f201b..92bef4a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -110,6 +110,8 @@ typedef struct IndexAmRoutine
 boolamclusterable;
 /* does AM handle pr

Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-10-03 Thread Fabien COELHO


Hello Christoph,


Attached patch does what is described in the title, hopefully. Continuations
in other pgbench backslash-commands should be dealt with elsewhere...


Would (a similar version of) that patch also apply to .psqlrc?


Pgbench has its own lexer & parser for \set expressions, so the 
continuation is handled there.



I "\set" a bunch of lengthy SQL commands in there, e.g.


I agree that this looks like a desirable feature, however I would tend to 
see that as material for another independent patch.


I think that .pgsqrc is really just a "psql" script so the handling would 
be somewhere there... I'll have a look.


\set config 'SELECT name, current_setting(name), CASE source WHEN 
$$configuration file$$ THEN sourcefile||$$:$$||sourceline ELSE source 
END FROM pg_settings WHERE source <> $$default$$;'


Hmmm. I'm not sure how this is parsed. If this is considered a string 
'...', then maybe \set should wait for the end of the string instead of 
the end of the line, i.e. no continuation would be needed...


 \set config '
SELECT name, ...
   CASE ... END
FROM pg_settings
WHERE ...;'


Being able to split that over several lines would greatly improve
maintainability. (Though I do realize this would also require a notion
for splitting/continuing strings.)


Yep. I'm not sure of the actual feature which is needed.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tuplesort merge pre-reading

2016-10-03 Thread Peter Geoghegan
On Mon, Oct 3, 2016 at 3:39 AM, Heikki Linnakangas  wrote:
>> Can't you just use state->tapeRange, and remove the "continue"? I
>> recommend referring to "now-exhausted input tapes" here, too.
>
>
> Don't think so. result_tape == tapeRange only when the merge was done in a
> single pass (or you're otherwise lucky).

Ah, yes. Logical tape assignment/physical tape number confusion on my part here.

>> * I'm not completely prepared to give up on using
>> MemoryContextAllocHuge() within logtape.c just yet. Maybe you're right
>> that it couldn't possibly matter that we impose a MaxAllocSize cap
>> within logtape.c (per tape), but I have slight reservations that I
>> need to address. Maybe a better way of putting it would be that I have
>> some reservations about possible regressions at the very high end,
>> with very large workMem. Any thoughts on that?
>
>
> Meh, I can't imagine that using more than 1 GB for a read-ahead buffer could
> make any difference in practice. If you have a very large work_mem, you'll
> surely get away with a single merge pass, and fragmentation won't become an
> issue. And 1GB should be more than enough to trigger OS read-ahead.

I had a non-specific concern, not an intuition of suspicion about
this. I think that I'll figure it out when I rebase the parallel
CREATE INDEX patch on top of this and test that.

> Committed with some final kibitzing. Thanks for the review!

Thanks for working on this!

> PS. This patch didn't fix bug #14344, the premature reuse of memory with
> tuplesort_gettupleslot. We'll still need to come up with 1. a backportable
> fix for that, and 2. perhaps a different fix for master.

Agreed. It seemed like you favor not changing memory ownership
semantics for 9.6. I'm not sure that that's the easiest approach for
9.6, but let's discuss that over on the dedicated thread soon.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-10-03 Thread Karl O. Pinc
On Mon, 3 Oct 2016 13:35:16 +0900
Michael Paquier  wrote:
> 
> Moved to next CF, the patch still applies. Karl, you have registered
> to review this patch a couple of months back but nothing happened. I
> have removed your name for now. If you have time, don't hesitate to
> come back to it.

Right.  Hope to get back to it soon.  Won't register until I'm ready
to look at it.


Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macro customizable hashtable / bitmapscan & aggregation perf

2016-10-03 Thread Arthur Silva
On Sat, Oct 1, 2016 at 2:44 AM, Andres Freund  wrote:

> Hi,
>
> On 2016-07-26 17:43:33 -0700, Andres Freund wrote:
> > In the attached patch I've attached simplehash.h, which can be
> > customized by a bunch of macros, before being inlined.  There's also a
> > patch using this for tidbitmap.c and nodeAgg/nodeSubplan/... via
> > execGrouping.c.
>
> Attached is a significantly improved version of this series.  The main
> changes are:
>
> - The hash table now uses robin-hood style hash collision handling. See the
>   commit message of 0002 (or simplehash.h) for an introduction. That
>   significantly reduces the impact of "clustering" due to linear
>   addressing.
> - Significant comment and correctness fixes, both in simplehash.h
> - itself, and 0003/0004.
> - a lot of other random performance improvements for the hashing code.
>
>
> Unfortunately I'm running out battery right now, so I don't want to
> re-run the benchmarks posted upthread (loading takes a while). But the
> last time I ran them all the results after the patches were better than
> before.
>
>
> This patch series currently consists out of four patches:
> - 0001 - add unlikely/likely() macros. The use here is not entirely
>   mandatory, but they do provide a quite consistent improvement. There
>   are other threads where they proved to be beneficial, so I see little
>   reason not to introduce them.
> - 0002 - the customizable hashtable itself. It now contains documentation.
> - 0003 - convert tidbitmap.c to use the new hashmap. This includes a fix
>   for the issue mentioned in [1], to improve peformance in heavily lossy
>   scenarios (otherwise we could regress in some extreme cases)
> - 0004 - convert execGrouping.c, e.g. used by hash aggregates
>
>
> While not quite perfect yet, I do think this is at a state where input
> is needed.  I don't want to go a lot deeper into this rabbit hole,
> before we have some agreement that this is a sensible course of action.
>
>
> I think there's several possible additional users of simplehash.h,
> e.g. catcache.c - which has an open coded, not particularly fast hash
> table and frequently shows up in profiles - but I think the above two
> conversions are plenty to start with.
>
>
> Comments?
>
>
> Greetings,
>
> Andres Freund
>
> [1] http://archives.postgresql.org/message-id/20160923205843.zcs
> 533sqdzlh4cpo%40alap3.anarazel.de
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
This is a great addition.

A couple of comments.

* 80% occupancy is a bit conservative for RH hashing, it works well up to
90% if you use the early stops due to distance. So that TODO is worth
pursuing.

* An optimized memory layout for RH hashmap is along the lines of
HHHKVKVKV, using one bit of the hash as the present/absent flag. That plays
specially well with large occupancies (~90%). Due to RH the probings on the
H array are very short and within a single cache line. Even with a 31bit
hash it's reallyyy rare to have to probe more than 1 entry in the KV array.
Essentially guaranteeing that the 99% percentile of lookups will only hit a
couple of cache lines (if you ignore crossing cache lines and other
details).


Re: [HACKERS] multivariate statistics (v19)

2016-10-03 Thread Heikki Linnakangas

On 10/03/2016 04:46 AM, Michael Paquier wrote:

On Fri, Sep 30, 2016 at 8:10 PM, Heikki Linnakangas  wrote:

This patch set is in pretty good shape, the only problem is that it's so big
that no-one seems to have the time or courage to do the final touches and
commit it.


Did you see my suggestions about simplifying its SQL structure? You
could shave some code without impacting the base set of features.


Yeah. The idea was to use something like pg_node_tree to store all the 
different kinds of statistics, the histogram, the MCV, and the 
functional dependencies, in one datum. Or JSON, maybe. It sounds better 
than an opaque bytea blob, although I'd prefer something more 
relational. For the functional dependencies, I think we could get away 
with a simple float array, so let's do that in the first cut, and 
revisit this for the MCV and histogram later. Separate columns for the 
functional dependencies, the MCVs, and the histogram, probably makes 
sense anyway.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Decoding proacl

2016-10-03 Thread Benedikt Grundmann
I'm trying to understand how to decode proacl in pg_proc.  The
documentation says:

PostgreSQL grants default privileges on some types of objects to PUBLIC.
... EXECUTE privilege for functions; ... Also, these initial default
privilege settings can be changed using the ALTER DEFAULT PRIVILEGES
command.

I also found this email
 by
Tom saying that NULL means the default of execute to public.

Questions:

a) Does NULL mean execute to public?  Or does it mean whatever
pg_default_acl contains for functions?

b) What does it mean if pg_default_acl is empty?

c) If NULL means execute to public can somebody explain why this happens:

postgres_prod@proddb_testing=# select proacl from pg_proc where proname =
'hstore_eq';
─[ RECORD 1 ]
proacl │ ¤

Time: 87.862 ms
postgres_prod@proddb_testing=# grant execute on function hstore_eq(hstore,
hstore) to public;
GRANT
Time: 88.931 ms
postgres_prod@proddb_testing=# select proacl from pg_proc where proname =
'hstore_eq';
─[ RECORD 1 ]
proacl │ *{=X/postgres_prod,postgres_prod=X/postgres_prod}*

I would have expected the bold to still be NULL.  Also I haven't found any
combination of statements to set it back to NULL (short of dropping and
recreating hstore_eq).

Which leads me to I guess the most important questions.

d) Other than compactness in the representation of acls is there any
practical difference between an the above representation and having NULL in
proacl?

Thanks in advance,

Bene


Re: [HACKERS] asynchronous execution

2016-10-03 Thread Kyotaro HORIGUCHI
Thank you for the thought.


At Fri, 23 Sep 2016 21:09:03 -0400, Robert Haas  wrote 
in 
> [ Adjusting subject line to reflect the actual topic of discussion better. ]
> 
> On Fri, Sep 23, 2016 at 9:29 AM, Robert Haas  wrote:
> > On Fri, Sep 23, 2016 at 8:45 AM, Amit Khandekar  
> > wrote:
> >> For e.g., in the above plan which you specified, suppose :
> >> 1. Hash Join has called ExecProcNode() for the child foreign scan b, and so
> >> is
> >> waiting in ExecAsyncWaitForNode(foreign_scan_on_b).
> >> 2. The event wait list already has foreign scan on a that is on a different
> >> subtree.
> >> 3. This foreign scan a happens to be ready, so in
> >> ExecAsyncWaitForNode (), ExecDispatchNode(foreign_scan_a) is called,
> >> which returns with result_ready.
> >> 4. Since it returns result_ready, it's parent node is now inserted in the
> >> callbacks array, and so it's parent (Append) is executed.
> >> 5. But, this Append planstate is already in the middle of executing Hash
> >> join, and is waiting for HashJoin.
> >
> > Ah, yeah, something like that could happen.  I've spent much of this
> > week working on a new design for this feature which I think will avoid
> > this problem.  It doesn't work yet - in fact I can't even really test
> > it yet.  But I'll post what I've got by the end of the day today so
> > that anyone who is interested can look at it and critique.
> 
> Well, I promised to post this, so here it is.  It's not really working
> all that well at this point, and it's definitely not doing anything
> that interesting, but you can see the outline of what I have in mind.
> Since Kyotaro Horiguchi found that my previous design had a
> system-wide performance impact due to the ExecProcNode changes, I
> decided to take a different approach here: I created an async
> infrastructure where both the requestor and the requestee have to be
> specifically modified to support parallelism, and then modified Append
> and ForeignScan to cooperate using the new interface.  Hopefully that
> means that anything other than those two nodes will suffer no
> performance impact.  Of course, it might have other problems

The previous framework didn't need to distinguish async-capable
and uncapable nodes from the parent node's view. The things in
ExecProcNode was required for the reason. Instead, this new one
removes the ExecProcNode stuff by distinguishing the two kinds of
node in async-aware parents, that is, Append. This no longer
involves async-unaware nodes into the tuple bubbling-up mechanism
so the reentrant problem doesn't seem to occur.

On the other hand, for example, the following plan, regrardless
of its practicality, (there should be more good example..)

(Async-unaware node)
 -  NestLoop
- Append
  - n * ForegnScan
- Append
  - n * ForegnScan

If the NestLoop, Append are async-aware, all of the ForeignScans
can run asynchronously with the previous framework. The topmost
NestLoop will be awakened after that firing of any ForenScans
makes a tuple bubbles up to the NestLoop. This is because the
not-need-to-distinguish-aware-or-not nature provided by the
ExecProcNode stuff.

On the other hand, with the new one, in order to do the same
thing, ExecAppend have in turn to behave differently whether the
parent is async or not. To do this will be bothersome but not
with confidence.

I examine this further intensively, especially for performance
degeneration and obstacles to complete this.


> Some notes:
> 
> - EvalPlanQual rechecks are broken.
> - EXPLAIN ANALYZE instrumentation is broken.
> - ExecReScanAppend is broken, because the async stuff needs some way
> of canceling an async request and I didn't invent anything like that
> yet.
> - The postgres_fdw changes pretend to be async but aren't actually.
> It's just a demo of (part of) the interface at this point.
> - The postgres_fdw changes also report all pg-fdw paths as
> async-capable, but actually the direct-modify ones aren't, so the
> regression tests fail.
> - Errors in the executor can leak the WaitEventSet.  Probably we need
> to modify ResourceOwners to be able to own WaitEventSets.
> - There are probably other bugs, too.
> 
> Whee!
> 
> Note that I've tried to solve the re-entrancy problems by (1) putting
> all of the event loop's state inside the EState rather than in local
> variables and (2) having the function that is called to report arrival
> of a result be thoroughly different than the function that is used to
> return a tuple to a synchronous caller.
> 
> Comments welcome, if you're feeling brave enough to look at anything
> this half-baked.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-10-03 Thread Christoph Berg
Re: Fabien COELHO 2016-10-03 
> 
> Attached patch does what is described in the title, hopefully. Continuations
> in other pgbench backslash-commands should be dealt with elsewhere...

Would (a similar version of) that patch also apply to .psqlrc? I
"\set" a bunch of lengthy SQL commands in there, e.g.

\set config 'SELECT name, current_setting(name), CASE source WHEN 
$$configuration file$$ THEN sourcefile||$$:$$||sourceline ELSE source END FROM 
pg_settings WHERE source <> $$default$$;'

Being able to split that over several lines would greatly improve
maintainability. (Though I do realize this would also require a notion
for splitting/continuing strings.)

Christoph


signature.asc
Description: PGP signature


Re: [HACKERS] Tuplesort merge pre-reading

2016-10-03 Thread Heikki Linnakangas

On 09/30/2016 04:08 PM, Peter Geoghegan wrote:

On Thu, Sep 29, 2016 at 4:10 PM, Heikki Linnakangas  wrote:

Bah, I fumbled the initSlabAllocator() function, attached is a fixed
version.


This looks much better. It's definitely getting close. Thanks for
being considerate of my more marginal concerns. More feedback:


Fixed most of the things you pointed out, thanks.


* Minor issues with initSlabAllocator():

You call the new function initSlabAllocator() as follows:


+   if (state->tuples)
+   initSlabAllocator(state, numInputTapes + 1);
+   else
+   initSlabAllocator(state, 0);


Isn't the number of slots (the second argument to initSlabAllocator())
actually just numInputTapes when we're "state->tuples"? And so,
shouldn't the "+ 1" bit happen within initSlabAllocator() itself? It
can just inspect "state->tuples" itself. In short, maybe push a bit
more into initSlabAllocator(). Making the arguments match those passed
to initTapeBuffers() a bit later would be nice, perhaps.


The comment above that explains the "+ 1". init_slab_allocator allocates 
the number of slots that was requested, and the caller is responsible 
for deciding how many slots are needed. Yeah, we could remove the 
argument and move the logic altogether into init_slab_allocator(), but I 
think it's clearer this way. Matter of taste, I guess.



* This could be simpler, I think:


+   /* Release the read buffers on all the other tapes, by rewinding them. */
+   for (tapenum = 0; tapenum < state->maxTapes; tapenum++)
+   {
+   if (tapenum == state->result_tape)
+   continue;
+   LogicalTapeRewind(state->tapeset, tapenum, true);
+   }


Can't you just use state->tapeRange, and remove the "continue"? I
recommend referring to "now-exhausted input tapes" here, too.


Don't think so. result_tape == tapeRange only when the merge was done in 
a single pass (or you're otherwise lucky).



* I'm not completely prepared to give up on using
MemoryContextAllocHuge() within logtape.c just yet. Maybe you're right
that it couldn't possibly matter that we impose a MaxAllocSize cap
within logtape.c (per tape), but I have slight reservations that I
need to address. Maybe a better way of putting it would be that I have
some reservations about possible regressions at the very high end,
with very large workMem. Any thoughts on that?


Meh, I can't imagine that using more than 1 GB for a read-ahead buffer 
could make any difference in practice. If you have a very large 
work_mem, you'll surely get away with a single merge pass, and 
fragmentation won't become an issue. And 1GB should be more than enough 
to trigger OS read-ahead.


Committed with some final kibitzing. Thanks for the review!

PS. This patch didn't fix bug #14344, the premature reuse of memory with 
tuplesort_gettupleslot. We'll still need to come up with 1. a 
backportable fix for that, and 2. perhaps a different fix for master.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2016-10-03 Thread Stas Kelvich
> On 17 Sep 2016, at 20:21, Yury Zhuravlev  wrote:
> 
> Michael Paquier wrote:
>> On Sat, Sep 17, 2016 at 1:40 AM, Yury Zhuravlev
>>  wrote:
>>> Michael Paquier wrote:
>>> I merged master to my branch and I spent time to porting all changes. I hope
>>> send patch in the weekend without terrible flaws.
>> 
>> By the way, I noticed that you did not register this patch in the current 
>> CF..
> 
> Now, I published the first version of the patch. This patch not ready for 
> commit yet and all current task you can read here:
> https://github.com/stalkerg/postgres_cmake/issues
> 
> I hope we realy close to end. 
> In this patch I forbade in-source build for avoid overwriting current 
> Makefiles. We will can remove all Makefiles only after shall see in CMake. 
> You don't need support two system. During commitfests CMake build system will 
> be supported by me.  
> I need help with buildfarm because my knowledge of Perl is very bad (thought 
> about rewrite buildfarm to Python). 
> 
> I hope for your support.


Tried to generate Xcode project out of cmake, build fails on genbki.pl: can't 
locate Catalog.pm (which itself lives in src/backend/catalog/Catalog.pm)

-- 
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgbench - allow backslash continuations in \set expressions

2016-10-03 Thread Fabien COELHO


Attached patch does what is described in the title, hopefully. 
Continuations in other pgbench backslash-commands should be dealt with 
elsewhere...


Also attached is a small test script.

While looking at the code, I noticed that newline is \n. Maybe it should 
be (\r|\n|\r\n) to allow for MacOS & Windows. I have not changed that for 
now.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 285608d..6f068cd 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -818,6 +818,8 @@ pgbench  options  dbname
  
   Sets variable varname to a value calculated
   from expression.
+  The expression may span several lines with backslash-newline
+  continuations.
   The expression may contain integer constants such as 5432,
   double constants such as 3.14159,
   references to variables :variablename,
@@ -832,7 +834,8 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % \
+   (10 * :scale) + 1
 
 

diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 20891a3..97e7a79 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -65,6 +65,8 @@ alnum			[a-zA-Z0-9_]
 space			[ \t\r\f\v]
 nonspace		[^ \t\r\f\v\n]
 newline			[\n]
+/* continuations in pgbench expressions */
+continuation	\\{newline}
 
 /* Exclusive states */
 %x EXPR
@@ -138,6 +140,8 @@ newline			[\n]
 	return FUNCTION;
 }
 
+{continuation}	{ /* ignore */ }
+
 {newline}		{
 	/* report end of command */
 	last_was_newline = true;


cont.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backporting PostgresNode.pm

2016-10-03 Thread Michael Paquier
On Mon, Oct 3, 2016 at 4:51 PM, Craig Ringer  wrote:
> Tom mentioned in "9.6 TAP tests and extensions" [1] that
>
> "[I]n the past we've often regretted it when we failed
> to back-patch TAP infrastructure fixes all the way back to 9.4."
>
> [...]
>
> If folks think this is a good idea I can do the compat wrapper for the
> existing tests and submit patches. I've got src/test/recovery running
> happily for 9.5 already.

+1 from here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2016-10-03 Thread Vitaly Burovoy
On 10/2/16, Michael Paquier  wrote:
> On Mon, Oct 3, 2016 at 3:25 PM, Vitaly Burovoy 
> wrote:
>> I guess for ability to use filtering like:
>>
>> SELECT * FROM pg_hba_rules WHERE options->>radiusserver LIKE
>> '%.example.com';
>>
>> I think it would be harder if options is an array of strings...
>
> With unnest() and a matching pattern, not that hard but..

I'm not a user of that feature, and I don't know how pg_hba files look
like in really big companies...

But for me filtering is more complicated than just a single comparison.
What about more complex filtering --- several radiusserver and a user(s):

WHERE
options->>radiusserver = ANY(ARRAY['a.example.com', 'g.example.com'])
AND
options->>radiusidentifier = ANY(ARRAY['ID_a', 'ID_b', 'ID_c',
'ID_d', 'ID_e'])  -- or even a subquery

Again, I don't know whether it will be widely used, but in case of
multiple param_name->param_value settings (column "options") I'd like
to see native key-value store rather than array of strings (according
to POLA).

I guess you're expecting "key=value" format as they are written in the
pg_hba file (and described in the doc), but sometimes they can be
parsed and output differs from exact pg_hba records (for instance look
at "ldapurl" parameter).

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Backporting PostgresNode.pm

2016-10-03 Thread Craig Ringer
Hi all

Tom mentioned in "9.6 TAP tests and extensions" [1] that

"[I]n the past we've often regretted it when we failed
to back-patch TAP infrastructure fixes all the way back to 9.4."

So maybe we should do just that? Backpatch to add PostgresNode.pm,
RecursiveCopy.pm, and the changes to TestLib.pm, as well as
src/test/recovery/* ?

TAP tests aren't run or enabled by default anyway, so it'd only affect
sites that --enable-tap-tests and run them.

The backport to 9.5 pretty much consists of "cp", with a trivial
change of wal_level = 'replica' to wal_level = 'hot_standby' in
PostgresNode.pm . Also have to delete the
multiple-synchronous-standbys tests from t/007_sync_rep.pl .

The existing TAP tests, which in 9.5 are:

./src/bin/initdb/t
./src/bin/pg_basebackup/t
./src/bin/pg_config/t
./src/bin/pg_controldata/t
./src/bin/pg_ctl/t
./src/bin/pg_rewind/t
./src/bin/pgbench/t
./src/bin/scripts/t
./src/test/ssl/t

all need some updates to use PostgresNode.pm since
configure_hba_for_replication(), start_test_server(),
stop_test_server() etc have gone away. But conversion looks pretty
simple, especially with the help of a compat wrapper in TestLib.pm .

If folks think this is a good idea I can do the compat wrapper for the
existing tests and submit patches. I've got src/test/recovery running
happily for 9.5 already.


[1] https://www.postgresql.org/message-id/6710.1473775...@sss.pgh.pa.us

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tracking wait event for latches

2016-10-03 Thread Michael Paquier
On Mon, Oct 3, 2016 at 12:35 PM, Thomas Munro
 wrote:
> Hmm.  I like the use of pgstat in that name.  It helps with the
> confusion created by the overloading of the term 'wait event' in the
> pg_stat_activity view and the WaitEventSet API, by putting it into a
> different pseudo-namespace.
>
> + uint32 wait_event;
>
> How about a typedef for that instead of using raw uint32 everywhere?
> Something like pgstat_wait_descriptor?  Then a variable name like
> pgstat_desc, since this is most definitely not just a wait_event
> anymore.

We cannot do that because of latch.h, which now only includes
 and it would be a bad idea to add more dependencies to
PG-specific headers.

> + /* Define event to wait for */
>
> It's not defining the event to wait for at all, it's building a
> description for pgstat.
>
> + wait_event = pgstat_make_wait_event(WAIT_EXTENSION,
> + WAIT_EVENT_EXTENSION);
>
> It's not making a wait event, it's combining a class and an event.
> How about something like this:
>
> pgstat_desc = pgstat_make_wait_descriptor(WAIT_CLASS_EXTENSION,
> WAIT_EVENT_EXTENSION)?

Maybe I sound stupid here, but I am trying to keep the same of this
macro short so I'd go for pgstat_make_wait_desc().

>   /* Sleep until there's something to do */
>   wc = WaitLatchOrSocket(MyLatch,
> WL_LATCH_SET | WL_SOCKET_READABLE,
> PQsocket(conn),
> -   -1L);
> +   -1L,
> +   wait_event);
>
> ... then use 'pgstat_desc' here.

For this one I agree, your naming is better. It is kind of good to let
callers of WaitLatch know where this is actually used. I have added as
well comments on top of WaitLatch & friends to mention what
pgstat_desc does, that's useful for the user.
-- 
Michael
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 8ca1c1c..ffa72b8 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -17,6 +17,7 @@
 #include "access/xact.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "pgstat.h"
 #include "storage/latch.h"
 #include "utils/hsearch.h"
 #include "utils/memutils.h"
@@ -491,12 +492,18 @@ pgfdw_get_result(PGconn *conn, const char *query)
 		while (PQisBusy(conn))
 		{
 			int			wc;
+			uint32		pgstat_desc;
+
+			/* Define event to wait for */
+			pgstat_desc = pgstat_make_wait_desc(WAIT_EXTENSION,
+WAIT_EVENT_EXTENSION);
 
 			/* Sleep until there's something to do */
 			wc = WaitLatchOrSocket(MyLatch,
    WL_LATCH_SET | WL_SOCKET_READABLE,
    PQsocket(conn),
-   -1L);
+   -1L,
+   pgstat_desc);
 			ResetLatch(MyLatch);
 
 			CHECK_FOR_INTERRUPTS();
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index f400785..bb975c1 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -679,6 +679,42 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   buffer in question.
  
 
+
+ 
+  Activity: The server process is idle.  This is used by
+  system processes waiting for activity in their main processing loop.
+  wait_event will identify the specific wait point.
+ 
+
+
+ 
+  Extension: The server process is waiting for activity
+  in an extension module.  This category is useful for modules to
+  track custom waiting points.
+ 
+
+
+ 
+  Client: The server process is waiting for some activity
+  on a socket from user applications, and that the server expects
+  something to happen that is independent from its internal processes.
+  wait_event will identify the specific wait point.
+ 
+
+
+ 
+  IPC: The server process is waiting for some activity
+  from another process in the server.  wait_event will
+  identify the specific wait point.
+ 
+
+
+ 
+  Timeout: The server process is waiting for a timeout
+  to expire.  wait_event will identify the specific wait
+  point.
+ 
+

   
  
@@ -1085,6 +1121,143 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  BufferPin
  Waiting to acquire a pin on a buffer.
 
+
+ Activity
+ ArchiverMain
+ Waiting in main loop of the archiver process.
+
+
+ AutoVacuumMain
+ Waiting in main loop of autovacuum launcher process.
+
+
+ BgWriterHibernate
+ Waiting in background writer process, hibernating.
+
+
+ BgWriterMain
+ Waiting in main loop of background writer process background worker.
+
+
+ CheckpointerMain
+ Waiting in main loop of checkpointer process.
+
+
+ PgStatMain
+ Waiting in main loop of the st