Re: [HACKERS] asynchronous execution

2017-06-28 Thread Amit Langote
Hi,

On 2017/06/29 13:45, Kyotaro HORIGUCHI wrote:
> Thank you for looking this.
> 
> At Wed, 28 Jun 2017 10:23:54 +0200, Antonin Houska wrote:
>> Kyotaro HORIGUCHI  wrote:
>>
>>> The patch got conflicted. This is a new version just rebased to
>>> the current master. Furtuer amendment will be taken later.
>>
>> Can you please explain this part of make_append() ?
>>
>> /* Currently async on partitioned tables is not available */
>> Assert(nasyncplans == 0 || partitioned_rels == NIL);
>>
>> I don't think the output of Append plan is supposed to be ordered even if the
>> underlying relation is partitioned. Besides ordering, is there any other
>> reason not to use the asynchronous execution?
> 
> It was just a developmental sentinel that will remind me later to
> consider the declarative partitions since I didn't have an idea
> of the differences (or the similarity) between appendrels and
> partitioned_rels. It is never to say the condition cannot
> make. I'll check it out and will support partitioned_rels sooner.
> Sorry for having left it as it is.

When making an Append for a partitioned table, among the arguments passed
to make_append(), 'partitioned_rels' is a list of RT indexes of
partitioned tables in the inheritance tree of which the aforementioned
partitioned table is the root.  'appendplans' is a list of subplans for
scanning the leaf partitions in the tree.  Note that the 'appendplans'
list contains no members corresponding to the partitioned tables, because
we don't need to scan them (only leaf relations contain any data).

The point of having the 'partitioned_rels' list in the resulting Append
plan is so that the executor can identify those relations and take the
appropriate locks on them.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] asynchronous execution

2017-06-28 Thread Kyotaro HORIGUCHI
Thank you for looking this.

At Wed, 28 Jun 2017 10:23:54 +0200, Antonin Houska  wrote in 
<4579.1498638234@localhost>
> Kyotaro HORIGUCHI  wrote:
> 
> > The patch got conflicted. This is a new version just rebased to
> > the current master. Furtuer amendment will be taken later.
> 
> Can you please explain this part of make_append() ?
> 
> /* Currently async on partitioned tables is not available */
> Assert(nasyncplans == 0 || partitioned_rels == NIL);
> 
> I don't think the output of Append plan is supposed to be ordered even if the
> underlying relation is partitioned. Besides ordering, is there any other
> reason not to use the asynchronous execution?

It was just a developmental sentinel that will remind me later to
consider the declarative partitions since I didn't have an idea
of the differences (or the similarity) between appendrels and
partitioned_rels. It is never to say the condition cannot
make. I'll check it out and will support partitioned_rels sooner.
Sorry for having left it as it is.

> And even if there was some, the planner should ensure that executor does not
> fire the assertion statement above. The script attached shows an example how
> to cause the assertion failure.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Craig Ringer
On 29 June 2017 at 12:23, Craig Ringer  wrote:

> It does. But I don't see anywhere that extra round trips have been discussed.

Ah, right, they're implied by having the server respond with some
downversion message and ignore input until the client sends a new
startup message. That'll only happen when a too-new client connects to
an older server, but that's probably not an especially unusual case.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Craig Ringer
On 29 June 2017 at 10:27, Tom Lane  wrote:
> Craig Ringer  writes:
>> On 29 June 2017 at 03:01, Robert Haas  wrote:
>>> It wouldn't be
>>> so bad if unrecognized parameters were just ignored; the client would
>>> know from the ServerProtocolVersion (or ParameterStatus) message that
>>> server had ignored those options and could respond as it saw fit.
>
>> Yeah. In retrospect it's a pity the key/value pairs don't come with
>> some kind of optional/required flag so the client can say "error if
>> you don't understand this, it's important" or "I don't care if you
>> don't understand this, and I'll notice when you fail to GUC_REPORT it
>> back to me". Or some convention like underscore-prefixing for
>> optional.
>
> Yeah.  Back in the day I helped design the PNG image format, and one
> of the better ideas in it was to make a distinction between critical and
> noncritical chunks within a PNG file; that was exactly the idea you're
> getting at here.  I agree with the suggestion to drive this off a
> parameter-name-based convention.  Having leading underscore indicate
> a noncritical parameter sounds fine.
>
> I don't really like any of the ideas that have been mentioned that would
> introduce extra network round trips into session startup.  It's expensive
> enough already.  The only thing we seem to be really hurting on is the
> ability for the client to add extensions to the original StartupMessage,
> and this seems like it can fix that.

It does. But I don't see anywhere that extra round trips have been discussed.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] "SELECT *" vs hidden columns and logical column order

2017-06-28 Thread Thomas Munro
Hi hackers,

I am aware of at three potential projects that would change the
meaning of "SELECT *":

1.  Incremental MATERIALIZED VIEW maintenance probably needs to be
able to use a hidden counter column which you can ask for by name but
will otherwise not show up to users:

https://www.postgresql.org/message-id/1371480075.55528.yahoomail...@web162901.mail.bf1.yahoo.com

2.  SQL:2011 temporal tables track system time and/or valid time with
columns that users create and then declare to be temporal control
columns.  I don't think they show up unless you name them directly (I
didn't check the standard but I noticed that it's that way in another
product), so I guess that's basically the same as (1).

3.  Logical column order aka ALTER COLUMN POSITION, a recurring topic
on pgsql-hackers for which patches have been written but nothing has
so far managed to stick:

https://www.postgresql.org/message-id/flat/20141209174146.GP1768%40alvh.no-ip.org

Suppose someone wanted to chip away at a small piece of incremental
matviews by inventing a way to declare 'hidden' columns: is there
really a dependency here as implied in the 2013 email above?  Is
anyone planning to revive logical column order?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [GSOC][weekly report 4] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-06-28 Thread Mengxing Liu
In the last week:

I added a tpcb benchmark and refactored the test code. It likes a framework 
now. We can add other benchmarks easily if necessary. 
https://github.com/liumx10/pg-bench
Analyzed the code acquiring SerializableFinishedListLock and provided a new 
proposal to improve it. My proposal is as below. 
As I reported in previous emails, lots of backend were blocked by 
SerializableFinishedListLock in heavy contention. 
There are only three functions acquiring this lock:

SummarizeOldestCommittedSxact: After transforming the conflict information into 
SLRU format, we need to release this SerailizableXact object. The lock protects 
the release operation and removing it from the finished list.
ReleasePredicateLocks: After releasing the predicate locks of the current 
transaction, if the transaction is rolled back, we need to release the 
SerializableXact object; if the transaction is committed, we should insert it 
to the finished list. The lock protects the release operation or insert 
operation.
ClearOldPredicateLocks: Look through the finished transaction list to find if 
any transaction object can be released. Thus the lock protects looking through 
the list, releasing transaction objects, and removing objects from the list.
As we can see, the lock protects two things: 1) the finished transaction list  
2) releasing serializable transaction objects. 
Using a single global lock to protect all will cause a lot of contentions. 
So I decoupled the lock's duty into two parts: protecting the finished 
transaction list and 
protecting releasing a single serializable transaction objects.


The SerializableFinishedListLock is reserved to protect the finished 
transaction list. 
Thus the function SummarizeOldestCommittedSxact and ReleasePredicateLocks are 
not changed.
For function ClearOldPredicateLocks, I scan the list and pick up the 
transactions which could be released first,
but not release these objects directly. After releasing the 
SerializableFinishedListLock, invoking
 ReleaseOneSerializableXact to release them.


At first, I want to use a partition lock or spinlock in each SerailizableXact 
object to protect ReleaseOneSerializableXact.
But I found it is not necessary to add new locks. in 
ReleaseOneSerializableXact, firstly, it released all predicate locks, 
which is protected by SerializablePredicateLockListLock; Then, it released all 
conflicts, which is protected by SerializableXactHashLock.
So I didn't change the function ReleaseOneSerializableXact. 


I have implemented this idea. But unfortunately, it didn't improve the 
performance or reduce the contention on SerializableFinishedListLock. 
I'll try to find out why in the next days.
But I'm really looking forward to your advice for my proposal. We can't be too 
careful to modify the code related to locks. 


--

Sincerely


Mengxing Liu








Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Tom Lane
Craig Ringer  writes:
> On 29 June 2017 at 03:01, Robert Haas  wrote:
>> It wouldn't be
>> so bad if unrecognized parameters were just ignored; the client would
>> know from the ServerProtocolVersion (or ParameterStatus) message that
>> server had ignored those options and could respond as it saw fit.

> Yeah. In retrospect it's a pity the key/value pairs don't come with
> some kind of optional/required flag so the client can say "error if
> you don't understand this, it's important" or "I don't care if you
> don't understand this, and I'll notice when you fail to GUC_REPORT it
> back to me". Or some convention like underscore-prefixing for
> optional.

Yeah.  Back in the day I helped design the PNG image format, and one
of the better ideas in it was to make a distinction between critical and
noncritical chunks within a PNG file; that was exactly the idea you're
getting at here.  I agree with the suggestion to drive this off a
parameter-name-based convention.  Having leading underscore indicate
a noncritical parameter sounds fine.

I don't really like any of the ideas that have been mentioned that would
introduce extra network round trips into session startup.  It's expensive
enough already.  The only thing we seem to be really hurting on is the
ability for the client to add extensions to the original StartupMessage,
and this seems like it can fix that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE of partition key

2017-06-28 Thread Amit Langote
Hi Amit,

On 2017/06/28 20:43, Amit Khandekar wrote:
> In attached patch v12

The patch no longer applies and fails to compile after the following
commit was made yesterday:

commit 501ed02cf6f4f60c3357775eb07578aebc912d3a
Author: Andrew Gierth 
Date:   Wed Jun 28 18:55:03 2017 +0100

Fix transition tables for partition/inheritance.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken hint bits (freeze)

2017-06-28 Thread Bruce Momjian
On Fri, Jun 23, 2017 at 06:17:47PM +0300, Sergey Burladyan wrote:
> PS:
> I successfully upgraded last night from 9.2 to 9.4 and find other issue :-)
> 
> It is about hash index and promote:
> 1. create master
> 2. create standby from it
> 3. create unlogged table and hash index like:
>  create unlogged table test (id int primary key, v text);
>  create index on test using hash (id);
> 3. stop master
> 4. promote standby
> 
> now, if you try to upgrade this new promoted master pg_upgrade will stop
> on this hash index:
> error while creating link for relation "public.test_id_idx" 
> ("s/9.2/base/16384/16393" to "m/9.4/base/16422/16393"): No such file or 
> directory
> Failure, exiting
> 
> I touch this file (s/9.2/base/16384/16393) and rerun pg_upgrade from
> scratch and it complete successfully.

Sergey, can you please test if the table "test" is not unlogged, does
pg_upgrade still fail on the hash index file?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Craig Ringer
On 29 June 2017 at 09:44, Craig Ringer  wrote:

> I
> can't personally think of much right away that wouldn't work pretty
> well in a follow-on message.

Actually, I take that back, there's one thing that's bugged me for a
while that wouldn't work well this way: determining the correct text
encoding with which to interpret the database name, user name, and any
string GUC values. Right now we take a wild stab in the dark and use
the server encoding.

e.g. in a terminal with LC_ALL=en_US.UTF-8 I run

postgres=# CREATE ROLE "café";
CREATE ROLE
postgres=# CREATE DATABASE "café" with owner "café";
CREATE DATABASE
postgres=# \du ca*
List of roles
 Role name |  Attributes  | Member of
---+--+---
 café  | Cannot login | {}

postgres=# \l ca*
List of databases
 Name | Owner | Encoding |   Collate   |Ctype| Access privileges
--+---+--+-+-+---
 café | café  | UTF8 | en_AU.UTF-8 | en_AU.UTF-8 |
(1 row)


then in a terminal with LC_ALL=en_US.ISO-8859-1 I run:

[craig@ayaki-localdomain log]$ psql -U "café"
psql: FATAL:  role "café" does not exist
[craig@ayaki-localdomain log]$ psql
psql (9.6.3, server 9.5.7)
Type "help" for help.

craig=> \du ca*
List of roles
 Role name |  Attributes  | Member of
---+--+---
 café  | Cannot login | {}

craig=> \l ca*
List of databases
 Name | Owner | Encoding |   Collate   |Ctype| Access privileges
--+---+--+-+-+---
 café | café  | UTF8 | en_AU.UTF-8 | en_AU.UTF-8 |
(1 row)

craig=> \c café
FATAL:  database "café" does not exist
Previous connection kept


Um, say what?

This happens because psql does server=>client encoding conversion once
connected, but there's no way to convert the startup message with the
username.

That IMO is an argument to allow startup message format change. (It'd
also then let the server reply with correctly-encoded pre-auth
messages for errors, something we currently fail to do).

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What does it mean by XLOG_BACKUP_RECORD?

2017-06-28 Thread Michael Paquier
On Thu, Jun 29, 2017 at 10:28 AM, Masahiko Sawada  wrote:
> While reading source codes I found the following comment in xlog.c.
>
> /*
>  * Have we passed our safe starting point? Note that minRecoveryPoint is
>  * known to be incorrectly set if ControlFile->backupEndRequired, until
>  * the XLOG_BACKUP_RECORD arrives to advise us of the correct
>  * minRecoveryPoint. All we know prior to that is that we're not
>  * consistent yet.
>  */
> if (!reachedConsistency && !ControlFile->backupEndRequired &&
> minRecoveryPoint <= lastReplayedEndRecPtr &&
> XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
>
> What does XLOG_BACKUP_RECORED means by? I could not find such XLOG info value.
> Does it mean XLOG_BACKUP_END?

This comment is a thinko, it refers to XLOG_BACKUP_END. This comment
block could be reworded a bit, it looks cleaner to me to say
"ControlFile->backupEndRequired is false" instead of just referring to
the variable itself. Worse, the current comment implies that
minRecoveryPoint is incorrectly set if it is true. Bleh.
-- 
Michael


xlog-comment-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Craig Ringer
On 29 June 2017 at 03:01, Robert Haas  wrote:

> One problem with that is that it means that the format of the
> StartupMessage itself can never change, which I think is not a good
> choice.

The startup message could be immediately followed by another
supplemental message, though.

- Startup["Protocol 3.2"]
- Extra Info Message

A server that only knows protocol version 3.1 would read the 3.1
message and reply with a NegotiateProtocolVersion or whatever, then
*discard all messages until it sees a new startup message*, so it
wouldn't be upset. Quite like what we do when reading until Sync after
an error.

Existing servers would bounce the connection when they saw the 3.2
field and the client would have to reconnect with 3.0. But we could
possibly even backpatch the nicer behaviour, since it's simple and
will only ever come into effect when sent a higher protocol version
msg that no existing client generates.

 If we ever landed up wanting to greatly revise the startup message
down the track, we could land up sending a stub startup message with
protocol 3.4 or whatever that just tells aware servers "real startup
message follows". Bit of a waste, but not that bad given how expensive
connections are. A dummy startup message is quite small.

I don't have a strong opinion here, just raising the possibility that
not being able to vary the startup message format may not be the end
of the world.

>  If, for example, we want to add a new option to the startup
> message (aside from the existing user, database, options, and
> replication keywords), we really can't do that today.

Right. An obvious example would be to put a starttls-like request in
it, asking the server to initiate TLS negotiation if supported.

Or an option asking the server to determine the DB to connect to based
on some property of the user.

Similarly, I've repeatedly wanted ways to specify client support for /
requests for optional protocol messages, like sending the xid and
commit record lsn of a commit along with the commit message.

Both those could be done in immediate follow-up messages, though. I
can't personally think of much right away that wouldn't work pretty
well in a follow-on message. There's a small one-off protocol size
overhead with doing it that way, but no extra latency, so who cares?

> It wouldn't be
> so bad if unrecognized parameters were just ignored; the client would
> know from the ServerProtocolVersion (or ParameterStatus) message that
> server had ignored those options and could respond as it saw fit.

Yeah. In retrospect it's a pity the key/value pairs don't come with
some kind of optional/required flag so the client can say "error if
you don't understand this, it's important" or "I don't care if you
don't understand this, and I'll notice when you fail to GUC_REPORT it
back to me". Or some convention like underscore-prefixing for
optional.

>> It's possible that we should do something that's not based on just a
>> linear protocol version number, but instead involves some kind of
>> bitmask of capabilities, say.  So the ServerProtocolVersion message
>> maybe needs to be a bit more complicated, and if the client does get
>> one back, maybe it should forward a ClientProtocolVersion message with
>> its own bitmask.  But these things could be designed in detail later.
>
> Right.  So for example we could decide that any parameter names that
> are passed in the startup packet that begin with an underscore are not
> GUCs but some kind of protocol extension; the server replies with some
> message containing a list of the ones which were not understood.

There's a *lot* of value to that idea when you consider proxies like
PgPool-II and PgBouncer that may struggle to satisfy all capabilities
of some protocol 3.3 but want to offer some key parts of it.

There's a price in terms of complexity but probably not a big one.
Some places that would be "client protocol version >= 3.3" would
instead be "client has_capability(x)". Possibly cleaner to read even
if we _did_ use a linear protocol version, frankly.

Also, importantly in my opinion, this would let clients turn things
on/off easily. I badly want this myself, as I really want the server
to be able to reply to each COMMIT with an extended CommandComplete
message or extra-info message after it, containing the commit record
LSN. Similarly, I really want to be able to send the xid of an xact on
the wire when one is assigned to an xact. Many clients won't need,
want or understand these things and whether they're protocol 3.0 or
3.9 they can just not ask for them.

Capabilities will make startup messages bigger. Personally I don't
care much about that, as on modern networks it's all about latency not
message size. We'd use abbreviated capability names I expect. If the
list gets too big we could always roll up capabilities that have
become universally adopted into the next protocol version bump so it's
assumed that a client announcing proto 3.3 supports feature x 

[HACKERS] What does it mean by XLOG_BACKUP_RECORD?

2017-06-28 Thread Masahiko Sawada
Hi,

While reading source codes I found the following comment in xlog.c.

/*
 * Have we passed our safe starting point? Note that minRecoveryPoint is
 * known to be incorrectly set if ControlFile->backupEndRequired, until
 * the XLOG_BACKUP_RECORD arrives to advise us of the correct
 * minRecoveryPoint. All we know prior to that is that we're not
 * consistent yet.
 */
if (!reachedConsistency && !ControlFile->backupEndRequired &&
minRecoveryPoint <= lastReplayedEndRecPtr &&
XLogRecPtrIsInvalid(ControlFile->backupStartPoint))

What does XLOG_BACKUP_RECORED means by? I could not find such XLOG info value.
Does it mean XLOG_BACKUP_END?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken hint bits (freeze)

2017-06-28 Thread Bruce Momjian
On Sat, Jun 24, 2017 at 09:24:21AM +0530, Amit Kapila wrote:
> > I was not clear.  I was not saying there can be only one extra WAL file.
> > I am saying the "Latest checkpoint location" should be one WAL file
> > farther on the master.  I think the big problem is that we need a full
> > replay of that WAL file, not just having it one less than the master.
> >
> 
> If the user has properly shutdown, then that last file should only
> have checkpoint record, is it safe to proceed with upgrade without
> actually copying that file?

Yes, but how do we know they processed all the records in the 
second-to-last WAL file (in WAL shipping mode).

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken hint bits (freeze)

2017-06-28 Thread Bruce Momjian
On Sat, Jun 24, 2017 at 09:19:10AM +0530, Amit Kapila wrote:
> > I successfully upgraded last night from 9.2 to 9.4 and find other issue :-)
> >
> > It is about hash index and promote:
> > 1. create master
> > 2. create standby from it
> > 3. create unlogged table and hash index like:
> >  create unlogged table test (id int primary key, v text);
> >  create index on test using hash (id);
> > 3. stop master
> > 4. promote standby
> >
> > now, if you try to upgrade this new promoted master pg_upgrade will stop
> > on this hash index:
> > error while creating link for relation "public.test_id_idx" 
> > ("s/9.2/base/16384/16393" to "m/9.4/base/16422/16393"): No such file or 
> > directory
> > Failure, exiting
> >
> 
> I am not sure if this is a problem because in the version you are
> trying hash indexes are not WAL-logged and the creation of same will
> not be replicated on standby, so the error seems to be expected.

Well, it certainly should not error out like this.  I have not seen such
a failure report before.

I think the fundamental problem is that unlogged objects
(pg_class.relpersistence='u') creates a file on the master, but doesn't
create anything on the standby since it is never transmitted over the
WAL (assuming the object is created after the base backup).

I assume the standby creates them as empty when it is promoted to
primary and someone tries to access the object.  I wonder if I need to
add a boolean to each object to record if it is unlogged, and allow
copy/link to silently fail in such cases.  Does that make sense?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Ants Aasma
On Wed, Jun 28, 2017 at 8:31 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2017-06-27 14:59:18 -0400, Tom Lane wrote:
>>> However, it's certainly arguable that this is too much change for an
>>> optional post-beta patch.
>
>> Yea, I think there's a valid case to be made for that. I'm still
>> inclined to go along with this, it seems we're otherwise just adding
>> complexity we're going to remove in a bit again.
>
> I'm not hearing anyone speaking against doing this now, so I'm going
> to go ahead with it.

+1 for going for it. Besides pg_ctl it would also help cluster
management software. In Patroni we basically reimplement pg_ctl to get
at the started postmaster pid to detect a crash during postmaster
startup earlier and to be able to reliably cancel start. Getting the
current state from the pid file sounds better than having a loop poke
the server with pg_isready.

To introduce feature creep, I would like to see more detailed states
than proposed here. Specifically I'm interested in knowing when
PM_WAIT_BACKENDS ends.

When we lose quorum we restart PostgreSQL as a standby. We use a
regular fast shutdown, but that can take a long time due to the
shutdown checkpoint. The leader lease may run out during this time so
we would have to escalate to immediate shutdown or have a watchdog
fence the node. If we knew that no user backends are left we can let
the shutdown checkpoint complete at leisure without risk for split
brain.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Race conditions with WAL sender PID lookups

2017-06-28 Thread Michael Paquier
On Thu, Jun 29, 2017 at 7:52 AM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas  wrote:
>> I take that you are referring to the two lookups in
>> WalSndWaitForWal(), one in exec_replication_command(), one in
>> WalSndLoop() and one in WalSndDone(). First let's remember that all
>> the fields protected by the spinlock are only updated in a WAL sender
>> process, so reading them directly is safe in this context: we know
>> that no other processes will write them. And all those functions are
>> called only in the context of a WAL sender process. So the copies of
>> MyWalSnd that you are referring to here don't need a spinlock. Is
>> there something I am missing?
>
> I assume you mean "reading them unlocked" when you write "reading them
> directly".

Both expressions sound the same to me. So yes I meant that.

> If so, I agree with that rationale; I verified that it
> applies to members state, sentPtr, write, sent, flush, writeLag,
> sentLag, flushLag.  I wonder if there's a maintainability danger: as
> soon as we add a write to those members in a process other than the
> walsender itself, we have a bug.  I don't yet have an opinion on the
> severity of that problem.
>
> Other struct members such as needreload are written by other processes,
> so they should be always accessed with mutex held.  Regarding pid, it
> seems easiest if we use the rule that it must also be always accessed
> with spinlock held.  I propose updating the comment on WalSnd to this:
>
> /*
>  * Each walsender has a WalSnd struct in shared memory.
>  *
>  * This struct is protected by 'mutex', with two exceptions; one is
>  * sync_standby_priority as noted below.  The other exception is that some
>  * members are only written by the walsender process itself, and thus that
>  * process is free to read those members without holding spinlock.  pid and
>  * needreload always require the spinlock to be held for all accesses.
>  */

Agreed, a comment seems appropriate to me. That's in line with what
Horiguchi-san has mentioned upthread.

> I think I'm done with the walsender half of this patch; I still need to
> review the walreceiver part.  I will report back tomorrow 19:00 CLT.

Thanks!

> Currently, I'm considering not to backpatch any of this.

Considering how crazy the conditions to make the information fetched
by users inconsistent are met, I agree with that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: Support Unicode host variable in ECPG

2017-06-28 Thread Jing Wang
Hello,

This is a new requirement comes from some customers hoping the ECPG can
support Unicode host variable which can compatible with the same feature in
Oracle Pro*C.

https://docs.oracle.com/database/121/LNPCC/pc_05adv.htm#LNPCC3273

By using this Unicode data type, user can define Unicode variables in ECPG
application in windows platform, save the content of the Unicode host
variables into the database as database character set or get the content
from the database into the Unicode host variables.


Requirements

1. support utext keyword in ECPG

The utext is used to define the Unicode host variable in ECPG application
in windows platform.

2. support  UVARCHAR keyword in ECPG

The UVARCHAR is used to define the Unicode vary length host variable in
ECPG application in windows platform.

3. Saving the content of the Unicode variables into the database as
database character set or getting the content from the database into the
Unicode variables.

4. Firstly can only consider the UTF8 as database character set and UTF16
as the Unicode format for host variable. A datatype convert will be done
between the UTF8 and UTF16 by ECPG.

5. Since Unicode has big-endian and little-endian format, a environment
variable is used to identify them and do the endianness convert accordingly.


Usage

int main() {
EXEC SQL BEGIN DECLARE SECTION;
utext employee[20] ;/* define Unicode host variable  */
UVARCHAR address[50] ;  /* defin a vary length Unicode host
variable  */
EXEC SQL END DECLARE SECTION;

...

EXEC SQL CREATE TABLE emp (ename char(20), address varchar(50));

/* UTF8 is the database character set  */
EXEC SQL INSERT INTO emp (ename) VALUES ('Mike', '1 sydney, NSW') ;

/* Database character set converted to Unicode */
EXEC SQL SELECT ename INTO :employee FROM emp ;

/* Database character set converted to Unicode */
EXEC SQL SELECT address INTO :address FROM emp ;

wprintf(L"employee name is %s\n",employee);

wprintf(L"employee address is %s\n", address.attr);

DELETE * FROM emp;

/* Unicode converted to Database character */
EXEC SQL INSERT INTO emp (ename,address) VALUES (:employee, :address);

EXEC SQL DROP TABLE emp;
EXEC SQL DISCONNECT ALL;
 }

--
Regards,
Jing Wang
Fujitsu Australia


Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Thomas Munro
On Thu, Jun 29, 2017 at 11:04 AM, Andres Freund  wrote:
>> diff --git a/configure.in b/configure.in
>> index 11eb9c8acfc..47452bbac43 100644
>> --- a/configure.in
>> +++ b/configure.in
>> @@ -1429,7 +1429,7 @@ PGAC_FUNC_WCSTOMBS_L
>>  LIBS_including_readline="$LIBS"
>>  LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
>>
>> -AC_CHECK_FUNCS([cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred 
>> getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np readlink 
>> setproctitle setsid shm_open symlink sync_file_range towlower utime utimes 
>> wcstombs wcstombs_l])
>> +AC_CHECK_FUNCS([cbrt clock_gettime dlopen fallocate fdatasync getifaddrs 
>> getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np 
>> readlink setproctitle setsid shm_open symlink sync_file_range towlower utime 
>> utimes wcstombs wcstombs_l])
>
> Why are we going for fallocate rather than posix_fallocate()? There's
> plenty use cases that can only be done with the former, but this doesn't
> seem like one of them?
>
> Currently this is a linux specific path - but I don't actually see any
> reason to keep it that way? It seems far from unlikely that this is just
> an issue with linux...

POSIX says that posix_fallocate() only works on files.  It's an
implementation detail that shm_open() returns an fd for a tmpfs file
on Linux.  posix_fallocate() on an fd returned by shm_open() fails
under FreeBSD, and I suspect on other Unices too (but haven't tried),
because there is no requirement for POSIX shm_open() objects to be so
file-ish.  My take is that this is a Linux-specific problem requiring
a Linux-specific solution.Considering the above and the advice on
LKML from a senior kernel hacker[1] to use fallocate() to solve this
problem, I figured that was the right way to go.  But yeah, it also
seems to work with posix_fallocate() *on Linux*.

>>
>>  #ifdef USE_DSM_POSIX
>>  /*
>> + * Set the size of a virtual memory region associate with a file descriptor.
>> + * On Linux, also ensure that virtual memory is actually allocated by the
>> + * operating system to avoid nasty surprises later.
>> + *
>> + * Returns non-zero if either truncation or allocation fails, and sets 
>> errno.
>> + */
>> +static int
>> +dsm_impl_posix_resize(int fd, off_t size)
>> +{
>> + int rc;
>> +
>> + /* Truncate (or extend) the file to the requested size. */
>> + rc = ftruncate(fd, size);
>> +
>> +#ifdef HAVE_FALLOCATE
>> +#ifdef __linux__
>> + /*
>> +  * On Linux, a shm_open fd is backed by a tmpfs file.  After 
>> resizing
>> +  * with ftruncate it may contain a hole.  Accessing memory 
>> backed by a
>> +  * hole causes tmpfs to allocate pages, which fails with 
>> SIGBUS if
>> +  * there is no virtual memory available.  So we ask tmpfs to 
>> allocate
>> +  * pages here, so we can fail gracefully with ENOSPC now 
>> rather than
>> +  * risking SIGBUS later.
>> +  */
>> + if (rc == 0)
>> + {
>> + do
>> + {
>> + rc = fallocate(fd, 0, 0, size);
>> + } while (rc == -1 && errno == EINTR);
>> + if (rc != 0 && errno == ENOSYS)
>> + {
>> + /* Kernel too old (< 2.6.23). */
>> + rc = 0;
>> + }
>> + }
>> +#endif
>> +#endif
>
> I'd rather fall-back to just write() initializing the buffer, and do
> either of those in all implementations rather than just linux.

You can't write() to a shm_open() fd on FreeBSD.  I can't find wording
in POSIX either way but it's not portable, and it's not too surprising
if you consider the spirit of POSIX shm_open() to be something like
"you might want to implement this as a special kind of file, or not"
but with the intention being to mmap it, not read/write it.

Note that I'm specifically addressing the freaky Linux SIGBUS problem
here: not virtual memory overcommit, which may or may not happen on
your particular OS leading to failures independent of this, and for
example does happen with FreeBSD.  The way to avoid that on FreeBSD is
probably to disable overcommit (I haven't checked).  The Linux SIGBUS
problem is not regular overcommit, it's tmpfs file holes and is a
different failure mode and AFAIK can bite you even if you disable
overcommit on Linux like TFM tells you to.

>> diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
>> index 7a05c7e5b85..dcb9a846c7b 100644
>> --- a/src/include/pg_config.h.in
>> +++ b/src/include/pg_config.h.in
>> @@ -167,6 +167,9 @@
>>  /* Define to 1 if you have the  header file. */
>>  #undef HAVE_EDITLINE_READLINE_H
>>
>> +/* Define to 1 if you have the `fallocate' function. */
>> +#undef HAVE_FALLOCATE
>> +
>>  /* Define to 1 if you have the `fdatasync' 

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Andres Freund
On 2017-06-28 19:07:50 +1200, Thomas Munro wrote:
> I think this line is saying that it won't restart automatically:
> 
> https://github.com/torvalds/linux/blob/590dce2d4934fb909b112cd80c80486362337744/mm/shmem.c#L2884

Indeed.


> So I think we either need to mask signals with or put in an explicit
> retry loop, as shown in the attached version of the patch.  With the
> v3 patch I posted earlier, I see interrupted system call failures in
> the select_parallel regression test, but with the v4 it passes.
> Thoughts?

I think a retry loop is a lot better than blocking signals.


> > Unfounded speculation: fallocate() might actually *improve*
> > performance of DSM segments if your access pattern involves random
> > access (just to pick an example out of the air, something like...
> > building a hash table), since it's surely easier to allocate a big
> > contiguous chunk than a squillion random pages most of which divide an
> > existing hole into two smaller holes...
> 
> Bleugh... I retract this, of course we initialise the hash table in
> order anyway so this doesn't make any sense.

It's still faster to create larger mappings at once, rather than through
subsequent page faults...


> diff --git a/configure.in b/configure.in
> index 11eb9c8acfc..47452bbac43 100644
> --- a/configure.in
> +++ b/configure.in
> @@ -1429,7 +1429,7 @@ PGAC_FUNC_WCSTOMBS_L
>  LIBS_including_readline="$LIBS"
>  LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
>  
> -AC_CHECK_FUNCS([cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred 
> getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np readlink 
> setproctitle setsid shm_open symlink sync_file_range towlower utime utimes 
> wcstombs wcstombs_l])
> +AC_CHECK_FUNCS([cbrt clock_gettime dlopen fallocate fdatasync getifaddrs 
> getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np 
> readlink setproctitle setsid shm_open symlink sync_file_range towlower utime 
> utimes wcstombs wcstombs_l])

Why are we going for fallocate rather than posix_fallocate()? There's
plenty use cases that can only be done with the former, but this doesn't
seem like one of them?

Currently this is a linux specific path - but I don't actually see any
reason to keep it that way? It seems far from unlikely that this is just
an issue with linux...

>
>  #ifdef USE_DSM_POSIX
>  /*
> + * Set the size of a virtual memory region associate with a file descriptor.
> + * On Linux, also ensure that virtual memory is actually allocated by the
> + * operating system to avoid nasty surprises later.
> + *
> + * Returns non-zero if either truncation or allocation fails, and sets errno.
> + */
> +static int
> +dsm_impl_posix_resize(int fd, off_t size)
> +{
> + int rc;
> +
> + /* Truncate (or extend) the file to the requested size. */
> + rc = ftruncate(fd, size);
> +
> +#ifdef HAVE_FALLOCATE
> +#ifdef __linux__
> + /*
> +  * On Linux, a shm_open fd is backed by a tmpfs file.  After 
> resizing
> +  * with ftruncate it may contain a hole.  Accessing memory 
> backed by a
> +  * hole causes tmpfs to allocate pages, which fails with SIGBUS 
> if
> +  * there is no virtual memory available.  So we ask tmpfs to 
> allocate
> +  * pages here, so we can fail gracefully with ENOSPC now rather 
> than
> +  * risking SIGBUS later.
> +  */
> + if (rc == 0)
> + {
> + do
> + {
> + rc = fallocate(fd, 0, 0, size);
> + } while (rc == -1 && errno == EINTR);
> + if (rc != 0 && errno == ENOSYS)
> + {
> + /* Kernel too old (< 2.6.23). */
> + rc = 0;
> + }
> + }
> +#endif
> +#endif

I'd rather fall-back to just write() initializing the buffer, and do
either of those in all implementations rather than just linux.

> diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
> index 7a05c7e5b85..dcb9a846c7b 100644
> --- a/src/include/pg_config.h.in
> +++ b/src/include/pg_config.h.in
> @@ -167,6 +167,9 @@
>  /* Define to 1 if you have the  header file. */
>  #undef HAVE_EDITLINE_READLINE_H
>  
> +/* Define to 1 if you have the `fallocate' function. */
> +#undef HAVE_FALLOCATE
> +
>  /* Define to 1 if you have the `fdatasync' function. */
>  #undef HAVE_FDATASYNC

Needs pg_config.h.win32 adjustment...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Race conditions with WAL sender PID lookups

2017-06-28 Thread Alvaro Herrera
Michael Paquier wrote:

> On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas  wrote:
> > I think if you're going to fix it so that we take spinlocks on
> > MyWalSnd in a bunch of places that we didn't take them before, it
> > would make sense to fix all the places where we're accessing those
> > fields without a spinlock instead of only some of them, unless there
> > are good reasons why we only need it in some cases and not others, in
> > which case I think the patch should add comments to each place where
> > the lock was not taken explaining why it's OK.  It didn't take me and
> > my copy of vi very long to find instances that you didn't change.
> 
> I take that you are referring to the two lookups in
> WalSndWaitForWal(), one in exec_replication_command(), one in
> WalSndLoop() and one in WalSndDone(). First let's remember that all
> the fields protected by the spinlock are only updated in a WAL sender
> process, so reading them directly is safe in this context: we know
> that no other processes will write them. And all those functions are
> called only in the context of a WAL sender process. So the copies of
> MyWalSnd that you are referring to here don't need a spinlock. Is
> there something I am missing?

I assume you mean "reading them unlocked" when you write "reading them
directly".  If so, I agree with that rationale; I verified that it
applies to members state, sentPtr, write, sent, flush, writeLag,
sentLag, flushLag.  I wonder if there's a maintainability danger: as
soon as we add a write to those members in a process other than the
walsender itself, we have a bug.  I don't yet have an opinion on the
severity of that problem.

Other struct members such as needreload are written by other processes,
so they should be always accessed with mutex held.  Regarding pid, it
seems easiest if we use the rule that it must also be always accessed
with spinlock held.  I propose updating the comment on WalSnd to this:

/*
 * Each walsender has a WalSnd struct in shared memory.
 *
 * This struct is protected by 'mutex', with two exceptions; one is
 * sync_standby_priority as noted below.  The other exception is that some
 * members are only written by the walsender process itself, and thus that
 * process is free to read those members without holding spinlock.  pid and
 * needreload always require the spinlock to be held for all accesses.
 */


> Actually, perhaps it would make sense to add some Assert(am_walsender)
> in this code?

I don't think that's needed, since MyWalSnd will only be valid in a
walsender.

I think I'm done with the walsender half of this patch; I still need to
review the walreceiver part.  I will report back tomorrow 19:00 CLT.

Currently, I'm considering not to backpatch any of this.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast promotion not used when doing a recovery_target PITR restore?

2017-06-28 Thread Michael Paquier
On Wed, Jun 28, 2017 at 6:13 AM, Andres Freund  wrote:
> You seem to completely argue besides my point that the replication path
> is *more* robust by now?  And there's plenty scenarios where a faster
> startup is quite crucial for performance. The difference between an
> immediate shutdown + recovery without checkpoint to a fast shutdown can
> be very large, and that matters a lot for faster postgres updates etc.

If you go that way, it seems safer to me if users had some control
with a switch, defaulting to the previous behavior. And a complete
switch to the newer behavior could be done later on depending on what
has been found.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] transition table behavior with inheritance appears broken

2017-06-28 Thread Thomas Munro
On Thu, Jun 29, 2017 at 6:21 AM, Andrew Gierth
 wrote:
> Commits pushed.

Great news.  Thanks for stepping up to get this committed.  Thanks a
lot also to Marko, Amit L, Kevin, Robert, Noah and Peter G for the
problem reports, reviews and issue chasing.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> So when I removed the miscadmin.h include, I found out that pg_ctl is
>> also relying on PG_BACKEND_VERSIONSTR from that file.
>> 
>> There are at least three things we could do here:
>> 
>> 1. Give this up as not worth this much trouble.
>> 
>> 2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the
>> other version-related macros.

> pg_config.h sounds like a decent enough solution.  It's a bit strange
> this hasn't come up before, given that that symbol is used more in
> frontend environ than backend.

Right at the moment, what I've done is to stick it into port.h beside
the declaration of find_other_exec, since the existing uses are all
as parameters of find_other_exec[_or_die].  But maybe that's a bit too
expedient.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Alvaro Herrera
Tom Lane wrote:
> Andres Freund  writes:

> So when I removed the miscadmin.h include, I found out that pg_ctl is
> also relying on PG_BACKEND_VERSIONSTR from that file.
> 
> There are at least three things we could do here:
> 
> 1. Give this up as not worth this much trouble.
> 
> 2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the
> other version-related macros.

pg_config.h sounds like a decent enough solution.  It's a bit strange
this hasn't come up before, given that that symbol is used more in
frontend environ than backend.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Robert Haas
On Wed, Jun 28, 2017 at 1:47 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Here's my proposal:
>
>> - If the server receives a StartupMessage for v3.x where x > the
>> version it knows, instead of just slamming the connection shut, it
>> responds by sending some new message (let's say,
>> NegotiateProtocolVersion) specifying the highest protocol version it
>> supports.
>
>> - The client may then continue by sending a new StartupMessage with a
>> version with a version number that is within range.
>
> How about a slightly simpler design: we specify that all 3.x protocol
> versions shall be mutually compatible so long as each side knows not
> to send something that the other side doesn't understand.  The field
> in the StartupMessage is understood to be the max protocol version the
> client knows how to cope with.  If the client sends something > 3.0,
> then the server responds with a ServerProtocolVersion message containing
> the max protocol version it understands.  Now both sides know what the
> other side can do and should be able to adapt their behavior accordingly.
> (If the client doesn't get a ServerProtocolVersion message, it should
> assume server protocol 3.0.)

One problem with that is that it means that the format of the
StartupMessage itself can never change, which I think is not a good
choice.  If, for example, we want to add a new option to the startup
message (aside from the existing user, database, options, and
replication keywords), we really can't do that today.  It wouldn't be
so bad if unrecognized parameters were just ignored; the client would
know from the ServerProtocolVersion (or ParameterStatus) message that
server had ignored those options and could respond as it saw fit.  But
because they are treated as GUCs the server will error out.   This is
clearly not an entirely theoretical concern considering that
'replication' was added as a keyword here within recent memory.

> In order to make the world safe for this, we'd have to adjust existing
> releases to not complain about 3.x for x > 0, and the sooner we do that
> the better chance of clients being able to make use of this within a
> reasonable timeframe.
>
> It's possible that we should do something that's not based on just a
> linear protocol version number, but instead involves some kind of
> bitmask of capabilities, say.  So the ServerProtocolVersion message
> maybe needs to be a bit more complicated, and if the client does get
> one back, maybe it should forward a ClientProtocolVersion message with
> its own bitmask.  But these things could be designed in detail later.

Right.  So for example we could decide that any parameter names that
are passed in the startup packet that begin with an underscore are not
GUCs but some kind of protocol extension; the server replies with some
message containing a list of the ones which were not understood.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-28 13:31:27 -0400, Tom Lane wrote:
>> While looking this over again, I got worried about the fact that pg_ctl
>> is #including "miscadmin.h".  That's a pretty low-level backend header
>> and it wouldn't be surprising at all if somebody tried to put stuff in
>> it that wouldn't compile frontend-side.  I think we should take the
>> opportunity, as long as we're touching this stuff, to split the #defines
>> that describe the contents of postmaster.pid into a separate header file.
>> Maybe "utils/pidfile.h" ?

> Yes, that sounds like a valid concern, and solution.

So when I removed the miscadmin.h include, I found out that pg_ctl is
also relying on PG_BACKEND_VERSIONSTR from that file.

There are at least three things we could do here:

1. Give this up as not worth this much trouble.

2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the
other version-related macros.

3. Give PG_BACKEND_VERSIONSTR its very own new header file.

Any preferences?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] asynchronous execution

2017-06-28 Thread Antonin Houska
Kyotaro HORIGUCHI  wrote:

> The patch got conflicted. This is a new version just rebased to
> the current master. Furtuer amendment will be taken later.

Just one idea that I had while reading the code.

In ExecAsyncEventLoop you iterate estate->es_pending_async, then move the
complete requests to the end and finaly adjust estate->es_num_pending_async so
that the array no longer contains the complete requests. I think the point is
that then you can add new requests to the end of the array.

I wonder if a set (Bitmapset) of incomplete requests would make the code more
efficient. The set would contain position of each incomplete request in
estate->es_num_pending_async (I think it's the myindex field of
PendingAsyncRequest). If ExecAsyncEventLoop used this set to retrieve the
requests subject to ExecAsyncNotify etc, then the compaction of
estate->es_pending_async wouldn't be necessary.

ExecAsyncRequest would use the set to look for space for new requests by
iterating it and trying to find the first gap (which corresponds to completed
request).

And finally, item would be removed from the set at the moment the request
state is being set to ASYNCREQ_COMPLETE.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] transition table behavior with inheritance appears broken

2017-06-28 Thread Andrew Gierth
Commits pushed.

Unless I broke the buildfarm again (which I'll check up on later), or
some new issue arises with the fixes, this should close all 3 related
items for transition tables.

-- 
Andrew.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Tom Lane
Robert Haas  writes:
> Here's my proposal:

> - If the server receives a StartupMessage for v3.x where x > the
> version it knows, instead of just slamming the connection shut, it
> responds by sending some new message (let's say,
> NegotiateProtocolVersion) specifying the highest protocol version it
> supports.

> - The client may then continue by sending a new StartupMessage with a
> version with a version number that is within range.

How about a slightly simpler design: we specify that all 3.x protocol
versions shall be mutually compatible so long as each side knows not
to send something that the other side doesn't understand.  The field
in the StartupMessage is understood to be the max protocol version the
client knows how to cope with.  If the client sends something > 3.0,
then the server responds with a ServerProtocolVersion message containing
the max protocol version it understands.  Now both sides know what the
other side can do and should be able to adapt their behavior accordingly.
(If the client doesn't get a ServerProtocolVersion message, it should
assume server protocol 3.0.)

In order to make the world safe for this, we'd have to adjust existing
releases to not complain about 3.x for x > 0, and the sooner we do that
the better chance of clients being able to make use of this within a
reasonable timeframe.

It's possible that we should do something that's not based on just a
linear protocol version number, but instead involves some kind of
bitmask of capabilities, say.  So the ServerProtocolVersion message
maybe needs to be a bit more complicated, and if the client does get
one back, maybe it should forward a ClientProtocolVersion message with
its own bitmask.  But these things could be designed in detail later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE of partition key

2017-06-28 Thread Amit Khandekar
On 22 June 2017 at 01:57, Robert Haas  wrote:
> On Wed, Jun 21, 2017 at 1:38 PM, Amit Khandekar  
> wrote:
 Yep, it's more appropriate to use
 ModifyTableState->rootResultRelationInfo->ri_RelationDesc somehow.  That
 is, if answer to the question I raised above is positive.
>>
>> From what I had checked earlier when coding that part,
>> rootResultRelInfo is NULL in case of inserts, unless something has
>> changed in later commits. That's the reason I decided to use the first
>> resultRelInfo.
>
> We're just going around in circles here.  Saying that you decided to
> use the first child's resultRelInfo because you didn't have a
> resultRelInfo for the parent is an explanation of why you wrote the
> code the way you did, but that doesn't make it correct.  I want to
> know why you think it's correct.

Yeah, that was just an FYI on how I decided to use the first
resultRelInfo; it was not for explaining why using first resultRelInfo
is correct. So upthread, I have tried to explain.

>
> I think it's probably wrong, because it seems to me that if the INSERT
> code needs to use the parent's ResultRelInfo rather than the first
> child's ResultRelInfo, the UPDATE code probably needs to do the same.
> Commit d3cc37f1d801a6b5cad9bf179274a8d767f1ee50 got rid of
> resultRelInfos for non-leaf partitions, and commit
> e180c8aa8caf5c55a273d4a8e6092e77ff3cff10 added the resultRelInfo back
> for the topmost parent, because otherwise it didn't work correctly.



Regarding rootResultRelInfo , it would have been good if
rootResultRelInfo was set for both insert and update, but it isn't set
for inserts.

For inserts :
In ExecInitModifyTable(), ModifyTableState->rootResultRelInfo  remains
NULL because ModifyTable->rootResultRelIndex is = -1 :
/* If modifying a partitioned table, initialize the root table info */
if (node->rootResultRelIndex >= 0)
   mtstate->rootResultRelInfo = estate->es_root_result_relations +
node->rootResultRelIndex;


ModifyTable->rootResultRelIndex = -1 because it does not get set since
ModifyTable->partitioned_rels is NULL :

/*
* If the main target relation is a partitioned table, the
* following list contains the RT indexes of partitioned child
* relations including the root, which are not included in the
* above list.  We also keep RT indexes of the roots
* separately to be identitied as such during the executor
* initialization.
*/
if (splan->partitioned_rels != NIL)
{
root->glob->nonleafResultRelations =
list_concat(root->glob->nonleafResultRelations,
list_copy(splan->partitioned_rels));
/* Remember where this root will be in the global list. */
splan->rootResultRelIndex = list_length(root->glob->rootResultRelations);
root->glob->rootResultRelations =
lappend_int(root->glob->rootResultRelations,
linitial_int(splan->partitioned_rels));
}

ModifyTable->partitioned_rels is NULL because inheritance_planner()
does not get called for INSERTs; instead, grouping_planner() gets
called :

subquery_planner()
{
/*
* Do the main planning.  If we have an inherited target relation, that
* needs special processing, else go straight to grouping_planner.
*/
if (parse->resultRelation && rt_fetch(parse->resultRelation,
parse->rtable)->inh)
   inheritance_planner(root);
else
   grouping_planner(root, false, tuple_fraction);

}

Above, inh is false in case of inserts.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Andres Freund
On 2017-06-28 13:31:27 -0400, Tom Lane wrote:
> I'm not hearing anyone speaking against doing this now, so I'm going
> to go ahead with it.

Cool.


> While looking this over again, I got worried about the fact that pg_ctl
> is #including "miscadmin.h".  That's a pretty low-level backend header
> and it wouldn't be surprising at all if somebody tried to put stuff in
> it that wouldn't compile frontend-side.  I think we should take the
> opportunity, as long as we're touching this stuff, to split the #defines
> that describe the contents of postmaster.pid into a separate header file.
> Maybe "utils/pidfile.h" ?

Yes, that sounds like a valid concern, and solution.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-27 14:59:18 -0400, Tom Lane wrote:
>> However, it's certainly arguable that this is too much change for an
>> optional post-beta patch.

> Yea, I think there's a valid case to be made for that. I'm still
> inclined to go along with this, it seems we're otherwise just adding
> complexity we're going to remove in a bit again.

I'm not hearing anyone speaking against doing this now, so I'm going
to go ahead with it.

> The 8-space thing in multiple places is a bit ugly.  How about having a
> a bunch of constants declared in one place?

While looking this over again, I got worried about the fact that pg_ctl
is #including "miscadmin.h".  That's a pretty low-level backend header
and it wouldn't be surprising at all if somebody tried to put stuff in
it that wouldn't compile frontend-side.  I think we should take the
opportunity, as long as we're touching this stuff, to split the #defines
that describe the contents of postmaster.pid into a separate header file.
Maybe "utils/pidfile.h" ?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH - version compatibility)

2017-06-28 Thread Robert Haas
On Mon, Mar 28, 2011 at 7:07 PM, Tom Lane  wrote:
> I wrote:
>> Now if we had a track record showing that we could tweak the protocol
>> version without causing problems, it'd be fine with me to do it for this
>> usage.  But we don't, and this particular case doesn't seem like the
>> place to start.
>
> And, btw, a moment's study of the protocol version checking code in
> postmaster.c shows that bumping the minor version number to 3.1 *would*
> break things: a client requesting 3.1 from a current postmaster would
> get a failure.
>
> Maybe we oughta change that logic --- it's not clear to me that there's
> any meaningful difference between major and minor numbers given the
> current postmaster behavior.

I just had a conversation at Postgres Vision that reminded me of the
old thread quoted aboce, and it seems to me that what Tom said there
is a good idea.  We've made quite a number of protocol version changes
over the years -- most recently, SCRAM introduced AuthenticationSASL
-- without ever bumping the protocol version.  In the case of the
change that this thread was originally about, namely the introduction
of CopyBoth, it could be argued that there is no real need to bump the
protocol version, because the client initiates the use of the new
protocol message and, before doing so, can know whether the server
version is new enough to support that protocol message based on the
ParameterStatus message for server_version which is sent after
authentication is complete.  Fair enough, but:

1. If we want to add a new message which is initiated by the server
rather than the client, the server currently has no way of knowing
whether the client is new enough to understand the message, because
the client never identifies itself except by submitting a protocol
version which must always be exactly 3.0 or the connection will be
killed.  SCRAM itself suffers from this problem; if an older libpq
tries to connect to a v10 server, they'll get a lousy error message
unless they've got a minor release new enough to contain
96d0f988b150aa0b52b44b8c1adbc7ef59262a1a or one of its siblings.
That's not the worst problem ever, but at the same time it's not
really a good design to rely on people upgrading the clients in order
to get the correct message about the client being too old.  It would
be much better if the client identified itself to the server.  Then if
the server sees that the client is old, it can give the client an
appropriate error, and if the client is new enough, then it can be
sure that it will be understood.

2. If we want to add a new message is initiated by the client before
authentication is complete, ParameterStatus messages won't have been
sent yet, so the client has no idea what capabilities the server might
be able to support.

3. It's a bit unclean to use the server version as a proxy for the
wire protocol version; there are non-PostgreSQL servers which speak
the PostgreSQL wire protocol, and making them impersonate a specific
version of PostgreSQL in order to clarify which wire protocol features
they support isn't really very nice.  So maybe there's a case for
bumping the minor version each time we actually change the protocol
rather than only in the cases where the client has no other method of
discovering the server version.  For example, you can imagine a
connection proxy like pgpool or pgbouncer wanting to limit the client
to a particular wire protocol version -- the highest that it supports
-- while proxying to a newer server version sitting behind it.  That
doesn't work if the server version is being abused as a sort of
back-door protocol version.

Here's my proposal:

- If the server receives a StartupMessage for v3.x where x > the
version it knows, instead of just slamming the connection shut, it
responds by sending some new message (let's say,
NegotiateProtocolVersion) specifying the highest protocol version it
supports.

- The client may then continue by sending a new StartupMessage with a
version with a version number that is within range.

If we had this, then bumping the minor protocol version wouldn't
require retrying connections, which among other problems generates
scary error messages in the logs; instead, they would succeed, but
with one additional round trip.  That seems like it would represent a
substantial improvement over what we have now, where bumping the
protocol version is such an incredibly painful thing that we never do
it, even when we've actually broken wire protocol compatibility!

Of course, it would take a few releases before server versions that
support this mechanism were widespread, and so clients using a
hypothetical version 3.1 would still have to be prepared to retry the
whole connection, but if we use that as a reason not to make a change
of this sort then it's not clear how things ever get any better than
they are now.  And it's clear that there is pent-up demand for the
ability to make minor additions to the protocol without breaking the
whole 

Re: [HACKERS] building Postgresql-9.0.10 with patching MERGE keyword

2017-06-28 Thread Tom Lane
Lelisa Diriba  writes:
> I am trying to build with Patching MERGE statement or  keyword in
> Postgresql 9.0.10 on ubuntu OS  and am getting a couple errors related to
> bootstrap libraries.

9.0 is a bit old to be of any interest ...

> In file included from bootparse.y:420:0:
> bootscanner.l: In function 'boot_yylex':
> bootscanner.l:110:10: error: 'XFORCE' undeclared (first use in this
> function)

In 9.5 and up, I see this in bootparse.y:

%token XFORCE XNOT XNULL

but previous versions have neither that declaration nor any uses of
those symbols.  I suspect you tried to back-patch a modern version
of bootscanner.l without also back-patching bootparse.y.  Or maybe
you just need to remove and rebuild bootparse.c.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typo in comment in postgres_fdw.c

2017-06-28 Thread Albe Laurenz
Attached is a fix for a small typo I found.

Yours,
Laurenz Albe


comment.patch
Description: comment.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \set AUTOROLLBACK ON

2017-06-28 Thread Joel Jacobson
On Mon, Jun 26, 2017 at 9:35 PM, David G. Johnston
 wrote:
> We already have SET TRANSACTION READ ONLY.

But in my use-case I am OK with the query doing write operations,
since sometimes you need to test something in prod (that cannot be
tested easily locally) but you want to ROLLBACK the query as quickly
as possible to avoid locking things longer than necessary. Currently
I'm just manually appending "; rollback;" to the query to make sure
its rollbacked.

The best thing I can do today is to map some button on the keyboard to
automatically type "; rollback; [enter]" instead of hitting [enter] to
fire-off the query, but it would be nice if it was built-in psql so
you could never commit something by mistake unless you explicitly exit
the AUTOROLLBACK mode.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] doc: Fix typo

2017-06-28 Thread Zero King
---
 doc/src/sgml/installation.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index becf868..809cacb 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -120,7 +120,7 @@ su - postgres
   edit previous commands.  This is very helpful and is strongly
   recommended.  If you don't want to use it then you must specify
   the --without-readline option to
-  configure. As an alternative, you can often use the
+  configure. As an alternative, you can also use the
   BSD-licensed libedit library, originally
   developed on NetBSD. The
   libedit library is
-- 
2.13.2



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable storage

2017-06-28 Thread Amit Kapila
On Wed, Jun 28, 2017 at 7:43 AM, Haribabu Kommi
 wrote:
>
> On Wed, Jun 28, 2017 at 12:00 AM, Alexander Korotkov
>  wrote:
>>
>> On Tue, Jun 27, 2017 at 4:19 PM, Amit Kapila 
>> wrote:
>>>
>>> On Thu, Jun 22, 2017 at 5:46 PM, Alexander Korotkov
>>>  wrote:
>>>
>>> Probably, some other algorithm for vacuum.  I am not sure current
>>> vacuum with its parameters can be used so easily.  One thing that
>>> might need some thoughts is that is it sufficient to say that keep
>>> autovacuum as off and call some different API for places where the
>>> vacuum can be invoked manually like Vacuum command to the developer
>>> implementing some different strategy for vacuum or we need something
>>> more as well.
>>
>>
>> What kind of other vacuum algorithm do you expect?  It would be rather
>> easier to understand if we would have examples.
>>
>> For me, changing of vacuum algorithm is not needed for just heap page
>> format change.  Existing vacuum algorithm could just call page format API
>> functions for manipulating individual pages.
>>
>> Changing of vacuum algorithm might be needed for more invasive changes
>> than just heap page format.  However, we should first understand what these
>> changes could be and how are they consistent with rest of API design.
>
>
> Yes, I agree that we need some changes in the vacuum to handle the pluggable
> storage.
> Currently I didn't fully checked the changes that are needed in vacuum, but
> I feel the low level changes of the function are enough, and also there
> should be
> some option from storage handler to decide whether it needs a vacuum or not?
> Based on this flag, the vacuum may be skipped on those tables. So these
> handlers
> no need to register those API's.
>

Something in that direction sounds reasonable to me.  I am also not
very clear what kind of pluggability will be required for vacuum.  I
think for now we can park this problem and try to tackle tuple format
and page format related stuff.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable storage

2017-06-28 Thread Amit Kapila
On Tue, Jun 27, 2017 at 7:23 PM, Alexander Korotkov
 wrote:
> On Tue, Jun 27, 2017 at 4:08 PM, Amit Kapila 
> wrote:
>>
>> Similarly,
>> if the page format is changed you need to consider all page scan API's
>> like heapgettup_pagemode/heapgetpage.
>
>
> If page format is changed, then heapgettup_pagemode/heapgetpage should use
> appropriate API functions for manipulating page items.  I'm very afraid of
> overriding whole heapgettup_pagemode/heapgetpage and monstrous functions
> like heap_update without understanding of clear use-case.  It's definitely
> not needed for changing heap page format.
>

Yeah, we might not change them completely, there will always be some
common parts, but as of now, this API considers that we can return a
pointer to tuple on the page with just having a pin on the buffer.
This is based on the assumption that nobody can update the current
tuple, only a new tuple will be created on the update.  For some use
cases like in-place updates, we might want to do that differently.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] building Postgresql-9.0.10 with patching MERGE keyword

2017-06-28 Thread Lelisa Diriba
Hello! I am a Junior Postgresql hacker.i was start from building and using
it.please Help me to fixing this error.
I am trying to build with Patching MERGE statement or  keyword in
Postgresql 9.0.10 on ubuntu OS  and am getting a couple errors related to
bootstrap libraries. These are occuring during "make -C bootstrap all".
Here is the output:
make -C bootstrap all
make[3]: Entering directory '/home/postgres/postgres/src/backend/bootstrap'
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv
-g -I. -I. -I../../../src/include -D_GNU_SOURCE   -c -o bootparse.o
bootparse.c -MMD -MP -MF .deps/bootparse.Po
In file included from bootparse.y:420:0:
bootscanner.l: In function 'boot_yylex':
bootscanner.l:110:10: error: 'XFORCE' undeclared (first use in this
function)
bootscanner.l:110:10: note: each undeclared identifier is reported only
once for each function it appears in
In file included from bootparse.y:420:0:
bootscanner.l:111:10: error: 'XNOT' undeclared (first use in this function)
  yylval.str = scanstr(yytext);
  ^
In file included from bootparse.y:420:0:
bootscanner.l:112:10: error: 'XNULL' undeclared (first use in this function)
  return(ID);
  ^
../../../src/Makefile.global:566: recipe for target 'bootparse.o' failed
make[3]: *** [bootparse.o] Error 1
common.mk:41: recipe for target 'bootstrap-recursive' failed
make[2]: *** [bootstrap-recursive] Error 2
Makefile:17: recipe for target 'all' failed
make[1]: *** [all] Error 2
GNUmakefile:12: recipe for target 'all' failed
make: *** [all] Error 2


Re: [HACKERS] [PATCH] pg_reload_backend to signal SIGHUP to a specific backend

2017-06-28 Thread Remi Colinet
Hi Yugo,

The patch looks fine. Installed and tested.

But a similar function already exists for the Postmaster process.

Datum
pg_reload_conf(PG_FUNCTION_ARGS)
{
if (kill(PostmasterPid, SIGHUP))
{
ereport(WARNING,
(errmsg("failed to send signal to
postmaster: %m")));
PG_RETURN_BOOL(false);
}

PG_RETURN_BOOL(true);
}

I would suggest to merge your patch with above function which is close to
yours.
Btw, your patch looks better that above function code.

You may for instance retain pg_reload_conf() without argument for the same
purpose as currently (Postmaster signaling).
And provide a pid argument to signal a given backend.

Regards
Remi


2017-06-28 12:17 GMT+02:00 Yugo Nagata :

> Hi,
>
> Attached is a patch of pg_reload_backend that is a function signaling
> SIGHUP to a specific backend. The original idea is from Michael Paquier[1].
> The documatation isn't included in this patch yet.
>
> We can change some parameters of other backend using the function
> as bellow.
>
> postgres=# alter system set log_min_messages to debug2;
> ALTER SYSTEM
> postgres=# select pg_reload_backend(18375);
>  pg_reload_backend
> ---
>  t
> (1 row)
>
> There are some usecases. For example:
>
> - changing log configuration in other backends temporally.
> - changing cost parameters for planner in other backends.
> - changing autovacuum parameters of an already running autovacuum worker.
>
>
> Hoever, this function need the superuser previlige to execute as well as
> pg_reload_conf(). Although we can grant the execution to public, it is
> useless for normal users bacause only superuser can change postgresql.conf.
>
> To allow normal users to change parameters in ohter backends, instead
> we might want another feature, for example, a function like set_config()
> than can change other backend's parameters using shared memory and SIGUSR1.
>
> Any comments would be appreciated.
>
> Regards,
>
> [1]
> https://www.postgresql.org/message-id/CAB7nPqT4y8-
> QoGKEugk99_tFuEOtAshcs5kxOeZ_0w27UtdGyA%40mail.gmail.com
>
> --
> Yugo Nagata 
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] UPDATE of partition key

2017-06-28 Thread Amit Khandekar
On 26 June 2017 at 08:37, Amit Khandekar  wrote:
> On 22 June 2017 at 01:41, Robert Haas  wrote:
 Second, it will amount to a functional bug if you get a
 different answer than the planner did.
>>>
>>> Actually, the per-leaf WCOs are meant to be executed on the
>>> destination partitions where the tuple is moved, while the WCOs
>>> belonging to the per-subplan resultRelInfo are meant for the
>>> resultRelinfo used for the UPDATE plans. So actually it should not
>>> matter whether they look same or different, because they are fired at
>>> different objects. Now these objects can happen to be the same
>>> relations though.
>>>
>>> But in any case, it's not clear to me how the mapped WCO and the
>>> planner's WCO would yield a different answer if they are both the same
>>> relation. I am possibly missing something. The planner has already
>>> generated the withCheckOptions for each of the resultRelInfo. And then
>>> we are using one of those to re-generate the WCO for a leaf partition
>>> by only adjusting the attnos. If there is already a WCO generated in
>>> the planner for that leaf partition (because that partition was
>>> present in mtstate->resultRelInfo), then the re-built WCO should be
>>> exactly look same as the earlier one, because they are the same
>>> relations, and so the attnos generated in them would be same since the
>>> Relation TupleDesc is the same.
>>
>> If the planner's WCOs and mapped WCOs are always the same, then I
>> think we should try to avoid generating both.  If they can be
>> different, but that's intentional and correct, then there's no
>> substantive problem with the patch but the comments need to make it
>> clear why we are generating both.
>>
>>> Actually I meant, "above works for only local updates. For
>>> row-movement-updates, we need per-leaf partition WCOs, because when
>>> the row is inserted into target partition, that partition may be not
>>> be included in the above planner resultRelInfo, so we need WCOs for
>>> all partitions". I think this said comment should be sufficient if I
>>> add this in the code ?
>>
>> Let's not get too focused on updating the comment until we are in
>> agreement about what the code ought to be doing.  I'm not clear
>> whether you accept the point that the patch needs to be changed to
>> avoid generating the same WCOs and returning lists in both the planner
>> and the executor.
>
> Yes, we can re-use the WCOs generated in the planner, as an
> optimization, since those we re-generate for the same relations will
> look exactly the same. The WCOs generated by planner (in
> inheritance_planner) are generated when (in adjust_appendrel_attrs())
> we change attnos used in the query to refer to the child RTEs and this
> adjusts the attnos of the WCOs of the child RTEs. So the WCOs of
> subplan resultRelInfo are actually the parent table WCOs, but only the
> attnos changed. And in ExecInitModifyTable() we do the same thing for
> leaf partitions, although using different function
> map_variable_attnos().

In attached patch v12,  during UPDATE tuple routing setup, for each
leaf partition, we now check if it is present already in one of the
UPDATE per-subplan resultrels. If present, we re-use them rather than
creating a new one and opening the table again.

So the mtstate->mt_partitions is now an array of ResultRelInfo
pointers. That pointer points to either the UPDATE per-subplan result
rel, or a newly allocated ResultRelInfo.

For each of the leaf partitions, we have to search through the
per-subplan resultRelInfo oids to check if there is a match. To do
this, I have created a temporary hash table which stores oids and the
ResultRelInfo pointers of mtstate->resultRelInfo array, and which can
be used to search the oid for each of the leaf partitions.

This patch version has handled only the above discussion point. I will
follow up with the other points separately.


update-partition-key_v12.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_reload_backend to signal SIGHUP to a specific backend

2017-06-28 Thread Dmitry Dolgov
> On 28 June 2017 at 12:17, Yugo Nagata  wrote:
>
> Hi,
>
> Attached is a patch of pg_reload_backend that is a function signaling
> SIGHUP to a specific backend. The original idea is from Michael
Paquier[1].
> The documatation isn't included in this patch yet.

I have few questions. I'm curious, why this function returns something
different from bool when I'm passing null as an argument?

=# select pg_reload_backend(27961);
WARNING:  PID 27961 is not a PostgreSQL server process
WARNING:  failed to send signal to backend: 27961
 pg_reload_backend
---
 f
(1 row)

=# select pg_reload_backend(27962);
 pg_reload_backend
---
 t
(1 row)

=# select pg_reload_backend(null);
 pg_reload_backend
---

(1 row)

Also for some reason I can't grant an execute permission on this function,
am I doing something wrong?

=# grant execute on function pg_reload_backend() to test_user;
ERROR:  function pg_reload_backend() does not exist
=# grant execute on function pg_reload_conf() to test_user;
GRANT


Re: [HACKERS] pg_receivewal and messages printed in non-verbose mode

2017-06-28 Thread Daniel Gustafsson
> On 14 Jun 2017, at 06:04, Michael Paquier  wrote:
> 
> On Tue, Jun 13, 2017 at 4:50 PM, Craig Ringer  wrote:
>> On 13 June 2017 at 14:33, Michael Paquier  wrote:
>>> Those come from stop_streaming in pg_receivewal.c. Shouldn't those
>>> messages only show up to the user if --verbose is used? It seems
>>> strange to me that at least the first one is written to the user as
>>> that's not an error after promoting a standby.
>> 
>> I agree. At least the first should be --verbose only.
> 
> I have been looking at all the code surrounding pg_receivewal and
> pg_recvlogical and those are indeed the two only places where we print
> a message in non-verbose mode even if those are not explicit errors.
> pg_recvlogical does not show up any messages when it is signaled or
> when it receives SIGINT or reaches the end of LSN position. I don't
> think that this is worth complicating the code for, just noticed the
> inconsistency on the way.
> 
> Perhaps a committer will care about that. Or not. For now I am just
> adding that in the CF.

+1, this patch makes --verbose behave consistently across these programs.
Since Ctrl-C’ing the program to terminate is not an error, I agree that it
should require verbose mode too.  I took a look as well and concur with your
findings.  Moving to Ready for Committer.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] pg_reload_backend to signal SIGHUP to a specific backend

2017-06-28 Thread Yugo Nagata
Hi,

Attached is a patch of pg_reload_backend that is a function signaling
SIGHUP to a specific backend. The original idea is from Michael Paquier[1].
The documatation isn't included in this patch yet.

We can change some parameters of other backend using the function
as bellow.

postgres=# alter system set log_min_messages to debug2;
ALTER SYSTEM
postgres=# select pg_reload_backend(18375);
 pg_reload_backend 
---
 t
(1 row)

There are some usecases. For example:

- changing log configuration in other backends temporally.
- changing cost parameters for planner in other backends.
- changing autovacuum parameters of an already running autovacuum worker.


Hoever, this function need the superuser previlige to execute as well as
pg_reload_conf(). Although we can grant the execution to public, it is
useless for normal users bacause only superuser can change postgresql.conf.

To allow normal users to change parameters in ohter backends, instead
we might want another feature, for example, a function like set_config()
than can change other backend's parameters using shared memory and SIGUSR1.

Any comments would be appreciated.

Regards,

[1]
https://www.postgresql.org/message-id/CAB7nPqT4y8-QoGKEugk99_tFuEOtAshcs5kxOeZ_0w27UtdGyA%40mail.gmail.com

-- 
Yugo Nagata 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0fdad0c..3a9a020 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1128,6 +1128,7 @@ REVOKE EXECUTE ON FUNCTION pg_wal_replay_pause() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_wal_replay_resume() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_rotate_logfile() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_reload_conf() FROM public;
+REVOKE EXECUTE ON FUNCTION pg_reload_backend(int) FROM public;
 REVOKE EXECUTE ON FUNCTION pg_current_logfile() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_current_logfile(text) FROM public;
 
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 62341b8..83b6c45 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -339,6 +339,36 @@ pg_reload_conf(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(true);
 }
 
+/*
+ * Signal to reload the database configuration for a specified backend
+ *
+ * Permission checking for this function is managed through the normal
+ * GRANT system.
+ */
+Datum
+pg_reload_backend(PG_FUNCTION_ARGS)
+{
+	int			pid = PG_GETARG_INT32(0);
+	int			r = pg_signal_backend(pid, SIGHUP);
+
+	if (r == SIGNAL_BACKEND_NOSUPERUSER)
+		ereport(WARNING,
+(errmsg("must be a superuser to reload superuser process")));
+
+	if (r == SIGNAL_BACKEND_NOPERMISSION)
+		ereport(WARNING,
+ (errmsg("must be a member of the role whose process is being reloaded or member of pg_signal_backend")));
+
+	if (r)
+	{
+		ereport(WARNING,
+(errmsg("failed to send signal to backend: %d", pid)));
+		PG_RETURN_BOOL(false);
+	}
+
+	PG_RETURN_BOOL(true);
+}
+
 
 /*
  * Rotate log file
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 1191b4a..a05e78a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3255,6 +3255,8 @@ DESCR("true if wal replay is paused");
 
 DATA(insert OID = 2621 ( pg_reload_conf			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_reload_conf _null_ _null_ _null_ ));
 DESCR("reload configuration files");
+DATA(insert OID = 3409 ( pg_reload_backend			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 16 "23" _null_ _null_ _null_ _null_ _null_ pg_reload_backend _null_ _null_ _null_ ));
+DESCR("reload configuration files");
 DATA(insert OID = 2622 ( pg_rotate_logfile		PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ ));
 DESCR("rotate log file");
 DATA(insert OID = 3800 ( pg_current_logfilePGNSP PGUID 12 1 0 0 0 f f f f f f v s 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_current_logfile _null_ _null_ _null_ ));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Small comment fix in partition.c

2017-06-28 Thread Masahiko Sawada
Hi,

Attached patch for $subject.
A period is missing at the end of sentence.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


small_fix_partition.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] building Postgresql-9.0.10

2017-06-28 Thread Lelisa Diriba
Hello! I am a Junior Postgresql hacker.i was start from building and using
it.please Help me to fixing this error.
I am trying to build with Patching MERGE statement or  keyword in
Postgresql 9.0.10 on ubuntu OS  and am getting a couple errors related to
bootstrap libraries. These are occuring during "make -C bootstrap all".
Here is the output:
make -C bootstrap all
make[3]: Entering directory '/home/postgres/postgres/src/backend/bootstrap'
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv
-g -I. -I. -I../../../src/include -D_GNU_SOURCE   -c -o bootparse.o
bootparse.c -MMD -MP -MF .deps/bootparse.Po
In file included from bootparse.y:420:0:
bootscanner.l: In function 'boot_yylex':
bootscanner.l:110:10: error: 'XFORCE' undeclared (first use in this
function)
bootscanner.l:110:10: note: each undeclared identifier is reported only
once for each function it appears in
In file included from bootparse.y:420:0:
bootscanner.l:111:10: error: 'XNOT' undeclared (first use in this function)
  yylval.str = scanstr(yytext);
  ^
In file included from bootparse.y:420:0:
bootscanner.l:112:10: error: 'XNULL' undeclared (first use in this function)
  return(ID);
  ^
../../../src/Makefile.global:566: recipe for target 'bootparse.o' failed
make[3]: *** [bootparse.o] Error 1
common.mk:41: recipe for target 'bootstrap-recursive' failed
make[2]: *** [bootstrap-recursive] Error 2
Makefile:17: recipe for target 'all' failed
make[1]: *** [all] Error 2
GNUmakefile:12: recipe for target 'all' failed
make: *** [all] Error 2


Re: [HACKERS] asynchronous execution

2017-06-28 Thread Antonin Houska
Kyotaro HORIGUCHI  wrote:

> The patch got conflicted. This is a new version just rebased to
> the current master. Furtuer amendment will be taken later.

Can you please explain this part of make_append() ?

/* Currently async on partitioned tables is not available */
Assert(nasyncplans == 0 || partitioned_rels == NIL);

I don't think the output of Append plan is supposed to be ordered even if the
underlying relation is partitioned. Besides ordering, is there any other
reason not to use the asynchronous execution?

And even if there was some, the planner should ensure that executor does not
fire the assertion statement above. The script attached shows an example how
to cause the assertion failure.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

OPT_HOST="-h localhost"
USER=postgres

for x in $(psql $OPT_HOST -c "SELECT datname FROM pg_database WHERE datname ~ 
'shard.*' " postgres -A -t)

do
echo dropping $x
dropdb $OPT_HOST $x
done;

createdb $OPT_HOST shard

psql $OPT_HOST -U $USER shard -c "
CREATE EXTENSION postgres_fdw;
CREATE TABLE orders(customer_id int, product_id int)
   PARTITION BY LIST (customer_id);"
createdb $OPT_HOST shard_0
psql $OPT_HOST -U $USER shard_0 -c "
CREATE TABLE orders_0 AS
SELECT trunc(2 * random())::int AS customer_id,
   trunc(100 * random())::int AS product_id
FROM generate_series(1, 5000);
ANALYZE;"
psql $OPT_HOST -U $USER shard -c "
CREATE SERVER server_0 FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'shard_0', fetch_size 
'1000',
use_remote_estimate 'true', fdw_tuple_cost '10');
CREATE USER MAPPING FOR CURRENT_USER SERVER server_0 OPTIONS 
(user '$USER');
IMPORT FOREIGN SCHEMA public FROM SERVER server_0 INTO public;
ALTER TABLE orders ATTACH PARTITION orders_0 FOR VALUES IN (0, 
1);"

createdb $OPT_HOST shard_1
psql $OPT_HOST -U $USER shard_1 -c "
CREATE TABLE orders_1 AS
SELECT trunc(2 * random())::int + 2 AS customer_id,
   trunc(100 * random())::int + 2 AS product_id
FROM generate_series(1, 5000);
ANALYZE;"
psql $OPT_HOST -U $USER shard -c "
CREATE SERVER server_1 FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'shard_1', fetch_size 
'1000',
use_remote_estimate 'true', fdw_tuple_cost '10');
CREATE USER MAPPING FOR CURRENT_USER SERVER server_1 OPTIONS 
(user '$USER');
IMPORT FOREIGN SCHEMA public FROM SERVER server_1 INTO public;
ALTER TABLE orders ATTACH PARTITION orders_1 FOR VALUES IN (2, 
3);"


psql $OPT_HOST -U $USER shard -c "SELECT * FROM orders"

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default Partition for Range

2017-06-28 Thread Beena Emerson
Hello Rahila,

On Wed, Jun 28, 2017 at 12:34 PM, Rahila Syed  wrote:
> Hi Beena,
>
> I started testing and reviewing the patch. Can you update the patch as v5
> patch does not apply cleanly on master?
>

Thanks for looking into this.

The patch is to be applied on Jeevn's partition patch
(http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg316818.html)
which can be applied over commit
a12c09ad86e60a8acb269744b8ee86429dda2cd8.


-- 

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default Partition for Range

2017-06-28 Thread Beena Emerson
On Wed, Jun 28, 2017 at 12:34 PM, Rahila Syed  wrote:
> Hi Beena,
>
> I started testing and reviewing the patch. Can you update the patch as v5
> patch does not apply cleanly on master?
>

I am currently working on Dilip's comments, I will update the patch soon.

-- 

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-06-28 Thread Pavel Stehule
2017-06-28 10:04 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> + if (success)
>> + {
>> + char   *ntuples = PQcmdTuples(results);
>> + SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples : "0");
>> + SetVariable(pset.vars, "ERROR", "FALSE");
>> + }
>> + else
>> + {
>> + SetVariable(pset.vars, "ROW_COUNT", "0");
>> + SetVariable(pset.vars, "ERROR", "TRUE");
>> + }
>> +}
>>
>> It can be simplified
>>
>> SetVariable(pset.vars, "ROW_COUNT", success ? PQcmdTuples(results) : 0);
>>
>
> According to the documentation, PQcmdTuples returns "" in some cases and
> ISTM we want "0" instead for consistency, so that it is always a number. I
> rejected calling PQcmdTuples twice:
>
>   ..., success && *PQcmdTuples(results) ? PQcmdTuples(results) : "0")
>
> Thus it makes the "if (success)" necessary for ROW_COUNT, and then it
> looked simpler to handle ERROR the same way.
>
> Now if the semantics is changed to put as row count whatever comes out of
> the function, even if not a count, then the code could indeed be simplified
> as you suggest.


Understand

Pavel


>
>
> --
> Fabien.
>


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-06-28 Thread Fabien COELHO


Hello Pavel,


+ if (success)
+ {
+ char   *ntuples = PQcmdTuples(results);
+ SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples : "0");
+ SetVariable(pset.vars, "ERROR", "FALSE");
+ }
+ else
+ {
+ SetVariable(pset.vars, "ROW_COUNT", "0");
+ SetVariable(pset.vars, "ERROR", "TRUE");
+ }
+}

It can be simplified

SetVariable(pset.vars, "ROW_COUNT", success ? PQcmdTuples(results) : 0);


According to the documentation, PQcmdTuples returns "" in some cases and 
ISTM we want "0" instead for consistency, so that it is always a number. I 
rejected calling PQcmdTuples twice:


  ..., success && *PQcmdTuples(results) ? PQcmdTuples(results) : "0")

Thus it makes the "if (success)" necessary for ROW_COUNT, and then it 
looked simpler to handle ERROR the same way.


Now if the semantics is changed to put as row count whatever comes out of 
the function, even if not a count, then the code could indeed be 
simplified as you suggest.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-06-28 Thread Pavel Stehule
2017-06-28 9:25 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> I agree that the existing "SetVariableBool" function is a misnommer, it
>>> should be "SetVariableOn" given what it does, and it is not what we need.
>>>
>>
>> switching default setting from ON to TRUE requires wider discussion -
>>
>
> Yep.
>
> in this moment I like to have special function "SetVariableON".
>>
>
> I'm fine with this, but this make it a change totally unrelated to this
> patch as it would not use the function... Moreover, this function would not
> use an hypothetical "set var bool" function because of the debatable on/off
> vs true/false change.
>
> Also, a "set var bool" function would be called only twice, which is not
> very beneficial for a oneliner, so I left it out.
>
> I agree that there is some common structure, but ISTM that the
>>> AcceptResult function is called in a variety of situation where variables
>>> are not to be set (eg "internal" queries, not user provided queries), so
>>> I
>>> thought it best to keep the two apart.
>>>
>>
>> I understand, but It is not nice, really  - maybe only switch can be moved
>> to some inlining function  like IsSuccess() - more .. with this function,
>> the SetResultVariables function will be more cleaner
>>
>
> Indeed. Attached v5 does that.


juju - something like this

+ if (success)
+ {
+ char   *ntuples = PQcmdTuples(results);
+ SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples : "0");
+ SetVariable(pset.vars, "ERROR", "FALSE");
+ }
+ else
+ {
+ SetVariable(pset.vars, "ROW_COUNT", "0");
+ SetVariable(pset.vars, "ERROR", "TRUE");
+ }
+}

It can be simplified

SetVariable(pset.vars, "ROW_COUNT", success ? PQcmdTuples(results) : 0);
SetVariable(pset.vars, "ERROR", success ? "FALSE" : "TRUE");

Regards

Pavel


>
>
> --
> Fabien.


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-06-28 Thread Fabien COELHO


Hello Pavel,

I agree that the existing "SetVariableBool" function is a misnommer, it 
should be "SetVariableOn" given what it does, and it is not what we 
need.


switching default setting from ON to TRUE requires wider discussion -


Yep.


in this moment I like to have special function "SetVariableON".


I'm fine with this, but this make it a change totally unrelated to this 
patch as it would not use the function... Moreover, this function would 
not use an hypothetical "set var bool" function because of the debatable 
on/off vs true/false change.


Also, a "set var bool" function would be called only twice, which is not 
very beneficial for a oneliner, so I left it out.



I agree that there is some common structure, but ISTM that the
AcceptResult function is called in a variety of situation where variables
are not to be set (eg "internal" queries, not user provided queries), so I
thought it best to keep the two apart.


I understand, but It is not nice, really  - maybe only switch can be moved
to some inlining function  like IsSuccess() - more .. with this function,
the SetResultVariables function will be more cleaner


Indeed. Attached v5 does that.

--
Fabien.diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 9faa365..bc9a2e4 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3452,6 +3452,35 @@ bar
   
 
   
+   ERROR
+   
+
+ Whether the last query failed, as a boolean.
+
+   
+  
+
+  
+   ERROR_CODE
+   
+
+ The error code associated to the last query, or
+ 0 if no error occured.
+
+   
+  
+
+  
+   ERROR_MESSAGE
+   
+
+ If the last query failed, the associated error message,
+ otherwise an empty string.
+
+   
+  
+
+  
 FETCH_COUNT
 
 
@@ -3656,6 +3685,24 @@ bar
   
 
   
+   STATUS
+   
+
+ Text status of the last query.
+
+   
+  
+
+  
+   ROW_COUNT
+   
+
+ How many rows were returned or affected by the last query.
+
+   
+  
+
+  
 QUIET
 
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 044cdb8..02fa89a 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -492,6 +492,48 @@ ResetCancelConn(void)
 #endif
 }
 
+/*
+ * ResultIsSuccess
+ *
+ * Tell whether query result is a success.
+ */
+static bool
+ResultIsSuccess(const PGresult *result)
+{
+	bool success;
+	if (!result)
+		success = false;
+	else
+	{
+		ExecStatusType	restat = PQresultStatus(result);
+
+		/* check all possible PGRES_ */
+		switch (restat)
+		{
+			case PGRES_EMPTY_QUERY:
+			case PGRES_TUPLES_OK:
+			case PGRES_COMMAND_OK:
+			case PGRES_COPY_OUT:
+			case PGRES_COPY_IN:
+			case PGRES_COPY_BOTH:
+			case PGRES_SINGLE_TUPLE:
+success = true;
+break;
+			case PGRES_BAD_RESPONSE:
+			case PGRES_NONFATAL_ERROR:
+			case PGRES_FATAL_ERROR:
+success = false;
+break;
+			default:
+/* dead code */
+success = false;
+psql_error("unexpected PQresultStatus: %d\n", restat);
+break;
+		}
+	}
+
+	return success;
+}
 
 /*
  * AcceptResult
@@ -504,34 +546,7 @@ ResetCancelConn(void)
 static bool
 AcceptResult(const PGresult *result)
 {
-	bool		OK;
-
-	if (!result)
-		OK = false;
-	else
-		switch (PQresultStatus(result))
-		{
-			case PGRES_COMMAND_OK:
-			case PGRES_TUPLES_OK:
-			case PGRES_EMPTY_QUERY:
-			case PGRES_COPY_IN:
-			case PGRES_COPY_OUT:
-/* Fine, do nothing */
-OK = true;
-break;
-
-			case PGRES_BAD_RESPONSE:
-			case PGRES_NONFATAL_ERROR:
-			case PGRES_FATAL_ERROR:
-OK = false;
-break;
-
-			default:
-OK = false;
-psql_error("unexpected PQresultStatus: %d\n",
-		   PQresultStatus(result));
-break;
-		}
+	bool		OK = ResultIsSuccess(result);
 
 	if (!OK)
 	{
@@ -1213,6 +1228,38 @@ PrintQueryResults(PGresult *results)
 	return success;
 }
 
+/*
+ * Set special variables for "front door" queries
+ * - STATUS: last query status
+ * - ERROR: TRUE/FALSE, whether an error occurred
+ * - ERROR_CODE: code if an error occured, or "0"
+ * - ERROR_MESSAGE: message if an error occured, or ""
+ * - ROW_COUNT: how many rows were returned or affected, or "0"
+ */
+static void
+SetResultVariables(PGresult *results)
+{
+	bool			success = ResultIsSuccess(results);
+	ExecStatusType	restat = PQresultStatus(results);
+	char 		   *code = PQresultErrorField(results, PG_DIAG_SQLSTATE);
+	char 		   *mesg = PQresultErrorField(results, PG_DIAG_MESSAGE_PRIMARY);
+
+	SetVariable(pset.vars, "STATUS", PQresStatus(restat) + strlen("PGRES_"));
+	SetVariable(pset.vars, "ERROR_CODE", code ? code : "0");
+	SetVariable(pset.vars, "ERROR_MESSAGE", mesg ? mesg : "");
+
+	if (success)
+	{
+		char   *ntuples = PQcmdTuples(results);
+		SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples 

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Thomas Munro
On Wed, Jun 28, 2017 at 5:19 PM, Thomas Munro
 wrote:
> On Wed, Aug 24, 2016 at 2:58 AM, Robert Haas  wrote:
>> Now, for bigger segment sizes, I think there actually could be a
>> little bit of a noticeable performance hit here, because it's not just
>> about total elapsed time.  Even if the code eventually touches all of
>> the memory, it might not touch it all before starting to fire up
>> workers or whatever else it wants to do with the DSM segment.  But I'm
>> thinking we still need to bite the bullet and pay the expense, because
>> crash-and-restart cycles are *really* bad.
>
> Here is a new rebased version of this patch, primarily to poke this
> thread as an unresolved question.  This patch is not committable as is
> though: I discovered that parallel query can cause fallocate to return
> with errno == EINTR.  I haven't yet investigated whether fallocate is
> supposed to be restartable, or signals should be blocked, or something
> else is wrong.  Another question is whether the call to ftruncate() is
> actually necessary before the call to fallocate().

I think this line is saying that it won't restart automatically:

https://github.com/torvalds/linux/blob/590dce2d4934fb909b112cd80c80486362337744/mm/shmem.c#L2884

Compare this patch (not in the kernel tree) that suggests that line
should be changed to cause restart:

https://lkml.org/lkml/2016/3/3/987

- error = -EINTR;
+ error = -ERESTARTSYS;

So I think we either need to mask signals with or put in an explicit
retry loop, as shown in the attached version of the patch.  With the
v3 patch I posted earlier, I see interrupted system call failures in
the select_parallel regression test, but with the v4 it passes.
Thoughts?

> Unfounded speculation: fallocate() might actually *improve*
> performance of DSM segments if your access pattern involves random
> access (just to pick an example out of the air, something like...
> building a hash table), since it's surely easier to allocate a big
> contiguous chunk than a squillion random pages most of which divide an
> existing hole into two smaller holes...

Bleugh... I retract this, of course we initialise the hash table in
order anyway so this doesn't make any sense.

-- 
Thomas Munro
http://www.enterprisedb.com


fallocate-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default Partition for Range

2017-06-28 Thread Rahila Syed
Hi Beena,

I started testing and reviewing the patch. Can you update the patch as v5
patch does not apply cleanly on master?

Thank you,
Rahila Syed

On Wed, Jun 21, 2017 at 8:43 PM, Dilip Kumar  wrote:

> On Wed, Jun 21, 2017 at 8:08 PM, Robert Haas 
> wrote:
> > I think somebody should do some testing of the existing code with
> > valgrind.  And then apply the list-partitioning patch and this patch,
> > and do some more testing with valgrind.  It seems to be really easy to
> > miss these uninitialized access problems during code review.
>
> I think it will help,  but it may not help in all the scenarios
> because most of the places we are allocating memory with palloc0 (
> initialized with 0) but it never initialized with RANGE_DATUM_DEFAULT
> except the first element (in the case of DEFAULT partition).  And,
> later they may be considered as RANGE_DATUM_FINITE (because its value
> is 0).
>
> One solution can be that if bound is DEFAULT then initialize with
> RANGE_DATUM_DEFAULT for the complete content array for that partition
> bound instead of just first.  Otherwise, we need to be careful of
> early exiting wherever we are looping the content array of the DEFAULT
> bound.
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-06-28 Thread Pavel Stehule
2017-06-27 17:30 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> We can introduce macro SetVariableBool(vars, varname, bool) instead
>>>
>>>  SetVariable(pset.vars, "ERROR", "FALSE");
>>>
>>
>> I checked source code, and it requires little bit more harder refactoring
>> because now we have SetVariableBool - what is unhappy name, because it
>> initialize variable to ON value. It is question what is better name?
>>
>
> The boolean values (on/off 1/0 true/false...) accepted for pg settings is
> probably convenient but also somehow fuzzy.
>
> From a programming point of view, I like booleans to have either true or
> false values, and nothing else.
>
> I agree that the existing "SetVariableBool" function is a misnommer, it
> should be "SetVariableOn" given what it does, and it is not what we need.
>

switching default setting from ON to TRUE requires wider discussion - in
this moment I like to have special function "SetVariableON".

>
> Here is a v4 which attempts to extend & reuse the function. People might
> be surprised that TRUE is used where ON was used before, so I'm not sure.
>
> I found more interesting issue - the code of  SetResultVariables is
>> partially redundant with AcceptResult - maybe the switch there can be
>> shared.
>>
>
> I agree that there is some common structure, but ISTM that the
> AcceptResult function is called in a variety of situation where variables
> are not to be set (eg "internal" queries, not user provided queries), so I
> thought it best to keep the two apart.


I understand, but It is not nice, really  - maybe only switch can be moved
to some inlining function  like IsSuccess() - more .. with this function,
the SetResultVariables function will be more cleaner

Regards

Pavel

>
>
> --
> Fabien.