date:20170309

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-09 Thread Jim Nasby


On 3/8/17 9:34 AM, Andreas Karlsson wrote:

Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.  I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.


Hm, I cannot think of any real life scenario where this will be an issue
based on my personal experience with PostgreSQL, but if you can think of
one please provide it. I will try to ponder some more on this myself.


The case I currently have is to allow tracking database objects similar 
to (but not the same) as how we track the objects that belong to an 
extension[1]. That currently depends on event triggers to keep names 
updated if they're changed, as well as making use of the reg* types. If 
an event trigger fired as part of the index rename (essentially treating 
it like an ALTER INDEX) then I should be able to work around that.


The ultimate reason for doing this is to provide something similar to 
extensions (create a bunch of database objects that are all bound 
together), but also similar to classes in OO languages (so you can have 
multiple instances).[2]


Admittedly, this is pretty off the beaten path and I certainly wouldn't 
hold up the patch because of it. I am hoping that it'd be fairly easy to 
fire an event trigger as if someone had just renamed the index.


1: https://github.com/decibel/object_reference
2: https://github.com/decibel/pg_classy
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ANALYZE command progress checker

2017-03-09 Thread vinayak


Thank you for reviewing the patch.

The attached patch incorporated Michael and Amit comments also.

On 2017/03/07 15:45, Haribabu Kommi wrote:



On Tue, Mar 7, 2017 at 5:01 PM, Michael Paquier 
> wrote:



@@ -496,7 +499,6 @@ do_analyze_rel(Relation onerel, int options,
VacuumParams *params,
numrows = (*acquirefunc) (onerel, elevel,
  rows, targrows,
  , );
-
/*
Useless diff.


Fixed.


+ 
+   ANALYZE is currently collecting the sample rows.
+   The sample it reads is taken randomly.Its size depends on
+   the default_statistics_target parameter value.
+ 
This should use a  markup for default_statistics_target.


Fixed.



@@ -203,6 +204,8 @@ analyze_rel(Oid relid, RangeVar *relation, int
options,
if (onerel->rd_rel->relkind == RELKIND_RELATION ||
onerel->rd_rel->relkind == RELKIND_MATVIEW)
{
+   pgstat_progress_start_command(PROGRESS_COMMAND_ANALYZE,
+  RelationGetRelid(onerel));
It seems to me that the report should begin in do_analyze_rel().


Fixed.


some more comments,

+/* Report compute heap stats phase */
+pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+PROGRESS_ANALYZE_PHASE_COMPUTE_HEAP_STATS);

The above analyze phase is updated inside a for loop, instead just set 
it above once.

Fixed.


+ /* Report compute index stats phase */
+ pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+ PROGRESS_ANALYZE_PHASE_COMPUTE_INDEX_STATS);

Irrespective of whether the index exists on the table or not, the 
above analyze phase
is set. It is better to set the above phase and index cleanup phase 
only when there

are indexes on the table.


Agreed. Fixed.

+/* Report total number of heap blocks and collectinf sample row phase*/

Typo? collecting?


Fixed.


+/* Report total number of heap blocks and collectinf sample row phase*/
+initprog_val[0] = PROGRESS_ANALYZE_PHASE_COLLECT_HEAP_SAMPLE_ROWS;
+initprog_val[1] = totalblocks;
+pgstat_progress_update_multi_param(2, initprog_index, initprog_val);
acquire_sample_rows function is called from acquire_inherited_sample_rows
function, so adding the phase in that function will provide wrong info.


I agree with you.


+#define PROGRESS_ANALYZE_PHASE_COLLECT_INH_SAMPLE_ROWS2

why there is no code added for the phase, any specific reason?

I am thinking how to report this phase. Do you have any suggestion?


+/* Phases of analyze */

Can be written as following for better understanding, and also
similar like vacuum.

/* Phases of analyze (as advertised via PROGRESS_ANALYZE_PHASE) */


Done.

Regards,
Vinayak Pokale
NTT Open Source Software Center


pg_stat_progress_analyze_v2.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increasing the default WAL segment size

2017-03-09 Thread tushar


On 03/10/2017 11:23 AM, Beena Emerson wrote:



Thank you for your reviews Kuntal, Jim, Ashutosh

Attached in an updated 02 patch which:

 1. Call RetrieveXLogSegSize(conn) in pg_receivewal.c
 2. Remove the warning in Windows
 3. Change PATH_MAX in pg_waldump with MAXPGPATH

Regarding the usage of the wal file size as the XLogSegSize, I agree 
with what Robert has said. Generally, the wal size will be of the 
expected wal_segment_size and to have it any other size, esspecially 
of a valid power2 value is extremely rare and I feel it is not a major 
cause of concern.

We (Prabhat and I) have started basic  testing of this feature -
2 quick issue -

1)at the time of initdb, we have set - "--wal-segsize 4"  ,so all the 
WAL file size should be 4 MB each  but in the postgresql.conf file , it 
is  mentioned


#wal_keep_segments = 0  # in logfile segments,*16MB each*; 0 
disables


so the comment  (16MB ) mentioned against parameter 'wal_keep_segments'  
looks wrong , either we should remove this or modify it .


2)Getting "Aborted (core dumped)"  error at the time of running 
pg_basebackup  , *(this issue is only coming on Linux32 ,not on Linux64)

* we have  double check to confirm it .*
*
Steps to reproduce on Linux32
===
fetch the sources
apply both the patches
 ./configure --with-zlib   --enable-debug  --enable-cassert 
--enable-depend --prefix=$PWD/edbpsql --with-openssl CFLAGS="-g -O0"; 
make all install

Performed initdb with switch "--wal-segsize 4"
start the server
run pg_basebackup

[centos@tushar-centos bin]$ ./pg_basebackup -v -D /tmp/myslave
*** glibc detected *** ./pg_basebackup: free(): invalid pointer: 
0x08da7f00 ***

=== Backtrace: =
/lib/libc.so.6[0xae7e31]
/home/centos/pg10_10mar/postgresql/edbpsql/lib/libpq.so.5(PQclear+0x16d)[0x6266f5]
./pg_basebackup[0x8051441]
./pg_basebackup[0x804e7b5]
/lib/libc.so.6(__libc_start_main+0xe6)[0xa8dd26]
./pg_basebackup[0x804a231]
=== Memory map: 
00153000-0017b000 r-xp  fc:01 1271 /lib/libk5crypto.so.3.1
0017b000-0017c000 r--p 00028000 fc:01 1271 /lib/libk5crypto.so.3.1
0017c000-0017d000 rw-p 00029000 fc:01 1271 /lib/libk5crypto.so.3.1
0017d000-0017e000 rw-p  00:00 0
0017e000-0018 r-xp  fc:01 1241 /lib/libkeyutils.so.1.3
0018-00181000 r--p 1000 fc:01 1241 /lib/libkeyutils.so.1.3
00181000-00182000 rw-p 2000 fc:01 1241 /lib/libkeyutils.so.1.3
002ad000-002b9000 r-xp  fc:01 1152 /lib/libnss_files-2.12.so
002b9000-002ba000 r--p b000 fc:01 1152 /lib/libnss_files-2.12.so
002ba000-002bb000 rw-p c000 fc:01 1152 /lib/libnss_files-2.12.so
004ad000-004b r-xp  fc:01 1267 /lib/libcom_err.so.2.1
004b-004b1000 r--p 2000 fc:01 1267 /lib/libcom_err.so.2.1
004b1000-004b2000 rw-p 3000 fc:01 1267 /lib/libcom_err.so.2.1
004ec000-005c3000 r-xp  fc:01 1199   /lib/libkrb5.so.3.3
005c3000-005c9000 r--p 000d6000 fc:01 1199   /lib/libkrb5.so.3.3
005c9000-005ca000 rw-p 000dc000 fc:01 1199   /lib/libkrb5.so.3.3
00617000-00642000 r-xp  fc:01 2099439 
/home/centos/pg10_10mar/postgresql/edbpsql/lib/libpq.so.5.10
00642000-00644000 rw-p 0002a000 fc:01 2099439 
/home/centos/pg10_10mar/postgresql/edbpsql/lib/libpq.so.5.10

00792000-0079c000 r-xp  fc:01 1255 /lib/libkrb5support.so.0.1
0079c000-0079d000 r--p 9000 fc:01 1255 /lib/libkrb5support.so.0.1
0079d000-0079e000 rw-p a000 fc:01 1255 /lib/libkrb5support.so.0.1
007fd000-0083b000 r-xp  fc:01 1280 /lib/libgssapi_krb5.so.2.2
0083b000-0083c000 r--p 0003e000 fc:01 1280 /lib/libgssapi_krb5.so.2.2
0083c000-0083d000 rw-p 0003f000 fc:01 1280 /lib/libgssapi_krb5.so.2.2
0083f000-009ed000 r-xp  fc:01 292057 /usr/lib/libcrypto.so.1.0.1e
009ed000-009fd000 r--p 001ae000 fc:01 292057 /usr/lib/libcrypto.so.1.0.1e
009fd000-00a04000 rw-p 001be000 fc:01 292057 /usr/lib/libcrypto.so.1.0.1e
00a04000-00a07000 rw-p  00:00 0
00a51000-00a6f000 r-xp  fc:01 14109  /lib/ld-2.12.so
00a6f000-00a7 r--p 0001d000 fc:01 14109  /lib/ld-2.12.so
00a7-00a71000 rw-p 0001e000 fc:01 14109  /lib/ld-2.12.so
00a77000-00c08000 r-xp  fc:01 14110  /lib/libc-2.12.so
00c08000-00c0a000 r--p 00191000 fc:01 14110  /lib/libc-2.12.so
00c0a000-00c0b000 rw-p 00193000 fc:01 14110  /lib/libc-2.12.so
00c0b000-00c0e000 rw-p  00:00 0
00c1-00c22000 r-xp  fc:01 14355  /lib/libz.so.1.2.3
00c22000-00c23000 r--p 00011000 fc:01 14355  /lib/libz.so.1.2.3
00c23000-00c24000 rw-p 00012000 fc:01 14355  /lib/libz.so.1.2.3
00c52000-00c55000 r-xp  fc:01 14375  /lib/libdl-2.12.so
00c55000-00c56000 r--p 2000 fc:01 14375  /lib/libdl-2.12.so
00c56000-00c57000 rw-p 3000 fc:01 14375  /lib/libdl-2.12.so
00c59000-00c7 r-xp  fc:01 14379 /lib/libpthread-2.12.so
00c7-00c71000 r--p 00016000 fc:01 14379 /lib/libpthread-2.12.so
00c71000-00c72000 rw-p 00017000 fc:01 14379 /lib/libpthread-2.12.so
00c72000-00c74000

Re: [HACKERS] Adding support for Default partition in partitioning

2017-03-09 Thread Jim Nasby


On 3/7/17 10:30 AM, Keith Fiske wrote:

I'm all for this feature and had suggested it back in the original


FWIW, I was working with a system just today that has an overflow partition.


thread to add partitioning to 10. I agree that adding a new partition
should not move any data out of the default. It's easy enough to set up


+1


a monitor to watch for data existing in the default. Perhaps also adding
a column to pg_partitioned_table that contains the oid of the default
partition so it's easier to identify from a system catalog perspective
and make that monitoring easier. I don't even see a need for it to fail


I agree that there should be a way to identify the default partition.


either and not quite sure how that would even work? If they can't add a
necessary child due to data being in the default, how can they ever get
it out?


Yeah, was wondering that as well...
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-09 Thread Ashutosh Bapat

On Fri, Mar 10, 2017 at 11:33 AM, Ashutosh Bapat
 wrote:
>>
>> But as far as code is concerned, I think the two-list approach will
>> turn out to be less simple if we derive corresponding two different
>> arrays in AppendState node. Handling two different arrays during
>> execution does not look clean. Whereas, the bitmapset that I have used
>> in Append has turned out to be very simple. I just had to do the below
>> check (and that is the only location) to see if it's a partial or
>> non-partial subplan. There is nowhere else any special handling for
>> non-partial subpath.
>>
>> /*
>> * Increment worker count for the chosen node, if at all we found one.
>> * For non-partial plans, set it to -1 instead, so that no other workers
>> * run it.
>> */
>> if (min_whichplan != PA_INVALID_PLAN)
>> {
>>if (bms_is_member(min_whichplan,
>> ((Append*)state->ps.plan)->partial_subplans_set))
>>padesc->pa_info[min_whichplan].pa_num_workers++;
>>else
>>padesc->pa_info[min_whichplan].pa_num_workers = -1;
>> }
>>
>> Now, since Bitmapset field is used during execution with such
>> simplicity, why not have this same data structure in AppendPath, and
>> re-use bitmapset field in Append plan node without making a copy of
>> it. Otherwise, if we have two lists in AppendPath, and a bitmap in
>> Append, again there is going to be code for data structure conversion.
>>
>
> I think there is some merit in separating out non-parallel and
> parallel plans within the same array or outside it. The current logic
> to assign plan to a worker looks at all the plans, unnecessarily
> hopping over the un-parallel ones after they are given to a worker. If
> we separate those two, we can keep assigning new workers to the
> non-parallel plans first and then iterate over the parallel ones when
> a worker needs a plan to execute. We might eliminate the need for
> special value -1 for num workers. You may separate those two kinds in
> two different arrays or within the same array and remember the
> smallest index of a parallel plan.

Further to that, with this scheme and the scheme to distribute workers
equally irrespective of the maximum workers per plan, you don't need
to "scan" the subplans to find the one with minimum workers. If you
treat the array of parallel plans as a circular queue, the plan to be
assigned next to a worker will always be the plan next to the one
which got assigned to the given worker. Once you have assigned workers
to non-parallel plans, intialize a shared variable next_plan to point
to the first parallel plan. When a worker comes asking for a plan,
assign the plan pointed by next_plan and update it to the next plan in
the circular queue.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-09 Thread Peter Eisentraut

On 3/8/17 16:49, Andres Freund wrote:
>> make check-world -j2 seems to run fine for me.
> 
> Hm, I at least used to get a lot of spurious failures with this. I
> e.g. don't think the free port selection is race free.

I was also not sure about that, but as Michael has pointed out, that
doesn't matter anymore, because it now uses a private socket directory.

I have been pounding it a bit, and every so often the test_decoding
tests fail in mysterious ways, but otherwise it seems to work fine.  I'm
curious what you are seeing.

Combining make -j10 and prove -j4, I get the run time down to 2 minutes
and a bit, from 20+ minutes.  Using the -O option if you have GNU make
>=4 is also useful to get some more sane output.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2017-03-09 Thread Craig Ringer

On 10 March 2017 at 02:55, Robert Haas  wrote:

> Well, that's why I tried to advocate that their should be two separate
> XID limits in shared memory: leave what's there now the way it is, and
> then add a new limit for "don't try to look up XIDs before this point:
> splat".  I still think that'd be less risky.

I'm coming back to this "cold" after an extended break, so I hope I
get the details right.

TL;DR: doing it that way duplicates a bunch of stuff and is ugly
without offering significant benefit over just fixing the original.

I started out with the approach you suggested, but it turns out to be
less helpful than you'd think. Simply advancing a new lower limit
field before beginning truncation is no help; there's then a race
where the lower-limit field can be advanced after we test it but
before we actually do the SLRU read from clog. To guard against this
it's necessary for SLRU truncation to take an exclusive lock during
which it advances the lower bound. That way a concurrent reader can
take the lock in shared mode before reading the lower bound and hold
it until it finishes the clog read, knowing that vacuum cannot then
truncate the data out from under it because it'll block trying to
advance the lower limit.

A spinlock isn't suitable for this. While we can take the lock only
briefly to update the limit field in vac_truncate_clog, in
txid_status() we have to hold it from when we test the boundary
through to when we finish the SLRU clog lookup, and that lookup does
I/O and might sleep. If we release it after testing the lower bound
but before the SLRU lookup our race comes back since vacuum can jump
in and truncate it out from under us. So now we need a new LWLock used
only for vac_truncate_clog before advancing the clog truncation.

I implemented just that - a new ClogTruncateLog in the lwlocks array
and a new field in ShmemVariableCache for the lower xid bound, IIRC.
Other than requiring an extra lwlock acquisition for vac_truncate_clog
it works fine ... for the master.

But it doesn't fix the much bigger race on the standby. We only emit
WAL for xid limits after we truncate clog, and the clog truncation
record doesn't record the new limit.

So now we need a new, somewhat redundant, xlog record and redo
function for this lower clog bound pointer. Which, really, is only
actually tracking a slightly more up to date version of oldestXid.

At that point I was just papering around a race that should just be
fixed at its source instead. Advance oldestXid before truncating clog,
and write a clog truncation record that includes the new oldestXid.

So... I can go back to the old approach and just add the new xlog
record and redo method, new lwlock, new shmemvariablecache field, etc,
if you're really concerned this approach is risky. But I'd rather fix
the original problem instead.

It might be helpful if I separate out the patch that touches oldestXid
from the rest for separate review, so I'll update here with a 2-patch
series instead of a squashed single patch soon.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM

2017-03-09 Thread Jim Nasby


On 3/6/17 8:34 PM, Masahiko Sawada wrote:

I don't think it can say "1 frozen pages" because the number of
skipped pages according to visibility map is always more than 32
(SKIP_PAGES_THRESHOLD).


That's just an artifact of how the VM currently works. I'm not a fan of 
cross dependencies like that unless there's a pretty good reason.


BTW, I think there's already a function that handles the pluralization 
for you. IIRC it's one of the things you can add to an ereport() call.

--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-09 Thread Tom Lane

Amit Kapila  writes:
> Just to let you know that I think I have figured out the reason of
> failure.  If we run the regressions with attached patch, it will make
> the regression tests fail consistently in same way.  The patch just
> makes all transaction status updates to go via group clog update
> mechanism.

This does *not* give me a warm fuzzy feeling that this patch was
ready to commit.  Or even that it was tested to the claimed degree.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-09 Thread Amit Kapila

On Fri, Mar 10, 2017 at 10:51 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Mar 9, 2017 at 9:17 PM, Tom Lane  wrote:
>>> Buildfarm thinks eight wasn't enough.
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clam=2017-03-10%2002%3A00%3A01
>
>> At first I was confused how you knew that this was the fault of this
>> patch, but this seems like a pretty indicator:
>> TRAP: FailedAssertion("!(curval == 0 || (curval == 0x03 && status !=
>> 0x00) || curval == status)", File: "clog.c", Line: 574)
>
> Yeah, that's what led me to blame the clog-group-update patch.
>
>> I'm not sure whether it's related to this problem or not, but now that
>> I look at it, this (preexisting) comment looks like entirely wishful
>> thinking:
>>  * If we update more than one xid on this page while it is being written
>>  * out, we might find that some of the bits go to disk and others don't.
>>  * If we are updating commits on the page with the top-level xid that
>>  * could break atomicity, so we subcommit the subxids first before we 
>> mark
>>  * the top-level commit.
>
> Maybe, but that comment dates to 2008 according to git, and clam has
> been, er, happy as a clam up to now.  My money is on a newly-introduced
> memory-access-ordering bug.
>
> Also, I see clam reported in green just now, so it's not 100%
> reproducible :-(
>

Just to let you know that I think I have figured out the reason of
failure.  If we run the regressions with attached patch, it will make
the regression tests fail consistently in same way.  The patch just
makes all transaction status updates to go via group clog update
mechanism.  Now, the reason of the problem is that the patch has
relied on XidCache in PGPROC for subtransactions when they are not
overflowed which is okay for Commits, but not for Rollback to
Savepoint and Rollback.  For Rollback to Savepoint, we just pass the
particular (sub)-transaction id to abort, but group mechanism will
abort all the sub-transactions in that top transaction to Rollback.  I
am still analysing what could be the best way to fix this issue.  I
think there could be multiple ways to fix this problem.  One way is
that we can advertise the fact that the status update for transaction
involves subtransactions and then we can use xidcache for actually
processing the status update.  Second is advertise all the
subtransaction ids for which status needs to be update, but I am sure
that is not-at all efficient as that will cosume lot of memory.  Last
resort could be that we don't use group clog update optimization when
transaction has sub-transactions.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

force_clog_group_commit_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-09 Thread Tsunakawa, Takayuki

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of David Steele
> PostgreSQL currently requires the file mode mask (umask) to be 0077.
> However, this precludes the possibility of a user in the postgres group
> performing a backup (or whatever).  Now that
> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
> unprivileged user, it makes sense to also allow a (relatively) unprivileged
> user to perform the backup at the file system level as well.

I'd like to help review this.  First, let me give some questions and comments.


1.What's the concrete use case of this feature?  Do you intend to extend the 
concept of multiple DBAs to the full range of administration of a single 
database instance, or just multiple OS users for database backup?
If you think that multiple OS user support is desirable to reduce the 
administration burdon on a single person, then isn't the automated backup 
sufficient (such as with cron)? 

2.Backup should always be considered with recovery.  If you allow another OS 
user to back up the database, can you allow him to recover the database as well?
For example, assume the PostgreSQL user account (the OS user who does initdb 
and pg_ctl start/stop) is dba1, and dba2 backs up the database using tar or 
cpio.
When dba2 restores the backup, the owner of the database cluster becomes dba2.  
If the file permission only allows one user to write the file, then dba1 can't 
start the instance.

3.The default location of the SSL key file is $PGDATA, so the permission of the 
key file is likely to become 0640.  But the current postgres requires it to be 
0600.  See src/backend/libpq/be-secure-openssl.c.

4.I've seen a few users to place .pgpass file in $PGDATA and set the 
environment variable PGPASSFILE to point to it.  They expect it to be back up 
with other database files.  So I'm afraid the permission of .pgpass file also 
becomes 0640 some time.  However, the current code requires it to be 0600.  See 
src/interface/libpq/fe-connect.c.

5.I think some explanation about the concept of multiple OS users is necessary, 
such as here:

16.1. Short Version
https://www.postgresql.org/docs/devel/static/install-short.html

18.2. Creating a Database Cluster
https://www.postgresql.org/docs/devel/static/creating-cluster.html


[FYI]
Oracle instructs the user, during the software installation, to put "umask 022" 
in ~/.bashrc or so.
MySQL's files in the data directory appears to be 0640.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-09 Thread Ashutosh Bapat

>
> But as far as code is concerned, I think the two-list approach will
> turn out to be less simple if we derive corresponding two different
> arrays in AppendState node. Handling two different arrays during
> execution does not look clean. Whereas, the bitmapset that I have used
> in Append has turned out to be very simple. I just had to do the below
> check (and that is the only location) to see if it's a partial or
> non-partial subplan. There is nowhere else any special handling for
> non-partial subpath.
>
> /*
> * Increment worker count for the chosen node, if at all we found one.
> * For non-partial plans, set it to -1 instead, so that no other workers
> * run it.
> */
> if (min_whichplan != PA_INVALID_PLAN)
> {
>if (bms_is_member(min_whichplan,
> ((Append*)state->ps.plan)->partial_subplans_set))
>padesc->pa_info[min_whichplan].pa_num_workers++;
>else
>padesc->pa_info[min_whichplan].pa_num_workers = -1;
> }
>
> Now, since Bitmapset field is used during execution with such
> simplicity, why not have this same data structure in AppendPath, and
> re-use bitmapset field in Append plan node without making a copy of
> it. Otherwise, if we have two lists in AppendPath, and a bitmap in
> Append, again there is going to be code for data structure conversion.
>

I think there is some merit in separating out non-parallel and
parallel plans within the same array or outside it. The current logic
to assign plan to a worker looks at all the plans, unnecessarily
hopping over the un-parallel ones after they are given to a worker. If
we separate those two, we can keep assigning new workers to the
non-parallel plans first and then iterate over the parallel ones when
a worker needs a plan to execute. We might eliminate the need for
special value -1 for num workers. You may separate those two kinds in
two different arrays or within the same array and remember the
smallest index of a parallel plan.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increasing the default WAL segment size

2017-03-09 Thread Beena Emerson

Hello,

On Tue, Mar 7, 2017 at 10:46 AM, Ashutosh Sharma 
wrote:

> Hi,
>
> I took a look at this patch. Overall, the patch looks good to me.
> However, there are some review comments that I would like to share,
>
> 1. I think the macro 'PATH_MAX' used in pg_waldump.c file is specific
> to Linux. It needs to be changed to some constant value or may be
> MAXPGPATH inorder to make it platform independent.
>
> 2. As already mentioned by Jim and Kuntal upthread, you are trying to
> detect the configured WAL segment size in pg_waldump.c and
> pg_standby.c files based on the size of the random WAL file which
> doesn't look like a good idea. But, then I think the only option we
> have is to pass the location of pg_control file to pg_waldump module
> along with the start and end wal segments.
>
> 3. When trying to compile '02-initdb-walsegsize-v2.patch' on Windows,
> I got this warning message,
>
> Warning1warning C4005: 'DEFAULT_XLOG_SEG_SIZE' : macro
> redefinition
> c:\users\ashu\postgresql\src\include\pg_config_manual.h20
>
> Apart from these, I am not having any comments as of now. I am still
> validating the patch on Windows. If I find any issues i will update
> it.
>
> Thank you for your reviews Kuntal, Jim, Ashutosh

Attached in an updated 02 patch which:

   1. Call RetrieveXLogSegSize(conn) in pg_receivewal.c
   2. Remove the warning in Windows
   3. Change PATH_MAX in pg_waldump with MAXPGPATH

Regarding the usage of the wal file size as the XLogSegSize, I agree with
what Robert has said. Generally, the wal size will be of the expected
wal_segment_size and to have it any other size, esspecially of a valid
power2 value is extremely rare and I feel it is not a major cause of
concern.

Thank you,

Beena Emerson
EnterpriseDB: http://www.enterprisedb.com


01-add-XLogSegmentOffset-macro.patch
Description: Binary data


02-initdb-walsegsize-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-09 Thread Peter Eisentraut

On 2/28/17 20:58, David Steele wrote:
> This patch introduces a new initdb param, -u/-file-mode-mask, and a new
> GUC, file_mode_mask, to allow the default mode of files and directories
> in the $PGDATA directory to be modified.

The postmaster.pid file appears not to observe the configured mask.

There ought to be a test, perhaps under src/bin/initdb/, to check for
that kind of thing.

There is no documentation update for initdb.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)

2017-03-09 Thread Kuntal Ghosh

On Fri, Mar 10, 2017 at 3:11 AM, Andres Freund  wrote:
> On 2017-03-09 16:37:29 -0500, Tom Lane wrote:
>> Robert Haas  writes:
>> > On Thu, Mar 9, 2017 at 2:30 PM, Peter Eisentraut
>> >  wrote:
>> >> In practice, I think it's common to do a quick select * from
>> >> pg_stat_activity to determine whether a database instance is in use.
>>
>> > I thought of the same kind of thing, and it was discussed upthread.
>> > There seemed to be more votes for keeping it all in one view, but that
>> > could change if more people vote.
>>
>> I've not been paying much attention to this thread, but it seems like
>> something that would help Peter's use-case and have other uses as well
>> is a new column that distinguishes different process types --- user
>> session, background worker, autovacuum worker, etc.
>
> The patches upthread add precisely such a column.
>
The patch exposes auxiliary processes, autovacuum launcher and
bgworker along with other backends in pg_stat_activity. It also adds
an extra column, named proc_type (suggested by Craig
and Robert), to indicate the type of process in pg_stat_activity view.
proc_type includes:

* client backend
* autovacuum launcher
* wal sender
* bgworker
* writer
* checkpointer
* wal writer
* wal receiver

Here is the present output with the relevant columns.
postgres=# SELECT wait_event_type, wait_event, state, proc_type FROM
pg_stat_activity;
 wait_event_type | wait_event  | state  |  proc_type
-+-++-
 Activity| AutoVacuumMain  | idle   | autovacuum launcher
 Activity| LogicalLauncherMain | idle   | bgworker
 Activity| WalSenderMain   | idle   | wal sender
 | | active | client backend
 Activity| BgWriterMain| idle   | writer
 Activity| CheckpointerMain| idle   | checkpointer
 Activity| WalWriterMain   | idle   | wal writer
(7 rows)


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-09 Thread Peter Eisentraut

On 2/28/17 13:02, Dmitry Dolgov wrote:
> +
> +-- Extract value by key
> +SELECT ('{"a": 1}'::jsonb)['a'];
> +
> +-- Extract nested value by key path
> +SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
> +
> +-- Extract element by index
> +SELECT ('[1, "2", null]'::jsonb)['1'];
> +
> +-- Update value by key
> +UPDATE table_name set jsonb_field['key'] = 1;
> +
> +-- Select records using where clause with subscripting
> +SELECT * from table_name where jsonb_field['key'] = '"value"';
> +

I see a possible problem here:  This design only allows one subscripting
function.  But what you'd really want in this case is at least two: one
taking an integer type for selecting by array index, and one taking text
for selecting by field name.

I suppose that since a given value can only be either an array or an
object, there is no ambiguity, but I think this might also lose some
error checking.  It might also not work the same way for other types.

It looks like your jsonb subscripting function just returns null if it
can't find a field, which is also a bit dubious.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-09 Thread Tom Lane

Robert Haas  writes:
> On Thu, Mar 9, 2017 at 9:17 PM, Tom Lane  wrote:
>> Buildfarm thinks eight wasn't enough.
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clam=2017-03-10%2002%3A00%3A01

> At first I was confused how you knew that this was the fault of this
> patch, but this seems like a pretty indicator:
> TRAP: FailedAssertion("!(curval == 0 || (curval == 0x03 && status !=
> 0x00) || curval == status)", File: "clog.c", Line: 574)

Yeah, that's what led me to blame the clog-group-update patch.

> I'm not sure whether it's related to this problem or not, but now that
> I look at it, this (preexisting) comment looks like entirely wishful
> thinking:
>  * If we update more than one xid on this page while it is being written
>  * out, we might find that some of the bits go to disk and others don't.
>  * If we are updating commits on the page with the top-level xid that
>  * could break atomicity, so we subcommit the subxids first before we mark
>  * the top-level commit.

Maybe, but that comment dates to 2008 according to git, and clam has
been, er, happy as a clam up to now.  My money is on a newly-introduced
memory-access-ordering bug.

Also, I see clam reported in green just now, so it's not 100%
reproducible :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-09 Thread Amit Khandekar

On 10 March 2017 at 10:13, Ashutosh Bapat
 wrote:
> On Thu, Mar 9, 2017 at 6:28 PM, Robert Haas  wrote:
>> On Thu, Mar 9, 2017 at 7:42 AM, Ashutosh Bapat
>>  wrote:

 +if (rel->partial_pathlist != NIL &&
 +(Path *) linitial(rel->partial_pathlist) == subpath)
 +partial_subplans_set = bms_add_member(partial_subplans_set, 
 i);

 This seems like a scary way to figure this out.  What if we wanted to
 build a parallel append subpath with some path other than the
 cheapest, for some reason?

Yes, there was an assumption that append subpath will be either a
cheapest non-partial path, or the cheapest (i.e. first in the list)
partial path, although in the patch there is no Asserts to make sure
that a common rule has been followed at both these places.

 I think you ought to record the decision
 that set_append_rel_pathlist makes about whether to use a partial path
 or a parallel-safe path, and then just copy it over here.
>>>
>>> I agree that assuming that a subpath is non-partial path if it's not
>>> cheapest of the partial paths is risky. In fact, we can not assume
>>> that even when it's not one of the partial_paths since it could have
>>> been kicked out or was never added to the partial path list like
>>> reparameterized path. But if we have to save the information about
>>> which of the subpaths are partial paths and which are not in
>>> AppendPath, it would take some memory, noticeable for thousands of
>>> partitions, which will leak if the path doesn't make into the
>>> rel->pathlist.
>>
>> True, but that's no different from the situation for any other Path
>> node that has substructure.  For example, an IndexPath has no fewer
>> than 5 list pointers in it.  Generally we assume that the number of
>> paths won't be large enough for the memory used to really matter, and
>> I think that will also be true here.  And an AppendPath has a list of
>> subpaths, and if I'm not mistaken, those list nodes consume more
>> memory than the tracking information we're thinking about here will.
>>
>
> What I have observed is that we try to keep the memory usage to a
> minimum, trying to avoid memory consumption as much as possible. Most
> of that substructure gets absorbed by the planner or is shared across
> paths. Append path lists are an exception to that, but we need
> something to hold all subpaths together and list is PostgreSQL's way
> of doing it. So, that's kind of unavoidable. And may be we will find
> some reason for almost every substructure in paths.
>
>> I think you're thinking about this issue because you've been working
>> on partitionwise join where memory consumption is a big issue, but
>> there are a lot of cases where that isn't really a big deal.
>
> :).
>
>>
>>> The purpose of that information is to make sure that we
>>> allocate only one worker to that plan. I suggested that we use
>>> path->parallel_workers for the same, but it seems that's not
>>> guaranteed to be reliable. The reasons were discussed upthread. Is
>>> there any way to infer whether we can allocate more than one workers
>>> to a plan by looking at the corresponding path?
>>
>> I think it would be smarter to track it some other way.  Either keep
>> two lists of paths, one of which is the partial paths and the other of
>> which is the parallel-safe paths, or keep a bitmapset indicating which
>> paths fall into which category.
>
> I like two lists: it consumes almost no memory (two list headers
> instead of one) compared to non-parallel-append when there are
> non-partial paths and what more, it consumes no extra memory when all
> paths are partial.

I agree that the two-lists approach will consume less memory than
bitmapset. Keeping two lists will effectively have an extra pointer
field which will add up to the AppendPath size, but this size will not
grow with the number of subpaths, whereas the Bitmapset will grow.

But as far as code is concerned, I think the two-list approach will
turn out to be less simple if we derive corresponding two different
arrays in AppendState node. Handling two different arrays during
execution does not look clean. Whereas, the bitmapset that I have used
in Append has turned out to be very simple. I just had to do the below
check (and that is the only location) to see if it's a partial or
non-partial subplan. There is nowhere else any special handling for
non-partial subpath.

/*
* Increment worker count for the chosen node, if at all we found one.
* For non-partial plans, set it to -1 instead, so that no other workers
* run it.
*/
if (min_whichplan != PA_INVALID_PLAN)
{
   if (bms_is_member(min_whichplan,
((Append*)state->ps.plan)->partial_subplans_set))
   padesc->pa_info[min_whichplan].pa_num_workers++;
   else
   padesc->pa_info[min_whichplan].pa_num_workers = -1;
}

Now, since Bitmapset field is used during execution

Re: [HACKERS] [PATCH] Enabling atomics on ARM64

2017-03-09 Thread Tom Lane

Andres Freund  writes:
> FWIW, Unless somebody seriously protests, I'm going to commit this soon-ish.

Sure, if you want to take responsibility for it, push away.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] allow referring to functions without arguments when unique

2017-03-09 Thread Peter Eisentraut

On 3/7/17 00:32, Michael Paquier wrote:
>> OK. After a lookup, I am just seeing opfamily, opclass missing, so
>> this patch is doing it as you describe.

I'm not sure what you mean here.

>> @@ -7198,6 +7198,33 @@ function_with_argtypes:
>> n->objargs = extractArgTypes($2);
>> $$ = n;
>> }
>> This may not have arguments listed, so is function_with_argtypes really 
>> adapted?

Well, we could do something like function_with_opt_argtypes?

>> +   /*
>> +* If no arguments were specified, the name must yield a unique 
>> candidate.
>> +*/
>> +   if (nargs == -1 && clist)
>> +   {
>> +   if (clist->next)
>> +   ereport(ERROR,
>> I would have used list_length here for clarity.

This is actually not a "List" node.

>> The comment at the top of LookupFuncName() needs a refresh. The caller
>> can as well just use a function name without arguments.

Fixed.

> =# set search_path to 'public,popo';

I think you mean

set search_path to public,popo;

> =# drop function dup2;
> ERROR:  42883: function dup2() does not exist
> LOCATION:  LookupFuncName, parse_func.c:1944
> In this case I would have expected an error telling that the name is
> ambiguous. FuncnameGetCandidates() returns an empty list.

Your example works correctly if you set the schema path correctly.
However, the error message is misleading with the parentheses.  I have
updated that to create a different error message for this case, and
added a test case.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From ac83b6f71433d2101f67dd148a4af2aac78129ee Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 9 Mar 2017 23:58:48 -0500
Subject: [PATCH v2] Allow referring to functions without arguments when unique

In DDL commands referring to an existing function, allow omitting the
argument list if the function name is unique in its schema, per SQL
standard.

This uses the same logic that the regproc type uses for finding
functions by name only.
---
 doc/src/sgml/ref/alter_extension.sgml   |  2 +-
 doc/src/sgml/ref/alter_function.sgml| 13 +
 doc/src/sgml/ref/alter_opfamily.sgml|  7 +++--
 doc/src/sgml/ref/comment.sgml   |  2 +-
 doc/src/sgml/ref/create_cast.sgml   |  6 ++--
 doc/src/sgml/ref/create_transform.sgml  | 12 +---
 doc/src/sgml/ref/drop_function.sgml | 35 ---
 doc/src/sgml/ref/grant.sgml |  2 +-
 doc/src/sgml/ref/revoke.sgml|  2 +-
 doc/src/sgml/ref/security_label.sgml|  2 +-
 src/backend/nodes/copyfuncs.c   |  1 +
 src/backend/nodes/equalfuncs.c  |  1 +
 src/backend/parser/gram.y   | 27 ++
 src/backend/parser/parse_func.c | 37 +++--
 src/include/nodes/parsenodes.h  |  3 ++
 src/test/regress/expected/create_function_3.out | 11 +++-
 src/test/regress/sql/create_function_3.sql  |  8 ++
 17 files changed, 143 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/ref/alter_extension.sgml b/doc/src/sgml/ref/alter_extension.sgml
index de6d6dca16..a7c0927d1c 100644
--- a/doc/src/sgml/ref/alter_extension.sgml
+++ b/doc/src/sgml/ref/alter_extension.sgml
@@ -39,7 +39,7 @@
   EVENT TRIGGER object_name |
   FOREIGN DATA WRAPPER object_name |
   FOREIGN TABLE object_name |
-  FUNCTION function_name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) |
+  FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] |
   MATERIALIZED VIEW object_name |
   OPERATOR operator_name (left_type, right_type) |
   OPERATOR CLASS object_name USING index_method |
diff --git a/doc/src/sgml/ref/alter_function.sgml b/doc/src/sgml/ref/alter_function.sgml
index 0388d06b95..168eeb7c52 100644
--- a/doc/src/sgml/ref/alter_function.sgml
+++ b/doc/src/sgml/ref/alter_function.sgml
@@ -21,15 +21,15 @@
 
  
 
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 action [ ... ] [ RESTRICT ]
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 RENAME TO new_name
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 OWNER TO { new_owner | CURRENT_USER | SESSION_USER }
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 SET SCHEMA new_schema
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 DEPENDS ON EXTENSION extension_name
 
 where action is one

[HACKERS] Logical replication origin tracking fix

2017-03-09 Thread Petr Jelinek

Hi,

while discussing with Craig issues around restarting logical replication
stream related to the patch he posted [1], I realized that we track
wrong origin LSN in the logical replication apply.

We currently track commit_lsn which is *start* of commit record, what we
need to track is end_lsn which is *end* of commit record otherwise we
might request transaction that was already replayed if the subscription
instance has crashed right after commit.

Attached patch fixes that.

[1]
https://www.postgresql.org/message-id/CAMsr+YGFvikx-U_mHQ0mAzTarqvCpwzvsPKv=7mfp9scdrm...@mail.gmail.com

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


0001-Fix-remote-position-tracking-in-logical-replication.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication existing data copy

2017-03-09 Thread Petr Jelinek

On 09/03/17 18:48, Peter Eisentraut wrote:
> On 3/6/17 05:27, Petr Jelinek wrote:
>> And lastly I changed the automagic around exporting, not exporting and
>> using the snapshot produced by CREATE_REPLICATION_SLOT into explicit
>> parameter for the CREATE_REPLICATION_SLOT command (and added simple
>> framework for adding more of those if needed in the future).
> 
> It might be useful to make this into a separate patch, for clarity.
> 

Yes, already working on that (at least part of it), since it's useful
for other patches in CF.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication existing data copy

2017-03-09 Thread Petr Jelinek

On 09/03/17 18:46, Peter Eisentraut wrote:
> On 3/6/17 05:27, Petr Jelinek wrote:
>> updated and rebased version of the patch attached.
> 
> Some comments on this patch:
> 
> Can we have a better explanation of different snapshot options in
> CREATE_REPLICATION_SLOT.  We use all these variants in different
> places, but it's not fully documented why.  Think about interested
> users reading this.
> 

I am not quite sure what/where do you want me to expand the docs, the
protocol doc explains what the options do, are you missing reasoning for
why we use them in the calling code?

> 
> errmsg("subscription table %u in subscription %u does not exist",
> 
> Use names instead of IDs if possible.
> 

Well, this message is more or less equal to cache lookup failure,
problem is that neither relation nor subscription are guaranteed to
exist if this fails. I can write code that spits two different messages
depending of they do, not quite sure if it's worth it there though.

> 
> +   libpqrcv_table_list,
> +   libpqrcv_table_info,
> +   libpqrcv_table_copy,
> 
> I don't think these functions belong into the WAL receiver API, since
> they are specific to this particular logical replication
> implementation.  I would just make an API function libpqrc_exec_sql()
> that runs a query, and then put the table_*() functions as wrappers
> around that into tablesync.c.
> 

Yeah I was wondering about it as well. The reason why it's in
libpqwalreciver is that if we create some libpqrc_exec_sql as you say,
we'll need code that converts the PQresult to something consumable by
backend that has no access to libpq API and that seemed like quite a bit
of boilerplate work. I really don't want to write another libpq-like or
SPI-like interface for this. Maybe it would be acceptable to return
result as tuplestore?

> 
> Not sure what the worker init stuff in ApplyLauncherShmemInit() is
> doing.  Could be commented more.
> 

Eh, I copy pasted comment there from different place, will fix.

> 
> max_sync_workers_per_subscription could be PGC_SIGHUP, I think.  And
> the minimum could be 0, to stop any syncing?
> 
> 
> pg_subscription_rel.h: I'm not sure under which circumstances we need
> to use BKI_ROWTYPE_OID.
> 
> 
> Does pg_subscription_rel need an OID column?  The index
> SubscriptionRelOidIndexId is not used anywhere.
>

Ah, leftover from the time it used dependencies for tracking.

> 
> You removed this command from the tests:
> 
> ALTER SUBSCRIPTION testsub SET PUBLICATION testpub, testpub1;
> 
> I suppose because it causes a connection.  We should have an option to
> prevent that (NOCONNECT/NOREFRESH?).

NOREFRESH makes more sense I guess.

> 
> Why was the test 'check replication origin was dropped on subscriber'
> removed?
> 

I don't know what you mean?

> 
> Attached also a small patch with some stylistic changes.
> 

Thanks.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading postmaster's log messages about bind/listen errors

2017-03-09 Thread Tom Lane

Robert Haas  writes:
> On Thu, Mar 9, 2017 at 4:01 PM, Joe Conway  wrote:
>> On 03/09/2017 12:27 PM, Tom Lane wrote:
>>> For good measure I also added a DEBUG1 log message reporting successful
>>> binding to a port.  I'm not sure if there's an argument for putting this
>>> out at LOG level (i.e. by default) --- any thoughts about that?

>> +1 for making it LOG instead of DEBUG1

> I would tend to vote against that, because startup is getting
> gradually chattier and chattier, and I think this isn't likely to be
> of interest to very many people most of the time.

Yeah, my thought was that if we've gotten along without this for 20 years,
it's probably not of interest to most people most of the time.

However, if we're measuring this on a scale of usefulness to the average
DBA, I would argue that it's of more interest than any of these messages
that currently appear by default:

2017-03-09 23:40:12.334 EST [19335] LOG:  MultiXact member wraparound 
protections are now enabled
2017-03-09 23:40:12.335 EST [19339] LOG:  autovacuum launcher started
2017-03-09 23:40:12.336 EST [19341] LOG:  logical replication launcher started

The first of those is surely past its sell-by date.  As for the other two,
we should log *failure* to start, but not the normal case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-09 Thread Ashutosh Bapat

On Thu, Mar 9, 2017 at 6:28 PM, Robert Haas  wrote:
> On Thu, Mar 9, 2017 at 7:42 AM, Ashutosh Bapat
>  wrote:
>>>
>>> +if (rel->partial_pathlist != NIL &&
>>> +(Path *) linitial(rel->partial_pathlist) == subpath)
>>> +partial_subplans_set = bms_add_member(partial_subplans_set, i);
>>>
>>> This seems like a scary way to figure this out.  What if we wanted to
>>> build a parallel append subpath with some path other than the
>>> cheapest, for some reason?  I think you ought to record the decision
>>> that set_append_rel_pathlist makes about whether to use a partial path
>>> or a parallel-safe path, and then just copy it over here.
>>
>> I agree that assuming that a subpath is non-partial path if it's not
>> cheapest of the partial paths is risky. In fact, we can not assume
>> that even when it's not one of the partial_paths since it could have
>> been kicked out or was never added to the partial path list like
>> reparameterized path. But if we have to save the information about
>> which of the subpaths are partial paths and which are not in
>> AppendPath, it would take some memory, noticeable for thousands of
>> partitions, which will leak if the path doesn't make into the
>> rel->pathlist.
>
> True, but that's no different from the situation for any other Path
> node that has substructure.  For example, an IndexPath has no fewer
> than 5 list pointers in it.  Generally we assume that the number of
> paths won't be large enough for the memory used to really matter, and
> I think that will also be true here.  And an AppendPath has a list of
> subpaths, and if I'm not mistaken, those list nodes consume more
> memory than the tracking information we're thinking about here will.
>

What I have observed is that we try to keep the memory usage to a
minimum, trying to avoid memory consumption as much as possible. Most
of that substructure gets absorbed by the planner or is shared across
paths. Append path lists are an exception to that, but we need
something to hold all subpaths together and list is PostgreSQL's way
of doing it. So, that's kind of unavoidable. And may be we will find
some reason for almost every substructure in paths.

> I think you're thinking about this issue because you've been working
> on partitionwise join where memory consumption is a big issue, but
> there are a lot of cases where that isn't really a big deal.

:).

>
>> The purpose of that information is to make sure that we
>> allocate only one worker to that plan. I suggested that we use
>> path->parallel_workers for the same, but it seems that's not
>> guaranteed to be reliable. The reasons were discussed upthread. Is
>> there any way to infer whether we can allocate more than one workers
>> to a plan by looking at the corresponding path?
>
> I think it would be smarter to track it some other way.  Either keep
> two lists of paths, one of which is the partial paths and the other of
> which is the parallel-safe paths, or keep a bitmapset indicating which
> paths fall into which category.

I like two lists: it consumes almost no memory (two list headers
instead of one) compared to non-parallel-append when there are
non-partial paths and what more, it consumes no extra memory when all
paths are partial.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-03-09 Thread Ashutosh Bapat

>>
>> The new name merge_fdw_options() is shorter than the one I chose, but
>> we are not exactly merging options for an upper relation since there
>> isn't the other fpinfo to merge from. But I think we can live with
>> merge_fdw_options().
>
> Perhaps "combine" is a better word? I didn't really see a problem with
> that. After I posted this I wished I'd made the function even more
> generic to accept either side as NULL and make up the new fpinfo from
> the non-NULL one, or Merge if both were non-NULL.  I liked that way
> much better than giving the function too much knowledge of what its
> purpose actually is. It's more likely to get used for something else
> in the future, which means there's less chance that someone else will
> make the same mistake.

It's more like copy for an upper rel and merge for a join rel. Your
point about making it side-agnostic is good but I doubt if we want to
spend more effort there. If somebody writes a code with the input plan
stuffed in the innerrel instead of the outerrel, s/he will notice it
immediately when testing as assertion would fail or there will be a
straight segfault. We will decide what to do then.

>
>> Once I fixed that, the testcases started showing an assertion failure,
>> since fpinfo of a base relation can not have an outerrel. Fixed the
>> assertion as well. If we are passing fpinfo's of joining relations,
>> there's no need to have outerrel and innerrel in fpinfo of join.
>
> Looks like you forgot to update the comment on the Assert()

Yes and I also forgot to update the function prologue to refer to the
fpinfo_o/i instead of inner and outer relations. Attached patch
corrects it.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


foreign_outerjoin_pushdown_fix_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-09 Thread Peter Geoghegan

On Thu, Mar 9, 2017 at 8:17 PM, Bruce Momjian  wrote:
>> In my opinion, we expose query id (and dbid, and userid) as the
>> canonical identifier for each pg_stat_statements entry, and have done
>> so for some time. That's the stable API -- not query text. I'm aware
>> of cases where query text was used as an identifier, but that ended up
>> being hashed anyway.
>
> Speaking of hash values for queries, someone once asked me if we could
> display a hash value for queries displayed in pg_stat_activity and
> pg_stat_statements so they could take a running query and look in
> pg_stat_statements to see how long is usually ran.  It seemed like a
> useful idea to me.

I agree.

> I don't think they can hash the query manually because of the constants
> involved.

It would be a matter of having postgres expose Query.queryId (or the
similar field in PlannedStmt, I suppose). Morally, that field belongs
to pg_stat_statements, since it was written to make the query
normalization stuff work, and because everything would break if
another extension attempted to use it as pg_stat_statements does.
Whether or not that makes it okay to expose the hash value in
pg_stat_activity like that is above my pay grade, as Tom would say.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-09 Thread Bruce Momjian

On Sat, Mar  4, 2017 at 10:52:44AM -0800, Peter Geoghegan wrote:
> On Sat, Mar 4, 2017 at 8:02 AM, Tom Lane  wrote:
> > Meh ... we've generally regretted it when we "solved" a backwards
> > compatibility problem by introducing a GUC that changes query semantics.
> > I'm inclined to think we should either do it or not.
> 
> In my opinion, we expose query id (and dbid, and userid) as the
> canonical identifier for each pg_stat_statements entry, and have done
> so for some time. That's the stable API -- not query text. I'm aware
> of cases where query text was used as an identifier, but that ended up
> being hashed anyway.

Speaking of hash values for queries, someone once asked me if we could
display a hash value for queries displayed in pg_stat_activity and
pg_stat_statements so they could take a running query and look in
pg_stat_statements to see how long is usually ran.  It seemed like a
useful idea to me.

I don't think they can hash the query manually because of the constants
involved.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-09 Thread Petr Jelinek

On 09/03/17 19:50, Peter van Hardenberg wrote:
> Anecdotally, we just stored dates as strings and used a convention (key
> ends in "_at", I believe) to interpret them. The lack of support for
> dates in JSON is well-known, universally decried... and not a problem
> the PostgreSQL community can fix.
> 

The original complain was about JSON_VALUE extracting date but I don't
understand why there is problem with that, the SQL/JSON defines that
behavior. The RETURNING clause there is more or less just shorthand for
casting with some advanced options.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] foreign partition DDL regression tests

2017-03-09 Thread Ashutosh Bapat

On Thu, Mar 9, 2017 at 11:44 PM, Robert Haas  wrote:
> On Thu, Mar 9, 2017 at 1:19 AM, Ashutosh Bapat
>  wrote:
 At least we need to update the documentation.
>>>
>>> Got a proposal?
>>
>> How about something like attached?
>
> Committed with some revisions.

Thanks.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bytea_output vs make installcheck

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 6:45 PM, Neha Khatri  wrote:
> Sorry about the naive question, but if someone has set the GUC bytea_output
> = 'escape', then the intention seem to be to obtain the output in 'escape'
> format for bytea.
> With this, if an installcheck is done, that might also have been done with
> the expectation that the output will be in 'escape' format. In that case,
> how much is it justified to hard code the format for regression database?
> However, I agree that there are not many bytea outputs in the current
> regression suite

I don't understand this.  People don't run the regression tests to get
the output.  They run the regression tests to see whether they pass.
While it may not be possible to make them pass with arbitrarily-crazy
settings, that's not a reason not to patch up the cases we can handle
sanely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan

On Thu, Mar 9, 2017 at 7:12 PM, Peter Geoghegan  wrote:
>> Hm - I think it's fair to export RecentGlobalXmin, given that we
>> obviously use it across modules in core code.  I see very little reason
>> not to export it.
>
> Well, the assertion is completely useless as anything but documentation...

I came up with the attached.

-- 
Peter Geoghegan
From 05a8c0de7da7cd9ee4d8436abdf8368c1cf8ac19 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Thu, 9 Mar 2017 19:17:07 -0800
Subject: [PATCH] Fix amcheck build on Windows.

Remove documenting assertion around RecentGlobalXmin, rather than
marking it PGDLLIMPORT.

In passing, update some comments that did not reflect the state of
amcheck as-committed.
---
 contrib/amcheck/verify_nbtree.c | 74 +++--
 1 file changed, 35 insertions(+), 39 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 5724aa6..d21b209 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -26,7 +26,6 @@
 #include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
-#include "utils/snapmgr.h"
 
 
 PG_MODULE_MAGIC;
@@ -55,8 +54,8 @@ typedef struct BtreeCheckState
 
 	/* B-Tree Index Relation */
 	Relation	rel;
-	/* ExclusiveLock held? */
-	bool		exclusivelylocked;
+	/* ShareLock held on heap/index, rather than AccessShareLock? */
+	bool		strongerlock;
 	/* Per-page context */
 	MemoryContext targetcontext;
 	/* Buffer access strategy */
@@ -94,7 +93,7 @@ PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
 static void bt_index_check_internal(Oid indrelid, bool parentcheck);
 static inline void btree_index_checkable(Relation rel);
-static void bt_check_every_level(Relation rel, bool exclusivelylocked);
+static void bt_check_every_level(Relation rel, bool strongerlock);
 static BtreeLevel bt_check_level_from_leftmost(BtreeCheckState *state,
 			 BtreeLevel level);
 static void bt_target_page_check(BtreeCheckState *state);
@@ -256,23 +255,26 @@ btree_index_checkable(Relation rel)
  * logical order, verifying invariants as it goes.
  *
  * It is the caller's responsibility to acquire appropriate heavyweight lock on
- * the index relation, and advise us if extra checks are safe when an
- * ExclusiveLock is held.
+ * the index relation, and advise us if extra checks are safe when a ShareLock
+ * is held.
  *
- * An ExclusiveLock is generally assumed to prevent any kind of physical
+ * A ShareLock is generally assumed to prevent any kind of physical
  * modification to the index structure, including modifications that VACUUM may
  * make.  This does not include setting of the LP_DEAD bit by concurrent index
  * scans, although that is just metadata that is not able to directly affect
  * any check performed here.  Any concurrent process that might act on the
  * LP_DEAD bit being set (recycle space) requires a heavyweight lock that
- * cannot be held while we hold an ExclusiveLock.  (Besides, even if that could
+ * cannot be held while we hold a ShareLock.  (Besides, even if that could
  * happen, the ad-hoc recycling when a page might otherwise split is performed
  * per-page, and requires an exclusive buffer lock, which wouldn't cause us
  * trouble.  _bt_delitems_vacuum() may only delete leaf items, and so the extra
  * parent/child check cannot be affected.)
+ *
+ * Note that we rely on RecentGlobalXmin interlock against recycling, just like
+ * standard index scans.  See note on RecentGlobalXmin/B-Tree page deletion.
  */
 static void
-bt_check_every_level(Relation rel, bool exclusivelylocked)
+bt_check_every_level(Relation rel, bool strongerlock)
 {
 	BtreeCheckState *state;
 	Page		metapage;
@@ -281,17 +283,11 @@ bt_check_every_level(Relation rel, bool exclusivelylocked)
 	BtreeLevel	current;
 
 	/*
-	 * RecentGlobalXmin assertion matches index_getnext_tid().  See note on
-	 * RecentGlobalXmin/B-Tree page deletion.
-	 */
-	Assert(TransactionIdIsValid(RecentGlobalXmin));
-
-	/*
 	 * Initialize state for entire verification operation
 	 */
 	state = palloc(sizeof(BtreeCheckState));
 	state->rel = rel;
-	state->exclusivelylocked = exclusivelylocked;
+	state->strongerlock = strongerlock;
 	/* Create context for page */
 	state->targetcontext = AllocSetContextCreate(CurrentMemoryContext,
  "amcheck context",
@@ -353,7 +349,7 @@ bt_check_every_level(Relation rel, bool exclusivelylocked)
 
 /*
  * Given a left-most block at some level, move right, verifying each page
- * individually (with more verification across pages for "exclusivelylocked"
+ * individually (with more verification across pages for "strongerlock"
  * callers).  Caller should pass the true root page as the leftmost initially,
  * working their way down by passing what is returned for the last call here
  * until level 0 (leaf page level) was reached.
@@ -428,7 +424,7 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 			 * locking, check

Re: [HACKERS] Upgrading postmaster's log messages about bind/listen errors

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 4:01 PM, Joe Conway  wrote:
> On 03/09/2017 12:27 PM, Tom Lane wrote:
>> Over in
>> https://www.postgresql.org/message-id/flat/201703072317.01345.john.iliffe%40iliffe.ca
>> we spent quite a lot of effort to diagnose what turned out to be a simple
>> networking misconfiguration.  It would probably have taken a lot less
>> effort if the postmaster were more forthcoming about exactly what address
>> it's trying to bind to.  I seem to recall having wanted to include that
>> info in the messages many years ago, but at the time we lacked any
>> reasonably-portable way to decode a struct addrinfo.  Now we have
>> pg_getnameinfo_all(), so PFA a patch to include the specific address in
>> any complaint about failures in the socket/bind/listen sequence.
>>
>> For good measure I also added a DEBUG1 log message reporting successful
>> binding to a port.  I'm not sure if there's an argument for putting this
>> out at LOG level (i.e. by default) --- any thoughts about that?
>
> +1 for making it LOG instead of DEBUG1

I would tend to vote against that, because startup is getting
gradually chattier and chattier, and I think this isn't likely to be
of interest to very many people most of the time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 9:34 PM, Amit Kapila  wrote:
> Do we really need to set LSN on this page (or mark it dirty), if so
> why?  Are you worried about restoration of FPI or something else?

I haven't thought through all of the possible consequences and am a
bit to tired to do so just now, but doesn't it seem rather risky to
invent a whole new way of using these xlog functions?
src/backend/access/transam/README describes how to do write-ahead
logging properly, and neither MarkBufferDirty() nor PageSetLSN() is
described as an optional step.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan

On Thu, Mar 9, 2017 at 7:10 PM, Andres Freund  wrote:
>>   verify_nbtree.obj : error LNK2001: unresolved external symbol
>> RecentGlobalXmin
>> [C:\buildfarm\buildenv\HEAD\pgsql.build\amcheck.vcxproj]
>>
>> Rather than marking RecentGlobalXmin as PGDLLIMPORT, I'd rather just
>> remove the documenting assertion and leave that comment as-is. I'll
>> work on a patch for this soon.
>
> Hm - I think it's fair to export RecentGlobalXmin, given that we
> obviously use it across modules in core code.  I see very little reason
> not to export it.

Well, the assertion is completely useless as anything but documentation...


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Andres Freund

Hi,

On 2017-03-09 19:04:46 -0800, Peter Geoghegan wrote:
> On Thu, Mar 9, 2017 at 4:41 PM, Andres Freund  wrote:
> > I don't really expect buildfarm fallout, but ..
> 
> Unfortunately, there is some on Windows:

Thanks for monitoring.


>   verify_nbtree.obj : error LNK2001: unresolved external symbol
> RecentGlobalXmin
> [C:\buildfarm\buildenv\HEAD\pgsql.build\amcheck.vcxproj]
> 
> Rather than marking RecentGlobalXmin as PGDLLIMPORT, I'd rather just
> remove the documenting assertion and leave that comment as-is. I'll
> work on a patch for this soon.

Hm - I think it's fair to export RecentGlobalXmin, given that we
obviously use it across modules in core code.  I see very little reason
not to export it.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan

On Thu, Mar 9, 2017 at 4:41 PM, Andres Freund  wrote:
> And pushed with these.

Thanks.

> I don't really expect buildfarm fallout, but ..

Unfortunately, there is some on Windows:

  verify_nbtree.obj : error LNK2001: unresolved external symbol
RecentGlobalXmin
[C:\buildfarm\buildenv\HEAD\pgsql.build\amcheck.vcxproj]

Rather than marking RecentGlobalXmin as PGDLLIMPORT, I'd rather just
remove the documenting assertion and leave that comment as-is. I'll
work on a patch for this soon.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug in get_partition_for_tuple

2017-03-09 Thread Amit Langote

Just observed a crash due to thinko in the logic that handles NULL
partition key.  Absence of null-accepting partition in this case should
have caused an error, instead the current code proceeds with comparison
resulting in crash.

create table p (a int, b char) partition by list (b);
create table p1 partition of p for values in ('a');
insert into p values (1);   -- crashes

Attached patch fixes that and adds a test.

Thanks,
Amit
>From 41f4d396267a817419eb338ef437e2c33d6862d9 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 10 Mar 2017 11:53:41 +0900
Subject: [PATCH] Fix a bug in get_partition_for_tuple

Handle the NULL partition key more carefully in the case of list
partitioning.
---
 src/backend/catalog/partition.c  | 10 +++---
 src/test/regress/expected/insert.out |  7 +++
 src/test/regress/sql/insert.sql  |  6 ++
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index e01ef864f0..a5ba7448eb 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1726,10 +1726,14 @@ get_partition_for_tuple(PartitionDispatch *pd,
 		errmsg("range partition key of row contains null")));
 		}
 
-		if (partdesc->boundinfo->has_null && isnull[0])
-			/* Tuple maps to the null-accepting list partition */
+		/*
+		 * A null partition key is only acceptable if null-accepting list
+		 * partition exists.
+		 */
+		cur_index = -1;
+		if (isnull[0] && partdesc->boundinfo->has_null)
 			cur_index = partdesc->boundinfo->null_index;
-		else
+		else if (!isnull[0])
 		{
 			/* Else bsearch in partdesc->boundinfo */
 			bool		equal = false;
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index 116854e142..9553fc597a 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -365,6 +365,13 @@ DETAIL:  Failing row contains (1, 2).
 insert into mlparted1 (a, b) values (2, 3);
 ERROR:  new row for relation "mlparted11" violates partition constraint
 DETAIL:  Failing row contains (3, 2).
+-- check routing error through a list partitioned table when they key is null
+create table lparted_nonullpart (a int, b char) partition by list (b);
+create table lparted_nonullpart_a partition of lparted_nonullpart for values in ('a');
+insert into lparted_nonullpart values (1);
+ERROR:  no partition of relation "lparted_nonullpart" found for row
+DETAIL:  Partition key of the failing row contains (b) = (null).
+drop table lparted_nonullpart;
 -- check that RETURNING works correctly with tuple-routing
 alter table mlparted drop constraint check_b;
 create table mlparted12 partition of mlparted1 for values from (5) to (10);
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index c56c3c22f8..e66783bb9b 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -226,6 +226,12 @@ insert into mlparted values (1, 2);
 -- selected by tuple-routing
 insert into mlparted1 (a, b) values (2, 3);
 
+-- check routing error through a list partitioned table when they key is null
+create table lparted_nonullpart (a int, b char) partition by list (b);
+create table lparted_nonullpart_a partition of lparted_nonullpart for values in ('a');
+insert into lparted_nonullpart values (1);
+drop table lparted_nonullpart;
+
 -- check that RETURNING works correctly with tuple-routing
 alter table mlparted drop constraint check_b;
 create table mlparted12 partition of mlparted1 for values from (5) to (10);
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Peter Geoghegan

On Thu, Mar 9, 2017 at 6:25 PM, Robert Haas  wrote:
> On Thu, Mar 9, 2017 at 7:29 PM, Thomas Munro
>  wrote:
>> Suppressing ENOENT because it's not clear which backend actually owns
>> a file is a bit different from using it to detect that there are no
>> more segments...
>
> +1, emphatically.

I don't feel strongly either way on this, FWIW.

>> Obviously I screwed some things up in the code I
>> posted, but I think the general idea that the DSM segment owns the
>> files on disk is a good one.
>
> +1 to that, too.  The DSM has exactly the lifetime that we want here;
> no backend or resowner involved in the transaction does.

I actually agree with you, from a theoretical perspective. But the
consequences of that theoretical point seem pretty extensive.
Following through with that by pushing much more file state into
shared memory has significant complexity and risk, if it's possible at
all without something like "palloc(), but for shared memory". I would
like to see a practical benefit.

>> I figure that it (via the detach hook)
>> should be able to delete the files using only data in shmem, without
>> reference to any backend-local memory, so that file cleanup is
>> entirely disconnected from backend-local resource cleanup (fds and
>> memory).
>
> That's one approach.  Another approach is to somehow tie the two
> together; for example, I thought about teaching FileClose that where
> it currently calls unlink() to get rid of the temporary file, it would
> first go check some shared reference count and decrement it, skipping
> the unlink if the result was >0.  But that only works if you have a
> separate chunk of shared memory per File, which seems like it won't
> work for what you need.

It won't work for what Thomas needs because that decision is made per
segment. But, the decision to decrement refcount really does need to
be made at the BufFile level, so Thomas is still not wrong to
structure things that way.

What somewhat justifies the idea of an on_dsm_detach() callback using
a pointer located in shared memory to get to local memory in one
backend (for error handling purposes) is the fact that it's already
tacitly assumed that the local memory used for the BufFile is released
after the resowner clean-up of BufFiles. Otherwise, somebody might get
in big trouble if they called BufFileClose() or something in an error
path. Arguably, the reliance on ordering already exists today.

I'm not saying that that's a good plan, or even an acceptable
trade-off. Pick your poison.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 9:17 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think eight is enough.  Committed with some cosmetic changes.
>
> Buildfarm thinks eight wasn't enough.
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clam=2017-03-10%2002%3A00%3A01

At first I was confused how you knew that this was the fault of this
patch, but this seems like a pretty indicator:

TRAP: FailedAssertion("!(curval == 0 || (curval == 0x03 && status !=
0x00) || curval == status)", File: "clog.c", Line: 574)

I'm not sure whether it's related to this problem or not, but now that
I look at it, this (preexisting) comment looks like entirely wishful
thinking:

 * If we update more than one xid on this page while it is being written
 * out, we might find that some of the bits go to disk and others don't.
 * If we are updating commits on the page with the top-level xid that
 * could break atomicity, so we subcommit the subxids first before we mark
 * the top-level commit.

The problem with that is the word "before".  There are no memory
barriers here, so there's zero guarantee that other processes see the
writes in the order they're performed here.  But it might be a stretch
to suppose that that would cause this symptom.

Maybe we should replace that Assert() with an elog() and dump out the
actual values.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2017-03-09 Thread Amit Kapila

On Thu, Mar 9, 2017 at 11:15 PM, Robert Haas  wrote:
> On Thu, Mar 9, 2017 at 10:23 AM, Amit Kapila  wrote:
>> Right, if we use XLogReadBufferForRedoExtended() instead of
>> XLogReadBufferExtended()/LockBufferForCleanup during relay routine,
>> then it will restore FPI when required and can serve our purpose of
>> acquiring clean up lock on primary bucket page.  Yet another way could
>> be to store some information like block number, relfilenode, forknum
>> (maybe we can get away with some less info) of primary bucket in the
>> fixed part of each of these records and then using that read the page
>> and then take a cleanup lock.   Now, as in most cases, this buffer
>> needs to be registered only in cases when there are multiple overflow
>> pages, I think having fixed information might hurt in some cases.
>
> Just because the WAL record gets larger?  I think you could arrange to
> record the data only in the cases where it is needed, but I'm also not
> sure it would matter that much anyway.  Off-hand, it seems better than
> having to mark the primary bucket page dirty (because you have to set
> the LSN) every time any page in the chain is modified.
>

Do we really need to set LSN on this page (or mark it dirty), if so
why?  Are you worried about restoration of FPI or something else?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-09 Thread Amit Kapila

On Fri, Mar 10, 2017 at 7:47 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think eight is enough.  Committed with some cosmetic changes.
>
> Buildfarm thinks eight wasn't enough.
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clam=2017-03-10%2002%3A00%3A01
>

Will look into this, though I don't have access to that machine, but
it looks to be a power machine and I have access to somewhat similar
machine.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 7:29 PM, Thomas Munro
 wrote:
> Suppressing ENOENT because it's not clear which backend actually owns
> a file is a bit different from using it to detect that there are no
> more segments...

+1, emphatically.

> Obviously I screwed some things up in the code I
> posted, but I think the general idea that the DSM segment owns the
> files on disk is a good one.

+1 to that, too.  The DSM has exactly the lifetime that we want here;
no backend or resowner involved in the transaction does.

> I figure that it (via the detach hook)
> should be able to delete the files using only data in shmem, without
> reference to any backend-local memory, so that file cleanup is
> entirely disconnected from backend-local resource cleanup (fds and
> memory).

That's one approach.  Another approach is to somehow tie the two
together; for example, I thought about teaching FileClose that where
it currently calls unlink() to get rid of the temporary file, it would
first go check some shared reference count and decrement it, skipping
the unlink if the result was >0.  But that only works if you have a
separate chunk of shared memory per File, which seems like it won't
work for what you need.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 2:19 PM, Peter Geoghegan  wrote:
> Quite a lot of thought seems to have gone into making the
> fd.c/resowner mechanism leak-proof in the face of errors. So, quite
> apart from what that approaches misses out on, I really don't want to
> change resource management too much. As I went into in my "recap", it
> seems useful to change as little as possible about temp files.

I think this is just wishful thinking.  resowners are backend-private,
and they are robust as long as they are managing files accessed by a
single backend.  Trying to bend them into some kind of a state where
they can manage a resource shared across multiple cooperating backends
is fundamentally changing the model.  What matters is not whether the
old model was any good but whether the new one is any good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix hard-coded relkind constants in pg_dump.c.

2017-03-09 Thread Michael Paquier

On Fri, Mar 10, 2017 at 10:59 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Fri, Mar 10, 2017 at 9:19 AM, Tom Lane  wrote:
>>> Existing style is mostly to inject relkind values into constructed
>>> query strings using %c.  I did not bother to touch places that did it
>>> like that, but really a better technique is to stringify the RELKIND
>>> macro, at least in places where you'd want single quotes around the
>>> code character.  That avoids any runtime effort and keeps the RELKIND
>>> symbol close to where it's used.
>
>> I have been wondering about the lack of readability with those
>> hardcoded relkinds in the code for some time but... Wouldn't it be
>> better to change as well psql's describe.c and tab_complete.c, as well
>> as pg_upgrade and initdb code?
>
> Yup, working on it.
>
> There's a fair number of other fields that could stand the same treatment,
> but for now I'm just going to touch references to relkind.

Yes, I just noticed postgres_fdw as well...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-09 Thread Tom Lane

Robert Haas  writes:
> I think eight is enough.  Committed with some cosmetic changes.

Buildfarm thinks eight wasn't enough.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clam=2017-03-10%2002%3A00%3A01

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Changing references of password encryption to hashing

2017-03-09 Thread Michael Paquier

Hi all,

As discussed here:
https://www.postgresql.org/message-id/98cafcd0-5557-0bdf-4837-0f2b7782d...@joeconway.com
We are using in documentation and code comments "encryption" to define
what actually is hashing, which is confusing.

Attached is a patch for HEAD to change the documentation to match hashing.

There are a couple of things I noticed on the way:
1) There is the user-visible PQencryptPassword in libpq, which
actually hashes the password, and not encrypts it :)
2) There is as well pg_md5_encrypt() in the code, as well as there is
pg_md5_hash(). Those may be better if renamed, at least I would think
that pg_md5_encrypt should be pg_md5_password_hash, because a salt is
used as well, and the routine is dedicated to work on passwords.
Thoughts?
3) createuser also has --encrypt and --unencrypted, perhaps those
should be renamed? Honestly I don't really think that this is worth a
breakage and the option names match with the SQL commands.

I did not bother about those things in the attached, which works only
documentation and comment changes.

An open item has been added on the wiki.

Thanks,
-- 
Michael


hashed-passwords-doc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-09 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> (And no, I don't especially
>> approve of RELKIND_SEQUENCE being 'S' either, but it's far too late to
>> change that.)

> FWIW the reason SEQUENCE uses S instead of 's' is that the latter was
> taken for "special" relations, which we removed a few releases ago
> (commit 3a694bb0a1).

Yeah, I'd just been reminded of that by some code in describe.c.
So there actually is a reason for sequences to be 'S', or once was.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: psql show index with type info

2017-03-09 Thread Robert Haas

On Mon, Mar 6, 2017 at 9:30 AM, Stephen Frost  wrote:
> * Amos Bird (amosb...@gmail.com) wrote:
>> Well, the prefix is used to differentiate other \d commands, like
>> this,
>
> Ah, ok, fair enough.
>
> Should we consider differentiating different table types also?  I
> suppose those are primairly just logged and unlogged, but I could see
> that being useful information to know when doing a \dt.  Not a big deal
> either way though, and this patch stands on its own certainly.

Logged vs. unlogged isn't the same thing as identifying the access
method.  Anything with storage can come in logged, unlogged, and
temporary varieties, but an access method is essentially a way of
saying that one object has a different physical layout than another.
Right now there is, indeed, only one heap access method, but some of
my colleagues and I are scheming over how to change that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSOC Introduction / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-09 Thread Thomas Munro

On Fri, Mar 10, 2017 at 2:04 PM, Stephen Frost  wrote:
> George,
>
> * George Papadrosou (gpapadro...@gmail.com) wrote:
>> my name is George Papadrosou, this is my first semester as graduate student 
>> at Georgia Tech and would like to submit a proposal to Google Summer of 
>> Code, for the project "Eliminate O(N^2) scaling from rw-conflict tracking in 
>> serializable transactions”.
>
> Fantastic!  I'll let Kevin comment on your questions regarding that
> project.

+1, welcome.  It would be very cool to see SERIALIZABLE get faster.

>> Until then, I would like to familiarize myself a bit with the codebase and 
>> fix some bug/todo. I didn’t find many [E] marked tasks in the todo list so 
>> the task I was thinking is "\s without arguments (display history) fails 
>> with libedit, doesn't use pager either - psql \s not working on OSX”. 
>> However, it works on my OSX El Capitan laptop with Postgres 9.4.4. Would you 
>> suggest some other starter task?
>
> One of the best things which you can do to start learning the PostgreSQL
> code base is to review existing patches which have been proposed for
> inclusion.  Now is a great time to be doing that as the feature freeze
> for the next version of PostgreSQL (PG v10) is at the end of the month
> and there's a ton of patches which need reviewing.
>
> The patches which need reviewing can be seen here:
>
> https://commitfest.postgresql.org/13/

+1

This is a large commitfest so there's a pretty wide range of
interesting stuff to help out with, and many ways to help including
documentation, testing and code review.


One small patch that just might interest you, based on your interest
in SERIALIZABLE and also in parallelism, is my fledgling attempt to
connect those two features:

https://commitfest.postgresql.org/13/1004/

Unfortunately SERIALIZABLE support has not yet been included with a
couple of really important recent Postgres features: parallelism and
streaming replication.  I suspect the latter may be quite hard to fix
and the former quite easy.  If you're currently studying the
SERIALIZABLE internal data structures then you might like to review
that patch and tell me if my optimism is entirely misplaced, and
figure out how to test and break it...


That said, please pick anything that interests you!

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-09 Thread Alvaro Herrera

Tom Lane wrote:

> (And no, I don't especially
> approve of RELKIND_SEQUENCE being 'S' either, but it's far too late to
> change that.)

FWIW the reason SEQUENCE uses S instead of 's' is that the latter was
taken for "special" relations, which we removed a few releases ago
(commit 3a694bb0a1).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib modules and relkind check

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 7:23 PM, Michael Paquier
 wrote:
> Thanks. Shouldn't this fix be back-patched? pg_visibility should fail
> properly for indexes and other relkinds even in 9.6. pgstattuple can
> also trigger failures. It would be confusing for users to face "could
> not open file" kind of errors instead of a proper error message. Note
> that I am fine to produce those patches if there is a resource issue
> for any of you two.

I don't see a real need to back-patch this.  I mean, it probably
wouldn't break anything, but it feels more like we're raising our
standards than fixing bugs per se.  I don't think it benefits us when
users see the release notes for a new minor release series cluttered
with essentially non-critical fixes.  It makes people worry about
whether upgrading is safe.  You might brand those people as silly, but
they (indirectly) pay my mortgage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Peter Geoghegan

On Thu, Mar 9, 2017 at 4:29 PM, Thomas Munro
 wrote:
> On Fri, Mar 10, 2017 at 8:19 AM, Peter Geoghegan  wrote:
>> by having state for each segment, it ends up actually *relying on*
>> ENOENT-on-unlink() as a condition that terminates execution:
>
> Yeah, this seems to fall out of the requirement to manage a growable
> number of partition files in a fixed space.  I wonder how this could
> go wrong.  One way would be for a crash-restart to happen (which
> leaves all temporary files in place by design, though it could clean
> them up like a normal restart if it wanted to), followed by a very
> long running cluster eventually generating the same (pid, set number)
> pair.  I think I see a simple way to defend against that, which I'll
> write about in the PHJ thread.

I am not expressing any real opinion about the idea of relying on or
suppressing ENOENT-on-unlink() just yet. What's clear is that that's
unorthodox. I seldom have any practical reason to make a distinction
between unorthodox and unacceptable. It's usually easier to just not
do the unorthodox thing. Maybe this is one of the rare exceptions.

> Thanks.  I will respond with code and comments over on the PHJ thread.
> Aside from the broken extendBufFile behaviour you mentioned, I'll look
> into the general modularity complaints I'm hearing about fd.c and
> buffile.c interaction.

buffile.c should stop pretending to care about anything other than
temp files, IMV. 100% of all clients that want temporary files go
through buffile.c. 100% of all clients that want non-temp files (files
which are not marked FD_TEMPORARY) access fd.c directly, rather than
going through buffile.c.

>> But my observations illustrate the difficulty with tying
>> resource manager resources in local memory (temp segments, and
>> associated BufFile(s)) with clean-up at shared memory segment
>> detachment. It's possible, but ticklish enough that you'll end up with
>> some wart or other when you're done.

> Suppressing ENOENT because it's not clear which backend actually owns
> a file is a bit different from using it to detect that there are no
> more segments...  Obviously I screwed some things up in the code I
> posted, but I think the general idea that the DSM segment owns the
> files on disk is a good one.  I figure that it (via the detach hook)
> should be able to delete the files using only data in shmem, without
> reference to any backend-local memory, so that file cleanup is
> entirely disconnected from backend-local resource cleanup (fds and
> memory).

The problem with that is that it kind of implies that we have to
invent a mechanism that 100% supersedes fd.c resource management. We
still want to free backend local vFDs and so on, using fd.c cleanup,
so that file descriptors are not leaked, but it's not clear we can do
that by half measure. Wouldn't even that have to be duplicated to deal
with shared memory state by way of your on_dsm_detach callback()? And,
if it did, it would presumably need to keep track of every single
segment, much like fd.c. But the number of segments isn't predictable
in advance, so you're forced to confront that problem, which you hoped
to avoid. And, where does all of this leave max_files_per_process
enforcement?

> Looking forward to seeing your v9.  We need to figure out what common
> ground we can find here.  Quality issues with the code I posted aside
> (which I'm sorting out), it seems that the underlying reason we're not
> converging yet is my requirement for a variable number of partitions
> managed in a fixed shmem space, resulting in the ENOENT-as-terminator
> behaviour that you don't like.

I agree that that's the snag we're getting caught on here. In even
simpler terms, what you need to do with BufFiles is somewhat more
difficult than what I need to do (for either of us), and now that time
grows short, I feel the need to draw a line under those differences.
At the same time, I would at the very least like to avoid doing you a
disservice by not at least anticipating your needs.

> I'm 100% happy to switch parts of what
> I'm doing to using anything you come up with if it can support a
> dynamic set of N partitions from fixed set of M participants.  I'm
> fairly sure that people won't be happy with two separate ways to
> manage shared temporary files.  I also don't want patch A to be
> derailed by patch B, but even if these things finish up in different
> Postgres releases we'll still have to solve the problem.

That's all fair. And, in case it isn't clear, I think that all of this
BufFile + fd.c + resowner.c + on_dsm_detach() stuff might be the
single biggest yak shaving expedition I've ever found myself on. It
just seems like the kind of thing where the only workable approach is
to try to converge on a good plan iteratively. That isn't anybody's
fault.

So, yes, there is a bit of friction between the requirements for
parallel tuplesort, and for parallel hash join, but I see that as
useful,

Re: [HACKERS] [COMMITTERS] pgsql: Fix hard-coded relkind constants in pg_dump.c.

2017-03-09 Thread Tom Lane

Michael Paquier  writes:
> On Fri, Mar 10, 2017 at 9:19 AM, Tom Lane  wrote:
>> Existing style is mostly to inject relkind values into constructed
>> query strings using %c.  I did not bother to touch places that did it
>> like that, but really a better technique is to stringify the RELKIND
>> macro, at least in places where you'd want single quotes around the
>> code character.  That avoids any runtime effort and keeps the RELKIND
>> symbol close to where it's used.

> I have been wondering about the lack of readability with those
> hardcoded relkinds in the code for some time but... Wouldn't it be
> better to change as well psql's describe.c and tab_complete.c, as well
> as pg_upgrade and initdb code?

Yup, working on it.

There's a fair number of other fields that could stand the same treatment,
but for now I'm just going to touch references to relkind.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 8:47 PM, Tom Lane  wrote:
> Amit Langote  writes:
>> On 2017/03/10 9:14, Jeff Janes wrote:
>>> I think we can just save $. and use that, as in the attached.
>
>> The patch works for me.
>
> Me too.  Pushed; we'll soon see if that makes the oldest buildfarm
> critters happy.

$. has been around since the stone age (a.k.a. when I was a teenager)
so we should be fine there.  I hope.  That whole input_line_number
thing was making me a bit nervous, but I couldn't find any reference
to when that had been added in a quick Google search, which gave me
hope that it was old enough that we'd be OK.

Sorry for the hassle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bizarre choice of case for RELKIND_PARTITIONED_TABLE

2017-03-09 Thread Tom Lane

I wrote:
> Robert Haas  writes:
>> For reasons which must've seemed good to whoever instituted the
>> policy, pg_dump refers to relkinds using the bare letters rather than
>> the constants.

> Even in pg_dump, it appears to me that the large majority of relkind
> references use the symbolic names.

After further study, I think it was psql/describe.c you were remembering.
I cleaned that up...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Enabling atomics on ARM64

2017-03-09 Thread Andres Freund

On 2017-03-10 10:29:39 +0900, Michael Paquier wrote:
> On Fri, Mar 10, 2017 at 3:01 AM, Robert Haas  wrote:
> > On Wed, Mar 8, 2017 at 10:18 PM, Roman Shaposhnik  wrote:
> >> I'd like to offer a forward port from a change I'm contributing
> >> the Greenplum code base over here:
> >> https://github.com/greenplum-db/gpdb/pull/1983
> 
> +1.
> 
> > Thanks.   Please add this to https://commitfest.postgresql.org/14/ so
> > it doesn't get forgotten.  I'm afraid your patch is probably too late
> > for v10 (the deadline for patches to be considered for v10 was March
> > 1st, though somebody might propose to make an exception), but v11 will
> > be here soon enough.
> 
> Yeah, one month earlier would have been fine though. This is not
> really invasive.

FWIW, Unless somebody seriously protests, I'm going to commit this soon-ish.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-09 Thread Tom Lane

Amit Langote  writes:
> On 2017/03/10 9:14, Jeff Janes wrote:
>> I think we can just save $. and use that, as in the attached.

> The patch works for me.

Me too.  Pushed; we'll soon see if that makes the oldest buildfarm
critters happy.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables

2017-03-09 Thread Amit Langote

On 2017/03/10 10:26, Michael Paquier wrote:
> On Thu, Mar 9, 2017 at 11:15 PM, Stephen Frost  wrote:
>> While reviewing Amit Langote's patch to handle partitioned tables
>> properly in various contrib modules (mostly by throwing an error since
>> things like pageinspect aren't going to work on the empty 'parent'
>> table), I went looking through contrib for other modules that do
>> something with relkind and noticed that postgres_fdw's IMPORT SCHEMA
>> would pull in the child tables (relkind = 'r') but would ignore the
>> parent table (relkind = 'P', or soon to be 'p', I guess).
> 
> It is not as straight-forward as it seems. A foreign table can be
> defined as a child (use of PARTITION OF), but not as a parent (use
> PARTITION BY), and IMPORT SCHEMA has to issue queries to create
> foreign tables. It seems to me that the correct fix here is to ignore
> child tables that are part of a partition, and just include the parent
> in what is imported so as when querying the parent through
> postgres_fdw all the child partitions are considered automatically.
> Thoughts?

I think that makes sense.  The query in postgresImportForeignSchema() that
fetches the information about remote tables should be fixed to include
relkind = 'P' tables (partitioned tables) but exclude relispartition =
true (partitions).  Something like below:

-   "WHERE c.relkind IN ('r', 'v', 'f', 'm') "
+   "WHERE c.relkind IN ('r', 'v', 'f', 'm', 'P') "
+   "  AND NOT c.relispartition "

It means we don't import tables that are supposed to be partitions of some
table.  If we allow importing the latter, we get access to those
partitions anyway.

I would like to hear more opinions of course.

> Of course this should be documented.

+1

>> I tend to view this as an issue which should be added to the open items
>> list and resolved before PG10 (though perhaps it could be done after
>> feature freeze), but I could see an argument that it should be just a
>> documented limitation of postgres_fdw and that adding such support would
>> be a new feature.
> 
> Agreed. I think that this is a bug, because any database having one
> partitioning set of tables would fail to import automatically except
> by excluding some tables, and that's annoying for the user.

Agreed too.

>> In any case, this seems like an issue that should be addressed one way
>> or the other, so I'll add it to the open items list.  I'm not planning
>> to work on fixing it myself, but if someone proposes a patch which looks
>> reasonable, I'll try to find time for it.
> 
> Well, I worked on IMPORT SCHEMA, so I'm fine to do something. After
> the CF is done though.

Thanks.

Regards,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Enabling atomics on ARM64

2017-03-09 Thread Michael Paquier

On Fri, Mar 10, 2017 at 3:01 AM, Robert Haas  wrote:
> On Wed, Mar 8, 2017 at 10:18 PM, Roman Shaposhnik  wrote:
>> I'd like to offer a forward port from a change I'm contributing
>> the Greenplum code base over here:
>> https://github.com/greenplum-db/gpdb/pull/1983

+1.

> Thanks.   Please add this to https://commitfest.postgresql.org/14/ so
> it doesn't get forgotten.  I'm afraid your patch is probably too late
> for v10 (the deadline for patches to be considered for v10 was March
> 1st, though somebody might propose to make an exception), but v11 will
> be here soon enough.

Yeah, one month earlier would have been fine though. This is not
really invasive.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables

2017-03-09 Thread Michael Paquier

On Thu, Mar 9, 2017 at 11:15 PM, Stephen Frost  wrote:
> While reviewing Amit Langote's patch to handle partitioned tables
> properly in various contrib modules (mostly by throwing an error since
> things like pageinspect aren't going to work on the empty 'parent'
> table), I went looking through contrib for other modules that do
> something with relkind and noticed that postgres_fdw's IMPORT SCHEMA
> would pull in the child tables (relkind = 'r') but would ignore the
> parent table (relkind = 'P', or soon to be 'p', I guess).

It is not as straight-forward as it seems. A foreign table can be
defined as a child (use of PARTITION OF), but not as a parent (use
PARTITION BY), and IMPORT SCHEMA has to issue queries to create
foreign tables. It seems to me that the correct fix here is to ignore
child tables that are part of a partition, and just include the parent
in what is imported so as when querying the parent through
postgres_fdw all the child partitions are considered automatically.
Thoughts?

Of course this should be documented.

> I tend to view this as an issue which should be added to the open items
> list and resolved before PG10 (though perhaps it could be done after
> feature freeze), but I could see an argument that it should be just a
> documented limitation of postgres_fdw and that adding such support would
> be a new feature.

Agreed. I think that this is a bug, because any database having one
partitioning set of tables would fail to import automatically except
by excluding some tables, and that's annoying for the user.

> In any case, this seems like an issue that should be addressed one way
> or the other, so I'll add it to the open items list.  I'm not planning
> to work on fixing it myself, but if someone proposes a patch which looks
> reasonable, I'll try to find time for it.

Well, I worked on IMPORT SCHEMA, so I'm fine to do something. After
the CF is done though.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Documentation improvements for partitioning

2017-03-09 Thread Amit Langote

On 2017/03/10 3:26, Robert Haas wrote:
> I think you might have the titles for 0002 and 0003 backwards.

Oops, you're right.

> On Fri, Mar 3, 2017 at 2:51 AM, Amit Langote wrote:
>> 0002: some cosmetic fixes to create_table.sgml
> 
> I think this sentence may be unclear to some readers:
> 
> + One might however want to set it for only some partitions,
> +  which is possible by doing SET NOT NULL on 
> individual
> +  partitions.
> 
> I think you could replace this with something like: Even if there is
> no NOT NULL constraint on the parent, such a constraint
> can still be added to individual partitions, if desired; that is, the
> children can disallow nulls even if the parent allows them, but not
> the other way around.

Reads much better, done that way.  Thanks.

>> 0003: add clarification about NOT NULL constraint on partition columns in
>>   alter_table.sgml
> 
> This is about list-ifying a note, but I think we should try to
> de-note-ify it.  It's a giant block of text that is not obviously more
> noteworthy than the surrounding text; I think  should be saved
> for things that particularly need to be called out.

OK.  The patch is now just about de-note-ifying the block of text.  Since
I don't see any other lists in the Parameters portion of the page, I also
take back my list-ifying proposal.

Attached updated patches.

Thanks,
Amit
>From a159c9aa3ee7f2c51084f94243be16a30242d7a6 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 3 Mar 2017 16:39:24 +0900
Subject: [PATCH 1/3] Rewrite sections in ddl.sgml related to partitioning

Merge sections Partitioned Tables and Partitioning into one section
called Table Partitioning and Related Solutions.
---
 doc/src/sgml/ddl.sgml | 1359 +
 1 file changed, 707 insertions(+), 652 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 09b5b3ff70..a2dd39df54 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2772,14 +2772,181 @@ VALUES ('Albany', NULL, NULL, 'NY');

   
 
-  
-   Partitioned Tables
+  
+   Table Partitioning and Related Solutions
+
+   
+partitioning
+   
+
+   
+table
+partitioning
+   
 

 partitioned table

 

+PostgreSQL supports basic table
+partitioning. This section describes why and how to implement
+partitioning as part of your database design.
+   
+
+   
+ Overview
+
+
+ Partitioning refers to splitting what is logically one large table into
+ smaller physical pieces.  Partitioning can provide several benefits:
+
+ 
+  
+   Query performance can be improved dramatically in certain situations,
+   particularly when most of the heavily accessed rows of the table are in a
+   single partition or a small number of partitions.  The partitioning
+   substitutes for leading columns of indexes, reducing index size and
+   making it more likely that the heavily-used parts of the indexes
+   fit in memory.
+  
+ 
+
+ 
+  
+   When queries or updates access a large percentage of a single
+   partition, performance can be improved by taking advantage
+   of sequential scan of that partition instead of using an
+   index and random access reads scattered across the whole table.
+  
+ 
+
+ 
+  
+   Bulk loads and deletes can be accomplished by adding or removing
+   partitions, if that requirement is planned into the partitioning design.
+   Doing ALTER TABLE DETACH PARTITION followed by
+   DROP TABLE is far faster than a bulk operation.  These
+   commands also entirely avoid the VACUUM overhead
+   caused by a bulk DELETE.
+  
+ 
+
+ 
+  
+   Seldom-used data can be migrated to cheaper and slower storage media.
+  
+ 
+
+
+ The benefits will normally be worthwhile only when a table would
+ otherwise be very large. The exact point at which a table will
+ benefit from partitioning depends on the application, although a
+ rule of thumb is that the size of the table should exceed the physical
+ memory of the database server.
+
+
+
+ The following forms of partitioning can be implemented in
+ PostgreSQL:
+
+ 
+  
+   Range Partitioning
+
+   
+
+ The table is partitioned into ranges defined
+ by a key column or set of columns, with no overlap between
+ the ranges of values assigned to different partitions.  For
+ example one might partition by date ranges, or by ranges of
+ identifiers for particular business objects.
+
+   
+  
+
+  
+   List Partitioning
+
+   
+
+ The table is partitioned by explicitly listing which key values
+ appear in each partition.
+
+   
+  
+ 
+
+
+
+ The following partitioning methods are currently supported:
+
+ 
+  
+

Re: [HACKERS] GSOC Introduction / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-09 Thread Stephen Frost

George,

* George Papadrosou (gpapadro...@gmail.com) wrote:
> my name is George Papadrosou, this is my first semester as graduate student 
> at Georgia Tech and would like to submit a proposal to Google Summer of Code, 
> for the project "Eliminate O(N^2) scaling from rw-conflict tracking in 
> serializable transactions”.

Fantastic!  I'll let Kevin comment on your questions regarding that
project.

> Until then, I would like to familiarize myself a bit with the codebase and 
> fix some bug/todo. I didn’t find many [E] marked tasks in the todo list so 
> the task I was thinking is "\s without arguments (display history) fails with 
> libedit, doesn't use pager either - psql \s not working on OSX”. However, it 
> works on my OSX El Capitan laptop with Postgres 9.4.4. Would you suggest some 
> other starter task?

One of the best things which you can do to start learning the PostgreSQL
code base is to review existing patches which have been proposed for
inclusion.  Now is a great time to be doing that as the feature freeze
for the next version of PostgreSQL (PG v10) is at the end of the month
and there's a ton of patches which need reviewing.

The patches which need reviewing can be seen here:

https://commitfest.postgresql.org/13/

There's lots of information on our wiki and other places about
developing PostgreSQL, you can start here:

https://wiki.postgresql.org/wiki/Development_information

and in particular:

https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F

We look forward to working with you!  Welcome!

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-09 Thread Amit Langote

On 2017/03/10 9:14, Jeff Janes wrote:
> On Thu, Mar 9, 2017 at 3:20 PM, Robert Haas  > wrote:
> 
> On Mon, Mar 6, 2017 at 11:37 AM, Dagfinn Ilmari Mannsåker
> > wrote:
> > David Christensen > 
> writes:
> >>> Hi David,
> >>>
> >>> Here's a review of your patch.
> >>
> >> Hi Ilmari, thanks for your time and review.  I’m fine with the revised 
> version.
> >
> > Okay, I've marked the patch as Ready For Committer.
> 
> Committed.   Hopefully this doesn't contain any Perl bits that are
> sufficiently new as to cause problems for our older BF members ... I
> guess we'll see.
> 
> 
> 
> Bad luck there.  I'm getting this error on CentOS6.8, perl v5.10.1
>
> Can't locate object method "input_line_number" via package "IO::Handle" at
> ../../../src/backend/catalog/Catalog.pm line 82,  line 148.
> make[3]: *** [fmgrtab.c] Error 25
> make[2]: *** [utils/fmgroids.h] Error 2
> make[2]: *** Waiting for unfinished jobs
> Can't locate object method "input_line_number" via package "IO::Handle" at
> ../../../src/backend/catalog/Catalog.pm line 82,  line 148.
> make[3]: *** [postgres.bki] Error 25
> make[2]: *** [submake-schemapg] Error 2
> make[1]: *** [all-backend-recurse] Error 2
> make: *** [all-src-recurse] Error 2

Same here.  Some buildfarm animals failed too.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary=2017-03-10%2000%3A55%3A42

> I think we can just save $. and use that, as in the attached.

The patch works for me.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Throw an error if a DATA() line contains wrong # of attributes.

2017-03-09 Thread Tom Lane

Michael Paquier  writes:
> On Fri, Mar 10, 2017 at 8:20 AM, Robert Haas  wrote:
>> Throw an error if a DATA() line contains wrong # of attributes.

> dromedary is unhappy because of this commit, and it is using perl 5.10.0:

My RHEL6 workstation (5.10.1) is also completely broken by this.  Kindly
revert so I can get some work done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should buffer of initialization fork have a BM_PERMANENT flag

2017-03-09 Thread Michael Paquier

On Thu, Mar 9, 2017 at 10:25 PM, Artur Zakirov  wrote:
> I think this is good fixes. I've checked them. And in my opinion they are
> correct.
>
> The code also is good.

Having something with conflicts is not nice, so attached is a rebased version.

> I have run regression and TAP tests. They all passed without error.
>
> I think the patch can be marked as "Ready for Committer" after rebase.

Thanks for the review.
-- 
Michael


unlogged-flush-fix-head-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib modules and relkind check

2017-03-09 Thread Michael Paquier

On Fri, Mar 10, 2017 at 9:34 AM, Stephen Frost  wrote:
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> Thanks. Shouldn't this fix be back-patched? pg_visibility should fail
>> properly for indexes and other relkinds even in 9.6. pgstattuple can
>> also trigger failures. It would be confusing for users to face "could
>> not open file" kind of errors instead of a proper error message. Note
>> that I am fine to produce those patches if there is a resource issue
>> for any of you two.
>
> I'm not really convinced that back-patching this is worthwhile, which is
> why I didn't go through the effort to do so.  A reasonable error will be
> thrown in either case, after all, without any bad behavior happening,
> from what I can see.

Okay, I don't mind as nobody has complained about pgstattuple in particular.

> That said, if you feel strongly enough about it to propose appropriate
> patches for the back-branches, I'll try and look at them before the next
> set of point releases, but I'm not going to deal with them this month as
> I'd like to get through as much of the CF for PG10 as we can.

Nah, let's see if somebody else complains.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Throw an error if a DATA() line contains wrong # of attributes.

2017-03-09 Thread Michael Paquier

On Fri, Mar 10, 2017 at 8:20 AM, Robert Haas  wrote:
> Throw an error if a DATA() line contains wrong # of attributes.
>
> David Christensen, reviewed by Dagfinn Ilmari Mannsåker
>
> Discussion: 
> http://postgr.es/m/20170215154018.fs5vwtqhp5d2s...@veeddeux.attlocal.net
>
> Branch
> --
> master
>
> Details
> ---
> http://git.postgresql.org/pg/commitdiff/7666e73a2e9e0e91a9f3b246a4299b44daadcbf8

dromedary is unhappy because of this commit, and it is using perl 5.10.0:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary=2017-03-10%2000%3A03%3A34

Here are the details:
Can't locate object method "input_line_number" via package
"IO::Handle" at ../../../src/backend/catalog/Catalog.pm line 82,
 line 148.
Can't locate object method "input_line_number" via package
"IO::Handle" at ../../../src/backend/catalog/Catalog.pm line 82,
 line 148.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] GSOC Introduction / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-09 Thread George Papadrosou

Hello psql hackers,

my name is George Papadrosou, this is my first semester as graduate student at 
Georgia Tech and would like to submit a proposal to Google Summer of Code, for 
the project "Eliminate O(N^2) scaling from rw-conflict tracking in serializable 
transactions”.

A short bio, I have a CS undergraduate degree from Athens University of 
Economics and Business. I had taken two databases courses where the first one 
was about sql, relational algebra, xpath and generally using an RDBMS while the 
second one was more about the internals, like  storage, indexing, query 
planning and transactions.

I have 3+ years professional experience in web technologies with a focus on the 
backend and I recently started my master with specialization in computing 
systems. One of my first courses that I am finishing this May is High 
Performance Computing(parallel algorithms), which seems to be closely related 
to this GSOC project.

I have not done any research on databases yet but I regard this project as an 
opportunity to make an initial contact with postgres' internals until I dive 
more into database algorithms. My future goal is to work on databases full
time.

I am going to prepare a draft proposal for this project and share it with you 
soon. The project’s description is pretty clear, do you think it should be more 
strictly defined in the proposal?

Until then, I would like to familiarize myself a bit with the codebase and fix 
some bug/todo. I didn’t find many [E] marked tasks in the todo list so the task 
I was thinking is "\s without arguments (display history) fails with libedit, 
doesn't use pager either - psql \s not working on OSX”. However, it works on my 
OSX El Capitan laptop with Postgres 9.4.4. Would you suggest some other starter 
task?

Looking forward to hearing from you soon.
Best regards,
George


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix hard-coded relkind constants in pg_dump.c.

2017-03-09 Thread Michael Paquier

On Fri, Mar 10, 2017 at 9:19 AM, Tom Lane  wrote:
> Fix hard-coded relkind constants in pg_dump.c.
>
> Although it's reasonable to expect that most of these constants will
> never change, that does not make it good programming style to hard-code
> the value rather than using the RELKIND_FOO macros.  There were only
> a few such violations, and all relatively new AFAICT.
>
> Existing style is mostly to inject relkind values into constructed
> query strings using %c.  I did not bother to touch places that did it
> like that, but really a better technique is to stringify the RELKIND
> macro, at least in places where you'd want single quotes around the
> code character.  That avoids any runtime effort and keeps the RELKIND
> symbol close to where it's used.

I have been wondering about the lack of readability with those
hardcoded relkinds in the code for some time but... Wouldn't it be
better to change as well psql's describe.c and tab_complete.c, as well
as pg_upgrade and initdb code?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Andres Freund

On 2017-03-09 16:29:24 -0800, Peter Geoghegan wrote:
> I am generally in favor of more inclusive Reviewed-By lists. I suggest
> that we expand it to:
> 
> "Reviewed-By: Andres Freund, Thomas Vondra, Thomas Munro, Anastasia
> Lubennikova, Robert Haas, Amit Langote"

And pushed with these.

I don't really expect buildfarm fallout, but ..

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib modules and relkind check

2017-03-09 Thread Stephen Frost

Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> Thanks. Shouldn't this fix be back-patched? pg_visibility should fail
> properly for indexes and other relkinds even in 9.6. pgstattuple can
> also trigger failures. It would be confusing for users to face "could
> not open file" kind of errors instead of a proper error message. Note
> that I am fine to produce those patches if there is a resource issue
> for any of you two.

I'm not really convinced that back-patching this is worthwhile, which is
why I didn't go through the effort to do so.  A reasonable error will be
thrown in either case, after all, without any bad behavior happening,
from what I can see.

That said, if you feel strongly enough about it to propose appropriate
patches for the back-branches, I'll try and look at them before the next
set of point releases, but I'm not going to deal with them this month as
I'd like to get through as much of the CF for PG10 as we can.

> +-- an actual index of a partitiond table should work though
> Typo here => s/partitiond/partitioned/

Will fix.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Thomas Munro

On Fri, Mar 10, 2017 at 8:19 AM, Peter Geoghegan  wrote:
> by having state for each segment, it ends up actually *relying on*
> ENOENT-on-unlink() as a condition that terminates execution:

Yeah, this seems to fall out of the requirement to manage a growable
number of partition files in a fixed space.  I wonder how this could
go wrong.  One way would be for a crash-restart to happen (which
leaves all temporary files in place by design, though it could clean
them up like a normal restart if it wanted to), followed by a very
long running cluster eventually generating the same (pid, set number)
pair.  I think I see a simple way to defend against that, which I'll
write about in the PHJ thread.

On Fri, Mar 10, 2017 at 10:01 AM, Peter Geoghegan  wrote:
> IOW, I cannot see why 0007-hj-shared-buf-file-v6.patch leaves parallel
> hash join with BufFiles that are even capable of being expanded past
> 1GiB (1 segment), which is surely not acceptable -- it's a bug. Have I
> just missed something?

Bleugh, yeah that's completely busted.  Fixing.

> segments). I suggest that the MAX_PHYSICAL_FILESIZE stress-test I
> sometimes perform [1] be incorporated in Thomas' testing.

Right, will do.

> (Looks more carefully...)
>
> I now see that PathNameCreateFile() is being called, a new function
> which is largely just a new interface to the existing
> OpenTemporaryFileInTablespace() temp files. So, if you look carefully,
> you notice that you do in fact end up with FD_TEMPORARY fd.c segments
> here...so maybe temp_file_limit isn't broken, because, it turns out,
> 0007-hj-shared-buf-file-v6.patch is still *halfway* buying into the
> conventional idea of file temp-ness. But buffile.c if left behind
> ("file->isTemp == false"), so AFAICT it must still be broken simply
> because new segments cannot be written, per BufFileCreate() comments
> quoted above.
>
> This has become more of a PHJ review, which is off-topic for this
> thread.

Thanks.  I will respond with code and comments over on the PHJ thread.
Aside from the broken extendBufFile behaviour you mentioned, I'll look
into the general modularity complaints I'm hearing about fd.c and
buffile.c interaction.

> But my observations illustrate the difficulty with tying
> resource manager resources in local memory (temp segments, and
> associated BufFile(s)) with clean-up at shared memory segment
> detachment. It's possible, but ticklish enough that you'll end up with
> some wart or other when you're done. The question, then, is what wart
> is the most acceptable for parallel tuplesort? I see two plausible
> paths forward right now:
>
> 1. Selectively suppress spurious ENOENT-on-unlink() LOG message in the
> brief window where it might be spurious. This is the easiest option.
> (There is less code this way.)
>
> 2. Do more or less what I've already drafted as my V9, which is
> described in opening mail to this thread, while accepting the ugliness
> that that approach brings with it.

Suppressing ENOENT because it's not clear which backend actually owns
a file is a bit different from using it to detect that there are no
more segments...  Obviously I screwed some things up in the code I
posted, but I think the general idea that the DSM segment owns the
files on disk is a good one.  I figure that it (via the detach hook)
should be able to delete the files using only data in shmem, without
reference to any backend-local memory, so that file cleanup is
entirely disconnected from backend-local resource cleanup (fds and
memory).

Looking forward to seeing your v9.  We need to figure out what common
ground we can find here.  Quality issues with the code I posted aside
(which I'm sorting out), it seems that the underlying reason we're not
converging yet is my requirement for a variable number of partitions
managed in a fixed shmem space, resulting in the ENOENT-as-terminator
behaviour that you don't like.  I'm 100% happy to switch parts of what
I'm doing to using anything you come up with if it can support a
dynamic set of N partitions from fixed set of M participants.  I'm
fairly sure that people won't be happy with two separate ways to
manage shared temporary files.  I also don't want patch A to be
derailed by patch B, but even if these things finish up in different
Postgres releases we'll still have to solve the problem.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan

On Thu, Mar 9, 2017 at 3:52 PM, Andres Freund  wrote:
>
> But I also think it's more important to get some initial verification
> functionality in, than resolving this conflict. I do, also quite
> strongly, think we'll be better served with having something like what
> you're proposing than nothing, and I don't have time/energy to turn your
> patch into what I'm envisioning, nor necessarily the buyin.  Hence I'm
> planning to commit the current interface.
>
> Attached is your original patch, and a patch editorializing things. I do
> plan to merge those before pushing.

Your revisions all seem fine. No problems with any of it.

> Who would you like to see added to Reviewed-By?

I am generally in favor of more inclusive Reviewed-By lists. I suggest
that we expand it to:

"Reviewed-By: Andres Freund, Thomas Vondra, Thomas Munro, Anastasia
Lubennikova, Robert Haas, Amit Langote"

Thanks for your help with this!

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib modules and relkind check

2017-03-09 Thread Michael Paquier

On Fri, Mar 10, 2017 at 8:59 AM, Amit Langote
 wrote:
> Hi Stephen,
>
> On 2017/03/10 6:48, Stephen Frost wrote:
>> Amit, Michael,
>>
>> * Amit Langote (langote_amit...@lab.ntt.co.jp) wrote:
>>> On 2017/03/09 11:51, Michael Paquier wrote:
 OK, I am marking that as ready for committer.
>>>
>>> Thanks.
>>
>> Thanks for this, I've pushed this now.  I do have a few notes about
>> changes that I made from your patch;
>>
>> - Generally speaking, the user-facing functions come first in these .c
>>   files, with a prototype at the top for the static functions defined
>>   later on in the file.  I went ahead and did that for the functions you
>>   added too.
>>
>> - I added more comments to the regression tests, in particular, we
>>   usually comment when tests are expected to fail.
>>
>> - I added some additional regression tests to cover more cases,
>>   particularly ones for things that weren't being tested at all.
>>
>> - Not the fault of your patch, but there were cases where elog() was
>>   being used when it really should have been ereport(), so I changed
>>   those cases to all be, hopefully, consistent throughout.
>
> Thanks a lot for all the improvements and committing.

Thanks. Shouldn't this fix be back-patched? pg_visibility should fail
properly for indexes and other relkinds even in 9.6. pgstattuple can
also trigger failures. It would be confusing for users to face "could
not open file" kind of errors instead of a proper error message. Note
that I am fine to produce those patches if there is a resource issue
for any of you two.

+-- an actual index of a partitiond table should work though
Typo here => s/partitiond/partitioned/
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning vs ON CONFLICT

2017-03-09 Thread Amit Langote

On 2017/03/10 9:10, Amit Langote wrote:
> On 2017/03/09 23:25, Robert Haas wrote:
>> On Fri, Feb 17, 2017 at 1:47 AM, Amit Langote wrote:
>>> I updated the patch.  Now it's reduced to simply removing the check in
>>> transformInsertStmt() that prevented using *any* ON CONFLICT on
>>> partitioned tables at all.
>>
>> This patch no longer applies.
> 
> Rebased patch is attached.

Oops, really attached this time,

Thanks,
Amit
>From a6377bf0a060676ae461314f62f8da4aac8896f8 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 17 Feb 2017 14:18:01 +0900
Subject: [PATCH] ON CONFLICT DO NOTHING should work with partitioned tables

Currently, a check in transformInsertStmt() prevents *any*
ON CONFLICT clause from being specified on a partitioned table,
even those specifying DO NOTHING as the alternative action.  It
is harmless to allow those, so remove that check.  It would still
not be possible to use DO UPDATE with partitioned table though,
because infer_arbiter_indexes() will eventually error out upon
failing to find a unique/exclusion constraint.  Remember it is
not at the moment possible to create these constraints on
partitioned tables.

Adds a test and updates the note in document about using ON CONFLICT
with partitioned tables.
---
 doc/src/sgml/ddl.sgml |  8 ++--
 src/backend/parser/analyze.c  |  8 
 src/test/regress/expected/insert_conflict.out | 10 ++
 src/test/regress/sql/insert_conflict.sql  | 10 ++
 4 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 09b5b3ff70..de57930520 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3854,8 +3854,12 @@ ANALYZE measurement;
 
 
  
-  INSERT statements with ON CONFLICT
-  clause are currently not allowed on partitioned tables.
+  Using the ON CONFLICT clause with partitioned tables
+  will cause an error if DO UPDATE is specified as the
+  alternative action, because it requires specifying a unique or exclusion
+  constraint to determine if there is a conflict.  Currently, it is not
+  possible to create indexes on partitioned tables required to implement
+  such constraints.
  
 
 
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 3571e50aea..25699fbc4a 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -842,16 +842,8 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
 
 	/* Process ON CONFLICT, if any. */
 	if (stmt->onConflictClause)
-	{
-		/* Bail out if target relation is partitioned table */
-		if (pstate->p_target_rangetblentry->relkind == RELKIND_PARTITIONED_TABLE)
-			ereport(ERROR,
-	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	 errmsg("ON CONFLICT clause is not supported with partitioned tables")));
-
 		qry->onConflict = transformOnConflictClause(pstate,
 	stmt->onConflictClause);
-	}
 
 	/*
 	 * If we have a RETURNING clause, we need to add the target relation to
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
index 8d005fddd4..c90d381b34 100644
--- a/src/test/regress/expected/insert_conflict.out
+++ b/src/test/regress/expected/insert_conflict.out
@@ -786,3 +786,13 @@ select * from selfconflict;
 (3 rows)
 
 drop table selfconflict;
+-- check that the following works:
+-- insert into partitioned_table on conflict do nothing
+create table parted_conflict_test (a int, b char) partition by list (a);
+create table parted_conflict_test_1 partition of parted_conflict_test for values in (1);
+insert into parted_conflict_test values (1, 'a') on conflict do nothing;
+insert into parted_conflict_test values (1, 'a') on conflict do nothing;
+-- however, on conflict do update not supported yet
+insert into parted_conflict_test values (1) on conflict (a) do update set b = excluded.b where excluded.a = 1;
+ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
+drop table parted_conflict_test, parted_conflict_test_1;
diff --git a/src/test/regress/sql/insert_conflict.sql b/src/test/regress/sql/insert_conflict.sql
index df3a9b59b5..78bffc783d 100644
--- a/src/test/regress/sql/insert_conflict.sql
+++ b/src/test/regress/sql/insert_conflict.sql
@@ -471,3 +471,13 @@ commit;
 select * from selfconflict;
 
 drop table selfconflict;
+
+-- check that the following works:
+-- insert into partitioned_table on conflict do nothing
+create table parted_conflict_test (a int, b char) partition by list (a);
+create table parted_conflict_test_1 partition of parted_conflict_test for values in (1);
+insert into parted_conflict_test values (1, 'a') on conflict do nothing;
+insert into parted_conflict_test values (1, 'a') on conflict do nothing;
+-- however, on conflict do update not supported yet
+insert into parted_conflict_test values (1) on conflict (a) do update set b = excluded.b where

Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-09 Thread Jeff Janes

On Thu, Mar 9, 2017 at 3:20 PM, Robert Haas  wrote:

> On Mon, Mar 6, 2017 at 11:37 AM, Dagfinn Ilmari Mannsåker
>  wrote:
> > David Christensen  writes:
> >>> Hi David,
> >>>
> >>> Here's a review of your patch.
> >>
> >> Hi Ilmari, thanks for your time and review.  I’m fine with the revised
> version.
> >
> > Okay, I've marked the patch as Ready For Committer.
>
> Committed.   Hopefully this doesn't contain any Perl bits that are
> sufficiently new as to cause problems for our older BF members ... I
> guess we'll see.
>


Bad luck there.  I'm getting this error on CentOS6.8, perl v5.10.1

Can't locate object method "input_line_number" via package "IO::Handle" at
../../../src/backend/catalog/Catalog.pm line 82,  line 148.
make[3]: *** [fmgrtab.c] Error 25
make[2]: *** [utils/fmgroids.h] Error 2
make[2]: *** Waiting for unfinished jobs
Can't locate object method "input_line_number" via package "IO::Handle" at
../../../src/backend/catalog/Catalog.pm line 82,  line 148.
make[3]: *** [postgres.bki] Error 25
make[2]: *** [submake-schemapg] Error 2
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2


I think we can just save $. and use that, as in the attached.


I as sabotaged a random line in src/include/catalog/pg_amop.h and it seems
to report the error correctly.

Cheers,

Jeff
diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
new file mode 100644
index 767a2ec..cb5fcc8
*** a/src/backend/catalog/Catalog.pm
--- b/src/backend/catalog/Catalog.pm
*** sub Catalogs
*** 65,70 
--- 65,71 
$_ .= $next_line;
redo;
}
+   my $input_line_number=$.;
  
# Strip useless whitespace and trailing semicolons.
chomp;
*** sub Catalogs
*** 80,86 
elsif 
(/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/)
{
check_natts($filename, $catalog{natts}, $3,
!   $input_file, 
INPUT_FILE->input_line_number);
  
push @{ $catalog{data} }, { oid => $2, 
bki_values => $3 };
}
--- 81,87 
elsif 
(/^DATA\(insert(\s+OID\s+=\s+(\d+))?\s+\(\s*(.*)\s*\)\s*\)$/)
{
check_natts($filename, $catalog{natts}, $3,
!   $input_file, 
$input_line_number);
  
push @{ $catalog{data} }, { oid => $2, 
bki_values => $3 };
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning vs ON CONFLICT

2017-03-09 Thread Amit Langote

On 2017/03/09 23:25, Robert Haas wrote:
> On Fri, Feb 17, 2017 at 1:47 AM, Amit Langote wrote:
>> I updated the patch.  Now it's reduced to simply removing the check in
>> transformInsertStmt() that prevented using *any* ON CONFLICT on
>> partitioned tables at all.
> 
> This patch no longer applies.

Rebased patch is attached.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib modules and relkind check

2017-03-09 Thread Amit Langote

Hi Stephen,

On 2017/03/10 6:48, Stephen Frost wrote:
> Amit, Michael,
> 
> * Amit Langote (langote_amit...@lab.ntt.co.jp) wrote:
>> On 2017/03/09 11:51, Michael Paquier wrote:
>>> OK, I am marking that as ready for committer.
>>
>> Thanks.
> 
> Thanks for this, I've pushed this now.  I do have a few notes about
> changes that I made from your patch;
> 
> - Generally speaking, the user-facing functions come first in these .c
>   files, with a prototype at the top for the static functions defined
>   later on in the file.  I went ahead and did that for the functions you
>   added too.
> 
> - I added more comments to the regression tests, in particular, we
>   usually comment when tests are expected to fail.
> 
> - I added some additional regression tests to cover more cases,
>   particularly ones for things that weren't being tested at all.
> 
> - Not the fault of your patch, but there were cases where elog() was
>   being used when it really should have been ereport(), so I changed
>   those cases to all be, hopefully, consistent throughout.

Thanks a lot for all the improvements and committing.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bytea_output vs make installcheck

2017-03-09 Thread David G. Johnston

On Thu, Mar 9, 2017 at 4:45 PM, Neha Khatri  wrote:

> On Fri, Mar 10, 2017 at 6:14 AM, Peter Eisentraut  ndquadrant.com> wrote:
>
>> On 2/14/17 16:50, Jeff Janes wrote:
>> > make installcheck currently fails against a server running
>> > with bytea_output = escape.
>> >
>> > Making it succeed is fairly easy, and I think it is worth doing.
>> >
>> > Attached are two options for doing that.  One overrides bytea_output
>> > locally where-ever needed, and the other overrides it for the entire
>> > 'regression' database.
>>
>> I would use option 2 here (ALTER DATABASE) and be done with it.  Some
>> people didn't like using ALTER DATABASE, but it's consistent with
>> existing use.  If someone wants to change that, that can be independent
>> of this issue.
>
>
> Sorry about the naive question, but if someone has set the GUC
> bytea_output = 'escape', then the intention seem to be to obtain the output
> in 'escape' format for bytea.
> With this, if an installcheck is done, that might also have been done with
> the expectation that the output will be in 'escape' format. In that case,
> how much is it justified to hard code the format for regression database?
> However, I agree that there are not many bytea outputs in the current
> regression suite
>
>
At a high level (which is all I know here) If we leave behind tests that
at least exercise bytea_output='escape' mode to ensure it is functioning
properly then I'd have no problem having the testing of other features
dependent upon bytea_output, but that are coded to compare against the
now-default output format, set that runtime configurable mode to that which
they require.  If the choice of output mode is simply a byproduct we should
be free to set it to whatever we need for the currently executing test.

If a simple way of doing this involves fixing the default to what the suite
expects and one-off changing it when testing escape mode stuff that seems
like a reasonable position to take.  Having to set bytea_output when it
isn't the item under test seems like its just going to add noise.

David J.

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Andres Freund

On 2017-03-06 20:40:59 -0800, Peter Geoghegan wrote:
> On Mon, Mar 6, 2017 at 3:57 PM, Andres Freund  wrote:
> > I'm ok with not immediately doing so, but I think Peter's design isn't
> > in line with achieving something like this.
>
> I would be okay with doing this if we had a grab-bag of expensive
> checks, that all pretty much require some combination of
> ExclusiveLocks and/or ShareLocks anyway, and are amenable to being
> combined. I could see that happening, but we're a long way off from
> that. When it does happen, we might have other things that are a bit
> like bt_index_parent_check(), in terms of locking requirements and
> degree of thoroughness, that might themselves require other parameters
> specific to the verification that they perform that we cannot
> anticipate. For example, maybe bt_index_parent_check() could have a
> parameter that is a probability that any given IndexTuple will be
> verified against its heap tuple. Then you could have something like a
> PRNG seed argument, so you can check different tuples every day. There
> are many possibilities, and I hope that eventually amcheck has this
> kind of very rich functionality.

> When you use the interface you describe to "stack" several checks at
> once, less important parameters, like the ones I suggest are going to
> be an awkward fit, and so will probably be excluded. I'm not opposed
> to having what amounts to a second interface to what bt_index_check()
> does, to make this kind of thing work where it makes sense. Really,
> bt_index_parent_check() is almost an alternative interface to the same
> functionality today. There can be another one in the future, if it
> serves a purpose, and the locking requirements are roughly the same
> for all the checks. I'd be fine with that. Let's just get the basic
> feature in for now, though.

I disagree quite strongly with pretty much all of that.


But I also think it's more important to get some initial verification
functionality in, than resolving this conflict. I do, also quite
strongly, think we'll be better served with having something like what
you're proposing than nothing, and I don't have time/energy to turn your
patch into what I'm envisioning, nor necessarily the buyin.  Hence I'm
planning to commit the current interface.

Attached is your original patch, and a patch editorializing things. I do
plan to merge those before pushing.

Who would you like to see added to Reviewed-By?

Regards,

Andres
>From 60c670d251dd4244a09215e4f6c22e444d2bff7c Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 9 Mar 2017 15:50:40 -0800
Subject: [PATCH 1/2] Add amcheck extension to contrib.

This is the beginning of a collection of SQL-callable functions to
verify the integrity of data files.  For now it only contains code to
verify B-Tree indexes.

This adds two SQL-callable functions, validating B-Tree consistency to
a varying degree.  Check the, extensive, docs for details.

The goal is to later extend the coverage of the module to further
access methods, possibly including the heap.  Once checks for
additional access methods exist, we'll likely add some "dispatch"
functions that cover multiple access methods.

Author: Peter Geoghegan, editorialized by Andres Freund
Reviewed-By: Andres Freund, Thomas Vondra
Discussion: cam3swzqzlmhmwmbqjzk+prkxrnuz4w90wymuwfkev8mz3de...@mail.gmail.com
---
 contrib/Makefile |1 +
 contrib/amcheck/Makefile |   19 +
 contrib/amcheck/amcheck--1.0.sql |   24 +
 contrib/amcheck/amcheck.control  |5 +
 contrib/amcheck/verify_nbtree.c  | 1238 ++
 doc/src/sgml/amcheck.sgml|  275 +
 doc/src/sgml/contrib.sgml|1 +
 doc/src/sgml/filelist.sgml   |1 +
 src/tools/pgindent/typedefs.list |2 +
 9 files changed, 1566 insertions(+)
 create mode 100644 contrib/amcheck/Makefile
 create mode 100644 contrib/amcheck/amcheck--1.0.sql
 create mode 100644 contrib/amcheck/amcheck.control
 create mode 100644 contrib/amcheck/verify_nbtree.c
 create mode 100644 doc/src/sgml/amcheck.sgml

diff --git a/contrib/Makefile b/contrib/Makefile
index 9a74e1b99f..e84eb67008 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -6,6 +6,7 @@ include $(top_builddir)/src/Makefile.global
 
 SUBDIRS = \
 		adminpack	\
+		amcheck		\
 		auth_delay	\
 		auto_explain	\
 		bloom		\
diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
new file mode 100644
index 00..f14c7aa58b
--- /dev/null
+++ b/contrib/amcheck/Makefile
@@ -0,0 +1,19 @@
+# contrib/amcheck/Makefile
+
+MODULE_big	= amcheck
+OBJS		= verify_nbtree.o $(WIN32RES)
+
+EXTENSION = amcheck
+DATA = amcheck--1.0.sql
+PGFILEDESC = "amcheck - functions to verify access method invariants"
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/amcheck
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include

Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-09 Thread Tels

Hi Aleksander,

noticed this in your patch:

> * Add a corespinding entry to pgStatTabHash.

"corresponding"

Also a question: Some one-line comments are

 /* Comment. */

while others are

 /*
  * Comment.
  */

Why the difference?

Hope this helps,

Tels

PS: Sorry if this appears twice, I fumbled the From: ...



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-09 Thread Tels

Hi Aleksander,

noticed this in your patch:

> * Add a corespinding entry to pgStatTabHash.

"corresponding"

Also a question: Some one-line comments are

 /* Comment. */

while others are

 /*
  * Comment.
  */

Why the difference?

Hope this helps,

Tels


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bytea_output vs make installcheck

2017-03-09 Thread Neha Khatri

On Fri, Mar 10, 2017 at 6:14 AM, Peter Eisentraut  wrote:

> On 2/14/17 16:50, Jeff Janes wrote:
> > make installcheck currently fails against a server running
> > with bytea_output = escape.
> >
> > Making it succeed is fairly easy, and I think it is worth doing.
> >
> > Attached are two options for doing that.  One overrides bytea_output
> > locally where-ever needed, and the other overrides it for the entire
> > 'regression' database.
>
> I would use option 2 here (ALTER DATABASE) and be done with it.  Some
> people didn't like using ALTER DATABASE, but it's consistent with
> existing use.  If someone wants to change that, that can be independent
> of this issue.

Sorry about the naive question, but if someone has set the GUC bytea_output
= 'escape', then the intention seem to be to obtain the output in 'escape'
format for bytea.
With this, if an installcheck is done, that might also have been done with
the expectation that the output will be in 'escape' format. In that case,
how much is it justified to hard code the format for regression database?
However, I agree that there are not many bytea outputs in the current
regression suite

Regards,
Neha

[HACKERS] SQL Standard Feature T211

2017-03-09 Thread Kevin Grittner

[separate thread from transition table patch, since a different
audience might be interested]

Four things are required to claim support for Feature T211, "Basic
trigger capability":
 - support for the CREATE TRIGGER statement
 - the ability to declare and reference transition tables for AFTER
   triggers
 - support for the DROP TRIGGER statement
 - support for TRIGGER in the GRANT and REVOKE statements

With the addition of transition tables we have all four, although
I'm not sure whether the CREATE TRIGGER statement conforms closely
enough to claim the feature.  The two basic issues I can think of
are:
 - we don't allow the OLD and NEW transition variable names to be
   specified -- using hard-coded values instead
 - we don't support the standard  formats,
   instead supporting only our EXECUTE PROCEDURE syntax

Do we leave T211, "Basic trigger capability" on our "unsupported"
list until those two points are addressed?  Does anyone know of
anything else that would need to be done?

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-09 Thread Robert Haas

On Mon, Mar 6, 2017 at 11:37 AM, Dagfinn Ilmari Mannsåker
 wrote:
> David Christensen  writes:
>>> Hi David,
>>>
>>> Here's a review of your patch.
>>
>> Hi Ilmari, thanks for your time and review.  I’m fine with the revised 
>> version.
>
> Okay, I've marked the patch as Ready For Committer.

Committed.   Hopefully this doesn't contain any Perl bits that are
sufficiently new as to cause problems for our older BF members ... I
guess we'll see.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops

2017-03-09 Thread Tels

Hello,

On Thu, March 9, 2017 9:04 am, Alexander Korotkov wrote:
> On Wed, Feb 1, 2017 at 2:43 PM, Emre Hasegeli  wrote:
>
>> > I think this patch is already in a good shape.
>>
>> I am sorry for introducing this bug.  This fix looks good to me as well.
>
>
> I checked this patch too.  And it seems good to me as well.
> Should we mark it as "ready for committer"?

I can't comment on the code, but the grammar on the comments caught my eye:

> +/* Can any range from range_box does not extend higher than this
argument? */
>
>+static bool
>+overLower2D(RangeBox *range_box, Range *query)
>+{
>+  return FPle(range_box->left.low, query->high) &&
>+  FPle(range_box->right.low, query->high);
>+}

The sentence sounds quite garbled in English. I'm not entirely sure what
it should be, but given the comment below "/* Can any range from range_box
to be higher than this argument? */" maybe something like:

/* Does any range from range_box extend to the right side of the query? */

If used, an analog wording should be used for overHigher2D's comment like:

/* Does any range from range_box extend to the left side of the query? */

Also:

/* Can any range from range_box to be higher than this argument? */

should be:

/* Can any range from range_box be higher than this argument? */

Another question: Does it make sense to add the "minimal bad example for
the '&<' case" as test case, too? After all, it should pass the test after
the patch.

Bets regards,

Tels

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-09 Thread Robert Haas

On Tue, Jan 31, 2017 at 11:35 PM, Michael Paquier
 wrote:
>> Thanks for the review.
>
> Moved to CF 2017-03, the 8th commit fest of this patch.

I think eight is enough.  Committed with some cosmetic changes.

I think the turning point for this somewhat-troubled patch was when we
realized that, while results were somewhat mixed on whether it
improved performance, wait event monitoring showed that it definitely
reduced contention significantly.  However, I just realized that in
both this case and in the case of group XID clearing, we weren't
advertising a wait event for the PGSemaphoreLock calls that are part
of the group locking machinery.  I think we should fix that, because a
quick test shows that can happen fairly often -- not, I think, as
often as we would have seen LWLock waits without these patches, but
often enough that you'll want to know.  Patch attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

group-update-waits-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] delta relations in AFTER triggers

2017-03-09 Thread Kevin Grittner

On Tue, Mar 7, 2017 at 6:28 PM, Kevin Grittner  wrote:

> New patch attached.

And bit-rotted less than 24 hours later by fcec6caa.

New patch attached just to fix bit-rot.

That conflicting patch might be a candidate to merge into the new
Ephemeral Named Relation provided by my patch, for more flexibility
and extensibility...

-- 
Kevin Grittner


transition-v12.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] macaddr 64 bit (EUI-64) datatype support

2017-03-09 Thread Stephen Frost

Hari Babu,

* Haribabu Kommi (kommi.harib...@gmail.com) wrote:
> On Wed, Feb 1, 2017 at 6:27 AM, Vitaly Burovoy  
> wrote:
> > The new status of this patch is: Ready for Committer
> 
> Thanks for the review.

I've started taking a look at this with an eye towards committing it
soon.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Performance issue after upgrading from 9.4 to 9.6

2017-03-09 Thread Naytro Naytro

2017-03-09 18:28 GMT+01:00 Robert Haas :

> On Thu, Mar 9, 2017 at 7:47 AM, Naytro Naytro 
> wrote:
> > We are having some performance issues after we upgraded to newest
> > version of PostgreSQL, before it everything was fast and smooth.
> >
> > Upgrade was done by pg_upgrade from 9.4 directly do 9.6.1. Now we
> > upgraded to 9.6.2 with no improvement.
> >
> > Some information about our setup: Freebsd, Solaris (SmartOS), simple
> > master-slave using streaming replication.
> >
> > Problem:
> > Very high system CPU when master is streaming replication data, CPU
> > goes up to 77%. Only one process is generating this load, it's a
> > postgresql startup process. When I attached a truss to this process I
> > saw a lot o read calls with almost the same number of errors (EAGAIN).
> > read(6,0x7fffa0c7,1) ERR#35 'Resource temporarily unavailable'
> >
> > Descriptor 6 is a pipe
> >
> > Read call try to read one byte over and over, I looked up to source
> > code and I think this file is responsible for this behavior
> > src/backend/storage/ipc/latch.c. There was no such file in 9.4.
>
> Our latch implementation did get overhauled pretty thoroughly in 9.6;
> see primarily commit 98a64d0bd713cb89e61bef6432befc4b7b5da59e.  But I
> can't guess what is going wrong here based on this information.  It
> might help if you can pull some stack backtraces from the startup
> process.


Dtrace from solaris: http://pastebin.com/u03uVKbr

Re: [HACKERS] Performance issue after upgrading from 9.4 to 9.6

2017-03-09 Thread Naytro Naytro

2017-03-09 20:19 GMT+01:00 Andres Freund :

> Hi,
>
> On 2017-03-09 13:47:35 +0100, Naytro Naytro wrote:
> > We are having some performance issues after we upgraded to newest
> > version of PostgreSQL, before it everything was fast and smooth.
> >
> > Upgrade was done by pg_upgrade from 9.4 directly do 9.6.1. Now we
> > upgraded to 9.6.2 with no improvement.
> >
> > Some information about our setup: Freebsd, Solaris (SmartOS), simple
> > master-slave using streaming replication.
>
> Which node is on which of those, and where is the high load?
>
>
High load in only on slaves, FreeBSD (master+slave) and Solaris (only
slaves)


>
> > Problem:
> > Very high system CPU when master is streaming replication data, CPU
> > goes up to 77%. Only one process is generating this load, it's a
> > postgresql startup process. When I attached a truss to this process I
> > saw a lot o read calls with almost the same number of errors (EAGAIN).
>
> Hm. Just to clarify: The load is on the *receiving* side, in the startup
> process?  Because the load doesn't quite look that way...
>
>
Yes


>
> > read(6,0x7fffa0c7,1) ERR#35 'Resource temporarily unavailable'
> >
> > Descriptor 6 is a pipe
>
> That's presumably a latches internal pipe.  Could you redo that
> truss/strace with timestamps attached?  Does truss show signals
> received? The above profile would e.g. make a lot more sense if not.  Is
> the wal receiver sending signals?
>
>
Truss from Solaris: http://pastebin.com/WajedZ8Y and FreeBSD:
http://pastebin.com/DB5iT8na
FreeBSD truss should show signals by default

Dtrace from solaris: http://pastebin.com/u03uVKbr


>
> > Read call try to read one byte over and over, I looked up to source
> > code and I think this file is responsible for this behavior
> > src/backend/storage/ipc/latch.c. There was no such file in 9.4.
>
> It was "just" moved (and expanded), used to be at
> src/backend/port/unix_latch.c.
>
> There normally shouldn't be that much "latch traffic" in the startup
> process, we'd expect to block from within WaitForWALToBecomeAvailable().
>
> Hm.  Any chance you've configured a recovery_min_apply_delay?  Although
> I'd expect more timestamp calls in that case.
>
>
No, we don't have this option configured


>
> Greetings,
>
> Andres Freund
>

Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supporting infrastructure

2017-03-09 Thread David Christensen


> On Mar 9, 2017, at 1:01 PM, Robert Haas  wrote:
> 
> On Sun, Feb 19, 2017 at 12:02 PM, David Christensen  
> wrote:
>> Hi Robert, this is part of a larger patch which *does* enable the checksums 
>> online; I’ve been extracting the necessary pieces out with the understanding 
>> that some people thought the disable code might be useful in its own merit.  
>> I can add documentation for the various states.  The CHECKSUM_REVALIDATE is 
>> the only one I feel is a little wibbly-wobbly; the other states are required 
>> because of the fact that enabling checksums requires distinguishing between 
>> “in process of enabling” and “everything is enabled”.
> 
> Ah, sorry, I had missed that patch.
> 
>> My design notes for the patch were submitted to the list with little 
>> comment; see: 
>> https://www.postgresql.org/message-id/1E6E64E9-634B-43F4-8AA2-CD85AD92D2F8%40endpoint.com
>> 
>> I have since added the WAL logging of checksum states, however I’d be glad 
>> to take feedback on the other proposed approaches (particularly the system 
>> catalog changes + the concept of a checksum cycle).]
> 
> I think the concept of a checksum cycle might be overkill.  It would
> be useful if we ever wanted to change the checksum algorithm, but I
> don't see any particular reason why we need to build infrastructure
> for that.  If we wanted to change the checksum algorithm, I think it
> likely that we'd do that in the course of, say, widening it to 4 bytes
> as part of a bigger change in the page format, and this infrastructure
> would be too narrow to help with that.

I hear what you are saying, however a boolean on pg_class/pg_database to show 
if the relation in question is insufficient if we allow arbitrary 
enabling/disabling while enabling is in progress.  In particular, if we disable 
checksums while the enabling was in progress and had only a boolean to indicate 
whether the checksums are complete for a relation there will have been a window 
when new pages in a relation will *not* be checksummed but the system table 
flag will indicate that the checksum is up-to-date, which is false.  This would 
lead to checksum failures when those pages are encountered.  Similar failures 
will occur as well when doing a pg_upgrade of an in-progress enabling.  Saying 
you can’t disable/cancel the checksum process while it’s running seems 
undesirable too, as what happens if you have a system failure mid-process.

We could certainly avoid this problem by just saying that we have to start over 
with the entire cluster and process every page from scratch rather than trying 
to preserve/track state that we know is good, perhaps only dirtying buffers 
which have a non-matching checksum while we’re in the conversion state, but 
this seems undesirable to me, plus if we made it work in a single session we’d 
have a long-running snapshot open (presumably) which would wreak havoc while it 
processes the entire database (as someone had suggested by doing it in a single 
function that just hangs while it’s running)

I think we still need a way to short-circuit the process/incrementally update 
and note which tables have been checksummed and which ones haven’t.  (Perhaps I 
could make the code which currently checks DataChecksumsEnabled() a little 
smarter, depending on the relation it’s looking at.)

As per one of Jim’s suggestions, I’ve been looking at making the 
data_checkum_state localized per database at postinit time, but really it 
probably doesn’t matter to anything but the buffer code whether it’s a global 
setting, a per-database setting or what.


> While I'm glad to see work finally going on to allow enabling and
> disabling checksums, I think it's too late to put this in v10.  We
> have a rough rule that large new patches shouldn't be appearing for
> the first time in the last CommitFest, and I think this falls into
> that category.  I also think it would be unwise to commit just the
> bits of the infrastructure that let us disable checksums without
> having time for a thorough review of the whole thing; to me, the
> chances that we'll make decisions that we later regret seem fairly
> high.  I would rather wait for v11 and have a little more time to try
> to get everything right.

Makes sense, but to that end I think we need to have at least some agreement on 
how this should work ahead of time.  Obviously it’s easier to critique existing 
code, but I don’t want to go out in left field of a particular approach is 
going to be obviously unworkable per some of the more experienced devs here.
--
David Christensen
End Point Corporation
da...@endpoint.com
785-727-1171





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Bitmap scans a bit broken

2017-03-09 Thread David Rowley

On 10 March 2017 at 06:17, Robert Haas  wrote:

> On Thu, Mar 9, 2017 at 11:50 AM, Dilip Kumar 
> wrote:
> > On Thu, Mar 9, 2017 at 10:02 PM, Dilip Kumar 
> wrote:
> >> I slightly modified your query to reproduce this issue.
> >>
> >> explain analyze select * from r1 where value<555;
> >>
> >> Patch is attached to fix the problem.
> >
> > I forgot to mention the cause of the problem.
> >
> > if (istate->schunkptr < istate->nchunks)
> > {
> >PagetableEntry *chunk = [idxchunks[istate->schunkptr]];
> > PagetableEntry *page = [idxpages[istate->spageptr]];
> > BlockNumber chunk_blockno;
> >
> > In above if condition we have only checked istate->schunkptr <
> > istate->nchunks that means we have some chunk left so we are safe to
> > access idxchunks,  But just after that we are accessing
> > ptbase[idxpages[istate->spageptr]] without checking that accessing
> > idxpages is safe or not.
> >
> > tbm_iterator already handling this case, I broke it in
> tbm_shared_iterator.
>
> I don't know if this is the only problem -- it would be good if David
> could retest -- but it's certainly *a* problem, so committed.
>

Thanks for committing, and generally parallelising more stuff.

I confirm that my test case is now working again.

I'll be in this general area today, so will mention if I stumble over
anything that looks broken.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] rename pg_log directory?

2017-03-09 Thread Bruce Momjian

On Tue, Mar  7, 2017 at 12:01:04AM +0100, Andreas Karlsson wrote:
> On 02/27/2017 03:05 PM, Peter Eisentraut wrote:
> >How about changing the default for log_directory from 'pg_log' to, say,
> >'log'?
> 
> I have attached a patch which does this. I do not care much about which name
> is picked as long as we get rid off the "pg_" prefix.
> 
> Btw, is there a reason for why global and base do not have the "pg_" prefix?

"data" and "base" where chosen because it is a "data-base", but with the
pg_ prefixes it would be a pg_data_pg_base.  ;-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] compiler warning in set_tablefunc_size_estimates

2017-03-09 Thread Robert Haas

On Thu, Mar 9, 2017 at 4:39 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I tried a non-cassert compile on a machine that has a pickier compiler
>> than my laptop, and got:
>
>> costsize.c: In function ‘set_tablefunc_size_estimates’:
>> costsize.c:4574:17: error: variable ‘rte’ set but not used
>> [-Werror=unused-but-set-variable]
>
>> That appears to be a legitimate gripe.  Perhaps:
>
> I think PG_USED_FOR_ASSERTS_ONLY would be a better solution.  It's
> only happenstance that the function currently uses the RTE just
> for this; if it grows another use, your approach would be harder
> to clean up.

Yeah, we might have to revert the entire -4/+1 line patch.

Actually, the thing I don't like about that is that that then we're
still emitting code for the planner_rt_fetch.  That probably doesn't
cost much, but why do it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-03-09 Thread Michael Paquier

On Thu, Mar 9, 2017 at 10:43 PM, Peter Eisentraut
 wrote:
> On 3/8/17 02:34, Michael Paquier wrote:
>> This patch looks good to me.
>
> committed

Thanks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog

2017-03-09 Thread Michael Paquier

On Thu, Mar 9, 2017 at 10:54 PM, Peter Eisentraut
 wrote:
> On 3/7/17 11:16, Robert Haas wrote:
>> Well, if the problem you're trying to solve is "retain WAL for as long
>> as possible without running out of disk space and having everything go
>> kablooey", then it would solve that problem, and I think that's a very
>> reasonable problem to want to solve.
>
> Could be.  I'm not sure what that means for the presented patch, though.
>  Or whether it addresses Michael's original use case at all.

The patch adding an end-segment command does address my problem,
because I just want to be sure that there is enough space left on disk
for one complete segment. And that's fine to check for that when the
last segment is complete. This needs some additional effort but that's
no big deal either.

Having something like --limit-retained-segments partially addresses
it, as long as there is a way to define an automatic mode, based on
statvfs() obviously.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-03-09 Thread Michael Paquier

On Fri, Mar 10, 2017 at 12:00 AM, Joe Conway  wrote:
> On 03/09/2017 06:34 AM, Robert Haas wrote:
>> On Thu, Mar 9, 2017 at 7:59 AM, Michael Paquier
>>  wrote:
>>> Yes, I agree with that for MD5, and after looking around I can see
>>> (like here http://prosody.im/doc/plain_or_hashed) as well that
>>> SCRAM-hashed is used. Now, there are as well references to the salt,
>>> like in protocol.sgml:
>>> "The salt to use when encrypting the password."
>>> Joe, do you think that in this case using the term "hashing" would be
>>> more appropriate? I would think so as we use it to hash the password.
>>
>> I'm not Joe, but I think that would be more appropriate.
>
> I am Joe and I agree ;-)

OK I'll spawn a new thread on the matter.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib modules and relkind check

2017-03-09 Thread Stephen Frost

Amit, Michael,

* Amit Langote (langote_amit...@lab.ntt.co.jp) wrote:
> On 2017/03/09 11:51, Michael Paquier wrote:
> > OK, I am marking that as ready for committer.
> 
> Thanks.

Thanks for this, I've pushed this now.  I do have a few notes about
changes that I made from your patch;

- Generally speaking, the user-facing functions come first in these .c
  files, with a prototype at the top for the static functions defined
  later on in the file.  I went ahead and did that for the functions you
  added too.

- I added more comments to the regression tests, in particular, we
  usually comment when tests are expected to fail.

- I added some additional regression tests to cover more cases,
  particularly ones for things that weren't being tested at all.

- Not the fault of your patch, but there were cases where elog() was
  being used when it really should have been ereport(), so I changed
  those cases to all be, hopefully, consistent throughout.

Thanks again!

Stephen


signature.asc
Description: Digital signature

1 2 3 >

1 - 100 of 211 matches

Mail list logo