date:20170316

Re: [HACKERS] plpgsql - additional extra checks

2017-03-16 Thread David Steele

On 1/13/17 6:55 AM, Marko Tiikkaja wrote:
> On Fri, Jan 13, 2017 at 2:46 AM, Jim Nasby  > wrote:
> 
> On 1/11/17 5:54 AM, Pavel Stehule wrote:
> 
> +too_many_rows
> +
> + 
> +  When result is assigned to a variable by
> INTO clause,
> +  checks if query returns more than one row. In this case
> the assignment
> +  is not deterministic usually - and it can be signal some
> issues in design.
> 
> 
> Shouldn't this also apply to
> 
> var := blah FROM some_table WHERE ...;
> 
> ?
> 
> AIUI that's one of the beefs the plpgsql2 project has.
> 
> 
> No, not at all.  That syntax is undocumented and only works because
> PL/PgSQL is a hack internally.  We don't use it, and frankly I don't
> think anyone should.

This patch still applies cleanly and compiles at cccbdde.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Suppress Clang 3.9 warnings

2017-03-16 Thread Tom Lane

Noah Misch  writes:
> On Wed, Mar 15, 2017 at 12:04:11PM +0300, Aleksander Alekseev wrote:
>> If it no too much trouble could you please explain what will happen
>> on such platforms?

> Both port.h and a system header will furnish a strlcpy() declaration.  The #if
> you modified exists to avoid that, and your change would make it ineffective
> for Clang.  This will have no symptoms, or it will elicit a warning.

The reason why this is bad is that port.h's declaration might be different
from the system headers'.  That's not hypothetical; for example, on my
Mac laptop, strlcpy is declared

size_t strlcpy(char * restrict dst, const char * restrict src, size_t size);

whereas of course there's no "restrict" in port.h.  To make matters worse,
it looks like strlcpy is actually a macro expanding to
'__builtin___strlcpy_chk'.  And the compiler on this box *is* clang,
meaning the proposed patch would affect it.  When I try it, I get
boatloads of errors (not warnings) like these:

In file included from ../../src/include/postgres_fe.h:25:
In file included from ../../src/include/c.h:1125:
../../src/include/port.h:403:15: error: expected parameter declarator
extern size_t strlcpy(char *dst, const char *src, size_t siz);
  ^
/usr/include/secure/_string.h:105:44: note: expanded from macro 'strlcpy'
  __builtin___strlcpy_chk (dest, src, len, __darwin_obsz (dest))
   ^

../../src/include/port.h:403:15: error: conflicting types for 
'__builtin___strlcpy_chk'
/usr/include/secure/_string.h:105:3: note: expanded from macro 'strlcpy'

In short, if this were to get committed, it would get reverted within
minutes, because more than a few of us use Macs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-16 Thread Daniel Verite

Vaishnavi Prabakaran wrote:

> So, attached the alternative fix for this issue.
> Please share me your thoughts.

I assume you prefer the alternative fix because it's simpler.

> I would also like to hear Craig's opinion on it before applying this fix
> to the original patch, just to make sure am not missing anything here.

+1

The main question is whether the predicates enforced
by PQsetSingleRowMode() apply in batch mode in all cases
when it's legit to call that function. Two predicates
that may be problematic are:
if (conn->asyncStatus != PGASYNC_BUSY)
return 0;
and
if (conn->result)
return 0;

The general case with batch mode is that, from the doc:
"The client interleaves result processing with sending batch queries"
Note that I've not even tested that here, I've tested
batching a bunch of queries in a first step and getting the results
in a second step.
I am not confident that the above predicates will be true
in all cases. Also your alternative fix assumes that we add
a user-visible exception to PQsetSingleRowMode in batch mode,
whereby it must not be called as currently documented:
  "call PQsetSingleRowMode immediately after a successful call of 
   PQsendQuery (or a sibling function)"
My gut feeling is that it's not the right direction, I prefer making
the single-row a per-query attribute internally and keep
PQsetSingleRowMode's contract unchanged from the
user's perspective.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] scram and \password

2017-03-16 Thread Heikki Linnakangas


On 03/14/2017 11:14 PM, Tom Lane wrote:

In short, I don't think that argument refutes my position that "md5"
in pg_hba.conf should be understood as allowing SCRAM passwords too.


Yeah, let's do that. Here's a patch.

I had some terminology trouble with the docs. What do you call a user 
that has "md5X" in pgauthid.rolpassword? What about someone with a 
SCRAM verifier? I used the terms "those users that have an MD5 hash set 
in the system catalog", and "users that have set their password as a 
SCRAM verifier", but it feels awkward.


The behavior when a user doesn't exist, or doesn't have a valid 
password, is a bit subtle. Previously, with 'md5' authentication, we 
would send the client an MD5 challenge, and fail with "invalid password" 
error after receiving the response. And with 'scram' authentication, we 
would perform a dummy authentication exchange, with a made-up salt. This 
is to avoid revealing to an unauthenticated client whether or not the 
user existed.


With this patch, the dummy authentication logic for 'md5' is a bit more 
complicated. I made it look at the password_encryption GUC, and send the 
client a dummy MD5 or SCRAM challenge based on that. The idea is that 
most users presumably have a password of that type, so we use that 
method for the dummy authentication, to make it look as "normal" as 
possible. It's not perfect, if password_encryption is set to 'scram', 
and you probe for a user that has an MD5 password set, you can tell that 
it's a valid user from the fact that the server sends an MD5 challenge.


In practice, I'm not sure how good this dummy authentication thing 
really is anyway. Even on older versions, I'd wager a guess that if you 
tried hard enough, you could tell if a user exists or not based on 
timing, for example. So I think this is good enough. But it's worth 
noting and discussing.


- Heikki

>From 4a6856c1becca8905a5255661b6b64b1aed64ec8 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Thu, 16 Mar 2017 15:45:16 +0200
Subject: [PATCH 1/1] Allow SCRAM authentication, when pg_hba.conf says 'md5'.

If a user has a SCRAM verifier in pg_authid.rolpassword, there's no reason
we cannot attempt to perform SCRAM authentication instead of MD5. In the
worst case, the client doesn't support SCRAM, and the authentication will
fail. But previously, it would fail for sure, because we would not even
try. SCRAM is strictly more secure than MD5, so there's no harm in trying
it. This allows for a more graceful transition from MD5 passwords to SCRAM,
as user passwords can be switched to SCRAM incrementally, without changing
pg_hba.conf.

Refactor the code in auth.c to support that better. Notably, we now have
to look up the user's pg_authid entry before sending the password
challenge, also when performing MD5 authentication. Also simplify the
concept of a "doomed" authentication. Previously, if a user had a password,
but it had expired, we still performed SCRAM authentication (but always
returned error at the end) using the salt and iteration count from the
expired password. Now we construct a fake salt, like we do when the user
doesn't have a password or doesn't exist at all. That simplifies
get_role_password(), and we can don't need to distinguish the  "user has
 expired password", and "user does not exist" cases in auth.c.

On second thoughts, also rename uaSASL to uaSCRAM. It refers to the
mechanism specified in pg_hba.conf, and while we use SASL for SCRAM
authentication at the protocol level, the mechanism should be called SCRAM,
not SASL. As a comparison, we have uaLDAP, even though it looks like the
plain 'password' authentication at the protocol level.
---
 doc/src/sgml/client-auth.sgml  |  38 -
 src/backend/libpq/auth-scram.c | 104 ++---
 src/backend/libpq/auth.c   | 173 ++---
 src/backend/libpq/crypt.c  |  44 ---
 src/backend/libpq/hba.c|   2 +-
 src/include/libpq/crypt.h  |   3 +-
 src/include/libpq/hba.h|   2 +-
 src/include/libpq/scram.h  |   2 +-
 8 files changed, 207 insertions(+), 161 deletions(-)

diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index bbd52a5418..db200d4b76 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -412,23 +412,22 @@ hostnossl  database  
user

 

-md5
+scram
 
  
-  Require the client to supply a double-MD5-hashed password for
-  authentication.
-  See  for details.
+  Perform SCRAM-SHA-256 authentication to verify the user's
+  password. See  for details.
  
 

 

-scram
+md5
 
  
-  Perform SCRAM-SHA-256 authentication to verify the user's
-  password.
-  See  for details.
+  Perform SCRAM-SHA-256 or MD5 authentication to verify the
+  user's

[HACKERS] Re: [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send

2017-03-16 Thread David Steele

On 1/9/17 11:33 PM, Jon Nelson wrote:
> 
> On Sat, Jan 7, 2017 at 7:48 PM, Jim Nasby  > wrote:
> 
> On 1/5/17 12:55 PM, Jonathon Nelson wrote:
> 
> Attached please find a patch for PostgreSQL 9.4 which changes the
> maximum amount of data that the wal sender will send at any point in
> time from the hard-coded value of 128KiB to a user-controllable
> value up
> to 16MiB. It has been primarily tested under 9.4 but there has
> been some
> testing with 9.5.
> 
> 
> To make sure this doesn't get lost, please add it to
> https://commitfest.postgresql.org
> . Please verify the patch will
> apply against current HEAD and pass make check-world.
> 
> 
> Attached please find a revision of the patch, changed in the following ways:
> 
> 1. removed a call to debug2.
> 2. applies cleanly against master (as of
> 8c5722948e831c1862a39da2bb5d793a6f2aabab)
> 3. one small indentation fix, one small verbiage fix.
> 4. switched to calculating the upper bound using XLOG_SEG_SIZE rather
> than hard-coding 16384.
> 5. the git author is - obviously - different.
> 
> make check-world passes.
> I have added it to the commitfest.
> I have verified with strace that up to 16MB sends are being used.
> I have verified that the GUC properly grumps about values greater than
> XLOG_SEG_SIZE / 1024 or smaller than 4.

This patch applies cleanly on cccbdde and compiles.  However,
documentation in config.sgml is needed.

The concept is simple enough though there seems to be some argument
about whether or not the patch is necessary.  In my experience 128K
should be more than large enough for a chunk size, but I'll buy the
argument that libpq is acting as a barrier in this case.

I'm marking this patch "Waiting on Author" for required documentation.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw: support parameterized foreign joins

2017-03-16 Thread Arthur Zakirov

Hello,

2017-02-27 12:40 GMT+03:00 Etsuro Fujita :
> Hi,
>
> I'd like to propose to support parameterized foreign joins.  Attached is a
> patch for that, which has been created on top of [1].
>

Can you rebase the patch? It is not applied now.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 5:13 AM, Petr Jelinek
 wrote:
> Hmm now that you mention it, I remember discussing something similar
> with you last year in Dallas in regards to parallel query. IIRC Windows
> should not have this problem but other systems with EXEC_BACKEND do.
> Don't remember the details though.

Generally, extension code can't use bgw_main safely, and must use
bgw_library_name and bgw_function_name.  But bgw_main is supposedly
safe for core code.  If it's not even safe there, then I guess we
should remove it entirely as a useless foot-gun.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitrary vacuum flags

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 12:29 AM, Haribabu Kommi
 wrote:
> On Fri, Feb 24, 2017 at 3:17 PM, Seki, Eiji 
> wrote:
>> Thank you for your comments.
>>
>> I reflected these comments to the attached patch. And I renamed IGNORE_XXX
>> flags to PROCARRAY_XXX flags.
>
> I checked the latest patch and I have some comments.
>
> +static int
> +ConvertProcarrayFlagToProcFlag(int flags)
>
> I feel this function is not needed, if we try to maintain same flag values
> for both PROC_XXX and PROCARRAY_XXX by writing some comments
> in the both the declarations place to make sure that the person modifying
> the flag values needs to update them in both the places. I feel it is
> usually
> rare that the flag values gets changed.

Yeah, it doesn't seem like a good idea to add additional computation
to something that's already a known hot spot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we eliminate or reduce HUP from docs?

2017-03-16 Thread Bruce Momjian

On Fri, Mar 10, 2017 at 11:57:30AM -0800, Joshua Drake wrote:
> Hello,
> 
> I am a bad speaker, I am writing a talk three weeks before the conference
> (as opposed to on the plane). I noticed in the docs we still reference the
> passing of SIGHUP for reloading conf file but we now have pg_reload_conf();
> 
> It seems the use of pg_reload_conf() would provide a better canonical
> interface to our users. Especially those users who are not used to
> interacting with the OS (Windows, Oracle etc...) for databases.

FYI, I did apply this patch for PG 10:

commit 10c064ce4dad088ba2d8b978bff6009b9f22dc3a
Author: Bruce Momjian 
Date:   Tue Oct 25 11:26:15 2016 -0400

Consistently mention 'SELECT pg_reload_conf()' in config files

Previously we only mentioned SIGHUP and 'pg_ctl reload' in
postgresql.conf and pg_hba.conf.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing references of password encryption to hashing

2017-03-16 Thread Stephen Frost

* Joe Conway (m...@joeconway.com) wrote:
> On 03/16/2017 06:19 AM, Robert Haas wrote:
> > On Mon, Mar 13, 2017 at 4:48 AM, Craig Ringer  wrote:
> >> So I'm in favour of fixing the docs but I'm not keen on changing the
> >> SQL syntax in a way that just kind of papers over part of the
> >> problems.
> > 
> > I agree.  I think that trying to design new SQL syntax at this point
> > is unlikely to be a good idea - we're just about out of time here, and
> > some people who might care about this are busy on other things, and
> > the deadline for patches that do new things has long since passed.
> > But I like the idea of trying to improve the documentation.
> 
> Agreed. I think the documentation fixes definitely should be done, but
> understand that the grammar is a longer term issue with backward
> compatibility implications. Acknowledging the problem is the first step ;-)

+1

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] wait events for disk I/O

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 8:28 AM, Rahila Syed  wrote:
> Thank you for the updated patch.
>
> I have applied and tested it on latest sources and the patch looks good to
> me.

The documentation puts the new wait events in a pretty random order.
I think they should be alphabetized, like we do with the IPC events.
I also suggest we change the naming scheme so that the kind of thing
being operated on is first and this is followed by the operation name.
This will let us keep related entries next to each other after
alphabetizing.  So with that principle in mind:

- instead of ReadDataBlock etc. I propose DataFileRead, DataFileWrite,
DataFileSync, DataFileExtend, DataFileFlush, DataFilePrefetch,
DataFileTruncate.  using file instead of block avoids singular/plural
confusion.
- instead of RelationSync and RelationImmedSync I proposed
DataFileSync and DataFileImmediateSync; these are md.c operations like
the previous set, so why name it differently?
- instead of WriteRewriteDataBlock and SyncRewriteDataBlock and
TruncateLogicalMappingRewrite, which aren't consistent with each other
even though they are related, I propose LogicalRewriteWrite,
LogicalRewriteSync, and LogicalRewriteTruncate, which are also closer
to the names of the functions that contain those wait points
- for ReadBuffile and WriteBuffile seem OK, I propose BufFileRead and
BufFileWrite, again reversing the order and also tweaking the
capitalization
- in keeping with our new policy of referring to xlog as wal in user
visible interfaces, I propose WALRead, WALCopyRead, WALWrite,
WALInitWrite, WALCopyWrite, WALBootstrapWrite, WALInitSync,
WALBootstrapSync, WALSyncMethodAssign
- for control file ops, ControlFileRead, ControlFileWrite,
ControlFileWriteUpdate, ControlFileSync, ControlFileSyncUpdate
- ReadApplyLogicalMapping and friends seem to have to do with the
reorderbuffer code, so maybe ReorderBufferRead etc.
- there seems to be some discrepancy between the documentation and
pgstat_get_wait_io for the snapbuild stuff.  maybe SnapBuildWrite,
SnapBuildSync, SnapBuildRead.
- SLRURead, SLRUWrite, etc.
- TimelineHistoryRead, etc.
- the syslogger changes should be dropped, since the syslogger is not
and should not be connected to shared memory
- the replslot terminology seems like a case of odd capitalization and
overeager abbreviation.  why not ReplicationSlotRead,
ReplicationSlotWrite, etc?  similarly RelationMapRead,
RelationMapWrite, etc?
- CopyFileRead, CopyFileWrite
- LockFileCreateRead, etc.
- AddToDataDirLockFileRead is a little long and incomprehensible;
maybe LockFileUpdateRead etc.
- DSMWriteZeroBytes, maybe?

Of course the constants should be renamed to match.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] scram and \password

2017-03-16 Thread Joe Conway

On 03/16/2017 06:52 AM, Heikki Linnakangas wrote:
> On 03/14/2017 11:14 PM, Tom Lane wrote:
>> In short, I don't think that argument refutes my position that "md5"
>> in pg_hba.conf should be understood as allowing SCRAM passwords too.
> 
> Yeah, let's do that. Here's a patch.
> 
> I had some terminology trouble with the docs. What do you call a user
> that has "md5X" in pgauthid.rolpassword? What about someone with a
> SCRAM verifier? I used the terms "those users that have an MD5 hash set
> in the system catalog", and "users that have set their password as a
> SCRAM verifier", but it feels awkward.

maybe something like:
"those users with an MD5 hashed password"
"those users with a SCRAM verifier hash"

> The behavior when a user doesn't exist, or doesn't have a valid
> password, is a bit subtle. Previously, with 'md5' authentication, we
> would send the client an MD5 challenge, and fail with "invalid password"
> error after receiving the response. And with 'scram' authentication, we
> would perform a dummy authentication exchange, with a made-up salt. This
> is to avoid revealing to an unauthenticated client whether or not the
> user existed.
> 
> With this patch, the dummy authentication logic for 'md5' is a bit more
> complicated. I made it look at the password_encryption GUC, and send the
> client a dummy MD5 or SCRAM challenge based on that. The idea is that
> most users presumably have a password of that type, so we use that
> method for the dummy authentication, to make it look as "normal" as
> possible. It's not perfect, if password_encryption is set to 'scram',
> and you probe for a user that has an MD5 password set, you can tell that
> it's a valid user from the fact that the server sends an MD5 challenge.

Presumably if you are unauthenticated you don't have any way to know
what password_encryption is set to, so this seems pretty reasonable.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Parallel Bitmap scans a bit broken

2017-03-16 Thread Emre Hasegeli

> Hopefully, this time I got it correct.  Since I am unable to reproduce
> the issue so I will again need your help in verifying the fix.

It is not crashing with the new patch.  Thank you.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 6:48 AM, Ashutosh Bapat
 wrote:
>> I thought the whole point here was that NOT doing that caused the
>> memory usage for partitionwise join to get out of control.  Am I
>> confused?
>
> We took a few steps to reduce the memory footprint of partition-wise
> join in [1] and [2]. According to the numbers reported in [1] and then
> in [2], if the total memory consumed by a planner is 44MB (memory
> consumed by paths 150K) for a 5-way non-parition-wise join between
> tables with 1000 partitions, partition-wise join consumed 192MB which
> is 4.4 times the non-partitino-wise case. The earlier implementation
> of blowing away a memory context after each top-level child-join, just
> got rid of the paths created for that child-join. The total memory
> consumed by paths created for all the child-joins was about 150MB.
> Remember that we can not get rid of memory consumed by expressions,
> RelOptInfos, RestrictInfos etc. since their pointers will be copied
> into the plan nodes.

All right, I propose that we revise our plan for attacking this
problem.  The code in this patch that proposes to reduce memory
utilization is very complicated and it's likely to cause us to miss
this release altogether if we keep hacking on it.  So, I propose that
you refactor this patch series so that the first big patch is
partition-wise join without any of the optimizations that save memory
- essentially the sample_partition_fraction = 1 case with all
memory-saving optimizations removed.  If it's only there to save
memory, rip it out.  Also, change the default value of
enable_partition_wise_join to false and document that turning it on
may cause a large increase in planner memory utilization, and that's
why it's not enabled by default.

If we get that committed, then we can have follow-on patches that add
the incremental path creation stuff and other memory-saving features,
and then at the end we can flip the default from "off" to "on".
Probably that last part will slip beyond v10 since we're only two
weeks from the end of the release cycle, but I think that's still
better than having everything slip.  Let's also put the multi-level
partition-wise join stuff ahead of the memory-saving stuff, because
being able to do only a single-level of partition-wise join is a
fairly unimpressive feature; I'm not sure this is really even
committable without that.

I realize in some sense that I'm telling you to go and undo all of the
work that you just did based on what I told you before, but I think
we've actually made some pretty good progress here: it's now clear
that there are viable strategies for getting the memory usage down to
an acceptable level, and we've got draft patches for those strategies.
So committing the core feature without immediately including that work
can't be regarded as breaking everything hopelessly; rather, it now
looks (I think, anyway) like a reasonable intermediate step towards
the eventual goal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 12:48 PM, Robert Haas  wrote:
> On Thu, Mar 16, 2017 at 11:46 AM, David Steele  wrote:
>> $ patch -p1 < ../other/postgres-fdw-subquery-support-v15.patch
>> patching file contrib/postgres_fdw/deparse.c
>> Hunk #11 succeeded at 1371 (offset -3 lines).
>> Hunk #12 succeeded at 1419 (offset -3 lines).
>> Hunk #13 succeeded at 1486 (offset -3 lines).
>> Hunk #14 succeeded at 2186 (offset -3 lines).
>> Hunk #15 succeeded at 3082 (offset -3 lines).
>> patching file contrib/postgres_fdw/expected/postgres_fdw.out
>> patching file contrib/postgres_fdw/postgres_fdw.c
>> Hunk #1 succeeded at 669 (offset 1 line).
>> Hunk #2 succeeded at 1245 (offset -1 lines).
>> Hunk #3 succeeded at 2557 (offset -1 lines).
>> Hunk #4 succeeded at 4157 (offset 3 lines).
>> Hunk #5 succeeded at 4183 (offset 3 lines).
>> Hunk #6 succeeded at 4212 (offset 3 lines).
>> Hunk #7 succeeded at 4315 (offset 3 lines).
>> patching file contrib/postgres_fdw/postgres_fdw.h
>> patching file contrib/postgres_fdw/sql/postgres_fdw.sql
>>
>> Since these are just offsets I'll leave the patch as "Needs review" but
>> an updated patch would be appreciated.
>
> I don't think that's really needed.  Offsets don't hurt anything.
> Even fuzz is OK.  As long as the hunks are applying, I think it's
> fine.
>
> Incidentally, I'm reading through this one now.

And ... I don't see anything to complain about, so, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LWLock optimization for multicore Power machines

2017-03-16 Thread David Steele

On 2/21/17 9:54 AM, Bernd Helmle wrote:
> Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:
>> +1
>> And you could try to use pg_wait_sampling
>>  to sampling of wait
>> events.
> 
> I've tried this with your example from your blog post[1] and got this:
> 
> (pgbench scale 1000)
> 
> pgbench -Mprepared -S -n -c 100 -j 100 -T 300 -P2 pgbench2
> 
> SELECT-only:
> 
> SELECT * FROM profile_log ;
>  ts |  event_type   | event | count 
> +---+---+---
>  2017-02-21 15:21:52.45719  | LWLockNamed   | ProcArrayLock | 8
>  2017-02-21 15:22:11.19594  | LWLockTranche | lock_manager  | 1
>  2017-02-21 15:22:11.19594  | LWLockNamed   | ProcArrayLock |24
>  2017-02-21 15:22:31.220803 | LWLockNamed   | ProcArrayLock | 1
>  2017-02-21 15:23:01.255969 | LWLockNamed   | ProcArrayLock | 1
>  2017-02-21 15:23:11.272254 | LWLockNamed   | ProcArrayLock | 2
>  2017-02-21 15:23:41.313069 | LWLockNamed   | ProcArrayLock | 1
>  2017-02-21 15:24:31.37512  | LWLockNamed   | ProcArrayLock |19
>  2017-02-21 15:24:41.386974 | LWLockNamed   | ProcArrayLock | 1
>  2017-02-21 15:26:41.530399 | LWLockNamed   | ProcArrayLock | 1
> (10 rows)
> 
> writes pgbench runs have far more events logged, see the attached text
> file. Maybe this is of interest...
> 
> 
> [1] http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/

This patch applies cleanly at cccbdde.  It doesn't break compilation on
amd64 but I can't test on a Power-based machine

Alexander, have you had a chance to look at Bernd's results?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PinBuffer() no longer makes use of strategy

2017-03-16 Thread David Steele

On 2/4/17 2:47 PM, Alexander Korotkov wrote:
> On Sat, Feb 4, 2017 at 4:33 AM, Andres Freund  > wrote:
> 
> On 2017-02-03 19:13:45 -0600, Jim Nasby wrote:
> > No, I noticed it while reading code. Removing that does mean that if any
> > non-default strategy (in any backend) hits that buffer again then the 
> buffer
> > will almost certainly migrate into the main buffer pool the next time 
> one of
> > the rings hits that buffer
> 
> Well, as long as the buffer is used from the ring, BufferAlloc() -
> BufferAlloc() will reset the usagecount when rechristening the
> buffer. So unless anything happens inbetween the buffer being remapped
> last and remapped next, it'll be reused. Right?
> 
> The only case where I can see the old logic mattering positively is for
> synchronized seqscans.  For pretty much else that logic seems worse,
> because it essentially prevents any buffers ever staying in s_b when
> only ringbuffer accesses are performed.
> 
> I'm tempted to put the old logic back, but more because this likely was
> unintentional, not because I think it's clearly better.
> 
> 
> +1
> Yes, it was unintentional change.  So we should put old logic back
> unless we have proof that this change make it better.
> Patch is attached.  I tried to make some comments, but probably they are
> not enough.

This patch looks pretty straight forward and applies cleanly and
compiles at cccbdde.

It's not a straight revert, though, so still seems to need review.

Jim, do you know when you'll have a chance to look at that?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Bitmap scans a bit broken

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 10:56 AM, Emre Hasegeli  wrote:
>> Hopefully, this time I got it correct.  Since I am unable to reproduce
>> the issue so I will again need your help in verifying the fix.
>
> It is not crashing with the new patch.  Thank you.

Thanks for confirming.  Some review comments on v2:

+if (istate->pagetable)

Please compare explicitly to InvalidDsaPointer.

+if (iterator->ptbase)
+ptbase = iterator->ptbase->ptentry;
+if (iterator->ptpages)
+idxpages = iterator->ptpages->index;
+if (iterator->ptchunks)
+idxchunks = iterator->ptchunks->index;

Similarly.

Dilip, please also provide a proposed commit message describing what
this is fixing.  Is it just the TBM_EMPTY case, or is there anything
else?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send

2017-03-16 Thread Robert Haas

On Mon, Jan 9, 2017 at 4:27 PM, Jonathon Nelson  wrote:
> [I have not done a rigid analysis, here, but...]
>
> I *think* libpq is the culprit here.
>
> walsender says "Hey, libpq - please send (up to) 128KB of data!" and doesn't
> "return" until it's "sent". Then it sends more.  Regardless of the
> underlying cause (nagle, tcp congestion control algorithms, umpteen
> different combos of hardware and settings, etc..) in almost every test I saw
> improvement (usually quite a bit). This was most easily observable with high
> bandwidth-delay product links, but my time in the lab is somewhat limited.

This seems plausible to me.  If it takes X amount of time for the
upper layers to put Y amount of data into libpq's buffers, that
imposes some limit on overall throughput.

I mean, is it not sufficient to know that the performance improvement
is happening?  If it's happening, there's an explanation for why it's
happening.

It would be good if somebody else could try to reproduce these results, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-03-16 Thread Robert Haas

On Thu, Feb 2, 2017 at 12:24 PM, David Fetter  wrote:
>> Also, somebody who wants a check like that isn't necessarily going
>> to want "no WHERE clause" training wheels.  So you're going to need
>> to think about facilities to enable or disable different checks.
>
> This is just the discussion I'd hoped for.  I'll draft up a patch in
> the next day or two, reflecting what's gone so far.

It looks like this was never produced, and it's been over a month.  A
patch that hasn't been updated in over a month and doesn't have
complete consensus doesn't seem like something we should still be
thinking about committing in the second half of March, so I'm going to
mark this Returned with Feedback.

On the substance of the issue, I think there's no problem with having
a module like this, and I think it's fine if it only handles the
WHERE-less case in the first version.  Somebody can add more later if
they want.  But naming the module in a generic way so that it lends
itself to such additions seems like a pretty good plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Microvacuum support for Hash Index

2017-03-16 Thread Ashutosh Sharma

>>> Sure, but we are not clearing in conditionally.  I am not sure, how
>>> after recovery it will be cleared it gets set during normal operation.
>>> Moreover, btree already clears similar flag during replay (refer
>>> btree_xlog_delete).
>>
>> You were right. In case datachecksum is enabled or wal_log_hint is set
>> to true, 'LH_PAGE_HAS_DEAD_TUPLES' will get wal logged and therefore
>> needs to be cleared on the standby as well.
>>
>
> I was thinking what bad can happen if we don't clear this flag during
> replay, the main thing that comes to mind is that after crash
> recovery, if the flag is set the inserts on that page might need to
> traverse all the tuples on that page once the page is full even if
> there are no dead tuples in that page.  It can be later cleared when
> there are dead tuples in that page and we actually delete them, but I
> don't think it is worth the price to pay for not clearing the flag
> during replay.

Yes, you are absolutely correct. If we do not clear this flag  during
replay then there is a possibility of _hash_doinsert() unnecessarily
scanning the page with no space assuming that the page has got some
dead tuples in it which is not true.

>
>> Attached is the patch that
>> clears this flag on standby during replay.
>>
>
> Don't you think, we should also clear it during the replay of
> XLOG_HASH_DELETE?  We might want to log the clear of flag along with
> WAL record for XLOG_HASH_DELETE.
>

Yes, it should be cleared. I completely missed this part in a hurry.
Thanks for informing. I have taken care of it in the attached v2
patch.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


0001-Reset-LH_PAGE_HAS_DEAD_TUPLES-flag-on-standby-during.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)

2017-03-16 Thread David Steele

On 2/14/17 4:03 PM, Tom Lane wrote:
> Jim Nasby  writes:
>> On 2/14/17 1:18 PM, Tom Lane wrote:
>>> One point that could use further review is whether the de-duplication
>>> algorithm is actually correct.  I'm only about 95% convinced by the
>>> argument I wrote in planunionor.c's header comment.
> 
>> I'll put some thought into it and see if I can find any holes. Are you 
>> only worried about the removal of "useless" rels or is there more?
> 
> Well, the key point is whether it's really OK to de-dup on the basis
> of only the CTIDs that are not eliminated in any UNION arm.  I was
> feeling fairly good about that until I thought of the full-join-to-
> left-join-to-no-join conversion issue mentioned in the comment.
> Now I'm wondering if there are other holes; or maybe I'm wrong about
> that one and it's not necessary to be afraid of full joins.

This patch applies cleanly (with offsets) and compiles at cccbdde.

Jim, have you had time to think about this?  Any insights?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSOC - TOAST'ing in slices

2017-03-16 Thread George Papadrosou

Hello all, 

thank you for your replies.  I agree with Alexander Korotkov that it is 
important to have a quality patch at the end of the summer. 

Stephen, you mentioned PostGIS, but the conversation seems to lean towards 
JSONB. What are your thoughts?

Also, if I am to include some ideas/approaches in the proposal, it seems I 
should really focus on understanding how a specific data type is used, queried 
and indexed, which is a lot of exploring for a newcomer in postgres code.

In the meanwhile, I am trying to find how jsonb is indexed and queried. After I 
grasp the current situation I will be to think about new approaches.

Regards,
George 

> On 15 Μαρ 2017, at 15:53, Tom Lane  wrote:
> 
> Robert Haas > writes:
>> On Tue, Mar 14, 2017 at 10:03 PM, George Papadrosou
>>  wrote:
>>> The project’s idea is implement different slicing approaches according to
>>> the value’s datatype. For example a text field could be split upon character
>>> boundaries while a JSON document would be split in a way that allows fast
>>> access to it’s keys or values.
> 
>> Hmm.  So if you had a long text field containing multibyte characters,
>> and you split it after, say, every 1024 characters rather than after
>> every N bytes, then you could do substr() without detoasting the whole
>> field.  On the other hand, my guess is that you'd waste a fair amount
>> of space in the TOAST table, because it's unlikely that the chunks
>> would be exactly the right size to fill every page of the table
>> completely.  On balance it seems like you'd be worse off, because
>> substr() probably isn't all that common an operation.
> 
> Keep in mind also that slicing on "interesting" boundaries rather than
> with the current procrustean-bed approach could save you at most one or
> two chunk fetches per access.  So the upside seems limited.  Moreover,
> how are you going to know whether a given toast item has been stored
> according to your newfangled approach?  I doubt we're going to accept
> forcing a dump/reload for this.
> 
> IMO, the real problem here is to be able to predict which chunk(s) to
> fetch at all, and I'd suggest focusing on that part of the problem rather
> than changes to physical storage.  It's hard to see how to do anything
> very smart for text (except in the single-byte-encoding case, which is
> already solved).  But the JSONB format was designed with some thought
> to this issue, so you might be able to get some traction there.
> 
>   regards, tom lane

Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-03-16 Thread Jeff Janes

On Thu, Mar 9, 2017 at 4:59 AM, Michael Paquier 
wrote:

> On Thu, Mar 9, 2017 at 1:17 AM, Joe Conway  wrote:
> > On 03/07/2017 08:29 PM, Tom Lane wrote:
> >> Michael Paquier  writes:
> >>> here is a separate thread dedicated to the following extension for
> >>> CREATE/ALTER ROLE: PASSWORD ('value' USING 'method').
> >>
> >> The parentheses seem weird ... do we really need those?
> >
> > +1
>
> Seeing 3 opinions in favor of that, let's do so then. I have updated
> the patch to not use parenthesis.
>

The regression tests only exercise the CREATE ROLE...USING version, not the
ALTER ROLE...USING version.

+and plain for an non-hashed password.  If the password
+string is already in MD5-hashed or SCRAM-hashed, then it is
+stored hashed as-is.

In the last line, I think "stored as-is" sounds better.

Other than that, it looks good to me.

Cheers,

Jeff

Re: [HACKERS] Parallel Bitmap scans a bit broken

2017-03-16 Thread Dilip Kumar

On Thu, Mar 16, 2017 at 8:42 PM, Robert Haas  wrote:
> Thanks for confirming.  Some review comments on v2:
>
> +if (istate->pagetable)
fixed
>
> Please compare explicitly to InvalidDsaPointer.
>
> +if (iterator->ptbase)
> +ptbase = iterator->ptbase->ptentry;
> +if (iterator->ptpages)
> +idxpages = iterator->ptpages->index;
> +if (iterator->ptchunks)
> +idxchunks = iterator->ptchunks->index;
>
> Similarly.
fixed

Also fixed at
+ if (ptbase)
+   pg_atomic_init_u32(>refcount, 0);

>
> Dilip, please also provide a proposed commit message describing what
> this is fixing.  Is it just the TBM_EMPTY case, or is there anything
> else?

Okay, I have added the commit message in the patch.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


fix_tbm_empty_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Bitmap scans a bit broken

2017-03-16 Thread Dilip Kumar

On Thu, Mar 16, 2017 at 8:26 PM, Emre Hasegeli  wrote:
>> Hopefully, this time I got it correct.  Since I am unable to reproduce
>> the issue so I will again need your help in verifying the fix.
>
> It is not crashing with the new patch.  Thank you.

Thanks for verifying.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] A better way to expand hash indexes.

2017-03-16 Thread David Steele

On 2/21/17 4:58 AM, Mithun Cy wrote:
> Thanks, Amit
> 
> On Mon, Feb 20, 2017 at 9:51 PM, Amit Kapila  wrote:
>> How will high and lowmask calculations work in this new strategy?
>> Till now they always work on doubling strategy and I don't see you
>> have changed anything related to that code.  Check below places.
> 
> It is important that the mask has to be (2^x) -1, if we have to retain
> the same hash map function. So mask variables will take same values as
> before. Only place I think we need change is  _hash_metapinit();
> unfortunately, I did not test for the case where we build the hash
> index on already existing tuples. Now I have fixed in the latest
> patch.
> 
> 
>> Till now, we have worked hard for not changing the page format in a
>> backward incompatible way, so it will be better if we could find some
>> way of doing this without changing the meta page format in a backward
>> incompatible way.
> 
> We are not adding any new variable/ deleting some, we increase the
> size of hashm_spares and hence mapping functions should be adjusted.
> The problem is the block allocation, and its management is based on
> the fact that all of the buckets(will be 2^x in number) belonging to a
> particular split-point is allocated at once and together. The
> hashm_spares is used to record those allocations and that will be used
> further by map functions to reach a particular block in the file. If
> we want to change the way we allocate the buckets then hashm_spares
> will change and hence mapping function. So I do not think we can avoid
> incompatibility issue.
> 
> One thing I can think of is if we can increase the hashm_version of
> hash index; then for old indexes, we can continue to use doubling
> method and its mapping. For new indexes, we can use new way as above.
> 
> Have you considered to store some information in
>> shared memory based on which we can decide how much percentage of
>> buckets are allocated in current table half?  I think we might be able
>> to construct this information after crash recovery as well.
> 
> I think all of above data has to be persistent. I am not able to
> understand what should be/can be stored in shared buffers. Can you
> please correct me if I am wrong?

This patch does not apply at cccbdde:

$ patch -p1 < ../other/expand_hashbucket_efficiently_02
patching file src/backend/access/hash/hashovfl.c
Hunk #1 succeeded at 49 (offset 1 line).
Hunk #2 succeeded at 67 (offset 1 line).
patching file src/backend/access/hash/hashpage.c
Hunk #1 succeeded at 502 with fuzz 1 (offset 187 lines).
Hunk #2 succeeded at 518 with fuzz 2 (offset 168 lines).
Hunk #3 succeeded at 562 (offset 163 lines).
Hunk #4 succeeded at 744 (offset 124 lines).
Hunk #5 FAILED at 774.
Hunk #6 succeeded at 869 (offset 19 lines).
Hunk #7 succeeded at 1450 (offset 242 lines).
Hunk #8 succeeded at 1464 (offset 242 lines).
Hunk #9 succeeded at 1505 (offset 242 lines).
1 out of 9 hunks FAILED -- saving rejects to file
src/backend/access/hash/hashpage.c.rej
patching file src/backend/access/hash/hashutil.c
Hunk #1 succeeded at 150 (offset 1 line).
patching file src/include/access/hash.h
Hunk #2 succeeded at 180 (offset 12 lines).
Hunk #3 succeeded at 382 (offset 18 lines).

It does apply with fuzz on 2b32ac2, so it looks like c11453c and
subsequent commits are the cause.  They represent a fairly substantial
change to hash indexes by introducing WAL logging so I think you should
reevaluate your patches to be sure they still function as expected.

Marked "Waiting on Author".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-16 Thread Tom Lane

Andres Freund  writes:
> On 2017-03-15 20:09:03 -0400, Tom Lane wrote:
>> That scares me quite a bit, because it smells exactly like the sort of
>> premature optimization that bit us on the rear in CVE-2016-5423 (cf commit
>> f0c7b789a).

> I don't think there's a danger similar to f0c7b789a here, because the
> "caller" (i.e. the node that needs the expression's result) expects
> resvalue/null to be overwritten.

Yeah, that's what I thought when I wrote the broken code in ExecEvalCase,
too.  It was wrong.  Basically you've got to be sure that no aliasing
can occur, and I think the only way to be safe about that is to have a
very clear rule about where results are allowed to get returned to,
preferably one that doesn't ever re-use the same target.  (I think the
repeated use of the same subexpression result address for the arms of
an AND or OR is okay, but it would be a good idea to have a clear
statement of why.)

The thing that actually made the ExecEvalCase code into a bug was that
we were using ExprContext-level fields to store the current caseValue,
allowing aliasing to occur across nested CASEs.  I think that in
this implementation, it ought to be possible to get rid of
ExprContext.caseValue_datum et al altogether, in favor of some storage
location that's private to each CASE expression.  I'm a bit disappointed
that that doesn't seem to have happened.

Eventually, I would also like to find a way to remove the restriction
imposed by the other part of f0c7b789a, ie that we can't inline a SQL
function when that would result in intermixing two levels of CASE
expression.  An implementation along the lines of what I've sketched
above could handle that easily enough, as long as we could identify
which nested level of CASE a particular CaseTestExpr belongs to.
I don't know how to do that offhand, but it seems like it ought to be
soluble if we put a bit of thought into it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Remove objname/objargs split for referring to objects

2017-03-16 Thread Alvaro Herrera

Michael Paquier wrote:

> What are you using as CFLAGS? As both typenames should be normally
> set, what about initializing those fields with NULL and add an
> assertion like the attached?

Actually, my compiler was right -- this was an ancient bug I introduced
in 9.5 (commit a61fd533), and this new warning was my compiler being a
bit smarter now for some reason.  The problem is we were trying to
extract String value from a TypeName node, which obviously doesn't work
very well.

I pushed a real fix, not just a compiler-silencer, along with a few
lines in object_address.sql to make sure it works properly.  Maybe we
need a few more tests cases for other parts of pg_get_object_address.

Pushed fix, backpatched to 9.5.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send

2017-03-16 Thread David Steele

On 3/16/17 11:53 AM, Jon Nelson wrote:
> 
> 
> On Thu, Mar 16, 2017 at 9:59 AM, David Steele  > wrote:
> 
> On 1/9/17 11:33 PM, Jon Nelson wrote:
> >
> > On Sat, Jan 7, 2017 at 7:48 PM, Jim Nasby  
> > >> 
> wrote:
> >
> > On 1/5/17 12:55 PM, Jonathon Nelson wrote:
> >
> > Attached please find a patch for PostgreSQL 9.4 which changes 
> the
> > maximum amount of data that the wal sender will send at any 
> point in
> > time from the hard-coded value of 128KiB to a user-controllable
> > value up
> > to 16MiB. It has been primarily tested under 9.4 but there has
> > been some
> > testing with 9.5.
> >
> >
> > To make sure this doesn't get lost, please add it to
> > https://commitfest.postgresql.org 
> 
> >  >. Please verify the patch will
> > apply against current HEAD and pass make check-world.
> >
> >
> > Attached please find a revision of the patch, changed in the following 
> ways:
> >
> > 1. removed a call to debug2.
> > 2. applies cleanly against master (as of
> > 8c5722948e831c1862a39da2bb5d793a6f2aabab)
> > 3. one small indentation fix, one small verbiage fix.
> > 4. switched to calculating the upper bound using XLOG_SEG_SIZE rather
> > than hard-coding 16384.
> > 5. the git author is - obviously - different.
> >
> > make check-world passes.
> > I have added it to the commitfest.
> > I have verified with strace that up to 16MB sends are being used.
> > I have verified that the GUC properly grumps about values greater than
> > XLOG_SEG_SIZE / 1024 or smaller than 4.
> 
> This patch applies cleanly on cccbdde and compiles.  However,
> documentation in config.sgml is needed.
> 
> The concept is simple enough though there seems to be some argument
> about whether or not the patch is necessary.  In my experience 128K
> should be more than large enough for a chunk size, but I'll buy the
> argument that libpq is acting as a barrier in this case.
>   (as
> I'm marking this patch "Waiting on Author" for required documentation.
> 
> 
> Thank you for testing and the comments.  I have some updates:
> 
> - I set up a network at home and - in some very quick testing - was
> unable to observe any obvious performance difference regardless of chunk
> size
> - Before I could get any real testing done, one of the machines I was
> using for testing died and won't even POST, which has put a damper on
> said testing (as you might imagine).
> - There is a small issue with the patch: a lower-bound of 4 is not
> appropriate; it should be XLOG_BLCKSZ / 1024 (I can submit an updated
> patch if that is appropriate)
> - I am, at this time, unable to replicate the earlier results however I
> can't rule them out, either.

My recommendation is that we mark this patch "Returned with Feedback" to
allow you time to test and refine the patch.  You can resubmit once it
is ready.

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Floating point comparison inconsistencies of the geometric types

2017-03-16 Thread David Steele

On 2/1/17 6:36 AM, Emre Hasegeli wrote:
>> Got it, but if other people don't agree then this is going nowhere.
> 
> Yes.  As I wrote, I don't particularly care about functions like "is
> point on line".  I can prepare a patch to fix as many problems as
> possible around those operators by preserving the current epsilon.
> 
> I though we were arguing about *all* operators.  Having containment
> and placement operators consistent with each other, is the primary
> thing I am trying to fix.  Is removing epsilon from them is
> acceptable?
> 
> We can also stay away from changing operators like "~=" to minimise
> compatibility break, if we keep the epsilon on some places.  We can
> instead document this operator as "close enough", and introduce
> another symbol for really "the same" operator.
> 
> That said, there are some places where it is hard to decide to apply
> the epsilon or not.  For example, we can keep the epsilon to check for
> two lines being parallel, but then should we return the intersection
> point, or not?  Those issues may become more clear when I start
> working on it, if preserving epsilon for those operators is the way to
> go forward.

The current patches do not apply cleanly at cccbdde:

$ git apply ../other/0001-float-header-v03.patch
error: patch failed: contrib/btree_gist/btree_ts.c:1
error: contrib/btree_gist/btree_ts.c: patch does not apply
error: patch failed: contrib/postgres_fdw/postgres_fdw.c:26
error: contrib/postgres_fdw/postgres_fdw.c: patch does not apply
error: patch failed: src/backend/access/gist/gistutil.c:14
error: src/backend/access/gist/gistutil.c: patch does not apply
error: patch failed: src/backend/utils/adt/float.c:339
error: src/backend/utils/adt/float.c: patch does not apply
error: patch failed: src/backend/utils/adt/geo_ops.c:14
error: src/backend/utils/adt/geo_ops.c: patch does not apply
error: patch failed: src/backend/utils/misc/guc.c:68
error: src/backend/utils/misc/guc.c: patch does not apply
error: patch failed: src/include/utils/builtins.h:334
error: src/include/utils/builtins.h: patch does not apply

I don't believe this patch should be in the "Needs review" state anyway.
 There are clearly a number of issues that need work and agreement.

Given that this thread has been idle since the beginning of February and
no resolution is likely for v10, I'm marking this submission "Returned
with Feedback".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-03-16 Thread David Steele

On 2/1/17 3:59 PM, Pavel Stehule wrote:
> Hi
> 
> 2017-01-24 21:33 GMT+01:00 Pavel Stehule  >:
> 
> Perhaps that's as simple as renaming all the existing _ns_*
> functions to _block_ and then adding support for pragmas...
> 
> Since you're adding cursor_options to PLpgSQL_expr it should
> probably be removed as an option to exec_*.
> 
> I have to recheck it. Some cursor options going from dynamic
> cursor variables and are related to dynamic query - not query
> that creates query string.  
> 
> hmm .. so current state is better due using options like
> CURSOR_OPT_PARALLEL_OK
> 
>  if (expr->plan == NULL)
> exec_prepare_plan(estate, expr, (parallelOK ?
>   CURSOR_OPT_PARALLEL_OK : 0) |
> expr->cursor_options);
> 
> This options is not permanent feature of expression - and then I
> cannot to remove cursor_option argument from exec_*
> 
> I did minor cleaning - remove cursor_options from plpgsql_var
> 
> + basic doc

This patch still applies cleanly and compiles at cccbdde.

Any reviewers want to have a look?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-16 Thread David Steele

On 2/5/17 11:04 AM, Andrew Borodin wrote:
> Hi, Jeff!
> 
> 2017-02-05 3:45 GMT+05:00 Jeff Davis :
>> On Sun, Jan 22, 2017 at 10:32 PM, Jeff Davis  wrote:
>>> On Sat, Jan 21, 2017 at 4:25 AM, Andrew Borodin  
>>> wrote:
>>> One idea I had that might be simpler is to use a two-stage page
>>> delete. The first stage would remove the link from the parent and mark
>>> the page deleted, but leave the right link intact and prevent
>>> recycling. The second stage would follow the chain of right links
>>> along each level, removing the right links to deleted pages and
>>> freeing the page to be recycled.
>>
>> Do you think this approach is viable as a simplification?
> 
> To consider this approach I need to ask several questions.
> 
> 1. Are we discussing simplification of existing GIN vacuum? Or
> proposed GIN vacuum? Please, note that they do not differ in the way
> page is unlinked, function ginDeletePage() is almost untoched.
> 
> 2. What do you mean by "stage"? In terms of the paper "A symmetric
> concurrent B-tree algorithm" by Lanin: stage is an
> uninterrupted period of holding lock on nonempty page set.
> Here's the picture https://www.screencast.com/t/xUpGKgkkU from L
> Both paper (L and L) tend to avoid lock coupling, which is
> inevitable if you want to do parent unlink first. To prevent insertion
> of records on a page, you have to mark it. If you are doing this in
> the stage when you unlink from parent - you have to own both locks.
> 
> 3. What do you want to simplify? Currently we unlink a page from
> parent and left sibling in one stage, at cost of aquiring CleanUp lock
> (the way we aquire it - is the diference between current and patched
> version).
> 2-stage algorithms will not be simplier, yet it will be more concurrent.
> Please note, that during absence of high fence keys in GIN B-tree we
> actually should implement algorithm from figure 3A
> https://www.screencast.com/t/2cfGZtrzaz0z  (It would be incorrect, but
> only with presence of high keys)

This patch applies cleanly and compiles at cccbdde.

Jeff, any thoughts on Andrew's responses?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time to up bgwriter_lru_maxpages?

2017-03-16 Thread David Steele

On 3/16/17 12:41 PM, Robert Haas wrote:
> On Thu, Mar 16, 2017 at 12:39 PM, David Steele  wrote:
>>> Anyway, I committed the patch posted here.  Or the important line out
>>> of the two, anyway.  :-)
>>
>> It seems that this submission should be marked as "Committed" with
>> Robert as the committer.  Am I missing something?
> 
> I think you are right.  Sorry that I missed that step.

Done.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time to up bgwriter_lru_maxpages?

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 12:39 PM, David Steele  wrote:
>> Anyway, I committed the patch posted here.  Or the important line out
>> of the two, anyway.  :-)
>
> It seems that this submission should be marked as "Committed" with
> Robert as the committer.  Am I missing something?

I think you are right.  Sorry that I missed that step.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw bug in 9.6

2017-03-16 Thread David Steele

On 1/23/17 4:56 AM, Etsuro Fujita wrote:
> On 2017/01/20 14:24, Ashutosh Bapat wrote:
>> On Thu, Jan 19, 2017 at 10:38 PM, Robert Haas 
>> wrote:
>>> On Wed, Jan 18, 2017 at 10:26 PM, Ashutosh Bapat
>>>  wrote:
> Yes, I think that's broadly the approach Tom was recommending.
> 
 I don't have problem with that approach, but the latest patch has
 following problems.
> 
 2. There are many cases where the new function would return no local
 path and hence postgres_fdw doesn't push down the join [1]. This will
 be performance regression compared to 9.6.
> 
>>> Some, or many?  How many?
> 
>> AFAIU, the problem esp. with parameterized paths is this: If rel1 is
>> required to be parameterized by rel2 (because of lateral references?),
>> a nested loop join with rel2 as outer relation and rel1 as inner
>> relation is possible. postgres_fdw and hence this function, however
>> doesn't consider all the possible join combinations and thus when this
>> function is presented with rel1 as outer and rel2 as inner would
>> refuse to create a path. More generally, while creating local paths,
>> we try many combinations of participating relations, some of which do
>> not produce local paths and some of them do (AFAIK, it happens in case
>> of lateral references, but there might be other cases). postgres_fdw,
>> however considers only a single combination, which may or may not have
>> produced local path. In such a case, postgres_fdw would create a
>> foreign join path but won't get a local path and thus bail out.
> 
> I had second thoughts; one idea how to build parameterized paths to
> avoid this issue is: as in postgresGetForeignPaths, to (1) identify
> which outer relations could supply additional safe-to-send-to-remote
> join clauses, and (2) build a parameterized path and its alternative
> local-join path for each such outer relation.  In #1, we could look at
> the join relation's ppilist to identify interesting outer relations.  In
> #2, the local-join path corresponding to each such outer relation could
> be built from the cheapest-total paths for the outer and inner
> relations, using CreateLocalJoinPath, so that the result path has the
> outer relation as its required_outer.
> 
>>> I'm a bit sketchy about this kind of thing, though:
>>>
>>> ! if (required_outer)
>>>   {
>>> ! bms_free(required_outer);
>>> ! return NULL;
>>>   }
>>>
>>> I don't understand what would stop that from making this fail to
>>> generate parameterized paths in some cases in which it would be legal
>>> to do so, and the comment is very brief.
> 
>> I am not so much worried about this though :).
>> GetExistingLocalJoinPath() also does not handle that case. The new
>> function is not making it worse in this case.
>> 731 /* Skip parameterised paths. */
>> 732 if (path->param_info != NULL)
>> 733 continue;
> 
> One idea to remove such extra checks is to pass the required_outer to
> CreateLocalJoinPath like the attached.  As described above, the caller
> would have that set before calling that function, so we wouldn't need to
> calculate that set in that function.
> 
> Other changes:
> 
> * Also modified CreateLocalJoinPath so that we pass outer/inner paths,
> not outer/inner rels, because it would be more flexible for the FDW to
> build the local-join path from paths it chose.
> * Fixed a bug that I missed the RIGHT JOIN case in CreateLocalJoinPath.

This patch does not apply cleanly at cccbdde:

$ patch -p1 < ../other/epqpath-for-foreignjoin-5.patch
patching file contrib/postgres_fdw/expected/postgres_fdw.out
Hunk #6 succeeded at 4134 (offset 81 lines).
Hunk #7 succeeded at 4275 (offset 81 lines).
patching file contrib/postgres_fdw/postgres_fdw.c
Hunk #1 succeeded at 4356 (offset 3 lines).
Hunk #2 succeeded at 4386 (offset 3 lines).
patching file contrib/postgres_fdw/sql/postgres_fdw.sql
patching file doc/src/sgml/fdwhandler.sgml
patching file src/backend/foreign/foreign.c
Hunk #2 FAILED at 696.
1 out of 2 hunks FAILED -- saving rejects to file
src/backend/foreign/foreign.c.rej
patching file src/backend/optimizer/path/joinpath.c
Hunk #1 FAILED at 25.
Hunk #2 succeeded at 109 (offset 27 lines).
Hunk #3 succeeded at 134 (offset 27 lines).
Hunk #4 succeeded at 197 (offset 27 lines).
Hunk #5 succeeded at 208 (offset 27 lines).
Hunk #6 succeeded at 225 (offset 27 lines).
Hunk #7 succeeded at 745 (offset 113 lines).
Hunk #8 FAILED at 894.
Hunk #9 succeeded at 1558 (offset 267 lines).
Hunk #10 succeeded at 1609 (offset 268 lines).
2 out of 10 hunks FAILED -- saving rejects to file
src/backend/optimizer/path/joinpath.c.rej
patching file src/include/foreign/fdwapi.h
Hunk #1 succeeded at 237 (offset 2 lines).
patching file src/include/nodes/relation.h
Hunk #1 succeeded at 914 (offset 10 lines).
Hunk #2 succeeded at 2057 (offset 47 lines).

Marked "Waiting on

Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-16 Thread Tom Lane

I wrote:
> Andres Freund  writes:
>> I don't think there's a danger similar to f0c7b789a here, because the
>> "caller" (i.e. the node that needs the expression's result) expects
>> resvalue/null to be overwritten.

> Yeah, that's what I thought when I wrote the broken code in ExecEvalCase,
> too.  It was wrong.

Along the same line, I notice that you've got some expr step types
overwriting their own input, the various flavors of EEOP_BOOLTEST for
example.  Maybe that's all right but it doesn't really give me a warm
feeling, especially when other single-argument operations like
EEOP_BOOL_NOT_STEP are done differently.  Again, I think a clear
explanation of the design is essential to allow people to reason about
whether this sort of trickery is safe.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: recursive json_populate_record()

2017-03-16 Thread David Steele

On 2/1/17 12:53 AM, Michael Paquier wrote:
> On Thu, Jan 26, 2017 at 6:49 AM, Tom Lane  wrote:
>> Nikita Glukhov  writes:
>>> On 25.01.2017 23:58, Tom Lane wrote:
 I think you need to take a second look at the code you're producing
 and realize that it's not so clean either.  This extract from
 populate_record_field, for example, is pretty horrid:
>>
>>> But what if we introduce some helper macros like this:
>>
>>> #define JsValueIsNull(jsv) \
>>>  ((jsv)->is_json ? !(jsv)->val.json.str \
>>>  : !(jsv)->val.jsonb || (jsv)->val.jsonb->type == jbvNull)
>>
>>> #define JsValueIsString(jsv) \
>>>  ((jsv)->is_json ? (jsv)->val.json.type == JSON_TOKEN_STRING \
>>>  : (jsv)->val.jsonb && (jsv)->val.jsonb->type == jbvString)
>>
>> Yeah, I was wondering about that too.  I'm not sure that you can make
>> a reasonable set of helper macros that will fix this, but if you want
>> to try, go for it.
>>
>> BTW, just as a stylistic thing, I find "a?b:c||d" unreadable: I have
>> to go back to the manual every darn time to convince myself whether
>> that means (a?b:c)||d or a?b:(c||d).  It's better not to rely on
>> the reader (... or the author) having memorized C's precedence rules
>> in quite that much detail.  Extra parens help.
> 
> Moved to CF 2017-03 as discussion is going on and more review is
> needed on the last set of patches.
> 

The current patches do not apply cleanly at cccbdde:

$ git apply ../other/0001-introduce-JsonContainerXxx-macros-v04.patch
error: patch failed: src/backend/utils/adt/jsonb_util.c:328
error: src/backend/utils/adt/jsonb_util.c: patch does not apply
error: patch failed: src/backend/utils/adt/jsonfuncs.c:1266
error: src/backend/utils/adt/jsonfuncs.c: patch does not apply
error: patch failed: src/include/utils/jsonb.h:218
error: src/include/utils/jsonb.h: patch does not apply

In addition, it appears a number of suggestions have been made by Tom
that warrant new patches.  Marked "Waiting on Author".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send

2017-03-16 Thread Jon Nelson

On Thu, Mar 16, 2017 at 9:59 AM, David Steele  wrote:

> On 1/9/17 11:33 PM, Jon Nelson wrote:
> >
> > On Sat, Jan 7, 2017 at 7:48 PM, Jim Nasby  > > wrote:
> >
> > On 1/5/17 12:55 PM, Jonathon Nelson wrote:
> >
> > Attached please find a patch for PostgreSQL 9.4 which changes the
> > maximum amount of data that the wal sender will send at any
> point in
> > time from the hard-coded value of 128KiB to a user-controllable
> > value up
> > to 16MiB. It has been primarily tested under 9.4 but there has
> > been some
> > testing with 9.5.
> >
> >
> > To make sure this doesn't get lost, please add it to
> > https://commitfest.postgresql.org
> > . Please verify the patch will
> > apply against current HEAD and pass make check-world.
> >
> >
> > Attached please find a revision of the patch, changed in the following
> ways:
> >
> > 1. removed a call to debug2.
> > 2. applies cleanly against master (as of
> > 8c5722948e831c1862a39da2bb5d793a6f2aabab)
> > 3. one small indentation fix, one small verbiage fix.
> > 4. switched to calculating the upper bound using XLOG_SEG_SIZE rather
> > than hard-coding 16384.
> > 5. the git author is - obviously - different.
> >
> > make check-world passes.
> > I have added it to the commitfest.
> > I have verified with strace that up to 16MB sends are being used.
> > I have verified that the GUC properly grumps about values greater than
> > XLOG_SEG_SIZE / 1024 or smaller than 4.
>
> This patch applies cleanly on cccbdde and compiles.  However,
> documentation in config.sgml is needed.
>
> The concept is simple enough though there seems to be some argument
> about whether or not the patch is necessary.  In my experience 128K
> should be more than large enough for a chunk size, but I'll buy the
> argument that libpq is acting as a barrier in this case.
>   (as
> I'm marking this patch "Waiting on Author" for required documentation.
>

Thank you for testing and the comments.  I have some updates:

- I set up a network at home and - in some very quick testing - was unable
to observe any obvious performance difference regardless of chunk size
- Before I could get any real testing done, one of the machines I was using
for testing died and won't even POST, which has put a damper on said
testing (as you might imagine).
- There is a small issue with the patch: a lower-bound of 4 is not
appropriate; it should be XLOG_BLCKSZ / 1024 (I can submit an updated patch
if that is appropriate)
- I am, at this time, unable to replicate the earlier results however I
can't rule them out, either.


--
Jon

Re: [HACKERS] Push down more full joins in postgres_fdw

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 11:46 AM, David Steele  wrote:
> $ patch -p1 < ../other/postgres-fdw-subquery-support-v15.patch
> patching file contrib/postgres_fdw/deparse.c
> Hunk #11 succeeded at 1371 (offset -3 lines).
> Hunk #12 succeeded at 1419 (offset -3 lines).
> Hunk #13 succeeded at 1486 (offset -3 lines).
> Hunk #14 succeeded at 2186 (offset -3 lines).
> Hunk #15 succeeded at 3082 (offset -3 lines).
> patching file contrib/postgres_fdw/expected/postgres_fdw.out
> patching file contrib/postgres_fdw/postgres_fdw.c
> Hunk #1 succeeded at 669 (offset 1 line).
> Hunk #2 succeeded at 1245 (offset -1 lines).
> Hunk #3 succeeded at 2557 (offset -1 lines).
> Hunk #4 succeeded at 4157 (offset 3 lines).
> Hunk #5 succeeded at 4183 (offset 3 lines).
> Hunk #6 succeeded at 4212 (offset 3 lines).
> Hunk #7 succeeded at 4315 (offset 3 lines).
> patching file contrib/postgres_fdw/postgres_fdw.h
> patching file contrib/postgres_fdw/sql/postgres_fdw.sql
>
> Since these are just offsets I'll leave the patch as "Needs review" but
> an updated patch would be appreciated.

I don't think that's really needed.  Offsets don't hurt anything.
Even fuzz is OK.  As long as the hunks are applying, I think it's
fine.

Incidentally, I'm reading through this one now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Crash on promotion when recovery.conf is renamed

2017-03-16 Thread David Steele

On 12/20/16 2:54 AM, Michael Paquier wrote:
> On Sat, Dec 17, 2016 at 9:23 PM, Magnus Hagander  wrote:
>> On Fri, Dec 16, 2016 at 7:08 AM, Michael Paquier 
>> wrote:
>>> Looking at PrescanPreparedTransactions(), I am thinking as well that it
>>> would
>>> be better to get a hard failure when bumping on a corrupted 2PC file.
>>> Future
>>> files are one thing, but corrupted files should be treated more carefully.
>>
>>
>> Again without looking at it, I agree (so much easier that way :P). Ignoring
>> corruption is generally a bad idea. Failing hard makes the user notice the
>> error, and makes it possible to initiate recovery from a standby or from
>> backups or something, or to *intentionally* remove/clear/ignore it.
> 
> And I am finishing with the two patches attached:
> - 0001 changes the 2PC checks so as corrupted entries are FATAL.
> PreScanPreparedTransaction is used when a hot standby is initialized.
> In this case a failure protects the range of XIDs generated,
> potentially saving from corruption of data. At the end of recovery,
> this is done before any on-disk actions are taken.
> - 0002 is the thing that Heikki has sent previously to minimize the
> window between end-of-recovery record write and timeline history file
> archiving.
> 
> I am attaching that to next CF.

This patch still applies cleanly and compiles at cccbdde.

Any idea when you'll have a chance to review?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Renaming of pg_xlog and pg_clog

2017-03-16 Thread David Steele

On 1/17/17 2:31 AM, Michael Paquier wrote:
> On Tue, Nov 29, 2016 at 1:35 PM, Michael Paquier
>  wrote:
>> On Tue, Nov 22, 2016 at 8:35 PM, Haribabu Kommi
>>  wrote:
>>> Hi Craig,
>>>
>>> This is a gentle reminder.
>>>
>>> you assigned as reviewer to the current patch in the 11-2016 commitfest.
>>> But you haven't shared your review yet. Please share your review about
>>> the patch. This will help us in smoother operation of commitfest.
>>>
>>> Please Ignore if you already shared your review.
>>
>> I have moved this CF entry to 2017-01, the remaining, still unreviewed
>> patch are for renaming pg_subxact and pg_clog.
> 
> The second patch has rotten a little because of the backup
> documentation. By the way, is something going to happen in the CF?
> Craig, you are a reviewer of this patch.

This patch does not apply cleanly at cccbdde:

$ git apply ../other/0001-Rename-pg_clog-to-pg_xact.patch
error: doc/src/sgml/ref/pg_resetxlog.sgml: No such file or directory
error: patch failed: src/backend/postmaster/autovacuum.c:2468
error: src/backend/postmaster/autovacuum.c: patch does not apply

Marked "Waiting on Author".

I'd really like to see the rest of the renames happen for v10.  It seems
like the process got stalled after the pg_wal rename.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2017-03-16 Thread David Steele

On 1/30/17 6:30 AM, Etsuro Fujita wrote:
> On 2017/01/27 21:25, Etsuro Fujita wrote:
>> On 2017/01/27 20:04, Ashutosh Bapat wrote:
>>> I think we should pick up your patch on
>>> 27th December, update the comment per your mail on 5th Jan. I will
>>> review it once and list down the things left to committer's judgement.
> 
>> Sorry, I started thinking we went in the wrong direction.  I added to
>> deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for
>> each subquery present in a given join tree prior to deparsing a whole
>> remote query.  But that's nothing but an overhead; we need to create a
>> tlist for the top-level query because we use it as fdw_scan_tlist, but
>> not for subqueries, and we need to create retrieved_attrs for the
>> top-level query while deparsing the targetlist in
>> deparseExplicitTargetList, but not for subqueries.  So, we should need
>> some work to avoid such a useless overhead.
> 
> I think we can avoid the useless overhead by adding a new function,
> deparseSubqueryTargetList, that deparses expressions in the given
> relation's reltarget, not the tlist, as a SELECT clause of the subquery
> representing the given relation.  That would also allow us to make the
> 1-to-1 relationship between the subquery output columns and their
> aliases more explicit, which was your original comment.  Please find
> attached the new version.  (The patch doesn't need the patch to avoid
> duplicate construction of the tlist, discussed upthread.)
> 
> Other changes:
> * I went back to make_outerrel_subquery and make_innerrel_subquery,
> which are flags to indicate whether to deparse the input relations as
> subqueries.  is_subquery_rel would work well for handling the cases of
> full joins with restrictions on the input relations, but I noticed that
> that wouldn't work well when extending to handle the cases where we
> deparse the input relations as subqueries to evaluate PHVs remotely.
> * Since appendSubqueryAlias in the previous patch is pretty small, I
> included the code into deparseRangeTblRef.

This patch does not apply cleanly at cccbdde:

$ patch -p1 < ../other/postgres-fdw-subquery-support-v15.patch
patching file contrib/postgres_fdw/deparse.c
Hunk #11 succeeded at 1371 (offset -3 lines).
Hunk #12 succeeded at 1419 (offset -3 lines).
Hunk #13 succeeded at 1486 (offset -3 lines).
Hunk #14 succeeded at 2186 (offset -3 lines).
Hunk #15 succeeded at 3082 (offset -3 lines).
patching file contrib/postgres_fdw/expected/postgres_fdw.out
patching file contrib/postgres_fdw/postgres_fdw.c
Hunk #1 succeeded at 669 (offset 1 line).
Hunk #2 succeeded at 1245 (offset -1 lines).
Hunk #3 succeeded at 2557 (offset -1 lines).
Hunk #4 succeeded at 4157 (offset 3 lines).
Hunk #5 succeeded at 4183 (offset 3 lines).
Hunk #6 succeeded at 4212 (offset 3 lines).
Hunk #7 succeeded at 4315 (offset 3 lines).
patching file contrib/postgres_fdw/postgres_fdw.h
patching file contrib/postgres_fdw/sql/postgres_fdw.sql

Since these are just offsets I'll leave the patch as "Needs review" but
an updated patch would be appreciated.

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench more operators & functions

2017-03-16 Thread David Steele

On 2/4/17 4:51 AM, Fabien COELHO wrote:
> 
> Hello,
> 
>> For my 2c, at least, while I'm definitely interested in this, it's not
>> nearly high enough on my plate with everything else going on to get any
>> attention in the next few weeks, at least.
>>
>> I do think that, perhaps, this patch may deserve a bit of a break, to
>> allow people to come back to it with a fresh perspective, so perhaps
>> moving it to the next commitfest would be a good idea, in a Needs Review
>> state.
> 
> So, let's try again for the next CF...
> 
> Here is a v9 which includes some more cleanup, hopefully in the expected
> direction which is to make pgbench expressions behave as SQL
> expressions, and I hope taking into account all other feedback as well.
> 
> 
> CONTEXT
> 
> Pgbench has been given an expression parser (878fdcb8) which allows to
> use full expressions instead of doing one-at-a-time operations. This
> parser has been extended with functions (7e137f84) & double type
> (86c43f4e). The first batch of functions was essentially a poc about how
> to add new functions with various requirements. Pgbench default
> "tpcb-like" test takes advantage of these additions to reduce the number
> of lines it needs.
> 
> 
> MOTIVATION
> 
> This patch aims at providing actually useful functions for benchmarking.
> The functions and operators provided here are usual basic operations.
> They are not chosen randomly, but are simply taken from existing
> benchmarks:
> 
> In TPC-B 2.0.0 section 5.3.5 and TPC-C 5.11 section 2.5.1.2, the
> selection of accounts uses a test (if ...), logical conditions (AND, OR)
> and comparisons (<, =, >=, >).
> 
> In TPC-C 5.11 section 2.1.6, a bitwise or (|) is used to skew a
> distribution based on two uniform distributions.
> 
> In TPC-C 5.11 section 5.2.5.4, a log function is used to determine
> "think time", which can be truncated (i.e. "least" function, already in
> pgbench).
> 
> 
> CONTENTS
> 
> The attached patch provides a consistent set of functions and operators
> based on the above examples, with operator precedence taken from
> postgres SQL parser:
> 
> - "boolean" type support is added, because it has been requested that
> pgbench should be as close as SQL expressions as possible. This induced
> some renaming as some functions & struct fields where named "num"
> because they where expecting an int or a double, but a boolean is not
> really a numeral.
> 
> - SQL comparisons (= <> < > <= >=) plus pg SQL "!=", which result in a
> boolean.
> 
> - SQL logical operators (and or not) on booleans.
> 
> - SQL bitwise operators taken from pg: | & # << >> ~.
> 
> - mod SQL function as a synonymous for %.
> 
> - ln and exp SQL functions.
> 
> - SQL CASE/END conditional structure.
> 
> The patch also includes documentation and additional tap tests.
> A test script is also provided.
> 
> This version is strict about typing, mimicking postgres behavior. For
> instance, using an int as a boolean results in a error. It is easy to
> make it more tolerant to types, which was the previous behavior before
> it was suggested to follow SQL behavior.
> 
> Together with another submitted patch about retrieving query results,
> the added capabilities allow to implement strictly conforming TPC-B
> transactions.

This patch applies cleanly and compiles at cccbdde with some whitespace
issues.

$ patch -p1 < ../other/pgbench-more-ops-funcs-9.patch
(Stripping trailing CRs from patch.)
patching file doc/src/sgml/ref/pgbench.sgml
(Stripping trailing CRs from patch.)
patching file src/bin/pgbench/exprparse.y
(Stripping trailing CRs from patch.)
patching file src/bin/pgbench/exprscan.l
(Stripping trailing CRs from patch.)
patching file src/bin/pgbench/pgbench.c
(Stripping trailing CRs from patch.)
patching file src/bin/pgbench/pgbench.h
(Stripping trailing CRs from patch.)
patching file src/bin/pgbench/t/002_pgbench.pl

Any reviewers want to have a look?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential data loss of 2PC files

2017-03-16 Thread David Steele

On 2/13/17 12:10 AM, Michael Paquier wrote:
> On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier
>  wrote:
>> On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas  wrote:
>>> If that can happen, don't we have the same problem in many other places?
>>> Like, all the SLRUs? They don't fsync the directory either.
>>
>> Right, pg_commit_ts and pg_clog enter in this category.
> 
> Implemented as attached.
> 
>>> Is unlink() guaranteed to be durable, without fsyncing the directory? If
>>> not, then we need to fsync() the directory even if there are no files in it
>>> at the moment, because some might've been removed earlier in the checkpoint
>>> cycle.
>>
>> Hm... I am not an expert in file systems. At least on ext4 I can see
>> that unlink() is atomic, but not durable. So if an unlink() is
>> followed by a power failure, the previously unlinked file could be
>> here if the parent directory is not fsync'd.
> 
> So I have been doing more work on this patch, with the following things done:
> - Flush pg_clog, pg_commit_ts and pg_twophase at checkpoint phase to
> ensure their durability.
> - Create a durable_unlink() routine to give a way to perform a durable
> file removal.
> I am now counting 111 calls to unlink() in the backend code, and
> looking at all of them most look fine with plain unlink() if they are
> not made durable as they work on temporary files (see timeline.c for
> example), with some exceptions:
> - In pg_stop_backup, the old backup_label and tablespace_map removal
> should be durable to avoid putting the system in a wrong state after
> power loss. Other calls of unlink() are followed by durable_rename so
> they are fine if let as such.
> - Removal of old WAL segments should be durable as well. There is
> already an effort to rename them durably in case of a segment
> recycled. In case of a power loss, a file that should have been
> removed could remain in pg_xlog.
> 
> Looking around, I have bumped as well on the following bug report for
> SQlite which is in the same category of things:
> http://sqlite.1065341.n5.nabble.com/Potential-bug-in-crash-recovery-code-unlink-and-friends-are-not-synchronous-td68885.html
> Scary to see that in this case durability can be a problem at
> transaction commit...

This patch applies cleanly and compiles at cccbdde.

Ashutosh, do you know when you'll have a chance to review?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time to up bgwriter_lru_maxpages?

2017-03-16 Thread David Steele

On 2/2/17 2:47 PM, Robert Haas wrote:
> On Wed, Feb 1, 2017 at 9:47 PM, Jim Nasby  wrote:
>> Before doing that the first thing to look at would be why the limit is
>> currently INT_MAX / 2 instead of INT_MAX.
> 
> Generally the rationale for GUCs with limits of that sort is that
> there is or might be code someplace that multiplies the value by 2 and
> expects the result not to overflow.
> 
> I expect that increasing the maximum value of shared_buffers beyond
> what can be stored by an integer could have a noticeable distributed
> performance cost for the entire system.  It might be a pretty small
> cost, but then again maybe not; for example, BufferDesc's buf_id
> member would have to get wider, and probably the freeNext member, too.
> Andres already did unspeakable things to make a BufferDesc fit into
> one cache line for performance reasons, so that wouldn't be great
> news.
> 
> Anyway, I committed the patch posted here.  Or the important line out
> of the two, anyway.  :-)

It seems that this submission should be marked as "Committed" with
Robert as the committer.  Am I missing something?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] asynchronous execution

2017-03-16 Thread Corey Huinker

On Mon, Mar 13, 2017 at 9:28 PM, Amit Langote  wrote:

> On 2017/03/14 10:08, Corey Huinker wrote:
> >> I don't think the plan itself will change as a result of applying this
> >> patch. You might however be able to observe some performance
> improvement.
> >
> > I could see no performance improvement, even with 16 separate queries
> > combined with UNION ALL. Query performance was always with +/- 10% of a
> 9.6
> > instance given the same script. I must be missing something.
>
> Hmm, maybe I'm missing something too.
>
> Anyway, here is an older message on this thread from Horiguchi-san where
> he shared some of the test cases that this patch improves performance for:
>
> https://www.postgresql.org/message-id/20161018.103051.
> 30820907.horiguchi.kyotaro%40lab.ntt.co.jp
>
> From that message:
>
> 
> I measured performance and had the following result.
>
> t0  - SELECT sum(a) FROM ;
> pl  - SELECT sum(a) FROM <4 local children>;
> pf0 - SELECT sum(a) FROM <4 foreign children on single connection>;
> pf1 - SELECT sum(a) FROM <4 foreign children on dedicate connections>;
>
> The result is written as "time (std dev )"
>
> sync
>   t0: 3820.33 (  1.88)
>   pl: 1608.59 ( 12.06)
>  pf0: 7928.29 ( 46.58)
>  pf1: 8023.16 ( 26.43)
>
> async
>   t0: 3806.31 (  4.49)0.4% faster (should be error)
>   pl: 1629.17 (  0.29)1.3% slower
>  pf0: 6447.07 ( 25.19)   18.7% faster
>  pf1: 1876.80 ( 47.13)   76.6% faster
> 
>
> IIUC, pf0 and pf1 is the same test case (all 4 foreign tables target the
> same server) measured with different implementations of the patch.
>
> Thanks,
> Amit
>

I reworked the test such that all of the foreign tables inherit from the
same parent table, and if you query that you do get async execution. But It
doesn't work when just stringing together those foreign tables with UNION
ALLs.

I don't know how to proceed with this review if that was a goal of the
patch.

Re: [HACKERS] [PATCH] Suppress Clang 3.9 warnings

2017-03-16 Thread Noah Misch

This mailing list does not welcome top-post replies.

On Wed, Mar 15, 2017 at 12:04:11PM +0300, Aleksander Alekseev wrote:
> > This is wrong on platforms that do have strlcpy() in libc.
> 
> If it no too much trouble could you please explain what will happen
> on such platforms?

Both port.h and a system header will furnish a strlcpy() declaration.  The #if
you modified exists to avoid that, and your change would make it ineffective
for Clang.  This will have no symptoms, or it will elicit a warning.

> On what platform did you check it?

None.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: new set of psql patches for loading (saving) data from (to) text, binary files

2017-03-16 Thread Pavel Stehule

Hi

2017-03-15 17:21 GMT+01:00 Stephen Frost :

> Pavel,
>
> I started looking through this to see if it might be ready to commit and
> I don't believe it is.  Below are my comments about the first patch, I
> didn't get to the point of looking at the others yet since this one had
> issues.
>
> * Pavel Stehule (pavel.steh...@gmail.com) wrote:
> > 2017-01-09 17:24 GMT+01:00 Jason O'Donnell :
> > > gstore/gbstore:
>
> I don't see the point to 'gstore'- how is that usefully different from
> just using '\g'?  Also, the comments around these are inconsistent, some
> say they can only be used with a filename, others say it could be a
> filename or a pipe+command.
>

\gstore ensure dump row data. It can be replaced by \g with some other
setting, but if query is not unique, then the result can be messy. What is
not possible with \gbstore.

More interesting is \gbstore that uses binary API - it can be used for
bytea fields or for XML fields with implicit correct encoding change.
\gbstore is not possible to replace by \g.


>
> There's a whitespace-only hunk that shouldn't be included.
>
> I don't agree with the single-column/single-row restriction on these.  I
> can certainly see a case where someone might want to, say, dump out a
> bunch of binary integers into a file for later processing.
>
> The tab-completion for 'gstore' wasn't correct (you didn't include the
> double-backslash).  The patch also has conflicts against current master
> now.
>
> I guess my thinking about moving this forward would be to simplify it to
> just '\gb' which will pull the data from the server side in binary
> format and dump it out to the filename or command given.  If there's a
> new patch with those changes, I'll try to find time to look at it.
>

ok I'll prepare patch


>
> I would recommend going through a detailed review of the other patches
> also before rebasing and re-submitting them also, in particular look to
> make sure that the comments are correct and consistent, that there are
> comments where there should be (generally speaking, whole functions
> should have at least some comments in them, not just the function header
> comment, etc).
>
> Lastly, I'd suggest creating a 'psql.source' file for the regression
> tests instead of just throwing things into 'misc.source'.  Seems like we
> should probably have more psql-related testing anyway and dumping
> everything into 'misc.source' really isn't a good idea.
>
> Thanks!
>
> Stephen
>

Re: [HACKERS] Microvacuum support for Hash Index

2017-03-16 Thread Ashutosh Sharma

On Thu, Mar 16, 2017 at 11:11 AM, Amit Kapila  wrote:
> On Wed, Mar 15, 2017 at 9:23 PM, Ashutosh Sharma  
> wrote:
>>
>>>
>>> Few other comments:
>>> 1.
>>> + if (ndeletable > 0)
>>> + {
>>> + /* No ereport(ERROR) until changes are logged */
>>> + START_CRIT_SECTION();
>>> +
>>> + PageIndexMultiDelete(page, deletable, ndeletable);
>>> +
>>> + pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
>>> + pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES;
>>>
>>> You clearing this flag while logging the action, but same is not taken
>>> care during replay. Any reasons?
>>
>> That's because we  conditionally WAL Log this flag status and when we
>> do so, we take a it's FPI.
>>
>
> Sure, but we are not clearing in conditionally.  I am not sure, how
> after recovery it will be cleared it gets set during normal operation.
> Moreover, btree already clears similar flag during replay (refer
> btree_xlog_delete).

You were right. In case datachecksum is enabled or wal_log_hint is set
to true, 'LH_PAGE_HAS_DEAD_TUPLES' will get wal logged and therefore
needs to be cleared on the standby as well. Attached is the patch that
clears this flag on standby during replay.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


0001-Reset-LH_PAGE_HAS_DEAD_TUPLES-flag-on-standby-when.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Andres Freund

On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote:
> On 16/03/17 04:42, Andres Freund wrote:
> > On 2017-03-15 20:28:33 -0700, Andres Freund wrote:
> >> Hi,
> >>
> >> I just unstuck a bunch of my buildfarm animals.  That triggered some
> >> spurious failures (on piculet, calliphoridae, mylodon), but also one
> >> that doesn't really look like that:
> >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2002%3A40%3A03
> >>
> >> with the pertinent point being:
> >>
> >> == stack trace: 
> >> pgsql.build/src/test/regress/tmp_check/data/core ==
> >> [New LWP 1894]
> >> [Thread debugging using libthread_db enabled]
> >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> >> Core was generated by `postgres: bgworker: logical replication launcher
> >> '.
> >> Program terminated with signal SIGSEGV, Segmentation fault.
> >> #0  0x55e265bff5e3 in ?? ()
> >> #0  0x55e265bff5e3 in ?? ()
> >> #1  0x55d3ccabed0d in StartBackgroundWorker () at 
> >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792
> >> #2  0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) 
> >> at 
> >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878
> >> #3  0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at 
> >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205
> >>
> >> it's possible that me killing things and upgrading caused this, but
> >> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that
> >> it's more than that.  The machine is a bit backed up at the moment, so
> >> it'll probably be a while till it's at that animal/branch again,
> >> otherwise I'd not have mentioned this.
> > 
> > For some reason it ran again pretty soon. And I'm afraid it's indeed an
> > issue:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2003%3A30%3A02
> > 
> 
> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it
> seems to work fine on my two machines. I don't see anything else
> different on culicidae though. Sadly the backtrace is not that
> informative either. I'll try to investigate more but it will take time...

I can give you a login to that machine, it doesn't do anything but run
buildfarm animals...  Will have to be my tomorrow however.

(Also need to fix config for older branches that don't work with
the upgraded ssl. This is a really bad situation :()

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)

2017-03-16 Thread Michael Paquier

On Wed, Mar 15, 2017 at 9:14 PM, Kuntal Ghosh
 wrote:
> I've attached the updated patches.

Thanks for the new versions. This begins to look really clear.

> In 0001-Infra-to-expose-all-backend-processes-in-pg_stat_get.patch,
> I've extended BackendStatusArray to to store auxiliary processes.
> Backends
> use slots indexed in the range from 1 to MaxBackends (inclusive), so
> we can use MaxBackends + AuxBackendType + 1 as the index of the slot
> for an auxiliary process. Also, I've added a backend_type to describe
> the type of the backend process. The type includes:
> * autovacuum launcher
> * autovacuum worker
> * background writer
> * bgworker
> * client backend
> * checkpointer
> * startup
> * walreceiver
> * walsender
> * walwriter
> In 0002-Expose-stats-for-all-backends.patch, I've added the required
> code for reporting activity of different auxiliary processes,
> autovacuum launcher and bgworker processes.
> In 0003-Add-backend_type-column-in-pg_stat_get_activity.patch, I've
> added a column named backend_type in pg_stat_get_activity to show the
> type of the process to user.
>
> There are some pg_stat_* functions where showing all the backends
> doesn't make much sense. For example,
> postgres=# SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
>pg_stat_get_backend_activity(s.backendid) AS query
> FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
>   pid  |  query
> ---+--
>  17300 | SELECT pg_stat_get_backend_pid(s.backendid) AS pid, +
>|pg_stat_get_backend_activity(s.backendid) AS query   +
>| FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
>  16925 | 
>  16927 | 
>  16926 | 
>  16929 | 
> IMHO, this scenario can be easily avoided by filtering backends using
> backend_type. I'm not sure whether we should add any logic in the code
> for handling such cases. Thoughts?

Having some activity really depends on the backend type (see
autovacuum workers for example which fill in the query field), so my
2c here is that we let things as your patch proposes. If at some point
it makes sense to add something in the query field, we could always
discuss it separately and patch it accordingly.

+/* Total number of backends including auxiliary */
+#define NumBackendStatSlots (MaxBackends + NUM_AUXPROCTYPES)
+
This variable remains localized in pgstat.c, so let's define it there.

+  bgworker, background writer,
That's really bike-shedding, but we could say here "background worker"
and be consistent with the rest.

+/* Total number of backends including auxiliary */
+#define NumBackendStatSlots (MaxBackends + NUM_AUXPROCTYPES)
This could be a bit more precise, telling as well that MaxBackends
includes autovacuum workers and background workers.

- * --
+ *
+ * Each auxiliary process also maintains a PgBackendStatus struct in shared
+ * memory.
  */
Better to not delete this line, this prevents pgindent to touch this
comment block.

Did you try if this patch worked with EXEC_BACKEND? Sorry I don't have
a Windows workstation at hand now, but as AuxiliaryProcs is
NON_EXEC_STATIC...

+   /* We have userid for client-backends and wal-sender processes */
+   if (beentry->st_backendType == B_BACKEND ||
beentry->st_backendType == B_WAL_SENDER)
+   beentry->st_userid = GetSessionUserId();
+   else
+   beentry->st_userid = InvalidOid;
This can be true as well for bgworkers defining a role OID when
connecting with BackgroundWorkerInitializeConnection().

+   /*
+* Before returning, report autovacuum launcher process in the
+* PgBackendStatus array.
+*/
+   pgstat_bestart();
return;
Wouldn't that be better in AutoVacLauncherMain()?

+   /*
+* Before returning, report the background worker process in the
+* PgBackendStatus array.
+*/
+   if (!bootstrap)
+   pgstat_bestart();
Ditto with BackgroundWriterMain().

@@ -808,6 +836,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
nulls[12] = true;
nulls[13] = true;
nulls[14] = true;
+   nulls[23] = true;
}
That's not possible to have backend_type set as NULL, no?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: BUG #13755: pgwin32_is_service not checking if SECURITY_SERVICE_SID is disabled

2017-03-16 Thread Heikki Linnakangas


On 11/08/2016 07:56 AM, Michael Paquier wrote:

On Tue, Nov 8, 2016 at 2:25 PM, Tsunakawa, Takayuki
 wrote:

From: pgsql-hackers-ow...@postgresql.org

[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier
Things are this way since b15f9b08 that introduced pgwin32_is_service().
Still, by considering what you say, you definitely have a point that if
postgres is started by another service running as Local System logs are
going where they should not. Let's remove the check for LocalSystem but
still check for SE_GROUP_ENABLED.


I did some testing on patch v5, on my Windows 8 VM. I launched cmd as 
Administrator, and registered the service with:


pg_ctl register -D data

I.e. without specifying a user. When I start the service with "net 
start", it refuses to start, and there are no messages in the event log. 
It refuses to start because "data" is not a valid directory, so that's 
correct. But the error message about that is lost.


Added some debugging messages to win32_is_service(), and it confirms 
that with this patch (v5), win32_is_service() incorrectly returns false, 
while unmodified git master returns true, and writes the error message 
to the event log.


So, I think we still need the check for Local System.


So, without any refactoring work, isn't the attached patch just but fine?
That seems to work properly for me.


Just taking a look at the patch, I'm sure it will work.

Committer (Heikki?),
v5 is refactored for HEAD, and v6 is for previous releases without refactoring. 
 I'd like v5 to be applied to at least HEAD, as it removes a lot of unnecessary 
code.


+if (!CheckTokenMembership(NULL, AdministratorsSid, ) ||
+!CheckTokenMembership(NULL, PowerUsersSid, ))
 {
-if ((EqualSid(AdministratorsSid, Groups->Groups[x].Sid) &&
- (Groups->Groups[x].Attributes & SE_GROUP_ENABLED)) ||
-(EqualSid(PowerUsersSid, Groups->Groups[x].Sid) &&
- (Groups->Groups[x].Attributes & SE_GROUP_ENABLED)))
-{
-success = TRUE;
-break;
-}
+log_error("could not check access token membership: error code %lu\n",
+GetLastError());
+exit(1);
 }
I just looked more deeply at your refactoring patch, and I didn't know
about CheckTokenMembership()... The whole logic of your patch depends
on it. That's quite a cleanup that you have here. It looks that the
former implementation just had no knowledge of this routine or it
would just have been used.


Yeah, CheckTokenMembership() seems really handy. Let's switch to that, 
but without removing the checks for Local System.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Petr Jelinek

On 16/03/17 04:42, Andres Freund wrote:
> On 2017-03-15 20:28:33 -0700, Andres Freund wrote:
>> Hi,
>>
>> I just unstuck a bunch of my buildfarm animals.  That triggered some
>> spurious failures (on piculet, calliphoridae, mylodon), but also one
>> that doesn't really look like that:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2002%3A40%3A03
>>
>> with the pertinent point being:
>>
>> == stack trace: 
>> pgsql.build/src/test/regress/tmp_check/data/core ==
>> [New LWP 1894]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> Core was generated by `postgres: bgworker: logical replication launcher  
>>   '.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x55e265bff5e3 in ?? ()
>> #0  0x55e265bff5e3 in ?? ()
>> #1  0x55d3ccabed0d in StartBackgroundWorker () at 
>> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792
>> #2  0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) at 
>> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878
>> #3  0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at 
>> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205
>>
>> it's possible that me killing things and upgrading caused this, but
>> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that
>> it's more than that.  The machine is a bit backed up at the moment, so
>> it'll probably be a while till it's at that animal/branch again,
>> otherwise I'd not have mentioned this.
> 
> For some reason it ran again pretty soon. And I'm afraid it's indeed an
> issue:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2003%3A30%3A02
> 

Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it
seems to work fine on my two machines. I don't see anything else
different on culicidae though. Sadly the backtrace is not that
informative either. I'll try to investigate more but it will take time...

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Andres Freund

On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote:
> On 16/03/17 04:42, Andres Freund wrote:
> > On 2017-03-15 20:28:33 -0700, Andres Freund wrote:
> >> Hi,
> >>
> >> I just unstuck a bunch of my buildfarm animals.  That triggered some
> >> spurious failures (on piculet, calliphoridae, mylodon), but also one
> >> that doesn't really look like that:
> >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2002%3A40%3A03
> >>
> >> with the pertinent point being:
> >>
> >> == stack trace: 
> >> pgsql.build/src/test/regress/tmp_check/data/core ==
> >> [New LWP 1894]
> >> [Thread debugging using libthread_db enabled]
> >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> >> Core was generated by `postgres: bgworker: logical replication launcher
> >> '.
> >> Program terminated with signal SIGSEGV, Segmentation fault.
> >> #0  0x55e265bff5e3 in ?? ()
> >> #0  0x55e265bff5e3 in ?? ()
> >> #1  0x55d3ccabed0d in StartBackgroundWorker () at 
> >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792
> >> #2  0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) 
> >> at 
> >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878
> >> #3  0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at 
> >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205
> >>
> >> it's possible that me killing things and upgrading caused this, but
> >> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that
> >> it's more than that.  The machine is a bit backed up at the moment, so
> >> it'll probably be a while till it's at that animal/branch again,
> >> otherwise I'd not have mentioned this.
> > 
> > For some reason it ran again pretty soon. And I'm afraid it's indeed an
> > issue:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2003%3A30%3A02
> > 
> 
> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it
> seems to work fine on my two machines. I don't see anything else
> different on culicidae though. Sadly the backtrace is not that
> informative either. I'll try to investigate more but it will take time...

Worthwhile additional failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2002%3A55%3A01

Same animal, also EXEC_BACKEND, but 9.6.

A quick look at the relevant line:
/*
 * If bgw_main is set, we use that value as the initial entrypoint.
 * However, if the library containing the entrypoint wasn't loaded at
 * postmaster startup time, passing it as a direct function pointer is 
not
 * possible.  To work around that, we allow callers for whom a function
 * pointer is not available to pass a library name (which will be 
loaded,
 * if necessary) and a function name (which will be looked up in the 
named
 * library).
 */
if (worker->bgw_main != NULL)
entrypt = worker->bgw_main;

makes the issue clear - we appear to be assuming that bgw_main is
meaningful across processes.  Which it isn't in the EXEC_BACKEND case
when ASLR is in use...

This kinda sounds familiar, but a quick google search doesn't find
anything relevant.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Petr Jelinek

On 16/03/17 09:53, Andres Freund wrote:
> On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote:
>> On 16/03/17 04:42, Andres Freund wrote:
>>> On 2017-03-15 20:28:33 -0700, Andres Freund wrote:
 Hi,

 I just unstuck a bunch of my buildfarm animals.  That triggered some
 spurious failures (on piculet, calliphoridae, mylodon), but also one
 that doesn't really look like that:
 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2002%3A40%3A03

 with the pertinent point being:

 == stack trace: 
 pgsql.build/src/test/regress/tmp_check/data/core ==
 [New LWP 1894]
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
 Core was generated by `postgres: bgworker: logical replication launcher
 '.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x55e265bff5e3 in ?? ()
 #0  0x55e265bff5e3 in ?? ()
 #1  0x55d3ccabed0d in StartBackgroundWorker () at 
 /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792
 #2  0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) 
 at 
 /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878
 #3  0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at 
 /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205

 it's possible that me killing things and upgrading caused this, but
 given this is a backend running EXEC_BACKEND, I'm a bit suspicous that
 it's more than that.  The machine is a bit backed up at the moment, so
 it'll probably be a while till it's at that animal/branch again,
 otherwise I'd not have mentioned this.
>>>
>>> For some reason it ran again pretty soon. And I'm afraid it's indeed an
>>> issue:
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2003%3A30%3A02
>>>
>>
>> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it
>> seems to work fine on my two machines. I don't see anything else
>> different on culicidae though. Sadly the backtrace is not that
>> informative either. I'll try to investigate more but it will take time...
> 
> Worthwhile additional failure:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2002%3A55%3A01
> 
> Same animal, also EXEC_BACKEND, but 9.6.
> 
> A quick look at the relevant line:
>   /*
>* If bgw_main is set, we use that value as the initial entrypoint.
>* However, if the library containing the entrypoint wasn't loaded at
>* postmaster startup time, passing it as a direct function pointer is 
> not
>* possible.  To work around that, we allow callers for whom a function
>* pointer is not available to pass a library name (which will be 
> loaded,
>* if necessary) and a function name (which will be looked up in the 
> named
>* library).
>*/
>   if (worker->bgw_main != NULL)
>   entrypt = worker->bgw_main;
> 
> makes the issue clear - we appear to be assuming that bgw_main is
> meaningful across processes.  Which it isn't in the EXEC_BACKEND case
> when ASLR is in use...
> 
> This kinda sounds familiar, but a quick google search doesn't find
> anything relevant.

Hmm now that you mention it, I remember discussing something similar
with you last year in Dallas in regards to parallel query. IIRC Windows
should not have this problem but other systems with EXEC_BACKEND do.
Don't remember the details though.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioned tables and relfilenode

2017-03-16 Thread Ashutosh Bapat

>>
>> Will it help to retain the partition hierarchy as inheritance
>> hierarchy and then collapse it while creating append paths. That will
>> be needed by partition-wise join, will be helpful in partition pruning
>> without using constraints and so on. So, may be could use that
>> infrastructure to simplify the logic here. The patch is available as
>> 0013 in [1].
>>
>> [1] cafjfprfqotrr6cm3soobhmhevdkffaz6pyyg4grzsomuw08...@mail.gmail.com
>
> IMHO, it would be better to keep those patches separate because the
> problems being solved are different.  By the way, one of the reasons that
> patch (as I had written it) was skipped was because it didn't cover the
> inheritance_planner() case [1].  Your patch 0013 at the link should be
> updated (maybe I should report on the partitionwise joins thread as well)
> in some way to handle the update/delete case, because this happens:
>
> create table p (a int, b char) partition by list (a);
> create table p1 partition of p for values in (1) partition by list (b);
> create table p1a partition of p1 for values in ('a');
> create table p2 partition of p for values in (2);
>
> explain (costs off) update p set a = a, b = 'b';
> QUERY PLAN
> ---
>  Update on p
>Update on p
>Update on p1 p
>Update on p2
>->  Seq Scan on p
>->  Result
>  ->  Append
>->  Seq Scan on p1
>->  Seq Scan on p1a
>->  Seq Scan on p2
> (10 rows)
>
> update p set a = a, b = 'b';
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.

Thanks for pointing that out. I am able to reproduce the crash. I
think, we will need to teach it to add the indirect children as well.
Looks like we are missing a testcase for this scenario. I had run
regression with that patch, and didn't catch any failures and crashes.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Consistency checking for hash indexes

2017-03-16 Thread Ashutosh Sharma

Hi,

Attached is the patch that allows WAL consistency tool to mask
'LH_PAGE_HAS_DEAD_TUPLES' flag in hash index. The flag got added as a
part of 'Microvacuum support for Hash index' patch . I have already
tested it using Kuntal's WAL consistency tool and it works fine.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Wed, Mar 15, 2017 at 11:27 AM, Kuntal Ghosh
 wrote:
> On Wed, Mar 15, 2017 at 12:32 AM, Robert Haas  wrote:
>> On Mon, Mar 13, 2017 at 10:36 AM, Ashutosh Sharma  
>> wrote:
>>> Couple of review comments,,
>>>
>>> You may also need to update the documentation as now we are also going
>>> to support wal consistency check for hash index. The current
>>> documentation does not include hash index.
>>>
>>> +only records originating from those resource managers.  Currently,
>>> +the supported resource managers are heap,
>>> +heap2, btree, gin,
>>> +gist, sequence, spgist,
>>> +brin, and generic. Only
>>
>> Did that, committed this.  Also ran pgindent over hash_mask() and
>> fixed an instance of dubious capitalization.
> Thanks Robert for the commit.
>
>
> --
> Thanks & Regards,
> Kuntal Ghosh
> EnterpriseDB: http://www.enterprisedb.com


0001-Allow-WAL-consistency-tool-to-mask-LH_PAGE_HAS_DEAD_.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Quals not pushed down into lateral

2017-03-16 Thread Andres Freund

Hi,

During citus development we noticed that restrictions aren't pushed down
into lateral subqueries, even if they semantically could.  For example,
in this dumbed down example:

postgres[31776][1]=# CREATE TABLE t_2(id serial primary key);
postgres[31776][1]=# CREATE TABLE t_1(id serial primary key);

Comparing:

postgres[31776][1]=# EXPLAIN SELECT * FROM t_1 JOIN (SELECT * FROM t_2 GROUP BY 
id) s ON (t_1.id = s.id) WHERE t_1.id = 3;
┌─┐
│ QUERY PLAN
  │
├─┤
│ Nested Loop  (cost=0.31..16.37 rows=1 width=8)
  │
│   ->  Index Only Scan using t_1_pkey on t_1  (cost=0.15..8.17 rows=1 width=4) 
  │
│ Index Cond: (id = 3)  
  │
│   ->  Group  (cost=0.15..8.17 rows=1 width=4) 
  │
│ Group Key: t_2.id 
  │
│ ->  Index Only Scan using t_2_pkey on t_2  (cost=0.15..8.17 rows=1 
width=4) │
│   Index Cond: (id = 3)
  │
└─┘
(7 rows)

with:

postgres[31776][1]=# EXPLAIN SELECT * FROM t_1, LATERAL (SELECT * FROM t_2 
WHERE t_1.id = t_2.id GROUP BY id) s WHERE t_1.id = 3;
┌─┐
│ QUERY PLAN
  │
├─┤
│ Nested Loop  (cost=0.31..16.37 rows=1 width=8)
  │
│   ->  Index Only Scan using t_1_pkey on t_1  (cost=0.15..8.17 rows=1 width=4) 
  │
│ Index Cond: (id = 3)  
  │
│   ->  Group  (cost=0.15..8.17 rows=1 width=4) 
  │
│ Group Key: t_2.id 
  │
│ ->  Index Only Scan using t_2_pkey on t_2  (cost=0.15..8.17 rows=1 
width=4) │
│   Index Cond: (id = t_1.id)   
  │
└─┘

it's noticeable that the former has id = 3 pushed down into both
relations index scans, whereas the latter doesn't.


This seems like a worthwhile future optimization opportunity.


I've not looked into this in any detail, but the proximate source is
that set_subquery_pathlist() doesn't see any baserstrictinfos to push
down.  Which makes sense, because t_1.id = t_2.id isn't "visible" (in
the sense of deconstruct_jointree dealing with it) to the outside.

It seems possible to look into rel->lateral_vars, check whether that's
member of some equivclass, and then push the relevant equivalences down
(after taking care that the Var from the outside is known as a Param on
the inside).


I'm not planning to work on this anytime soon, but I thought it'd be
useful to have a searchable reference point about the topic.  If
somebody wants to work on it...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Petr Jelinek

On 16/03/17 09:44, Andres Freund wrote:
> On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote:
>> On 16/03/17 04:42, Andres Freund wrote:
>>> On 2017-03-15 20:28:33 -0700, Andres Freund wrote:
 Hi,

 I just unstuck a bunch of my buildfarm animals.  That triggered some
 spurious failures (on piculet, calliphoridae, mylodon), but also one
 that doesn't really look like that:
 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2002%3A40%3A03

 with the pertinent point being:

 == stack trace: 
 pgsql.build/src/test/regress/tmp_check/data/core ==
 [New LWP 1894]
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
 Core was generated by `postgres: bgworker: logical replication launcher
 '.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x55e265bff5e3 in ?? ()
 #0  0x55e265bff5e3 in ?? ()
 #1  0x55d3ccabed0d in StartBackgroundWorker () at 
 /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792
 #2  0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) 
 at 
 /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878
 #3  0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at 
 /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205

 it's possible that me killing things and upgrading caused this, but
 given this is a backend running EXEC_BACKEND, I'm a bit suspicous that
 it's more than that.  The machine is a bit backed up at the moment, so
 it'll probably be a while till it's at that animal/branch again,
 otherwise I'd not have mentioned this.
>>>
>>> For some reason it ran again pretty soon. And I'm afraid it's indeed an
>>> issue:
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2017-03-16%2003%3A30%3A02
>>>
>>
>> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it
>> seems to work fine on my two machines. I don't see anything else
>> different on culicidae though. Sadly the backtrace is not that
>> informative either. I'll try to investigate more but it will take time...
> 
> I can give you a login to that machine, it doesn't do anything but run
> buildfarm animals...  Will have to be my tomorrow however.
> 

That would be helpful, thanks.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-16 Thread Ashutosh Bapat

On Thu, Mar 16, 2017 at 3:57 PM, Amit Khandekar  wrote:
> On 12 March 2017 at 08:50, Robert Haas  wrote:
>>> However, Ashutosh's response made me think of something: one thing is
>>> that we probably do want to group all of the non-partial plans at the
>>> beginning of the Append so that they get workers first, and put the
>>> partial plans afterward.  That's because the partial plans can always
>>> be accelerated by adding more workers as they become available, but
>>> the non-partial plans are just going to take as long as they take - so
>>> we want to start them as soon as possible.  In fact, what we might
>>> want to do is actually sort the non-partial paths in order of
>>> decreasing cost, putting the most expensive one first and the others
>>> in decreasing order after that - and then similarly afterward with the
>>> partial paths.  If we did that, we wouldn't need to store a bitmapset
>>> OR two separate lists.  We could just store the index of the first
>>> partial plan in the list.  Then you can test whether a path is partial
>>> by checking whether this_index >= first_partial_index.
>
> Attached is an updated patch v7, which does the above. Now,
> AppendState->subplans has all non-partial subplans followed by all
> partial subplans, with the non-partial subplans in the order of
> descending total cost. Also, for convenience, the AppendPath also now
> has similar ordering in its AppendPath->subpaths. So there is a new
> field both in Append and AppendPath : first_partial_path/plan, which
> has value 0 if there are no non-partial subpaths.
>
> Also the backend now scans reverse, so that it does not take up the
> most expensive path.
>
> There are also some changes in the costing done. Now that we know that
> the very first path is the costliest non-partial path, we can use its
> total cost as the total cost of Append in case all the partial path
> costs are lesser.
>
> Modified/enhanced an existing test scenario in
> src/test/regress/select_parallel.sql so that Parallel Append is
> covered.
>
> As suggested by Robert, since pa_info->pa_finished was the only field
> in pa_info, removed the ParallelAppendDescData.pa_info structure, and
> instead brought pa_info->pa_finished into ParallelAppendDescData.
>
 +static inline void
 +exec_append_scan_first(AppendState *appendstate)
 +{
 +appendstate->as_whichplan = 0;
 +}

 I don't think this is buying you anything, and suggest backing it out.
>>>
>>> This is required for sequential Append, so that we can start executing
>>> from the first subplan.
>>
>> My point is that there's really no point in defining a static inline
>> function containing one line of code.  You could just put that line of
>> code in whatever places need it, which would probably be more clear.
>
> Did the same.

Some comments
+ * Check if we are already finished plans from parallel append. This
+ * can happen if all the subplans are finished when this worker
+ * has not even started returning tuples.
+ */
+if (node->as_padesc && node->as_whichplan == PA_INVALID_PLAN)
+return ExecClearTuple(node->ps.ps_ResultTupleSlot);
>From the comment, it looks like this condition will be encountered before the
backend returns any tuple. But this code is part of the loop which returns the
tuples. Shouldn't this be outside the loop? Why do we want to check a condition
for every row returned when the condition can happen only once and that too
before returning any tuple?

Why do we need following code in both ExecAppendInitializeWorker() and
ExecAppendInitializeDSM()? Both of those things happen before starting the
actual execution, so one of those should suffice?
+/* Choose the optimal subplan to be executed. */
+(void) parallel_append_next(node);

There is no pa_num_worker now, so probably this should get updated. Per comment
we should also get rid of SpinLockAcquire() and SpinLockRelease()?
+ *purpose. The spinlock is used so that it does not change the
+ *pa_num_workers field while workers are choosing the next node.

BTW, sa_finished seems to be a misnomor. The plan is not finished yet, but it
wants no more workers. So, should it be renamed as sa_no_new_workers or
something like that?

In parallel_append_next() we shouldn't need to call goto_next_plan() twice. If
the plan indicated by pa_next_plan is finished, all the plans must have
finished. This should be true if we set pa_next_plan to 0 at the time of
initialization. Any worker picking up pa_next_plan will set it to the next
valid plan. So the next worker asking for plan should pick pa_next_plan and
set it to the next one and so on.

I am wonding whether goto_next_plan() can be simplified as some module
arithmatic e.g. (whichplan - first_plan)++ % (last_plan - first_plan)
+ first_plan.

I am still reviewing the patch.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres

Re: [HACKERS] multivariate statistics (v25)

2017-03-16 Thread David Rowley

On 16 March 2017 at 09:45, Alvaro Herrera  wrote:

> Here's another version of 0002 after cleaning up almost everything from
> David's review.  I also added tests for ALTER STATISTICS in
> sql/alter_generic.sql which made me realize there were three crasher bug
> in here; fixed all those.  It also made me realize that psql's \d was a
> little bit too generous with dropped columns in a stats object.  That
> should all behave better now.
>

Thanks for fixing.

As you mentioned to me off-list about missing pg_dump support, I've gone
and implemented that in the attached patch.

I followed how pg_dump works for indexes, and
created pg_get_statisticsextdef() in ruleutils.c. I was unsure if I should
be naming this pg_get_statisticsdef() instead.

I also noticed there's no COMMENT ON support either, so I added that too.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

extstats_pg_dump_and_comment_support.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-16 Thread Robert Haas

On Wed, Mar 15, 2017 at 3:44 PM, Pavan Deolasee
 wrote:
> I couldn't find a better way without a lot of complex infrastructure. Even
> though we now have ability to mark index pointers and we know that a given
> pointer either points to the pre-WARM chain or post-WARM chain, this does
> not solve the case when an index does not receive a new entry. In that case,
> both pre-WARM and post-WARM tuples are reachable via the same old index
> pointer. The only way we could deal with this is to mark index pointers as
> "common", "pre-warm" and "post-warm". But that would require us to update
> the old pointer's state from "common" to "pre-warm" for the index whose keys
> are being updated. May be it's doable, but might be more complex than the
> current approach.

/me scratches head.

Aren't pre-warm and post-warm just (better) names for blue and red?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding of two-phase transactions

2017-03-16 Thread Stas Kelvich


> On 16 Mar 2017, at 14:44, Craig Ringer  wrote:
> 
> I'm going to try to pick this patch up and amend its interface per our
> discussion earlier, see if I can get it committable.

I’m working right now on issue with building snapshots for decoding prepared tx.
I hope I'll send updated patch later today.

> -- 
> Craig Ringer   http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Error handling in transactions

2017-03-16 Thread Peter van Hardenberg

After the previous thread, Jean-Paul, Ads, Alvarro and I were discussing
the use-case described by Joshua and trying to think about mitigating
strategies. Before getting into a discussion of a proposed solution, I'll
try and expand on the reasoning behind why I think this is a problem worth
solving.

First, discoverability of the current ON_ERROR_ROLLBACK=interactive is
poor. How would a user ever know that this was available as an option they
may want to set? Even if they could be told it was an option (in say a hint
message on a transaction abort) they would only find out about this after
the fact when the damage (a lost transaction) was done.

So let's try and imagine a solution where a user who has made a mistake in
a transaction might be able to gracefully recover but where the current
semantics are preserved.

In this case, we'd want a transaction not to abort immediately (allowing
recoverability) but not to commit if there was an error.

To make this work, an error during a transaction would not trigger an
immediate ROLLBACK but would instead set a session state say,
ERROR_TRIGGERED.

Most statements would not be allowed to execute in this state and each
statement would return an error describing the current state. A COMMIT
would then finally trigger the ROLLBACK, closing the transaction scope.

If the user were interested in recovering their transaction, they could set
ERROR_TRIGGERED back to "false", send any commands they wanted (retrying
part of the transaction, or whatever.) It might be simplest to prevent all
statements besides reading or setting ERROR_TRIGGERED but it may be
desirable to allow non-DDL/DML statements in order to aid in diagnosing
what happened.

This would also allow for programmatic error handling during transactions
without the overhead of savepoints by checking the value of ERROR_TRIGGERED
after each statement and handling it as appropriate.

Of course, the additional complexity of this feature is greater than simply
updating a default value but I'm certainly willing to accept the argument
that setting a new default to a potentially destructive setting is
problematic. Still, I do believe that the current state of affairs is
painful and problematic and this is a problem worth solving.

-- 
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut

Re: [HACKERS] [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

2017-03-16 Thread Petr Jelinek

On 15/03/17 17:55, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 03/03/2017 11:11 PM, Tom Lane wrote:
>>> Yeah, I was wondering if this is just exposing a pre-existing bug.
>>> However, the "normal" path operates by repeatedly invoking PQconnectPoll
>>> (cf. connectDBComplete) so it's not immediately obvious how such a bug
>>> would've escaped detection.
> 
>> (After a long period of fruitless empirical testing I turned to the code)
>> Maybe I'm missing something, but connectDBComplete() handles a return of
>> PGRESS_POLLING_OK as a success while connectDBStart() seems not to. I
>> don't find anywhere in our code other than libpqwalreceiver that
>> actually uses that interface, so it's not surprising if it's now
>> failing. So my bet is it is indeed a long-standing bug.
> 
> Meh ... that argument doesn't hold water, because the old code here called
> PQconnectdbParams which is just PQconnectStartParams then
> connectDBComplete.  So the problem cannot be in connectDBStart; that's
> common to both paths.  It has to be some discrepancy between what
> connectDBComplete does and what the new loop in libpqwalreceiver is doing.
> 
> The original loop coding in 1e8a85009 was not very close to the documented
> spec for PQconnectPoll at all, and while e434ad39a made it closer, it's
> still not really the same: connectDBComplete doesn't call PQconnectPoll
> until the socket is known read-ready or write-ready.  The walreceiver loop
> does not guarantee that, but would make an additional call after any
> random other wakeup.  It's not very clear why bowerbird, and only
> bowerbird, would be seeing such wakeups --- but I'm having a really hard
> time seeing any other explanation for the change in behavior.  (I wonder
> whether bowerbird is telling us that WaitLatchOrSocket can sometimes
> return prematurely on Windows.)
> 
> I'm also pretty sure that the ResetLatch call is in the wrong place which
> could lead to missed wakeups, though that's the opposite of the immediate
> problem.
> 
> I'll try correcting these things and we'll see if it gets any better.
> 

Looks like that didn't help either.

I setup my own Windows machine and can reproduce the issue. I played
around a bit and could not really find a fix other than adding
WL_TIMEOUT and short timeout to WaitLatchOrSocket (it does wait a very
long time on the WaitLatchOrSocket otherwise before failing).

So I wonder if this is the same issue that caused us using different
coding for WaitLatchOrSocket in pgstat.c (lines ~3918-3940).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speedup twophase transactions

2017-03-16 Thread Michael Paquier

On Thu, Mar 16, 2017 at 7:18 PM, Nikhil Sontakke
 wrote:
>> + *  * RecoverPreparedTransactions(),
>> StandbyRecoverPreparedTransactions()
>> + *and PrescanPreparedTransactions() have been modified to go
>> throug
>> + *gxact->inredo entries that have not made to disk yet.
>>
>> It seems to me that there should be an initial scan of pg_twophase at
>> the beginning of recovery, discarding on the way with a WARNING
>> entries that are older than the checkpoint redo horizon. This should
>> fill in shmem entries using something close to PrepareRedoAdd(), and
>> mark those entries as inredo. Then, at the end of recovery,
>> PrescanPreparedTransactions does not need to look at the entries in
>> pg_twophase. And that's the case as well of
>> RecoverPreparedTransaction(). I think that you could get the patch
>> much simplified this way, as any 2PC data can be fetched directly from
>> WAL segments and there is no need to rely on scans of pg_twophase,
>> this is replaced by scans of entries in TwoPhaseState.
>>
>
> I don't think this will work. We cannot replace pg_twophase with shmem
> entries + WAL pointers. This is because we cannot expect to have WAL entries
> around for long running prepared queries which survive across checkpoints.

But at the beginning of recovery, we can mark such entries with ondisk
and inredo, in which case the WAL pointers stored in the shmem entries
do not matter because the data is already on disk.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Bitmap scans a bit broken

2017-03-16 Thread Dilip Kumar

On Thu, Mar 16, 2017 at 5:14 PM, Dilip Kumar  wrote:
> pg_atomic_write_u32_impl(val=0) at generic.h:57, queue = 
> 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
>>>   * frame #0: 0x000100caf314 postgres`tbm_prepare_shared_iterate 
>>> [inlined] pg_atomic_write_u32_impl(val=0) at generic.h:57 [opt]
>>> frame #1: 0x000100caf314 postgres`tbm_prepare_shared_iterate 
>>> [inlined] pg_atomic_init_u32_impl(val_=0) at generic.h:163 [opt]
>>> frame #2: 0x000100caf314 postgres`tbm_prepare_shared_iterate 
>>> [inlined] pg_atomic_init_u32(val=0) + 17 at atomics.h:237 [opt]
>
> By looking at the call stack I got the problem location.  I am
> reviewing other parts of the code if there are the similar mistake at
> other places. Soon I will post the patch.  Thanks for the help.

Based on the call stack I have tried to fix the issue. The problem is
there was some uninitialized pointer access (in some special cases
i.e. TBM_EMPTY when pagetable is not created at all).

 fix_tbm_empty.patch have fixed some of them but induced one which you
are seeing in your call stack.

Hopefully, this time I got it correct.  Since I am unable to reproduce
the issue so I will again need your help in verifying the fix.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


fix_tbm_empty_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wait events for disk I/O

2017-03-16 Thread Rahila Syed

Thank you for the updated patch.

I have applied and tested it on latest sources and the patch looks good to
me.

>I am not quite sure about this, as this is for stat statements. Also part
from the
>place you found there are many other fwrite() call into
pg_stat_statements, and
>I intentionally haven't added event here as it is very small write about
stat, and
>doesn't look like we should add for those call.
I agree that this writes less amount of data. Although tracking this can
be useful too in scenarios where pg_stat_statements is lagging due to I/O
bottleneck.
I will leave this decision to the committer.

Thank you,
Rahila Syed

On Wed, Mar 15, 2017 at 1:03 PM, Rushabh Lathia 
wrote:

> Thanks Rahila for reviewing this patch.
>
> On Tue, Mar 14, 2017 at 8:13 PM, Rahila Syed 
> wrote:
>
>> Hello,
>>
>> I applied and tested this patch on latest sources and it works fine.
>>
>> Following are some comments,
>>
>> >+   /* Wait event for SNRU */
>> >+   WAIT_EVENT_READ_SLRU_PAGE,
>> Typo in the comment.
>>
>>
> Fixed.
>
>
>> >FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush,
>> WAIT_EVENT_FLUSH_DATA_BLOCK);
>> This call is inside mdwriteback() which can flush more than one block so
>> should  WAIT_EVENT _FLUSH_DATA_BLOCK
>> be renamed to WAIT_EVENT_FLUSH_DATA_BLOCKS?
>>
>>
> Changed with WAIT_EVENT_FLUSH_DATA_BLOCKS.
>
>
>> Should calls to write() in following functions be tracked too?
>>  qtext_store()  - This is related to pg_stat_statements
>>
>>
>
> I am not quite sure about this, as this is for stat statements. Also part
> from the
> place you found there are many other fwrite() call into
> pg_stat_statements, and
> I intentionally haven't added event here as it is very small write about
> stat, and
> doesn't look like we should add for those call.
>
>
>
>> dsm_impl_mmap() - This is in relation to creating dsm segments.
>>
>>
> Added new event here. Actually particular write call is zero-filling the
> DSM file.
>
>
>> write_auto_conf_file()-  This is called when updated configuration
>> parameters are
>>  written to a temp file.
>>
>>
> write_auto_conf_file() is getting called during the ALTER SYSTEM call.
> Here write
> happen only when someone explicitly run the ALTER SYSTEM call. This is
> administrator call and so doesn't seem like necessary to add separate wait
> event
> for this.
>
> PFA latest patch with other fixes.
>
>
>>
>> On Wed, Mar 8, 2017 at 4:50 PM, Rushabh Lathia 
>> wrote:
>>
>>>
>>>
>>> On Wed, Mar 8, 2017 at 8:23 AM, Robert Haas 
>>> wrote:
>>>
 On Tue, Mar 7, 2017 at 9:32 PM, Amit Kapila 
 wrote:
 > On Tue, Mar 7, 2017 at 9:16 PM, Robert Haas 
 wrote:
 >> On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila 
 wrote:
 >>> Sure, if you think both Writes and Reads at OS level can have some
 >>> chance of blocking in obscure cases, then we should add a wait event
 >>> for them.
 >>
 >> I think writes have a chance of blocking in cases even in cases that
 >> are not very obscure at all.
 >
 > Point taken for writes, but I think in general we should have some
 > criteria based on which we can decide whether to have a wait event for
 > a particular call. It should not happen that we have tons of wait
 > events and out of which, only a few are helpful in most of the cases
 > in real-world scenarios.

 Well, the problem is that if you pick and choose which wait events to
 add based on what you think will be common, you're actually kind of
 hosing yourself. Because now when something uncommon happens, suddenly
 you don't get any wait event data and you can't tell what's happening.
 I think the number of new wait events added by Rushabh's patch is
 wholly reasonable.  Yeah, some of those are going to be a lot more
 common than others, but so what?  We add wait events so that we can
 find out what's going on.  I don't want to sometimes know when a
 backend is blocked on an I/O.  I want to ALWAYS know.


>>> Yes, I agree with Robert. Knowing what we want and what we don't
>>> want is difficult to judge. Something which we might think its not useful
>>> information, and later of end up into situation where we re-think about
>>> adding those missing stuff is not good. Having more information about
>>> the system, specially for monitoring purpose is always good.
>>>
>>> I am attaching  another version of the patch, as I found stupid mistake
>>> in the earlier version of patch, where I missed to initialize initial
>>> value to
>>> WaitEventIO enum. Also earlier version was not getting cleanly apply on
>>> the current version of sources.
>>>
>>>
>>>
>>> --
>>> Rushabh Lathia
>>> EnterpriseDB: http://www.enterprisedb.com
>>> The Enterprise PostgreSQL Company
>>>
>>>
>>>
>>> --

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Andres Freund

On 2017-03-16 09:27:59 -0400, Robert Haas wrote:
> On Thu, Mar 16, 2017 at 5:13 AM, Petr Jelinek
>  wrote:
> > Hmm now that you mention it, I remember discussing something similar
> > with you last year in Dallas in regards to parallel query. IIRC Windows
> > should not have this problem but other systems with EXEC_BACKEND do.
> > Don't remember the details though.
> 
> Generally, extension code can't use bgw_main safely, and must use
> bgw_library_name and bgw_function_name.  But bgw_main is supposedly
> safe for core code.

I indeed think it's not safe, and it's going to get less and less safe
on windows (or EXEC_BACKEND).  I don't think we can afford to disable
ASLR in the long run (I indeed supect that'll just be disallowed at some
point), and that's the only thing making it safe-ish in combination with
EXEC_BACKEND.

> If it's not even safe there, then I guess we should remove it entirely
> as a useless foot-gun.

I indeed think that's the right consequence.  One question is what to
replace it with exactly - are we guaranteed we can dynamically lookup
symbols by name in the main binary on every platform?  Alternatively we
can just hardcode a bunch of bgw_function_name values that are matched
to specific functions if bgw_library_name is NULL - I suspect that'd be
the easiest / least worrysome portability-wise.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-16 Thread Peter Geoghegan

On Thu, Mar 16, 2017 at 11:53 AM, Andrew Borodin  wrote:
> 2.  Thus, L fully concurrent vacuum is possible, indeed, and
> furthermore Theodor suggested that I should implement not only page
> eviction, but also page merge and tree condence algorithm.

I think that it's very hard to make merging of pages that are not
completely empty work, while also using the L algorithm.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Bitmap scans a bit broken

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 1:50 PM, Dilip Kumar  wrote:
> fixed

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 8:48 AM, Ashutosh Bapat
 wrote:
> Why do we need following code in both ExecAppendInitializeWorker() and
> ExecAppendInitializeDSM()? Both of those things happen before starting the
> actual execution, so one of those should suffice?
> +/* Choose the optimal subplan to be executed. */
> +(void) parallel_append_next(node);

ExecAppendInitializeWorker runs only in workers, but
ExecAppendInitializeDSM runs only in the leader.

> BTW, sa_finished seems to be a misnomor. The plan is not finished yet, but it
> wants no more workers. So, should it be renamed as sa_no_new_workers or
> something like that?

I think that's not going to improve clarity.  The comments can clarify
the exact semantics.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 6:27 AM, Amit Khandekar  wrote:
> Attached is an updated patch v7, which does the above.

Some comments:

- You've added a GUC (which is good) but not documented it (which is
bad) or added it to postgresql.conf.sample (also bad).

- You've used a loop inside a spinlock-protected critical section,
which is against project policy.  Use an LWLock; define and document a
new builtin tranche ID.

- The comment for pa_finished claims that it is the number of workers
executing the subplan, but it's a bool, not a count; I think this
comment is just out of date.

- paths_insert_sorted_by_cost() is a hand-coded insertion sort.  Can't
we find a way to use qsort() for this instead of hand-coding a slower
algorithm?  I think we could just create an array of the right length,
stick each path into it from add_paths_to_append_rel, and then qsort()
the array based on .  Then the result can be
turned into a list.

- Maybe the new helper functions in nodeAppend.c could get names
starting with exec_append_, to match the style of
exec_append_initialize_next().

- There's a superfluous whitespace change in add_paths_to_append_rel.

- The substantive changes in add_paths_to_append_rel don't look right
either.  It's not clear why accumulate_partialappend_subpath is
getting called even in the non-enable_parallelappend case.  I don't
think the logic for the case where we're not generating a parallel
append path needs to change at all.

- When parallel append is enabled, I think add_paths_to_append_rel
should still consider all the same paths that it does today, plus one
extra.  The new path is a parallel append path where each subpath is
the cheapest subpath for that childrel, whether partial or
non-partial.  If !enable_parallelappend, or if all of the cheapest
subpaths are partial, then skip this.  (If all the cheapest subpaths
are non-partial, it's still potentially useful.)  In other words,
don't skip consideration of parallel append just because you have a
partial path available for every child rel; it could be

- I think the way cost_append() works is not right.  What you've got
assumes that you can just multiply the cost of a partial plan by the
parallel divisor to recover the total cost, which is not true because
we don't divide all elements of the plan cost by the parallel divisor
-- only the ones that seem like they should be divided.  Also, it
could be smarter about what happens with the costs of non-partial
paths. I suggest the following algorithm instead.

1. Add up all the costs of the partial paths.  Those contribute
directly to the final cost of the Append.  This ignores the fact that
the Append may escalate the parallel degree, but I think we should
just ignore that problem for now, because we have no real way of
knowing what the impact of that is going to be.

2. Next, estimate the cost of the non-partial paths.  To do this, make
an array of Cost of that length and initialize all the elements to
zero, then add the total cost of each non-partial plan in turn to the
element of the array with the smallest cost, and then take the maximum
of the array elements as the total cost of the non-partial plans.  Add
this to the result from step 1 to get the total cost.

- In get_append_num_workers, instead of the complicated formula with
log() and 0.693, just add the list lengths and call fls() on the
result.  Integer arithmetic FTW!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] asynchronous execution

2017-03-16 Thread Tom Lane

Corey Huinker  writes:
> I reworked the test such that all of the foreign tables inherit from the
> same parent table, and if you query that you do get async execution. But It
> doesn't work when just stringing together those foreign tables with UNION
> ALLs.

> I don't know how to proceed with this review if that was a goal of the
> patch.

Whether it was a goal or not, I'd say there is something either broken
or incorrectly implemented if you don't see that.  The planner (and
therefore also the executor) generally treats inheritance the same as
simple UNION ALL.  If that's not the case here, I'd want to know why.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-16 Thread Andrew Borodin

2017-03-16 21:27 GMT+05:00 David Steele :
> This patch applies cleanly and compiles at cccbdde.
>
> Jeff, any thoughts on Andrew's responses?

Hi, David!

I've got some updates on the matter of this patch, since the
understanding of the B-tree bothered me much.
Currently, I'm at PgConf.Russia, where I've contacted Theodor Sigaev,
and he answered my questions about the GIN.
0. I think that proposed patch is safe (deadlock free, does not
introduce new livelocks, all the resources guarded properly)
1. There _are_ high keys at the posting trees, they are just called
rightmost keys, but in fact they are high keys in terms of L
algorithm.
2.  Thus, L fully concurrent vacuum is possible, indeed, and
furthermore Theodor suggested that I should implement not only page
eviction, but also page merge and tree condence algorithm.
3. Eventually, I'll do that, certainly, but, currently, I can't
predict the time it'll take. I think I'll start somewhere in the
summer, may be right after GiST intrapage indexing.

As for now, I think that having this patch in PostgreSQL 10 is viable.

Best regards, Andrey Borodin.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] temp_buffers vs temp vs local and explain

2017-03-16 Thread Joshua D. Drake


-hackers,

I was reviewing an explain plan today and with some help from Andrew G, 
I got a lot more information than I deserved. It did however bring up 
quite a usability issue that I think we should consider.


Let's review the following two lines:

Sort Method: external merge  Disk: 19352kB
   Buffers: shared hit=257714, temp read=8822 written=8808

Now the first line is pretty obvious. We spilled over work_mem and hit 
the disk for ~ 20MB of use.


The second line is not so clear.

Buffers, shared_buffers? We hit 257714 of those. That makes sense but 
what about temp? Temp refers to temp files, not temp_buffers or temp 
tables. Temp buffers refers to a temp table (ala create temp table) but 
is represented as local in an explain plan. Further the values of temp 
are blocks, not bytes.


Basically, it is a little convoluted.

I am not 100% what the answer here is but it seems more consistency 
might be a good start.


Also, it would be a huge boon for many (almost all) of our users if we 
could just do (something like) this:


EXPLAIN (ANALYZE,SUMMARY)

And it said:

Query 1

shared_buffers
  *
  *
work_mem
  * Total Used =
  * In Memory =
  * On Disk =
Rows
  * Estimated =
  * Actual =

etc...

I know that access to the details are needed but for day to day 
operations for a huge portion of our users, they just want to know how 
much memory they need, or if they need a faster disk etc...


Thanks,

JD




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ls_waldir() & pg_ls_logdir()

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 6:09 AM, Dave Page  wrote:
> Hmm, good point. Google seems to be saying there isn't one. Patch
> updated as you suggest (and I've added back in a function declaration
> that got lost in the rebasing of the last version).

OK, I took another look at this:

- The documentation wasn't consistent with itself about the order in
which the three columns were mentioned.  I changed it to say name,
size, modification time both places and made the code also return the
columns in that order.  And I renamed the columns to name, size, and
modification, the last of which was chosen to match pg_stat_file().

- I added an error check for the stat() call.

- I moved the code to genfile.c where pg_ls_dir() already is; it seems
to fit within the charter of that file.

- I changed it to build a heap tuple directly instead of converting to
text and then back to datums.  Seems less error-prone that way, and
more consistent with what's done elsewhere in genfile.c.

- I made it use a static-allocated buffer instead of a palloc'd one,
just so it doesn't leak into the surrounding context.

- I removed the function prototype and instead declared the helper
function static.  If there's an intent to expose that function to
extensions, the prototype should be in a header, not the .c file.

- I adjusted the language in the documentation to be a bit more
similar to what we've done elsewhere.

With those changes, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 2:55 PM, Andres Freund  wrote:
> I indeed think it's not safe, and it's going to get less and less safe
> on windows (or EXEC_BACKEND).  I don't think we can afford to disable
> ASLR in the long run (I indeed supect that'll just be disallowed at some
> point), and that's the only thing making it safe-ish in combination with
> EXEC_BACKEND.

Ugh.

>> If it's not even safe there, then I guess we should remove it entirely
>> as a useless foot-gun.
>
> I indeed think that's the right consequence.  One question is what to
> replace it with exactly - are we guaranteed we can dynamically lookup
> symbols by name in the main binary on every platform?

I don't know the answer to that question.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] reorder tablespaces in basebackup tar stream for backup_label

2017-03-16 Thread Michael Banck

Hi,

sorry, it took me a while to get back to this.

Am Freitag, den 03.03.2017, 15:44 +0900 schrieb Michael Paquier:
> On Wed, Feb 22, 2017 at 9:23 PM, Bernd Helmle  wrote:
> > The comment in the code says explicitely to add the base directory to
> > the end of the list, not sure if that is out of a certain reason.
> >
> > I'd say this is an oversight in the implementation. I'm currently
> > working on a tool using the streaming protocol directly and i've
> > understood it exactly the way, that the default tablespace is the first
> > one in the stream.
> >
> > So +1 for the patch.
> 
> Commit 507069de has switched the main directory from the beginning to
> the end of the list, and the thread about this commit is here:
> https://www.postgresql.org/message-id/AANLkTikgmZRkBuQ%2B_hcwPBv7Cd7xW48Ev%3DUBHA-k4v0W%40mail.gmail.com
> 
> +   /* Add a node for the base directory at the beginning.  This way, the
> +* backup_label file is always the first file to be sent. */
> ti = palloc0(sizeof(tablespaceinfo));
> ti->size = opt->progress ? sendDir(".", 1, true, tablespaces,
> true) : -1;
> -   tablespaces = lappend(tablespaces, ti);
> +   tablespaces = lcons(ti, tablespaces);
> So, the main directory is located at the end on purpose. When using
> --wal-method=fetch the WAL segments are part of the main tarball, so
> if you send the main tarball first you would need to generate a second
> tarball with the WAL segments that have been generated between the
> moment the main tarball has finished until the end of the last
> tablespace taken if you want to have a consistent backup. 

Ah, thanks for pointing that out, I've missed that in my testing.

> Your patch would work with the stream mode though.

Or, if not requesting the "WAL" option of the replication protocol's
BASE_BACKUP command.

I agree it doesn't make sense to start messing with fetch mode, but I
don't think we guarantee any ordering of tablespaces (to wit, Bernd was
pretty sure it was the other way around all the time), neither do I
think having the main tablespace be the first for non-WAL/stream and the
last for WAL/fetch would be confusing personally, though I understand
this is debatable.

So I've updated the patch to only switch the main tablespace to be first
in case WAL isn't included, please find it attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

From 2532c4a659eb32527c489d1a65caa080e181dbd0 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Sun, 26 Feb 2017 17:59:38 +0100
Subject: [PATCH] Reorder tablespaces for non-WAL streaming basebackups.

The replication protocol documentation appears to express that the main
tablespace is the first to be sent, however, it is actually the last
one in order for the WAL files to be appended to it. This makes the
backup_label file (which gets prepended to the main tablespace)
inconveniently end up in the middle of the basebackup stream if other
tablespaces are present.

Change this so that the main tablespace is the first to be sent in case
no WAL files are requested, ensuring that backup_label is the first file
in the stream in this case.
---
 src/backend/replication/basebackup.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 09ecc15..ef3c115 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -230,10 +230,16 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
 		else
 			statrelpath = pgstat_stat_directory;
 
-		/* Add a node for the base directory at the end */
+		/* Add a node for the base directory, either at the end or (if
+		 * WAL is not included) at the beginning (if WAL is not
+		 * included).  This means the backup_label file is the first
+		 * file to be sent in the latter case. */
 		ti = palloc0(sizeof(tablespaceinfo));
 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, true) : -1;
-		tablespaces = lappend(tablespaces, ti);
+		if (opt->includewal)
+			tablespaces = lappend(tablespaces, ti);
+		else
+			tablespaces = lcons(ti, tablespaces);
 
 		/* Send tablespace header */
 		SendBackupHeader(tablespaces);
-- 
2.1.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench more operators & functions

2017-03-16 Thread Fabien COELHO



Hello David,

This patch applies cleanly and compiles at cccbdde with some whitespace 
issues.


$ patch -p1 < ../other/pgbench-more-ops-funcs-9.patch
(Stripping trailing CRs from patch.)


My guess is that your mailer changed the eol-style of the file when saving 
it:


  sh> sha1sum pg-patches/pgbench-more-ops-funcs-9.patch
  608a601561f4cba982f0ce92df30d1868338342b

ISTM that standard mime-type of *.patch and *.diff is really 
"text/x-diff", so my ubuntu laptop is somehow right to put that in 
"/etc/mime.types", but this seems to have anoying consequences at least on 
Macs.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-16 Thread Andrew Borodin

2017-03-16 23:55 GMT+05:00 Peter Geoghegan :
> On Thu, Mar 16, 2017 at 11:53 AM, Andrew Borodin  wrote:
>> 2.  Thus, L fully concurrent vacuum is possible, indeed, and
>> furthermore Theodor suggested that I should implement not only page
>> eviction, but also page merge and tree condence algorithm.
>
> I think that it's very hard to make merging of pages that are not
> completely empty work, while also using the L algorithm.

That's true. This is a distant plan...

Best regards, Andrey Borodin.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Monitoring roles patch

2017-03-16 Thread Denish Patel

Hi Dave,

The patch failed applied...

patch -p1 < /home/vagrant/pg_monitor.diff
patching file contrib/pg_buffercache/Makefile
patching file contrib/pg_buffercache/pg_buffercache--1.2--1.3.sql
patching file contrib/pg_buffercache/pg_buffercache.control
patching file contrib/pg_freespacemap/Makefile
patching file contrib/pg_freespacemap/pg_freespacemap--1.1--1.2.sql
patching file contrib/pg_freespacemap/pg_freespacemap.control
patching file contrib/pg_stat_statements/Makefile
patching file contrib/pg_stat_statements/pg_stat_statements--1.4--1.5.sql
patching file contrib/pg_stat_statements/pg_stat_statements.c
patching file contrib/pg_stat_statements/pg_stat_statements.control
patching file contrib/pg_visibility/Makefile
Hunk #1 succeeded at 4 with fuzz 1.
patching file contrib/pg_visibility/pg_visibility--1.1--1.2.sql
patching file contrib/pg_visibility/pg_visibility.control
patching file contrib/pgrowlocks/pgrowlocks.c
patching file contrib/pgstattuple/pgstattuple--1.4--1.5.sql
patching file doc/src/sgml/catalogs.sgml
Hunk #1 succeeded at 10027 (offset 11 lines).
patching file doc/src/sgml/func.sgml
Hunk #1 succeeded at 19364 (offset 311 lines).
Hunk #2 succeeded at 19648 (offset 311 lines).
patching file doc/src/sgml/pgbuffercache.sgml
patching file doc/src/sgml/pgfreespacemap.sgml
patching file doc/src/sgml/pgrowlocks.sgml
patching file doc/src/sgml/pgstatstatements.sgml
patching file doc/src/sgml/pgstattuple.sgml
patching file doc/src/sgml/pgvisibility.sgml
patching file doc/src/sgml/user-manag.sgml
patching file src/backend/catalog/system_views.sql
Hunk #1 FAILED at 1099.
1 out of 1 hunk FAILED -- saving rejects to file 
src/backend/catalog/system_views.sql.rej
patching file src/backend/replication/walreceiver.c
patching file src/backend/utils/adt/dbsize.c
Hunk #1 succeeded at 17 (offset -1 lines).
Hunk #2 succeeded at 89 (offset -1 lines).
Hunk #3 succeeded at 179 (offset -1 lines).
patching file src/backend/utils/adt/pgstatfuncs.c
patching file src/backend/utils/misc/guc.c
Hunk #2 succeeded at 6678 (offset 10 lines).
Hunk #3 succeeded at 6728 (offset 10 lines).
Hunk #4 succeeded at 8021 (offset 10 lines).
Hunk #5 succeeded at 8053 (offset 10 lines).
patching file src/include/catalog/pg_authid.h

Reject file contents...

cat src/backend/catalog/system_views.sql.rej
--- src/backend/catalog/system_views.sql
+++ src/backend/catalog/system_views.sql
@@ -1099,3 +1099,7 @@

 REVOKE EXECUTE ON FUNCTION pg_ls_logdir() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_ls_waldir() FROM public;
+GRANT EXECUTE ON FUNCTION pg_ls_logdir() TO pg_monitor;
+GRANT EXECUTE ON FUNCTION pg_ls_waldir() TO pg_monitor;
+
+GRANT pg_read_all_gucs TO pg_monitor;

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing references of password encryption to hashing

2017-03-16 Thread Joe Conway

On 03/16/2017 06:19 AM, Robert Haas wrote:
> On Mon, Mar 13, 2017 at 4:48 AM, Craig Ringer  wrote:
>> So I'm in favour of fixing the docs but I'm not keen on changing the
>> SQL syntax in a way that just kind of papers over part of the
>> problems.
> 
> I agree.  I think that trying to design new SQL syntax at this point
> is unlikely to be a good idea - we're just about out of time here, and
> some people who might care about this are busy on other things, and
> the deadline for patches that do new things has long since passed.
> But I like the idea of trying to improve the documentation.


Agreed. I think the documentation fixes definitely should be done, but
understand that the grammar is a longer term issue with backward
compatibility implications. Acknowledging the problem is the first step ;-)

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 7:19 AM, Ashutosh Bapat
 wrote:
> On Thu, Mar 16, 2017 at 7:10 AM, Robert Haas  wrote:
>> So I am looking at this part of 0008:
>>
>> +   /*
>> +* Do not copy parent_rinfo and child_rinfos because 1. they create a
>> +* circular dependency between child and parent RestrictInfo 2. 
>> dropping
>> +* those links just means that we loose some memory
>> optimizations. 3. There
>> +* is a possibility that the child and parent RestrictInfots
>> themselves may
>> +* have got copied and thus the old links may no longer be valid. The
>> +* caller may set up those links itself, if needed.
>> +*/
>>
>> I don't think that it's very clear whether or not this is safe.  I
>> experimented with making _copyRestrictInfo PANIC,
>
> I am not able to understand how to make _copyRestrictInfo PANIC. Can
> you please share the patch or compiler flags or settings? I will look
> at the case below once I have that.

I just put elog(PANIC, "_copyRestrictInfo") into the function.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Size vs size_t

2017-03-16 Thread Thomas Munro

On Fri, Mar 17, 2017 at 10:39 AM, Andres Freund  wrote:
> On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
>> Robert Haas  writes:
>> > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund  wrote:
>> >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
>> > Well, I don't think we want to end up with a mix of Size and size_t in
>> > related code.  That buys nobody anything.  I'm fine with replacing
>> > Size with size_t if they are always equivalent, but there's no sense
>> > in having a jumble of styles.
>>
>> I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create
>> a lot of merge pain for back-patching, while not actually buying anything
>> much concretely.  I think this falls under the same policy we use for many
>> other stylistic details, ie make new code look like the code right around
>> it.  But I'm fine with entirely-new files standardizing on size_t.
>
> That seems like sane policy.  I'm a bit doubtful that the pain would be
> all that bad, but I'm also not wild about trying.

Naive replacement in new files (present in master but not in 9.6) with
the attached script, followed by a couple of manual corrections where
Size was really an English word in a comment, gets the attached diff.

 src/backend/access/hash/hash_xlog.c|  26 ++--
 src/backend/replication/logical/launcher.c |   4 +-
 src/backend/utils/misc/backend_random.c|   4 +-
 src/backend/utils/mmgr/dsa.c   |  94 ++---
 src/backend/utils/mmgr/freepage.c  | 202 ++--
 src/backend/utils/mmgr/slab.c  |  34 ++---
 src/include/lib/simplehash.h   |   6 +-
 src/include/replication/logicallauncher.h  |   2 +-
 src/include/utils/backend_random.h |   2 +-
 src/include/utils/dsa.h|  10 +-
 src/include/utils/freepage.h   |  24 ++--
 src/include/utils/relptr.h |   4 +-
 12 files changed, 206 insertions(+), 206 deletions(-)

That might be just about enough for size_t to catch up...

-- 
Thomas Munro
http://www.enterprisedb.com


Size-to-size_t.sh
Description: Bourne shell script


Size-to-size_t.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG #13755: pgwin32_is_service not checking if SECURITY_SERVICE_SID is disabled

2017-03-16 Thread MauMau

From: Heikki Linnakangas
So, I think we still need the check for Local System.


Thanks, fixed and confirmed that the error message is output in the
event log.

Regards
MauMau



win32-security-service-v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Remove defunct and unnecessary link

2017-03-16 Thread David Christensen

The HA docs reference a “glossary” link which is no longer accessible, nor is 
it likely to be useful in general to link off-site IMHO.  This simple patch 
removes this link.

Best,

David
--
David Christensen
End Point Corporation
da...@endpoint.com
785-727-1171




0001-Remove-defunct-and-unnecessary-doc-link.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] \h tab-completion

2017-03-16 Thread Peter Eisentraut

On 3/15/17 22:46, Andreas Karlsson wrote:
> On 03/01/2017 02:47 PM, Peter Eisentraut wrote:
>> Instead of creating another copy of list_ALTER, let's use the
>> words_after_create list and write a version of
>> create_command_generator/drop_command_generator.
> 
> Good idea. Here is a patch with that.

Committed with some tweaking.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making clausesel.c Smarter

2017-03-16 Thread Tom Lane

David Steele  writes:
> Anyone familiar with the planner available to review this patch?

FWIW, it's on my radar but I don't expect to get to it real soon,
as there's other stuff I deem higher priority.  In the meantime,
I don't want to stand in the way of someone else looking at it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Size vs size_t

2017-03-16 Thread Tom Lane

Andres Freund  writes:
> On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
>> The short answer to that is that "Size" predates the universal acceptance
>> of size_t.  If we were making these decisions today, or anytime since the
>> early 2000s, we'd surely have just gone with size_t.  But it wasn't a
>> realistic option in the 90s.

> Just out of curiosity I checked when we switched to backing Size with
> size_t:
> 1998 - 0ad5d2a3a886e72b429ea2b84bfcb36c0680f84d

Yeah.  We inherited the previous definition (as "unsigned int") from
Berkeley.  I wasn't involved then, of course, but I follow their reasoning
perfectly because I remember fighting the same type of portability battles
with libjpeg in the early 90s.  "size_t" was invented by the ANSI C
committee (hence, 1989 or 1990) and had only very haphazard penetration
until the late 90s.  If you wanted to write portable code you couldn't
depend on it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Size vs size_t

2017-03-16 Thread Tom Lane

Thomas Munro  writes:
> Naive replacement in new files (present in master but not in 9.6) with
> the attached script, followed by a couple of manual corrections where
> Size was really an English word in a comment, gets the attached diff.

In the case of mmgr/slab.c, a lot of those uses of Size probably
correspond to instantiations of the MemoryContext APIs; so blindly
changing them to "size_t" seems like a bit of a type violation
(and might indeed draw warnings from pickier compilers).  Don't
know if any of the other things you've identified here have similar
entanglements.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Renaming of pg_xlog and pg_clog

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 9:31 PM, Michael Paquier
 wrote:
> On Fri, Mar 17, 2017 at 12:19 AM, David Steele  wrote:
>> This patch does not apply cleanly at cccbdde:
>>
>> $ git apply ../other/0001-Rename-pg_clog-to-pg_xact.patch
>> error: doc/src/sgml/ref/pg_resetxlog.sgml: No such file or directory
>> error: patch failed: src/backend/postmaster/autovacuum.c:2468
>> error: src/backend/postmaster/autovacuum.c: patch does not apply
>
> This has rotten again...
>
>> Marked "Waiting on Author".
>>
>> I'd really like to see the rest of the renames happen for v10.  It seems
>> like the process got stalled after the pg_wal rename.
>
> Let's see what happens, attached are refreshed versions. Uncertainty
> is the fun part of a CF.

I understand that the point of renaming pg_clog to pg_xact is that
pg_clog contains the dreaded letters l-o-g, which we hypothesize
causes DBAs to remove it.  (Alternate hypothesis: "So, that's what's
clogging my database!")

Renaming pg_subtrans to pg_subxact has no such redeeming properties.

More, with each of these renamings, we're further separating what
things are called in the code (xlog, clog, subtrans) with what they're
called in the filesystem (wal, xact, subxact).

So if we must rename pg_clog, OK, but can't we leave pg_subtrans
alone?  It's not hurting anybody.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Renaming of pg_xlog and pg_clog

2017-03-16 Thread Michael Paquier

On Fri, Mar 17, 2017 at 11:17 AM, Robert Haas  wrote:
> I understand that the point of renaming pg_clog to pg_xact is that
> pg_clog contains the dreaded letters l-o-g, which we hypothesize
> causes DBAs to remove it.  (Alternate hypothesis: "So, that's what's
> clogging my database!")
>
> Renaming pg_subtrans to pg_subxact has no such redeeming properties.
>
> More, with each of these renamings, we're further separating what
> things are called in the code (xlog, clog, subtrans) with what they're
> called in the filesystem (wal, xact, subxact).
>
> So if we must rename pg_clog, OK, but can't we leave pg_subtrans
> alone?  It's not hurting anybody.

The only argument behind the renaming of pg_subtrans is really
consistency with pg_xact, because both deal with transactions. I don't
personally mind if this portion of the renaming is left off, as you
say anything labelled with "log" is at the origin of this thread.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential data loss of 2PC files

2017-03-16 Thread Ashutosh Bapat

On Thu, Mar 16, 2017 at 10:17 PM, David Steele  wrote:
> On 2/13/17 12:10 AM, Michael Paquier wrote:
>> On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier
>>  wrote:
>>> On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas  
>>> wrote:
 If that can happen, don't we have the same problem in many other places?
 Like, all the SLRUs? They don't fsync the directory either.
>>>
>>> Right, pg_commit_ts and pg_clog enter in this category.
>>
>> Implemented as attached.
>>
 Is unlink() guaranteed to be durable, without fsyncing the directory? If
 not, then we need to fsync() the directory even if there are no files in it
 at the moment, because some might've been removed earlier in the checkpoint
 cycle.
>>>
>>> Hm... I am not an expert in file systems. At least on ext4 I can see
>>> that unlink() is atomic, but not durable. So if an unlink() is
>>> followed by a power failure, the previously unlinked file could be
>>> here if the parent directory is not fsync'd.
>>
>> So I have been doing more work on this patch, with the following things done:
>> - Flush pg_clog, pg_commit_ts and pg_twophase at checkpoint phase to
>> ensure their durability.
>> - Create a durable_unlink() routine to give a way to perform a durable
>> file removal.
>> I am now counting 111 calls to unlink() in the backend code, and
>> looking at all of them most look fine with plain unlink() if they are
>> not made durable as they work on temporary files (see timeline.c for
>> example), with some exceptions:
>> - In pg_stop_backup, the old backup_label and tablespace_map removal
>> should be durable to avoid putting the system in a wrong state after
>> power loss. Other calls of unlink() are followed by durable_rename so
>> they are fine if let as such.
>> - Removal of old WAL segments should be durable as well. There is
>> already an effort to rename them durably in case of a segment
>> recycled. In case of a power loss, a file that should have been
>> removed could remain in pg_xlog.
>>
>> Looking around, I have bumped as well on the following bug report for
>> SQlite which is in the same category of things:
>> http://sqlite.1065341.n5.nabble.com/Potential-bug-in-crash-recovery-code-unlink-and-friends-are-not-synchronous-td68885.html
>> Scary to see that in this case durability can be a problem at
>> transaction commit...
>
> This patch applies cleanly and compiles at cccbdde.
>
> Ashutosh, do you know when you'll have a chance to review?

The scope of this work has expanded, since last time I reviewed and
marked it as RFC. Right now I am busy with partition-wise joins and do
not have sufficient time to take a look at the expanded scope.
However, I can come back to this after partition-wise join, but that
may stretch to the end of the commitfest. Sorry.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding of two-phase transactions

2017-03-16 Thread Craig Ringer

On 17 March 2017 at 08:10, Stas Kelvich  wrote:

> While working on this i’ve spotted quite a nasty corner case with aborted 
> prepared
> transaction. I have some not that great ideas how to fix it, but maybe i 
> blurred my
> view and missed something. So want to ask here at first.
>
> Suppose we created a table, then in 2pc tx we are altering it and after that 
> aborting tx.
> So pg_class will have something like this:
>
> xmin | xmax | relname
> 100  | 200| mytable
> 200  | 0| mytable
>
> After previous abort, tuple (100,200,mytable) becomes visible and if we will 
> alter table
> again then xmax of first tuple will be set current xid, resulting in 
> following table:
>
> xmin | xmax | relname
> 100  | 300| mytable
> 200  | 0| mytable
> 300  | 0| mytable
>
> In that moment we’ve lost information that first tuple was deleted by our 
> prepared tx.

Right. And while the prepared xact has aborted, we don't control when
it aborts and when those overwrites can start happening. We can and
should check if a 2pc xact is aborted before we start decoding it so
we can skip decoding it if it's already aborted, but it could be
aborted *while* we're decoding it, then have data needed for its
snapshot clobbered.

This hasn't mattered in the past because prepared xacts (and
especially aborted 2pc xacts) have never needed snapshots, we've never
needed to do something from the perspective of a prepared xact.

I think we'll probably need to lock the 2PC xact so it cannot be
aborted or committed while we're decoding it, until we finish decoding
it. So we lock it, then check if it's already aborted/already
committed/in progress. If it's aborted, treat it like any normal
aborted xact. If it's committed, treat it like any normal committed
xact. If it's in progress, keep the lock and decode it.

People using logical decoding for 2PC will presumably want to control
2PC via logical decoding, so they're not so likely to mind such a
lock.

> * Try at first to scan catalog filtering out tuples with xmax bigger than 
> snapshot->xmax
> as it was possibly deleted by our tx. Than if nothing found scan in a usual 
> way.

I don't think that'll be at all viable with the syscache/relcache
machinery. Way too intrusive.

> * Do not decode such transaction at all.

Yes, that's what I'd like to do, per above.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioned tables and relfilenode

2017-03-16 Thread Robert Haas

On Thu, Mar 16, 2017 at 6:03 AM, Amit Langote
 wrote:
> I think we'll need to store *somewhere* the mapping of which inh=false
> partitioned table RTE is the child of which inh=true (IOW, parent)
> partitioned table RTE.

I mean, for the children you're going to scan, that seems to be
necessary so that you can do things like translate targetlists to use
the correct varno.  But for the children you're not going to scan,
well, you need to know which ones they are so you can lock them, but
do you really need the parent-child mappings?  Or just a list of which
ones there are?

> I've come to think that AppendRelInfos, although
> contain extraneous information that won't be used, are better than
> inventing something new altogether for time being.  AppendRelInfos are
> referred to a few times by query_planner() steps before we eventually get
> to either set_append_rel_pathlist() or inheritance_planner(), so not
> changing that approach seems less worrisome for now.  So now if we both
> create child RTEs and AppendRelInfos for the partitioned tables, we don't
> need to change expand_inherited_rtentry() at all with this patch.
> Finally, set_append_rel_size/pathlist() and inheritance_planner() skip the
> child partitioned table RTEs, because no path/plan need to be created.  We
> can do away with having to create RelOptInfos for child partitioned table
> RTEs, which I found to be not that invasive.

Yes, but on the flip side, you're having to add code in a lot of
places -- I think I counted 7 -- where you turn around and ignore
those AppendRelInfos.  That's a lot; how do we know we've got them
all?  I'm not sure what the patch would look like the other way, but
I'm hoping that you could just keep the list of partitioned table RTIs
someplace that mostly gets ignored, and then all of that special
handling could be ripped out.

> Append node elision does not occur in the one-child case.  With the patch:

Oh, OK.  Somehow the commit message you included led me to the
contrary conclusion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing references of password encryption to hashing

2017-03-16 Thread Robert Haas

On Mon, Mar 13, 2017 at 4:48 AM, Craig Ringer  wrote:
> So I'm in favour of fixing the docs but I'm not keen on changing the
> SQL syntax in a way that just kind of papers over part of the
> problems.

I agree.  I think that trying to design new SQL syntax at this point
is unlikely to be a good idea - we're just about out of time here, and
some people who might care about this are busy on other things, and
the deadline for patches that do new things has long since passed.
But I like the idea of trying to improve the documentation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding of two-phase transactions

2017-03-16 Thread Craig Ringer

On 16 March 2017 at 19:52, Stas Kelvich  wrote:

>
> I’m working right now on issue with building snapshots for decoding prepared 
> tx.
> I hope I'll send updated patch later today.

Great.

What approach are you taking?

It looks like the snapshot builder actually does most of the work we
need for this already, maintaining a stack of snapshots we can use. It
might be as simple as invalidating the relcache/syscache when we exit
(and enter?) decoding of a prepared 2pc xact, since it violates the
usual assumption of logical decoding that we decode things strictly in
commit-time order.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-16 Thread Corey Huinker

On Mon, Mar 13, 2017 at 5:21 PM, Tom Lane  wrote:

> "Daniel Verite"  writes:
> > Tom Lane wrote:
> >> when we see \if is that we do nothing but absorb text
> >> until we see the matching \endif.  At that point we could bitch and
> throw
> >> everything away if, say, there's \elif after \else, or anything else you
> >> want to regard as a "compile time error".  Otherwise we start execution,
> >> and from there on it probably has to behave as we've been discussing.
> >> But this'd be pretty unfriendly from an interactive standpoint, and I'm
> >> not really convinced that it makes for significantly better error
> >> reporting.
>
> > On the whole, isn't that a reasonable model to follow for psql?
>
> One thing that occurs to me after more thought is that with such a model,
> we could not have different lexing rules for live vs not-live branches,
> since we would not have made those decisions before scanning the input.
> This seems problematic.  Even if you discount the question of whether
> variable expansion is allowed to change command-boundary decisions, we'd
> still not want backtick execution to happen everywhere in the block, ISTM.
>
> Maybe we could fix things so that backtick execution happens later, but
> it would be a pretty significant and invasive change to backslash command
> execution, I'm afraid.
>
> regards, tom lane
>

Ok, I've got some time now and I'm starting to dig into this. I'd like to
restate what I *think* my feedback is, in case I missed or misunderstood
something.

1. Convert perl tests to a single regular regression test.

2. Have MainLoop() pass the cond_stack to the lexer via
psql_scan_set_passthrough(scan_state, (void *) cond_stack);

3. Change command scans to scan the whole boolean expression, not just
OT_NORMAL.

There's a couple ways to go about this. My gut reaction is to create a new
scan type OT_BOOL_EXPR, which for the time being is the same as
OT_WHOLE_LINE, but could one day be something different.

4. Change variable expansion and backtick execution in false branches to
match new policy.

I've inferred that current preference would be for no expansion and no
execution.

5. Allow contextually-correct invalid boolean expressions to map to false.

Out-of-context \endif, \else, and \elif commands remain as errors to be
ignored, invalid expressions in an \if or legallyl-placed \elif are just
treated as false.

Did I miss anything?

1 2 >

1 - 100 of 153 matches

Mail list logo