Re: [HACKERS] Some bugs in psql_complete of psql

2015-12-08 Thread Kyotaro HORIGUCHI
Hello, thank you for reviewing.

# I injured at fingertip, It's quite nuisance and make me more
# slower to type in..

I posted another patch to totally refactor psql-complete so I
don't put revised patch of this for now but this discussion has
an importance for me so please continue to discuss on this a bit
more with me.

At Fri, 4 Dec 2015 22:39:08 +0900, Fujii Masao  wrote in 

> On Fri, Nov 6, 2015 at 11:27 AM, Kyotaro HORIGUCHI
>  wrote:
> > Hello, thank you for the comments.
> >
> > The revised version of this patch is attached.
> 
> Thanks for updating the patch!
> 
> I tested whether the following patterns work as expected or not.
> 
> CREATE UNIQUE INDEX CONCURRENTLY name ON
> CREATE UNIQUE INDEX CONCURRENTLY ON
> CREATE UNIQUE INDEX name ON
> CREATE UNIQUE INDEX ON
> CREATE INDEX CONCURRENTLY name ON
> CREATE INDEX CONCURRENTLY ON
> CREATE INDEX name ON
> CREATE INDEX ON
> 
> Then I found the following problems.
> 
> "CREATE UNIQUE INDEX CONCURRENTLY " didn't suggest "ON".

This worked fine for me..

> "CREATE UNIQUE INDEX ON " suggested nothing.
> "CREATE INDEX ON " suggested nothing.

Oops... The comment for the entry says (tab-complete.c:2355 @
master).

|  * Complete INDEX  ON  with a list of table columns (which
|  * should really be in parens)

But it was *actually* a completion for "INDEX [CONCURRENTLY] ON
name" or "INDEX name ON" in doubtful way. I broke it by fixing it
so as to fit the comment. I saw the same kind of doubtfulness at
some place and this is one of the nuisance of the current
implement. I proposed to use regex-like minilanguage in another
thread. This enables write matching description by almost the
same format as what currently written as comments. What do you
think about this? (I'll post this later.)

http://www.postgresql.org/message-id/20151126.144512.10228250.horiguchi.kyot...@lab.ntt.co.jp


> "CREATE INDEX CONCURRENTLY " didn't suggest "ON".

This is not suggested before this.


> BTW, I found that tab-completion for DROP INDEX has the following problems.
> 
> "DROP INDEX " didn't suggest "CONCURRENTLY".
> "DROP INDEX CONCURRENTLY name " suggested nothing.

Yeah, these are the same behavior with the current. We shall find
many instances of this kind of incompleteness of completion in
the current implelent. But the current way prohibits us from
enrich the completion in this direction, I believe.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-08 Thread Ashutosh Bapat
On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita 
wrote:

> Hi Ashutosh,
>
> On 2015/12/02 20:45, Ashutosh Bapat wrote:
>
>> It's been a long time since last patch on this thread was posted. I have
>> started
>> to work on supporting join pushdown for postgres_fdw.
>>
>
> Thanks for the work!
>
> Generating paths
>> 
>>
>
> A join between two foreign relations is considered safe to push down if
>> 1. The joining sides are pushable
>> 2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and
>> ANTI
>> joins are not considered right now, because of difficulties in
>> constructing
>> the queries involving those. The join clauses of SEMI/ANTI joins are
>> not in a
>> form that can be readily converted to IN/EXISTS/NOT EXIST kind of
>> expression.
>> We might consider this as future optimization.
>> 3. Joining sides do not have clauses which can not be pushed down to the
>> foreign
>> server. For an OUTER join this is important since those clauses need
>> to be
>> applied before performing the join and thus join can not be pushed
>> to the
>> foreign server. An example is
>> SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
>> ON (join clause)
>> Here the local_cond on ft2 needs to be executed before performing
>> LEFT JOIN
>> between ft1 and ft2.
>> This condition can be relaxed for an INNER join by pulling the local
>> clauses
>> up the join tree. But this needs more investigation and is not
>> considered in
>> this version.
>> 4. The join conditions (e.g. conditions in ON clause) are all safe to
>> push down.
>> This is important for OUTER joins as pushing down join clauses
>> partially and
>> applying rest locally changes the result. There are ways [1] by
>> which partial
>> OUTER join can be completed by applying unpushable clauses locally
>> and then
>> nullifying the nullable side and eliminating duplicate non-nullable
>> side
>> rows. But that's again out of scope of first version of postgres_fdw
>> join
>> pushdown.
>>
>
> As for 4, as commented in the patch, we could relax the requirement that
> all the join conditions (given by JoinPathExtraData's restrictlist) need to
> be safe to push down to the remote server;
> * In case of inner join, all the conditions would not need to be safe.
> * In case of outer join, all the "otherclauses" would not need to be safe,
> while I think all the "joinclauses" need to be safe to get the right
> results (where "joinclauses" and "otherclauses" are defined by
> extract_actual_join_clauses).  And I think we should do this relaxation to
> some extent for 9.6, to allow more joins to be pushed down.


agreed. I will work on those.


> I don't know about [1].  May I see more information about [1]?
>
>
Generating plan
>> ===
>>
>
> Rest of this section describes the logic to construct
>> the SQL
>> for join; the logic is implemented as function deparseSelectSqlForRel().
>>
>> deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
>> baserel
>> asd well) recursively.
>> For joinrels
>> 1. it constructs SQL representing either side of join, by calling itself
>> in recursive fashion.
>> 2. These SQLs are converted into subqueries and become part of the FROM
>> clause
>> with appropriate JOIN type and clauses. The left and right
>> subqueries are
>> given aliases "l" and "r" respectively. The columns in each subquery
>> are
>> aliased as "a1", "a2", "a3" and so on. Thus the third column on left
>> side can
>> be referenced as "l.a3" at any recursion level.
>> 3. Targetlist is added representing the columns in the join result
>> expected at
>> that level.
>> 4. The join clauses are added as part of ON clause
>> 5. Any clauses that planner has deemed fit to be evaluated at that level
>> of join
>> are added as part of WHERE clause.
>>
>
> Honestly, I'm not sure that that is a good idea.  One reason for that is
> that a query string constructed by the procedure is difficult to read
> especially when the procedure is applied recursively.  So, I'm thinking to
> revise the procedure so as to construct a query string with a flattened
> FROM clause, as discussed in eg, [2].
>
>
Just to confirm, the hook discussed in [2] is not in place right? I can
find only one hook for foreign join
 50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
 51   RelOptInfo
*joinrel,
 52 RelOptInfo
*outerrel,
 53 RelOptInfo
*innerrel,
 54   JoinType
jointype,
 55JoinPathExtraData
*extra);
This hook takes an inner and outer relation, so can not be used for N-way
join as discussed in that thread.

Are you 

Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-08 Thread Kyotaro HORIGUCHI
Thank you for looking on this and the comment.

At Mon, 7 Dec 2015 15:00:32 +0900, Michael Paquier  
wrote in 

Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-08 Thread Jeremy Harris
On 07/12/15 16:44, Simon Riggs wrote:
> There are many optimizations we might adopt, yet planning time is a factor.
> It seems simple enough to ignore more complex optimizations if we have
> already achieved a threshold cost (say 10). Such a test would add nearly
> zero time for the common case. We can apply the optimizations in some kind
> of ordering depending upon the cost, so we are careful to balance the
> cost/benefit of trying certain optimizations.

Given parallelism, why not continue planning after initiating a
a cancellable execution, giving a better plan to be used if the
excecution runs for long enough?
-- 
Cheers,
  Jeremy




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-08 Thread Michael Paquier
On Tue, Dec 8, 2015 at 6:31 PM, Kyotaro HORIGUCHI
 wrote:
>
> Thank you for looking on this and the comment.
>
> At Mon, 7 Dec 2015 15:00:32 +0900, Michael Paquier 
>  wrote in 
> 

Re: [HACKERS] Error with index on unlogged table

2015-12-08 Thread Andres Freund
On 2015-12-03 22:09:43 +0100, Andres Freund wrote:
> On 2015-11-20 16:11:15 +0900, Michael Paquier wrote:
> > +   if (bkpb.fork == INIT_FORKNUM)
> > +   {
> > +   SMgrRelation srel;
> > +   srel = smgropen(bkpb.node, InvalidBackendId);
> > +   smgrimmedsync(srel, INIT_FORKNUM);
> > +   smgrclose(srel);
> > +   }
>
> A smgrwrite() instead of a smgrimmedsync() should be sufficient here.

More importantly, the smgrimmedsync() won't actually achieve anything -
RestoreBackupBlockContents() will just have written the data into shared
buffers. smgrimmedsync() doesn't flush that.

And further, just flushing the buffer directly to disk using
smgrwrite(), without reflecting that in the in-memory state, doesn't
seem prudent. At the very least we'll write all those blocks twice. It
also likely misses dealing with checksums and such.

What I think we really ought to do instead is to use "proper" buffer
functionality, e.g. flush the buffer via FlushBuffer().

It seems slightly better not to directly expose FlushBuffer() and
instead add a tiny wrapper. Couldn't come up with a grand name tho.

For me the attached, preliminary, patch, fixes the problem in master;
previous branches ought to look mostly similar, except the flush moved
to RestoreBackupBlockContents/RestoreBackupBlock.


Does anybody have a better idea? Suitable for the back-branches?


I'm kinda wondering if it wouldn't have been better to go through shared
buffers in ResetUnloggedRelationsInDbspaceDir() instead of using
copy_file().

Regareds,

Andres
>From 9c4a7691595c1edfc12c5583cd007bb569f9db94 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 8 Dec 2015 13:56:18 +0100
Subject: [PATCH] Preliminary-unlogged-rel-recovery-fix

---
 src/backend/access/transam/xlogutils.c |  9 +
 src/backend/storage/buffer/bufmgr.c| 21 +
 src/include/storage/bufmgr.h   |  1 +
 3 files changed, 31 insertions(+)

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index a5003c3..9073a85 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -368,6 +368,15 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
 
 		MarkBufferDirty(*buf);
 
+		/*
+		 * At the end of crash recovery the init forks of unlogged relations
+		 * are copied, without going through shared buffers. So we need to
+		 * force the on-disk state of init forks to always be in sync with the
+		 * state in shared buffers.
+		 */
+		if (forknum == INIT_FORKNUM)
+			FlushOneBuffer(*buf);
+
 		return BLK_RESTORED;
 	}
 	else
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 63188a3..b32543b 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -2989,6 +2989,27 @@ FlushDatabaseBuffers(Oid dbid)
 }
 
 /*
+ * Flush a previously, shared or exclusively, locked and pinned buffer to the
+ * OS.
+ */
+void
+FlushOneBuffer(Buffer buffer)
+{
+	BufferDesc *bufHdr;
+
+	/* currently not needed, but no fundamental reason not to support */
+	Assert(!BufferIsLocal(buffer));
+
+	Assert(BufferIsPinned(buffer));
+
+	bufHdr = GetBufferDescriptor(buffer - 1);
+
+	LWLockHeldByMe(bufHdr->content_lock);
+
+	FlushBuffer(bufHdr, NULL);
+}
+
+/*
  * ReleaseBuffer -- release the pin on a buffer
  */
 void
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 0f59201..09b2b11 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -176,6 +176,7 @@ extern void CheckPointBuffers(int flags);
 extern BlockNumber BufferGetBlockNumber(Buffer buffer);
 extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation,
 ForkNumber forkNum);
+extern void FlushOneBuffer(Buffer buffer);
 extern void FlushRelationBuffers(Relation rel);
 extern void FlushDatabaseBuffers(Oid dbid);
 extern void DropRelFileNodeBuffers(RelFileNodeBackend rnode,
-- 
2.6.0.rc3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: add \pset true/false

2015-12-08 Thread Kyotaro HORIGUCHI
Hello,

At Sat, 5 Dec 2015 21:05:29 +0900, Michael Paquier  
wrote in 
> On Fri, Dec 4, 2015 at 6:06 PM, Pavel Stehule  wrote:
> > long time I am dream about integrating Lua to psql
> >
> > It is fast enough for these purpose and can be used for custom macros, ..
> 
> Things are perhaps digressing too much here...

But is it another example of per-column pluggable filter? (I
cannot imagine a concrete usage scenario, though.)

> Regarding the patch, I
> would tend to think that we should just reject it and try to cruft
> something that could be more pluggable if there is really a need.
> Thoughts?

Honestly saying, I feel simlilaly with you:p I personally will do
something like the following for the original objective.

psql postgres -c 'select * from tb' | perl -ne 's/t/○/g,s/f/×/g if ($. > 2); 
print'

I think that almost everything of this kind of replacement can be
accomplised by post-filtering like this, especially with
a help of replacement of field separator.


I have shown how it looks that DLL and script/executables is used
for this purpose (the latter could be specified as command line
parameter but totally would be in same shape).  The former looks
too complicated only for this purposr and the latter looks simple
but just peculiar. However, I doubt there's other measure simple
but clean way, and doubt that there's so many usage of pluggable
mechanism on psql.

Quite concise (and unseen) way of platform-independent DLL
loading may allow us to usr DLL for this purpose?

Or, separating platform-independent dll-loading stuff from
backend and share with frontend would allow this? (This could be
acceptable.)


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Minor comment update in setrefs.c

2015-12-08 Thread Etsuro Fujita
Hi,

Attached is a small patch to adjust a comment in setrefs.c; in
set_foreignscan_references, fdw_recheck_quals also gets adjusted to
reference foreign scan tuple, in case of a foreign join, so I added
"etc.", to a comment there, as the comment in case of a simple foreign
table scan.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***
*** 1108,1114  set_foreignscan_references(PlannerInfo *root,
  
  	if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
  	{
! 		/* Adjust tlist, qual, fdw_exprs to reference foreign scan tuple */
  		indexed_tlist *itlist = build_tlist_index(fscan->fdw_scan_tlist);
  
  		fscan->scan.plan.targetlist = (List *)
--- 1108,1117 
  
  	if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
  	{
! 		/*
! 		 * Adjust tlist, qual, fdw_exprs, etc. to reference foreign scan
! 		 * tuple
! 		 */
  		indexed_tlist *itlist = build_tlist_index(fscan->fdw_scan_tlist);
  
  		fscan->scan.plan.targetlist = (List *)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: add \pset true/false

2015-12-08 Thread Michael Paquier
On Tue, Dec 8, 2015 at 7:18 PM, Kyotaro HORIGUCHI
 wrote:
> At Sat, 5 Dec 2015 21:05:29 +0900, Michael Paquier 
>  wrote in 
> 
> > Regarding the patch, I
> > would tend to think that we should just reject it and try to cruft
> > something that could be more pluggable if there is really a need.
> > Thoughts?
>
> Honestly saying, I feel similarly with you:p I personally will do
> something like the following for the original objective.

Are there other opinions? The -1 team is in majority at the end of this thread..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-08 Thread Etsuro Fujita

On 2015/12/08 17:27, Ashutosh Bapat wrote:

On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita
> wrote:



Generating paths




A join between two foreign relations is considered safe to push
down if



4. The join conditions (e.g. conditions in ON clause) are all
safe to
push down.
 This is important for OUTER joins as pushing down join clauses
partially and
 applying rest locally changes the result. There are ways [1] by
which partial
 OUTER join can be completed by applying unpushable clauses
locally
and then
 nullifying the nullable side and eliminating duplicate
non-nullable side
 rows. But that's again out of scope of first version of
postgres_fdw
join
 pushdown.



As for 4, as commented in the patch, we could relax the requirement
that all the join conditions (given by JoinPathExtraData's
restrictlist) need to be safe to push down to the remote server;
* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be
safe, while I think all the "joinclauses" need to be safe to get the
right results (where "joinclauses" and "otherclauses" are defined by
extract_actual_join_clauses).  And I think we should do this
relaxation to some extent for 9.6, to allow more joins to be pushed
down.



agreed. I will work on those.


Great!


Generating plan
===



Rest of this section describes the logic to construct
the SQL
for join; the logic is implemented as function
deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and
now for
baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by
calling itself
 in recursive fashion.
2. These SQLs are converted into subqueries and become part of
the FROM
clause
 with appropriate JOIN type and clauses. The left and right
subqueries are
 given aliases "l" and "r" respectively. The columns in each
subquery are
 aliased as "a1", "a2", "a3" and so on. Thus the third
column on left
side can
 be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result
expected at
 that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at
that level
of join
 are added as part of WHERE clause.



Honestly, I'm not sure that that is a good idea.  One reason for
that is that a query string constructed by the procedure is
difficult to read especially when the procedure is applied
recursively.  So, I'm thinking to revise the procedure so as to
construct a query string with a flattened FROM clause, as discussed
in eg, [2].



Just to confirm, the hook discussed in [2] is not in place right? I can
find only one hook for foreign join
  50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
  51
RelOptInfo *joinrel,
  52 RelOptInfo
*outerrel,
  53 RelOptInfo
*innerrel,
  54   JoinType
jointype,
  55
JoinPathExtraData *extra);
This hook takes an inner and outer relation, so can not be used for
N-way join as discussed in that thread.

Are you suggesting that we should add that hook before we implement join
pushdown in postgres_fdw? Am I missing something?


I don't mean it.  I'm thinking that I'll just revise the procedure so as 
to generate a FROM clause that is something like "from c left join d on 
(...) full join e on (...)" based on the existing hook you mentioned.



TODOs
=



In another thread Robert, Fujita-san and Kaigai-san are
discussing about
EvalPlanQual support for foreign joins. Corresponding changes to
postgres_fdw
will need to be added once those changes get committed.



Yeah, we would need those changes including helper functions to
create a local join execution plan for that support.  I'd like to
add those changes to your updated patch if it's okay.



Right now, we do not have any support for postgres_fdw join pushdown. I
was thinking of adding at least minimal support for the same using this
patch, may be by preventing join pushdown in case there are row marks
for now. That way, we at least have some way to play with postgres_fdw
join pushdown. 

Re: [HACKERS] Rework the way multixact truncations work

2015-12-08 Thread Andres Freund
On 2015-12-04 21:55:29 -0500, Noah Misch wrote:
> On Thu, Dec 03, 2015 at 07:03:21PM +0100, Andres Freund wrote:
> > Sorry, but I really just want to see these changes as iterative patches
> > ontop of 9.5/HEAD instead of this process. I won't revert the reversion
> > if you push it anyway, but I think it's a rather bad idea.
> 
> I hear you.

Not just me.

> I evaluated your request and judged that the benefits you cited
> did not make up for the losses I cited.  Should you wish to change my mind,
> your best bet is to find defects in the commits I proposed.  If I introduced
> juicy defects, that discovery would lend much weight to your conjectures.

I've absolutely no interest in "proving you wrong". And my desire to
review patches that are in a, in my opinion, barely reviewable format is
pretty low as well.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-12-08 Thread Andres Freund
Hi,

On 2015-12-08 12:53:49 +0300, Alexander Korotkov wrote:
> ​This is why atomic increment *could be* cheaper than loop over CAS and, it
> worth having experiments. ​Another idea is that we can put arbitrary logic
> between lwarx and stwcx. Thus, we can implement PinBuffer using single loop
> of lwarx and stwcx which could be better than loop of CAS.

You can't really put that much between an ll/sc - the hardware is only
able to track a very limited number of cacheline references.

> 3) lwlock-increment.patch – LWLockAttemptLock change state using atomic
> increment operation instead of loop of CAS. This patch does it for
> LWLockAttemptLock like pinunpin-increment.patch does for PinBuffer.
> Actually, this patch is not directly related to buffer manager. However,
> it's nice to test loop of CAS vs atomic increment in different places.

Yea, that's a worthwhile improvement. Actually it's how the first
versions of the lwlock patches worked - unfortunately I couldn't see big
differences on hardware I had available at the time.

There's some more trickyness required than what you have in your patch
(afaics at least). The problem is that when you 'optimistically'
increment by LW_VAL_SHARED and notice that there actually was another
locker, you possibly, until you've 'fixed' the state, are blocking new
exclusive lockers from acquiring the locks.  So you additionally need to
do special handling in these cases, and check the queue more.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-12-08 Thread Alexander Korotkov
On Tue, Dec 8, 2015 at 1:04 PM, Andres Freund  wrote:

> On 2015-12-08 12:53:49 +0300, Alexander Korotkov wrote:
> > ​This is why atomic increment *could be* cheaper than loop over CAS and,
> it
> > worth having experiments. ​Another idea is that we can put arbitrary
> logic
> > between lwarx and stwcx. Thus, we can implement PinBuffer using single
> loop
> > of lwarx and stwcx which could be better than loop of CAS.
>
> You can't really put that much between an ll/sc - the hardware is only
> able to track a very limited number of cacheline references.
>

​I have some doubts about this, but I didn't find the place where it's​
​ explicitly documented. In the case of LWLockAttemptLock it not very much
between
 lwarx/stwcx
​: 4 instructions while CAS have 2 instructions.
Could you please share some link to docs, if any?​


> > 3) lwlock-increment.patch – LWLockAttemptLock change state using atomic
> > increment operation instead of loop of CAS. This patch does it for
> > LWLockAttemptLock like pinunpin-increment.patch does for PinBuffer.
> > Actually, this patch is not directly related to buffer manager. However,
> > it's nice to test loop of CAS vs atomic increment in different places.
>
> Yea, that's a worthwhile improvement. Actually it's how the first
> versions of the lwlock patches worked - unfortunately I couldn't see big
> differences on hardware I had available at the time.
>
> There's some more trickyness required than what you have in your patch
> (afaics at least). The problem is that when you 'optimistically'
> increment by LW_VAL_SHARED and notice that there actually was another
> locker, you possibly, until you've 'fixed' the state, are blocking new
> exclusive lockers from acquiring the locks.  So you additionally need to
> do special handling in these cases, and check the queue more.
>

​Agree. This patch need to be carefully verified. Current experiments just
show that it is promising direction for improvement. I'll come with better
version of this patch.

Also, after testing on large machines I have another observation to share.
For now, LWLock doesn't guarantee that exclusive lock would be ever
acquired (assuming each shared lock duration is finite). It because when
there is no exclusive lock, new shared locks aren't queued and LWLock state
is changed directly. Thus, process which tries to acquire exclusive lock
have to wait for gap in shared locks. But with high concurrency for shared
lock that could happen very rare, say never.

We did see this on big Intel machine in practice. pgbench -S gets shared
ProcArrayLock very frequently. Since some number of connections is
achieved, new connections hangs on getting exclusive ProcArrayLock. I think
we could do some workaround for this problem. For instance, when exclusive
lock waiter have some timeout it could set some special bit which prevents
others to get new shared locks.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


[HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Sandeep Thakkar
Hi

We have registered two new buildfarm animals for RHEL 7 on PPC64 and
PPC64LE (Little Endian). The configure for 9.3, 9.2 and 9.1 failed on
ppc64le with the error "error: cannot guess build type; you must specify
one". This is because config.guess for these branches don't expect the arch
to be ppc64le (case statement). When I passed the build type as
"powerpc64le-unknown-linux-gnu" to the configure script, the build is
successful.

So, config.guess should be changed to include the build type for ppc64le
like it is in 9.4+ branches.

-- 
Sandeep Thakkar


Re: [HACKERS] Error with index on unlogged table

2015-12-08 Thread Andres Freund
Hi,

On 2015-12-04 21:57:54 +0900, Michael Paquier wrote:
> On Fri, Dec 4, 2015 at 5:10 PM, Andres Freund  wrote:
> >> Let's go for XLOG_FPI_FLUSH.
> >
> > I think the other way is a bit better, because we can add new flags
> > without changing the WAL format.
> 
> Hm. On the contrary, I think that it would make more sense to have a
> flag as well for FOR_HINT honestly, those are really the same
> operations, and FOR_HINT is just here for statistic purposes.

I don't think it's all that much the same operation. And WRT statistics
purpose: Being able to easily differentiate FOR_HINT is important for
pg_xlogdump --stats, but not at all for XLOG_FPI_FLUSH.

> >> --- a/src/backend/access/transam/xloginsert.c
> >> +++ b/src/backend/access/transam/xloginsert.c
> >> @@ -932,10 +932,13 @@ XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
> >>   * If the page follows the standard page layout, with a PageHeader and 
> >> unused
> >>   * space between pd_lower and pd_upper, set 'page_std' to TRUE. That 
> >> allows
> >>   * the unused space to be left out from the WAL record, making it smaller.
> >> + *
> >> + * If 'is_flush' is set to TRUE, relation will be requested to flush
> >> + * immediately its data at replay after replaying this full page image.
> >>   */
> >
> > s/is_flush/flush_immed/? And maybe say that it 'will be flushed to the
> > OS immediately after replaying the record'?
> 
> s/OS/stable storage?

It's not flushed to stable storage here. Just to the OS.

> diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
> index 99337b0..fff48ab 100644
> --- a/src/backend/access/brin/brin.c
> +++ b/src/backend/access/brin/brin.c
> @@ -682,7 +682,15 @@ brinbuildempty(PG_FUNCTION_ARGS)
>   brin_metapage_init(BufferGetPage(metabuf), BrinGetPagesPerRange(index),
>  BRIN_CURRENT_VERSION);
>   MarkBufferDirty(metabuf);
> - log_newpage_buffer(metabuf, false);
> +
> + /*
> +  * For unlogged relations, this page should be immediately flushed
> +  * to disk after being replayed. This is necessary to ensure that the
> +  * initial on-disk state of unlogged relations is preserved as the
> +  * on-disk files are copied directly at the end of recovery.
> +  */
> + log_newpage_buffer(metabuf, false,
> + index->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED);
>   END_CRIT_SECTION();
>  
>   UnlockReleaseBuffer(metabuf);
> diff --git a/src/backend/access/brin/brin_pageops.c 
> b/src/backend/access/brin/brin_pageops.c
> index f876f62..572fe20 100644
> --- a/src/backend/access/brin/brin_pageops.c
> +++ b/src/backend/access/brin/brin_pageops.c
> @@ -865,7 +865,7 @@ brin_initialize_empty_new_buffer(Relation idxrel, Buffer 
> buffer)
>   page = BufferGetPage(buffer);
>   brin_page_init(page, BRIN_PAGETYPE_REGULAR);
>   MarkBufferDirty(buffer);
> - log_newpage_buffer(buffer, true);
> + log_newpage_buffer(buffer, true, false);
>   END_CRIT_SECTION();
>  
>   /*
> diff --git a/src/backend/access/gin/gininsert.c 
> b/src/backend/access/gin/gininsert.c
> index 49e9185..17c168a 100644
> --- a/src/backend/access/gin/gininsert.c
> +++ b/src/backend/access/gin/gininsert.c
> @@ -450,14 +450,22 @@ ginbuildempty(PG_FUNCTION_ARGS)
>   ReadBufferExtended(index, INIT_FORKNUM, P_NEW, RBM_NORMAL, 
> NULL);
>   LockBuffer(RootBuffer, BUFFER_LOCK_EXCLUSIVE);
>  
> - /* Initialize and xlog metabuffer and root buffer. */
> + /*
> +  * Initialize and xlog metabuffer and root buffer. For unlogged
> +  * relations, those pages need to be immediately flushed to disk
> +  * after being replayed. This is necessary to ensure that the
> +  * initial on-disk state of unlogged relations is preserved when
> +  * they get reset at the end of recovery.
> +  */
>   START_CRIT_SECTION();
>   GinInitMetabuffer(MetaBuffer);
>   MarkBufferDirty(MetaBuffer);
> - log_newpage_buffer(MetaBuffer, false);
> + log_newpage_buffer(MetaBuffer, false,
> + index->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED);
>   GinInitBuffer(RootBuffer, GIN_LEAF);
>   MarkBufferDirty(RootBuffer);
> - log_newpage_buffer(RootBuffer, false);
> + log_newpage_buffer(RootBuffer, false,
> + index->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED);
>   END_CRIT_SECTION();
>  
>   /* Unlock and release the buffers. */
> diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
> index 53bccf6..6a20031 100644
> --- a/src/backend/access/gist/gist.c
> +++ b/src/backend/access/gist/gist.c
> @@ -84,7 +84,8 @@ gistbuildempty(PG_FUNCTION_ARGS)
>   START_CRIT_SECTION();
>   GISTInitBuffer(buffer, F_LEAF);
>   MarkBufferDirty(buffer);
> - log_newpage_buffer(buffer, true);
> + log_newpage_buffer(buffer, true,
> + index->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED);

Re: [HACKERS] [PROPOSAL] Backup and recovery of pg_statistic

2015-12-08 Thread Simon Riggs
On 27 November 2015 at 15:52, Dmitry Ivanov  wrote:

Currently there is no way to backup 'pg_statistic' because columns of
> 'anyarray' type cannot be reconstructed solely with their textual
> representation. Meanwhile, all that is needed to solve this problem is a
> small
> extension capable of retrieving the element type of an 'anyarray' object
> and
> recreating this particular 'anyarray' object using the 'array_in'
> procedure.
>

Please submit a patch on core for this.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-12-08 Thread Amit Kapila
On Tue, Dec 8, 2015 at 3:56 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:
>
> ​Agree. This patch need to be carefully verified. Current experiments just
> show that it is promising direction for improvement. I'll come with better
> version of this patch.
>
> Also, after testing on large machines I have another observation to share.
> For now, LWLock doesn't guarantee that exclusive lock would be ever
> acquired (assuming each shared lock duration is finite). It because when
> there is no exclusive lock, new shared locks aren't queued and LWLock state
> is changed directly. Thus, process which tries to acquire exclusive lock
> have to wait for gap in shared locks.
>

I think this has the potential to starve exclusive lockers in worst case.


> But with high concurrency for shared lock that could happen very rare, say
> never.
>
> We did see this on big Intel machine in practice. pgbench -S gets shared
> ProcArrayLock very frequently. Since some number of connections is
> achieved, new connections hangs on getting exclusive ProcArrayLock. I think
> we could do some workaround for this problem. For instance, when exclusive
> lock waiter have some timeout it could set some special bit which prevents
> others to get new shared locks.
>
>
I think timeout based solution would lead to giving priority to
exclusive lock waiters (assume a case where each of exclusive
lock waiter timesout one after another) and make shared lockers
wait and a timer based solution might turn out to be costly for
general cases where wait is not so long.  Another way could be to
check if the Exclusive locker needs to go for repeated wait for a
couple of times, then we can set such a bit.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Tab-comletion for RLS

2015-12-08 Thread Masahiko Sawada
​
Hi all,

​
I found some lacks of tab-completion for RLS in 9.5.

* ALTER POLICY [TAB]
I expected to appear the list of policy name, but nothing is appeared.

* ALTER POLICY hoge_policy ON [TAB]
I expected to appear the list of table name related to specified policy,
but all table names are appeared.

* ALTER POLICY ... ON ... TO [TAB]
I expected to appear { role_name | PUBLIC | CURRENT_USER | SESSION_USER },
but only role_name and PUBLIC are appeared.
Same problem is exists in
​
"
CREATE POLICY ... ON ... TO [TAB]
​
"
​
.

#1 and #2 problems are exist in 9.5 or later, but #3 is exist in only 9.5
because it's unintentionally fixed by
2f8880704a697312d8d10ab3a2ad7ffe4b5e3dfd commit.
I think we should apply the necessary part of this commit for 9.5 as well,
though?

Attached patches are:​
* 000_fix_tab_completion_rls.patch
  fixes #1, #2 problem, and is for master branch and REL9_5_STABLE.​
* 001_fix_tab_completion_rls_for_95.patch
  fixes #3 problem, and is for only REL9_5_STABLE.

​Regards,​

--
Masahiko Sawada


000_fix_tab_completion_rls.patch
Description: Binary data


001_fix_tab_completion_rls_for_95.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Tom Lane
Sandeep Thakkar  writes:
> So, config.guess should be changed to include the build type for ppc64le
> like it is in 9.4+ branches.

So far as I can tell from a quick troll of the git history, we have never
ever updated config.guess/config.sub in released branches.  I'm a bit
hesitant to do it in this case either: it would amount to retroactively
adding support for a platform, which sure sounds like a new feature.

My vote would be to adjust your buildfarm critter to only try to build
9.4 and up.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PoC] Asynchronous execution again (which is not parallel)

2015-12-08 Thread Robert Haas
On Mon, Nov 30, 2015 at 7:47 AM, Kyotaro HORIGUCHI
 wrote:
> "Asynchronous execution" is a feature to start substantial work
> of nodes before doing Exec*. This can reduce total startup time
> by folding startup time of multiple execution nodes. Especially
> effective for the combination of joins or appends and their
> multiple children that needs long time to startup.
>
> This patch does that by inserting another phase "Start*" between
> ExecInit* and Exec* to launch parallel processing including
> pgworker and FDWs before requesting the very first tuple of the
> result.

I have thought about this, too, but I'm not very convinced that this
is the right model.  In a typical case involving parallelism, you hope
to have the Gather node as close to the top of the plan tree as
possible.  Therefore, the start phase will not happen much before the
first execution of the node, and you don't get much benefit.
Moreover, I think that prefetching can be useful not only at the start
of the query - which is the only thing that your model supports - but
also in mid-query.  For example, consider an Append of two ForeignScan
nodes.  Ideally we'd like to return the results in the order that they
become available, rather than serially.  This model might help with
that for the first batch of rows you fetch, but not after that.

There are a couple of other problems here that are specific to this
example.  You get a benefit here because you've got two Gather nodes
that both get kicked off before we try to read tuples from either, but
that's generally something to avoid - you can only use 3 processes and
typically at most 2 of those will actually be running (as opposed to
sleeping) at the same time: the workers will run to completion, and
then the leader will wake up and do its thing.   I'm not saying our
current implementation of parallel query scales well to a large number
of workers (it doesn't) but I think that's more about improving the
implementation than any theoretical problem, so this seems a little
worse.  Also, currently, both merge and hash joins have an
optimization wherein if the outer side of the join turns out to be
empty, we avoid paying the startup cost for the inner side of the
join; kicking off the work on the inner side of the merge join
asynchronously before we've gotten any tuples from the outer side
loses the benefit of that optimization.

I suspect there is no single paradigm that will help with all of the
cases where asynchronous execution is useful.  We're going to need a
series of changes that are targeted at specific problems.  For
example, here it would be useful to have one side of the join confirm
at the earliest possible stage that it will definitely return at least
one tuple eventually, but then return control to the caller so that we
can kick off the other side of the join.  The sort node never
eliminates anything, so as soon as the sequential scan underneath it
coughs up a tuple, we're definitely getting a return value eventually.
At that point it's safe to kick off the other Gather node.  I don't
quite know how to design a signalling system for that, but it could be
done.

But is it important enough to be worthwhile?  Maybe, maybe not.  I
think we should be working toward a world where the Gather is at the
top of the plan tree as often as possible, in which case
asynchronously kicking off a Gather node won't be that exciting any
more - see notes on the "parallelism + sorting" thread where I talk
about primitives that would allow massively parallel merge joins,
rather than 2 or 3 way parallel.  From my point of view, the case
where we really need some kind of asynchronous execution solution is a
ForeignScan, and in particular a ForeignScan which is the child of an
Append.  In that case it's obviously really useful to be able to kick
off all the foreign scans and then return a tuple from whichever one
coughs it up first.  Is that the ONLY case where asynchronous
execution is useful?  Probably not, but I bet it's the big one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 5:49 AM, Etsuro Fujita
 wrote:
> On 2015/12/08 3:06, Tom Lane wrote:
>> Robert Haas  writes:
>>> I think the core system likely needs visibility into where paths and
>>> plans are present in node trees, and putting them somewhere inside
>>> fdw_private would be going in the opposite direction.
>
>> Absolutely.  You don't really want FDWs having to take responsibility
>> for setrefs.c processing of their node trees, for example.  This is why
>> e.g. ForeignScan has both fdw_exprs and fdw_private.
>>
>> I'm not too concerned about whether we have to adjust FDW-related APIs
>> as we go along.  It's been clear from the beginning that we'd have to
>> do that, and we are nowhere near a point where we should promise that
>> we're done doing so.
>
> OK, I'd vote for Robert's idea, then.  I'd like to discuss the next
> thing about his patch.  As I mentioned in [1], the following change in
> the patch will break the EXPLAIN output.
>
> @@ -205,6 +218,11 @@ ExecInitForeignScan(ForeignScan *node, EState
> *estate, int eflags)
> scanstate->fdwroutine = fdwroutine;
> scanstate->fdw_state = NULL;
>
> +   /* Initialize any outer plan. */
> +   if (outerPlanState(scanstate))
> +   outerPlanState(scanstate) =
> +   ExecInitNode(outerPlan(node), estate, eflags);
> +
>
> As pointed out by Horiguchi-san, that's not correct, though; we should
> initialize the outer plan if outerPlan(node) != NULL, not
> outerPlanState(scanstate) != NULL.  Attached is an updated version of
> his patch.

Oops, good catch.

> I'm also attaching an updated version of the postgres_fdw
> join pushdown patch.

Is that based on Ashutosh's version of the patch, or are the two of
you developing independent of each other?  We should avoid dueling
patches if possible.

> You can find the breaking examples by doing the
> regression tests in the postgres_fdw patch.  Please apply the patches in
> the following order:
>
> epq-recheck-v6-efujita (attached)
> usermapping_matching.patch in [2]
> add_GetUserMappingById.patch in [2]
> foreign_join_v16_efujita2.patch (attached)
>
> As I proposed upthread, I think we could fix that by handling the outer
> plan as in the patch [3]; a) the core initializes the outer plan and
> stores it into somewhere in the ForeignScanState node, not the lefttree
> of the ForeignScanState node, during ExecInitForeignScan, and b) when
> the RecheckForeignScan routine gets called, the FDW extracts the plan
> from the given ForeignScanState node and executes it.  What do you think
> about that?

I think the actual regression test outputs are fine, and that your
desire to suppress part of the plan tree from showing up in the
EXPLAIN output is misguided.  I like it just the way it is.  To
prevent user confusion, I think that when we add support to
postgres_fdw for this we might also want to add some documentation
explaining how to interpret this EXPLAIN output, but I don't think
there's any problem with the output itself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-08 Thread Amit Kapila
On Tue, Dec 8, 2015 at 9:41 AM, Haribabu Kommi 
wrote:
>
> On Sat, Dec 5, 2015 at 3:31 AM, Alvaro Herrera 
wrote:
> > Haribabu Kommi wrote:
> >
> >> How about as follows?
> >>
> >> postgres=# select * from pg_hba_lookup('all','all','::1');
> >>  line_number | type  | database |  user   |  address  | hostname |
method | options |  mode
> >>
-+---+--+-+---+--++-+-
> >> 84   | local  | ["all"] | ["all"] |   |  |
trust  | {}  | skipped
> >> 86   | host   | ["all"] | ["all"] | 127.0.0.1 |  |
trust  | {}  | skipped
> >> 88   | host   | ["all"] | ["all"] | ::1   |  |
trust  | {}  | matched
> >> (3 rows)
> >
> > What did you do to the whitespace when posting that table?  I had to
> > reformat it pretty heavily to understand what you had.
> > Anyway, I think the "mode" column should be right after the line number.
> > I assume the "reason" for skipped lines is going to be somewhere in the
> > table too.
>
> when i try to copy paste the output from psql, it didn't come properly, so
> I adjusted the same to looks properly, but after sending mail, it look
ugly.
>
> Added reason column also as the last column of the table and moved the
mode
> as the second column.
>

Few assorted comments:

1.
+ /*
+ * SQL-accessible SRF to return all the settings from the pg_hba.conf
+ * file.
+ */
+ Datum
+ pg_hba_lookup_2args(PG_FUNCTION_ARGS)
+ {
+ return pg_hba_lookup(fcinfo);
+ }
+
+ /*
+  * SQL-accessible SRF to return all the settings from the pg_hba.conf
+  * file.
+  */
+ Datum
+ pg_hba_lookup_3args(PG_FUNCTION_ARGS)
+ {
+ return pg_hba_lookup(fcinfo);
+ }

I think it is better to have check on number of args in the
above functions similar to what we have in ginarrayextract_2args.

2.
+
+ /*
+ * Reload authentication config files too to refresh
+ * pg_hba_conf view data.
+ */
+ if (!load_hba())
+ {
+ ereport(DEBUG1,
+ (errmsg("Falure in reloading pg_hba.conf, pg_hba_conf view may show stale
information")));
+ load_hba_failure = true;
+ }
+
+ load_hba_failure = false;

Won't the above code set load_hba_failure as false even in
failure case.

3.
+ Datum
+ pg_hba_lookup(PG_FUNCTION_ARGS)
+ {
+ char *user;
+ char *database;
+ char *address;
+ char*hostname;
+ bool ssl_inuse = false;
+ struct sockaddr_storage addr;
+ hba_lookup_args_mode args_mode = TWO_ARGS_MODE; /* Minimum number of
arguments */
+
+ /*
+ * We must use the Materialize mode to be safe against HBA file reloads
+ * while the cursor is open. It's also more efficient than having to look
+ * up our current position in the parsed list every time.
+ */
+ ReturnSetInfo *rsi = (ReturnSetInfo *)fcinfo->resultinfo;
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ (errmsg("only superuser can view pg_hba.conf settings";

To be consistent with other similar messages, it is better to
start this message with "must be superuser ..", refer other
similar superuser checks in xlogfuncs.c



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Erroneous cost estimation for nested loop join

2015-12-08 Thread Robert Haas
On Wed, Dec 2, 2015 at 8:42 PM, Bruce Momjian  wrote:
> No one mentioned the random page docs so I will quote it here:
>
> 
> http://www.postgresql.org/docs/9.5/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS
>
> Random access to mechanical disk storage is normally much more 
> expensive
> than four times sequential access. However, a lower default is used
> (4.0) because the majority of random accesses to disk, such as indexed
> reads, are assumed to be in cache. The default value can be thought of
> as modeling random access as 40 times slower than sequential, while
> expecting 90% of random reads to be cached.
>
> If you believe a 90% cache rate is an incorrect assumption for your
> workload, you can increase random_page_cost to better reflect the true
> cost of random storage reads. Correspondingly, if your data is likely 
> to
> be completely in cache, such as when the database is smaller than the
> total server memory, decreasing random_page_cost can be appropriate.
> Storage that has a low random read cost relative to sequential, e.g.
> solid-state drives, might also be better modeled with a lower value 
> for
> random_page_cost.
>
> What we don't have is way to know how much is in the cache, not only at
> planning time, but at execution time.  (Those times are often
> different for prepared queries.)  I think that is the crux of what has
> to be addressed here.

I think that paragraph is more of an apology for the system that we've
got than a description of what a good one would look like.  If I have
a 1MB table and a 1TB, they are not equally likely to be cached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Should psql exit when the log file can't be written into?

2015-12-08 Thread Daniel Verite
  Hi,

Following last week's commit 344cdff :

Author: Tom Lane 
Date:   Thu Dec 3 14:28:58 2015 -0500

Clean up some psql issues around handling of the query output file.

Formerly, if "psql -o foo" failed to open the output file "foo", it would
print an error message but then carry on as though -o had not been
specified at all.  This seems contrary to expectation: a program that
cannot open its output file normally fails altogether.  Make psql do
exit(1) after reporting the error.
[...]

I notice that --log-file also reports an error but continues nonetheless
if the file can't be opened.
The attached trivial patch makes it fail and exit instead.

Also there's the fact that once opened, psql currently ignores errors
when writing to that file.

Without going as far as tracking every write in print.c,  there are at
least two places in the code where it would be easy to abort the
processing, should we want to, at the beginning of SendQuery() and
PSQLexec(). The code here looks like:

  if (pset.logfile)
  {
fprintf(pset.logfile,
_("* QUERY **\n"
  "%s\n"
  "**\n\n"), query);
fflush(pset.logfile);
  }

At this point if fprintf() returns a negative value, maybe we should
error out and exit rather than ignore it.
Or at least do that when ON_ERROR_STOP is on?

The typical case when it might happen is a disk full condition.
In this scenario, I can imagine a psql exit(1) being both desirable
or not. Certain users could be satisfied that a script went to completion
even in this client-side degraded state.
But other users could be unhappy about the queries being executed
without leaving the expected traces, if the logging part happens to
be essential to them.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 4e5021a..bdd10ba 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -275,8 +275,11 @@ main(int argc, char *argv[])
 	{
 		pset.logfile = fopen(options.logfilename, "a");
 		if (!pset.logfile)
+		{
 			fprintf(stderr, _("%s: could not open log file \"%s\": %s\n"),
 	pset.progname, options.logfilename, strerror(errno));
+			exit(EXIT_FAILURE);
+		}
 	}
 
 	/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-08 Thread Tom Lane
Andreas Seltenreich  writes:
>> I no longer see "failed to build any n-way joins" after pulling, but
>> there are still instances of "could not devise a query plan". Samples below.

> sorry, I spoke too soon: nine of the former have been logged through the
> night.  I'm attaching a larger set of sample queries this time in case
> that there are still multiple causes for the observed errors.

Hm.  At least in the first of these cases, the problem is that the code
I committed yesterday doesn't account for indirect lateral dependencies.
That is, if S1 depends on S2 which depends on the inner side of an outer
join, it now knows not to join S2 directly to the outer side of the outer
join, but it doesn't realize that the same must apply to S1.

Maybe we should redefine lateral_relids as the transitive closure of
a rel's lateral dependencies?  Not sure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should psql exit when the log file can't be written into?

2015-12-08 Thread Tom Lane
"Daniel Verite"  writes:
> I notice that --log-file also reports an error but continues nonetheless
> if the file can't be opened.
> The attached trivial patch makes it fail and exit instead.

I agree with doing that.

> Also there's the fact that once opened, psql currently ignores errors
> when writing to that file.
> Without going as far as tracking every write in print.c,  there are at
> least two places in the code where it would be easy to abort the
> processing, should we want to, at the beginning of SendQuery() and
> PSQLexec().

I do not think this is a good approach.  Disk-full conditions, as an
example, are quite likely to appear and disappear over the course of
a run, if other programs are creating/deleting files.  Thus we might
lose some log output and never notice, if we only test for errors
here and there.  Providing partial coverage would then provide users
with a false sense of confidence that their log isn't truncated.

(It might work to check ferror() every so often, instead of checking
results for individual output calls, but I'm not sure if all the
functions we use will set ferror().)

I'm lukewarm about whether psql should complain about I/O errors other
than open failure, but I think if we do it, it needs to be full
coverage.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 12:24 PM, Andrew Dunstan  wrote:
>>> So, config.guess should be changed to include the build type for ppc64le
>>> like it is in 9.4+ branches.
>>
>> So far as I can tell from a quick troll of the git history, we have never
>> ever updated config.guess/config.sub in released branches.  I'm a bit
>> hesitant to do it in this case either: it would amount to retroactively
>> adding support for a platform, which sure sounds like a new feature.
>>
>> My vote would be to adjust your buildfarm critter to only try to build
>> 9.4 and up.
>>
>
> Or put what he said works in his critter's build-farm.conf in the
> config_opts section, something like:
>
> --build=powerpc64le-unknown-linux-gnu

I don't really want to get into an argument about this, but is the
reason we haven't updated config.guess and config.sub in the past that
it presents an actual stability risk, or just that nobody's asked
before?  Because the first one is a good reason not to do it now, but
the second one isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Combining Aggregates

2015-12-08 Thread Robert Haas
On Thu, Dec 3, 2015 at 12:01 AM, David Rowley
 wrote:
> On 27 July 2015 at 04:58, Heikki Linnakangas  wrote:
>> This patch seems sane to me, as far as it goes. However, there's no
>> planner or executor code to use the aggregate combining for anything. I'm
>> not a big fan of dead code, I'd really like to see something to use this.
>
> I've attached an updated version of the patch. The main change from last
> time is that I've added executor support and exposed this to the planner via
> two new parameters in make_agg().
>
> I've also added EXPLAIN support, this will display "Partial
> [Hash|Group]Aggregate" for cases where the final function won't be called
> and displays "Finalize [Hash|Group]Aggregate" when combining states and
> finalizing aggregates.
>
> This patch is currently intended for foundation work for parallel
> aggregation.

Given Tom's dislike of executor nodes that do more than one thing, I
fear he's not going to be very happy about combining Aggregate,
PartialAggregate, FinalizeAggregate, GroupAggregate,
PartialGroupAggregate, FinalizeGroupAggregate, HashAggregate,
PartialHashAggregate, and FinalizeHashAggregate under one roof.
However, it's not at all obvious to me what would be any better.
nodeAgg.c is already a very big file, and this patch adds a
surprisingly small amount of code to it.

I think the parameter in CREATE AGGREGATE ought to be called
COMBINEFUNC rather than CFUNC.  There are a lot of English words that
begin with C, and it's not self-evident which one is meant.

"iif performing the final aggregate stage we'll check the qual" should
probably say "If" with a capital letter rather than "iif" without one.

I think it would be useful to have a test patch associated with this
patch that generates a FinalizeAggregate + PartialAggregate combo
instead of an aggregate, and run that through the regression tests.
There will obviously be EXPLAIN differences, but in theory nothing
else should blow up.  Of course, such tests will become more
meaningful as we add more combine-functions, but this would at least
be a start.

Generally, I think that this patch will be excellent infrastructure
for parallel aggregate and I think we should try to get it committed
soon.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Weird portability issue in TestLib.pm's slurp_file

2015-12-08 Thread Tom Lane
For grins I tried running the TAP tests on my ancient HPUX box that
hosts pademelon and gaur.  At first they showed a lot of failures,
which I eventually determined were happening because slurp_file was
only retrieving part of the postmaster logfile, causing issues_sql_like
to mistakenly report a failure.  I don't know exactly why slurp_file
is misbehaving; it may well be a bug in perl 5.8.9.  But anyway,
rewriting it like this makes the problem go away:

diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 
da67f33c7e38929067350c7cf0da8fd8c5c1e43d..3940891e5d0533af93bacf3ff4af5a6fb5f10117
 100644
*** a/src/test/perl/TestLib.pm
--- b/src/test/perl/TestLib.pm
*** sub slurp_dir
*** 158,166 
  
  sub slurp_file
  {
local $/;
!   local @ARGV = @_;
!   my $contents = <>;
$contents =~ s/\r//g if $Config{osname} eq 'msys';
return $contents;
  }
--- 158,169 
  
  sub slurp_file
  {
+   my ($filename) = @_;
local $/;
!   open(my $in, $filename)
! or die "could not read \"$filename\": $!";
!   my $contents = <$in>;
!   close $in;
$contents =~ s/\r//g if $Config{osname} eq 'msys';
return $contents;
  }

To my admittedly-no-perl-expert eye, the existing coding is too cute
by half anyway, eg what it will do in error situations is documented
nowhere that I can find in man perlop.

Any objections to pushing this?

regards, tom lane

PS: I'm not planning to turn on TAP testing on pademelon/gaur; looks like
that would add about 40 minutes to what's already an hour and a half per
buildfarm run.  But it might be good to check it manually every so often.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 6:43 AM, Andres Freund  wrote:
> On 2015-12-04 21:55:29 -0500, Noah Misch wrote:
>> On Thu, Dec 03, 2015 at 07:03:21PM +0100, Andres Freund wrote:
>> > Sorry, but I really just want to see these changes as iterative patches
>> > ontop of 9.5/HEAD instead of this process. I won't revert the reversion
>> > if you push it anyway, but I think it's a rather bad idea.
>>
>> I hear you.
>
> Not just me.
>
>> I evaluated your request and judged that the benefits you cited
>> did not make up for the losses I cited.  Should you wish to change my mind,
>> your best bet is to find defects in the commits I proposed.  If I introduced
>> juicy defects, that discovery would lend much weight to your conjectures.
>
> I've absolutely no interest in "proving you wrong". And my desire to
> review patches that are in a, in my opinion, barely reviewable format is
> pretty low as well.

I agree.  Noah, it seems to me that you are offering a novel theory of
how patches should be submitted, reviewed, and committed, but you've
got three people, two of them committers, telling you that we don't
like that approach.  I seriously doubt you're going to find anyone who
does.  When stuff gets committed to the tree, people want to to be
able to answer the question "what has just now changed?" and it is
indisputable that what you want to do here will make that harder.
That's not a one-time problem for Andres during the course of review;
that is a problem for every single person who looks at the commit
history from now until the end of time.  I don't think you have the
right to force your proposed approach through in the face of concerted
opposition.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Andrew Dunstan



On 12/08/2015 10:27 AM, Tom Lane wrote:

Sandeep Thakkar  writes:

So, config.guess should be changed to include the build type for ppc64le
like it is in 9.4+ branches.

So far as I can tell from a quick troll of the git history, we have never
ever updated config.guess/config.sub in released branches.  I'm a bit
hesitant to do it in this case either: it would amount to retroactively
adding support for a platform, which sure sounds like a new feature.

My vote would be to adjust your buildfarm critter to only try to build
9.4 and up.






Or put what he said works in his critter's build-farm.conf in the 
config_opts section, something like:



--build=powerpc64le-unknown-linux-gnu


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind test race condition..?

2015-12-08 Thread Oleksii Kliukin
Hi,

On Wed, Apr 29, 2015, at 01:36 AM, Heikki Linnakangas wrote:
> The problem seems to be that when the standby is promoted, it's a 
> so-called "fast promotion", where it writes an end-of-recovery record 
> and starts accepting queries before creating a real checkpoint. 
> pg_rewind looks at the TLI in the latest checkpoint, as it's in the 
> control file, but that isn't updated until the checkpoint completes. I 
> don't see it on my laptop normally, but I can reproduce it if I insert a 
> "sleep(5)" in StartupXLog, just before it requests the checkpoint:
> 
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -7173,7 +7173,10 @@ StartupXLOG(void)
>* than is appropriate now that we're not in standby mode anymore.
>*/
>   if (fast_promoted)
> +   {
> +   sleep(5);
>   RequestCheckpoint(CHECKPOINT_FORCE);
> +   }
>   }
> 
> The simplest fix would be to force a checkpoint in the regression test, 
> before running pg_rewind. It's a bit of a cop out, since you'd still get 
> the same issue when you tried to do the same thing in the real world. It 
> should be rare in practice - you'd not normally run pg_rewind 
> immediately after promoting the standby - but a better error message at 
> least would be nice..

I think we came across this issue in production. We run a Python daemon
called 'Patroni' (https://github.com/zalando/patroni) in order to
automate failovers. It stops the current master before promotion of one
of the replicas, and runs pg_rewind on the former master to make it
"compatible" with the new one. Naturally, since pg_rewind is launched
automatically, it is called right after detecting that the new master is
running.

What we get sometimes is  "source and target cluster are on the same
timeline". Per the explanation from Heikki, it happens when pg_rewind
connects to the new master after promotion but before the first
checkpoint with the new timeline. Since "fast promotion" is the only
option for pg_ctl promote, I think it might be beneficial to add an
option for pg_rewind to issue a checkpoint  (I worked around this
problem by adding a checkpoint call to Patroni, another way would be to
create the 'fallback_promote' flle and do the 'slow' promote without
relying on pg_ctl promote functionality).

There is another weird case I cannot explain easily. If I do a single
promotion/rewind, while executing an insert that takes quite some time
(insert into test select id from generate_series(1, 1000) id) and
actually interrupting it by the Patroni promotion (which does a
checkpoint and pg_ctl -mf stop on the former master, and pg_ctl promote
on the replica candidate), I'm getting the following (postgresql0 is the
rewound node, note that the latest checkpoint location there is lower
than the prior checkpoint one):

$ pg_controldata data/postgresql0
pg_control version number:942
Catalog version number:   201511071
Database system identifier:   6225948429980199864
Database cluster state:   in archive recovery
pg_control last modified: Tue Dec  8 17:20:16 2015
Latest checkpoint location:   0/628
Prior checkpoint location:0/6000138
Latest checkpoint's REDO location:0/628
Latest checkpoint's REDO WAL file:00010006
Latest checkpoint's TimeLineID:   1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0/944
Latest checkpoint's NextOID:  16390
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:931
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  0
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTs:   0
Latest checkpoint's newestCommitTs:   0
Time of latest checkpoint:Tue Dec  8 17:20:04 2015
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location: 0/6020580
Min recovery ending loc's timeline:   2
Backup start location:0/0
Backup end location:  0/0
End-of-backup record required:no
wal_level setting:hot_standby
wal_log_hints setting:on
max_connections setting:  100
max_worker_processes setting: 8
max_prepared_xacts setting:   0
max_locks_per_xact setting:   64
track_commit_timestamp setting:   off
Maximum data alignment:   8
Database block size:  8192
Blocks per segment of large relation: 131072
WAL block size:   8192
Bytes per WAL segment:16777216
Maximum length of identifiers:64
Maximum columns in an index:  32
Maximum size of a TOAST chunk:1996
Size of a large-object chunk: 2048
Date/time type storage:   64-bit integers

Re: [HACKERS] Remaining 9.5 open items

2015-12-08 Thread Robert Haas
On Thu, Dec 3, 2015 at 9:51 PM, Noah Misch  wrote:
> On Tue, Dec 01, 2015 at 11:05:47AM -0500, Robert Haas wrote:
>> On Mon, Nov 30, 2015 at 2:43 PM, Tom Lane  wrote:
>> > * Foreign join pushdown vs EvalPlanQual
>> >
>> > Is this fixed by 5fc4c26db?  If not, what remains to do?
>>
>> Unfortunately, no.  That commit allows FDWs to do proper EPQ handling
>> for plain table scans, but it proves to be inadequate for EPQ handling
>> for joins. Solving that problem will require another patch, and,
>> modulo a bunch of cosmetic issues, I'm reasonably happy with KaiGai
>> Kohei's latest submission.  I'll respond in more detail on that
>> thread, but the question I want to raise here is: do we want to
>> back-patch those changes to 9.5 at this late date?
>
> Yes.  If 9.5 added a bad interface, better to fix the interface even now than
> to live with the bad one.

OK, I've pushed the latest patch for that issue to master and 9.5.
I'm not completely positive we've killed this problem dead, but I hope
so.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-08 Thread Merlin Moncure
Hello,

The following patch deals with a long standing gripe of mine that the
terminal frequently gets garbled so that when typing.  I guess this
problem is entirely dependent on pager settings and your interaction
patterns with the window (in particular, if you tend to resize the
window when the pager is open).  Experimenting with the problem, it
became pretty clear: libreadline for whatever reason does not get the
signal from the kernal telling it that the bounds have changed.  This
problem does not manifest 100% of the time, you may have to get the
pager to open, resize the window, close the pager, and recall a
previous long line (or type a new one) several times to get the
problem to occur.  Nevertheless, the attached seems to end the
problem.

This adds a dependency to print.c on input.h for the readline macro
and the readline header.

merlin

diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 655850b..ede736e 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -27,6 +27,7 @@
 #include "common.h"
 #include "mbprint.h"
 #include "print.h"
+#include "input.h"

 /*
  * We define the cancel_pressed flag in this file, rather than common.c where
@@ -2247,6 +2248,13 @@ ClosePager(FILE *pagerpipe)
 #ifndef WIN32
pqsignal(SIGPIPE, SIG_DFL);
 #endif
+#ifdef USE_READLINE
+   /*
+* Force libreadline to recheck the terminal size in case pager may
+* have handled any terminal resize events.
+*/
+   rl_resize_terminal();
+#endif
}
 }


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Tom Lane
Robert Haas  writes:
> I don't really want to get into an argument about this, but is the
> reason we haven't updated config.guess and config.sub in the past that
> it presents an actual stability risk, or just that nobody's asked
> before?  Because the first one is a good reason not to do it now, but
> the second one isn't.

Well, I see at least one case in the git history where we explicitly
declined to update config.guess/config.sub:

Author: Tom Lane 
Branch: master Release: REL9_3_BR [5c7603c31] 2013-06-04 15:42:02 -0400
Branch: REL9_2_STABLE Release: REL9_2_5 [612ecf311] 2013-06-04 15:42:21 -0400

Add ARM64 (aarch64) support to s_lock.h.

Use the same gcc atomic functions as we do on newer ARM chips.
(Basically this is a copy and paste of the __arm__ code block,
but omitting the SWPB option since that definitely won't work.)

Back-patch to 9.2.  The patch would work further back, but we'd also
need to update config.guess/config.sub in older branches to make them
build out-of-the-box, and there hasn't been demand for it.

Mark Salter


More generally, I think "does updating config.guess, in itself, pose
a stability risk" is a false statement of the issue.  The real issue is
do we want to start supporting a new platform in 9.1-9.3; that is, IMO
if we accept this request then we are buying into doing *whatever is
needed* to support ppc64le on those branches.  Maybe that will stop with
config.guess/config.sub, or maybe it won't.

Setting this precedent will also make it quite hard to reject future
requests to back-patch support for other new platforms.

I'm not planning to go to war about this issue either.  But I do think
there's a slippery-slope hazard here, and we should be prepared for
the logical consequences of accepting such a request.  Or if we're
going to have a policy allowing this request but not every such request,
somebody had better enunciate what that policy is.

regards, tom lane

(BTW, so far as direct stability hazards go, I would guess that the
key question is how much version skew can be tolerated between autoconf
and config.guess/config.sub. I have no idea about that; Peter E. might.
But I do note that pre-9.4 branches use an older autoconf version.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-08 Thread Robert Haas
On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia  wrote:
> Thanks Ashutosh.
>
> Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
> looks good to me.

This patch needs a rebase.

It's not going to work to say this is a patch proposed for commit when
it's still got a TODO comment in it that obviously needs to be
changed.   And the formatting of that long comment is pretty weird,
too, and not consistent with other functions in that same file (e.g.
get_remote_estimate, ec_member_matches_foreign, create_cursor).

Aside from that, I think before we commit this, somebody should do
some testing that demonstrates that this is actually a good idea.  Not
as part of the test case set for this patch, but just in general.
Merge joins are typically going to be relevant for large tables, but
the examples in the regression tests are necessarily tiny.  I'd like
to see some sample data and some sample queries that get appreciably
faster with this code.  If we can't find any, we don't need the code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2015-12-08 Thread Robert Haas
On Thu, Dec 3, 2015 at 1:48 AM, Amit Kapila  wrote:
> I think the way to address is don't add backend to Group list if it is
> not intended to update the same page as Group leader.  For transactions
> to be on different pages, they have to be 32768 transactionid's far apart
> and I don't see much possibility of that happening for concurrent
> transactions that are going to be grouped.

That might work.

>> My idea for how this could possibly work is that you could have a list
>> of waiting backends for each SLRU buffer page.
>
> Won't this mean that first we need to ensure that page exists in one of
> the buffers and once we have page in SLRU buffer, we can form the
> list and ensure that before eviction, the list must be processed?
> If my understanding is right, then for this to work we need to probably
> acquire CLogControlLock in Shared mode in addition to acquiring it
> in Exclusive mode for updating the status on page and performing
> pending updates for other backends.

Hmm, that wouldn't be good.  You're right: this is a problem with my
idea.  We can try what you suggested above and see how that works.  We
could also have two or more slots for groups - if a backend doesn't
get the lock, it joins the existing group for the same page, or else
creates a new group if any slot is unused.  I think it might be
advantageous to have at least two groups because otherwise things
might slow down when some transactions are rolling over to a new page
while others are still in flight for the previous page.  Perhaps we
should try it both ways and benchmark.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Another little thing about psql wrapped expanded output

2015-12-08 Thread Robert Haas
On Wed, Dec 2, 2015 at 5:24 PM, Tom Lane  wrote:
> Jeff Janes  writes:
>> On Wed, Dec 2, 2015 at 12:37 PM, Josh Berkus  wrote:
>>> If you're fixing the dashed-line code, is there a way to say that we
>>> never have more than a reasonable number of dashes (ideally, the width
>>> of the terminal) no matter how wide the data is?  Having 4000 dashes
>>> because of large text on one row is kinda painful, and not at all useful.
>
>> If you use the default format (\pset format aligned) in expanded mode, then
>> I agree with you we shouldn't print a half screen full of dashes to
>> separate every tuple.
>
> Don't think I agree.  Suppose that you have a wider-than-screen table
> and you use a pager to scroll left and right in that.  If we shorten the
> dashed lines, then once you scroll to the right of wherever they stop,
> you lose that visual cue separating the rows.  This matters a lot if
> only a few of the column values are very wide: everywhere else, there's
> gonna be lots of whitespace.

For what it's worth, I'm with Josh and Jeff.  My pager, like nearly
everybody else's, is less.  And it's not stupid to have a behavior
that works reasonably with less's default settings.  I haven't kept a
count of the number of times I've had to scroll down through endless
pages of dashes in order to find some data that's not dashes, but it's
surely quite a few.

Your point is also valid, so I don't mean to detract from that.  But
the status quo is definitely annoying.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgresNode::_update_pid using undefined variables in tap tests

2015-12-08 Thread Robert Haas
On Thu, Dec 3, 2015 at 11:28 PM, Michael Paquier
 wrote:
> This does not impact the run, but it creates unwelcome warnings in the
> logs. This is actually caused by the following code in PostgresNode
> that uses an incorrect check to see if the file has been correctly
> opened or not:
> open my $pidfile, $self->data_dir . "/postmaster.pid";
> if (not defined $pidfile)
>
> One way to fix this is to use if(open(...)), a second way I know of is
> to check if the opened file handle matches tell($pidfile) == -1. The
> patch attached uses the first method to fix the issue.

My Perl-fu must be getting weak.  What's wrong with the existing code?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread Robert Haas
On Sun, Dec 6, 2015 at 9:27 AM, Michael Paquier
 wrote:
> On Sun, Dec 6, 2015 at 10:56 PM, Michael Paquier
>  wrote:
>> Thanks, I looked at that again and problem is fixed as attached.
>
> Er, not exactly... poll_query_until in PostgresNode.pm is using psql
> -c without the --no-psqlrc switch so this patch causes the regression
> tests of pg_rewind to fail. Fixed as attached.

Committed.  Go, team.

This has been a long time coming.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread David Fetter
On Tue, Dec 08, 2015 at 01:51:57PM -0500, Robert Haas wrote:
> On Fri, Dec 4, 2015 at 11:08 AM, Catalin Iacob  wrote:
> > On Fri, Dec 4, 2015 at 3:47 PM, Robert Haas  wrote:
> >> For the most part, the cleanups in this version are just cosmetic: I
> >> fixed some whitespace damage, and reverted some needless changes to
> >> the psql references page that were whitespace-only adjustments.  In a
> >> few places, I tweaked documentation or comment language.
> >
> > Sorry for the docs whitespace-only changes, I did that.
> >
> > I realized before the submission I made the diff bigger than it needed
> > to be, but that's because I used M-q in Emacs to break the lines I did
> > change and that reformatted the whole paragraph including some
> > unchanged lines. Breaking all the lines by hand would be quite a job,
> > and any time you go back and tweak the wording or so you need to do it
> > again. So I just used M-q and sent the result of that.
> > Do you happen to know of a better way to do this?
> 
> No.  I always redo the indentation by hand and then look at the diff
> afterwards to see if there's anything I can strip out.

There's also an excellent git check-whitepace thing Peter Eisentraut
put together:

http://peter.eisentraut.org/blog/2014/11/04/checking-whitespace-with-git/

> I also use vim rather than emacs, except when my hand is steady enough
> for the magnetized needle approach.[1]

I figured you for more the butterfly type.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] W-TinyLfu for cache eviction

2015-12-08 Thread Robert Haas
On Thu, Dec 3, 2015 at 2:27 AM, Vladimir Sitnikov
 wrote:
> I've recently noticed W-TinyLfu cache admission policy (see [1]) being
> used for caffeine "high performance caching library for Java 8".
> It demonstrates high cache hit ratios (see [2]) and enables to build
> high-throughput caches (see caffeine in [3])
> Authors explicitly allow implementations of the algorithm (see [4]).
>
> Does it make sense to evaluate the algorithm for buffer replacement?

Maybe.  Want to code it up?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-12-08 Thread Pavel Stehule
2015-12-05 8:59 GMT+01:00 Pavel Stehule :

>
>
> 2015-11-30 16:34 GMT+01:00 Daniel Verite :
>
>> Pavel Stehule wrote:
>>
>> > [ \rotate being a wrong name ]
>>
>> Here's an updated patch.
>>
>
> Today I have a time to play with it. I am sorry for delay.
>
>
>>
>> First it renames the command to \crosstabview, which hopefully may
>> be more consensual, at least it's semantically much closer to crosstab .
>>
>
> Thank you very much - it is good name.
>
>
>>
>> > The important question is sorting output. The vertical header is
>> > sorted by first appearance in result. The horizontal header is
>> > sorted in ascending or descending order. This is unfriendly for
>> > often use case - month names. This can be solved by third parameter
>> > - sort function.
>>
>> I've thought that sorting with an external function would be too
>> complicated for this command, but sorting ascending by default
>> was not the right choice either.
>> So I've changed to sorting by first appearance in result (like the
>> vertical
>> header), and sorting ascending or descending only when specified
>> (with +colH or -colH syntax).
>>
>> So the synopsis becomes: \crosstabview [ colV [+ | -]colH ]
>>
>> Example with a time series (daily mean temperatures in Paris,2014),
>> month names across, day numbers down :
>>
>> select
>>   to_char(w_date,'DD') as day ,
>>   to_char(w_date,'Mon') as month,
>>   w_temp from weather
>>   where w_date between '2014-01-01' and '2014-12-31'
>>   order by w_date
>> \crosstabview
>>
>>  day | Jan | Feb | Mar | Apr | May | Jun | ...[cut]
>> -+-+-+-+-+-+-+-
>>  01  |   8 |   8 |   6 |  16 |  12 |  15 |
>>  02  |  10 |   6 |   6 |  15 |  12 |  16 |
>>  03  |  11 |   5 |   7 |  14 |  11 |  17 |
>>  04  |  10 |   6 |   8 |  12 |  12 |  14 |
>>  05  |   6 |   7 |   8 |  14 |  16 |  14 |
>>  06  |  10 |   9 |   9 |  16 |  17 |  20 |
>>  07  |  11 |  10 |  10 |  18 |  14 |  24 |
>>  08  |  11 |   8 |  12 |  10 |  13 |  22 |
>>  09  |  10 |   6 |  14 |  12 |  16 |  22 |
>>  10  |   6 |   7 |  14 |  14 |  14 |  19 |
>>  11  |   7 |   6 |  12 |  14 |  12 |  21 |
>> ...cut..
>>  28  |   4 |   7 |  10 |  12 |  14 |  16 |
>>  29  |   4 | |  14 |  10 |  15 |  16 |
>>  30  |   5 | |  14 |  14 |  17 |  18 |
>>  31  |   5 | |  14 | |  16 | |
>>
>> The month names come out in the expected order here,
>> contrary to what happened with the previous iteration of
>> the patch which forced a sort in all cases.
>> Here it plays out well because the single "ORDER BY w_date" is
>> simultaneously OK for the vertical and horizontal headers,
>> a common case for time series.
>>
>> For more complicated cases, when the horizontal and vertical
>> headers should be ordered independantly, and
>> in addition the horizontal header should not be sorted
>> by its values, I've toyed with the idea of sorting by another
>> column which would supposedly be added in the query
>> just for sorting, but it loses much in simplicity. For the more
>> complex stuff, users can always turn to the server-side methods
>> if needed.
>>
>>
> .Usually you have not natural order for both dimensions - I miss a
> possibility to set [+/-] order for vertical dimension
>
> For my query
>
> select sum(amount) as amount, to_char(date_trunc('month', closed),'TMmon')
> as Month, customer
>   from data group by customer, to_char(date_trunc('month', closed),
> 'TMmon'), extract(month from closed)
>   order by extract(month from closed);
>
> I cannot to push order by customer - and I have to use
>
>
> select sum(amount) as amount, extract(month from closed) as Month,
> customer from data group by customer, extract(month from closed) order by
> customer;
>
> and \crosstabview 3 +2
>
> So possibility to enforce order for vertical dimension and use data order
> for horizontal dimension can be really useful. Other way using special
> column for sorting
>
> some like \crosstabview verticalcolumn horizontalcolumn
> sorthorizontalcolumn
>
>
> Next - I use "fetch_count" > 0. Your new version work only with
> "fetch_cunt <= 0". It is limit - but I am thinking it is acceptable.In this
> case some warning should be displayed - some like "crosstabview doesn't
> work with FETCH_COUNT > 0"
>
> I miss support for autocomplete and \?
>
>
> Regards
>
> Pavel
>
>
I did few minor changes in your patch

1. autocomplete + warning on active FETCH_COUNT (the worning should be
replaced by error, the statement show nothing)

2. support for labels

postgres=# \d data
   Table "public.data"
┌──┬─┬───┐
│  Column  │  Type   │ Modifiers │
╞══╪═╪═══╡
│ id   │ integer │   │
│ customer │ text│   │
│ name │ text│   │
│ amount   │ integer │   │
│ expected │ text│   │
│ closed   │ date│   │
└──┴─┴───┘

postgres=# select sum(amount) as amount, extract(month from 

Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread Robert Haas
On Fri, Dec 4, 2015 at 11:08 AM, Catalin Iacob  wrote:
> On Fri, Dec 4, 2015 at 3:47 PM, Robert Haas  wrote:
>> For the most part, the cleanups in this version are just cosmetic: I
>> fixed some whitespace damage, and reverted some needless changes to
>> the psql references page that were whitespace-only adjustments.  In a
>> few places, I tweaked documentation or comment language.
>
> Sorry for the docs whitespace-only changes, I did that.
>
> I realized before the submission I made the diff bigger than it needed
> to be, but that's because I used M-q in Emacs to break the lines I did
> change and that reformatted the whole paragraph including some
> unchanged lines. Breaking all the lines by hand would be quite a job,
> and any time you go back and tweak the wording or so you need to do it
> again. So I just used M-q and sent the result of that.
> Do you happen to know of a better way to do this?

No.  I always redo the indentation by hand and then look at the diff
afterwards to see if there's anything I can strip out.

I also use vim rather than emacs, except when my hand is steady enough
for the magnetized needle approach.[1]

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] https://xkcd.com/378/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-08 Thread Tom Lane
Merlin Moncure  writes:
> The following patch deals with a long standing gripe of mine that the
> terminal frequently gets garbled so that when typing.

Hm.  I wonder whether rl_resize_terminal() exists in every iteration
of libreadline and libedit.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 1:35 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I don't really want to get into an argument about this, but is the
>> reason we haven't updated config.guess and config.sub in the past that
>> it presents an actual stability risk, or just that nobody's asked
>> before?  Because the first one is a good reason not to do it now, but
>> the second one isn't.
>
> Well, I see at least one case in the git history where we explicitly
> declined to update config.guess/config.sub:
>
> Author: Tom Lane 
> Branch: master Release: REL9_3_BR [5c7603c31] 2013-06-04 15:42:02 -0400
> Branch: REL9_2_STABLE Release: REL9_2_5 [612ecf311] 2013-06-04 15:42:21 -0400
>
> Add ARM64 (aarch64) support to s_lock.h.
>
> Use the same gcc atomic functions as we do on newer ARM chips.
> (Basically this is a copy and paste of the __arm__ code block,
> but omitting the SWPB option since that definitely won't work.)
>
> Back-patch to 9.2.  The patch would work further back, but we'd also
> need to update config.guess/config.sub in older branches to make them
> build out-of-the-box, and there hasn't been demand for it.
>
> Mark Salter

Right, but that commit cites lack of demand, rather than anything else.

> More generally, I think "does updating config.guess, in itself, pose
> a stability risk" is a false statement of the issue.  The real issue is
> do we want to start supporting a new platform in 9.1-9.3; that is, IMO
> if we accept this request then we are buying into doing *whatever is
> needed* to support ppc64le on those branches.  Maybe that will stop with
> config.guess/config.sub, or maybe it won't.

I'm sympathetic to that concern.  On the other hand, if somebody were
to discover, say, a bug in s_lock.h that causes a compile failure on
ppcle, I think we'd treat that as a back-patchable fix and whack it
into all supported branches.  If somebody were to come along and say,
you can't use GetWeirdWindowsCrap() on the latest verison of Windows,
you have to instead use GetDumbWindowsCrap(), we would presumably
back-patch that as far as convenient also.  Are those kind of changes
adding support for a new platform, or just fixing bugs?  Now, on the
flip side, the original port to Windows was not back-patched, nor
should it have been: that was clearly major new development, not just
tweaking.

With respect to this particular thing, it's hard for me to imagine
that anything will go wrong on ppcle that we wouldn't consider a
back-patchable fix, so there might be no harm in adjusting
config.guess and config.sub.  On the other hand, maybe it'd be better
to get the buildfarm critter set up first (using the workaround Andrew
suggested) and then reevaluate.  If it seems to work, I see little
harm in saying, oh yeah sure it's supported, but we don't actually
know that yet.

> Setting this precedent will also make it quite hard to reject future
> requests to back-patch support for other new platforms.
>
> I'm not planning to go to war about this issue either.  But I do think
> there's a slippery-slope hazard here, and we should be prepared for
> the logical consequences of accepting such a request.  Or if we're
> going to have a policy allowing this request but not every such request,
> somebody had better enunciate what that policy is.

I don't have a real clear opinion about this just at the moment, but I
think "those changes look scary to back-patch" is a pretty decent
heuristic.  "That looks like new development" is another one I'd feel
confident to deploy.

> (BTW, so far as direct stability hazards go, I would guess that the
> key question is how much version skew can be tolerated between autoconf
> and config.guess/config.sub. I have no idea about that; Peter E. might.
> But I do note that pre-9.4 branches use an older autoconf version.)

That's a good question, and I wonder if this is why it works from 9.4.
I don't remember an explicit decision to begin supporting ppcle, and
I'm not sure in any case that I'd endorse the view that whatever
config.guess has heard of, we support.  I thought the policy was more
like "whatever we have a working BF member for, we support".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-08 Thread Merlin Moncure
On Tue, Dec 8, 2015 at 2:33 PM, Merlin Moncure  wrote:
> On Tue, Dec 8, 2015 at 2:02 PM, Tom Lane  wrote:
>> I wrote:
>>> Merlin Moncure  writes:
 The following patch deals with a long standing gripe of mine that the
 terminal frequently gets garbled so that when typing.
>>
>>> Hm.  I wonder whether rl_resize_terminal() exists in every iteration
>>> of libreadline and libedit.
>>
>> Quick followup: rl_resize_terminal() exists in GNU readline at least as
>> far back as 4.0 (released Feb 1999).  However, it doesn't seem to be there
>> at all in libedit; I don't see it in OS X Yosemite's headers, anyway.
>> So we'd need a configure test for this.
>
> All right, I guess this answers the question I was thinking, 'can this
> be backpatched?' (basically, now).

er, meant to say, 'no'.

> merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-08 Thread Merlin Moncure
On Tue, Dec 8, 2015 at 2:02 PM, Tom Lane  wrote:
> I wrote:
>> Merlin Moncure  writes:
>>> The following patch deals with a long standing gripe of mine that the
>>> terminal frequently gets garbled so that when typing.
>
>> Hm.  I wonder whether rl_resize_terminal() exists in every iteration
>> of libreadline and libedit.
>
> Quick followup: rl_resize_terminal() exists in GNU readline at least as
> far back as 4.0 (released Feb 1999).  However, it doesn't seem to be there
> at all in libedit; I don't see it in OS X Yosemite's headers, anyway.
> So we'd need a configure test for this.

All right, I guess this answers the question I was thinking, 'can this
be backpatched?' (basically, now).  I'll work up a configure test and
submit it to the appropriate commit fest.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-08 Thread Robert Haas
On Thu, Dec 3, 2015 at 3:52 AM, amul sul  wrote:
> Hi ALL,
>
> Need your suggestions.
> initially_valid flag is added to make column constraint valid. (commit : 
> http://www.postgresql.org/message-id/e1q2ak9-0006hk...@gemulon.postgresql.org)
>
>
> IFAICU, initially_valid and skip_validation values are mutually exclusive at 
> constraint creation(ref: gram.y), unless it set explicitly.
>
> Can we pass initially_valid instead of !skip_validation to StoreRelCheck() in 
> AddRelationNewConstraints(), as shown below?

The comments say this:

 * If skip_validation is true then we skip checking that the existing rows
 * in the table satisfy the constraint, and just install the catalog entries
 * for the constraint.  A new FK constraint is marked as valid iff
 * initially_valid is true.  (Usually skip_validation and initially_valid
 * are inverses, but we can set both true if the table is known empty.)

These comments are accurate.  If you create a FK constraint as not
valid, the system decides that it's really valid after all, but if you
add a CHECK constraint as not valid, the system just believes you:

rhaas=# create table foo (a serial primary key);
CREATE TABLE
rhaas=# create table bar (a int, foreign key (a) references foo (a)
not valid, check (a != 0) not valid);
CREATE TABLE
rhaas=# \d bar
  Table "public.bar"
 Column |  Type   | Modifiers
+-+---
 a  | integer |
Check constraints:
"bar_a_check" CHECK (a <> 0) NOT VALID
Foreign-key constraints:
"bar_a_fkey" FOREIGN KEY (a) REFERENCES foo(a)

I suspect this is an oversight.  We allowed FOREIGN KEY constraints to
be not valid in 722bf7017bbe796decc79c1fde03e7a83dae9ada by Simon
Riggs, but didn't add allow it for CHECK constraints until Alvaro's
commit of 897795240cfaaed724af2f53ed2c50c9862f951f a few months later.
My guess is that there's no reason for these not to behave in the same
way, but they don't.  Amul's proposed one-liner might be one part of
actually fixing that, but it wouldn't be enough by itself: you'd also
need to teach transformCreateStmt to set the initially_valid flag to
true, maybe by adding a new function transformCheckConstraints or so.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for readline terminal size problems when window is resized with open pager

2015-12-08 Thread Tom Lane
I wrote:
> Merlin Moncure  writes:
>> The following patch deals with a long standing gripe of mine that the
>> terminal frequently gets garbled so that when typing.

> Hm.  I wonder whether rl_resize_terminal() exists in every iteration
> of libreadline and libedit.

Quick followup: rl_resize_terminal() exists in GNU readline at least as
far back as 4.0 (released Feb 1999).  However, it doesn't seem to be there
at all in libedit; I don't see it in OS X Yosemite's headers, anyway.
So we'd need a configure test for this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread Pavel Stehule
2015-12-08 20:09 GMT+01:00 Robert Haas :

> On Sun, Dec 6, 2015 at 9:27 AM, Michael Paquier
>  wrote:
> > On Sun, Dec 6, 2015 at 10:56 PM, Michael Paquier
> >  wrote:
> >> Thanks, I looked at that again and problem is fixed as attached.
> >
> > Er, not exactly... poll_query_until in PostgresNode.pm is using psql
> > -c without the --no-psqlrc switch so this patch causes the regression
> > tests of pg_rewind to fail. Fixed as attached.
>
> Committed.  Go, team.
>
> This has been a long time coming.
>

great, thank you very much you and all

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread Michael Paquier
On Wed, Dec 9, 2015 at 5:08 AM, Pavel Stehule  wrote:
> 2015-12-08 20:09 GMT+01:00 Robert Haas :
>> On Sun, Dec 6, 2015 at 9:27 AM, Michael Paquier
>>  wrote:
>> > On Sun, Dec 6, 2015 at 10:56 PM, Michael Paquier
>> >  wrote:
>> >> Thanks, I looked at that again and problem is fixed as attached.
>> >
>> > Er, not exactly... poll_query_until in PostgresNode.pm is using psql
>> > -c without the --no-psqlrc switch so this patch causes the regression
>> > tests of pg_rewind to fail. Fixed as attached.
>>
>> Committed.  Go, team.
>>
>> This has been a long time coming.
>
>
> great, thank you very much you and all

Thanks! This is now marked as committed in the CF app...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-12-08 Thread Andres Freund
On 2015-12-09 16:30:47 +0900, Michael Paquier wrote:
> > I'm kinda wondering if it wouldn't have been better to go through shared
> > buffers in ResetUnloggedRelationsInDbspaceDir() instead of using
> > copy_file().
> 
> For deployment with large shared_buffers settings, wouldn't that be
> actually more costly than the current way of doing? We would need to
> go through all the buffers at least once and look for the INIT_FORKNUM
> present to flush them.

We could just check the file sizes on disk, and the check for the
contents of all the pages for each file.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-12-08 Thread Etsuro Fujita

On 2015/12/09 2:56, Robert Haas wrote:

On Thu, Dec 3, 2015 at 9:51 PM, Noah Misch  wrote:

On Tue, Dec 01, 2015 at 11:05:47AM -0500, Robert Haas wrote:

On Mon, Nov 30, 2015 at 2:43 PM, Tom Lane  wrote:

* Foreign join pushdown vs EvalPlanQual

Is this fixed by 5fc4c26db?  If not, what remains to do?



Unfortunately, no.  That commit allows FDWs to do proper EPQ handling
for plain table scans, but it proves to be inadequate for EPQ handling
for joins. Solving that problem will require another patch, and,
modulo a bunch of cosmetic issues, I'm reasonably happy with KaiGai
Kohei's latest submission.  I'll respond in more detail on that
thread, but the question I want to raise here is: do we want to
back-patch those changes to 9.5 at this late date?



Yes.  If 9.5 added a bad interface, better to fix the interface even now than
to live with the bad one.



OK, I've pushed the latest patch for that issue to master and 9.5.
I'm not completely positive we've killed this problem dead, but I hope
so.


Thank you for committing the patch!

Sorry, I overlooked a typo in docs: s/more that one/more than one/ 
Please find attached a patch.


Best regards,
Etsuro Fujita
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 0090e24..dc2d890 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -793,7 +793,7 @@ RecheckForeignScan (ForeignScanState *node, TupleTableSlot *slot);
  ForeignScan.  When a recheck is required, this subplan
  can be executed and the resulting tuple can be stored in the slot.
  This plan need not be efficient since no base table will return more
- that one row; for example, it may implement all joins as nested loops.
+ than one row; for example, it may implement all joins as nested loops.
 

 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Tom Lane
Robert Haas  writes:
> With respect to this particular thing, it's hard for me to imagine
> that anything will go wrong on ppcle that we wouldn't consider a
> back-patchable fix, so there might be no harm in adjusting
> config.guess and config.sub.

FWIW, I also suspect that supporting ppc64le would not really take
much more than updating config.guess/config.sub; there's no evidence
in the git logs that we had to fix anything else in the newer branches.

My concern here is about establishing project policy about whether
we will or won't consider back-patching support for newer platforms.
I think that the default answer should be "no", and I'd like to
see us set down some rules about what it takes to override that.

Obviously, setting up a buildfarm member helps a good deal.  But
is that sufficient, or necessary?

> I'm not sure in any case that I'd endorse the view that whatever
> config.guess has heard of, we support.

config.guess support is a necessary but certainly not sufficient
condition.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-12-08 Thread Jeff Janes
On Mon, Dec 7, 2015 at 9:01 AM, Jeff Janes  wrote:
>
> The patched one ends with a 2-way, two sequential 233-way merges, and
> a final 233-way merge:
>
> LOG:  finished 2-way merge step: CPU 62.08s/435.70u sec elapsed 587.52 sec
> LOG:  finished 233-way merge step: CPU 77.94s/660.11u sec elapsed 897.51 sec
> LOG:  a multi-pass external merge sort is required (234 tape maximum)
> HINT:  Consider increasing the configuration parameter "maintenance_work_mem".
> LOG:  finished 233-way merge step: CPU 94.55s/884.63u sec elapsed 1185.17 sec
> LOG:  performsort done (except 233-way final merge): CPU
> 94.76s/884.69u sec elapsed 1192.01 sec
> LOG:  external sort ended, 2541656 disk blocks used: CPU
> 202.65s/1771.50u sec elapsed 3963.90 sec
>
>
> If you just look at the final merges of each, they should have the
> same number of tuples going through them (i.e. all of the tuples) but
> the patched one took well over twice as long, and all that time was IO
> time, not CPU time.
>
> I reversed out the memory pooling patch, and that shaved some time
> off, but nowhere near bringing it back to parity.
>
> I think what is going on here is that the different numbers of runs
> with the patched code just makes it land in an anti-sweat spot in the
> tape emulation and buffering algorithm.
>
> Each tape gets 256kB of buffer.  But two tapes have one third of the
> tuples each other third are spread over all the other tapes almost
> equally (or maybe one tape has 2/3 of the tuples, if the output of one
> 233-way nonfinal merge was selected as the input of the other one).
> Once the large tape(s) has depleted its buffer, the others have had
> only slightly more than 1kB each depleted.  Yet when it goes to fill
> the large tape, it also tops off every other tape while it is there,
> which is not going to get much read-ahead performance on them, leading
> to a lot of random IO.

The final merge only refills each tape buffer as that buffer gets
depleted, rather than refilling all of them whenever any is depleted,
so my explanation doesn't work. But move it back one layer.  There are
3 sequential 233-way merges.  The first one produces a giant tape run.
The second one consumes that giant tape run along with 232 small tape
runs.  At this point, the logic I describe above does come into play,
refilling each of the buffers for the small runs much too often,
freeing blocks on the tape emulation for those runs in dribs and
drabs.  Those free blocks get re-used by the giant output tape run, in
a scattered fashion.

Then in the next (final) merge, it is has to read in this huge
fragmented tape run emulation, generating a lot of random IO to read
it.

With the patched code, the average length of reads on files in
pgsql_tmp between lseeks or changing to a different file descriptor is
8, while in the unpatched code it is 14.

>
> Now, I'm not sure why this same logic wouldn't apply to the unpatched
> code with 118-way merge too.  So maybe I am all wet here.  It seems
> like that imbalance would be enough to also cause the problem.

So my current theory is that it takes one large merge to generate an
unbalanced tape, one large merge where that large unbalanced tape
leads to fragmenting the output tape, and one final merge to be slowed
down by this fragmentation.

I looked at https://code.google.com/p/ioapps/ as Peter recommended,
but couldn't figure out what do with it.  The only conclusion I got
from ioprofiler was that it spend a lot of time reading files in
pgsql_tmp.   I found just doing
strace -y -ttt -T -p 
And then analyzing with perl one liners to work better, but it could
just be the learning curve.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgresNode::_update_pid using undefined variables in tap tests

2015-12-08 Thread Michael Paquier
On Wed, Dec 9, 2015 at 8:57 AM, Alvaro Herrera  wrote:
> Michael Paquier wrote:
>
>> This code should have checked for the return result of open instead of
>> looking at $pidfile. This has been noticed by Noah as well afterwards
>> and already addressed as 9821492.
>
> Oops, sorry I didn't credit you in the commit message.

That's fine. Don't worry.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

2015-12-08 Thread Olson, Ken
I have downloaded a fresh copy of the Win x64 installer 
(postgresql-9.4.5-2-windows-x64.exe) from 
http://www.enterprisedb.com/products-services-training/pgdownload. The output 
of pg_config and assodicated directory listing of include/server:

BINDIR = C:/PROGRA~1/POSTGR~1/9.4/bin
DOCDIR = C:/PROGRA~1/POSTGR~1/9.4/doc
HTMLDIR = C:/PROGRA~1/POSTGR~1/9.4/doc
INCLUDEDIR = C:/PROGRA~1/POSTGR~1/9.4/include
PKGINCLUDEDIR = C:/PROGRA~1/POSTGR~1/9.4/include
INCLUDEDIR-SERVER = C:/PROGRA~1/POSTGR~1/9.4/include/server
LIBDIR = C:/PROGRA~1/POSTGR~1/9.4/lib
PKGLIBDIR = C:/PROGRA~1/POSTGR~1/9.4/lib
LOCALEDIR = C:/PROGRA~1/POSTGR~1/9.4/share/locale
MANDIR = C:/Program Files/PostgreSQL/9.4/man
SHAREDIR = C:/PROGRA~1/POSTGR~1/9.4/share
SYSCONFDIR = C:/Program Files/PostgreSQL/9.4/etc
PGXS = C:/Program Files/PostgreSQL/9.4/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = --enable-thread-safety --enable-integer-datetimes --enable-nls 
--with-ldap --with-ossp-uuid --with-libxml --with-libxslt --with-tcl 
--with-perl --with-python
VERSION = PostgreSQL 9.4.5
 Volume in drive C has no label.
 Volume Serial Number is 78D5-D08D

 Directory of C:\PROGRA~1\POSTGR~1\9.4\include\server

12/06/2015  01:58 PM  .
12/06/2015  01:58 PM  ..
12/06/2015  01:58 PM  access
12/06/2015  01:58 PM  bootstrap
12/06/2015  01:59 PM  catalog
12/06/2015  01:58 PM  commands
12/06/2015  01:58 PM  common
12/06/2015  01:58 PM  datatype
12/06/2015  01:58 PM  executor
12/06/2015  01:58 PM  foreign
12/06/2015  01:58 PM  lib
12/06/2015  01:58 PM  libpq
12/06/2015  01:58 PM  mb
12/06/2015  01:58 PM  nodes
12/06/2015  01:58 PM  optimizer
12/06/2015  01:58 PM  parser
12/06/2015  01:58 PM  port
12/06/2015  01:58 PM  portability
12/06/2015  01:58 PM  postmaster
12/06/2015  01:58 PM  regex
12/06/2015  01:58 PM  replication
12/06/2015  01:58 PM  rewrite
12/06/2015  01:58 PM  snowball
12/06/2015  01:58 PM  storage
12/06/2015  01:58 PM  tcop
12/06/2015  01:58 PM  tsearch
12/06/2015  01:58 PM  utils
11/19/2015  12:19 AM30,882 c.h
11/19/2015  12:19 AM30,626 fmgr.h
11/19/2015  12:19 AM10,711 funcapi.h
11/19/2015  12:19 AM 3,986 getaddrinfo.h
11/19/2015  12:19 AM   660 getopt_long.h
11/19/2015  12:19 AM15,482 miscadmin.h
11/19/2015  12:45 AM21,702 pg_config.h
11/19/2015  12:45 AM   253 pg_config_ext.h
11/19/2015  12:19 AM10,875 pg_config_manual.h
11/19/2015  12:45 AM13,392 pg_config_os.h
11/19/2015  12:19 AM 1,084 pg_getopt.h
11/19/2015  12:19 AM   316 pg_trace.h
11/19/2015  12:19 AM27,166 pgstat.h
11/19/2015  12:19 AM   606 pgtar.h
11/19/2015  12:19 AM 2,309 pgtime.h
11/19/2015  12:19 AM27,489 plpgsql.h
11/19/2015  12:19 AM14,096 port.h
11/19/2015  12:19 AM21,398 postgres.h
11/19/2015  12:19 AM 2,109 postgres_ext.h
11/19/2015  12:19 AM   763 postgres_fe.h
11/19/2015  12:19 AM   843 rusagestub.h
11/19/2015  12:19 AM 2,379 windowapi.h
  22 File(s)239,127 bytes
  27 Dir(s)  26,142,257,152 bytes free

Ken

-Original Message-
From: Chapman Flack [mailto:c...@anastigmatix.net] 
Sent: Saturday, December 05, 2015 4:07 PM
To: Olson, Ken
Subject: Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

Ken,

Do you have a moment to respond to Bruce's questions here, about what files
*are* put into $INCLUDEDIR_SERVER by the Windows PG installer you've used, and 
just what installer that is (supplied by EDB, right?)?

Thanks,
-Chap


On 12/05/15 15:35, Bruce Momjian wrote:
> So, for me, the big question is why dynloader.h isn't getting copied 
> into /include/server.  (There is a mention the 'Makefile' about vpath 
> builds needing to copy dynloader.h manually --- is vpath being used 
> for the Windows installers somehow?)
> 
> Can you show me what files you have in /include/server, without your 
> copying the dynloader.h file manually?  Where did you get that Windows 
> installer?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Double linking MemoryContext children

2015-12-08 Thread Kevin Grittner
On Sun, Dec 6, 2015 at 7:30 PM, Jim Nasby  wrote:
> On 9/14/15 7:24 AM, Jan Wieck wrote:
>> On 09/12/2015 11:35 AM, Kevin Grittner wrote:
>>
>>> On the other hand, a grep indicates that there are two places that
>>> MemoryContextData.nextchild is set (and we therefore probably need
>>> to also set the new field), and Jan's proposed patch only changes
>>> one of them.  If we do this, I think we need to change both places
>>> that are affected, so ResourceOwnerCreate() in resowner.c would
>>> need a line or two added.
>>
>> ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not
>> MemoryContextData.nextchild.
>
> Anything ever happen with this? 

Jan was right; the code for operating on resource owners was
similar enough that I mistook it for memory context code in a quick
review of grep results looking for any places that Jan might have
missed.  I went over it all again and couldn't resist adding an
Assert() at one point, but otherwise it looks good.

An optimized build without assertions runs his 2 statement DO
test in 25646.811 ms without the patch and 2933.754 ms with the
patch.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-08 Thread Jim Nasby

On 12/7/15 7:26 PM, David Rowley wrote:

I was talking to Thomas Munro yesterday about this, and he mentioned
that it would be quite nice to have some stats on how much time is spent
in the planner, vs executor. He came up with the idea of adding a column
to pg_stat_statements to record the planning time.


I think that would be useful. Maybe something in pg_stat_database too.


If we could get real statistics on real world cases and we discovered
that, for example on average that the total CPU time of planning was
only 1% of execution time, then it would certainly make adding 2%
overhead onto the planner a bit less of a worry, as this would just be
%2 of 1% (0.02%). Such information, if fed back into the community might
be able to drive us in the right direction when it comes to deciding
what needs to be done to resolve this constant issue with accepting
planner improvement patches.


Might be nice, but I think it's also pretty unnecessary.

I've dealt with dozens of queries that took minutes to hours to run. Yet 
I can't recall ever having an EXPLAIN on one of these take more than a 
few seconds. I tend to do more OLTP stuff so maybe others have 
experienced long-running EXPLAIN, in which case it'd be great to know that.


Actually, I have seen long EXPLAIN times, but only as part of trying to 
aggressively increase *_collapse_limit. IIRC I was able to increase one 
of those to 14 and one to 18 before plan time became unpredictably bad 
(it wasn't always bad though, just sometimes).



I believe that with parallel query on the horizon for 9.6 that we're now
aiming to support bigger OLAP type database than ever before. So if we
ignore patches like this one then it appears that we have some
conflicting goals in the community as it seems that we're willing to add
the brawn, but we're not willing to add the brain. If this is the case
then it's a shame, as I think we can have both. So I very much agree on
the fact that we must find a way to maintain support and high
performance of small OLTP databases too.


+1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HELP!!! The WAL Archive is taking up all space

2015-12-08 Thread FattahRozzaq
Hi all,

Thank you for all of your responses.
Meanwhile, I will repost this at pgsql-gene...@postgresql.org


Regards,
Fattah

On 09/12/2015, David G. Johnston  wrote:
> On Tue, Dec 8, 2015 at 4:43 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Tue, Dec 8, 2015 at 3:33 AM, FattahRozzaq  wrote:
>>
>>> Hi all,
>>>
>>> Please help...
>>>
>>> I have 1 master PostgreSQL and 1 standby PostgreSQL.
>>> Both servers has the same OS Linux Debian Wheezy, the same hardware.
>>>
>>> Both server hardware:
>>> CPU: 24 cores
>>> RAM: 128GB
>>> Disk-1: 800GB SAS (for OS, logs, WAL archive directory)
>>> Disk-2: 330GB SSD (for PostgreSQL data directory, except WAL archive
>>> and except pg_log)
>>>
>>> The part of the configuration are as below:
>>> checkpoint_segments = 64
>>> checkpoint_completion_target = 0.9
>>> default_statistics_target = 10
>>> maintenance_work_mem = 1GB
>>> effective_cache_size = 64GB
>>> shared_buffers = 24GB
>>> work_mem = 5MB
>>> wal_buffers = 8MB
>>> wal_keep_segments = 4096
>>> wal_level = hot_standby
>>> max_wal_senders = 10
>>> archive_mode = on
>>> archive_command = 'cp -i %p /home/postgres/archive/master/%f'
>>>
>>>
>>> The WAL archive is at /home/postgres/archive/master/, right?
>>> This directory consume more than 750GB of Disk-1.
>>> Each segment in the /home/postgres/archive/master/ is 16MB each
>>> There are currently 47443 files in this folder.
>>>
>>> I want to limit the total size use by WAL archive to around 200-400 GB.
>>>
>>> Do I set the segment too big?
>>> wal_keep_segments = 4096
>>> checkpoint_segments = 64
>>>
>>> What value should I set for it?
>>>
>>
>> In which case you need to calculate how long it takes to accumulate that
>> much archive data and then perform a base backup roughly that often after
>> which point any WAL older that the point at which you began the backup
>> can
>> be removed.
>>
>> You cannot just limit how large the WAL archive is since removing any WAL
>> file will pretty much make any attempt at restoration fail.​
>>
>> David J.
>>
>>
> ​While valid I missed that you have a streaming replica on the other end
> that should be removing files as they are loaded pending the retention
> setting...see Michael's response for better information.
>
> David J.​
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HELP!!! The WAL Archive is taking up all space

2015-12-08 Thread Michael Paquier
On Tue, Dec 8, 2015 at 7:33 PM, FattahRozzaq  wrote:
> The WAL archive is at /home/postgres/archive/master/, right?
> This directory consume more than 750GB of Disk-1.
> Each segment in the /home/postgres/archive/master/ is 16MB each
> There are currently 47443 files in this folder.
>
> I want to limit the total size use by WAL archive to around 200-400 GB.

This kind of question is more adapted for pgsql-general. pgsql-hackers
is where happens discussions related to features and development.

There is no magic value. This depends on the data policy retention you
want to have for your backups. More information here:
http://www.postgresql.org/docs/devel/static/continuous-archiving.html
If you don't need this many segments, you should just decrease it. If
you need more, buy more disk space.

> Do I set the segment too big?
> wal_keep_segments = 4096
> What value should I set for it?

That's a lot, but it depends on what you seek, leading to up to 200GB
of WAL segments. Here this would be useful if you expect to be able to
recover with large instances, aka a base backup takes a lot of time,
and the standby that replays behind will be able to connect to its
root note within this interval of segments.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Double linking MemoryContext children

2015-12-08 Thread Thom Brown
On 7 December 2015 at 01:30, Jim Nasby  wrote:
> On 9/14/15 7:24 AM, Jan Wieck wrote:
>>
>> On 09/12/2015 11:35 AM, Kevin Grittner wrote:
>>
>>> On the other hand, a grep indicates that there are two places that
>>> MemoryContextData.nextchild is set (and we therefore probably need
>>> to also set the new field), and Jan's proposed patch only changes
>>> one of them.  If we do this, I think we need to change both places
>>> that are affected, so ResourceOwnerCreate() in resowner.c would
>>> need a line or two added.
>>
>>
>> ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not
>> MemoryContextData.nextchild.
>
>
> Anything ever happen with this? 

Yeah, it was committed... a few mins ago.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Another little thing about psql wrapped expanded output

2015-12-08 Thread David G. Johnston
On Tue, Dec 8, 2015 at 12:36 PM, Robert Haas  wrote:

> On Wed, Dec 2, 2015 at 5:24 PM, Tom Lane  wrote:
> > Jeff Janes  writes:
> >> On Wed, Dec 2, 2015 at 12:37 PM, Josh Berkus  wrote:
> >>> If you're fixing the dashed-line code, is there a way to say that we
> >>> never have more than a reasonable number of dashes (ideally, the width
> >>> of the terminal) no matter how wide the data is?  Having 4000 dashes
> >>> because of large text on one row is kinda painful, and not at all
> useful.
> >
> >> If you use the default format (\pset format aligned) in expanded mode,
> then
> >> I agree with you we shouldn't print a half screen full of dashes to
> >> separate every tuple.
> >
> > Don't think I agree.  Suppose that you have a wider-than-screen table
> > and you use a pager to scroll left and right in that.  If we shorten the
> > dashed lines, then once you scroll to the right of wherever they stop,
> > you lose that visual cue separating the rows.  This matters a lot if
> > only a few of the column values are very wide: everywhere else, there's
> > gonna be lots of whitespace.
>
> For what it's worth, I'm with Josh and Jeff.  My pager, like nearly
> everybody else's, is less.  And it's not stupid to have a behavior
> that works reasonably with less's default settings.  I haven't kept a
> count of the number of times I've had to scroll down through endless
> pages of dashes in order to find some data that's not dashes, but it's
> surely quite a few.
>
> Your point is also valid, so I don't mean to detract from that.  But
> the status quo is definitely annoying.


​for those wishing to change the status quo the question is whether there
needs to be a way to get back to the present behavior and, more generally,
configure the behavior to taste while still having a reasonable default.

Losing a bit of usability in not being able to identify record boundaries
while viewing off to the right seems is a trade-off that feels right to
me.  During interactive use SELECT * is quite useful but is hampered on
relations that just happen to have a wide column that you don't care about
but also don't want to waste the effort to specify all column names except
that one.

​So +1 from me.

David J.​
​


Re: [HACKERS] Patch: ResourceOwner optimization for tables with many partitions

2015-12-08 Thread Robert Haas
On Fri, Dec 4, 2015 at 7:15 AM, Aleksander Alekseev
 wrote:
> Current implementation of ResourceOwner uses arrays to store resources
> like TupleDescs, Snapshots, etc. When we want to release one of these
> resources ResourceOwner finds it with linear scan. Granted, resource
> array are usually small so its not a problem most of the time. But it
> appears to be a bottleneck when we are working with tables which have a
> lot of partitions.
>
> To reproduce this issue:
> 1. run `./gen.pl 1 | psql my_database postgres`
> 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database`
> 3. in second terminal run `sudo perf top -u postgres`
>
> Both gen.pl and q.sql are attached to this message.
>
> You will see that postgres spends a lot of time in ResourceOwnerForget*
> procedures:
>
>  32.80%  postgres   [.] list_nth
>  20.29%  postgres   [.] ResourceOwnerForgetRelationRef
>  12.87%  postgres   [.] find_all_inheritors
>   7.90%  postgres   [.] get_tabstat_entry
>   6.68%  postgres   [.] ResourceOwnerForgetTupleDesc
>   1.17%  postgres   [.] hash_search_with_hash_value
>  ... < 1% ...
>
> I would like to suggest a patch (see attachment) witch fixes this
> bottleneck. Also I discovered that there is a lot of code duplication in
> ResourceOwner. Patch fixes this too. The idea is quite simple. I just
> replaced arrays with something that could be considered hash tables,
> but with some specific optimizations.
>
> After applying this patch we can see that bottleneck is gone:
>
>  42.89%  postgres   [.] list_nth
>  18.30%  postgres   [.] find_all_inheritors
>  10.97%  postgres   [.] get_tabstat_entry
>   1.82%  postgres   [.] hash_search_with_hash_value
>   1.21%  postgres   [.] SearchCatCache
>  ... < 1% ...
>
> For tables with thousands partitions we gain in average 32.5% more TPS.
>
> As far as I can see in the same time patch doesn't make anything worse.
> `make check` passes with asserts enabled and disabled. There is no
> performance degradation according to both standard pgbench benchmark
> and benchmark described above for tables with 10 and 100 partitions.

I do think it's interesting to try to speed up the ResourceOwner
mechanism when there are many resources owned.  We've had some forays
in that direction in the past, and I think it's useful to continue
that work.  But I also think that this patch, while interesting, is
not something we can seriously consider committing in its current
form, because:

1. It lacks adequate comments and submission notes to explain the
general theory of operation of the patch.

2. It seems to use uint8 * rather freely as a substitute for char * or
void *, which doesn't seem like a great idea.

3. It doesn't follow PostgreSQL formatting conventions (uncuddled
curly braces, space after if and before open paren, etc.).

4. It mixes together multiple ideas in a single patch, not only
introducing a hashing concept but also striping a brand-new layer of
abstraction across the resource-owner mechanism.  I am not sure that
layer of abstraction is a very good idea, but if it needs to be done,
I think it should be a separate patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgresNode::_update_pid using undefined variables in tap tests

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 6:09 PM, Michael Paquier
 wrote:
> On Wed, Dec 9, 2015 at 4:47 AM, Robert Haas  wrote:
>> On Thu, Dec 3, 2015 at 11:28 PM, Michael Paquier
>>  wrote:
>> > This does not impact the run, but it creates unwelcome warnings in the
>> > logs. This is actually caused by the following code in PostgresNode
>> > that uses an incorrect check to see if the file has been correctly
>> > opened or not:
>> > open my $pidfile, $self->data_dir . "/postmaster.pid";
>> > if (not defined $pidfile)
>> >
>> > One way to fix this is to use if(open(...)), a second way I know of is
>> > to check if the opened file handle matches tell($pidfile) == -1. The
>> > patch attached uses the first method to fix the issue.
>>
>> My Perl-fu must be getting weak.  What's wrong with the existing code?
>
> This code should have checked for the return result of open instead of
> looking at $pidfile. This has been noticed by Noah as well afterwards
> and already addressed as 9821492.

I see that open() returns a value, but I figured $pidfile would end up
as undef if open failed.  I see that's not the case:

[rhaas pgsql]$ perl -MData::Dumper -e 'open(my $pidfile, "<",
"/fscsfasf") || warn $!; print Dumper($pidfile);'
No such file or directory at -e line 1.
$VAR1 = \*{'::$pidfile'};

Boy, that's awful.  Whoever designed that bit of wonderfulness should
have their language design license revoked.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] HELP!!! The WAL Archive is taking up all space

2015-12-08 Thread FattahRozzaq
Hi all,

Please help...

I have 1 master PostgreSQL and 1 standby PostgreSQL.
Both servers has the same OS Linux Debian Wheezy, the same hardware.

Both server hardware:
CPU: 24 cores
RAM: 128GB
Disk-1: 800GB SAS (for OS, logs, WAL archive directory)
Disk-2: 330GB SSD (for PostgreSQL data directory, except WAL archive
and except pg_log)

The part of the configuration are as below:
checkpoint_segments = 64
checkpoint_completion_target = 0.9
default_statistics_target = 10
maintenance_work_mem = 1GB
effective_cache_size = 64GB
shared_buffers = 24GB
work_mem = 5MB
wal_buffers = 8MB
wal_keep_segments = 4096
wal_level = hot_standby
max_wal_senders = 10
archive_mode = on
archive_command = 'cp -i %p /home/postgres/archive/master/%f'


The WAL archive is at /home/postgres/archive/master/, right?
This directory consume more than 750GB of Disk-1.
Each segment in the /home/postgres/archive/master/ is 16MB each
There are currently 47443 files in this folder.

I want to limit the total size use by WAL archive to around 200-400 GB.

Do I set the segment too big?
wal_keep_segments = 4096
checkpoint_segments = 64

What value should I set for it?



Regards,
Fattah


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HELP!!! The WAL Archive is taking up all space

2015-12-08 Thread David G. Johnston
On Tue, Dec 8, 2015 at 3:33 AM, FattahRozzaq  wrote:

> Hi all,
>
> Please help...
>
> I have 1 master PostgreSQL and 1 standby PostgreSQL.
> Both servers has the same OS Linux Debian Wheezy, the same hardware.
>
> Both server hardware:
> CPU: 24 cores
> RAM: 128GB
> Disk-1: 800GB SAS (for OS, logs, WAL archive directory)
> Disk-2: 330GB SSD (for PostgreSQL data directory, except WAL archive
> and except pg_log)
>
> The part of the configuration are as below:
> checkpoint_segments = 64
> checkpoint_completion_target = 0.9
> default_statistics_target = 10
> maintenance_work_mem = 1GB
> effective_cache_size = 64GB
> shared_buffers = 24GB
> work_mem = 5MB
> wal_buffers = 8MB
> wal_keep_segments = 4096
> wal_level = hot_standby
> max_wal_senders = 10
> archive_mode = on
> archive_command = 'cp -i %p /home/postgres/archive/master/%f'
>
>
> The WAL archive is at /home/postgres/archive/master/, right?
> This directory consume more than 750GB of Disk-1.
> Each segment in the /home/postgres/archive/master/ is 16MB each
> There are currently 47443 files in this folder.
>
> I want to limit the total size use by WAL archive to around 200-400 GB.
>
> Do I set the segment too big?
> wal_keep_segments = 4096
> checkpoint_segments = 64
>
> What value should I set for it?
>

In which case you need to calculate how long it takes to accumulate that
much archive data and then perform a base backup roughly that often after
which point any WAL older that the point at which you began the backup can
be removed.

You cannot just limit how large the WAL archive is since removing any WAL
file will pretty much make any attempt at restoration fail.​

David J.


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-08 Thread Haribabu Kommi
On Wed, Dec 9, 2015 at 2:36 AM, Amit Kapila  wrote:
>
> Few assorted comments:

Thanks for the review.

> 1.
> + /*
> + * SQL-accessible SRF to return all the settings from the pg_hba.conf
> + * file.
> + */
> + Datum
> + pg_hba_lookup_2args(PG_FUNCTION_ARGS)
> + {
> + return pg_hba_lookup(fcinfo);
> + }
> +
> + /*
> +  * SQL-accessible SRF to return all the settings from the pg_hba.conf
> +  * file.
> +  */
> + Datum
> + pg_hba_lookup_3args(PG_FUNCTION_ARGS)
> + {
> + return pg_hba_lookup(fcinfo);
> + }
>
> I think it is better to have check on number of args in the
> above functions similar to what we have in ginarrayextract_2args.

ginarrayextract_2args is an deprecated function that checks and returns
error if user is using with two arguments.  But in pg_hba_lookup function,
providing two argument is a valid scenario. The check can be added only
to verify whether the provided number of arguments are two or not. Is it
really required?

> 2.
> +
> + /*
> + * Reload authentication config files too to refresh
> + * pg_hba_conf view data.
> + */
> + if (!load_hba())
> + {
> + ereport(DEBUG1,
> + (errmsg("Falure in reloading pg_hba.conf, pg_hba_conf view may show stale
> information")));
> + load_hba_failure = true;
> + }
> +
> + load_hba_failure = false;
>
> Won't the above code set load_hba_failure as false even in
> failure case.

Fixed.

> 3.
> + Datum
> + pg_hba_lookup(PG_FUNCTION_ARGS)
> + {
> + char *user;
> + char *database;
> + char *address;
> + char*hostname;
> + bool ssl_inuse = false;
> + struct sockaddr_storage addr;
> + hba_lookup_args_mode args_mode = TWO_ARGS_MODE; /* Minimum number of
> arguments */
> +
> + /*
> + * We must use the Materialize mode to be safe against HBA file reloads
> + * while the cursor is open. It's also more efficient than having to look
> + * up our current position in the parsed list every time.
> + */
> + ReturnSetInfo *rsi = (ReturnSetInfo *)fcinfo->resultinfo;
> +
> + if (!superuser())
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> + (errmsg("only superuser can view pg_hba.conf settings";
>
> To be consistent with other similar messages, it is better to
> start this message with "must be superuser ..", refer other
> similar superuser checks in xlogfuncs.c

Updated as "must be superuser to view".

Attached updated patch after taking care of review comments.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgresNode::_update_pid using undefined variables in tap tests

2015-12-08 Thread Michael Paquier
On Wed, Dec 9, 2015 at 4:47 AM, Robert Haas  wrote:
>
> On Thu, Dec 3, 2015 at 11:28 PM, Michael Paquier
>  wrote:
> > This does not impact the run, but it creates unwelcome warnings in the
> > logs. This is actually caused by the following code in PostgresNode
> > that uses an incorrect check to see if the file has been correctly
> > opened or not:
> > open my $pidfile, $self->data_dir . "/postmaster.pid";
> > if (not defined $pidfile)
> >
> > One way to fix this is to use if(open(...)), a second way I know of is
> > to check if the opened file handle matches tell($pidfile) == -1. The
> > patch attached uses the first method to fix the issue.
>
> My Perl-fu must be getting weak.  What's wrong with the existing code?

This code should have checked for the return result of open instead of
looking at $pidfile. This has been noticed by Noah as well afterwards
and already addressed as 9821492.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgresNode::_update_pid using undefined variables in tap tests

2015-12-08 Thread Alvaro Herrera
Michael Paquier wrote:

> This code should have checked for the return result of open instead of
> looking at $pidfile. This has been noticed by Noah as well afterwards
> and already addressed as 9821492.

Oops, sorry I didn't credit you in the commit message.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More stable query plans via more predictable column statistics

2015-12-08 Thread Robert Haas
On Fri, Dec 4, 2015 at 12:53 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Still, maybe we should try to sneak at least this much into
>> 9.5 RSN, because I have to think this is going to help people with
>> mostly-NULL (or mostly-really-wide) columns.
>
> Please no.  We are trying to get to release, not destabilize things.

Well, OK, but I don't really see how that particular bit is anything
other than a bug fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HELP!!! The WAL Archive is taking up all space

2015-12-08 Thread David G. Johnston
On Tue, Dec 8, 2015 at 4:43 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Dec 8, 2015 at 3:33 AM, FattahRozzaq  wrote:
>
>> Hi all,
>>
>> Please help...
>>
>> I have 1 master PostgreSQL and 1 standby PostgreSQL.
>> Both servers has the same OS Linux Debian Wheezy, the same hardware.
>>
>> Both server hardware:
>> CPU: 24 cores
>> RAM: 128GB
>> Disk-1: 800GB SAS (for OS, logs, WAL archive directory)
>> Disk-2: 330GB SSD (for PostgreSQL data directory, except WAL archive
>> and except pg_log)
>>
>> The part of the configuration are as below:
>> checkpoint_segments = 64
>> checkpoint_completion_target = 0.9
>> default_statistics_target = 10
>> maintenance_work_mem = 1GB
>> effective_cache_size = 64GB
>> shared_buffers = 24GB
>> work_mem = 5MB
>> wal_buffers = 8MB
>> wal_keep_segments = 4096
>> wal_level = hot_standby
>> max_wal_senders = 10
>> archive_mode = on
>> archive_command = 'cp -i %p /home/postgres/archive/master/%f'
>>
>>
>> The WAL archive is at /home/postgres/archive/master/, right?
>> This directory consume more than 750GB of Disk-1.
>> Each segment in the /home/postgres/archive/master/ is 16MB each
>> There are currently 47443 files in this folder.
>>
>> I want to limit the total size use by WAL archive to around 200-400 GB.
>>
>> Do I set the segment too big?
>> wal_keep_segments = 4096
>> checkpoint_segments = 64
>>
>> What value should I set for it?
>>
>
> In which case you need to calculate how long it takes to accumulate that
> much archive data and then perform a base backup roughly that often after
> which point any WAL older that the point at which you began the backup can
> be removed.
>
> You cannot just limit how large the WAL archive is since removing any WAL
> file will pretty much make any attempt at restoration fail.​
>
> David J.
>
>
​While valid I missed that you have a streaming replica on the other end
that should be removing files as they are loaded pending the retention
setting...see Michael's response for better information.

David J.​


Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-08 Thread Jim Nasby

On 12/8/15 3:52 AM, Jeremy Harris wrote:

On 07/12/15 16:44, Simon Riggs wrote:

There are many optimizations we might adopt, yet planning time is a factor.
It seems simple enough to ignore more complex optimizations if we have
already achieved a threshold cost (say 10). Such a test would add nearly
zero time for the common case. We can apply the optimizations in some kind
of ordering depending upon the cost, so we are careful to balance the
cost/benefit of trying certain optimizations.


Given parallelism, why not continue planning after initiating a
a cancellable execution, giving a better plan to be used if the
excecution runs for long enough?


Because that would take significantly more work than what Simon is 
proposing.


That said, I think the ability to restart with a different plan is 
something we might need, for cases when we discover the plan estimates 
were way off. If that ever gets built it might be useful for what you 
propose as well.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-08 Thread Amit Langote
On 2015/12/09 5:50, Robert Haas wrote:
> On Thu, Dec 3, 2015 at 3:52 AM, amul sul  wrote:
>> Can we pass initially_valid instead of !skip_validation to StoreRelCheck() 
>> in AddRelationNewConstraints(), as shown below?
> 
> The comments say this:
> 
>  * If skip_validation is true then we skip checking that the existing rows
>  * in the table satisfy the constraint, and just install the catalog entries
>  * for the constraint.  A new FK constraint is marked as valid iff
>  * initially_valid is true.  (Usually skip_validation and initially_valid
>  * are inverses, but we can set both true if the table is known empty.)
> 
> These comments are accurate.  If you create a FK constraint as not
> valid, the system decides that it's really valid after all, but if you
> add a CHECK constraint as not valid, the system just believes you:
> 
> rhaas=# create table foo (a serial primary key);
> CREATE TABLE
> rhaas=# create table bar (a int, foreign key (a) references foo (a)
> not valid, check (a != 0) not valid);
> CREATE TABLE
> rhaas=# \d bar
>   Table "public.bar"
>  Column |  Type   | Modifiers
> +-+---
>  a  | integer |
> Check constraints:
> "bar_a_check" CHECK (a <> 0) NOT VALID
> Foreign-key constraints:
> "bar_a_fkey" FOREIGN KEY (a) REFERENCES foo(a)

Didn't realize that marking constraints NOT VALID during table creation
was syntactically possible. Now I see that the same grammar rule for table
constraints is used for both create table and alter table add constraint.

> I suspect this is an oversight.  We allowed FOREIGN KEY constraints to
> be not valid in 722bf7017bbe796decc79c1fde03e7a83dae9ada by Simon
> Riggs, but didn't add allow it for CHECK constraints until Alvaro's
> commit of 897795240cfaaed724af2f53ed2c50c9862f951f a few months later.
> My guess is that there's no reason for these not to behave in the same
> way, but they don't.  Amul's proposed one-liner might be one part of
> actually fixing that, but it wouldn't be enough by itself: you'd also
> need to teach transformCreateStmt to set the initially_valid flag to
> true, maybe by adding a new function transformCheckConstraints or so.

So, any NOT VALID specification for a FK constraint is effectively
overridden in transformFKConstraints() at table creation time but the same
doesn't happen for CHECK constraints. I agree that that could be fixed,
then as you say, Amul's one-liner would make sense.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Another little thing about psql wrapped expanded output

2015-12-08 Thread Jim Nasby

On 12/8/15 1:36 PM, Robert Haas wrote:

Your point is also valid, so I don't mean to detract from that.  But
the status quo is definitely annoying.


+1, and I even use -S.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-08 Thread Robert Haas
On Mon, Dec 7, 2015 at 8:26 PM, David Rowley
 wrote:
> The biggest frustration for me is that the way Tom always seems to argue his
> point it's as if planning time is roughly the same or more expensive than
> execution time, and likely that's true in many cases, but I would imagine in
> more normal cases that execution time is longer, although I've never had the
> guts to stand up and argued this as I don't have any stats to back me up.

I think this is missing the point.  It is possible to have a query
planner optimization that is so expensive that it loses even when it
improves the plan, but I don't think this optimization falls into that
category.  This particular change would not have been requested as
many times as it has been if people didn't keep rediscovering that, on
a certain class of queries, it works *really* well.  The problem that
I understand Tom to be concerned about is the queries where the
optimization consumes additional planning time without delivering a
better plan - and those where, without careful thought, it might even
deliver a worse plan.

Now, I'm not sure Tom is right about that, but he might be.  The class
of queries we're talking about here are the ones where somebody
constrains a column that is part of an equivalence class with multiple
members.  Perhaps the best example is a large join, let's say 10
tables, where the join column is the same for all tables; thus, we
have an equivalence class with 10 members.  Suppose further we have an
inequality condition applied to one member of that equivalence class.

Currently, we'll apply that inequality to the table that the user
actually mentioned and ignore all the others; in theory, it could be
right to enforce that inequality condition on any nonempty subset of
the 10 tables we've got.  It might be right to pick exactly one of
those tables and enforce the inequality there, or it might be right to
enforce it on some of them, or it might be right to enforce it on all
of them.  The reason why the answer isn't automatically "all of them"
is because, first of all, it's possible that enforcing the condition
at a particular table costs more to filter out the rows that we save
in execution time at higher levels of the plan tree.  For example,
consider A JOIN B ON A.X = B.X WHERE A.X > 100.  It might be that
the range of A.X is [0,101] but the range of B.X is
[100,200]; so enforcing the inequality against A is very
selective but enforcing it against B filters out basically nothing.
Furthermore, there are some cases involving parameterized paths where
enforcing the inequality multiple times is definitely bad: for
example, if we've got a nested loop where the outer side is a seq scan
that enforces the condition and the inner side is an index probe, it
is just a waste to retest it on the inner side.  We already know that
the outer row passes the inequality, so the inner row will necessarily
pass also.  This doesn't apply to merge or hash joins, and it also
doesn't apply to all nested loops: scans that aren't paramaterized by
the equivalence-class column can still benefit from separate
enforcement of the inequality.

Considering the factors mentioned in the previous paragraph, it seems
likely to me that a naive patch that propagates the inequalities to
every relation where they could hypothetically be enforced will be a
significant loss in some cases even just in execution cost, ignoring
completely the effects on planning time.  And, on the flip side,
figuring out what non-empty subset of relations forms the optimal set
upon which to enforce the inequality seems like a complicated problem.
  A first cut might be to enforce the inequality against the relation
where it's believed to be most selective, and to also enforce it in
paths for other relations that aren't parameterized by the
equivalence-class column mentioned in the inequality provided that the
selectivity is thought to be above some threshold ... but I'm not sure
this is very principled, and it's certainly not obvious what arbitrary
threshold would be appropriate.  The threshold where multiple
enforcement of the qual starts to pay off also depends on the cost of
the qual: an expensive qual has to filter out more rows than a cheap
qual to be worthwhile.  I'm not sure how to estimate all this, but I
don't have trouble believing that if not done carefully it could
either (a) cost a lot of planning time or (b) generate lousy plans.

Now, all that having been said, I think this is a pretty important
optimization.  Lots of people have asked for it, and I think it would
be worth expending some brainpower to try to figure out a way to be
smarter than we are now, which is, in a nutshell, as dumb as possible.
You're completely right that, on an OLAP query that's going to run for
a minute, a few extra milliseconds of planning time could be totally
OK even if they don't yield any benefit on most queries. Maybe we can
get the cost of this feature down 

Re: [HACKERS] PostgresNode::_update_pid using undefined variables in tap tests

2015-12-08 Thread Michael Paquier
On Wed, Dec 9, 2015 at 11:23 AM, Robert Haas wrote:
> Boy, that's awful.  Whoever designed that bit of wonderfulness should
> have their language design license revoked.

If one would want to work on a check with pidfile, he/she could check
for tell($pidfile) == -1 which corresponds to an undetermined
position, that's ugly but we need to live with that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-12-08 Thread Greg Stark
On Wed, Dec 9, 2015 at 12:02 AM, Jeff Janes  wrote:
>
>
> Then in the next (final) merge, it is has to read in this huge
> fragmented tape run emulation, generating a lot of random IO to read
> it.

This seems fairly plausible. Logtape.c is basically implementing a
small filesystem and doesn't really make any attempt to avoid
fragmentation. The reason it does this is so that we can reuse blocks
and avoid needing to store 2x disk space for the temporary space. I
wonder if we're no longer concerned about keeping the number of tapes
down if it makes sense to give up on this goal too and just write out
separate files for each tape letting the filesystem avoid
fragmentation. I suspect it would also be better for filesystems like
ZFS and SSDs where rewriting blocks can be expensive.


> With the patched code, the average length of reads on files in
> pgsql_tmp between lseeks or changing to a different file descriptor is
> 8, while in the unpatched code it is 14.

I don't think Peter did anything to the scheduling of the merges so I
don't see how this would be different. It might just have hit a
preexisting case by changing the number and size of tapes.

I also don't think the tapes really ought to be so unbalanced. I've
noticed some odd things myself -- like what does a 1-way merge mean
here?

LOG:  finished writing run 56 to tape 2 (9101313 blocks): CPU
0.19s/10.97u sec elapsed 16.68 sec
LOG:  finished writing run 57 to tape 3 (9084929 blocks): CPU
0.19s/11.14u sec elapsed 19.08 sec
LOG:  finished writing run 58 to tape 4 (9101313 blocks): CPU
0.20s/11.31u sec elapsed 19.26 sec
LOG:  performsort starting: CPU 0.20s/11.48u sec elapsed 19.44 sec
LOG:  finished writing run 59 to tape 5 (9109505 blocks): CPU
0.20s/11.49u sec elapsed 19.44 sec
LOG:  finished writing final run 60 to tape 6 (8151041 blocks): CPU
0.20s/11.55u sec elapsed 19.50 sec
LOG:  finished 1-way merge step (1810433 blocks): CPU 0.20s/11.58u sec
elapsed 19.54 sec   <-=
LOG:  finished 10-way merge step (19742721 blocks): CPU 0.20s/12.23u
sec elapsed 20.19 sec
LOG:  finished 13-way merge step (2389 blocks): CPU 0.20s/13.15u
sec elapsed 21.11 sec
LOG:  finished 13-way merge step (4777 blocks): CPU 0.22s/14.07u
sec elapsed 23.13 sec
LOG:  finished 14-way merge step (4777 blocks): CPU 0.24s/15.65u
sec elapsed 24.74 sec
LOG:  performsort done (except 14-way final merge): CPU 0.24s/15.66u
sec elapsed 24.75 sec

I wonder if something's wrong with the merge scheduling.

Fwiw attached are two patches for perusal. One is a trivial patch to
add the size of the tape to trace_sort output. I guess I'll just apply
that without discussion. The other replaces the selection sort with an
open coded sort network for cases up to 8 elements. (Only in the perl
generated qsort for the moment). I don't have the bandwidth to
benchmark this for the moment but if anyone's interested in trying I
suspect it'll make a small but noticeable difference. I'm guessing
2-5%.

-- 
greg
 src/backend/utils/sort/logtape.c   | 9 +
 src/backend/utils/sort/tuplesort.c | 7 +--
 src/include/utils/logtape.h| 1 +
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index 252ba22..20251ab 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -1014,3 +1014,12 @@ LogicalTapeSetBlocks(LogicalTapeSet *lts)
 {
 	return lts->nFileBlocks;
 }
+
+/*
+ * Obtain total disk space currently used by a LogicalTapeSet, in blocks.
+ */
+long
+LogicalTapeBlocks(LogicalTapeSet *lts, int tapenum)
+{
+	return lts->tapes[tapenum].numFullBlocks * BLCKSZ + 1;
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 6572af8..5d996f2 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -2424,7 +2424,9 @@ mergeonerun(Tuplesortstate *state)
 
 #ifdef TRACE_SORT
 	if (trace_sort)
-		elog(LOG, "finished %d-way merge step: %s", state->activeTapes,
+		elog(LOG, "finished %d-way merge step (%ld blocks): %s", 
+			 state->activeTapes,
+			 LogicalTapeBlocks(state->tapeset, destTape),
 			 pg_rusage_show(>ru_start));
 #endif
 }
@@ -2654,9 +2656,10 @@ dumptuples(Tuplesortstate *state, bool alltuples)
 
 #ifdef TRACE_SORT
 			if (trace_sort)
-elog(LOG, "finished writing%s run %d to tape %d: %s",
+elog(LOG, "finished writing%s run %d to tape %d (%ld blocks): %s",
 	 (state->memtupcount == 0) ? " final" : "",
 	 state->currentRun, state->destTape,
+	 LogicalTapeBlocks(state->tapeset, state->destTape),
 	 pg_rusage_show(>ru_start));
 #endif
 
diff --git a/src/include/utils/logtape.h b/src/include/utils/logtape.h
index 47426ce..0b7aefe 100644
--- a/src/include/utils/logtape.h
+++ b/src/include/utils/logtape.h
@@ -40,5 +40,6 @@ extern bool LogicalTapeSeek(LogicalTapeSet *lts, int tapenum,
 extern void LogicalTapeTell(LogicalTapeSet *lts, int 

Re: [HACKERS] Using quicksort for every external sort run

2015-12-08 Thread Greg Stark
On 9 Dec 2015 02:44, "Peter Geoghegan"  wrote:
>
> I guess you mean insertion sort. What's the theoretical justification
> for the change?

Er, right. Insertion sort.

The sort networks I used here are optimal both in number of comparisons and
depth. I suspect modern CPUs actually manage to do some of the comparisons
in parallel even.

I was experimenting with using SIMD registers and did a non SIMD
implementation like this first and noticed it was doing 15% fewer
comparisons than insertion sort and ran faster. That was for sets of 8, I'm
not sure there's as much saving on smaller sets.


Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 11:12 PM, David Rowley
 wrote:
> My point was more of a response to the general condition that we cannot add
> any planner overhead unless the extra CPU cycles spent are almost certainly
> going improve the plan.

I hope that's an overstatement of the requirement, because otherwise
it boils down to "don't improve the planner".  Most things we do to
try to improve plans are going to generate paths that will only
sometimes be better than the paths we're already generating.  "almost
certainly" is a high bar to clear.

> hmm, well I think it's more than likely that my view on this has been skewed
> by the fact that we push equality quals down with the eclass contains a
> Const without any regard that it could generate a worse plan or add needless
> CPU overhead for checking the quals. If you rewrite your above paragraph
> with equality instead of inequality then it still applies. A JOIN B ON A.X =
> B.X WHERE A.X = 100, will clause B.X = 100 to be pushed into B, but
> what if B.X values are all set to 100? I've not seen anyone complain
> about us doing that.  This is the reason I limited the patch to only deal
> with members of the BTREE op-class, I assumed that if there's some BTREE
> index helping with the equality qual then that same index may well just help
> with a qual using the <, <=, > or >= operator, and also I thought that in
> most cases all these btree inequality operations will have about the same
> CPU cost as equality.
>
> Would you be able to explain the difference between the two? Or is it just a
> question of the likelihood?

Likelihood, I guess.  I mean, typically if you are doing an equijoin
on two or more relations, the join column is unique, so there's only
going to be one row in each of A and B that is equal to a given
constant.  If A.X and/or B.X contain duplicate values, then the output
is going to have a number of rows equal to the product of the number
of duplicates in each, sort of like a Cartesian join.  That scenario
could happen, but it's a pretty strange kind of query which I would be
disinclined to spend a lot of time optimizing.  OTOH, something like
SELECT * FROM orders o, lines l WHERE l.order_id = o.order_id AND
o.order_id > 100 is a pretty normal query, at least IMHO.
Worrying about how that's going to perform with various data
distributions seems like a pretty reasonable thing to care about.

>> Furthermore, there are some cases involving parameterized paths where
>> enforcing the inequality multiple times is definitely bad: for
>> example, if we've got a nested loop where the outer side is a seq scan
>> that enforces the condition and the inner side is an index probe, it
>> is just a waste to retest it on the inner side.  We already know that
>> the outer row passes the inequality, so the inner row will necessarily
>> pass also.  This doesn't apply to merge or hash joins, and it also
>> doesn't apply to all nested loops: scans that aren't paramaterized by
>> the equivalence-class column can still benefit from separate
>> enforcement of the inequality.
>>
> I guess that could be fixed by somehow marking these pushed quals as
> optional and having parameterised scans ignore optional quals.

Yeah.

> I have to admit that I didn't give it much thought as all of the cases that
> I tested turned what used to be nested loop with a parameterised inner index
> scan into a merge join, which was faster in my test case. Of course, that
> does not mean that I claim that this merge join will be faster in all cases
> or that the plan will switch to using a merge join in all cases.

Interesting that it turned into a merge join and that that was faster.

> Again this is one of the reason that I limited it to btree operators only.
> I don't quite know how to estimate this either as the extra rows being
> filtered may mean that a sort no longer spills to disk, or a hash join no
> longer multi-batches. I'd imagine filtering would be extra worthwhile in
> such cases, so I doubt setting the threshold to some constant value is the
> correct way, and most likely considering the path with and without the
> qual(s) would be far too costly.

It might be OK, particularly for OLTP queries, but it's certainly not
going to be the cheapest thing anybody ever did.

> Well I agree with that. I've got no interest in slowing anything down. I've
> been busy working with big databases for quite a while now, so perhaps my
> point of view is coming from that direction still.

Yeah.

> So anyway, it's quite clear that we don't want the patch in its current
> form, and I'm not volunteering any more time at this stage to improve it, so
> for this reason I won't add it the January commit fest.

Too bad, but I understand.  I hope you come back around to working on
this further at some point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread Pavel Stehule
2015-12-09 6:10 GMT+01:00 Michael Paquier :

> On Wed, Dec 9, 2015 at 2:07 PM, Pavel Stehule 
> wrote:
> > should be noted, recorded somewhere so this introduce possible
> compatibility
> > issue - due default processing .psqlrc.
>
> That's written in the commit log, so I guess that's fine.
>

ook


> --
> Michael
>


Re: [HACKERS] Error with index on unlogged table

2015-12-08 Thread Michael Paquier
On Tue, Dec 8, 2015 at 9:57 PM, Andres Freund  wrote:
> For me the attached, preliminary, patch, fixes the problem in master;
> previous branches ought to look mostly similar, except the flush moved
> to RestoreBackupBlockContents/RestoreBackupBlock.
> Does anybody have a better idea? Suitable for the back-branches?

Not really I am afraid..

> I'm kinda wondering if it wouldn't have been better to go through shared
> buffers in ResetUnloggedRelationsInDbspaceDir() instead of using
> copy_file().

For deployment with large shared_buffers settings, wouldn't that be
actually more costly than the current way of doing? We would need to
go through all the buffers at least once and look for the INIT_FORKNUM
present to flush them.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-08 Thread Amit Kapila
On Wed, Dec 9, 2015 at 6:48 AM, Haribabu Kommi 
wrote:
>
> On Wed, Dec 9, 2015 at 2:36 AM, Amit Kapila 
wrote:
> >
> > I think it is better to have check on number of args in the
> > above functions similar to what we have in ginarrayextract_2args.
>
> ginarrayextract_2args is an deprecated function that checks and returns
> error if user is using with two arguments.  But in pg_hba_lookup function,
> providing two argument is a valid scenario. The check can be added only
> to verify whether the provided number of arguments are two or not. Is it
> really required?
>

I think we can live without such a check especially because you
already have required check for function args in pg_hba_lookup
function.

> > 2.
> > +
> > + /*
> > + * Reload authentication config files too to refresh
> > + * pg_hba_conf view data.
> > + */
> > + if (!load_hba())
> > + {
> > + ereport(DEBUG1,
> > + (errmsg("Falure in reloading pg_hba.conf, pg_hba_conf view may show
stale
> > information")));
> > + load_hba_failure = true;
> > + }
> > +
> > + load_hba_failure = false;
> >
> > Won't the above code set load_hba_failure as false even in
> > failure case.
>
> Fixed.
>

Another bigger issue I see in the above part of code is that it doesn't
seem to be safe to call load_hba() at that place in PostgresMain() as
currently load_hba() is using a context created from PostmasterContext
to perform the parsing and some other stuff, the PostmasterContext
won't be available at that time.  It is deleted immediately after
InitPostgres
is completed.  So either we need to make PostmasterContext don't go
away after InitPostgres() or load_hba shouldn't use it and rather use
CurrentMemroyContext similar to ProcessConfigFile or may be use
TopMemoryContext instead of PostmasterContext if possible.  I think
this needs some more thoughts.

Apart from above, this patch doesn't seem to work on Windows or
for EXEC_BACKEND builds as we are loading the hba file in a
temporary context (PostmasterContext, refer PerformAuthentication)
which won't be alive for the backends life.  This works on linux because
of fork/exec mechanism which allows to inherit the parsed file
(parsed_hba_lines). This is slightly tricky problem to solve and we
have couple of options (a) use TopMemoryContext instead of Postmaster
Context to load hba; (b) Use CurrentMemoryContext (c) pass the parsed
hba file for Windows/Exec_Backend using save_backend_variables/
restore_backend_variables mechanism or if you have any other idea.
If you don't have any better idea, then you can evaluate above ideas
and see which one makes more sense.




With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Using quicksort for every external sort run

2015-12-08 Thread Peter Geoghegan
On Tue, Dec 8, 2015 at 6:39 PM, Greg Stark  wrote:
> Fwiw attached are two patches for perusal. One is a trivial patch to
> add the size of the tape to trace_sort output. I guess I'll just apply
> that without discussion.

+1

> The other replaces the selection sort with an
> open coded sort network for cases up to 8 elements. (Only in the perl
> generated qsort for the moment). I don't have the bandwidth to
> benchmark this for the moment but if anyone's interested in trying I
> suspect it'll make a small but noticeable difference. I'm guessing
> 2-5%.

I guess you mean insertion sort. What's the theoretical justification
for the change?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-12-08 Thread Peter Geoghegan
On Tue, Dec 8, 2015 at 6:44 PM, Peter Geoghegan  wrote:
> On Tue, Dec 8, 2015 at 6:39 PM, Greg Stark  wrote:
>> Fwiw attached are two patches for perusal. One is a trivial patch to
>> add the size of the tape to trace_sort output. I guess I'll just apply
>> that without discussion.
>
> +1

>> +/*
>> + * Obtain total disk space currently used by a LogicalTapeSet, in blocks.
>> + */
>> +long
>> +LogicalTapeBlocks(LogicalTapeSet *lts, int tapenum)
>> +{
>> + return lts->tapes[tapenum].numFullBlocks * BLCKSZ + 1;
>> +}

Why multiply by BLCKSZ here?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Equivalence Class Filters

2015-12-08 Thread David Rowley
On 9 December 2015 at 15:14, Robert Haas  wrote:

> On Mon, Dec 7, 2015 at 8:26 PM, David Rowley
>  wrote:
> > The biggest frustration for me is that the way Tom always seems to argue
> his
> > point it's as if planning time is roughly the same or more expensive than
> > execution time, and likely that's true in many cases, but I would
> imagine in
> > more normal cases that execution time is longer, although I've never had
> the
> > guts to stand up and argued this as I don't have any stats to back me up.
>
> I think this is missing the point.  It is possible to have a query
> planner optimization that is so expensive that it loses even when it
> improves the plan, but I don't think this optimization falls into that
> category.  This particular change would not have been requested as
> many times as it has been if people didn't keep rediscovering that, on
> a certain class of queries, it works *really* well.  The problem that
> I understand Tom to be concerned about is the queries where the
> optimization consumes additional planning time without delivering a
> better plan - and those where, without careful thought, it might even
> deliver a worse plan.
>
>
My point was more of a response to the general condition that we cannot add
any planner overhead unless the extra CPU cycles spent are almost certainly
going improve the plan.


> Now, I'm not sure Tom is right about that, but he might be.  The class
> of queries we're talking about here are the ones where somebody
> constrains a column that is part of an equivalence class with multiple
> members.  Perhaps the best example is a large join, let's say 10
> tables, where the join column is the same for all tables; thus, we
> have an equivalence class with 10 members.  Suppose further we have an
> inequality condition applied to one member of that equivalence class.
>
> Currently, we'll apply that inequality to the table that the user
> actually mentioned and ignore all the others; in theory, it could be
> right to enforce that inequality condition on any nonempty subset of
> the 10 tables we've got.  It might be right to pick exactly one of
> those tables and enforce the inequality there, or it might be right to
> enforce it on some of them, or it might be right to enforce it on all
> of them.  The reason why the answer isn't automatically "all of them"
> is because, first of all, it's possible that enforcing the condition
> at a particular table costs more to filter out the rows that we save
> in execution time at higher levels of the plan tree.  For example,
> consider A JOIN B ON A.X = B.X WHERE A.X > 100.  It might be that
> the range of A.X is [0,101] but the range of B.X is
> [100,200]; so enforcing the inequality against A is very
> selective but enforcing it against B filters out basically nothing.
>

hmm, well I think it's more than likely that my view on this has been
skewed by the fact that we push equality quals down with the eclass
contains a Const without any regard that it could generate a worse plan or
add needless CPU overhead for checking the quals. If you rewrite your above
paragraph with equality instead of inequality then it still applies. A JOIN
B ON A.X = B.X WHERE A.X = 100, will clause B.X = 100 to be pushed
into B, but what if B.X values are all set to 100? I've not seen anyone
complain about us doing that.  This is the reason I limited the patch to
only deal with members of the BTREE op-class, I assumed that if there's
some BTREE index helping with the equality qual then that same index may
well just help with a qual using the <, <=, > or >= operator, and also I
thought that in most cases all these btree inequality operations will have
about the same CPU cost as equality.

Would you be able to explain the difference between the two? Or is it just
a question of the likelihood?



> Furthermore, there are some cases involving parameterized paths where
> enforcing the inequality multiple times is definitely bad: for
> example, if we've got a nested loop where the outer side is a seq scan
> that enforces the condition and the inner side is an index probe, it
> is just a waste to retest it on the inner side.  We already know that
> the outer row passes the inequality, so the inner row will necessarily
> pass also.  This doesn't apply to merge or hash joins, and it also
> doesn't apply to all nested loops: scans that aren't paramaterized by
> the equivalence-class column can still benefit from separate
> enforcement of the inequality.
>
>
I guess that could be fixed by somehow marking these pushed quals as
optional and having parameterised scans ignore optional quals.
I have to admit that I didn't give it much thought as all of the cases that
I tested turned what used to be nested loop with a parameterised inner
index scan into a merge join, which was faster in my test case. Of course,
that does not mean that I claim that this merge join will be faster 

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 10:00 PM, Etsuro Fujita
 wrote:
>> I think the actual regression test outputs are fine, and that your
>> desire to suppress part of the plan tree from showing up in the
>> EXPLAIN output is misguided.  I like it just the way it is.  To
>> prevent user confusion, I think that when we add support to
>> postgres_fdw for this we might also want to add some documentation
>> explaining how to interpret this EXPLAIN output, but I don't think
>> there's any problem with the output itself.
>
> I'm not sure that that's a good idea.  one reason for that is I think that
> that would be more confusing to users when more than two foreign tables are
> involved in a foreign join as shown in the following example.  Note that the
> outer plans will be shown recursively.  Another reason is there is no
> consistency between the costs of the outer plans and that of the main plan.

I still don't really see a problem here, but, regardless, the solution
can't be to hide nodes that are in fact present from the user.  We can
talk about making further changes here, but hiding the nodes
altogether is categorically out in my mind.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-08 Thread Kouhei Kaigai
> On Tue, Dec 8, 2015 at 10:00 PM, Etsuro Fujita
>  wrote:
> >> I think the actual regression test outputs are fine, and that your
> >> desire to suppress part of the plan tree from showing up in the
> >> EXPLAIN output is misguided.  I like it just the way it is.  To
> >> prevent user confusion, I think that when we add support to
> >> postgres_fdw for this we might also want to add some documentation
> >> explaining how to interpret this EXPLAIN output, but I don't think
> >> there's any problem with the output itself.
> >
> > I'm not sure that that's a good idea.  one reason for that is I think that
> > that would be more confusing to users when more than two foreign tables are
> > involved in a foreign join as shown in the following example.  Note that the
> > outer plans will be shown recursively.  Another reason is there is no
> > consistency between the costs of the outer plans and that of the main plan.
> 
> I still don't really see a problem here, but, regardless, the solution
> can't be to hide nodes that are in fact present from the user.  We can
> talk about making further changes here, but hiding the nodes
> altogether is categorically out in my mind.
>
Fujita-san,

If you really want to hide the alternative sub-plan, you can move the
outer planstate onto somewhere private field on BeginForeignScan,
then kick ExecProcNode() at the ForeignRecheck callback by itself.
Explain walks down the sub-plan if outerPlanState(planstate) is
valid. So, as long as your extension keeps the planstate privately,
it is not visible from the EXPLAIN.

Of course, I don't recommend it.
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread Pavel Stehule
2015-12-09 1:27 GMT+01:00 Michael Paquier :

> On Wed, Dec 9, 2015 at 5:08 AM, Pavel Stehule 
> wrote:
> > 2015-12-08 20:09 GMT+01:00 Robert Haas :
> >> On Sun, Dec 6, 2015 at 9:27 AM, Michael Paquier
> >>  wrote:
> >> > On Sun, Dec 6, 2015 at 10:56 PM, Michael Paquier
> >> >  wrote:
> >> >> Thanks, I looked at that again and problem is fixed as attached.
> >> >
> >> > Er, not exactly... poll_query_until in PostgresNode.pm is using psql
> >> > -c without the --no-psqlrc switch so this patch causes the regression
> >> > tests of pg_rewind to fail. Fixed as attached.
> >>
> >> Committed.  Go, team.
> >>
> >> This has been a long time coming.
> >
> >
> > great, thank you very much you and all
>
> Thanks! This is now marked as committed in the CF app...
>

should be noted, recorded somewhere so this introduce possible
compatibility issue - due default processing .psqlrc.

Regards

Pavel


> --
> Michael
>


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-08 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 8 Dec 2015 20:50:39 +0900, Michael Paquier  
wrote in 

Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-08 Thread Amit Langote
On 2015/12/09 11:19, Amit Langote wrote:
> On 2015/12/09 5:50, Robert Haas wrote:
>> I suspect this is an oversight.  We allowed FOREIGN KEY constraints to
>> be not valid in 722bf7017bbe796decc79c1fde03e7a83dae9ada by Simon
>> Riggs, but didn't add allow it for CHECK constraints until Alvaro's
>> commit of 897795240cfaaed724af2f53ed2c50c9862f951f a few months later.
>> My guess is that there's no reason for these not to behave in the same
>> way, but they don't.  Amul's proposed one-liner might be one part of
>> actually fixing that, but it wouldn't be enough by itself: you'd also
>> need to teach transformCreateStmt to set the initially_valid flag to
>> true, maybe by adding a new function transformCheckConstraints or so.
> 
> So, any NOT VALID specification for a FK constraint is effectively
> overridden in transformFKConstraints() at table creation time but the same
> doesn't happen for CHECK constraints. I agree that that could be fixed,
> then as you say, Amul's one-liner would make sense.

So, how about attached?

I think it may be enough to flip initially_valid to true in
transformTableConstraint() when in a CREATE TABLE context.

Regarding Amul's proposed change, there arises one minor inconsistency.
StoreRelCheck() is called in two places - AddRelationNewConstraints(),
where we can safely change from passing the value of !skip_validation to
that of initially_valid and StoreConstraints(), where we cannot because
CookedConstraint is used which doesn't have the initially_valid field.
Nevertheless, attached patch includes the former.

Thoughts?

Thanks,
Amit
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7d7d062..04c4f8f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -2349,7 +2349,7 @@ AddRelationNewConstraints(Relation rel,
 		 * OK, store it.
 		 */
 		constrOid =
-			StoreRelCheck(rel, ccname, expr, !cdef->skip_validation, is_local,
+			StoreRelCheck(rel, ccname, expr, cdef->initially_valid, is_local,
 		  is_local ? 0 : 1, cdef->is_no_inherit, is_internal);
 
 		numchecks++;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 344a40c..ce45804 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -687,6 +687,10 @@ transformTableConstraint(CreateStmtContext *cxt, Constraint *constraint)
 			break;
 
 		case CONSTR_CHECK:
+			/* Is this better done in a transformCheckConstraint? */
+			if (!cxt->isalter)
+constraint->initially_valid = true;
+
 			cxt->ckconstraints = lappend(cxt->ckconstraints, constraint);
 			break;
 
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 61669b6..c91342f 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -187,7 +187,7 @@ Table "public.constraint_rename_test"
  b  | integer | 
  c  | integer | 
 Check constraints:
-"con1" CHECK (a > 0)
+"con1" CHECK (a > 0) NOT VALID
 
 CREATE TABLE constraint_rename_test2 (a int CONSTRAINT con1 CHECK (a > 0), d int) INHERITS (constraint_rename_test);
 NOTICE:  merging column "a" with inherited definition
@@ -217,7 +217,7 @@ Table "public.constraint_rename_test"
  b  | integer | 
  c  | integer | 
 Check constraints:
-"con1foo" CHECK (a > 0)
+"con1foo" CHECK (a > 0) NOT VALID
 Number of child tables: 1 (Use \d+ to list them.)
 
 \d constraint_rename_test2
@@ -243,7 +243,7 @@ Table "public.constraint_rename_test"
  b  | integer | 
  c  | integer | 
 Check constraints:
-"con1foo" CHECK (a > 0)
+"con1foo" CHECK (a > 0) NOT VALID
 "con2bar" CHECK (b > 0) NO INHERIT
 Number of child tables: 1 (Use \d+ to list them.)
 
@@ -271,7 +271,7 @@ Table "public.constraint_rename_test"
 Indexes:
 "con3foo" PRIMARY KEY, btree (a)
 Check constraints:
-"con1foo" CHECK (a > 0)
+"con1foo" CHECK (a > 0) NOT VALID
 "con2bar" CHECK (b > 0) NO INHERIT
 Number of child tables: 1 (Use \d+ to list them.)
 
@@ -1820,7 +1820,7 @@ CREATE TABLE test_inh_check_child() INHERITS(test_inh_check);
  a  | double precision | 
  b  | double precision | 
 Check constraints:
-"test_inh_check_a_check" CHECK (a > 10.2::double precision)
+"test_inh_check_a_check" CHECK (a > 10.2::double precision) NOT VALID
 Number of child tables: 1 (Use \d+ to list them.)
 
 \d test_inh_check_child
@@ -1851,7 +1851,7 @@ ALTER TABLE test_inh_check ALTER COLUMN a TYPE numeric;
  a  | numeric  | 
  b  | double precision | 
 Check constraints:
-"test_inh_check_a_check" CHECK (a::double precision > 10.2::double precision)
+"test_inh_check_a_check" CHECK (a::double precision > 10.2::double precision) NOT VALID
 Number of child tables: 1 (Use \d+ to list them.)
 
 \d test_inh_check_child
@@ -1861,7 +1861,7 @@ Number of child tables: 1 (Use \d+ to list them.)
  a  | numeric  | 
  b  | double precision | 
 Check constraints:
-

Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Sandeep Thakkar
Hi Tom

With --build=powerpc64le-unknown-linux-gnu in the config_opts section
of build-farm.conf,
the build and the regression were successful.

Well, by the time the decision is made on this, I have enabled only 9.4+
runs on ppc64le. The results from this buildfarm member 'clam' are now
being reported.

On Wed, Dec 9, 2015 at 12:05 AM, Tom Lane  wrote:

> Robert Haas  writes:
> > I don't really want to get into an argument about this, but is the
> > reason we haven't updated config.guess and config.sub in the past that
> > it presents an actual stability risk, or just that nobody's asked
> > before?  Because the first one is a good reason not to do it now, but
> > the second one isn't.
>
> Well, I see at least one case in the git history where we explicitly
> declined to update config.guess/config.sub:
>
> Author: Tom Lane 
> Branch: master Release: REL9_3_BR [5c7603c31] 2013-06-04 15:42:02 -0400
> Branch: REL9_2_STABLE Release: REL9_2_5 [612ecf311] 2013-06-04 15:42:21
> -0400
>
> Add ARM64 (aarch64) support to s_lock.h.
>
> Use the same gcc atomic functions as we do on newer ARM chips.
> (Basically this is a copy and paste of the __arm__ code block,
> but omitting the SWPB option since that definitely won't work.)
>
> Back-patch to 9.2.  The patch would work further back, but we'd also
> need to update config.guess/config.sub in older branches to make them
> build out-of-the-box, and there hasn't been demand for it.
>
> Mark Salter
>
>
> More generally, I think "does updating config.guess, in itself, pose
> a stability risk" is a false statement of the issue.  The real issue is
> do we want to start supporting a new platform in 9.1-9.3; that is, IMO
> if we accept this request then we are buying into doing *whatever is
> needed* to support ppc64le on those branches.  Maybe that will stop with
> config.guess/config.sub, or maybe it won't.
>
> Setting this precedent will also make it quite hard to reject future
> requests to back-patch support for other new platforms.
>
> I'm not planning to go to war about this issue either.  But I do think
> there's a slippery-slope hazard here, and we should be prepared for
> the logical consequences of accepting such a request.  Or if we're
> going to have a policy allowing this request but not every such request,
> somebody had better enunciate what that policy is.
>
> regards, tom lane
>
> (BTW, so far as direct stability hazards go, I would guess that the
> key question is how much version skew can be tolerated between autoconf
> and config.guess/config.sub. I have no idea about that; Peter E. might.
> But I do note that pre-9.4 branches use an older autoconf version.)
>



-- 
Sandeep Thakkar


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-08 Thread Etsuro Fujita

On 2015/12/09 1:13, Robert Haas wrote:

On Tue, Dec 8, 2015 at 5:49 AM, Etsuro Fujita
 wrote:

I'd like to discuss the next
thing about his patch.  As I mentioned in [1], the following change in
the patch will break the EXPLAIN output.

@@ -205,6 +218,11 @@ ExecInitForeignScan(ForeignScan *node, EState
*estate, int eflags)
 scanstate->fdwroutine = fdwroutine;
 scanstate->fdw_state = NULL;

+   /* Initialize any outer plan. */
+   if (outerPlanState(scanstate))
+   outerPlanState(scanstate) =
+   ExecInitNode(outerPlan(node), estate, eflags);
+

As pointed out by Horiguchi-san, that's not correct, though; we should
initialize the outer plan if outerPlan(node) != NULL, not
outerPlanState(scanstate) != NULL.  Attached is an updated version of
his patch.



I'm also attaching an updated version of the postgres_fdw
join pushdown patch.



Is that based on Ashutosh's version of the patch, or are the two of
you developing independent of each other?  We should avoid dueling
patches if possible.


That's not based on his version.  I'll add to his patch changes I've 
made.  IIUC, his version is an updated version of Hanada-san's original 
patches that I've modified, so I guess that I could do that easily. 
(I've added a helper function for creating a local join execution plan 
for a given foreign join, but that is a rush work.  So, I'll rewrite that.)



You can find the breaking examples by doing the
regression tests in the postgres_fdw patch.  Please apply the patches in
the following order:

epq-recheck-v6-efujita (attached)
usermapping_matching.patch in [2]
add_GetUserMappingById.patch in [2]
foreign_join_v16_efujita2.patch (attached)

As I proposed upthread, I think we could fix that by handling the outer
plan as in the patch [3]; a) the core initializes the outer plan and
stores it into somewhere in the ForeignScanState node, not the lefttree
of the ForeignScanState node, during ExecInitForeignScan, and b) when
the RecheckForeignScan routine gets called, the FDW extracts the plan
from the given ForeignScanState node and executes it.  What do you think
about that?



I think the actual regression test outputs are fine, and that your
desire to suppress part of the plan tree from showing up in the
EXPLAIN output is misguided.  I like it just the way it is.  To
prevent user confusion, I think that when we add support to
postgres_fdw for this we might also want to add some documentation
explaining how to interpret this EXPLAIN output, but I don't think
there's any problem with the output itself.


I'm not sure that that's a good idea.  one reason for that is I think 
that that would be more confusing to users when more than two foreign 
tables are involved in a foreign join as shown in the following example. 
 Note that the outer plans will be shown recursively.  Another reason 
is there is no consistency between the costs of the outer plans and that 
of the main plan.


postgres=# explain verbose select * from foo, bar, baz where foo.a = 
bar.a and bar.a = baz.a for update;


 QUERY PLAN




 LockRows  (cost=100.00..100.45 rows=15 width=96)
   Output: foo.a, bar.a, baz.a, foo.*, bar.*, baz.*
   ->  Foreign Scan  (cost=100.00..100.30 rows=15 width=96)
 Output: foo.a, bar.a, baz.a, foo.*, bar.*, baz.*
 Relations: ((public.foo) INNER JOIN (public.bar)) INNER JOIN 
(public.baz)
 Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1, r.a2 FROM 
(SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a9, ROW(l.a9) FROM (SELECT 
a a9 FROM p
ublic.foo FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT r.a9, ROW(r.a9) 
FROM (SELECT a a9 FROM public.bar FOR UPDATE) r) r (a1, a2) ON ((l.a1 = 
r.a1))) l
 (a1, a2, a3, a4) INNER JOIN (SELECT r.a9, ROW(r.a9) FROM (SELECT a a9 
FROM public.baz FOR UPDATE) r) r (a1, a2) ON ((l.a1 = r.a1))

 ->  Hash Join  (cost=272.13..272.69 rows=15 width=96)
   Output: foo.a, foo.*, bar.a, bar.*, baz.a, baz.*
   Hash Cond: (foo.a = baz.a)
   ->  Foreign Scan  (cost=100.00..100.04 rows=2 width=64)
 Output: foo.a, foo.*, bar.a, bar.*
 Relations: (public.foo) INNER JOIN (public.bar)
 Remote SQL: SELECT l.a1, l.a2, r.a1, r.a2 FROM 
(SELECT l.a9, ROW(l.a9) FROM (SELECT a a9 FROM public.foo FOR UPDATE) l) 
l (a1, a2)
INNER JOIN (SELECT r.a9, ROW(r.a9) FROM (SELECT a a9 FROM public.bar FOR 
UPDATE) r) r (a1, a2) ON ((l.a1 = 

Re: [HACKERS] Using quicksort for every external sort run

2015-12-08 Thread Peter Geoghegan
On Tue, Dec 8, 2015 at 7:09 PM, Peter Geoghegan  wrote:
> Why multiply by BLCKSZ here?

I ask because LogicalTapeSetBlocks() returns blocks directly, not
tapes, and I'd expect the same. Also, the callers seem to expect
blocks, not bytes.



-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-12-08 Thread Michael Paquier
On Wed, Dec 9, 2015 at 2:07 PM, Pavel Stehule  wrote:
> should be noted, recorded somewhere so this introduce possible compatibility
> issue - due default processing .psqlrc.

That's written in the commit log, so I guess that's fine.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On columnar storage (2)

2015-12-08 Thread Jeff Janes
Could we get this rebased past the merge of the parallel execution commits?

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-12-08 Thread Michael Paquier
On Tue, Dec 8, 2015 at 10:09 PM, Andres Freund  wrote:
> On 2015-12-04 21:57:54 +0900, Michael Paquier wrote:
>> On Fri, Dec 4, 2015 at 5:10 PM, Andres Freund  wrote:
>> >> Let's go for XLOG_FPI_FLUSH.
>> >
>> > I think the other way is a bit better, because we can add new flags
>> > without changing the WAL format.
>>
>> Hm. On the contrary, I think that it would make more sense to have a
>> flag as well for FOR_HINT honestly, those are really the same
>> operations, and FOR_HINT is just here for statistic purposes.
>
> I don't think it's all that much the same operation. And WRT statistics
> purpose: Being able to easily differentiate FOR_HINT is important for
> pg_xlogdump --stats, but not at all for XLOG_FPI_FLUSH.

OK. Switched back to using XLOG_FPI_FLUSH.

>> [stuff about unlogged relations in code]
>
> I might be missing something here - but isn't it pretty much guaranteed
> that all these are unlogged relations? Otherwise *buildempty() wouldn't
> have been called, right?

Yep, that's ensured in index.c when calling ambuildempty. Let's flush
it unconditionally then in those code paths.

>>   else if (info == XLOG_BACKUP_END)
>>   {
>> @@ -178,9 +183,6 @@ xlog_identify(uint8 info)
>>   case XLOG_FPI:
>>   id = "FPI";
>>   break;
>> - case XLOG_FPI_FOR_HINT:
>> - id = "FPI_FOR_HINT";
>> - break;
>>   }
>
> Really don't want to do that.

OK, added back.

>> @@ -9391,14 +9394,34 @@ xlog_redo(XLogReaderState *record)
>>* resource manager needs to generate conflicts, it has to 
>> define a
>>* separate WAL record type and redo routine.
>>*
>> -  * XLOG_FPI_FOR_HINT records are generated when a page needs 
>> to be
>> -  * WAL- logged because of a hint bit update. They are only 
>> generated
>> -  * when checksums are enabled. There is no difference in 
>> handling
>> -  * XLOG_FPI and XLOG_FPI_FOR_HINT records, they use a 
>> different info
>> -  * code just to distinguish them for statistics purposes.
>> +  * Records flagged with 'for_hint_bits' are generated when a 
>> page needs
>> +  * to be WAL- logged because of a hint bit update. They are 
>> only
>> +  * generated when checksums are enabled. There is no 
>> difference in
>> +  * handling records when this flag is set, it is used for 
>> statistics
>> +  * purposes.
>> +  *
>> +  * Records flagged with 'is_flush' indicate that the page 
>> immediately
>> +  * needs to be written to disk, not just to shared buffers. 
>> This is
>> +  * important if the on-disk state is to be the authoritative, 
>> not the
>> +  * state in shared buffers. E.g. because on-disk files may 
>> later be
>> +  * copied directly.
>>*/
>>   if (XLogReadBufferForRedo(record, 0, ) != BLK_RESTORED)
>>   elog(ERROR, "unexpected XLogReadBufferForRedo result 
>> when restoring backup block");
>> +
>> + if (xlrec.is_flush)
>> + {
>> + RelFileNode rnode;
>> + ForkNumber  forknum;
>> + BlockNumber blkno;
>> + SMgrRelation srel;
>> +
>> + (void) XLogRecGetBlockTag(record, 0, , , 
>> );
>> + srel = smgropen(rnode, InvalidBackendId);
>> + smgrwrite(srel, forknum, blkno, BufferGetPage(buffer), 
>> false);
>> + smgrclose(srel);
>> + }
>
> That'd leave the dirty bit set on the buffer...

OK, I have used an equivalent of FlushBuffer, FlushSingleBuffer being
a copycat of your previous patch... I am attaching as well a patch for
~9.4, which uses XLOG_HEAP_NEWPAGE instead.
-- 
Michael


20151209_replay_unlogged_master_95.patch
Description: binary/octet-stream


20151209_replay_unlogged_94.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include ppc64le build type for back branches

2015-12-08 Thread Peter Eisentraut
On 12/8/15 1:06 PM, Robert Haas wrote:
> I don't really want to get into an argument about this, but is the
> reason we haven't updated config.guess and config.sub in the past that
> it presents an actual stability risk, or just that nobody's asked
> before?  Because the first one is a good reason not to do it now, but
> the second one isn't.

We had some incompatibility issues with these updates in the very
distant past, but I don't think this would be an issue anymore.  The
updates themselves are better and smaller now, and the buildfarm
coverage is quite good.  It would be prudent, however, to do a manual
verification of the changes, especially in the distant back branches.

I think there is a slippery slope argument, but I don't think we have
that many new platforms and such all the time.  If people start wanting
a new arm variant every six weeks, the we'll have to put a stop to it,
perhaps.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weird portability issue in TestLib.pm's slurp_file

2015-12-08 Thread Robert Haas
On Tue, Dec 8, 2015 at 12:31 PM, Tom Lane  wrote:
> For grins I tried running the TAP tests on my ancient HPUX box that
> hosts pademelon and gaur.  At first they showed a lot of failures,
> which I eventually determined were happening because slurp_file was
> only retrieving part of the postmaster logfile, causing issues_sql_like
> to mistakenly report a failure.  I don't know exactly why slurp_file
> is misbehaving; it may well be a bug in perl 5.8.9.  But anyway,
> rewriting it like this makes the problem go away:
>
> diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
> index 
> da67f33c7e38929067350c7cf0da8fd8c5c1e43d..3940891e5d0533af93bacf3ff4af5a6fb5f10117
>  100644
> *** a/src/test/perl/TestLib.pm
> --- b/src/test/perl/TestLib.pm
> *** sub slurp_dir
> *** 158,166 
>
>   sub slurp_file
>   {
> local $/;
> !   local @ARGV = @_;
> !   my $contents = <>;
> $contents =~ s/\r//g if $Config{osname} eq 'msys';
> return $contents;
>   }
> --- 158,169 
>
>   sub slurp_file
>   {
> +   my ($filename) = @_;
> local $/;
> !   open(my $in, $filename)
> ! or die "could not read \"$filename\": $!";
> !   my $contents = <$in>;
> !   close $in;
> $contents =~ s/\r//g if $Config{osname} eq 'msys';
> return $contents;
>   }
>
> To my admittedly-no-perl-expert eye, the existing coding is too cute
> by half anyway, eg what it will do in error situations is documented
> nowhere that I can find in man perlop.
>
> Any objections to pushing this?

Your rewrite looks like a big improvement to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >