Re: [HACKERS] Microvacuum support for Hash Index

2016-11-01 Thread Ashutosh Sharma
Hi,

> While replaying the delete/vacuum record on standby, it can conflict
> with some already running queries.  Basically the replay can remove
> some row which can be visible on standby.  You need to resolve
> conflicts similar to what we do in btree delete records (refer
> btree_xlog_delete).

Agreed. Thanks for putting this point. I have taken care of it in the
attached v2 patch.

> + /*
> + * Write-lock the meta page so that we can decrement
> + * tuple count.
> + */
> + _hash_chgbufaccess(rel, metabuf, HASH_NOLOCK, HASH_WRITE);
> +
> + _hash_vacuum_one_page(rel, metabuf, buf, bucket_buf,
> +  (buf == bucket_buf) ? true : false);
> +
> + _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK);
>
> It seems here meta page lock is acquired for duration more than
> required and also it is not required when there are no deletable items
> on page. You can take the metapage lock before decrementing the count.

Ok. Corrected. Please refer to the attached v2 patch.


> Spurious space.  There are some other similar spurious white space
> changes in patch, remove them as well.

Corrected. Please refer attached v2 patch.
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index db73f05..4a4d614 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -157,7 +157,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
 	if (buildstate.spool)
 	{
 		/* sort the tuples and insert them into the index */
-		_h_indexbuild(buildstate.spool);
+		_h_indexbuild(buildstate.spool, heap->rd_node);
 		_h_spooldestroy(buildstate.spool);
 	}
 
@@ -196,6 +196,8 @@ hashbuildCallback(Relation index,
 	Datum		index_values[1];
 	bool		index_isnull[1];
 	IndexTuple	itup;
+	Relation	rel;
+	RelFileNode	rnode;
 
 	/* convert data to a hash key; on failure, do not insert anything */
 	if (!_hash_convert_tuple(index,
@@ -212,8 +214,12 @@ hashbuildCallback(Relation index,
 		/* form an index tuple and point it at the heap tuple */
 		itup = index_form_tuple(RelationGetDescr(index),
 index_values, index_isnull);
+		/* Get RelfileNode from relation OID */
+		rel = relation_open(htup->t_tableOid, NoLock);
+		rnode = rel->rd_node;
+		relation_close(rel, NoLock);
 		itup->t_tid = htup->t_self;
-		_hash_doinsert(index, itup);
+		_hash_doinsert(index, itup, rnode);
 		pfree(itup);
 	}
 
@@ -245,7 +251,7 @@ hashinsert(Relation rel, Datum *values, bool *isnull,
 	itup = index_form_tuple(RelationGetDescr(rel), index_values, index_isnull);
 	itup->t_tid = *ht_ctid;
 
-	_hash_doinsert(rel, itup);
+	_hash_doinsert(rel, itup, heapRel->rd_node);
 
 	pfree(itup);
 
@@ -325,14 +331,21 @@ hashgettuple(IndexScanDesc scan, ScanDirection dir)
 		if (scan->kill_prior_tuple)
 		{
 			/*
-			 * Yes, so mark it by setting the LP_DEAD state in the item flags.
+			 * Yes, so remember it for later. (We'll deal with all such
+			 * tuples at once right after leaving the index page or at
+			 * end of scan.)
 			 */
-			ItemIdMarkDead(PageGetItemId(page, offnum));
+			if (so->killedItems == NULL)
+so->killedItems = palloc(MaxIndexTuplesPerPage *
+		 sizeof(HashScanPosItem));
 
-			/*
-			 * Since this can be redone later if needed, mark as a hint.
-			 */
-			MarkBufferDirtyHint(buf, true);
+			if (so->numKilled < MaxIndexTuplesPerPage)
+			{
+so->killedItems[so->numKilled].heapTid = so->hashso_heappos;
+so->killedItems[so->numKilled].indexOffset =
+			ItemPointerGetOffsetNumber(&(so->hashso_curpos));
+so->numKilled++;
+			}
 		}
 
 		/*
@@ -439,6 +452,9 @@ hashbeginscan(Relation rel, int nkeys, int norderbys)
 
 	so->hashso_skip_moved_tuples = false;
 
+	so->killedItems = NULL;
+	so->numKilled = 0;
+
 	scan->opaque = so;
 
 	return scan;
@@ -454,6 +470,10 @@ hashrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
 	HashScanOpaque so = (HashScanOpaque) scan->opaque;
 	Relation	rel = scan->indexRelation;
 
+	/* Before leaving current page, deal with any killed items */
+	if (so->numKilled > 0)
+		hashkillitems(scan);
+
 	_hash_dropscanbuf(rel, so);
 
 	/* set position invalid (this will cause _hash_first call) */
@@ -480,6 +500,10 @@ hashendscan(IndexScanDesc scan)
 	HashScanOpaque so = (HashScanOpaque) scan->opaque;
 	Relation	rel = scan->indexRelation;
 
+	/* Before leaving current page, deal with any killed items */
+	if (so->numKilled > 0)
+		hashkillitems(scan);
+
 	_hash_dropscanbuf(rel, so);
 
 	pfree(so);
@@ -809,6 +833,15 @@ hashbucketcleanup(Relation rel, Buffer bucket_buf,
 			PageIndexMultiDelete(page, deletable, ndeletable);
 			bucket_dirty = true;
 
+			/*
+			 * Let us mark the page as clean if vacuum removes the DEAD tuples
+			 * from an index page. We do this by clearing LH_PAGE_HAS_DEAD_TUPLES
+			 * flag.
+			 */
+			if (tuples_removed && *tuples_removed > 0 &&
+opaque->hasho_flag & LH_PAGE_HAS_DEAD_TUPLES)
+opaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES;
+
 			MarkBufferDirty(buf);
 
 			/* XLOG stuff */
diff --git 

Re: [HACKERS] WAL consistency check facility

2016-11-01 Thread Michael Paquier
On Tue, Nov 1, 2016 at 10:31 PM, Robert Haas  wrote:
> IMHO, your rewrite of this patch was a bit heavy-handed.

OK... Sorry for that.

> I haven't
> scrutinized the code here so maybe it was a big improvement, and if so
> fine, but if not it's better to collaborate with the author than to
> take over.

While reviewing the code, that has finished by being a large rewrite,
and that was more understandable than a review looking at all the
small tweaks and things I have been through while reading it. I have
also experimented a couple of ideas with the patch that I added, so at
the end it proves to be a gain for everybody. I think that the last
patch is an improvement, if you want to make your own opinion on the
matter looking at the differences between both patches would be the
most direct way to go.

> In any case, yeah, I think you should put that back.

Here you go with this parameter back and the allocation of the masked
buffers done beforehand, close to the moment the XLogReader is
allocated actually. I have also removed wal_consistency from
PostgresNode.pm, small buildfarm machines would really suffer on it,
and hamster is very good to track race conditions when running TAP
tests. On top of that I have replaced a bunch of 0xF thingies by
their PG_UINT_MAX equivalents to keep things cleaner.

Now, I have put back the GUC-related code exactly to the same shape as
it was originally. Here are a couple of comments regarding it after
review:
- Let's drop 'none' as a magic keyword. Users are going to use an
empty string, and the default should be defined as such IMO.
- Using an allocated array of booleans to store the values of each
RMGRs could be replaced by an integer using bitwise shifts. Your
option looks better and makes the code cleaner.

A more nitpick remark: code comments don't refer much to RMIDs, but
they use the term "resource managers" more generally. I'd suggest to
do the same.
-- 
Michael


walconsistency_v10.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-11-01 Thread Tomas Vondra

On 11/01/2016 08:13 PM, Robert Haas wrote:

On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
 wrote:

Honestly, I have no idea what to think about this ...


I think a lot of the details here depend on OS scheduler behavior.
For example, here's one of the first scalability graphs I ever did:

http://rhaas.blogspot.com/2011/09/scalability-in-graphical-form-analyzed.html

It's a nice advertisement for fast-path locking, but look at the funny
shape of the red and green lines between 1 and 32 cores.  The curve is
oddly bowl-shaped.  As the post discusses, we actually dip WAY under
linear scalability in the 8-20 core range and then shoot up like a
rocket afterwards so that at 32 cores we actually achieve super-linear
scalability. You can't blame this on anything except Linux.  Someone
shared BSD graphs (I forget which flavor) with me privately and they
don't exhibit this poor behavior.  (They had different poor behaviors
instead - performance collapsed at high client counts.  That was a
long time ago so it's probably fixed now.)

This is why I think it's fundamentally wrong to look at this patch and
say "well, contention goes down, and in some cases that makes
performance go up, but because in other cases it decreases performance
or increases variability we shouldn't commit it".  If we took that
approach, we wouldn't have fast-path locking today, because the early
versions of fast-path locking could exhibit *major* regressions
precisely because of contention shifting to other locks, specifically
SInvalReadLock and msgNumLock.  (cf. commit
b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4).  If we say that because the
contention on those other locks can get worse as a result of
contention on this lock being reduced, or even worse, if we try to
take responsibility for what effect reducing lock contention might
have on the operating system scheduler discipline (which will
certainly differ from system to system and version to version), we're
never going to get anywhere, because there's almost always going to be
some way that reducing contention in one place can bite you someplace
else.



I don't think I've suggested not committing any of the clog patches (or 
other patches in general) because shifting the contention somewhere else 
might cause regressions. At the end of the last CF I've however stated 
that we need to better understand the impact on various wokloads, and I 
think Amit agreed with that conclusion.


We have that understanding now, I believe - also thanks to your idea of 
sampling wait events data.


You're right we can't fix all the contention points in one patch, and 
that shifting the contention may cause regressions. But we should at 
least understand what workloads might be impacted, how serious the 
regressions may get etc. Which is why all the testing was done.




I also believe it's pretty normal for patches that remove lock
contention to increase variability.  If you run an auto race where
every car has a speed governor installed that limits it to 80 kph,
there will be much less variability in the finish times than if you
remove the governor, but that's a stupid way to run a race.  You won't
get much innovation around increasing the top speed of the cars under
those circumstances, either.  Nobody ever bothered optimizing the
contention around msgNumLock before fast-path locking happened,
because the heavyweight lock manager burdened the system so heavily
that you couldn't generate enough contention on it to matter.
Similarly, we're not going to get much traction around optimizing the
other locks to which contention would shift if we applied this patch
unless we apply it.  This is not theoretical: EnterpriseDB staff have
already done work on trying to optimize WALWriteLock, but it's hard to
get a benefit.  The more contention other contention we eliminate, the
easier it will be to see whether a proposed change to WALWriteLock
helps.


Sure, I understand that. My main worry was that people will get worse 
performance with the next major version that what they get now (assuming 
we don't manage to address the other contention points). Which is 
difficult to explain to users & customers, no matter how reasonable it 
seems to us.


The difference is that both the fast-path locks and msgNumLock went into 
9.2, so that end users probably never saw that regression. But we don't 
know if that happens for clog and WAL.


Perhaps you have a working patch addressing the WAL contention, so that 
we could see how that changes the results?



> Of course, we'll also be more at the mercy of operating system

scheduler discipline, but that's not all a bad thing either.  The
Linux kernel guys have been known to run PostgreSQL to see whether
proposed changes help or hurt, but they're not going to try those
tests after applying patches that we rejected because they expose us
to existing Linux shortcomings.



I might be wrong, but I doubt the kernel guys are running particularly 
wide set 

Re: [HACKERS] WAL consistency check facility

2016-11-01 Thread Peter Geoghegan
On Mon, Oct 31, 2016 at 5:31 AM, Robert Haas  wrote:
> On Fri, Oct 28, 2016 at 2:05 AM, Michael Paquier
>  wrote:
>> And here we go. Here is a review as well as a large brush-up for this
>> patch. A couple of things:
>> - wal_consistency is using a list of RMGRs, at the cost of being
>> PGC_POSTMASTER. I'd suggest making it PGC_SUSER, and use a boolean (I
>> have been thinking hard about that, and still I don't see the point).
>> It is rather easy to for example default it to false, and enable it to
>> true to check if a certain code path is correctly exercised or not for
>> WAL consistency. Note that this simplification reduces the patch size
>> by 100~150 lines. I know, I know, I'd expect some complains about
>> that
>
> I don't understand how you can fail to see the point of that.  As you
> yourself said, this facility generates a ton of WAL.  If you're
> focusing on one AM, why would you want to be forced to incur the
> overhead for every other AM?  A good deal has been written about this
> upthread already, and just saying "I don't see the point" seems to be
> ignoring the explanations already given.

+1. I strongly agree.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Amit Langote
On 2016/11/02 2:53, Robert Haas wrote:
> On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote
>  wrote:
>>  [ new patches ]
> 
> Reviewing 0005:
> 
> Your proposed commit messages says this:
> 
>> If relation is the target table (UPDATE and DELETE), flattening is
>> done regardless (scared to modify inheritance_planner() yet).

I should have explained my thinking behind why I ended up not handling the
result relation case:

While set_append_rel_size() can be safely invoked recursively on the roots
of partition sub-trees, I was not quite sure if inheritance_planner() is
amenable to such recursive invocation.  While the former is relatively
straightforward baserel processing, the latter is not-so-straightforward
transformation of the whole query for every target child relation of the
target parent.

> In the immortal words of Frank Herbert: “I must not fear. Fear is the
> mind-killer. Fear is the little-death that brings total obliteration.
> I will face my fear. I will permit it to pass over me and through me.
> And when it has gone past I will turn the inner eye to see its path.
> Where the fear has gone there will be nothing. Only I will remain.”
> 
> In other words, I'm not going to accept fear as a justification for
> randomly-excluding the target-table case from this code.  If there's
> an actual reason not to do this in that case or some other case, then
> let's document that reason.  But weird warts in the code that are
> justified only by nameless anxiety are not good.

Perhaps the above comment expands a little on the actual reason.  Though I
guess it still amounts to giving up on doing a full analysis of whether
recursive processing within inheritance_planner() is feasible.

I think we could just skip this patch as long as a full investigation into
inheritance_planner() issues is not done.  It's possible that there might
be other places in the planner code which are not amenable to the proposed
multi-level AppendRelInfos.  Ashutosh had reported one [1], wherein
lateral joins wouldn't work with multi-level partitioned table owing to
the multi-level AppendRelInfos (the patch contains a hunk to address that).

> Of course, the prior question is whether we should EVER be doing this.
> I realize that something like this MAY be needed for partition-wise
> join, but the mission of this patch is not to implement partition-wise
> join.  Does anything in this patch series really require this?  If so,
> what?  If not, how about we leave it out and refactor it when that
> change is really needed for something?

Nothing *requires* it as such.  One benefit I see is that exclusion of the
root of some partition sub-tree means the whole sub-tree is excluded in
one go.  Currently, because of the flattening, every relation in that
sub-tree would be excluded separately, needlessly repeating the expensive
constraint exclusion proof. But I guess that's just an optimization.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFjFpReZF34MDbY95xoATi0xVj2mAry4-LHBWVBayOc8gj%3Diqg%40mail.gmail.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-01 Thread Karl O. Pinc
On Mon, 31 Oct 2016 09:26:27 +0100
Gilles Darold  wrote:

> Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :

> Attached patch v11 include your patch.
> 
> >
> > I have some questions about logfile_writename():
> >
> >  Why does the logfile_open() call fail silently?

> logfile_open() "fail silently" in logfile_writename(), like in other
> parts of syslogger.c where it is called, because this is a function()
> that already report a message when an error occurs ("could not open
> log file..."). I think I have already replied to this question.

Please apply the attached patch on top of your v11 patch.
It simulates a logfile_open() failure.  Upon simulated failure you do
not get a "currrent_logfile" file, and neither do you get any
indication of any problems anywhere in the logs.  It's failing
silently.

To test I create a cluster, start the server, and look for
current_logfile and at the logs.

(I finally got around to writing down the process I use to install
and run a patched server, instead of just poking it with a
stick until it works every time I get back to hacking pg.  
I'd be happy to share my process with you
if you're interested.  If you cannot reproduce my results please
share with me your procedure for cluster creation and runtime testing
so I can see why I find a problem and you don't.  Thank you.)

I don't expect to have a lot of time to work on pg in the next
36 hours.  After that I hope to push this through to completion.
I did want to get something back to you now, hence this email.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


patch_pg_current_logfile-v11.diff.silentfail
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-11-01 Thread David Rowley
On 31 October 2016 at 18:37, David Rowley  wrote:
> I've rebased the changes I made to address this back in April to current 
> master.

Please note that I went ahead and marked this as "Ready for
committer". It was previously marked as such in a previous commitfest.
The changes made since last version was based on feedback from Tom.

If anyone thinks this is not correct then please mark as "Ready for review".

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-01 Thread Tom Lane
Vik Fearing  writes:
> I'm really late to this discussion, and I apologize for that; but I'm
> wondering why we're doing all this through some random file on disk.

Well, the log collector is intentionally not connected to very much.

> Why not just use the stats collector and have everything we'd want in a
> pg_stat_logger view just like we have for pg_stat_archiver and others?

This would, at minimum, require the log collector process to not start
until after the stats collector (so that it could be connected to the
stats collector's input socket).  But perhaps more to the point, it
establishes the stats collector as infrastructure for the log collector,
which seems pretty backwards.  It's not totally out of the question that
that could result in a deadlock --- stats collector trying to write to
the log while log collector is trying to write to the stats socket,
and both getting blocked.  Also, this doesn't seem like the sort of
info that people would be okay with having the stats collector drop
under load.

I have to agree that the intermediate disk file is kind of ugly.
But passing this info through the stats collector is not better.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-11-01 Thread Robert Haas
On Mon, Oct 24, 2016 at 10:30 AM, Amit Kapila  wrote:
> [ new patches ]

I looked over parts of this today, mostly the hashinsert.c changes.

+/*
+ * Copy bucket mapping info now;  The comment in _hash_expandtable where
+ * we copy this information and calls _hash_splitbucket explains why this
+ * is OK.
+ */

So, I went and tried to find the comments to which this comment is
referring and didn't have much luck.  At the point this code is
running, we have a pin but no lock on the metapage, so this is only
safe if changing any of these fields requires a cleanup lock on the
metapage.  If that's true, it seems like you could just make the
comment say that; if it's false, you've got problems.

This code seems rather pointless anyway, the way it's written.  All of
these local variables are used in exactly one place, which is here:

+_hash_finish_split(rel, metabuf, buf, nbuf, maxbucket,
+   highmask, lowmask);

But you hold the same locks at the point where you copy those values
into local variables and the point where that code runs.  So if the
code is safe as written, you could instead just pass
metap->hashm_maxbucket, metap->hashm_highmask, and
metap->hashm_lowmask to that function instead of having these local
variables.  Or, for that matter, you could just let that function read
the data itself: it's got metabuf, after all.

+ * In future, if we want to finish the splits during insertion in new
+ * bucket, we must ensure the locking order such that old bucket is locked
+ * before new bucket.

Not if the locks are conditional anyway.

+nblkno = _hash_get_newblk(rel, pageopaque);

I think this is not a great name for this function.  It's not clear
what "new blocks" refers to, exactly.  I suggest
FIND_SPLIT_BUCKET(metap, bucket) or OLD_BUCKET_TO_NEW_BUCKET(metap,
bucket) returning a new bucket number.  I think that macro can be
defined as something like this: bucket + (1 <<
(fls(metap->hashm_maxbucket) - 1)). Then do nblkno =
BUCKET_TO_BLKNO(metap, newbucket) to get the block number.  That'd all
be considerably simpler than what you have for hash_get_newblk().

Here's some test code I wrote, which seems to work:

#include 
#include 
#include 
#include 

int
newbucket(int bucket, int nbuckets)
{
assert(bucket < nbuckets);
return bucket + (1 << (fls(nbuckets) - 1));
}

int
main(int argc, char **argv)
{
intnbuckets = 1;
int restartat = 1;
intsplitbucket = 0;

while (splitbucket < 32)
{
printf("old bucket %d splits to new bucket %d\n", splitbucket,
   newbucket(splitbucket, nbuckets));
if (++splitbucket >= restartat)
{
splitbucket = 0;
restartat *= 2;
}
++nbuckets;
}

exit(0);
}

Moving on ...

 /*
  * ovfl page exists; go get it.  if it doesn't have room, we'll
- * find out next pass through the loop test above.
+ * find out next pass through the loop test above.  Retain the
+ * pin, if it is a primary bucket page.
  */
-_hash_relbuf(rel, buf);
+if (pageopaque->hasho_flag & LH_BUCKET_PAGE)
+_hash_chgbufaccess(rel, buf, HASH_READ, HASH_NOLOCK);
+else
+_hash_relbuf(rel, buf);

It seems like it would be cheaper, safer, and clearer to test whether
buf != bucket_buf here, rather than examining the page opaque data.
That's what you do down at the bottom of the function when you ensure
that the pin on the primary bucket page gets released, and it seems
like it should work up here, too.

+boolretain_pin = false;
+
+/* page flags must be accessed before releasing lock on a page. */
+retain_pin = pageopaque->hasho_flag & LH_BUCKET_PAGE;

Similarly.

I have also attached a patch with some suggested comment changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


hashinsert-comments.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: scan key push down to heap [WIP]

2016-11-01 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Dilip Kumar
> Sent: Saturday, October 29, 2016 3:48 PM
> To: Andres Freund
> Cc: Tom Lane; Alvaro Herrera; pgsql-hackers
> Subject: Re: [HACKERS] Proposal: scan key push down to heap [WIP]
> 
> On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund  wrote:
> > The gains are quite noticeable in some cases. So if we can make it work
> > without noticeable downsides...
> >
> > What I'm worried about though is that this, afaics, will quite
> > noticeably *increase* total cost in cases with a noticeable number of
> > columns and a not that selective qual. The reason for that being that
> > HeapKeyTest() uses heap_getattr(), whereas upper layers use
> > slot_getattr(). The latter "caches" repeated deforms, the former
> > doesn't... That'll lead to deforming being essentially done twice, and
> > it's quite often already a major cost of query processing.
> 
> What about putting slot reference inside HeapScanDesc ?. I know it
> will make ,heap layer use executor structure but just a thought.
> 
> I have quickly hacked this way where we use slot reference in
> HeapScanDesc and directly use
>  slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
> use _heap_getattr) and measure the worst case performance (what you
> have mentioned above.)
> 
> My Test: (21 column table with varchar in beginning + qual is on last
> few column + varying selectivity )
> 
> postgres=# \d test
>   Table "public.test"
>  Column |   Type| Modifiers
> +---+---
>  f1 | integer   |
>  f2 | character varying |
>  f3 | integer   |
>  f4 | integer   |
>  f5 | integer   |
>  f6 | integer   |
>  f7 | integer   |
>  f8 | integer   |
>  f9 | integer   |
>  f10| integer   |
>  f11| integer   |
>  f12| integer   |
>  f13| integer   |
>  f14| integer   |
>  f15| integer   |
>  f16| integer   |
>  f17| integer   |
>  f18| integer   |
>  f19| integer   |
>  f20| integer   |
>  f21| integer   |
> 
> tuple count : 1000 (10 Million)
> explain analyze select * from test where f21< $1 and f20 < $1 and f19
> < $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).
> 
> Target code base:
> ---
> 1. Head
> 2. Heap_scankey_pushdown_v1
> 3. My hack for keeping slot reference in HeapScanDesc
> (v1+use_slot_in_HeapKeyTest)
> 
> Result:
> Selectivity Head   scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
> 0.1 3880  2980 2747
> 0.2 4041  3187 2914
> 0.5 5051  4921 3626
> 0.8 5378  7296 3879
> 1.0 6161  8525 4575
> 
> Performance graph is attached in the mail..
> 
> Observation:
> 
> 1. Heap_scankey_pushdown_v1, start degrading after very high
> selectivity (this behaviour is only visible if table have 20 or more
> columns, I tested with 10 columns but with that I did not see any
> regression in v1).
> 
> 2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high 
> selectivity.
> 
Prior to this interface change, it may be a starting point to restrict scan key
pushdown only when OpExpr references the column with static attcacheoff.
This type of column does not need walks on tuples from the head, thus, tuple
deforming cost will not be a downside.

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-01 Thread Vik Fearing
On 03/09/2016 07:32 PM, Gilles Darold wrote:
> This patch implements the pg_current_logfile() function that can be
> used as follow. The function returns NULL when logging_collector
> is not active and outputs a warning.

I'm really late to this discussion, and I apologize for that; but I'm
wondering why we're doing all this through some random file on disk.

Why not just use the stats collector and have everything we'd want in a
pg_stat_logger view just like we have for pg_stat_archiver and others?
That makes the most sense to me.

We could then also count the number of rotations per time/size and whatnot.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Specifying the log file name of pgbench -l option

2016-11-01 Thread Fabien COELHO


Hello Masahiko,


So I would suggest to:
 - fix the compilation issue
 - leave -l/--log as it is, i.e. use "pgbench_log" as a prefix
 - add --log-prefix=... (long option only) for changing this prefix


I agree. It's better to add the separated option to specify the prefix
of log file instead of changing the existing behaviour. Attached
latest patch incorporated review comments.
Please review it.


Patch applies, compiles and works for me.

This new option is for benchmarking, so "benchmarking_option_set" should 
be set to true.


To simplify the code, I would suggest to initialize explicitely 
"logfile_prefix" to the default value which is then overriden when the 
option is set, which allows to get rid of the "prefix" variable later.


There is no reason to change the documentation by breaking a line at 
another place if the text is not changed (where ... with 1).


I would suggest to simplify a little bit the documentation:
  "prefix of log file ..." ->
  "default log file prefix that can be changed with ..."

Otherwise the explanations seem clear enough to me. If a native English 
speaker could check them though, it would be nice.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2016-11-01 Thread Peter Eisentraut
On 10/3/16 8:52 AM, Jesper Pedersen wrote:
> On 09/29/2016 04:02 PM, Peter Eisentraut wrote:
>> On 9/29/16 4:00 PM, Peter Eisentraut wrote:
>>> Since the commit fest is drawing to a close, I'll set this patch as
>>> returned with feedback.
>>
>> Actually, the CF app informs me that moving to the next CF is more
>> appropriate, so I have done that.
>>
> 
> Ok, thanks for your feedback.
> 
> Maybe "Returned with Feedback" is more appropriate, as there is still 
> development left.

I have applied the documentation change that introduced subsections,
which seems quite useful independently.  I have also committed the tests
that I proposed and will work through the failures.

As no new patch has been posted for the 2016-11 CF, I will close the
patch entry now.  Please submit an updated patch when you have the time,
keeping an eye on ongoing work to update hash indexes.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain vs. parallel query

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 10:58 AM, Tomas Vondra
 wrote:
> Damn! You're right of course. Who'd guess I need more coffee this early?
>
> Attached is a fix replacing the flag with an array of flags, indexed by
> ParallelMasterBackendId. Hopefully that makes it work with multiple
> concurrent parallel queries ... still, I'm not sure this is the right
> solution.

I feel like it isn't.  I feel like this ought to go in the DSM for
that parallel query, not the main shared memory segment, but I'm not
sure how to accomplish that offhand.  Also, if we do solve it this
way, surely we don't need the locking.  The flag's only set before any
workers have started and never changes thereafter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] commitfest 2016-11 status summary

2016-11-01 Thread Haribabu Kommi
Hi All,

The 2016-11 commitfest is officially started. Please add any further
development patches into 2017-01 commitfest.

The current status summary is:

Needs review: 98
Waiting on author: 9
Ready for Commiter: 18
Commited: 19
Moved to next CF: 0
Rejected: 0
Returned with feedback: 3
TOTAL: 147


Already there is an overall progress of 14%.

There are plenty of patches that are in "ready for committer" state,
committers please have a look at those patches and give some conclusion
on them.

There are some patches on "waiting on author" with no updates from author
for a long time, please take some action within 7-10 days.

There are still some patches with no reviewer assigned. Please consider
reviewing some patches. The review can be a code, test and etc, whichever is
comfortable. We need your help.

PS: This is my first time as CFM.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] commit fest manager for CF 2016-11?

2016-11-01 Thread Haribabu Kommi
On Tue, Nov 1, 2016 at 11:19 PM, Michael Paquier 
wrote:

> On Tue, Nov 1, 2016 at 4:53 PM, Michael Paquier
>  wrote:
> > On Tue, Nov 1, 2016 at 12:20 PM, Thomas Munro
> >  wrote:
> >> On Tue, Nov 1, 2016 at 4:15 PM, Peter Eisentraut
> >>  wrote:
> >>> On 10/31/16 9:39 PM, Michael Paquier wrote:
>  We are on the 1st, at least in my timezone. So the CF should start
>  really soon, meaning that no new patches would get in. In a couple of
>  hours that would be fine? I recall that the last times we tried to
>  sync with the US East coast time.
> >>>
> >>> https://en.wikipedia.org/wiki/Anywhere_on_Earth
> >>
> >> +1
> >
> > OK, why not. And so this lets exactly 4 hours at the moment I write this
> email.
>
> And done.
>

Thanks Michael. I will send the commitfest summary in a different mail.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Typo in comment in contrib/postgres_fdw/deparse.c

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 8:20 AM, Etsuro Fujita
 wrote:
> I ran into a typo in a comment in contrib/postgres_fdw/deparse.c. Please
> find attached a patch.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-11-01 Thread Robert Haas
On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
 wrote:
> Honestly, I have no idea what to think about this ...

I think a lot of the details here depend on OS scheduler behavior.
For example, here's one of the first scalability graphs I ever did:

http://rhaas.blogspot.com/2011/09/scalability-in-graphical-form-analyzed.html

It's a nice advertisement for fast-path locking, but look at the funny
shape of the red and green lines between 1 and 32 cores.  The curve is
oddly bowl-shaped.  As the post discusses, we actually dip WAY under
linear scalability in the 8-20 core range and then shoot up like a
rocket afterwards so that at 32 cores we actually achieve super-linear
scalability. You can't blame this on anything except Linux.  Someone
shared BSD graphs (I forget which flavor) with me privately and they
don't exhibit this poor behavior.  (They had different poor behaviors
instead - performance collapsed at high client counts.  That was a
long time ago so it's probably fixed now.)

This is why I think it's fundamentally wrong to look at this patch and
say "well, contention goes down, and in some cases that makes
performance go up, but because in other cases it decreases performance
or increases variability we shouldn't commit it".  If we took that
approach, we wouldn't have fast-path locking today, because the early
versions of fast-path locking could exhibit *major* regressions
precisely because of contention shifting to other locks, specifically
SInvalReadLock and msgNumLock.  (cf. commit
b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4).  If we say that because the
contention on those other locks can get worse as a result of
contention on this lock being reduced, or even worse, if we try to
take responsibility for what effect reducing lock contention might
have on the operating system scheduler discipline (which will
certainly differ from system to system and version to version), we're
never going to get anywhere, because there's almost always going to be
some way that reducing contention in one place can bite you someplace
else.

I also believe it's pretty normal for patches that remove lock
contention to increase variability.  If you run an auto race where
every car has a speed governor installed that limits it to 80 kph,
there will be much less variability in the finish times than if you
remove the governor, but that's a stupid way to run a race.  You won't
get much innovation around increasing the top speed of the cars under
those circumstances, either.  Nobody ever bothered optimizing the
contention around msgNumLock before fast-path locking happened,
because the heavyweight lock manager burdened the system so heavily
that you couldn't generate enough contention on it to matter.
Similarly, we're not going to get much traction around optimizing the
other locks to which contention would shift if we applied this patch
unless we apply it.  This is not theoretical: EnterpriseDB staff have
already done work on trying to optimize WALWriteLock, but it's hard to
get a benefit.  The more contention other contention we eliminate, the
easier it will be to see whether a proposed change to WALWriteLock
helps.  Of course, we'll also be more at the mercy of operating system
scheduler discipline, but that's not all a bad thing either.  The
Linux kernel guys have been known to run PostgreSQL to see whether
proposed changes help or hurt, but they're not going to try those
tests after applying patches that we rejected because they expose us
to existing Linux shortcomings.

I don't want to be perceived as advocating too forcefully for a patch
that was, after all, written by a colleague.  However, I sincerely
believe it's a mistake to say that a patch which reduces lock
contention must show a tangible win or at least no loss on every piece
of hardware, on every kernel, at every client count with no increase
in variability in any configuration.  Very few (if any) patches are
going to be able to meet that bar, and if we make that the bar, people
aren't going to write patches to reduce lock contention in PostgreSQL.
For that to be worth doing, you have to be able to get the patch
committed in finite time.  We've spent an entire release cycle
dithering over this patch.  Several alternative patches have been
written that are not any better (and the people who wrote those
patches don't seem especially interested in doing further work on them
anyway).  There is increasing evidence that the patch is effective at
solving the problem it claims to solve, and that any downsides are
just the result of poor lock-scaling behavior elsewhere which we could
be working on fixing if we weren't still spending time on this.  Is
that really not good enough?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP FUNCTION of multiple functions

2016-11-01 Thread Fabrízio de Royes Mello
On Tue, Nov 1, 2016 at 2:55 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:
>
> Here is a patch series that implements several changes in the internal
> grammar and node representation of function signatures.  They are not
> necessarily meant to be applied separately, but they explain the
> progression of the changes nicely, so I left them like that for review.
>
> The end goal is to make some user-visible changes in DROP FUNCTION and
> possibly other commands that refer to functions.
>
> With these patches, it is now possible to use DROP FUNCTION to drop
> multiple functions at once: DROP FUNCTION func1(), func2(), func3().
> Other DROP commands already supported that, but DROP FUNCTION didn't
> because the internal representation was complicated and couldn't handle
it.
>
> The next step after this would be to allow referring to functions
> without having to supply the arguments, if the name is unique.  This is
> an SQL-standard feature and would be very useful for dealing "business
> logic" functions with 10+ arguments.  The details of that are to be
> worked out, but with the help of the present changes, this would be a
> quite localized change, because the grammar representation is well
> encapsulated.
>

Really nice... just a little about 006, can't we reduce the code bellow?

@@ -823,8 +823,7 @@ get_object_address(ObjectType objtype, List *objname,
List *objargs,
 {
 FuncWithArgs *fwa = (FuncWithArgs *) linitial(objname);
 address.classId = ProcedureRelationId;
-address.objectId =
-LookupAggNameTypeNames(fwa->funcname,
fwa->funcargs, missing_ok);
+address.objectId = LookupAggWithArgs(fwa, missing_ok);
 address.objectSubId = 0;
 break;
 }
@@ -832,8 +831,7 @@ get_object_address(ObjectType objtype, List *objname,
List *objargs,
 {
 FuncWithArgs *fwa = (FuncWithArgs *) linitial(objname);
 address.classId = ProcedureRelationId;
-address.objectId =
-LookupFuncNameTypeNames(fwa->funcname,
fwa->funcargs, missing_ok);
+address.objectId = LookupFuncWithArgs(fwa, missing_ok);
 address.objectSubId = 0;
 break;
 }

Regards,


--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Corey Huinker
On Tue, Nov 1, 2016 at 2:24 PM, Robert Haas  wrote:

> Well, I'm not sure we've exactly reached consensus here, and you're
> making me feel like I just kicked a puppy.
>

It was hyperbole. I hope you found it as funny to read as I did to write.
This is a great feature and I'm not going to make "perfect" the enemy of
"good".


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 2:11 PM, Corey Huinker  wrote:
> On Tue, Nov 1, 2016 at 2:01 PM, Robert Haas  wrote:
>> Yeah.  That syntax has some big advantages, though.  If we define that
>> partition as START ('2014-01-01') INCLUSIVE END ('2014-12-31')
>> INCLUSIVE, there's no way for the system to tell that the there's no
>> gap between the that ending bound and the starting bound of the 2015
>> partition, because the system has no domain-specific knowledge that
>> there is no daylight between 2014-12-31 and 2015-01-01.  So if we
>> allow things to be specified that way, then people will use that
>> syntax and then complain when it doesn't perform quite as well as
>> START ('2014-01-01') END ('2015-01-01').  Maybe the difference isn't
>> material and maybe we don't care; what do you think?
>
> It was a fight I didn't expect to win, and even if we don't get
> [x,x]-expressible partitions, at least we're not in the Oracle
> context-waterfall, where the lower bound of your partition is determined by
> the upper bound of the NEXT partition.

I certainly agree that was a horrible design decision on Oracle's
part.  It's really messy.  If you drop a partition, it changes the
partition constraint for the adjacent partition.  Then you want to add
the partition back, say, but you have to first check whether the
adjacent partition, whose legal range has been expanded, has any
values that are out of bounds.  Which it can't, but you don't know
that, so you have to scan the whole partition.  Aargh!  Maybe this
somehow works out in their system - I'm not an Oracle expert - but I
think it's absolutely vital that we don't replicate it into
PostgreSQL.  (I have some, ahem, first-hand knowledge of how that
works out.)

>> (I really don't want to get tied up adding a system for adding and
>> subtracting one to and from arbitrary data types.  Life is too short.
>> If that requires that users cope with a bit of cognitive dissidence,
>> well, it's not the first time something like that will have happened.
>> I have some cognitive dissidence about the fact that creat(2) has no
>> trailing "e" but truncate(2) does, and moreover the latter can be used
>> to make a file longer rather than shorter.  But, hey, that's what you
>> get for choosing a career in computer science.)
>
> That noise your heard was the sound of my dream dying.

Well, I'm not sure we've exactly reached consensus here, and you're
making me feel like I just kicked a puppy.  However, my goal here is
not to kill your dream but to converge on a committable patch as
quickly as possible.  Adding increment/decrement operators to every
discrete(-ish) type we have is not the way to accomplish that.  To the
contrary, that's just about guaranteed to make this patch take an
extra release cycle to finish.  Now, that does not necessarily mean we
can't keep the INCLUSIVE/EXCLUSIVE stuff -- after all, Amit has
already written it -- but then we have to live with the fact that
+1/-1 based optimizations and matching are not going to be there.
Whether it's still worth having fully-open and fully-closed ranges on
general principle -- and whether the lack of such optimizations
changes the calculus -- is what we are now debating.

More votes welcome.  Right now I count 2 votes for keeping the
inclusive-exclusive stuff (Amit, Corey) and two for nuking it from
orbit (Robert, Francisco).  I'm not going to put my foot down on this
point against a contrary consensus, so please chime in.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Francisco Olarte
Robert:

On Tue, Nov 1, 2016 at 7:09 PM, Robert Haas  wrote:
> In defense of Corey's position, that's not so easy.  First, \0 doesn't
> work; our strings can't include null bytes.  Second, the minimum legal
> character depends on the collation in use.  It's not so easy to figure
> out what the "next" string is, even though there necessarily must be
> one.

I'm aware of that, just wanted to point that it can be done on strings.

> I think we're all in agreement that half-open intervals should not
> only be allowed, but the default.  The question is whether it's a good
> idea to also allow other possibilities.

In my experience, people continuously misuse them. I would specially
like to have them disallowed on timestamp columns ( and other
real-like data, including numeric ). But knowing they cannot do a few
things, and some others are easier with them is enough for allowing
them as an explicit non default for me.

Francisco Olarte.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP FUNCTION of multiple functions

2016-11-01 Thread Alvaro Herrera
Peter Eisentraut wrote:
> Here is a patch series that implements several changes in the internal
> grammar and node representation of function signatures.  They are not
> necessarily meant to be applied separately, but they explain the
> progression of the changes nicely, so I left them like that for review.

I think patches 1-4 and 6 are pretty clearly straight cleanup and it
seems fine to me to push them ahead of the rest.  I didn't review super
carefully but it looks good on a quick glance.  I'd just advise to pay
special attention to the objectaddress.sql test and the UI involved
there (which you appear to have done already).

Patch 5 is obviously a new feature and I would put that one in a
separate commit from the first four plus six.  (I'm not clear why you
put 6 after 5 instead of before.)

I'm not at all sure about patch 7.  Maybe that one is okay, not sure,
but since it's bigger I didn't look at it.



-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Corey Huinker
>
> OTOH I've seen a lot of people bitten by [2014-01-01,2014-12-31] on
> TIMESTAMP intervals.
>

No argument there.


> Everybody remembers december has 31 days, but when we have to do
> MONTHLY partitions if you use closed intervals someone always miskeys
> the number of days, or forgets wheter a particular year is leap or
> not, and when doing it automatically I always have to code it as start
> + 1 month - 1day. In my experience having the non-significant part of
> the dates ( days in monthly case, months too in yearly cases ) both 1
> and equal in start and end makes it easier to check and identify, and
> less error prone.
>

Being able to define partitions by expressions based on the values I want
would be a good thing.


> You just do the classical ( I've had to do it ) closed end || minimum
> char ( "XYZ","XYZ\0" in this case ). It is not that difficult as
> strings have a global order, the next string to any one is always that
> plus the \0, or whatever your minimum is.
>

I've thought about doing that in the past, but wasn't sure it would
actually work correctly. If you have experience with it doing so, that
would be encouraging. How does that *look* when you print your partition
layout though?


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Francisco Olarte
On Tue, Nov 1, 2016 at 7:01 PM, Robert Haas  wrote:
> In the end, keywords are not the defining issue here; the issue is
> whether all of this complexity around inclusive and exclusive bounds
> carries its weight, and whether we want to be committed to that.
>
> Any other opinions out there?

If it where for me I would opt for just half-open intervals. The only
problem I've ever had with them is when working with FINITE ranges,
i.e., there is no way of expresing the range of 8 bits integer with
half open intervals of 8 bit integers, but I would happily pay that
cost for the benefits of not having people unintentionally make
non-contiguous date/timestamp intervals, which I periodically suffer.

Francisco Olarte.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Corey Huinker
On Tue, Nov 1, 2016 at 2:01 PM, Robert Haas  wrote:

> Yeah.  That syntax has some big advantages, though.  If we define that
> partition as START ('2014-01-01') INCLUSIVE END ('2014-12-31')
> INCLUSIVE, there's no way for the system to tell that the there's no
> gap between the that ending bound and the starting bound of the 2015
> partition, because the system has no domain-specific knowledge that
> there is no daylight between 2014-12-31 and 2015-01-01.  So if we
> allow things to be specified that way, then people will use that
> syntax and then complain when it doesn't perform quite as well as
> START ('2014-01-01') END ('2015-01-01').  Maybe the difference isn't
> material and maybe we don't care; what do you think?
>

It was a fight I didn't expect to win, and even if we don't get
[x,x]-expressible partitions, at least we're not in the Oracle
context-waterfall, where the lower bound of your partition is determined by
the upper bound of the NEXT partition.

(I really don't want to get tied up adding a system for adding and
> subtracting one to and from arbitrary data types.  Life is too short.
> If that requires that users cope with a bit of cognitive dissidence,
> well, it's not the first time something like that will have happened.
> I have some cognitive dissidence about the fact that creat(2) has no
> trailing "e" but truncate(2) does, and moreover the latter can be used
> to make a file longer rather than shorter.  But, hey, that's what you
> get for choosing a career in computer science.)
>

That noise your heard was the sound of my dream dying.


Re: [HACKERS] Using a latch between a background worker process and a thread

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 12:35 PM, Abbas Butt  wrote:
> Hi,
> Consider this situation:
> 1. I have a background worker process.
> 2. The process creates a latch, initializes it using InitLatch & resets it.
> 3. It then creates a thread and passes the latch created in step 2 to it.
> To pass it, the process uses the last argument of pthread_create.
> 4. The thread blocks by calling WaitLatch.
> 5. The process after some time sets the latch using SetLatch.
>
> The thread does not notice that the latch has been set and keeps waiting.
>
> My question is:
> Are latches supposed to work between a process and a thread created by that
> process?

Nothing in the entire backend is guaranteed to work if you spawn
multiple threads within the same process.

Including this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 2:05 PM, Francisco Olarte  wrote:
>> /me raises hand.  We have tables with a taxonomy in them where the even data
>> splits don't fall on single letter boundaries, and often the single string
>> values have more rows than entire letters. In those situations, being able
>> to express ['XYZ','XYZ'] is important.  ['XYZ,'XZ') would let 'XYZ1' bleed
>> into the partition and ['XYZ','XYZ1') lets in other values, and so I go
>> chasing down the non-discrete set rabbit hole.
>
> You just do the classical ( I've had to do it ) closed end || minimum
> char ( "XYZ","XYZ\0" in this case ). It is not that difficult as
> strings have a global order, the next string to any one is always that
> plus the \0, or whatever your minimum is.

In defense of Corey's position, that's not so easy.  First, \0 doesn't
work; our strings can't include null bytes.  Second, the minimum legal
character depends on the collation in use.  It's not so easy to figure
out what the "next" string is, even though there necessarily must be
one.

> The problem is with anything similar to a real number, but then there
> I've always opted for half-open interval, as they can cover the line
> without overlapping, unlike closed ones.
>
> Anyway, as long as anyone makes sure HALF-OPEN intervals are allowed,
> I'm fine ( I do not remember the name, but once had to work with a
> system that only allowed closed or open and it was a real PITA.

I think we're all in agreement that half-open intervals should not
only be allowed, but the default.  The question is whether it's a good
idea to also allow other possibilities.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Francisco Olarte
On Tue, Nov 1, 2016 at 6:49 PM, Corey Huinker  wrote:
>
> On Tue, Nov 1, 2016 at 12:57 PM, Robert Haas  wrote:
>> For strings and numeric types that are not integers, there is in
>> theory a loss of power.  If you want a partition that allows very
>> value starting with 'a' plus the string 'b' but not anything after
>> that, you are out of luck.  START ('a') END ('b') INCLUSIVE would have
>> done exactly what you want, but now you need to store the first string
>> that you *don't* want to include in that partition, and what's that?
>> Dunno.  Or similarly if you want to store everything from 1.0 up to
>> and including 2.0 but nothing higher, you can't, really.
> Exactly. This is especially true for date ranges. There's a lot of cognitive
> dissonance in defining the "2014" partition as < '2015-01-01', as was the
> case in Oracle waterfall-style partitioning. That was my reasoning for
> pushing for range-ish syntax as well as form.

OTOH I've seen a lot of people bitten by [2014-01-01,2014-12-31] on
TIMESTAMP intervals.

Everybody remembers december has 31 days, but when we have to do
MONTHLY partitions if you use closed intervals someone always miskeys
the number of days, or forgets wheter a particular year is leap or
not, and when doing it automatically I always have to code it as start
+ 1 month - 1day. In my experience having the non-significant part of
the dates ( days in monthly case, months too in yearly cases ) both 1
and equal in start and end makes it easier to check and identify, and
less error prone.

>> But who wants that?  People who are doing prefix-based partitioning of
>> their text keys are going to want all of the 'a' things together, and
>> all of the 'b' things in another category.  Same for ranges of
>> floating-point numbers, which are also probably an unlikely candidate
>> for a partitioning key anyway.
> /me raises hand.  We have tables with a taxonomy in them where the even data
> splits don't fall on single letter boundaries, and often the single string
> values have more rows than entire letters. In those situations, being able
> to express ['XYZ','XYZ'] is important.  ['XYZ,'XZ') would let 'XYZ1' bleed
> into the partition and ['XYZ','XYZ1') lets in other values, and so I go
> chasing down the non-discrete set rabbit hole.

You just do the classical ( I've had to do it ) closed end || minimum
char ( "XYZ","XYZ\0" in this case ). It is not that difficult as
strings have a global order, the next string to any one is always that
plus the \0, or whatever your minimum is.

The problem is with anything similar to a real number, but then there
I've always opted for half-open interval, as they can cover the line
without overlapping, unlike closed ones.

Anyway, as long as anyone makes sure HALF-OPEN intervals are allowed,
I'm fine ( I do not remember the name, but once had to work with a
system that only allowed closed or open and it was a real PITA.

Francisco Olarte.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 1:49 PM, Corey Huinker  wrote:
> Exactly. This is especially true for date ranges. There's a lot of cognitive
> dissonance in defining the "2014" partition as < '2015-01-01', as was the
> case in Oracle waterfall-style partitioning. That was my reasoning for
> pushing for range-ish syntax as well as form.

Yeah.  That syntax has some big advantages, though.  If we define that
partition as START ('2014-01-01') INCLUSIVE END ('2014-12-31')
INCLUSIVE, there's no way for the system to tell that the there's no
gap between the that ending bound and the starting bound of the 2015
partition, because the system has no domain-specific knowledge that
there is no daylight between 2014-12-31 and 2015-01-01.  So if we
allow things to be specified that way, then people will use that
syntax and then complain when it doesn't perform quite as well as
START ('2014-01-01') END ('2015-01-01').  Maybe the difference isn't
material and maybe we don't care; what do you think?

(I really don't want to get tied up adding a system for adding and
subtracting one to and from arbitrary data types.  Life is too short.
If that requires that users cope with a bit of cognitive dissidence,
well, it's not the first time something like that will have happened.
I have some cognitive dissidence about the fact that creat(2) has no
trailing "e" but truncate(2) does, and moreover the latter can be used
to make a file longer rather than shorter.  But, hey, that's what you
get for choosing a career in computer science.)

>> But who wants that?  People who are doing prefix-based partitioning of
>> their text keys are going to want all of the 'a' things together, and
>> all of the 'b' things in another category.  Same for ranges of
>> floating-point numbers, which are also probably an unlikely candidate
>> for a partitioning key anyway.
>
> /me raises hand.  We have tables with a taxonomy in them where the even data
> splits don't fall on single letter boundaries, and often the single string
> values have more rows than entire letters. In those situations, being able
> to express ['XYZ','XYZ'] is important.  ['XYZ,'XZ') would let 'XYZ1' bleed
> into the partition and ['XYZ','XYZ1') lets in other values, and so I go
> chasing down the non-discrete set rabbit hole.

Hmm.  I have to admit that I hadn't considered the case where you have
a range partitioning scheme but one of the ranges includes only a
single string.  If that's an important use case, that might be a fatal
problem with my proposal.  :-(

> If we're worried about keywords, maybe a BOUNDED '[]' clause?

In the end, keywords are not the defining issue here; the issue is
whether all of this complexity around inclusive and exclusive bounds
carries its weight, and whether we want to be committed to that.

Any other opinions out there?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Robert Haas
On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote
 wrote:
>  [ new patches ]

Reviewing 0005:

Your proposed commit messages says this:

> If relation is the target table (UPDATE and DELETE), flattening is
> done regardless (scared to modify inheritance_planner() yet).

In the immortal words of Frank Herbert: “I must not fear. Fear is the
mind-killer. Fear is the little-death that brings total obliteration.
I will face my fear. I will permit it to pass over me and through me.
And when it has gone past I will turn the inner eye to see its path.
Where the fear has gone there will be nothing. Only I will remain.”

In other words, I'm not going to accept fear as a justification for
randomly-excluding the target-table case from this code.  If there's
an actual reason not to do this in that case or some other case, then
let's document that reason.  But weird warts in the code that are
justified only by nameless anxiety are not good.

Of course, the prior question is whether we should EVER be doing this.
I realize that something like this MAY be needed for partition-wise
join, but the mission of this patch is not to implement partition-wise
join.  Does anything in this patch series really require this?  If so,
what?  If not, how about we leave it out and refactor it when that
change is really needed for something?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Corey Huinker
On Tue, Nov 1, 2016 at 12:57 PM, Robert Haas  wrote:

> For strings and numeric types that are not integers, there is in
> theory a loss of power.  If you want a partition that allows very
> value starting with 'a' plus the string 'b' but not anything after
> that, you are out of luck.  START ('a') END ('b') INCLUSIVE would have
> done exactly what you want, but now you need to store the first string
> that you *don't* want to include in that partition, and what's that?
> Dunno.  Or similarly if you want to store everything from 1.0 up to
> and including 2.0 but nothing higher, you can't, really.
>
>
Exactly. This is especially true for date ranges. There's a lot of
cognitive dissonance in defining the "2014" partition as < '2015-01-01', as
was the case in Oracle waterfall-style partitioning. That was my reasoning
for pushing for range-ish syntax as well as form.


> But who wants that?  People who are doing prefix-based partitioning of
> their text keys are going to want all of the 'a' things together, and
> all of the 'b' things in another category.  Same for ranges of
> floating-point numbers, which are also probably an unlikely candidate
> for a partitioning key anyway.
>

/me raises hand.  We have tables with a taxonomy in them where the even
data splits don't fall on single letter boundaries, and often the single
string values have more rows than entire letters. In those situations,
being able to express ['XYZ','XYZ'] is important.  ['XYZ,'XZ') would let
'XYZ1' bleed into the partition and ['XYZ','XYZ1') lets in other values,
and so I go chasing down the non-discrete set rabbit hole.

If we're worried about keywords, maybe a BOUNDED '[]' clause?


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Robert Haas
On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote
 wrote:
>> Insisting that you can't drop a child without detaching it first seems
>> wrong to me.  If I already made this comment and you responded to it,
>> please point me back to whatever you said.  However, my feeling is
>> this is flat wrong and absolutely must be changed.
>
> I said the following [1]:
>
> | Hmm, I don't think I like this.  Why should it be necessary to detach
> | a partition before dropping it?  That seems like an unnecessary step.
>
> I thought we had better lock the parent table when removing one of its
> partitions and it seemed a bit odd to lock the parent table when dropping
> a partition using DROP TABLE?  OTOH, with ALTER TABLE parent DETACH
> PARTITION, the parent table is locked anyway.

That "OTOH" part seems like a pretty relevant point.

Basically, I think people expect to be able to say "DROP THINGTYPE
thingname" or at most "DROP THINGTYPE thingname CASCADE" and have that
thing go away.  I'm opposed to anything which requires some other
series of steps without a very good reason, and possible surprise
about the precise locks that the command requires isn't a good enough
reason from my point of view.

>> I wonder if it's really a good idea for the partition constraints to
>> be implicit; what is the benefit of leaving those uncatalogued?
>
> I did start out that way - ie, catalogued implicit constraints, but later
> thought it might not be good to end up with multiple copies of essentially
> the same information.  With cataloguing them will come dependencies and
> all places that know about pg_constraint.
>
> In the long term, I think we're only going to need them because we want to
> enforce them when directly inserting data into partitions.

See my other email on this topic.  I agree there are some complexities
here, including making sure that pg_dump does the right thing.  But I
think it's the right way to go.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Specifying the log file name of pgbench -l option

2016-11-01 Thread Masahiko Sawada
On Wed, Nov 2, 2016 at 1:41 AM, Fabien COELHO  wrote:
>
>> The log file generated by pgbench -l option is fixed file name
>> 'pgbench_log..'. And it's a little complicated for the
>> script that runs pgbench repeatedly to identify the log file name.
>> Attached patch make it possible to specify the log file name. I think
>> it's useful for the use who want to run pgbench repeatedly in script
>> and collects and analyze the result.
>>
>> The one thing I concern is that this patch changes -l option so that
>> it requires argument.
>> But changing its behavior would be good rather than adding new option.
>>
>> Please give me feedback.
>
>
> Patch applies but does not compile, because "logfilename" is not declared.
> I guess "logfile" was meant instead.
>
> I understand and agree that in some case having only a predefined file
> prefix in the current directory as the only option can be a hindrance for
> scripts which use pgbench and rely on the log.
>
> I'm not at ease either with changing the behavior of such an option, as some
> people may be happy with it and some script may be using it. I would suggest
> not to do so.
>
> Moreover, what is provided is not a file name, but a prefix used to build
> file names.
>
> So I would suggest to:
>  - fix the compilation issue
>  - leave -l/--log as it is, i.e. use "pgbench_log" as a prefix
>  - add --log-prefix=... (long option only) for changing this prefix
>

Thank you for reviewing this patch!

I agree. It's better to add the separated option to specify the prefix
of log file instead of changing the existing behaviour. Attached
latest patch incorporated review comments.
Please review it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 285608d..100cdb0 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -614,6 +614,16 @@ pgbench  options  dbname
   
  
 
+ 
+  --log-prefix=prefix
+  
+   
+Prefix of transaction log file. If this option is not given,
+pgbench_log is used.
+   
+  
+ 
+
 

 
@@ -1121,15 +1131,17 @@ END;
With the -l option but without the --aggregate-interval,
pgbench writes the time taken by each transaction
to a log file.  The log file will be named
-   pgbench_log.nnn, where
+   pgbench_log.nnn,
+   where pgbench_log is the prefix of log file that can
+   be changed by specifying --log-prefix, and where
nnn is the PID of the pgbench process.
If the -j option is 2 or higher, creating multiple worker
threads, each will have its own log file. The first worker will use the
same name for its log file as in the standard single worker case.
The additional log files for the other workers will be named
-   pgbench_log.nnn.mmm,
-   where mmm is a sequential number for each worker starting
-   with 1.
+   pgbench_log.nnn.mmm,
+   where mmm is a sequential number
+   for each worker starting with 1.
   
 
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d44cfda..32ffdff 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -180,6 +180,7 @@ char	   *pghost = "";
 char	   *pgport = "";
 char	   *login = NULL;
 char	   *dbName;
+char	   *logfile_prefix;
 const char *progname;
 
 #define WSEP '@'/* weight separator */
@@ -511,6 +512,7 @@ usage(void)
 		   "  --aggregate-interval=NUM aggregate data over NUM seconds\n"
 		"  --progress-timestamp use Unix epoch timestamps for progress\n"
 		   "  --sampling-rate=NUM  fraction of transactions to log (e.g., 0.01 for 1%%)\n"
+		   "  --log-prefix=PREFIX  prefix of transaction time log file (default: pgbench_log)\n"
 		   "\nCommon options:\n"
 		   "  -d, --debug  print debugging output\n"
 	  "  -h, --host=HOSTNAME  database server host or socket directory\n"
@@ -3643,6 +3645,7 @@ main(int argc, char **argv)
 		{"sampling-rate", required_argument, NULL, 4},
 		{"aggregate-interval", required_argument, NULL, 5},
 		{"progress-timestamp", no_argument, NULL, 6},
+		{"log-prefix", required_argument, NULL, 7},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -3990,6 +3993,9 @@ main(int argc, char **argv)
 progress_timestamp = true;
 benchmarking_option_set = true;
 break;
+			case 7:
+logfile_prefix = pg_strdup(optarg);
+break;
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit(1);
@@ -4087,6 +4093,12 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	if (!use_log && logfile_prefix)
+	{
+		fprintf(stderr, "log file prefix (--log-prefix) is allowed only when actually logging transactions\n");
+		exit(1);
+	}
+
 	if (duration > 0 && agg_interval > duration)
 	{
 		fprintf(stderr, "number of seconds for aggregation (%d) must not be higher than test duration (%d)\n", agg_interval, duration);
@@ -4388,11 +4400,13 @@ 

Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Robert Haas
On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote
 wrote:
> [ new patches ]

Reviewing 0006:

This patch seems scary.  I sort of assumed from the title -- "Teach a
few places to use partition check quals." -- that this was an optional
thing, some kind of optimization from which we could reap further
advantage once the basic infrastructure was in place.  But it's not
that at all.  It's absolutely necessary that we do this, or data
integrity is fundamentally compromised.  How do we know that we've
found all of the places that need to be taught about these new,
uncatalogued constraints?

I'm feeling fairly strongly like you should rewind and make the
partitioning constraints normal catalogued constraints.  That's got a
number of advantages, most notably that we can be sure they will be
properly enforced by the entire system (modulo existing bugs, of
course).  Also, they'll show up automatically in tools like psql's \d
output, pgAdmin, and anything else that is accustomed to being able to
find constraints in the catalog.  We do need to make sure that those
constraints can't be dropped (or altered?) inappropriately, but that's
a relatively small problem.  If we stick with the design you've got
here, every client tool in the world needs to be updated, and I'm not
seeing nearly enough advantage in this system to justify that kind of
upheaval.

In fact, as far as I can see, the only advantage of this approach is
that when the insert arrives through the parent and is routed to the
child by whatever tuple-routing code we end up with (I guess that's
what 0008 does), we get to skip checking the constraint, saving CPU
cycles.  That's probably an important optimization, but I don't think
that putting the partitioning constraint in the catalog in any way
rules out the possibility of performing that optimization.  It's just
that instead of having the partitioning excluded-by-default and then
sometimes choosing to include it, you'll have it included-by-default
and then sometimes choose to exclude it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-01 Thread Robert Haas
On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote
 wrote:
>> 4. I'm somewhat wondering if we ought to just legislate that the lower
>> bound is always inclusive and the upper bound is always exclusive.
>> The decision to support both inclusive and exclusive partition bounds
>> is responsible for an enormous truckload of complexity in this patch,
>> and I have a feeling it is also going to be a not-infrequent cause of
>> user error.
>
> I thought we decided at some point to go with range type like notation to
> specify range partition bound because of its flexibility.  I agree though
> that with that flexibility, there will more input combinations that will
> cause error.  As for the internal complexity, it's not clear to me whether
> it will be reduced by always-inclusive lower and always-exclusive upper
> bounds.  We would still need to store the inclusive flag with individual
> PartitionRangeBound and consider it when comparing them with each other
> and with partition key of tuples.

We did.  But I now think that was kind of silly.  I mean, we also
talked about not having a hard distinction between list partitioning
and range partitioning, but you didn't implement it that way and I
think that's probably a good thing.  The question is - what's the
benefit of allowing this to be configurable?

For integers, there's absolutely no difference in expressive power.
If you want to allow 1000-2000 with both bounds inclusive, you can
just say START (1000) END (2001) instead of START (1000) END (2000)
INCLUSIVE.  This is also true for any other datatype where it makes
sense to talk about "the next value" and "the previous value".
Instead of making the upper bound inclusive, you can just end at the
next value instead.  If you were tempted to make the lower bound
exclusive, same thing.

For strings and numeric types that are not integers, there is in
theory a loss of power.  If you want a partition that allows very
value starting with 'a' plus the string 'b' but not anything after
that, you are out of luck.  START ('a') END ('b') INCLUSIVE would have
done exactly what you want, but now you need to store the first string
that you *don't* want to include in that partition, and what's that?
Dunno.  Or similarly if you want to store everything from 1.0 up to
and including 2.0 but nothing higher, you can't, really.

But who wants that?  People who are doing prefix-based partitioning of
their text keys are going to want all of the 'a' things together, and
all of the 'b' things in another category.  Same for ranges of
floating-point numbers, which are also probably an unlikely candidate
for a partitioning key anyway.

So let's look at the other side.  What do we gain by excluding this
functionality?  Well, we save a parser keyword: INCLUSIVE no longer
needs to be a keyword.  We also can save some code in
make_one_range_bound(), RelationBuildPartitionDesc(),
copy_range_bound(), partition_rbound_cmp(), and partition_rbound_eq().

Also, if we exclude this now as I'm proposing, we can always add it
back later if it turns out that people need it.  On the other hand, if
we include it in the first version, it's going to be very hard to get
rid of it if it turns out we don't want it.  Once we release support
for a syntax, we're kinda stuck with it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for UUIDs

2016-11-01 Thread Adam Brusselback
So I apologize in advance if I didn't follow the processes exactly, I was
going to attempt to review this to move it along, but ran into issues
applying the patch cleanly to master.  I fixed the issues I was having
applying it, and created a new patch (attached).

Managed to test it out after I got it applied, and it is working as
expected for exclusion constraints, as well as normal indexes.

This test was performed on Windows 10 (64 bit), and Postgres was compiled
using MSYS2.


btree_gist_uuid_7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Specifying the log file name of pgbench -l option

2016-11-01 Thread Fabien COELHO



The log file generated by pgbench -l option is fixed file name
'pgbench_log..'. And it's a little complicated for the
script that runs pgbench repeatedly to identify the log file name.
Attached patch make it possible to specify the log file name. I think
it's useful for the use who want to run pgbench repeatedly in script
and collects and analyze the result.

The one thing I concern is that this patch changes -l option so that
it requires argument.
But changing its behavior would be good rather than adding new option.

Please give me feedback.


Patch applies but does not compile, because "logfilename" is not declared.
I guess "logfile" was meant instead.

I understand and agree that in some case having only a predefined file 
prefix in the current directory as the only option can be a hindrance for 
scripts which use pgbench and rely on the log.


I'm not at ease either with changing the behavior of such an option, as 
some people may be happy with it and some script may be using it. I would 
suggest not to do so.


Moreover, what is provided is not a file name, but a prefix used to build 
file names.


So I would suggest to:
 - fix the compilation issue
 - leave -l/--log as it is, i.e. use "pgbench_log" as a prefix
 - add --log-prefix=... (long option only) for changing this prefix

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Using a latch between a background worker process and a thread

2016-11-01 Thread Abbas Butt
Hi,
Consider this situation:
1. I have a background worker process.
2. The process creates a latch, initializes it using InitLatch & resets it.
3. It then creates a thread and passes the latch created in step 2 to it.
To pass it, the process uses the last argument of pthread_create.
4. The thread blocks by calling WaitLatch.
5. The process after some time sets the latch using SetLatch.

The thread does not notice that the latch has been set and keeps waiting.

My question is:
Are latches supposed to work between a process and a thread created by that
process?

Thanks.

-- 
-- 
*Abbas*
Architect

Ph: 92.334.5100153
Skype ID: gabbasb
www.enterprisedb.co m


*Follow us on Twitter*
@EnterpriseDB

Visit EnterpriseDB for tutorials, webinars, whitepapers
 and more



Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 11:21 AM, Mithun Cy  wrote:
> On Tue, Nov 1, 2016 at 7:44 PM, Robert Haas  wrote:
>>> Starting program: /home/mithun/libpqbin/bin/./psql
>>> postgres://%2home%2mithun:/postgres -U mithun1
>>Can you provide a concrete test scenario or some test code that fails?
>>connhost is supposed to be getting set in connectOptions2(), which is
>>run (AIUI) when the PGconn is first created.  So I think that it will
>>be set in all scenarios of the type you mention, but I might be
>>missing something.
>
> Sorry if my sentence is confusing
> If I give a proper hexadecimal encoding % followed by 2 hexadigit (i.e. for
> e.g %2f, %2a) every thing is fine. When I pass a invalid hexadigit encoding
> eg:  %2h, %2m among the host string e.g
> "postgres://%2home%2mithun:/postgres". then "PQconnectdbParams()" fails
> before calling connectOptions2(). In that case failed PQconnectdbParams()
> also return a PGconn where connhost is not set. If we call PQpass(),
> PQReset() on such a PGconn we get a crash.
>
> A simple test case which crash is:
> ./psql  'postgres://%2hxxx:/postgres'
>
> Call stack:
> --
> #0  0x77bb8d4f in PQpass (conn=0x68aa10) at fe-connect.c:5582
> #1  0x77bb907a in PQconnectionNeedsPassword (conn=0x68aa10) at
> fe-connect.c:5727
> #2  0x004130aa in main (argc=2, argv=0x7fffdff8) at
> startup.c:250

Ah, nuts.  Thanks, good catch.  Should be fixed in the attached version.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


multihost-v3.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] save redundant code for pseudotype I/O functions

2016-11-01 Thread Peter Eisentraut
On 10/28/16 9:25 AM, Tom Lane wrote:
> Would it be better to use CppAsString and CppConcat instead of directly
> using # and ##, for consistency with what we do elsewhere?

I modeled this after several similar places in gin and tsearch code,
which use ##:

contrib/btree_gin/btree_gin.c
src/backend/tsearch/wparser_def.c
src/backend/utils/adt/tsvector_op.c

Do people prefer the macros over this?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-01 Thread Mithun Cy
On Tue, Nov 1, 2016 at 7:44 PM, Robert Haas  wrote:
>> Starting program: /home/mithun/libpqbin/bin/./psql
>> postgres://%2home%2mithun:/postgres -U mithun1
>Can you provide a concrete test scenario or some test code that fails?
>connhost is supposed to be getting set in connectOptions2(), which is
>run (AIUI) when the PGconn is first created.  So I think that it will
>be set in all scenarios of the type you mention, but I might be
>missing something.

Sorry if my sentence is confusing
If I give a proper hexadecimal encoding % followed by 2 hexadigit (i.e. for
e.g %2f, %2a) every thing is fine. When I pass a invalid hexadigit encoding
eg:  *%2h, %2m *among the host string e.g
"postgres://*%2h*ome*%2m*ithun:/postgres".
then "PQconnectdbParams()" fails before calling connectOptions2(). In that
case failed PQconnectdbParams() also return a PGconn where connhost is not
set. If we call PQpass(), PQReset() on such a PGconn we get a crash.

A simple test case which crash is:
./psql  'postgres://%2hxxx:/postgres'

Call stack:
--
#0  0x77bb8d4f in PQpass (conn=0x68aa10) at fe-connect.c:5582
#1  0x77bb907a in PQconnectionNeedsPassword (conn=0x68aa10) at
fe-connect.c:5727
#2  0x004130aa in main (argc=2, argv=0x7fffdff8) at
startup.c:250

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] emergency outage requiring database restart

2016-11-01 Thread Merlin Moncure
On Tue, Nov 1, 2016 at 8:56 AM, Tom Lane  wrote:
> Merlin Moncure  writes:
>> On Mon, Oct 31, 2016 at 10:32 AM, Oskari Saarenmaa  wrote:
>>> Your production system's postgres backends probably have a lot more open
>>> files associated with them than the simple test case does.  Since Postgres
>>> likes to keep files open as long as possible and only closes them when you
>>> need to free up fds to open new files, it's possible that your production
>>> backends have almost all allowed fds used when you execute your pl/sh
>>> function.
>>>
>>> If that's the case, the sqsh process that's executed may not have enough fds
>>> to do what it wanted to do and because of busted error handling could end up
>>> writing to fds that were opened by Postgres and point to $PGDATA files.
>
>> Does that apply?  the mechanics are a sqsh function that basically does:
>> cat foo.sql  | sqsh 
>> pipe redirection opens a new process, right?
>
> Yeah, but I doubt that either level of the shell would attempt to close
> inherited file handles.
>
> The real problem with Oskari's theory is that it requires not merely
> busted, but positively brain-dead error handling in the shell and/or
> sqsh, ie ignoring open() failures altogether.  That seems kind of
> unlikely.  Still, I suspect he might be onto something --- there must
> be some reason you can reproduce the issue in production and not in
> your test bed, and number-of-open-files is as good a theory as I've
> heard.
>
> Maybe the issue is not with open() failures, but with the resulting
> FD numbers being much larger than sqsh is expecting.  It would be
> weird to try to store an FD in something narrower than int, but
> I could see a use of select() being unprepared for large FDs.
> Still, it's hard to translate that idea into scribbling on the
> wrong file...

Looking at the sqsh code, nothing really stands out.  It's highly
developed and all obvious errors are checked.  There certainly could
be a freak bug in there (or in libfreetds which sqsh links to) doing
the damage though.  In the meantime I'll continue to try and work a
reliable reproduction. This particular routine only gets called in
batches on a quarterly basis so things have settled down.

Just a thought; could COPY be tricked into writing into the wrong file
descriptor?  For example, if a file was killed with a rm -rf and the
fd pressured backend reopened the fd immediately?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain vs. parallel query

2016-11-01 Thread Tomas Vondra

On 11/01/2016 03:29 PM, Robert Haas wrote:

On Tue, Nov 1, 2016 at 10:21 AM, Tomas Vondra
 wrote:

Clearly we need to pass some information to the worker processes, so that
they know whether to instrument the query or not. I don't know if there's a
good non-invasive way to do that from an extension - the easiest way I can
think of is using a bit of shared memory to pass the "sample query" flag -
attached is a patch that does that, and it seems to be working fine for me.


Uh, isn't this going to break as soon as there are multiple parallel
queries in process at the same time?



Damn! You're right of course. Who'd guess I need more coffee this early?

Attached is a fix replacing the flag with an array of flags, indexed by 
ParallelMasterBackendId. Hopefully that makes it work with multiple 
concurrent parallel queries ... still, I'm not sure this is the right 
solution.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 4ccd2aa..192a3a1 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -14,8 +14,13 @@
 
 #include 
 
+#include "access/parallel.h"
 #include "commands/explain.h"
 #include "executor/instrument.h"
+#include "miscadmin.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
 #include "utils/guc.h"
 
 PG_MODULE_MAGIC;
@@ -47,10 +52,13 @@ static ExecutorStart_hook_type prev_ExecutorStart = NULL;
 static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
 
 /* Is the current query sampled, per backend */
 static bool current_query_sampled = true;
 
+static int parent;
+
 #define auto_explain_enabled() \
 	(auto_explain_log_min_duration >= 0 && \
 	 (nesting_level == 0 || auto_explain_log_nested_statements))
@@ -65,6 +73,18 @@ static void explain_ExecutorRun(QueryDesc *queryDesc,
 static void explain_ExecutorFinish(QueryDesc *queryDesc);
 static void explain_ExecutorEnd(QueryDesc *queryDesc);
 
+typedef struct AutoExplainState
+{
+	LWLock	   *lock;			/* protects shared statr */
+	bool		sampled[FLEXIBLE_ARRAY_MEMBER];	/* should we sample this query? */
+} AutoExplainState;
+
+static AutoExplainState *explainState;
+
+static void explain_shmem_startup(void);
+static bool is_query_sampled(void);
+static void set_query_sampled(bool sample);
+static Size explain_shmem_size(void);
 
 /*
  * Module load callback
@@ -72,6 +92,9 @@ static void explain_ExecutorEnd(QueryDesc *queryDesc);
 void
 _PG_init(void)
 {
+	if (!process_shared_preload_libraries_in_progress)
+		elog(ERROR, "auto_explain can only be loaded via shared_preload_libraries");
+
 	/* Define custom GUC variables. */
 	DefineCustomIntVariable("auto_explain.log_min_duration",
 		 "Sets the minimum execution time above which plans will be logged.",
@@ -176,6 +199,8 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	RequestAddinShmemSpace(explain_shmem_size());
+	RequestNamedLWLockTranche("auto_explain", 1);
 	EmitWarningsOnPlaceholders("auto_explain");
 
 	/* Install hooks. */
@@ -187,6 +212,15 @@ _PG_init(void)
 	ExecutorFinish_hook = explain_ExecutorFinish;
 	prev_ExecutorEnd = ExecutorEnd_hook;
 	ExecutorEnd_hook = explain_ExecutorEnd;
+
+	prev_shmem_startup_hook = shmem_startup_hook;
+	shmem_startup_hook = explain_shmem_startup;
+}
+
+static Size
+explain_shmem_size(void)
+{
+	return offsetof(AutoExplainState, sampled) + MaxConnections * sizeof(bool);
 }
 
 /*
@@ -200,6 +234,7 @@ _PG_fini(void)
 	ExecutorRun_hook = prev_ExecutorRun;
 	ExecutorFinish_hook = prev_ExecutorFinish;
 	ExecutorEnd_hook = prev_ExecutorEnd;
+	shmem_startup_hook = prev_shmem_startup_hook;
 }
 
 /*
@@ -208,15 +243,20 @@ _PG_fini(void)
 static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
+	parent = MyBackendId;
+
 	/*
 	 * For rate sampling, randomly choose top-level statement. Either all
 	 * nested statements will be explained or none will.
 	 */
-	if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
+	if (auto_explain_log_min_duration >= 0 && nesting_level == 0 && (! IsParallelWorker()))
+	{
 		current_query_sampled = (random() < auto_explain_sample_rate *
  MAX_RANDOM_VALUE);
+		set_query_sampled(current_query_sampled);
+	}
 
-	if (auto_explain_enabled() && current_query_sampled)
+	if (auto_explain_enabled() && is_query_sampled())
 	{
 		/* Enable per-node instrumentation iff log_analyze is required. */
 		if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0)
@@ -235,7 +275,7 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	else
 		standard_ExecutorStart(queryDesc, eflags);
 
-	if (auto_explain_enabled() && current_query_sampled)
+	if 

Re: [HACKERS] [PATCH] pgpassfile connection option

2016-11-01 Thread Julian Markwort

Am 01.11.2016 um 15:14 schrieb Fabien COELHO:


Thus, when returning with an error, if conn->pgpassfile was set and a 
password was necessary, we must have tried that pgpassfile, so i got 
rid of the field "dot_pgpass_used"


No, you should not have done that, because it changes a feature which 
was to warn *only* when the password was coming from file.


The warning is wrong, the password was typed directly, not retrieved 
from a file. The "dot_pgpass_used" boolean is still required to avoid 
that. 


I see... I was too focused on looking for things to declutter, that I 
missed that case. I'll address that in the next revision.


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-11-01 Thread Robert Haas
On Mon, Oct 31, 2016 at 7:44 PM, Abhijit Menon-Sen  wrote:
> At 2016-09-28 13:13:56 -0400, robertmh...@gmail.com wrote:
 I hope that the fact that there's been no discussion for the last
>> three weeks doesn't mean this effort is dead; I would like very
>> much to see it move forward.
>
> Here's an updated patch. Sorry, I got busy elswhere.
>
> I struggled with the handling of recovery_target a little. For example,
> one suggested alternative was:
>
> recovery_target_type = xid
> recovery_target_value = …
>
> The problem with implementing it this way is that the _value setting
> cannot be parsed without already having parsed the _type, and I didn't
> want to force that sort of dependency.
>
> What I've done instead is to make this work:
>
> recovery_target = xid|time|name|lsn|immediate
> recovery_target_xid = …
> recovery_target_time = …
> recovery_target_name = …
> recovery_target_lsn = …
>
> The recovery_target_xxx values are parsed as they used to be, but the
> one that's used is the one that's set in recovery_target. That's easy to
> explain, and the patch is much less intrusive, but I'm certainly open to
> suggestions to improve this, and I have the time to work on this patch
> with a view towards getting it committed in this cycle.

I liked Heikki's suggestion (at some point quite a while ago now) of
recovery_target = 'xid 123' or recovery_target='lsn 0/723' or
whatever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain vs. parallel query

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 10:21 AM, Tomas Vondra
 wrote:
> Clearly we need to pass some information to the worker processes, so that
> they know whether to instrument the query or not. I don't know if there's a
> good non-invasive way to do that from an extension - the easiest way I can
> think of is using a bit of shared memory to pass the "sample query" flag -
> attached is a patch that does that, and it seems to be working fine for me.

Uh, isn't this going to break as soon as there are multiple parallel
queries in process at the same time?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain vs. parallel query

2016-11-01 Thread Tomas Vondra

On 11/01/2016 02:15 PM, Robert Haas wrote:

On Mon, Oct 31, 2016 at 6:35 PM, Tomas Vondra
 wrote:

While debugging something on 9.6, I've noticed that auto_explain handles
parallel queries in a slightly strange way - both the leader and all the
workers log their chunk of the query (i.e. the leader logs explain for the
whole query, while workers only log the parts they've been executing).

So for example for a query with 2 workers, the log look like this:

   2016-10-31 23:10:23.481 CET [12278] LOG:  duration: 32.562 ms  pl ...
   Query Text:   ...
   Parallel Seq Scan on tables  (cost=0.00..15158.25 rows=220 wi ...
 Filter: ((table_datoid < '10'::oid) AND (table_nspo ...
 Rows Removed by Filter: 140614  ...
   2016-10-31 23:10:23.481 CET [12277] LOG:  duration: 32.325 ms  pl ...
   Query Text:   ...
   Parallel Seq Scan on tables  (cost=0.00..15158.25 rows=220 wi ...
 Filter: ((table_datoid < '10'::oid) AND (table_nspo ...
 Rows Removed by Filter: 80948   ...
   2016-10-31 23:10:23.483 CET [12259] LOG:  duration: 38.997 ms  pl ...
   Query Text: explain analyze select * from tables where table_ ...
   Gather  (cost=1000.00..16211.15 rows=529 width=356) (actual t ...
 Workers Planned: 2  ...
 Workers Launched: 2 ...
 ->  Parallel Seq Scan on tables  (cost=0.00..15158.25 rows= ...
   Filter: ((table_datoid < '10'::oid) AND (tabl ...
   Rows Removed by Filter: 105570...

I'd say that's fairly surprising, and I haven't found any discussion about
auto_explain vs. parallel queries in the archives (within the last year), so
I assume this may not be entirely intentional.


Yeah, that's not entirely intentional.  :-(


Not only this does not match the output when running EXPLAIN ANALYZE
manually, it also provides no information about which messages from workers
belong to which "leader" message.

Another thing is that this interacts with sampling in a rather unfortunate
way, because each process evaluates the sampling condition independently. So
for example with auto_explain.sample_rate=0.5 a random subset of
worker/leader explain plans will be logged.

But this also applies to the conditions in ExecutorStart, which enables
instrumentation only when the query gets sampled - so when the leader gets
sampled but not all the workers, the counters in the query logged by the
leader are incomplete.

I do believe those are bugs that make auto_explain rather unusable with
parallel queries. Adding IsParallelWorker() to the condition in ExecutorEnd
should fix the extra logging messages (and log only from the leader).

Not sure how to fix the sampling - simply adding IsParallelWorker() to
ExecutorStart won't do the trick, because we don't want to enable
instrumentation only for sample queries. So I guess this needs to be decided
in the leader, and communicated to the workers somehow.


Yes, deciding in the leader and communicating to workers seems like
the way to go, but I'm not sure how simple that's going to be to
implement.



Clearly we need to pass some information to the worker processes, so 
that they know whether to instrument the query or not. I don't know if 
there's a good non-invasive way to do that from an extension - the 
easiest way I can think of is using a bit of shared memory to pass the 
"sample query" flag - attached is a patch that does that, and it seems 
to be working fine for me.


The bad thing is that this requires auto_explain to be loaded from 
shared_preload_libraries, because it needs to allocate the chunk of 
shared memory. That's quite annoying, because auto_explain is often 
useful for ad-hoc investigations, so being able to load it on the fly is 
valuable. But perhaps DSM would fix this (sorry, my knowledge of the DSM 
infrastructure is limited)?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 4ccd2aa..a9e87bd 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -14,8 +14,13 @@
 
 #include 
 
+#include "access/parallel.h"
 #include "commands/explain.h"
 #include "executor/instrument.h"
+#include "miscadmin.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
 #include "utils/guc.h"
 
 PG_MODULE_MAGIC;
@@ -47,6 +52,7 @@ static ExecutorStart_hook_type prev_ExecutorStart = NULL;
 static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
+static 

Re: [HACKERS] [PATCH] pgpassfile connection option

2016-11-01 Thread Fabien COELHO


Hello Julian,


Alright, here goes another one:


Patch v3 applies, make check ok, feature tested on Linux, one small issue 
found, see below.


1. Cleaned up the clutter with getPgPassFilename - the function is now named 
fillDefaultPGPassFile() and only does exactly that.


Ok.

2. Since a connection option "pgpassfile" or environment variable 
"PGPASSFILE" are picked up in conninfo_add_defaults() or in case a password 
was needed, but neither a pgpassfile connection option or environment 
variable were set, we'd have filled the conn->pgpassfile field with the 
"default" ~/.pgpass stuff.


Ok.

Thus, when returning with an error, if conn->pgpassfile was set and a 
password was necessary, we must have tried that pgpassfile, so i got rid of 
the field "dot_pgpass_used"


No, you should not have done that, because it changes a feature which was 
to warn *only* when the password was coming from file.


in the pg_conn struct and the pgpassfile string is always used in the 
error message.


 sh> touch dot_pass_empty
 sh> LD_LIBRARY_PATH=./src/interfaces/libpq \
psql "dbname=test host=localhost user=test pgpassfile=./dot_pass_empty"
 Password: BAD_PASSWORD_TYPED
 psql: FATAL:  password authentication failed for user "test"
 password retrieved from file "./dot_pass_empty"

The warning is wrong, the password was typed directly, not retrieved from 
a file. The "dot_pgpass_used" boolean is still required to avoid that.



3. Going on, I renamed "dot_pg_pass_warning()" to "PGPassFileWarning()"


This makes sense, its name is not necessarily ".pgpass".

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 9:43 AM, Mithun Cy  wrote:
>> Starting program: /home/mithun/libpqbin/bin/./psql
>> postgres://%2fhome%2fmithun:/postgres -U mithun1
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> psql: could not connect to server: No such file or directory
>> Is the server running locally and accepting
>> connections on Unix domain socket "/home/mithun/.s.PGSQL."?
>
> Accidentally when testing further on this line I found a crash when invalid
> encoding "%2" (not recognised as hexa) was used.
> Test:
>>
>> Starting program: /home/mithun/libpqbin/bin/./psql
>> postgres://%2home%2mithun:/postgres -U mithun1

You seem to be confused about how to escape characters in URLs.  The
format is a % followed by exactly two hex digits.  In this case, that
would give you %2h and %2m, and h and m are not hex characters, so the
error message is right.  Your input is not valid.

> There can be PGconn which have no connhost.
> exposed API's PQpass, PQreset->connectDBStart access conn->connhost without
> checking whether it is set.

Can you provide a concrete test scenario or some test code that fails?
 connhost is supposed to be getting set in connectOptions2(), which is
run (AIUI) when the PGconn is first created.  So I think that it will
be set in all scenarios of the type you mention, but I might be
missing something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve output of BitmapAnd EXPLAIN ANALYZE

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 9:46 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> I don't like Tom's proposal of trying to fake up a value here when
>> EXPLAIN ANALYZE is in use.  Reporting "exact" and "lossy" values for
>> BitmapAnd would be a fine enhancement, but artificially trying to
>> flatten that back into a row count is going to be confusing, not
>> helpful.  (Just last week I saw a case where the fact that many pages
>> were being lossified caused a performance problem ... so treating
>> lossy pages as if they don't exist would have led to a lot of
>> head-scratching, because under Tom's proposal the row count would have
>> been way off.)
>
> It would very often be the case that the value I suggested would be exact,
> so this complaint seems off-base to me.

>From my point of view, something that very often gives the right
answers isn't acceptable.  We certainly wouldn't accept a query
optimization that very often gives the right answers.  It's gotta
always give the right answer.

> If we were willing to add an additional output line, we could also report
> the number of lossy pages in the result bitmap, and people would then
> know not to trust the reported rowcount as gospel.  But it's still useful
> to have it.  I'm envisioning output like
>
>->  BitmapOr  (cost=... rows=2000 width=0) (actual time=... rows=1942 
> loops=1)
>
> in the no-lossy-pages case, otherwise
>
>->  BitmapOr  (cost=... rows=4000 width=0) (actual time=... rows=3945 
> loops=1)
>  Lossy Bitmap: exact entries=2469, lossy pages=123
>
> There's nothing misleading about that, IMO.  (Exercise for the reader:
> what rows/page estimate did I assume?)

(4000-2469)/123 = 12.44715 ?

I think it's inherently misleading to report values that were
concocted specifically for EXPLAIN ANALYZE.  Things that we report
there should have some underlying reality or relevance.  People -
including me - tend to assume they do, and you don't want to spend
time chasing down something that's PURELY an EXPLAIN ANALYZE artifact
with no actual relevance to the runtime behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] emergency outage requiring database restart

2016-11-01 Thread Andres Freund
On 2016-11-01 09:56:45 -0400, Tom Lane wrote:
> The real problem with Oskari's theory is that it requires not merely
> busted, but positively brain-dead error handling in the shell and/or
> sqsh, ie ignoring open() failures altogether.  That seems kind of
> unlikely.  Still, I suspect he might be onto something --- there must
> be some reason you can reproduce the issue in production and not in
> your test bed, and number-of-open-files is as good a theory as I've
> heard.

I've seen shell code akin to
exec >16 somefile # assume fd 16 is unused
more than one :(


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] emergency outage requiring database restart

2016-11-01 Thread Tom Lane
Merlin Moncure  writes:
> On Mon, Oct 31, 2016 at 10:32 AM, Oskari Saarenmaa  wrote:
>> Your production system's postgres backends probably have a lot more open
>> files associated with them than the simple test case does.  Since Postgres
>> likes to keep files open as long as possible and only closes them when you
>> need to free up fds to open new files, it's possible that your production
>> backends have almost all allowed fds used when you execute your pl/sh
>> function.
>> 
>> If that's the case, the sqsh process that's executed may not have enough fds
>> to do what it wanted to do and because of busted error handling could end up
>> writing to fds that were opened by Postgres and point to $PGDATA files.

> Does that apply?  the mechanics are a sqsh function that basically does:
> cat foo.sql  | sqsh 
> pipe redirection opens a new process, right?

Yeah, but I doubt that either level of the shell would attempt to close
inherited file handles.

The real problem with Oskari's theory is that it requires not merely
busted, but positively brain-dead error handling in the shell and/or
sqsh, ie ignoring open() failures altogether.  That seems kind of
unlikely.  Still, I suspect he might be onto something --- there must
be some reason you can reproduce the issue in production and not in
your test bed, and number-of-open-files is as good a theory as I've
heard.

Maybe the issue is not with open() failures, but with the resulting
FD numbers being much larger than sqsh is expecting.  It would be
weird to try to store an FD in something narrower than int, but
I could see a use of select() being unprepared for large FDs.
Still, it's hard to translate that idea into scribbling on the
wrong file...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve output of BitmapAnd EXPLAIN ANALYZE

2016-11-01 Thread Tom Lane
Robert Haas  writes:
> I don't like Tom's proposal of trying to fake up a value here when
> EXPLAIN ANALYZE is in use.  Reporting "exact" and "lossy" values for
> BitmapAnd would be a fine enhancement, but artificially trying to
> flatten that back into a row count is going to be confusing, not
> helpful.  (Just last week I saw a case where the fact that many pages
> were being lossified caused a performance problem ... so treating
> lossy pages as if they don't exist would have led to a lot of
> head-scratching, because under Tom's proposal the row count would have
> been way off.)

It would very often be the case that the value I suggested would be exact,
so this complaint seems off-base to me.

If we were willing to add an additional output line, we could also report
the number of lossy pages in the result bitmap, and people would then
know not to trust the reported rowcount as gospel.  But it's still useful
to have it.  I'm envisioning output like

   ->  BitmapOr  (cost=... rows=2000 width=0) (actual time=... rows=1942 
loops=1)

in the no-lossy-pages case, otherwise

   ->  BitmapOr  (cost=... rows=4000 width=0) (actual time=... rows=3945 
loops=1)
 Lossy Bitmap: exact entries=2469, lossy pages=123

There's nothing misleading about that, IMO.  (Exercise for the reader:
what rows/page estimate did I assume?)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-01 Thread Mithun Cy
On Tue, Nov 1, 2016 at 6:54 PM, Robert Haas  wrote:
>That's the wrong syntax.  If you look in
> https://www.postgresql.org/docs/devel/static/libpq-connect.html under
> "32.1.1.2. Connection URIs", it gives an example of how to include a
> slash in a pathname.  You have to use %2F, or else the URL parser will
> think you're starting a new section of the URI.
>I believe this works fine if you use the correct syntax.

Sorry that was a mistake from me. Now it appears fine.

>
> Starting program: /home/mithun/libpqbin/bin/./psql
> postgres://%2fhome%2fmithun:/postgres -U mithun1
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> psql: could not connect to server: No such file or directory
> Is the server running locally and accepting
> *connections on Unix domain socket "/home/mithun/.s.PGSQL."*?



Accidentally when testing further on this line I found a crash when invalid
encoding "%2" (not recognised as hexa) was used.
Test:

> Starting program: /home/mithun/libpqbin/bin/*./psql
> postgres://%2home%2mithun:/postgres -U mithun1*

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib64/libthread_db.so.1".


> Program received signal SIGSEGV, Segmentation fault.

0x77bb8d4f in PQpass (conn=0x68aaa0) at fe-connect.c:5582

5582 password = conn->connhost[conn->whichhost].password;

Missing separate debuginfos, use: debuginfo-install
> glibc-2.17-106.el7_2.6.x86_64 ncurses-libs-5.9-13.20130511.el7.x86_64
> readline-6.2-9.el7.x86_64


There can be PGconn which have no connhost.
exposed API's PQpass, PQreset->connectDBStart access conn->connhost without
checking whether it is set.

>That output seems fine to me.  In a real connection string, you're not
>likely to have so many duplicated addresses, and it's good for the
>error message to make clear which addresses were tried and what
>happened for each one.

Agree, Thanks.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] emergency outage requiring database restart

2016-11-01 Thread Merlin Moncure
On Mon, Oct 31, 2016 at 10:32 AM, Oskari Saarenmaa  wrote:
> 27.10.2016, 21:53, Merlin Moncure kirjoitti:
>>
>> As noted earlier, I was not able to reproduce the issue with
>> crashme.sh, which was:
>>
>> NUM_FORKS=16
>> do_parallel psql -p 5432  -c"select PushMarketSample('1740')"
>> castaging_test
>> do_parallel psql -p 5432  -c"select PushMarketSample('4400')"
>> castaging_test
>> do_parallel psql -p 5432  -c"select PushMarketSample('2160')"
>> castaging_test
>> do_parallel psql -p 5432  -c"select PushMarketSample('6680')"
>> castaging_test
>> 
>>
>> (do_parallel is simple wrapper to executing the command in parallel up
>> to NUM_FORKS).   This is on the same server and cluster as above.
>> This kind of suggests that either
>> A) there is some concurrent activity from another process that is
>> tripping the issue
>> or
>> B) there is something particular to the session invoking the function
>> that is participating in the problem.  As the application is
>> structured, a single threaded node.js app is issuing the query that is
>> high traffic and long lived.  It's still running in fact and I'm kind
>> of tempted to find some downtime to see if I can still reproduce via
>> the UI.
>
> Your production system's postgres backends probably have a lot more open
> files associated with them than the simple test case does.  Since Postgres
> likes to keep files open as long as possible and only closes them when you
> need to free up fds to open new files, it's possible that your production
> backends have almost all allowed fds used when you execute your pl/sh
> function.
>
> If that's the case, the sqsh process that's executed may not have enough fds
> to do what it wanted to do and because of busted error handling could end up
> writing to fds that were opened by Postgres and point to $PGDATA files.

Does that apply?  the mechanics are a sqsh function that basically does:
cat foo.sql  | sqsh 

pipe redirection opens a new process, right?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-11-01 Thread Robert Haas
On Mon, Oct 31, 2016 at 5:51 PM, Michael Paquier
 wrote:
> Hehe, I was expecting you to jump on those lines. While looking at the
> patch I have simplified it first to focus on the core engine of the
> thing. Adding back this code sounds fine to me as there is a wall of
> contestation. I offer to do it myself if the effort is the problem.

IMHO, your rewrite of this patch was a bit heavy-handed.  I haven't
scrutinized the code here so maybe it was a big improvement, and if so
fine, but if not it's better to collaborate with the author than to
take over.  In any case, yeah, I think you should put that back.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix bug in handling of dropped columns in pltcl triggers

2016-11-01 Thread Michael Paquier
On Tue, Nov 1, 2016 at 4:17 AM, Jim Nasby  wrote:
> While reviewing code coverage in pltcl, I uncovered a bug in trigger
> function return handling. If you returned the munged name of a dropped
> column, that would silently be ignored. It would be unusual to hit this,
> since dropped columns end up with names like "...pg.dropped.2...",
> but since that's still a legitimate name for a column silently ignoring it
> seems rather bogus.

It seems to me that this patch breaks $TG_relatts and what existing
applications would except from it:
  
   $TG_relatts
   

 A Tcl list of the table column names, prefixed with an empty list
 element. So looking up a column name in the list with
Tcl's
 lsearch command returns the element's number starting
 with 1 for the first column, the same way the columns are customarily
 numbered in PostgreSQL.  (Empty list
 elements also appear in the positions of columns that have been
 dropped, so that the attribute numbering is correct for columns
 to their right.)

   
  
As this is a behavior present since 2004, it does not sound wise to change it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-01 Thread Robert Haas
On Tue, Nov 1, 2016 at 2:03 AM, Mithun Cy  wrote:
> On Thu, Oct 27, 2016 at 11:15 PM, Robert Haas  wrote:
>> Thanks.  Here's a new version with a fix for that issue and also a fix
>> for PQconnectionNeedsPassword(), which was busted in v1.
> I did some more testing of the patch for both URI and (host, port) parameter
> pairs. I did test for ipv4, ipv6, UDS type of sockets, Also tested different
> password using .pgpass. I did not see any major issue with. All seems to
> work as defined. Tested the behaviors of API PQport, PQhost, PQpass, PQreset
> all works as defined.
>
> I also have following observation which can be considered if it is valid.
>
> 1. URI do not support UDS, where as (host,port) pair support same.
> But there is one problem with this, instead of reporting it, we discard the
> URI input (@conninfo_uri_parse_options) and try to connect to default unix
> socket
>
> [mithun@localhost bin]$ ./psql
> postgres:///home/mithun:,127.0.0.1:/postgres -U mithun
> psql: could not connect to server: No such file or directory
> Is the server running locally and accepting
> connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
> server is running on both the sockets.

That's the wrong syntax.  If you look in
https://www.postgresql.org/docs/devel/static/libpq-connect.html under
"32.1.1.2. Connection URIs", it gives an example of how to include a
slash in a pathname.  You have to use %2F, or else the URL parser will
think you're starting a new section of the URI.

I believe this works fine if you use the correct syntax.

> 2. If we fail to connect we get an error message for each of the addrinfo we
> tried to connect, wondering if we could simplify same. Just a suggestion.
> [mithun@localhost bin]$ ./psql postgres://,,,/postgres
> psql: could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Connection refused
> Is the server running on host "" (127.0.0.1) and accepting
> TCP/IP connections on port 5432?

That output seems fine to me.  In a real connection string, you're not
likely to have so many duplicated addresses, and it's good for the
error message to make clear which addresses were tried and what
happened for each one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve output of BitmapAnd EXPLAIN ANALYZE

2016-11-01 Thread Robert Haas
On Mon, Oct 31, 2016 at 6:56 AM, Emre Hasegeli  wrote:
> The BRIN Bitmap Index Scan has the same problem.  I have seen people
> confused by this.  I think N/A would clearly improve the situation.

I agree.  Or perhaps better still, leave rows=%.0f out altogether when
we don't have a meaningful value to report.  If it were OK to use some
unimportant-looking value as a proxy for "undefined", the SQL standard
wouldn't include nulls.

I don't like Tom's proposal of trying to fake up a value here when
EXPLAIN ANALYZE is in use.  Reporting "exact" and "lossy" values for
BitmapAnd would be a fine enhancement, but artificially trying to
flatten that back into a row count is going to be confusing, not
helpful.  (Just last week I saw a case where the fact that many pages
were being lossified caused a performance problem ... so treating
lossy pages as if they don't exist would have led to a lot of
head-scratching, because under Tom's proposal the row count would have
been way off.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgpassfile connection option

2016-11-01 Thread Julian Markwort

Welp, I was in head over heels, sorry for my messy email...

*2. No more "dot_pgpass_used" - we fill the conn->pgpassfile field with 
any options that have been provided (connection parameter, environment 
variable, "default" ~/.pgpass) and in case there has been an error with 
the authentification, we only use conn->pgpassfile in the error message.


Fabien Coelho wrote:
I agree that it is currently unconvincing. I'm unclear about the 
correct level of messages that should be displayed for a library. I 
think that it should be pretty silent by default if all is well but 
that some environment variable should be able to put a more verbose 
level... 
fe_connect.c in general seems very talkative about it's errors - even if 
it's only about files not beeing found, so I'll include an error message 
for that case next time.


Julian


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain vs. parallel query

2016-11-01 Thread Robert Haas
On Mon, Oct 31, 2016 at 6:35 PM, Tomas Vondra
 wrote:
> While debugging something on 9.6, I've noticed that auto_explain handles
> parallel queries in a slightly strange way - both the leader and all the
> workers log their chunk of the query (i.e. the leader logs explain for the
> whole query, while workers only log the parts they've been executing).
>
> So for example for a query with 2 workers, the log look like this:
>
>2016-10-31 23:10:23.481 CET [12278] LOG:  duration: 32.562 ms  pl ...
>Query Text:   ...
>Parallel Seq Scan on tables  (cost=0.00..15158.25 rows=220 wi ...
>  Filter: ((table_datoid < '10'::oid) AND (table_nspo ...
>  Rows Removed by Filter: 140614  ...
>2016-10-31 23:10:23.481 CET [12277] LOG:  duration: 32.325 ms  pl ...
>Query Text:   ...
>Parallel Seq Scan on tables  (cost=0.00..15158.25 rows=220 wi ...
>  Filter: ((table_datoid < '10'::oid) AND (table_nspo ...
>  Rows Removed by Filter: 80948   ...
>2016-10-31 23:10:23.483 CET [12259] LOG:  duration: 38.997 ms  pl ...
>Query Text: explain analyze select * from tables where table_ ...
>Gather  (cost=1000.00..16211.15 rows=529 width=356) (actual t ...
>  Workers Planned: 2  ...
>  Workers Launched: 2 ...
>  ->  Parallel Seq Scan on tables  (cost=0.00..15158.25 rows= ...
>Filter: ((table_datoid < '10'::oid) AND (tabl ...
>Rows Removed by Filter: 105570...
>
> I'd say that's fairly surprising, and I haven't found any discussion about
> auto_explain vs. parallel queries in the archives (within the last year), so
> I assume this may not be entirely intentional.

Yeah, that's not entirely intentional.  :-(

> Not only this does not match the output when running EXPLAIN ANALYZE
> manually, it also provides no information about which messages from workers
> belong to which "leader" message.
>
> Another thing is that this interacts with sampling in a rather unfortunate
> way, because each process evaluates the sampling condition independently. So
> for example with auto_explain.sample_rate=0.5 a random subset of
> worker/leader explain plans will be logged.
>
> But this also applies to the conditions in ExecutorStart, which enables
> instrumentation only when the query gets sampled - so when the leader gets
> sampled but not all the workers, the counters in the query logged by the
> leader are incomplete.
>
> I do believe those are bugs that make auto_explain rather unusable with
> parallel queries. Adding IsParallelWorker() to the condition in ExecutorEnd
> should fix the extra logging messages (and log only from the leader).
>
> Not sure how to fix the sampling - simply adding IsParallelWorker() to
> ExecutorStart won't do the trick, because we don't want to enable
> instrumentation only for sample queries. So I guess this needs to be decided
> in the leader, and communicated to the workers somehow.

Yes, deciding in the leader and communicating to workers seems like
the way to go, but I'm not sure how simple that's going to be to
implement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgpassfile connection option

2016-11-01 Thread Julian Markwort

Alright, here goes another one:
1. Cleaned up the clutter with getPgPassFilename - the function is now 
named fillDefaultPGPassFile() and only does exactly that.
2. Since a connection option "pgpassfile" or environment variable 
"PGPASSFILE" are picked up in conninfo_add_defaults() or in case a 
password was needed, but neither a pgpassfile connection option or 
environment variable were set, we'd have filled the conn->pgpassfile 
field with the "default" ~/.pgpass stuff.
Thus, when returning with an error, if conn->pgpassfile was set and a 
password was necessary, we must have tried that pgpassfile, so i got rid 
of the field "dot_pgpass_used" in  the pg_conn struct and the pgpassfile 
string is always used in the error message.

3. Going on, I renamed "dot_pg_pass_warning()" to "PGPassFileWarning()"

Kind regards,
Julian Markwort
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 4e34f00..1bd5597 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -930,7 +930,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
 Note that authentication is likely to fail if host
 is not the name of the server at network address hostaddr.
 Also, note that host rather than hostaddr
-is used to identify the connection in ~/.pgpass (see
+is used to identify the connection in a password file (see
 ).

 
@@ -986,6 +986,16 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
  
 
+ 
+  pgpassfile
+  
+  
+Specifies the name of the file used to lookup passwords.
+Defaults to the password file (see ).
+  
+  
+ 
+
  
   connect_timeout
   
@@ -6862,9 +6872,8 @@ myEventProc(PGEventId evtId, void *evtInfo, void 
*passThrough)
   
PGPASSFILE
   
-  PGPASSFILE specifies the name of the password file to
-  use for lookups.  If not set, it defaults to ~/.pgpass
-  (see ).
+  PGPASSFILE behaves the same as the  connection parameter.
  
 
 
@@ -7136,13 +7145,15 @@ myEventProc(PGEventId evtId, void *evtInfo, void 
*passThrough)
   
 
   
-   The file .pgpass in a user's home directory or the
-   file referenced by PGPASSFILE can contain passwords to
+   The file .pgpass in a user's home directory can 
contain passwords to
be used if the connection requires a password (and no password has been
specified  otherwise). On Microsoft Windows the file is named
%APPDATA%\postgresql\pgpass.conf (where
%APPDATA% refers to the Application Data subdirectory in
the user's profile).
+   Alternatively, a password file can be specified
+   using the connection parameter 
+   or the environment variable PGPASSFILE.
   
 
   
diff --git a/src/interfaces/libpq/fe-connect.c 
b/src/interfaces/libpq/fe-connect.c
index f3a9e5a..e8f8fe1 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -184,6 +184,10 @@ static const internalPQconninfoOption PQconninfoOptions[] 
= {
"Database-Password", "*", 20,
offsetof(struct pg_conn, pgpass)},
 
+   {"pgpassfile", "PGPASSFILE", NULL, NULL,
+   "Database-Password-File", "", 64,
+   offsetof(struct pg_conn, pgpassfile)},
+
{"connect_timeout", "PGCONNECT_TIMEOUT", NULL, NULL,
"Connect-timeout", "", 10,  /* strlen(INT32_MAX) == 
10 */
offsetof(struct pg_conn, connect_timeout)},
@@ -375,9 +379,9 @@ static int parseServiceFile(const char *serviceFile,
 bool *group_found);
 static char *pwdfMatchesString(char *buf, char *token);
 static char *PasswordFromFile(char *hostname, char *port, char *dbname,
-char *username);
-static bool getPgPassFilename(char *pgpassfile);
-static void dot_pg_pass_warning(PGconn *conn);
+char *username, char *pgpassfile);
+static bool fillDefaultPGPassFile(PGconn *conn);
+static void PGPassFileWarning(PGconn *conn);
 static void default_threadlock(int acquire);
 
 
@@ -804,18 +808,22 @@ connectOptions2(PGconn *conn)
 */
if (conn->pgpass == NULL || conn->pgpass[0] == '\0')
{
+   if(!conn->pgpassfile || conn->pgpassfile[0] =='\0')
+   {
+   fillDefaultPGPassFile(conn);
+   }
+
if (conn->pgpass)
free(conn->pgpass);
+   /* We'll pass conn->pgpassfile regardless of it's contents - 
checks happen in PasswordFromFile() */
conn->pgpass = PasswordFromFile(conn->pghost, conn->pgport,
-   
conn->dbName, conn->pguser);
+   
conn->dbName, conn->pguser, conn->pgpassfile);
if (conn->pgpass == NULL)
{
conn->pgpass = strdup(DefaultPassword);
 

Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-11-01 Thread Fabien COELHO


Hello Rafia,


Well with this new approach, the example you gave previously for better
readability:


\set bid
 CASE WHEN random(0, 99) < 85
   THEN :tbid
   ELSE :abid + (:abid >= :tbid)
 END


will give error at the first line.


Indeed you are right for the patch I sent, but it is ok if the initial 
state is COEX, i.e. it does not allow an empty expression.


In general, this new approach is likely to create confusions in such 
cases.


See attached version.

As an end-user one needs to be real careful to check what portions have 
to split between lines. Keeping this in mind, I'd prefer the previous 
approach.


I will not fight over this one. I like it in "scala", though, and I find 
it rather elegant, especially as backslashes are quite ugly.


Another reason not to take it is that it would be much harder to have that 
in psql as well.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 285608d..f066be1 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -826,13 +826,17 @@ pgbench  options  dbname
   %) with their usual precedence and associativity,
   function calls, and
   parentheses.
+  Expressions can spead several lines till completed: the parsing is
+  pursued if a token at the end of the line cannot end an expression
+  or if there is an unclosed parenthesis.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) %
+   (10 * :scale) + 1
 
 

diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 20891a3..0d6fcd6 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -43,6 +43,16 @@ static bool last_was_newline = false;
 extern int	expr_yyget_column(yyscan_t yyscanner);
 extern void expr_yyset_column(int column_no, yyscan_t yyscanner);
 
+/* the expression cannot end on this token */
+#define TO_COEX	   \
+	cur_state->start_state = COEX; \
+	BEGIN(COEX)
+
+/* continuation if unclosed parentheses */
+#define TO_EXPR			\
+	cur_state->start_state = cur_state->paren_depth > 0 ? COEX : EXPR;	\
+	BEGIN(cur_state->start_state)
+
 %}
 
 /* Except for the prefix, these options should match psqlscan.l */
@@ -67,7 +77,7 @@ nonspace		[^ \t\r\f\v\n]
 newline			[\n]
 
 /* Exclusive states */
-%x EXPR
+%x EXPR COEX
 
 %%
 
@@ -104,46 +114,64 @@ newline			[\n]
 	return 0;
 }
 
+	/* COEX (continued expression) state */
+
+{
+
+{newline}		{ /* ignore */ }
+
+}
+
 	/* EXPR state */
 
 {
 
-"+"{ return '+'; }
-"-"{ return '-'; }
-"*"{ return '*'; }
-"/"{ return '/'; }
-"%"{ return '%'; }
-"("{ return '('; }
-")"{ return ')'; }
-","{ return ','; }
+{newline}		{
+	/* report end of command */
+	last_was_newline = true;
+	return 0;
+}
+}
+
+	/* EXPR & COEX states common rules */
+
+{
+
+"+"{ TO_COEX; return '+'; }
+"-"{ TO_COEX; return '-'; }
+"*"{ TO_COEX; return '*'; }
+"/"{ TO_COEX; return '/'; }
+"%"{ TO_COEX; return '%'; }
+"("{ cur_state->paren_depth++; TO_COEX; return '('; }
+")"{ cur_state->paren_depth--; TO_EXPR; return ')'; }
+","{ TO_COEX; return ','; }
 
 :{alnum}+		{
 	yylval->str = pg_strdup(yytext + 1);
+	TO_EXPR;
 	return VARIABLE;
 }
 {digit}+		{
 	yylval->ival = strtoint64(yytext);
+	TO_EXPR;
 	return INTEGER_CONST;
 }
 {digit}+(\.{digit}*)?([eE][-+]?{digit}+)?	{
 	yylval->dval = atof(yytext);
+	TO_EXPR;
 	return DOUBLE_CONST;
 }
 \.{digit}+([eE][-+]?{digit}+)?	{
 	yylval->dval = atof(yytext);
+	TO_EXPR;
 	return DOUBLE_CONST;
 }
 {alpha}{alnum}*	{
 	yylval->str = pg_strdup(yytext);
+	TO_COEX;
 	return FUNCTION;
 }
 
-{newline}		{
-	/* report end of command */
-	last_was_newline = true;
-	return 0;
-}
-
 {space}+		{ /* ignore */ }
 
 .{
@@ -289,7 +317,7 @@ expr_scanner_init(PsqlScanState state,
 		yy_switch_to_buffer(state->scanbufhandle, state->scanner);
 
 	/* Set start state */
-	state->start_state = EXPR;
+	state->start_state = COEX;
 
 	return state->scanner;
 }


cont2.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typo in comment in contrib/postgres_fdw/deparse.c

2016-11-01 Thread Etsuro Fujita

Hi,

I ran into a typo in a comment in contrib/postgres_fdw/deparse.c.  
Please find attached a patch.


Best regards,
Etsuro Fujita
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 450693a..66b059a 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -847,7 +847,7 @@ deparse_type_name(Oid type_oid, int32 typemod)
  *
  * The output targetlist contains the columns that need to be fetched from the
  * foreign server for the given relation.  If foreignrel is an upper relation,
- * then the output targetlist can also contains expressions to be evaluated on
+ * then the output targetlist can also contain expressions to be evaluated on
  * foreign server.
  */
 List *

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] commit fest manager for CF 2016-11?

2016-11-01 Thread Michael Paquier
On Tue, Nov 1, 2016 at 4:53 PM, Michael Paquier
 wrote:
> On Tue, Nov 1, 2016 at 12:20 PM, Thomas Munro
>  wrote:
>> On Tue, Nov 1, 2016 at 4:15 PM, Peter Eisentraut
>>  wrote:
>>> On 10/31/16 9:39 PM, Michael Paquier wrote:
 We are on the 1st, at least in my timezone. So the CF should start
 really soon, meaning that no new patches would get in. In a couple of
 hours that would be fine? I recall that the last times we tried to
 sync with the US East coast time.
>>>
>>> https://en.wikipedia.org/wiki/Anywhere_on_Earth
>>
>> +1
>
> OK, why not. And so this lets exactly 4 hours at the moment I write this 
> email.

And done.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Push down more full joins in postgres_fdw

2016-11-01 Thread Etsuro Fujita

On 2016/10/27 20:41, Etsuro Fujita wrote:

Here is the updated version, which includes the restructuring you
proposed.  Other than the above issue and the alias issue we discussed,
I addressed all your comments except one on testing; I tried to add test
cases where the remote query is deparsed as nested subqueries, but I
couldn't because IIUC, reduce_outer_joins reduced full joins to inner
joins or left joins.  So, I added two test cases: (1) the joining
relations are both base relations (actually, we already have that) and
(2) one of the joining relations is a base relation and the other is a
join relation.  I rebased the patch to HEAD, so I added a test case with
aggregate pushdown also.


I've updated another patch for evaluating PHVs remotely.  Attached is an 
updated version of the patch, which is created on top of the patch for 
supporting deparsing subqueries.  You can find test cases where a remote 
join query is deparsed as nested subqueries.


Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 49,54 
--- 49,55 
  #include "nodes/nodeFuncs.h"
  #include "nodes/plannodes.h"
  #include "optimizer/clauses.h"
+ #include "optimizer/placeholder.h"
  #include "optimizer/prep.h"
  #include "optimizer/tlist.h"
  #include "optimizer/var.h"
***
*** 157,162  static void deparseRelabelType(RelabelType *node, deparse_expr_cxt *context);
--- 158,164 
  static void deparseBoolExpr(BoolExpr *node, deparse_expr_cxt *context);
  static void deparseNullTest(NullTest *node, deparse_expr_cxt *context);
  static void deparseArrayExpr(ArrayExpr *node, deparse_expr_cxt *context);
+ static void deparseExprInPlaceHolderVar(PlaceHolderVar *node, deparse_expr_cxt *context);
  static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
   deparse_expr_cxt *context);
  static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
***
*** 169,177  static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
  static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
     bool make_subquery, List **params_list);
  static void appendSubselectAlias(StringInfo buf, int tabno, int ncols);
! static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel,
  	  int *tabno, int *colno);
! static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno);
  static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
  static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
  static void appendAggOrderBy(List *orderList, List *targetList,
--- 171,180 
  static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
     bool make_subquery, List **params_list);
  static void appendSubselectAlias(StringInfo buf, int tabno, int ncols);
! static void getSubselectAliasInfo(Expr *node, RelOptInfo *foreignrel,
  	  int *tabno, int *colno);
! static bool isSubqueryExpr(Expr *node, PlannerInfo *root, RelOptInfo *foreignrel,
! 			   int *tabno, int *colno);
  static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
  static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
  static void appendAggOrderBy(List *orderList, List *targetList,
***
*** 762,767  foreign_expr_walker(Node *node,
--- 765,789 
  	state = FDW_COLLATE_UNSAFE;
  			}
  			break;
+ 		case T_PlaceHolderVar:
+ 			{
+ PlaceHolderVar *phv = (PlaceHolderVar *) node;
+ PlaceHolderInfo *phinfo = find_placeholder_info(glob_cxt->root,
+ phv, false);
+ 
+ /*
+  * If the PHV's contained expression is computable on the
+  * remote server, we consider the PHV safe to send.
+  */
+ if (phv->phlevelsup == 0 &&
+ 	bms_is_subset(phinfo->ph_eval_at,
+   glob_cxt->foreignrel->relids))
+ 	return foreign_expr_walker((Node *) phv->phexpr,
+ 			   glob_cxt, outer_cxt);
+ else
+ 	return false;
+ 			}
+ 			break;
  		default:
  
  			/*
***
*** 856,862  deparse_type_name(Oid type_oid, int32 typemod)
   * foreign server.
   */
  List *
! build_tlist_to_deparse(RelOptInfo *foreignrel)
  {
  	List	   *tlist = NIL;
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
--- 878,884 
   * foreign server.
   */
  List *
! build_tlist_to_deparse(PlannerInfo *root, RelOptInfo *foreignrel)
  {
  	List	   *tlist = NIL;
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
***
*** 869,883  build_tlist_to_deparse(RelOptInfo *foreignrel)
  		return fpinfo->grouped_tlist;
  
  	/*
! 	 * We require columns specified in foreignrel->reltarget->exprs and those
! 	 * required for evaluating the local conditions.
  	 */
! 	tlist = add_to_flat_tlist(tlist,
! 	   pull_var_clause((Node *) foreignrel->reltarget->exprs,
! 	   

Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-11-01 Thread Rafia Sabih
Well with this new approach, the example you gave previously for better
readability:

> \set bid
>  CASE WHEN random(0, 99) < 85
>THEN :tbid
>ELSE :abid + (:abid >= :tbid)
>  END

will give error at the first line. In general, this new approach is likely
to create confusions in such cases. As an end-user one needs to be real
careful to check what portions have to split between lines. Keeping this in
mind, I'd prefer the previous approach.

On Tue, Nov 1, 2016 at 4:23 PM, Fabien COELHO  wrote:

>
> Attached patch does what is described in the title, hopefully.
>> Continuations in other pgbench backslash-commands should be dealt with
>> elsewhere...
>>
>> Also attached is a small test script.
>>
>
> Here is another approach, with "infered" continuations: no backslash is
> needed, the parsing is pursued if the last token of the line cannot end an
> expression (eg an operator) or if there is an unclosed parenthesis.
>
> I think that backslashes are less surprising for the classically minded
> user, but this one is more fun:-) Also, this version changes a little more
> the scanner because on each token the next state (continued or not) must be
> decided.
>
> --
> Fabien.
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


Re: [HACKERS] Fix checkpoint skip logic on idle systems by tracking LSN progress

2016-11-01 Thread Kyotaro HORIGUCHI
Thanks for merging. It still applies on the current master with
some displacements.

At Wed, 5 Oct 2016 15:18:53 +0900, Michael Paquier  
wrote in 

Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-11-01 Thread Fabien COELHO


Attached patch does what is described in the title, hopefully. Continuations 
in other pgbench backslash-commands should be dealt with elsewhere...


Also attached is a small test script.


Here is another approach, with "infered" continuations: no backslash is 
needed, the parsing is pursued if the last token of the line cannot end an 
expression (eg an operator) or if there is an unclosed parenthesis.


I think that backslashes are less surprising for the classically minded 
user, but this one is more fun:-) Also, this version changes a little more 
the scanner because on each token the next state (continued or not) must 
be decided.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 285608d..f066be1 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -826,13 +826,17 @@ pgbench  options  dbname
   %) with their usual precedence and associativity,
   function calls, and
   parentheses.
+  Expressions can spead several lines till completed: the parsing is
+  pursued if a token at the end of the line cannot end an expression
+  or if there is an unclosed parenthesis.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) %
+   (10 * :scale) + 1
 
 

diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 20891a3..1b0d4d0 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -43,6 +43,16 @@ static bool last_was_newline = false;
 extern int	expr_yyget_column(yyscan_t yyscanner);
 extern void expr_yyset_column(int column_no, yyscan_t yyscanner);
 
+/* the expression cannot end on this token */
+#define TO_COEX	   \
+	cur_state->start_state = COEX; \
+	BEGIN(COEX)
+
+/* continuation if unclosed parentheses */
+#define TO_EXPR			\
+	cur_state->start_state = cur_state->paren_depth > 0 ? COEX : EXPR;	\
+	BEGIN(cur_state->start_state)
+
 %}
 
 /* Except for the prefix, these options should match psqlscan.l */
@@ -67,7 +77,7 @@ nonspace		[^ \t\r\f\v\n]
 newline			[\n]
 
 /* Exclusive states */
-%x EXPR
+%x EXPR COEX
 
 %%
 
@@ -104,46 +114,64 @@ newline			[\n]
 	return 0;
 }
 
+	/* COEX (continued expression) state */
+
+{
+
+{newline}		{ /* ignore */ }
+
+}
+
 	/* EXPR state */
 
 {
 
-"+"{ return '+'; }
-"-"{ return '-'; }
-"*"{ return '*'; }
-"/"{ return '/'; }
-"%"{ return '%'; }
-"("{ return '('; }
-")"{ return ')'; }
-","{ return ','; }
+{newline}		{
+	/* report end of command */
+	last_was_newline = true;
+	return 0;
+}
+}
+
+	/* EXPR & COEX states common rules */
+
+{
+
+"+"{ TO_COEX; return '+'; }
+"-"{ TO_COEX; return '-'; }
+"*"{ TO_COEX; return '*'; }
+"/"{ TO_COEX; return '/'; }
+"%"{ TO_COEX; return '%'; }
+"("{ cur_state->paren_depth++; TO_COEX; return '('; }
+")"{ cur_state->paren_depth--; TO_EXPR; return ')'; }
+","{ TO_COEX; return ','; }
 
 :{alnum}+		{
 	yylval->str = pg_strdup(yytext + 1);
+	TO_EXPR;
 	return VARIABLE;
 }
 {digit}+		{
 	yylval->ival = strtoint64(yytext);
+	TO_EXPR;
 	return INTEGER_CONST;
 }
 {digit}+(\.{digit}*)?([eE][-+]?{digit}+)?	{
 	yylval->dval = atof(yytext);
+	TO_EXPR;
 	return DOUBLE_CONST;
 }
 \.{digit}+([eE][-+]?{digit}+)?	{
 	yylval->dval = atof(yytext);
+	TO_EXPR;
 	return DOUBLE_CONST;
 }
 {alpha}{alnum}*	{
 	yylval->str = pg_strdup(yytext);
+	TO_COEX;
 	return FUNCTION;
 }
 
-{newline}		{
-	/* report end of command */
-	last_was_newline = true;
-	return 0;
-}
-
 {space}+		{ /* ignore */ }
 
 .{


cont2.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #14350: VIEW with INSTEAD OF INSERT TRIGGER and COPY. Missing feature or working as designed.

2016-11-01 Thread Dilip Kumar
On Tue, Nov 1, 2016 at 12:40 PM, Haribabu Kommi
 wrote:
> COPY command is treated as an UTILITY command, During the query
> processing, the rules are not applied on the COPY command, and in the
> execution of COPY command, it just inserts the data into the heap and
> indexes with direct calls.
>
> I feel supporting the COPY command for the case-2 is little bit complex
> and may not be that much worth.

I agree it will be complex..
>
> Attached is the updated patch with doc changes.

Now since we are adding support for INSTEAD OF TRIGGER in COPY FROM
command, It will be good that we provide a HINT to user if INSTEAD of
trigger does not exist, like we do in case of insert ?

INSERT case:
postgres=# insert into ttt_v values(7);
2016-11-01 14:31:39.845 IST [72343] ERROR:  cannot insert into view "ttt_v"
2016-11-01 14:31:39.845 IST [72343] DETAIL:  Views that do not select
from a single table or view are not automatically updatable.
2016-11-01 14:31:39.845 IST [72343] HINT:  To enable inserting into
the view, provide an INSTEAD OF INSERT trigger or an unconditional ON
INSERT DO INSTEAD rule.

COPY case:
postgres=# COPY ttt_v FROM stdin;
2016-11-01 14:31:52.235 IST [72343] ERROR:  cannot copy to view "ttt_v"
2016-11-01 14:31:52.235 IST [72343] STATEMENT:  COPY ttt_v FROM stdin;


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-11-01 Thread Fabien COELHO


Hello Rafia,


Seems like an interesting addition to pgbench interface, but not sure where
it's required, it'll be good if you may provide some cases where it's
utility can be witnessed. Something like where you absolutely need
continuations in expression.


It is never "absolutely needed", you can always put the expression on one 
longer line.


As an long line example, supposing some other currently submitted patches 
are committed to pgbench, for implementing tpc-b according to the 
specification one may write the following:


  \set bid CASE WHEN random(0, 99) < 85 THEN :tbid ELSE :abid + (:abid >= 
:tbid) END

or

  \set bid \
 CASE WHEN random(0, 99) < 85  \
   THEN :tbid  \
   ELSE :abid + (:abid >= :tbid)   \
 END

I find the second version more readable, but it is a really matter of 
taste, obviously.


While applying it is giving some trailing whitespace errors, please 
correct them.


 sh> git checkout -b tmp
 sh> git apply ~/pgbench-set-continuation-1.patch
 sh> git commit -a
 sh>

I do not get any errors, and I have not found any trailing spaces in the 
patch. Could you give me the precise error message you got?


As an additional comment you may consider reformatting 
following snippet



{continuation} { /* ignore */ }

  as



{continuation} {
/* ignore */
}


Hmmm... Other lex rules to ignore spaces use the one line syntax, I would 
prefer to keep it the same as those.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] commit fest manager for CF 2016-11?

2016-11-01 Thread Michael Paquier
On Tue, Nov 1, 2016 at 12:20 PM, Thomas Munro
 wrote:
> On Tue, Nov 1, 2016 at 4:15 PM, Peter Eisentraut
>  wrote:
>> On 10/31/16 9:39 PM, Michael Paquier wrote:
>>> We are on the 1st, at least in my timezone. So the CF should start
>>> really soon, meaning that no new patches would get in. In a couple of
>>> hours that would be fine? I recall that the last times we tried to
>>> sync with the US East coast time.
>>
>> https://en.wikipedia.org/wiki/Anywhere_on_Earth
>
> +1

OK, why not. And so this lets exactly 4 hours at the moment I write this email.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2016-11-01 Thread Rafia Sabih
Hie Fabien,
Seems like an interesting addition to pgbench interface, but not sure where
it's required, it'll be good if you may provide some cases where it's
utility can be witnessed. Something like where you absolutely need
continuations in expression.

While applying it is giving some trailing whitespace errors, please correct
them.
As an additional comment you may consider  reformatting following snippet

> {continuation} { /* ignore */ }
>
>   as

> {continuation} {
> /* ignore */
> }

Thanks.

On Mon, Oct 3, 2016 at 6:16 PM, Christoph Berg  wrote:

> Re: Fabien COELHO 2016-10-03 
> > > I "\set" a bunch of lengthy SQL commands in there, e.g.
> >
> > I agree that this looks like a desirable feature, however I would tend to
> > see that as material for another independent patch.
>
> Sure, my question was by no means intending to stop your pgbench patch
> from going forward by adding extra requirements.
>
> > Hmmm. I'm not sure how this is parsed. If this is considered a string
> '...',
> > then maybe \set should wait for the end of the string instead of the end
> of
> > the line, i.e. no continuation would be needed...
> >
> >  \set config '
> > SELECT name, ...
> >CASE ... END
> > FROM pg_settings
> > WHERE ...;'
>
> I guess that would be the sane solution here, yes. Not adding any \
> chars at the end of the line would also mean cut-and-paste of the RHS
> content would work.
>
> Thanks for the feedback!
>
> Christoph
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


[HACKERS] Re: [BUGS] BUG #14350: VIEW with INSTEAD OF INSERT TRIGGER and COPY. Missing feature or working as designed.

2016-11-01 Thread Haribabu Kommi
On Tue, Nov 1, 2016 at 3:54 PM, Dilip Kumar  wrote:

> On Tue, Oct 4, 2016 at 1:07 PM, Haribabu Kommi 
> wrote:
> > I think currently there is no handling of INSTEAD of triggers in the copy
> > functionality.
> >
> > It didn't seem difficult to the support the same, until unless there are
> any
> > problems for complext queries, so after adding the INSTEAD of triggers
> > check and calling the ExecIRInsertTriggers function, the Copy is also
> > working for the view.
> >
> > Attached is a POC patch of the same. I didn't checked all the possible
> > scenarios.
>
> We support insert into view in below 2 cases..
>
> 1. INSTEAD OF INSERT trigger
> 2. or an unconditional ON INSERT DO INSTEAD rule
>

Yes, I agree that the above are the two cases where the insert is possible
on a view other than updatable views.



> In your patch we are supporting first one in COPY command, Will it not
> be good to support second one also in COPY command ?
>


COPY command is treated as an UTILITY command, During the query
processing, the rules are not applied on the COPY command, and in the
execution of COPY command, it just inserts the data into the heap and
indexes with direct calls.

I feel supporting the COPY command for the case-2 is little bit complex
and may not be that much worth.

Attached is the updated patch with doc changes.

Regards,
Hari Babu
Fujitsu Australia


copy_to_view_1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-01 Thread Mithun Cy
On Thu, Oct 27, 2016 at 11:15 PM, Robert Haas  wrote:
> Thanks.  Here's a new version with a fix for that issue and also a fix
> for PQconnectionNeedsPassword(), which was busted in v1.
I did some more testing of the patch for both URI and (host, port)
parameter pairs. I did test for ipv4, ipv6, UDS type of sockets, Also
tested different password using .pgpass. I did not see any major issue
with. All seems to work as defined. Tested the behaviors of API PQport,
PQhost, PQpass, PQreset all works as defined.

I also have following observation which can be considered if it is valid.

1. URI do not support UDS, where as (host,port) pair support same.
But there is one problem with this, instead of reporting it, we discard the
URI input (@conninfo_uri_parse_options) and try to connect to default unix
socket

[mithun@localhost bin]$ ./psql postgres:///home/mithun:,
127.0.0.1:/postgres -U mithun
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
server is running on both the sockets.

2. If we fail to connect we get an error message for each of the addrinfo
we tried to connect, wondering if we could simplify same. Just a suggestion.
[mithun@localhost bin]$ ./psql postgres://,,,/postgres
psql: could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "" (127.0.0.1) and accepting
TCP/IP connections on port 5432?

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com