Re: pgbench - add pseudo-random permutation function

2018-09-25 Thread Thomas Munro
On Wed, Sep 19, 2018 at 7:07 AM Fabien COELHO  wrote:
> > I reviewed 'pgbench-prp-func/pgbench-prp-func-4.patch'. [...] I thinks
> > this patch is fine.

modular_multiply() is an interesting device.  I will leave this to
committers with a stronger mathematical background than me, but I have
a small comment in passing:

+#ifdef HAVE__BUILTIN_POPCOUNTLL
+ return __builtin_popcountll(n);
+#else /* use clever no branching bitwise operator version */

I think it is not enough for the compiler to have
__builtin_popcountll().  The CPU that this is eventually executed on
must actually have the POPCNT instruction[1] (or equivalent on ARM,
POWER etc), or the program will die with SIGILL.  There are CPUs in
circulation produced in this decade that don't have it.  I have
previously considered something like this[2], but realised you would
therefore need a runtime check.  There are a couple of ways to do
that: see commit a7a73875 for one example, also
__builtin_cpu_supports(), and direct CPU ID bit checks (some
platforms).  There is also the GCC "multiversion" system that takes
care of that for you but that worked only for C++ last time I checked.

[1] https://en.wikipedia.org/wiki/SSE4#POPCNT_and_LZCNT
[2] 
https://www.postgresql.org/message-id/CAEepm%3D3g1_fjJGp38QGv%2B38BC2HHVkzUq6s69nk3mWLgPHqC3A%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com



Re: pgsql: Remove absolete function TupleDescGetSlot().

2018-09-25 Thread Michael Paquier
On Tue, Sep 25, 2018 at 06:42:51PM -0700, Andres Freund wrote:
> My point is that FuncCallContext->slot isn't actually all that related
> to TupleDescGetSlot() and could be used entirely independently.

That's fair.  Why not just replacing the existing comment with something
like "slot can be used in your own context to store tuple values,
useful in the context of user-defined SRFs"?  Just my 2c.
--
Michael


signature.asc
Description: PGP signature


Re: Segfault when creating partition with a primary key and sql_drop trigger exists

2018-09-25 Thread Michael Paquier
On Tue, Sep 25, 2018 at 01:39:59PM +0200, Dmitry Dolgov wrote:
> Shouldn't it be fixed by adding EventTriggerAlterTableStart? Judging from the
> following call of ATController, we can just pass NULL as parsetree.

Hmm.  I don't think that this is correct as this data could always be
used to fetch a command tag, right?  It seems to me instead that we
should pass down IndexStmt and handle things like the attached.
Thoughts?
--
Michael
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 9229f619d2..b0cb5514d3 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -50,6 +50,7 @@
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
+#include "commands/event_trigger.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -212,7 +213,8 @@ relationHasPrimaryKey(Relation rel)
 void
 index_check_primary_key(Relation heapRel,
 		IndexInfo *indexInfo,
-		bool is_alter_table)
+		bool is_alter_table,
+		IndexStmt *stmt)
 {
 	List	   *cmds;
 	int			i;
@@ -280,7 +282,11 @@ index_check_primary_key(Relation heapRel,
 	 * unduly.
 	 */
 	if (cmds)
+	{
+		EventTriggerAlterTableStart((Node *) stmt);
 		AlterTableInternal(RelationGetRelid(heapRel), cmds, true);
+		EventTriggerAlterTableEnd();
+	}
 }
 
 /*
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index ab3d9a0a48..4fc279e86f 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -666,7 +666,7 @@ DefineIndex(Oid relationId,
 	 * Extra checks when creating a PRIMARY KEY index.
 	 */
 	if (stmt->primary)
-		index_check_primary_key(rel, indexInfo, is_alter_table);
+		index_check_primary_key(rel, indexInfo, is_alter_table, stmt);
 
 	/*
 	 * If this table is partitioned and we're creating a unique index or a
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1205dbc0b5..64e1059e94 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7074,7 +7074,7 @@ ATExecAddIndexConstraint(AlteredTableInfo *tab, Relation rel,
 
 	/* Extra checks needed if making primary key */
 	if (stmt->primary)
-		index_check_primary_key(rel, indexInfo, true);
+		index_check_primary_key(rel, indexInfo, true, stmt);
 
 	/* Note we currently don't support EXCLUSION constraints here */
 	if (stmt->primary)
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index f20c5f789b..35a29f3498 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -40,7 +40,8 @@ typedef enum
 
 extern void index_check_primary_key(Relation heapRel,
 		IndexInfo *indexInfo,
-		bool is_alter_table);
+		bool is_alter_table,
+		IndexStmt *stmt);
 
 #define	INDEX_CREATE_IS_PRIMARY(1 << 0)
 #define	INDEX_CREATE_ADD_CONSTRAINT			(1 << 1)


signature.asc
Description: PGP signature


Re: pgsql: Remove absolete function TupleDescGetSlot().

2018-09-25 Thread Andres Freund
On 2018-09-26 10:38:51 +0900, Michael Paquier wrote:
> On Tue, Sep 25, 2018 at 05:10:39PM -0700, Andres Freund wrote:
> >> git grep TupleDescGetSlot
> >> doc/src/sgml/xfunc.sgml: * user-defined SRFs that use the deprecated 
> >> TupleDescGetSlot().
> >> src/include/funcapi.h:   * user-defined SRFs that use the deprecated 
> >> TupleDescGetSlot().
> > 
> > But here I'm less convinced. It's not entirely clear to me that the only
> > real reason for this to exists actually was TupleDescGetSlot(). OTOH, I
> > can't really see any proper reason to have it either.
> 
> I have not been following the recent thread about the refactoring of
> TupleSlot and such very closely, but if you don't plan to use
> TupleTableSlot in FuncCallContext in the future, cannot this just go
> away?  The function is not here anymore, so my take would be to get rid
> of all things which relied on its presence.

My point is that FuncCallContext->slot isn't actually all that related
to TupleDescGetSlot() and could be used entirely independently.

Greetings,

Andres Freund



Re: pgsql: Remove absolete function TupleDescGetSlot().

2018-09-25 Thread Michael Paquier
On Tue, Sep 25, 2018 at 05:10:39PM -0700, Andres Freund wrote:
>> git grep TupleDescGetSlot
>> doc/src/sgml/xfunc.sgml: * user-defined SRFs that use the deprecated 
>> TupleDescGetSlot().
>> src/include/funcapi.h:   * user-defined SRFs that use the deprecated 
>> TupleDescGetSlot().
> 
> But here I'm less convinced. It's not entirely clear to me that the only
> real reason for this to exists actually was TupleDescGetSlot(). OTOH, I
> can't really see any proper reason to have it either.

I have not been following the recent thread about the refactoring of
TupleSlot and such very closely, but if you don't plan to use
TupleTableSlot in FuncCallContext in the future, cannot this just go
away?  The function is not here anymore, so my take would be to get rid
of all things which relied on its presence.
--
Michael


signature.asc
Description: PGP signature


Re: when set track_commit_timestamp on, database system abort startup

2018-09-25 Thread Michael Paquier
On Tue, Sep 25, 2018 at 11:54:53PM +0900, Masahiko Sawada wrote:
> Can we use "XLOG_PARAMETER_CHANGE record" instead of  "record
> XLOG_PARAMETER_CHANGE" at the second hunk because the comment of the
> first hunk uses it. The other part of this patch looks good to me.

Let's use your suggestion, and committed down to 9.5 where this fix
applies.  The TAP test is only present in v11 and above.  I have also
noticed some comment which became incorrect after the fix, so I changed
them on the way.
--
Michael


signature.asc
Description: PGP signature


Re: Shared buffer access rule violations?

2018-09-25 Thread Thomas Munro
On Thu, Aug 9, 2018 at 12:59 PM Asim R P  wrote:
> On Tue, Aug 7, 2018 at 7:00 PM, Peter Geoghegan  wrote:
> > I wonder if it would be a better idea to enable Valgrind's memcheck to
> > mark buffers as read-only or read-write. We've considered doing
> > something like that for years, but for whatever reason nobody followed
> > through.
>
> Basic question: how do you mark buffers as read-only using memcheck
> tool?  Running installcheck with valgrind didn't uncover any errors:
>
> valgrind --trace-children=yes pg_ctl -D datadir start
> make installcheck-parallel

Presumably with VALGRIND_xxx macros, but is there a way to make that
work for shared memory?

I like the mprotect() patch.  This could be enabled on a build farm
animal.  I guess it would either fail explicitly or behave incorrectly
for VM page size > BLCKSZ depending on OS, but at least on Linux/amd64
you have to go out of your way to get pages > 4KB so that seems OK for
a debugging feature.

I would like to do something similar with DSA, to electrify
superblocks and whole segments that are freed.  That would have caught
a recent bug in DSA itself and in client code.

-- 
Thomas Munro
http://www.enterprisedb.com



RE: libpq debug log

2018-09-25 Thread Iwata, Aya
Hi, 

Sorry for my late response.

> Between perf/systemtap/dtrace and wireshark, you can already do pretty much
> all of that.  Have you looked at those and found anything missing?

These commands provide detailed information to us.
However, I think the trace log is necessary from the following point.

1. ease of use for users
It is necessary to take information that is easy to understand for database 
users.
This trace log is retrieved on the application server side.
Not only the database developer but also application developer will get and 
check this log.
Also, some of these commands return detailed PostgreSQL function names.
The trace log would be useful for users who do not know the inside of 
PostgreSQL (e.g. application developers)


2. obtain the same information on all OS
Some commands depend on the OS.
I think that it is important that the trace log information is compatible to 
each OS.

Regards,
Aya Iwata




Re: shared-memory based stats collector

2018-09-25 Thread Kyotaro HORIGUCHI
Hello. Thank you for the comments.

At Thu, 20 Sep 2018 10:37:24 -0700, Andres Freund  wrote in 
<20180920173724.5w2n2nwkxtyi4...@alap3.anarazel.de>
> Hi,
> 
> On 2018-09-20 09:55:27 +0200, Antonin Houska wrote:
> > I've spent some time reviewing this version.
> > 
> > Design
> > --
> > 
> > 1. Even with your patch the stats collector still uses an UDP socket to
> >receive data. Now that the shared memory API is there, shouldn't the
> >messages be sent via shared memory queue? [1] That would increase the
> >reliability of message delivery.
> > 
> >I can actually imagine backends inserting data into the shared hash 
> > tables
> >themselves, but that might make them wait if the same entries are 
> > accessed
> >by another backend. It should be much cheaper just to insert message into
> >the queue and let the collector process it. In future version the 
> > collector
> >can launch parallel workers so that writes by backends do not get blocked
> >due to full queue.
> 
> I don't think either of these is right. I think it's crucial to get rid
> of the UDP socket, but I think using a shmem queue is the wrong
> approach. Not just because postgres' shm_mq is single-reader/writer, but
> also because it's plainly unnecessary.  Backends should attempt to
> update the shared hashtable, but acquire the necessary lock
> conditionally, and leave the pending updates of the shared hashtable to
> a later time if they cannot acquire the lock.

Ok, I just intended to avoid reading many bytes from a file and
thought that writer-side can be resolved later.

Currently locks on the shared stats table is acquired by dshash
mechanism in a partition-wise manner. The number of the
partitions is currently fixed to 2^7 = 128, but writes for the
same table confilicts each other regardless of the number of
partitions. As the first step, I'm going to add
conditional-locking capability to dsahsh_find_or_insert and each
backend holds a queue of its pending updates.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: transction_timestamp() inside of procedures

2018-09-25 Thread Bruce Momjian
On Tue, Sep 25, 2018 at 03:01:31PM -0700, David G. Johnston wrote:
> On Sat, Sep 22, 2018 at 5:27 PM, Bruce Momjian  wrote:
> 
> On Fri, Sep 21, 2018 at 06:35:02PM -0400, Bruce Momjian wrote:
> > Does the SQL standard have anything to say about CURRENT_TIMESTAMP in
> > procedures?  Do we need another function that does advance on procedure
> > commit?
> 
> I found a section in the SQL standards that talks about it, but I don't
> understand it.  Can I quote the paragraph here?
> 
> 
> I've seen others do it; and small sections of copyrighted material posted for
> commentary or critique would likely be covered by "fair use" exemptions.

Well, it is an entire paragraph, so it might be too much.  If you
download the zip file here:

http://www.wiscorp.com/sql200n.zip

and open 5CD2-02-Foundation-2006-01.pdf, at the top of page 289 under
General Rules, paragraph label 3 has the description.  It talks about
procedure statements and trigger functions, which seems promising.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: pgsql: Remove absolete function TupleDescGetSlot().

2018-09-25 Thread Andres Freund
On 2018-09-26 09:04:14 +0900, Michael Paquier wrote:
> Hi Andres,
> 
> On Tue, Sep 25, 2018 at 11:39:05PM +, Andres Freund wrote:
> > Remove absolete function TupleDescGetSlot().
> > 
> > TupleDescGetSlot() was kept around for backward compatibility for
> > user-written SRFs. With the TupleTableSlot abstraction work, that code
> > will need to be version specific anyway, so there's no point in
> > keeping the function around any longer.
> 
> There are still references in the code to this function, and a
> declaration of it:

Hrmpf :/. Thanks for catching.


> src/include/funcapi.h: * TupleTableSlot *TupleDescGetSlot(TupleDesc tupdesc) 
> - Builds a
> src/include/funcapi.h:extern TupleTableSlot *TupleDescGetSlot(TupleDesc 
> tupdesc);

These two clearly need to go.


> git grep TupleDescGetSlot
> doc/src/sgml/xfunc.sgml: * user-defined SRFs that use the deprecated 
> TupleDescGetSlot().
> src/include/funcapi.h:   * user-defined SRFs that use the deprecated 
> TupleDescGetSlot().

But here I'm less convinced. It's not entirely clear to me that the only
real reason for this to exists actually was TupleDescGetSlot(). OTOH, I
can't really see any proper reason to have it either.

Greetings,

Andres Freund



Re: TupleTableSlot abstraction

2018-09-25 Thread Andres Freund
On 2018-09-04 18:35:34 +0530, Amit Khandekar wrote:
> The attached v11 tar has the above set of changes.


> Subject: [PATCH 07/14] Restructure TupleTableSlot to allow tuples other than
>  HeapTuple

> +
> +/*
> + * This is a function used by all getattr() callbacks which deal with a heap
> + * tuple or some tuple format which can be represented as a heap tuple e.g. a
> + * minimal tuple.
> + *
> + * heap_getattr considers any attnum beyond the attributes available in the
> + * tuple as NULL. This function however returns the values of missing
> + * attributes from the tuple descriptor in that case. Also this function does
> + * not support extracting system attributes.
> + *
> + * If the attribute needs to be fetched from the tuple, the function fills in
> + * tts_values and tts_isnull arrays upto the required attnum.
> + */
> +Datum
> +tts_heap_getattr_common(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
> + int attnum, bool
> *isnull)

I'm still *vehemently* opposed to the introduction of this.



> @@ -2024,7 +2024,18 @@ FormIndexDatum(IndexInfo *indexInfo,
>   Datum   iDatum;
>   boolisNull;
>  
> - if (keycol != 0)
> + if (keycol < 0)
> + {
> + HeapTupleTableSlot *hslot = (HeapTupleTableSlot *)slot;
> +
> + /* Only heap tuples have system attributes. */
> + Assert(TTS_IS_HEAPTUPLE(slot) || 
> TTS_IS_BUFFERTUPLE(slot));
> +
> + iDatum = heap_getsysattr(hslot->tuple, keycol,
> +  
> slot->tts_tupleDescriptor,
> +  
> );
> + }
> + else if (keycol != 0)
>   {
>   /*
>* Plain index column; get the value we need directly 
> from the

This now should access the system column via the slot, right?  There's
other places like this IIRC.



> diff --git a/src/backend/executor/execExprInterp.c 
> b/src/backend/executor/execExprInterp.c
> index 9d6e25a..1b4e726 100644
> --- a/src/backend/executor/execExprInterp.c
> +++ b/src/backend/executor/execExprInterp.c
> @@ -490,54 +490,21 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, 
> bool *isnull)
>  
>   EEO_CASE(EEOP_INNER_SYSVAR)
>   {
> - int attnum = op->d.var.attnum;
> - Datum   d;
> -
> - /* these asserts must match defenses in slot_getattr */
> - Assert(innerslot->tts_tuple != NULL);
> - Assert(innerslot->tts_tuple != 
> &(innerslot->tts_minhdr));
> -
> - /* heap_getsysattr has sufficient defenses against bad 
> attnums */
> - d = heap_getsysattr(innerslot->tts_tuple, attnum,
> - 
> innerslot->tts_tupleDescriptor,
> - op->resnull);
> - *op->resvalue = d;
> + ExecEvalSysVar(state, op, econtext, innerslot);

These changes should be in a separate commit.


> +const TupleTableSlotOps TTSOpsHeapTuple = {
> + sizeof(HeapTupleTableSlot),
> + .init = tts_heap_init,

The first field should also use a named field (same in following cases).


> @@ -185,6 +1020,7 @@ ExecResetTupleTable(List *tupleTable,/* tuple table 
> */
>   {
>   TupleTableSlot *slot = lfirst_node(TupleTableSlot, lc);
>  
> + slot->tts_cb->release(slot);
>   /* Always release resources and reset the slot to empty */
>   ExecClearTuple(slot);
>   if (slot->tts_tupleDescriptor)
> @@ -240,6 +1076,7 @@ void
>  ExecDropSingleTupleTableSlot(TupleTableSlot *slot)
>  {
>   /* This should match ExecResetTupleTable's processing of one slot */
> + slot->tts_cb->release(slot);
>   Assert(IsA(slot, TupleTableSlot));
>   ExecClearTuple(slot);
>   if (slot->tts_tupleDescriptor)

ISTM that release should be called *after* clearing the slot.


> @@ -56,11 +56,28 @@ tqueueReceiveSlot(TupleTableSlot *slot, DestReceiver 
> *self)
>   TQueueDestReceiver *tqueue = (TQueueDestReceiver *) self;
>   HeapTuple   tuple;
>   shm_mq_result result;
> + booltuple_copied = false;
> +
> + /* Get the tuple out of slot, if necessary converting the slot's 
> contents
> +  * into a heap tuple by copying. In the later case we need to free the 
> copy.
> +  */
> + if (TTS_IS_HEAPTUPLE(slot) || TTS_IS_BUFFERTUPLE(slot))
> + {
> + tuple = ExecFetchSlotTuple(slot, true);
> + tuple_copied = false;
> + }
> + else
> + {
> + tuple = ExecCopySlotTuple(slot);
> + 

Re: pgsql: Remove absolete function TupleDescGetSlot().

2018-09-25 Thread Michael Paquier
Hi Andres,

On Tue, Sep 25, 2018 at 11:39:05PM +, Andres Freund wrote:
> Remove absolete function TupleDescGetSlot().
> 
> TupleDescGetSlot() was kept around for backward compatibility for
> user-written SRFs. With the TupleTableSlot abstraction work, that code
> will need to be version specific anyway, so there's no point in
> keeping the function around any longer.

There are still references in the code to this function, and a
declaration of it:
git grep TupleDescGetSlot
doc/src/sgml/xfunc.sgml: * user-defined SRFs that use the deprecated 
TupleDescGetSlot().
src/include/funcapi.h:   * user-defined SRFs that use the deprecated 
TupleDescGetSlot().
src/include/funcapi.h: * TupleTableSlot *TupleDescGetSlot(TupleDesc tupdesc) - 
Builds a
src/include/funcapi.h:extern TupleTableSlot *TupleDescGetSlot(TupleDesc 
tupdesc);
--
Michael


signature.asc
Description: PGP signature


Re: TupleTableSlot abstraction

2018-09-25 Thread Andres Freund
Hi,

On 2018-09-04 18:35:34 +0530, Amit Khandekar wrote:
> Subject: [PATCH 05/14] Use tts_flags instead of previous bool members
> 
> Pack the boolean members in TupleTableSlot into a 16 bit tts_flags.
> This reduces the size of TupleTableSlot since each bool member takes
> at least a byte on the platforms where bool is defined as a char.
> 
> Ashutosh Bapat and Andres Freund

> +
> +/* true = slot is empty */
> +#define  TTS_ISEMPTY (1 << 1)
> +#define IS_TTS_EMPTY(slot)   ((slot)->tts_flags & TTS_ISEMPTY)
> +#define SET_TTS_EMPTY(slot) ((slot)->tts_flags |= TTS_ISEMPTY)
> +#define RESET_TTS_EMPTY(slot) ((slot)->tts_flags &= ~TTS_ISEMPTY)

The flag stuff is the right way, but I'm a bit dubious about the
accessor functions.  I can see open-coding the accesses, using the
macros, or potentially going towards using bitfields.

Any comments?

- Andres



Re: transction_timestamp() inside of procedures

2018-09-25 Thread Andres Freund
On 2018-09-25 14:50:02 -0700, Andres Freund wrote:
> ISTM this is an issue that belongs on the open items list. Peter, could
> you comment?

Done so, per discussion with the rest of the RMT.



Re: TupleTableSlot abstraction

2018-09-25 Thread Andres Freund
On 2018-09-04 18:35:34 +0530, Amit Khandekar wrote:
> The attached v11 tar has the above set of changes.

- I've pushed 0003 - although that commit seems to have included a lot
  of things unrelated to the commit message. I guess two patches have
  accidentally been merged?  Could you split out the part of the changes
  that was mis-squashed?
- I've pushed an extended version of 0001.
- I've pushed 0002, after some minor polishing
- I've pushed 0004

Greetings,

Andres Freund



Re: PG vs macOS Mojave

2018-09-25 Thread Tom Lane
Thomas Munro  writes:
> ... and now I'm on macOS 10.14.  I removed all traces of MacPorts from
> my PATH and configure line to test this.  --with-tcl worked, but
> --with-perl couldn't find "perl.h".  Then I realised that it was
> because I was still on Xcode 9, so I was in for another ~5GB of
> upgrading to get to Xcode 10.  After that, it all worked fine.

Interesting.  IME, updating Xcode first usually gives you a working
system, for some value of "work".  Right now I've updated to Xcode
10 but not yet Mojave on my main laptop, and PG works except that
the linker warns about out-of-sync library files during some link
steps.  I don't see that on the other machine where I've done both
updates.  I don't recall ever having tried the other order, but
evidently it (sometimes?) has some issues.

Maybe we should make the test in configure for whether to prepend
PG_SYSROOT be more specific, ie

-if test -d "$PG_SYSROOT$perl_archlibexp" ; then
+if test -f "$PG_SYSROOT$perl_archlibexp/CORE/perl.h" ; then
   perl_includedir="$PG_SYSROOT$perl_archlibexp"

I think what must've happened to you is that in Xcode 9, the
$PG_SYSROOT$perl_archlibexp directory exists but doesn't contain
the header files.  Now, once you'd updated the main system,
neither would $perl_archlibexp, so you're going to fail either
way in that state :-(.  But this feels a bit safer somehow.

regards, tom lane



Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

2018-09-25 Thread Tom Lane
Heikki Linnakangas  writes:
> On 27/06/18 21:57, Daniel Gustafsson wrote:
>> Courtesy of the ever-present Murphy I managed to forget some testcode in
>> src/backend/Makefile which broke compilation for builds without secure
>> transport, attached v8 patch fixes that.

> I've read through this patch now in more detail. Looks pretty good, but 
> I have a laundry list of little things below. The big missing item is 
> documentation.

I did some simple testing on this today.  The patch has bit-rotted,
mostly as a consequence of 77291139c which removed tls-unique channel
binding.  That's probably a good thing for the Secure Transport code,
since it wasn't supporting that anyway, but it needs to be updated.

I ripped out the failing-to-compile code (maybe that was too simplistic?)
but could not figure out whether there was anything still useful about
the diffs in ssl/t/002_scram.pl, so I left those out.  Anyway, the
attached update applies cleanly to today's HEAD, and the openssl part
still compiles cleanly and passes regression tests.  The securetransport
part compiles cleanly, but it fails 8 of the 68 tests in 001_ssltests.pl.
I'm not sure how many of those might be new and how many were there as
of the previous submission.

> The "-framework" option, being added to CFLAGS, is clang specific. I 
> think we need some more autoconf magic, to make this work with gcc.

AFAIK, Apple's "gcc" *is* clang; it certainly has no problem with
these switches (and I rather doubt there's any hope of linking to
Secure Transport without 'em).  However, I agree that the technique
of teaching random makefiles about this explicitly is mighty ugly,
and we ought to put the logic into configure instead, if possible.
Possibly it could be modeled on LDAP_LIBS or ICU_LIBS, ie configure
sets up a macro that pulls in the openssl libraries, or the
secure transport libraries, or $other-implementation, or nothing.
The CFLAGS hacks need similar treatment (btw, should they be
CPPFLAGS hacks instead?  I think they're morally equivalent to
-I switches).  And avoid using "override" if at all possible.

Some other comments:

* I notice that contrib/sslinfo hasn't been touched.  That probably
ought to be on the to-do list for this, though I don't insist that
it needs to be in the first commit.

* I'm not sure what the "keychain" additions to test/ssl/Makefile
are for, but they don't seem to be wired up to anything.  Running
"make keychains" has no impact on the test failures, either.

* I do not like making the libpq connection parameters for this be
#ifdef'd out when the option isn't selected.  I believe project policy is
that we accept all parameters always, and either ignore unsupported ones,
or throw errors if they're set to nondefault values (cf. the comment above
the sslmode parameter in fe-connect.c).  I realize that some earlier
patches like GSSAPI did not get the word on this, but that's not a reason
to emulate their mistake.  I'm not sure about the equivalent decision
w.r.t. backend GUCs, but we need to figure that out.

* In place of changes like
-#ifdef USE_SSL
+#if defined(USE_SSL) && defined(USE_OPENSSL)
I'd be inclined to just do "#ifdef USE_OPENSSL", ie assume that macro
can't be set without USE_SSL.

regards, tom lane

diff --git a/configure b/configure
index 23ebfa8..6bdc96a 100755
--- a/configure
+++ b/configure
@@ -708,6 +708,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
+with_securetransport
 with_openssl
 with_ldap
 with_krb_srvnam
@@ -854,6 +855,7 @@ with_bsd_auth
 with_ldap
 with_bonjour
 with_openssl
+with_securetransport
 with_selinux
 with_systemd
 with_readline
@@ -1554,6 +1556,7 @@ Optional Packages:
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
   --with-openssl  build with OpenSSL support
+  --with-securetransport  build with Apple Secure Transport support
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -7968,6 +7971,41 @@ $as_echo "$with_openssl" >&6; }
 
 
 #
+# Apple Secure Transport
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Apple Secure Transport support" >&5
+$as_echo_n "checking whether to build with Apple Secure Transport support... " >&6; }
+
+
+
+# Check whether --with-securetransport was given.
+if test "${with_securetransport+set}" = set; then :
+  withval=$with_securetransport;
+  case $withval in
+yes)
+
+$as_echo "#define USE_SECURETRANSPORT 1" >>confdefs.h
+
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-securetransport option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_securetransport=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_securetransport" >&5
+$as_echo "$with_securetransport" >&6; }
+
+
+#
 # SELinux
 #
 { $as_echo 

Re: PG vs macOS Mojave

2018-09-25 Thread Thomas Munro
On Wed, Sep 26, 2018 at 3:14 AM Tom Lane  wrote:
> Thomas Munro  writes:
> > On Tue, Sep 25, 2018 at 4:49 PM Tom Lane  wrote:
> >> I've tested this on all the macOS versions I have at hand, and it
> >> doesn't seem to break anything.  Only part (1) could possibly
> >> affect other platforms, and that seems safe enough.
> >>
> >> I'd like to commit and backpatch this, because otherwise longfin
> >> is going to start falling over when I upgrade its host to Mojave.
>
> > Looks good on this 10.13.4 system.  About to upgrade to 10.14...
>
> Thanks for testing!  I'll set to work on back-patching that.

... and now I'm on macOS 10.14.  I removed all traces of MacPorts from
my PATH and configure line to test this.  --with-tcl worked, but
--with-perl couldn't find "perl.h".  Then I realised that it was
because I was still on Xcode 9, so I was in for another ~5GB of
upgrading to get to Xcode 10.  After that, it all worked fine.


--
Thomas Munro
http://www.enterprisedb.com



Re: Allowing printf("%m") only where it actually works

2018-09-25 Thread Michael Paquier
On Tue, Sep 25, 2018 at 12:05:42PM -0400, Tom Lane wrote:
> Alvaro Herrera  writes:
>> Actually I think it *is* useful to do it like this, because then the
>> user knows to fix the netmsg.dll problem so that they can continue to
>> investigate the winsock problem.  If we don't report the secondary error
>> message, how are users going to figure out how to fix the problem?
> 
> OK, I'm fine with doing it like that if people want it.

+1.
--
Michael


signature.asc
Description: PGP signature


Re: Proposal for Signal Detection Refactoring

2018-09-25 Thread Michael Paquier
On Mon, Sep 24, 2018 at 09:27:35PM -0700, Andres Freund wrote:
> On 2018-09-25 11:50:47 +0900, Michael Paquier wrote:
>> PGDLLIMPORT changes don't get back-patched as well...
> 
> We've been more aggressive with that lately, and I think that's good. It
> just is a unnecessarily portability barrier for extension to be
> judicious in applying that.

Agreed.  Are there any objections with the proposal of changing the
interruption pending flags in miscadmin.h to use sig_atomic_t?
ClientConnectionLost would gain PGDLLIMPORT at the same time.
--
Michael


signature.asc
Description: PGP signature


Re: Slotification of partition tuple conversion

2018-09-25 Thread Andres Freund
Hi Amit,

Could you rebase this patch, it doesn't apply anymore.

Greetings,

Andres Freund



Re: Calculate total_table_pages after set_base_rel_sizes()

2018-09-25 Thread David Rowley
On Tue, 25 Sep 2018 at 10:23, Edmund Horner  wrote:
> I have a small tweak.  In make_one_rel, we currently have:
>
> /*
>  * Compute size estimates and consider_parallel flags for each base rel,
>  * then generate access paths.
>  */
> set_base_rel_sizes(root);
> set_base_rel_pathlists(root);
>
> Your patch inserts code between the two lines.  I think the comment should be 
> split too.
>
> /* Compute size estimates and consider_parallel flags for each base rel. 
> */
> set_base_rel_sizes(root);
>
> // NEW CODE
>
> /* Generate access paths. */
> set_base_rel_pathlists(root);

Thanks for looking at this.

I've changed that in the attached updated patch.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


v2-0001-Calculate-total_table_pages-after-set_base_rel_si.patch
Description: Binary data


Re: transction_timestamp() inside of procedures

2018-09-25 Thread David G. Johnston
On Sat, Sep 22, 2018 at 5:27 PM, Bruce Momjian  wrote:

> On Fri, Sep 21, 2018 at 06:35:02PM -0400, Bruce Momjian wrote:
> > Does the SQL standard have anything to say about CURRENT_TIMESTAMP in
> > procedures?  Do we need another function that does advance on procedure
> > commit?
>
> I found a section in the SQL standards that talks about it, but I don't
> understand it.  Can I quote the paragraph here?


I've seen others do it; and small sections of copyrighted material posted
for commentary or critique would likely be covered by "fair use" exemptions.

David J.


Please, can I be a mentor for Google Code In

2018-09-25 Thread Saheed Bolarinwa
Hello,
I am a Master Degree Student at University of Pecs, Hungary.
I will like to be a mentor for Google Code In for PostgresSQL project

Please, how can I apply?

Thank you.

Saheed


Re: transction_timestamp() inside of procedures

2018-09-25 Thread Andres Freund
Hi,

On 2018-09-20 19:40:40 -0400, Bruce Momjian wrote:
> This function shows that only clock_timestamp() advances inside a
> procedure, not statement_timestamp() or transaction_timestamp():
> 
>   CREATE OR REPLACE PROCEDURE test_timestamp () AS $$
>   DECLARE
>   str TEXT;
>   BEGIN
>   WHILE TRUE LOOP
>   -- clock_timestamp() is updated on every loop
>   SELECT clock_timestamp() INTO str;
>   RAISE NOTICE 'clock   %', str;
>   SELECT statement_timestamp() INTO str;
>   RAISE NOTICE 'statement   %', str;
>   SELECT transaction_timestamp() INTO str;
>   RAISE NOTICE 'transaction %', str;
>   COMMIT;
>   
>   PERFORM pg_sleep(2);
>   END LOOP;
>   END
>   $$ LANGUAGE plpgsql;
> 
>   CALL test_timestamp();
>   NOTICE:  clock   2018-09-20 19:38:22.575794-04
>   NOTICE:  statement   2018-09-20 19:38:22.575685-04
>   NOTICE:  transaction 2018-09-20 19:38:22.575685-04
> 
> -->   NOTICE:  clock   2018-09-20 19:38:24.578027-04
>   NOTICE:  statement   2018-09-20 19:38:22.575685-04
>   NOTICE:  transaction 2018-09-20 19:38:22.575685-04
> 
> This surprised me since I expected a new timestamp after commit.  Is
> this something we want to change or document?  Are there other
> per-transaction behaviors we should adjust?

ISTM this is an issue that belongs on the open items list. Peter, could
you comment?

Greetings,

Andres Freund



clarify documentation of BGW_NEVER_RESTART ?

2018-09-25 Thread Chapman Flack
I did not notice until today that there is some ambiguity in
this paragraph:

  bgw_restart_time is the interval, in seconds, that postgres should
  wait before restarting the process, in case it crashes. It can be
  any positive value, or BGW_NEVER_RESTART, indicating not to restart
  the process in case of a crash.

I had been reading "in case _it_ crashes" and "in case of _a_ crash"
as "in case _the background worker_ crashes", so I assumed with
BGW_NEVER_RESTART I was saying I don't want my worker restarted if
_it_ flakes out while PG is otherwise operating normally.

But I was surprised when the unrelated crash of a different, normal
backend left my background worker killed and never restarted. I had
always regarded the fatal-error kick-out-all-backends-and-recover
handling as essentially equivalent to a PG restart, so I had expected
it to start the bgworker over just as a real restart would.

But sure enough, ResetBackgroundWorkerCrashTimes() gets called in
that case, and treats every worker with BGW_NEVER_RESTART as gone
and forgotten. So it seems the "it" in "it crashes" can be "the
background worker" or "postgres itself" or "any shmem-connected
backend".

If the wording fooled me it might fool somebody else too. I can
work on wordsmithing a patch to match the doc to the behavior,
but wanted first to check that the behavior is what was intended.

-Chap



Re: Query is over 2x slower with jit=on

2018-09-25 Thread Andres Freund
Hi,

On 2018-09-25 12:50:34 -0700, Andres Freund wrote:
> On 2018-09-25 01:47:49 -0700, Andres Freund wrote:
> > Planning Time: 0.152 ms
> > JIT:
> >   Functions: 4
> >   Options: Inlining true, Optimization true, Expressions true, Deforming 
> > true
> >   Timing: Generation 0.955 ms, Inlining 157.422 ms, Optimization 11.751 ms, 
> > Emission 11.454 ms, Total 181.582 ms
> > Execution Time: 184.197 ms
> 
> With parallelism on, this is the aggregated cost of doing JIT
> compilation. I wonder if, in VERBOSE mode, we should separately display
> the cost of JIT for the leader.  Comments?

I've pushed the change without that bit - it's just a few additional
lines if we want to change that.

Greetings,

Andres Freund



Re: Query is over 2x slower with jit=on

2018-09-25 Thread Andres Freund
Hi,

On 2018-09-25 01:47:49 -0700, Andres Freund wrote:
> Forcing parallelism and JIT to be on, I get:
> 
> postgres[20216][1]=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS) SELECT count(*) FROM 
> pg_class;
> 
> Finalize Aggregate  (cost=15.43..15.44 rows=1 width=8) (actual 
> time=183.282..183.282 rows=1 loops=1)
>   Output: count(*)
>   Buffers: shared hit=13
>   ->  Gather  (cost=15.41..15.42 rows=2 width=8) (actual 
> time=152.835..152.904 rows=2 loops=1)
> Output: (PARTIAL count(*))
> Workers Planned: 2
> Workers Launched: 2
> JIT for worker 0:
>   Functions: 2
>   Options: Inlining true, Optimization true, Expressions true, 
> Deforming true
>   Timing: Generation 0.480 ms, Inlining 78.789 ms, Optimization 5.797 
> ms, Emission 5.554 ms, Total 90.620 ms
> JIT for worker 1:
>   Functions: 2
>   Options: Inlining true, Optimization true, Expressions true, 
> Deforming true
>   Timing: Generation 0.475 ms, Inlining 78.632 ms, Optimization 5.954 
> ms, Emission 5.900 ms, Total 90.962 ms
> Buffers: shared hit=13
> ->  Partial Aggregate  (cost=15.41..15.42 rows=1 width=8) (actual 
> time=90.608..90.609 rows=1 loops=2)
>   Output: PARTIAL count(*)
>   Buffers: shared hit=13
>   Worker 0: actual time=90.426..90.426 rows=1 loops=1
> Buffers: shared hit=7
>   Worker 1: actual time=90.791..90.792 rows=1 loops=1
> Buffers: shared hit=6
>   ->  Parallel Seq Scan on pg_catalog.pg_class  (cost=0.00..14.93 
> rows=193 width=0) (actual time=0.006..0.048 rows=193 loops=2)
> Output: relname, relnamespace, reltype, reloftype, 
> relowner, relam, relfilenode, reltablespace, relpages, reltuples, 
> relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, 
> relkind, relnatts, relchecks, relhasoids, relhasrules, relhastriggers, 
> relhassubclass, relrowsecurity, relfo
> Buffers: shared hit=13
> Worker 0: actual time=0.006..0.047 rows=188 loops=1
>   Buffers: shared hit=7
> Worker 1: actual time=0.006..0.048 rows=198 loops=1
>   Buffers: shared hit=6
> Planning Time: 0.152 ms
> JIT:
>   Functions: 4
>   Options: Inlining true, Optimization true, Expressions true, Deforming true
>   Timing: Generation 0.955 ms, Inlining 157.422 ms, Optimization 11.751 ms, 
> Emission 11.454 ms, Total 181.582 ms
> Execution Time: 184.197 ms

With parallelism on, this is the aggregated cost of doing JIT
compilation. I wonder if, in VERBOSE mode, we should separately display
the cost of JIT for the leader.  Comments?

- Andres



Re: txid_status returns NULL for recently commited transactions

2018-09-25 Thread Michail Nikolaev
Hi, thanks for the reply!

> What are you using it for?

I want to use it to validate status of related entities in other database
(queue) in short interval after PG transaction commit/rollback.

> I can't reproduce that...

Yes, it happens only with one cluster. All others work as expected.

> Your mailer appears to do very annoying things by converting numbers to
phone numbers.

Sorry.

> It's from the last epoch. Plain xids are 32bit wide, the epochs deal
> with values that are bigger. And 2207340131 is less than 2^31 in the
> past.

Yes, and probably it is cause of the issue.

ShmemVariableCache->oldestClogXid = 2207340131
xid_epoch = 1
xid = 150227913
TransactionIdPrecedes(xid, ShmemVariableCache->oldestClogXid)) return TRUE
, then TransactionIdInRecentPast return FALSE and txtd_status return NULL.

But xid (1) and xid_epoch (150227913) are correct values from my active (or
recently commited) transaction.

>> SELECT txid_status(BIGINT '4294967295') -> 'commited'.
>> SELECT txid_status(BIGINT '4294967296') -> NULL
> Why do you think that is the wrong result?

Let's leave it for now (maybe my misunderstanding). I think it is better to
deal with "txid_status(txid_current()) -> NULL" issue first.

Thanks,
Michail.


Re: txid_status returns NULL for recently commited transactions

2018-09-25 Thread Andres Freund
Hi,

On 2018-09-25 19:47:40 +0300, Michail Nikolaev wrote:
> I see strange beaviour of txid_status with one of my production servers.

What are you using it for?


> SELECT txid_status(txid_current()) -> NULL

I can't reproduce that...


> SELECT txid_current() -> 4447342811
> 
> It also returns null for recent commited and aborted transactions.
> 
> SELECT datname, age(datfrozenxid), datfrozenxid FROM pg_database;
> template   0239961994 4207340131
> postgres   289957756 4157344369 <(415)%20734-4369>
> template1 289957661 4157344464 <(415)%20734-4464>
> project  682347934 3764954191

"0239961994" - I can't see how a leading zero is possible.

Your mailer appears to do very annoying things by converting numbers to
phone numbers.


> GDB shows it happens because of TransactionIdPrecedes(xid,
> ShmemVariableCache->oldestClogXid)) at txid.c:155.
> 
> xid_epoch = 1
> xid = 150227913
> now_epoch = 1
> now_epoch_last_xid = 150468261
> ShmemVariableCache->oldestClogXid = 2207340131 <(220)%20734-0131>
> 
> The strangest thing is ShmemVariableCache->oldestClogXid == 2207340131
> <(220)%20734-0131> which is greater than current xid without an epoch
> (150227913) .

It's from the last epoch.  Plain xids are 32bit wide, the epochs deal
with values that are bigger.  And 2207340131 is less than 2^31 in the
past.

> Postgres was updated from 9.6 to PostgreSQL 10.4 (Ubuntu
> 10.4-2.pgdg14.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit.
> 
> Also, another small issue:
> 
> On the same database:
> SELECT txid_status(BIGINT '4294967295') -> 'commited'.
> SELECT txid_status(BIGINT '4294967296') -> NULL
> 
> Both tx ids are from my head and not valid.
> First 'commited' happends because int32 overflow:
> TransactionIdPrecedes(59856482, 2207340131 <(220)%20734-0131>) == false

Why do you think that is the wrong result?

Greetings,

Andres Freund



Re: txid_status returns NULL for recently commited transactions

2018-09-25 Thread Michail Nikolaev
Sorry, some auto formatting in recent email.

Again wtih fixed formatiing:

I see strange beaviour of txid_status with one of my production servers.

SELECT txid_status(txid_current()) -> NULL
SELECT txid_current() -> 4447342811

It also returns null for recent commited and aborted transactions.

SELECT datname, age(datfrozenxid), datfrozenxid FROM pg_database;
template 0239961994 4207340131
postgres 289957756 4157344369
template1 289957661 4157344464
project 682347934 3764954191

GDB shows it happens because of TransactionIdPrecedes(xid,
ShmemVariableCache->oldestClogXid)) at txid.c:155.

xid_epoch = 1
xid = 150227913
now_epoch = 1
now_epoch_last_xid = 150468261
ShmemVariableCache->oldestClogXid = 2207340131

The strangest thing is ShmemVariableCache->oldestClogXid == 2207340131
which is greater than current xid without an epoch (150227913) .

Postgres was updated from 9.6 to PostgreSQL 10.4 (Ubuntu
10.4-2.pgdg14.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit.

Also, another small issue:

On the same database:
SELECT txid_status(BIGINT '4294967295') -> 'commited'.
SELECT txid_status(BIGINT '4294967296') -> NULL

Both tx ids are from my head and not valid.
First 'commited' happends because int32 overflow:
TransactionIdPrecedes(59856482, 2207340131) == false

Could anyone help me with understanding of txid_status behaviour?

Thanks,
Michail.



вт, 25 сент. 2018 г. в 19:47, Michail Nikolaev :

> Hello everyone.
>
> I see strange beaviour of txid_status with one of my production servers.
>
> SELECT txid_status(txid_current()) -> NULL
> SELECT txid_current() -> 4447342811
>
> It also returns null for recent commited and aborted transactions.
>
> SELECT datname, age(datfrozenxid), datfrozenxid FROM pg_database;
> template   0239961994 4207340131
> postgres   289957756 4157344369 <(415)%20734-4369>
> template1 289957661 4157344464 <(415)%20734-4464>
> project  682347934 3764954191
>
> GDB shows it happens because of TransactionIdPrecedes(xid,
> ShmemVariableCache->oldestClogXid)) at txid.c:155.
>
> xid_epoch = 1
> xid = 150227913
> now_epoch = 1
> now_epoch_last_xid = 150468261
> ShmemVariableCache->oldestClogXid = 2207340131 <(220)%20734-0131>
>
> The strangest thing is ShmemVariableCache->oldestClogXid == 2207340131
> <(220)%20734-0131> which is greater than current xid without an epoch
> (150227913) .
>
> Postgres was updated from 9.6 to PostgreSQL 10.4 (Ubuntu
> 10.4-2.pgdg14.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit.
>
> Also, another small issue:
>
> On the same database:
> SELECT txid_status(BIGINT '4294967295') -> 'commited'.
> SELECT txid_status(BIGINT '4294967296') -> NULL
>
> Both tx ids are from my head and not valid.
> First 'commited' happends because int32 overflow:
> TransactionIdPrecedes(59856482, 2207340131 <(220)%20734-0131>) == false
>
> Could anyone help me with understanding of txid_status behaviour?
>
> Thanks,
> Michail.
>


Re: [PATCH] Include application_name in "connection authorized" log message

2018-09-25 Thread Don Seiler
On Mon, Sep 24, 2018 at 4:10 PM Stephen Frost  wrote:

>
> * Don Seiler (d...@seiler.us) wrote:
> >
> > OK I created a new function called clean_ascii() in common/string.c. I
> call
> > this from my new logic in postmaster.c as well as replacing the logic in
> > guc.c's check_application_name() and check_cluster_name().
>
> Since we're putting it into common/string.c (which seems pretty
> reasonable to me, at least), I went ahead and changed it to be
> 'pg_clean_ascii'.  I didn't see any other obvious cases where we could
> use this function (though typecmds.c does have an interesting ASCII
> check for type categories..).
>

Good idea, makes it all a bit more uniform.


> Otherwise, I added some comments, added application_name to the
> replication 'connection authorized' messages (seems like we really
> should be consistent across all of them...), ran it through pgindent,
> and updated a variable name or two here and there.
>

This looks great. Thanks for cleaning it up and all your help along the way!


> > I've been fighting my own confusion with git and rebasing and fighting
> the
> > same conflicts over and over and over, but this patch should be what I
> > want. If anyone has time to review my git process, I would appreciate
> it. I
> > must be doing something wrong to have these same conflicts every time I
> > rebase (or I completely misunderstand what it actually does).
>
> I'd be happy to chat about it sometime, of course, just have to find
> time when we both have a free moment. :)
>

Gladly! Hopefully there is some other low-hanging fruit that would be
within my grasp to help out again in the future.


> Attached is the updated patch.  If you get a chance to look over it
> again and make sure it looks good to you, that'd be great.  I did a bit
> of testing of it myself but wouldn't complain if someone else wanted to
> also.
>

Reviewed and approved, for whatever my approval is worth. Thanks again!

Don.

-- 
Don Seiler
www.seiler.us


txid_status returns NULL for recently commited transactions

2018-09-25 Thread Michail Nikolaev
Hello everyone.

I see strange beaviour of txid_status with one of my production servers.

SELECT txid_status(txid_current()) -> NULL
SELECT txid_current() -> 4447342811

It also returns null for recent commited and aborted transactions.

SELECT datname, age(datfrozenxid), datfrozenxid FROM pg_database;
template   0239961994 4207340131
postgres   289957756 4157344369 <(415)%20734-4369>
template1 289957661 4157344464 <(415)%20734-4464>
project  682347934 3764954191

GDB shows it happens because of TransactionIdPrecedes(xid,
ShmemVariableCache->oldestClogXid)) at txid.c:155.

xid_epoch = 1
xid = 150227913
now_epoch = 1
now_epoch_last_xid = 150468261
ShmemVariableCache->oldestClogXid = 2207340131 <(220)%20734-0131>

The strangest thing is ShmemVariableCache->oldestClogXid == 2207340131
<(220)%20734-0131> which is greater than current xid without an epoch
(150227913) .

Postgres was updated from 9.6 to PostgreSQL 10.4 (Ubuntu
10.4-2.pgdg14.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit.

Also, another small issue:

On the same database:
SELECT txid_status(BIGINT '4294967295') -> 'commited'.
SELECT txid_status(BIGINT '4294967296') -> NULL

Both tx ids are from my head and not valid.
First 'commited' happends because int32 overflow:
TransactionIdPrecedes(59856482, 2207340131 <(220)%20734-0131>) == false

Could anyone help me with understanding of txid_status behaviour?

Thanks,
Michail.


Re: Allowing printf("%m") only where it actually works

2018-09-25 Thread Tom Lane
Alvaro Herrera  writes:
> On 2018-Sep-25, Tom Lane wrote:
>> We could possibly write something like
>> 
>> sprintf(wserrbuf, "winsock error %d (could not load netmsg.dll to translate: 
>> error code %lu)", err, GetLastError(;
>> 
>> but I'm unconvinced that that's useful.

> Actually I think it *is* useful to do it like this, because then the
> user knows to fix the netmsg.dll problem so that they can continue to
> investigate the winsock problem.  If we don't report the secondary error
> message, how are users going to figure out how to fix the problem?

OK, I'm fine with doing it like that if people want it.

regards, tom lane



Re: FETCH FIRST clause PERCENT option

2018-09-25 Thread Mark Dilger



> On Sep 25, 2018, at 8:08 AM, Mark Dilger  wrote:
> 
> 
> 
>> On Sep 25, 2018, at 5:07 AM, Surafel Temesgen  wrote:
>> 
>> hey
>> 
>> On 9/21/18, Mark Dilger  wrote:
>> 
>>> Surafel, there are no regression tests that I can see in your patch.  It
>>> would help if you added some, as then I could precisely what behavior you
>>> are expecting.
>> thank you for looking at it .the attach patch add regression tests
>> 
> 
> Surafel,
> 
> I messed around with your changes to the grammar and it seems you don't
> need to add PERCENT as a reserved keyword.  Moving this to the unreserved
> keyword section does not cause any shift/reduce errors, and the regression
> tests still pass.  Relative to your patch v4, these changes help:

I spoke too soon.  The main regression tests pass, but your change to
src/test/modules/test_ddl_deparse/sql/create_table.sql per Thomas's
suggestion is no longer needed, since PERCENT no longer needs to be
quoted.

I recommend you also apply the following to your v4 patch, which just
rolls back that one change you made, and at least for me, is enough
to get `make check-world` to pass:

diff --git a/src/test/modules/test_ddl_deparse/sql/create_table.sql 
b/src/test/modules/test_ddl_deparse/sql/create_table.sql
index 4325de2d04..5e78452729 100644
--- a/src/test/modules/test_ddl_deparse/sql/create_table.sql
+++ b/src/test/modules/test_ddl_deparse/sql/create_table.sql
@@ -94,7 +94,7 @@ CREATE TABLE student (
 ) INHERITS (person);
 
 CREATE TABLE stud_emp (
-   "percent"   int4
+   percent int4
 ) INHERITS (emp, student);





Re: Allowing printf("%m") only where it actually works

2018-09-25 Thread Alvaro Herrera
On 2018-Sep-25, Tom Lane wrote:

> Michael Paquier  writes:

> > Ok.  I won't fight hard on that.  Why changing the error message from
> > "could not load netmsg.dll" to "unrecognized winsock error" then?  The
> > original error string is much more verbose to grab the context.
> 
> As the code stands, what you'll get told about is the error code
> returned by the failed LoadLibrary call; the original winsock error
> code is reported nowhere.  I think that's backwards.

I agree that the winsock problem is the main one we should be reporting,
including its numeric error code.  Even if we can't translate it, the
numeric value can be translated by web-searching, if it comes to that.

> We could possibly write something like
> 
> sprintf(wserrbuf, "winsock error %d (could not load netmsg.dll to translate: 
> error code %lu)", err, GetLastError(;
> 
> but I'm unconvinced that that's useful.

Actually I think it *is* useful to do it like this, because then the
user knows to fix the netmsg.dll problem so that they can continue to
investigate the winsock problem.  If we don't report the secondary error
message, how are users going to figure out how to fix the problem?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PG vs macOS Mojave

2018-09-25 Thread Tom Lane
Thomas Munro  writes:
> On Tue, Sep 25, 2018 at 4:49 PM Tom Lane  wrote:
>> I've tested this on all the macOS versions I have at hand, and it
>> doesn't seem to break anything.  Only part (1) could possibly
>> affect other platforms, and that seems safe enough.
>> 
>> I'd like to commit and backpatch this, because otherwise longfin
>> is going to start falling over when I upgrade its host to Mojave.

> Looks good on this 10.13.4 system.  About to upgrade to 10.14...

Thanks for testing!  I'll set to work on back-patching that.

regards, tom lane



Re: FETCH FIRST clause PERCENT option

2018-09-25 Thread Mark Dilger



> On Sep 25, 2018, at 5:07 AM, Surafel Temesgen  wrote:
> 
> hey
> 
> On 9/21/18, Mark Dilger  wrote:
> 
>> Surafel, there are no regression tests that I can see in your patch.  It
>> would help if you added some, as then I could precisely what behavior you
>> are expecting.
> thank you for looking at it .the attach patch add regression tests
> 

Surafel,

I messed around with your changes to the grammar and it seems you don't
need to add PERCENT as a reserved keyword.  Moving this to the unreserved
keyword section does not cause any shift/reduce errors, and the regression
tests still pass.  Relative to your patch v4, these changes help:

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a97ee30924..9872cf7257 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -15186,6 +15186,7 @@ unreserved_keyword:
| PARTITION
| PASSING
| PASSWORD
+   | PERCENT
| PLANS
| POLICY
| PRECEDING
@@ -15465,7 +15466,6 @@ reserved_keyword:
| ONLY
| OR
| ORDER
-   | PERCENT
| PLACING
| PRIMARY
| REFERENCES
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 150d51df8f..f23fee0653 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -300,7 +300,7 @@ PG_KEYWORD("partial", PARTIAL, UNRESERVED_KEYWORD)
 PG_KEYWORD("partition", PARTITION, UNRESERVED_KEYWORD)
 PG_KEYWORD("passing", PASSING, UNRESERVED_KEYWORD)
 PG_KEYWORD("password", PASSWORD, UNRESERVED_KEYWORD)
-PG_KEYWORD("percent", PERCENT, RESERVED_KEYWORD)
+PG_KEYWORD("percent", PERCENT, UNRESERVED_KEYWORD)
 PG_KEYWORD("placing", PLACING, RESERVED_KEYWORD)
 PG_KEYWORD("plans", PLANS, UNRESERVED_KEYWORD)
 PG_KEYWORD("policy", POLICY, UNRESERVED_KEYWORD)




Re: "could not reattach to shared memory" on buildfarm member dory

2018-09-25 Thread Noah Misch
On Mon, Sep 24, 2018 at 01:53:05PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > In this proof of concept, the
> > postmaster does not close its copy of a backend socket until the backend
> > exits.
> 
> That seems unworkable because it would interfere with detection of client
> connection drops.  But since you say this is just a POC, maybe you
> intended to fix that?  It'd probably be all right for the postmaster to
> hold onto the socket until the new backend reports successful attach,
> using the same signaling mechanism you had in mind for the other way.

It wasn't relevant to the concept being proven, so I suspended decisions in that
area.  Arranging for socket closure is a simple matter of programming.

> Overall, I agree that neither of these approaches are exactly attractive.
> We're paying a heck of a lot of performance or complexity to solve a
> problem that shouldn't even be there, and that we don't understand well.
> In particular, the theory that some privileged code is injecting a thread
> into every new process doesn't square with my results at
> https://www.postgresql.org/message-id/15345.1525145612%40sss.pgh.pa.us
> 
> I think our best course of action at this point is to do nothing until
> we have a clearer understanding of what's actually happening on dory.
> Perhaps such understanding will yield an idea for a less painful fix.

I see.



Re: when set track_commit_timestamp on, database system abort startup

2018-09-25 Thread Masahiko Sawada
On Tue, Sep 25, 2018 at 1:43 PM, Michael Paquier  wrote:
> Sawada-san,
>
> On Mon, Sep 24, 2018 at 08:28:45AM +0900, Michael Paquier wrote:
>> Wouldn't it be better to incorporate the new test as part of
>> 004_restart.pl?  This way, we avoid initializing a full instance, which
>> is always a good thing as that's very costly.  The top of this file also
>> mentions that it tests clean restarts, but it bothers also about crash
>> recovery.
>
> I have been able to work on this bug, and rewrote the proposed test case
> as attached.  The test can only get on v11 and HEAD.  What do you think?

Thank you for working on this.

+
+# Start the server, which generates a XLOG_PARAMETER_CHANGE record where
+# the parameter change is registered.
 $node_master->start;

+# Now restart again the server so as no record XLOG_PARAMETER_CHANGE are
+# replayed with the follow-up immediate shutdown.
+$node_master->restart;

Can we use "XLOG_PARAMETER_CHANGE record" instead of  "record
XLOG_PARAMETER_CHANGE" at the second hunk because the comment of the
first hunk uses it. The other part of this patch looks good to me.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] SERIALIZABLE on standby servers

2018-09-25 Thread Kevin Grittner
On Fri, Sep 21, 2018 at 7:29 AM Thomas Munro
 wrote:

> I'll add it to the next Commitfest so I know when to rebase it.

I signed up as reviewer in that CF.

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

2018-09-25 Thread Tom Lane
Andrew Gierth  writes:
> So I've tried to rough out a decision tree for the various options on
> how this might be implemented (discarding the "use precedence hacks"
> option). Opinions? Additions?

I think it'd be worth at least drafting an implementation for the
lexical-lookahead fix.  I think it's likely that we'll need to extend
base_yylex to do more lookahead in the future even if we don't do it
for this, given the SQL committee's evident love for COBOL-ish syntax
and lack of regard for what you can do in LALR(1).

The questions of how we interface to the individual window functions
are really independent of how we handle the parsing problem.  My
first inclination is to just pass the flags down to the window functions
(store them in WindowObject and provide some additional inquiry functions
in windowapi.h) and let them deal with it.

>   If the clauses are legal on all window functions, what to do about existing
>   window functions for which the clauses do not make sense?

Option 1: do nothing, document that nothing happens if w.f. doesn't
implement it.

Option 2: record whether the inquiry functions got called.  At end of
query, error out if they weren't and the options were used.

It's also worth wondering if we couldn't just implement the flags in
some generic fashion and not need to involve the window functions at
all.  FROM LAST, for example, could and perhaps should be implemented
by inverting the sort order.  Possibly IGNORE NULLS could be implemented
inside the WinGetFuncArgXXX functions?  These behaviors might or might
not make much sense with other window functions, but that doesn't seem
like it's our problem.

regards, tom lane



Re: Allowing printf("%m") only where it actually works

2018-09-25 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Sep 24, 2018 at 01:18:47PM -0400, Tom Lane wrote:
>> Well, we have to change the code somehow to make it usable in frontend
>> as well as backend.  And we can *not* have it do exit(1) in libpq.
>> So the solution I chose was to make it act the same as if FormatMessage
>> were to fail.  I don't find this behavior unreasonable: what is really
>> important is the original error code, not whether we were able to
>> pretty-print it.  I think the ereport(FATAL) coding is a pretty darn
>> bad idea even in the backend.

> Ok.  I won't fight hard on that.  Why changing the error message from
> "could not load netmsg.dll" to "unrecognized winsock error" then?  The
> original error string is much more verbose to grab the context.

As the code stands, what you'll get told about is the error code
returned by the failed LoadLibrary call; the original winsock error
code is reported nowhere.  I think that's backwards.

We could possibly write something like

sprintf(wserrbuf, "winsock error %d (could not load netmsg.dll to translate: 
error code %lu)", err, GetLastError(;

but I'm unconvinced that that's useful.

regards, tom lane



Re: Proposal for Signal Detection Refactoring

2018-09-25 Thread Tom Lane
Chris Travers  writes:
> However,  what I think one could do is use a struct of volatile
> sig_atomic_t members and macros for checking/setting.  Simply writing a
> value is safe in C89 and higher.

Yeah, we could group those flags in a struct, but what's the point?

regards, tom lane



Re: FETCH FIRST clause PERCENT option

2018-09-25 Thread Surafel Temesgen
hey

On 9/21/18, Mark Dilger  wrote:

> Surafel, there are no regression tests that I can see in your patch.  It
> would help if you added some, as then I could precisely what behavior you
> are expecting.
thank you for looking at it .the attach patch add regression tests
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 4db8142afa..8491b7831a 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -44,7 +44,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionexpression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
 [ LIMIT { count | ALL } ]
 [ OFFSET start [ ROW | ROWS ] ]
-[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY ]
+[ FETCH { FIRST | NEXT } [ count ] [ PERCENT ] { ROW | ROWS } ONLY ]
 [ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ] [...] ]
 
 where from_item can be one of:
@@ -1397,7 +1397,7 @@ OFFSET start
 which PostgreSQL also supports.  It is:
 
 OFFSET start { ROW | ROWS }
-FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
+FETCH { FIRST | NEXT } [ count ] [ PERCENT ] { ROW | ROWS } ONLY
 
 In this syntax, the start
 or count value is required by
@@ -1407,7 +1407,8 @@ FETCH { FIRST | NEXT } [ count ] {
 ambiguity.
 If count is
 omitted in a FETCH clause, it defaults to 1.
-ROW
+with PERCENT count specifies the maximum number of rows to return 
+in percentage.ROW
 and ROWS as well as FIRST
 and NEXT are noise words that don't influence
 the effects of these clauses.
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index bb28cf7d1d..1d5b48bbe9 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -43,6 +43,7 @@ ExecLimit(PlanState *pstate)
 	LimitState *node = castNode(LimitState, pstate);
 	ScanDirection direction;
 	TupleTableSlot *slot;
+	TupleDesc   tupleDescriptor;
 	PlanState  *outerPlan;
 
 	CHECK_FOR_INTERRUPTS();
@@ -52,6 +53,8 @@ ExecLimit(PlanState *pstate)
 	 */
 	direction = node->ps.state->es_direction;
 	outerPlan = outerPlanState(node);
+	slot = node->ps.ps_ResultTupleSlot;
+	tupleDescriptor = node->ps.ps_ResultTupleSlot->tts_tupleDescriptor;
 
 	/*
 	 * The main logic is a simple state machine.
@@ -60,6 +63,23 @@ ExecLimit(PlanState *pstate)
 	{
 		case LIMIT_INITIAL:
 
+			if (node->limitOption == PERCENTAGE)
+			{
+
+			/*
+			 * Find all rows the plan will return.
+			 */
+for (;;)
+{
+	slot = ExecProcNode(outerPlan);
+	if (TupIsNull(slot))
+	{
+		break;
+	}
+	tuplestore_puttupleslot(node->totalTuple, slot);
+}
+			}
+
 			/*
 			 * First call for this node, so compute limit/offset. (We can't do
 			 * this any earlier, because parameters from upper nodes will not
@@ -87,24 +107,46 @@ ExecLimit(PlanState *pstate)
 return NULL;
 			}
 
-			/*
-			 * Fetch rows from subplan until we reach position > offset.
-			 */
-			for (;;)
+			if (node->limitOption == PERCENTAGE)
 			{
-slot = ExecProcNode(outerPlan);
-if (TupIsNull(slot))
+for (;;)
 {
-	/*
-	 * The subplan returns too few tuples for us to produce
-	 * any output at all.
-	 */
-	node->lstate = LIMIT_EMPTY;
-	return NULL;
+	slot = MakeSingleTupleTableSlot(tupleDescriptor);
+	if (!tuplestore_gettupleslot(node->totalTuple, true, true, slot))
+	{
+		node->lstate = LIMIT_EMPTY;
+		return NULL;
+	}
+	else
+	{
+		node->subSlot = slot;
+		if (++node->position > node->offset)
+		break;
+	}
+}
+			}
+			else if (node->limitOption == EXACT_NUMBER)
+			{
+
+/*
+ * Fetch rows from subplan until we reach position > offset.
+ */
+for (;;)
+{
+	slot = ExecProcNode(outerPlan);
+	if (TupIsNull(slot))
+	{
+		/*
+		 * The subplan returns too few tuples for us to produce
+		 * any output at all.
+		 */
+		node->lstate = LIMIT_EMPTY;
+		return NULL;
+	}
+	node->subSlot = slot;
+	if (++node->position > node->offset)
+		break;
 }
-node->subSlot = slot;
-if (++node->position > node->offset)
-	break;
 			}
 
 			/*
@@ -144,18 +186,34 @@ ExecLimit(PlanState *pstate)
 
 	return NULL;
 }
+if (node->limitOption == PERCENTAGE)
+{
+	if (tuplestore_gettupleslot(node->totalTuple, true, false, slot))
+	{
+		node->subSlot = slot;
+		node->position++;
+	}
+	else
+	{
+		node->lstate = LIMIT_SUBPLANEOF;
+		return NULL;
+	}
+}
+else if (node->limitOption == EXACT_NUMBER)
+{
 
-/*
- * Get next tuple from subplan, if any.
- */
-slot = ExecProcNode(outerPlan);
-if (TupIsNull(slot))
-{
-	node->lstate = LIMIT_SUBPLANEOF;
-	return NULL;
+	 /*
+	 * Get next tuple from subplan, if any.
+	 */
+	slot = ExecProcNode(outerPlan);
+	if (TupIsNull(slot))
+	{
+		

Re: Re[2]: Adding a note to protocol.sgml regarding CopyData

2018-09-25 Thread Amit Kapila
On Tue, Sep 25, 2018 at 4:51 AM Bradley DeJong  wrote:
>
> On 2018-09-22, Amit Kapila wrote ...
>  > ... duplicate the same information in different words at three
> different places ...
>
> I count 7 different places. In the protocol docs, there is the old
> mention in the "Summary of Changes since Protocol 2.0" section
>

Below text is present in the section quoted by you:
The special \. last line is not
needed anymore, and is not sent during COPY OUT.
(It is still recognized as a terminator during COPY
IN, but its use is deprecated and will eventually be
removed.)

This email started with the need to mention this in protocol.sgml and
it is already present although at a different place than the place at
which it is proposed.  Personally, I don't see the need to add it to
more places.  Does anybody else have any opinion on this matter?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Collation versioning

2018-09-25 Thread Christoph Berg
Re: Thomas Munro 2018-09-24 

> I wonder if we would be practically constrained to using the
> distro-supplied ICU (by their policies of not allowing packages to
> ship their own copies ICU); it seems like it.  I wonder which distros
> allow multiple versions of ICU to be installed.  I see that Debian 9.5
> only has 57 in the default repo, but the major version is in the
> package name (what is the proper term for that kind of versioning?)
> and it doesn't declare a conflict with other versions, so that's
> promising.

The point of the put-the-version/soname-into-the-package-name business
is to allow co-installation of (mostly library) packages[*], so this
is indeed possible with the ICU package on Debian.

The bad news is that this applies to the binary package name only, the
source package (from which the binaries are built) is only called
"icu". As Debian only ships one version of a source package in a
release, there can only be one libicu$version.deb in a release.
(This means you can have libicu52.deb and libicu60.deb installed in
parallel, but after upgrading, libicu52 won't have any installation
source. You can either keep or remove the package, but not reinstall
it.)

The fix would be to include the version in the source package name as
well, like postgresql-NN and llvm-toolchain-NN. (And then find a
maintainer willing to maintain the bunch.)

Christoph

[*] now historical footnote: this wasn't the case with the
libperl-5.xx packages which conflicted with each other, which is why
upgrading Debian used to remove the postgresql-plperl-$oldmajor
version on upgrade. This has now been fixed for the stretch->buster
upgrade.



Re: PATCH: Update snowball stemmers

2018-09-25 Thread Arthur Zakirov
On Mon, Sep 24, 2018 at 05:36:39PM -0400, Tom Lane wrote:
> I reviewed and pushed this.

Great! Thank you.

> As a cross-check on the patch, I cloned the Snowball github repo
> and built the derived files in it.  I noticed that they'd incorporated
> several new stemmers since 2007 --- not only your Nepali one, but
> half a dozen more besides.  Since the point here is (IMO) mostly to
> follow their lead on what's interesting, I went ahead and added those
> as well.

Agree. It is good decision. It may attract more users.

> Although I added nepali.stop from the other patch, I've not done
> anything about updating our other stopword lists.  Presumably those
> are a bit obsolete by now as well.  I wonder if we can prevail on
> the Snowball people to make those available in some less painful way
> than scraping them off assorted web pages.  Ideally they'd stick them
> into their git repo ...

They have repository snowball-website [1]. It is snowballstem.org
web-site source repository. It also stores stopwords for various
languages (for example for english [2]). I checked couple languages. It
seems their russian and danish stopword lists look like PostgreSQL's
stopword lists. But their english stopword list is different.

There is lack of stopword lists for the following languages:
- arabic
- irish
- lithuanian
- nepali - I can create a pull request to add it to snowball-website
- tamil

There is also another project, called Stopwords ISO [3]. But I'm not
sure about them. It stores stopword lists from various sources. And also
there are stopwords for languages mentioned above, except for nepali and
tamil.

I think I could make a script, which generates stopwords from
snowball-website repository. It can be run periodically. Is it suitable?
Also it would be good to move missing stopwords from Stopwords ISO to
snowball-website...

1 - https://github.com/snowballstem/snowball-website/tree/master/algorithms
2 - 
https://github.com/snowballstem/snowball-website/blob/master/algorithms/english/stop.txt
3 - https://github.com/stopwords-iso

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: Segfault when creating partition with a primary key and sql_drop trigger exists

2018-09-25 Thread Dmitry Dolgov
> On Mon, 24 Sep 2018 at 17:58, Alvaro Herrera  wrote:
>
> On 2018-Sep-20, Marco Slot wrote:
>
> > We're seeing a segmentation fault when creating a partition of a
> > partitioned table with a primary key when there is a sql_drop trigger on
> > Postgres 11beta4.
> >
> > We discovered it because the Citus extension creates a sql_drop trigger,
> > but it's otherwise unrelated to the Citus extension:
> > https://github.com/citusdata/citus/issues/2390
>
> Thanks for the reproducer.  Will research.

Shouldn't it be fixed by adding EventTriggerAlterTableStart? Judging from the
following call of ATController, we can just pass NULL as parsetree.


sql_drop_on_partition.patch
Description: Binary data


RE: libpq debug log

2018-09-25 Thread Iwata, Aya
Let me explain this trace log in a bit more detail.

This log is not turned on in the system by default. 
Turn it on when an issue occurs and you want to check the information in the 
application in order to clarify the cause.

I will present three use cases for this log.

1. Confirmation on cause of out-of-memory
We assume that Out-of-memory occurred in the process of storing the data 
received from the database.
However, the contents or length of the data is not known.
A trace log is obtained to find these out and what kind of data you have in 
which part (i.e. query result when receiving from database).

2. Protocol error confirmation
When there is an error in the protocol transmitted from the client and an error 
occurs in the database server, the protocol sent by the client is checked.
When the network is unstable, log is checked whether the database server is 
receiving protocol messages. 

3. Processing delay survey
If the processing in the application is slow and the processing in the database 
is fast, investigate the cause of the processing delay.
4 kind of time can be obtained by the log;

 Timestamp when SQL started
 Timestamp when information began to be sent to the backend
 Timestamp when information is successfully received in the application
 Timestamp when SQL ended

Then the difference between the time is checked to find out which part of 
process takes time.


I reconfirm the function I proposed.

If get the trace log, set PGLOGDIR/logdir and PGLOGSIZE/logsize.
These parameters are set in the environment variable or the connection service 
file.
- logdir or PGLOGDIR : directory where log file written
- logsize or PGLOGSIZE : maximum log size. When the log file size exceeds to 
PGLOGSIZE, the log is output to another file.

The log file name is determined as follow.
libpq-%ConnectionID-%Y-%m-%d_%H%M%S.log

This is a trace log example;

...
Start: 2018-09-03 18:16:35.357 Connection(1)
Status: Connection
Send message: 2018-09-03 18:16:35.358

Receive message: 2018-09-03 18:16:35.359

End: 2018-09-03 18:16:35.360
...
Start: 2018-09-03 18:16:35.357 Connection(1)...(1), (2)
Query: DECLARE myportal CURSOR FOR select * from pg_database...(3)
Send message: 2018-09-03 18:16:35.358   ...(4)
...(5)
Receive message: 2018/09/03 18:16:35.359...(6)
   ...(7)
End: 2018-09-03 18:16:35.360...(8)
...


(1) Timestamp when SQL started
(2) Connection ID (Identify the connection)
(3) SQL statement
(4) Timestamp when information began to be sent to the backend
(5) send message to backend (Result of query, Protocol messages)
(6) Timestamp when information is successfully received in the application
(7) receive message from backend (Result of query, Protocol messages)
(8) Timestamp when SQL ended


Regards,
Iwata Aya




Re: Query is over 2x slower with jit=on

2018-09-25 Thread Andres Freund
On 2018-09-18 10:03:02 +0530, Amit Khandekar wrote:
> Attached v3 patch that does the above change.

Attached is a revised version of that patch.  I've changed quite a few
things:
- I've reverted the split of "base" and "provider specific" contexts - I
  don't think it really buys us anything here.

- I've reverted the context creation changes - instead of creating a
  context in the leader just to store instrumentation in the worker,
  there's now a new EState->es_jit_combined_instr.

- That also means worker instrumentation doesn't get folded into the
  leader's instrumentation. This seems good for the future and for
  extensions - it's not actually "linear" time that's spent doing
  JIT in workers (& leader), as all of that work happens in
  parallel. Being able to disentangle that seems important.

- Only allocate / transport JIT instrumentation if instrumentation is
  enabled.

- Smaller cleanups.

Forcing parallelism and JIT to be on, I get:

postgres[20216][1]=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS) SELECT count(*) FROM 
pg_class;

Finalize Aggregate  (cost=15.43..15.44 rows=1 width=8) (actual 
time=183.282..183.282 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=13
  ->  Gather  (cost=15.41..15.42 rows=2 width=8) (actual time=152.835..152.904 
rows=2 loops=1)
Output: (PARTIAL count(*))
Workers Planned: 2
Workers Launched: 2
JIT for worker 0:
  Functions: 2
  Options: Inlining true, Optimization true, Expressions true, 
Deforming true
  Timing: Generation 0.480 ms, Inlining 78.789 ms, Optimization 5.797 
ms, Emission 5.554 ms, Total 90.620 ms
JIT for worker 1:
  Functions: 2
  Options: Inlining true, Optimization true, Expressions true, 
Deforming true
  Timing: Generation 0.475 ms, Inlining 78.632 ms, Optimization 5.954 
ms, Emission 5.900 ms, Total 90.962 ms
Buffers: shared hit=13
->  Partial Aggregate  (cost=15.41..15.42 rows=1 width=8) (actual 
time=90.608..90.609 rows=1 loops=2)
  Output: PARTIAL count(*)
  Buffers: shared hit=13
  Worker 0: actual time=90.426..90.426 rows=1 loops=1
Buffers: shared hit=7
  Worker 1: actual time=90.791..90.792 rows=1 loops=1
Buffers: shared hit=6
  ->  Parallel Seq Scan on pg_catalog.pg_class  (cost=0.00..14.93 
rows=193 width=0) (actual time=0.006..0.048 rows=193 loops=2)
Output: relname, relnamespace, reltype, reloftype, 
relowner, relam, relfilenode, reltablespace, relpages, reltuples, 
relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, 
relkind, relnatts, relchecks, relhasoids, relhasrules, relhastriggers, 
relhassubclass, relrowsecurity, relfo
Buffers: shared hit=13
Worker 0: actual time=0.006..0.047 rows=188 loops=1
  Buffers: shared hit=7
Worker 1: actual time=0.006..0.048 rows=198 loops=1
  Buffers: shared hit=6
Planning Time: 0.152 ms
JIT:
  Functions: 4
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 0.955 ms, Inlining 157.422 ms, Optimization 11.751 ms, 
Emission 11.454 ms, Total 181.582 ms
Execution Time: 184.197 ms


Instead of "JIT for worker 0" we could instead do "Worker 0: JIT", but
I'm not sure that's better, given that the JIT group is multi-line
output.

Currently structured formats show this as:
   "JIT": {
 "Worker Number": 1,
 "Functions": 2,
 "Options": {
   "Inlining": true,
   "Optimization": true,
   "Expressions": true,
   "Deforming": true
 },
 "Timing": {
   "Generation": 0.374,
   "Inlining": 70.341,
   "Optimization": 8.006,
   "Emission": 7.670,
   "Total": 86.390
 }
   },


This needs a bit more polish tomorrow, but I'm starting to like where
this is going.  Comments?

Greetings,

Andres Freund
>From dd7d504df9fca0ee0cc0a48930980cda2ad8be1d Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 25 Sep 2018 01:46:34 -0700
Subject: [PATCH] Collect JIT instrumentation from workers.

Author: Amit Khandekar and Andres Freund
Reviewed-By:
Discussion: https://postgr.es/m/caj3gd9elrz51rk_gtkod+71idcjpb_n8ec6vu2aw-vicsae...@mail.gmail.com
Backpatch: 11-
---
 contrib/auto_explain/auto_explain.c |  6 +-
 src/backend/commands/explain.c  | 87 +++--
 src/backend/executor/execMain.c | 15 +
 src/backend/executor/execParallel.c | 82 +++
 src/backend/jit/jit.c   | 11 
 src/backend/jit/llvm/llvmjit.c  | 14 ++---
 src/backend/jit/llvm/llvmjit_expr.c |  2 +-
 src/include/commands/explain.h  |  3 +-
 src/include/executor/execParallel.h |  1 +
 src/include/jit/jit.h   | 27 +++--
 src/include/nodes/execnodes.h   |  8 +++
 11 files changed, 209 insertions(+), 47 deletions(-)

diff --git 

Re: PG vs macOS Mojave

2018-09-25 Thread Thomas Munro
On Tue, Sep 25, 2018 at 4:49 PM Tom Lane  wrote:
> I've tested this on all the macOS versions I have at hand, and it
> doesn't seem to break anything.  Only part (1) could possibly
> affect other platforms, and that seems safe enough.
>
> I'd like to commit and backpatch this, because otherwise longfin
> is going to start falling over when I upgrade its host to Mojave.

Looks good on this 10.13.4 system.  About to upgrade to 10.14...

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Proposal for Signal Detection Refactoring

2018-09-25 Thread Chris Travers
On Mon, Sep 24, 2018 at 7:06 PM Andres Freund  wrote:

> On 2018-09-24 11:45:10 +0200, Chris Travers wrote:
> > I did some more reading.
> >
> > On Mon, Sep 24, 2018 at 10:15 AM Chris Travers  >
> > wrote:
> >
> > > First, thanks for taking the time to write this.  Its very helpful.
> > > Additional thoughts inline.
> > >
> > > On Mon, Sep 24, 2018 at 2:12 AM Michael Paquier 
> > > wrote:
> > >
> > >>
> > >> There could be value in refactoring things so as all the *Pending
> flags
> > >> of miscadmin.h get stored into one single volatile sig_atomic_t which
> > >> uses bit-wise markers, as that's at least 4 bytes because that's
> stored
> > >> as an int for most platforms and can be performed as an atomic
> operation
> > >> safely across signals (If my memory is right;) ).  And this leaves a
> lot
> > >> of room for future flags.
> > >>
> > >
> > > Yeah I will look into this.
> > >
> >
> >
> > Ok so having looked into this a bit more
> >
> > It looks like to be strictly conforming, you can't just use a series of
> > flags because neither C 89 or 99 guarantee that sig_atomic_t is
> read/write
> > round-trip safe in signal handlers.  In other words, if you read, are
> > pre-empted by another signal, and then write, you may clobber what the
> > other signal handler did to the variable.  So you need atomics, which are
> > C11
> >
> > What I would suggest instead at least for an initial approach is:
> >
> > 1.  A struct of volatile bools statically stored
> > 2.  macros for accessing/setting/clearing flags
> > 3.  Consistent use of these macros throughout the codebase.
> >
> > To make your solution work it looks like we'd need C11 atomics which
> would
> > be nice and maybe at some point we decide to allow newer feature, or we
> > could wrap this itself in checks for C11 features and provide atomic
> flags
> > in the future.  It seems that the above solution would strictly comply to
> > C89 and pose no concurrency issues.
>
> It's certainly *NOT* ok to use atomics in signal handlers
> indiscriminately, on some hardware / configure option combinations
> they're backed by spinlocks (or even semaphores) and thus *NOT*
> interrupt save.
>
> This doesn't seem to solve an actual problem, why are we discussing
> changing this? What'd be measurably improved, worth the cost of making
> backpatching more painful?


I am not sure if this is measurable as a problem but I am fairly sure this
is tangible and is likely to become more so over time.

Right now, the current approach is to use a series of globally defined
single variables for handling interrupt conditions.  Because this has grown
organically over time, the sort naming of these variables doesn't have a
hard degree of consistency.  Consequently, if one has code which needs to
check why it was interrupted, this is both a fragile process, and one which
imposes a dependency directly on internals.  We now have two different
areas in the code which do handling of interrupt conditions, and two more
places that non-intrusively check for whether certain kinds of interrupt
conditions are pending.

My sense is that the number of places which have to be aware of these sort
of things is likely to grow in the future.  Therefore I think it would be a
good idea sooner rather than later to ensure that the both the structures
and the mechanisms and interfaces are forward-looking, and make stability
guarantees we can hold well into the future without burdening code
maintenance much.  So maybe there is some short-term pain in back porting
but the goal is to make sure that long-term backporting is easier, and that
checking pending interrupt status is safer.

I am not sold on using bit flags here.  I don't think they are C89 or 99
safe (maybe in C11 with caveats).

My suspicion is that when I get to a second attempt at this problem, it
will look much closer to what we have now, just better encapsulated.
-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: Proposal for Signal Detection Refactoring

2018-09-25 Thread Chris Travers
On Tue, Sep 25, 2018 at 3:03 AM Tom Lane  wrote:

> Michael Paquier  writes:
> > And then within separate signal handlers things like:
> > void
> > StatementCancelHandler(SIGNAL_ARGS)
> > {
> > [...]
> > signalPendingFlags |= PENDING_INTERRUPT | PENDING_CANCEL_QUERY;
> > [...]
> > }
>
> AFAICS this still wouldn't work.  The machine code is still going to
> look (on many machines) like "load from signalPendingFlags,
> OR in some bits, store to signalPendingFlags".  So there's still a
> window for another signal handler to interrupt that and store some
> bits that will get lost.
>
> You could only fix that by blocking all signal handling during the
> handler, which would be expensive and rather pointless.
>
> I do not think that it's readily possible to improve on the current
> situation with one sig_atomic_t per flag.
>


After a fair bit of reading I think there are ways of doing this in C11 but
I don't think those are portable to C99.

In C99 (and, in practice C89, as the C99 committee noted there were no
known C89 implementations where reading was unsafe), reading or writing a
static sig_atomic_t inside a signal handler is safe, but a round-trip is
*not* guaranteed not to clobber.  While I could be wrong, I think it is
only in C11 that you have any round-trip operations which are guaranteed
not to clobber in the language itself.

Basically we are a long way out to be able to consider these a single value
as flags.

However,  what I think one could do is use a struct of volatile
sig_atomic_t members and macros for checking/setting.  Simply writing a
value is safe in C89 and higher.



> regards, tom lane
>


-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-09-25 Thread Thomas Munro
On Sat, Sep 22, 2018 at 4:52 PM Amit Kapila  wrote:
> On Sat, Sep 22, 2018 at 2:28 AM Andres Freund  wrote:
> > On 2018-09-22 08:54:57 +1200, Thomas Munro wrote:
> > > On Fri, Sep 21, 2018 at 4:43 PM Tom Lane  wrote:
> > > > Unless it looks practical to support this behavior in the Windows
> > > > and SysV cases, I think we should get rid of it rather than expend
> > > > effort on supporting it for just some platforms.
> > >
> > > We can remove it in back-branches without breaking API compatibility:
> > >
> > > 1.  Change dsm_impl_can_resize() to return false unconditionally (I
> > > suppose client code is supposed to check this before using
> > > dsm_resize(), though I'm not sure why it has an "impl" in its name if
> > > it's part of the public interface of this module).
> > > 2.  Change dsm_resize() and dsm_remap() to raise an error conditionally.
> > > 3.  Rip out the DSM_OP_RESIZE cases from various places.
> > >
> > > Then in master, remove all of those functions completely.  However,
> > > I'd feel like a bit of a vandal.  Robert and Amit probably had plans
> > > for that code...?
> >
> > Robert, Amit: ^
>
> I went through and check the original proposal [1] to see if any use
> case is mentioned there, but nothing related has been discussed.   I
> couldn't think of much use of this facility except maybe for something
> like parallelizing correalated sub-queries where the size of outer var
> can change across executions and we might need to resize the initially
> allocated memory.  This is just a wild thought, I don't have any
> concrete idea about this.   Having said that, I don't object to
> removing this especially because the implementation doesn't seem to be
> complete.  In future, if someone needs such a facility, they can first
> develop a complete version of this API.

Thanks for looking into that.  Here's a pair of draft patches to
disable and then remove dsm_resize() and dsm_map().

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Desupport-dsm_resize-and-dsm_remap.patch
Description: Binary data


0002-Remove-dsm_resize-and-dsm_remap.patch
Description: Binary data