Re: [HACKERS] Time to drop old-style (V0) functions?

2017-02-28 Thread Andres Freund
On 2017-02-28 23:15:15 -0800, Andres Freund wrote:
> On 2016-12-08 13:34:41 -0800, Andres Freund wrote:
> > Hi,
> > 
> > I'm wondering if it's not time for $subject:
> > - V0 causes confusion / weird crashes when PG_FUNCTION_INFO_V1 was
> >   forgotten
> > - They have us keep weird hacks around just for the sake of testing V0
> > - they actually cost performance, because we have to zero initialize 
> > Datums, even if
> >   the corresponding isnull marker is set.
> > - they allow to call arbitrary functions pretty easily
> > 
> > I don't see any reason to keep them around. If seriously doubt anybody
> > is using them seriously in anything but error.
> 
> Patches attached.

One unaddressed question in those patches is what we do with
src/backend/utils/fmgr/README - I'm not quite sure what its purpose is,
in its current state.  If we want to keep it, we'd probably have to
pretty aggressively revise it?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to drop old-style (V0) functions?

2017-02-28 Thread Andres Freund
On 2016-12-08 13:34:41 -0800, Andres Freund wrote:
> Hi,
> 
> I'm wondering if it's not time for $subject:
> - V0 causes confusion / weird crashes when PG_FUNCTION_INFO_V1 was
>   forgotten
> - They have us keep weird hacks around just for the sake of testing V0
> - they actually cost performance, because we have to zero initialize Datums, 
> even if
>   the corresponding isnull marker is set.
> - they allow to call arbitrary functions pretty easily
> 
> I don't see any reason to keep them around. If seriously doubt anybody
> is using them seriously in anything but error.

Patches attached.
>From dc6de0b47e1013c165c1026df65bda62cba8d8b7 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 28 Feb 2017 23:14:58 -0800
Subject: [PATCH 1/2] Move contrib/seg to only use V1 calling conventions.

A later commit will remove V0 support.

Author: Andres Freund
Discussion: https://postgr.es/m/20161208213441.k3mbno4twhg2q...@alap3.anarazel.de
---
 contrib/seg/seg.c | 453 +-
 1 file changed, 244 insertions(+), 209 deletions(-)

diff --git a/contrib/seg/seg.c b/contrib/seg/seg.c
index 895d879498..ab6b017282 100644
--- a/contrib/seg/seg.c
+++ b/contrib/seg/seg.c
@@ -47,60 +47,53 @@ PG_FUNCTION_INFO_V1(seg_center);
 /*
 ** GiST support methods
 */
-bool gseg_consistent(GISTENTRY *entry,
-SEG *query,
-StrategyNumber strategy,
-Oid subtype,
-bool *recheck);
-GISTENTRY  *gseg_compress(GISTENTRY *entry);
-GISTENTRY  *gseg_decompress(GISTENTRY *entry);
-float	   *gseg_penalty(GISTENTRY *origentry, GISTENTRY *newentry, float *result);
-GIST_SPLITVEC *gseg_picksplit(GistEntryVector *entryvec, GIST_SPLITVEC *v);
-bool		gseg_leaf_consistent(SEG *key, SEG *query, StrategyNumber strategy);
-bool		gseg_internal_consistent(SEG *key, SEG *query, StrategyNumber strategy);
-SEG		   *gseg_union(GistEntryVector *entryvec, int *sizep);
-SEG		   *gseg_binary_union(SEG *r1, SEG *r2, int *sizep);
-bool	   *gseg_same(SEG *b1, SEG *b2, bool *result);
+PG_FUNCTION_INFO_V1(gseg_consistent);
+PG_FUNCTION_INFO_V1(gseg_compress);
+PG_FUNCTION_INFO_V1(gseg_decompress);
+PG_FUNCTION_INFO_V1(gseg_picksplit);
+PG_FUNCTION_INFO_V1(gseg_penalty);
+PG_FUNCTION_INFO_V1(gseg_union);
+PG_FUNCTION_INFO_V1(gseg_same);
+static Datum gseg_leaf_consistent(Datum key, Datum query, StrategyNumber strategy);
+static Datum gseg_internal_consistent(Datum key, Datum query, StrategyNumber strategy);
+static Datum gseg_binary_union(Datum r1, Datum r2, int *sizep);
 
 
 /*
 ** R-tree support functions
 */
-bool		seg_same(SEG *a, SEG *b);
-bool		seg_contains_int(SEG *a, int *b);
-bool		seg_contains_float4(SEG *a, float4 *b);
-bool		seg_contains_float8(SEG *a, float8 *b);
-bool		seg_contains(SEG *a, SEG *b);
-bool		seg_contained(SEG *a, SEG *b);
-bool		seg_overlap(SEG *a, SEG *b);
-bool		seg_left(SEG *a, SEG *b);
-bool		seg_over_left(SEG *a, SEG *b);
-bool		seg_right(SEG *a, SEG *b);
-bool		seg_over_right(SEG *a, SEG *b);
-SEG		   *seg_union(SEG *a, SEG *b);
-SEG		   *seg_inter(SEG *a, SEG *b);
-void		rt_seg_size(SEG *a, float *sz);
+PG_FUNCTION_INFO_V1(seg_same);
+PG_FUNCTION_INFO_V1(seg_contains);
+PG_FUNCTION_INFO_V1(seg_contained);
+PG_FUNCTION_INFO_V1(seg_overlap);
+PG_FUNCTION_INFO_V1(seg_left);
+PG_FUNCTION_INFO_V1(seg_over_left);
+PG_FUNCTION_INFO_V1(seg_right);
+PG_FUNCTION_INFO_V1(seg_over_right);
+PG_FUNCTION_INFO_V1(seg_union);
+PG_FUNCTION_INFO_V1(seg_inter);
+static void rt_seg_size(SEG *a, float *size);
 
 /*
 ** Various operators
 */
-int32		seg_cmp(SEG *a, SEG *b);
-bool		seg_lt(SEG *a, SEG *b);
-bool		seg_le(SEG *a, SEG *b);
-bool		seg_gt(SEG *a, SEG *b);
-bool		seg_ge(SEG *a, SEG *b);
-bool		seg_different(SEG *a, SEG *b);
+PG_FUNCTION_INFO_V1(seg_cmp);
+PG_FUNCTION_INFO_V1(seg_lt);
+PG_FUNCTION_INFO_V1(seg_le);
+PG_FUNCTION_INFO_V1(seg_gt);
+PG_FUNCTION_INFO_V1(seg_ge);
+PG_FUNCTION_INFO_V1(seg_different);
 
 /*
 ** Auxiliary funxtions
 */
 static int	restore(char *s, float val, int n);
 
-
 /*
  * Input/Output functions
  */
 
+
 Datum
 seg_in(PG_FUNCTION_ARGS)
 {
@@ -193,13 +186,17 @@ seg_upper(PG_FUNCTION_ARGS)
 ** the predicate x op query == FALSE, where op is the oper
 ** corresponding to strategy in the pg_amop table.
 */
-bool
-gseg_consistent(GISTENTRY *entry,
-SEG *query,
-StrategyNumber strategy,
-Oid subtype,
-bool *recheck)
+Datum
+gseg_consistent(PG_FUNCTION_ARGS)
 {
+	GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+	Datum		query = PG_GETARG_DATUM(1);
+	StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+
+	/* Oid		subtype = PG_GETARG_OID(3); */
+	bool	   *recheck = (bool *) PG_GETARG_POINTER(4);
+
+
 	/* All cases served by this function are exact */
 	*recheck = false;
 
@@ -208,71 +205,74 @@ gseg_consistent(GISTENTRY *entry,
 	 * gseg_leaf_consistent
 	 */
 	if 

Re: [HACKERS] Proposal : Parallel Merge Join

2017-02-28 Thread Amit Kapila
On Wed, Mar 1, 2017 at 11:24 AM, Dilip Kumar  wrote:
> On Wed, Mar 1, 2017 at 11:13 AM, Amit Kapila  wrote:
>> I think for now we can keep the parallel safety check for cheapest
>> inner path, though it will be of use only for the very first time we
>> compare the paths in that loop.  I am not sure if there is any other
>> better way to handle the same.
>
> Done that way.
>

Thanks, your patch looks good to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2017-02-28 Thread Andres Freund
Hi,

On 2017-02-28 23:42:45 -0500, Peter Eisentraut wrote:
> On 1/26/17 22:46, Andres Freund wrote:
> > On 2016-09-30 15:24:09 -0400, Peter Eisentraut wrote:
> >> Yeah, I have committed a few of the patches now and I'll close the CF
> >> entry now.  Thanks for your research.
> > 
> > Are you planning to push more of these at some point?
> 
> Sure, let's take a look.

Cool.


> Getting rid of the C++ keywords would open up the possibility of using
> -Wc++-compat, which you have expressed interest in, I think.

Indeed. And if we go down this path, I'm going to argue that we should
add that by default.

> Subject: [PATCH v2 01/23] Fix mixup of bool and ternary value
> 
> ---
>  src/backend/access/gin/ginscan.c | 2 +-
>  src/include/access/gin_private.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/backend/access/gin/ginscan.c 
> b/src/backend/access/gin/ginscan.c
> index c3ce0479c5..c83375d6b4 100644
> --- a/src/backend/access/gin/ginscan.c
> +++ b/src/backend/access/gin/ginscan.c
> @@ -147,7 +147,7 @@ ginFillScanKey(GinScanOpaque so, OffsetNumber attnum,
>   key->nuserentries = nUserQueryValues;
>  
>   key->scanEntry = (GinScanEntry *) palloc(sizeof(GinScanEntry) * 
> nQueryValues);
> - key->entryRes = (bool *) palloc0(sizeof(bool) * nQueryValues);
> + key->entryRes = (GinTernaryValue *) palloc0(sizeof(GinTernaryValue) * 
> nQueryValues);
>  
>   key->query = query;
>   key->queryValues = queryValues;
> diff --git a/src/include/access/gin_private.h 
> b/src/include/access/gin_private.h
> index 34e7339f05..f1f395ac85 100644
> --- a/src/include/access/gin_private.h
> +++ b/src/include/access/gin_private.h
> @@ -281,7 +281,7 @@ typedef struct GinScanKeyData
>   int nadditional;
>  
>   /* array of check flags, reported to consistentFn */
> - bool   *entryRes;
> + GinTernaryValue *entryRes;
>   bool(*boolConsistentFn) (GinScanKey key);
>   GinTernaryValue (*triConsistentFn) (GinScanKey key);
>   FmgrInfo   *consistentFmgrInfo;

That seems like a pretty clear-cut case given the usage of entryRes[i].


> From 7c2f4f31df0eec816d6bb17aa6df2e7f3c6da03e Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Tue, 30 Aug 2016 12:00:00 -0400
> Subject: [PATCH v2 02/23] Fix LDFLAGS test for C++
> 
> The test looks to link to particular function, so we need to make that
> function have C linkage.
> ---
>  config/c-compiler.m4 |  9 -
>  configure| 27 ---
>  2 files changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
> index 7d901e1f1a..21fb4645c4 100644
> --- a/config/c-compiler.m4
> +++ b/config/c-compiler.m4
> @@ -353,7 +353,14 @@ AC_DEFUN([PGAC_PROG_CC_LDFLAGS_OPT],
>  AC_CACHE_CHECK([whether $CC supports $1], [Ac_cachevar],
>  [pgac_save_LDFLAGS=$LDFLAGS
>  LDFLAGS="$pgac_save_LDFLAGS $1"
> -AC_RUN_IFELSE([AC_LANG_PROGRAM([extern void $2 (); void (*fptr) () = 
> $2;],[])],
> +AC_RUN_IFELSE([AC_LANG_PROGRAM([#ifdef __cplusplus
> +extern "C" {
> +#endif
> +extern void $2 ();
> +#ifdef __cplusplus
> +}
> +#endif
> +void (*fptr) () = $2;],[])],
>[Ac_cachevar=yes],
>[Ac_cachevar=no],
>[Ac_cachevar="assuming no"])

Hm. I'm a bit confused here.


The description of PGAC_PROG_CC_LDFLAGS_OPT isn't exactly great:

# PGAC_PROG_CC_LDFLAGS_OPT
# 
# Given a string, check if the compiler supports the string as a
# command-line option. If it does, add the string to LDFLAGS.

doesn't even mention that the second parameter is something kind of
required, and what it could be used for.

Nor is:
# For reasons you'd really rather not know about, this checks whether
# you can link to a particular function, not just whether you can link.
# In fact, we must actually check that the resulting program runs :-(

exactly helpful...


But if we accept the premise of the current way to do things, this
mostly makes sense.  I do wonder if we really should use CC to run a C++
compiler, but ...


> From fd5248ca63ce3cf35f22cebd2088c4dfd3d994a4 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Tue, 30 Aug 2016 12:00:00 -0400
> Subject: [PATCH v2 03/23] Add test for -Wmissing-prototypes
> 
> This was part of the hard-coded default warnings set in C, but the
> option is not valid for C++ (since prototypes are always required).

>  
>  if test "$GCC" = yes -a "$ICC" = no; then
> -  CFLAGS="-Wall -Wmissing-prototypes -Wpointer-arith"
> +  CFLAGS="-Wall -Wpointer-arith"
> +  # not valid for C++
> +  PGAC_PROG_CC_CFLAGS_OPT([-Wmissing-prototypes])
># These work in some but not all gcc versions
>PGAC_PROG_CC_CFLAGS_OPT([-Wdeclaration-after-statement])
>PGAC_PROG_CC_CFLAGS_OPT([-Wendif-labels])

Makes sense.


> From 3447e7cc85c1f2836d882e19952e6db1950a2607 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 

Re: [HACKERS] Proposal : Parallel Merge Join

2017-02-28 Thread Dilip Kumar
On Wed, Mar 1, 2017 at 11:13 AM, Amit Kapila  wrote:
> I think for now we can keep the parallel safety check for cheapest
> inner path, though it will be of use only for the very first time we
> compare the paths in that loop.  I am not sure if there is any other
> better way to handle the same.

Done that way.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


parallel_mergejoin_v8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : Parallel Merge Join

2017-02-28 Thread Amit Kapila
On Tue, Feb 28, 2017 at 9:28 PM, Dilip Kumar  wrote:
> On Mon, Feb 27, 2017 at 10:27 AM, Amit Kapila  wrote:
>> Okay, but in that case don't you think it is better to consider the
>> parallel safety of cheapest_total_inner only when we don't find any
>> cheap parallel_safe innerpath by reducing the sort keys?
>
> Well,  we can do that but suppose cheapest_total_inner is not parallel
> safe and we do not get any parallel safe path which is cheaper than
> cheapest_total_inner, then we just end up making the merge join path
> with the cheapest parallel safe path but we might have missed some of
> the paths whose pathkey is covering more ordered keys.  Still, it's
> hard to argue what it better because we can always say that if we try
> only cheapest parallel safe path we will generate fewer paths.
>

I think for now we can keep the parallel safety check for cheapest
inner path, though it will be of use only for the very first time we
compare the paths in that loop.  I am not sure if there is any other
better way to handle the same.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-02-28 Thread Jim Nasby

On 1/24/17 10:43 PM, Jim Nasby wrote:



I strongly suggest making this design effort a separate thread, and
focusing on the SPI improvements that give "free" no-user-action
performance boosts here.


Fair enough. I posted the SPI portion of that yesterday. That should be
useful for pl/R and possibly pl/perl. pl/tcl could make use of it, but
it would end up executing arbitrary tcl code in the middle of portal
execution, which doesn't strike me as a great idea. Unfortunately, I
don't think plpgsql could make much use of this for similar reasons.

I'll post a plpython patch that doesn't add the output format control.


I've attached the results of that. Unfortunately the speed improvement 
is only 27% at this point (with 999 tuples). Presumably that's 
because it's constructing a brand new dictionary from scratch for each 
tuple.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
>From 7ef3e944c1ee8266d70fafae080afc6beb492102 Mon Sep 17 00:00:00 2001
From: Jim Nasby 
Date: Wed, 25 Jan 2017 12:57:40 -0600
Subject: [PATCH 1/2] Add SPI_execute_callback() and callback-based
 DestReceiver.

Instead of placing results in a tuplestore, this method of execution
uses the supplied callback when creating the Portal for a query.
---
 src/backend/executor/spi.c | 76 --
 src/backend/tcop/dest.c| 11 +++
 src/include/executor/spi.h |  4 +++
 src/include/tcop/dest.h|  1 +
 4 files changed, 83 insertions(+), 9 deletions(-)

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 55f97b14e6..d55e06509f 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, 
SPIPlanPtr plan);
 
 static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
  Snapshot snapshot, Snapshot 
crosscheck_snapshot,
- bool read_only, bool fire_triggers, uint64 
tcount);
+ bool read_only, bool fire_triggers, uint64 
tcount,
+ DestReceiver *callback);
 
 static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
@@ -320,7 +321,34 @@ SPI_execute(const char *src, bool read_only, long tcount)
 
res = _SPI_execute_plan(, NULL,
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+int
+SPI_execute_callback(const char *src, bool read_only, long tcount,
+   DestReceiver *callback)
+{
+   _SPI_plan   plan;
+   int res;
+
+   if (src == NULL || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   memset(, 0, sizeof(_SPI_plan));
+   plan.magic = _SPI_PLAN_MAGIC;
+   plan.cursor_options = 0;
+
+   _SPI_prepare_oneshot_plan(src, );
+
+   res = _SPI_execute_plan(, NULL,
+   InvalidSnapshot, 
InvalidSnapshot,
+   read_only, true, 
tcount, callback);
 
_SPI_end_call(true);
return res;
@@ -354,7 +382,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const 
char *Nulls,

_SPI_convert_params(plan->nargs, plan->argtypes,

Values, Nulls),
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+
+/* Execute a previously prepared plan with a callback Destination */
+int
+SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls,
+bool read_only, long tcount, DestReceiver 
*callback)
+{
+   int res;
+
+   if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   if (plan->nargs > 0 && Values == NULL)
+   return SPI_ERROR_PARAM;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   res = _SPI_execute_plan(plan,
+   

Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Kuntal Ghosh
On Wed, Mar 1, 2017 at 10:53 AM, Andres Freund  wrote:
> On 2017-03-01 10:47:45 +0530, Kuntal Ghosh wrote:
>> if (insertdist > curdist)
>> {
>> swap the entry to be inserted with the current entry.
>> Try to insert the current entry in the same logic.
>> }
>>
>> So, the second approach will not cause all the followers to be shifted by 1.
>
> How not? You'll have to do that game until you found a free slot.
You can skip increasing the curdist of some elements for which it is
already pretty high. Suppose, we've the following hash table and an
element e,
_,_,_,i,_,_,j,j+1,j+2,_,_
We want to insert e at i but there is a collision and we start doing
probing. At j, the condition if (insertdist > curdist) satisfies and
we'll swap e with the element at j. Now, we try to insert e( = element
at j) and so on. In this process, if the element at j+1,j+2 has
already high distance from their optimal location, you can skip those
element and go ahead. But, in the existing approach, we increase their
distance further. Thoughts?


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Radix tree for character conversion

2017-02-28 Thread Michael Paquier
On Tue, Feb 28, 2017 at 5:34 PM, Kyotaro HORIGUCHI
 wrote:
> At Tue, 28 Feb 2017 15:20:06 +0900, Michael Paquier 
>  wrote in 
> 
>> +conv.o: conv.c char_converter.c
>> This also can go away.
>
> Touching char_converter.c will be ignored if it is removed. Did
> you mistake it for map_checker?

That was not what I meant: as pg_mb_radix_conv() is only used in
conv.c, it may be better to just remove completely char_converter.c.

> And the code-comment pointed in the comment by the previous mail
> is rewritten as Robert's suggestion.

Fine for me.

-distclean: clean
+distclean:
rm -f $(TEXTS)

-maintainer-clean: distclean
-   rm -f $(MAPS)
-
+maintainer-clean:
+   rm -f $(TEXTS) $(MAPS)
Well, I would have assumed that this should not change..

The last version of the patch looks in rather good shape to me, we are
also sure that the sanity checks on the old maps and the new maps
match per the previous runs with map_checker. One thing that still
need some extra opinions is what to do with the old maps:
1) Just remove them, replacing the old maps by the new radix tree maps.
2) Keep them around in the backend code, even if they are useless.
3) Use a GUC to be able to switch from one to the other, giving a
fallback method in case of emergency.
4) Use an extension module to store the old maps with as well the
previous build code, so as sanity checks can still be performed on the
new maps.

I would vote for 2), to reduce long term maintenance burdens and after
seeing all the sanity checks that have been done in previous versions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-02-28 Thread amul sul
On Tue, Feb 28, 2017 at 8:03 PM, Yugo Nagata  wrote:
> Hi all,
>
> Now we have a declarative partitioning, but hash partitioning is not
> implemented yet. Attached is a POC patch to add the hash partitioning
> feature. I know we will need more discussions about the syntax and other
> specifications before going ahead the project, but I think this runnable
> code might help to discuss what and how we implement this.
>

Great.

> * Description
>
> In this patch, the hash partitioning implementation is basically based
> on the list partitioning mechanism. However, partition bounds cannot be
> specified explicitly, but this is used internally as hash partition
> index, which is calculated when a partition is created or attached.
>
> The tentative syntax to create a partitioned table is as bellow;
>
>  CREATE TABLE h (i int) PARTITION BY HASH(i) PARTITIONS 3 USING hashint4;
>
> The number of partitions is specified by PARTITIONS, which is currently
> constant and cannot be changed, but I think this is needed to be changed
in
> some manner. A hash function is specified by USING. Maybe, specifying hash
> function may be ommitted, and in this case, a default hash function
> corresponding to key type will be used.
>
> A partition table can be create as bellow;
>
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
>
> FOR VALUES clause cannot be used, and the partition bound is
> calclulated automatically as partition index of single integer value.
>
> When trying create partitions more than the number specified
> by PARTITIONS, it gets an error.
>
> postgres=# create table h4 partition of h;
> ERROR:  cannot create hash partition more than 3 for h
>
> An inserted record is stored in a partition whose index equals
> abs(hashfunc(key)) % . In the above
> example, this is abs(hashint4(i))%3.
>
> postgres=# insert into h (select generate_series(0,20));
> INSERT 0 21
>
> postgres=# select *,tableoid::regclass from h;
>  i  | tableoid
> +--
>   0 | h1
>   1 | h1
>   2 | h1
>   4 | h1
>   8 | h1
>  10 | h1
>  11 | h1
>  14 | h1
>  15 | h1
>  17 | h1
>  20 | h1
>   5 | h2
>  12 | h2
>  13 | h2
>  16 | h2
>  19 | h2
>   3 | h3
>   6 | h3
>   7 | h3
>   9 | h3
>  18 | h3
> (21 rows)
>
> * Todo / discussions
>
> In this patch, we cannot change the number of partitions specified
> by PARTITIONS. I we can change this, the partitioning rule
> ( = abs(hashfunc(key)) % )
> is also changed and then we need reallocatiing records between
> partitions.
>
> In this patch, user can specify a hash function USING. However,
> we migth need default hash functions which are useful and
> proper for hash partitioning.
>
​IMHO, we should try to keep create partition syntax simple and aligned
with other partition strategy. For e.g:
CREATE TABLE h (i int) PARTITION BY HASH(i);

I Agree that it is unavoidable partitions number in modulo hashing,
but we can do in other hashing technique.  Have you had thought about
Linear hashing[1] or Consistent hashing​[2]?​  This will allow us to
add/drop
partition with minimal row moment. ​

​+1 for the default hash function corresponding to partitioning key type.​

Regards,
Amul
​

[1] https://en.wikipedia.org/wiki/Linear_hashing
[2] https://en.wikipedia.org/wiki/Consistent_hashing


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Andres Freund
On 2017-03-01 10:47:45 +0530, Kuntal Ghosh wrote:
> if (insertdist > curdist)
> {
> swap the entry to be inserted with the current entry.
> Try to insert the current entry in the same logic.
> }
> 
> So, the second approach will not cause all the followers to be shifted by 1.

How not? You'll have to do that game until you found a free slot.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Kuntal Ghosh
On Wed, Mar 1, 2017 at 9:38 AM, Andres Freund  wrote:
> Hi,
>
> On 2017-03-01 09:33:07 +0530, Kuntal Ghosh wrote:
>> On Wed, Mar 1, 2017 at 9:19 AM, Andres Freund  wrote:
>> >> So, I was looking for other alternatives and I've found one called
>> >> RobinHood hashing.
>> >
>> > simplehash.h implements robin hood hashing.
>
>> But, it doesn't implement the swapping idea, right?
>
> It does, that's the if (insertdist > curdist) block in SH_INSERT.
> Unless I misunderstand what you're proposing?
>
I think the idea of the existing implementation is:

if (insertdist > curdist)
{
find an empty space at the end of the cluster.
shift all the followers by 1 position to make space for the element to
be inserted.
insert the element at the current place.
}

What I'm trying to say is:

if (insertdist > curdist)
{
swap the entry to be inserted with the current entry.
Try to insert the current entry in the same logic.
}

So, the second approach will not cause all the followers to be shifted by 1.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Kuntal Ghosh
On Wed, Mar 1, 2017 at 9:33 AM, Kuntal Ghosh  wrote:
> On Wed, Mar 1, 2017 at 9:19 AM, Andres Freund  wrote:
>> That's without the patch in
>> http://archives.postgresql.org/message-id/20161123083351.5vramz52nmdokhzz%40alap3.anarazel.de
>> ? With that patch it should complete without that change, right?
>>
> No, it's with the patch in the above link which is
> 0001-Resize-simplehash.h-table-in-case-of-long-runs.patch. But, I've
> kept the SH_FILLFACTOR to 0.8 and SH_MAX_FILLFACTOR to 0.95. I'll do
> another round of testing with the constants provided by you.
>
I've tested with the patch
0001-Resize-simplehash.h-table-in-case-of-long-runs.patch and the
results are same, i.e., hash table grows even when it is only 10-12%
filled. I've attached the logfile for reference.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


logfile
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] brin autosummarization -- autovacuum "work items"

2017-02-28 Thread Thomas Munro
On Wed, Mar 1, 2017 at 5:58 PM, Alvaro Herrera  wrote:
> I think one of the most serious issues with BRIN indexes is how they
> don't get updated automatically as the table is filled.  This patch
> attempts to improve on that.  During brininsert() time, we check whether
> we're inserting the first item on the first page in a range.  If we are,
> request autovacuum to do a summarization run on that table.  This is
> dependent on a new reloption for BRIN called "autosummarize", default
> off.

Nice.

> The way the request works is that autovacuum maintains a DSA which can
> be filled by backends with "work items".  Currently, work items can
> specify a BRIN summarization of some specific index; in the future we
> could use this framework to request other kinds of things that do not
> fit in the "dead tuples / recently inserted tuples" logic that autovac
> currently uses to decide to vacuum/analyze tables.
>
> However, it seems I have not quite gotten the hang of DSA just yet,
> because after a couple of iterations, crashes occur.  I think the reason
> has to do with either a resource owner clearing the DSA at an unwelcome
> time, or perhaps there's a mistake in my handling of DSA "relative
> pointers" stuff.

Ok, I'll take a look.  It's set up for ease of use in short lifespan
situations like parallel query, and there are a few extra hoops to
jump through for longer lived DSA areas.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-02-28 Thread Rushabh Lathia
On Tue, Feb 28, 2017 at 8:03 PM, Yugo Nagata  wrote:

> Hi all,
>
> Now we have a declarative partitioning, but hash partitioning is not
> implemented yet. Attached is a POC patch to add the hash partitioning
> feature. I know we will need more discussions about the syntax and other
> specifications before going ahead the project, but I think this runnable
> code might help to discuss what and how we implement this.
>
> * Description
>
> In this patch, the hash partitioning implementation is basically based
> on the list partitioning mechanism. However, partition bounds cannot be
> specified explicitly, but this is used internally as hash partition
> index, which is calculated when a partition is created or attached.
>
> The tentative syntax to create a partitioned table is as bellow;
>
>  CREATE TABLE h (i int) PARTITION BY HASH(i) PARTITIONS 3 USING hashint4;
>
> The number of partitions is specified by PARTITIONS, which is currently
> constant and cannot be changed, but I think this is needed to be changed in
> some manner. A hash function is specified by USING. Maybe, specifying hash
> function may be ommitted, and in this case, a default hash function
> corresponding to key type will be used.
>
> A partition table can be create as bellow;
>
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
>
> FOR VALUES clause cannot be used, and the partition bound is
> calclulated automatically as partition index of single integer value.
>
> When trying create partitions more than the number specified
> by PARTITIONS, it gets an error.
>
> postgres=# create table h4 partition of h;
> ERROR:  cannot create hash partition more than 3 for h
>
> An inserted record is stored in a partition whose index equals
> abs(hashfunc(key)) % . In the above
> example, this is abs(hashint4(i))%3.
>
> postgres=# insert into h (select generate_series(0,20));
> INSERT 0 21
>
> postgres=# select *,tableoid::regclass from h;
>  i  | tableoid
> +--
>   0 | h1
>   1 | h1
>   2 | h1
>   4 | h1
>   8 | h1
>  10 | h1
>  11 | h1
>  14 | h1
>  15 | h1
>  17 | h1
>  20 | h1
>   5 | h2
>  12 | h2
>  13 | h2
>  16 | h2
>  19 | h2
>   3 | h3
>   6 | h3
>   7 | h3
>   9 | h3
>  18 | h3
> (21 rows)
>
>
This is good, I will have closer look into the patch, but here are
few quick comments.

- CREATE HASH partition syntax adds two new keywords and ideally
we should try to avoid adding additional keywords. Also I can see that
HASH keyword been added, but I don't see any use of newly added
keyword in gram.y.

- Also I didn't like the idea of fixing number of partitions during the
CREATE
TABLE syntax. Thats something that needs to be able to changes.



> * Todo / discussions
>
> In this patch, we cannot change the number of partitions specified
> by PARTITIONS. I we can change this, the partitioning rule
> ( = abs(hashfunc(key)) % )
> is also changed and then we need reallocatiing records between
> partitions.
>
> In this patch, user can specify a hash function USING. However,
> we migth need default hash functions which are useful and
> proper for hash partitioning.
>

+1

- With fixing default hash function and not specifying number of partitions
during CREATE TABLE - don't need two new additional columns into
pg_partitioned_table catalog.


> Currently, even when we issue SELECT query with a condition,
> postgres looks into all partitions regardless of each partition's
> constraint, because this is complicated such like "abs(hashint4(i))%3 = 0".
>
> postgres=# explain select * from h where i = 10;
> QUERY PLAN
> --
>  Append  (cost=0.00..125.62 rows=40 width=4)
>->  Seq Scan on h  (cost=0.00..0.00 rows=1 width=4)
>  Filter: (i = 10)
>->  Seq Scan on h1  (cost=0.00..41.88 rows=13 width=4)
>  Filter: (i = 10)
>->  Seq Scan on h2  (cost=0.00..41.88 rows=13 width=4)
>  Filter: (i = 10)
>->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4)
>  Filter: (i = 10)
> (9 rows)
>
> However, if we modify a condition into a same expression
> as the partitions constraint, postgres can exclude unrelated
> table from search targets. So, we might avoid the problem
> by converting the qual properly before calling predicate_refuted_by().
>
> postgres=# explain select * from h where abs(hashint4(i))%3 =
> abs(hashint4(10))%3;
> QUERY PLAN
> --
>  Append  (cost=0.00..61.00 rows=14 width=4)
>->  Seq Scan on h  (cost=0.00..0.00 rows=1 width=4)
>  Filter: ((abs(hashint4(i)) % 3) = 2)
>->  Seq Scan on h3  (cost=0.00..61.00 rows=13 width=4)
>  Filter: ((abs(hashint4(i)) % 3) = 2)
> (5 rows)
>
> Best regards,
> Yugo Nagata
>
> --
> Yugo Nagata 
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to 

Re: [HACKERS] patch proposal

2017-02-28 Thread Venkata B Nagothi
On Wed, Mar 1, 2017 at 1:14 AM, Venkata B Nagothi  wrote:

> Hi David,
>
> On Tue, Jan 31, 2017 at 6:49 AM, David Steele  wrote:
>
>> On 1/27/17 3:19 AM, Venkata B Nagothi wrote:
>>
>> > I will be adding the tests in
>> > src/test/recovery/t/003_recovery_targets.pl
>> > . My tests would look more or less
>> > similar to recovery_target_action. I do not see any of the tests added
>> > for the parameter recovery_target_action ? Anyways, i will work on
>> > adding tests for recovery_target_incomplete.
>>
>> Do you know when those will be ready?
>>
>
> Attached are both the patches with tests included.
>
>
>>
>> >
>> >
>> > 4) I'm not convinced that fatal errors by default are the way to go.
>> > It's a pretty big departure from what we have now.
>> >
>> >
>> > I have changed the code to generate ERRORs instead of FATALs. Which is
>> > in the patch recoveryStartPoint.patch
>>
>> I think at this point in recovery any ERROR at all will probably be
>> fatal.  Once you have some tests in place we'll know for sure.
>>
>> Either way, the goal would be to keep recovery from proceeding and let
>> the user adjust their targets.  Since this is a big behavioral change
>> which will need buy in from the community.
>>
>
> Hoping to get some comments / feedback from the community on the second
> patch too.
>
> > Also, I don't think it's very intuitive how
>> recovery_target_incomplete
>> > works.  For instance, if I set recovery_target_incomplete=promote
>> and
>> > set recovery_target_time, but pick a backup after the specified
>> time,
>> > then there will be an error "recovery_target_time %s is older..."
>> rather
>> > than a promotion.
>> >
>> >
>> > Yes, that is correct and that is the expected behaviour. Let me explain
>> -
>>
>> My point was that this combination of options could lead to confusion on
>> the part of users.  However, I'd rather leave that alone for the time
>> being and focus on the first patch.
>>
>> > I have split the patch into two different
>> > pieces. One is to determine if the recovery can start at all and other
>> > patch deals with the incomplete recovery situation.
>>
>> I think the first patch looks promising and I would rather work through
>> that before considering the second patch.  Once there are tests for the
>> first patch I will complete my review.
>>
>
> I have added all the tests for the second patch and all seem to be working
> fine.
>
> Regarding the first patch, i have included all the tests except for the
> test case on recovery_target_time.
> I did write the test, but, the test is kind of behaving weird which i am
> working through to resolve.
>

This issue has been resolved. All good now. To avoid any further confusion,
i have attached the latest versions (4) of both the patches with all the
tests included.

I have already changed the patch status to "Needs review".

Thank you !

Regards,

Venkata B N
Database Consultant


recoveryStartPoint.patch-version4
Description: Binary data


recoveryTargetIncomplete.patch-version4
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] brin autosummarization -- autovacuum "work items"

2017-02-28 Thread Alvaro Herrera
I think one of the most serious issues with BRIN indexes is how they
don't get updated automatically as the table is filled.  This patch
attempts to improve on that.  During brininsert() time, we check whether
we're inserting the first item on the first page in a range.  If we are,
request autovacuum to do a summarization run on that table.  This is
dependent on a new reloption for BRIN called "autosummarize", default
off.

The way the request works is that autovacuum maintains a DSA which can
be filled by backends with "work items".  Currently, work items can
specify a BRIN summarization of some specific index; in the future we
could use this framework to request other kinds of things that do not
fit in the "dead tuples / recently inserted tuples" logic that autovac
currently uses to decide to vacuum/analyze tables.

However, it seems I have not quite gotten the hang of DSA just yet,
because after a couple of iterations, crashes occur.  I think the reason
has to do with either a resource owner clearing the DSA at an unwelcome
time, or perhaps there's a mistake in my handling of DSA "relative
pointers" stuff.

This patch was initially written by Simon Riggs, who envisioned that
brininsert itself would invoke the summarization.  However, this doesn't
work because summarization requires having ShareUpdateExclusive lock,
which brininsert doesn't have.  So I modified things to instead use the
DSA stuff.  (He also set things up so that brininsert would only
summarize the just-filled range, but I didn't preserve that idea in the
autovacuum-based implementation; some changed lines there can probably
be removed.)

-- 
Álvaro HerreraPostgreSQL Expert, https://www.2ndQuadrant.com/
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index 6448b18..480895b 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -74,9 +74,13 @@
tuple; those tuples remain unsummarized until a summarization run is
invoked later, creating initial summaries.
This process can be invoked manually using the
-   brin_summarize_new_values(regclass) function,
-   or automatically when VACUUM processes the table.
+   brin_summarize_new_values(regclass) function;
+   automatically when VACUUM processes the table;
+   or by automatic summarization executed by autovacuum, as insertions
+   occur.  (This last trigger is disabled by default and is enabled with
+   the parameter autosummarize.)
   
+
  
 
 
diff --git a/doc/src/sgml/ref/create_index.sgml 
b/doc/src/sgml/ref/create_index.sgml
index fcb7a60..80d9c39 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -382,7 +382,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS 
] 
 

-BRIN indexes accept a different parameter:
+BRIN indexes accept different parameters:

 

@@ -396,6 +396,16 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS 
] 
 

+
+   
+autosummarize
+
+
+ Defines whether a summarization run is invoked for the previous page
+ range whenever an insertion is detected on the next one.
+
+
+   

   
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b22563b..01586ff 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -26,6 +26,7 @@
 #include "catalog/pg_am.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "utils/builtins.h"
@@ -60,10 +61,12 @@ typedef struct BrinOpaque
BrinDesc   *bo_bdesc;
 } BrinOpaque;
 
+#define BRIN_ALL_BLOCKRANGES   InvalidBlockNumber
+
 static BrinBuildState *initialize_brin_buildstate(Relation idxRel,
   BrinRevmap *revmap, 
BlockNumber pagesPerRange);
 static void terminate_brin_buildstate(BrinBuildState *state);
-static void brinsummarize(Relation index, Relation heapRel,
+static void brinsummarize(Relation index, Relation heapRel, BlockNumber 
pageRange,
  double *numSummarized, double *numExisting);
 static void form_and_insert_tuple(BrinBuildState *state);
 static void union_tuples(BrinDesc *bdesc, BrinMemTuple *a,
@@ -126,8 +129,11 @@ brinhandler(PG_FUNCTION_ARGS)
  * with those of the new tuple.  If the tuple values are not consistent with
  * the summary tuple, we need to update the index tuple.
  *
+ * If autosummarization is enabled, check if we need to summarize the previous
+ * page range.
+ *
  * If the range is not currently summarized (i.e. the revmap returns NULL for
- * it), there's nothing to do.
+ * it), there's nothing to do for this tuple.
  */
 bool
 brininsert(Relation idxRel, Datum *values, bool *nulls,
@@ -141,6 +147,7 @@ brininsert(Relation idxRel, Datum *values, bool *nulls,
Buffer  buf = InvalidBuffer;
MemoryContext tupcxt = NULL;
MemoryContext oldcxt = CurrentMemoryContext;
+   bool 

Re: [HACKERS] rename pg_log directory?

2017-02-28 Thread Peter Eisentraut
On 2/28/17 06:07, Magnus Hagander wrote:
> server_log seems like a better choice then I think. So +1 for that.
> 
> In theory cluster_log since it's a "cluster level log", but given how
> many people already get confused by the term cluster being used that
> way, I think that while maybe technically correct, that would be a very
> bad choice. 

Well, the premise was that we wanted to rename xlog because people think
that the "log" is "the log".  So there should be no confusion if there
is only one "log" left.  (cough, cough, clog, cough, cough)  I think if
we invent nonstandard names, it will be more confusing, not less.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we cacheline align PGXACT?

2017-02-28 Thread Ashutosh Sharma
On Tue, Feb 28, 2017 at 11:44 PM, Simon Riggs  wrote:

> On 28 February 2017 at 11:34, Ashutosh Sharma 
> wrote:
>
>
>> So, Here are the pgbench results I got with '
>> *reduce_pgxact_access_AtEOXact.v2.patch*' on a read-write workload.
>>
>
> Thanks for performing a test.
>
> I see a low yet noticeable performance gain across the board on that
> workload.
>
> That is quite surprising to see a gain on that workload. The main workload
> we have been discussing was the full read-only test (-S). For that case the
> effect should be much more noticeable based upon Andres' earlier comments.
>
> Would it be possible to re-run the test using only the -S workload? Thanks
> very much.
>

Okay, I already had the results for read-oly workload but just forgot to
share it along with the results for read-write test. Here are the results
for read-only
test,

CLIENT COUNT TPS (HEAD) TPS (PATCH) % IMPROVEMENT
4 26322 31259 18.75617354
8 63499 69472 9.406447346
16 181155 186534 2.96928045
32 333504 337533 1.208081462
64 350341 353747 0.9721956608
72 366339 373898 2.063389374
128 443381 478562 7.934710779
180 299875 334118 11.41909129
196 269194 275525 2.351835479
256 220027 235995 7.257291151

The pgbench settings and non-default params are,

pgbench -i -s 300 postgres
pgbench -M prepared -c $thread -j $thread -T $time_for_reading -S postgres

where, time_for_reading = 10mins

*non default param:*
shared_buffers=8GB
max_connections=300

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

>
>> *pgbench settings:*
>> pgbench -i -s 300 postgres
>> pgbench -M prepared -c $thread -j $thread -T $time_for_reading  postgres
>>
>> where, time_for_reading = 30mins
>>
>> *non default GUC param*
>> shared_buffers=8GB
>> max_connections=300
>>
>> pg_wal is located in SSD.
>>
>> CLIENT COUNT TPS (HEAD) TPS (PATCH) % IMPROVEMENT
>> 4 2588 2601 0.5023183926 <%28502%29%20318-3926>
>> 8 5094 5098 0.0785237534
>> 16 10294 10307 0.1262871576
>> 32 19779 19815 0.182011224
>> 64 27908 28346 1.569442454
>> 72 27823 28416 2.131330194
>> 128 28455 28618 0.5728342998
>> 180 26739 26879 0.5235797898
>> 196 27820 27963 0.5140186916
>> 256 28763 28969 0.7161978931
>>
>> Also, Excel sheet (results-readwrite-300-SF.xlsx) containing the results
>> for all the 3 runs is attached.
>>
>
>
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> 
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] rename pg_log directory?

2017-02-28 Thread Peter Eisentraut
On 2/27/17 09:51, Tom Lane wrote:
> No objection to the basic point, but "log" seems perhaps a little too
> generic to me.  Would something like "server_log" be better?

Well, "log" is pretty well established.  There is /var/log, and if you
unpack a, say, Kafka or Cassandra distribution, they also come with a
log or logs directory.


-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-02-28 Thread Andres Freund
On 2017-02-28 20:18:35 -0800, Andres Freund wrote:
> - Andres, hoping the buildfarm turns greener

Oh well, that didn't work. Investigating.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-02-28 Thread Andres Freund
On 2017-02-28 10:41:22 -0800, Andres Freund wrote:
> Hi,
> 
> On 2017-02-27 23:44:20 -0800, Andres Freund wrote:
> > *preliminary* patch attached. This needs a good bit of polishing
> > (primarily comment work, verifying that valgrind works), but I'm too
> > tired now.
> > 
> > I'm not quite sure how to deal with mmgr/README - it's written as kind
> > of a historical document, and the "Mechanisms to Allow Multiple Types of
> > Contexts" is already quite out of date.  I think it'd be good to rip out
> > all the historical references and just describe the current state, but
> > I'm not really enthusiastic about tackling that :/
> 
> While still not enthusiastic, I took a stab at doing so.  While still
> not perfect, I do think this is an improvement.
> 
> Is anybody uncomfortable going away from the current historical account
> style?

I've pushed these now.  I'm not claiming that the README revision is
perfect, but we can incremently improve it further...

- Andres, hoping the buildfarm turns greener


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Andres Freund
Hi,

On 2017-03-01 09:33:07 +0530, Kuntal Ghosh wrote:
> On Wed, Mar 1, 2017 at 9:19 AM, Andres Freund  wrote:
> >> So, I was looking for other alternatives and I've found one called
> >> RobinHood hashing.
> >
> > simplehash.h implements robin hood hashing.

> But, it doesn't implement the swapping idea, right?

It does, that's the if (insertdist > curdist) block in SH_INSERT.
Unless I misunderstand what you're proposing?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Kuntal Ghosh
On Wed, Mar 1, 2017 at 9:19 AM, Andres Freund  wrote:
> On 2017-03-01 09:13:15 +0530, Kuntal Ghosh wrote:
>> On Wed, Mar 1, 2017 at 8:48 AM, Andres Freund  wrote:
>> >> BTW, another option to consider is lowering the target fillfactor.
>> >> IIRC, Kuntal mentioned to me that cranking it down seemed to fix the
>> >> issue.  Obviously, it's better to detect when we need a lower
>> >> fillfactor than to always use a lower one, but obviously the tighter
>> >> you pack the hash table, the more likely it is that you're going to
>> >> have these kinds of problems.
>> >
>> > Yea, that'd be one approach, but I feel better dynamically increasing
>> > the fillfactor like in the patch. Even with a lower fillfactor you could
>> > see issues, and as you say a higher fillfactor is nice [TM]; after the
>> > patch I played with *increasing* the fillfactor, without being able to
>> > measure a downside.
>
>> I've tested with different fill factors and the query got completed
>> with fill factor 0.6.
>
> That's without the patch in
> http://archives.postgresql.org/message-id/20161123083351.5vramz52nmdokhzz%40alap3.anarazel.de
> ? With that patch it should complete without that change, right?
>
No, it's with the patch in the above link which is
0001-Resize-simplehash.h-table-in-case-of-long-runs.patch. But, I've
kept the SH_FILLFACTOR to 0.8 and SH_MAX_FILLFACTOR to 0.95. I'll do
another round of testing with the constants provided by you.

>> With linear probing, the performance degrades more quickly at high
>> fill factors because of primary clustering, a tendency for one
>> collision to cause more nearby collisions. So, increasing the fill
>> factor further doesn't seem to be a good idea.
>
> Well, the idea is increasing it, but *also* detecting clustering and
> adaptively resizing earlier.
>
>
>> So, I was looking for other alternatives and I've found one called
>> RobinHood hashing.
>
> simplehash.h implements robin hood hashing.
>
>
But, it doesn't implement the swapping idea, right?


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-02-28 Thread Corey Huinker
>
>
>>
>> Alternatively if the structure must really be kept, then deal with errors
>> in a first switch, read value *after* switch and deal with other errors
>> there, then start a second switch, and adjust the documentation accordingly?
>>
>>   switch
>> errors
>>   read
>>   if
>> errors
>>   // no error
>>   switch
>>
>
>
it's now something more like

switch
  error-conditions
if no-errors
  read
  if was a boolean
  switch last-state

It doesn't strike me as much cleaner, but it's no worse, either.
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index ae58708..9d245a9 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2035,6 +2035,83 @@ hello 10
 
   
 
+  
+\if expr
+\elif expr
+\else
+\endif
+
+
+This group of commands implements nestable conditional blocks, like
+this:
+
+
+-- set ON_ERROR_STOP in case the variables are not valid boolean expressions
+\set ON_ERROR_STOP on
+-- check for the existence of two separate records in the database and store
+-- the results in separate psql variables
+SELECT
+EXISTS(SELECT 1 FROM customer WHERE customer_id = 123) as is_customer,
+EXISTS(SELECT 1 FROM employee WHERE employee_id = 456) as is_employee
+\gset
+\if :is_customer
+SELECT * FROM customer WHERE customer_id = 123;
+\elif :is_employee
+\echo 'is not a customer but is an employee'
+SELECT * FROM employee WHERE employee_id = 456;
+\else
+\if yes
+\echo 'not a customer or employee'
+\else
+\echo 'this should never print'
+\endif
+\endif
+
+
+Conditional blocks must begin with a \if and end
+with an \endif, and the pairs must be found in
+the same source file. If an EOF is reached on the main file or an
+\include-ed file before all local
+\if-\endif are matched, then
+psql will raise an error.
+
+
+A conditional block can have any number of 
+\elif clauses, which may optionally be followed by a
+single \else clause.
+
+
+The \if and \elif commands
+read the rest of the line and evaluate it as a boolean expression.
+Currently, expressions are limited to a single unquoted string
+which are evaluated like other on/off options, so the valid values
+are any unambiguous case insensitive matches for one of:
+true, false, 
1,
+0, on, off,
+yes, no.  So 
+t, T, and tR
+will all match true.
+
+Expressions that do not properly evaluate to true or false will
+generate an error and cause the \if or
+\elif command to fail.  Because that behavior may
+change branching context in undesirable ways (executing code which
+was intended to be skipped, causing \elif,
+\else, and \endif commands to
+pair with the wrong \if, etc), it is
+recommended that scripts which use conditionals also set
+ON_ERROR_STOP.
+
+
+Lines within false branches are parsed normally for query/command
+completion, but any queries are not sent to the server, and any
+commands other than conditionals (\if,
+\elif, \else,
+\endif) are ignored. Conditional commands are
+still checked for valid nesting.
+
+
+  
 
   
 \ir or \include_relative 
filename
@@ -3703,8 +3780,9 @@ testdb= INSERT INTO my_table VALUES 
(:'content');
 
 
 In prompt 1 normally =,
-but ^ if in single-line mode,
-or ! if the session is disconnected from the
+but @ if the session is in a false conditional
+block, or ^ if in single-line mode, or
+! if the session is disconnected from the
 database (which can happen if \connect fails).
 In prompt 2 %R is replaced by a character that
 depends on why psql expects more input:
diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index c53733f..1492b66 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -21,7 +21,7 @@ REFDOCDIR= $(top_srcdir)/doc/src/sgml/ref
 override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
 
-OBJS=  command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
+OBJS=  command.o common.o conditional.o help.o input.o stringutils.o 
mainloop.o copy.o \
startup.o prompt.o variables.o large_obj.o describe.o \
crosstabview.o tab-complete.o \
sql_help.o psqlscanslash.o \
@@ -61,8 +61,16 @@ uninstall:
 
 clean distclean:
rm -f psql$(X) $(OBJS) lex.backup
+   rm -rf tmp_check
 
 # files removed here are supposed to be in the distribution tarball,
 # so do not clean them in the clean/distclean rules
 maintainer-clean: distclean
rm -f 

Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Andres Freund
On 2017-03-01 09:21:47 +0530, Robert Haas wrote:
> On Wed, Mar 1, 2017 at 8:48 AM, Andres Freund  wrote:
> >> BTW, another option to consider is lowering the target fillfactor.
> >> IIRC, Kuntal mentioned to me that cranking it down seemed to fix the
> >> issue.  Obviously, it's better to detect when we need a lower
> >> fillfactor than to always use a lower one, but obviously the tighter
> >> you pack the hash table, the more likely it is that you're going to
> >> have these kinds of problems.
> >
> > Yea, that'd be one approach, but I feel better dynamically increasing
> > the fillfactor like in the patch. Even with a lower fillfactor you could
> > see issues, and as you say a higher fillfactor is nice [TM]; after the
> > patch I played with *increasing* the fillfactor, without being able to
> > measure a downside.
> 
> I am going to bet that increasing the fillfactor would be a mistake.
> The expected length of a clump in the table is going to be about
> 1/(1-ff), which increases very quickly beyond the current value of
> 0.8.  For example, if the actual fillfactor is 0.9 and the keys are
> perfectly distributed -- nine in a row and then one empty slot,
> lather, rinse, repeat -- probing for a key that's not there has a 10%
> chance of hitting an empty slot, a 10% chance of hitting the slot just
> before an empty slot, and so on.  So the expected number of probes to
> find that there's no match is 4.5.  However, it could easily happen
> that you have 3 or 4 empty slots in a row in one place and 27 or 36
> occupied slots in a row in another place, and that could cause all
> kinds of grief.  Moreover, real-world hash functions on real-world
> data aren't always perfect, which increases the chances of things
> going off track.  I'm not exactly sure how high a fillfactor is too
> high, but note that
> https://en.wikipedia.org/wiki/Hash_table#Open_addressing claims that
> "performance dramatically degrades when the load factor grows beyond
> 0.7 or so", which isn't cited but the same text appears in
> https://www.unf.edu/~wkloster/3540/wiki_book2.pdf for whatever that's
> worth.

That's without the "trick" of "robin hood" hashing though:
https://en.wikipedia.org/wiki/Hash_table#Robin_Hood_hashing
http://codecapsule.com/2013/11/17/robin-hood-hashing-backward-shift-deletion/
https://www.pvk.ca/Blog/2013/11/26/the-other-robin-hood-hashing/

it probably depends a bit on the scenario how high you want to have the
fillfactor - for hashaggs a high fill factor would probably be good
(they're often "read" heavy), for bitmapscans less so.

But I guess there's not much point in immediately increasing the
fillfactor, can do that in a second step.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] perlcritic

2017-02-28 Thread Peter Eisentraut
I posted this about 18 months ago but then ran out of steam.  In the
meantime, some people have been going around doing various Perl code
cleanups in parts of the code, so it seems it makes sense to proceed
with this.  We use "use strict" everywhere now, so some of the original
patch has gone away.  Here is an updated patch.  The testing
instructions below still apply.  Especially welcome would be ideas on
how to address some of the places I have marked with ## no critic.

On 8/31/15 23:57, Peter Eisentraut wrote:
> We now have 80+ Perl files in our tree, and it's growing.  Some of those
[actually >=117 now]
> files were originally written for Perl 4, and the coding styles and
> quality are quite, uh, divergent.  So I figured it's time to clean up
> that code a bit.  I ran perlcritic over the tree and cleaned up all the
> warnings at level 5 (the default, least severe).
> 
> Testing guidelines:
> 
> - Many files are part of the regular build or test process.
> 
> - msvc files need to be tested separately.  I tested as best as I could
> on a non-Windows system.
> 
> - There are a couple of one-offs in contrib and src/test that need to be
> run manually.
> 
> - The stuff under utils/mb/Unicode/
[has already been cleaned up separately]

> To install perlcritic, run
> 
> cpan -i Perl::Critic
> 
> and then run
> 
> perlcritic .
> 
> at the top of the tree (or a subdirectory).

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From af08d7e1b7a947a3f94bb1cef7508c3f926cc35a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 4 Jan 2017 12:00:00 -0500
Subject: [PATCH v2 1/2] Clean up Perl code according to perlcritic severity
 level 5

---
 contrib/intarray/bench/create_test.pl|  20 +--
 doc/src/sgml/generate-errcodes-table.pl  |   2 +-
 doc/src/sgml/mk_feature_tables.pl|  12 +-
 src/backend/catalog/Catalog.pm   |   8 +-
 src/backend/catalog/genbki.pl|  64 -
 src/backend/parser/check_keywords.pl |  30 ++---
 src/backend/storage/lmgr/generate-lwlocknames.pl |  30 ++---
 src/backend/utils/Gen_fmgrtab.pl |  32 ++---
 src/backend/utils/generate-errcodes.pl   |   2 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  26 ++--
 src/bin/pg_ctl/t/001_start_stop.pl   |  14 +-
 src/bin/psql/create_help.pl  |  28 ++--
 src/interfaces/ecpg/preproc/check_rules.pl   |  12 +-
 src/interfaces/libpq/test/regress.pl |  10 +-
 src/pl/plperl/plc_perlboot.pl|   4 +-
 src/pl/plperl/plc_trusted.pl |   2 +-
 src/pl/plperl/text2macro.pl  |   8 +-
 src/pl/plpgsql/src/generate-plerrcodes.pl|   2 +-
 src/pl/plpython/generate-spiexceptions.pl|   2 +-
 src/pl/tcl/generate-pltclerrcodes.pl |   2 +-
 src/test/locale/sort-test.pl |   4 +-
 src/test/perl/PostgresNode.pm|   8 +-
 src/test/perl/TestLib.pm |  16 +--
 src/test/ssl/ServerSetup.pm  |  48 +++
 src/tools/fix-old-flex-code.pl   |   4 +-
 src/tools/msvc/Install.pm|  10 +-
 src/tools/msvc/Mkvcbuild.pm  |   2 +-
 src/tools/msvc/Project.pm|  28 ++--
 src/tools/msvc/Solution.pm   | 162 +++
 src/tools/msvc/build.pl  |   8 +-
 src/tools/msvc/builddoc.pl   |   2 +-
 src/tools/msvc/gendef.pl |  18 +--
 src/tools/msvc/install.pl|   4 +-
 src/tools/msvc/mkvcbuild.pl  |   4 +-
 src/tools/msvc/pgbison.pl|   4 +-
 src/tools/msvc/pgflex.pl |  12 +-
 src/tools/msvc/vcregress.pl  |  19 +--
 src/tools/pginclude/pgcheckdefines   |  32 ++---
 src/tools/pgindent/pgindent  |   4 +-
 src/tools/version_stamp.pl   |   6 +-
 src/tools/win32tzlist.pl |   6 +-
 41 files changed, 356 insertions(+), 355 deletions(-)

diff --git a/contrib/intarray/bench/create_test.pl b/contrib/intarray/bench/create_test.pl
index 1323b31e4d..f3262df05b 100755
--- a/contrib/intarray/bench/create_test.pl
+++ b/contrib/intarray/bench/create_test.pl
@@ -15,8 +15,8 @@
 
 EOT
 
-open(MSG, ">message.tmp") || die;
-open(MAP, ">message_section_map.tmp") || die;
+open(my $msg, '>', "message.tmp") || die;
+open(my $map, '>', "message_section_map.tmp") || die;
 
 srand(1);
 
@@ -42,16 +42,16 @@
 	}
 	if ($#sect < 0 || rand() < 0.1)
 	{
-		print MSG "$i\t\\N\n";
+		print $msg "$i\t\\N\n";
 	}
 	else
 	{
-		print MSG "$i\t{" . join(',', @sect) . "}\n";
-		map { print MAP "$i\t$_\n" } @sect;
+		print $msg 

Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Robert Haas
On Wed, Mar 1, 2017 at 8:48 AM, Andres Freund  wrote:
>> BTW, another option to consider is lowering the target fillfactor.
>> IIRC, Kuntal mentioned to me that cranking it down seemed to fix the
>> issue.  Obviously, it's better to detect when we need a lower
>> fillfactor than to always use a lower one, but obviously the tighter
>> you pack the hash table, the more likely it is that you're going to
>> have these kinds of problems.
>
> Yea, that'd be one approach, but I feel better dynamically increasing
> the fillfactor like in the patch. Even with a lower fillfactor you could
> see issues, and as you say a higher fillfactor is nice [TM]; after the
> patch I played with *increasing* the fillfactor, without being able to
> measure a downside.

I am going to bet that increasing the fillfactor would be a mistake.
The expected length of a clump in the table is going to be about
1/(1-ff), which increases very quickly beyond the current value of
0.8.  For example, if the actual fillfactor is 0.9 and the keys are
perfectly distributed -- nine in a row and then one empty slot,
lather, rinse, repeat -- probing for a key that's not there has a 10%
chance of hitting an empty slot, a 10% chance of hitting the slot just
before an empty slot, and so on.  So the expected number of probes to
find that there's no match is 4.5.  However, it could easily happen
that you have 3 or 4 empty slots in a row in one place and 27 or 36
occupied slots in a row in another place, and that could cause all
kinds of grief.  Moreover, real-world hash functions on real-world
data aren't always perfect, which increases the chances of things
going off track.  I'm not exactly sure how high a fillfactor is too
high, but note that
https://en.wikipedia.org/wiki/Hash_table#Open_addressing claims that
"performance dramatically degrades when the load factor grows beyond
0.7 or so", which isn't cited but the same text appears in
https://www.unf.edu/~wkloster/3540/wiki_book2.pdf for whatever that's
worth.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Andres Freund
On 2017-03-01 09:13:15 +0530, Kuntal Ghosh wrote:
> On Wed, Mar 1, 2017 at 8:48 AM, Andres Freund  wrote:
> >> BTW, another option to consider is lowering the target fillfactor.
> >> IIRC, Kuntal mentioned to me that cranking it down seemed to fix the
> >> issue.  Obviously, it's better to detect when we need a lower
> >> fillfactor than to always use a lower one, but obviously the tighter
> >> you pack the hash table, the more likely it is that you're going to
> >> have these kinds of problems.
> >
> > Yea, that'd be one approach, but I feel better dynamically increasing
> > the fillfactor like in the patch. Even with a lower fillfactor you could
> > see issues, and as you say a higher fillfactor is nice [TM]; after the
> > patch I played with *increasing* the fillfactor, without being able to
> > measure a downside.

> I've tested with different fill factors and the query got completed
> with fill factor 0.6.

That's without the patch in
http://archives.postgresql.org/message-id/20161123083351.5vramz52nmdokhzz%40alap3.anarazel.de
? With that patch it should complete without that change, right?

> With linear probing, the performance degrades more quickly at high
> fill factors because of primary clustering, a tendency for one
> collision to cause more nearby collisions. So, increasing the fill
> factor further doesn't seem to be a good idea.

Well, the idea is increasing it, but *also* detecting clustering and
adaptively resizing earlier.


> So, I was looking for other alternatives and I've found one called
> RobinHood hashing.

simplehash.h implements robin hood hashing.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-02-28 Thread Pavel Stehule
>
>
>>>
>> Good work - it will be pretty big patch.
>>
>> There is a intersection with implementation of XMLTABLE. I prepared a
>> executor infrastructure. So it can little bit reduce  size of this patch.
>>
>
> we considered your XMLTABLE patch, but it's itself pretty big and in
> unknown state.
>

It is big, but it is hard to expect so JSON_TABLE can be shorter if you are
solve all commiters requests.

Last patch should be near to final state.


>
>
>>
>> Taking only Oracle as origin can be risk - in details Oracle doesn't
>> respects owns proposal to standard.
>>
>
> we used an original standard document !  I suggest Oracle to those, who
> don't have access to standard. Yes, there are some problem in Oracle's
> implementation.
>
>
>>
>> This is last commitfest for current release cycle - are you sure, so is
>> good idea to push all mentioned features?
>>
>
> This would be a great feature for Release 10 and I understand all risks.
> Hopefully, community will help us. We have resources to continue our work
> and will do as much as possible to satisfy community requirements. It's not
> our fault, that standard was released so late :)
>

It is not your fault. Ok, I am looking for patches.

Regards

Pavel



>
>
>
>>
>> Regards
>>
>> Pavel
>>
>>
>>
>>
>>> Best regards,
>>>
>>> Oleg
>>>
>>>
>>> --
>>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>
>>>
>>
>


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Kuntal Ghosh
On Wed, Mar 1, 2017 at 8:48 AM, Andres Freund  wrote:
>> BTW, another option to consider is lowering the target fillfactor.
>> IIRC, Kuntal mentioned to me that cranking it down seemed to fix the
>> issue.  Obviously, it's better to detect when we need a lower
>> fillfactor than to always use a lower one, but obviously the tighter
>> you pack the hash table, the more likely it is that you're going to
>> have these kinds of problems.
>
> Yea, that'd be one approach, but I feel better dynamically increasing
> the fillfactor like in the patch. Even with a lower fillfactor you could
> see issues, and as you say a higher fillfactor is nice [TM]; after the
> patch I played with *increasing* the fillfactor, without being able to
> measure a downside.
I've tested with different fill factors and the query got completed
with fill factor 0.6.

With linear probing, the performance degrades more quickly at high
fill factors because of primary clustering, a tendency for one
collision to cause more nearby collisions. So, increasing the fill
factor further doesn't seem to be a good idea. Besides, when I've
tested with the patch, I've seen hash table grows even when the 10-12%
of the hash table is filled.

LOG:  size: 8388608, members: 845056, filled: 0.100739, total chain:
735723, max chain: 37, avg chain: 0.870620, total_collisions: 219101,
max_collisions: 6, avg_collisions: 0.259274
STATEMENT:  explain analyze select l_orderkey from lineitem group by l_orderkey;
LOG:  size: 16777216, members: 845056, filled: 0.050369, total chain:
197047, max chain: 6, avg chain: 0.233176, total_collisions: 121185,
max_collisions: 5, avg_collisions: 0.143405
STATEMENT:  explain analyze select l_orderkey from lineitem group by l_orderkey;
LOG:  size: 16777216, members: 2127649, filled: 0.126818, total chain:
1471327, max chain: 35, avg chain: 0.691527, total_collisions: 486310,
max_collisions: 6, avg_collisions: 0.228567
STATEMENT:  explain analyze select l_orderkey from lineitem group by l_orderkey;
LOG:  size: 33554432, members: 2127649, filled: 0.063409, total chain:
413073, max chain: 7, avg chain: 0.194145, total_collisions: 265766,
max_collisions: 5, avg_collisions: 0.124911
STATEMENT:  explain analyze select l_orderkey from lineitem group by l_orderkey;

This seems bad. We're wasting a lot of memory in the hash table.

Apart from that, I was thinking about the following optimization in the code.
/*
 * If the bucket is not empty, we either found a match (in which case
 * we're done), or we have to decide whether to skip over or move the
 * colliding entry. When the colliding element's distance to its
 * optimal position is smaller than the to-be-inserted entry's, we
 * shift the colliding entry (and its followers) forward by one.
 */
I'm worried about the fact that we are increasing the chain length of
each follower. It may happen that we've increased the chain length of
the last element by a huge factor.

So, I was looking for other alternatives and I've found one called
RobinHood hashing. In this approach, when you probe for a position to
insert a new element, if the probe length for the existing element is
less than the current probe length for the element being inserted,
swap the two elements and keep going. It leads to a low variance for
the chain lengths rather than the last one. Is this approach already
considered?


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-02-28 Thread David Steele
On 2/28/17 10:22 PM, Robert Haas wrote:
> On Tue, Feb 28, 2017 at 6:22 AM, David Steele  wrote:
 I'm not sure that's the case.  It seems like it should lock just as
 multiple backends would now.  One process would succeed and the others
 would error.  Maybe I'm missing something?
>>>
>>> Hm, any errors happening in the workers would be reported to the
>>> leader, meaning that even if one worker succeeded to run
>>> pg_start_backup() it would be reported as an error at the end to the
>>> client, no? By marking the exclusive function restricted we get sure
>>> that it is just the leader that fails or succeeds.
>>
>> Good point, and it strengthens the argument beyond, "it just seems right."
> 
> I think the argument should be based on whether or not the function
> depends on backend-private state that will not be synchronized.
> That's the definition of what makes something parallel-restricted or
> not.

Absolutely.  Yesterday was a long day so I may have (perhaps) become a
bit flippant.

> It looks like pg_start_backup() and pg_stop_backup() depend on the
> backend-private global variable nonexclusive_backup_running, so they
> should be parallel-restricted.

Agreed.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] some dblink refactoring

2017-02-28 Thread Corey Huinker
On Tue, Feb 28, 2017 at 10:09 PM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> Here is a patch to refactor some macro hell in dblink.
>
> This patch was discussed in the background sessions thread as a
> prerequisite for some work there, but I figure I'll make a separate
> thread for it to give everyone interested in dblink a chance to respond
> separate from the other thread.
>

+1

Any chance we can make get_connect_string() a core function or at least
externally accessible?


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-02-28 Thread Robert Haas
On Tue, Feb 28, 2017 at 6:22 AM, David Steele  wrote:
>>> I'm not sure that's the case.  It seems like it should lock just as
>>> multiple backends would now.  One process would succeed and the others
>>> would error.  Maybe I'm missing something?
>>
>> Hm, any errors happening in the workers would be reported to the
>> leader, meaning that even if one worker succeeded to run
>> pg_start_backup() it would be reported as an error at the end to the
>> client, no? By marking the exclusive function restricted we get sure
>> that it is just the leader that fails or succeeds.
>
> Good point, and it strengthens the argument beyond, "it just seems right."

I think the argument should be based on whether or not the function
depends on backend-private state that will not be synchronized.
That's the definition of what makes something parallel-restricted or
not.

It looks like pg_start_backup() and pg_stop_backup() depend on the
backend-private global variable nonexclusive_backup_running, so they
should be parallel-restricted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cast result of copyObject()

2017-02-28 Thread Peter Eisentraut
> The problem this patch would address is that currently you can write
> 
> SomeNode *x = ...;
> 
> ...
> 
> 
> OtherNode *y = copyObject(x);
> 
> and there is no notice about that potential mistake.
> 
> This patch makes that an error.
> 
> If you are sure about what you are doing, you can add a cast.  But
> casting the result of copyObject() should be limited to a few specific
> cases where the result is assigned to a generic Node * or something like
> that.

Updated patch, which I will register in the commit fest for some wider
portability testing (Windows!).

(changed subject copyNode -> copyObject (was excited about castNode at
the time))

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From a6cccd06e22cbd4d84da1c2d1b085c68ae3d8a9a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 28 Feb 2017 15:07:05 -0500
Subject: [PATCH v2] Cast result of copyObject() to correct type

copyObject() is declared to return void *, which allows easily assigning
the result independent of the input, but it loses all type checking.

If the compiler supports typeof or something similar, cast the result to
the input type.  This creates a greater amount of type safety.  In some
cases, where the result is assigned to a generic type such as Node * or
Expr *, new casts are now necessary, but in general casts are now
unnecessary in the normal case and indicate that something unusual is
happening.
---
 config/c-compiler.m4  | 27 +
 configure | 40 +++
 configure.in  |  1 +
 src/backend/bootstrap/bootstrap.c |  4 ++--
 src/backend/commands/createas.c   |  2 +-
 src/backend/commands/event_trigger.c  |  8 +++
 src/backend/commands/prepare.c|  4 ++--
 src/backend/commands/view.c   |  2 +-
 src/backend/nodes/copyfuncs.c |  4 
 src/backend/optimizer/path/indxpath.c |  4 ++--
 src/backend/optimizer/plan/createplan.c   |  6 ++---
 src/backend/optimizer/plan/initsplan.c|  8 +++
 src/backend/optimizer/plan/planagg.c  |  4 ++--
 src/backend/optimizer/plan/planner.c  |  4 ++--
 src/backend/optimizer/plan/setrefs.c  | 26 ++--
 src/backend/optimizer/plan/subselect.c| 14 +--
 src/backend/optimizer/prep/prepjointree.c |  2 +-
 src/backend/optimizer/prep/preptlist.c|  2 +-
 src/backend/optimizer/prep/prepunion.c|  4 ++--
 src/backend/optimizer/util/tlist.c| 12 +-
 src/backend/parser/analyze.c  |  2 +-
 src/backend/parser/gram.y |  2 +-
 src/backend/parser/parse_clause.c |  2 +-
 src/backend/parser/parse_expr.c   |  2 +-
 src/backend/parser/parse_relation.c   |  2 +-
 src/backend/parser/parse_utilcmd.c|  6 ++---
 src/backend/rewrite/rewriteHandler.c  |  8 +++
 src/backend/rewrite/rewriteManip.c|  8 +++
 src/backend/tcop/postgres.c   |  6 ++---
 src/backend/utils/cache/plancache.c   | 14 +--
 src/backend/utils/cache/relcache.c|  8 +++
 src/include/nodes/nodes.h |  4 
 src/include/optimizer/tlist.h |  4 ++--
 src/include/pg_config.h.in|  6 +
 src/include/pg_config.h.win32 |  6 +
 35 files changed, 173 insertions(+), 85 deletions(-)

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 7d901e1f1a..7afaec5f85 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -178,6 +178,33 @@ fi])# PGAC_C_STATIC_ASSERT
 
 
 
+# PGAC_C_TYPEOF
+# -
+# Check if the C compiler understands typeof or a variant.  Define
+# HAVE_TYPEOF if so, and define 'typeof' to the actual key word.
+#
+AC_DEFUN([PGAC_C_TYPEOF],
+[AC_CACHE_CHECK(for typeof, pgac_cv_c_typeof,
+[pgac_cv_c_typeof=no
+for pgac_kw in typeof __typeof__ decltype; do
+  AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
+[int x = 0;
+$pgac_kw(x) y;
+y = x;
+return y;])],
+[pgac_cv_c_typeof=$pgac_kw])
+  test "$pgac_cv_c_typeof" != no && break
+done])
+if test "$pgac_cv_c_typeof" != no; then
+  AC_DEFINE(HAVE_TYPEOF, 1,
+[Define to 1 if your compiler understands `typeof' or something similar.])
+  if test "$pgac_cv_c_typeof" != typeof; then
+AC_DEFINE(typeof, $pgac_cv_c_typeof, [Define to how the compiler spells `typeof'.])
+  fi
+fi])# PGAC_C_TYPEOF
+
+
+
 # PGAC_C_TYPES_COMPATIBLE
 # ---
 # Check if the C compiler understands __builtin_types_compatible_p,
diff --git a/configure b/configure
index b5cdebb510..47a1a59e8e 100755
--- a/configure
+++ b/configure
@@ -11399,6 +11399,46 @@ if test x"$pgac_cv__static_assert" = xyes ; then
 $as_echo "#define HAVE__STATIC_ASSERT 1" >>confdefs.h
 
 fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for typeof" >&5
+$as_echo_n "checking for typeof... " >&6; }
+if 

Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Andres Freund
On 2017-03-01 08:42:35 +0530, Robert Haas wrote:
> On Tue, Feb 28, 2017 at 11:16 PM, Andres Freund  wrote:
> >> To me, it seems like a big problem that we have large, unfixed
> >> performance regressions with simplehash four months after it went in.
> >
> > Yea, I agree.  I'm fairly sure that the patch I posted in that thread
> > actually fixes the issue (and would also have made already applied hash
> > patch of yours a small-ish optimization). I held back back because I
> > disliked the idea of magic constants, and I couldn't figure out a way to
> > properly "derive" them - but I'm inclined to simply live with the magic 
> > constsnts.
> 
> I think it's better to push in a proposed fix and see how it holds up
> than to leave the tree in an unfixed state for long periods of time.

Yea, that was a mistake.  I kind of was hoping for an epiphany that
has not yet come.


> BTW, another option to consider is lowering the target fillfactor.
> IIRC, Kuntal mentioned to me that cranking it down seemed to fix the
> issue.  Obviously, it's better to detect when we need a lower
> fillfactor than to always use a lower one, but obviously the tighter
> you pack the hash table, the more likely it is that you're going to
> have these kinds of problems.

Yea, that'd be one approach, but I feel better dynamically increasing
the fillfactor like in the patch. Even with a lower fillfactor you could
see issues, and as you say a higher fillfactor is nice [TM]; after the
patch I played with *increasing* the fillfactor, without being able to
measure a downside.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] use SQL standard error code for nextval

2017-02-28 Thread Peter Eisentraut
The SQL standard defines a separate error code for nextval exhausting
the sequence space.  I haven't found any discussion of this in the
archives, so it seems this was just not considered or not yet in
existence when the error codes were introduced.  Here is a patch to
correct it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From db87587803458c6ade81819441321e8761d5ef7a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 28 Feb 2017 15:14:14 -0500
Subject: [PATCH] Use SQL standard error code for nextval

---
 src/backend/commands/sequence.c | 4 ++--
 src/backend/utils/errcodes.txt  | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e0df642254..5820fee5a3 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -694,7 +694,7 @@ nextval_internal(Oid relid)
 
 	snprintf(buf, sizeof(buf), INT64_FORMAT, maxv);
 	ereport(ERROR,
-		  (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+		  (errcode(ERRCODE_SEQUENCE_GENERATOR_LIMIT_EXCEEDED),
 		   errmsg("nextval: reached maximum value of sequence \"%s\" (%s)",
   RelationGetRelationName(seqrel), buf)));
 }
@@ -717,7 +717,7 @@ nextval_internal(Oid relid)
 
 	snprintf(buf, sizeof(buf), INT64_FORMAT, minv);
 	ereport(ERROR,
-		  (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+		  (errcode(ERRCODE_SEQUENCE_GENERATOR_LIMIT_EXCEEDED),
 		   errmsg("nextval: reached minimum value of sequence \"%s\" (%s)",
   RelationGetRelationName(seqrel), buf)));
 }
diff --git a/src/backend/utils/errcodes.txt b/src/backend/utils/errcodes.txt
index 46aadd76f7..b6e0e987a8 100644
--- a/src/backend/utils/errcodes.txt
+++ b/src/backend/utils/errcodes.txt
@@ -188,6 +188,7 @@ Section: Class 22 - Data Exception
 22004EERRCODE_NULL_VALUE_NOT_ALLOWED null_value_not_allowed
 22002EERRCODE_NULL_VALUE_NO_INDICATOR_PARAMETER  null_value_no_indicator_parameter
 22003EERRCODE_NUMERIC_VALUE_OUT_OF_RANGE numeric_value_out_of_range
+2200HEERRCODE_SEQUENCE_GENERATOR_LIMIT_EXCEEDED  sequence_generator_limit_exceeded
 22026EERRCODE_STRING_DATA_LENGTH_MISMATCHstring_data_length_mismatch
 22001EERRCODE_STRING_DATA_RIGHT_TRUNCATION   string_data_right_truncation
 22011EERRCODE_SUBSTRING_ERRORsubstring_error
-- 
2.12.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Robert Haas
On Tue, Feb 28, 2017 at 11:16 PM, Andres Freund  wrote:
>> To me, it seems like a big problem that we have large, unfixed
>> performance regressions with simplehash four months after it went in.
>
> Yea, I agree.  I'm fairly sure that the patch I posted in that thread
> actually fixes the issue (and would also have made already applied hash
> patch of yours a small-ish optimization). I held back back because I
> disliked the idea of magic constants, and I couldn't figure out a way to
> properly "derive" them - but I'm inclined to simply live with the magic 
> constsnts.

I think it's better to push in a proposed fix and see how it holds up
than to leave the tree in an unfixed state for long periods of time.
If the fix is good, then the fact that there's a magic constant
doesn't really matter all that much, and it can always be improved
later.  On the other hand, if the fix is bad, pushing it improves the
chances of finding the problems.  Not many people are going to test an
uncommitted fix.

BTW, another option to consider is lowering the target fillfactor.
IIRC, Kuntal mentioned to me that cranking it down seemed to fix the
issue.  Obviously, it's better to detect when we need a lower
fillfactor than to always use a lower one, but obviously the tighter
you pack the hash table, the more likely it is that you're going to
have these kinds of problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] some dblink refactoring

2017-02-28 Thread Peter Eisentraut
Here is a patch to refactor some macro hell in dblink.

This patch was discussed in the background sessions thread as a
prerequisite for some work there, but I figure I'll make a separate
thread for it to give everyone interested in dblink a chance to respond
separate from the other thread.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 2e1fc0b0c50452bac91461a2317c28a8718fe89f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 25 Dec 2016 12:00:00 -0500
Subject: [PATCH v2 1/2] dblink: Replace some macros by static functions

Also remove some unused code and the no longer useful dblink.h file.
---
 contrib/dblink/dblink.c | 290 +++-
 contrib/dblink/dblink.h |  39 ---
 2 files changed, 137 insertions(+), 192 deletions(-)
 delete mode 100644 contrib/dblink/dblink.h

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index e0d6778a08..67d6699066 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -61,8 +61,6 @@
 #include "utils/tqual.h"
 #include "utils/varlena.h"
 
-#include "dblink.h"
-
 PG_MODULE_MAGIC;
 
 typedef struct remoteConn
@@ -146,98 +144,102 @@ typedef struct remoteConnHashEnt
 /* initial number of connection hashes */
 #define NUMCONN 16
 
-/* general utility */
-#define xpfree(var_) \
-	do { \
-		if (var_ != NULL) \
-		{ \
-			pfree(var_); \
-			var_ = NULL; \
-		} \
-	} while (0)
-
-#define xpstrdup(var_c, var_) \
-	do { \
-		if (var_ != NULL) \
-			var_c = pstrdup(var_); \
-		else \
-			var_c = NULL; \
-	} while (0)
-
-#define DBLINK_RES_INTERNALERROR(p2) \
-	do { \
-			msg = pchomp(PQerrorMessage(conn)); \
-			if (res) \
-PQclear(res); \
-			elog(ERROR, "%s: %s", p2, msg); \
-	} while (0)
-
-#define DBLINK_CONN_NOT_AVAIL \
-	do { \
-		if(conname) \
-			ereport(ERROR, \
-	(errcode(ERRCODE_CONNECTION_DOES_NOT_EXIST), \
-	 errmsg("connection \"%s\" not available", conname))); \
-		else \
-			ereport(ERROR, \
-	(errcode(ERRCODE_CONNECTION_DOES_NOT_EXIST), \
-	 errmsg("connection not available"))); \
-	} while (0)
-
-#define DBLINK_GET_CONN \
-	do { \
-			char *conname_or_str = text_to_cstring(PG_GETARG_TEXT_PP(0)); \
-			rconn = getConnectionByName(conname_or_str); \
-			if (rconn) \
-			{ \
-conn = rconn->conn; \
-conname = conname_or_str; \
-			} \
-			else \
-			{ \
-connstr = get_connect_string(conname_or_str); \
-if (connstr == NULL) \
-{ \
-	connstr = conname_or_str; \
-} \
-dblink_connstr_check(connstr); \
-conn = PQconnectdb(connstr); \
-if (PQstatus(conn) == CONNECTION_BAD) \
-{ \
-	msg = pchomp(PQerrorMessage(conn)); \
-	PQfinish(conn); \
-	ereport(ERROR, \
-			(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION), \
-			 errmsg("could not establish connection"), \
-			 errdetail_internal("%s", msg))); \
-} \
-dblink_security_check(conn, rconn); \
-if (PQclientEncoding(conn) != GetDatabaseEncoding()) \
-	PQsetClientEncoding(conn, GetDatabaseEncodingName()); \
-freeconn = true; \
-			} \
-	} while (0)
-
-#define DBLINK_GET_NAMED_CONN \
-	do { \
-			conname = text_to_cstring(PG_GETARG_TEXT_PP(0)); \
-			rconn = getConnectionByName(conname); \
-			if (rconn) \
-conn = rconn->conn; \
-			else \
-DBLINK_CONN_NOT_AVAIL; \
-	} while (0)
-
-#define DBLINK_INIT \
-	do { \
-			if (!pconn) \
-			{ \
-pconn = (remoteConn *) MemoryContextAlloc(TopMemoryContext, sizeof(remoteConn)); \
-pconn->conn = NULL; \
-pconn->openCursorCount = 0; \
-pconn->newXactForCursor = FALSE; \
-			} \
-	} while (0)
+static char *
+xpstrdup(const char *in)
+{
+	if (in == NULL)
+		return NULL;
+	return pstrdup(in);
+}
+
+static void
+dblink_res_internalerror(PGconn *conn, PGresult *res, const char *p2)
+{
+	char	   *msg = pchomp(PQerrorMessage(conn));
+	if (res)
+		PQclear(res);
+	elog(ERROR, "%s: %s", p2, msg);
+}
+
+static void pg_attribute_noreturn()
+dblink_conn_not_avail(const char *conname)
+{
+	if (conname)
+		ereport(ERROR,
+(errcode(ERRCODE_CONNECTION_DOES_NOT_EXIST),
+ errmsg("connection \"%s\" not available", conname)));
+	else
+		ereport(ERROR,
+(errcode(ERRCODE_CONNECTION_DOES_NOT_EXIST),
+ errmsg("connection not available")));
+}
+
+static void
+dblink_get_conn(char *conname_or_str,
+PGconn * volatile *conn_p, char **conname_p, volatile bool *freeconn_p)
+{
+	remoteConn *rconn = getConnectionByName(conname_or_str);
+	PGconn	   *conn;
+	char	   *conname;
+	bool		freeconn;
+
+	if (rconn)
+	{
+		conn = rconn->conn;
+		conname = conname_or_str;
+		freeconn = false;
+	}
+	else
+	{
+		const char *connstr;
+
+		connstr = get_connect_string(conname_or_str);
+		if (connstr == NULL)
+			connstr = conname_or_str;
+		dblink_connstr_check(connstr);
+		conn = PQconnectdb(connstr);
+		if (PQstatus(conn) == CONNECTION_BAD)
+		{
+			char	   *msg = 

Re: [HACKERS] port of INSTALL file generation to XSLT

2017-02-28 Thread Peter Eisentraut
On 2/28/17 08:57, Magnus Hagander wrote:
> It appears we need pandoc 1.13 to get the good output.  This won't be
> available until Debian stretch.
> 
> So for PG10, I propose the attached revised patch which keeps using lynx
> but uses xsltproc instead of jade.
> 
> 
> It is available for people not using debian though? Might it be
> worthwhile to make it dependent on the version of pandoc -- so use that
> method if people have the newer pandoc and fall back to lynx if they don't? 

Well, this really only runs once every couple of months on borka and
here or there for those building their own snapshot tarballs.  I don't
think we need to cater to a wide audience here.  In fact, variety could
be bad here:  We don't want to find out that a tarball was rolled with
the wrong variant.

The pandoc change can be revisited independently.  The main point right
now is to get away from the DSSSL toolchain.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] allow referring to functions without arguments when unique

2017-02-28 Thread Peter Eisentraut
This is the "grand finale" that goes on top of the "DROP FUNCTION of
multiple functions" patch set.  The purpose is to allow referring to
functions without having to spell out the argument list, when the
function name is unique.  This is especially useful when having to
operate on "business logic" functions with many many arguments.  It's an
SQL standard feature, and it applies everywhere a function is referred
to in the grammar.  We already have the lookup logic for the regproc
type, and thanks to the grand refactoring of the parser representation
of functions, this is quite a small patch.  There is a bit of
reduce/reduce parser mystery, to keep the reviewer entertained.  (The
equivalent could be implemented for aggregates and operators, but I
haven't done that here yet.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 40eeb753abc83bbb0e38997fb518f9cd663f1729 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 25 Feb 2017 19:13:15 -0500
Subject: [PATCH] Allow referring to functions without arguments when unique

In DDL commands referring to an existing function, allow omitting the
argument list if the function name is unique in its schema, per SQL
standard.

This uses the same logic that the regproc type uses for finding
functions by name only.
---
 doc/src/sgml/ref/alter_extension.sgml   |  2 +-
 doc/src/sgml/ref/alter_function.sgml| 13 ++--
 doc/src/sgml/ref/alter_opfamily.sgml|  7 ---
 doc/src/sgml/ref/comment.sgml   |  2 +-
 doc/src/sgml/ref/create_cast.sgml   |  6 --
 doc/src/sgml/ref/create_transform.sgml  | 12 +++
 doc/src/sgml/ref/drop_function.sgml | 26 
 doc/src/sgml/ref/grant.sgml |  2 +-
 doc/src/sgml/ref/revoke.sgml|  2 +-
 doc/src/sgml/ref/security_label.sgml|  2 +-
 src/backend/nodes/copyfuncs.c   |  1 +
 src/backend/nodes/equalfuncs.c  |  1 +
 src/backend/parser/gram.y   | 27 +
 src/backend/parser/parse_func.c | 17 +++-
 src/include/nodes/parsenodes.h  |  3 +++
 src/test/regress/expected/create_function_3.out |  9 -
 src/test/regress/sql/create_function_3.sql  |  7 +++
 17 files changed, 113 insertions(+), 26 deletions(-)

diff --git a/doc/src/sgml/ref/alter_extension.sgml b/doc/src/sgml/ref/alter_extension.sgml
index de6d6dca16..a7c0927d1c 100644
--- a/doc/src/sgml/ref/alter_extension.sgml
+++ b/doc/src/sgml/ref/alter_extension.sgml
@@ -39,7 +39,7 @@
   EVENT TRIGGER object_name |
   FOREIGN DATA WRAPPER object_name |
   FOREIGN TABLE object_name |
-  FUNCTION function_name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) |
+  FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] |
   MATERIALIZED VIEW object_name |
   OPERATOR operator_name (left_type, right_type) |
   OPERATOR CLASS object_name USING index_method |
diff --git a/doc/src/sgml/ref/alter_function.sgml b/doc/src/sgml/ref/alter_function.sgml
index 0388d06b95..168eeb7c52 100644
--- a/doc/src/sgml/ref/alter_function.sgml
+++ b/doc/src/sgml/ref/alter_function.sgml
@@ -21,15 +21,15 @@
 
  
 
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 action [ ... ] [ RESTRICT ]
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 RENAME TO new_name
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 OWNER TO { new_owner | CURRENT_USER | SESSION_USER }
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 SET SCHEMA new_schema
-ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] )
+ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
 DEPENDS ON EXTENSION extension_name
 
 where action is one of:
@@ -75,7 +75,8 @@ Parameters
 name
 
  
-  The name (optionally schema-qualified) of an existing function.
+  The name (optionally schema-qualified) of an existing function.  If no
+  argument list is specified, the name must be unique in its schema.
  
 

diff --git a/doc/src/sgml/ref/alter_opfamily.sgml b/doc/src/sgml/ref/alter_opfamily.sgml
index 4511c7f7b2..0bafe5b8f8 100644
--- a/doc/src/sgml/ref/alter_opfamily.sgml
+++ b/doc/src/sgml/ref/alter_opfamily.sgml
@@ -25,7 +25,7 @@
   {  OPERATOR strategy_number operator_name ( op_type, op_type )
   [ FOR SEARCH | FOR ORDER BY sort_family_name ]
| FUNCTION support_number [ ( op_type [ , op_type ] 

Re: [HACKERS] background sessions

2017-02-28 Thread Peter Eisentraut
> For additional entertainment, I include patches that integrate
> background sessions into dblink.  So dblink can open a connection to a
> background session, and then you can use the existing dblink functions
> to send queries, read results, etc.  People use dblink to make
> self-connections to get autonomous subsessions, so this would directly
> address that use case.  The 0001 patch is some prerequisite refactoring
> to remove an ugly macro mess, which is useful independent of this.  0002
> is the actual patch.

Updated patch, mainly with improved error handling and some tidying up.

Related to this is also the patch in

as a resource control mechanism.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 48058db2817b8955ea3424bc77b55bb5782ffb0d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 28 Dec 2016 12:00:00 -0500
Subject: [PATCH v3] Add background sessions

This adds a C API to run SQL statements in a background worker,
communicating by FE/BE protocol over a DSM message queue.  This can be
used to execute statements and transactions separate from the main
foreground session.

Also included is a PL/Python interface to this functionality.
---
 doc/src/sgml/bgsession.sgml | 273 +++
 doc/src/sgml/filelist.sgml  |   1 +
 doc/src/sgml/plpython.sgml  | 105 ++-
 doc/src/sgml/postgres.sgml  |   1 +
 doc/src/sgml/stylesheet-common.xsl  |   1 +
 src/backend/commands/variable.c |   5 +
 src/backend/libpq/pqmq.c|  26 +-
 src/backend/storage/ipc/shm_mq.c|  19 +
 src/backend/tcop/Makefile   |   2 +-
 src/backend/tcop/bgsession.c| 929 
 src/backend/tcop/postgres.c |  38 +-
 src/include/commands/variable.h |   1 +
 src/include/libpq/pqmq.h|   1 +
 src/include/storage/shm_mq.h|   3 +
 src/include/tcop/bgsession.h|  32 +
 src/include/tcop/tcopprot.h |  10 +
 src/pl/plpython/Makefile|   2 +
 src/pl/plpython/expected/plpython_bgsession.out | 217 ++
 src/pl/plpython/expected/plpython_test.out  |   7 +-
 src/pl/plpython/plpy_bgsession.c| 509 +
 src/pl/plpython/plpy_bgsession.h|  18 +
 src/pl/plpython/plpy_main.h |   3 +
 src/pl/plpython/plpy_planobject.c   |   1 +
 src/pl/plpython/plpy_planobject.h   |   2 +
 src/pl/plpython/plpy_plpymodule.c   |   5 +
 src/pl/plpython/plpy_spi.c  |  29 +-
 src/pl/plpython/plpy_spi.h  |   3 +
 src/pl/plpython/sql/plpython_bgsession.sql  | 177 +
 28 files changed, 2388 insertions(+), 32 deletions(-)
 create mode 100644 doc/src/sgml/bgsession.sgml
 create mode 100644 src/backend/tcop/bgsession.c
 create mode 100644 src/include/tcop/bgsession.h
 create mode 100644 src/pl/plpython/expected/plpython_bgsession.out
 create mode 100644 src/pl/plpython/plpy_bgsession.c
 create mode 100644 src/pl/plpython/plpy_bgsession.h
 create mode 100644 src/pl/plpython/sql/plpython_bgsession.sql

diff --git a/doc/src/sgml/bgsession.sgml b/doc/src/sgml/bgsession.sgml
new file mode 100644
index 00..14efb1b495
--- /dev/null
+++ b/doc/src/sgml/bgsession.sgml
@@ -0,0 +1,273 @@
+
+
+
+ Background Session API
+
+ 
+  The background session API is a C API for creating additional database
+  sessions in the background and running SQL statements in them.  A background
+  session behaves like a normal (foreground) session in that it has session
+  state, transactions, can run SQL statements, and so on.  Unlike a foreground
+  session, it is not connected directly to a client.  Instead the foreground
+  session can use this API to execute SQL statements and retrieve their
+  results.  Higher-level integrations, such as in procedural languages, can
+  make this functionality available to clients.  Background sessions are
+  independent from their foreground sessions in their session and transaction
+  state.  So a background session cannot see uncommitted data in foreground
+  sessions or vice versa, and there is no preferential treatment about
+  locking.  Like all sessions, background sessions are separate processes.
+  Foreground and background sessions communicate over shared memory messages
+  queues instead of the sockets that a client/server connection uses.
+ 
+
+ 
+  Background sessions can be useful in a variety of scenarios when effects
+  that are independent of the foreground session are to be achieved, for
+  example:
+  
+   
+
+ Commit data independent of whether a foreground 

Re: [HACKERS] New Committer - Andrew Gierth

2017-02-28 Thread Amit Kapila
On Tue, Feb 28, 2017 at 11:52 PM, Stephen Frost  wrote:
> Greetings!
>
> The PostgreSQL committers would like to welcome Andrew Gierth as a new
> committer for the PostgreSQL project.
>

Congratulations!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Creation of "Future" commit fest, named 2017-07

2017-02-28 Thread Michael Paquier
On Tue, Feb 28, 2017 at 9:28 AM, David Steele  wrote:
> On 2/27/17 7:27 PM, Michael Paquier wrote:
>> On Tue, Feb 28, 2017 at 9:21 AM, David Steele  wrote:
>>> Looks like it's me, then.  I don't seem to have admin privileges to
>>> close the commitfest or I don't know where the option is located.
>>>
>>> Can somebody grant privileges or close the CF at 3/1 12AM AoE (UTC-12)
>>> for me?
>>
>> I have no way to give the administration rights, but as I got them I
>> am fine to mark the CF as in progress around this time.
>
> That works for me.  I have all the privileges I need to run the CF
> otherwise.

Still 10 hours left if you want to propose a patch for the last CF!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-02-28 Thread Amit Langote
Nagata-san,

On 2017/02/28 23:33, Yugo Nagata wrote:
> Hi all,
> 
> Now we have a declarative partitioning, but hash partitioning is not
> implemented yet. Attached is a POC patch to add the hash partitioning
> feature. I know we will need more discussions about the syntax and other
> specifications before going ahead the project, but I think this runnable
> code might help to discuss what and how we implement this.

Great!

> * Description
> 
> In this patch, the hash partitioning implementation is basically based
> on the list partitioning mechanism. However, partition bounds cannot be
> specified explicitly, but this is used internally as hash partition
> index, which is calculated when a partition is created or attached.
> 
> The tentative syntax to create a partitioned table is as bellow;
> 
>  CREATE TABLE h (i int) PARTITION BY HASH(i) PARTITIONS 3 USING hashint4;
> 
> The number of partitions is specified by PARTITIONS, which is currently
> constant and cannot be changed, but I think this is needed to be changed in
> some manner. A hash function is specified by USING. Maybe, specifying hash
> function may be ommitted, and in this case, a default hash function
> corresponding to key type will be used.
> 
> A partition table can be create as bellow;
> 
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
> 
> FOR VALUES clause cannot be used, and the partition bound is
> calclulated automatically as partition index of single integer value.
> 
> When trying create partitions more than the number specified
> by PARTITIONS, it gets an error.
> 
> postgres=# create table h4 partition of h;
> ERROR:  cannot create hash partition more than 3 for h

Instead of having to create each partition individually, wouldn't it be
better if the following command

CREATE TABLE h (i int) PARTITION BY HASH (i) PARTITIONS 3;

created the partitions *automatically*?

It makes sense to provide a way to create individual list and range
partitions separately, because users can specify custom bounds for each.
We don't need that for hash partitions, so why make users run separate
commands (without the FOR VALUES clause) anyway?  We may perhaps need to
offer a way to optionally specify a user-defined name for each partition
in the same command, along with tablespace, storage options, etc.  By
default, the names would be generated internally and the user can ALTER
individual partitions after the fact to specify tablespace, etc.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] help to identify the reason that extension's C function returns array get segmentation fault

2017-02-28 Thread 钱新林
Thanks for your clues.

The system I have used to debug the code is x86 64bit based, Ubuntu 1404
and postgres 9.3.13, I have revised the code and it looks like as following:

Datum vquery(PG_FUNCTION_ARGS) {

int array_len = PG_GETARG_INT32(0);
int64 * node_ids;
ArrayType * retarr;
Datum * vals ;

SPI_connect();

//some code to retrieve data from various tables
// node_ids are allocated and filled up

vals = SPI_palloc(array_len * sizeof(Datum));
memset (vals, 0, array_len * sizeof(Datum));

// fill the vals up
for (i = 0 ; i < array_len ; i++)
  vals[i] = Int64GetDatum((node_ids[i]));

retarr = construct_array(vals, retcnt, INT8OID, sizeof(int64), true, 'd');

SPI_finish();

PG_RETURN_ARRAYTYPE_P(retarr);
}

It seems to solve the problem,  I have tested the code for a while and no
more segmentation faults are reported.

I have built Postgresql with --enable-debug and --enable-cassert, but use
the binary with gdb and get no symbol file loaded. I will take further
researches and use it to facilitate debug. Thanks.

On Tue, Feb 28, 2017 at 12:54 PM, Tom Lane  wrote:

> =?UTF-8?B?6ZKx5paw5p6X?=  writes:
> > I have written an extension to manage openstreetmap data. There is a C
> > function to perform spatial top k query on several  tables and return an
> > array of int8 type as result. The code skeleton of this function is as
> > follows:
>
> There are a remarkable lot of bugs in this code fragment.  Many of them
> would not bite you as long as you are running on 64-bit Intel hardware,
> but that doesn't make them not bugs.
>
> > Datum vquery(PG_FUNCTION_ARGS) {
>
> > int array_len = PG_GETARG_INT32(0);
> > long * node_ids;
>
> > SPI_connect();
>
> > //some code to retrieve data from various tables
> > // node_ids are allocated and filled up
>
> > ArrayType * retarr;
> > Datum * vals ;
>
> > vals = palloc0(array_len * sizeof(long));
>
> Datum is not necessarily the same as "long".
>
> > // fill the vals up
> > for (i = 0 ; i < array_len ; i++)
> >   vals[i] = Int64GetDatum((node_ids[i]));
>
> int64 is not necessarily the same as "long", either.
>
> > retarr = construct_array(vals, retcnt, INT8OID, sizeof(long), true, 'i');
>
> Again, INT8 is not the same size as "long", and it's not necessarily
> pass-by-val, and it's *certainly* not integer alignment.
>
> > SPI_finish();
>
> > PG_RETURN_ARRAYTYPE_P(retarr);
>
> But I think what's really biting you, probably, is that construct_array()
> made the array in CurrentMemoryContext which at that point was the SPI
> execution context; which would be deleted by SPI_finish.  So you're
> returning a dangling pointer.  You need to do something to either copy
> the array value out to the caller's context, or build it there in the
> first place.
>
> BTW, this failure would be a lot less intermittent if you were testing
> in a CLOBBER_FREED_MEMORY build.  I would go so far as to say you should
> *never* develop or test C code for the Postgres backend without using
> the --enable-cassert configure option for your build.  You're simply
> tossing away a whole lot of debug support if you don't.
>
> regards, tom lane
>


[HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall

2017-02-28 Thread Haribabu Kommi
Subject changed for better context of the patch.
(was - Re: Question about grant create on database and pg_dump/pg_dumpall)

On Fri, Sep 30, 2016 at 12:29 AM, Tom Lane
 wrote:
>
>1. pg_dump without --create continues to do what it does today, ie it just
>dumps objects within the database, assuming that database-level properties
>will already be set correctly for the target database.
>
>2. pg_dump with --create creates the target database and also sets all
>database-level properties (ownership, ACLs, ALTER DATABASE SET, etc etc).
>
>3. pg_dumpall loses all code relating to individual-database creation
>and property setting and instead relies on pg_dump --create to do that.
>This would leave only the code relating to "pg_dumpall -g" (ie, dump roles
>and tablespaces) within pg_dumpall itself.

I removed all the database related code from pg_dumpall and moved the
necessary part of the code into pg_dump and called pg_dump with --create
option from pg_dumpall to ensure that all the database create commands
are getting dumped.

Except postgres, template1 databases for rest of the databases the
CREATE DATABASE command is issued. And all other properties
dump is same for every database.

>One thing that would still be messy is that presumably "pg_dumpall -g"
>would issue ALTER ROLE SET commands, but it's unclear what to do with
>ALTER ROLE IN DATABASE SET commands.  Should those become part of
>"pg_dump --create"'s charter?  It seems like not, but I'm not certain.

Yes, I moved the ALTER ROLE IN DATABASE SET commands also as part
of pg_dump --create command, this way it will be easier to dump all the
database objects using (pg_dumpall -g and pg_dump -C ).

>Another thing that requires some thought is that pg_dumpall is currently
>willing to dump ACLs and other properties for template1/template0, though
>it does not invoke pg_dump on them.  If we wanted to preserve that
>behavior while still moving the code that does those things to pg_dump,
>pg_dump would have to grow an option that would let it do that.  But
>I'm not sure how much of that behavior is actually sensible.

Currently the ACLs and other changes related to template database are
getting
dumped with --create option in pg_dump. do we still need another option?

>This would probably take a pg_dump archive version bump, since I think
>we don't currently record enough information for --create to do this
>(and we can't just cram the extra commands into the DATABASE entry,
>since we don't know whether --create will be specified to pg_restore).
>But we've done those before.

There is no specific code is required related to the archive version check.
Still do we need to bump the archive version? As it just adds some new
commands as part of --create with pg_dump.

Patch attached. Still some more docs needs to be added.

comments?

[1] - https://www.postgresql.org/message-id/21573.1475162...@sss.pgh.pa.us

Regards,
Hari Babu
Fujitsu Australia


pg_dump_changes_1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PATCH: Configurable file mode mask

2017-02-28 Thread David Steele
PostgreSQL currently requires the file mode mask (umask) to be 0077.
However, this precludes the possibility of a user in the postgres group
performing a backup (or whatever).  Now that
pg_start_backup()/pg_stop_backup() privileges can be delegated to an
unprivileged user, it makes sense to also allow a (relatively)
unprivileged user to perform the backup at the file system level as well.

This patch introduces a new initdb param, -u/-file-mode-mask, and a new
GUC, file_mode_mask, to allow the default mode of files and directories
in the $PGDATA directory to be modified.

This obviously required mode changes in a number of places, so at the
same time the BasicOpenFile(), OpenTransientFile(), and
PathNameOpenFile() have been split into versions that either use the
default permissions or allow custom permissions.  In the end there was
only one call to the custom permission version (be-fsstubs.c:505) for
all three variants.

The following three calls (at the least) need to be reviewed:

bin/pg_dump/pg_backup_directory.c:194
src/port/mkdtemp.c:190
bin/pg_basebackup.c:599:655:1399

And this call needs serious consideration:

bin/pg_rewind/file_ops.c:214

Besides that there should be tests to make sure the masks are working as
expected and these could be added to the initdb TAP tests, though no
mask tests exist at this time.  Making sure all file operations produce
the correct modes would need to be placed in a new module, perhaps the
new backup tests proposed in [1].

Adam Brightwell developed the patch based on an initial concept by me
and Stephen Frost.  I added the refactoring in fd.c and some additional
documentation.

This patch applies cleanly on 016c990 but may fare badly over time due
to the number of files modified.

-- 
-David
da...@pgmasters.net

[1]
https://www.postgresql.org/message-id/758e3fd1-45b4-5e28-75cd-e9e7f93a4...@pgmasters.net
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 62dec87..98f8170 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1858,8 +1858,7 @@ qtext_store(const char *query, int query_len,
*query_offset = off;
 
/* Now write the data into the successfully-reserved part of the file */
-   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDWR | O_CREAT | PG_BINARY,
-  S_IRUSR | S_IWUSR);
+   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDWR | O_CREAT | PG_BINARY);
if (fd < 0)
goto error;
 
@@ -1923,7 +1922,7 @@ qtext_load_file(Size *buffer_size)
int fd;
struct stat stat;
 
-   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDONLY | PG_BINARY, 0);
+   fd = OpenTransientFile(PGSS_TEXT_FILE, O_RDONLY | PG_BINARY);
if (fd < 0)
{
if (errno != ENOENT)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 1b390a2..2371878 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -812,6 +812,38 @@ include_dir 'conf.d'
   
  
 
+ 
+  file_mode_mask (integer)
+  
+   file_mode_mask configuration parameter
+  
+  
+  
+   
+Sets the file mode mask (umask) for the data directory. The parameter
+value is expected to be a numeric mode specified in the format
+accepted by the chmod and
+umask system calls. (To use the customary octal
+format the number must start with a 0 (zero).)
+   
+
+   
+The default file_mode_mask is 
0077,
+meaning that the only the database owner can read and write files in
+the data directory.  For example, setting the
+file_mode_mask to 0027 would 
allow
+any user in the same group as the database owner to read all database
+files, which would be useful for producing a backup using a relatively
+unprivileged user.
+   
+
+   
+This parameter can only be set at server start.
+   
+
+  
+ 
+
  
   bonjour (boolean)
   
diff --git a/src/backend/access/heap/rewriteheap.c 
b/src/backend/access/heap/rewriteheap.c
index c7b283c..53a2acc 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1010,8 +1010,7 @@ logical_rewrite_log_mapping(RewriteState state, 
TransactionId xid,
src->off = 0;
memcpy(src->path, path, sizeof(path));
src->vfd = PathNameOpenFile(path,
-   O_CREAT 
| O_EXCL | O_WRONLY | PG_BINARY,
-   S_IRUSR 
| S_IWUSR);
+   O_CREAT 
| O_EXCL | O_WRONLY | PG_BINARY);
if (src->vfd < 0)
ereport(ERROR,
  

Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-02-28 Thread Amit Langote
Hi,

On 2017/02/28 23:25, Aleksander Alekseev wrote:
> Hello.
> 
> I decided to figure out whether current implementation of declarative
> partitioning has any bottlenecks when there is a lot of partitions. Here
> is what I did [1].

Thanks for sharing.

> Then:
> 
> ```
> # 2580 is some pk that exists
> echo 'select * from part_test where pk = 2580;' > t.sql
> pgbench -j 7 -c 7 -f t.sql -P 1 -T 300 eax
> ```
> 
> `perf top` showed to bottlenecks [2]. A stacktrace for the first one
> looks like this [3]:
> 
> ```
> 0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') at 
> pgstat.c:1689
> 1689  if (entry->t_id == rel_id)
> #0  0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') 
> at pgstat.c:1689
> #1  0x007a4275 in pgstat_initstats (rel=0x7f4af3fd41f8) at 
> pgstat.c:1666
> #2  0x004c7090 in relation_open (relationId=25696, lockmode=0) at 
> heapam.c:1137
> #3  0x004c72c9 in heap_open (relationId=25696, lockmode=0) at 
> heapam.c:1291
> (skipped)
> ```
> 
> And here is a stacktrace for the second bottleneck [4]:
> 
> ```
> 0x00584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, 
> numparents=0x0) at pg_inherits.c:199
> 199   forboth(lo, rels_list, li, rel_numparents)
> #0  0x00584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, 
> numparents=0x0) at pg_inherits.c:199
> #1  0x0077fc9f in expand_inherited_rtentry (root=0x1badcb8, 
> rte=0x1b630b8, rti=1) at prepunion.c:1408
> #2  0x0077fb67 in expand_inherited_tables (root=0x1badcb8) at 
> prepunion.c:1335
> #3  0x00767526 in subquery_planner (glob=0x1b63cc0, parse=0x1b62fa0, 
> parent_root=0x0, hasRecursion=0 '\000', tuple_fraction=0) at planner.c:568
> (skipped)
> ```
> 
> The first one could be easily fixed by introducing a hash table
> (rel_id -> pgStatList entry). Perhaps hash table should be used only
> after some threshold. Unless there are any objections I will send a
> corresponding patch shortly.

I have never thought about this one really.

> I didn't explored the second bottleneck closely yet but at first glance
> it doesn't look much more complicated.

I don't know which way you're thinking of fixing this, but a planner patch
to implement faster partition-pruning will have taken care of this, I
think.  As you may know, even declarative partitioned tables currently
depend on constraint exclusion for partition-pruning and planner's current
approach of handling inheritance requires to open all the child tables
(partitions), whereas the new approach hopefully shouldn't need to do
that.  I am not sure if looking for a more localized fix for this would be
worthwhile, although I may be wrong.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-02-28 Thread Lukas Fittl
Hi,

Currently pg_stat_statements replaces constant values with ? characters.
I've seen this be a problem on multiple occasions, in particular since it
conflicts with the use of ? as an operator.

I'd like to propose changing the replacement character from ? to instead be
a parameter (like $1).

My main motiviation is to aid external tools that parse pg_stat_statements
output (like [0]), and currently have to resort to complex parser patches
to distinguish operators from replacement characters.

First of all, attached 0001_pgss_additional_regression_tests.v1.patch which
increases regression test coverage for a few scenarios relevant to this
discussion.

Then, there is two variants I've prepared, only one of the two to be
committed:

A) 0002_pgss_mask_with_incrementing_params.v1.patch: Replace constants with
$1, $2, $3 (etc) in the normalized query, whilst respecting any existing
params in the counting

B) 0003_pgss_mask_with_zero_param.v1.patch: Replace constants with $0 in
the normalized query

Ideally I'd see A turn into something we can commit, but it involves a bit
more computation to get the parameter numbers right. The benefit is that we
could use PREPARE/EXECUTE with pg_stat_statements output.

B is intentionally simple, and would address the primary issue at hand
(distinguishing from the ? operator).

I'm also adding this patch to the commitfest starting tomorrow.

Best,
Lukas

[0] https://github.com/lfittl/pg_query#parsing-a-normalized-query

-- 
Lukas Fittl


0001_pgss_additional_regression_tests.v1.patch
Description: Binary data


0002_pgss_mask_with_incrementing_params.v1.patch
Description: Binary data


0003_pgss_mask_with_zero_param.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding support for Default partition in partitioning

2017-02-28 Thread Rahila Syed
Hello,

Currently inserting the data into a partitioned table that does not fit into
any of its partitions is not allowed.

The attached patch provides a capability to add a default partition to a
list
partitioned table as follows.

postgres=# CREATE TABLE list_partitioned (
a int
) PARTITION BY LIST (a);
CREATE TABLE

postgres=# CREATE TABLE part_default PARTITION OF list_partitioned FOR
VALUES IN (DEFAULT);
CREATE TABLE

postgres=# CREATE TABLE part_1 PARTITION OF list_partitioned FOR VALUES IN
(4,5);
CREATE TABLE

postgres=# insert into list_partitioned values (9);
INSERT 0 1

postgres=# select * from part_default;
 a
---
 9
(1 row)

The attached patch is in a  preliminary stage and has following ToDos:
1. Adding pg_dump support.
2. Documentation
3. Handling adding a new partition to a partitioned table
   with default partition.
   This will require moving tuples from existing default partition to
  newly created partition if they satisfy its partition bound.
4. Handling of update of partition key in a default partition. As per
current design it should throw an error if the update requires the tuple to
be moved to any other partition. But this can changed by the following
proposal.

https://www.postgresql.org/message-id/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=
yjo3u3qhyp8e...@mail.gmail.com

I am adding it to the current commitfest with the status Waiting on Author
as I will submit an updated patch with above ToDos.
Kindly give your suggestions.

Thank you,
Rahila Syed


default_list_partition_v1.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2]

2017-02-28 Thread Kouhei Kaigai
The attached patch is rebased version of pass-down LIMIT clause patch,
which was forgotten to register on the last CF.

It allows to inform required number of rows to the sub-plans not only
ones we have individually handled at pass_down_bound().
Its primary target is control of number of remote tuple transfer over
the network connection by postgres_fdw.

According to the past discussion, we add a new field @ps_numTuples on
the PlanState to represent the required number of tuples.
Limit node assign a particular number on the field of sub-plan, then
this sub-plan can know its upper node does not require entire tuples,
and adjust its execution storategy.
Like MergeAppend, the sub-plan can also pass down the bounds to its
sub-plans again, if it makes sense and works correctly.

This feature is potentially a basis of GPU-based sorting on top of
CustomScan, because it has advantage for a workload to pick up the
top-N tuples if its data-size is enough small to load onto GPU-RAM.

Thanks,

PG-Strom Project / NEC OSS Promotion Center
KaiGai Kohei 


> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kohei KaiGai
> Sent: Tuesday, January 03, 2017 12:07 PM
> To: Kaigai Kouhei(海外 浩平) 
> Cc: Jeevan Chalke ; Robert Haas
> ; pgsql-hackers@postgresql.org; Etsuro Fujita
> ; Andres Freund 
> Subject: Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan
> [take-2]
> 
> Oops, I oversight this patch was marked as "returned with feedback", not
> "moved to the next CF".
> 
> Its status has not been changed since the last update. (Code was revised
> according to the last comment by Jeevan, but CF-Nov was time up at that
> time.)
> 
> How do I handle the patch?
> 
> 2016-12-05 16:49 GMT+09:00 Kouhei Kaigai :
> > Hello,
> >
> > Sorry for my late response.
> > The attached patch reflects your comments.
> >
> >> Here are few comments on latest patch:
> >>
> >>
> >> 1.
> >> make/make check is fine, however I am getting regression failure in
> >> postgres_fdw contrib module (attached regression.diff).
> >> Please investigate and fix.
> >>
> > It was an incorrect interaction when postgres_fdw tries to push down
> > sorting to the remote side. We cannot attach LIMIT clause on the plain
> > scan path across SORT, however, the previous version estimated the
> > cost for the plain scan with LIMIT clause even if local sorting is needed.
> > If remote scan may return just 10 rows, estimated cost of the local
> > sort is very lightweight, thus, this unreasonable path was chosen.
> > (On the other hands, its query execution results were correct because
> > ps_numTuples is not delivered across Sort node, so ForeignScan
> > eventually scanned all the remote tuples. It made correct results but
> > not optimal from the viewpoint of performance.)
> >
> > The v3 patch estimates the cost with remote LIMIT clause only if
> > supplied pathkey strictly matches with the final output order of the
> > query, thus, no local sorting is expected.
> >
> > Some of the regression test cases still have different plans but due
> > to the new optimization by remote LIMIT clause.
> > Without remote LIMIT clause, some of regression test cases preferred
> > remote-JOIN + local-SORT then local-LIMIT.
> > Once we have remote-LIMIT option, it allows to discount the cost for
> > remote-SORT by choice of top-k heap sorting.
> > It changed the optimizer's decision on some test cases.
> >
> > Potential one big change is the test case below.
> >
> >  -- CROSS JOIN, not pushed down
> >  EXPLAIN (VERBOSE, COSTS OFF)
> >  SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1,
> > t2.c1 OFFSET 100 LIMIT 10;
> >
> > It assumed CROSS JOIN was not pushed down due to the cost for network
> > traffic, however, remote LIMIT reduced the estimated number of tuples
> > to be moved. So, all of the CROSS JOIN + ORDER BY + LIMIT became to
> > run on the remote side.
> >
> >> 2.
> >> + *
> >> + * MEMO: root->limit_tuples is not attached when query
> >> contains
> >> + * grouping-clause or aggregate functions. So, we don's
> adjust
> >> + * rows even if LIMIT  is supplied.
> >>
> >> Can you please explain why you are not doing this for grouping-clause
> >> or aggregate functions.
> >>
> > See grouping_planner() at optimizer/plan/planner.c It puts an invalid
> > value on the root->limit_tuples if query has GROUP BY clause, so we
> > cannot know number of tuples to be returned even if we have upper
> > limit actually.
> >
> > /*
> >  * Figure out whether there's a hard limit on the number of rows
> that
> >  * query_planner's result subplan needs to return.  Even if we
> know a
> >  * hard limit overall, it doesn't apply if the query has any
> >  

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-02-28 Thread Jan Michálek
There it is, what i have.
I need i small help with psql.out, because \pset format wrapped. I don`t
know, how to have it in fixed width.

2017-02-28 14:23 GMT+01:00 Jan Michálek :

> Current state is something like this (diff is attached).
> I currently haven`t regression test, tab completion etc., I will add this
> thing following structure of asciidoc commit.
>
> Output is tested using retext, rst is OK, md have problem with cells with
> newline (i must find out, how it is possible create table with this in
> markdown).
>
> [jelen@laptak patch_postgre_rst]$
> [jelen@laptak psql]$ ./psql
> psql (9.6.2, server 9.6.1)
> Type "help" for help.
>
> jelen=# \pset linestyle markdown
> Line style is markdown.
> jelen=# values(E'nasral Franta\nna trabanta','Žluťoučký kobyly'),
> (,E'a\tb') \g | xclip
> jelen=# values(E'nasral Franta\nna trabanta','Žluťoučký kobyly'),
> (,E'a\tb') \g
>
> |column1| column2  |
> |---|--|
> | nasral Franta | Žluťoučký kobyly |
> | na trabanta   |  |
> | ' | a   b|
>
>
> (2 rows)
>
> jelen=# \pset linestyle rst
> Line style is rst.
> jelen=# values(E'nasral Franta\nna trabanta','Žluťoučký kobyly'),
> (,E'a\tb') \g
> +---+--+
> |column1| column2  |
> +===+==+
> | nasral Franta+| Žluťoučký kobyly |
> | na trabanta   |  |
> +---+--+
> | ' | a   b|
> +---+--+
>
> (2 rows)
>
> jelen=#
>
> 2017-02-24 0:46 GMT+01:00 Michael Paquier :
>
>> On Fri, Feb 24, 2017 at 3:09 AM, Jan Michálek 
>> wrote:
>> > I can try it, doesn`t look dificult, but I`m worry, that I`m not able to
>> > write clean, pretty code.
>>
>> If you want to have something available in Postgres 10, you had better
>> be quick. The last commit fest of the development cycle of Postgres 10
>> begins on the 1st of March, you need to to register your patch here:
>> https://commitfest.postgresql.org/13/
>> Here are also some rough guidelines about submitting a patch:
>> https://wiki.postgresql.org/wiki/Submitting_a_Patch
>> --
>> Michael
>>
>
>
>
> --
> Jelen
> Starší čeledín datovýho chlíva
>



-- 
Jelen
Starší čeledín datovýho chlíva
diff -ru a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
--- a/doc/src/sgml/ref/psql-ref.sgml2017-02-06 22:45:25.0 +0100
+++ b/doc/src/sgml/ref/psql-ref.sgml2017-03-01 00:37:19.0 +0100
@@ -2373,6 +2373,7 @@
   
   Sets the border line drawing style to one
   of ascii, old-ascii,
+ , rst, markdown
   or unicode.
   Unique abbreviations are allowed.  (That would mean one
   letter is enough.)
@@ -2408,6 +2409,12 @@
   again in the left-hand margin of the following line.
   
 
+ rst and markdown format
+ tables for use restructured text on markdown documents. Both of them 
+ works well only with format aligned and border
+ 2.
+ 
+
   
   When the border setting is greater than zero,
   the linestyle option also determines the
diff -ru a/src/bin/psql/command.c b/src/bin/psql/command.c
--- a/src/bin/psql/command.c2017-02-06 22:45:25.0 +0100
+++ b/src/bin/psql/command.c2017-03-01 01:12:17.0 +0100
@@ -2584,9 +2584,17 @@
popt->topt.line_style = _asciiformat_old;
else if (pg_strncasecmp("unicode", value, vallen) == 0)
popt->topt.line_style = _utf8format;
+   /* format markdown
+   */
+   else if (pg_strncasecmp("markdown", value, vallen) == 0)
+   popt->topt.line_style = _markdown;
+   /* format rst
+   */
+   else if (pg_strncasecmp("rst", value, vallen) == 0)
+   popt->topt.line_style = _rst;
else
{
-   psql_error("\\pset: allowed line styles are ascii, 
old-ascii, unicode\n");
+   psql_error("\\pset: allowed line styles are ascii, 
old-ascii, unicode, markdown, rst\n");
return false;
}
 
diff -ru a/src/bin/psql/help.c b/src/bin/psql/help.c
--- a/src/bin/psql/help.c   2017-02-06 22:45:25.0 +0100
+++ b/src/bin/psql/help.c   2017-03-01 00:59:07.0 +0100
@@ -374,7 +374,7 @@
fprintf(output, _("  fieldsep_zero  set field separator for 
unaligned output to zero byte\n"));
fprintf(output, _("  footer enable or disable display of 
the table footer [on, off]\n"));
fprintf(output, _("  format set output format [unaligned, 
aligned, wrapped, html, asciidoc, ...]\n"));
-   fprintf(output, _("  linestyle  

Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-02-28 Thread Andres Freund
On 2017-02-28 15:59:08 +0100, Andreas Karlsson wrote:
> On 02/28/2017 03:13 PM, Bruce Momjian wrote:
> > I might have added that one; the text is:
> > 
> > Consider disallowing multiple queries in PQexec()
> > as an additional barrier to SQL injection attacks
> > 
> > and it is a "consider" item.  Should it be moved to the Wire Protocol
> > Changes / v4 Protocol section or removed?
> 
> A new protocol version wont solve the breakage of the C API, so I am not
> sure we can ever drop this feature other than by adding a new function
> something in the protocol to support this.

The protocol and C APIs to enforce this are already available, no? The
extended protocol (and thus PQexecParam/PQExecPrepared/...) don't allow
multiple statements:

/*
 * We only allow a single user statement in a prepared statement. This 
is
 * mainly to keep the protocol simple --- otherwise we'd need to worry
 * about multiple result tupdescs and things like that.
 */
if (list_length(parsetree_list) > 1)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("cannot insert multiple commands into a prepared 
statement")));

So if you don't want to allow multiple statements, use PQexecParams et
al.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-02-28 Thread Oleg Bartunov
On Tue, Feb 28, 2017 at 10:55 PM, Pavel Stehule 
wrote:

> Hi
>
>
> 2017-02-28 20:08 GMT+01:00 Oleg Bartunov :
>
>> Hi there,
>>
>>
>> Attached patch is an implementation of SQL/JSON data model from SQL-2016
>> standard (ISO/IEC 9075-2:2016(E)), which was published 2016-12-15 and is
>> available only for purchase from ISO web site (
>> https://www.iso.org/standard/63556.html). Unfortunately I didn't find
>> any public sources of the standard or any preview documents, but Oracle
>> implementation of json support in 12c release 2 is very close (
>> http://docs.oracle.com/database/122/ADJSN/json-in-oracle-database.htm),
>> also we used https://livesql.oracle.com/  to understand some details.
>>
>> Postgres has already two json data types - json and jsonb and
>> implementing another json data type, which strictly conforms the standard,
>> would be not a good idea. Moreover, SQL standard doesn’t describe data
>> type, but only data model, which “comprises SQL/JSON items and SQL/JSON
>> sequences. The components of the SQL/JSON data model are:
>>
>> 1) An SQL/JSON item is defined recursively as any of the following:
>>
>> a) An SQL/JSON scalar, defined as a non-null value of any of the
>> following predefined (SQL) types:
>>
>> character string with character set Unicode, numeric, Boolean, or
>> datetime.
>>
>> b) An SQL/JSON null, defined as a value that is distinct from any value
>> of any SQL type.
>>
>> NOTE 122 — An SQL/JSON null is distinct from the SQL null value.
>>
>> c) An SQL/JSON array, defined as an ordered list of zero or more SQL/JSON
>> items, called the SQL/JSON
>>
>> elements of the SQL/JSON array.
>>
>> d) An SQL/JSON object, defined as an unordered collection of zero or more
>> SQL/JSON members….
>>
>> “
>>
>> Our jsonb corresponds to SQL/JSON with UNIQUE KEYS and implicit ordering
>> of keys and our main intention was to provide support of jsonb as a most
>> important and usable data type.
>>
>> We created repository for reviewing (ask for write access) -
>> https://github.com/postgrespro/sqljson/tree/sqljson
>>
>> Examples of usage can be found in src/test/regress/sql/sql_json.sql
>>
>> The whole documentation about json support should be reorganized and
>> added, and we plan to do this before release. We need help of community
>> here.
>>
>> Our goal is to provide support of main features of SQL/JSON to release
>> 10, as we discussed at developers meeting in Brussels (Andrew Dunstan has
>> kindly agreed to review the patch).
>>
>> We had not much time to develop the complete support, because of standard
>> availability), but hope all major features are here, namely, all nine
>> functions as described in the standard (but see implementation notes below):
>>
>> “All manipulation (e.g., retrieval, creation, testing) of SQL/JSON items
>> is performed through a number of SQL/JSON functions. There are nine such
>> functions, categorized as SQL/JSON retrieval functions and SQL/JSON
>> construction functions. The SQL/JSON retrieval functions are characterized
>> by operating on JSON data and returning an SQL value (possibly a Boolean
>> value) or a JSON value. The SQL/JSON construction functions return JSON
>> data created from operations on SQL data or other JSON data.
>>
>> The SQL/JSON retrieval functions are:
>>
>> — : extracts an SQL value of a predefined type from
>> a JSON text.
>>
>> — : extracts a JSON text from a JSON text.
>>
>> — : converts a JSON text to an SQL table.
>>
>> — : tests whether a string value is or is not properly
>> formed JSON text.
>>
>> — : tests whether an SQL/JSON path expression
>> returns any SQL/JSON items.
>>
>> The SQL/JSON construction functions are:
>>
>> — : generates a string that is a serialization
>> of an SQL/JSON object.
>>
>> — : generates a string that is a serialization of
>> an SQL/JSON array.
>>
>> — : generates, from an aggregation of
>> SQL data, a string that is a serialization
>>
>> of an SQL/JSON object.
>>
>> — : generates, from an aggregation of
>> SQL data, a string that is a serialization
>>
>> of an SQL/JSON array.
>>
>> A JSON-returning function is an SQL/JSON construction function or
>> JSON_QUERY.”
>>
>> The standard describes SQL/JSON path language, which used by SQL/JSON
>> query operators to query JSON. It defines path language as string literal.
>> We implemented the path language as  JSONPATH data type, since other
>> approaches are not friendly to planner and executor.
>>
>> The functions and JSONPATH provide a new functionality for json support,
>> namely, ability to operate (in standard specified way) with json structure
>> at SQL-language level - the often requested feature by the users.
>>
>> The patch is consists of about 15000 insertions (about 5000 lines are
>> from tests), passes all regression tests and doesn’t touches critical
>> parts, so we hope with community help to bring it to committable state.
>>
>> Authors: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov and Alexander

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-02-28 Thread Pavel Stehule
Hi


2017-02-28 20:08 GMT+01:00 Oleg Bartunov :

> Hi there,
>
>
> Attached patch is an implementation of SQL/JSON data model from SQL-2016
> standard (ISO/IEC 9075-2:2016(E)), which was published 2016-12-15 and is
> available only for purchase from ISO web site (
> https://www.iso.org/standard/63556.html). Unfortunately I didn't find any
> public sources of the standard or any preview documents, but Oracle
> implementation of json support in 12c release 2 is very close (
> http://docs.oracle.com/database/122/ADJSN/json-in-oracle-database.htm),
> also we used https://livesql.oracle.com/  to understand some details.
>
> Postgres has already two json data types - json and jsonb and implementing
> another json data type, which strictly conforms the standard, would be not
> a good idea. Moreover, SQL standard doesn’t describe data type, but only
> data model, which “comprises SQL/JSON items and SQL/JSON sequences. The
> components of the SQL/JSON data model are:
>
> 1) An SQL/JSON item is defined recursively as any of the following:
>
> a) An SQL/JSON scalar, defined as a non-null value of any of the following
> predefined (SQL) types:
>
> character string with character set Unicode, numeric, Boolean, or datetime.
>
> b) An SQL/JSON null, defined as a value that is distinct from any value of
> any SQL type.
>
> NOTE 122 — An SQL/JSON null is distinct from the SQL null value.
>
> c) An SQL/JSON array, defined as an ordered list of zero or more SQL/JSON
> items, called the SQL/JSON
>
> elements of the SQL/JSON array.
>
> d) An SQL/JSON object, defined as an unordered collection of zero or more
> SQL/JSON members….
>
> “
>
> Our jsonb corresponds to SQL/JSON with UNIQUE KEYS and implicit ordering
> of keys and our main intention was to provide support of jsonb as a most
> important and usable data type.
>
> We created repository for reviewing (ask for write access) -
> https://github.com/postgrespro/sqljson/tree/sqljson
>
> Examples of usage can be found in src/test/regress/sql/sql_json.sql
>
> The whole documentation about json support should be reorganized and
> added, and we plan to do this before release. We need help of community
> here.
>
> Our goal is to provide support of main features of SQL/JSON to release 10,
> as we discussed at developers meeting in Brussels (Andrew Dunstan has
> kindly agreed to review the patch).
>
> We had not much time to develop the complete support, because of standard
> availability), but hope all major features are here, namely, all nine
> functions as described in the standard (but see implementation notes below):
>
> “All manipulation (e.g., retrieval, creation, testing) of SQL/JSON items
> is performed through a number of SQL/JSON functions. There are nine such
> functions, categorized as SQL/JSON retrieval functions and SQL/JSON
> construction functions. The SQL/JSON retrieval functions are characterized
> by operating on JSON data and returning an SQL value (possibly a Boolean
> value) or a JSON value. The SQL/JSON construction functions return JSON
> data created from operations on SQL data or other JSON data.
>
> The SQL/JSON retrieval functions are:
>
> — : extracts an SQL value of a predefined type from a
> JSON text.
>
> — : extracts a JSON text from a JSON text.
>
> — : converts a JSON text to an SQL table.
>
> — : tests whether a string value is or is not properly
> formed JSON text.
>
> — : tests whether an SQL/JSON path expression
> returns any SQL/JSON items.
>
> The SQL/JSON construction functions are:
>
> — : generates a string that is a serialization of
> an SQL/JSON object.
>
> — : generates a string that is a serialization of
> an SQL/JSON array.
>
> — : generates, from an aggregation of
> SQL data, a string that is a serialization
>
> of an SQL/JSON object.
>
> — : generates, from an aggregation of
> SQL data, a string that is a serialization
>
> of an SQL/JSON array.
>
> A JSON-returning function is an SQL/JSON construction function or
> JSON_QUERY.”
>
> The standard describes SQL/JSON path language, which used by SQL/JSON
> query operators to query JSON. It defines path language as string literal.
> We implemented the path language as  JSONPATH data type, since other
> approaches are not friendly to planner and executor.
>
> The functions and JSONPATH provide a new functionality for json support,
> namely, ability to operate (in standard specified way) with json structure
> at SQL-language level - the often requested feature by the users.
>
> The patch is consists of about 15000 insertions (about 5000 lines are from
> tests), passes all regression tests and doesn’t touches critical parts, so
> we hope with community help to bring it to committable state.
>
> Authors: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov and Alexander
> Korotkov
>
> Implementation notes:
>
>
>1.
>
>We didn’t implemented ‘datetime’ support, since it’s not clear from
>standard.
>2.
>
>JSON_OBJECT/JSON_OBJECTAGG (KEY  VALUE , ...) doesn’t
>  

Re: [HACKERS] Logical replication existing data copy

2017-02-28 Thread Erik Rijkers

On 2017-02-28 07:38, Erik Rijkers wrote:

On 2017-02-27 15:08, Petr Jelinek wrote:


0001-Use-asynchronous-connect-API-in-libpqwalreceiver.patch 
+
0002-Fix-after-trigger-execution-in-logical-replication.patch   
+
0003-Add-RENAME-support-for-PUBLICATIONs-and-SUBSCRIPTION.patch 
+
snapbuild-v4-0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch 
+

snapbuild-v4-0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
snapbuild-v4-0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch 
 +
snapbuild-v4-0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch  
+
snapbuild-v4-0005-Skip-unnecessary-snapshot-builds.patch
+

0001-Logical-replication-support-for-initial-data-copy-v6.patch


This is the most frequent error that happens while doing pgbench-runs 
over logical replication: I run it continuously all day, and every few 
hours an error occurs of the kind seen below: a table (pgbench_history, 
mostly) ends up 1 row short (673466 instead of 673467).  I have the 
script wait a long time before calling it an error (because in theory it 
could still 'finish', and end successfully (although that has not 
happened yet, once the system got into this state).


-- pgbench -c 16 -j 8 -T 120 -P 24 -n -M simple  --  scale 25

[...]
6972 a,b,t,h:  250 25250 673467   e53236c09 643235708 
f952814c3 559d618cd   master
6973 a,b,t,h:  250 25250 673466   e53236c09 643235708 
f952814c3 4b09337e3   replica NOK   a22fb00a6

-- wait another 5 s   (total 20 s) (unchanged 1)
-- getting md5 (cb)
6972 a,b,t,h:  250 25250 673467   e53236c09 643235708 
f952814c3 559d618cd   master
6973 a,b,t,h:  250 25250 673466   e53236c09 643235708 
f952814c3 4b09337e3   replica NOK   a22fb00a6

-- wait another 5 s   (total 25 s) (unchanged 2)
-- getting md5 (cb)
6972 a,b,t,h:  250 25250 673467   e53236c09 643235708 
f952814c3 559d618cd   master
6973 a,b,t,h:  250 25250 673466   e53236c09 643235708 
f952814c3 4b09337e3   replica NOK   a22fb00a6

-- wait another 5 s   (total 30 s) (unchanged 3)
-- getting md5 (cb)
6972 a,b,t,h:  250 25250 673467   e53236c09 643235708 
f952814c3 559d618cd   master
6973 a,b,t,h:  250 25250 673466   e53236c09 643235708 
f952814c3 4b09337e3   replica NOK   a22fb00a6

-- wait another 5 s   (total 35 s) (unchanged 4)


I gathered some info in this (proabably deadlocked) state in the hope 
there is something suspicious in there:



UID PID   PPID  C STIME TTY  STAT   TIME CMD
rijkers   71203  1  0 20:06 pts/57   S  0:00 postgres -D 
/var/data1/pg_stuff/pg_installations/pgsql.logical_replication2/data -p 
6973 -c wal_level=replica [...]
rijkers   71214  71203  0 20:06 ?Ss 0:00  \_ postgres: 
logger process
rijkers   71216  71203  0 20:06 ?Ss 0:00  \_ postgres: 
checkpointer process
rijkers   71217  71203  0 20:06 ?Ss 0:00  \_ postgres: 
writer process
rijkers   71218  71203  0 20:06 ?Ss 0:00  \_ postgres: wal 
writer process
rijkers   71219  71203  0 20:06 ?Ss 0:00  \_ postgres: 
autovacuum launcher process
rijkers   71220  71203  0 20:06 ?Ss 0:00  \_ postgres: stats 
collector process
rijkers   71221  71203  0 20:06 ?Ss 0:00  \_ postgres: 
bgworker: logical replication launcher
rijkers   71222  71203  0 20:06 ?Ss 0:00  \_ postgres: 
bgworker: logical replication worker 30042


rijkers   71201  1  0 20:06 pts/57   S  0:00 postgres -D 
/var/data1/pg_stuff/pg_installations/pgsql.logical_replication/data -p 
6972 -c wal_level=logical [...]
rijkers   71206  71201  0 20:06 ?Ss 0:00  \_ postgres: 
logger process
rijkers   71208  71201  0 20:06 ?Ss 0:00  \_ postgres: 
checkpointer process
rijkers   71209  71201  0 20:06 ?Ss 0:00  \_ postgres: 
writer process
rijkers   71210  71201  0 20:06 ?Ss 0:00  \_ postgres: wal 
writer process
rijkers   71211  71201  0 20:06 ?Ss 0:00  \_ postgres: 
autovacuum launcher process
rijkers   71212  71201  0 20:06 ?Ss 0:00  \_ postgres: stats 
collector process
rijkers   71213  71201  0 20:06 ?Ss 0:00  \_ postgres: 
bgworker: logical replication launcher
rijkers   71223  71201  0 20:06 ?Ss 0:00  \_ postgres: wal 
sender process rijkers [local] idle





-- replica:
 port | shared_buffers | work_mem | m_w_m | e_c_s
--++--+---+---
 6973 | 100MB  | 50MB | 2GB   | 64GB
(1 row)

select  current_setting('port') as port
, datname   as db
,  to_char(pg_database_size(datname), '9G999G999G999G999')
 || ' (' ||  pg_size_pretty(pg_database_size(datname)) || ')' as 
dbsize

, pid
, application_name  as app
, xact_start
, query_start
, regexp_replace( cast(now() - query_start as text), 
E'\.[[:digit:]]+\$', '')   

[HACKERS] Cost model for parallel CREATE INDEX

2017-02-28 Thread Peter Geoghegan
There are a couple of open items for the parallel CREATE INDEX patch
that at this point represent blockers to commit, IMV. The first is
around a deficiency in the shared refcount mechanism, which is well
understood and doesn't need to be rehashed on this thread. The second
is the cost model, which is what I want to talk about here.

Currently, the cost model scales the number of workers at logarithmic
intervals, in the style of compute_parallel_worker(), but without
considering heap pages (that's actually broken out, for now). I'm
calling a new function that lives in planner.c, right next to
plan_cluster_use_sort(). ISTM that we ought to be considering both
heap pages and indexes pages, which makes the new signature of
compute_parallel_worker() (which now anticipates the needs of parallel
index scans) interesting to me, so that's something that I need to
address.

Right now, as of V8, we:

0. See if the parallel_workers *index* storage parameter is set (I've
added this new storage parameter, but I think that Amit has or will
add the same new index storage parameter for parallel index scan [1]).

1. Estimate/project the size of the finished index, using a new
nbtpage.c function. Note that this does the right thing with partial
indexes and things like that. It uses pg_statistic stats, where
available.

(My testing patch 0002-* has had an SQL-callable function that lets
reviewers easily determine what the projected size of some existing
index is, which might be a good idea to polish up and include as a
general purpose tool, apropos of nothing -- it is typically very
accurate [2]).

2. Input that size into the compute_parallel_worker()-style
logarithmic scaling of number of workers.

3. Calculate how many workers will have at least a full
maintenance_work_mem share doled out, while still having at least
min_parallel_relation_size of workMem in tuplesort.c.

(I guess I should say min_parallel_table_scan_size or even
min_parallel_index_scan_size now, but whatever).

4. Return the minimum worker calculation from either one of steps 3 and 4.

So, a low maintenance_work_mem may cap our original suggested number
of workers. This cap isn't particularly likely to be applied, though.
Note also that the max_parallel_workers_maintenance GUC is given the
opportunity to cap things off. This is the utility statement
equivalent of max_parallel_workers_per_gather.

Issues with this:

* This scales based on output size (projected index size), not input
size (heap scan input). Apparently, that's what we always do right
now.

* This is dissimilar to how we cost parallel seq scans. There, the
only cost associated with going with a parallel access path is fixed
startup overheads, and IPC costs (paid in parallel_tuple_cost units).
So, we're not doing a comparison against a serial and parallel plan,
even though we might want to, especially because parallel CREATE INDEX
always uses temp files, unlike serial CREATE INDEX. cost_sort() is
never involved in any of this, and in any case isn't prepared to cost
parallel sorts right now.

* OTOH, there is less sense in doing the equivalent of charging for
IPC overhead that something like a Gather node incurs costs for during
planning, because to some degree the IPC involved is inherently
necessary. If you were going to get an external sort anyway, well,
that still involves temp files that are written to and read from.
Whether or not it's the same backend that does the writing as the
reading in all cases may matter very little. So, the main factor that
discourages parallel sequential scans doesn't really exist for
parallel CREATE INDEX.

I am tempted to move to something closer to what you see elsewhere,
were a serial path and partial path are both created. This would make
some sense for parallel CREATE INDEX, because you happen to have the
issue of parallelism effectively forcing an external sort. But
otherwise, it wouldn't make that much sense, because parallelism is
always going to help up to the point that all cores are in use, or at
least not hurt. Testing shows this to be the case. It's not as if
there are obvious IPC costs that push things against parallelism. This
is generally good, because the danger of using too many workers is
much less pronounced -- it's demonstrably very small, especially if
you assume a baseline of a serial external sort based CREATE INDEX.

What direction does the cost model need to go in?

I still lean towards the approach V8 takes, though the scaling should
possibly use heap pages and index pages with
compute_parallel_worker(), while not worrying about the input/output
distinction. It's not very appealing to have to teach cost_sort()
about any of this, since the things that considers currently are hard
to map onto parallelism. Besides, it's not as if there is a world of
difference between a serial internal sort CREATE INDEX, and a parallel
external sort CREATE INDEX with lots of memory. It's not a potentially
very large difference, as we see with the sort 

[HACKERS] SQL/JSON in PostgreSQL

2017-02-28 Thread Oleg Bartunov
Hi there,


Attached patch is an implementation of SQL/JSON data model from SQL-2016
standard (ISO/IEC 9075-2:2016(E)), which was published 2016-12-15 and is
available only for purchase from ISO web site (
https://www.iso.org/standard/63556.html). Unfortunately I didn't find any
public sources of the standard or any preview documents, but Oracle
implementation of json support in 12c release 2 is very close (
http://docs.oracle.com/database/122/ADJSN/json-in-oracle-database.htm),
also we used https://livesql.oracle.com/  to understand some details.

Postgres has already two json data types - json and jsonb and implementing
another json data type, which strictly conforms the standard, would be not
a good idea. Moreover, SQL standard doesn’t describe data type, but only
data model, which “comprises SQL/JSON items and SQL/JSON sequences. The
components of the SQL/JSON data model are:

1) An SQL/JSON item is defined recursively as any of the following:

a) An SQL/JSON scalar, defined as a non-null value of any of the following
predefined (SQL) types:

character string with character set Unicode, numeric, Boolean, or datetime.

b) An SQL/JSON null, defined as a value that is distinct from any value of
any SQL type.

NOTE 122 — An SQL/JSON null is distinct from the SQL null value.

c) An SQL/JSON array, defined as an ordered list of zero or more SQL/JSON
items, called the SQL/JSON

elements of the SQL/JSON array.

d) An SQL/JSON object, defined as an unordered collection of zero or more
SQL/JSON members….

“

Our jsonb corresponds to SQL/JSON with UNIQUE KEYS and implicit ordering of
keys and our main intention was to provide support of jsonb as a most
important and usable data type.

We created repository for reviewing (ask for write access) -
https://github.com/postgrespro/sqljson/tree/sqljson


Examples of usage can be found in src/test/regress/sql/sql_json.sql

The whole documentation about json support should be reorganized and added,
and we plan to do this before release. We need help of community here.

Our goal is to provide support of main features of SQL/JSON to release 10,
as we discussed at developers meeting in Brussels (Andrew Dunstan has
kindly agreed to review the patch).

We had not much time to develop the complete support, because of standard
availability), but hope all major features are here, namely, all nine
functions as described in the standard (but see implementation notes below):

“All manipulation (e.g., retrieval, creation, testing) of SQL/JSON items is
performed through a number of SQL/JSON functions. There are nine such
functions, categorized as SQL/JSON retrieval functions and SQL/JSON
construction functions. The SQL/JSON retrieval functions are characterized
by operating on JSON data and returning an SQL value (possibly a Boolean
value) or a JSON value. The SQL/JSON construction functions return JSON
data created from operations on SQL data or other JSON data.

The SQL/JSON retrieval functions are:

— : extracts an SQL value of a predefined type from a
JSON text.

— : extracts a JSON text from a JSON text.

— : converts a JSON text to an SQL table.

— : tests whether a string value is or is not properly
formed JSON text.

— : tests whether an SQL/JSON path expression
returns any SQL/JSON items.

The SQL/JSON construction functions are:

— : generates a string that is a serialization of
an SQL/JSON object.

— : generates a string that is a serialization of
an SQL/JSON array.

— : generates, from an aggregation of
SQL data, a string that is a serialization

of an SQL/JSON object.

— : generates, from an aggregation of SQL
data, a string that is a serialization

of an SQL/JSON array.

A JSON-returning function is an SQL/JSON construction function or
JSON_QUERY.”

The standard describes SQL/JSON path language, which used by SQL/JSON query
operators to query JSON. It defines path language as string literal. We
implemented the path language as  JSONPATH data type, since other
approaches are not friendly to planner and executor.

The functions and JSONPATH provide a new functionality for json support,
namely, ability to operate (in standard specified way) with json structure
at SQL-language level - the often requested feature by the users.

The patch is consists of about 15000 insertions (about 5000 lines are from
tests), passes all regression tests and doesn’t touches critical parts, so
we hope with community help to bring it to committable state.

Authors: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov and Alexander Korotkov

Implementation notes:


   1.

   We didn’t implemented ‘datetime’ support, since it’s not clear from
   standard.
   2.

   JSON_OBJECT/JSON_OBJECTAGG (KEY  VALUE , ...) doesn’t
   implemented, only (:, …) and ( VALUE , …) are
   supported, because of  grammar conflicts with leading KEY keyword.
   3.

   FORMAT (JSON|JSONB))  in JSON_ARRAYAGG with subquery  doesn’t supported,
   because of grammar conflicts with non-reserved word FORMAT.
   4.

   JSONPATH 

Re: [HACKERS] PATCH: two slab-like memory allocators

2017-02-28 Thread Andres Freund
Hi,

On 2017-02-27 23:44:20 -0800, Andres Freund wrote:
> *preliminary* patch attached. This needs a good bit of polishing
> (primarily comment work, verifying that valgrind works), but I'm too
> tired now.
> 
> I'm not quite sure how to deal with mmgr/README - it's written as kind
> of a historical document, and the "Mechanisms to Allow Multiple Types of
> Contexts" is already quite out of date.  I think it'd be good to rip out
> all the historical references and just describe the current state, but
> I'm not really enthusiastic about tackling that :/

While still not enthusiastic, I took a stab at doing so.  While still
not perfect, I do think this is an improvement.

Is anybody uncomfortable going away from the current historical account
style?

- Andres
>From 5eef2abe5e593b6a9b072b58bbecadbe15689a01 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 28 Feb 2017 10:36:29 -0800
Subject: [PATCH 2/2] Overhaul mmgr/'s README.

The README was written as a "historical account", and that style
hasn't aged particularly well.  Rephrase it to describe the current
situation, instead of having various version specific comments.

This also updates the description of how allocated chunks are
associated with their corresponding context, the method of which has
changed with the previous commit.

Author: Andres Freund
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc729...@2ndquadrant.com
---
 src/backend/utils/mmgr/README | 292 +++---
 1 file changed, 132 insertions(+), 160 deletions(-)

diff --git a/src/backend/utils/mmgr/README b/src/backend/utils/mmgr/README
index f97d7653de..b83b29c268 100644
--- a/src/backend/utils/mmgr/README
+++ b/src/backend/utils/mmgr/README
@@ -1,15 +1,7 @@
 src/backend/utils/mmgr/README
 
-Notes About Memory Allocation Redesign
-==
-
-Up through version 7.0, Postgres had serious problems with memory leakage
-during large queries that process a lot of pass-by-reference data.  There
-was no provision for recycling memory until end of query.  This needed to be
-fixed, even more so with the advent of TOAST which allows very large chunks
-of data to be passed around in the system.  This document describes the new
-memory management system implemented in 7.1.
-
+Memory Context System Design Overview
+=
 
 Background
 --
@@ -38,10 +30,10 @@ to or get more memory from the same context the chunk was originally
 allocated in.
 
 At all times there is a "current" context denoted by the
-CurrentMemoryContext global variable.  The backend macro palloc()
-implicitly allocates space in that context.  The MemoryContextSwitchTo()
-operation selects a new current context (and returns the previous context,
-so that the caller can restore the previous context before exiting).
+CurrentMemoryContext global variable.  palloc() implicitly allocates space
+in that context.  The MemoryContextSwitchTo() operation selects a new current
+context (and returns the previous context, so that the caller can restore the
+previous context before exiting).
 
 The main advantage of memory contexts over plain use of malloc/free is
 that the entire contents of a memory context can be freed easily, without
@@ -60,8 +52,10 @@ The behavior of palloc and friends is similar to the standard C library's
 malloc and friends, but there are some deliberate differences too.  Here
 are some notes to clarify the behavior.
 
-* If out of memory, palloc and repalloc exit via elog(ERROR).  They never
-return NULL, and it is not necessary or useful to test for such a result.
+* If out of memory, palloc and repalloc exit via elog(ERROR).  They
+never return NULL, and it is not necessary or useful to test for such
+a result.  With palloc_extended() that behavior can be overridden
+using the MCXT_ALLOC_NO_OOM flag.
 
 * palloc(0) is explicitly a valid operation.  It does not return a NULL
 pointer, but a valid chunk of which no bytes may be used.  However, the
@@ -71,28 +65,18 @@ error.  Similarly, repalloc allows realloc'ing to zero size.
 * pfree and repalloc do not accept a NULL pointer.  This is intentional.
 
 
-pfree/repalloc No Longer Depend On CurrentMemoryContext

+The Current Memory Context
+--
 
-Since Postgres 7.1, pfree() and repalloc() can be applied to any chunk
-whether it belongs to CurrentMemoryContext or not --- the chunk's owning
-context will be invoked to handle the operation, regardless.  This is a
-change from the old requirement that CurrentMemoryContext must be set
-to the same context the memory was allocated from before one can use
-pfree() or repalloc().
-
-There was some consideration of getting rid of CurrentMemoryContext entirely,
-instead requiring the target memory context for allocation to be specified
-explicitly.  But we decided that would be too much notational overhead ---
-we'd 

Re: [HACKERS] New Committer - Andrew Gierth

2017-02-28 Thread David Fetter
On Tue, Feb 28, 2017 at 01:22:22PM -0500, Stephen Frost wrote:
> Greetings!
> 
> The PostgreSQL committers would like to welcome Andrew Gierth as a
> new committer for the PostgreSQL project.

Congratulations!

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New Committer - Andrew Gierth

2017-02-28 Thread Andres Freund
Hi,


On 2017-02-28 13:22:22 -0500, Stephen Frost wrote:
> Greetings!
> 
> The PostgreSQL committers would like to welcome Andrew Gierth as a new
> committer for the PostgreSQL project.
> 
> Andrew - welcome!

Congratulations Andrew!

We now have two Andrews and one Andres amongst the committers...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] New Committer - Andrew Gierth

2017-02-28 Thread Stephen Frost
Greetings!

The PostgreSQL committers would like to welcome Andrew Gierth as a new
committer for the PostgreSQL project.

Andrew - welcome!

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Should we cacheline align PGXACT?

2017-02-28 Thread Simon Riggs
On 28 February 2017 at 11:34, Ashutosh Sharma  wrote:


> So, Here are the pgbench results I got with '
> *reduce_pgxact_access_AtEOXact.v2.patch*' on a read-write workload.
>

Thanks for performing a test.

I see a low yet noticeable performance gain across the board on that
workload.

That is quite surprising to see a gain on that workload. The main workload
we have been discussing was the full read-only test (-S). For that case the
effect should be much more noticeable based upon Andres' earlier comments.

Would it be possible to re-run the test using only the -S workload? Thanks
very much.


>
> *pgbench settings:*
> pgbench -i -s 300 postgres
> pgbench -M prepared -c $thread -j $thread -T $time_for_reading  postgres
>
> where, time_for_reading = 30mins
>
> *non default GUC param*
> shared_buffers=8GB
> max_connections=300
>
> pg_wal is located in SSD.
>
> CLIENT COUNT TPS (HEAD) TPS (PATCH) % IMPROVEMENT
> 4 2588 2601 0.5023183926 <(502)%20318-3926>
> 8 5094 5098 0.0785237534
> 16 10294 10307 0.1262871576
> 32 19779 19815 0.182011224
> 64 27908 28346 1.569442454
> 72 27823 28416 2.131330194
> 128 28455 28618 0.5728342998
> 180 26739 26879 0.5235797898
> 196 27820 27963 0.5140186916
> 256 28763 28969 0.7161978931
>
> Also, Excel sheet (results-readwrite-300-SF.xlsx) containing the results
> for all the 3 runs is attached.
>



-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] avoid bloat from CREATE INDEX CONCURRENTLY

2017-02-28 Thread Simon Riggs
On 28 February 2017 at 13:30, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 28 February 2017 at 13:05, Tom Lane  wrote:
>>> Um ... isn't there a transaction boundary there anyway?
>
>> Yes, the patch releases the snapshot early, so it does not hold it
>> once the build scan has completed. This allows the sort and build
>> phases to occur without holding back the xmin.
>
> Oh ... so Alvaro explained it badly.  The reason this is specific to
> btree is that it's the only AM with any significant post-scan building
> time.
>
> However, now that I read the patch: this is a horribly ugly hack.
> I really don't like the API (if it even deserves the dignity of that
> name) that you've added to snapmgr.  I supposwe the zero documentation
> for it fits in nicely with the fact that it's a badly-thought-out kluge.

WTF. Frankly, knowing it would generate such a ridiculously negative
response was the reason it wasn't me that submitted it and why its not
fully documented. Documentation in this case would be a short
paragraph in the index AM, explaining for the user what is already in
code comments.

You're right to point out that there is significant post-scan build
time and the reduction in bloat during that time is well worth the
trouble. I'm pleased to have thought of it and to have contributed it
to the community.

> I think it would be better to just move the responsibility for snapshot
> popping in this sequence to the index AMs, full stop.

There were two choices: a) leave the responsibility to the index AM,
giving a clean API, or b) don't trust that all index AMs would know or
implement this correctly. If the index AM doesn't implement this
correctly it becomes a crash bug, which seemed unacceptable in an
extensible server.

After implementing (a), I chose (b) and took extra time to implement
the the ugly API in preference to the possibility of a crash bug. I am
open to following consensus on that and to resubmit other patches as
required.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow pg_dumpall to work without pg_authid

2017-02-28 Thread Simon Riggs
On 28 February 2017 at 16:12, Stephen Frost  wrote:
> Robert,
>
> * Robert Haas (robertmh...@gmail.com) wrote:
>> On Sun, Feb 19, 2017 at 12:24 AM, Robins Tharakan  wrote:
>> > I would like to work on a patch to accommodate restricted environments 
>> > (such
>> > as AWS RDS Postgres) which don't allow pg_authid access since their
>> > definition of Superuser is just a regular user with extra permissions.
>> >
>> > Would you consider a patch to add a flag to work around this restriction,
>> > Or, do you prefer that this be maintained outside core?
>>
>> I am a little surprised that this patch has gotten such a good
>> reception.  We haven't in the past been all that willing to accept
>> core changes for the benefit of forks of PostgreSQL; extensions, sure,
>> but forks?  Maybe we should take the view that Amazon has broken this
>> and Amazon ought to fix it, rather than making it our job to (try to)
>> work around their bugs.
>
> I suspect it's gotten a reasonably good reception because it's about
> making it possible to do more things as a non-superuser, which is
> something that we've got a number of different people working on
> currently, with two interesting patches from Dave geared towards doing
> that for monitoring.  I have no doubt that there will be other users of
> this, which means that it isn't "for the benefit of forks" but for users
> of core too.

If this was of benefit only to one company it would not get time or
attention from me. I thought that would have been obvious enough, but
if not, I'm happy to say it clearly.

The enhanced patch by me removes any mention of specific vendors or approaches.

I've edited the stated reason for the patch on the CF app, so its
clearer as to why this might be acceptable.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-02-28 Thread Andres Freund
Hi,


On 2017-02-26 19:30:32 +0530, Robert Haas wrote:
> On Wed, Feb 22, 2017 at 11:23 AM, Kuntal Ghosh
>  wrote:
> > While conducting the experiments for parallelism, Rafia came across a
> > hang in Q18 when plan uses partial and finalize hash aggregate. This
> > could be seen on both scale factors - 20 and 300, on setting work_mem
> > high enough so that the query uses hash aggregate. It seems that
> > commit b81b5a96f424531b97cdd1dba97d9d1b9c9d372e does not solve the
> > issue completely.
> 
> Andres, any thoughts?  Isn't the same issue that we were discussing
> https://www.postgresql.org/message-id/CA+TgmoYNO8qouPVO=1q2axzuxe942d_t5bcvzd9ikoc9tb3...@mail.gmail.com
> over a month ago?

Yes, I presume that it is.


> To me, it seems like a big problem that we have large, unfixed
> performance regressions with simplehash four months after it went in.

Yea, I agree.  I'm fairly sure that the patch I posted in that thread
actually fixes the issue (and would also have made already applied hash
patch of yours a small-ish optimization). I held back back because I
disliked the idea of magic constants, and I couldn't figure out a way to
properly "derive" them - but I'm inclined to simply live with the magic 
constsnts.


> I hate to suggest ripping the whole thing out, and it seems like
> overkill

Yea, I don't think we're there yet.  Let me push the patch that resizes
adaptively, and we can see how things are going from there.


> , but it's pretty clear to me that the current state of things
> is unacceptable, and that we're going to have a lot of unhappy users
> if we ship it the way that it is right now.

Yea, if we can't improve upon the current state, we'd need to revert.


> I want to point out that
> the kinds of problems we're hitting here are exactly what I told you I
> was worried about before it went in - that the average-case
> performance would be better but that there would be
> all-too-easy-to-hit cases where things got much worse because the
> whole thing degenerated into a linear search.  Not only did that
> happen, but there seem to be multiple ways of producing it without
> half trying, of which b81b5a96f424531b97cdd1dba97d9d1b9c9d372e fixed
> only one.

Note that that one is also fixed with what I'm proposing (but it's a
worthwhile small-ish improvement nonetheless).


> Something that's 5-10% faster in common cases but 2x or 10x
> slower when things go wrong is not an improvement.

The hash-table part is more like 3x faster - but as so often when making
things faster, the next bottleneck is just around the corner...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-28 Thread Andreas Karlsson

Hi,

Here is a third take on this feature, heavily based on Michael Paquier's 
2.0 patch. This time the patch does not attempt to preserve the index 
oids, but instead creates new indexes and moves all dependencies from 
the old indexes to the new before dropping the old ones. The only 
downside I can see to this approach is that we no logner will able to 
reindex catalog tables concurrently, but in return it should be easier 
to confirm that this approach can be made work.


This patch relies on that we can change the indisvalid flag of indexes 
transactionally, and as far as I can tell this is the case now that we 
have MVCC for the catalog updates.


The code does some extra intermediate commits when building the indexes 
to avoid long running transactions.


How REINDEX CONCURRENTLY operates:

For each table:

1. Create new indexes without populating them, and lock the tables and 
indexes for the session.


2. After waiting for all running transactions populate each index in a 
separate transaction and set them to ready.


3. After waiting again for all running transactions validate each index 
in a separate transaction (but not setting them to valid just yet).


4. Swap all dependencies over from each old index to the new index and 
rename the old and the new indexes (from the  to _ccold and 
_new to ), and set isprimary and isexclusion flags. Here we 
also mark the new indexes as valid and the old indexes as invalid.


5. After waiting for all running transactions we change each index from 
invalid to dead.


6. After waiting for all running transactions we drop each index.

7. Drop all session locks.

Andreas
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 
  Acquired by VACUUM (without FULL),
- ANALYZE, CREATE INDEX CONCURRENTLY, and
+ ANALYZE, CREATE INDEX CONCURRENTLY,
+ REINDEX CONCURRENTLY,
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..3449c0af73 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
   convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  REINDEX will perform a concurrent build if 
+  CONCURRENTLY is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY
+  command. Indexes of toast relations can be rebuilt with REINDEX
+  CONCURRENTLY.
  
 
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 

+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+
+   
 VERBOSE
 
  
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+
+   
+Rebuilding an index can interfere with regular operation of a database.
+Normally PostgreSQL locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block until the
+index rebuild is finished. This could have a severe effect if the system is
+a live production database. Very large tables can take many hours to be
+indexed, and even for smaller tables, an index rebuild can lock out writers
+for periods that are unacceptably long for a production system.
+   
+
+   
+PostgreSQL supports rebuilding indexes with minimum locking
+of writes.  This method is invoked by specifying the
+CONCURRENTLY option of 

[HACKERS] update comments about CatalogUpdateIndexes

2017-02-28 Thread Tomas Vondra

Hi,

commit 2f5c9d9c9ce removed CatalogUpdateIndexes and replaced it with 
CatalogTupleInsert/CatalogTupleUpdate, which do both the operation and 
index update.


But there remained three places in comments still referencing the 
removed CatalogUpdateIndexes. Attached is a patch fixing those places. 
It also removes the comment attribution to "bjm" from syscache.c, 
because after modifying it's no longer the original comment.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


catalog-update-indexes-comments.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-02-28 Thread Brandur Leach
Hi Julien,

Thanks for the expedient reply, even after I'd dropped the
ball for so long :)

> Indeed, I should have checked more examples :/ There
> isn't any clear pattern for this, so I guess any one
> would be ok.

Yeah, agreed. If it's alright with you, I ended up moving
the naming back to `macaddr_cmp_internal` just because it
results in a smaller final diff.

> Thanks. I'm ok with this, but maybe a native english
> speaker would have a better opinion on this.

Cool! I just re-read my own comment a few days later and I
think that it still mostly makes sense, but definitely open
to other edits if anyone else has one.

> One last thing, I noticed that you added:
>
> +static int macaddr_internal_cmp(macaddr *a1, macaddr *a2);
>
> but the existing function is declared as
>
> static int32
> macaddr_internal_cmp(macaddr *a1, macaddr *a2)
>
> I'd be in favor to declare both as int.

Great catch! I have no idea how I missed that. I've done as
you suggested and made them both "int", which seems
consistent with SortSupport implementations elsewhere.

> After this, I think this patch will be ready for committer.

Excellent! I've attached a new (and hopefully final)
version of the patch.

Two final questions about the process if you'd be so kind:

* Should I change the status on the Commitfest link [1] or
  do I leave that to you (or someone else like a committer)?

* I've been generating a new OID value with the
  `unused_oids` script, but pretty much every time I rebase
  I collide with someone else's addition and need to find a
  new one. Is it better for me to pick an OID in an exotic
  range for my final patch, or that a committer just finds
  a new one (if necessary) as they're putting it into
  master?

Thanks again!
Brandur

[1] https://commitfest.postgresql.org/10/743/


On Sat, Feb 25, 2017 at 9:56 AM, Julien Rouhaud 
wrote:

> On Sun, Feb 05, 2017 at 01:56:41PM -0800, Brandur Leach wrote:
>
> Hello Brandur, thanks for the updated patch!
>
> >
> > > * you used macaddr_cmp_internal() function name, for uuid
> > >   the same function is named uuid_internal_cmp().  Using
> > >   the same naming pattern is probably better.
> >
> > I was a little split on this one! It's true that UUID uses
> > `_internal_cmp`, but `_cmp_internal` is also used in a
> > number of places like `enum`, `timetz`, and `network`. I
> > don't have a strong feeling about it either way, so I've
> > changed it to `_internal_cmp` to match UUID as you
> > suggested.
> >
>
> Indeed, I should have checked more examples :/ There isn't any clear
> pattern
> for this, so I guess any one would be ok.
>
> > > * the function comment on macaddr_abbrev_convert()
> > >   doesn't mention specific little-endian handling
> >
> > I tried to bake this into the comment text. Here are the
> > relevant lines of the amended version:
> >
> > * Packs the bytes of a 6-byte MAC address into a Datum and treats it
> as
> > an
> > * unsigned integer for purposes of comparison. On a 64-bit machine,
> > there
> > * will be two zeroed bytes of padding. The integer is converted to
> > native
> > * endianness to facilitate easy comparison.
> >
> > > * "There will be two bytes of zero padding on the least
> > >   significant end"
> > >
> > > "least significant bits" would be better
> >
> > Also done. Here is the amended version:
> >
> > * On a 64-bit machine, zero out the 8-byte datum and copy the 6
> bytes of
> > * the MAC address in. There will be two bytes of zero padding on the
> end
> > * of the least significant bits.
> >
>
> Thanks.  I'm ok with this, but maybe a native english speaker would have a
> better opinion on this.
>
> > > * This patch will trigger quite a lot modifications after
> > >   a pgindent run.  Could you try to run pgindent on mac.c
> > >   before sending an updated patch?
> >
> > Good call! I've run the new version through pgindent.
> >
>
> Thanks also, no more issue here.
>
> > Let me know if you have any further feedback and/or
> > suggestions. Thanks!
>
> One last thing, I noticed that you added:
>
> +static int macaddr_internal_cmp(macaddr *a1, macaddr *a2);
>
> but the existing function is declared as
>
> static int32
> macaddr_internal_cmp(macaddr *a1, macaddr *a2)
>
> I'd be in favor to declare both as int.
>
> After this, I think this patch will be ready for committer.
>
> --
> Julien Rouhaud
> http://dalibo.com - http://dalibo.org
>


0003-Implement-SortSupport-for-macaddr-data-type.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow pg_dumpall to work without pg_authid

2017-02-28 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Sun, Feb 19, 2017 at 12:24 AM, Robins Tharakan  wrote:
> > I would like to work on a patch to accommodate restricted environments (such
> > as AWS RDS Postgres) which don't allow pg_authid access since their
> > definition of Superuser is just a regular user with extra permissions.
> >
> > Would you consider a patch to add a flag to work around this restriction,
> > Or, do you prefer that this be maintained outside core?
> 
> I am a little surprised that this patch has gotten such a good
> reception.  We haven't in the past been all that willing to accept
> core changes for the benefit of forks of PostgreSQL; extensions, sure,
> but forks?  Maybe we should take the view that Amazon has broken this
> and Amazon ought to fix it, rather than making it our job to (try to)
> work around their bugs.

I suspect it's gotten a reasonably good reception because it's about
making it possible to do more things as a non-superuser, which is
something that we've got a number of different people working on
currently, with two interesting patches from Dave geared towards doing
that for monitoring.  I have no doubt that there will be other users of
this, which means that it isn't "for the benefit of forks" but for users
of core too.

> On the other hand, maybe that approach is narrow-minded.

I agree that we shouldn't be accepting changes into core which are only
applicable and helpful for forks of PG.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal : Parallel Merge Join

2017-02-28 Thread Dilip Kumar
On Mon, Feb 27, 2017 at 10:27 AM, Amit Kapila  wrote:
> Okay, but in that case don't you think it is better to consider the
> parallel safety of cheapest_total_inner only when we don't find any
> cheap parallel_safe innerpath by reducing the sort keys?

Well,  we can do that but suppose cheapest_total_inner is not parallel
safe and we do not get any parallel safe path which is cheaper than
cheapest_total_inner, then we just end up making the merge join path
with the cheapest parallel safe path but we might have missed some of
the paths whose pathkey is covering more ordered keys.  Still, it's
hard to argue what it better because we can always say that if we try
only cheapest parallel safe path we will generate fewer paths.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel bitmap heap scan

2017-02-28 Thread Dilip Kumar
On Sun, Feb 26, 2017 at 9:26 PM, Robert Haas  wrote:
> I'm not entirely sure about the calling convention for
> tbm_free_shared_area() but the rest seems OK.

Changed
>
>> 2. Now tbm_free is not freeing any of the shared members which can be
>> accessed by other worker so tbm_free is safe to call from
>> ExecEndBitmapHeapScan without any safety check or ref count.
>
> That also seems fine. We ended up with something very similar in the
> Parallel Index Scan patch.
>
>> 0002:
>> 1. We don't need ExecShutdownBitmapHeapScan anymore because now we are
>> not freeing any shared member in ExecEndBitmapHeapScan.
>> 2. In ExecReScanBitmapHeapScan we will call tbm_free_shared_area to
>> free the shared members of the TBM.
>> 3. After that, we will free TBMSharedIteratorState what we allocated
>> using tbm_prepare_shared_iterate.
>
> Check.  But I think tbm_free_shared_area() should also free the object
> itself, instead of making the caller do that separately.

Right, done that way.
>
> +if (DsaPointerIsValid(node->pstate->tbmiterator))
> +{
> +/* First we free the shared TBM members using the shared state */
> +tbm_free_shared_area(dsa, node->pstate->tbmiterator);
> +dsa_free(dsa, node->pstate->tbmiterator);
> +}
> +
> +if (DsaPointerIsValid(node->pstate->prefetch_iterator))
> +dsa_free(dsa, node->pstate->prefetch_iterator);
>
> The fact that these cases aren't symmetric suggests that your
> abstraction is leaky.  I'm guessing that you can't call
> tbm_free_shared_area because the two iterators share one copy of the
> underlying iteration arrays, and the TBM code isn't smart enough to
> avoid freeing them twice.  You're going to have to come up with a
> better solution to that problem; nodeBitmapHeapScan.c shouldn't know
> about the way the underlying storage details are managed.  (Maybe you
> need to reference-count the iterator arrays?)

Converted iterator arrays to structure and maintained the refcount. I
had to do the same thing for pagetable also because that is also
shared across iterator.

>
> +if (node->inited)
> +goto start_iterate;
>
> My first programming teacher told me not to use goto.  I've
> occasionally violated that rule, but I need a better reason than you
> have here.  It looks very easy to avoid.

Changed
>
> +pbms_set_parallel(outerPlanState(node));
>
> I think this should be a flag in the plan, and the planner should set
> it correctly, instead of having it be a flag in the executor that the
> executor sets.  Also, the flag itself should really be called
> something that involves the word "shared" rather than "parallel",
> because the bitmap will not be created in parallel, but it will be
> shared.
Done
>
> Have you checked whether this patch causes any regression in the
> non-parallel case?  It adds a bunch of "if" statements, so it might.
> Hopefully not, but it needs to be carefully tested.

With new patch I tested in my local machine, perform multiple
executions and it doesn't show any regression.  Attached file
perf_result contains the explain analyze output for one of the query
(head, patch with 0 workers and patch with 2 workers).  I will perform
the test with all the TPC-H queries which using bitmap plan on the
bigger machine and will post the results next week.
>
> @@ -48,10 +48,11 @@
>  #include "utils/snapmgr.h"
>  #include "utils/tqual.h"
>
> -
>  static TupleTableSlot *BitmapHeapNext(BitmapHeapScanState *node);
>
> Unnecessary.
Fixed
>
> +static bool pbms_is_leader(ParallelBitmapState *pbminfo);
> +static void pbms_set_parallel(PlanState *node);
>
> I don't think this "pbms" terminology is very good.  It's dissimilar
> to the way other functions in this file are named.  Maybe
> BitmapShouldInitializeSharedState().
Changed
>
> I think that some of the bits that this function makes conditional on
> pstate should be factored out into inline functions.  Like:
>
> -if (node->prefetch_target >= node->prefetch_maximum)
> - /* don't increase any further */ ;
> -else if (node->prefetch_target >= node->prefetch_maximum / 2)
> -node->prefetch_target = node->prefetch_maximum;
> -else if (node->prefetch_target > 0)
> -node->prefetch_target *= 2;
> -else
> -node->prefetch_target++;
> +if (!pstate)
> +{
> +if (node->prefetch_target >= node->prefetch_maximum)
> + /* don't increase any further */ ;
> +else if (node->prefetch_target >= node->prefetch_maximum / 2)
> +node->prefetch_target = node->prefetch_maximum;
> +else if (node->prefetch_target > 0)
> +node->prefetch_target *= 2;
> +else
> +node->prefetch_target++;
> +}
> +else if (pstate->prefetch_target < 

Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2017-02-28 Thread Stephen Frost
* David Fetter (da...@fetter.org) wrote:
> On Mon, Feb 27, 2017 at 11:53:17PM -0500, Stephen Frost wrote:
> > * Kyotaro HORIGUCHI (horiguchi.kyot...@lab.ntt.co.jp) wrote:
> > > I suppose it is for suggesting what kind of word should come
> > > there, or avoiding silence for a tab. Or for symmetry with other
> > > types of manipulation, like DROP. Another possibility is creating
> > > multiple objects with similar names, say CREATE TABLE employee_x1,
> > > CREATE TABLE employee_x2. Just trying to complete existing
> > > *schema* is one more another possible objective.
> > 
> > I don't buy any of these arguments either.  I *really* don't want us
> > going down some road where we try to make sure that hitting 'tab'
> > never fails...
> 
> Wouldn't that just be a correct, grammar-aware implementation of tab
> completion?  Why wouldn't you want that?

No, it wouldn't, it would mean we have to provide something for cases
where it doesn't make sense to try and provide an answer, as being
discussed here for CREATE TABLE.

We can't provide an answer based on tab-completion to what you want to
call your new table.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] timeouts in PostgresNode::psql

2017-02-28 Thread Andrew Dunstan


On 02/28/2017 07:39 AM, Alvaro Herrera wrote:
> Lately I've been wondering about backpatching the whole TAP test
> infrastructure, all the way back.  As we notice bugs, it's really useful
> to use newly added tests in all branches; but currently PostgresNode
> doesn't work with old branches, particularly since the '-w' switch was
> removed from pg_ctl invokations in PostgresNode->start and ->restart
> methods -- (the test just fail without any indication of what is going
> on).
>


+1

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-02-28 Thread Aleksander Alekseev
Hi, Yugo.

Looks like a great feature! I'm going to take a closer look on your code
and write a feedback shortly. For now I can only tell that you forgot
to include some documentation in the patch.

I've added a corresponding entry to current commitfest [1]. Hope you
don't mind. If it's not too much trouble could you please register on a
commitfest site and add yourself to this entry as an author? I'm pretty
sure someone is using this information for writing release notes or
something like this.

[1] https://commitfest.postgresql.org/13/1059/

On Tue, Feb 28, 2017 at 11:33:13PM +0900, Yugo Nagata wrote:
> Hi all,
> 
> Now we have a declarative partitioning, but hash partitioning is not
> implemented yet. Attached is a POC patch to add the hash partitioning
> feature. I know we will need more discussions about the syntax and other
> specifications before going ahead the project, but I think this runnable
> code might help to discuss what and how we implement this.
> 
> * Description
> 
> In this patch, the hash partitioning implementation is basically based
> on the list partitioning mechanism. However, partition bounds cannot be
> specified explicitly, but this is used internally as hash partition
> index, which is calculated when a partition is created or attached.
> 
> The tentative syntax to create a partitioned table is as bellow;
> 
>  CREATE TABLE h (i int) PARTITION BY HASH(i) PARTITIONS 3 USING hashint4;
> 
> The number of partitions is specified by PARTITIONS, which is currently
> constant and cannot be changed, but I think this is needed to be changed in
> some manner. A hash function is specified by USING. Maybe, specifying hash
> function may be ommitted, and in this case, a default hash function
> corresponding to key type will be used.
> 
> A partition table can be create as bellow;
> 
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
> 
> FOR VALUES clause cannot be used, and the partition bound is
> calclulated automatically as partition index of single integer value.
> 
> When trying create partitions more than the number specified
> by PARTITIONS, it gets an error.
> 
> postgres=# create table h4 partition of h;
> ERROR:  cannot create hash partition more than 3 for h
> 
> An inserted record is stored in a partition whose index equals
> abs(hashfunc(key)) % . In the above
> example, this is abs(hashint4(i))%3.
> 
> postgres=# insert into h (select generate_series(0,20));
> INSERT 0 21
> 
> postgres=# select *,tableoid::regclass from h;
>  i  | tableoid 
> +--
>   0 | h1
>   1 | h1
>   2 | h1
>   4 | h1
>   8 | h1
>  10 | h1
>  11 | h1
>  14 | h1
>  15 | h1
>  17 | h1
>  20 | h1
>   5 | h2
>  12 | h2
>  13 | h2
>  16 | h2
>  19 | h2
>   3 | h3
>   6 | h3
>   7 | h3
>   9 | h3
>  18 | h3
> (21 rows)
> 
> * Todo / discussions
> 
> In this patch, we cannot change the number of partitions specified
> by PARTITIONS. I we can change this, the partitioning rule
> ( = abs(hashfunc(key)) % )
> is also changed and then we need reallocatiing records between
> partitions.
> 
> In this patch, user can specify a hash function USING. However,
> we migth need default hash functions which are useful and
> proper for hash partitioning. 
> 
> Currently, even when we issue SELECT query with a condition,
> postgres looks into all partitions regardless of each partition's
> constraint, because this is complicated such like "abs(hashint4(i))%3 = 0".
> 
> postgres=# explain select * from h where i = 10;
> QUERY PLAN
> --
>  Append  (cost=0.00..125.62 rows=40 width=4)
>->  Seq Scan on h  (cost=0.00..0.00 rows=1 width=4)
>  Filter: (i = 10)
>->  Seq Scan on h1  (cost=0.00..41.88 rows=13 width=4)
>  Filter: (i = 10)
>->  Seq Scan on h2  (cost=0.00..41.88 rows=13 width=4)
>  Filter: (i = 10)
>->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4)
>  Filter: (i = 10)
> (9 rows)
> 
> However, if we modify a condition into a same expression
> as the partitions constraint, postgres can exclude unrelated
> table from search targets. So, we might avoid the problem
> by converting the qual properly before calling predicate_refuted_by().
> 
> postgres=# explain select * from h where abs(hashint4(i))%3 = 
> abs(hashint4(10))%3;
> QUERY PLAN
> --
>  Append  (cost=0.00..61.00 rows=14 width=4)
>->  Seq Scan on h  (cost=0.00..0.00 rows=1 width=4)
>  Filter: ((abs(hashint4(i)) % 3) = 2)
>->  Seq Scan on h3  (cost=0.00..61.00 rows=13 width=4)
>  Filter: ((abs(hashint4(i)) % 3) = 2)
> (5 rows)
> 
> Best regards,
> Yugo Nagata
> 
> -- 
> Yugo Nagata 

> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
> index 41c0056..3820920 100644
> --- 

Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-02-28 Thread Andreas Karlsson

On 02/28/2017 03:13 PM, Bruce Momjian wrote:

I might have added that one; the text is:

Consider disallowing multiple queries in PQexec()
as an additional barrier to SQL injection attacks

and it is a "consider" item.  Should it be moved to the Wire Protocol
Changes / v4 Protocol section or removed?


A new protocol version wont solve the breakage of the C API, so I am not 
sure we can ever drop this feature other than by adding a new function 
something in the protocol to support this.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btree_gin and btree_gist for enums

2017-02-28 Thread Andrew Dunstan


On 02/27/2017 04:41 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> OK, here's the whole series of patches.
> I've not tested it at all, but this looks generally sane in a quick
> once-over.
>
> A minor quibble is that in 0003, you weren't terribly consistent about
> argument order --- in some places you have the FmgrInfo argument added
> before the collation argument, and in some places after.  I'd suggest
> trying to make the argument orders consistent with the fmgr.c support
> functions.  (I'm generally -1 on blindly adding stuff at the end.)
>
>   

Thanks for reviewing.


I don't mind changing it, but if I do I'm inclined to make it as
consistent as possible with the 0002 patch, which did put all the
FmgrInfo arguments at the end - there's not any other more obvious place
for them in that case, as there is no collation argument.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [POC] hash partitioning

2017-02-28 Thread Yugo Nagata
Hi all,

Now we have a declarative partitioning, but hash partitioning is not
implemented yet. Attached is a POC patch to add the hash partitioning
feature. I know we will need more discussions about the syntax and other
specifications before going ahead the project, but I think this runnable
code might help to discuss what and how we implement this.

* Description

In this patch, the hash partitioning implementation is basically based
on the list partitioning mechanism. However, partition bounds cannot be
specified explicitly, but this is used internally as hash partition
index, which is calculated when a partition is created or attached.

The tentative syntax to create a partitioned table is as bellow;

 CREATE TABLE h (i int) PARTITION BY HASH(i) PARTITIONS 3 USING hashint4;

The number of partitions is specified by PARTITIONS, which is currently
constant and cannot be changed, but I think this is needed to be changed in
some manner. A hash function is specified by USING. Maybe, specifying hash
function may be ommitted, and in this case, a default hash function
corresponding to key type will be used.

A partition table can be create as bellow;

 CREATE TABLE h1 PARTITION OF h;
 CREATE TABLE h2 PARTITION OF h;
 CREATE TABLE h3 PARTITION OF h;

FOR VALUES clause cannot be used, and the partition bound is
calclulated automatically as partition index of single integer value.

When trying create partitions more than the number specified
by PARTITIONS, it gets an error.

postgres=# create table h4 partition of h;
ERROR:  cannot create hash partition more than 3 for h

An inserted record is stored in a partition whose index equals
abs(hashfunc(key)) % . In the above
example, this is abs(hashint4(i))%3.

postgres=# insert into h (select generate_series(0,20));
INSERT 0 21

postgres=# select *,tableoid::regclass from h;
 i  | tableoid 
+--
  0 | h1
  1 | h1
  2 | h1
  4 | h1
  8 | h1
 10 | h1
 11 | h1
 14 | h1
 15 | h1
 17 | h1
 20 | h1
  5 | h2
 12 | h2
 13 | h2
 16 | h2
 19 | h2
  3 | h3
  6 | h3
  7 | h3
  9 | h3
 18 | h3
(21 rows)

* Todo / discussions

In this patch, we cannot change the number of partitions specified
by PARTITIONS. I we can change this, the partitioning rule
( = abs(hashfunc(key)) % )
is also changed and then we need reallocatiing records between
partitions.

In this patch, user can specify a hash function USING. However,
we migth need default hash functions which are useful and
proper for hash partitioning. 

Currently, even when we issue SELECT query with a condition,
postgres looks into all partitions regardless of each partition's
constraint, because this is complicated such like "abs(hashint4(i))%3 = 0".

postgres=# explain select * from h where i = 10;
QUERY PLAN
--
 Append  (cost=0.00..125.62 rows=40 width=4)
   ->  Seq Scan on h  (cost=0.00..0.00 rows=1 width=4)
 Filter: (i = 10)
   ->  Seq Scan on h1  (cost=0.00..41.88 rows=13 width=4)
 Filter: (i = 10)
   ->  Seq Scan on h2  (cost=0.00..41.88 rows=13 width=4)
 Filter: (i = 10)
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4)
 Filter: (i = 10)
(9 rows)

However, if we modify a condition into a same expression
as the partitions constraint, postgres can exclude unrelated
table from search targets. So, we might avoid the problem
by converting the qual properly before calling predicate_refuted_by().

postgres=# explain select * from h where abs(hashint4(i))%3 = 
abs(hashint4(10))%3;
QUERY PLAN
--
 Append  (cost=0.00..61.00 rows=14 width=4)
   ->  Seq Scan on h  (cost=0.00..0.00 rows=1 width=4)
 Filter: ((abs(hashint4(i)) % 3) = 2)
   ->  Seq Scan on h3  (cost=0.00..61.00 rows=13 width=4)
 Filter: ((abs(hashint4(i)) % 3) = 2)
(5 rows)

Best regards,
Yugo Nagata

-- 
Yugo Nagata 
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 41c0056..3820920 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -3074,7 +3074,7 @@ StorePartitionKey(Relation rel,
   AttrNumber *partattrs,
   List *partexprs,
   Oid *partopclass,
-  Oid *partcollation)
+  Oid *partcollation, int16 partnparts, Oid hashfunc)
 {
 	int			i;
 	int2vector *partattrs_vec;
@@ -3121,6 +3121,8 @@ StorePartitionKey(Relation rel,
 	values[Anum_pg_partitioned_table_partrelid - 1] = ObjectIdGetDatum(RelationGetRelid(rel));
 	values[Anum_pg_partitioned_table_partstrat - 1] = CharGetDatum(strategy);
 	values[Anum_pg_partitioned_table_partnatts - 1] = Int16GetDatum(partnatts);
+	values[Anum_pg_partitioned_table_partnparts - 1] = Int16GetDatum(partnparts);
+	values[Anum_pg_partitioned_table_parthashfunc - 1] = ObjectIdGetDatum(hashfunc);
 	values[Anum_pg_partitioned_table_partattrs - 1] = PointerGetDatum(partattrs_vec);
 	

[HACKERS] Declarative partitioning optimization for large amount of partitions

2017-02-28 Thread Aleksander Alekseev
Hello.

I decided to figure out whether current implementation of declarative
partitioning has any bottlenecks when there is a lot of partitions. Here
is what I did [1].

```
-- init schema

\timing on

CREATE TABLE part_test (pk int not null, k int, v varchar(128)) PARTITION BY 
RANGE(pk);

do $$
declare
i integer;
begin
for i in 1 .. 1
loop
raise notice 'i = %', i;
execute ('CREATE TABLE part_test_' || i ||
 ' PARTITION OF part_test FOR VALUES FROM (' ||
 (1 + (i-1)*1000) || ') to (' || ( (i * 1000) + 1) || ');'
);
end loop;
end $$;

-- fill tables with some data

do $$
declare
i integer;
begin
for i in 1 .. 100*1000
loop
raise notice 'i = %', i;
execute ('insert into part_test values ( ceil(random()*(1-1)*1000), 
ceil(random()*1*1000),  || ceil(random()*1*1000) );');
end loop;
end $$;
```

Then:

```
# 2580 is some pk that exists
echo 'select * from part_test where pk = 2580;' > t.sql
pgbench -j 7 -c 7 -f t.sql -P 1 -T 300 eax
```

`perf top` showed to bottlenecks [2]. A stacktrace for the first one
looks like this [3]:

```
0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') at 
pgstat.c:1689
1689if (entry->t_id == rel_id)
#0  0x007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') 
at pgstat.c:1689
#1  0x007a4275 in pgstat_initstats (rel=0x7f4af3fd41f8) at pgstat.c:1666
#2  0x004c7090 in relation_open (relationId=25696, lockmode=0) at 
heapam.c:1137
#3  0x004c72c9 in heap_open (relationId=25696, lockmode=0) at 
heapam.c:1291
(skipped)
```

And here is a stacktrace for the second bottleneck [4]:

```
0x00584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, 
numparents=0x0) at pg_inherits.c:199
199 forboth(lo, rels_list, li, rel_numparents)
#0  0x00584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, 
numparents=0x0) at pg_inherits.c:199
#1  0x0077fc9f in expand_inherited_rtentry (root=0x1badcb8, 
rte=0x1b630b8, rti=1) at prepunion.c:1408
#2  0x0077fb67 in expand_inherited_tables (root=0x1badcb8) at 
prepunion.c:1335
#3  0x00767526 in subquery_planner (glob=0x1b63cc0, parse=0x1b62fa0, 
parent_root=0x0, hasRecursion=0 '\000', tuple_fraction=0) at planner.c:568
(skipped)
```

The first one could be easily fixed by introducing a hash table
(rel_id -> pgStatList entry). Perhaps hash table should be used only
after some threshold. Unless there are any objections I will send a
corresponding patch shortly.

I didn't explored the second bottleneck closely yet but at first glance
it doesn't look much more complicated.

Please don't hesitate to share your thoughts regarding this matter.

[1] http://afiskon.ru/s/e3/5f47af9102_benchmark.txt
[2] http://afiskon.ru/s/00/2008c4ae66_temp.png
[3] http://afiskon.ru/s/23/650f0afc89_stack.txt
[4] http://afiskon.ru/s/03/a7e685a4db_stack2.txt

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] rename pg_log directory?

2017-02-28 Thread Jorge Solórzano
On Tue, Feb 28, 2017 at 5:07 AM, Magnus Hagander 
wrote:

>
> server_log seems like a better choice then I think. So +1 for that.
>

​server_log +1

​


>
> In theory cluster_log since it's a "cluster level log", but given how many
> people already get confused by the term cluster being used that way, I
> think that while maybe technically correct, that would be a very bad
> choice.
>


Re: [HACKERS] patch proposal

2017-02-28 Thread Venkata B Nagothi
Hi David,

On Tue, Jan 31, 2017 at 6:49 AM, David Steele  wrote:

> On 1/27/17 3:19 AM, Venkata B Nagothi wrote:
>
> > I will be adding the tests in
> > src/test/recovery/t/003_recovery_targets.pl
> > . My tests would look more or less
> > similar to recovery_target_action. I do not see any of the tests added
> > for the parameter recovery_target_action ? Anyways, i will work on
> > adding tests for recovery_target_incomplete.
>
> Do you know when those will be ready?
>

Attached are both the patches with tests included.


>
> >
> >
> > 4) I'm not convinced that fatal errors by default are the way to go.
> > It's a pretty big departure from what we have now.
> >
> >
> > I have changed the code to generate ERRORs instead of FATALs. Which is
> > in the patch recoveryStartPoint.patch
>
> I think at this point in recovery any ERROR at all will probably be
> fatal.  Once you have some tests in place we'll know for sure.
>
> Either way, the goal would be to keep recovery from proceeding and let
> the user adjust their targets.  Since this is a big behavioral change
> which will need buy in from the community.
>

Hoping to get some comments / feedback from the community on the second
patch too.

> Also, I don't think it's very intuitive how recovery_target_incomplete
> > works.  For instance, if I set recovery_target_incomplete=promote
> and
> > set recovery_target_time, but pick a backup after the specified time,
> > then there will be an error "recovery_target_time %s is older..."
> rather
> > than a promotion.
> >
> >
> > Yes, that is correct and that is the expected behaviour. Let me explain -
>
> My point was that this combination of options could lead to confusion on
> the part of users.  However, I'd rather leave that alone for the time
> being and focus on the first patch.
>
> > I have split the patch into two different
> > pieces. One is to determine if the recovery can start at all and other
> > patch deals with the incomplete recovery situation.
>
> I think the first patch looks promising and I would rather work through
> that before considering the second patch.  Once there are tests for the
> first patch I will complete my review.
>

I have added all the tests for the second patch and all seem to be working
fine.

Regarding the first patch, i have included all the tests except for the
test case on recovery_target_time.
I did write the test, but, the test is kind of behaving weird which i am
working through to resolve.

Regards,

Venkata B N
Database Consultant


recoveryStartPoint.patch-version3
Description: Binary data


recoveryTargetIncomplete.patch-version4
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-02-28 Thread Bruce Momjian
On Tue, Feb 28, 2017 at 09:04:29AM -0500, Tom Lane wrote:
> Surafel Temesgen  writes:
> > This assignment is on todo list and has a benefit of providing an
> > additional defense against SQL-injection attacks.
> 
> This is on the todo list?  Really?  It seems unlikely to be worth the
> backwards-compatibility breakage.  I certainly doubt that we could
> get away with unconditionally rejecting such cases with no "off" switch,
> as you have here.
> 
> > Previous mailing list discussion is here
> > 
> 
> That message points out specifically that we *didn't* plan to do this.
> Perhaps back then (ten years ago) we could have gotten away with the
> compatibility breakage, but now I doubt it.

I might have added that one; the text is:

Consider disallowing multiple queries in PQexec()
as an additional barrier to SQL injection attacks 

and it is a "consider" item.  Should it be moved to the Wire Protocol
Changes / v4 Protocol section or removed?
 
-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-02-28 Thread Tom Lane
Surafel Temesgen  writes:
> This assignment is on todo list and has a benefit of providing an
> additional defense against SQL-injection attacks.

This is on the todo list?  Really?  It seems unlikely to be worth the
backwards-compatibility breakage.  I certainly doubt that we could
get away with unconditionally rejecting such cases with no "off" switch,
as you have here.

> Previous mailing list discussion is here
> 

That message points out specifically that we *didn't* plan to do this.
Perhaps back then (ten years ago) we could have gotten away with the
compatibility breakage, but now I doubt it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] port of INSTALL file generation to XSLT

2017-02-28 Thread Magnus Hagander
On Mon, Feb 27, 2017 at 4:55 PM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 12/31/16 07:33, Magnus Hagander wrote:
> > Borka being a standard debian jessie install has Pandoc
> > 1.12.4.2~dfsg-1+b14. Should be no problem installing that.
> >
> > I ran a "make INSTALL" on a jessie box, and it comes out with some
> > formatting still in the file, see attachment. That seems incorrect.
>
> It appears we need pandoc 1.13 to get the good output.  This won't be
> available until Debian stretch.
>
> So for PG10, I propose the attached revised patch which keeps using lynx
> but uses xsltproc instead of jade.
>
>
It is available for people not using debian though? Might it be worthwhile
to make it dependent on the version of pandoc -- so use that method if
people have the newer pandoc and fall back to lynx if they don't?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] BRIN de-summarize ranges

2017-02-28 Thread Tom Lane
Alvaro Herrera  writes:
> Here's a small patch to make a BRIN page range unsummarized.  This is
> useful if data has been deleted, and the heap pages are now used for
> completely different data.

This seems remarkably, um, manual.  Why shouldn't users expect the system
to take care of this for them?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-02-28 Thread Pavan Deolasee
Hello All,

During the second heap scan of CREATE INDEX CONCURRENTLY, we're only
interested in the tuples which were inserted after the first scan was
started. All such tuples can only exists in pages which have their VM bit
unset. So I propose the attached patch which consults VM during second scan
and skip all-visible pages. We do the same trick of skipping pages only if
certain threshold of pages can be skipped to ensure OS's read-ahead is not
disturbed.

The patch obviously shows significant reduction of time for building index
concurrently for very large tables, which are not being updated frequently
and which was vacuumed recently (so that VM bits are set). I can post
performance numbers if there is interest. For tables that are being updated
heavily, the threshold skipping was indeed useful and without that we saw a
slight regression.

Since VM bits are only set during VACUUM which conflicts with CIC on the
relation lock, I don't see any risk of incorrectly skipping pages that the
second scan should have scanned.

Comments?

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


cic_skip_all_visible_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] avoid bloat from CREATE INDEX CONCURRENTLY

2017-02-28 Thread Tom Lane
Simon Riggs  writes:
> On 28 February 2017 at 13:05, Tom Lane  wrote:
>> Um ... isn't there a transaction boundary there anyway?

> Yes, the patch releases the snapshot early, so it does not hold it
> once the build scan has completed. This allows the sort and build
> phases to occur without holding back the xmin.

Oh ... so Alvaro explained it badly.  The reason this is specific to
btree is that it's the only AM with any significant post-scan building
time.

However, now that I read the patch: this is a horribly ugly hack.
I really don't like the API (if it even deserves the dignity of that
name) that you've added to snapmgr.  I supposwe the zero documentation
for it fits in nicely with the fact that it's a badly-thought-out kluge.

I think it would be better to just move the responsibility for snapshot
popping in this sequence to the index AMs, full stop.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-02-28 Thread Jan Michálek
Current state is something like this (diff is attached).
I currently haven`t regression test, tab completion etc., I will add this
thing following structure of asciidoc commit.

Output is tested using retext, rst is OK, md have problem with cells with
newline (i must find out, how it is possible create table with this in
markdown).

[jelen@laptak patch_postgre_rst]$
[jelen@laptak psql]$ ./psql
psql (9.6.2, server 9.6.1)
Type "help" for help.

jelen=# \pset linestyle markdown
Line style is markdown.
jelen=# values(E'nasral Franta\nna trabanta','Žluťoučký kobyly'),
(,E'a\tb') \g | xclip
jelen=# values(E'nasral Franta\nna trabanta','Žluťoučký kobyly'),
(,E'a\tb') \g

|column1| column2  |
|---|--|
| nasral Franta | Žluťoučký kobyly |
| na trabanta   |  |
| ' | a   b|


(2 rows)

jelen=# \pset linestyle rst
Line style is rst.
jelen=# values(E'nasral Franta\nna trabanta','Žluťoučký kobyly'),
(,E'a\tb') \g
+---+--+
|column1| column2  |
+===+==+
| nasral Franta+| Žluťoučký kobyly |
| na trabanta   |  |
+---+--+
| ' | a   b|
+---+--+

(2 rows)

jelen=#

2017-02-24 0:46 GMT+01:00 Michael Paquier :

> On Fri, Feb 24, 2017 at 3:09 AM, Jan Michálek 
> wrote:
> > I can try it, doesn`t look dificult, but I`m worry, that I`m not able to
> > write clean, pretty code.
>
> If you want to have something available in Postgres 10, you had better
> be quick. The last commit fest of the development cycle of Postgres 10
> begins on the 1st of March, you need to to register your patch here:
> https://commitfest.postgresql.org/13/
> Here are also some rough guidelines about submitting a patch:
> https://wiki.postgresql.org/wiki/Submitting_a_Patch
> --
> Michael
>



-- 
Jelen
Starší čeledín datovýho chlíva
diff -ru a/src/bin/psql/command.c b/src/bin/psql/command.c
--- a/src/bin/psql/command.c2017-02-06 22:45:25.0 +0100
+++ b/src/bin/psql/command.c2017-02-28 01:07:26.0 +0100
@@ -2584,9 +2584,17 @@
popt->topt.line_style = _asciiformat_old;
else if (pg_strncasecmp("unicode", value, vallen) == 0)
popt->topt.line_style = _utf8format;
+   /* format markdown
+   */
+   else if (pg_strncasecmp("markdown", value, vallen) == 0)
+   popt->topt.line_style = _markdown;
+   /* format rst
+   */
+   else if (pg_strncasecmp("rst", value, vallen) == 0)
+   popt->topt.line_style = _rst;
else
{
-   psql_error("\\pset: allowed line styles are ascii, 
old-ascii, unicode\n");
+   psql_error("\\pset: allowed line styles are ascii, 
old-ascii, unicode, markdown\n");
return false;
}
 
diff -ru a/src/fe_utils/print.c b/src/fe_utils/print.c
--- a/src/fe_utils/print.c  2017-02-06 22:45:25.0 +0100
+++ b/src/fe_utils/print.c  2017-02-28 13:34:11.0 +0100
@@ -57,6 +57,48 @@
 static printTableFooter default_footer_cell = {default_footer, NULL};
 
 /* Line style control structures */
+const printTextFormat pg_markdown =
+{
+   "markdown",
+   {
+   {"", "", "", ""},
+   {"-", "|", "|", "|"},
+   {"", "", "", ""},
+   {"", "|", "|", "|"}
+   },
+   "|",
+   "|",
+   "|",
+   " ",
+   "+",
+   " ",
+   " ",
+   ".",
+   ".",
+   true
+};
+
+const printTextFormat pg_rst =
+{
+   "rst",
+   {
+   {"-", "+", "+", "+"},
+   {"=", "+", "+", "+"},
+   {"-", "+", "+", "+"},
+   {"", "|", "|", "|"}
+   },
+   "|",
+   "|",
+   "|",
+   " ",
+   "+",
+   " ",
+   "+",
+   ".",
+   ".",
+   true
+};
+
 const printTextFormat pg_asciiformat =
 {
"ascii",
@@ -623,6 +665,12 @@
if (opt_border > 2)
opt_border = 2;
 
+   if (format == _markdown)
+  opt_border = 2;
+
+   if (format == _rst)
+  opt_border = 2;
+
if (cont->ncolumns > 0)
{
col_count = cont->ncolumns;
@@ -1124,13 +1172,20 @@
fputc('\n', fout);
 
} while (more_lines);
+
+   /* add line after every record 
*/
+   if (opt_border == 2 && format == _rst)
+  _print_horizontal_line(col_count, width_wrap, opt_border,
+PRINT_RULE_BOTTOM, format, fout);
}
 
if (cont->opt->stop_table)
{
printTableFooter *footers = 

Re: [HACKERS] avoid bloat from CREATE INDEX CONCURRENTLY

2017-02-28 Thread Simon Riggs
On 28 February 2017 at 13:05, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> This patch reduces the amount of bloat you get from running CREATE INDEX
>> CONCURRENTLY by destroying the snapshot taken in the first phase, before
>> entering the second phase.  This allows the global xmin to advance,
>
> Um ... isn't there a transaction boundary there anyway?

Yes, the patch releases the snapshot early, so it does not hold it
once the build scan has completed. This allows the sort and build
phases to occur without holding back the xmin.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Avoiding bloat in CIC

2017-02-28 Thread Simon Riggs
In CREATE INDEX CONCURRENTLY it seems possible to release the index
build snapshot early, so we can reset our xmin before we complete the
sort and write the main part of the index.

Patch to implement this attached.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


avoid_bloat_from_cic.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-02-28 Thread Tom Lane
"Tsunakawa, Takayuki"  writes:
> As I stated here and at the PGConf.ASIA developer meeting last year, I'd
> like to propose statement-level rollback feature.

I do not really see how this would ever get past the compatibility
problems that forced us to give up on server-side autocommit years ago.

If you want to provide a client-side facility for this, perhaps that could
fly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] avoid bloat from CREATE INDEX CONCURRENTLY

2017-02-28 Thread Tom Lane
Alvaro Herrera  writes:
> This patch reduces the amount of bloat you get from running CREATE INDEX
> CONCURRENTLY by destroying the snapshot taken in the first phase, before
> entering the second phase.  This allows the global xmin to advance,

Um ... isn't there a transaction boundary there anyway?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] postgres_fdw: evaluate placeholdervars on remote server

2017-02-28 Thread Etsuro Fujita

Hi,

Here is a patch for $subject.  This is the same as what I proposed in  
combination with a feature for full joins [1]; this would allow us to  
push down left/right/full joins with PHVs to the remote and improve how  
to deparse whole-row references.  Since this is implemented on top of  
the feature for full-joins (ie, the deparser logic for subqueries), I  
proposed this on that thread, but this is slightly independent from that  
feature (and we haven't discussed this in detail on that thread), so I  
think it's better to start new thread.  Attached is a new version, which  
is created on top of [2].  I'll add this to the upcoming CF.


Best regards,
Etsuro Fujita

[1]  
https://www.postgresql.org/message-id/c449261a-b033-dc02-9254-2fe5b7044795%40lab.ntt.co.jp
[2]  
https://www.postgresql.org/message-id/920e660b-6fec-6022-759d-e96e37dd5984%40lab.ntt.co.jp
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 49,54 
--- 49,55 
  #include "nodes/nodeFuncs.h"
  #include "nodes/plannodes.h"
  #include "optimizer/clauses.h"
+ #include "optimizer/placeholder.h"
  #include "optimizer/prep.h"
  #include "optimizer/tlist.h"
  #include "optimizer/var.h"
***
*** 158,163  static void deparseRelabelType(RelabelType *node, deparse_expr_cxt *context);
--- 159,165 
  static void deparseBoolExpr(BoolExpr *node, deparse_expr_cxt *context);
  static void deparseNullTest(NullTest *node, deparse_expr_cxt *context);
  static void deparseArrayExpr(ArrayExpr *node, deparse_expr_cxt *context);
+ static void deparseExprInPlaceHolderVar(PlaceHolderVar *node, deparse_expr_cxt *context);
  static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
   deparse_expr_cxt *context);
  static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
***
*** 184,192  static Node *deparseSortGroupClause(Index ref, List *tlist,
  /*
   * Helper functions
   */
! static bool is_subquery_var(Var *node, RelOptInfo *foreignrel,
! 			int *relno, int *colno);
! static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
  		  int *relno, int *colno);
  
  
--- 186,195 
  /*
   * Helper functions
   */
! static bool is_subquery_expr(Expr *node, PlannerInfo *root,
! 			 RelOptInfo *foreignrel,
! 			 int *relno, int *colno);
! static void get_relation_column_alias_ids(Expr *node, RelOptInfo *foreignrel,
  		  int *relno, int *colno);
  
  
***
*** 771,776  foreign_expr_walker(Node *node,
--- 774,798 
  	state = FDW_COLLATE_UNSAFE;
  			}
  			break;
+ 		case T_PlaceHolderVar:
+ 			{
+ PlaceHolderVar *phv = (PlaceHolderVar *) node;
+ PlaceHolderInfo *phinfo = find_placeholder_info(glob_cxt->root,
+ phv, false);
+ 
+ /*
+  * If the PHV's contained expression is computable on the
+  * remote server, we consider the PHV safe to send.
+  */
+ if (phv->phlevelsup == 0 &&
+ 	bms_is_subset(phinfo->ph_eval_at,
+   glob_cxt->foreignrel->relids))
+ 	return foreign_expr_walker((Node *) phv->phexpr,
+ 			   glob_cxt, outer_cxt);
+ else
+ 	return false;
+ 			}
+ 			break;
  		default:
  
  			/*
***
*** 865,871  deparse_type_name(Oid type_oid, int32 typemod)
   * foreign server.
   */
  List *
! build_tlist_to_deparse(RelOptInfo *foreignrel)
  {
  	List	   *tlist = NIL;
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
--- 887,893 
   * foreign server.
   */
  List *
! build_tlist_to_deparse(PlannerInfo *root, RelOptInfo *foreignrel)
  {
  	List	   *tlist = NIL;
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
***
*** 878,892  build_tlist_to_deparse(RelOptInfo *foreignrel)
  		return fpinfo->grouped_tlist;
  
  	/*
! 	 * We require columns specified in foreignrel->reltarget->exprs and those
! 	 * required for evaluating the local conditions.
  	 */
! 	tlist = add_to_flat_tlist(tlist,
! 	   pull_var_clause((Node *) foreignrel->reltarget->exprs,
! 	   PVC_RECURSE_PLACEHOLDERS));
! 	tlist = add_to_flat_tlist(tlist,
! 			  pull_var_clause((Node *) fpinfo->local_conds,
! 			  PVC_RECURSE_PLACEHOLDERS));
  
  	return tlist;
  }
--- 900,972 
  		return fpinfo->grouped_tlist;
  
  	/*
! 	 * Fetch all expressions in the given relation's reltarget if the
! 	 * reltarget_is_shippable flag is set TRUE.  Otherwise, fetch shipplable
! 	 * expressions in the reltarget plus expressions required for evaluating
! 	 * non-shippable expressions in the reltarget.
  	 */
! 	if (fpinfo->reltarget_is_shippable)
! 		tlist = add_to_flat_tlist(tlist, foreignrel->reltarget->exprs);
! 	else
! 	{
! 		List	   *exprs = NIL;
! 		ListCell   *lc;
! 
! 		/* Note: we have at least one non-shippable PHV in the reltarget. */
! 		foreach(lc, foreignrel->reltarget->exprs)
! 		{
! 			Node	   *node = (Node *) lfirst(lc);
! 

Re: [HACKERS] timeouts in PostgresNode::psql

2017-02-28 Thread Alvaro Herrera
Michael Paquier wrote:
> On Mon, Feb 27, 2017 at 11:28 AM, Craig Ringer  wrote:

> > Instead of
> >
> > $exc_save !~ /^$timeout_exception.*/
> >
> > I've updated to:
> >
> > $exc_save !~ /^\Q$timeout_exception\E/
> >
> > i.e. don't do an unnecessary wildcard match at the end, and disable
> > metachar interpretation in the substituted range.
> >
> > Still needs applying to pg9.6 and pg10.
> 
> I did not understand at first what you meant, but after looking at the
> commit message of the patch things are clear:
> Newer Perl or IPC::Run versions default to appending the filename to string
> exceptions, e.g. the exception
> psql timed out
>  is thrown as
> psql timed out at /usr/share/perl5/vendor_perl/IPC/Run.pm line 2961.

Hmm, I think this is really a bugfix that we should backpatch all the
way back to where we introduced PostgresNode.

Lately I've been wondering about backpatching the whole TAP test
infrastructure, all the way back.  As we notice bugs, it's really useful
to use newly added tests in all branches; but currently PostgresNode
doesn't work with old branches, particularly since the '-w' switch was
removed from pg_ctl invokations in PostgresNode->start and ->restart
methods -- (the test just fail without any indication of what is going
on).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] few fts functions for jsonb

2017-02-28 Thread Oleg Bartunov
The proposed patch looks not very important, but I consider it as an
important feature, which Oracle and Microsoft already have, that's why I
asked Dmitry to work on this and made it before feature freeze. My comments
follows below the post.

On Tue, Feb 28, 2017 at 1:59 PM, Dmitry Dolgov <9erthali...@gmail.com>
wrote:

> Hi all
>
> I would like to propose patch with a set of new small functions for fts in
> case of
> jsonb data type:
>
> * to_tsvector(config, jsonb) - make a tsvector from all string values and
>   elements of jsonb object. To prevent the situation, when tsquery can
> find a
>   phrase consisting of lexemes from two different values/elements, this
>   function will add an increment to position of each lexeme from every new
>   value/element.
>
> * ts_headline(config, jsonb, tsquery, options) - generate a headline
> directly
>   from jsonb object
>
> Here are the examples how they work:
>
> ```
> =# select to_tsvector('{"a": "aaa bbb", "b": ["ccc ddd"], "c": {"d": "eee
> fff"}}'::jsonb);
>to_tsvector
> -
>  'aaa':1 'bbb':2 'ccc':4 'ddd':5 'eee':7 'fff':8
> (1 row)
>
>
> =# select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc
> ddd"}}'::jsonb, tsquery('bbb & ddd & hhh'), 'StartSel = <, StopSel = >');
>  ts_headline
> --
>  aaa  ccc 
> (1 row)
> ```
>

> Any comments or suggestions?
>

1. add json support
2. Its_headline  should returns the original json with highlighting.  As a
first try the proposed ts_headline  could be ok, probably need special
option.



> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] Enabling parallelism for queries coming from SQL or other PL functions

2017-02-28 Thread Amit Kapila
On Mon, Feb 27, 2017 at 12:21 PM, Robert Haas  wrote:
> On Mon, Feb 27, 2017 at 8:33 AM, Amit Kapila  wrote:
>> Is there any easy way to find out which way is less expensive?
>
> No.  But that's a separate problem.  I'm just saying we shouldn't
> arbitrarily prohibit parallelism for parallel-safe functions.
>
>> Even
>> if we find some way or just make a rule that when an outer query uses
>> parallelism, then force function call to run serially, how do we
>> achieve that I mean in each worker we can ensure that each
>> individual statements from a function can run serially (by having a
>> check of isparallelworker() in gather node), but having a similar
>> check in the master backend is tricky or maybe we don't want to care
>> for the same in master backend.  Do you have any suggestions on how to
>> make it work?
>
> I don't understand what's wrong with the existing logic in standard_planner().
>

When such a function (that contains statements which have parallel
plans) is being executed as part of another parallel plan, it can
allow spawning workers unboundedly.   Assume a query like  select *
from t1 where c1 < func1(),  this can use parallel scan for t1 and
then in master backend, during partial scan of t1, it can again spawn
new set of workers for queries inside func1(), this can happen
multiple times if parallel query inside func1() again calls some other
function func2() which has parallel query.  Now, this might be okay,
but today such a situation doesn't exist that Gather execution can
invoke another Gather node, so it is worth to consider if we want to
allow it.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we cacheline align PGXACT?

2017-02-28 Thread Ashutosh Sharma
Hi,

On Fri, Feb 24, 2017 at 12:22 PM, Ashutosh Sharma 
wrote:

On Fri, Feb 24, 2017 at 10:41 AM, Simon Riggs  wrote:
> > On 24 February 2017 at 04:41, Ashutosh Sharma 
> wrote:
> >>
> >> Okay. As suggested by Alexander, I have changed the order of reading and
> >> doing initdb for each pgbench run. With these changes, I got following
> >> results at 300 scale factor with 8GB of shared buffer.
> >
> >
> > Would you be able to test my patch also please?
> >
>
> Sure, I can do that and share the results by early next week. Thanks.
>

So, Here are the pgbench results I got with '
*reduce_pgxact_access_AtEOXact.v2.patch*' on a read-write workload.

*pgbench settings:*
pgbench -i -s 300 postgres
pgbench -M prepared -c $thread -j $thread -T $time_for_reading  postgres

where, time_for_reading = 30mins

*non default GUC param*
shared_buffers=8GB
max_connections=300

pg_wal is located in SSD.

CLIENT COUNT TPS (HEAD) TPS (PATCH) % IMPROVEMENT
4 2588 2601 0.5023183926
8 5094 5098 0.0785237534
16 10294 10307 0.1262871576
32 19779 19815 0.182011224
64 27908 28346 1.569442454
72 27823 28416 2.131330194
128 28455 28618 0.5728342998
180 26739 26879 0.5235797898
196 27820 27963 0.5140186916
256 28763 28969 0.7161978931

Also, Excel sheet (results-readwrite-300-SF.xlsx) containing the results
for all the 3 runs is attached.


results-readwrite-300-SF_EOXACT.xlsx
Description: MS-Excel 2007 spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wrong variable type in KeepLogSeg

2017-02-28 Thread Magnus Hagander
On Tue, Feb 28, 2017 at 3:17 AM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> Hello, I found a variable definition with wrong type
> specification in KeepLogSeg, which doesn't harm anything.
>
> > static void
> > KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
> > {
> > ...
> > /* then check whether slots limit removal further */
> > if (max_replication_slots > 0 && keep != InvalidXLogRecPtr)
> > {
> > XLogRecPtrslotSegNo;
> >
> > XLByteToSeg(keep, slotSegNo);
>
>
> slotSegNo should be a XLogSegNo. Both types share the same
> intrinsic type so it doesn't harm anything.
>
> This is back-patchable upto 9.4.
>

Nice catch. Applied and backpatched.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] rename pg_log directory?

2017-02-28 Thread Magnus Hagander
On Tue, Feb 28, 2017 at 4:01 AM, Bruce Momjian  wrote:

> On Mon, Feb 27, 2017 at 09:51:26AM -0500, Tom Lane wrote:
> > Peter Eisentraut  writes:
> > > How about changing the default for log_directory from 'pg_log' to, say,
> > > 'log'?
> >
> > > We have been emphasizing that the prefix "pg_" is for things reserved
> to
> > > PostgreSQL, whereas the pg_log directory is entirely an arbitrary
> > > user-space name.  Also, with a different name, the directory would
> stand
> > > out more between all the other pg_* directories in the data directory.
> >
> > No objection to the basic point, but "log" seems perhaps a little too
> > generic to me.  Would something like "server_log" be better?
>
> "activity_log"?  I like the idea of a rename.


server_log seems like a better choice then I think. So +1 for that.

In theory cluster_log since it's a "cluster level log", but given how many
people already get confused by the term cluster being used that way, I
think that while maybe technically correct, that would be a very bad
choice.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


  1   2   >