date:20171102

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-11-02 Thread Michael Paquier

On Fri, Nov 3, 2017 at 1:10 AM, Amit Kapila  wrote:
> On Fri, Nov 3, 2017 at 2:54 AM, Tom Lane  wrote:
>> I've marked the CF entry closed.  However, I'm not sure if we're quite
>> done with this thread.  Weren't we going to adjust nbtree and hash to
>> be more aggressive about labeling their metapages as REGBUF_STANDARD?
>
> I have already posted the patches [1] for the same in this thread and
> those are reviewed [2][3] as well. I have adjusted the comments as per
> latest commit.   Please find updated patches attached.

Confirmed. Setting those makes sense even if REGBUF_WILL_INIT is set,
at least for page masking.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - use enum for meta commands

2017-11-02 Thread Fabien COELHO




[ pgbench-enum-meta-2.patch ]


Pushed with some trivial cosmetic adjustments (pgindent changed
it more than I did).


Ok. Thanks.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: pgbench - option to build using ppoll() for larger connection counts

2017-11-02 Thread Fabien COELHO



Hello,

Could you rebase the v11 patch?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread David Rowley

On 3 November 2017 at 03:26, Craig Ringer  wrote:
> On 2 November 2017 at 22:22, David Rowley  
> wrote:
>> Maybe, but the new implementation is not going to do well with places
>> where we perform lcons(). Probably many of those places could be
>> changed to lappend(), but I bet there's plenty that need prepend.
>
> Yeah, and it's IMO significant that pretty much every nontrivial
> system with ADTs offers multiple implementations of list data
> structures, often wrapped with a common API.
>
> Java's Collections, the STL, you name it.

I've never really looked at much Java, but I've done quite a bit of
dotnet stuff in my time and they have a common interface ICollection
and various data structures that implement that interface. Which is
implemented by a bunch of classes, something like:

ConcurrentDictionary (hash table)
Dictionary (hash table)
HashSet (hash table)
LinkedList (some kinda linked list)
List (Arrays, probably)
SortedDictionary (bst, I think)
SortedList (no idea, perhaps a btree)
SortedSet
Bag (no idea, but does not care about any order)

Probably there are more, but the point is that they obviously don't
believe there's a one-size-fits-all type. I don't think we should do
that either. However, I've proved nothing on the performance front
yet, so that should probably be my next step.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL and Encryption

2017-11-02 Thread Michael Paquier

On Fri, Nov 3, 2017 at 3:19 AM, Craig Ringer  wrote:
> This is probably off topic for pgsql-hackers.
>
> For password crypto please go read the SCRAM thread and the PostgreSQL
> 10 release notes.

The SCRAM discussion is spread across two threads mainly with hundreds
of emails, which may discourage even the bravest. Here are links to
the important documentation:
https://www.postgresql.org/docs/current/static/auth-methods.html#auth-password
https://www.postgresql.org/docs/10/static/sasl-authentication.html

And PostgreSQL implements SCRAM-SHA-256 following RFCs 7677 and 5802:
https://tools.ietf.org/html/rfc5802
https://tools.ietf.org/html/rfc7677
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-11-02 Thread David Rowley

On 31 October 2017 at 21:43, Amit Langote  wrote:
> Attached updated version of the patches addressing some of your comments

I've spent a bit of time looking at these but I'm out of time for now.

So far I have noted down the following;

1. This comment seem wrong.

/*
* Since the clauses in rel->baserestrictinfo should all contain Const
* operands, it should be possible to prune partitions right away.
*/

How about PARTITION BY RANGE (a) and SELECT * FROM parttable WHERE a > b; ?
baserestrictinfo in this case will contain a single RestrictInfo with
an OpExpr containing two Var args and it'll come right through that
function too.

2. This code is way more complex than it needs to be.

if (num_parts > 0)
{
int j;

all_indexes = (int *) palloc(num_parts * sizeof(int));
j = 0;
if (min_part_idx >= 0 && max_part_idx >= 0)
{
for (i = min_part_idx; i <= max_part_idx; i++)
all_indexes[j++] = i;
}
if (!bms_is_empty(other_parts))
while ((i = bms_first_member(other_parts)) >= 0)
all_indexes[j++] = i;
if (j > 1)
qsort((void *) all_indexes, j, sizeof(int), intcmp);
}

It looks like the min/max partition stuff is just complicating things
here. If you need to build this array of all_indexes[] anyway, I don't
quite understand the point of the min/max. It seems like min/max would
probably work a bit nicer if you didn't need the other_parts
BitmapSet, so I recommend just getting rid of min/max completely and
just have a BitmapSet with bit set for each partition's index you
need, you'd not need to go to the trouble of performing a qsort on an
array and you could get rid of quite a chunk of code too.

The entire function would then not be much more complex than:

partindexes = get_partitions_from_clauses(parent, partclauses);

while ((i = bms_first_member(partindexes)) >= 0)
{
AppendRelInfo *appinfo = rel->part_appinfos[i];
result = lappend(result, appinfo);
}

Then you can also get rid of your intcmp() function too.

3. Following code has the wrong size calculation:

memset(keynullness, -1, PARTITION_MAX_KEYS * sizeof(NullTestType *));

should be PARTITION_MAX_KEYS * sizeof(NullTestType). It might have
worked on your machine if you're compiling as 32 bit.

I'll continue on with the review in the next few days.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL and Encryption

2017-11-02 Thread Craig Ringer

On 3 November 2017 at 11:16, chiru r  wrote:
> Hi ,
>
> Please suggest the best chiper suite to configure openSSL for PostgreSQL
> Server and client?.
>
> How to use other than md5 encryption algorithm to encrypt the passwords in
> PostgreSQL?

This is probably off topic for pgsql-hackers.

For password crypto please go read the SCRAM thread and the PostgreSQL
10 release notes.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SSL and Encryption

2017-11-02 Thread chiru r

Hi ,

Please suggest the best chiper suite to configure openSSL for PostgreSQL
Server and client?.

How to use other than md5 encryption algorithm to encrypt the passwords in
PostgreSQL?

Thanks,
Chiru

Re: [HACKERS] path toward faster partition pruning

2017-11-02 Thread David Rowley

On 31 October 2017 at 21:43, Amit Langote  wrote:
> Attached updated version of the patches addressing some of your comments
> above and fixing a bug that Rajkumar reported [1].  As mentioned there,
> I'm including here a patch (the 0005 of the attached) to tweak the default
> range partition constraint to be explicit about null values that it might
> contain.  So, there are 6 patches now and what used to be patch 0005 in
> the previous set is patch 0006 in this version of the set.

Hi Amit,

I've been looking over this. I see the latest patches conflict with
cf7ab13bf. Can you send patches rebased on current master?

Thanks

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small improvement to compactify_tuples

2017-11-02 Thread Tom Lane

Sokolov Yura  writes:
> [ 0001-Improve-compactify_tuples.patch, v5 or thereabouts ]

I started to review this patch.  I spent a fair amount of time on
beautifying the code, because I found it rather ugly and drastically
undercommented.  Once I had it to the point where it seemed readable,
I went to check the shellsort algorithm against Wikipedia's entry,
and found that this appears to be an incorrect implementation of
shellsort: where pg_shell_sort_pass has

for (_i = off; _i < _n; _i += off) \

it seems to me that we need to have

for (_i = off; _i < _n; _i += 1) \

or maybe just _i++.  As-is, this isn't h-sorting the whole file,
but just the subset of entries that have multiple-of-h indexes
(ie, the first of the h distinct subfiles that should get sorted).
The bug is masked by the final pass of plain insertion sort, but
we are not getting the benefit we should get from the earlier passes.

However, I'm a bit dubious that it's worth fixing that; instead
my inclination would be to rip out the shellsort implementation
entirely.  The code is only using it for the nitems <= 48 case
(which makes the first three offset steps certainly no-ops) and
I am really unconvinced that it's worth expending the code space
for a shellsort rather than plain insertion sort in that case,
especially when we have good reason to think that the input data
is nearly sorted.

BTW, the originally given test case shows no measurable improvement
on my box.  I was eventually able to convince myself by profiling
that the patch makes us spend less time in compactify_tuples, but
this test case isn't a very convincing one.

So, quite aside from the bug, I'm not excited about committing the
attached as-is.  I think we should remove pg_shell_sort and just
use pg_insertion_sort.  If somebody can show a test case that
provides a measurable speed improvement from the extra code,
I could be persuaded to reconsider.

I also wonder if the nitems <= 48 cutoff needs to be reconsidered
in light of this.  But since I can hardly measure any benefit from
the patch at all, I'm not in the best position to test different
values for that cutoff.

Have not looked at the 0002 patch yet.

regards, tom lane

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 41642eb..1af1b85 100644
*** a/src/backend/storage/page/bufpage.c
--- b/src/backend/storage/page/bufpage.c
***
*** 18,23 
--- 18,24 
  #include "access/itup.h"
  #include "access/xlog.h"
  #include "storage/checksum.h"
+ #include "utils/inline_sort.h"
  #include "utils/memdebug.h"
  #include "utils/memutils.h"
  
*** typedef struct itemIdSortData
*** 425,439 
  } itemIdSortData;
  typedef itemIdSortData *itemIdSort;
  
! static int
  itemoffcompare(const void *itemidp1, const void *itemidp2)
  {
- 	/* Sort in decreasing itemoff order */
  	return ((itemIdSort) itemidp2)->itemoff -
  		((itemIdSort) itemidp1)->itemoff;
  }
  
  /*
   * After removing or marking some line pointers unused, move the tuples to
   * remove the gaps caused by the removed items.
   */
--- 426,542 
  } itemIdSortData;
  typedef itemIdSortData *itemIdSort;
  
! /* Comparator for sorting in decreasing itemoff order */
! static inline int
  itemoffcompare(const void *itemidp1, const void *itemidp2)
  {
  	return ((itemIdSort) itemidp2)->itemoff -
  		((itemIdSort) itemidp1)->itemoff;
  }
  
  /*
+  * Sort an array of itemIdSort's on itemoff, descending.
+  *
+  * This uses Shell sort.  Given that array is small and itemoffcompare
+  * can be inlined, it is much faster than general-purpose qsort.
+  */
+ static void
+ sort_itemIds_small(itemIdSort itemidbase, int nitems)
+ {
+ 	pg_shell_sort(itemIdSortData, itemidbase, nitems, itemoffcompare);
+ }
+ 
+ /*
+  * Sort an array of itemIdSort's on itemoff, descending.
+  *
+  * This uses bucket sort:
+  * - single pass of stable prefix sort on high 8 bits of itemoffs
+  * - then insertion sort on buckets larger than 1 element
+  */
+ static void
+ sort_itemIds(itemIdSort itemidbase, int nitems)
+ {
+ 	/* number of buckets to use: */
+ #define NSPLIT 256
+ 	/* divisor to scale input values into 0..NSPLIT-1: */
+ #define PREFDIV (BLCKSZ / NSPLIT)
+ 	/* per-bucket counts; we need two extra elements, see below */
+ 	uint16		count[NSPLIT + 2];
+ 	itemIdSortData copy[Max(MaxIndexTuplesPerPage, MaxHeapTuplesPerPage)];
+ 	int			i,
+ max,
+ total,
+ pos,
+ highbits;
+ 
+ 	Assert(nitems <= lengthof(copy));
+ 
+ 	/*
+ 	 * Count how many items in each bucket.  We assume all itemoff values are
+ 	 * less than BLCKSZ, therefore dividing by PREFDIV gives a value less than
+ 	 * NSPLIT.
+ 	 */
+ 	memset(count, 0, sizeof(count));
+ 	for (i = 0; i < nitems; i++)
+ 	{
+ 		highbits = itemidbase[i].itemoff / PREFDIV;
+ 		count[highbits]++;
+ 	}
+ 
+ 	/*
+ 	 * Now convert counts to bucket position info, placing the buckets in
+ 	 *

[HACKERS] Skip unneeded temp file in 'make html'

2017-11-02 Thread David Fetter

Folks,

Please find attached a patch for $Subject.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Don't make an unneeded temp file

In passing, make a slight correction to the regex.
---
 doc/src/sgml/Makefile | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 428eb569fc..f4ded1b3eb 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -147,14 +147,12 @@ INSTALL.xml: standalone-profile.xsl standalone-install.xml postgres.xml
 # if we try to do "make all" in a VPATH build without the explicit
 # $(srcdir) on the postgres.sgml dependency in this rule.  GNU make bug?
 postgres.xml: $(srcdir)/postgres.sgml $(ALLSGML)
-	$(OSX) $(SPFLAGS) $(SGMLINCLUDE) -x lower $< >$@.tmp
-	$(call mangle-xml,book)
+	$(OSX) $(SPFLAGS) $(SGMLINCLUDE) -x lower $< | $(call mangle-xml,book)
 
 define mangle-xml
-$(PERL) -p -e 's/\[(aacute|acirc|aelig|agrave|amp|aring|atilde|auml|bull|copy|eacute|egrave|gt|iacute|lt|mdash|nbsp|ntilde|oacute|ocirc|oslash|ouml|pi|quot|scaron|uuml) *\]/\&\1;/gi;' \
+$(PERL) -p -e 's/\[(aacute|acirc|aelig|agrave|amp|aring|atilde|auml|bull|copy|eacute|egrave|gt|iacute|lt|mdash|nbsp|ntilde|oacute|ocirc|oslash|ouml|pi|quot|scaron|uuml) *\]/\&$$1;/gio;' \
-e '$$_ .= qq{http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd;>\n} if $$. == 1;' \
-  <$@.tmp > $@
-rm $@.tmp
+  > $@
 endef
 
 
-- 
2.13.6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-11-02 Thread Thomas Munro

On Fri, Nov 3, 2017 at 2:24 PM, Peter Geoghegan  wrote:
> Thomas Munro  wrote:
>> That way you don't have to opt in to BufFile's
>> double buffering and segmentation schemes just to get shared file
>> clean-up, if for some reason you want direct file handles.
>
> Is that something that you really think is possible?

It's pretty far fetched, but maybe shared temporary relation files
accessed via smgr.c/md.c?  Or maybe future things that don't want to
read/write through a buffer but instead want to mmap it.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-11-02 Thread Peter Geoghegan


Thomas Munro  wrote:

I'm going to make an item on my personal TODO list for that. No useful
insights on that right now, though.


I decided to try that, but it didn't really work: fd.h gets included
by front-end code, so I can't very well define a struct and declare
functions that deal in dsm_segment and slock_t.  On the other hand it
does seem a bit better to for these shared file sets to work in terms
of File, not BufFile.


Realistically, fd.h has a number of functions that are really owned by
buffile.c already. This sounds fine.


That way you don't have to opt in to BufFile's
double buffering and segmentation schemes just to get shared file
clean-up, if for some reason you want direct file handles.


Is that something that you really think is possible?

--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-11-02 Thread Thomas Munro

On Wed, Nov 1, 2017 at 2:11 PM, Peter Geoghegan  wrote:
> On Tue, Oct 31, 2017 at 5:07 PM, Thomas Munro
>  wrote:
>> Another complaint is that perhaps fd.c
>> knows too much about buffile.c's business.  For example,
>> RemovePgTempFilesInDir() knows about the ".set" directories created by
>> buffile.c, which might be called a layering violation.  Perhaps the
>> set/directory logic should move entirely into fd.c, so you'd call
>> FileSetInit(FileSet *), not BufFileSetInit(BufFileSet *), and then
>> BufFileOpenShared() would take a FileSet *, not a BufFileSet *.
>> Thoughts?
>
> I'm going to make an item on my personal TODO list for that. No useful
> insights on that right now, though.

I decided to try that, but it didn't really work: fd.h gets included
by front-end code, so I can't very well define a struct and declare
functions that deal in dsm_segment and slock_t.  On the other hand it
does seem a bit better to for these shared file sets to work in terms
of File, not BufFile.  That way you don't have to opt in to BufFile's
double buffering and segmentation schemes just to get shared file
clean-up, if for some reason you want direct file handles.  So I in
the v24 parallel hash patch set I just posted over in the other thread
I have moved it into its own translation unit sharedfileset.c and made
it work with File objects.  buffile.c knows how to use it as a source
of segment files.  I think that's better.

> If the new standard is that you have temp file names that suggest the
> purpose of each temp file, then that may be something that parallel
> CREATE INDEX should buy into.

Yeah, I guess that could be useful.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-11-02 Thread Amit Kapila

On Fri, Nov 3, 2017 at 2:54 AM, Tom Lane  wrote:
> Amit Langote  writes:
>> On 2017/09/26 16:30, Michael Paquier wrote:
>>> Cool, let's switch it back to a ready for committer status then.
>
>> Sure, thanks.
>
> Pushed with some cosmetic adjustments --- mostly, making the comments more
> explicit about why we need the apparently-redundant assignments to
> pd_lower.
>
> I've marked the CF entry closed.  However, I'm not sure if we're quite
> done with this thread.  Weren't we going to adjust nbtree and hash to
> be more aggressive about labeling their metapages as REGBUF_STANDARD?
>

I have already posted the patches [1] for the same in this thread and
those are reviewed [2][3] as well. I have adjusted the comments as per
latest commit.   Please find updated patches attached.


[1] - 
https://www.postgresql.org/message-id/CAA4eK1%2BTKg6ZK%3DmF14x_wf2KrmOxoMJ6z7YUK3-78acaYLwQ8Q%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAB7nPqTE4-GCaLtDh%3DJBcgUKR6B5WkvRLC-NpOqkgybi4FhHPw%40mail.gmail.com
[3] - 
https://www.postgresql.org/message-id/CAB7nPqQBtDW43ABnWEdoHP6A2ToedzDFdpykbGjpO2wuZNiQnw%40mail.gmail.com
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


change_metapage_usage_btree-v3.patch
Description: Binary data


change_metapage_usage_hash-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Hash take II

2017-11-02 Thread Thomas Munro

On Mon, Oct 30, 2017 at 1:49 PM, Thomas Munro
 wrote:
> A couple of stupid things outstanding:
>
> 1.  EXPLAIN ANALYZE for Parallel Hash "actual" shows the complete row
> count, which is interesting to know (or not?  maybe I should show it
> somewhere else?), but the costing shows the partial row estimate used
> for costing purposes.

Fixed.

> 2.  The BufFileSet's temporary directory gets created even if you
> don't need it for batches.  Duh.

Fixed.

I also refactored shared temporary files a bit while looking into
this.  The shared file ownership mechanism is now promoted to its own
translation unit sharedfileset.c and it now works with fd.c files.
buffile.c can still make use of it.  That seems like a better division
of labour.

> 3.  I don't have a good query rescan regression query yet.  I wish I
> could write my own query plans to test the executor.

I found a query that rescans a parallel-aware hash join and added a
couple of variants to the regression tests.

I also decluttered the EXPLAIN ANALYZE output for enable_parallel_hash
= off a bit.

-- 
Thomas Munro
http://www.enterprisedb.com

parallel-hash-v24.patchset.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linking libpq statically to libssl

2017-11-02 Thread Stephen Frost

Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 10/27/17 08:24, Daniele Varrazzo wrote:
> > I have a problem building binary packages for psycopg2. Binary
> > packages ship with their own copies of libpq and libssl;
> 
> Aside from the advice of "don't do that" ...
> 
> > however if
> > another python package links to libssl the library will be imported
> > twice with conflicting symbols, likely resulting in a segfault (see
> > https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if
> > a python script both connects to postgres and opens an https resource.
> 
> ... the standard solutions to this problem are symbol versioning and
> linker flags to avoid making all symbols globally available.  libpq has
> symbol versioning.  Maybe the libssl you are using does not.  Also, for
> example, if you dlopen() with RTLD_LOCAL, the symbols will not be
> globally available, so there should be no conflicts.

Uh, libpq doesn't actually have symbol versioning, at least not on the
installed Ubuntu packages for PG10, as shown by objdump -T :

00011d70 gDF .text  0054  Base PQconnectStart

and we've certainly not spent effort that I've seen to try to actually
make libpq work when multiple versions of libpq are linked into the same
running backend.

I addressed that in my comments also and I don't believe you could
guarantee that things would operate correctly even with symbol
versioning being used.  Perhaps if you versioned your symbols with
something wildly different from what's on the system and hope that
what's on the system never ends up with that version then maybe it'd
work, but that's quite far from the intended use-case of symbol
versioning.

> This needs cooperation from various different parties, and the details
> will likely be platform dependent.  But it's generally a solved problem.

Right, it's solved when there's one source of truth for the library
which is being maintained consistently and carefully.  That allows
multiple versions of libraries to be installed concurrently and linked
into a running process at the same time, because the group maintaining
those different versions of the library make sure that the symbols are
versioned with appropriate versions and they make sure to not break ABI
for those symbols unless they are also changing the version and putting
out a new version.

This was all solved with GLIBC a very long time ago, but it's a heck of
a lot of work to get it right and correct, and that's when you've got a
bunch of people who work on libraries all the time and carefully control
all of the versions which are installed on the OS in coordination.
Trying to do so when you can't control what's happing with the other
library strikes me as highly likely to result in a whole lot of
difficulties.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] A hook for session start

2017-11-02 Thread Fabrízio de Royes Mello

On Thu, Nov 2, 2017 at 5:42 AM, Aleksandr Parfenov <
a.parfe...@postgrespro.ru> wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  not tested
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
>
> Hi,
>
> Unfortunately, patches 0001 and 0002 don't apply to current master.
>
> The new status of this patch is: Waiting on Author
>

Thanks for your review. Rebased versions attached.

Regards,


--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2c7260e..24346fb 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -169,6 +169,9 @@ static ProcSignalReason RecoveryConflictReason;
 static MemoryContext row_description_context = NULL;
 static StringInfoData row_description_buf;
 
+/* Hook for plugins to get control at start of session */
+session_start_hook_type session_start_hook = NULL;
+
 /* 
  *		decls for routines only used in this file
  * 
@@ -3857,6 +3860,9 @@ PostgresMain(int argc, char *argv[],
 	if (!IsUnderPostmaster)
 		PgStartTime = GetCurrentTimestamp();
 
+	if (session_start_hook)
+		(*session_start_hook) (dbname, username);
+
 	/*
 	 * POSTGRES main processing loop begins here
 	 *
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index f8c535c..d349592 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -35,6 +35,11 @@ extern PGDLLIMPORT const char *debug_query_string;
 extern int	max_stack_depth;
 extern int	PostAuthDelay;
 
+/* Hook for plugins to get control at start of session */
+typedef void (*session_start_hook_type) (const char *dbname,
+		 const char *username);
+extern PGDLLIMPORT session_start_hook_type session_start_hook;
+
 /* GUC-configurable parameters */
 
 typedef enum
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 24346fb..0dbbd7b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -169,9 +169,9 @@ static ProcSignalReason RecoveryConflictReason;
 static MemoryContext row_description_context = NULL;
 static StringInfoData row_description_buf;
 
-/* Hook for plugins to get control at start of session */
+/* Hook for plugins to get control at start or end of session */
 session_start_hook_type session_start_hook = NULL;
-
+session_end_hook_type session_end_hook = NULL;
 /* 
  *		decls for routines only used in this file
  * 
@@ -196,6 +196,7 @@ static void drop_unnamed_stmt(void);
 static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);
+static void do_session_end_hook(int code, Datum arg);
 
 
 /* 
@@ -3864,6 +3865,12 @@ PostgresMain(int argc, char *argv[],
 		(*session_start_hook) (dbname, username);
 
 	/*
+	 * Setup handler to session end hook
+	 */
+	if (IsUnderPostmaster)
+		on_proc_exit(do_session_end_hook, 0);
+
+	/*
 	 * POSTGRES main processing loop begins here
 	 *
 	 * If an exception is encountered, processing resumes here so we abort the
@@ -4615,3 +4622,15 @@ disable_statement_timeout(void)
 		stmt_timeout_active = false;
 	}
 }
+
+/*
+ * on_proc_exit handler to call session end hook
+ */
+static void
+do_session_end_hook(int code, Datum arg)
+{
+	Port	   *port = MyProcPort;
+
+	if (session_end_hook)
+		(*session_end_hook) (port->database_name, port->user_name);
+}
\ No newline at end of file
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index d349592..b7fb8c3 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -35,10 +35,14 @@ extern PGDLLIMPORT const char *debug_query_string;
 extern int	max_stack_depth;
 extern int	PostAuthDelay;
 
-/* Hook for plugins to get control at start of session */
+/* Hook for plugins to get control at start and end of session */
 typedef void (*session_start_hook_type) (const char *dbname,
 		 const char *username);
+typedef void (*session_end_hook_type) (const char *dbname,
+	   const char *username);
+
 extern PGDLLIMPORT session_start_hook_type session_start_hook;
+extern PGDLLIMPORT session_end_hook_type session_end_hook;
 
 /* GUC-configurable parameters */
 
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index b7ed0af..a3c8c1e 100644
--- a/src/test/modules/Makefile
+++

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-11-02 Thread Piotr Stefaniak

On 2017-02-28 20:08, Oleg Bartunov wrote:
> Attached patch is an implementation of SQL/JSON data model from SQL-2016
> standard (ISO/IEC 9075-2:2016(E))

I've faintly started looking into this.

> We created repository for reviewing (ask for write access) -
> https://github.com/postgrespro/sqljson/tree/sqljson

> Examples of usage can be found in src/test/regress/sql/sql_json.sql

> The whole documentation about json support should be reorganized and added,
> and we plan to do this before release. We need help of community here.


> The standard describes SQL/JSON path language, which used by SQL/JSON query
> operators to query JSON. It defines path language as string literal. We
> implemented the path language as  JSONPATH data type, since other
> approaches are not friendly to planner and executor.

I was a bit sad to discover that I can't
PREPARE jsq AS SELECT JSON_QUERY('{}', $1);
I assume because of this part of the updated grammar:
json_path_specification:
Sconst { $$ = $1; }
   ;

Would it make sense, fundamentally, to allow variables there? After 
Andrew Gierth's analysis of this grammar problem, I understand that it's 
not reasonable to expect JSON_TABLE() to support variable jsonpaths, but 
maybe it would be feasible for everything else? From Andrew's changes to 
the new grammar (see attached) it seems to me that at least that part is 
possible. Or should I forget about trying to implement the other part?
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index adfe9b1..f459996 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -624,6 +624,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	json_table_plan_cross
 	json_table_plan_primary
 	json_table_default_plan
+	json_path_specification
 
 %type 		json_arguments
 	json_passing_clause_opt
@@ -635,8 +636,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %type 		json_returning_clause_opt
 
-%type 			json_path_specification
-	json_table_column_path_specification_clause_opt
+%type 			json_table_column_path_specification_clause_opt
 	json_table_path_name
 	json_as_path_name_clause_opt
 
@@ -845,6 +845,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  */
 %nonassoc	UNBOUNDED		/* ideally should have same precedence as IDENT */
 %nonassoc	ERROR_P EMPTY_P DEFAULT ABSENT /* JSON error/empty behavior */
+%nonassoc	COLUMNS FALSE_P KEEP OMIT PASSING TRUE_P UNKNOWN
 %nonassoc	IDENT GENERATED NULL_P PARTITION RANGE ROWS PRECEDING FOLLOWING CUBE ROLLUP
 %left		Op OPERATOR		/* multi-character ops and user-defined operators */
 %left		'+' '-'
@@ -14472,7 +14473,7 @@ json_context_item:
 		;
 
 json_path_specification:
-			Sconst	{ $$ = $1; }
+			a_expr	{ $$ = $1; }
 		;
 
 json_as_path_name_clause_opt:
@@ -14802,7 +14803,7 @@ json_table_formatted_column_definition:
 		;
 
 json_table_nested_columns:
-			NESTED path_opt json_path_specification
+			NESTED path_opt Sconst
 			json_as_path_name_clause_opt
 			json_table_columns_clause
 {

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Nico Williams

On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote:
> Nico Williams  wrote:
> >A MERGE mapped to a DML like this:
> 
> This is a bad idea. An implementation like this is not at all
> maintainable.

Assuming the DELETE issue can be addressed, why would this not be
maintainable?

> >can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE.
> 
> That's not handling concurrency -- it's silently ignoring an error. Who
> is to say that the conflict that IGNORE ignored is associated with a row
> visible to the MVCC snapshot of the statement? IOW, why should the DELETE
> affect any row?

Ah, yes, we'd have to make sure the DELETE does not delete rows that
could not be inserted.  There's... no way to find out what those would
have been -- RETURNING won't mention them, though it'd be a nice
addition to UPSERT to have a way to do that, and it'd make this mapping
feasible.

> There are probably a great many reasons why you need a ModifyTable
> executor node that keeps around state, and explicitly indicates that a
> MERGE is a MERGE. For example, we'll probably want statement level
> triggers to execute in a fixed order, regardless of the MERGE, RLS will
> probably require explicitly knowledge of MERGE semantics, and so on.

Wouldn't those fire anyways in a statement like the one I mentioned?

> FWIW, your example doesn't actually have a source (just a target), so it
> isn't actually like MERGE.

That can be added.  I was trying to keep it pithy.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - use enum for meta commands

2017-11-02 Thread Tom Lane

Fabien COELHO  writes:
> [ pgbench-enum-meta-2.patch ]

Pushed with some trivial cosmetic adjustments (pgindent changed
it more than I did).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GnuTLS support

2017-11-02 Thread Andreas Karlsson

On 09/18/2017 07:04 PM, Jeff Janes wrote:> You fixed the first issue, 
but I still get the second one:


be-secure-gnutls.c: In function 'get_peer_certificate':
be-secure-gnutls.c:667: error: 'GNUTLS_X509_CRT_LIST_SORT' undeclared 
(first use in this function)
be-secure-gnutls.c:667: error: (Each undeclared identifier is reported 
only once

be-secure-gnutls.c:667: error: for each function it appears in.)


Thanks again for testing the code. I have now rebased the patch and 
fixed the second issue. I tested that it works on CentOS 6.


Work which remains:

- sslinfo
- pgcrypto
- Documentation
- Decide if what I did with the config is a good idea

Andreas
diff --git a/configure b/configure
index 4ecd2e1922..1ba34dfced 100755
--- a/configure
+++ b/configure
@@ -709,6 +709,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
+with_gnutls
 with_openssl
 krb_srvtab
 with_python
@@ -837,6 +838,7 @@ with_bsd_auth
 with_ldap
 with_bonjour
 with_openssl
+with_gnutls
 with_selinux
 with_systemd
 with_readline
@@ -1531,6 +1533,7 @@ Optional Packages:
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
   --with-openssl  build with OpenSSL support
+  --with-gnutls   build with GnuTS support
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -5997,6 +6000,41 @@ fi
 $as_echo "$with_openssl" >&6; }
 
 
+#
+# GnuTLS
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5
+$as_echo_n "checking whether to build with GnuTLS support... " >&6; }
+
+
+
+# Check whether --with-gnutls was given.
+if test "${with_gnutls+set}" = set; then :
+  withval=$with_gnutls;
+  case $withval in
+yes)
+
+$as_echo "#define USE_GNUTLS 1" >>confdefs.h
+
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_gnutls=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5
+$as_echo "$with_gnutls" >&6; }
+
+
 #
 # SELinux
 #
@@ -10164,6 +10202,94 @@ done
 
 fi
 
+if test "$with_gnutls" = yes ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5
+$as_echo_n "checking for library containing gnutls_init... " >&6; }
+if ${ac_cv_search_gnutls_init+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char gnutls_init ();
+int
+main ()
+{
+return gnutls_init ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' gnutls; do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search_gnutls_init=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext
+  if ${ac_cv_search_gnutls_init+:} false; then :
+  break
+fi
+done
+if ${ac_cv_search_gnutls_init+:} false; then :
+
+else
+  ac_cv_search_gnutls_init=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5
+$as_echo "$ac_cv_search_gnutls_init" >&6; }
+ac_res=$ac_cv_search_gnutls_init
+if test "$ac_res" != no; then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+else
+  as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5
+fi
+
+  # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted
+  # certificate chains.
+  ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include 
+#include 
+
+"
+if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then :
+  ac_have_decl=1
+else
+  ac_have_decl=0
+fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl
+_ACEOF
+
+  for ac_func in gnutls_pkcs11_set_pin_function
+do :
+  ac_fn_c_check_func "$LINENO" "gnutls_pkcs11_set_pin_function" "ac_cv_func_gnutls_pkcs11_set_pin_function"
+if test "x$ac_cv_func_gnutls_pkcs11_set_pin_function" = xyes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_GNUTLS_PKCS11_SET_PIN_FUNCTION 1
+_ACEOF
+
+fi
+done
+
+fi
+
 if test "$with_pam" = yes ; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5
 $as_echo_n "checking for pam_start in -lpam... " >&6; }
@@ -10961,6 +11087,17 @@ else
 fi
 
 
+fi
+
+if test "$with_gnutls" = yes ; then
+  ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default"
+if test

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Peter Geoghegan


Nico Williams  wrote:

A MERGE mapped to a DML like this:

 WITH
 updated AS (
   UPDATE 
   SET ...
   WHERE 
   RETURNING 
   )
   , inserted AS (
   INSERT INTO 
   SELECT ...
   WHERE  NOT IN (SELECT  FROM updated) AND ..
   ON CONFLICT DO NOTHING -- see below!
   RETURNING 
   )
 DELETE FROM 
 WHERE  NOT IN (SELECT  FROM updated) AND
NOT IN (SELECT  FROM inserted) AND ...;



This is a bad idea. An implementation like this is not at all
maintainable.


can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE.


That's not handling concurrency -- it's silently ignoring an error. Who
is to say that the conflict that IGNORE ignored is associated with a row
visible to the MVCC snapshot of the statement? IOW, why should the DELETE
affect any row?

There are probably a great many reasons why you need a ModifyTable
executor node that keeps around state, and explicitly indicates that a
MERGE is a MERGE. For example, we'll probably want statement level
triggers to execute in a fixed order, regardless of the MERGE, RLS will
probably require explicitly knowledge of MERGE semantics, and so on.

FWIW, your example doesn't actually have a source (just a target), so it
isn't actually like MERGE.

--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Nico Williams

On Thu, Nov 02, 2017 at 02:05:19PM -0700, Peter Geoghegan wrote:
> Simon Riggs  wrote:
> >So in your view we should make no attempt to avoid concurrent errors,
> >even when we have the capability to do so (in some cases) and doing so
> >would be perfectly compliant with the SQLStandard.
> 
> Yes. That's what I believe. I believe this because I can't see a way to
> do this that isn't a mess, and because ON CONFLICT DO UPDATE exists and
> works well for the cases where we can do better in READ COMMITTED mode.

A MERGE mapped to a DML like this:

  WITH
  updated AS (
UPDATE 
SET ...
WHERE 
RETURNING 
)
, inserted AS (
INSERT INTO 
SELECT ...
WHERE  NOT IN (SELECT  FROM updated) AND ..
ON CONFLICT DO NOTHING -- see below!
RETURNING 
)
  DELETE FROM 
  WHERE  NOT IN (SELECT  FROM updated) AND
 NOT IN (SELECT  FROM inserted) AND ...;

can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE.

Now, one could write a MERGE that produces conflicts even without
concurrency, so adding ON CONFLICT DO NOTHING by default as above...
seems not-quite-correct.  But presumably one wouldn't write MERGE
statements that produce conflicts in the absence of concurrency, so this
seems close enough to me.

Another thing is that MERGE itself could get an ON CONFLICT clause for
the INSERT portion of the MERGE, allowing one to ignore some conflicts
and not others, though there would be no need for DO UPDATE, only DO
NOTHING for conflict resolution :)  This seems better.

I do believe this mapping is correct, and could be implemented entirely
in src/backend/parser/gram.y!  Am I wrong about this?

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> So what would happen if we just don't summarize partial ranges?

> Index scan would always have to read all the heap pages for that partial
> range.  Maybe not a big issue, but when you finish loading a table, it'd
> be good to have a mechanism to summarize that partial range ...

I guess if the ranges are big this might not be nice.

> Rather than remove the capability, I'd be inclined to make
> brin_summarize_new_values summarize the final partial range, and have
> VACUUM not do it.  Would that be too inconsistent?

That doesn't really get you out of the problem that this is an abuse of
the relation extension lock, and is likely to cause issues when people
try to optimize that locking mechanism.

Why is it that the regular technique doesn't work, ie create a placeholder
tuple and let it get added to by any insertions that happen?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-11-02 Thread Tom Lane

Amit Langote  writes:
> On 2017/09/26 16:30, Michael Paquier wrote:
>> Cool, let's switch it back to a ready for committer status then.

> Sure, thanks.

Pushed with some cosmetic adjustments --- mostly, making the comments more
explicit about why we need the apparently-redundant assignments to
pd_lower.

I've marked the CF entry closed.  However, I'm not sure if we're quite
done with this thread.  Weren't we going to adjust nbtree and hash to
be more aggressive about labeling their metapages as REGBUF_STANDARD?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Peter Geoghegan


Simon Riggs  wrote:

I think people imagined you had worked out how to make MERGE run
concurrently, I certainly did, but in fact you're just saying you
don't believe it ever should.


I'm certain that they didn't think that at all. But I'll let them speak
for themselves.


That is strange since the SQL Standard specifically allows the
implementation to decide upon concurrent behaviour.


And yet nobody else decided to do what you propose with this apparent
leeway.  (See the UPSERT wiki page for many references that confirm
this.)


So in your view we should make no attempt to avoid concurrent errors,
even when we have the capability to do so (in some cases) and doing so
would be perfectly compliant with the SQLStandard.


Yes. That's what I believe. I believe this because I can't see a way to
do this that isn't a mess, and because ON CONFLICT DO UPDATE exists and
works well for the cases where we can do better in READ COMMITTED mode.

Did you know that Oracle doesn't have an EPQ style mechanism at all?
Instead, it rolls back the entire statement and retries it from scratch.
That is not really any further from or closer to the standard than the
EPQ stuff, because the standard doesn't say anything about what should
happen as far as READ COMMITTED conflict handling goes.

My point here is that all of the stuff I'm talking about is only
relevant in READ COMMITTED mode, in areas where the standard never
provides us with guidance. If you can rely on SSI, then there is no
difference between what you propose and what I propose anyway, except
that what I propose is more general and will have better performance,
especially for batch MERGEs. If READ COMMITTED didn't exist,
implementing ON CONFLICT would have been more or less free of
controversy.


Yes, that certainly will make an easier patch for MERGE.


Indeed, it will.

--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deadlock in ALTER SUBSCRIPTION REFRESH PUBLICATION

2017-11-02 Thread Peter Eisentraut

On 10/24/17 13:13, Konstantin Knizhnik wrote:
> The reason of this deadlock seems to be clear: ALTER SUBSCRIPTION starts 
> transaction at one node and tries to create slot at other node, which waiting 
> for completion of all active transaction while building scnapshpot.
> Is there any way to avoid this deadlock?

I don't see a way to avoid it in general, unless we come up with a novel
way of creating replication slots.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add ALWAYS DEFERRED option for constraints

2017-11-02 Thread Nico Williams

On Thu, Nov 02, 2017 at 04:20:19PM -0400, Peter Eisentraut wrote:
> I haven't really thought about this feature too hard; I just want to
> give you a couple of code comments.

Thanks!

> I think the catalog structure, and relatedly also the parser structures,
> could be made more compact.  We currently have condeferrable and
> condeferred to represent three valid states (NOT DEFERRABLE, DEFERRABLE
> INITIALLY IMMEDIATE, DEFERRABLE INITIALLY DEFERRED).  You are adding
> conalwaysdeferred, but you are adding only additional state (ALWAYS
> DEFERRED).  So we end up with three bool fields to represent four
> states.  I think this should all be rolled into one char field with four
> states.

I thought about this.  I couldn't see a way to make the two existing
boolean columns have a combination meaning "ALWAYS DEFERRED" that might
not break something else.

Since (condeferred AND NOT condeferrable) is an impossible combination
today, I could use it to mean ALWAYS DEFERRED.  I'm not sure how safe
that would be.  And it does seem like a weird way to express ALWAYS
DEFERRED, though it would work.

Replacing condeferred and condeferrable with a char columns also
occurred to me, and though I assume backwards-incompatible changes to
pg_catalog tables are fair game, I assumed everyone would prefer
avoiding such changes where possible.

Also, a backwards-incompatible change to the table would significantly
enlarge the patch, as more version checks would be needed, particularly
regarding upgrades (which are otherwise trivial).

I felt adding a new column was probably safest.  I'll make a backwards-
incompatible change if requested, naturally, but I guess I'd want to
get wider consensus on that, as I fear others may not agree.  That fear
may just be due to my ignorance of the community's preference as to
pg_catalog backwards-compatibility vs. cleanliness.

Hmmm, must I do anything special about _downgrades_?  Does PostgreSQL
support downgrades?

> In psql and pg_dump, if you are query new catalog fields, you need to
> have a version check to have a different query for >=PG11.  (This would
> likely apply whether you adopt my suggestion above or not.)

Ah, yes, of course.  I will add such a check.

> Maybe a test case in pg_dump would be useful.

Will do.

> Other than that, this looks like a pretty complete patch.

Thanks for the review!  It's a small-ish patch, and my very first for
PG.  It was fun writing it.  I greatly appreciate that PG source is easy
to read.

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Linking libpq statically to libssl

2017-11-02 Thread Peter Eisentraut

On 10/27/17 08:24, Daniele Varrazzo wrote:
> I have a problem building binary packages for psycopg2. Binary
> packages ship with their own copies of libpq and libssl;

Aside from the advice of "don't do that" ...

> however if
> another python package links to libssl the library will be imported
> twice with conflicting symbols, likely resulting in a segfault (see
> https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if
> a python script both connects to postgres and opens an https resource.

... the standard solutions to this problem are symbol versioning and
linker flags to avoid making all symbols globally available.  libpq has
symbol versioning.  Maybe the libssl you are using does not.  Also, for
example, if you dlopen() with RTLD_LOCAL, the symbols will not be
globally available, so there should be no conflicts.

This needs cooperation from various different parties, and the details
will likely be platform dependent.  But it's generally a solved problem.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dynamic result sets from procedures

2017-11-02 Thread Daniel Verite

Peter Eisentraut wrote:

> CREATE PROCEDURE pdrstest1()
> LANGUAGE SQL
> AS $$
> DECLARE c1 CURSOR WITH RETURN FOR SELECT * FROM cp_test2;
> DECLARE c2 CURSOR WITH RETURN FOR SELECT * FROM cp_test3;
> $$;
> 
> CALL pdrstest1();
> 
> and that returns those two result sets to the client.

If applied to plpgsql, to return a dynamic result, the following
does work:

CREATE PROCEDURE test()
LANGUAGE plpgsql
AS $$
DECLARE
 query text:= 'SELECT 1 AS col1, 2 AS col2';
BEGIN
 EXECUTE 'DECLARE c CURSOR WITH RETURN FOR ' || query;
END;
$$;

This method could be used, for instance, to build a pivot with dynamic
columns in a single client-server round-trip, which is not possible today
with the query-calling-functions interface.
More generally, I guess this should help in the whole class of situations
where the client needs polymorphic results, which is awesome.

But instead of having procedures not return anything,
couldn't they return whatever resultset(s) they want to
("no resultset" being just a particular case of "anything"),
so that we could leave out cursors and simply write:

CREATE PROCEDURE test()
LANGUAGE plpgsql
AS $$
  RETURN QUERY  EXECUTE 'SELECT 1 AS col1, 2 AS col2';
END;
$$;

Or is that not possible or not desirable?

Similarly, for the SQL language, I wonder if the above example
could be simplified to:

CREATE PROCEDURE pdrstest1()
LANGUAGE SQL
AS $$
 SELECT * FROM cp_test2;
 SELECT * FROM cp_test3;
$$;
by which the two result sets would go back to the client again
without declaring explicit cursors.
Currently, it does not error out and no result set is sent.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Simon Riggs

On 2 November 2017 at 19:16, Peter Geoghegan  wrote:
> Simon Riggs  wrote:
>>
>> So if I understand you correctly, in your view MERGE should just fail
>> with an ERROR if it runs concurrently with other DML?
>
>
> That's certainly my opinion on the matter. It seems like that might be
> the consensus, too.

Given that I only just found out what you've been talking about, I
don't believe that anybody else did either.

I think people imagined you had worked out how to make MERGE run
concurrently, I certainly did, but in fact you're just saying you
don't believe it ever should.

That is strange since the SQL Standard specifically allows the
implementation to decide upon concurrent behaviour.

> Without meaning to sound glib: we already did make it work for a
> special, restricted case that is important enough to justify introducing
> a couple of kludges -- ON CONFLICT DO UPDATE/upsert.
>
> I do agree that what I propose for MERGE will probably cause confusion;
> just look into Oracle's MERGE implementation for examples of this.  We
> ought to go out of our way to make it clear that MERGE doesn't provide
> these guarantees.

So in your view we should make no attempt to avoid concurrent errors,
even when we have the capability to do so (in some cases) and doing so
would be perfectly compliant with the SQLStandard.

Yes, that certainly will make an easier patch for MERGE.

Or are you arguing against allowing any patch for MERGE?

Now we have more clarity, who else agrees with this?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Nico Williams

On Thu, Nov 02, 2017 at 12:51:45PM -0700, Peter Geoghegan wrote:
> Nico Williams  wrote:
> >If you want to ignore conflicts arising from concurrency you could
> >always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I
> >proposed earlier.  Thus a MERGE CONCURRENTLY could just do that.
> >
> >Is there any reason not to map MERGE as I proposed?
> 
> Performance, for one. MERGE generally has a join that can be optimized
> like an UPDATE FROM join.

Ah, right, I think my mapping was pessimal.  How about this mapping
instead then:

WITH
updated AS (
  UPDATE 
  SET ...
  WHERE 
  RETURNING 
  )
  , inserted AS (
  INSERT INTO 
  SELECT ...
  WHERE  NOT IN (SELECT  FROM updated) AND ..
  /*
   * Add ON CONFLICT DO NOTHING here to avoid conflicts in the face
   * of concurrency.
   */
  RETURNING 
  )
DELETE FROM 
WHERE  NOT IN (SELECT  FROM updated) AND
   NOT IN (SELECT  FROM inserted) AND ...;

?

If a MERGE has no delete clause, then the mapping would be:

WITH
updated AS (
  UPDATE 
  SET ...
  WHERE 
  RETURNING 
  )
INSERT INTO 
SELECT ...
WHERE  NOT IN (SELECT  FROM updated) AND ..
/*
 * Add ON CONFLICT DO NOTHING here to avoid conflicts in the face
 * of concurrency.
 */
;

> I haven't studied this question in any detail, but FWIW I think that
> using CTEs for merging is morally equivalent to a traditional MERGE
> implementation. [...]

I agree.  So why not do that initially?  Optimize later.

Such a MERGE mapping could be implemented entirely within
src/backend/parser/gram.y ...

Talk about cheap to implement, review, and maintain!

Also, this would be notionally very simple.

Any optimizations to CTE query/DML execution would be generic and
applicable to MERGE and other things besides.  If mapping MERGE to
CTE-using DMLs motivates such optimizations, all the better.

>  [...]. It may actually be possible to map from CTEs to a MERGE
> statement, but I don't think that that's a good approach to implementing
> MERGE.

Surely not every DML with CTEs can map to MERGE.  Maybe I misunderstood
your comment?

> Most of the implementation time will probably be spent doing things like
> making sure MERGE behaves appropriately with triggers, RLS, updatable
> views, and so on. That will take quite a while, but isn't particularly
> technically challenging IMV.

Note that mapping to a DML with CTEs as above gets triggers, RLS, and
updateable views right from the get-go, because DMLs with CTEs, and DMLs
as CTEs, surely do as well.

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add ALWAYS DEFERRED option for constraints

2017-11-02 Thread Peter Eisentraut

I haven't really thought about this feature too hard; I just want to
give you a couple of code comments.

I think the catalog structure, and relatedly also the parser structures,
could be made more compact.  We currently have condeferrable and
condeferred to represent three valid states (NOT DEFERRABLE, DEFERRABLE
INITIALLY IMMEDIATE, DEFERRABLE INITIALLY DEFERRED).  You are adding
conalwaysdeferred, but you are adding only additional state (ALWAYS
DEFERRED).  So we end up with three bool fields to represent four
states.  I think this should all be rolled into one char field with four
states.

In psql and pg_dump, if you are query new catalog fields, you need to
have a version check to have a different query for >=PG11.  (This would
likely apply whether you adopt my suggestion above or not.)

Maybe a test case in pg_dump would be useful.

Other than that, this looks like a pretty complete patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Alvaro Herrera

Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> If VACUUM and brin_summarize_new_values both ignore the partial
> >> range, then what else would request this?  Can't we just decree
> >> that we don't summarize the partial range, period?
> 
> > brin_summarize_range() can do it.
> 
> So what would happen if we just don't summarize partial ranges?

Index scan would always have to read all the heap pages for that partial
range.  Maybe not a big issue, but when you finish loading a table, it'd
be good to have a mechanism to summarize that partial range ...

Rather than remove the capability, I'd be inclined to make
brin_summarize_new_values summarize the final partial range, and have
VACUUM not do it.  Would that be too inconsistent?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Peter Geoghegan


Simon Riggs  wrote:

So if I understand you correctly, in your view MERGE should just fail
with an ERROR if it runs concurrently with other DML?


That's certainly my opinion on the matter. It seems like that might be
the consensus, too.

Obviously there are things that you as a user can do about this on your
own, like opt to use a higher isolation level, or manually LOCK TABLE.
For some use cases, including bulk loading for OLAP, users might just
know that there isn't going to be concurrent activity because it's not
an OLTP system.

If this still seems odd to you, then consider that exactly the same
situation exists with UPDATE. A user could want their UPDATE to affect a
row where no row version is actually visible to their MVCC snapshot,
because they have an idea about reliably updating the latest row. UPDATE
doesn't work like that, of course. Is this unacceptable because the user
expects that it should work that way?

Bear in mind that ON CONFLICT DO UPDATE *can* actually update a row when
there is no version of it visible to the snapshot. It can also update a
row where there is a concurrent DELETE + INSERT, and the tuples with the
relevant unique index values end up not even being part of the same
update chain in each case (MVCC-snapshot-visible vs. latest). IOW, you
may end up updating a completely different logical row to the row with
the conflicting value that is visible to your MVCC snapshot!


i.e. if a race condition between the query and an INSERT runs
concurrently with another INSERT

We have no interest in making that work?


Without meaning to sound glib: we already did make it work for a
special, restricted case that is important enough to justify introducing
a couple of kludges -- ON CONFLICT DO UPDATE/upsert.

I do agree that what I propose for MERGE will probably cause confusion;
just look into Oracle's MERGE implementation for examples of this.  We
ought to go out of our way to make it clear that MERGE doesn't provide
these guarantees.

--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Peter Geoghegan


Nico Williams  wrote:

If you want to ignore conflicts arising from concurrency you could
always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I
proposed earlier.  Thus a MERGE CONCURRENTLY could just do that.

Is there any reason not to map MERGE as I proposed?


Performance, for one. MERGE generally has a join that can be optimized
like an UPDATE FROM join.

I haven't studied this question in any detail, but FWIW I think that
using CTEs for merging is morally equivalent to a traditional MERGE
implementation. It may actually be possible to map from CTEs to a MERGE
statement, but I don't think that that's a good approach to implementing
MERGE.

Most of the implementation time will probably be spent doing things like
making sure MERGE behaves appropriately with triggers, RLS, updatable
views, and so on. That will take quite a while, but isn't particularly
technically challenging IMV.

--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Nico Williams

On Thu, Nov 02, 2017 at 11:48:44AM -0400, Tom Lane wrote:
> Nico Williams  writes:
> > With access controls, GUCs could become schema variables, and settings
> > from postgresql.conf could move into the database itself (which I think
> > would be nice).
> 
> People re-propose some variant of that every so often, but it never works,
> because it ignores the fact that some of the GUCs' values are needed
> before you can access system catalogs at all, or in places where relying
> on system catalog access would be a bad idea.

ISTM that it should be possible to break the chicken-egg issue by having
the config variables stored in such a way that knowing only the pgdata
directory path should suffice to find them.  That's effectively the case
already in that postgresql.conf is found... there.

One could do probably this as a PoC entirely as a SQL-coded VIEW that
reads and writes (via the adminpack module's pg_catalog.pg_file_write())
postgresql.conf (without preserving comments, or with some rules
regarding comments so that they are effectively attached to params).

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

2017-11-02 Thread Tom Lane

Haribabu Kommi  writes:
> The changes are fine and now it reports proper live tuples in both
> pg_class and stats. The other issue of continuous vacuum operation
> leading to decrease of number of live tuples is not related to this
> patch and that can be handled separately.

I did not like this patch too much, because it did nothing to fix
the underlying problem of confusion between "live tuples" and "total
tuples"; in fact, it made that worse, because now the comment on
LVRelStats.new_rel_tuples is a lie.  And there's at least one usage
of that field value where I think we do want total tuples not live
tuples: where we pass it to index AM cleanup functions.  Indexes
don't really care whether heap entries are live or not, only that
they're there.  Plus the VACUUM VERBOSE log message that uses the
field is supposed to be reporting total tuples not live tuples.

I hacked up the patch trying to make that better, finding along the
way that contrib/pgstattuple shared in the confusion about what
it was trying to count.  Results attached.

However, I'm not sure we're there yet, because there remains a fairly
nasty discrepancy even once we've gotten everyone onto the same page
about reltuples counting just live tuples: VACUUM and ANALYZE have
different definitions of what's "live".  In particular they do not treat
INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same.  Should we
try to do something about that?  If so, what?  It looks like ANALYZE's
behavior is pretty tightly constrained, judging by the comments in
acquire_sample_rows.

Another problem is that it looks like CREATE INDEX will set reltuples
to the total number of heap entries it chose to index, because that
is what IndexBuildHeapScan counts.  Maybe we should adjust that?

regards, tom lane

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index 5bf0613..68211c6 100644
*** a/contrib/pgstattuple/pgstatapprox.c
--- b/contrib/pgstattuple/pgstatapprox.c
*** statapprox_heap(Relation rel, output_typ
*** 68,74 
  	Buffer		vmbuffer = InvalidBuffer;
  	BufferAccessStrategy bstrategy;
  	TransactionId OldestXmin;
- 	uint64		misc_count = 0;

  	OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM);
  	bstrategy = GetAccessStrategy(BAS_BULKREAD);
--- 68,73 
*** statapprox_heap(Relation rel, output_typ
*** 153,177 
  			tuple.t_tableOid = RelationGetRelid(rel);

  			/*
! 			 * We count live and dead tuples, but we also need to add up
! 			 * others in order to feed vac_estimate_reltuples.
  			 */
  			switch (HeapTupleSatisfiesVacuum(, OldestXmin, buf))
  			{
- case HEAPTUPLE_RECENTLY_DEAD:
- 	misc_count++;
- 	/* Fall through */
- case HEAPTUPLE_DEAD:
- 	stat->dead_tuple_len += tuple.t_len;
- 	stat->dead_tuple_count++;
- 	break;
  case HEAPTUPLE_LIVE:
  	stat->tuple_len += tuple.t_len;
  	stat->tuple_count++;
  	break;
! case HEAPTUPLE_INSERT_IN_PROGRESS:
! case HEAPTUPLE_DELETE_IN_PROGRESS:
! 	misc_count++;
  	break;
  default:
  	elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
--- 152,172 
  			tuple.t_tableOid = RelationGetRelid(rel);

  			/*
! 			 * We follow VACUUM's lead in counting INSERT_IN_PROGRESS and
! 			 * DELETE_IN_PROGRESS tuples as "live".
  			 */
  			switch (HeapTupleSatisfiesVacuum(, OldestXmin, buf))
  			{
  case HEAPTUPLE_LIVE:
+ case HEAPTUPLE_INSERT_IN_PROGRESS:
+ case HEAPTUPLE_DELETE_IN_PROGRESS:
  	stat->tuple_len += tuple.t_len;
  	stat->tuple_count++;
  	break;
! case HEAPTUPLE_DEAD:
! case HEAPTUPLE_RECENTLY_DEAD:
! 	stat->dead_tuple_len += tuple.t_len;
! 	stat->dead_tuple_count++;
  	break;
  default:
  	elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
*** statapprox_heap(Relation rel, output_typ
*** 184,191 

  	stat->table_len = (uint64) nblocks * BLCKSZ;

  	stat->tuple_count = vac_estimate_reltuples(rel, false, nblocks, scanned,
! 			   stat->tuple_count + misc_count);

  	/*
  	 * Calculate percentages if the relation has one or more pages.
--- 179,190 

  	stat->table_len = (uint64) nblocks * BLCKSZ;

+ 	/*
+ 	 * Extrapolate the live-tuple count to the whole table in the same way
+ 	 * that VACUUM does.  (XXX shouldn't we also extrapolate tuple_len?)
+ 	 */
  	stat->tuple_count = vac_estimate_reltuples(rel, false, nblocks, scanned,
! 			   stat->tuple_count);

  	/*
  	 * Calculate percentages if the relation has one or more pages.
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ef60a58..87bba31 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
*** SCRAM-SHA-256$iteration
*** 1739,1746 
float4

!Number of rows in the table.  This is only an estimate used by the
!planner.  It is updated by

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Nico Williams

On Thu, Nov 02, 2017 at 06:49:18PM +, Simon Riggs wrote:
> On 1 November 2017 at 18:20, Peter Geoghegan  wrote:
> > In Postgres, you can avoid duplicate violations with MERGE by using a
> > higher isolation level (these days, those are turned into a
> > serialization error at higher isolation levels when no duplicate is
> > visible to the xact's snapshot).
> 
> So if I understand you correctly, in your view MERGE should just fail
> with an ERROR if it runs concurrently with other DML?
> 
> i.e. if a race condition between the query and an INSERT runs
> concurrently with another INSERT
> 
> We have no interest in making that work?

If you map MERGE to a DML with RETURNING-DML CTEs as I suggested before,
how would that interact with concurrent DMLs?  The INSERT DML of the
mapped statement could produce conflicts that abort the whole MERGE,
correct?

If you want to ignore conflicts arising from concurrency you could
always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I
proposed earlier.  Thus a MERGE CONCURRENTLY could just do that.

Is there any reason not to map MERGE as I proposed?

Such an implementation of MERGE wouldn't be online because CTEs are
always implemented sequentially currently.  That's probably reason
enough to eventually produce a native implementation of MERGE, ... or to
revamp the CTE machinery to allow such a mapping to be online.

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> If VACUUM and brin_summarize_new_values both ignore the partial
>> range, then what else would request this?  Can't we just decree
>> that we don't summarize the partial range, period?

> brin_summarize_range() can do it.

So what would happen if we just don't summarize partial ranges?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Alvaro Herrera

Tom Lane wrote:
> Alvaro Herrera  writes:

> > 2. when summarization is requested on the partial range at the end of a
> > table, we acquire extension lock on the rel, then compute relation size
> > and run summarization with the lock held.  This guarantees that we don't
> > miss any pages.  This is bad for concurrency though, so it's only done
> > in that specific scenario.
> 
> Hm, I wonder how this will play with the active proposals around
> reimplementing relation extension locks.  All that work seems to be
> assuming that the extension lock is only held for a short time and
> nothing much beyond physical extension is done while holding it.
> I'm afraid that you may be introducing a risk of e.g. deadlocks
> if you do this.

Ouch ... yeah, that could be a problem.

Another idea I had was to just insert the placeholder tuple while
holding the extension lock, then release the lock while the
summarization is done.  It would be a bit of a break of the current
separation of concerns, but I'm not convinced that the current setup is
perfect, so maybe that's okay.

> If VACUUM and brin_summarize_new_values both ignore the partial
> range, then what else would request this?  Can't we just decree
> that we don't summarize the partial range, period?

brin_summarize_range() can do it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Nico Williams

If nothing else, anyone needing MERGE can port their MERGE statements to
a DML with DML-containing CTEs...

The generic mapping would be something like this, I think:

WITH
rows AS (SELECT  FROM  WHERE )
  , updated AS (
UPDATE 
SET ...
WHERE  IN (SELECT  FROM rows) /* matched */
RETURNING 
  )
  , inserted AS (
INSERT INTO 
SELECT ...
WHERE  NOT IN (SELECT  FROM rows) /* not matched */
RETURNING 
  )
DELETE FROM 
WHERE (...) AND
   NOT IN (SELECT  FROM updated UNION
SELECT  FROM inserted);

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Simon Riggs

On 1 November 2017 at 18:20, Peter Geoghegan  wrote:

> In Postgres, you can avoid duplicate violations with MERGE by using a
> higher isolation level (these days, those are turned into a
> serialization error at higher isolation levels when no duplicate is
> visible to the xact's snapshot).

So if I understand you correctly, in your view MERGE should just fail
with an ERROR if it runs concurrently with other DML?

i.e. if a race condition between the query and an INSERT runs
concurrently with another INSERT

We have no interest in making that work?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Peter Geoghegan


Robert Haas  wrote:

And if, in the meantime, MERGE can only handle the cases where there
is a unique index, then it can only handle the cases INSERT .. ON
CONFLICT UPDATE can cover, which makes it, as far as I can see,
syntactic sugar over what we already have.  Maybe it's not entirely -
you might be planning to make some minor functional enhancements - but
it's not clear what those are, and I feel like whatever it is could be
done with less work and more elegance by just extending the INSERT ..
ON CONFLICT UPDATE syntax.


+1

Marko Tiikkaja's INSERT ... ON CONFLICT SELECT patch, which is in the
current CF [1], moves things in this direction. I myself have
occasionally wondered if it was worth adding an alternative DO DELETE
conflict_action. This could appear alongside DO UPDATE, and be applied
using MERGE-style conditions.

All of these things seem like small adjuncts to ON CONFLICT because
they're all just an alternative way of modifying or projecting the tuple
that is locked by ON CONFLICT. Everything new would have to happen after
the novel ON CONFLICT handling has already completed.

The only reason that I haven't pursued this is because it doesn't seem
that compelling. I mention it now because It's worth acknowledging that
ON CONFLICT could be pushed a bit further in this direction. Of course,
this still falls far short of making ON CONFLICT entirely like MERGE.

[1] https://commitfest.postgresql.org/15/1241/
--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Tom Lane

Alvaro Herrera  writes:
> 1. in VACUUM or brin_summarize_new_values, we only process fully loaded
> ranges, and ignore the partial range at end of table.

OK.

> 2. when summarization is requested on the partial range at the end of a
> table, we acquire extension lock on the rel, then compute relation size
> and run summarization with the lock held.  This guarantees that we don't
> miss any pages.  This is bad for concurrency though, so it's only done
> in that specific scenario.

Hm, I wonder how this will play with the active proposals around
reimplementing relation extension locks.  All that work seems to be
assuming that the extension lock is only held for a short time and
nothing much beyond physical extension is done while holding it.
I'm afraid that you may be introducing a risk of e.g. deadlocks
if you do this.

If VACUUM and brin_summarize_new_values both ignore the partial
range, then what else would request this?  Can't we just decree
that we don't summarize the partial range, period?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Alvaro Herrera

Tomas Vondra wrote:

> Unfortunately, I think we still have a problem ... I've been wondering
> if we end up producing correct indexes, so I've done a simple test.

Here's a proposed patch that should fix this problem (and does, in my
testing).  Would you please give it a try?

This patch changes two things:

1. in VACUUM or brin_summarize_new_values, we only process fully loaded
ranges, and ignore the partial range at end of table.

2. when summarization is requested on the partial range at the end of a
table, we acquire extension lock on the rel, then compute relation size
and run summarization with the lock held.  This guarantees that we don't
miss any pages.  This is bad for concurrency though, so it's only done
in that specific scenario.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From a760bfd44afeff8d1629599411ac7ce87acc7d26 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 2 Nov 2017 18:35:10 +0100
Subject: [PATCH] Fix summarization concurrent with relation extension

---
 src/backend/access/brin/brin.c | 91 --
 1 file changed, 70 insertions(+), 21 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b3aa6d1ced..7696c0700c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -29,6 +29,7 @@
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
+#include "storage/lmgr.h"
 #include "utils/builtins.h"
 #include "utils/index_selfuncs.h"
 #include "utils/memutils.h"
@@ -68,6 +69,10 @@ static BrinBuildState *initialize_brin_buildstate(Relation 
idxRel,
 static void terminate_brin_buildstate(BrinBuildState *state);
 static void brinsummarize(Relation index, Relation heapRel, BlockNumber 
pageRange,
  double *numSummarized, double *numExisting);
+static void determine_summarization_range(BlockNumber pageRange,
+ BlockNumber 
heapNumBlocks,
+ BlockNumber 
pagesPerRange,
+ BlockNumber 
*startBlk, BlockNumber *endBlk);
 static void form_and_insert_tuple(BrinBuildState *state);
 static void union_tuples(BrinDesc *bdesc, BrinMemTuple *a,
 BrinTuple *b);
@@ -1253,30 +1258,16 @@ brinsummarize(Relation index, Relation heapRel, 
BlockNumber pageRange,
BlockNumber startBlk;
BlockNumber endBlk;
 
-   /* determine range of pages to process; nothing to do for an empty 
table */
-   heapNumBlocks = RelationGetNumberOfBlocks(heapRel);
-   if (heapNumBlocks == 0)
-   return;
-
revmap = brinRevmapInitialize(index, , NULL);
 
-   if (pageRange == BRIN_ALL_BLOCKRANGES)
+   /* determine range of pages to process */
+   heapNumBlocks = RelationGetNumberOfBlocks(heapRel);
+   determine_summarization_range(pageRange, heapNumBlocks, pagesPerRange,
+ , 
);
+   if (startBlk > heapNumBlocks)
{
-   startBlk = 0;
-   endBlk = heapNumBlocks;
-   }
-   else
-   {
-   startBlk = (pageRange / pagesPerRange) * pagesPerRange;
-   /* Nothing to do if start point is beyond end of table */
-   if (startBlk > heapNumBlocks)
-   {
-   brinRevmapTerminate(revmap);
-   return;
-   }
-   endBlk = startBlk + pagesPerRange;
-   if (endBlk > heapNumBlocks)
-   endBlk = heapNumBlocks;
+   brinRevmapTerminate(revmap);
+   return;
}
 
/*
@@ -1287,9 +1278,31 @@ brinsummarize(Relation index, Relation heapRel, 
BlockNumber pageRange,
{
BrinTuple  *tup;
OffsetNumber off;
+   boolext_lock_held = false;
 
CHECK_FOR_INTERRUPTS();
 
+   /*
+* Whenever we're scanning a range that would go past what we 
know to
+* be end-of-relation, we need to ensure we scan to the true 
end of the
+* relation, or we risk missing tuples in recently added pages. 
 To do
+* this, we hold the relation extension lock from here till 
we're done
+* summarizing, and compute the relation size afresh now.  The 
logic in
+* determine_summarization_range ensures that this is not done 
in the
+* common cases of vacuum or brin_summarize_new_values(), but 
instead
+* only when we're specifically asked to summarize the last 
range in
+* the relation.
+*/
+   if (heapBlk + pagesPerRange >

Re: [HACKERS] initdb w/ restart

2017-11-02 Thread Peter Eisentraut

On 9/28/17 15:40, Jesper Pedersen wrote:
> In a case of
> 
> initdb /tmp/pgsql
> 
> followed by
> 
> pg_ctl -D /tmp/pgsql/ -l /tmp/logfile restart
> 
> you'll get
> 
> pg_ctl: PID file "/tmp/pgsql/postmaster.pid" does not exist
> Is server running?
> starting server anyway
> pg_ctl: could not read file "/tmp/pgsql/postmaster.opts"
> 
> The attached patch changes the message to "trying to start server 
> anyway" to highlight it is an attempt, not something that will happen.

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Robert Haas

On Thu, Nov 2, 2017 at 9:05 PM, Nico Williams  wrote:
>> Overloading SET to handle both variables and GUCs seems likely to
>> create problems, possibly including security problems.  For example,
>> maybe a security-definer function could leave behind variables to
>> trick the calling code into failing to set GUCs that it intended to
>> set.  Or maybe creating a variable at the wrong time will just break
>> things randomly.
>
> That's already true of GUCs, since there are no access controls on
> set_config()/current_setting().

No, it isn't.  Right now, SET always refers to a GUC, never a
variable, so there's no possibility of getting confused about whether
it's intending to change a GUC or an eponymous variable.  Once you
make SET able to change either one of two different kinds of objects,
then that possibility does exist.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Parallel Plans and Cost of non-filter functions

2017-11-02 Thread Paul Ramsey

I'm working on a custom aggregate, that generates a serialized data format.
The preparation of the geometry before being formatted is pretty intense,
so it is probably a good thing for that work to be done in parallel, in
partial aggregates. Here's an example SQL call:

EXPLAIN analyze
SELECT length(ST_AsMVT(a)) FROM (
SELECT ST_AsMVTGeom(p.geom, ::geometry_literal, 4096, 0, true), gid,
fed_num
FROM pts_10 p
WHERE p.geom && ::geometry_literal
AND p.geom IS NOT NULL
) a;

The ST_AsMVTGeom() function can be comically expensive, it's really good
when it's in partial aggregates. But the cost of the function seems to be
ignored.

(First note that, in order to consistently get parallel plans I have to
brutally suppress parallel_tuple_cost, as described here
http://blog.cleverelephant.ca/2017/10/parallel-postgis-2.html)

Whether I get a parallel aggregate seems entirely determined by the number
of rows, not the cost of preparing those rows.

When changing the number of rows in the subquery, with a LIMIT, I can
change from a seq scan to a paralllel seq scan and finally to a parallel
aggregate, as the number of rows goes up.

An odd effect: when I have enough rows to get a paralllel seq scan, I get
flip it back to a seq scan, by *increasing* the cost of ST_AsMVTGeom. That
seems odd and backwards.

Is there anywhere a guide or rough description to how costs are used in
determining parallel plans? The empirical approach starts to wear one down
after a while :)

P.

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Robert Haas

On Thu, Nov 2, 2017 at 10:26 PM, Peter Geoghegan  wrote:
> On Thu, Nov 2, 2017 at 9:44 AM, Robert Haas  wrote:
>> The second commit (22576734b805fb0952f9be841ca8f643694ee868) is where
>> I think things get a lot more dangerous.  The problem (as Andres
>> pointed out to me this afternoon) is that it seems possible to end up
>> with a situation where there should be two HOT chains on a page, and
>> because of the weakened xmin/xmax checking rules, we end up thinking
>> that the second one is a continuation of the first one, which will be
>> all kinds of bad news.  That would be particularly likely to happen in
>> releases from before we invented HEAP_XMIN_FROZEN, when there's no
>> xmin/xmax matching at all, but could happen on later releases if we
>> are extraordinarily unlucky (i.e. xmin of the first tuple in the
>> second chain happens to be the same as the pre-freeze xmax in the old
>> chain, probably because the same XID was used to update the page in
>> two consecutive epochs).  Fortunately, that commit is (I think) not
>> released anywhere.
>
> FWIW, if you look at the second commit
> (22576734b805fb0952f9be841ca8f643694ee868) carefully, you'll realize
> that it doesn't even treat those two cases differently. It was buggy
> even on its own terms. The FrozenTransactionId test used an xmin from
> HeapTupleHeaderGetXmin(), not HeapTupleHeaderGetRawXmin().

Oh, wow.  You seem to be correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Robert Haas

On Tue, Oct 31, 2017 at 5:14 PM, Simon Riggs  wrote:
> I've proposed a SQL Standard compliant implementation that would do
> much more than be new syntax over what we already have.
>
> So these two claims aren't accurate: "radical difference" and "syntax
> sugar over a capability we have".

I think those claims are pretty accurate.

The design of INSERT .. ON CONFLICT UPDATE and the supporting
machinery only work if there is a unique index.  It provides radically
different behavior than what you get with any other DML statement,
with completely novel concurrency behavior, and there is no reasonable
way to emulate that behavior in cases where no relevant unique index
exists.  Therefore, if MERGE eventually uses INSERT .. ON CONFLICT
UPDATE when a relevant unique index exists and does something else,
such as your proposal of taking a strong lock, or Peter's proposal of
doing this in a concurrency-oblivious manner, in other cases, then
those two cases will behave very differently.

And if, in the meantime, MERGE can only handle the cases where there
is a unique index, then it can only handle the cases INSERT .. ON
CONFLICT UPDATE can cover, which makes it, as far as I can see,
syntactic sugar over what we already have.  Maybe it's not entirely -
you might be planning to make some minor functional enhancements - but
it's not clear what those are, and I feel like whatever it is could be
done with less work and more elegance by just extending the INSERT ..
ON CONFLICT UPDATE syntax.

And it does seem to be your intention to only handle the cases which
the INSERT .. ON CONFLICT UPDATE infrastructure can cover, because
upthread you wrote this: "I didn't say it but my intention was to just
throw an ERROR if no single unique index can be identified."  I don't
think anybody's putting words into your mouth here.  We're just
reading what you wrote.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - use enum for meta commands

2017-11-02 Thread a . parfenov


On Thu, 2 Nov 2017 17:50:52 +0100 (CET)
Fabien COELHO  wrote:


> Updated version attached.

. Here is the patch. Sorry for the noise.



Everything alright. Patch is ready for commiter.

--
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Minor patch to correct symbol name in logs

2017-11-02 Thread Peter Eisentraut

On 9/20/17 01:56, Vaishnavi Prabakaran wrote:
> Backend's lo_functions were renamed to avoid conflicting with libpq
> prototypes in commit - 6fc547960dbe0b8bd6cefae5ab7ec3605a5c46fc. 
> 
> Logs inside those functions still refer to old symbol names, Here is the
> small patch to update the same.

I think those debug messages refer to the SQL name of the function, so
they look correct to me as is.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: passwordcheck: Add test suite

2017-11-02 Thread Peter Eisentraut

On 9/15/17 00:20, Michael Paquier wrote:
> On Fri, Sep 15, 2017 at 12:02 PM, Michael Paquier
>  wrote:
>> On Fri, Sep 15, 2017 at 11:46 AM, Peter Eisentraut  wrote:
>>> passwordcheck: Add test suite
>>>
>>> Also improve one error message.
>>>
>>> Reviewed-by: David Steele 
>>
>> Sorry for showing up late for this topic.
>> +REGRESS_OPTS = --temp-config $(srcdir)/passwordcheck.conf
>> +REGRESS = passwordcheck
>> +# disabled because these tests require setting shared_preload_libraries
>> +NO_INSTALLCHECK = 1
>> You could have avoided all that by just issuing "load
>> 'passwordcheck';" at the beginning of the test. And you gain support
>> for installcheck this way.
> 
> In short you just need the attached ;)

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Peter Geoghegan

On Thu, Nov 2, 2017 at 9:44 AM, Robert Haas  wrote:
> The second commit (22576734b805fb0952f9be841ca8f643694ee868) is where
> I think things get a lot more dangerous.  The problem (as Andres
> pointed out to me this afternoon) is that it seems possible to end up
> with a situation where there should be two HOT chains on a page, and
> because of the weakened xmin/xmax checking rules, we end up thinking
> that the second one is a continuation of the first one, which will be
> all kinds of bad news.  That would be particularly likely to happen in
> releases from before we invented HEAP_XMIN_FROZEN, when there's no
> xmin/xmax matching at all, but could happen on later releases if we
> are extraordinarily unlucky (i.e. xmin of the first tuple in the
> second chain happens to be the same as the pre-freeze xmax in the old
> chain, probably because the same XID was used to update the page in
> two consecutive epochs).  Fortunately, that commit is (I think) not
> released anywhere.

FWIW, if you look at the second commit
(22576734b805fb0952f9be841ca8f643694ee868) carefully, you'll realize
that it doesn't even treat those two cases differently. It was buggy
even on its own terms. The FrozenTransactionId test used an xmin from
HeapTupleHeaderGetXmin(), not HeapTupleHeaderGetRawXmin().

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-02 Thread Robert Haas

On Thu, Nov 2, 2017 at 8:28 AM, Craig Ringer  wrote:
> On 1 November 2017 at 01:55, Peter Geoghegan  wrote:
>> The problem here is: Iff the first statement uses ON CONFLICT
>> infrastructure, doesn't the absence of WHEN NOT MATCHED imply
>> different semantics for the remaining updates and deletes in the
>> second version of the query? You've removed what seems like a neat
>> adjunct to the MERGE, but it actually changes everything else too when
>> using READ COMMITTED.
>
> Would these concerns be alleviated by adding some kind of Pg-specific
> decoration that constrained concurrency-safe MERGEs?
>
> So your first statement would be
>
>  MERGE CONCURRENTLY ...
>
> and when you removed the WHEN NOT MATCHED clause it'd ERROR because
> that's no longer able to be done with the same concurrency-safe
> semantics?
>
> I don't know if this would be helpful TBH, or if it would negate
> Simon's compatibility goals. Just another idea.

Yes, that fixes the problem.  Of course, it also turns MERGE
CONCURRENTLY into syntactic sugar for INSERT ON CONFLICT UPDATE, which
brings one back to the question of exactly what we're trying to
achieve here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - use enum for meta commands

2017-11-02 Thread Fabien COELHO




Updated version attached.


. Here is the patch. Sorry for the noise.

--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 5d8a01c..6bd3e52 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -373,11 +373,21 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
+typedef enum MetaCommand
+{
+	META_NONE,
+	META_SET,
+	META_SETSHELL,
+	META_SHELL,
+	META_SLEEP
+} MetaCommand;
+
 typedef struct
 {
 	char	   *line;			/* text of command line */
 	int			command_num;	/* unique index of this Command struct */
 	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
+	MetaCommand meta;			/* meta command identifier, if appropriate */
 	int			argc;			/* number of command words */
 	char	   *argv[MAX_ARGS]; /* command word list */
 	PgBenchExpr *expr;			/* parsed expression, if needed */
@@ -1721,6 +1731,26 @@ evaluateExpr(TState *thread, CState *st, PgBenchExpr *expr, PgBenchValue *retval
 	}
 }
 
+/* return meta-command enum identifier */
+static MetaCommand
+getMetaCommand(const char * cmd)
+{
+	MetaCommand mc;
+	if (cmd == NULL)
+		mc = META_NONE;
+	else if (pg_strcasecmp(cmd, "set") == 0)
+		mc = META_SET;
+	else if (pg_strcasecmp(cmd, "setshell") == 0)
+		mc = META_SETSHELL;
+	else if (pg_strcasecmp(cmd, "shell") == 0)
+		mc = META_SHELL;
+	else if (pg_strcasecmp(cmd, "sleep") == 0)
+		mc = META_SLEEP;
+	else
+		mc = META_NONE;
+	return mc;
+}
+
 /*
  * Run a shell command. The result is assigned to the variable if not NULL.
  * Return true if succeeded, or false on error.
@@ -2214,7 +2244,7 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 		fprintf(stderr, "\n");
 	}
 
-	if (pg_strcasecmp(argv[0], "sleep") == 0)
+	if (command->meta == META_SLEEP)
 	{
 		/*
 		 * A \sleep doesn't execute anything, we just get the
@@ -2240,7 +2270,7 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 	}
 	else
 	{
-		if (pg_strcasecmp(argv[0], "set") == 0)
+		if (command->meta == META_SET)
 		{
 			PgBenchExpr *expr = command->expr;
 			PgBenchValue result;
@@ -2259,7 +2289,7 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 break;
 			}
 		}
-		else if (pg_strcasecmp(argv[0], "setshell") == 0)
+		else if (command->meta == META_SETSHELL)
 		{
 			bool		ret = runShellCommand(st, argv[1], argv + 2, argc - 2);
 
@@ -2279,7 +2309,7 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 /* succeeded */
 			}
 		}
-		else if (pg_strcasecmp(argv[0], "shell") == 0)
+		else if (command->meta == META_SHELL)
 		{
 			bool		ret = runShellCommand(st, NULL, argv + 1, argc - 1);
 
@@ -3023,6 +3053,7 @@ process_sql_command(PQExpBuffer buf, const char *source)
 	my_command = (Command *) pg_malloc0(sizeof(Command));
 	my_command->command_num = num_commands++;
 	my_command->type = SQL_COMMAND;
+	my_command->meta = META_NONE;
 	initSimpleStats(_command->stats);
 
 	/*
@@ -3091,7 +3122,9 @@ process_backslash_command(PsqlScanState sstate, const char *source)
 	my_command->argv[j++] = pg_strdup(word_buf.data);
 	my_command->argc++;
 
-	if (pg_strcasecmp(my_command->argv[0], "set") == 0)
+	my_command->meta = getMetaCommand(my_command->argv[0]);
+
+	if (my_command->meta == META_SET)
 	{
 		/* For \set, collect var name, then lex the expression. */
 		yyscan_t	yyscanner;
@@ -3146,7 +3179,7 @@ process_backslash_command(PsqlScanState sstate, const char *source)
   expr_scanner_offset(sstate),
   true);
 
-	if (pg_strcasecmp(my_command->argv[0], "sleep") == 0)
+	if (my_command->meta == META_SLEEP)
 	{
 		if (my_command->argc < 2)
 			syntax_error(source, lineno, my_command->line, my_command->argv[0],
@@ -3187,13 +3220,13 @@ process_backslash_command(PsqlScanState sstate, const char *source)
 			 my_command->argv[2], offsets[2] - start_offset);
 		}
 	}
-	else if (pg_strcasecmp(my_command->argv[0], "setshell") == 0)
+	else if (my_command->meta == META_SETSHELL)
 	{
 		if (my_command->argc < 3)
 			syntax_error(source, lineno, my_command->line, my_command->argv[0],
 		 "missing argument", NULL, -1);
 	}
-	else if (pg_strcasecmp(my_command->argv[0], "shell") == 0)
+	else if (my_command->meta == META_SHELL)
 	{
 		if (my_command->argc < 2)
 			syntax_error(source, lineno, my_command->line, my_command->argv[0],
@@ -3201,6 +3234,7 @@ process_backslash_command(PsqlScanState sstate, const char *source)
 	}
 	else
 	{
+		/* my_command->meta == META_NONE */
 		syntax_error(source, lineno, my_command->line, my_command->argv[0],
 	 "invalid command", NULL, -1);
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - use enum for meta commands

2017-11-02 Thread Fabien COELHO



The only thing I'm not quite sure about is a comment "which meta command 
...". Maybe it's better to write it without question word, something 
like "meta command identifier..."?


Ok. I agree.

Updated version attached. I also added a const on a function parameter.

Just a note about the motivation: I want to add the same "\if" syntax 
added to psql, but it requires to look at the meta command in a number of 
places to manage the automaton status, and the strcmp solution looked both 
ugly and inefficient. So this small refactoring is just a preliminary to 
the "\if" patch, some day, after this one get committed, if it gets 
committed.



The new status of this patch is: Ready for Committer


Thanks for the review.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing wal_keep_segments as default configuration in PostgresNode.pm

2017-11-02 Thread Michael Paquier

On Thu, Nov 2, 2017 at 4:47 PM, Peter Eisentraut
 wrote:
> On 9/11/17 21:55, Michael Paquier wrote:
>> I tend to think that while all the other parameters make sense to
>> deploy instances that need few resources, wal_keep_segments may cause
>> up to 350MB of WAL segments to be kept in each pg_wal's instance,
>> while max_wal_size is set at 128MB. The only test in the code tree in
>> need of wal_keep_segments is actually pg_rewind, which enforces
>> checkpoints after the rewind to update the source's control file.
>>
>> So, thoughts about the attached that reworks this portion of PostgresNode.pm?
>
> Committed.
>
> Besides the resource usage, it would probably be bad if a
> wal_keep_segments setting papered over problems with replication slots
> for example.

Thanks! I almost forgot this patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Peter Geoghegan

On Thu, Nov 2, 2017 at 4:20 AM, Andres Freund  wrote:
> I think the problem is on the pruning, rather than the freezing side. We
> can't freeze a tuple if it has an alive predecessor - rather than
> weakining this, we should be fixing the pruning to not have the alive
> predecessor.

Excellent catch.

> If the update xmin is actually below the cutoff we can remove the tuple
> even if there's live lockers - the lockers will also be present in the
> newer version of the tuple.  I verified that for me that fixes the
> problem. Obviously that'd require some comment work and more careful
> diagnosis.

I didn't even know that that was safe.

> I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
> the face of previously corrupted tuple chains and pg_upgraded clusters -
> it can lead to tuples being considered related, even though they they're
> from entirely independent hot chains. Especially when upgrading 9.3 post
> your fix, to current releases.

Frankly, I'm relieved that you got to this. I was highly suspicious of
a5736bf754c82d8b86674e199e232096c679201d, even beyond my specific,
actionable concern about how it failed to handle the
9.3/FrozenTransactionId xmin case as special. As I went into in the
"heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead
bug" thread, these commits left us with a situation where there didn't
seem to be a reliable way of knowing whether or not it is safe to
interrogate clog for a given heap tuple using a tool like amcheck.
And, it wasn't obvious that you couldn't have a codepath that failed
to account for pre-cutoff non-frozen tuples -- codepaths that call
TransactionIdDidCommit() despite it actually being unsafe.

If I'm not mistaken, your proposed fix restores sanity there.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing wal_keep_segments as default configuration in PostgresNode.pm

2017-11-02 Thread Peter Eisentraut

On 9/11/17 21:55, Michael Paquier wrote:
> I tend to think that while all the other parameters make sense to
> deploy instances that need few resources, wal_keep_segments may cause
> up to 350MB of WAL segments to be kept in each pg_wal's instance,
> while max_wal_size is set at 128MB. The only test in the code tree in
> need of wal_keep_segments is actually pg_rewind, which enforces
> checkpoints after the rewind to update the source's control file.
> 
> So, thoughts about the attached that reworks this portion of PostgresNode.pm?

Committed.

Besides the resource usage, it would probably be bad if a
wal_keep_segments setting papered over problems with replication slots
for example.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Robert Haas

On Thu, Nov 2, 2017 at 8:25 PM, Alvaro Herrera  wrote:
> Pushed the reverts.
>
> I noticed while doing so that REL_10_STABLE contains the bogus commits.
> Does that change our opinion regarding what to do for people upgrading
> to a version containing the broken commits?  I don't think so, because
>
>   1) we hope that not many people will trust their data to 10.0
>  immediately after release
>   2) the bug is very low probability
>   3) it doesn't look like we can do a lot about it anyway.

Just to be clear, it looks like "Fix freezing of a dead HOT-updated
tuple" (46c35116ae1acc8826705ef2a7b5d9110f9d6e84) went in before 10.0
was stamped, but "Fix traversal of half-frozen update chains"
(22576734b805fb0952f9be841ca8f643694ee868) went in afterwards and is
therefore unreleased at present.

Users of 10.0 who hit the code introduced by
46c35116ae1acc8826705ef2a7b5d9110f9d6e84 will have XIDs stored in the
xmax fields of tuples that predate relfrozenxid.  Those tuples will be
hinted-committed.  That's not good, but it might not really have much
in the way of consequences.  *IF* the next VACUUM doesn't get confused
by the old XID, then it will prune the tuple then and I think we'll be
OK.  And I think it won't, because it should just call
HeapTupleSatisfiesVacuum() and that should see that
HEAP_XMAX_COMMITTED is set and not actually try to consult the old
CLOG.  If that hint bit can ever get lost - or fail to propagate to a
standby - then we have more trouble, but the fact that it's set by a
logged operation makes me hope that can't happen. Furthermore, that
follow-on VACUUM should indeed arrive in due time, because we will not
have marked the page all-visible -- HeapTupleSatisfiesVacuum() will
NOT have returned HEAPTUPLE_LIVE when called from lazy_scan_heap(),
and therefore we will have set all_visible = false.

The second commit (22576734b805fb0952f9be841ca8f643694ee868) is where
I think things get a lot more dangerous.  The problem (as Andres
pointed out to me this afternoon) is that it seems possible to end up
with a situation where there should be two HOT chains on a page, and
because of the weakened xmin/xmax checking rules, we end up thinking
that the second one is a continuation of the first one, which will be
all kinds of bad news.  That would be particularly likely to happen in
releases from before we invented HEAP_XMIN_FROZEN, when there's no
xmin/xmax matching at all, but could happen on later releases if we
are extraordinarily unlucky (i.e. xmin of the first tuple in the
second chain happens to be the same as the pre-freeze xmax in the old
chain, probably because the same XID was used to update the page in
two consecutive epochs).  Fortunately, that commit is (I think) not
released anywhere.

Personally, I think it would be best to push the release out a week.
I think we understand this well enough now that we can fix it
relatively easily, but haste makes bugs, and (I know you're all tired
of hearing me say this) patches that implicate the on-disk format are
scary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> Where are we on this --- do you want me to push the brin_doupdate
>> fix I proposed, or were you intending to merge that into a
>> larger patch?

> Please push your fixes, I'll post my proposed patch for the other bug
> afterwards; they are unrelated problems after all.

OK, I was not sure if you considered them related or not.  Will push.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Alvaro Herrera

Tom Lane wrote:
> Alvaro Herrera  writes:
> >> Hmm, I'm pretty sure we stress-tested brin in pretty much the same way.
> >> But I see this misbehavior too.  Looking ...
> 
> > Turns out that this is related to concurrent growth of the table while
> > the summarization process is scanning -- so new pages have appeared at
> > the end of the table after the end point has been determined.  It would
> > be a pain to determine number of blocks for each range, so I'm looking
> > for a simple way to fix it without imposing so much overhead.
> 
> Where are we on this --- do you want me to push the brin_doupdate
> fix I proposed, or were you intending to merge that into a
> larger patch?

Please push your fixes, I'll post my proposed patch for the other bug
afterwards; they are unrelated problems after all.

If you prefer me to push your fixes, I can do that -- let me know.

> If I'm to do it, is there a reason not to back-patch to all branches
> with BRIN?

No, all these fixes should go back to 9.5.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-02 Thread Tom Lane

Alvaro Herrera  writes:
>> Hmm, I'm pretty sure we stress-tested brin in pretty much the same way.
>> But I see this misbehavior too.  Looking ...

> Turns out that this is related to concurrent growth of the table while
> the summarization process is scanning -- so new pages have appeared at
> the end of the table after the end point has been determined.  It would
> be a pain to determine number of blocks for each range, so I'm looking
> for a simple way to fix it without imposing so much overhead.

Where are we on this --- do you want me to push the brin_doupdate
fix I proposed, or were you intending to merge that into a
larger patch?  If I'm to do it, is there a reason not to back-patch
to all branches with BRIN?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor comment issue in receivelog.c

2017-11-02 Thread Bernd Helmle

Please find a minor comment fix for receivelog.c, HandleCopyStream().

The comments talks about a START_STREAMING command, but i think
START_REPLICATION is what's meant here.


Bernd
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 888458f4a9..befbbb092b 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -763,7 +763,7 @@ ReadEndOfStreamingResult(PGresult *res, XLogRecPtr *startpos, uint32 *timeline)
 
 /*
  * The main loop of ReceiveXlogStream. Handles the COPY stream after
- * initiating streaming with the START_STREAMING command.
+ * initiating streaming with the START_REPLICATION command.
  *
  * If the COPY ends (not necessarily successfully) due a message from the
  * server, returns a PGresult and sets *stoppos to the last byte written.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Document pgstattuple privileges without ambiguity

2017-11-02 Thread Peter Eisentraut

On 8/21/17 03:47, Feike Steenbergen wrote:
> When installing pgstattuple on 10, the documentation about its
> privileges was unclear to me. (Does the pg_stat_scan_tables role get
> EXECUTE privileges by default or not?).

I agree that this has gotten a bit confusing after apparently being
patched around a bit recently.  I have rewritten it a bit to make it
clearer.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Pavel Stehule

2017-11-02 13:35 GMT+01:00 Robert Haas :

> On Thu, Oct 26, 2017 at 12:51 PM, Pavel Stehule 
> wrote:
> > The variables can be modified by SQL command SET (this is taken from
> > standard, and it natural)
> >
> > SET varname = expression;
>
> Overloading SET to handle both variables and GUCs seems likely to
> create problems, possibly including security problems.  For example,
> maybe a security-definer function could leave behind variables to
> trick the calling code into failing to set GUCs that it intended to
> set.  Or maybe creating a variable at the wrong time will just break
> things randomly.
>

The syntax CREATE OR REPLACE FUNCTION xxx $$ ... $$ SET GUC=, ... is always
related only to GUC. So there should not be any security risk.

It is another reason why GUC and variables should be separated.

I know so there is risk of possibility of collision. There are two
possibilities

a) use different keyword - but it is out of SQL/PSM and out of another
databases.

b) detect possible collision and raise error when assignment is ambiguous.
I am thinking about similar solution used in plpgsql, where is a
possibility of collision between SQL identifier and plpgsql variable.

Regards

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Tom Lane

Nico Williams  writes:
> With access controls, GUCs could become schema variables, and settings
> from postgresql.conf could move into the database itself (which I think
> would be nice).

People re-propose some variant of that every so often, but it never works,
because it ignores the fact that some of the GUCs' values are needed
before you can access system catalogs at all, or in places where relying
on system catalog access would be a bad idea.

Sure, we could have two completely different configuration mechanisms
so that some of the variables could be "inside the database", but that
doesn't seem like a net improvement to me.  The point of the Grand Unified
Configuration mechanism was to be unified, after all.

I'm on board with having a totally different mechanism for session
variables.  The fact that people have been abusing GUC to store
user-defined variables doesn't make it a good way to do that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Andres Freund

On 2017-11-02 10:38:34 -0400, Tom Lane wrote:
> David Rowley  writes:
> > On 3 November 2017 at 03:17, Tom Lane  wrote:
> >> We've jacked up the List API and driven a new implementation underneath
> >> once before.  Maybe it's time to do that again.
> 
> > Maybe, but the new implementation is not going to do well with places
> > where we perform lcons(). Probably many of those places could be
> > changed to lappend(), but I bet there's plenty that need prepend.
> 
> [ shrug... ] To me, that means this implementation isn't necessarily
> the right solution.
> 
> It seems to me that a whole lot of the complaints about this could be
> resolved simply by improving the List infrastructure to allocate ListCells
> in batches.  That would address the question of "too much palloc traffic"
> and greatly improve the it-accesses-all-over-memory situation too.

It'll however not solve the problem that pointer chained datastructures
have very poor access patterns.

> Possibly there are more aggressive changes that could be implemented
> without breaking too many places, but I think it would be useful to
> start there and see what it buys us.

I'm suspicious that that will result in causing pain twice, once doing
the compromise thing, and once doing it right.

FWIW, I'd started pursuing converting a lot of executor uses of lists
with arrays. But everywhere I did, particularly the uses inside the
expression machinery, I noticed that larger changes where needed. For
the expression stuff we needed a much larger architectural change (hence
the stuff in 10 I did with Tom's help).  The other big user of lists at
the execution phase are targetlists - I think there the architectural
change is also bigger: Rebuilding/analyzing all the targetlists,
tupledescs etc at execution time is the architectural problem, not the
use of lists itself (although that's also bad).   The amount of cycles
spent in ExecTypeFromTL() etc and friends is partially caused by the use
of lists, but the real problem is that we're doing all that work in the
first place.

So I personally think most uses of lists at execution time are
archetecturally wrong, therefore optimizing their implementation is just
a bandaid.  Where I think the arguments for a more optimized
implementation are better is the parser, where we really don't ahead of
time know the size of the data we're dealing with.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Pavel Stehule

2017-11-02 16:07 GMT+01:00 Craig Ringer :

> On 26 October 2017 at 15:21, Pavel Stehule 
> wrote:
> > Hi,
> >
> > I propose a  new database object - a variable.
>
> Didn't we have a pretty long discussion about this already in
>
> Yeah.
>
> https://www.postgresql.org/message-id/flat/CAMsr%2BYF0G8_
> FehQyFS8gSfnEer9OPsMOvpfniDJOVGQzJzHzsw%40mail.gmail.com#CAMsr+YF0G8_
> fehqyfs8gsfneer9opsmovpfnidjovgqzjzh...@mail.gmail.com
>
> It'd be nice if you summarised any outcomes from that and addressed
> it, rather than taking this as a new topic.
>

I am sorry. This thread follow mentioned and I started with small
recapitulation.

Regards

Pavel


> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Pavel Stehule

2017-11-02 16:35 GMT+01:00 Nico Williams :

> On Thu, Nov 02, 2017 at 06:05:54PM +0530, Robert Haas wrote:
> > On Thu, Oct 26, 2017 at 12:51 PM, Pavel Stehule 
> wrote:
> > > The variables can be modified by SQL command SET (this is taken from
> > > standard, and it natural)
> > >
> > > SET varname = expression;
> >
> > Overloading SET to handle both variables and GUCs seems likely to
> > create problems, possibly including security problems.  For example,
> > maybe a security-definer function could leave behind variables to
> > trick the calling code into failing to set GUCs that it intended to
> > set.  Or maybe creating a variable at the wrong time will just break
> > things randomly.
>
> That's already true of GUCs, since there are no access controls on
> set_config()/current_setting().
>
> Presumably "schema variables" would really just be GUC-like and not at
> all like lexically scoped variables.  And also subject to access
> controls, thus an overall improvement on set_config()/current_setting().
>
> With access controls, GUCs could become schema variables, and settings
> from postgresql.conf could move into the database itself (which I think
> would be nice).
>

I am sorry, but I don't plan it. the behave of GUC is too different than
behave of variables. But I am planning so system GUC can be "moved" to
pg_catalog to be possibility to specify any object exactly.

Regards

Pavel

>
> Nico
> --
>

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Nico Williams

On Thu, Nov 02, 2017 at 06:05:54PM +0530, Robert Haas wrote:
> On Thu, Oct 26, 2017 at 12:51 PM, Pavel Stehule  
> wrote:
> > The variables can be modified by SQL command SET (this is taken from
> > standard, and it natural)
> >
> > SET varname = expression;
> 
> Overloading SET to handle both variables and GUCs seems likely to
> create problems, possibly including security problems.  For example,
> maybe a security-definer function could leave behind variables to
> trick the calling code into failing to set GUCs that it intended to
> set.  Or maybe creating a variable at the wrong time will just break
> things randomly.

That's already true of GUCs, since there are no access controls on
set_config()/current_setting().

Presumably "schema variables" would really just be GUC-like and not at
all like lexically scoped variables.  And also subject to access
controls, thus an overall improvement on set_config()/current_setting().

With access controls, GUCs could become schema variables, and settings
from postgresql.conf could move into the database itself (which I think
would be nice).

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Client Connection redirection support for PostgreSQL

2017-11-02 Thread Craig Ringer

On 2 November 2017 at 14:02, Satyanarayana Narlapuram
 wrote:
> Proposal:
>
> Add the ability to the PostgreSQL server instance to route the traffic to a
> different server instance based on the rules defined in server’s pg_bha.conf
> configuration file. At a high level this enables offloading the user
> requests to a different server instance based on the rules defined in the
> pg_hba.conf configuration file.

pg_hba.conf is "host based access [control]" . I'm not sure it's
really the right place.

> Some of the interesting scenarios this
> enables include but not limited to - rerouting traffic based on the client
> hosts, users, database, etc. specified, redirecting read-only query traffic
> to the hot stand by replicas, and in multi-master scenarios.

When this has come up before, one of the issues has been determining
what exactly should constitute "read only" vs "read write" for the
purposes of redirecting work.

There are a bunch of issues there. If you're doing "read only" txns
and then do something "read write" and get redirected, the destination
doesn't have your prepared statements, any WITH HOLD cursors, temp
tables, etc you were working with. Strangeness ensues.

But we now have a session-intent stuff though. So we could possibly do
it at session level.

Backends used just for a redirect would be pretty expensive though.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Craig Ringer

On 26 October 2017 at 15:21, Pavel Stehule  wrote:
> Hi,
>
> I propose a  new database object - a variable.

Didn't we have a pretty long discussion about this already in

Yeah.

https://www.postgresql.org/message-id/flat/CAMsr%2BYF0G8_FehQyFS8gSfnEer9OPsMOvpfniDJOVGQzJzHzsw%40mail.gmail.com#camsr+yf0g8_fehqyfs8gsfneer9opsmovpfnidjovgqzjzh...@mail.gmail.com

It'd be nice if you summarised any outcomes from that and addressed
it, rather than taking this as a new topic.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom compression methods

2017-11-02 Thread Craig Ringer

On 2 November 2017 at 17:41, Ildus Kurbangaliev
 wrote:

> In this patch compression methods is suitable for MAIN and EXTENDED
> storages like in current implementation in postgres. Just instead only
> of LZ4 you can specify any other compression method.

We've had this discussion before.

Please read the "pluggable compression support" thread. See you in a
few days ;) sorry, it's kinda long.

https://www.postgresql.org/message-id/flat/20130621000900.GA12425%40alap2.anarazel.de#20130621000900.ga12...@alap2.anarazel.de

IIRC there were some concerns about what happened with pg_upgrade,
with consuming precious toast bits, and a few other things.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread David Rowley

On 3 November 2017 at 03:27, Stephen Frost  wrote:
> * David Rowley (david.row...@2ndquadrant.com) wrote:
>> We could get around a few of these problems if Lists were implemented
>> internally as arrays, however, arrays are pretty bad if we want to
>> delete an item out the middle as we'd have to shuffle everything over
>> one spot. Linked lists are much better here... at least they are when
>> you already have the cell you want to remove... so we can't expect
>> that we can just rewrite List to use arrays instead.
>
> This actually depends on just how you delete the item out of the array,
> though there's a trade-off there.  If you "delete" the item by marking
> it as "dead" then you don't need to re-shuffle everything, but, of
> course, then you have to make sure to 'skip' over those by running down
> the array for each list_nth() call- but here's the thing, with a small
> list that's all in a tighly packed array, that might not be too bad.
> Certainly far better than having to grab random memory on each step and
> I'm guessing you'd only actually store the "data" item for a given list
> member in the array if it's a integer/oid/etc list, not when it's
> actually a pointer being stored in the list, so you're always going to
> be working with pretty tightly packed arrays where scanning the list on
> the list_nth call really might be darn close to as fast as using an
> offset to the actual entry in the array, unless it's a pretty big list,
> but I don't think we've actually got lots of multi-hundred-deep lists.

I was hoping to have list_nth as always O(1) so that we'd never be
tempted again to use arrays directly like we are not for things like
simle_rel_array[].
Probably it could be made to work quickly in most cases by tracking if
there are any deleted items and just do the direct array lookups if
nothing is deleted, and if something is deleted then visit index n
minus bms_num_members() of some deleted bitmapset for the bits earlier
than the Nth element. bms_num_members() could be made faster with
64bit bitmap words and __builtin_popcountl() falling back on the
number_of_ones[] array when not available. Would also require a
bms_num_members_before() function.

In the implementation that I attached, I showed an example of how to
delete items out an array list, which I think is quite a bit cleaner
than how we generally do it today with List. (e.g in
reconsider_outer_join_clauses()) However, it'll still perform badly
when just a single known item is being removed.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Alvaro Herrera

Stephen Frost wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Andres Freund  writes:
> > > Do we care about people upgrading to unreleased versions? We could do
> > > nothing, document it in the release notes, or ???
> > 
> > Do nothing.
> 
> Agreed.  Not much we can do there.

Pushed the reverts.

I noticed while doing so that REL_10_STABLE contains the bogus commits.  
Does that change our opinion regarding what to do for people upgrading
to a version containing the broken commits?  I don't think so, because

  1) we hope that not many people will trust their data to 10.0
 immediately after release
  2) the bug is very low probability
  3) it doesn't look like we can do a lot about it anyway.

I'll experiment with Andres' proposed fix now.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Tom Lane

David Rowley  writes:
>> It seems to me that a whole lot of the complaints about this could be
>> resolved simply by improving the List infrastructure to allocate ListCells
>> in batches.  That would address the question of "too much palloc traffic"
>> and greatly improve the it-accesses-all-over-memory situation too.
>> 
>> Possibly there are more aggressive changes that could be implemented
>> without breaking too many places, but I think it would be useful to
>> start there and see what it buys us.

> Yeah, certainly would be a good way of determining the worth of changing.

BTW, with just a little more work that could be made to address the
list-nth-is-expensive problem too.  I'm imagining a call that preallocates
an empty list with room for N ListCells (or perhaps, if we need to
preserve compatibility with NIL == NULL, creates a single-element list
with room for N-1 more cells), plus some tracking in list.c of whether
the list cells have been consumed in order.  Given the typical use-case of
building lists strictly with lappend, list_nth() could index directly into
the ListCell slab as long as it saw the List header's "is_linear" flag was
true.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread David Rowley

On 3 November 2017 at 03:38, Tom Lane  wrote:
> David Rowley  writes:
>> On 3 November 2017 at 03:17, Tom Lane  wrote:
>>> We've jacked up the List API and driven a new implementation underneath
>>> once before.  Maybe it's time to do that again.
>
>> Maybe, but the new implementation is not going to do well with places
>> where we perform lcons(). Probably many of those places could be
>> changed to lappend(), but I bet there's plenty that need prepend.
>
> [ shrug... ] To me, that means this implementation isn't necessarily
> the right solution.

True, of course, could be worked around probably with some bit flags
to record if lcons was called, and then consume the elements in
reverse. We might have to shuffle things along a bit for Lists that
get lcons and lappend calls.

The API break that I can't see a way around is:

 /* does the list have 0 elements? */
if (list == NIL) ...

That would need to be rewritten as:

if (list_length(list) == 0) ...

> It seems to me that a whole lot of the complaints about this could be
> resolved simply by improving the List infrastructure to allocate ListCells
> in batches.  That would address the question of "too much palloc traffic"
> and greatly improve the it-accesses-all-over-memory situation too.
>
> Possibly there are more aggressive changes that could be implemented
> without breaking too many places, but I think it would be useful to
> start there and see what it buys us.

Yeah, certainly would be a good way of determining the worth of changing.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Tom Lane

David Rowley  writes:
> On 3 November 2017 at 03:17, Tom Lane  wrote:
>> We've jacked up the List API and driven a new implementation underneath
>> once before.  Maybe it's time to do that again.

> Maybe, but the new implementation is not going to do well with places
> where we perform lcons(). Probably many of those places could be
> changed to lappend(), but I bet there's plenty that need prepend.

[ shrug... ] To me, that means this implementation isn't necessarily
the right solution.

It seems to me that a whole lot of the complaints about this could be
resolved simply by improving the List infrastructure to allocate ListCells
in batches.  That would address the question of "too much palloc traffic"
and greatly improve the it-accesses-all-over-memory situation too.

Possibly there are more aggressive changes that could be implemented
without breaking too many places, but I think it would be useful to
start there and see what it buys us.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - use enum for meta commands

2017-11-02 Thread Aleksandr Parfenov

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi,

Looks good to me.

The only thing I'm not quite sure about is a comment "which meta command ...".
Maybe it's better to write it without question word, something like "meta 
command identifier..."?

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Stephen Frost

David,

* David Rowley (david.row...@2ndquadrant.com) wrote:
> Our List implementation internally uses linked lists which are
> certainly good for some things, but pretty bad at other things. Linked
> lists are pretty bad when you want to fetch the Nth element, as it
> means looping ever each previous element until you get to the Nth.
> They're also pretty bad for a modern CPU due to their memory access
> pattern.

I can certainly understand this complaint.

> We could get around a few of these problems if Lists were implemented
> internally as arrays, however, arrays are pretty bad if we want to
> delete an item out the middle as we'd have to shuffle everything over
> one spot. Linked lists are much better here... at least they are when
> you already have the cell you want to remove... so we can't expect
> that we can just rewrite List to use arrays instead.

This actually depends on just how you delete the item out of the array,
though there's a trade-off there.  If you "delete" the item by marking
it as "dead" then you don't need to re-shuffle everything, but, of
course, then you have to make sure to 'skip' over those by running down
the array for each list_nth() call- but here's the thing, with a small
list that's all in a tighly packed array, that might not be too bad.
Certainly far better than having to grab random memory on each step and
I'm guessing you'd only actually store the "data" item for a given list
member in the array if it's a integer/oid/etc list, not when it's
actually a pointer being stored in the list, so you're always going to
be working with pretty tightly packed arrays where scanning the list on
the list_nth call really might be darn close to as fast as using an
offset to the actual entry in the array, unless it's a pretty big list,
but I don't think we've actually got lots of multi-hundred-deep lists.

Of course, you'd have to watch out for cases where there's a lot of
deletes happening, since that would effectively make this list longer.

Anyway, just something to consider.  Having to actually store the
"delete" flag would have some cost to it also, unless you happen to
already be storing something per entry and have an extra bit to spare.

Others who are more versed in CPU magics than I am are certainly welcome
to comment on if they think this makes sense to consider, I'm no CPU
architecture guru.

> Anyway, please don't debate the usages of the new type here. As for
> all the above plans, I admit to not having a full handle on them yet.

+1.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Craig Ringer

On 2 November 2017 at 22:22, David Rowley  wrote:
> On 3 November 2017 at 03:17, Tom Lane  wrote:
>> We've jacked up the List API and driven a new implementation underneath
>> once before.  Maybe it's time to do that again.
>
> Maybe, but the new implementation is not going to do well with places
> where we perform lcons(). Probably many of those places could be
> changed to lappend(), but I bet there's plenty that need prepend.

Yeah, and it's IMO significant that pretty much every nontrivial
system with ADTs offers multiple implementations of list data
structures, often wrapped with a common API.

Java's Collections, the STL, you name it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Craig Ringer

On 2 November 2017 at 22:17, Tom Lane  wrote:
> David Rowley  writes:
>> Comments on the design are welcome, but I was too late to the
>> commitfest, so there are other priorities. However, if you have a
>> strong opinion, feel free to voice it.
>
> I do not like replacing Lists piecemeal; that's a recipe for ongoing
> API breakage and back-patching pain.  Plus we'll then have *four*
> different linked-list implementations in the backend, which sure
> seems like too many.
>
> We've jacked up the List API and driven a new implementation underneath
> once before.  Maybe it's time to do that again.

I know some systems use hybrid linked array-lists, where linked list
cells are multi-element.

https://en.wikipedia.org/wiki/Unrolled_linked_list

I don't have much experience with them myself.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread David Rowley

On 3 November 2017 at 03:17, Tom Lane  wrote:
> We've jacked up the List API and driven a new implementation underneath
> once before.  Maybe it's time to do that again.

Maybe, but the new implementation is not going to do well with places
where we perform lcons(). Probably many of those places could be
changed to lappend(), but I bet there's plenty that need prepend.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Tom Lane

David Rowley  writes:
> Comments on the design are welcome, but I was too late to the
> commitfest, so there are other priorities. However, if you have a
> strong opinion, feel free to voice it.

I do not like replacing Lists piecemeal; that's a recipe for ongoing
API breakage and back-patching pain.  Plus we'll then have *four*
different linked-list implementations in the backend, which sure
seems like too many.

We've jacked up the List API and driven a new implementation underneath
once before.  Maybe it's time to do that again.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Andres Freund  writes:
> > Do we care about people upgrading to unreleased versions? We could do
> > nothing, document it in the release notes, or ???
> 
> Do nothing.

Agreed.  Not much we can do there.

Thanks!

Stephen


signature.asc
Description: Digital signature

[HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread David Rowley

Hackers,

Our List implementation internally uses linked lists which are
certainly good for some things, but pretty bad at other things. Linked
lists are pretty bad when you want to fetch the Nth element, as it
means looping ever each previous element until you get to the Nth.
They're also pretty bad for a modern CPU due to their memory access
pattern.

We could get around a few of these problems if Lists were implemented
internally as arrays, however, arrays are pretty bad if we want to
delete an item out the middle as we'd have to shuffle everything over
one spot. Linked lists are much better here... at least they are when
you already have the cell you want to remove... so we can't expect
that we can just rewrite List to use arrays instead.

I've attached a first cut implementation of "AList". I was originally
going to call it "ArrayList", but my fingers were getting tired, so I
change it.

This first cut version replaces nothing in the code base to use
ALists. This email (and due to the timing of it) is more of a marker
that this is being worked on. I know in particular that Andres has
complained at least once about List usages in the executor, which I
agree is not great. Looking at the usage of list_nth() is quite scary!

My plans for this include:

* Make targetlists from createplan onwards use ALists. Every time I
look at build_path_tlist() I gawk at the number of palloc calls that
are going on inside those lappend() calls. We already know how many
items will be in the list, so with AList we could just
alist_premake(list_length(path->pathtarget->exprs)) and we'd never
have to palloc() another element for that list again. We do the same
again in various mutator functions in setrefs.c too!

* list_nth usage in adjust_appendrel_attrs_mutator() to speed up
translation between parent and child Vars

* And also, umm,  simple_rte_array and
simple_rel_array. Well, we'd still need somewhere to store
all the RelOptInfos, but the idea is that it would be rather nice if
we could not expand the entire inheritance hierarchy of a relation,
and it would be rather nice if we could just add the partitions that
we need, rather than add them all and setting dummy paths for the ones
we're not interested in.

Anyway, please don't debate the usages of the new type here. As for
all the above plans, I admit to not having a full handle on them yet.
I wrote the Array List implementation so I could get a better idea on
how possible each of those would be, so, for now, to prevent any
duplicate work, here it is...

(Including in Andres because I know he's mentioned his dislike for
List in the executor)

Comments on the design are welcome, but I was too late to the
commitfest, so there are other priorities. However, if you have a
strong opinion, feel free to voice it.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


0001-Basic-implementation-of-array-lists-AList.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Tom Lane

Andres Freund  writes:
> Do we care about people upgrading to unreleased versions? We could do
> nothing, document it in the release notes, or ???

Do nothing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Client Connection redirection support for PostgreSQL

2017-11-02 Thread David Fetter

On Thu, Nov 02, 2017 at 06:02:43AM +, Satyanarayana Narlapuram wrote:
> Proposal:
> Add the ability to the PostgreSQL server instance to route the
> traffic to a different server instance based on the rules defined in
> server's pg_bha.conf configuration file. At a high level this
> enables offloading the user requests to a different server instance
> based on the rules defined in the pg_hba.conf configuration file.
> Some of the interesting scenarios this enables include but not
> limited to - rerouting traffic based on the client hosts, users,
> database, etc. specified, redirecting read-only query traffic to the
> hot stand by replicas, and in multi-master scenarios.

What advantages do you see in doing this in the backend over the
current system where the concerns are separated, i.e. people use
connection poolers like pgbouncer to do the routing?

> The rules to route the traffic will be provided in the pg_hba.conf
> file. The proposal is to add a new optional field 'RoutingList' to
> the record format. The RoutingList contains comma-seperated list of
> one or more servers that can be routed the traffic to. In the
> absence of this new field there is no change to the current login
> code path for both the server and the client. RoutingList can be
> updated for each new connection to balance the load across multiple
> server instances

> RoutingList format:
> server_address1:port, server_address2:port...

Would it make sense also to include an optional routing algorithm or
pointer to a routing function for each RoutingList, or do you see this
as entirely the client's responsibility?

> The message flow
> 
>   1.  Client connects to the server, and server accepts the connections

How does this work with SSL?

>   2.  Client sends the startup message
>   3.  Server looks at the rules configured in the pg_hba.conf file and
>  *   If the rule matches redirection
>i.  Send a special message with the RoutingList described above
>ii. Server disconnects
> 
>  *   If the rule doesn't have RoutingList defined
> 
>i. Server proceeds in the existing code path and sends auth request
> 
>   1.  Client gets the list of addresses and attempts to connect to a
>   server in the list provided until the first successful connections
>   is established or the list is exhausted. If the client can't
>   connect to any server instance on the RoutingList, client reports
>   the login failure message.
> 
> Backward compatibility:
> There are a few ways to provide the backward compatibility, and each
> approach has their own advantages and disadvantage and are listed
> below
> 
>   1.  Bumping the protocol version - old server instances may not
>   understand the new client protocol

This sounds more attractive, assuming that the feature is.

>   2.  Adding additional optional parameter routing_enabled, without
>   bumping the protocol version. In this approach, old Postgres
>   server instances may not understand this and fail the connections.

Silently changing an API without bumping the protocol version seems
like a bad plan, even when it's an additive change as you propose here.

>   3.  The current proposal - to keep it in the hba.conf and let the
>   server admin deal with the configuration by taking conscious
>   choice on the configuration of routing list based on the clients
>   connecting to the server instance.

How would clients identify themselves as eligible without a protocol
version bump?

> Backward compatibility scenarios:
> 
>   *   The feature is not usable for the existing clients, and the
>   new servers shouldn't set the routing list if they expect any
>   connections from the legacy clients. We should do either (1) or
>   (2) in the above list to achieve this. Otherwise need to rely on
>   the admin to take care of the settings.
>   *   For the new client connecting to the old server, there is no
>   change in the message flow

So to DoS the server, what's required is a flock of old clients?  I
presume there's a good reason to reroute rather than serve these
requests.

>   *   For the new clients to the new server, the message flow will be based 
> on the routing list filed in the configuration.
> This proposal is in very early stage, comments and feedback is very much 
> appreciated.

Comments and feedback have begun.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add two-arg for of current_setting(NAME, FALLBACK)

2017-11-02 Thread Peter Eisentraut

On 11/1/17 14:02, Nico Williams wrote:
> There already _is_ a two-argument form of current_setting() that yours
> somewhat conflicts with:
> 
>current_setting(setting_name [, missing_ok ])
> 
> https://www.postgresql.org/docs/current/static/functions-admin.html
> 
> I often use
> 
>   coalesce(current_setting(setting_name, true), default_value_here)
> 
> as an implementation of current_setting() with a default value.

That appears to address this use case then.  Do we need the new proposed
variant still?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] list of credits for release notes

2017-11-02 Thread Peter Eisentraut

On 10/3/17 12:31, Dang Minh Huong wrote:
> Sorry but please change "Huong Dangminh"(my name) to "Dang Minh Huong".

done

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Andres Freund

Hi,

On 2017-11-02 13:49:47 +0100, Alvaro Herrera wrote:
> Andres Freund wrote:
> > I think the problem is on the pruning, rather than the freezing side. We
> > can't freeze a tuple if it has an alive predecessor - rather than
> > weakining this, we should be fixing the pruning to not have the alive
> > predecessor.
> 
> I gave a look at HTSV back then, but I didn't find what the right tweak
> was, but then I only tried changing the return value to DEAD and
> DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based
> on OldestXmin didn't occur to me ... I was thinking that the fact that
> there were live lockers meant that the tuple could not be removed,
> obviously failing to notice that the subsequent versions of the tuple
> would be good enough.

I'll try to write up a commit based on that idea. I think there's some
comment work needed too, Robert and I were both confused by a few
things.
I'm unfortunately travelling atm - it's evening here, and I'll flying
back to the US all Saturday. I'm fairly sure I'll be able to come up
with a decent patch tomorrow, but I'll need review and testing by
others.

> > If the update xmin is actually below the cutoff we can remove the tuple
> > even if there's live lockers - the lockers will also be present in the
> > newer version of the tuple.  I verified that for me that fixes the
> > problem. Obviously that'd require some comment work and more careful
> > diagnosis.
> 
> Sounds good.
> 
> I'll revert those commits then, keeping the test, and then you can
> commit your change.  OK?

Generally that sounds good - but you can't keep the testcase in without
the new fix - the buildfarm would immediately turn red. I guess the best
thing would be to temporarily remove it from the schedule?

Do we care about people upgrading to unreleased versions? We could do
nothing, document it in the release notes, or ???

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Alvaro Herrera

Andres Freund wrote:

> I spent some time discussing this with Robert today (with both of us
> alternating between feeling the other and ourselves as stupid), and the
> conclusion I think is that the problem is on the pruning, rather than
> the freezing side.

Thanks both for spending some more time on this.

> I think the problem is on the pruning, rather than the freezing side. We
> can't freeze a tuple if it has an alive predecessor - rather than
> weakining this, we should be fixing the pruning to not have the alive
> predecessor.

I gave a look at HTSV back then, but I didn't find what the right tweak
was, but then I only tried changing the return value to DEAD and
DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based
on OldestXmin didn't occur to me ... I was thinking that the fact that
there were live lockers meant that the tuple could not be removed,
obviously failing to notice that the subsequent versions of the tuple
would be good enough.

> If the update xmin is actually below the cutoff we can remove the tuple
> even if there's live lockers - the lockers will also be present in the
> newer version of the tuple.  I verified that for me that fixes the
> problem. Obviously that'd require some comment work and more careful
> diagnosis.

Sounds good.

I'll revert those commits then, keeping the test, and then you can
commit your change.  OK?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: schema variables

2017-11-02 Thread Robert Haas

On Thu, Oct 26, 2017 at 12:51 PM, Pavel Stehule  wrote:
> The variables can be modified by SQL command SET (this is taken from
> standard, and it natural)
>
> SET varname = expression;

Overloading SET to handle both variables and GUCs seems likely to
create problems, possibly including security problems.  For example,
maybe a security-definer function could leave behind variables to
trick the calling code into failing to set GUCs that it intended to
set.  Or maybe creating a variable at the wrong time will just break
things randomly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Robert Haas

On Thu, Nov 2, 2017 at 4:50 PM, Andres Freund  wrote:
> I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
> the face of previously corrupted tuple chains and pg_upgraded clusters -
> it can lead to tuples being considered related, even though they they're
> from entirely independent hot chains. Especially when upgrading 9.3 post
> your fix, to current releases.

I think this is a key point.  If the new behavior were merely not
entirely correct, we could perhaps refine it later.  But it's not only
not correct - it actually has the potential to create new problems
that didn't exist before those commits.  And if we release without
reverting those commits then we can't change our mind later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Client Connection redirection support for PostgreSQL

2017-11-02 Thread Satyanarayana Narlapuram

Proposal:
Add the ability to the PostgreSQL server instance to route the traffic to a 
different server instance based on the rules defined in server's pg_bha.conf 
configuration file. At a high level this enables offloading the user requests 
to a different server instance based on the rules defined in the pg_hba.conf 
configuration file. Some of the interesting scenarios this enables include but 
not limited to - rerouting traffic based on the client hosts, users, database, 
etc. specified, redirecting read-only query traffic to the hot stand by 
replicas, and in multi-master scenarios.
The rules to route the traffic will be provided in the pg_hba.conf file. The 
proposal is to add a new optional field 'RoutingList' to the record format. The 
RoutingList contains comma-seperated list of one or more servers that can be 
routed the traffic to. In the absence of this new field there is no change to 
the current login code path for both the server and the client. RoutingList can 
be updated for each new connection to balance the load across multiple server 
instances
RoutingList format:
server_address1:port, server_address2:port...
The message flow

  1.  Client connects to the server, and server accepts the connections
  2.  Client sends the startup message
  3.  Server looks at the rules configured in the pg_hba.conf file and
 *   If the rule matches redirection

   i.  Send a 
special message with the RoutingList described above

 ii.  Server 
disconnects

 *   If the rule doesn't have RoutingList defined

   i.  Server 
proceeds in the existing code path and sends auth request

  1.  Client gets the list of addresses and attempts to connect to a server in 
the list provided until the first successful connections is established or the 
list is exhausted. If the client can't connect to any server instance on the 
RoutingList, client reports the login failure message.

Backward compatibility:
There are a few ways to provide the backward compatibility, and each approach 
has their own advantages and disadvantage and are listed below

  1.  Bumping the protocol version - old server instances may not understand 
the new client protocol
  2.  Adding additional optional parameter routing_enabled, without bumping the 
protocol version. In this approach, old Postgres server instances may not 
understand this and fail the connections.
  3.  The current proposal - to keep it in the hba.conf and let the server 
admin deal with the configuration by taking conscious choice on the 
configuration of routing list based on the clients connecting to the server 
instance.
Backward compatibility scenarios:

  *   The feature is not usable for the existing clients, and the new servers 
shouldn't set the routing list if they expect any connections from the legacy 
clients. We should do either (1) or (2) in the above list to achieve this. 
Otherwise need to rely on the admin to take care of the settings.
  *   For the new client connecting to the old server, there is no change in 
the message flow
  *   For the new clients to the new server, the message flow will be based on 
the routing list filed in the configuration.
This proposal is in very early stage, comments and feedback is very much 
appreciated.
Thanks,
Satya

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Andres Freund

Hi,

On 2017-09-28 14:47:53 +, Alvaro Herrera wrote:
> Fix freezing of a dead HOT-updated tuple
>
> Vacuum calls page-level HOT prune to remove dead HOT tuples before doing
> liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples.  But
> concurrent transaction commit/abort may turn DEAD some of the HOT tuples
> that survived the prune, before HeapTupleSatisfiesVacuum tests them.
> This happens to activate the code that decides to freeze the tuple ...
> which resuscitates it, duplicating data.
>
> (This is especially bad if there's any unique constraints, because those
> are now internally violated due to the duplicate entries, though you
> won't know until you try to REINDEX or dump/restore the table.)
>
> One possible fix would be to simply skip doing anything to the tuple,
> and hope that the next HOT prune would remove it.  But there is a
> problem: if the tuple is older than freeze horizon, this would leave an
> unfrozen XID behind, and if no HOT prune happens to clean it up before
> the containing pg_clog segment is truncated away, it'd later cause an
> error when the XID is looked up.
>
> Fix the problem by having the tuple freezing routines cope with the
> situation: don't freeze the tuple (and keep it dead).  In the cases that
> the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag
> so that there is no need to look up the XID in pg_clog later on.

I think this is the wrong fix - the assumption that ctid chains can be
validated based on the prev-xmax = cur-xmin is fairly ingrained into the
system, and we shouldn't just be breaking it. The need to later
lobotomize the checks, in a5736bf754, is some evidence of that.

I spent some time discussing this with Robert today (with both of us
alternating between feeling the other and ourselves as stupid), and the
conclusion I think is that the problem is on the pruning, rather than
the freezing side.

FWIW, I don't think the explanation in the commit message of how the
problem triggers is actually correct - the testcase you added doesn't
have any transactions concurrently committing / aborting when
crashing. Rather the problem is that the liveliness checks for freezing
is different from the ones for pruning. HTSV considers xmin
RECENTLY_DEAD when there's a multi xmax with at least one alive locker,
whereas pruning thinks it has to do something because there's the member
xid below the cutoff. No concurrent activity is needed to trigger that.

I think the problem is on the pruning, rather than the freezing side. We
can't freeze a tuple if it has an alive predecessor - rather than
weakining this, we should be fixing the pruning to not have the alive
predecessor.

The relevant logic in HTSV is:

if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
{
TransactionId xmax;

if (MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), 
false))
{
/* already checked above */
Assert(!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask));

xmax = HeapTupleGetUpdateXid(tuple);

/* not LOCKED_ONLY, so it has to have an xmax */
Assert(TransactionIdIsValid(xmax));

if (TransactionIdIsInProgress(xmax))
return HEAPTUPLE_DELETE_IN_PROGRESS;
else if (TransactionIdDidCommit(xmax))
/* there are still lockers around -- can't 
return DEAD here */
return HEAPTUPLE_RECENTLY_DEAD;
/* updating transaction aborted */
return HEAPTUPLE_LIVE;

with the problematic branch being the TransactionIdDidCommit()
case. Rather than unconditionally returning HEAPTUPLE_RECENTLY_DEAD, we
should test the update xid against OldestXmin and return DEAD /
RECENTLY_DEAD according to that.

If the update xmin is actually below the cutoff we can remove the tuple
even if there's live lockers - the lockers will also be present in the
newer version of the tuple.  I verified that for me that fixes the
problem. Obviously that'd require some comment work and more careful
diagnosis.

I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
the face of previously corrupted tuple chains and pg_upgraded clusters -
it can lead to tuples being considered related, even though they they're
from entirely independent hot chains. Especially when upgrading 9.3 post
your fix, to current releases.

In short, I think the two commits should be reverted, and replaced with
a fix along what I'm outlining above.

There'll be some trouble for people that upgraded to an unreleased
version, but I don't really see what we could do about that.

I could be entirely wrong - I've been travelling for the last two weeks
and my brain is somewhat more fried than usual.

Regards,

Andres

-- 
Sent via pgsql-hackers mailing list

1 2 >

1 - 100 of 119 matches

Mail list logo