Re: [HACKERS] UPDATE of partition key

2017-06-18 Thread Amit Khandekar
When I tested partition-key-update on a partitioned table having no
child partitions, it crashed. This is because there is an
Assert(mtstate->mt_num_partitions > 0) for creating the
partition-to-root map, which fails if there are no partitions under
the partitioned table. Actually we should skp creating this map if
there are no partitions under the partitioned table on which UPDATE is
run. So the attached patch has this new change to fix it (and
appropriate additional test case added) :

--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2006,15 +2006,14 @@ ExecInitModifyTable(ModifyTable *node, EState
*estate, int eflags)
 * descriptor of a source partition does not match the root partition
 * descriptor. In such case we need to convert tuples to the
root partition
 * tuple descriptor, because the search for destination partition starts
-* from the root.
+* from the root. Skip this setup if it's not a partition key
update or if
+* there are no partitions below this partitioned table.
 */
-   if (is_partitionkey_update)
+   if (is_partitionkey_update && mtstate->mt_num_partitions > 0)
{
TupleConversionMap **tup_conv_maps;
TupleDesc   outdesc;

-   Assert(mtstate->mt_num_partitions > 0);
-
mtstate->mt_resultrel_maps =
(TupleConversionMap **)
palloc0(sizeof(TupleConversionMap*) * nplans);

On 15 June 2017 at 23:06, Amit Khandekar  wrote:
> On 13 June 2017 at 15:40, Amit Khandekar  wrote:
>> While rebasing my patch for the below recent commit, I realized that a
>> similar issue exists for the uptate-tuple-routing patch as well :
>>
>> commit 78a030a441966d91bc7e932ef84da39c3ea7d970
>> Author: Tom Lane 
>> Date:   Mon Jun 12 23:29:44 2017 -0400
>>
>> Fix confusion about number of subplans in partitioned INSERT setup.
>>
>> The above issue was about incorrectly using 'i' in
>> mtstate->mt_plans[i] during handling WITH CHECK OPTIONS in
>> ExecInitModifyTable(), where 'i' was actually meant to refer to the
>> positions in mtstate->mt_num_partitions. Actually for INSERT, there is
>> only a single plan element in mtstate->mt_plans[] array.
>>
>> Similarly, for update-tuple routing, we cannot use
>> mtstate->mt_plans[i], because 'i' refers to position in
>> mtstate->mt_partitions[] , whereas mtstate->mt_plans is not at all in
>> order of mtstate->mt_partitions; in fact mt_plans has only the plans
>> that are to be scanned on pruned partitions; so it can well be a small
>> subset of total partitions.
>>
>> I am working on an updated patch to fix the above.
>
> Attached patch v10 fixes the above. In the existing code, where it
> builds WCO constraints for each leaf partition; with the patch, that
> code now is applicable to row-movement-updates as well. So the
> assertions in the code are now updated to allow the same. Secondly,
> the mapping for each of the leaf partitions was constructed using the
> root partition attributes. Now in the patch, the
> mtstate->resultRelInfo[0] (i.e. the first resultRelInfo) is used as
> reference. So effectively, map_partition_varattnos() now represents
> not just parent-to-partition mapping, but rather, mapping between any
> two partitions/partitioned_tables. It's done this way, so that we can
> have a common WCO building code for inserts as well as updates. For
> e.g. for inserts, the first (and only) WCO belongs to
> node->nominalRelation so nominalRelation is used for
> map_partition_varattnos(), whereas for updates, first WCO belongs to
> the first resultRelInfo which is not same as nominalRelation. So in
> the patch, in both cases, we use the first resultRelInfo and the WCO
> of the first resultRelInfo for map_partition_varattnos().
>
> Similar thing is done for Returning expressions.
>
> -
>
> Another change in the patch is : for ExecInitQual() for WCO quals,
> mtstate->ps is used as parent, rather than first plan. For updates,
> first plan does not belong to the parent partition. In fact, I think
> in all cases, we should use mtstate->ps as the parent.
> mtstate->mt_plans[0] don't look like they should be considered parent
> of these expressions. May be it does not matter to which parent we
> link these quals, because there is no ReScan for ExecModifyTable().
>
> Note that for RETURNING projection expressions, we do use mtstate->ps.
>
> 
>
> There is another issue I discovered. The row-movement works fine if
> the destination leaf partition has different attribute ordering than
> the root : the existing insert-tuple-routing mapping handles that. But
> if the source partition has different ordering w.r.t. the root, it has
> a problem : there is no mapping in the opposite direction, i.e. from
> the leaf to root. And we require that because the tuple of source leaf
> partition needs to be 

Re: [HACKERS] Getting server crash on Windows when using ICU collation

2017-06-18 Thread Ashutosh Sharma
Hi,

On Sun, Jun 18, 2017 at 6:39 PM, Amit Kapila  wrote:
> On Sat, Jun 17, 2017 at 8:27 PM, Ashutosh Sharma  
> wrote:
>> Hi,
>>
>> On Sat, Jun 17, 2017 at 6:38 PM, Peter Eisentraut
>>  wrote:
>>> On 6/16/17 23:46, Amit Kapila wrote:
 I have just posted one way
 to determine if icu library has support for ucol_strcollUTF8, see if
 that sounds like a way forward to you.
>>>
>>> I'm not in a position to test such patches, so someone else will have to
>>> take that on.
>>
>
> I can verify the patch on Win7 setup to which I have access if that helps.
>
>> Well, I have tested it from my end. I've basically tried to test it by
>> running multi-byte tests as Amit suggested and by verifing the things
>> manually through debugger . And, i had mistakenly attached wrong patch
>> in my earlier email. Here, i attach the correct patch.
>>
>
> Is it possible for you to once verify this patch with icu library
> version where ucol_strcollUTF8 is not defined?

Okay, I have verified the attached patch with following ICU versions
and didn't find any problem

1) ICU 4.8.1.1
2) ICU 49.1.2
3) ICU 50.1.2
4) ICU 53.1

The first two ICU versions i.e. 'ICU 4.8.1.1' and 'ICU 49.1.2' doesn't
have 'ucol_strcollUTF8' defined in it whereas the last two versions
i.e. 'ICU 50.1.2' and 'ICU 53.1', includes 'ucol_strcollUTF8'.

>
> + # get the icu version.
> + my $output = `uconv -V /? 2>&1`;
> + $? >> 8 == 0 or die "uconv command not found";
>
> If we don't find unconv, isn't it better to fall back to non-UTF8
> version rather than saying command not found?

Well, if any of the ICU package is installed on our system then we
will certainly find uconv command. The only case where we can see such
error is when user doesn't have any of the ICU packages installed on
their system and are somehow trying to perform icu enabled build and
in such case the build configuration has to fail which i think is the
expected behaviour. Anyways, it is not uconv that decides whether we
should fallback to non-UTF8 or not. It's the ICU version that decides
whether to stick to UTF8 or fallback to nonUTF8 version. Thanks.

>
> + print $o "#define HAVE_UCOL_STRCOLLUTF8 1\n"
> + if ($self->{ICUVersion} >= 50);
>
> Indentation looks slightly odd, see the other usage in the same file.

I have corrected it. Please have a look into the attached patch. Thanks.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


detect_ICU_version_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PATCH: Making constant StringInfo

2017-06-18 Thread Craig Ringer
Hi all

TL;DR: formalize using StringInfo over an existing buffer as a cursor
for pq_getmsg.

In src/backend/replication/logical/worker.c we intern a pre-existing
string in a StringInfo so it can be used with the pq_getmsg functions
etc.

StringInfoData s;



s.data = buf;
s.len = len;
s.cursor = 0;
s.maxlen = -1;


and this strikes me as something that stringinfo.h should expose and
bless, so we don't have to fiddle with the guts of a StringInfo
directly.

Reasonable?

void
initConstantStringInfo(StringInfo str, const char *buf, Size length);

resetStringInfo() on such a string would raise an error, as would
appending and enlarging. (Right now resetting will just let you
trample on the original buffer). If anyone later wants to be able to
take such an interned stringinfo and make a writeable buffer with it,
a new function like "reallocateStringInfo" could do that, but it's
easy enough to initStringInfo a new string and appendBinaryStringInfo
it, so there's not much point.

The reason for not just copying the string when attempts to modify it
are made is clear ownership. A StringInfo usually owns its buffer, but
these ones don't, and we don't want to leave the result of a given
append call as "maybe this stringinfo still references an
outside-owned buffer, maybe it doesn't". There doesn't seem a great
deal of call for a StringInfo that can start with an external buffer
and append to it until it runs out of room, then copy it only if
needed.

Patch for constrant StringInfo attached.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 2a5f70aa69dca92e6d255b797e28cc814a39aa29 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 19 Jun 2017 12:12:20 +0800
Subject: [PATCH v1] Introduce "constant StringInfo"

Introduce constant StringInfo's, which are used as cursors over pre-existing
data by the pq_getmsg functions etc.
---
 src/backend/lib/stringinfo.c | 23 +++
 src/backend/replication/logical/worker.c |  6 +---
 src/include/lib/stringinfo.h | 50 ++--
 3 files changed, 65 insertions(+), 14 deletions(-)

diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index fd15567..6bc19ea 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -52,6 +52,21 @@ initStringInfo(StringInfo str)
 	resetStringInfo(str);
 }
 
+void
+initConstantStringInfo(StringInfo str, const char *buf, Size length)
+{
+	str->data = buf;
+	str->len = length;
+	str->maxlen = -1;
+	str->cursor = 0;
+}
+
+bool
+stringInfoIsConstant(StringInfo str)
+{
+	return str->maxlen == -1;
+}
+
 /*
  * resetStringInfo
  *
@@ -61,6 +76,7 @@ initStringInfo(StringInfo str)
 void
 resetStringInfo(StringInfo str)
 {
+	Assert(!stringInfoIsConstant(str));
 	str->data[0] = '\0';
 	str->len = 0;
 	str->cursor = 0;
@@ -77,6 +93,7 @@ resetStringInfo(StringInfo str)
 void
 appendStringInfo(StringInfo str, const char *fmt,...)
 {
+	Assert(!stringInfoIsConstant(str));
 	for (;;)
 	{
 		va_list		args;
@@ -117,6 +134,7 @@ appendStringInfoVA(StringInfo str, const char *fmt, va_list args)
 	size_t		nprinted;
 
 	Assert(str != NULL);
+	Assert(!stringInfoIsConstant(str));
 
 	/*
 	 * If there's hardly any space, don't bother trying, just fail to make the
@@ -169,6 +187,7 @@ void
 appendStringInfoChar(StringInfo str, char ch)
 {
 	/* Make more room if needed */
+	Assert(!stringInfoIsConstant(str));
 	if (str->len + 1 >= str->maxlen)
 		enlargeStringInfo(str, 1);
 
@@ -186,6 +205,7 @@ appendStringInfoChar(StringInfo str, char ch)
 void
 appendStringInfoSpaces(StringInfo str, int count)
 {
+	Assert(!stringInfoIsConstant(str));
 	if (count > 0)
 	{
 		/* Make more room if needed */
@@ -208,6 +228,7 @@ void
 appendBinaryStringInfo(StringInfo str, const char *data, int datalen)
 {
 	Assert(str != NULL);
+	Assert(!stringInfoIsConstant(str));
 
 	/* Make more room if needed */
 	enlargeStringInfo(str, datalen);
@@ -246,6 +267,8 @@ enlargeStringInfo(StringInfo str, int needed)
 {
 	int			newlen;
 
+	Assert(!stringInfoIsConstant(str));
+
 	/*
 	 * Guard against out-of-range "needed" values.  Without this, we can get
 	 * an overflow or infinite loop in the following.
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c67720b..4160a7f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -1049,11 +1049,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
 	/* Ensure we are reading the data into our memory context. */
 	MemoryContextSwitchTo(ApplyMessageContext);
 
-	s.data = buf;
-	s.len = len;
-	s.cursor = 0;
-	s.maxlen = -1;
-
+	initConstantStringInfo(, buf, len);
 	c = pq_getmsgbyte();
 
 	if (c == 'w')
diff --git 

Re: [HACKERS] proposal psql \gdesc

2017-06-18 Thread Pavel Stehule
2017-06-16 8:45 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> new update - rebase, changed message
>>
>
> I did yet another rebase of your patch after Tom alphabetically ordered
> backslash commands. Here is the result.


It looks well

Thank you

Pavel


>
>
> --
> Fabien.


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-06-18 Thread Pavel Stehule
2017-06-17 7:58 GMT+02:00 Fabien COELHO :

>
> I have not any other comments. The implementation is trivial. I rerun all
>> tests and tests passed.
>>
>> I'll mark this patch as ready for commiters.
>>
>
> Oops, I just noticed a stupid confusion on my part which got through, I
> was setting "ERROR" as "success", inverting the expected boolean value.
>
> Here is a fixed version.


I missed it too.

We can introduce macro SetVariableBool(vars, varname, bool) instead

 SetVariable(pset.vars, "ERROR", "FALSE");

Regards

Pavel

>
>
> --
> Fabien.


[HACKERS] REPLICA IDENTITY FULL

2017-06-18 Thread Tatsuo Ishii
While playing around with logical replication, I am confused by the
behavior of REPLICA IDENTITY FULL.

First I created a table having 2 INT columns with no keys. If I
execute ALTER TABLE REPLICA IDENTITY FULL, replication for
UPDATE/DELETE works.

In the session below, port 11002 is the publisher side, while 11003 is
the subscriber side.

+ psql -e -p 11002 -c create table t3(i int, j int); test
create table t3(i int, j int);
CREATE TABLE
+ psql -e -p 11003 -c create table t3(i int, j int); test
create table t3(i int, j int);
CREATE TABLE
+ psql -e -p 11002 -c alter table t3 replica identity full test
alter table t3 replica identity full
ALTER TABLE
+ psql -e -p 11002 -c insert into t3 values(1,1); test
insert into t3 values(1,1);
INSERT 0 1
+ psql -e -p 11002 -c insert into t3 values(2,2); test
insert into t3 values(2,2);
INSERT 0 1
+ psql -e -p 11002 -c insert into t3 values(2,2); test
insert into t3 values(2,2);
INSERT 0 1
+ psql -e -p 11003 -c ALTER SUBSCRIPTION mysub REFRESH PUBLICATION; test
ALTER SUBSCRIPTION mysub REFRESH PUBLICATION;
NOTICE:  added subscription for table public.t3
ALTER SUBSCRIPTION
+ sleep 3
+ psql -e -p 11003 -c select * from t3; test
select * from t3;
 i | j 
---+---
 1 | 1
 2 | 2
 2 | 2
(3 rows)

+ psql -e -p 11002 -c update t3 set j = 10 where i = 2 and j = 2; test
update t3 set j = 10 where i = 2 and j = 2;
UPDATE 2
+ psql -e -p 11003 -c select * from t3; test
select * from t3;
 i | j  
---+
 1 |  1
 2 | 10
 2 | 10
(3 rows)

+ psql -e -p 11002 -c delete from t3 where i = 2; test
delete from t3 where i = 2;
DELETE 2
+ psql -e -p 11003 -c select * from t3; test
Pager usage is off.
select * from t3;
 i | j 
---+---
 1 | 1
(1 row)

However, if a table has text columns, UPDATE/DELETE replication does
not work any more. Am I missing something?

+ psql -e -p 11002 -c create table t4(i text, j text); test
create table t4(i text, j text);
CREATE TABLE
+ psql -e -p 11003 -c create table t4(i text, j text); test
create table t4(i text, j text);
CREATE TABLE
+ psql -e -p 11002 -c alter table t4 replica identity full test
alter table t4 replica identity full
ALTER TABLE
+ psql -e -p 11002 -c insert into t4 values('a','a'); test
insert into t4 values('a','a');
INSERT 0 1
+ psql -e -p 11002 -c insert into t4 values('b','b'); test
insert into t4 values('b','b');
INSERT 0 1
+ psql -e -p 11002 -c insert into t4 values('b','b'); test
insert into t4 values('b','b');
INSERT 0 1
+ psql -e -p 11003 -c ALTER SUBSCRIPTION mysub REFRESH PUBLICATION; test
ALTER SUBSCRIPTION mysub REFRESH PUBLICATION;
NOTICE:  added subscription for table public.t4
ALTER SUBSCRIPTION
+ sleep 3
+ psql -e -p 11003 -c select * from t4; test
select * from t4;
 i | j 
---+---
 a | a
 b | b
 b | b
(3 rows)

+ psql -e -p 11002 -c update t4 set j = 'c' where i = 'b' and j = 'b'; test
update t4 set j = 'c' where i = 'b' and j = 'b';
UPDATE 2
+ psql -e -p 11003 -c select * from t4; test
select * from t4;
 i | j 
---+---
 a | a
 b | b
 b | b
(3 rows)

+ psql -e -p 11002 -c delete from t4 where i = 'b'; test
delete from t4 where i = 'b';
DELETE 2
+ psql -e -p 11003 -c select * from t4; test
select * from t4;
 i | j 
---+---
 a | a
 b | b
 b | b
(3 rows)


Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Directory pg_replslot is not properly cleaned

2017-06-18 Thread Andres Freund
Hi,

On 2017-06-07 15:46:45 -0300, Fabrízio de Royes Mello wrote:
> > >> Just adding Dimitriy to conversation... previous email I provided was
> > >wrong.
> > >>
> > >
> > >Does anyone have some thought about this critical issue?
> > >
> >
> > I plan to look into it over the next few days.
> >
> 
> Thanks...

As noted in
http://archives.postgresql.org/message-id/20170619023014.qx7zjmnkzy3fwpfl%40alap3.anarazel.de
I've pushed a fix for this.  Sorry for it taking this long.

The fix will be included in the next set of minor releases.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something is rotten in publication drop

2017-06-18 Thread Noah Misch
On Thu, Jun 15, 2017 at 02:04:04AM +, Noah Misch wrote:
> On Fri, Jun 09, 2017 at 11:45:58AM -0400, Tom Lane wrote:
> > Peter Eisentraut  writes:
> > > On 6/8/17 23:53, Tom Lane wrote:
> > >> ! ERROR:  publication "addr_pub" does not exist
> > 
> > > The \d+ command attempts to print out any publications that the relation
> > > is part of.  To find the publications it is part of, it runs this query:
> > 
> > > "SELECT pub.pubname\n"
> > > " FROM pg_catalog.pg_publication pub,\n"
> > > "  pg_catalog.pg_get_publication_tables(pub.pubname)\n"
> > > "WHERE relid = '%s'\n"
> > > "ORDER BY 1;",
> > 
> > > pg_get_publication_tables() calls GetPublicationByName(), which throws
> > > this error.
> > 
> > > So I suppose that if a publication is dropped between the time
> > > pg_publication is read and the function is called, you could get this 
> > > error.
> > 
> > Yeah, I'd suspected as much but not tracked it down yet.
> > 
> > > How could we improve that?
> > 
> > What we've done in many comparable situations is to allow a
> > catalog-probing function to return NULL instead of failing
> > when handed an OID or other identifier that it can't locate.
> > Here it seems like pg_get_publication_tables() needs to use
> > missing_ok = TRUE and then return zero rows for a null result.
> > Arguably, a nonexistent publication publishes no tables, so
> > it's not clear that's not the Right Thing anyway.
> > 
> > BTW, isn't the above command a hugely inefficient way of finding
> > the publications for the target rel?  Unless you've got a rather
> > small number of rather restricted publications, seems like it's
> > going to take a long time.  Maybe we don't care too much about
> > manual invocations of \d+, but I bet somebody will carp if there's
> > not a better way to find this out.  Maybe a better answer is to
> > define a more suitable function pg_publications_for_table(relid)
> > and let it have the no-error-for-bad-OID behavior.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Peter,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
> 
> [1] 
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improve release-note for pg_current_logfile()

2017-06-18 Thread Tatsuo Ishii
>>> Add function pg_current_logfile() to read syslog's current stderr and 
>>> csvlog output file names (Gilles Darold)
>> 
>> This confused me because "syslog" is one of method for logging as well
>> as stderr and csvlog. I guess it is intended syslogger, but I think that
>> "logging collector" is more convenient for users because this is used in
>> the pg_current_logfile() documentation.
> 
> His proposal is changing the line above to:
> 
> Add function pg_current_logfile() to read logging collector's current stderr 
> and csvlog output file names (Gilles Darold)
> 
> Looks reasonable fix to me. If there's no objection, I will commit
> this.

Done.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregation support for aggregate functions that use transitions not implemented for array_agg

2017-06-18 Thread Tomas Vondra

Hi,

On 6/7/17 5:52 AM, Regina Obe wrote:

On 6/6/17 13:52, Regina Obe wrote:

It seems CREATE  AGGREGATE was expanded in 9.6 to support
parallelization of aggregate functions using transitions, with the
addition of serialfunc and deserialfunc to the aggregate definitions.

https://www.postgresql.org/docs/10/static/sql-createaggregate.html

I was looking at the PostgreSQL 10 source code for some example usages
of this and was hoping that array_agg and string_agg would support the feature.



I'm not sure how you would parallelize these, since in most uses
you want to have a deterministic output order.



--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Good point.  If that's the reason it wasn't done, that's good just wasn't sure.

But if you didn't have an ORDER BY in your aggregate usage, and you
did have those transition functions, it shouldn't be any different from
any other use case right?
I imagine you are right that most folks who use array_agg and
string_agg usually combine it with array_agg(... ORDER BY ..)



I think that TL had in mind is something like

SELECT array_agg(x) FROM (
   SELECT x FROM bar ORDER BY y
) foo;

i.e. a subquery producing the data in predictable order.

>

My main reason for asking is that most of the PostGIS geometry and
raster aggregate functions use transitions and were patterned after
array agg.




In the case of PostGIS the sorting is done internally and really
only to expedite take advantage of things like cascaded union
algorithms.
That is always done though (so even if each worker does it on just it's
batch that's still better than having only one worker).
So I think it's still very beneficial to break into separate jobs
since in the end the gather, will have say 2 biggish geometries or 2
biggish rasters to union if you have 2 workers which is still better
than having a million smallish geometries/rasters to union
I'm not sure I got your point correctly, but if you can (for example) 
sort the per-worker results as part of the "serialize" function, and 
benefit from that while combining that in the gather, then sure, that 
should be a huge win.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-18 Thread Andrew Gierth
> "Thomas" == Thomas Munro  writes:

 Thomas> Thanks both for the review.  New version of patch #2 attached.

I'm looking to commit this soon; if anyone has any further comment now
would be a good time to speak up.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transition tables vs ON CONFLICT

2017-06-18 Thread Andrew Gierth
> "Thomas" == Thomas Munro  writes:

 Thomas> That accidentally removed a comment that I wanted to keep.
 Thomas> Here is a better version.

I plan to commit this soon; if anyone has any comment to make, now would
be a good time.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] transition table behavior with inheritance appears broken

2017-06-18 Thread Andrew Gierth
> "Andrew" == Andrew Gierth  writes:

 Andrew> Unfortunately I've been delayed over the past couple of days,
 Andrew> but I have Thomas' latest patchset in hand and will be working
 Andrew> on it over the rest of the week. Status update by 23:59 BST on
 Andrew> Sun 18th, by which time I hope to have everything finalized
 Andrew> (all three issues, not just the inheritance one).

I have, I believe, completed my review of the patchset. My conclusion is
that the fix appears to be sound and I haven't been able to find any
further issues with it; so I think Thomas's patches should be committed
as-is. Unless anyone objects I will do this within the next few days.

(Any preferences for whether it should be one commit or 3 separate ones?)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2017-06-18 Thread Thomas Munro
On Mon, Jun 19, 2017 at 2:24 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, Jun 17, 2017 at 11:58 PM, Thomas Munro
>>  wrote:
>>> On Sun, Jun 18, 2017 at 2:31 PM, Robert Haas  wrote:
 What would be the point of that?
>
>>> We'd accept and display the new SQL:2016 standard type name with
>>> length, but by mapping it onto different internal types we could use a
>>> pass-by-value type when it fits in a Datum.
>
>> Uggh.  I'll repeat what has been said on this mailing list many times
>> before: the SQL standards committee often seems to make life
>> unnecessarily difficult with its choice of syntax.
>
> We could do what we did with FLOAT(n), which is to accept the new
> typename syntax but convert it to simple typenames decfloatN, and
> not worry about reversing the transformation on output.
>
> But the real question is whether we want to get that deeply invested
> in a type that couldn't be considered standard for many years to come.
> (Unless somebody wants to write an all-software fallback implementation,
> which I sure don't.)

There are already two well known all-software implementations:

1.  IBM's decNumber[1] seems to be the more popular and is about
20kloc with a choice of ICU or GPL license.  pgdecimal[3] (the
experimental extension by Feng Tian and Pavel Stehule that this thread
announced) uses that (an earlier version used the C language extension
types like _Decimal64 instead). Several projects seem to be using it
in-tree, including GCC.
2.  Intel's RDFPMathLib[2] is much larger.

So I guess the questions would be:

1.  Intel or IBM?
2.  In tree or out of tree dependency?
3.  Also support the new C TS extension types (_Decimal64 etc) as an
alternative for C compilers that have the extension, for the benefit
of xlc/POWER systems?

I speculate that decNumber in-tree would be the path of least
resistance (assuming the "ICU 1.8.1 and later" license[4] would be
acceptable -- to my untrained eye it looks rather BSD-ish -- and
20kloc isn't viewed as excessive), and further that a standard
compliant version might have some good reasons to be in core rather
than in an extension like pgdecimal:

1.  We'd need gram.y + format_type.c support to get the property I
mentioned above (standard typename mapping to more than one internal
type in order to get pass-by-value for good performance with the
Datum-sized variant).
2.  There are probably some casts and conversions among this and the
existing number types and rules for parsing constants etc that finish
up needing core changes.

[1] http://speleotrove.com/decimal/
[2] 
https://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library
[3] https://github.com/vitesse-ftian/pgdecimal
[4] https://spdx.org/licenses/ICU.html

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how are the rpms configured that are available in PostgreSQL RPM Building Project - Yum Repository

2017-06-18 Thread Cook, Malcolm
Great – perfect – and …. I have one installed – we will investigate…


From: David G. Johnston [mailto:david.g.johns...@gmail.com]
Sent: Friday, June 16, 2017 8:59 PM
To: Cook, Malcolm 
Cc: pgsql-hackers@PostgreSQL.org; McGee, Jenny 
Subject: Re: [HACKERS] how are the rpms configured that are available in 
PostgreSQL RPM Building Project - Yum Repository

On Fri, Jun 16, 2017 at 12:40 PM, Cook, Malcolm 
> wrote:
Hi,

I am referring to the contents of https://yum.postgresql.org/ (specifically 
version 9.6 rpms for CentoOS7)

More specifically I wonder if they are configured:--with-python --with-tcl 
--with-pam --with-ldap

And do they install the Additional Supplied Modules 
(https://www.postgresql.org/docs/9.1/static/contrib.html ) as provided when 
performing a

   make install world

Please advise if my possibly novice question should better be posed elsewhere.  
Or if I should expect to find answers in the fine manual.  I looked but did not 
see…

​This may provide the info you need.  Note you'll actually need an installed 
version to run it against.​

https://www.postgresql.org/docs/current/static/app-pgconfig.html

​This will show what extensions are available in said cluster.

https://www.postgresql.org/docs/current/static/view-pg-available-extensions.html

​David J.​



Re: [HACKERS] WIP: Data at rest encryption

2017-06-18 Thread Tomas Vondra

Hi all,

I've noticed this thread got resurrected a few days ago, but I haven't 
managed to read all the messages until today. I do have a bunch of 
comments, but let me share them as a single consistent message instead 
of sending a thousand responses to individual messages.



1) Threat model
---

Firstly, I think the thread would seriously benefit from an explanation 
and discussion of the threat model - what types of attacks it's meant to 
address, and what attacks it can't defend against.


My understanding is that data-at-rest encryption generally addresses 
only the "someone stole the disk" case and pretty much nothing else.


Moreover, I claim that encryption implemented at the database-level is 
strictly weaker compared to LUKS or encrypted disks, because it simply 
reveals a lot more information even without decryption (file sizes, 
timestamps, etc.)


That is a serious issue in practice, and researches have been proving 
that for a long time now. I do recommend this paper from Cornell Tech as 
a great starting point (it cites many papers relevant to this thread):


Why Your Encrypted Database Is Not Secure
Paul Grubbs, Thomas Ristenpart, Vitaly Schmatikov
http://eprint.iacr.org/2017/468.pdf

The paper explains how encryption schemes on general-purpose databases 
fail, due to exactly such side-channels. MVCC, logging and other side 
channels turn all attackers into "persistent passive attackers".


Now, this does not mean the feature is useless - nothing is perfect, and 
security is not a binary feature. It certainly makes attacks mode 
difficult compared to plaintext database. But it's untrue that it's 
basically LUKS, just implemented at the database level.


I'm not suggesting that we should not pursue this idea, but the threat 
model is a crucial piece of information missing in this thread.



2) How do other databases do it?


It was repeatedly mentioned that other databases support this type of 
encryption. So how do they deal with the hard parts? For example how do 
they get and protect the encryption key?


I agree with Stephen that we should not require a full key management 
from v1 of the patch, that's an incredibly difficult thing. And it 
largely depends on the hardware (e.g. it should be possible to move the 
key to TrustZone on ARM / SGX on Intel).



3) Why do users prefer this to FDE?
---

I suppose we're discussing this feature because we've been asked about 
it by users/customers who can't use FDE. Can we reach to them and ask 
them about the reasons? Why can't they use FDE?


It was mentioned in the thread that the FDE solutions are not portable 
between different systems, but honestly - is that an issue? You probably 
can't copy the datadir anyway due locale differences anyway. If you're 
running multiple operating systems, FDE is just one of many differences.



4) Other solutions?
---

Clearly, FDE (at the block device level) and DB-level encryption are not 
the only solutions. There are also filesystems-level solutions now, for 
example.


ext4 (since kernel 4.1) and f2fs (since kernel 4.2) allow encryption at 
directory level, are transparent to the user space, and include things 
like key management (well, that's provided by kernel). NTFS can do 
something quite similar using EFS.


https://lwn.net/Articles/639427/

https://blog.quarkslab.com/a-glimpse-of-ext4-filesystem-level-encryption.html

Of course, if you happen to use another filesystem (e.g. XFS), this 
won't work for you. But then there's eCryptfs, for example:


https://en.wikipedia.org/wiki/ECryptfs


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-18 Thread Alexander Korotkov
On Sun, Jun 18, 2017 at 12:41 AM, Mark Rofail 
wrote:

> *Questions:*
>
>- I'd like to check that anyelem and anyarray have the same element
>type. but anyelem is obtained from PG_FUNCTION_ARGS as a Datum. How
>can I make such a check?
>
>
As I know, it's implicitly checked during query analyze stage.  You don't
have to implement your own check inside function implementation.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] INSERT ... ON CONFLICT () SELECT

2017-06-18 Thread Peter Geoghegan
On Sun, Jun 18, 2017 at 4:33 AM, Matt Pulver  wrote:
> This would be nearly identical to the existing error message that is
> produced when running:
>
> INSERT INTO example (name) VALUES ('foo'), ('foo')
> ON CONFLICT (name) DO UPDATE SET value=1
> RETURNING *
>
>
> which gives the error message:
> "ERROR:  ON CONFLICT DO UPDATE command cannot affect row a second time
> HINT:  Ensure that no rows proposed for insertion within the same command
> have duplicate constrained values."

FWIW, the fact that ON CONFLICT DO UPDATE will never update a row that
it itself inserted has proved valuable. For example, transition
tables, which appeared within statement triggers, have no need to
consider the case where an inserted tuple appears for the INSERT
statement case, as well as the UPDATE statement case (in another
form).

> Technically, an error doesn't need to be produced for the above "ON CONFLICT
> DO SELECT" statement - I think it is still perfectly well-defined to return
> duplicate rows as in option 1.

Returning rows with duplicate values seems rather unorthodox.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in drop_publication.sgml

2017-06-18 Thread Julien Rouhaud
On Sun, Jun 18, 2017 at 07:42:54PM +0200, Magnus Hagander wrote:
> On Sun, Jun 18, 2017 at 8:46 AM, Julien Rouhaud 
> wrote:
> 
> > Patch attached.
> >
> 
> Applied, thanks.
> 

Thanks.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in drop_publication.sgml

2017-06-18 Thread Magnus Hagander
On Sun, Jun 18, 2017 at 8:46 AM, Julien Rouhaud 
wrote:

> Patch attached.
>

Applied, thanks.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] Making server name part of the startup message

2017-06-18 Thread narlapurams
Thank you, Alvaro, Andres, Magnus, Peter, and Tom for your thoughts! I
consolidated all the responses, and provided the high level overview of the
Azure PostgreSQL database architecture, and provided additional details at
the bottom of the message.

> Tom Lane:  We don't actually have any concept of a server name at the
> moment, and it isn't very clear what introducing that concept would buy.
> Please explain.


>  Tom Lane: I think for instance you could have one pgbouncer instance (or
> whatever pooler) pointing >to several different servers.  So the client
> connects to the pooler and indicates which of the servers to >connect to.
>  Magnus: The normal one to use for pgbonucer today is, well, "database
> name". You can then have >pgbouncer map different databases to different
> backend servers. It's fairly common in my experience >to have things like
> "dbname" and "dbname-ro" (for example) as different  
> database names with one mapping to the master and one mapping to a
> load-balanced set of standbys, >and things like that. ISTM that using the
> database name is a good choice for that.

PgBouncer for example assumes that the database names are unique across the
database clusters it is serving. Our front-end Gateways can serve tens of
thousands of Postgres servers spanning multiple customers, and
organizations, and enforcing the database names being unique is not possible
for the users of the service. 

> Magnus: For the original idea in this thread, using something like
> dbname@server seems a more logical choice than username@server.
We considered this option but connecting to the database from the GUI tools
is not very intuitive / possible. Also /c option now requires including full
cluster_name every time user connect.

> Tome Lane: I should think that in such cases, the end client is exactly
> not what you want to be choosing which server it gets redirected to. 
> You'd be wanting to base that on policies defined at the pooler.  There
> are 
> already plenty of client-supplied attributes you could use as inputs for
> such policies (user name and application name, for instance). Why do we
> need to incur a protocol break to add another one?

This is optional and is not a protocol break. This doesn’t make the cluster
name field mandatory in the startup message. If the client specifies the
extra parameter in the connection string to include the server name in the
startup message then only it will be included otherwise it is not. In a
proxy scenario, end clients startup message doesn’t need to include the
server name in it, and for proxy it is optional to include this field while
sending the startup message to the server. It is preferred to set the field
for the Azure PostgreSQL service instead of appending the cluster name to
the user name.

Yes, there are other fields like application name, but it is not easy to use
them from GUI tools like Pg Admin. Overloading a field is also not always
intuitive to the users, and some of the applications potentially using them
for different purposes. Default database name is the user name for some of
the clients, and as we are overloading user name today and the startup
message has user@clustername in it. This behavior causing logins to fail
with invalid database name as the database doesn’t exist on the server.
Using database name may not be ideal because GUI tools doesn’t assume
database has server name in it. 

> Peter: I think this could be useful if it's something like what HTTP uses.
The proposal is similar to http host header field in HTTP1.1. This allows
the origin server or gateway to differentiate between internally-ambiguous
URLs, such as the root "/" URL of a server for multiple host names on a
single IP address. For reference,
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23  Azure
database for PostgreSQL follows the similar pattern where all the database
cluster dns records points to our gateways nodes, and we would like to
resolve them based on the server name field.


High level overview of Azure Database For PostgreSQL service.

Azure database for PostgreSQL is a cloud service that hosts several
PostgreSQL database clusters a.k.a. servers with in a region, potentially
tens of thousands of database clusters. We have several front-end proxies
(called Gateways) deployed to proxy the customer connections to the
appropriate database cluster. Each database cluster has a unique name, and a
DNS record is provisioned with that name. For example, if pgserver is the
name of the database cluster, then it has a dns record
pgserver.postgres.database.azure.com associated with it. The DNS record of a
customer database server will be pointing to the front-end Gateways, and the
customer request reaches these Gateways. Gateway requires database cluster
name to proxy the connection to the appropriate database cluster. In the
absence of this it is impossible for us to proxy the request. Startup
message containing the server name helps us route the 

Re: [HACKERS] Preliminary results for proposed new pgindent implementation

2017-06-18 Thread Tom Lane
Peter Eisentraut  writes:
> On 6/16/17 10:51, Tom Lane wrote:
>> So I'm back to the position that we ought to stick the indent
>> code under src/tools/ in our main repo.  Is anyone really
>> seriously against that?

> I think it would be better to have it separate.

> Other than for reasons of principle and general modularity of the world,
> I would like this to be available separately for separate download,
> packaging, etc. to it can be applied to extension projects without
> having to download and build (a specific version of) PostgreSQL.  The
> code formatting in extension projects is, um, questionable.  In fact, if
> it's a better indent period, I would like to package it for the general
> public.

Well, the direction I'm headed in for addressing the portability issues
is to make it depend on the usual PG build environment, notably c.h and
libpgport.  If we don't want it in-tree, it can be built using PGXS,
but it'll still require a PG installation somewhere in order to get built.
Making it independent of both FreeBSD and PG is a significantly larger
project, and one I don't personally intend to tackle.  (And, if someone
does tackle that, I don't exactly see why having our own copy in-tree
would stop them.)

However ... off-list discussion with Piotr indicates that he's unwilling
to touch the license text without permission from FreeBSD core and/or
legal teams.  While the 4-clause license is certainly no impediment to
using indent, we don't want any such text in our tree, so that seems
like a showstopper, at least until the license question is resolved.

Accordingly, I'll proceed with setting up a repo for it on
git.postgresql.org.

> If the vote is to put it into the tree, I would request not to do it in
> PG10.  At this point, we should be winding things down and not open up
> new areas of activity.

I'm confused by this.  Are you objecting to switching to the new indent
version for v10?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb initalization failure for collation "ja_JP"

2017-06-18 Thread Tom Lane
Marco Atzeri  writes:
> Building on Cygwin latest 10  beta1 or head sourece,
> make check fails as:
> ...
> performing post-bootstrap initialization ... 2017-05-31 23:23:22.214 
> CEST [16860] FATAL:  collation "ja_JP" for encoding "EUC_JP" already exists

Hmph.  Could we see the results of "locale -a | grep ja_JP" ?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2017-06-18 Thread Tom Lane
Robert Haas  writes:
> On Sat, Jun 17, 2017 at 11:58 PM, Thomas Munro
>  wrote:
>> On Sun, Jun 18, 2017 at 2:31 PM, Robert Haas  wrote:
>>> What would be the point of that?

>> We'd accept and display the new SQL:2016 standard type name with
>> length, but by mapping it onto different internal types we could use a
>> pass-by-value type when it fits in a Datum.

> Uggh.  I'll repeat what has been said on this mailing list many times
> before: the SQL standards committee often seems to make life
> unnecessarily difficult with its choice of syntax.

We could do what we did with FLOAT(n), which is to accept the new
typename syntax but convert it to simple typenames decfloatN, and
not worry about reversing the transformation on output.

But the real question is whether we want to get that deeply invested
in a type that couldn't be considered standard for many years to come.
(Unless somebody wants to write an all-software fallback implementation,
which I sure don't.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] outfuncs.c utility statement support

2017-06-18 Thread Tom Lane
Peter Eisentraut  writes:
> An empty token produces "<>", so just debackslashing wouldn't work.

pg_strtok recognizes "<>" and returns length = 0, so debackslash()
would produce the right answer AFAICS (admittedly, I haven't tested).
But I don't really want to do it like that because of the wasted
palloc space and cycles.

> Maybe
> local_node->fldname = length ? token[0] : '\0';
> ?

Doesn't cope with backslash-quoted characters.  If we're going to bother
to do anything here, I think we ought to make it reversible for all
possible characters.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2017-06-18 Thread Robert Haas
On Sat, Jun 17, 2017 at 11:58 PM, Thomas Munro
 wrote:
> On Sun, Jun 18, 2017 at 2:31 PM, Robert Haas  wrote:
>> On Sat, Jun 17, 2017 at 3:50 PM, Thomas Munro
>>  wrote:
>>> On Sun, Jun 18, 2017 at 5:38 AM, Robert Haas  wrote:
 I feel like these would logically just be different types, like int4
 and int8 are.  We don't have integer(9) and integer(18).
>>>
>>> Hmm.  Perhaps format_type.c could render decfloat16 as decfloat(16)
>>> and decfloat34 as decfloat(34), and gram.y could have a production
>>> that selects the right one when you write DECFLOAT(x) and rejects
>>> values of x other than 16 and 34.
>>
>> What would be the point of that?
>
> We'd accept and display the new SQL:2016 standard type name with
> length, but by mapping it onto different internal types we could use a
> pass-by-value type when it fits in a Datum.

Uggh.  I'll repeat what has been said on this mailing list many times
before: the SQL standards committee often seems to make life
unnecessarily difficult with its choice of syntax.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting server crash on Windows when using ICU collation

2017-06-18 Thread Amit Kapila
On Sat, Jun 17, 2017 at 8:27 PM, Ashutosh Sharma  wrote:
> Hi,
>
> On Sat, Jun 17, 2017 at 6:38 PM, Peter Eisentraut
>  wrote:
>> On 6/16/17 23:46, Amit Kapila wrote:
>>> I have just posted one way
>>> to determine if icu library has support for ucol_strcollUTF8, see if
>>> that sounds like a way forward to you.
>>
>> I'm not in a position to test such patches, so someone else will have to
>> take that on.
>

I can verify the patch on Win7 setup to which I have access if that helps.

> Well, I have tested it from my end. I've basically tried to test it by
> running multi-byte tests as Amit suggested and by verifing the things
> manually through debugger . And, i had mistakenly attached wrong patch
> in my earlier email. Here, i attach the correct patch.
>

Is it possible for you to once verify this patch with icu library
version where ucol_strcollUTF8 is not defined?

+ # get the icu version.
+ my $output = `uconv -V /? 2>&1`;
+ $? >> 8 == 0 or die "uconv command not found";

If we don't find unconv, isn't it better to fall back to non-UTF8
version rather than saying command not found?

+ print $o "#define HAVE_UCOL_STRCOLLUTF8 1\n"
+ if ($self->{ICUVersion} >= 50);

Indentation looks slightly odd, see the other usage in the same file.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index

2017-06-18 Thread Shubham Barai
Hi,

Please find the updated patch here.

Regards,
Shubham

On 16 June 2017 at 15:54, Shubham Barai  wrote:

> Hi, hackers!
>
> I have created my first patch for predicate locking in gist index. It
> includes a test for verification of serialization failures and a test to
> check false positives.
> I am submitting my patch little late because there were some issues with
> "make check" that I was trying to solve. Now, the patch passes all existing
> tests.
>
> Regards,
> Shubham
>
>
>
>    Sent with Mailtrack
> 
>


Predicate-Locking-in-Gist-index_2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT () SELECT

2017-06-18 Thread Matt Pulver
On Sat, Jun 17, 2017 at 9:55 PM, Peter Geoghegan  wrote:

> On Sat, Jun 17, 2017 at 7:49 AM, Matt Pulver 
> wrote:
> > With the proposed "INSERT ... ON CONFLICT () SELECT" feature, the
> > get_or_create_id() function is simplified to:
>
> Are you locking the existing rows? Because otherwise, the
> determination that they're conflicting can become obsolete immediately
> afterwards. (I guess you would be.)
>

If that is required in order to return the rows in their conflicted state,
then yes.


> The problem with this design and similar designs is that presumably
> the user is sometimes projecting the conflicting rows with the
> intention of separately updating them in a wCTE. That might not work,
> because only ON CONFLICT doesn't use the MVCC snapshot, in order to
> ensure that an UPDATE is guaranteed when an INSERT cannot go ahead.
> That isn't what you're doing in the example you gave, but surely some
> users would try to do things like that, and get very confused.
>

Ultimately the proposed "INSERT ... ON CONFLICT () DO SELECT" syntax is
still an INSERT statement, not a SELECT, so a user should not expect rows
returned from it to be available for UPDATE/DELETE in another part of a
wCTE. Anyone who understands this behavior for an INSERT statement, let
alone the current "INSERT ... ON CONFLICT DO UPDATE" should not be too
surprised if the same thing applies to the new "INSERT ... ON CONFLICT DO
SELECT".


> I think that what you propose to do here would likely create a lot of
> confusion by mixing MVCC semantics with special UPSERT visibility
> semantics ("show me the latest row version visible to any possible
> snapshot for the special update") even without a separate UPDATE, in
> fact. Would you be okay if "id" appeared duplicated in the rows you
> project in your new syntax, even when there is a separate unique
> constraint on that column? I suppose that there is some risk of things
> like that today, but this would make the "sleight of hand" used by ON
> CONFLICT DO UPDATE more likely to cause problems.
>

 Good point. Here is an example using the example table from my previous
email:

INSERT INTO example (name) VALUES ('foo'), ('foo')
ON CONFLICT (name) DO SELECT
RETURNING *


Here are a couple options of how to handle this:

1) Return two identical rows (with the same id).
2) Produce an error, with error message:
"ERROR:  ON CONFLICT DO SELECT command cannot reference row a second time
HINT:  Ensure that no rows proposed for insertion within the same command
have duplicate constrained values."

This would be nearly identical to the existing error message that is
produced when running:

INSERT INTO example (name) VALUES ('foo'), ('foo')
ON CONFLICT (name) DO UPDATE SET value=1
RETURNING *


which gives the error message:
"ERROR:  ON CONFLICT DO UPDATE command cannot affect row a second time
HINT:  Ensure that no rows proposed for insertion within the same command
have duplicate constrained values."

Technically, an error doesn't *need* to be produced for the above "ON
CONFLICT DO SELECT" statement - I think it is still perfectly well-defined
to return duplicate rows as in option 1. Option 2 is more for the benefit
of the user who is probably doing something wrong by attempting to INSERT a
set of rows that violate a constraint. What do you think would be best?

Thank you for the discussion.

Best regards,
Matt


[HACKERS] Typo in insert.sgml

2017-06-18 Thread Julien Rouhaud
Patch attached.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 94dad00870..0e327e5b2e 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -228,7 +228,7 @@ INSERT INTO table_name [ AS INSERT INTO tbl2 OVERRIDING USER VALUE SELECT * FROM
 tbl1 will copy from tbl1 all columns that
 are not identity columns in tbl2 but will continue
-the sequence counters for any identity columns.
+to use the sequence counters for any identity columns.

   
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typo in drop_publication.sgml

2017-06-18 Thread Julien Rouhaud
Patch attached.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/ref/drop_publication.sgml 
b/doc/src/sgml/ref/drop_publication.sgml
index 517d142251..8e45a43982 100644
--- a/doc/src/sgml/ref/drop_publication.sgml
+++ b/doc/src/sgml/ref/drop_publication.sgml
@@ -46,8 +46,8 @@ DROP PUBLICATION [ IF EXISTS ] name
 IF EXISTS
 
  
-  Do not throw an error if the extension does not exist. A notice is issued
-  in this case.
+  Do not throw an error if the publication does not exist. A notice is
+  issued in this case.
  
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers