Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0

2016-12-11 Thread Michael Paquier
On Fri, Dec 9, 2016 at 4:11 PM, Michael Paquier
 wrote:
> On Fri, Dec 9, 2016 at 3:23 PM, Michael Paquier
>  wrote:
>> This basically means that if the latch is set, we don't wait at all
>> and drop the ball. I am wondering: isn't that a problem even if
>> WL_LATCH_SET is *not* set? If I read this code correctly, even if
>> caller has not set WL_LATCH_SET and the latch is set, then the wait
>> will stop.
>
> Nah. I misread the code. set->latch is not NULL only if WL_LATCH_SET is 
> enabled.

OK, I think that I have spotted an issue after additional read of the
code. When a WSA event is used for read/write activity on a socket,
the same WSA event gets reused again and again. That's fine for
performance reasons, but what I think is not fine is the fact that
this WSA event is *not* reset even once in-between when calling
WaitEventAdjustWin32() to adjust an event HANDLE to wait for, and the
FeBeWaitEvent is used a lot. That's one inconsistency with the old
code that always closed the WSA event after using it, unconditionally.
Now that we cache it I think that we had better put it in a clean
state every time we want to use it again.

With the pointer miscalculation I pointed out upthread it gives the
patch attached.

Ashutosh, could you try it and see if it improves things?
-- 
Michael


fix-wait-event-win32.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jacana hung after failing to acquire random number

2016-12-11 Thread Michael Paquier
On Mon, Dec 12, 2016 at 4:32 PM, Heikki Linnakangas  wrote:
> That test is broken. It looks like the x"$VAR" = x"constant" idiom, but the 
> left side of the comparison doesn't have the 'x'. Oops.

Good catch.

> This makes me wonder if we should work a bit harder to get a good error
> message, if acquiring a random number fails for any reason. This needs to
> work in the frontend as well backend, but we could still have an elog(LOG,
> ...) there, inside an #ifndef FRONTEND block.

Yeah, this has been itching me as well. We could allocate an error
string in a psprintf()'d string and let the callers of
pg_strong_backend() use it as they are responsible for the error
handling. What do you think?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jacana hung after failing to acquire random number

2016-12-11 Thread Heikki Linnakangas

On 12/12/2016 05:58 AM, Michael Paquier wrote:

On Sun, Dec 11, 2016 at 9:06 AM, Andrew Dunstan  wrote:


jascana (mingw, 64 bit compiler, no openssl) is currently hung on "make
check". After starting the autovacuum launcher there are 120 messages on its
log about "Could not acquire random number". Then nothing.


So I suspect the problem here is commit
fe0a0b5993dfe24e4b3bcf52fa64ff41a444b8f1, although I haven't looked in
detail.


Shouldn't we want the postmaster to shut down if it's not going to go
further? Note that frogmouth, also mingw, which builds with openssl, doesn't
have this issue.


Did you unlock it in some way at the end? Here is the shape of the
report for others:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2016-12-10%2022%3A00%3A15
And here is of course the interesting bit:
2016-12-10 17:25:38.822 EST [584c80e2.ddc:2] LOG:  could not acquire
random number
2016-12-10 17:25:39.869 EST [584c80e2.ddc:3] LOG:  could not acquire
random number
2016-12-10 17:25:40.916 EST [584c80e2.ddc:4] LOG:  could not acquire
random number

I am not seeing any problems with MSVC without openssl, so that's a
problem proper to MinGW. I am getting to wonder if it is actually a
good idea to cache the crypt context and then re-use it. Using a new
context all the time is definitely not performance-wise though.


Actually, looking at the config.log on jacana, it's trying to use 
/dev/urandom:


configure:15028: checking for /dev/urandom
configure:15041: result: yes
configure:15054: checking which random number source to use
configure:15073: result: /dev/urandom

And looking closer at configure.in, I can see why:

  elif test "$PORTNAME" = x"win32" ; then
USE_WIN32_RANDOM=1

That test is broken. It looks like the x"$VAR" = x"constant" idiom, but 
the left side of the comparison doesn't have the 'x'. Oops.


Fixed that, let's see if it made jacana happy again.

This makes me wonder if we should work a bit harder to get a good error 
message, if acquiring a random number fails for any reason. This needs 
to work in the frontend as well backend, but we could still have an 
elog(LOG, ...) there, inside an #ifndef FRONTEND block.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel Index-only scan

2016-12-11 Thread Rafia Sabih
Hello all,
This is to propose a patch for enabling parallel index-only scans. With the
patch of parallel index scan around [1], adding the mechanism for
parallelising index-only scans makes sense. Without this mechanism for the
queries preferring index-only scans, it is likely that at higher scale
parallel index or parallel seq scan serve as cheaper alternative to
index-only scans and then query performance might suffer because of the
added processing of index or seq scans.

Performance
-
Consider the performance of a simple query on TPC-H schema,

explain analyse select count(*) from lineitem where l_shipdate < date
'1995-01-03';

Without parallel index-only scan, parallel seq scan got picked and it took
around 23 seconds for the query to execute,

> Finalize Aggregate  (cost=651586.63..651586.64 rows=1 width=8) (actual
> time=22853.872..22853.873 rows=1 loops=1)
>->  Gather  (cost=651586.21..651586.62 rows=4 width=8) (actual
> time=22853.684..22853.864 rows=5 loops=1)
>  Workers Planned: 4
>  Workers Launched: 4
>  ->  Partial Aggregate  (cost=650586.21..650586.22 rows=1 width=8)
> (actual time=22850.489..22850.489 rows=1 loops=5)
>->  Parallel Seq Scan on lineitem  (cost=0.00..618021.73
> rows=13025795 width=0) (actual time=0.035..20553.495 rows=10342437 loops=5)
>  Filter: (l_shipdate < '1995-01-03'::date)
>  Rows Removed by Filter: 13656485
>  Planning time: 0.225 ms
>  Execution time: 22855.196 ms
>
However, with parallel index-only scan, it took only 8.5 seconds,

Finalize Aggregate  (cost=568883.69..568883.70 rows=1 width=8) (actual
> time=8548.993..8548.993 rows=1 loops=1)
>->  Gather  (cost=568883.27..568883.68 rows=4 width=8) (actual
> time=8548.789..8548.976 rows=5 loops=1)
>  Workers Planned: 4
>  Workers Launched: 4
>  ->  Partial Aggregate  (cost=567883.27..567883.28 rows=1 width=8)
> (actual time=8541.929..8541.929 rows=1 loops=5)
>->  Parallel Index Only Scan using idx_l_shipdate on
> lineitem  (cost=0.57..535318.78 rows=13025795 width=0) (actual
> time=0.113..5866.729 rows=10342437 loops=5)
>  Index Cond: (l_shipdate < '1995-01-03'::date)
>  Heap Fetches: 0
>  Planning time: 0.266 ms
>  Execution time: 8569.735 ms


The effect of parallel index-only scan can be seen more in some more
complex queries where parallelism is enabled till top of the tree, e.g,
following query takes 118 ms on head,

explain analyse select sum(l_extendedprice * l_discount) as revenue,
avg(o_orderkey) from orders, lineitem where o_orderkey < 1 and
o_orderkey = l_orderkey group by l_shipmode order by revenue

 Sort  (cost=24396.44..24396.45 rows=7 width=75) (actual
> time=118.823..118.825 rows=7 loops=1)
>Sort Key: (sum((lineitem.l_extendedprice * lineitem.l_discount)))
>Sort Method: quicksort  Memory: 25kB
>->  HashAggregate  (cost=24396.23..24396.34 rows=7 width=75) (actual
> time=118.749..118.786 rows=7 loops=1)
>  Group Key: lineitem.l_shipmode
>  ->  Nested Loop  (cost=1.13..24293.11 rows=10312 width=27)
> (actual time=0.096..73.198 rows=9965 loops=1)
>->  Index Only Scan using orders_pkey on orders
>  (cost=0.56..46.48 rows=2578 width=4) (actual time=0.072..1.663 rows=2503
> loops=1)
>  Index Cond: (o_orderkey < 1)
>  Heap Fetches: 0
>->  Index Scan using idx_lineitem_orderkey on lineitem
>  (cost=0.57..6.45 rows=296 width=31) (actual time=0.018..0.023 rows=4
> loops=2503)
>  Index Cond: (l_orderkey = orders.o_orderkey)
>  Planning time: 1.062 ms
>  Execution time: 118.977 ms


With parallel index-only scan, the performance improves to 40 ms,

Sort  (cost=7191.33..7191.35 rows=7 width=75) (actual time=40.475..40.476
> rows=7 loops=1)
>Sort Key: (sum((lineitem.l_extendedprice * lineitem.l_discount)))
>Sort Method: quicksort  Memory: 25kB
>->  Finalize GroupAggregate  (cost=7190.78..7191.23 rows=7 width=75)
> (actual time=40.168..40.451 rows=7 loops=1)
>  Group Key: lineitem.l_shipmode
>  ->  Sort  (cost=7190.78..7190.85 rows=28 width=75) (actual
> time=40.105..40.127 rows=35 loops=1)
>Sort Key: lineitem.l_shipmode
>Sort Method: quicksort  Memory: 29kB
>->  Gather  (cost=7187.22..7190.11 rows=28 width=75)
> (actual time=39.344..39.983 rows=35 loops=1)
>  Workers Planned: 4
>  Workers Launched: 4
>  ->  Partial HashAggregate  (cost=6187.22..6187.31
> rows=7 width=75) (actual time=25.981..26.011 rows=7 loops=5)
>Group Key: lineitem.l_shipmode
>->  Nested Loop  (cost=1.13..6084.10 rows=10312
> width=27) (actual time=0.139..16.352 rows=1993 loops=5)
>  ->  Parallel Index Only Scan using
> 

Re: [HACKERS] Declarative partitioning - another take

2016-12-11 Thread Amit Langote

Hi Tomas,

On 2016/12/12 10:02, Tomas Vondra wrote:
> On 12/07/2016 07:20 PM, Robert Haas wrote:
>> I've committed 0001 - 0006 with that correction and a few other
>> adjustments.  There's plenty of room for improvement here, and almost
>> certainly some straight-up bugs too, but I think we're at a point
>> where it will be easier and less error-prone to commit follow on
>> changes incrementally rather than by continuously re-reviewing a very
>> large patch set for increasingly smaller changes.
>>
> 
> I've been working on a review / testing of the partitioning patch, but
> have been unable to submit it before the commit due to a lot of travel.
> However, at least some of the points seem to be still valid, so let me
> share it as an after-commit review. Most of the issues are fairly minor
> (possibly even nitpicking).

Thanks a lot for the review comments and testing!

I attach a patch that implements some of the fixes that you suggest below.
 Also, I am merging a patch I sent earlier [1] to avoid having to look at
multiple patches.

> review
> --
> 
> 1) src/include/catalog/pg_partitioned_table.h contains this bit:
> 
>  * $PostgreSQL: pgsql/src/include/catalog/pg_partitioned_table.h $

Fixed.

> 2) I'm wondering whether having 'table' in the catalog name (and also in
> the new relkind) is too limiting. I assume we'll have partitioned indexes
> one day, for example - do we expect to use the same catalogs?

I am not sure I understand your idea of partitioned indexes, but I doubt
it would require entries in the catalog under consideration.  Could you
perhaps elaborate more?

> 3) A comment within BeginCopy (in copy.c) says:
> 
> * Initialize state for CopyFrom tuple routing.  Watch out for
> * any foreign partitions.
> 
> But the code does not seem to be doing that (at least I don't see any
> obvious checks for foreign partitions). Also, the comment should probably
> at least mention why foreign partitions need extra care.

It's a stale comment after I took out the code that handled foreign
partitions in final versions of the tuple-routing patch.  You may have
noticed in the commit message that tuple-routing does not work for foreign
partitions, because the changes I'd proposed weren't that good.  Comment
fixed anyway.

> To nitpick, the 'pd' variable in that block seems rather pointless - we
> can assign directly to cstate->partition_dispatch_info.

OK, fixed.  Also, fixed similar code in ExecInitModifyTable().

> 4) I see GetIndexOpClass() got renamed to ResolveOpClass(). I find the new
> name rather strange - all other similar functions start with "Get", so I
> guess "GetOpClass()" would be better. But I wonder if the rename was even
> necessary, considering that it still deals with index operator classes
> (although now also in context of partition keys). If the rename really is
> needed, isn't that a sign that the function does not belong to indexcmds.c
> anymore?

The fact that both index keys and partition keys have very similar
specification syntax means there would be some common code handling the
same, of which this is one (maybe, only) example.  I didn't see much point
in finding a new place for the function, although I can see why it may be
a bit confusing to future readers of the code.

About the new name - seeing the top line in the old header comment, what
the function does is resolve a possibly explicitly specified operator
class name to an operator class OID.  I hadn't changed the name in my
original patch, but Robert suggested the rename, which I thought made sense.

> 5) Half the error messages use 'child table' while the other half uses
> 'partition'. I think we should be using 'partition' as child tables really
> have little meaning outside inheritance (which is kinda hidden behind the
> new partitioning stuff).

One way to go about it may be to check all sites that can possibly report
an error involving child tables (aka "partitions") whether they know from
the context which name to use.  I think it's possible, because we have
access to the parent relation in all such sites and looking at the relkind
can tell whether to call child tables "partitions".

> 6) The psql tab-completion seems a bit broken, because it only offers the
> partitions, not the parent table. Which is usually exactly the opposite of
> what the user wants.

Actually, I have not implemented any support for tab-completion for the
new DDL.  I will work on that.

> testing
> ---
> 
> I've also done quite a bit of testing with different partition layouts
> (single-level list/range partitioning, multi-level partitioning etc.),
> with fairly large number (~10k) of partitions. The scripts I used are
> available here: https://bitbucket.org/tvondra/partitioning-tests
> 
> 1) There seems to be an issue when a partition is created and then
> accessed within the same transaction, i.e. for example
> 
> BEGIN;
> ... create parent ...
> ... create partition 
> ... insert into parent ...
> COMMIT;
> 
> which 

Re: [HACKERS] Broken SSL tests in master

2016-12-11 Thread Heikki Linnakangas

On 12/01/2016 09:45 PM, Andres Freund wrote:

And nobody has added a buildfarm module to run it manually on their
servers either :(


I just added a module to run "make -C src/test/ssl check" in chipmunk. 
So at least that's covered now.


http://pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=chipmunk=2016-12-10%2012%3A08%3A31=test-ssl-check

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql statements \gstore \gstore_binary (instead COPY RAW)

2016-12-11 Thread Pavel Stehule
2016-12-11 18:23 GMT+01:00 Pavel Stehule :

> Hi
>
> 2016-12-09 18:39 GMT+01:00 Pavel Stehule :
>
>> Hi
>>
>> Long time I am pushing a COPY RAW - without success.
>>
>> Now I propose functionally similar solution - reduced to only to psql
>> console
>>
>> Now we have a statement \g for execution query, \gset for exec and store
>> result in memory and I propose \gstore for storing result in file and
>> \gstore_binary for storing result in file with binary passing. The query
>> result should be one row, one column.
>>
>> Usage:
>>
>> SELECT image FROM accounts WHERE id = xxx
>> \gstore_binary ~/image.png
>>
>> What do you think about this proposal?
>>
>
> here is a poc patch
>
> Regards
>
> Pavel
>
> Usage:
>
> postgres=# set client_encoding to 'latin2';
> SET
> Time: 1,561 ms
> postgres=# select a from foo
> postgres-# \gbstore ~/doc.xml
> Time: 1,749 ms
>
> content of doc.xml
> příliš žluťoučký kůň se napil
> žluté vody
>
>
second update - + doc

the export import regress tests are little bit heavy - I'll write it for
loading content file together.

Regards

Pavel


>
>> Regards
>>
>> Pavel
>>
>
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 9915731..7e2fa96 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1944,6 +1944,31 @@ hello 10
 
   
 
+
+  
+\gstore [ filename ]
+\gstore [ |command ]
+\gbstore [ filename ]
+\gbstore [ |command ]
+
+
+ Sends the current query input buffer to the server and stores the
+ raw query's output into stores the query's output in filename or pipes the output
+ to the shell command command.  The file or command is
+ written to only if the query successfully returns exactly one row
+ one column non null result, not if the query fails or is a 
+ non-data-returning SQL command. For example:
+
+= SELECT avatar FROM users WHERE id = 123
+- \gbstore ~/avatar.png
+
+
+
+  
+
+
   
 \h or \help [ command ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a9a2fdb..e8fabb9 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -929,6 +929,27 @@ exec_command(const char *cmd,
 		status = PSQL_CMD_SEND;
 	}
 
+	/* \gstore [filename], \gbstore [filename] -- send query and store result in (binary) file */
+	else if (strcmp(cmd, "gstore") == 0 ||
+			  (strcmp(cmd, "gbstore") == 0))
+	{
+		char	   *fname = psql_scan_slash_option(scan_state,
+   OT_FILEPIPE, NULL, false);
+
+		if (!fname)
+			pset.gfname = pg_strdup("");
+		else
+		{
+			expand_tilde();
+			pset.gfname = pg_strdup(fname);
+		}
+
+		pset.raw_flag = true;
+		pset.binres_flag = (strcmp(cmd, "gbstore") == 0);
+		free(fname);
+		status = PSQL_CMD_SEND;
+	}
+
 	/* help */
 	else if (strcmp(cmd, "h") == 0 || strcmp(cmd, "help") == 0)
 	{
@@ -1064,7 +1085,6 @@ exec_command(const char *cmd,
 		free(opt2);
 	}
 
-
 	/* \o -- set query output */
 	else if (strcmp(cmd, "o") == 0 || strcmp(cmd, "out") == 0)
 	{
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index a7789df..d4b4f15 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -854,6 +854,85 @@ StoreQueryTuple(const PGresult *result)
 	return success;
 }
 
+/*
+ * StoreRawResult: the returned value (possibly binary) is displayed
+ * or stored in file. The result should be exactly one row, one column.
+ */
+static bool
+StoreRawResult(const PGresult *result)
+{
+	bool		success = true;
+
+	if (PQntuples(result) < 1)
+	{
+		psql_error("no rows returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQntuples(result) > 1)
+	{
+		psql_error("more than one row returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQnfields(result) < 1)
+	{
+		psql_error("no columns returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQnfields(result) > 1)
+	{
+		psql_error("more than one column returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQgetisnull(result, 0, 0))
+	{
+		psql_error("returned value is null for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else
+	{
+		char	   *value;
+		int			length;
+		FILE	   *fout = NULL;
+		bool		is_pipe = false;
+
+		value = PQgetvalue(result, 0, 0);
+		length = PQgetlength(result, 0, 0);
+
+		if (pset.gfname && *(pset.gfname) != '\0')
+		{
+			if (!openQueryOutputFile(pset.gfname, , _pipe))
+success = false;
+			if (success && is_pipe)
+disable_sigpipe_trap();
+		}
+
+		if (success)
+		{
+			success = fwrite(value, 1, length, fout != NULL ? fout : pset.queryFout) == length;
+			if (!success)
+psql_error("%s: %s\n", pset.gfname, strerror(errno));
+
+			if (success)
+success = fflush(fout != NULL ? fout : pset.queryFout) == 0;
+
+			if (!success)
+psql_error("%s: %s\n", pset.gfname, strerror(errno));
+
+			if 

Re: [HACKERS] pgcrypto compilation error due to stack-allocated EVP_CIPHER_CTX

2016-12-11 Thread Michael Paquier
On Fri, Dec 9, 2016 at 10:22 AM, Michael Paquier
 wrote:
> Thanks for looking at the patch. Looking forward to hearing more!

Here is an updated patch based on which reviews should be done. I have
fixed the issue you have reported, and upon additional lookup I have
noticed that returning -1 when failing on EVP_CIPHER_CTX_new() in
px_find_cipher() is dead wrong. The error code should be
PXE_CIPHER_INIT.
-- 
Michael


pgcrypto-openssl11-fix-v4.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.

2016-12-11 Thread Amit Langote
On 2016/12/10 7:55, Keith Fiske wrote:
> Working on a blog post for this feature and just found some more
> inconsistencies with the doc examples. Looks like the city_id column was
> defined in the measurements table when it should be in the cities table.
> The addition of the partition to the cities table fails since it's missing.
> 
> Examples should look like this:
> 
> CREATE TABLE measurement (
> logdate date not null,
> peaktempint,
> unitsales   int
> ) PARTITION BY RANGE (logdate);
> 
> CREATE TABLE cities (
> city_id bigserial not null,
> name text not null,
> population   int
> ) PARTITION BY LIST (initcap(name));
> 
> I actually changed my example to have city_id use bigserial to show that
> sequences are inherited automatically. May be good to show that in the docs.

Attached is a documentation patch fixing inconsistencies in the examples
that Keith reports and also improve them a bit (cities_west example sounds
a bit contrived now that I think).

Also, I posted a patch earlier [1] to mention the limitation that row
movement caused by UPDATE is treated an error.  I have combined it into
this patch, so that all the documentation fixes proposed are together.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/a4f261c2-8554-f443-05ff-d97dddc19689%40lab.ntt.co.jp
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index a6a43c4b30..333b01db36 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -715,7 +715,7 @@ ALTER TABLE [ IF EXISTS ] name

 

-ATTACH PARTITION partition_name partition_bound_spec
+ATTACH PARTITION partition_name FOR VALUES partition_bound_spec
 
  
   This form attaches an existing table (which might itself be partitioned)
@@ -1332,7 +1332,7 @@ ALTER TABLE measurement
Attach a partition to list partitioned table:
 
 ALTER TABLE cities
-ATTACH PARTITION cities_west FOR VALUES IN ('Los Angeles', 'San Francisco');
+ATTACH PARTITION cities_ab FOR VALUES IN ('a', 'b');
 
 
   
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 8bf8af302b..58f8bf6d6a 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -248,7 +248,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI

 

-PARTITION OF parent_table
+PARTITION OF parent_table FOR VALUES partition_bound_spec
 
  
   Creates the table as partition of the specified
@@ -275,7 +275,8 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
  
   Rows inserted into a partitioned table will be automatically routed to
   the correct partition.  If no suitable partition exists, an error will
-  occur.
+  occur.  Also, if updating a row in a given partition causes it to move
+  to another partition due to the new partition key, an error will occur.
  
 
  
@@ -1477,7 +1478,6 @@ CREATE TABLE employees OF employee_type (
Create a range partitioned table:
 
 CREATE TABLE measurement (
-city_id int not null,
 logdate date not null,
 peaktempint,
 unitsales   int
@@ -1488,9 +1488,10 @@ CREATE TABLE measurement (
Create a list partitioned table:
 
 CREATE TABLE cities (
+city_id  bigserial not null,
 name text not null,
-population   int,
-) PARTITION BY LIST (initcap(name));
+population   bigint,
+) PARTITION BY LIST (left(lower(name), 1));
 
 
   
@@ -1498,30 +1499,30 @@ CREATE TABLE cities (
 
 CREATE TABLE measurement_y2016m07
 PARTITION OF measurement (
-unitsales WITH OPTIONS DEFAULT 0
+unitsales DEFAULT 0
 ) FOR VALUES FROM ('2016-07-01') TO ('2016-08-01');
 
 
   
Create partition of a list partitioned table:
 
-CREATE TABLE cities_west
+CREATE TABLE cities_ab
 PARTITION OF cities (
 CONSTRAINT city_id_nonzero CHECK (city_id != 0)
-) FOR VALUES IN ('Los Angeles', 'San Francisco');
+) FOR VALUES IN ('a', 'b');
 
 
   
Create partition of a list partitioned table that is itself further
partitioned and then add a partition to it:
 
-CREATE TABLE cities_west
+CREATE TABLE cities_ab
 PARTITION OF cities (
 CONSTRAINT city_id_nonzero CHECK (city_id != 0)
-) FOR VALUES IN ('Los Angeles', 'San Francisco') PARTITION BY RANGE (population);
+) FOR VALUES IN ('a', 'b') PARTITION BY RANGE (population);
 
-CREATE TABLE cities_west_1_to_10
-PARTITION OF cities_west FOR VALUES FROM (1) TO (10);
+CREATE TABLE cities_ab_1_to_10
+PARTITION OF cities_ab FOR VALUES FROM (1) TO (10);
 
  
 
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 06f416039b..00c984d8d5 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -526,6 +526,17 @@ INSERT oid count
  
+ 
+ 
+  Notes
+
+  
+   If the 

Re: [HACKERS] jsonb problematic operators

2016-12-11 Thread Craig Ringer
On 12 December 2016 at 12:52, Tom Lane  wrote:
> Craig Ringer  writes:
>> It's definitely annoying, in both directions. ? wasn't a great choice
>> for an operator character but it's logical and was grandfathered over
>> from hstore.
>
> It was grandfathered from a lot further back than that.  A quick look
> into the system catalogs says that core Postgres currently has 21
> operators that include "?" in their names.  Three of those are the
> jsonb operators, and the other 18 have been there since circa 1997.
> (Most of them seem to date to Tom Lockhart's commit 3c2d74d2a, but
> "" is present in Berkeley Postgres v4r2, released in 1994.)
>
> I do not have a lot of patience with client-side code that's unable
> to deal with operator names containing "?".  It's not like this
> requirement has blindsided anybody in this century.

Pretty much. Nor is it the only oddity you have to deal with when
working with different DBMSes.

PgJDBC allows you to write ??, which is ugly, but tolerable, since the
JDBC spec doesn't have an escape syntax for it. We could've extended
the JDBC escape syntax with a new kind of {postgres } escape if we'd
wanted instead, but it'd still be nonportable so the driver went for
the less verbose option. PDO should do something similar. It's not
like client code using our ? operators has to be portable.

I didn't realise Pg's use of ? was that old, so thanks. That makes
offering alternatives much less appealing.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-12-11 Thread Amit Kapila
On Mon, Dec 12, 2016 at 10:25 AM, Jeff Janes  wrote:
> On Sun, Dec 11, 2016 at 8:37 PM, Amit Kapila 
> wrote:
>>
>> On Sun, Dec 11, 2016 at 11:54 AM, Amit Kapila 
>> wrote:
>> > On Wed, Dec 7, 2016 at 2:02 AM, Jeff Janes  wrote:
>> >
>> > With above fixes, the test ran successfully for more than a day.
>> >
>>
>> There was a small typo in the previous patch which is fixed in
>> attached.  I don't think it will impact the test results if you have
>> already started the test with the previous patch, but if not, then it
>> is better to test with attached.
>
>
> Thanks,  I've already been running the previous one for several hours, and
> so far it looks good.
>

Thanks.

>  I've tried forward porting it to the WAL patch to
> test that as well, but didn't have any luck doing so.
>

I think we can verify WAL patch separately.  I am already working on it.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-12-11 Thread Jeff Janes
On Sun, Dec 11, 2016 at 8:37 PM, Amit Kapila 
wrote:

> On Sun, Dec 11, 2016 at 11:54 AM, Amit Kapila 
> wrote:
> > On Wed, Dec 7, 2016 at 2:02 AM, Jeff Janes  wrote:
> >
> > With above fixes, the test ran successfully for more than a day.
> >
>
> There was a small typo in the previous patch which is fixed in
> attached.  I don't think it will impact the test results if you have
> already started the test with the previous patch, but if not, then it
> is better to test with attached.
>

Thanks,  I've already been running the previous one for several hours, and
so far it looks good.  I've tried forward porting it to the WAL patch to
test that as well, but didn't have any luck doing so.

Cheers,

Jeff


Re: [HACKERS] jsonb problematic operators

2016-12-11 Thread Tom Lane
Craig Ringer  writes:
> It's definitely annoying, in both directions. ? wasn't a great choice
> for an operator character but it's logical and was grandfathered over
> from hstore.

It was grandfathered from a lot further back than that.  A quick look
into the system catalogs says that core Postgres currently has 21
operators that include "?" in their names.  Three of those are the
jsonb operators, and the other 18 have been there since circa 1997.
(Most of them seem to date to Tom Lockhart's commit 3c2d74d2a, but
"" is present in Berkeley Postgres v4r2, released in 1994.)

I do not have a lot of patience with client-side code that's unable
to deal with operator names containing "?".  It's not like this
requirement has blindsided anybody in this century.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-12-11 Thread Amit Kapila
On Tue, Dec 6, 2016 at 1:29 PM, Jeff Janes  wrote:
> On Thu, Dec 1, 2016 at 10:54 PM, Amit Kapila 
> wrote:
>
> With the latest HASH WAL patch applied, I get different but apparently
> related errors
>
> 41993 UPDATE XX002 2016-12-05 22:28:45.333 PST:ERROR:  index "foo_index_idx"
> contains corrupted page at block 27602
> 41993 UPDATE XX002 2016-12-05 22:28:45.333 PST:HINT:  Please REINDEX it.
> 41993 UPDATE XX002 2016-12-05 22:28:45.333 PST:STATEMENT:  update foo set
> count=count+1 where index=$1
>

This is not the problem of WAL patch per se.  It should be fixed with
the hash index bug fix patch I sent upthread.  However, after the bug
fix patch, WAL patch needs to be rebased which I will do and send it
after verification.  In the meantime, it would be great if you can
verify the fix posted.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-11 Thread Dilip Kumar
On Tue, Dec 6, 2016 at 9:19 AM, Andres Freund  wrote:
> 0009  WIP: Add minimal keytest implementation.
>
> More or less experimental patch that tries to implement simple
> expression of the OpExpr(ScalarVar, Const) into a single expression
> evaluation step.  The benefits probably aren't big enough iff we do end
> up doing JITing of expressions.

Seems like we are try to achieve same thing with 'heap scan key push
down patch'[1] as well. But I think with this patch you are covering
OpExpr(ScalarVar, Const) for all the cases, wherein with [1] we are
currently only doing it for seqscan and we are trying to make that
generic for other node as well.

So do you see any advantage of continuing [1] ?  Is there something
extra we can achieve with [1] what we can not get with this patch ?

https://www.postgresql.org/message-id/CAFiTN-takT6Z4s3tGDwyC9bhYf%2B1gumpvW5bo_fpeNUy%2BrL-kg%40mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-12-11 Thread Amit Kapila
On Sun, Dec 11, 2016 at 11:54 AM, Amit Kapila  wrote:
> On Wed, Dec 7, 2016 at 2:02 AM, Jeff Janes  wrote:
>
> With above fixes, the test ran successfully for more than a day.
>

There was a small typo in the previous patch which is fixed in
attached.  I don't think it will impact the test results if you have
already started the test with the previous patch, but if not, then it
is better to test with attached.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_hashindex_issues_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb problematic operators

2016-12-11 Thread Craig Ringer
On 11 December 2016 at 18:52, Geoff Winkless  wrote:
> On 9 Dec 2016 17:54, "Andres Freund"  wrote:
>
> On 2016-12-09 12:17:32 -0500, Robert Haas wrote:
>> As Geoff says, you don't have to use the operators; you could use the
>> equivalent functions instead.  Every operator just gets turned into a
>> function call internally, so this is always possible.
>
> Well, except that only operators support indexing :(
>
>
> Really? Seems like an odd design decision.
>
> The only other simple suggestion then would be to use PDO named parameters
> instead of positional ones. Much nicer syntax anyway, IMO.

You can also create alternate names for the operators, but it's a bit
of a pain if you want indexing support. Though I thought we defined
alternative names for exactly this reason, but I don't see them...

It's definitely annoying, in both directions. ? wasn't a great choice
for an operator character but it's logical and was grandfathered over
from hstore. PDO not offering any way to escape parameter binding
characters is at least as bad. What client interface provides no way
to pass-through strings it would otherwise treat as special?

Does PDO cope if you use the OPERATOR("pg_catalog".?) form? Or does it
still try to bind the parameter? e.g.

postgres=> SELECT '{}'::jsonb OPERATOR("pg_catalog".?) 'fred';
 ?column?
--
 f
(1 row)

Does PDO let you double question marks to escape them, writing ?? or
\? instead of ? or anything like that?

If not, I suggest that you (a) submit a postgres patch adding
alternative operator names for ? and ?|, and (b) submit a PDO patch to
allow ?? or \? as an escape for ? .

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-11 Thread Amit Langote

Hi,

On 2016/12/11 10:02, Venkata B Nagothi wrote:
> On Fri, Dec 9, 2016 at 11:11 PM, Amit Langote 
> wrote:
>> On Fri, Dec 9, 2016 at 3:16 PM, Venkata B Nagothi 
>> wrote:
>>> I am testing the partitioning feature from the latest master and got the
>>> following error while loading the data -
>>>
>>> db01=# create table orders_y1993 PARTITION OF orders FOR VALUES FROM
>>> ('1993-01-01') TO ('1993-12-31');
>>> CREATE TABLE
>>>
>>> db01=# copy orders from '/data/orders-1993.csv' delimiter '|';
>>> ERROR:  could not read block 6060 in file "base/16384/16412": read only
>> 0 of
>>> 8192 bytes
>>> CONTEXT:  COPY orders, line 376589:
>>> "9876391|374509|O|54847|1997-07-16|3-MEDIUM
>>  |Clerk#01993|0|ithely
>>> regular pack"
>>
>> Hmm.   Could you tell what relation the file/relfilenode 16412 belongs to?
>>
> 
> db01=# select relname from pg_class where relfilenode=16412 ;
>relname
> --
>  orders_y1997
> (1 row)
> 
> 
> I VACUUMED the partition and then re-ran the copy command and no luck.
> 
> db01=# vacuum orders_y1997;
> VACUUM
> 
> db01=# copy orders from '/data/orders-1993.csv' delimiter '|';
> ERROR:  could not read block 6060 in file "base/16384/16412": read only 0
> of 8192 bytes
> CONTEXT:  COPY orders, line 376589:
> "9876391|374509|O|54847|1997-07-16|3-MEDIUM   |Clerk#01993|0|ithely
> regular pack"
> 
> I do not quite understand the below behaviour as well. I VACUUMED 1997
> partition and then i got an error for 1992 partition and then after 1996
> and then after 1994 and so on.
> 

[ ... ]

> db01=# vacuum orders_y1997;
> VACUUM
> db01=# copy orders from '/data/orders-1993.csv' delimiter '|';
> ERROR:  could not read block 6060 in file "base/16384/16412": read only 0
> of 8192 bytes
> CONTEXT:  COPY orders, line 376589:
> "9876391|374509|O|54847|1997-07-16|3-MEDIUM   |Clerk#01993|0|ithely
> regular pack"
> db01=#
> 
> Am i not understanding anything here ?

I could not reproduce this issue.  Also, I could not say what might have
gone wrong based only on the information I have seen so far.

Have you tried inserting the same data using insert?

create table orders_unpartitioned (like orders);
copy orders_unpartitioned from '/data/orders-1993.csv';
insert into orders select * from orders_unpartitioned;

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2016-12-11 Thread Petr Jelinek
On 01/12/16 03:26, Tomas Vondra wrote:
> 
> Dne 11/27/2016 v 11:02 PM Andres Freund napsal(a):
>> On 2016-11-27 22:21:49 +0100, Petr Jelinek wrote:
>>> On 27/11/16 21:47, Andres Freund wrote:
 Hi,

>> +typedef struct SlabBlockData *SlabBlock;/* forward
>> reference */
>> +typedef struct SlabChunkData *SlabChunk;
>>
>> Can we please not continue hiding pointers behind typedefs? It's a
>> bad
>> pattern, and that it's fairly widely used isn't a good excuse to
>> introduce further usages of it.
>>
>
> Why is it a bad pattern?

 It hides what is passed by reference, and what by value, and it
 makes it
 a guessing game whether you need -> or . since you don't know whether
 it's a pointer or the actual object. All to save a * in parameter and
 variable declaration?...

>>>
>>> FWIW I don't like that pattern either although it's used in many
>>> parts of our code-base.
>>
>> But relatively few new ones, most of it is pretty old.
>>
> 
> I do agree it's not particularly pretty pattern, but in this case it's
> fairly isolated in the mmgr sources, and I quite value the consistency
> in this part of the code (i.e. that aset.c, slab.c and generation.c all
> use the same approach). So I haven't changed this.
> 
> The attached v7 fixes the off-by-one error in slab.c, causing failures
> in test_decoding isolation tests, and renames Gen to Generation, as
> proposed by Petr.
> 

I'd be happy with this patch now (as in committer ready) except that it
does have some merge conflicts after the recent commits, so rebase is
needed.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jacana hung after failing to acquire random number

2016-12-11 Thread Michael Paquier
On Sun, Dec 11, 2016 at 9:06 AM, Andrew Dunstan  wrote:
>
> jascana (mingw, 64 bit compiler, no openssl) is currently hung on "make
> check". After starting the autovacuum launcher there are 120 messages on its
> log about "Could not acquire random number". Then nothing.
>
>
> So I suspect the problem here is commit
> fe0a0b5993dfe24e4b3bcf52fa64ff41a444b8f1, although I haven't looked in
> detail.
>
>
> Shouldn't we want the postmaster to shut down if it's not going to go
> further? Note that frogmouth, also mingw, which builds with openssl, doesn't
> have this issue.

Did you unlock it in some way at the end? Here is the shape of the
report for others:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2016-12-10%2022%3A00%3A15
And here is of course the interesting bit:
2016-12-10 17:25:38.822 EST [584c80e2.ddc:2] LOG:  could not acquire
random number
2016-12-10 17:25:39.869 EST [584c80e2.ddc:3] LOG:  could not acquire
random number
2016-12-10 17:25:40.916 EST [584c80e2.ddc:4] LOG:  could not acquire
random number

I am not seeing any problems with MSVC without openssl, so that's a
problem proper to MinGW. I am getting to wonder if it is actually a
good idea to cache the crypt context and then re-use it. Using a new
context all the time is definitely not performance-wise though. A
second difference are the missing CRYPT_MACHINE_KEYSET and
CRYP_NEWKEYSET. So, with something like the patch attached, do you see
improvements? What this patch does is to use a different context at
each call, and with the key container flags to allow proper access to
it (as there are winxp-only limitations here). I have tried to compile
with MinGW in my environment but my gcc compiler keep crashing, so I
cannot reproduce directly the problem I am afraid.
-- 
Michael
diff --git a/src/port/pg_strong_random.c b/src/port/pg_strong_random.c
index 6d3aa38..34cab6f 100644
--- a/src/port/pg_strong_random.c
+++ b/src/port/pg_strong_random.c
@@ -31,14 +31,6 @@
 #include 
 #endif
 
-#ifdef WIN32
-/*
- * Cache a global crypto provider that only gets freed when the process
- * exits, in case we need random numbers more than once.
- */
-static HCRYPTPROV hProvider = 0;
-#endif
-
 #if defined(USE_DEV_URANDOM)
 /*
  * Read (random) bytes from a file.
@@ -111,28 +103,36 @@ pg_strong_random(void *buf, size_t len)
 	 * Windows has CryptoAPI for strong cryptographic numbers.
 	 */
 #elif defined(USE_WIN32_RANDOM)
-	if (hProvider == 0)
+	HCRYPTPROV	hProvider;
+	DWORD		flags;
+
+	flags = CRYPT_VERIFYCONTEXT | CRYPT_SILENT | CRYPT_MACHINE_KEYSET;
+
+	/* Create a crypto provider */
+	if (!CryptAcquireContext(,
+			 NULL,
+			 MS_DEF_PROV,
+			 PROV_RSA_FULL,
+			 flags))
 	{
+		/* If previous creation failed, try with a new key container */
+		flags |= CRYPT_NEWKEYSET;
 		if (!CryptAcquireContext(,
  NULL,
  MS_DEF_PROV,
  PROV_RSA_FULL,
- CRYPT_VERIFYCONTEXT | CRYPT_SILENT))
-		{
-			/*
-			 * On failure, set back to 0 in case the value was for some reason
-			 * modified.
-			 */
-			hProvider = 0;
-		}
+ flags))
+			return false;
 	}
-	/* Re-check in case we just retrieved the provider */
-	if (hProvider != 0)
+
+	if (!CryptGenRandom(hProvider, len, buf))
 	{
-		if (CryptGenRandom(hProvider, len, buf))
-			return true;
+		CryptReleaseContext(hProvider, 0);
+		return false;
 	}
-	return false;
+
+	CryptReleaseContext(hProvider, 0);
+	return true;
 
 	/*
 	 * Read /dev/urandom ourselves.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] snapbuild woes

2016-12-11 Thread Craig Ringer
On 12 December 2016 at 00:36, Kevin Grittner  wrote:
> On Sun, Dec 11, 2016 at 1:17 AM, Craig Ringer
>  wrote:
>> On 11 Dec. 2016 06:50, "Petr Jelinek"  wrote:
>>> On 10/12/16 23:10, Petr Jelinek wrote:
>>>
 The attached 0002-Skip-unnecessary-snapshot-builds.patch changes this
 behavior so that we don't make snapshots for transactions that we seen
 wholly and know that they didn't make catalog changes even if we didn't
 reach consistency yet.
>>>
>>> Eh, attached wrong patch. This one is correct.
>>
>> Attached no patch second time?
>
> I see an attachment, and it shows in the archives.
>
> https://www.postgresql.org/message-id/aee1d499-e3ca-e091-56da-1ee6a47741c8%402ndquadrant.com

Sorry for the noise, apparently my phone's mail client was being dense.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-11 Thread Tomas Vondra

Hi,

On 12/07/2016 07:20 PM, Robert Haas wrote:
> On Wed, Dec 7, 2016 at 11:53 AM, Erik Rijkers  wrote:
>>> My bad.  The fix I sent last night for one of the cache flush issues
>>> wasn't quite right.  The attached seems to fix it.
>> Yes, fixed here too.  Thanks.
>
> Thanks for the report - that was a good catch.
>
> I've committed 0001 - 0006 with that correction and a few other
> adjustments.  There's plenty of room for improvement here, and almost
> certainly some straight-up bugs too, but I think we're at a point
> where it will be easier and less error-prone to commit follow on
> changes incrementally rather than by continuously re-reviewing a very
> large patch set for increasingly smaller changes.
>

I've been working on a review / testing of the partitioning patch, but 
have been unable to submit it before the commit due to a lot of travel. 
However, at least some of the points seem to be still valid, so let me 
share it as an after-commit review. Most of the issues are fairly minor 
(possibly even nitpicking).



review
--

1) src/include/catalog/pg_partitioned_table.h contains this bit:

 * $PostgreSQL: pgsql/src/include/catalog/pg_partitioned_table.h $

2) I'm wondering whether having 'table' in the catalog name (and also in 
the new relkind) is too limiting. I assume we'll have partitioned 
indexes one day, for example - do we expect to use the same catalogs?


3) A comment within BeginCopy (in copy.c) says:

* Initialize state for CopyFrom tuple routing.  Watch out for
* any foreign partitions.

But the code does not seem to be doing that (at least I don't see any 
obvious checks for foreign partitions). Also, the comment should 
probably at least mention why foreign partitions need extra care.


To nitpick, the 'pd' variable in that block seems rather pointless - we 
can assign directly to cstate->partition_dispatch_info.


4) I see GetIndexOpClass() got renamed to ResolveOpClass(). I find the 
new name rather strange - all other similar functions start with "Get", 
so I guess "GetOpClass()" would be better. But I wonder if the rename 
was even necessary, considering that it still deals with index operator 
classes (although now also in context of partition keys). If the rename 
really is needed, isn't that a sign that the function does not belong to 
indexcmds.c anymore?


5) Half the error messages use 'child table' while the other half uses 
'partition'. I think we should be using 'partition' as child tables 
really have little meaning outside inheritance (which is kinda hidden 
behind the new partitioning stuff).


6) The psql tab-completion seems a bit broken, because it only offers 
the partitions, not the parent table. Which is usually exactly the 
opposite of what the user wants.



testing
---

I've also done quite a bit of testing with different partition layouts 
(single-level list/range partitioning, multi-level partitioning etc.), 
with fairly large number (~10k) of partitions. The scripts I used are 
available here: https://bitbucket.org/tvondra/partitioning-tests


1) There seems to be an issue when a partition is created and then 
accessed within the same transaction, i.e. for example


BEGIN;
... create parent ...
... create partition 
... insert into parent ...
COMMIT;

which simply fails with an error like this:

ERROR:  no partition of relation "range_test_single" found for row
DETAIL:  Failing row contains (99000, 99000).

Of course, the partition is there. And interestingly enough, this works 
perfectly fine when executed without the explicit transaction, so I 
assume it's some sort of cache invalidation mix-up.


2) When doing a SELECT COUNT(*) from the partitioned table, I get a plan 
like this:


  QUERY PLAN
---
 Finalize Aggregate  (cost=124523.64..124523.65 rows=1 width=8)
   ->  Gather  (cost=124523.53..124523.64 rows=1 width=8)
 Workers Planned: 1
 ->  Partial Aggregate  (cost=123523.53..123523.54 rows=1 ...)
   ->  Append  (cost=0.00..108823.53 rows=5880001 width=0)
 ->  Parallel Seq Scan on parent ...
 ->  Parallel Seq Scan on partition_1 ...
 ->  Parallel Seq Scan on partition_2 ...
 ->  Parallel Seq Scan on partition_3 ...
 ->  Parallel Seq Scan on partition_4 ...
 ->  Parallel Seq Scan on partition_5 ...
 ->  Parallel Seq Scan on partition_6 ...
 ... and the rest of the 10k partitions

So if I understand the plan correctly, we first do a parallel scan of 
the parent, then partition_1, partition_2 etc. But AFAIK we scan the 
tables in Append sequentially, and each partition only has 1000 rows 
each, making the parallel execution rather inefficient. Unless we scan 
the partitions concurrently.


In any case, as this happens even with plain 

Re: [HACKERS] building HEAD on macos fails with #error no source of random numbers configured

2016-12-11 Thread Jim Nasby

On 12/9/16 9:53 AM, Heikki Linnakangas wrote:


I'm not sure what --no-recursion does, but I would say that we'd
consider that unsupported as well.


Interesting. Running config.status adds those --no-create --no-recursion
flags automatically. You can see them in the command-line at the top of
config.log, too. I never bothered to check what they do...


AIUI they have to do with config dependency checking (where a simple 
make will detect if config needs to run again or not). I've been bitten 
by this in the past as well. Maybe there's a way to get config to spit 
out a warning for those options and have make filter the warning out.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for index-only scans with filter conditions

2016-12-11 Thread Tom Lane
Mateusz Stefek  writes:
> Attached is a patch, which changes the order of the checks. I.e. the
> visibility of the row is confirmed only after the qual condition is
> evaluated to true.

This cannot be accepted, because it's unsafe to try to apply user
code to a dead tuple.  The tuple might reference toast data that's
not there anymore, or in the worst case it might not even match
the table rowtype anymore.

Even without that problem, I would think that whether it is a win or
not would be mighty workload-dependent.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Crash in tsquery_rewrite/QTNBinary

2016-12-11 Thread Tom Lane
Artur Zakirov  writes:
> 2016-12-07 9:06 GMT+03:00 Andreas Seltenreich :
>> the following query crashes master as of 4212cb7.

> It happens because 'I' is stop word and substitute query becomes
> empty. And for queries above we need recursive dropvoidsubtree()
> function. Without this patch this function cleans only first level of
> tree. And query above becomes: '6 | void'.

> Firstly I made recursive dropvoidsubtree(). But attached patch cleans
> query tree in dofindsubquery() to avoid extra tree scan.

This patch looks good to me.  I have to admit that I'd been suspicious
of dropvoidsubtree() the last time I looked at this code, but I didn't
have adequate reason to touch it.  Pushed with some minor comment
adjustments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Optimization for index-only scans with filter conditions

2016-12-11 Thread Mateusz Stefek
Hi,

I noticed that during an index-only scan the filter check is executed
after the visibility of a tuple is confirmed. This means a lot of
unnecessary heap fetches, if the filter condition is highly selective.

In the application I'm maintaining, the visibility map is mostly
dirty. Index-only scans deteriorate into normal index-scans, even when
there is no row matching the filter conditions.

Attached is a patch, which changes the order of the checks. I.e. the
visibility of the row is confirmed only after the qual condition is
evaluated to true.

To see the performance benefits, consider the following example:
create table test_table (id integer primary key, a integer, b text, c
text, filler text);
insert into test_table (id, a, b, c, filler) select s, 1,
md5(s::text), md5(s::text), repeat('a', 100) from
generate_series(1000, 1999) s;
insert into test_table (id, a, b, c, filler) select s, 2,
md5(s::text), md5(s::text), repeat('a', 100) from
generate_series(2000, 2999) s;
insert into test_table (id, a, b, c, filler) select s, 3,
md5(s::text), md5(s::text), repeat('a', 100) from
generate_series(3000, 3999) s;
create index test_index on test_table (a, b, c);
explain analyze select a, b, c from test_table where a = 2 and b > c
order by a, b, c limit 1;

Before the changes:
"Limit  (cost=0.69..11.80 rows=1 width=68) (actual
time=8136433.536..8136433.536 rows=0 loops=1)"
"  ->  Index Only Scan using test_index on test_table
(cost=0.69..555456.69 rows=5 width=68) (actual
time=8136433.533..8136433.533 rows=0 loops=1)"
"Index Cond: (a = 2)"
"Filter: (b > c)"
"Rows Removed by Filter: 1000"
"Heap Fetches: 1000"
"Planning time: 2.054 ms"
"Execution time: 8136433.564 ms"

After the changes:
"Limit  (cost=0.69..11.80 rows=1 width=68) (actual
time=3060.548..3060.548 rows=0 loops=1)"
"  ->  Index Only Scan using test_index on test_table
(cost=0.69..555456.69 rows=5 width=68) (actual
time=3060.544..3060.544 rows=0 loops=1)"
"Index Cond: (a = 2)"
"Filter: (b > c)"
"Rows Removed by Filter: 1000"
"Heap Fetches: 0"
"Planning time: 11.684 ms"
"Execution time: 3060.615 ms"
The example is extreme, but I would say it doesn't differ much from
what I see in my production environment.

About the naming in the solution:
There's already a concept of "index recheck" which seems to be a
perfect name for the functionality i'm introducing. However, after
looking into the code, I don't think it can be reused, because the
current rechecks are evaluated before quals and ExecScanFetch contains
a very specific code.

Please let me know what you think about the change.

--
Mateusz Stefek


indexonlyfilter.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql statements \gstore \gstore_binary (instead COPY RAW)

2016-12-11 Thread Pavel Stehule
Hi

2016-12-09 18:39 GMT+01:00 Pavel Stehule :

> Hi
>
> Long time I am pushing a COPY RAW - without success.
>
> Now I propose functionally similar solution - reduced to only to psql
> console
>
> Now we have a statement \g for execution query, \gset for exec and store
> result in memory and I propose \gstore for storing result in file and
> \gstore_binary for storing result in file with binary passing. The query
> result should be one row, one column.
>
> Usage:
>
> SELECT image FROM accounts WHERE id = xxx
> \gstore_binary ~/image.png
>
> What do you think about this proposal?
>

here is a poc patch

Regards

Pavel

Usage:

postgres=# set client_encoding to 'latin2';
SET
Time: 1,561 ms
postgres=# select a from foo
postgres-# \gbstore ~/doc.xml
Time: 1,749 ms

content of doc.xml
příliš žluťoučký kůň se napil
žluté vody


> Regards
>
> Pavel
>
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a9a2fdb..2cf54bf 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -929,6 +929,27 @@ exec_command(const char *cmd,
 		status = PSQL_CMD_SEND;
 	}
 
+	/* \gstore [filename], \gbstore [filename] -- send query and store result in (binary) file */
+	else if (strcmp(cmd, "gstore") == 0 ||
+			  (strcmp(cmd, "gbstore") == 0))
+	{
+		char	   *fname = psql_scan_slash_option(scan_state,
+   OT_FILEPIPE, NULL, false);
+
+		if (!fname)
+			pset.gfname = pg_strdup("");
+		else
+		{
+			expand_tilde();
+			pset.gfname = pg_strdup(fname);
+		}
+
+		pset.raw_flag = true;
+		pset.binary_result = (strcmp(cmd, "gbstore") == 0);
+		free(fname);
+		status = PSQL_CMD_SEND;
+	}
+
 	/* help */
 	else if (strcmp(cmd, "h") == 0 || strcmp(cmd, "help") == 0)
 	{
@@ -1064,7 +1085,6 @@ exec_command(const char *cmd,
 		free(opt2);
 	}
 
-
 	/* \o -- set query output */
 	else if (strcmp(cmd, "o") == 0 || strcmp(cmd, "out") == 0)
 	{
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index a7789df..b2e437a 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -854,6 +854,85 @@ StoreQueryTuple(const PGresult *result)
 	return success;
 }
 
+/*
+ * StoreRawResult: the returned value (possibly binary) is displayed
+ * or stored in file. The result should be exactly one row, one column.
+ */
+static bool
+StoreRawResult(const PGresult *result)
+{
+	bool		success = true;
+
+	if (PQntuples(result) < 1)
+	{
+		psql_error("no rows returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQntuples(result) > 1)
+	{
+		psql_error("more than one row returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQnfields(result) < 1)
+	{
+		psql_error("no columns returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQnfields(result) > 1)
+	{
+		psql_error("more than one column returned for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else if (PQgetisnull(result, 0, 0))
+	{
+		psql_error("returned value is null for \\gstore or \\gbstore\n");
+		success = false;
+	}
+	else
+	{
+		char	   *value;
+		int			length;
+		FILE	   *fout = NULL;
+		bool		is_pipe = false;
+
+		value = PQgetvalue(result, 0, 0);
+		length = PQgetlength(result, 0, 0);
+
+		if (pset.gfname && *(pset.gfname) != '\0')
+		{
+			if (!openQueryOutputFile(pset.gfname, , _pipe))
+success = false;
+			if (success && is_pipe)
+disable_sigpipe_trap();
+		}
+
+		if (success)
+		{
+			success = fwrite(value, 1, length, fout != NULL ? fout : pset.queryFout) == length;
+			if (!success)
+psql_error("%s: %s\n", pset.gfname, strerror(errno));
+
+			if (success)
+success = fflush(fout != NULL ? fout : pset.queryFout) == 0;
+
+			if (!success)
+psql_error("%s: %s\n", pset.gfname, strerror(errno));
+
+			if (fout != NULL)
+			{
+if (is_pipe)
+{
+	pclose(fout);
+	restore_sigpipe_trap();
+}
+else
+	fclose(fout);
+			}
+		}
+	}
+
+	return success;
+}
 
 /*
  * ExecQueryTuples: assuming query result is OK, execute each query
@@ -1124,6 +1203,8 @@ PrintQueryResults(PGresult *results)
 success = ExecQueryTuples(results);
 			else if (pset.crosstab_flag)
 success = PrintResultsInCrosstab(results);
+			else if (pset.raw_flag)
+success = StoreRawResult(results);
 			else
 success = PrintQueryTuples(results);
 			/* if it's INSERT/UPDATE/DELETE RETURNING, also print status */
@@ -1278,7 +1359,8 @@ SendQuery(const char *query)
 	}
 
 	if (pset.fetch_count <= 0 || pset.gexec_flag ||
-		pset.crosstab_flag || !is_select_command(query))
+		pset.crosstab_flag || !is_select_command(query) ||
+		pset.raw_flag)
 	{
 		/* Default fetch-it-all-and-print mode */
 		instr_time	before,
@@ -1287,7 +1369,16 @@ SendQuery(const char *query)
 		if (pset.timing)
 			INSTR_TIME_SET_CURRENT(before);
 
-		results = PQexec(pset.db, query);
+		if (pset.binary_result)
+			results = PQexecParams(pset.db, query,
+	0,
+	NULL,
+	NULL,
+	NULL,
+	NULL,
+	pset.binary_result);
+		else
+			

Re: [HACKERS] snapbuild woes

2016-12-11 Thread Kevin Grittner
On Sun, Dec 11, 2016 at 1:17 AM, Craig Ringer
 wrote:
> On 11 Dec. 2016 06:50, "Petr Jelinek"  wrote:
>> On 10/12/16 23:10, Petr Jelinek wrote:
>>
>>> The attached 0002-Skip-unnecessary-snapshot-builds.patch changes this
>>> behavior so that we don't make snapshots for transactions that we seen
>>> wholly and know that they didn't make catalog changes even if we didn't
>>> reach consistency yet.
>>
>> Eh, attached wrong patch. This one is correct.
>
> Attached no patch second time?

I see an attachment, and it shows in the archives.

https://www.postgresql.org/message-id/aee1d499-e3ca-e091-56da-1ee6a47741c8%402ndquadrant.com

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb problematic operators

2016-12-11 Thread Geoff Winkless
On 9 Dec 2016 17:54, "Andres Freund"  wrote:

On 2016-12-09 12:17:32 -0500, Robert Haas wrote:
> As Geoff says, you don't have to use the operators; you could use the
> equivalent functions instead.  Every operator just gets turned into a
> function call internally, so this is always possible.

Well, except that only operators support indexing :(


Really? Seems like an odd design decision.

The only other simple suggestion then would be to use PDO named parameters
instead of positional ones. Much nicer syntax anyway, IMO.

Geoff


Re: [HACKERS] background sessions

2016-12-11 Thread Andrew Borodin
Hi!

I'm planning to review this patch.
I have some questions, maybe answers could help me understand patch better.
1. As far as I can see, we connot use COPY FROM STDIN in bg session?
Since one of purposes is to orchestrate transactions, may be that
would be valuable.
2. From my point of view the interface of the feature is the strong
point here (compared to pg_background). But it does not help
programmer to follow good practice: bgworker is a valuable and limited
resource, may be we could start session with something like
TryBeginSession()? May be this violates some policies or idioms which
I'm not aware of.

Thank you for your work.

Best regards, Andrey Borodin.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typo in doc/src/sgml/catalogs.sgml

2016-12-11 Thread Joel Jacobson
Hi,

I found a minor typo at
https://www.postgresql.org/docs/9.6/static/catalog-pg-am.html.

pg_catalog.pg_am. amhandler is of type "oid" according to the
documentation, but it's actually of type "regproc" in reality.

Compare with e.g. pg_aggregate where the columns of type "regproc" is
both "regproc" in the documentation and in reality.

To fix:

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 29738b0..93deda4 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -566,7 +566,7 @@

  
   amhandler
-  oid
+  regproc
   pg_proc.oid
   
OID of a handler function that is responsible for supplying information

Best regards,

Joel Jacobson
Trustly


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background contrib module proposal

2016-12-11 Thread Andrew Borodin
2016-12-09 18:00 GMT+05:00 Robert Haas :
> It looks like this could be reworked as a client of Peter Eisentraut's
> background sessions code, which I think is also derived from
> pg_background:
>
> http://archives.postgresql.org/message-id/e1c2d331-ee6a-432d-e9f5-dcf85cffa...@2ndquadrant.com
>
> That might be good, because then we wouldn't have to maintain two
> copies of the code.

Code looks quite different. I mean bgsession.c code and pg_background.c code.
Definitly, there is possibility to refactor both patches to have
common subset of base routines, they operate with similar concepts.
But to start, it's better to choose which patch goes first, or merge
them.

There is no possibility to make one on base of other since they both
require some work.
Personally, I like C code from pg_background more. It is far better
commented and has more exceptions info for user. But interface of
bgsessions is crispier. Finally, they solve different problems.

I signed up for review there too (in background sessions patch). I
hope I'll have enough resources to provide decent review for both in
december, before commitfest.

Best regards, Andrey Borodin.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers