Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-11-13 Thread Alexander Kuzmenkov

I've pushed the executor part of this, but mostly not the planner part,
because I didn't think the latter was anywhere near ready for prime
time: the regression test changes it was causing were entirely bogus.


Hi Tom,

Thanks for the commit and the explanation. I'll try to address your 
comments if I continue working on the planner part.


--

Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted

2017-11-13 Thread Masahiko Sawada
Hi,

Commit e2c79e14 prevented multiple cleanup process for pending list in
GIN index. But I think that there is still possibility that vacuum
could miss tuples to be deleted if someone else is cleaning up the
pending list.

In ginInsertCleanup(), we lock the GIN meta page by LockPage and could
wait for the concurrent cleaning up process if stats == NULL. And the
source code comment says that this happen is when ginINsertCleanup is
called by [auto]vacuum/analyze or gin_clean_pending_list(). I agree
with this behavior. However, looking at the callers the stats is NULL
only either if pending list exceeds to threshold during insertions or
if only analyzing is performed by an autovacum worker or ANALYZE
command. So I think we should inVacuum = (stats != NULL) instead.
Also, we might want autoanalyze and ANALYZE command to wait for
concurrent process as well. Attached patch fixes these two issue. If
this is a bug we should back-patch to 9.6.

void
ginInsertCleanup(GinState *ginstate, bool full_clean,
 bool fill_fsm, IndexBulkDeleteResult *stats)
{

(snip)

boolinVacuum = (stats == NULL);

/*
 * We would like to prevent concurrent cleanup process. For that we will
 * lock metapage in exclusive mode using LockPage() call. Nobody other
 * will use that lock for metapage, so we keep possibility of concurrent
 * insertion into pending list
 */

if (inVacuum)
{
/*
 * We are called from [auto]vacuum/analyze or gin_clean_pending_list()
 * and we would like to wait concurrent cleanup to finish.
 */
LockPage(index, GIN_METAPAGE_BLKNO, ExclusiveLock);
workMemory =
(IsAutoVacuumWorkerProcess() && autovacuum_work_mem != -1) ?
autovacuum_work_mem : maintenance_work_mem;
}
else
{
/*
 * We are called from regular insert and if we see concurrent cleanup
 * just exit in hope that concurrent process will clean up pending
 * list.
 */
if (!ConditionalLockPage(index, GIN_METAPAGE_BLKNO, ExclusiveLock))
return;
workMemory = work_mem;
}

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


fix_ginInsertCleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-11-13 Thread Konstantin Knizhnik



On 11.11.2017 23:29, Konstantin Knizhnik wrote:

On 10/27/2017 02:01 PM, Jeevan Chalke wrote:

Hi,

Attached new patch-set here. Changes include:

1. Added separate patch for costing Append node as discussed up-front 
in the

patch-set.
2. Since we now cost Append node, we don't need 
partition_wise_agg_cost_factor
GUC. So removed that. The remaining patch hence merged into main 
implementation

patch.
3. Updated rows in test-cases so that we will get partition-wise plans.

Thanks


I applied partition-wise-agg-v6.tar.gz patch to  the master and use 
shard.sh example from 
https://www.postgresql.org/message-id/14577.1509723225%40localhost

Plan for count(*) is the following:

shard=# explain select count(*) from orders;
  QUERY PLAN
--- 


 Finalize Aggregate  (cost=100415.29..100415.30 rows=1 width=8)
   ->  Append  (cost=50207.63..100415.29 rows=2 width=8)
 ->  Partial Aggregate  (cost=50207.63..50207.64 rows=1 width=8)
   ->  Foreign Scan on orders_0 (cost=101.00..50195.13 
rows=5000 width=0)

 ->  Partial Aggregate  (cost=50207.63..50207.64 rows=1 width=8)
   ->  Foreign Scan on orders_1 (cost=101.00..50195.13 
rows=5000 width=0)



We really calculate partial aggregate for each partition, but to do we 
still have to fetch all data from remote host.

So for foreign partitions such plans is absolutely inefficient.
Amy be it should be combined with some other patch?
For example, with  agg_pushdown_v4.tgz patch 
https://www.postgresql.org/message-id/14577.1509723225%40localhost ?

But it is not applied after partition-wise-agg-v6.tar.gz patch.
Also postgres_fdw in 11dev is able to push down aggregates without 
agg_pushdown_v4.tgz patch.


In 0009-Teach-postgres_fdw-to-push-aggregates-for-child-rela.patch
there is the following check:

 /* Partial aggregates are not supported. */
+   if (extra->isPartial)
+   return;

If we just comment this line then produced plan will be the following:

shard=# explain select sum(product_id) from orders;
   QUERY PLAN

 Finalize Aggregate  (cost=308.41..308.42 rows=1 width=8)
   ->  Append  (cost=144.18..308.41 rows=2 width=8)
 ->  Foreign Scan  (cost=144.18..154.20 rows=1 width=8)
   Relations: Aggregate on (public.orders_0 orders)
 ->  Foreign Scan  (cost=144.18..154.20 rows=1 width=8)
   Relations: Aggregate on (public.orders_1 orders)
(6 rows)

And it is actually desired plan!
Obviously such approach will not always work. FDW really doesn't 
support partial aggregates now.
But for most frequently used aggregates: sum, min, max, count 
aggtype==aggtranstype and there is no difference

between partial and normal aggregate calculation.
So instead of (extra->isPartial) condition we can add more complex 
check which will traverse pathtarget expressions and
check if it can be evaluated in this way. Or... extend FDW API to 
support partial aggregation.


But even the last plan is not ideal: it will calculate predicates at 
each remote node sequentially.

There is parallel append patch:
https://www.postgresql.org/message-id/CAJ3gD9ctEcrVUmpY6fq_JUB6WDKGXAGd70EY68jVFA4kxMbKeQ%40mail.gmail.com 

but ... FDW doesn't support parallel scan, so parallel append can not 
be applied in this case.
And we actually do not need parallel append with all its dynamic 
workers here.
We just need to start commands at all remote servers and only after it 
fetch results (which can be done sequentially).


I am investigating problem of efficient execution of OLAP queries on 
sharded tables (tables with remote partitions).

After reading all this threads and corresponding  patches, it seems to me
that we already have most of parts of the puzzle, what we need is to 
put them on right places and may be add missed ones.

I wonder if somebody is busy with it and can I somehow help here?

Also I am not quite sure about the best approach with parallel 
execution of distributed query at all nodes.
Should we make postgres_fdw parallel safe and use parallel append? How 
difficult it will be?
Or in addition to parallel append we should also have "asynchronous 
append" which will be able to initiate execution at all nodes?
It seems to be close to merge append, because it should simultaneously 
traverse all cursors.


Looks like second approach is easier for implementation. But in case 
of sharded table, distributed query may need to traverse both remote
and local shards and this approach doesn't allow to processed several 
local shards in parallel.




I attach small patch for postgres_fdw.c which allows concurrent 
execution of aggregates by all remote servers (when them are accessed 
through postgres_fdw).
I have added "postgres_fdw.use_prefetch" GUC to enable/disable 
prefetching data in postgres_fdw.


Re: [HACKERS] How to implement a SP-GiST index as a extension module?

2017-11-13 Thread Alexander Korotkov
Hi!

On Mon, Nov 13, 2017 at 6:47 AM, Connor Wolf  wrote:

> Ok, I've managed to get my custom index working.
>

Good!

It's all on github here: https://github.com/fake-name/pg-spgist_hamming, if
> anyone else needs a fuzzy-image searching system
> that can integrate into postgresql..
>
> It should be a pretty good basis for anyone else to use if they want to
> implement a SP-GiST index too.
>

I took a look at the code, and I feel myself a bit confused :)
It appears that you're indexing int8 values.  That seems like unrealistic
short representation for image signature.
Also, name of repository make me think that hamming distance would be used
to compare signatures.  But after look at the code, I see that plain
absolute value of difference is used for that purpose.

static double
getDistance(Datum v1, Datum v2)
{
int64_t a1 = DatumGetInt64(v1);
int64_t a2 = DatumGetInt64(v2);
int64_t diff = Abs(a1 - a2);
fprintf_to_ereport("getDistance %ld <-> %ld : %ld", a1, a2, diff);
return diff;
}

For such notion of distance, you don't need a VP-tree or another complex
indexing.  B-tree is quite enough in this case.  Alternatively, distance
function is not what it meant to be.

It would be useful if you provide complete usage example of this extension:
from image to signature conversion to search queries.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [Patch] Log SSL certificate verification errors

2017-11-13 Thread Laurenz Albe
Graham Leggett wrote:
> Currently neither the server side nor the client side SSL certificate verify
> callback does anything, leading to potential hair-tearing-out moments.
>
> The following patch to master implements logging of all certificate
> verification failures, as well as (crucially) which certificates failed to 
> verify,
> and at what depth, so the admin can zoom in straight onto the problem without 
> any guessing.

+1 for the idea.

I have been in this situation before, and any information that helps to
clarify what the problem is would be a great help.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10beta1 sequence regression failure on sparc64

2017-11-13 Thread Jonathan Jacobson

Christoph, what beta2 change was it that fixed that problem?

I'm having exactly the same regression test failure in version 10.1 (not beta) 
running on Solaris 9

Compiler used: GCC 4.6.4
CFLAGS: -O2 -m64


On 13/07/2017 20:05, Christoph Berg wrote:

Re: To Andres Freund 2017-05-24 <20170524170921.7pykzbt54dlfk...@msg.df7cb.de>

If we had a typo or something in that code, the build farm should have
caught it by now.

I would try compiling with lower -O and see what happens.

Trying -O0 now.

Sorry for the late reply, I was hoping to catch you on IRC, but then
didn't manage to be around at a time that would fit the timezone
shift.

The -O0 build passed the regression tests. Not sure what that means
for our problem, though.

Fwiw, the problem is fixed with beta2, even with -O2:

https://buildd.debian.org/status/logs.php?pkg=postgresql-10=sparc64

Christoph






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix bloom WAL tap test

2017-11-13 Thread Alexander Korotkov
Hi!

On Fri, Nov 10, 2017 at 9:12 PM, Tom Lane  wrote:

> I wrote:
> > Is there anything we can do to cut the runtime of the TAP test to
> > the point where running it by default wouldn't be so painful?
>
> As an experiment, I tried simply cutting the size of the test table 10X:
>
> diff --git a/contrib/bloom/t/001_wal.pl b/contrib/bloom/t/001_wal.pl
> index 1b319c9..566abf9 100644
> --- a/contrib/bloom/t/001_wal.pl
> +++ b/contrib/bloom/t/001_wal.pl
> @@ -57,7 +57,7 @@ $node_standby->start;
>  $node_master->safe_psql("postgres", "CREATE EXTENSION bloom;");
>  $node_master->safe_psql("postgres", "CREATE TABLE tst (i int4, t
> text);");
>  $node_master->safe_psql("postgres",
> -"INSERT INTO tst SELECT i%10, substr(md5(i::text), 1, 1) FROM
> generate_series(1,10) i;"
> +"INSERT INTO tst SELECT i%10, substr(md5(i::text), 1, 1) FROM
> generate_series(1,1) i;"
>  );
>  $node_master->safe_psql("postgres",
> "CREATE INDEX bloomidx ON tst USING bloom (i, t) WITH (col1 =
> 3);");
> @@ -72,7 +72,7 @@ for my $i (1 .. 10)
> test_index_replay("delete $i");
> $node_master->safe_psql("postgres", "VACUUM tst;");
> test_index_replay("vacuum $i");
> -   my ($start, $end) = (11 + ($i - 1) * 1, 10 + $i *
> 1);
> +   my ($start, $end) = (10001 + ($i - 1) * 1000, 1 + $i * 1000);
> $node_master->safe_psql("postgres",
>  "INSERT INTO tst SELECT i%10, substr(md5(i::text), 1, 1) FROM
> generate_series($start,$end) i;"
> );
>
> This about halved the runtime of the TAP test, and it changed the coverage
> footprint not at all according to lcov.  (Said coverage is only marginally
> better than what we get without running the bloom TAP test, AFAICT.)
>
> It seems like some effort could be put into both shortening this test
> and improving the amount of code it exercises.
>

Thank you for committing patch which fixes tap test.
I'll try to improve coverage of this test and reduce its run time.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] proposal: schema variables

2017-11-13 Thread Pavel Golub
Hello, Pavel.

You wrote:

PS> Hi,

PS> I propose a  new database object - a variable. The variable is
PS> persistent object, that holds unshared session based not
PS> transactional in memory value of any type. Like variables in any
PS> other languages. The persistence is required for possibility to do
PS> static checks, but can be limited to session - the variables can be 
temporal.

Great idea.

PS> My proposal is related to session variables from Sybase, MSSQL or
PS> MySQL (based on prefix usage @ or @@), or package variables from
PS> Oracle (access is controlled by scope), or schema variables from
PS> DB2. Any design is coming from different sources, traditions and
PS> has some advantages or disadvantages. The base of my proposal is
PS> usage schema variables as session variables for stored procedures.
PS> It should to help to people who try to port complex projects to PostgreSQL 
from other databases.

PS> The Sybase  (T-SQL) design is good for interactive work, but it
PS> is weak for usage in stored procedures - the static check is not
PS> possible. Is not possible to set some access rights on variables.

PS> The ADA design (used on Oracle) based on scope is great, but our
PS> environment is not nested. And we should to support other PL than PLpgSQL 
more strongly.

PS> There is not too much other possibilities - the variable that
PS> should be accessed from different PL, different procedures (in
PS> time) should to live somewhere over PL, and there is the schema only.

PS> The variable can be created by CREATE statement:

PS> CREATE VARIABLE public.myvar AS integer;
PS> CREATE VARIABLE myschema.myvar AS mytype;

PS> CREATE [TEMP] VARIABLE [IF NOT EXISTS] name AS type 
PS>   [ DEFAULT expression ] [[NOT] NULL]
PS>   [ ON TRANSACTION END { RESET | DROP } ]
PS>   [ { VOLATILE | STABLE } ];


PS> It is dropped by command DROP VARIABLE  [ IF EXISTS] varname.

PS> The access rights is controlled by usual access rights - by
PS> commands GRANT/REVOKE. The possible rights are: READ, WRITE

PS> The variables can be modified by SQL command SET (this is taken from 
standard, and it natural)

PS> SET varname = expression;

I propose LET keyword for this to distinguish GUC from variables, e.g.

LET varname = expression;

PS> Unfortunately we use the SET command for different purpose. But I
PS> am thinking so we can solve it with few tricks. The first is
PS> moving our GUC to pg_catalog schema. We can control the strictness
PS> of SET command. In one variant, we can detect custom GUC and allow
PS> it, in another we can disallow a custom GUC and allow only schema
PS> variables. A new command LET can be alternative.



PS> The variables should be used in queries implicitly (without JOIN)


PS> SELECT varname;


PS> The SEARCH_PATH is used, when varname is located. The variables
PS> can be used everywhere where query parameters are allowed. 



PS> I hope so this proposal is good enough and simple.


PS> Comments, notes?


PS> regards


PS> Pavel






-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] path toward faster partition pruning

2017-11-13 Thread Amit Langote
Horiguchi-san,

Thanks for taking a look.  Replying to all your emails here.

On 2017/11/10 12:30, Kyotaro HORIGUCHI wrote:
> In 0002, bms_add_range has a bit naive-looking loop
> 
> + while (wordnum <= uwordnum)
> + {
> + bitmapword mask = (bitmapword) ~0;
> +
> + /* If working on the lower word, zero out bits below 'lower'. */
> + if (wordnum == lwordnum)
> + {
> + int lbitnum = BITNUM(lower);
> + mask >>= lbitnum;
> + mask <<= lbitnum;
> + }
> +
> + /* Likewise, if working on the upper word, zero bits above 
> 'upper' */
> + if (wordnum == uwordnum)
> + {
> + int ushiftbits = BITS_PER_BITMAPWORD - (BITNUM(upper) + 
> 1);
> + mask <<= ushiftbits;
> + mask >>= ushiftbits;
> + }
> +
> + a->words[wordnum++] |= mask;
> + }
> 
> Without some aggressive optimization, the loop takes most of the
> time to check-and-jump for nothing especially with many
> partitions and somewhat unintuitive.
> 
> The following uses a bit tricky bitmap operation but
> is straightforward as a whole.
> 
> =
> /* fill the bits upper from BITNUM(lower) (0-based) of the first word */
> a->workds[wordnum++] += ~(bitmapword)((1 << BITNUM(lower)) - 1);
> 
> /* fill up intermediate words */
> while (wordnum < uwordnum)
>a->words[wordnum++] = ~(bitmapword) 0;
> 
> /* fill up to BITNUM(upper) bit (0-based) of the last word */
> a->workds[wordnum++] |=
>  (~(bitmapword) 0) >> (BITS_PER_BITMAPWORD - (BITNUM(upper) - 1));
> =

Considering also the David's comment downthread, I will try to incorporate
this into bms_add_range().

> In 0003, 
> 
> +match_clauses_to_partkey(RelOptInfo *rel,
> ...
> +  if (rinfo->pseudoconstant &&
> +(IsA(clause, Const) &&
> + Const *) clause)->constisnull) ||
> +  !DatumGetBool(((Const *) clause)->constvalue
> +  {
> +*constfalse = true;
> +continue;
> 
> If we call this function in both conjunction and disjunction
> context (the latter is only in recursive case). constfalse ==
> true means no need of any clauses for the former case.
> 
> Since (I think) just a list of RestrictInfo is expected to be
> treated as a conjunction (it's quite doubious, though..),

I think it makes sense to consider a list of RestrictInfo's, such as
baserestrictinfo, that is passed as input to match_clauses_to_partkey(),
to be mutually conjunctive for our purpose here.

> we
> might be better call this for each subnodes of a disjunction. Or,
> like match_clauses_to_index, we might be better provide
> match_clause_to_partkey(rel, rinfo, contains_const), which
> returns NULL if constfalse. (I'm not self-confident on this..)

After reading your comment, I realized that it was wrong that the
recursive call to match_clauses_to_partkey() passed the arguments of an OR
clause all at once.  That broke the assumption mentioned above that all of
the clauses in the list passed to match_clauses_to_partkey() are mutually
conjunctive.  Instead, we must make a single-member list for each of the
OR clause's arguments and pass the same.

Then if we got constfalse for all of the OR's arguments, then we return
the constfalse=true to the original caller.

> 
> +  /*
> +   * If no commutator exists, cannot flip the qual's args,
> +   * so give up.
> +   */
> +  if (!OidIsValid(expr_op))
> +continue;
> 
> I suppose it's better to leftop and rightop as is rather than
> flipping over so that var is placed left-side. Does that make
> things so complex?

Reason to do it that way is that the code placed far away (code in
partition.c that extracts constant values to use for pruning from matched
clauses) can always assume that the clauses determined to be useful for
partition-pruning always come in the 'partkey op constant' form.

> + * It the operator happens to be '<>', which is never listed
> If?

Right, will fix.

> +if (!op_in_opfamily(expr_op, partopfamily))
> +{
> +  Oidnegator = get_negator(expr_op);
> +
> +  if (!OidIsValid(negator) ||
> +!op_in_opfamily(negator, partopfamily))
> +continue;
> 
> classify_partition_bounding_keys() checks the same thing by
> checking whether the negator's strategy is
> BTEquealStrategyNumber. (I'm not sure the operator is guaranteed
> to be of btreee, though..) Aren't they needed to be in similar
> way?

You're right.  The <>'s negator may not always be a btree operator.  So,
we should add a check in match_clauses_to_partkey() that list or range
partitioning is in use, because only those require a btree operator
family.  We now have hash partitioning, so need to be careful not to make
the assumption that all partitioning operators are from btree operator
families.

If match_clauses_to_partkey() 

Re: [HACKERS] Jsonb transform for pl/python

2017-11-13 Thread Anthony Bykov
On Thu, 09 Nov 2017 12:26:46 +
Aleksander Alekseev  wrote:

> The following review has been posted through the commitfest
> application: make installcheck-world:  tested, failed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:tested, passed
> 
> Hello Anthony,
> 
> Great job!
> 
> I decided to take a closer look on your patch. Here are some defects
> I discovered.
> 
> > +   Additional extensions are available that implement transforms
> > for
> > +   the jsonb type for the language PL/Python.  The
> > +   extensions for PL/Perl are called  
> 
> 1. The part regarding PL/Perl is obviously from another patch.
> 
> 2. jsonb_plpython2u and jsonb_plpythonu are marked as relocatable,
> while jsonb_plpython3u is not. Is it a mistake? Anyway if an
> extension is relocatable there should be a test that checks this.
> 
> 3. Not all json types are test-covered. Tests for 'true' :: jsonb,
> '3.14' :: jsonb and 'null' :: jsonb are missing.
> 
> 4. jsonb_plpython.c:133 - "Iterate trhrough Jsonb object." Typo, it
> should be "through" or probably even "over".
> 
> 5. It looks like you've implemented transform in two directions
> Python <-> JSONB, however I see tests only for Python <- JSONB case.
> 
> 6. Tests passed on Python 2.7.14 but failed on 3.6.2:
> 
> >  CREATE EXTENSION jsonb_plpython3u CASCADE;
> > + ERROR:  could not access file "$libdir/jsonb_plpython3": No such
> > file or directory  
> 
> module_pathname in jsonb_plpython3u.control should be
> $libdir/jsonb_plpython3u, not $libdir/jsonb_plpython3.
> 
> Tested on Arch Linux x64, GCC 7.2.0.
> 
> The new status of this patch is: Waiting on Author
> 

Hello, Aleksander. 
Thank you for your time. The defects you have noticed were fixed.
Please, find in attachments new version of the patch (it is called
0001-jsonb_plpython-extension-v2.patch).

Most of changes were made to fix defects(list of the defects may be
found in citation in the beginning of this message), but the algorithm
of iterating through incoming jsonb was changed so that it looks tidier.

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/contrib/Makefile b/contrib/Makefile
index 8046ca4..d9d9817 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -81,9 +81,9 @@ ALWAYS_SUBDIRS += hstore_plperl
 endif
 
 ifeq ($(with_python),yes)
-SUBDIRS += hstore_plpython ltree_plpython
+SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython
 else
-ALWAYS_SUBDIRS += hstore_plpython ltree_plpython
+ALWAYS_SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython
 endif
 
 # Missing:
diff --git a/contrib/jsonb_plpython/Makefile b/contrib/jsonb_plpython/Makefile
new file mode 100644
index 000..6371d11
--- /dev/null
+++ b/contrib/jsonb_plpython/Makefile
@@ -0,0 +1,39 @@
+# contrib/jsonb_plpython/Makefile
+
+MODULE_big = jsonb_plpython$(python_majorversion)u
+OBJS = jsonb_plpython.o $(WIN32RES)
+PGFILEDESC = "jsonb_plpython - transform between jsonb and plpythonu"
+
+PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plpython $(python_includespec) -DPLPYTHON_LIBNAME='"plpython$(python_majorversion)"'
+
+EXTENSION = jsonb_plpythonu jsonb_plpython2u jsonb_plpython3u
+DATA = jsonb_plpythonu--1.0.sql jsonb_plpython2u--1.0.sql jsonb_plpython3u--1.0.sql
+
+REGRESS = jsonb_plpython$(python_majorversion)
+REGRESS_PLPYTHON3_MANGLE := $(REGRESS)
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/jsonb_plpython
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+# We must link libpython explicitly
+ifeq ($(PORTNAME), win32)
+# ... see silliness in plpython Makefile ...
+SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a))
+else
+rpathdir = $(python_libdir)
+SHLIB_LINK += $(python_libspec) $(python_additional_libs)
+endif
+
+ifeq ($(python_majorversion),2)
+REGRESS_OPTS += --load-extension=plpython2u
+else
+REGRESS_OPTS += --load-extension=plpython3u
+endif
diff --git a/contrib/jsonb_plpython/expected/jsonb_plpython2.out b/contrib/jsonb_plpython/expected/jsonb_plpython2.out
new file mode 100644
index 000..8ad5338
--- /dev/null
+++ b/contrib/jsonb_plpython/expected/jsonb_plpython2.out
@@ -0,0 +1,478 @@
+CREATE EXTENSION jsonb_plpython2u CASCADE;
+-- test jsonb -> python dict
+CREATE FUNCTION test1(val jsonb) RETURNS int
+LANGUAGE plpython2u
+TRANSFORM FOR TYPE jsonb
+AS $$
+assert isinstance(val, dict)
+plpy.info(sorted(val.items()))
+return len(val)
+$$;
+SELECT test1('{"a":1, "c":"NULL"}'::jsonb);
+INFO:  [('a', Decimal('1')), ('c', 'NULL')]
+ test1 
+---
+ 2
+(1 row)
+
+-- test jsonb -> python dict
+-- complex dict with dicts as value
+CREATE FUNCTION test1complex(val jsonb) RETURNS int
+LANGUAGE plpython2u
+TRANSFORM FOR TYPE jsonb
+AS $$
+assert isinstance(val, dict)
+assert(val == {"d":{"d": 1}})
+return 

Re: [HACKERS] UPDATE of partition key

2017-11-13 Thread David Rowley
 On 10 November 2017 at 16:42, Amit Khandekar  wrote:
[ update-partition-key_v23.patch ]

Hi Amit,

Thanks for working on this. I'm looking forward to seeing this go in.

So... I've signed myself up to review the patch, and I've just had a
look at it, (after first reading this entire email thread!).

Overall the patch looks like it's in quite a good shape. I think I do
agree with Robert about the UPDATE anomaly that's been discussed. I
don't think we're painting ourselves into any corner by not having
this working correctly right away. Anyone who's using some trigger
workarounds for the current lack of support for updating the partition
key is already going to have the same issues, so at least this will
save them some troubles implementing triggers and give them much
better performance. I see you've documented this fact too, which is
good.

I'm writing this email now as I've just run out of review time for today.

Here's what I noted down during my first pass:

1. Closing command tags in docs should not be abbreviated

triggers are concerned, AFTER DELETE and

This changed in c29c5789. I think Peter will be happy if you don't
abbreviate the closing tags.

2. "about to do" would read better as "about to perform"

 concurrent session, and it is about to do an UPDATE

I think this paragraph could be more clear if we identified the
sessions with a number.

Perhaps:
   Suppose, session 1 is performing an UPDATE on a
   partition key, meanwhile, session 2 tries to perform an UPDATE
or DELETE operation on the same row.
   Session 2 can silently miss the row due to session 1's activity.  In
   such a case, session 2's UPDATE/DELETE
   , being unaware of the row's movement, interprets this that the
   row has just been deleted, so there is nothing to be done for this row.
   Whereas, in the usual case where the table is not partitioned, or where
   there is no row movement, the second session would have identified the
   newly updated row and carried UPDATE/DELETE
on this new row version.


3. Integer width. get_partition_natts returns int but we assign to int16.

int16 partnatts = get_partition_natts(key);

Confusingly get_partition_col_attnum() returns int16 instead of AttrNumber
but that's existingly not correct.

4. The following code could just pull_varattnos(partexprs, 1, _keycols);

foreach(lc, partexprs)
{
Node*expr = (Node *) lfirst(lc);

pull_varattnos(expr, 1, _keycols);
}

5. Triggers. Do we need a new "TG_" tag to allow trigger functions to
do something
special when the DELETE/INSERT is a partition move? I have audit
tables in mind here
it may appear as though a user performed a DELETE when they actually
performed an UPDATE
giving visibility of this to the trigger function will allow the
application to work around this.

6. change "row" to "a row" and "old" to "the old"

* depending on whether the event is for row being deleted from old

But to be honest, I'm having trouble parsing the comment. I think it
would be better to
say explicitly when the row will be NULL rather than "depending on
whether the event"

7. I'm confused with how this change came about. If the old comment
was correct here then the comment you're referring to here should
remain in ExecPartitionCheck(), but you're saying it's in
ExecConstraints().

/* See the comments in ExecConstraints. */

If the comment really is in ExecConstraints(), then you might want to
give an overview of what you mean, then reference ExecConstraints() if
more details are required.

8. I'm having trouble parsing this comment:

 * 'update_rri' has the UPDATE per-subplan result rels.

I think "has" should be "contains" ?

9. Also, this should likely be reworded:

 * 'num_update_rri' : number of UPDATE per-subplan result rels. For INSERT,
 *  this is 0.

 'num_update_rri' number of elements in 'update_rri' array or zero for INSERT.

10. There should be no space before the '?'

/* Is this leaf partition present in the update resultrel ? */

11. I'm struggling to understand this comment:

* This is required when converting tuple as per root
* partition tuple descriptor.

"tuple" should probably be "the tuple", but not quite sure what you
mean by "as per root".

I may have misunderstood, but maybe it should read:

* This is required when we convert the partition's tuple to
* be compatible with the partitioned table's tuple descriptor.

12. I think "as well" would be better written as "either".

* If we didn't open the partition rel, it means we haven't
* initialized the result rel as well.

13. I'm unsure what is meant by the following comment:

* Verify result relation is a valid target for insert operation. Even
* for updates, we are doing this for tuple-routing, so again, we need
* to check the validity for insert operation.

I'm not quite sure where UPDATE comes in here as we're only checking for INSERT?

14. Use of underscores instead of camelCase.


Re: [HACKERS] proposal: schema variables

2017-11-13 Thread Pavel Stehule
Hi

2017-11-13 13:15 GMT+01:00 Pavel Golub :

> Hello, Pavel.
>
> You wrote:
>
> PS> Hi,
>
> PS> I propose a  new database object - a variable. The variable is
> PS> persistent object, that holds unshared session based not
> PS> transactional in memory value of any type. Like variables in any
> PS> other languages. The persistence is required for possibility to do
> PS> static checks, but can be limited to session - the variables can be
> temporal.
>
> Great idea.
>
> PS> My proposal is related to session variables from Sybase, MSSQL or
> PS> MySQL (based on prefix usage @ or @@), or package variables from
> PS> Oracle (access is controlled by scope), or schema variables from
> PS> DB2. Any design is coming from different sources, traditions and
> PS> has some advantages or disadvantages. The base of my proposal is
> PS> usage schema variables as session variables for stored procedures.
> PS> It should to help to people who try to port complex projects to
> PostgreSQL from other databases.
>
> PS> The Sybase  (T-SQL) design is good for interactive work, but it
> PS> is weak for usage in stored procedures - the static check is not
> PS> possible. Is not possible to set some access rights on variables.
>
> PS> The ADA design (used on Oracle) based on scope is great, but our
> PS> environment is not nested. And we should to support other PL than
> PLpgSQL more strongly.
>
> PS> There is not too much other possibilities - the variable that
> PS> should be accessed from different PL, different procedures (in
> PS> time) should to live somewhere over PL, and there is the schema only.
>
> PS> The variable can be created by CREATE statement:
>
> PS> CREATE VARIABLE public.myvar AS integer;
> PS> CREATE VARIABLE myschema.myvar AS mytype;
>
> PS> CREATE [TEMP] VARIABLE [IF NOT EXISTS] name AS type
> PS>   [ DEFAULT expression ] [[NOT] NULL]
> PS>   [ ON TRANSACTION END { RESET | DROP } ]
> PS>   [ { VOLATILE | STABLE } ];
>
>
> PS> It is dropped by command DROP VARIABLE  [ IF EXISTS] varname.
>
> PS> The access rights is controlled by usual access rights - by
> PS> commands GRANT/REVOKE. The possible rights are: READ, WRITE
>
> PS> The variables can be modified by SQL command SET (this is taken from
> standard, and it natural)
>
> PS> SET varname = expression;
>
> I propose LET keyword for this to distinguish GUC from variables, e.g.
>
> LET varname = expression;
>

 It is one possible variant. I plan to implement more variants and then
choose one.

Regards

Pavel

>
> PS> Unfortunately we use the SET command for different purpose. But I
> PS> am thinking so we can solve it with few tricks. The first is
> PS> moving our GUC to pg_catalog schema. We can control the strictness
> PS> of SET command. In one variant, we can detect custom GUC and allow
> PS> it, in another we can disallow a custom GUC and allow only schema
> PS> variables. A new command LET can be alternative.
>
>
>
> PS> The variables should be used in queries implicitly (without JOIN)
>
>
> PS> SELECT varname;
>
>
> PS> The SEARCH_PATH is used, when varname is located. The variables
> PS> can be used everywhere where query parameters are allowed.
>
>
>
> PS> I hope so this proposal is good enough and simple.
>
>
> PS> Comments, notes?
>
>
> PS> regards
>
>
> PS> Pavel
>
>
>
>
>
>
> --
> With best wishes,
>  Pavel  mailto:pa...@gf.microolap.com
>
>


Re: [HACKERS] Parallel Hash take II

2017-11-13 Thread Thomas Munro
Hi Andres and Peter,

Please see below for inline responses to your feedback.  New patch attached.

On Wed, Nov 8, 2017 at 10:01 AM, Andres Freund  wrote:
> +set min_parallel_table_scan_size = 0;
> +set parallel_setup_cost = 0;
> +-- Make a simple relation with well distributed keys and correctly
> +-- estimated size.
> +create table simple as
> +  select generate_series(1, 2) AS id, 
> 'aa';
> +alter table simple set (parallel_workers = 2);
> +analyze simple;
> +-- Make a relation whose size we will under-estimate.  We want stats
> +-- to say 1000 rows, but actually there are 20,000 rows.
> +create table bigger_than_it_looks as
> +  select generate_series(1, 2) as id, 
> 'aa';
> +alter table bigger_than_it_looks set (autovacuum_enabled = 'false');
> +alter table bigger_than_it_looks set (parallel_workers = 2);
> +delete from bigger_than_it_looks where id <= 19000;
> +vacuum bigger_than_it_looks;
> +analyze bigger_than_it_looks;
> +insert into bigger_than_it_looks
> +  select generate_series(1, 19000) as id, 
> 'aa';
>
> It seems kinda easier to just manipulate ndistinct and reltuples...

Done.

> +set max_parallel_workers_per_gather = 0;
> +set work_mem = '4MB';
>
> I hope there's a fair amount of slop here - with different archs you're
> going to see quite some size differences.

Yeah, this is a problem I wrestled with.  See next.

> +-- The "good" case: batches required, but we plan the right number; we
> +-- plan for 16 batches, and we stick to that number, and peak memory
> +-- usage says within our work_mem budget
> +-- non-parallel
> +set max_parallel_workers_per_gather = 0;
> +set work_mem = '128kB';
>
> So how do we know that's actually the case we're testing rather than
> something arbitrarily different? There's IIRC tests somewhere that just
> filter the json explain output to the right parts...

Yeah, good idea.  My earlier attempts to dump out the hash join
dimensions ran into problems with architecture sensitivity and then
some run-to-run non-determinism in the parallel case (due to varying
fragmentation depending on how many workers get involved in time).
The attached version tells you about batch growth without reporting
the exact numbers, except in the "ugly" case where we know that there
is only one possible outcome because the extreme skew detector is
guaranteed to go off after the first nbatch increase (I got rid of all
other tuples except ones with the same key to make this true).

This exercise did reveal a bug in
0008-Show-hash-join-per-worker-information-in-EXPLAIN-ANA.patch
though: it is capturing shared instrumentation too soon in the
non-Parallel Hash case so the nbatch reported by EXPLAIN ANALYZE might
be too low if we grew while probing.  Oops.  Will post a fix for that.

> +/*
> + * Build the name for a given segment of a given BufFile.
> + */
> +static void
> +MakeSharedSegmentName(char *name, const char *buffile_name, int segment)
> +{
> +   snprintf(name, MAXPGPATH, "%s.%d", buffile_name, segment);
> +}
>
> Not a fan of this name - you're not "making" a filename here (as in
> allocating or such). I think I'd just remove the Make prefix.

Done.  I also changed some similar code where I'd used GetXXX when
building paths.

> +/*
> + * Open a file that was previously created in another backend with
> + * BufFileCreateShared in the same SharedFileSet using the same name.  The
> + * backend that created the file must have called BufFileClose() or
> + * BufFileExport() to make sure that it is ready to be opened by other
> + * backends and render it read-only.
> + */
>
> Is it actually guaranteed that it's another backend / do we rely on
> that?

No, it could be any backend that is attached to the SharedFileSet,
including the current one.  Wording improved.

> +BufFile *
> +BufFileOpenShared(SharedFileSet *fileset, const char *name)
> +{
>
> +   /*
> +* If we didn't find any files at all, then no BufFile exists with 
> this
> +* tag.
> +*/
> +   if (nfiles == 0)
> +   return NULL;
>
> s/taag/name/?

Fixed.

> +/*
> + * Delete a BufFile that was created by BufFileCreateShared in the given
> + * SharedFileSet using the given name.
> + *
> + * It is not necessary to delete files explicitly with this function.  It is
> + * provided only as a way to delete files proactively, rather than waiting 
> for
> + * the SharedFileSet to be cleaned up.
> + *
> + * Only one backend should attempt to delete a given name, and should know
> + * that it exists and has been exported or closed.
> + */
> +void
> +BufFileDeleteShared(SharedFileSet *fileset, const char *name)
> +{
> +   charsegment_name[MAXPGPATH];
> +   int segment = 0;
> +   boolfound = false;
> +
> +   /*
> +* We don't know how many segments the file has.  We'll keep deleting
> +* until we 

[HACKERS] Migration to pglister - Before

2017-11-13 Thread Stephen Frost
Greetings,

We will be migrating these lists to pglister in the next few minutes.

This final email on the old list system is intended to let you know
that future emails will have different headers and you will need to
adjust your filters.

The changes which we expect to be most significant to users can be found
on the wiki here: https://wiki.postgresql.org/wiki/PGLister_Announce

Once the migration of these lists is complete, an 'after' email will be
sent out.

Thanks!

Stephen


signature.asc
Description: Digital signature