Re: [HACKERS] path toward faster partition pruning

2017-10-25 Thread Beena Emerson
Hello,

On Wed, Oct 25, 2017 at 1:07 PM, Amit Langote
 wrote:
> On 2017/10/25 15:47, Amit Langote wrote:
>> On 2017/10/24 1:38, Beena Emerson wrote:
>>> I had noticed this and also that this crash:
>>>
>>> tprt PARTITION BY RANGE(Col1)
>>>tprt_1 FOR VALUES FROM (1) TO (50001) PARTITION BY RANGE(Col1)
>>>   tprt_11 FOR VALUES FROM (1) TO (1),
>>>   tprt_1d DEFAULT
>>>tprt_2 FOR VALUES FROM (50001) TO (11)
>>>
>>> EXPLAIN (COSTS OFF) SELECT * FROM tprt WHERE col1 BETWEEN 2 AND 7;
>>> server closed the connection unexpectedly
>>> This probably means the server terminated abnormally
>>> before or while processing the request.
>>> The connection to the server was lost. Attempting reset: Failed.
>>> !>
>>
>> ...and this (crash) were due to bugs in the 0005 patch.
>
> [  ]
>
>> Should be fixed in the attached updated version.
>
> Oops, not quite.  The crash that Beena reported wasn't fixed (or rather
> reintroduced by some unrelated change after once confirming it was fixed).
>
> Really fixed this time.

Some minor comments:

1. wrong function name (0003)

The comment on function get_partitions_from_clauses_guts uses wrong name:
instead of "_from_", "_using_" is written.

 /*
+ * get_partitions_using_clauses_guts
+ * Determine relation's partitions that satisfy *all* of the clauses
+ * in the list (return value describes the set of such partitions)
+ *

2. typo information (0003)

+/*
+ * classify_partition_bounding_keys
+ * Classify partition clauses into equal, min, max keys, along with any
+ * Nullness constraints and return that informatin in the output argument

3. misspell admissible (0003)
+* columns, whereas a prefix of all partition key columns is addmissible
+* as min and max keys.

4. double and? (0002)
+* as part of the operator family, check if its negator
+* exists and and that the latter is compatible with

5. typo inequality (0002)
+* (key < val OR key > val), if the partitioning method
+* supports such notion of inequlity.


6. typo output (0005)
+* return it as the only scannable partition, that means the query
+* doesn't want null values in its outout.

7. typo provide (0005)
+   /* Valid keys->eqkeys must provoide all partition keys. */
+   Assert(keys->n_eqkeys == 0 || keys->n_eqkeys == partkey->partnatts);

8. comment of struct PartClause (0003)
+/*
+ * Information about a clause matched with a partition key column kept to
+ * avoid repeated recomputation in remove_redundant_clauses().
+ */

Instead of repeated recomputation, we can use just  " repeated
computation" or just " recomputation"



-- 

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Process variadic arguments consistently in json functions

2017-10-25 Thread Michael Paquier
On Wed, Oct 25, 2017 at 5:24 AM, Andrew Dunstan  wrote:
> Process variadic arguments consistently in json functions
>
> json_build_object and json_build_array and the jsonb equivalents did not
> correctly process explicit VARIADIC arguments. They are modified to use
> the new extract_variadic_args() utility function which abstracts away
> the details of the call method.
>
> Michael Paquier, reviewed by Tom Lane and Dmitry Dolgov.
>
> Backpatch to 9.5 for the jsonb fixes and 9.4 for the json fixes, as
> that's where they originated.

- * Copyright (c) 2014-2017, PostgreSQL Global Development Group
+ * COPYRIGHT (c) 2014-2017, PostgreSQL Global Development Group
Andrew, I have just noticed that this noise diff has crept in. You may
want to fix that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timeline ID in backup_label file

2017-10-25 Thread Craig Ringer
On 26 October 2017 at 02:50, Michael Paquier  wrote:
> Hi all,
>
> Lately, in order to extract some information from a backup_label file
> in python I have found myself doing something like the following to
> get a timeline number (feel free to reuse that code, especially the
> regex pattern):
> pattern = re.compile('^START WAL
> LOCATION:.*([0-9A-F]+\/[0-9A-F]+).*\(file ([0-9A-F]+)\)')
> if pattern.match(backup_lable_line):
> current_lsn = m.group(1)
> current_segment = m.group(2)
> current_tli = str(int(current_segment[:8], 16))
>
> That's more or less similar to what read_backup_label()@xlog.c does
> using sscanf(), still I keep wondering why we need to go through this
> much complication to get the origin timeline number of a base backup,
> information which is IMO as important as the start LSN when working on
> backup methods.
>
> I would find interesting to add at the bottom of the backup_label file
> a new field called "START TIMELINE: %d" to put this information in a
> more understandable shape. Any opinions?

Strong "yes" from me.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Continuous integration on Windows?

2017-10-25 Thread Thomas Munro
On Sat, Oct 14, 2017 at 3:08 AM, Andrew Dunstan
 wrote:
> I'll add some info to the wiki. Unfortunately, the tests fail on the
> tablespace test because they are running as a privileged user. We need
> to find a way around that, still looking into it. (See
> 
> which describes how I get around that when running by hand).

Thanks for that pointer.  "runas" seems to be no good in batch files
since it asks for a password interactively, but I managed to get the
tablespace test to pass by running it like this:

test_script:
  - net user testuser password1234! /add
  - psexec.exe -u testuser -p password1234! -w
c:\projects\postgres\src\tools\msvc perl.exe vcregress.pl check

(It probably goes without saying but I'll say it anyway for anyone who
might find this: this plaintext-password-in-script-files stuff is
intended for use on self-destructing isolated build bot images only
and should never be done on a computer you care about.)

Hooray!  Now I can go and figure out why my Parallel Hash regression
test is failing with file permissions problems on Windows...

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #14849: jsonb_build_object doesn't like VARIADIC calls very much

2017-10-25 Thread Marko Tiikkaja
On Wed, Oct 25, 2017 at 5:32 PM, Michael Paquier 
wrote:

> On Mon, Oct 23, 2017 at 6:50 AM, Michael Paquier
>  wrote:
> > Okay, attached is what I think a fully implemented patch should look
> > like. On top of what Andrew has done, I added and reworked the
> > following:
> > - removed duplicate error handling.
> > - documented the function in funcapi.h and funcapi.c.
> > - Added a new section in funcapi.h to outline that this is for support
> > of VARIADIC inputs.
> > I have added a commit message as well. Hope this helps.
>
> For the sake of the archives, the introduction of
> extract_variadic_args is committed with f3c6e8a2, and the JSON fixes
> with 18fc4ecf. Thanks Andrew for the commit, and thanks Tom, Andrew
> and Dmitry for the reviews.


Thx yo.


.m


[HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-25 Thread David Rowley
It seems like a good idea for the planner not to generate Append or
MergeAppend paths when the node contains just a single subpath. If we
were able to remove these then the planner would have more flexibility
to build a better plan.

As of today, because we include this needless [Merge]Append node, we
cannot parallelise scans below the Append. We must also include a
Materialize node in Merge Joins since both MergeAppend and Append
don't support mark and restore. Also, as shown in [1], there's an
overhead to pulling tuples through nodes.

I've been looking into resolving this issue but the ways I've explored
so far seems to be bending the current planner a bit out of shape.

Method 1:

In set_append_rel_size() detect when just a single subpath would be
added to the Append path. Set a promotion_child RelOptInfo field in
the base rel's RelOptInfo, and during make_one_rel, after
set_base_rel_sizes() modify the joinlist to get rid of the parent and
use the child instead.

This pretty much breaks the assumption we have that the finalrel will
have all the relids to root->all_baserels. We have an Assert in
make_one_rel() which checks this is true.

There are also complications around set_rel_size() generating paths
for subqueries which may be parameterized paths with the Append parent
required for the parameterization to work.

Method 2:

Invent a ProxyPath concept that allows us to add Paths belonging to
one relation to another relation. The ProxyPaths can have
translation_vars to translate targetlists to reference the correct
Vars. These ProxyPaths could exist up as far as createplan, where we
could perform the translation and build the ProxyPath's subnode
instead.

This method is not exactly very clean either as there are various
places around the planner we'd need to give special treatment to these
ProxyPaths, such as is_projection_capable_path() would need to return
the projection capability of the subpath within the ProxyPath since we
never actually create a "Proxy" node.

Probably either of these two methods could be made to work. I prefer
method 2, since that infrastructure could one day be put towards
scanning a Materialized View instead of the relation. in order to
satisfy some query. However, method 2 appears to also require some Var
translation in Path targetlists which contain this Proxy path, either
that or some global Var replacement would need to be done during
setrefs.

I'm posting here in the hope that it will steer my development of this
in a direction that's acceptable to the community.

Perhaps there is also a "Method 3" which I've not thought about which
would be much cleaner.

[1] 
https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementing pg_receivewal --no-sync

2017-10-25 Thread Michael Paquier
On Tue, Oct 24, 2017 at 11:05 PM, Kuntal Ghosh
 wrote:
> +   
> +By default, pg_receivewal flushes a WAL segment's
> +contents each time a feedback message is sent to the server depending
> +on the interval of time defined by
> +--status-interval.
> IMHO, it's okay to remove the part 'depending on
> the.--status-interval'.

This sentence is actually wrong, a feedback message is never sent with
the feedback message. You need to use either --synchronous or --slot
for that, and the docs are already clear on the matter.

> +This option causes
> +pg_receivewal to not issue such flushes waiting,
> Did you mean 'to not issue such flush waitings'?

By reading again the patch, "waiting" should not be here. I have
reworded the documentation completely anyway. Hopefully it is more
simple now.

> + [ 'pg_receivewal', '-D', $stream_dir, '--synchronous', '--no-sync' ],
> + 'failure if --synchronous specified without --no-sync');
> s/without/with

Right.
-- 
Michael


pg_receivewal_nosync_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Timeline ID in backup_label file

2017-10-25 Thread Michael Paquier
Hi all,

Lately, in order to extract some information from a backup_label file
in python I have found myself doing something like the following to
get a timeline number (feel free to reuse that code, especially the
regex pattern):
pattern = re.compile('^START WAL
LOCATION:.*([0-9A-F]+\/[0-9A-F]+).*\(file ([0-9A-F]+)\)')
if pattern.match(backup_lable_line):
current_lsn = m.group(1)
current_segment = m.group(2)
current_tli = str(int(current_segment[:8], 16))

That's more or less similar to what read_backup_label()@xlog.c does
using sscanf(), still I keep wondering why we need to go through this
much complication to get the origin timeline number of a base backup,
information which is IMO as important as the start LSN when working on
backup methods.

I would find interesting to add at the bottom of the backup_label file
a new field called "START TIMELINE: %d" to put this information in a
more understandable shape. Any opinions?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: BRIN multi-range indexes

2017-10-25 Thread Tomas Vondra
Apparently I've managed to botch the git format-patch thing :-( Attached
are both patches (the first one adding BRIN bloom indexes, the other one
adding the BRIN multi-range). Hopefully I got it right this time ;-)

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From c27f02d2839e08ebcee852448ed3838c8932d2ea Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Mon, 23 Oct 2017 22:48:58 +0200
Subject: [PATCH 1/2] brin bloom v1

---
 doc/src/sgml/brin.sgml   | 236 ++
 src/backend/access/brin/Makefile |   2 +-
 src/backend/access/brin/brin_bloom.c | 727 +++
 src/include/catalog/pg_amop.h|  59 +++
 src/include/catalog/pg_amproc.h  | 153 +++
 src/include/catalog/pg_opclass.h |  25 ++
 src/include/catalog/pg_opfamily.h|  20 +
 src/include/catalog/pg_proc.h|  10 +
 src/test/regress/expected/opr_sanity.out |   3 +-
 9 files changed, 1233 insertions(+), 2 deletions(-)
 create mode 100644 src/backend/access/brin/brin_bloom.c

diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index b7483df..49d53e1 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -118,6 +118,13 @@


 
+ abstime_bloom_ops
+ abstime
+ 
+  =
+ 
+
+
  abstime_minmax_ops
  abstime
  
@@ -129,6 +136,13 @@
  
 
 
+ int8_bloom_ops
+ bigint
+ 
+  =
+ 
+
+
  int8_minmax_ops
  bigint
  
@@ -180,6 +194,13 @@
  
 
 
+ bytea_bloom_ops
+ bytea
+ 
+  =
+ 
+
+
  bytea_minmax_ops
  bytea
  
@@ -191,6 +212,13 @@
  
 
 
+ bpchar_bloom_ops
+ character
+ 
+  =
+ 
+
+
  bpchar_minmax_ops
  character
  
@@ -202,6 +230,13 @@
  
 
 
+ char_bloom_ops
+ "char"
+ 
+  =
+ 
+
+
  char_minmax_ops
  "char"
  
@@ -213,6 +248,13 @@
  
 
 
+ date_bloom_ops
+ date
+ 
+  =
+ 
+
+
  date_minmax_ops
  date
  
@@ -224,6 +266,13 @@
  
 
 
+ float8_bloom_ops
+ double precision
+ 
+  =
+ 
+
+
  float8_minmax_ops
  double precision
  
@@ -235,6 +284,13 @@
  
 
 
+ inet_bloom_ops
+ inet
+ 
+  =
+ 
+
+
  inet_minmax_ops
  inet
  
@@ -258,6 +314,13 @@
  
 
 
+ int4_bloom_ops
+ integer
+ 
+  =
+ 
+
+
  int4_minmax_ops
  integer
  
@@ -269,6 +332,13 @@
  
 
 
+ interval_bloom_ops
+ interval
+ 
+  =
+ 
+
+
  interval_minmax_ops
  interval
  
@@ -280,6 +350,13 @@
  
 
 
+ macaddr_bloom_ops
+ macaddr
+ 
+  =
+ 
+
+
  macaddr_minmax_ops
  macaddr
  
@@ -291,6 +368,13 @@
  
 
 
+ macaddr8_bloom_ops
+ macaddr8
+ 
+  =
+ 
+
+
  macaddr8_minmax_ops
  macaddr8
  
@@ -302,6 +386,13 @@
  
 
 
+ name_bloom_ops
+ name
+ 
+  =
+ 
+
+
  name_minmax_ops
  name
  
@@ -313,6 +404,13 @@
  
 
 
+ numeric_bloom_ops
+ numeric
+ 
+  =
+ 
+
+
  numeric_minmax_ops
  numeric
  
@@ -324,6 +422,13 @@
  
 
 
+ pg_lsn_bloom_ops
+ pg_lsn
+ 
+  =
+ 
+
+
  pg_lsn_minmax_ops
  pg_lsn
  
@@ -335,6 +440,13 @@
  
 
 
+ oid_bloom_ops
+ oid
+ 
+  =
+ 
+
+
  oid_minmax_ops
  oid
  
@@ -366,6 +478,13 @@
  
 
 
+ float4_bloom_ops
+ real
+ 
+  =
+ 
+
+
  float4_minmax_ops
  real
  
@@ -377,6 +496,13 @@
  
 
 
+ reltime_bloom_ops
+ reltime
+ 
+  =
+ 
+
+
  reltime_minmax_ops
  reltime
  
@@ -388,6 +514,13 @@
  
 
 
+ int2_bloom_ops
+ smallint
+ 
+  =
+ 
+
+
  int2_minmax_ops
  smallint
  
@@ -399,6 +532,13 @@
  
 
 
+ text_bloom_ops
+ text
+ 
+  =
+ 
+
+
  text_minmax_ops
  text
  
@@ -410,6 +550,13 @@
  
 
 
+ tid_bloom_ops
+ tid
+ 
+  =
+ 
+
+
  tid_minmax_ops
  tid
  
@@ -421,6 +568,13 @@
  
 
 
+ timestamp_bloom_ops
+ timestamp without time zone
+ 
+  =
+ 
+
+
  timestamp_minmax_ops
  timestamp without time zone
  
@@ -432,6 +586,13 @@
  
 
 
+ timestamptz_bloom_ops
+ timestamp with time zone
+ 
+  =
+ 
+
+
  timestamptz_minmax_ops
  timestamp with time zone
  
@@ -443,6 +604,13 @@
  
 
 
+ 

[HACKERS] WIP: BRIN multi-range indexes

2017-10-25 Thread Tomas Vondra
Hi all,

A couple of days ago I've shared a WIP patch [1] implementing BRIN
indexes based on bloom filters. One inherent limitation of that approach
is that it can only support equality conditions - that's perfectly fine
in many cases (e.g. with UUIDs it's rare to use range queries, etc.).

[1]
https://www.postgresql.org/message-id/5d78b774-7e9c-c94e-12cf-fef51cc89b1a%402ndquadrant.com

But in other cases that restriction is pretty unacceptable, e.g. with
timestamps that are queried mostly using range conditions. A common
issue is that while the data is initially well correlated (giving us
nice narrow min/max ranges in the BRIN index), this degrades over time
(typically due to DELETE/UPDATE and then new rows routed to free space).
There are not many options to prevent this, and fixing it pretty much
requires CLUSTER on the table.

This patch addresses this by BRIN indexes with more complex "summary".
Instead of keeping just a single "minmax interval", we maintain a list
of "minmax intervals", which allows us to track "gaps" in the data.

To illustrate the improvement, consider this table:

create table a (val float8) with (fillfactor = 90);
insert into a select i::float from generate_series(1,1000) s(i);
update a set val = 1 where random() < 0.01;
update a set val = 1000 where random() < 0.01;

Which means the column 'val' is almost perfectly correlated with the
position in the table (which would be great for BRIN minmax indexes),
but then 1% of the values is set to 1 and 10.000.000. That means pretty
much every range will be [1,1000], which makes this BRIN index
mostly useless, as illustrated by these explain plans:

create index on a using brin (val) with (pages_per_range = 16);

explain analyze select * from a where val = 100;
  QUERY PLAN

 Bitmap Heap Scan on a  (cost=54.01..10691.02 rows=8 width=8)
(actual time=5.901..785.520 rows=1 loops=1)
   Recheck Cond: (val = '100'::double precision)
   Rows Removed by Index Recheck: 999
   Heap Blocks: lossy=49020
   ->  Bitmap Index Scan on a_val_idx
 (cost=0.00..54.00 rows=3400 width=0)
 (actual time=5.792..5.792 rows=490240 loops=1)
 Index Cond: (val = '100'::double precision)
 Planning time: 0.119 ms
 Execution time: 785.583 ms
(8 rows)

explain analyze select * from a where val between 100 and 1;
  QUERY PLAN
--
 Bitmap Heap Scan on a  (cost=55.94..25132.00 rows=7728 width=8)
  (actual time=5.939..858.125 rows=9695 loops=1)
   Recheck Cond: ((val >= '100'::double precision) AND
  (val <= '1'::double precision))
   Rows Removed by Index Recheck: 9990305
   Heap Blocks: lossy=49020
   ->  Bitmap Index Scan on a_val_idx
 (cost=0.00..54.01 rows=10200 width=0)
 (actual time=5.831..5.831 rows=490240 loops=1)
 Index Cond: ((val >= '100'::double precision) AND
  (val <= '1'::double precision))
 Planning time: 0.139 ms
 Execution time: 871.132 ms
(8 rows)

Obviously, the queries do scan the whole table and then eliminate most
of the rows in "Index Recheck". Decreasing pages_per_range does not
really make a measurable difference in this case - it eliminates maybe
10% of the rechecks, but most pages still have very wide minmax range.

With the patch, it looks about like this:

create index on a using brin (val float8_minmax_multi_ops)
with (pages_per_range = 16);

explain analyze select * from a where val = 100;
  QUERY PLAN
---
 Bitmap Heap Scan on a  (cost=830.01..11467.02 rows=8 width=8)
(actual time=7.772..8.533 rows=1 loops=1)
   Recheck Cond: (val = '100'::double precision)
   Rows Removed by Index Recheck: 3263
   Heap Blocks: lossy=16
   ->  Bitmap Index Scan on a_val_idx
 (cost=0.00..830.00 rows=3400 width=0)
 (actual time=7.729..7.729 rows=160 loops=1)
 Index Cond: (val = '100'::double precision)
 Planning time: 0.124 ms
 Execution time: 8.580 ms
(8 rows)


explain analyze select * from a where val between 100 and 1;
 QUERY PLAN
--
 Bitmap Heap Scan on a  (cost=831.94..25908.00 rows=7728 width=8)
(actual time=9.318..23.715 rows=9695 loops=1)
   Recheck Cond: ((val >= '100'::double precision) AND
  (val <= '1'::double precision))
   Rows Removed by Index Recheck: 3361
   Heap 

[HACKERS] Re: [BUGS] BUG #14849: jsonb_build_object doesn't like VARIADIC calls very much

2017-10-25 Thread Michael Paquier
On Mon, Oct 23, 2017 at 6:50 AM, Michael Paquier
 wrote:
> Okay, attached is what I think a fully implemented patch should look
> like. On top of what Andrew has done, I added and reworked the
> following:
> - removed duplicate error handling.
> - documented the function in funcapi.h and funcapi.c.
> - Added a new section in funcapi.h to outline that this is for support
> of VARIADIC inputs.
> I have added a commit message as well. Hope this helps.

For the sake of the archives, the introduction of
extract_variadic_args is committed with f3c6e8a2, and the JSON fixes
with 18fc4ecf. Thanks Andrew for the commit, and thanks Tom, Andrew
and Dmitry for the reviews.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CurTransactionContext freed before transaction COMMIT ???

2017-10-25 Thread Michael Paquier
On Wed, Oct 25, 2017 at 7:34 AM, Gaddam Sai Ram
 wrote:
> Thanks for the response,
>
> Can you check if CurTransactionContext is valid at that point?
>
>
> Any Postgres function to check if CurTransactionContext is valid or not?

Assert(MemoryContextIsValid(CurTransactionContext));
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CurTransactionContext freed before transaction COMMIT ???

2017-10-25 Thread Gaddam Sai Ram
Thanks for the response,


Can you check if CurTransactionContext is valid at that point? 








Any Postgres function to check if CurTransactionContext is valid or not?


To see, if this problem is related to CurTransactionContext, can you try to 
populate the list in TopMemoryContext and see if that works.








Did you mean TopTransactionContext? 
As of now, we don't free our dlist. We solely depend on Postgres to free our 
dlist while it frees the TopTransactionContext. But if we do allocate in 
TopMemoryContext, we need to take care of freeing our allocations.


And one more issue is, we found this bug once in all the testing we did. So 
trying to replay this bug seems very difficult.



Regards,

G. Sai Ram






Re: [HACKERS] CurTransactionContext freed before transaction COMMIT ???

2017-10-25 Thread Amit Kapila
On Tue, Oct 24, 2017 at 7:42 PM, Gaddam Sai Ram  wrote:

>
> Hello people,
>
> We are implementing in-memory index. As a part of that, during
> index callbacks, under CurTransactionContext, we cache all the DMLs of a
> transaction in dlist(postgres's doubly linked list).
> We registered transaction callback, and in transaction
> pre-commit event(XACT_EVENT_PRE_COMMIT), we iterate through all cached
> DMLs(dlist) and populate in my in-memory data structure.
>
>--> For detailed explanation of our implementation, please look
> into below thread.
>https://www.postgresql.org/message-id/15f4ea99b34.
> e69a4e1b633.8937474039794097334%40zohocorp.com
>
>--> It was working fine until I was greeted with a segmentation
> fault while accessing dlist!
>
>--> On further examining I found that dlist head is not NULL
> and it is pointing to some address, but the container I extracted is
> pointing to 0x7f7f7f7f7f7f7f5f, and I was not able to access any members in
> my container.
>
>--> https://wiki.postgresql.org/wiki/Developer_FAQ#Why_
> are_my_variables_full_of_0x7f_bytes.3F
> From the above wiki, we came to a conclusion that the memory
> context in which that dlist was present was freed.
> And yes CLOBBER_FREED_MEMORY is enabled by passing
> --enable-cassert to enable asserts in my code.
>
>--> Normally the memory context inside XACT_EVENT_PRE_COMMIT is
> TopTransactionContext but in this particular case, the memory context was
> 'MessageContext'. Little unusual! Not sure why!
>
>--> So basically CurTransactionContext shouldn't get freed
> before transaction COMMIT.
>--> So what has actually happened here??? Kindly help us with
> your insights!
>
>
Can you check if CurTransactionContext is valid at that point?  To see, if
this problem is related to CurTransactionContext, can you try to populate
the list in TopMemoryContext and see if that works.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Pluggable storage

2017-10-25 Thread Amit Kapila
On Wed, Oct 25, 2017 at 11:37 AM, Robert Haas  wrote:
> On Fri, Oct 20, 2017 at 5:47 AM, Amit Kapila  wrote:
>> I think what we need here is a way to register satisfies function
>> (SnapshotSatisfiesFunc) in SnapshotData for different storage engines.
>
> I don't see how that helps very much.  SnapshotSatisfiesFunc takes a
> HeapTuple as an argument, and it cares in detail about that tuple's
> xmin, xmax, and infomask, and it sets hint bits.  All of that is bad,
> because an alternative storage engine is likely to use a different
> format than HeapTuple and to not have hint bits (or at least not in
> the same form we have them now).  Also, it doesn't necessarily have a
> Boolean answer to the question "can this snapshot see this tuple?".
> It may be more like "given this TID, what tuple if any can I see
> there?" or "given this tuple, what version of it would I see with this
> snapshot?".
>
> Another thing to consider is that, if we could replace satisfiesfunc,
> it would probably break some existing code.  There are multiple places
> in the code that compare snapshot->satisfies to
> HeapTupleSatisfiesHistoricMVCC and HeapTupleSatisfiesMVCC.
>
> I think the storage API should just leave snapshots alone.  If a
> storage engine wants to call HeapTupleSatisfiesVisibility() with that
> snapshot, it can do so.  Otherwise it can switch on
> snapshot->satisfies and handle each case however it likes.
>

How will it switch satisfies at runtime?  There are places where we
might know which visibility function (*MVCC , *Dirty, etc) needs to be
called, but I think there are other places (like heap_fetch) where it
is not clear and we decide based on what is stored in
snapshot->satisfies.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Jsonb transform for pl/python

2017-10-25 Thread Anthony Bykov
Hi.
I've implemented jsonb transform
(https://www.postgresql.org/docs/9.5/static/sql-createtransform.html)
for pl/python. 

1. '{"1":1}'::jsonb is transformed into dict {"1"=>1}, while
'["1",2]'::jsonb is transformed into list(not tuple!) ["1", 2]

2. If there is a numeric value appear in jsonb, it will be transformed
to decimal through string (Numeric->String->Decimal). Not the best
solution, but as far as I understand this is usual practise in
postgresql to serialize Numerics and de-serialize them.

3. Decimal is transformed into jsonb through string
(Decimal->String->Numeric).

An example may also be helpful to understand extension. So, as an
example, function "test" transforms incoming jsonb into python,
transforms it back into jsonb and returns it.

create extension jsonb_plpython2u cascade;

create or replace function test(val jsonb)
returns jsonb
transform for type jsonb
language plpython2u
as $$
return (val);
$$;

select test('{"1":1,"example": null}'::jsonb);

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/contrib/Makefile b/contrib/Makefile
index e84eb67..d6b7170 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -82,9 +82,9 @@ ALWAYS_SUBDIRS += hstore_plperl
 endif
 
 ifeq ($(with_python),yes)
-SUBDIRS += hstore_plpython ltree_plpython
+SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython
 else
-ALWAYS_SUBDIRS += hstore_plpython ltree_plpython
+ALWAYS_SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython
 endif
 
 # Missing:
diff --git a/contrib/jsonb_plpython/Makefile b/contrib/jsonb_plpython/Makefile
new file mode 100644
index 000..1e34d86
--- /dev/null
+++ b/contrib/jsonb_plpython/Makefile
@@ -0,0 +1,39 @@
+# contrib/jsonb_plpython/Makefile
+
+MODULE_big = jsonb_plpython$(python_majorversion)u
+OBJS = jsonb_plpython.o $(WIN32RES)
+PGFILEDESC = "jsonb_plpython - transform between jsonb and plpythonu"
+
+PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plpython $(python_includespec) -DPLPYTHON_LIBNAME='"plpython$(python_majorversion)"'
+
+EXTENSION = jsonb_plpython$(python_majorversion)u
+DATA = jsonb_plpython$(python_majorversion)u--1.0.sql
+
+REGRESS = jsonb_plpython$(python_majorversion)
+REGRESS_PLPYTHON3_MANGLE := $(REGRESS)
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/jsonb_plpython
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+# We must link libpython explicitly
+ifeq ($(PORTNAME), win32)
+# ... see silliness in plpython Makefile ...
+SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a))
+else
+rpathdir = $(python_libdir)
+SHLIB_LINK += $(python_libspec) $(python_additional_libs)
+endif
+
+ifeq ($(python_majorversion),2)
+REGRESS_OPTS += --load-extension=plpython2u
+else
+REGRESS_OPTS += --load-extension=plpython3u
+endif
diff --git a/contrib/jsonb_plpython/expected/jsonb_plpython.out b/contrib/jsonb_plpython/expected/jsonb_plpython.out
new file mode 100644
index 000..be104af
--- /dev/null
+++ b/contrib/jsonb_plpython/expected/jsonb_plpython.out
@@ -0,0 +1,118 @@
+CREATE EXTENSION jsonb_plpythonu CASCADE;
+-- test jsonb -> python dict
+CREATE FUNCTION test1(val jsonb) RETURNS int
+LANGUAGE plpythonu
+TRANSFORM FOR TYPE jsonb
+AS $$
+assert isinstance(val, dict)
+plpy.info(sorted(val.items()))
+return len(val)
+$$;
+SELECT test1('{"a":1, "c":"NULL"}'::jsonb);
+INFO:  [('a', Decimal('1')), ('c', 'NULL')]
+ test1 
+---
+ 2
+(1 row)
+
+-- test jsonb -> python dict
+-- complex dict with dicts as value
+CREATE FUNCTION test1complex(val jsonb) RETURNS int
+LANGUAGE plpythonu
+TRANSFORM FOR TYPE jsonb
+AS $$
+assert isinstance(val, dict)
+assert(val == {"d":{"d": 1}})
+return len(val)
+$$;
+SELECT test1complex('{"d":{"d": 1}}'::jsonb);
+ test1complex 
+--
+1
+(1 row)
+
+-- test jsonb[] -> python dict
+-- dict with array as value
+CREATE FUNCTION test1arr(val jsonb) RETURNS int
+LANGUAGE plpythonu
+TRANSFORM FOR TYPE jsonb
+AS $$
+assert isinstance(val, dict)
+assert(val == {"d": [12,1]})
+return len(val)
+$$;
+SELECT test1arr('{"d":[12,1]}'::jsonb);
+ test1arr 
+--
+1
+(1 row)
+
+-- test jsonb[] -> python list
+-- simple list
+CREATE FUNCTION test2arr(val jsonb) RETURNS int
+LANGUAGE plpythonu
+TRANSFORM FOR TYPE jsonb
+AS $$
+assert isinstance(val, list)
+assert(val == [12,1])
+return len(val)
+$$;
+SELECT test2arr('[12,1]'::jsonb);
+ test2arr 
+--
+2
+(1 row)
+
+-- test jsonb[] -> python list
+-- array of dicts
+CREATE FUNCTION test3arr(val jsonb) RETURNS int
+LANGUAGE plpythonu
+TRANSFORM FOR TYPE jsonb
+AS $$
+assert isinstance(val, list)
+assert(val == [{"a":1,"b":2},{"c":3,"d":4}])
+return len(val)
+$$;
+SELECT test3arr('[{"a":1,"b":2},{"c":3,"d":4}]'::jsonb);
+ test3arr 
+--
+2
+(1 row)
+
+-- test jsonb int -> python int
+CREATE FUNCTION test1int(val jsonb) RETURNS int

Re: [HACKERS] pgbench - allow to store select results into variables

2017-10-25 Thread Pavel Stehule
Hi

2017-10-20 18:37 GMT+02:00 Fabien COELHO :

>
> Here is a v12.
>>
>
> Here is a v13, which is just a rebase after the documentation xml-ization.
>

I am looking to this patch.

Not sure if "cset" is best name - maybe "eset" .. like embeded set?

The code of append_sql_command is not too readable and is not enough
commented.

I don't understand why you pass a param compounds to append_sql_command,
when this value is stored in my_command->compound from create_sql_command?

Or maybe some unhappy field or variable names was chosen.

Regards

Pavel

>
> --
> Fabien.


Re: [HACKERS] path toward faster partition pruning

2017-10-25 Thread Beena Emerson
Hello Amit,

Thanks for the updated patches

On Wed, Oct 25, 2017 at 1:07 PM, Amit Langote
 wrote:
> On 2017/10/25 15:47, Amit Langote wrote:
>> On 2017/10/24 1:38, Beena Emerson wrote:
>>> I had noticed this and also that this crash:
>>>
>>> tprt PARTITION BY RANGE(Col1)
>>>tprt_1 FOR VALUES FROM (1) TO (50001) PARTITION BY RANGE(Col1)
>>>   tprt_11 FOR VALUES FROM (1) TO (1),
>>>   tprt_1d DEFAULT
>>>tprt_2 FOR VALUES FROM (50001) TO (11)
>>>
>>> EXPLAIN (COSTS OFF) SELECT * FROM tprt WHERE col1 BETWEEN 2 AND 7;
>>> server closed the connection unexpectedly
>>> This probably means the server terminated abnormally
>>> before or while processing the request.
>>> The connection to the server was lost. Attempting reset: Failed.
>>> !>
>>
>> ...and this (crash) were due to bugs in the 0005 patch.
>
> [  ]
>
>> Should be fixed in the attached updated version.
>
> Oops, not quite.  The crash that Beena reported wasn't fixed (or rather
> reintroduced by some unrelated change after once confirming it was fixed).
>
> Really fixed this time.
>

The crashes are fixed. However, handling of DEFAULT partition in
various queries is not proper.

Case 1: In this case default should be selected.
 DROP TABLE tprt;
  CREATE TABLE tprt (col1 int, col2 int) PARTITION BY range(col1);
  CREATE TABLE tprt_1 PARTITION OF tprt FOR VALUES FROM (1) TO (50001)
PARTITION BY list(col1);
  CREATE TABLE tprt_11 PARTITION OF tprt_1 FOR VALUES IN (2, 25000);
  CREATE TABLE tprt_12 PARTITION OF tprt_1 FOR VALUES IN (5, 35000);
  CREATE TABLE tprt_13 PARTITION OF tprt_1 FOR VALUES IN (1);
  CREATE TABLE tprt_1d PARTITION OF tprt_1 DEFAULT;


postgres=#  EXPLAIN (COSTS OFF) SELECT * FROM tprt WHERE col1 < 1;
QUERY PLAN
--
 Result
   One-Time Filter: false
(2 rows)


Case 2: In this case DEFAULT need not be selected.

DROP TABLE  tprt;
  CREATE TABLE tprt (col1 int, col2 int) PARTITION BY range(col1);
  CREATE TABLE tprt_1 PARTITION OF tprt FOR VALUES FROM (1) TO (50001)
PARTITION BY range(col1);
  CREATE TABLE tprt_11 PARTITION OF tprt_1 FOR VALUES FROM (1) TO (1);
  CREATE TABLE tprt_12 PARTITION OF tprt_1 FOR VALUES FROM (1) TO (2);
  CREATE TABLE tprt_13 PARTITION OF tprt_1 FOR VALUES FROM (2) TO (3);
  CREATE TABLE tprt_1d PARTITION OF tprt_1 DEFAULT;
  INSERT INTO tprt SELECT generate_series(1,5), generate_series(1,5);

postgres=#  EXPLAIN (COSTS OFF) SELECT * FROM tprt WHERE col1 < 1;
   QUERY PLAN

 Append
   ->  Seq Scan on tprt_11
 Filter: (col1 < 1)
   ->  Seq Scan on tprt_1d
 Filter: (col1 < 1)
(5 rows)


-- 

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench more operators & functions

2017-10-25 Thread Pavel Stehule
Hi

2017-10-20 18:36 GMT+02:00 Fabien COELHO :

>
> Here is a v13. No code changes, but TAP tests added to maintain pgbench
 coverage to green.

>>>
> Here is a v14, which is just a rebase after the documentation xml-ization.
>

all tests passed
no problems with doc building


> --
> Fabien.


Re: [HACKERS] Pluggable storage

2017-10-25 Thread Robert Haas
On Fri, Oct 20, 2017 at 5:47 AM, Amit Kapila  wrote:
> I think what we need here is a way to register satisfies function
> (SnapshotSatisfiesFunc) in SnapshotData for different storage engines.

I don't see how that helps very much.  SnapshotSatisfiesFunc takes a
HeapTuple as an argument, and it cares in detail about that tuple's
xmin, xmax, and infomask, and it sets hint bits.  All of that is bad,
because an alternative storage engine is likely to use a different
format than HeapTuple and to not have hint bits (or at least not in
the same form we have them now).  Also, it doesn't necessarily have a
Boolean answer to the question "can this snapshot see this tuple?".
It may be more like "given this TID, what tuple if any can I see
there?" or "given this tuple, what version of it would I see with this
snapshot?".

Another thing to consider is that, if we could replace satisfiesfunc,
it would probably break some existing code.  There are multiple places
in the code that compare snapshot->satisfies to
HeapTupleSatisfiesHistoricMVCC and HeapTupleSatisfiesMVCC.

I think the storage API should just leave snapshots alone.  If a
storage engine wants to call HeapTupleSatisfiesVisibility() with that
snapshot, it can do so.  Otherwise it can switch on
snapshot->satisfies and handle each case however it likes.  I don't
see how generalizing a Snapshot for other storage engines really buys
us anything except complexity and the danger of reducing performance
for the existing heap.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementing pg_receivewal --no-sync

2017-10-25 Thread Kuntal Ghosh
On Wed, Oct 25, 2017 at 6:07 AM, Michael Paquier
 wrote:
> Hi all,
>
> After thinking a bit on the subject, I have decided to submit a patch
> to do $subject. This makes pg_receivewal more consistent with
> pg_basebackup. This option is mainly useful for testing, something
> that becomes way more doable since support for --endpos has been
> added.
>
> Unsurprisingly, --synchronous and --no-sync are incompatible options.
+   
+By default, pg_receivewal flushes a WAL segment's
+contents each time a feedback message is sent to the server depending
+on the interval of time defined by
+--status-interval.
IMHO, it's okay to remove the part 'depending on
the.--status-interval'.

+This option causes
+pg_receivewal to not issue such flushes waiting,
Did you mean 'to not issue such flush waitings'?


+ [ 'pg_receivewal', '-D', $stream_dir, '--synchronous', '--no-sync' ],
+ 'failure if --synchronous specified without --no-sync');
s/without/with


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Hash take II

2017-10-25 Thread Thomas Munro
On Tue, Oct 24, 2017 at 10:10 PM, Thomas Munro
 wrote:
> Here is an updated patch set that does that ^.

It's a bit hard to understand what's going on with the v21 patch set I
posted yesterday because EXPLAIN ANALYZE doesn't tell you anything
interesting.  Also, if you apply the multiplex_gather patch[1] I
posted recently and set multiplex_gather to off then it doesn't tell
you anything at all, because the leader has no hash table (I suppose
that could happen with unpatched master given sufficiently bad
timing).  Here's a new version with an extra patch that adds some
basic information about load balancing to EXPLAIN ANALYZE, inspired by
what commit bf11e7ee did for Sort.

Example output:

enable_parallel_hash = on, multiplex_gather = on:

 ->  Parallel Hash (actual rows=100 loops=3)
   Buckets: 131072  Batches: 16
   Leader:Shared Memory Usage: 3552kB  Hashed: 396120  Batches Probed: 7
   Worker 0:  Shared Memory Usage: 3552kB  Hashed: 276640  Batches Probed: 6
   Worker 1:  Shared Memory Usage: 3552kB  Hashed: 327240  Batches Probed: 6
   ->  Parallel Seq Scan on simple s (actual rows=33 loops=3)

 ->  Parallel Hash (actual rows=1000 loops=8)
   Buckets: 131072  Batches: 256
   Leader:Shared Memory Usage: 2688kB  Hashed: 1347720
Batches Probed: 36
   Worker 0:  Shared Memory Usage: 2688kB  Hashed: 1131360
Batches Probed: 33
   Worker 1:  Shared Memory Usage: 2688kB  Hashed: 1123560
Batches Probed: 38
   Worker 2:  Shared Memory Usage: 2688kB  Hashed: 1231920
Batches Probed: 38
   Worker 3:  Shared Memory Usage: 2688kB  Hashed: 1272720
Batches Probed: 34
   Worker 4:  Shared Memory Usage: 2688kB  Hashed: 1234800
Batches Probed: 33
   Worker 5:  Shared Memory Usage: 2688kB  Hashed: 1294680
Batches Probed: 37
   Worker 6:  Shared Memory Usage: 2688kB  Hashed: 1363240
Batches Probed: 35
   ->  Parallel Seq Scan on big s2 (actual rows=125 loops=8)

enable_parallel_hash = on, multiplex_gather = off (ie no leader participation):

 ->  Parallel Hash (actual rows=100 loops=2)
   Buckets: 131072  Batches: 16
   Worker 0:  Shared Memory Usage: 3520kB  Hashed: 475920  Batches Probed: 9
   Worker 1:  Shared Memory Usage: 3520kB  Hashed: 524080  Batches Probed: 8
   ->  Parallel Seq Scan on simple s (actual rows=50 loops=2)

enable_parallel_hash = off, multiplex_gather = on:

 ->  Hash (actual rows=100 loops=3)
   Buckets: 131072  Batches: 16
   Leader:Memory Usage: 3227kB
   Worker 0:  Memory Usage: 3227kB
   Worker 1:  Memory Usage: 3227kB
   ->  Seq Scan on simple s (actual rows=100 loops=3)

enable_parallel_hash = off, multiplex_gather = off:

 ->  Hash (actual rows=100 loops=2)
   Buckets: 131072  Batches: 16
   Worker 0:  Memory Usage: 3227kB
   Worker 1:  Memory Usage: 3227kB
   ->  Seq Scan on simple s (actual rows=100 loops=2)

parallelism disabled (traditional single-line output, unchanged):

 ->  Hash (actual rows=100 loops=1)
   Buckets: 131072  Batches: 16  Memory Usage: 3227kB
   ->  Seq Scan on simple s (actual rows=100 loops=1)

(It actually says "Tuples Hashed", not "Hashed" but I edited the above
to fit on a standard punchcard.)  Thoughts?

[1] 
https://www.postgresql.org/message-id/CAEepm%3D2U%2B%2BLp3bNTv2Bv_kkr5NE2pOyHhxU%3DG0YTa4ZhSYhHiw%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


parallel-hash-v22.patchset.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers