Re: [HACKERS] Aggregates push-down to partitions

2017-11-09 Thread Maksim Milyutin

Hi Konstantin!


09.11.17 20:14, Konstantin Knizhnik wrote:
It is still far from ideal plan because each worker is working with 
all partitions, instead of spitting partitions between workers and 
calculate partial aggregates for each partition.


But if we add FDW as a child of parent table, then parallel scan can 
not be used and we get the worst possible plan:


postgres=# create foreign table derived_fdw() inherits(base) server 
pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE

postgres=# explain select sum(x) from base;
    QUERY PLAN
-- 


 Aggregate  (cost=34055.07..34055.08 rows=1 width=8)
   ->  Append  (cost=0.00..29047.75 rows=2002926 width=4)
 ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=4)
 ->  Seq Scan on derived1  (cost=0.00..14425.00 rows=100 
width=4)
 ->  Seq Scan on derived2  (cost=0.00..14425.00 rows=100 
width=4)
 ->  Foreign Scan on derived_fdw  (cost=100.00..197.75 
rows=2925 width=4)

(6 rows)

So we sequentially pull all data to this node and compute aggregates 
locally.
Ideal plan will calculate in parallel partial aggregates at all nodes 
and then combine partial results.

It requires two changes:
1. Replace Aggregate->Append with 
Finalize_Aggregate->Append->Partial_Aggregate
2. Concurrent execution of Append. It also can be done in two 
different ways: we can try to use existed parallel workers 
infrastructure and
replace Append with Gather. It seems to be the best approach for local 
partitioning. In case of remote (FDW) partitions, it is enough
to split starting of execution (PQsendQuery in postgres_fdw) and 
getting results. So it requires some changes in FDW protocol.



I wonder if somebody already investigate this problem or working in 
this direction.

May be there are already some patches proposed?
I have searched hackers archive, but didn't find something relevant...
Are there any suggestions about the best approach to implement this 
feature?




Maybe in this thread[1] your described problem are solved through 
introducing Parallel Append node?


1. 
https://www.postgresql.org/message-id/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1%2BS%2BvRuUQ%40mail.gmail.com


--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-10-07 Thread Maksim Milyutin

07.10.17 16:34, Robert Haas wrote:


On Fri, Oct 6, 2017 at 12:37 PM, Alvaro Herrera  wrote:
One thing I'm a bit worried about is how to name these subordinate
indexes.  They have to have names because that's how pg_class works,
and those names can't all be the same, again because that's how
pg_class works.  There's no problem right away when you first create
the partitioned index, because you can just pick names out of a hat
using whatever name-generating algorithm seems best.  However, when
you dump-and-restore (including but not limited to the pg_upgrade
case) you've got to preserve those names.  If you just generate a new
name that may or may not be the same as the old one, then it may
collide with a user-specified name that only occurs later in the dump.
Also, you'll have trouble if the user has applied a COMMENT or a
SECURITY LABEL to the index because that command works by name, or if
the user has a reference to the index name inside a function or
whatever.

These are pretty annoying corner-case bugs because they're not likely
to come up very often.  Most people won't notice or care if the index
name changes.  But I don't think it's acceptable to just ignore the
problem.  An idea I had was to treat the abstract index - to use your
term - sort of the way we treat an extension.  Normally, when you
create an index on a partitioned table, it cascades, but for dump and
restore purpose, we tag on some syntax that says "well, don't actually
create the subordinate indexes, i'll tell you about those later".
Then for each subordinate index we issue a separate CREATE INDEX
command followed by ALTER INDEX abstract_index ATTACH PARTITION
concrete_index or something of that sort.  That means you can't
absolutely count on the parent index to have all of the children it's
supposed to have but maybe that's OK.


AFAICS, the main problem with naming is generating new unique names for 
subordinate indexes on the stage of migrating data scheme (pg_dump, 
pg_upgrade, etc). And we cannot specify these names in the 'CREATE INDEX 
partitioned_index' statement therefore we have to regenerate their.


In this case I propose to restore index names' hierarchy *bottom-up*, 
i.e. first of all create indexes for the leaf partitions and then create 
ones for parents up to root explicitly specifying names. When creating 
index on parent table we have to check is there exist any index on child 
table that could be child index (identical criteria). If so, not 
generate new index but implicitly attach that index into parent one.
If we have incomplete index hierarchy, e.g. we dropped some indexes of 
partitions previously, then recreating of parent's index would 
regenerate (not attach) indexes for those partitions. We could drop 
those odd generated indexes after building of parent's index. This 
decision is not straightforward but provides to consider 'CREATE INDEX 
paritioned_table' statement as a cascade operation.
As a result, we can specify name for each concrete index while 
recreating a whole hierarchy.


--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-10-06 Thread Maksim Milyutin

Hi!


On 06.10.2017 19:37, Alvaro Herrera wrote:

As you propose, IMO this new feature would use the standard index
creation syntax:
CREATE [UNIQUE] INDEX someindex ON parted_table (a, b);

This command currently throws an error.  We'd have it do two things:

1. create catalog rows in pg_class and pg_index for the main table,
indicating the existance of this partitioned index.  These would not
point to an actual index (since the main table itself is empty), but
instead to an "abstract" index.  This abstract index can be used by
various operations; see below.


Robert Haas proposed[1] to use the name "partitioned index" (instead of 
abstract) that have to be reflected in 'relkind' field of pg_class as 
the RELKIND_PARTITIONED_INDEX value.



I propose that an index declared UNIQUE throws an error for now.  We can
implement uniqueness (for the case where the indexed columns match the
partitioning key) later, once we sort out all the issues here first.  I
think unique indexes would be very useful even with that limitation, but
let's have it as a separate project.


Yes, global uniqueness through local unique indexes causes further work 
related with foreign keys on partitioned table, full support of INSERT 
OF CONFLICT, etc. It make sense to implement after the current stage of 
work.



I think using pg_depend as the mechanism to link partition indexes to
parent is a bad idea.  How about pg_inherits instead?  Seems more
appropriate.


Greg Stark proposed[2] to use new pg_index field of oid type that refers 
to the parent pg_index item. AFAIC pg_inherits also makes sense but 
semantically it deals with inheriting tables. IMHO the using of this 
catalog table to define relation between partitioned table and 
partitions looks like a hack to make use of constraint exclusion logic 
for partition pruning.



Creating hierachy-wide indexes for existing partitioned tables is likely
to take a long time, so we must include CONCURRENTLY as an option.  This
will need some transaction machinery support in order to ensure that
each partition gets its index created at some point in a long chain of
transactions, and that the whole thing is marked valid only at the end.
Also, if the creation is interrupted (because of a crash or a regular
shutdown), it'll be useful to be able to continue rather than being
forced to start from scratch.


This option was very difficult for me. I would be interested to see the 
implementation.



During ALTER TABLE ... ATTACH PARTITION, we check that an indexing
satisfying the abstract index exist (and we create a pg_inherit link).
If not, the command is aborted.


We could create necessary index for partition, not abort ALTER TABLE ... 
ATTACH PARTITION statement.



We need to come up with some way to generate names for each partition
index.


I think the calling 'ChooseIndexName(RelationGetRelationName(childrel), 
namespaceId, indexColNames, ...)' resolves this problem.



I am going to work on this now.


It will be great! I think this project is difficult for me on the part 
of integration described above functionality with the legacy postgres 
code. Also IMO this project is very important because it opens the way 
for such feature as global uniqueness of fields of partitioned tables. 
And any protraction in implementation is bad.


I would like to review and test your intermediate results.


1. 
https://www.postgresql.org/message-id/CA%2BTgmoY5UOUnW%3DMcwT7xUB_2W5dAkvOg5kD20Spx5gF-Ad47cA%40mail.gmail.com
2. 
https://www.postgresql.org/message-id/CAM-w4HOVftuv5RVi3a%2BsRV6nBpg204w7%3DL8MwPXVvYBFo1uM1Q%40mail.gmail.com


--
Regards,
Maksim Milyutin



Re: [HACKERS][BUG] Cache invalidation for queries that contains const of temporary composite type

2017-10-03 Thread Maksim Milyutin

Hi HORIGUCHI-san!


Thanks for the valuable comments.


03.10.17 4:30, Kyotaro HORIGUCHI wrote:


The first thought that patch gave me is that the problem is not
limited to constants. Actually the following sequence also
reproduces similar failure even with this patch.

create table t2 (x int , y int);
create type pair as (x int, y int);
prepare test as select row(x, y)::pair from t2;
drop type pair;
execute test;
| ERROR:  cache lookup failed for type 16410

In this case the causal expression is in the following form.

   TargetEntry (
 expr = (
   RowExpr:
 typeid = 16410,
 row_format = COERCE_EXPLICIT_CAST,
 args = List (Var(t2.x), Var(t2.y))
 )
   )


Yeah, RowExpr has no dependency from type relation with oid=16410. I 
think the routine 'fix_expr_common' needs to be reworked to take into 
account any possible dependencies of expression from composite type.


On November commitfest I'll lay out patch that covers your case.

--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS][BUG] Cache invalidation for queries that contains const of temporary composite type

2017-10-02 Thread Maksim Milyutin

Hi, Alexander!

Thanks for the comments.


02.10.17 20:02, Alexander Korotkov wrote:
Please, register this patch at upcoming commitfest to ensure it 
wouldn't be forgotten.
Regression test changes (both .sql and .out) are essential parts of 
the patch.  I see no point in posting them separately.  Please, 
incorporate them into your patch.


OK, I'll take your advice.

Did you check this patch with manually created composite type (made by 
CREATE TYPE typname AS ...)?

I have tested the following case:

create type pair as (x int, y int);
prepare test as select json_populate_record(null::pair, '{"x": 1, "y": 
2}'::json);

drop type pair cascade;

execute test;

-- The following output is obtained before patch
ERROR:  cache lookup failed for type 16419

-- After applying patch
ERROR:  type "pair" does not exist

But after recreating 'pair' type I'll get the following message:
ERROR:  cached plan must not change result type

I don't know whether it's right behavior. Anyhow your point is a good 
motivation to experiment and investigate different scenarios of work 
with cached plan that depends on non-stable type. Thanks for that.


--
Regards,
Maksim Milyutin



Re: [HACKERS][BUG] Cache invalidation for queries that contains const of temporary composite type

2017-10-02 Thread Maksim Milyutin

On 26.09.2017 23:25, Maksim Milyutin wrote:


25.09.17 20:50, Maksim Milyutin wrote:

I have found out the problem when try to sequentially call the 
function that casts constant to composite type of temporary table 
that is deleted ateach transaction termination (i.e. at each function 
call completion).



For example, we have the following function 'test':

CREATE OR REPLACE FUNCTION test()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
begin
    create temp table tbl (id int) on commit drop;
    perform json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt;
end;
$function$


Оn the second and subsequent calls we'll observe the following error:

ERROR:  cache lookup failed for type 16392


I investigated the problem and realized that result type of function 
*json_populate_record* (/funcresulttype/ field of /FuncExpr/ struct) 
as well as type of the first null argument (/consttype/ field of 
/Const/ struct) refer to the invalid composite type related with 
temporary table'tbl'. Namely they take a value of oid gotten from the 
first 'tbl' initialization. The plan of query *'perform 
json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt'*is 
cached and is not invalidated at each function call. This is 
because** the statement of this query doesn't have any dependency 
from the 'tbl' relation (/relationOids/ list of /CachedPlanSource/ 
struct).



Attached patch resolves this problem by adding dependency from 
relation upon detection Const expression of composite type of that 
relation


Updated patchset contains more transparent definition of composite 
type for constant node and regression test for described above buggy case.



Is there any interest on the problem in this thread?

--
Regards,
Maksim Milyutin



Re: [HACKERS] Add support for tuple routing to foreign partitions

2017-10-02 Thread Maksim Milyutin

Hi Fujita-san!


On 11.09.2017 16:01, Etsuro Fujita wrote:

Here is an updated version of the patch.

* Query planning: the patch creates copies of Query/Plan with a 
foreign partition as target from the original Query/Plan for each 
foreign partition and invokes PlanForeignModify with those copies, to 
allow the FDW to do query planning for remote INSERT with the existing 
API.  To make such Queries the similar way inheritance_planner does, I 
modified transformInsertStmt so that the inh flag for the partitioned 
table's RTE is set to true, which allows (1) expand_inherited_rtentry 
to build an RTE and AppendRelInfo for each partition in the 
partitioned table and (2) make_modifytable to build such Queries using 
adjust_appendrel_attrs and those AppendRelInfos.


* explain.c: I modified show_modifytable_info so that we can show 
remote queries for foreign partitions in EXPLAIN for INSERT into a 
partitioned table the same way as for inherited UPDATE/DELETE cases.  
Here is an example:


postgres=# explain verbose insert into pt values (1), (2);
    QUERY PLAN
---
 Insert on public.pt  (cost=0.00..0.03 rows=2 width=4)
   Foreign Insert on public.fp1
 Remote SQL: INSERT INTO public.t1(a) VALUES ($1)
   Foreign Insert on public.fp2
 Remote SQL: INSERT INTO public.t2(a) VALUES ($1)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=4)
 Output: "*VALUES*".column1
(7 rows)

I think I should add more explanation about the patch, but I don't 
have time today, so I'll write additional explanation in the next 
email. Sorry about that.


Could you update your patch, it isn't applied on the actual state of 
master. Namely conflict in src/backend/executor/execMain.c


--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitions: \d vs \d+

2017-09-29 Thread Maksim Milyutin

On 29.09.2017 04:33, Amit Langote wrote:


So, we should be looking at partconstraintdef only when verbose is true,
because that's only when we set it to a valid value.  Now, if
partconstraintdef is NULL even after verbose is true, that means backend
returned that there exists no constraint for that partition, which I
thought would be true for a default partition (because the commit that
introduced default partitions also introduced "No partition constraint"),
but it's really not.

For example, \d and \d+ show contradictory outputs for a default partition.

create table p (a int) partition by list (a);
create table p1 partition of p for values in (1);
create table pd partition of p default;

\d pd
  Table "public.pd"
  Column |  Type   | Collation | Nullable | Default
+-+---+--+-
  a  | integer |   |  |
Partition of: p DEFAULT
No partition constraint

\d+ pd
 Table "public.pd"
  Column |  Type   | Collation | Nullable | Default | Storage | Stats
target | Description
+-+---+--+-+-+--+-
  a  | integer |   |  | | plain   |  |
Partition of: p DEFAULT
Partition constraint: (NOT ((a IS NOT NULL) AND (a = ANY (ARRAY[1]


Perhaps, there is no case when "No partition constraint" should be output,
but I may be missing something.


Anyhow, we have to protect ourselves from empty output from 
*pg_get_partition_constraintdef*. And printing *No partition constraint* 
would be good point to start to examine why we didn't get any constraint 
definition.


--
Regards,
Maksim Milyutin



Re: [HACKERS] Partitions: \d vs \d+

2017-09-28 Thread Maksim Milyutin

On 28.09.2017 16:29, Jesper Pedersen wrote:


On 09/28/2017 09:19 AM, Maksim Milyutin wrote:
E.g. "No partition constraint" vs. "Partition constraint: 
satisfies_hash_partition(...)".


I also noticed ambiguity in printing "No partition constraint" in 
non-verbose mode and "Partition constraint:..." in verbose one for 
partition tables regardless of the type of partition.
Attached small patch removes any output about partition constraint in 
non-verbose mode.




Yeah, that could be one way.

It should likely be backported to REL_10_STABLE, so the question is if 
we are too late in the release cycle to change that output.


I want to prepare more complete patch for "Partition constraint" output. 
For example, I encountered the primitive output with repetitive 
conjuncts for subpartition whose parent is partitioned by the same key:


Partition constraint: (/(i IS NOT NULL)/ AND (i >= 30) AND (i < 40) AND 
/(i IS NOT NULL)/ AND (i = ANY (ARRAY[30, 31])))


--
Regards,
Maksim Milyutin



Re: [HACKERS] Partitions: \d vs \d+

2017-09-28 Thread Maksim Milyutin

Hi!


On 28.09.2017 16:02, Jesper Pedersen wrote:

Hi,

Using hash partitions I noticed that \d gives

D=# \d T_p63
   Table "public.T_p63"
    Column | Type  | Collation | Nullable | Default
---+---+---+--+-



Partition of: T FOR VALUES WITH (modulus 64, remainder 63)
No partition constraint
Indexes:
    "T_p63" btree (X, Y)

where as \d+ gives

D=# \d+ T_p63
   Table "public.T_p63"
    Column | Type  | Collation | Nullable | Default | 
Storage  | Stats target | Description
---+---+---+--+-+--+--+- 





Partition of: T FOR VALUES WITH (modulus 64, remainder 63)
Partition constraint: satisfies_hash_partition(64, 63, 
hashint4extended(X, '8816678312871386367'::bigint))

Indexes:
    "T_p63" btree (X, Y)

E.g. "No partition constraint" vs. "Partition constraint: 
satisfies_hash_partition(...)".


I also noticed ambiguity in printing "No partition constraint" in 
non-verbose mode and "Partition constraint:..." in verbose one for 
partition tables regardless of the type of partition.
Attached small patch removes any output about partition constraint in 
non-verbose mode.


--
Regards,
Maksim Milyutin

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d22ec68..b301219 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -1900,13 +1900,16 @@ describeOneTableDetails(const char *schemaname,
 			  partdef);
 			printTableAddFooter(&cont, tmpbuf.data);
 
-			/* If there isn't any constraint, show that explicitly */
-			if (partconstraintdef == NULL || partconstraintdef[0] == '\0')
-printfPQExpBuffer(&tmpbuf, _("No partition constraint"));
-			else
-printfPQExpBuffer(&tmpbuf, _("Partition constraint: %s"),
-  partconstraintdef);
-			printTableAddFooter(&cont, tmpbuf.data);
+			if (verbose)
+			{
+/* If there isn't any constraint, show that explicitly */
+if (partconstraintdef == NULL || partconstraintdef[0] == '\0')
+	printfPQExpBuffer(&tmpbuf, _("No partition constraint"));
+else
+	printfPQExpBuffer(&tmpbuf, _("Partition constraint: %s"),
+	  partconstraintdef);
+printTableAddFooter(&cont, tmpbuf.data);
+			}
 
 			PQclear(result);
 		}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS][BUG] Cache invalidation for queries that contains const of temporary composite type

2017-09-26 Thread Maksim Milyutin

25.09.17 20:50, Maksim Milyutin wrote:

I have found out the problem when try to sequentially call the 
function that casts constant to composite type of temporary table that 
is deleted ateach transaction termination (i.e. at each function call 
completion).



For example, we have the following function 'test':

CREATE OR REPLACE FUNCTION test()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
begin
    create temp table tbl (id int) on commit drop;
    perform json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt;
end;
$function$


Оn the second and subsequent calls we'll observe the following error:

ERROR:  cache lookup failed for type 16392


I investigated the problem and realized that result type of function 
*json_populate_record* (/funcresulttype/ field of /FuncExpr/ struct) 
as well as type of the first null argument (/consttype/ field of 
/Const/ struct) refer to the invalid composite type related with 
temporary table'tbl'. Namely they take a value of oid gotten from the 
first 'tbl' initialization. The plan of query *'perform 
json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt'*is cached 
and is not invalidated at each function call. This is because** the 
statement of this query doesn't have any dependency from the 'tbl' 
relation (/relationOids/ list of /CachedPlanSource/ struct).



Attached patch resolves this problem by adding dependency from 
relation upon detection Const expression of composite type of that 
relation


Updated patchset contains more transparent definition of composite type 
for constant node and regression test for described above buggy case.


--
Regards,
Maksim Milyutin

diff --git a/src/backend/optimizer/plan/setrefs.c 
b/src/backend/optimizer/plan/setrefs.c
index b0c9e94459..482b0d227c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -78,6 +78,12 @@ typedef struct
(((con)->consttype == REGCLASSOID || (con)->consttype == OIDOID) && \
 !(con)->constisnull)
 
+/*
+ * Check if a Const node has composite type
+ */
+#define ISTABLETYPECONST(con) \
+   (get_typtype((con)->consttype) == TYPTYPE_COMPOSITE)
+
 #define fix_scan_list(root, lst, rtoffset) \
((List *) fix_scan_expr(root, (Node *) (lst), rtoffset))
 
@@ -1410,6 +1416,12 @@ fix_expr_common(PlannerInfo *root, Node *node)
root->glob->relationOids =
lappend_oid(root->glob->relationOids,

DatumGetObjectId(con->constvalue));
+
+   /* Check whether const has composite type */
+   if (ISTABLETYPECONST(con))
+   root->glob->relationOids =
+   lappend_oid(root->glob->relationOids,
+   
get_typ_typrelid(con->consttype));
}
else if (IsA(node, GroupingFunc))
{
diff --git a/src/test/regress/expected/plancache.out 
b/src/test/regress/expected/plancache.out
index c2eeff1614..5c0be47778 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -220,6 +220,28 @@ execute p2;
1
 (1 row)
 
+-- Check that invalidation deals with casting const value to temporary
+-- composite type reinitialized on each new transaction
+create function cache_query_with_composite_const() returns void as $$
+begin
+create temp table tbl(id int) on commit drop;
+
+-- Plan of the next query has to be rebuilt on each new call of function
+-- due to casting first argument 'null' to recreated temprary table 'tbl'
+perform json_populate_record(null::tbl, '{"id": 0}'::json);
+end$$ language plpgsql;
+select cache_query_with_composite_const();
+ cache_query_with_composite_const 
+--
+ 
+(1 row)
+
+select cache_query_with_composite_const();
+ cache_query_with_composite_const 
+--
+ 
+(1 row)
+
 -- Check DDL via SPI, immediately followed by SPI plan re-use
 -- (bug in original coding)
 create function cachebug() returns void as $$
diff --git a/src/test/regress/sql/plancache.sql 
b/src/test/regress/sql/plancache.sql
index cb2a551487..41be0d6bf4 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -140,6 +140,21 @@ create temp sequence seq;
 
 execute p2;
 
+-- Check that invalidation deals with casting const value to temporary
+-- composite type reinitialized on each new transaction
+
+create function cache_query_with_composite_const() returns void as $$
+begin
+create temp table tbl(id int) on commit drop;
+
+-- Plan of the next query has to be rebuilt on each new call of function
+-- due to casting first argument 'null' to recreated temprary tab

Re: [HACKERS] Repetitive code in RI triggers

2017-09-26 Thread Maksim Milyutin

On 19.09.2017 11:09, Daniel Gustafsson wrote:


Removing reviewer Maksim Milyutin from patch entry due to inactivity and
community account email bouncing.  Maksim: if you are indeed reviewing this
patch, then please update the community account and re-add to the patch entry.

cheers ./daniel


Daniel, thanks for noticing. I have updated my account and re-added to 
the patch entry.


Ildar, your patch is conflicting with the current HEAD of master branch, 
please update it.


--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS][BUG] Cache invalidation for queries that contains const of temporary composite type

2017-09-25 Thread Maksim Milyutin

Hello everyone!


I have found out the problem when try to sequentially call the function 
that casts constant to composite type of temporary table that is deleted 
ateach transaction termination (i.e. at each function call completion).



For example, we have the following function 'test':

CREATE OR REPLACE FUNCTION test()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
begin
    create temp table tbl (id int) on commit drop;
    perform json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt;
end;
$function$


Оn the second and subsequent calls we'll observe the following error:

ERROR: cache lookup failed for type 16392


I investigated the problem and realized that result type of function 
*json_populate_record* (/funcresulttype/ field of /FuncExpr/ struct) as 
well as type of the first null argument (/consttype/ field of /Const/ 
struct) refer to the invalid composite type related with temporary 
table'tbl'. Namely they take a value of oid gotten from the first 'tbl' 
initialization. The plan of query *'perform 
json_populate_record(null::tbl, '{ "id": 11 }'::json) as tt'*is cached 
and is not invalidated at each function call. This is because** the 
statement of this query doesn't have any dependency from the 'tbl' 
relation (/relationOids/ list of /CachedPlanSource/ struct).



Attached patch resolves this problem by adding dependency from relation 
upon detection Const expression of composite type of that relation.


--
Regards,
Maksim Milyutin

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b0c9e94..bf5e4f4 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -78,6 +78,12 @@ typedef struct
 	(((con)->consttype == REGCLASSOID || (con)->consttype == OIDOID) && \
 	 !(con)->constisnull)
 
+/*
+ * Check if a Const node has composite type
+ */
+#define ISTABLETYPECONST(con) \
+	(OidIsValid(get_typ_typrelid((con)->consttype)))
+
 #define fix_scan_list(root, lst, rtoffset) \
 	((List *) fix_scan_expr(root, (Node *) (lst), rtoffset))
 
@@ -1410,6 +1416,12 @@ fix_expr_common(PlannerInfo *root, Node *node)
 			root->glob->relationOids =
 lappend_oid(root->glob->relationOids,
 			DatumGetObjectId(con->constvalue));
+
+		/* Check whether const has composite type */
+		if (ISTABLETYPECONST(con))
+			root->glob->relationOids =
+lappend_oid(root->glob->relationOids,
+			get_typ_typrelid(con->consttype));
 	}
 	else if (IsA(node, GroupingFunc))
 	{
diff --git a/src/test/regress/sql/plancache.sql b/src/test/regress/sql/plancache.sql
index cb2a551..41be0d6 100644
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -140,6 +140,21 @@ create temp sequence seq;
 
 execute p2;
 
+-- Check that invalidation deals with casting const value to temporary
+-- composite type reinitialized on each new transaction
+
+create function cache_query_with_composite_const() returns void as $$
+begin
+create temp table tbl(id int) on commit drop;
+
+-- Plan of the next query has to be rebuilt on each new call of function
+-- due to casting first argument 'null' to recreated temprary table 'tbl'
+perform json_populate_record(null::tbl, '{"id": 0}'::json);
+end$$ language plpgsql;
+
+select cache_query_with_composite_const();
+select cache_query_with_composite_const();
+
 -- Check DDL via SPI, immediately followed by SPI plan re-use
 -- (bug in original coding)
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain : log queries with wrong estimation

2017-08-24 Thread Maksim Milyutin

On 24.08.2017 14:56, Adrien Nayrat wrote:


Hi hackers,


Hi,

I try to made a patch to auto_explain in order to log queries with 
wrong estimation.


I compare planned row id : queryDesc->planstate->plan->plan_rows

Vs ntuples : queryDesc->planstate->instrument->ntuples;


AFAICS you want to introduce two additional per-node variables:
 - auto_explain_log_estimate_ratio that denotes minimum ratio (>= 1) 
between real value and planned one. I would add 'min' prefix before 
'ratio'.
 - auto_explain_log_estimate_min_rows - minimum absolute difference 
between those two values. IMHO this name is somewhat poor, the suffix 
'min_diff_rows' looks better.
If real expressions (ratio and diff) exceed these threshold values both, 
you log this situation. I'm right?


If I understand, instrumentation is used only with explain. So my 
patch works

only with explain (and segfault without).


Instrumentation is initialized only with analyze (log_analyze is true)[1]


Is there a simple way to get ntuples?


It's interesting question. In one's time I didn't find any way to get 
the amount of tuples emitted from a node.



1. contrib/auto_explain/auto_explain.c:221

--
Regards,
Maksim Milyutin



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-08-10 Thread Maksim Milyutin

10.08.17 23:01, Robert Haas wrote:


On Wed, Apr 19, 2017 at 5:25 AM, Maksim Milyutin
 wrote:

Ok, thanks for the feedback. Then I'll use a new relkind for local
partitioned index in further development.

Any update on this?



I'll continue to work soon. Sorry for so long delay.

--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] New command to monitor progression of long running queries

2017-04-19 Thread Maksim Milyutin

On 19.04.2017 17:13, Remi Colinet wrote:

Maksim,


2017-04-18 20:31 GMT+02:00 Maksim Milyutin mailto:m.milyu...@postgrespro.ru>>:

On 18.04.2017 17:39, Remi Colinet wrote:


Regarding the queryDesc state of SQL query upon receiving a
request to
report its execution progress, it does not bring any issue. The
request
is noted when the signal is received by the monitored backend.
Then, the
backend continues its execution code path. When interrupts are
checked
in the executor code, the request will be dealt.


Yes, interrupts are checked in the CHECK_FOR_INTERRUPTS entries.

When the request is being dealt, the monitored backend will stop its
execution and report the progress of the SQL query. Whatever is the
status of the SQL query, progress.c code checks the status and
report
either that the SQL query does not have a valid status, or
otherwise the
current execution state of the SQL query.

SQL query status checking is about:
- idle transaction
- out of transaction status
- null planned statement
- utility command
- self monitoring

Other tests can be added if needed to exclude some SQL query
state. Such
checking is done in void HandleProgressRequest(void).
I do not see why a SQL query progression would not be possible
in this
context. Even when the queryDescc is NULL, we can just report a
 output. This is currently the case with the patch
suggested.


It's interesting question - how much the active query is in a usable
state on the stage of execution. Tom Lane noticed that
CHECK_FOR_INTERRUPTS doesn't give us 100% guarantee about full
consistency [1].


I wonder what you mean about usable state.



A usable query state is suitable for analysis, IOW we have consistent 
QueryDesc object. This term was introduced by Tom Lane in [1]. I suppose 
he meant the case when a query fails with error and before transaction 
aborts we bump into *CHECK_FOR_INTERRUPTS* in the place where QueryDesc 
may be inconsistent and further reading from it will give us invalid result.




Currently, the code suggested tests the queryDesc pointer and all the
sub nodes pointers in order to detect NULL pointers. When the progress
report is collected by the backend, this backend does the collect and
consequently does not run the query. So the query tree is not being
modified. At this moment, whatever is the query state, we can manage to
deal with its static state. It is only a tree which could also be just a
NULL pointer in the most extreme case. Such case is dealt in the current
code.



Perhaps the deep checking of QueryDesc would allow us to consider it as 
consistent.



1. https://www.postgresql.org/message-id/24182.1472745492%40sss.pgh.pa.us

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-04-19 Thread Maksim Milyutin

On 19.04.2017 11:42, Ashutosh Bapat wrote:

On Tue, Apr 18, 2017 at 4:43 PM, Maksim Milyutin
 wrote:


Local partitioned indexes can be recognized through the check on the relkind
of table to which the index refers. Something like this:

heap = relation_open(IndexGetRelation(indexid, false), heapLockmode);
if (heap->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
/* indexid is local index on partitioned table */


An index on partitioned table can be global index (yet to be
implemented) or a local index. We can not differentiate between those
just by looking at the relation on which they are built.



We could to refine the criteria for the local partitioned index later 
encapsulating it in a macro, e.g., adding a new flag from pg_index that 
differentiate the type of index on partitioned table.




Thеsе cases must be caught. But as much as partitioned tables doesn't
participate in query plans their indexes are unaccessible by executor.
Reindex operation is overloaded with my patch.



A global index would have storage for a partitioned table whereas a
local index wouldn't have any storage for a partitioned table.

I agree with Amit that we need new relkinds for local as well as global indexes.



Ok, thanks for the feedback. Then I'll use a new relkind for local 
partitioned index in further development.




--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] New command to monitor progression of long running queries

2017-04-18 Thread Maksim Milyutin

On 18.04.2017 17:39, Remi Colinet wrote:

Hello Maksim,

The core implementation I suggested for the new PROGRESS command uses
different functions from the one used by EXPLAIN for its core
implementation.
Some source code is shared with EXPLAIN command. But this shared code is
only related to quals, properties, children, subPlans and few other nodes.

All other code for PROGRESS is new code.

I don't believe explain.c code can be fully shared with the one of the
new PROGRESS command. These 2 commands have different purposes.
The core code used for the new PROGRESS command is very different from
the core code used for EXPLAIN.



Perhaps you will be forced to duplicate significant snippets of 
functionality from explain.c into your progress.c.




Regarding the queryDesc state of SQL query upon receiving a request to
report its execution progress, it does not bring any issue. The request
is noted when the signal is received by the monitored backend. Then, the
backend continues its execution code path. When interrupts are checked
in the executor code, the request will be dealt.



Yes, interrupts are checked in the CHECK_FOR_INTERRUPTS entries.


When the request is being dealt, the monitored backend will stop its
execution and report the progress of the SQL query. Whatever is the
status of the SQL query, progress.c code checks the status and report
either that the SQL query does not have a valid status, or otherwise the
current execution state of the SQL query.

SQL query status checking is about:
- idle transaction
- out of transaction status
- null planned statement
- utility command
- self monitoring

Other tests can be added if needed to exclude some SQL query state. Such
checking is done in void HandleProgressRequest(void).
I do not see why a SQL query progression would not be possible in this
context. Even when the queryDescc is NULL, we can just report a  output. This is currently the case with the patch suggested.



It's interesting question - how much the active query is in a usable 
state on the stage of execution. Tom Lane noticed that 
CHECK_FOR_INTERRUPTS doesn't give us 100% guarantee about full 
consistency [1].



So far, I've found this new command very handy. It allows to evaluate
the time needed to complete a SQL query.



Could you explain how you get the percent of execution for nodes of plan 
tree and overall for query?



A further improvement would be to report the memory, disk and time
resources used by the monitored backend. An overuse of memory, disk and
time resources can prevent the SQL query to complete.



This functionality is somehow implemented in explain.c. You can see my 
patch to this file [2]. I only manipulate runtime statistics (data in 
the structure 'Instrumentation') to achieve the printing state of 
running query.



1. https://www.postgresql.org/message-id/24182.1472745492%40sss.pgh.pa.us
2. 
https://github.com/postgrespro/pg_query_state/blob/master/runtime_explain.patch


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] New command to monitor progression of long running queries

2017-04-18 Thread Maksim Milyutin
tch is still work in progress. It is meanwhile stable. It
can be used with regular queries. Utilities commands are not supported
for the moment.
Documentation is not yet written.

Regards
Remi



I had implemented analogous feature as extension *pg_query_state* [1] 
the idea of which I proposed in the thread [2]. Together with this 
extension I provided some patches to postgres core to enable to send 
custom signals to working backend (similar to your PROCSIG_PROGRESS) and 
to print the current query state through patches in 'ExplainNode' function.


I had implemented the same mechanics as you:
1) interrupt the working backend through ProcSignal;
2) handle progress request in the CHECK_FOR_INTERRUPTS entry;
3) transfer query state through shared memory to caller.
But criticism of my approach was that the structure 'QueryDesc' on basis 
of which query state is formed can be inconsistent in the places where 
CHECK_FOR_INTERRUPTS appears [3].


I plan to propose the custom_signal patch to community as soon as 
possible and as consequence release *pg_query_state* from dependency on 
patches to postgres core. In perspective, I want to resolve the problem 
related to binding to the CHECK_FOR_INTERRUPTS entries perhaps through 
patching the executor and implement the robust PROGRESS command.



1. https://github.com/postgrespro/pg_query_state
2. 
https://www.postgresql.org/message-id/dbfb1a42-ee58-88fd-8d77-550498f52bc5%40postgrespro.ru

3. https://www.postgresql.org/message-id/24182.1472745492%40sss.pgh.pa.us


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-04-18 Thread Maksim Milyutin

On 18.04.2017 13:08, Amit Langote wrote:

Hi,



Hi, Amit!


On 2017/04/17 23:00, Maksim Milyutin wrote:


Ok, thanks for the note.

But I want to discuss the relevancy of introduction of a new relkind for
partitioned index. I could to change the control flow in partitioned index
creation (specify conditional statement in the 'index_create' routine in
attached patch) and not enter to the 'heap_create' routine. This case
releases us from integrating new relkind into different places of Postgres
code. But we have to copy-paste some specific code from 'heap_create'
function, e.g., definition of relfilenode and tablespaceid for the new
index and perhaps something more when 'heap_create' routine will be extended.


I may be missing something, but isn't it that a new relkind will be needed
anyway?  How does the rest of the code distinguish such index objects once
they are created?


Local partitioned indexes can be recognized through the check on the 
relkind of table to which the index refers. Something like this:


heap = relation_open(IndexGetRelation(indexid, false), heapLockmode);
if (heap->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
/* indexid is local index on partitioned table */


Is it possible that some other code may try to access
the storage for an index whose indrelid is a partitioned table?



Thеsе cases must be caught. But as much as partitioned tables doesn't 
participate in query plans their indexes are unaccessible by executor. 
Reindex operation is overloaded with my patch.



--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-04-17 Thread Maksim Milyutin

On 10.04.2017 14:20, Robert Haas wrote:

On Tue, Apr 4, 2017 at 12:10 PM, Maksim Milyutin
 wrote:

1. I have added a new relkind for local indexes named RELKIND_LOCAL_INDEX
(literal 'l').


Seems like it should maybe be RELKIND_PARTITIONED_INDEX.  There's
nothing particularly "local" about it.  I suppose what you're going
for is that it's not global, but in a way it *is* global to the
partitioning hierarchy.  That's the point.  It's just that it's
partitioned.



Ok, thanks for the note.

But I want to discuss the relevancy of introduction of a new relkind for 
partitioned index. I could to change the control flow in partitioned 
index creation (specify conditional statement in the 'index_create' 
routine in attached patch) and not enter to the 'heap_create' routine. 
This case releases us from integrating new relkind into different places 
of Postgres code. But we have to copy-paste some specific code from 
'heap_create' function, e.g., definition of relfilenode and tablespaceid 
for the new index and perhaps something more when 'heap_create' routine 
will be extended.


What do you think about this way?


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 2328b92..9c15bc9 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -41,6 +41,7 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_constraint_fn.h"
+#include "catalog/pg_depend.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_tablespace.h"
@@ -849,17 +850,33 @@ index_create(Relation heapRelation,
 	 * we fail further down, it's the smgr's responsibility to remove the disk
 	 * file again.)
 	 */
-	indexRelation = heap_create(indexRelationName,
-namespaceId,
-tableSpaceId,
-indexRelationId,
-relFileNode,
-indexTupDesc,
-RELKIND_INDEX,
-relpersistence,
-shared_relation,
-mapped_relation,
-allow_system_table_mods);
+	if (heapRelation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+		indexRelation = heap_create(indexRelationName,
+	namespaceId,
+	tableSpaceId,
+	indexRelationId,
+	relFileNode,
+	indexTupDesc,
+	RELKIND_INDEX,
+	relpersistence,
+	shared_relation,
+	mapped_relation,
+	allow_system_table_mods);
+	else
+		indexRelation =
+			RelationBuildLocalRelation(indexRelationName,
+	   namespaceId,
+	   indexTupDesc,
+	   indexRelationId,
+	   (OidIsValid(relFileNode)) ?
+		relFileNode : indexRelationId,
+	   (tableSpaceId == MyDatabaseTableSpace) ?
+		InvalidOid : tableSpaceId,
+	   shared_relation,
+	   mapped_relation,
+	   relpersistence,
+	   RELKIND_INDEX);
+
 
 	Assert(indexRelationId == RelationGetRelid(indexRelation));
 
@@ -1549,10 +1566,14 @@ index_drop(Oid indexId, bool concurrent)
 		TransferPredicateLocksToHeapRelation(userIndexRelation);
 	}
 
-	/*
-	 * Schedule physical removal of the files
-	 */
-	RelationDropStorage(userIndexRelation);
+	if (userHeapRelation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+	{
+
+		/*
+		 * Schedule physical removal of the files
+		 */
+		RelationDropStorage(userIndexRelation);
+	}
 
 	/*
 	 * Close and flush the index's relcache entry, to ensure relcache doesn't
@@ -3294,6 +3315,111 @@ IndexGetRelation(Oid indexId, bool missing_ok)
 }
 
 /*
+ * Find all leaf indexes included into local index with 'indexId' oid and lock
+ * all dependent indexes and respective relations.
+ *
+ * Search is performed in pg_depend table since all indexes belonging to child
+ * tables depends on index from parent table.
+ *
+ *	indexId: the oid of local index whose leaf indexes need to find
+ *	result: list of result leaf indexes
+ *	depRel: already opened pg_depend relation
+ *	indexLockmode: lockmode for indexes' locks
+ *	heapLockmode: lockmode for relations' locks
+ */
+static void
+findDepedentLeafIndexes(Oid indexId, List **result, Relation depRel,
+		LOCKMODE indexLockmode, LOCKMODE heapLockmode)
+{
+	ScanKeyData 		key[3];
+	int	nkeys;
+	SysScanDesc 		scan;
+	HeapTuple			tup;
+	List			   *localSubIndexIds = NIL;
+	ListCell		   *lc;
+
+	ScanKeyInit(&key[0],
+Anum_pg_depend_refclassid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+Anum_pg_depend_refobjid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(indexId));
+	nkeys = 2;
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+			  NULL, nkeys, key);
+
+	while (HeapTupleIsVa

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-04-10 Thread Maksim Milyutin

On 10.04.2017 13:46, Greg Stark wrote:

On 4 April 2017 at 17:10, Maksim Milyutin  wrote:


3. As I noticed early pg_depend table is used for cascade deleting indexes
on partitioned table and its children. I also use pg_depend to determine
relationship between parent and child indexes when reindex executes
recursively on child indexes.

Perhaps, it's not good way to use pg_depend to determine the relationship
between parent and child indexes because the kind of this relationship is
not defined. I could propose to add into pg_index table specific field of
'oidvector' type that specify oids of dependent indexes for the current
local index.



Alternately you could have an single oid in pg_index on each of the
children that specifies which local index is its parent. That would
probably require a new index on that column so you could look up all
the children efficiently.

I think it would behave more sensibly when you're adding or removing a
partition, especially if you want to add many partitions in parallel
using multiple transactions. An oidvector of children would
effectively mean you could only be doing one partition creation or
deletion at a time.



Thanks for your comment. Your approach sounds better than mine. I'll try it.

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2017-04-07 Thread Maksim Milyutin

On 07.04.2017 13:05, Etsuro Fujita wrote:

On 2016/12/14 16:20, Etsuro Fujita wrote:

On 2016/12/09 19:46, Maksim Milyutin wrote:

I would like to work on two tasks:
 - insert (and eventually update) tuple routing for foreign partition.
 - the ability to create an index on the parent and have all of the
children inherit it;

The first one has been implemented in pg_pathman somehow, but the code
relies on dirty hacks, so the FDW API has to be improved. As for the
extended index support, it doesn't look like a super-hard task.



That would be great!  I'd like to help review the first one.


There seems to be no work on the first one, so I'd like to work on that.


Yes, you can start to work on this, I'll join later as a reviewer.



--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-04-04 Thread Maksim Milyutin

On 01.03.2017 13:53, Maksim Milyutin wrote:

Hi hackers!

As I've understood from thread [1] the main issue of creating local
indexes for partitions is supporting REINDEX and DROP INDEX operations
on parent partitioned tables. Furthermore Robert Haas mentioned the
problem of creating index on key that is represented in partitions with
single value (or primitive interval) [1] i.e. under the
list-partitioning or range-partitioning with unit interval.

I would like to propose the following solution:

1. Create index for hierarchy of partitioned tables and partitions
recursively. Don't create relfilenode for indexes on parents, only
entries in catalog (much like the partitioned table's storage
elimination in [2]). Abstract index for partitioned tables is only for
the reference on indexes of child tables to perform REINDEX and DROP
INDEX operations.

2. Specify created indexes in pg_depend table so that indexes of child
tables depend on corresponding indexes of parent tables with type of
dependency DEPENDENCY_NORMAL so that index could be removed separately
for partitions and recursively/separately for partitioned tables.

3. REINDEX on index of partitioned table would perform this operation on
existing indexes of corresponding partitions. In this case it is
necessary to consider such operations as REINDEX SCHEMA | DATABASE |
SYSTEM so that partitions' indexes wouldn't be re-indexed multiple times
in a row.

Any thoughts?

1.
https://www.postgresql.org/message-id/CA+TgmoZUwj=qynak+f7xef4w_e2g3xxdmnsnzmzjuinhrco...@mail.gmail.com

2.
https://www.postgresql.org/message-id/2b0d42f2-3a53-763b-c9c2-47139e4b1c2e%40lab.ntt.co.jp




I want to present the first version of patches that implement local 
indexes for partitioned tables and discuss some technical details of 
that implementation.



1. I have added a new relkind for local indexes named 
RELKIND_LOCAL_INDEX (literal 'l').


This was done because physical storage is created in the 'heap_create' 
function and we need to revoke the creating storage as with partitioned 
tables. Since information that this index belongs to partitioned tables 
is not available in 'heap_create' function (pg_index entry on the index 
is not created yet) I chose the least painful way - added a specific 
relkind for index on partitioned table.
I suppose that this act will require the integrating new relkind to 
different places of source code so I'm ready to consider another 
proposals on this point.


2. My implementation doesn't support the concurrent creating of local 
index (CREATE INDEX CONCURRENTLY). As I understand, this operation 
involves nontrivial manipulation with snapshots and I don't know how to 
implement concurrent creating of multiple indexes. In this point I ask 
help from community.


3. As I noticed early pg_depend table is used for cascade deleting 
indexes on partitioned table and its children. I also use pg_depend to 
determine relationship between parent and child indexes when reindex 
executes recursively on child indexes.


Perhaps, it's not good way to use pg_depend to determine the 
relationship between parent and child indexes because the kind of this 
relationship is not defined. I could propose to add into pg_index table 
specific field of 'oidvector' type that specify oids of dependent 
indexes for the current local index.



On this stage I want to discuss only technical details of local indexes' 
implementation. The problems related to merging existing indexes of 
partitions within local index tree, determination uniqueness of field in 
global sense through local index and syntax notes I want to arise later.



CC welcome!

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index cc5ac8b..bec3983 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -154,7 +154,8 @@ index_open(Oid relationId, LOCKMODE lockmode)
 
 	r = relation_open(relationId, lockmode);
 
-	if (r->rd_rel->relkind != RELKIND_INDEX)
+	if (r->rd_rel->relkind != RELKIND_INDEX &&
+		r->rd_rel->relkind != RELKIND_LOCAL_INDEX)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("\"%s\" is not an index",
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index fc088b2..26a10e9 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1107,7 +1107,8 @@ doDeletion(const ObjectAddress *object, int flags)
 			{
 char		relKind = get_rel_relkind(object->objectId);
 
-if (relKind == RELKIND_INDEX)
+if (relKind == RELKIND_INDEX ||
+	relKind == RELKIND_LOCAL_INDEX)
 {
 	bool		concurrent = ((flags & PERFORM_DELETION_CONCURRENTLY) != 0);
 
diff --git a/src/backend/catalog/heap.c b/src/backend

Re: [HACKERS] Partitioned tables and relfilenode

2017-03-27 Thread Maksim Milyutin

On 24.03.2017 03:54, Amit Langote wrote:


And here is the updated patch.



Perhaps you forgot my fix in the updated patch

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 3999e6e..36917c8 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1823,7 +1823,8 @@ heap_drop_with_catalog(Oid relid)
 */
if (rel->rd_rel->relkind != RELKIND_VIEW &&
rel->rd_rel->relkind != RELKIND_COMPOSITE_TYPE &&
-   rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE)
+   rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
+   rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
{
    RelationDropStorage(rel);
}

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-23 Thread Maksim Milyutin

Hi!

I have noticed that there is scheduled unlinking of nonexistent physical 
storage under partitioned table when we execute DROP TABLE statement on 
this partitioned table. Though this action doesn't generate any error 
under typical behavior of postgres because the error of storage's lack 
is caught through if-statement [1] I think it is not safe.


My patch fixes this issue.

1. src/backend/storage/smgr/md.c:1385

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 3999e6e..36917c8 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1823,7 +1823,8 @@ heap_drop_with_catalog(Oid relid)
 	 */
 	if (rel->rd_rel->relkind != RELKIND_VIEW &&
 		rel->rd_rel->relkind != RELKIND_COMPOSITE_TYPE &&
-		rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE)
+		rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
+		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 	{
 		RelationDropStorage(rel);
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-03-02 Thread Maksim Milyutin

On 02.03.2017 11:41, Robert Haas wrote:

Sounds generally good.  One thing to keep in mind is that - in this
system - a UNIQUE index on the parent doesn't actually guarantee
uniqueness across the whole partitioning hierarchy unless it so
happens that the index columns or expressions are the same as the
partitioning columns or expressions.  That's a little a
counterintuitive, and people have already been complaining that a
partitioned table + partitions doesn't look enough like a plain table.
However, I'm not sure there's a better alternative, because somebody
might want partition-wise unique indexes even though that doesn't
guarantee global uniqueness.  So I think if someday we have global
indexes, then we can plan to use some other syntax for that, like
CREATE GLOBAL [ UNIQUE ] INDEX.


Yes, I absolutely agree with your message that cross-partition 
uniqueness is guaranteed through global index on partitioned table apart 
from the case when the index key are the same as partitioning key (or 
index comprises partitioning key in general).


Thanks for your comment. I'll try to propose the first patches as soon 
as possible.


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-03-01 Thread Maksim Milyutin

On 01.03.2017 05:14, Amit Langote wrote:

Nagata-san,


A partition table can be create as bellow;

 CREATE TABLE h1 PARTITION OF h;
 CREATE TABLE h2 PARTITION OF h;
 CREATE TABLE h3 PARTITION OF h;

FOR VALUES clause cannot be used, and the partition bound is
calclulated automatically as partition index of single integer value.

When trying create partitions more than the number specified
by PARTITIONS, it gets an error.

postgres=# create table h4 partition of h;
ERROR:  cannot create hash partition more than 3 for h


Instead of having to create each partition individually, wouldn't it be
better if the following command

CREATE TABLE h (i int) PARTITION BY HASH (i) PARTITIONS 3;

created the partitions *automatically*?


It's a good idea but in this case we can't create hash-partition that is 
also partitioned table, and as a consequence we are unable to create 
subpartitions. My understanding is that the table can be partitioned 
only using CREATE TABLE statement, not ALTER TABLE. For this reason the 
new created partitions are only regular tables.


We can achieve desired result through creating a separate partitioned 
table and making the DETACH/ATTACH manipulation, though. But IMO it's 
not flexible case.


It would be a good thing if a regular table could be partitioned through 
separate command. Then your idea would not be restrictive.



--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal: Local indexes for partitioned table

2017-03-01 Thread Maksim Milyutin

Hi hackers!

As I've understood from thread [1] the main issue of creating local 
indexes for partitions is supporting REINDEX and DROP INDEX operations 
on parent partitioned tables. Furthermore Robert Haas mentioned the 
problem of creating index on key that is represented in partitions with 
single value (or primitive interval) [1] i.e. under the 
list-partitioning or range-partitioning with unit interval.


I would like to propose the following solution:

1. Create index for hierarchy of partitioned tables and partitions 
recursively. Don't create relfilenode for indexes on parents, only 
entries in catalog (much like the partitioned table's storage 
elimination in [2]). Abstract index for partitioned tables is only for 
the reference on indexes of child tables to perform REINDEX and DROP 
INDEX operations.


2. Specify created indexes in pg_depend table so that indexes of child 
tables depend on corresponding indexes of parent tables with type of 
dependency DEPENDENCY_NORMAL so that index could be removed separately 
for partitions and recursively/separately for partitioned tables.


3. REINDEX on index of partitioned table would perform this operation on 
existing indexes of corresponding partitions. In this case it is 
necessary to consider such operations as REINDEX SCHEMA | DATABASE | 
SYSTEM so that partitions' indexes wouldn't be re-indexed multiple times 
in a row.


Any thoughts?

1. 
https://www.postgresql.org/message-id/CA+TgmoZUwj=qynak+f7xef4w_e2g3xxdmnsnzmzjuinhrco...@mail.gmail.com
2. 
https://www.postgresql.org/message-id/2b0d42f2-3a53-763b-c9c2-47139e4b1c2e%40lab.ntt.co.jp


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-09 Thread Maksim Milyutin

Hi, everyone!


08.12.16 18:25, Robert Haas wrote:

Of course, this is the beginning, not the end.  I've been thinking
about next steps -- here's an expanded list:

- more efficient plan-time partition pruning (constraint exclusion is too slow)
- run-time partition pruning
- partition-wise join (Ashutosh Bapat is already working on this)
- try to reduce lock levels
- hash partitioning
- the ability to create an index on the parent and have all of the
children inherit it; this should work something like constraint
inheritance.  you could argue that this doesn't add any real new
capability but it's a huge usability feature.
- teaching autovacuum enough about inheritance hierarchies for it to
update the parent statistics when they get stale despite the lack of
any actual inserts/updates/deletes to the parent.  this has been
pending for a long time, but it's only going to get more important
- row movement (aka avoiding the need for an ON UPDATE trigger on each
partition)
- insert (and eventually update) tuple routing for foreign partitions
- not scanning the parent
- fixing the insert routing so that we can skip tuple conversion where possible
- fleshing out the documentation


I would like to work on two tasks:
 - insert (and eventually update) tuple routing for foreign partition.
 - the ability to create an index on the parent and have all of the 
children inherit it;


The first one has been implemented in pg_pathman somehow, but the code 
relies on dirty hacks, so the FDW API has to be improved. As for the 
extended index support, it doesn't look like a super-hard task.


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-09-02 Thread Maksim Milyutin

01.09.2016 18:58, Tom Lane пишет:

Maksim Milyutin  writes:

On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin
mailto:m.milyu...@postgrespro.ru>> wrote:
Yes, but the problem is that nothing gives you the guarantee that at the
moment you decide to handle the interrupt, the QueryDesc structures
you're looking at are in a good shape for Explain* functions to run on
them.  Even if that appears to be the case in your testing now, there's
no way to tell if that will be the case at any future point in time.



CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc
structure) is more or less consistent.


Really?  Even if that's 100% true today, which I wouldn't bet very much
on, it seems like a really dangerous property to insist on system-wide.
The only restriction we have ever placed on CHECK_FOR_INTERRUPTS is that
it occur at places where it'd be safe to throw elog(ERROR), and in general
we don't assume that the active query is still in a usable state after
an error.  What you propose would amount to a new restriction that nothing
can ever call any nontrivial subroutine while the active query tree is
less than fully valid (because the subroutine might contain a
CHECK_FOR_INTERRUPTS somewhere).  That sounds impractical and
unenforceable.


Ok, thanks! I could propose a different approach: when 
CHECK_FOR_INTERRUPTS occurs, set a hook to be executed by the following 
ExecProcNode(). The hook is going to be provided by my extension. It is 
expected to send current query state to some recipient backend (the one 
who asked for it). I think the active query is consistent after any node 
have worked off one or zero rows. After it has sent all necessary data, 
the hook will disable itself.


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-09-01 Thread Maksim Milyutin



On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin
mailto:m.milyu...@postgrespro.ru>> wrote:

On Mon, Aug 29, 2016 at 5:22 PM, maksim
mailto:m.milyu...@postgrespro.ru>
<mailto:m.milyu...@postgrespro.ru
<mailto:m.milyu...@postgrespro.ru>>> wrote:

Hi, hackers!

Now I complete extension that provides facility to see the
current
state of query execution working on external session in form of
EXPLAIN ANALYZE output. This extension works on 9.5 version,
for 9.6
and later it doesn't support detailed statistics for
parallel nodes yet.

I want to present patches to the latest version of
PostgreSQL core
to enable this extension.

Hello,

Did you publish the extension itself yet?


Hello, extension for version 9.5 is available in repository
https://github.com/postgrespro/pg_query_state/tree/master
<https://github.com/postgrespro/pg_query_state/tree/master>.

Last year (actually, exactly one year ago) I was trying to do
something
very similar, and it quickly turned out that signals are not the
best
way to make this sort of inspection.  You can find the discussion
here:

https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=xx_hq7qq5ndr4jpo+h-o5vp6i...@mail.gmail.com

<https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=xx_hq7qq5ndr4jpo+h-o5vp6i...@mail.gmail.com>


Thanks for link!

My patch *custom_signal.patch* resolves the problem of «heavy»
signal handlers. In essence, I follow the course offered in
*procsignal.c* file. They define *ProcSignalReason* values on which
the SUGUSR1 is multiplexed. Signal recent causes setting flags for
*ProcessInterrupt* actuating, i.e. procsignal_sigusr1_handler() only
sets specific flags. When CHECK_FOR_INTERRUPTS appears later on
query execution *ProcessInterrupt* is called. Then triggered user
defined signal handler is executed.
As a result, we have a deferred signal handling.


Yes, but the problem is that nothing gives you the guarantee that at the
moment you decide to handle the interrupt, the QueryDesc structures
you're looking at are in a good shape for Explain* functions to run on
them.  Even if that appears to be the case in your testing now, there's
no way to tell if that will be the case at any future point in time.


CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc 
structure) is more or less consistent. In these macro calls I pass 
QueryDesc to Explain* functions. I exactly know that elementary 
statistics updating functions (e.g. InstrStartNode, InstrStopNode, etc) 
don't contain CHECK_FOR_INTERRUPTS within itself therefore statistics at 
least on node level is consistent when backend will be ready to transfer 
its state.
The problem may be in interpretation of collected statistics in Explain* 
functions. In my practice there was the case when wrong number of 
inserted rows is shown under INSERT ON CONFLICT request. That request 
consisted of two parts: SELECT from table and INSERT with check on 
predicate. And I was interrupted between these parts. Formula for 
inserted rows was the number of extracting rows from SELECT minus 
rejected rows from INSERT. And I got imaginary inserted row. I removed 
the printing number of inserted rows under explain of running query 
because I don't know whether INSERT node has processed that last row. 
But the remaining number of rejected rows was deterministic and I showed it.



Another problem is use if shm_mq facility, because it manipulates the
state of process latch.  This is not supposed to happen to a backend
happily performing its tasks, at random due to external factors, and
this is what the proposed approach introduces


In Postgres source code the most WaitLatch() call on process latch is 
surrounded by loop and forms the pattern like this:


  for (;;)
  {
 if (desired_state_has_occured)
   break;

 WaitLatch(MyLatch);
 CHECK_FOR_INTERRUPTS();
 ResetLatch(MyLatch)
  }

The motivation of this decision is pretty clear illustrated by the 
extract from comment in Postgres core:


usage of "the generic process latch has to be robust against unrelated 
wakeups: Always check that the desired state has occurred, and wait 
again if not"[1].


I mean that random setting of process latch at the time of query 
executing don't affect on another usage of that latch later in code.



1. src/backend/storage/lmgr/proc.c:1724 for ProcWaitForSignal function

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-08-31 Thread Maksim Milyutin



On 2016-08-30 11:22:43 +0300, Maksim Milyutin wrote:

Hi,

On 2016-08-29 18:22:56 +0300, maksim wrote:

Now I complete extension that provides facility to see the current state of
query execution working on external session in form of EXPLAIN ANALYZE
output. This extension works on 9.5 version, for 9.6 and later it doesn't
support detailed statistics for parallel nodes yet.


Could you expand a bit on what you want this for exactly?


Max goal - to push my extension to postgres core. But now it's ready only
for 9.5. Prerequisites of this extension are patches presented here.


I'm asking what you want this for. "An extension" isn't a detailed
description...



I want to provide the facility to fetch state of query on some other 
backend running on the same server. In essence, it's going to be a 
microlevel monitoring tool. A typical use case looks like this:


1) assume 1st backend executes a simple query:
 select * from foo join bar on foo.c1=bar.c1
2) somebody tries to fetch state of that backend, so he addresses it 
through pid:

 select * from pg_query_state(pid := <1st_backend_pid>)
3) he'll get detailed description of state - something like this:

Hash Join (Current loop: actual rows=0, loop number=1)
   Hash Cond: (foo.c1 = bar.c1)
   ->  Seq Scan on foo (Current loop: actual rows=1, loop number=1)
   ->  Hash (Current loop: actual rows=0, loop number=1)
   Buckets: 131072  Batches: 8  Memory Usage: 1kB
   ->  Seq Scan on bar (Current loop: actual rows=49, loop 
number=1)


Note that I've added *Current loop* records with mumber of emitted rows 
(*actual rows*) and *loop number* attached to each node. We could also 
add a timing info.


For parallel nodes I want to print statistics for each worker separately 
(it's not finished yet).


You could also watch my screencast (it's short enough) to get the idea: 
https://asciinema.org/a/981bed2lu7r8sx60u5lsjei30





2. Patch that enables to interrupt the query executor
(executor_hooks.patch).
This patch enables to hang up hooks on executor function of each node
(ExecProcNode). I define hooks before any node execution and after
execution.
I use this patch to add possibility of query tracing by emitted rows from
any node. I interrupt query executor after any node delivers one or zero
rows to upper node. And after execution of specific number trace steps I can
get the query state of traceable backend which will be somewhat
deterministic. I use this possibility for regression tests of my extension.


This will increase executor overhead.


In simple case we have checks on existence of hooks.


That *is* noticeable.



Then I'll really consider the case with hiding hook checking inside the 
"if (instrument)" statement, thanks!


 >>> I think we'll need to find a way
 >>> to hide this behind the existing if (instrument) branches.
 >>
 >> And so can be. It doesn't matter for trace mode. But I think instrument
 >> branch is intended only for collecting statistics by nodes.
 >
 > I can't follow here. That's all what analyze is about?
 >

I meant that hiding hooks is not universal solution. If 'instrument' 
variable is empty (e.g. query without analyze) hooks become disabled. 
But in my case 'instrument' is initialized anyway and I don't care about it.


 >> 3. Patch that enables to output runtime explain statistics
 >> (runtime_explain.patch).
 >> This patch extends the regular explain functionality. The problem 
is in the
 >> point that regular explain call makes result output after query 
execution

 >> performing InstrEndLoop on nodes where necessary. My patch introduces
 >> specific flag *runtime* that indicates whether we explain running 
query and
 >> does some insertions in source code dedicated to output the 
statistics of

 >> running query.
 >
 > Unless I'm missing something this doesn't really expose a user of this
 > functionality?
 >

Probably I could exclude *runtime_explain.patch* from the Postgres core 
through copying *explain.c* to module's directory and its further 
customization for my purposes. But in that case I'd have to maintain 
'local explain', fixing bugs and coping with other issues from time to 
time (i.e. in case of major upgrade).


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-08-30 Thread Maksim Milyutin

Hi,

On 2016-08-29 18:22:56 +0300, maksim wrote:

Now I complete extension that provides facility to see the current state of
query execution working on external session in form of EXPLAIN ANALYZE
output. This extension works on 9.5 version, for 9.6 and later it doesn't
support detailed statistics for parallel nodes yet.


Could you expand a bit on what you want this for exactly?


Max goal - to push my extension to postgres core. But now it's ready 
only for 9.5. Prerequisites of this extension are patches presented here.



2. Patch that enables to interrupt the query executor
(executor_hooks.patch).
This patch enables to hang up hooks on executor function of each node
(ExecProcNode). I define hooks before any node execution and after
execution.
I use this patch to add possibility of query tracing by emitted rows from
any node. I interrupt query executor after any node delivers one or zero
rows to upper node. And after execution of specific number trace steps I can
get the query state of traceable backend which will be somewhat
deterministic. I use this possibility for regression tests of my extension.


This will increase executor overhead.


In simple case we have checks on existence of hooks. We may suppose to 
use only not heavy processing on hooks under regular execution of query. 
In my case (query trace), I set up hooks under trace mode and throw off 
otherwise.


> I think we'll need to find a way
> to hide this behind the existing if (instrument) branches.

And so can be. It doesn't matter for trace mode. But I think instrument 
branch is intended only for collecting statistics by nodes.



3. Patch that enables to output runtime explain statistics
(runtime_explain.patch).
This patch extends the regular explain functionality. The problem is in the
point that regular explain call makes result output after query execution
performing InstrEndLoop on nodes where necessary. My patch introduces
specific flag *runtime* that indicates whether we explain running query and
does some insertions in source code dedicated to output the statistics of
running query.


Unless I'm missing something this doesn't really expose a user of this
functionality?


This patch is only for extension. As described in 
https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=xx_hq7qq5ndr4jpo+h-o5vp6i...@mail.gmail.com 
regular ExplainNode() prints only statistics after query execution and 
asserts under InstEndLoop(). My patch releases this problem and rewrite 
formulas for statistic parameters appropriate to running queries without 
affecting regular EXPLAIN outputs.



--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-08-30 Thread Maksim Milyutin

On Mon, Aug 29, 2016 at 5:22 PM, maksim mailto:m.milyu...@postgrespro.ru>> wrote:

Hi, hackers!

Now I complete extension that provides facility to see the current
state of query execution working on external session in form of
EXPLAIN ANALYZE output. This extension works on 9.5 version, for 9.6
and later it doesn't support detailed statistics for parallel nodes yet.

I want to present patches to the latest version of PostgreSQL core
to enable this extension.

Hello,

Did you publish the extension itself yet?



Hello, extension for version 9.5 is available in repository 
https://github.com/postgrespro/pg_query_state/tree/master.



Last year (actually, exactly one year ago) I was trying to do something
very similar, and it quickly turned out that signals are not the best
way to make this sort of inspection.  You can find the discussion
here: 
https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=xx_hq7qq5ndr4jpo+h-o5vp6i...@mail.gmail.com


Thanks for link!

My patch *custom_signal.patch* resolves the problem of «heavy» signal 
handlers. In essence, I follow the course offered in *procsignal.c* 
file. They define *ProcSignalReason* values on which the SUGUSR1 is 
multiplexed. Signal recent causes setting flags for *ProcessInterrupt* 
actuating, i.e. procsignal_sigusr1_handler() only sets specific flags. 
When CHECK_FOR_INTERRUPTS appears later on query execution 
*ProcessInterrupt* is called. Then triggered user defined signal handler 
is executed.

As a result, we have a deferred signal handling.

Patch *runtime_explain.patch* releases the problem with error from 
InstrEndLoop(). I catch all places where this unlucky function is called 
and wrap in checks on *runtime* flag. This flag indicates whether 
*ExplainQuery* is called for running query. Also I complement explain 
output, you can see details in README.md in repository.


--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-08-29 Thread maksim

Hi, hackers!

Now I complete extension that provides facility to see the current state 
of query execution working on external session in form of EXPLAIN 
ANALYZE output. This extension works on 9.5 version, for 9.6 and later 
it doesn't support detailed statistics for parallel nodes yet.


I want to present patches to the latest version of PostgreSQL core to 
enable this extension.



1. Patch that provides facility to send user signal to external backend 
(custom_signals.patch).


This patch transparently extends process signal interface to enable 
sending user defined signals (I call them - custom signals) and defining 
callbacks to them. Formally it will appear in following manner:


/* Register custom signal and define signal callback */

Reason = RegisterCustomProcSignalHandler(sighandler);

void
sighandler(void)
   {
   ...
   }

/* On other session we can send process signal to the first backend */
   SendProcSignal(pid, Reason, backendId)

The InterruptPending flag is set under process signal handling and 
sighandler is called in CHECK_FOR_INTERRUPTS.

I use this patch to notice other backend to send its state to caller.

2. Patch that enables to interrupt the query executor 
(executor_hooks.patch).
This patch enables to hang up hooks on executor function of each node 
(ExecProcNode). I define hooks before any node execution and after 
execution.
I use this patch to add possibility of query tracing by emitted rows 
from any node. I interrupt query executor after any node delivers one or 
zero rows to upper node. And after execution of specific number trace 
steps I can get the query state of traceable backend which will be 
somewhat deterministic. I use this possibility for regression tests of 
my extension.


3. Patch that enables to output runtime explain statistics 
(runtime_explain.patch).
This patch extends the regular explain functionality. The problem is in 
the point that regular explain call makes result output after query 
execution performing InstrEndLoop on nodes where necessary. My patch 
introduces specific flag *runtime* that indicates whether we explain 
running query and does some insertions in source code dedicated to 
output the statistics of running query.
I want to notice that this patch is completed only for 9.5 postgres 
version. For later versions there is not implemented detailed statistics 
for parellel nodes. Now, I'm working at this feature.



That's all. CC welcome!

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index a3d6ac5..07270a9 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -59,12 +59,17 @@ typedef struct
  */
 #define NumProcSignalSlots	(MaxBackends + NUM_AUXPROCTYPES)
 
+static bool CustomSignalPendings[NUM_CUSTOM_PROCSIGNALS];
+static ProcSignalHandler_type CustomHandlers[NUM_CUSTOM_PROCSIGNALS];
+
 static ProcSignalSlot *ProcSignalSlots = NULL;
 static volatile ProcSignalSlot *MyProcSignalSlot = NULL;
 
 static bool CheckProcSignal(ProcSignalReason reason);
 static void CleanupProcSignalState(int status, Datum arg);
 
+static void CustomSignalInterrupt(ProcSignalReason reason);
+
 /*
  * ProcSignalShmemSize
  *		Compute space needed for procsignal's shared memory
@@ -165,6 +170,57 @@ CleanupProcSignalState(int status, Datum arg)
 }
 
 /*
+ * RegisterCustomProcSignalHandler
+ * 		Assign specific handler of custom process signal with new ProcSignalReason key.
+ * 		Return INVALID_PROCSIGNAL if all custom signals have been assigned.
+ */
+ProcSignalReason
+RegisterCustomProcSignalHandler(ProcSignalHandler_type handler)
+{
+	ProcSignalReason reason;
+
+	/* iterate through custom signal keys to find free spot */
+	for (reason = PROCSIG_CUSTOM_1; reason <= PROCSIG_CUSTOM_N; reason++)
+		if (!CustomHandlers[reason - PROCSIG_CUSTOM_1])
+		{
+			CustomHandlers[reason - PROCSIG_CUSTOM_1] = handler;
+			return reason;
+		}
+	return INVALID_PROCSIGNAL;
+}
+
+/*
+ * AssignCustomProcSignalHandler
+ * 		Assign handler of custom process signal with specific ProcSignalReason key.
+ * 		Return old ProcSignal handler.
+ * 		Assume incoming reason is one of custom ProcSignals.
+ */
+ProcSignalHandler_type
+AssignCustomProcSignalHandler(ProcSignalReason reason, ProcSignalHandler_type handler)
+{
+	ProcSignalHandler_type old;
+
+	Assert(reason >= PROCSIG_CUSTOM_1 && reason <= PROCSIG_CUSTOM_N);
+
+	old = CustomHandlers[reason - PROCSIG_CUSTOM_1];
+	CustomHandlers[reason - PROCSIG_CUSTOM_1] = handler;
+	return old;
+}
+
+/*
+ * GetCustomProcSignalHandler
+ * 		Get handler of custom process signal.
+ *		Assume incoming reason is one of custom ProcSignals.
+ */
+ProcSignalHandler_type
+GetCustomProcSignalHandler(ProcSignalReason reason)
+{
+	Assert(reason >= PROCSIG_CUSTOM_1 && reason <= PROCSIG_CUSTOM_N);

[HACKERS] Fwd: [BUG] Print timing statistics of trigger execution under turned off timing option of EXPLAIN ANALYZE

2016-08-12 Thread maksim

Hello, hackers!


At this moment EXPLAIN ANALYZE with turned off timing option after 
execution query that expires trigger prints time of total execution of 
trigger function:


postgres=# EXPLAIN (ANALYZE, timing false) insert INTO foo 
values(101, '');

QUERY PLAN
--
 Insert on foo  (cost=0.00..0.01 rows=1 width=36) (actual rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=36) (actual rows=1 loops=1)
 Planning time: 0.038 ms
 Trigger unique_foo_c1:*/time=0.000/* calls=1
 Execution time: 340.766 ms
(5 rows)

My patch fixes this problem.

--
Maksim Milyutin
Postgres Professional:http://www.postgrespro.com
Russian Postgres Company

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index dbd27e5..808e23e 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -698,8 +698,11 @@ report_triggers(ResultRelInfo *rInfo, bool show_relname, ExplainState *es)
 appendStringInfo(es->str, " for constraint %s", conname);
 			if (show_relname)
 appendStringInfo(es->str, " on %s", relname);
-			appendStringInfo(es->str, ": time=%.3f calls=%.0f\n",
-			 1000.0 * instr->total, instr->ntuples);
+			if (es->timing)
+appendStringInfo(es->str, ": time=%.3f calls=%.0f\n",
+ 1000.0 * instr->total, instr->ntuples);
+			else
+appendStringInfo(es->str, ": calls=%.0f\n", instr->ntuples);
 		}
 		else
 		{
@@ -707,7 +710,8 @@ report_triggers(ResultRelInfo *rInfo, bool show_relname, ExplainState *es)
 			if (conname)
 ExplainPropertyText("Constraint Name", conname, es);
 			ExplainPropertyText("Relation", relname, es);
-			ExplainPropertyFloat("Time", 1000.0 * instr->total, 3, es);
+			if (es->timing)
+ExplainPropertyFloat("Time", 1000.0 * instr->total, 3, es);
 			ExplainPropertyFloat("Calls", instr->ntuples, 0, es);
 		}
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FreeSpaceMap hashtable out of memory

2003-10-01 Thread Maksim Likharev
It is problematic to produce small enough subset, due to large DB and 
randomness of the situation.
But here is what we see in server log file, see below:

It seems like 
WARNING:  ShmemAlloc: out of memory
ERROR:FreeSpaceMap hashtable out of memory

goes together, does it related to the size of Shared Memory or could
increase of 
Shared Memory solve the thing or that just a coincidence?

Another thing ( from the code) it seems that
hash_search trying to inset new entry into DynHash and cannot allocate
memory
in DynHash Context, so another question is DynHash Context upper
limited, another word
cannot grow more than 8 * 1024 * 1024 byte?

Any suggestions are welcome, 
Thank you.


--- LOG

LOG:  all server processes terminated; reinitializing shared memory and 
semaphores
IpcMemoryCreate: shmget(key=5432001, size=4669440, 03600) failed: Not
enough 
core

This error usually means that PostgreSQL's request for a shared
memory segment exceeded available memory or swap space.
To reduce the request size (currently 4669440 bytes), reduce
PostgreSQL's shared_buffers parameter (currently 256) and/or
its max_connections parameter (currently 128).

The PostgreSQL Administrator's Guide contains more information about
shared memory configuration.


Error 2:
WARNING:  ShmemAlloc: out of memory
ERROR:  FreeSpaceMap hashtable out of memory



-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Wednesday, October 01, 2003 2:51 PM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]
Subject: Re: [HACKERS] FreeSpaceMap hashtable out of memory 


"Maksim Likharev" <[EMAIL PROTECTED]> writes:
> Using PG under Cygwin we having following error message during INSERT
> INTO
> "FreeSpaceMap hashtable out of memory".

Hm, that's not supposed to happen.  Can you create a reproducible
example?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] FreeSpaceMap hashtable out of memory

2003-10-01 Thread Maksim Likharev
Hi,
Using PG under Cygwin we having following error message during INSERT
INTO
"FreeSpaceMap hashtable out of memory".

What does that mean?
And if for a moment step out of knowledge 'PG under Cygwin', what in
general 
this message is about and more important how to fix it?

Thank you.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Single-file DBs WAS: Need concrete "Why Postgres

2003-08-22 Thread Maksim Likharev
Just want to add my $0.03 on that.
single file DB has some benefits as:
1. ability to allocate one continues chunk on disk, 
   significantly reduces disk seeks, 
   mostly ancestral but still true today ( I still see DB living on none
arrays ) 
2. And of cause countless design considerations.

Saying that corruption in single file DB more severe probably not true
cause
one way or another you have to restore from backup, unless you can yank
corrupted
table/index/etc. file from some place else.

And more important, I have no idea how Oracle does it, but for MS SQL
transaction log
is a separate file, so if you lost transaction log... this is another
story, but 
nothing to do with single/multi file dbs.

And as for SQL Server, DB is usually 2 files db and log, very very
useful in handling
I can just copy, zip one file or attach/detach db.

Anyway there is no black and white here, different designs and different
limitations, like
On Windows it is very reasonable and I would say correct to have single
db file.
On *nix it is difficult ( using POSIX calls ), to have single file due
to 2GB 
limit on 32 bit CPUs and so on...
Choices, choices...


-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED]
Sent: Friday, August 22, 2003 1:05 PM
To: Jan Wieck
Cc: Andrew Dunstan; PostgreSQL-development
Subject: Re: [HACKERS] Single-file DBs WAS: Need concrete "Why Postgres

<>

Yes, but you've just added a significant amount to the work the DB
system 
needs to do in recovery.   PostgreSQL just needs to check for, and
recover 
from, issues with LSN headers and transactions.   Single-file DBs, like
SQL 
Server, need to also check and audit the internal file partitioning.

In my experience (a lot of MS SQL, more MS Access than I want to talk
about, 
and a little Oracle) corruption failures on single-file databases are
more 
frequent than databases which depend on the host OS, and such failures
are 
much more severe when the occur.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [GENERAL] PG crash on simple query, story continues

2003-07-12 Thread Maksim Likharev
Possible, but if before almost every tenth query crash the server
now it stays, that's only I care about.


-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Saturday, July 12, 2003 2:05 PM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [HACKERS] [GENERAL] PG crash on simple query, story
continues 


"Maksim Likharev" <[EMAIL PROTECTED]> writes:
> So following modification seems to fixed all PG (7.3/7.3.3)crashes on
> Solaris ( NON C LOCALE )

Given that the problem is Solaris' tendency to write more data than
the specified output buffer length allows, I'd think this is still
risking a core dump (due to null pointer dereference).

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [GENERAL] PG crash on simple query, story continues

2003-07-11 Thread Maksim Likharev
So following modification seems to fixed all PG (7.3/7.3.3)crashes on
Solaris ( NON C LOCALE )

selfuncs.c line 2356:

I changed:
xfrmsize = strlen(val) + 32;/*arbitrary pad value here...*/
to
xfrmsize = strxfrm(NULL, val, 0) + 32;

so basically instead of wild guess of transformed string size I asking
"strxfrm" for that.

+32 out my desperation, strxfrm(NULL, val, 0) + 1 should be fine ( have
not tested that )...

Out of curiosity:
Really interesting, following condition seems to be impossible anymore,
of cause if something went terribly wrong,

die here, return original string, return empty string?

if (xfrmlen >= xfrmsize) {
pfree(xfrmstr);
xfrmstr = (char *) palloc(xfrmlen + 1);
xfrmlen = strxfrm(xfrmstr, val, xfrmlen + 1);
}


Again fixed all crashes on Sun 5.8 ( PG 7.3.3, en_US locale, LATIN1
encoding )  Generic Patch...

P.S
NO SUPPORT, NO WARRANTY, NO NOTHING, just for you information.

Regards.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 08, 2003 3:58 PM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [GENERAL] PG crash on simple query, story continues 


"Maksim Likharev" <[EMAIL PROTECTED]> writes:
>  On failure, strxfrm() returns (size_t)-1.

Not according to the Single Unix Specification, Linux, or HP-UX;
I don't have any others to check.  But anyway, that is not causing
your problem, since palloc(0) would complain not dump core.

> I am on SunOS 5.8, 

Solaris, eh?  IIRC, it was Solaris that we last heard about broken
strxfrm on.  Better check to see if Sun has a fix for this.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] PG crash on simple query, story continues

2003-07-08 Thread Maksim Likharev
I would referrer dump that gar.xxg, and put PG on Linux,
but this is not up to me.
Thanks for the help.



-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 08, 2003 3:58 PM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [GENERAL] PG crash on simple query, story continues 


"Maksim Likharev" <[EMAIL PROTECTED]> writes:
>  On failure, strxfrm() returns (size_t)-1.

Not according to the Single Unix Specification, Linux, or HP-UX;
I don't have any others to check.  But anyway, that is not causing
your problem, since palloc(0) would complain not dump core.

> I am on SunOS 5.8, 

Solaris, eh?  IIRC, it was Solaris that we last heard about broken
strxfrm on.  Better check to see if Sun has a fix for this.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [GENERAL] PG crash on simple query, story continues

2003-07-08 Thread Maksim Likharev
>>  !if error happend, xfrmlen will be (size_t)-1
>No it won't; see the man page for strxfrm.

RETURN VALUES
 Upon successful completion, strxfrm() returns the length  of
 the  transformed  string (not including the terminating null
 byte). If the value returned is n or more, the  contents  of
 the array pointed to by s1 are indeterminate.

 On failure, strxfrm() returns (size_t)-1.

but you a right it is strxfrm() that returns more than allowed,
most likely in following condition:
strxfrm(xfrmstr, val, 0)

a null terminator extra.

I am on SunOS 5.8, 
BTW on Linux it works


-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 08, 2003 11:45 AM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [GENERAL] PG crash on simple query, story continues 


"Maksim Likharev" <[EMAIL PROTECTED]> writes:
> ! I would say very interesting aproach, 
> ! why not just
>   xfrmsize = strxfrm(xfrmstr, NULL, 0);

strxfrm doesn't work that way (and if it did, it would give back a
malloc'd not a palloc'd string).

>   !if error happend, xfrmlen will be (size_t)-1

No it won't; see the man page for strxfrm.

This does raise an interesting thought though: what platform are you on?
It seems to me that we've heard of buggy versions of strxfrm that write
more bytes than they're allowed to, thereby clobbering palloc's data
structures.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] PG crash on simple query, story continues

2003-07-08 Thread Maksim Likharev
Hi, I have very interesting suspicion:
See my comments !

convert_string_datum

...
!this is my case
if (!lc_collate_is_c())
{
/* Guess that transformed string is not much bigger than
original */
xfrmsize = strlen(val) + 32;/* arbitrary pad value
here... */

! I would say very interesting aproach, 
! why not just
xfrmsize = strxfrm(xfrmstr, NULL, 0);

! fine  
xfrmstr = (char *) palloc(xfrmsize);
!fine
xfrmlen = strxfrm(xfrmstr, val, xfrmsize);

!if error happend, xfrmlen will be (size_t)-1
if (xfrmlen >= xfrmsize) {
!yep did not make it
/* Oops, didn't make it */
pfree(xfrmstr);

!what do we allocating here? 0 byte
xfrmstr = (char *) palloc(xfrmlen + 1);

!BOOM
xfrmlen = strxfrm(xfrmstr, val, xfrmlen + 1);
}
pfree(val);
val = xfrmstr;
}



-Original Message-
From: Maksim Likharev 
Sent: Tuesday, July 08, 2003 9:35 AM
To: 'Tom Lane'
Cc: [EMAIL PROTECTED]; '[EMAIL PROTECTED]'
Subject: RE: [GENERAL] PG crash on simple query, story continues 


After upgrade on 7.3.3 we have following:

signal 11
#0  0x254f38 in pfree ()
#1  0x1fde44 in convert_to_scalar ()
#2  0x1faafc in scalarineqsel ()
#3  0x1fd574 in mergejoinscansel ()
#4  0x14fec8 in cost_mergejoin ()
#5  0x16b820 in create_mergejoin_path ()
#6  0x155048 in sort_inner_and_outer ()
#7  0x154dd0 in add_paths_to_joinrel ()
#8  0x1567cc in make_join_rel ()
#9  0x15669c in make_jointree_rel ()
#10 0x14dd28 in make_fromexpr_rel ()
#11 0x14d6d0 in make_one_rel ()
#12 0x15d328 in subplanner ()
#13 0x15d218 in query_planner ()
#14 0x15f29c in grouping_planner ()
#15 0x15d93c in subquery_planner ()
#16 0x15d5e4 in planner ()
#17 0x1a6a94 in pg_plan_query ()
#18 0x1a712c in pg_exec_query_string ()
#19 0x1a8fd8 in PostgresMain ()
#20 0x172698 in DoBackend ()
#21 0x171ac4 in BackendStartup ()
#22 0x16ff14 in ServerLoop ()
#23 0x16f780 in PostmasterMain ()
#24 0x128e60 in main ()



-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Monday, July 07, 2003 10:14 PM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]
Subject: Re: [GENERAL] PG crash on simple query, story continues 


"Maksim Likharev" <[EMAIL PROTECTED]> writes:
> SELECT p.docid FROM prod.t_documents AS p 
>  INNER JOIN t_tempdocs AS t 
>   ON p.docid = t.docid
>   LEFT OUTER JOIN prod.t_refs AS ct
>   ON ct.docid = p.docid;

> here is a stack trace:
>  00252174 AllocSetAlloc (3813b0, 15, 251fe0, 20, 0, ffbee2f8) + 194
>  002532e4 MemoryContextAlloc (3813b0, 15, 11, 7efefeff, 81010100,
ff00)
> + 68
>  0020dc0c varcharin (ffbee378, ffbee378, 20dae4, 0, 0, ffbee3f0) + 128
>  00243570 FunctionCall3 (ffbee4a8, 3c1ce8, 0, 324, 0, ffbee5c4) + 11c
>  0023e6c4 get_attstatsslot (3d6410, 413, 324, 2, 0, ffbee5c4) + 2b0
>  001f8cb4 scalarineqsel (3bb978, 42a, 0, 3bffa8, 40f0e8, 413) + 288
>  001fb824 mergejoinscansel (3bb978, 3c0080, 3c0968, 3c0970, 0, 1) +
23c

Hmm, it would seem there's something flaky about your pg_statistic
entries.  Could we see the pg_stats rows for the columns mentioned
in this query?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [GENERAL] PG crash on simple query, story continues

2003-07-08 Thread Maksim Likharev
After upgrade on 7.3.3 we have following:

signal 11
#0  0x254f38 in pfree ()
#1  0x1fde44 in convert_to_scalar ()
#2  0x1faafc in scalarineqsel ()
#3  0x1fd574 in mergejoinscansel ()
#4  0x14fec8 in cost_mergejoin ()
#5  0x16b820 in create_mergejoin_path ()
#6  0x155048 in sort_inner_and_outer ()
#7  0x154dd0 in add_paths_to_joinrel ()
#8  0x1567cc in make_join_rel ()
#9  0x15669c in make_jointree_rel ()
#10 0x14dd28 in make_fromexpr_rel ()
#11 0x14d6d0 in make_one_rel ()
#12 0x15d328 in subplanner ()
#13 0x15d218 in query_planner ()
#14 0x15f29c in grouping_planner ()
#15 0x15d93c in subquery_planner ()
#16 0x15d5e4 in planner ()
#17 0x1a6a94 in pg_plan_query ()
#18 0x1a712c in pg_exec_query_string ()
#19 0x1a8fd8 in PostgresMain ()
#20 0x172698 in DoBackend ()
#21 0x171ac4 in BackendStartup ()
#22 0x16ff14 in ServerLoop ()
#23 0x16f780 in PostmasterMain ()
#24 0x128e60 in main ()



-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Monday, July 07, 2003 10:14 PM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]
Subject: Re: [GENERAL] PG crash on simple query, story continues 


"Maksim Likharev" <[EMAIL PROTECTED]> writes:
> SELECT p.docid FROM prod.t_documents AS p 
>  INNER JOIN t_tempdocs AS t 
>   ON p.docid = t.docid
>   LEFT OUTER JOIN prod.t_refs AS ct
>   ON ct.docid = p.docid;

> here is a stack trace:
>  00252174 AllocSetAlloc (3813b0, 15, 251fe0, 20, 0, ffbee2f8) + 194
>  002532e4 MemoryContextAlloc (3813b0, 15, 11, 7efefeff, 81010100,
ff00)
> + 68
>  0020dc0c varcharin (ffbee378, ffbee378, 20dae4, 0, 0, ffbee3f0) + 128
>  00243570 FunctionCall3 (ffbee4a8, 3c1ce8, 0, 324, 0, ffbee5c4) + 11c
>  0023e6c4 get_attstatsslot (3d6410, 413, 324, 2, 0, ffbee5c4) + 2b0
>  001f8cb4 scalarineqsel (3bb978, 42a, 0, 3bffa8, 40f0e8, 413) + 288
>  001fb824 mergejoinscansel (3bb978, 3c0080, 3c0968, 3c0970, 0, 1) +
23c

Hmm, it would seem there's something flaky about your pg_statistic
entries.  Could we see the pg_stats rows for the columns mentioned
in this query?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Share state ( allocated memory ) across two C functions...

2003-06-30 Thread Maksim Likharev
Hi,
I have interesting question that stops me now.
Suppose I have 2 functions 

1. preparestate
2. doajob

first allocates some state using
MemoryContextAlloc(TopTransactionContext) 
or something, another function using that memory.

question is how I lookup that memory in second function doajob?

Of cause I can return a handle from first function, pointer and accept 
that pointer in second function, but in this case I have to check
that pointer on validity and so on...

Is there any good practice ( some way to do so ) for that?

Thank you.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] TO_CHAR SO SLOW???

2003-06-24 Thread Maksim Likharev
Yes it is TO_CHAR,
look like that OS ( SUN ) related issue, I assume PG uses some of the
lib functions.
Looks like nonsense for me, what is damn difficult in that ( formating
dates ).
going to try date_part, might help me.

Too bad EXPLAIN does not provide statistic of time that spent inside a 
function call, would be much helpful in such case.
In comparison with Microsoft SQL, productivity of using 
profiling/debugging tools sorry to say that, far behind.



-Original Message-
From: Karel Zak [mailto:[EMAIL PROTECTED]
Sent: Tuesday, June 24, 2003 1:04 AM
To: Maksim Likharev
Cc: [EMAIL PROTECTED]
Subject: Re: [HACKERS] TO_CHAR SO SLOW???


On Mon, Jun 23, 2003 at 06:08:19PM -0700, Maksim Likharev wrote:
> Hi,
> I have some SQL function, just regular function selects data by using
4
> joins nothing fancy,
> but one thing pretty noticeable,
> I have to display 3 different columns with same date formatted
> differently,
> here are 3 different snippets:
> 
> 1. SELECT t.x,t.y,TO_CHAR(t.dt, 'DD/MM/') 
>   FROM ( SELECT x, y, dt FROM  ) AS t
>   ...
> 2. SELECT t.x,t.y,TO_CHAR(t.dt, 'DD/MM/'), TO_CHAR(t.dt,
'Mon-')
>   FROM ( SELECT x, y, dt FROM  ) AS t
>   ..
> 3. SELECT t.x,t.y,TO_CHAR(t.dt, 'DD/MM/'), TO_CHAR(t.dt,
> 'Mon-'), TO_CHAR(t.dt, '')
>   FROM ( SELECT x, y, dt FROM  ) AS t
>   ...
> 
> # 1: 15000 rows, I getting data for  130 sec
> # 2: 15000 rows, I getting data for  160 sec
> # 3: 15000 rows, I getting data for  220 sec
> 
> adding different fields into output change query time only marginally
> but adding or removing to_char, 
> just heavily knocks performance.
> 
> is it TO_CHAR so slow??

 I don't think to_char() is so slow. What happen with performance
 if you use t.dt without formatting or if try some other function
 an example extract()?
 
 SELECT t.x, t.y, t.dt FROM ( SELECT x, y, dt FROM  ) AS t;
 
 SELECT t.x, t.y, EXTRACT(year from t.dt) 
 FROM ( SELECT x, y, dt FROM  ) AS t;
 
Karel

-- 
 Karel Zak  <[EMAIL PROTECTED]>
 http://home.zf.jcu.cz/~zakkr/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] TO_CHAR SO SLOW???

2003-06-23 Thread Maksim Likharev
Hi,
I have some SQL function, just regular function selects data by using 4
joins nothing fancy,
but one thing pretty noticeable,
I have to display 3 different columns with same date formatted
differently,
here are 3 different snippets:

1. SELECT t.x,t.y,TO_CHAR(t.dt, 'DD/MM/')   
FROM ( SELECT x, y, dt FROM  ) AS t
...
2. SELECT t.x,t.y,TO_CHAR(t.dt, 'DD/MM/'), TO_CHAR(t.dt, 'Mon-')
FROM ( SELECT x, y, dt FROM  ) AS t
..
3. SELECT t.x,t.y,TO_CHAR(t.dt, 'DD/MM/'), TO_CHAR(t.dt,
'Mon-'), TO_CHAR(t.dt, '')
FROM ( SELECT x, y, dt FROM  ) AS t
...

# 1: 15000 rows, I getting data for  130 sec
# 2: 15000 rows, I getting data for  160 sec
# 3: 15000 rows, I getting data for  220 sec

adding different fields into output change query time only marginally
but adding or removing to_char, 
just heavily knocks performance.

is it TO_CHAR so slow??

P.S
Postgres 7.3 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] SELECT TAKES A LOOOONG TIME

2003-06-10 Thread Maksim Likharev
Hi,
could somebody explain me please why following select
SELECT docid FROM prod.guids 
GROUP BY docid HAVING( COUNT(docid) > 1 )

taking 15 min on 2 Proc Box on 1M rows, where number of duplicates
around 300K,
and docid indexed and not null and char(16).

May be I am doing something wrong?
Thank you.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly