Re: [HACKERS] Possible problem in Custom Scan API

2017-04-18 Thread Dmitry Ivanov

Tom Lane wrote:

I'm unimpressed by this part --- we couldn't back-patch such a change, and
I think it's unnecessary anyway in 9.6+ because the scan provider could
generate a nondefault pathtarget if it wants this to happen.


You're right, of course. Thank you very much!


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem in Custom Scan API

2017-04-12 Thread Dmitry Ivanov

Tom Lane wrote:

I'm coming around to the idea that it'd be better to disable physical
tlists for custom scans.


I've been thinking about this all along, and it seems that this is a decent 
decision. However, I've made a tiny bugfix patch which allows CustomScans 
to notify the core code that they can handle physical targetlists. 
Extension authors won't have to change anything provided that CustomPath is 
allocated using palloc0().



However, I'm hesitant to make such a change in the back branches; if
there's anyone using custom scans who's negatively affected, they'd be
rightfully unhappy.  Are you satisfied if we change this in HEAD?


It's kind of bittersweet. I'm really glad that you've changed your mind and 
this is going to be fixed in HEAD, but this change is crucial for our 
extension (if we use it with vanilla postgres).


Maybe you'll like my patch more.


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 71678d08dc..4b0322530b 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -761,6 +761,9 @@ use_physical_tlist(PlannerInfo *root, Path *path, int flags)
 	if (rel->reloptkind != RELOPT_BASEREL)
 		return false;
 
+	if (IsA(path, CustomPath) && ((CustomPath *) path)->forbid_physical_tlist)
+		return false;
+
 	/*
 	 * Can't do it if any system columns or whole-row Vars are requested.
 	 * (This could possibly be fixed but would take some fragile assumptions
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 48862d93ae..5a1f6cee47 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1089,6 +1089,7 @@ typedef struct CustomPath
  * nodes/extensible.h */
 	List	   *custom_paths;	/* list of child Path nodes, if any */
 	List	   *custom_private;
+	bool		forbid_physical_tlist;
 	const struct CustomPathMethods *methods;
 } CustomPath;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem in Custom Scan API

2017-04-11 Thread Dmitry Ivanov

Tom Lane wrote:

Uh, no, construction of a custom plan node is entirely driven by the
PlanCustomPath method as far as I can see.  You're free to ignore what
create_scan_plan did and insert your own tlist.


Are you sure? Even if it's true, this targetlist should still contain each 
and every Var that's been requested. If I'm correct, the only way to ensure 
that is to call build_path_tlist(), which is static (oops!). Perhaps I 
could make my own, but it uses replace_nestloop_params() (again, static), 
and the problem goes further and further.


This effectively means that it would be nice if I could just use the 
existing machinery. The fix would be quite trivial.


By the way, what if our CustomScan is a top node? Let's take a look at 
create_plan():


/* Recursively process the path tree, demanding the correct tlist result */
plan = create_plan_recurse(root, best_path, CP_EXACT_TLIST);

...
if (!IsA(plan, ModifyTable))
apply_tlist_labeling(plan->targetlist, root->processed_tlist);
...

If I spoil the targetlist, everything blows up in apply_tlist_labeling():

Assert(list_length(dest_tlist) == list_length(src_tlist));

since the lengths may differ. It's much safer to use the tlist provided by 
the core code, but alas, sometimes it's not an option.



In particular, if what
your custom path actually is is a rescan of something like an
IndexOnlyScan, why don't you just copy the IOS's result tlist?


I'm actively working on a prototype of partition pruning. It could be much 
simpler if I just patched the core, but we have a working extension for 9.5 
and 9.6, and we can't throw it away just yet. I wouldn't bother you if I 
didn't encounter a broken query :)




--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem in Custom Scan API

2017-04-11 Thread Dmitry Ivanov

Tom Lane wrote:

Reading between the lines, I think the problem may be that you're not
being careful about how you set up custom_scan_tlist.  But since the
core code has zero involvement in that decision, it's hard to see why
it would be a core code bug.


I'm sorry, but you didn't understand. It's *the core code* that builds the 
targetlist, and sometimes it may decide to provide a physical targetlist, 
not the one that's *really needed*. The broader targetlist has Vars that 
cannot be supplied by the IndexOnlyScan node, hence the error.


We cannot come up with our own targetlist because we don't know if it's a 
top node and we should return exactly the same tuple (CP_EXACT_TLIST) or 
it's just the stray optimization that made us think so.


Append works only because it doesn't allow projections, and it can never 
get a physical targetlist if an index doesn't contain all columns.


But everything changes when we use CustomScan: PostgreSQL will not pay 
attention. For example, if our CustomScan is a child of NestLoop, the 
former will call this (create_nestloop_plan):


inner_plan = create_plan_recurse(root, best_path->innerjoinpath, 0);

Since NestLoop can make projections, it doesn't enforce CP_EXACT_TLIST 
flag, and our CustomScan may end up with a full physical targetlist (see 
the code of create_scan_plan() if you don't believe me), INSTEAD OF the one 
it really has been asked to produce. Meanwhile, Append will behave as it 
should, since it doesn't care about physical tlists.


It's not just my imagination.


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem in Custom Scan API

2017-04-11 Thread Dmitry Ivanov

Tom Lane wrote:

Uh, why would you see that?  The planner would never generate an
IndexOnlyScan in the first place if the query required any columns
not available from the index.


True, but as you can see, create_append_plan() produces its own targetlist:

static Plan *
create_append_plan(PlannerInfo *root, AppendPath *best_path)
{
Append *plan;
List   *tlist = build_path_tlist(root, _path->path);
...

If we replace Append with some custom node, the plan will instantly become 
invalid (it won't be be able to build a projection from 'custom_scan_tlist' 
to 'targetlist'). However, this doesn't mean that it's unable to produce 
the same result.



--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Possible problem in Custom Scan API

2017-04-11 Thread Dmitry Ivanov

Hi hackers,

I'm struggling to understand one particular thing about Custom Scan API.

As you may know, there's a function called use_physical_tlist(), which aims 
to eliminate meaningless projections during query execution. Any scan node 
(e.g. CustomScan) aims to take advantage of physical targetlists... except 
for the IndexOnlyScan (for obvious reasons):


createplan.c, create_scan_plan():

if (use_physical_tlist(root, best_path, flags))
{
if (best_path->pathtype == T_IndexOnlyScan)
{
/* For index-only scan, the preferred tlist is the index's */
tlist = copyObject(((IndexPath *) 
best_path)->indexinfo->indextlist);
...
}
...
}

In theory, CustomScans should be able to use any Plan nodes (i.e. 
'custom_plans' list), but as far as I can understand, there's no way to 
override behavior of use_physical_tlist(), which means that we might see 
something like this:


ERROR:  variable not found in subplan target list

if we use child IndexOnlyScan and the index does not include some of the 
relation's columns.


Is there any existing workaround?


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-15 Thread Dmitry Ivanov

Hi everyone,

Looks like "sql_inheritance" GUC is affecting partitioned tables:

explain (costs off) select * from test;
 QUERY PLAN  
--

Append
  ->  Seq Scan on test
  ->  Seq Scan on test_1
  ->  Seq Scan on test_2
  ->  Seq Scan on test_1_1
  ->  Seq Scan on test_1_2
  ->  Seq Scan on test_1_1_1
  ->  Seq Scan on test_1_2_1
(8 rows)


set sql_inheritance = off;


explain (costs off) select * from test;
   QUERY PLAN
--

Seq Scan on test
(1 row)


I might be wrong, but IMO this should not happen. Queries involving update, 
delete etc on partitioned tables are basically broken. Moreover, there's no 
point in performing such operations on a parent table that's supposed to be 
empty at all times.


I've come up with a patch which fixes this behavior for UPDATE, DELETE, 
TRUNCATE and also in transformTableEntry(). It might be hacky, but it gives 
an idea.


I didn't touch RenameConstraint() and renameatt() since this would break 
ALTER TABLE ONLY command.



--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7a574dc..67e118e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1183,7 +1183,7 @@ ExecuteTruncate(TruncateStmt *stmt)
 	{
 		RangeVar   *rv = lfirst(cell);
 		Relation	rel;
-		bool		recurse = interpretInhOption(rv->inhOpt);
+		bool		recurse;
 		Oid			myrelid;
 
 		rel = heap_openrv(rv, AccessExclusiveLock);
@@ -1198,6 +1198,12 @@ ExecuteTruncate(TruncateStmt *stmt)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
+		/* Use interpretInhOption() unless it's a partitioned table */
+		if (rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+			recurse = interpretInhOption(rv->inhOpt);
+		else
+			recurse = true;
+
 		if (recurse)
 		{
 			ListCell   *child;
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 5e65fe7..a3772f7 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -367,6 +367,7 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt)
 	Query	   *qry = makeNode(Query);
 	ParseNamespaceItem *nsitem;
 	Node	   *qual;
+	RangeTblEntry *rte;
 
 	qry->commandType = CMD_DELETE;
 
@@ -384,6 +385,11 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt)
 		 true,
 		 ACL_DELETE);
 
+	/* Set "inh" if table is partitioned */
+	rte = rt_fetch(qry->resultRelation, pstate->p_rtable);
+	if (rte->relkind == RELKIND_PARTITIONED_TABLE)
+		rte->inh = true;
+
 	/* grab the namespace item made by setTargetTable */
 	nsitem = (ParseNamespaceItem *) llast(pstate->p_namespace);
 
@@ -2164,6 +2170,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
 	Query	   *qry = makeNode(Query);
 	ParseNamespaceItem *nsitem;
 	Node	   *qual;
+	RangeTblEntry *rte;
 
 	qry->commandType = CMD_UPDATE;
 	pstate->p_is_insert = false;
@@ -2181,6 +2188,11 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
 		 true,
 		 ACL_UPDATE);
 
+	/* Set "inh" if table is partitioned */
+	rte = rt_fetch(qry->resultRelation, pstate->p_rtable);
+	if (rte->relkind == RELKIND_PARTITIONED_TABLE)
+		rte->inh = true;
+
 	/* grab the namespace item made by setTargetTable */
 	nsitem = (ParseNamespaceItem *) llast(pstate->p_namespace);
 
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 751de4b..215ec73 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -439,6 +439,10 @@ transformTableEntry(ParseState *pstate, RangeVar *r)
 	rte = addRangeTableEntry(pstate, r, r->alias,
 			 interpretInhOption(r->inhOpt), true);
 
+	/* Set "inh" if table is partitioned */
+	if (rte->relkind == RELKIND_PARTITIONED_TABLE)
+		rte->inh = true;
+
 	return rte;
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-12 Thread Dmitry Ivanov
Huh, this code is broken as well. We have to ignore partitions that don't 
have any subpartitions. Patch is attached below (v2).



--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 219d380..6555c7c 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -950,7 +950,8 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
 			   *parted_rels;
 	ListCell   *lc;
 	int			i,
-k;
+k,
+partitioned_children_so_far = 0;
 
 	/*
 	 * Lock partitions and make a list of the partitioned ones to prepare
@@ -1001,7 +1002,7 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
 		PartitionKey partkey = RelationGetPartitionKey(partrel);
 		PartitionDesc partdesc = RelationGetPartitionDesc(partrel);
 		int			j,
-	m;
+	my_partitioned_children;
 
 		pd[i] = (PartitionDispatch) palloc(sizeof(PartitionDispatchData));
 		pd[i]->reldesc = partrel;
@@ -1010,7 +1011,7 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
 		pd[i]->partdesc = partdesc;
 		pd[i]->indexes = (int *) palloc(partdesc->nparts * sizeof(int));
 
-		m = 0;
+		my_partitioned_children = 0;
 		for (j = 0; j < partdesc->nparts; j++)
 		{
 			Oid			partrelid = partdesc->oids[j];
@@ -1026,11 +1027,21 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
  * We can assign indexes this way because of the way
  * parted_rels has been generated.
  */
-pd[i]->indexes[j] = -(i + 1 + m);
-m++;
+pd[i]->indexes[j] = -(1 +
+	  my_partitioned_children +
+	  partitioned_children_so_far);
+
+my_partitioned_children++;
 			}
 		}
 		i++;
+
+		/*
+		 * Children of this parent should be placed after all
+		 * partitioned children of all previous parents, so we
+		 * have to take this into account.
+		 */
+		partitioned_children_so_far += my_partitioned_children;
 	}
 
 	return pd;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-12 Thread Dmitry Ivanov

Hi guys,

Looks like I've just encountered a bug. Please excuse me for the messy 
email, I don't have much time at the moment.



Here's the test case:

create table test(val int) partition by range (val);
create table test_1 partition of test for values from (1) to (1000) 
partition by range(val);
create table test_2 partition of test for values from (1000) to (2000) 
partition by range(val);
create table test_1_1 partition of test_1 for values from (1) to (500) 
partition by range(val);
create table test_1_2 partition of test_1 for values from (500) to (1000) 
partition by range(val);

create table test_1_1_1 partition of test_1_1 for values from (1) to (500);
create table test_1_2_1 partition of test_1_2 for values from (500) to 
(1000);



/* insert a row into "test_1_2_1" */
insert into test values(600);


/* what we EXPECT to see */
select *, tableoid::regclass from test;
val |  tableoid  
-+

600 | test_1_2_1
(1 row)


/* what we ACTUALLY see */
insert into test values(600);
ERROR:  no partition of relation "test_1_1" found for row
DETAIL:  Failing row contains (600).


How does this happen? This is how "PartitionDispatch" array looks like:

test | test_1 | test_2 | test_1_1 | test_1_2

which means that this code (partition.c : 1025):


/*
* We can assign indexes this way because of the way
* parted_rels has been generated.
*/
pd[i]->indexes[j] = -(i + 1 + m);


doesn't work, since partitions are not always placed right after the parent 
(implied by index "m").


We have to take into account the total amount of partitions we've 
encountered so far (right before index "i").


I've attached a patch with a hotfix, but the code looks so-so and has a 
smell. I think it must be rewritten. This bug hunt surely took a while: I 
had to recheck all of the steps several times.



--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 219d380..119a41d 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -950,7 +950,8 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
 			   *parted_rels;
 	ListCell   *lc;
 	int			i,
-k;
+k,
+children_so_far = 0;
 
 	/*
 	 * Lock partitions and make a list of the partitioned ones to prepare
@@ -1026,11 +1027,18 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
  * We can assign indexes this way because of the way
  * parted_rels has been generated.
  */
-pd[i]->indexes[j] = -(i + 1 + m);
+pd[i]->indexes[j] = -(1 + m + children_so_far);
 m++;
 			}
 		}
 		i++;
+
+		/*
+		 * Children of this parent are placed
+		 * after all children of all previous parents,
+		 * so we have to take this into account.
+		 */
+		children_so_far += partdesc->nparts;
 	}
 
 	return pd;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-08 Thread Dmitry Ivanov

That would be fantastic.  I and my colleagues at EnterpriseDB can
surely help review; of course, maybe you and some of your colleagues
would like to help review our patches, too. 


Certainly, I'll start reviewing as soon as I get familiar with the code.



Do you think this is
likely to be something where you can get something done quickly, with
the hope of getting it into v10?  


Yes, I've just cleared my schedule in order to make this possible. I'll 
bring in the patches ASAP.



--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-12-08 Thread Dmitry Ivanov

Hi everyone,


Of course, this is the beginning, not the end.  I've been thinking
about next steps -- here's an expanded list:


- more efficient plan-time partition pruning (constraint 
exclusion is too slow)

- run-time partition pruning
- try to reduce lock levels
...


We (PostgresPro) have been working on pg_pathman for quite a while, and 
since it's obviously going to become the thing of the past, it would be a 
wasted effort if we didn't try to participate.


For starters, I'd love to work on both plan-time & run-time partition 
pruning. I created a custom node for run-time partition elimination, so I 
think I'm capable of developing something similar.



--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in comment to function LockHasWaitersRelation() [master branch]

2016-08-22 Thread Dmitry Ivanov
> Hi hackers,
> 
> I've found a typo in a comment to function LockHasWaitersRelation() [lmgr.c
> :
> 271, master branch]:
> >> This is a functiion to check

Attached a patch.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index 7b08555..d0fdb94 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -268,7 +268,7 @@ UnlockRelation(Relation relation, LOCKMODE lockmode)
 /*
  *		LockHasWaitersRelation
  *
- * This is a functiion to check if someone else is waiting on a
+ * This is a function to check if someone else is waiting on a
  * lock, we are currently holding.
  */
 bool

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typo in comment to function LockHasWaitersRelation() [master branch]

2016-08-22 Thread Dmitry Ivanov
Hi hackers,

I've found a typo in a comment to function LockHasWaitersRelation() [lmgr.c : 
271, master branch]:

>> This is a functiion to check


-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in WaitForBackgroundWorkerShutdown() [REL9_5_STABLE]

2016-08-04 Thread Dmitry Ivanov
> Recently I've encountered a strange crash while calling elog(ERROR, "...")
> after the WaitForBackgroundWorkerShutdown() function. It turns out that
> _returns_ inside the PG_TRY()..PG_CATCH() block are *highly* undesirable,
> since they leave PG_exception_stack pointing to a local struct in a dead
> frame, which is an obvious UB. I've attached a patch which fixes this
> behavior in the aforementioned function, but there might be more errors
> like that elsewhere.

Forgot some curly braces, my bad. v2 attached.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 76619e9..fd55262 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1025,13 +1025,16 @@ WaitForBackgroundWorkerShutdown(BackgroundWorkerHandle *handle)
 
 			status = GetBackgroundWorkerPid(handle, );
 			if (status == BGWH_STOPPED)
-return status;
+break;
 
 			rc = WaitLatch(>procLatch,
 		   WL_LATCH_SET | WL_POSTMASTER_DEATH, 0);
 
 			if (rc & WL_POSTMASTER_DEATH)
-return BGWH_POSTMASTER_DIED;
+			{
+status = BGWH_POSTMASTER_DIED;
+break;
+			}
 
 			ResetLatch(>procLatch);
 		}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bug in WaitForBackgroundWorkerShutdown() [REL9_5_STABLE]

2016-08-04 Thread Dmitry Ivanov
Recently I've encountered a strange crash while calling elog(ERROR, "...") 
after the WaitForBackgroundWorkerShutdown() function. It turns out that 
_returns_ inside the PG_TRY()..PG_CATCH() block are *highly* undesirable, 
since they leave PG_exception_stack pointing to a local struct in a dead 
frame, which is an obvious UB. I've attached a patch which fixes this behavior 
in the aforementioned function, but there might be more errors like that 
elsewhere.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 76619e9..5ecb55a 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1025,13 +1025,14 @@ WaitForBackgroundWorkerShutdown(BackgroundWorkerHandle *handle)
 
 			status = GetBackgroundWorkerPid(handle, );
 			if (status == BGWH_STOPPED)
-return status;
+break;
 
 			rc = WaitLatch(>procLatch,
 		   WL_LATCH_SET | WL_POSTMASTER_DEATH, 0);
 
 			if (rc & WL_POSTMASTER_DEATH)
-return BGWH_POSTMASTER_DIED;
+status = BGWH_POSTMASTER_DIED;
+break;
 
 			ResetLatch(>procLatch);
 		}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-21 Thread Dmitry Ivanov
There are some previously unnoticed errors in docs (master branch), this patch 
fixes them.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 930c8f0..78eaf74 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1523,34 +1523,6 @@ SELECT tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10);
 
  
  
-  setweight
- 
-
-  setweight(query tsquery, weight "char") returns tsquery
- 
-
- 
-  
-   setweight returns a copy of the input query in which every
-   position has been labeled with the given weight(s), either
-   A, B, C,
-   D or their combination. These labels are retained when
-   queries are concatenated, allowing words from different parts of a document
-   to be weighted differently by ranking functions.
-  
-
-  
-   Note that weight labels apply to positions, not
-   lexemes.  If the input query has been stripped of
-   positions then setweight does nothing.
-  
- 
-
-
-
-
- 
- 
   numnode
  
 
@@ -2588,7 +2560,7 @@ more sample word(s) : more indexed word(s)
 

 Specific stop words recognized by the subdictionary cannot be
-specified;  instead use - to mark the location where any
+specified;  instead use ? to mark the location where any
 stop word can appear.  For example, assuming that a and
 the are stop words according to the subdictionary:
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-06 Thread Dmitry Ivanov
> > It seems that everything is settled now, so here's the patch introducing
> > the '<->' and '' operators. I've made the necessary changes to docs &
> > regression tests.
> 
> I noticed that I had accidently trimmed whitespaces in docs, this is a
> better one.

After a brief but reasonable discussion with Teodor I've come up with a more 
proper piece of code for phrase operator parsing. The patch is included below.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-06 Thread Dmitry Ivanov
> > > It seems that everything is settled now, so here's the patch introducing
> > > the '<->' and '' operators. I've made the necessary changes to docs &
> > > regression tests.
> > 
> > I noticed that I had accidently trimmed whitespaces in docs, this is a
> > better one.
> 
> After a brief but reasonable discussion with Teodor I've come up with a more
> proper piece of code for phrase operator parsing. The patch is included
> below.

Attached the patch. Sorry for the inconvenience.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+---

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-05 Thread Dmitry Ivanov
> It seems that everything is settled now, so here's the patch introducing the
> '<->' and '' operators. I've made the necessary changes to docs &
> regression tests.

I noticed that I had accidently trimmed whitespaces in docs, this is a better 
one.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'moscow') as query where que

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-31 Thread Dmitry Ivanov
Hi Teodor,

I've looked through your version and made a few adjustments.

> Pls, remove tsquery_setweight from patch because it's not a part of phrase
> search and it isn't mentioned in docs. Please, make a separate patch for it.

I've removed tsquery_setweight as you requested. I'm going to add it to the 
following commitfest later.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-25 Thread Dmitry Ivanov
Sorry for the delay, I desperately needed some time to finish a bunch of 
dangling tasks.

I've added some new comments and clarified the ones that were obscure. 
Moreover, I felt an urge to recheck most parts of the code since apparently 
nobody (besides myself) has gone so far yet.

On 25.03.16 18:42 MSK, David Steele wrote:
> Time is short and it's not encouraging that you say there is "still much 
work to be done".

Indeed, I was inaccurate. I am more than interested in the positive outcome.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-19 Thread Dmitry Ivanov
Hi, Artur

I've made an attempt to fix some of the issues you've listed, although there's 
still much work to be done. I'll add some comments later.

> This function has the duplicated piece from the tsvector_setweight()
> from tsvector_op.c. You can define new function for it.

I'm not sure it's worth the trouble. IMO these functions are relatively small 
and we won't benefit from extracting the duplicate code.

> These functions was removed in 9acb9007de30b3daaa9efc16763c3bc6e3e0a92d.
> It seems that tsvector_op.c was not synchronized with the master.

Got it, thanks!

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+--

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-03-18 Thread Dmitry Ivanov
Hi Constantin,

I did a quick review of your patch, and here are my comments:

- This patch applies cleanly to the current HEAD (61d2ebdbf91).

- Code compiles without warnings.

- Currently there's no documentation regarding parallel gin build feature and 
provided GUC variables.

- Built indexes work seem to work normally.


Performance
---

I've made a few runs on my laptop (i7-4710HQ, default postgresql.conf), here 
are the results:

workers avg. time (s)
0   412
4   133
8   81

Looks like 8 workers & a backend do the job 5x times faster than a sole 
backend, which is good!


Code style
--

There are some things that you've probably overlooked, such as:

> task->heap_oid = heap->rd_id;
> task->index_oid = index->rd_id;

You could replace direct access to 'rd_id' field with the RelationGetRelid 
macro.

> static void ginDumpEntry(GinState *ginstate,
>volatile WorkerResult *r

Parameter 'r' is unused, you could remove it.

Some of the functions and pieces of code that you've added do not comply to 
the formatting conventions, e. g.

> static int claimSomeBlocks(...
> static GinEntryStack *pushEntry(

>> // The message consists of 2 or 3 parts. iovec allows us to send them as

etc.

Keep up the good work!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] ALTER ... OWNER TO ... CASCADE

2016-02-16 Thread Dmitry Ivanov
> Sometimes I hit the following. You have created a database and schema inside
> it from the superuser (i.e. postgres). Than you want to change ownership of
> whole database to another user (i.e. alice), but only this database, not
> all other objects in all other databases. 

Actually, it skips all files that belong to irrelevant databases:

/*
 * We only operate on shared objects and objects in the current
 * database
 */
if (sdepForm->dbid != MyDatabaseId &&
sdepForm->dbid != InvalidOid)
continue;

> It seems that REASSIGN OWNED doesn’t solve this already.

Yes, it doesn't solve this case. This is due to the fact that if the superuser 
that created the database is 'pinned' (e.g. postgres), it is impossible to 
track any object which depends on him, since such a dependency is not present 
in the pg_shdepend (pay attention to the comment below):

if (isSharedObjectPinned(AuthIdRelationId, roleid, sdepRel))
{
.
ereport(ERROR,
(errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST),
 errmsg("cannot reassign ownership

/*
 * There's no need to tell the whole truth, which is that we
 * didn't track these dependencies at all ...
 */
}

This prevents you from doing something like:

test=# reassign owned by postgres to test;
ERROR:  cannot reassign ownership of objects owned by role postgres because 
they are required by the database system

I think that my solution might fit better.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] ALTER ... OWNER TO ... CASCADE

2016-02-15 Thread Dmitry Ivanov
> Dmitry Ivanov <d.iva...@postgrespro.ru> writes:
> > As of now there's no way to transfer the ownership of an object and all
> > its
> > dependent objects in one step. One has to manually alter the owner of each
> > object, be it a table, a schema or something else.
> 
> TBH, this sounds like a completely terrible idea.  There are far too many
> sorts of dependencies across which one would not expect ownership to
> propagate; for example, use of a function in a view, or even just a
> foreign key dependency between two tables.

Well, actually this is a statement of the fact, and I don't propose enabling 
this option for every dependency possible. This patch includes an experimental 
feature that anyone can try and discuss, nothing more. Besides, it acts a lot 
like 'drop ... cascade' (the findDependentObjects() function is used under the 
hood), so the behavior is totally predictable. It also writes down all objects 
that have been touched, so no change goes unnoticed.

> I'm not even clear that there are *any* cases where this behavior is
> wanted, other than perhaps ALTER OWNER on an extension

At very least this might be useful in order to change owner of all tables 
which inherit some table.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [WIP] ALTER ... OWNER TO ... CASCADE

2016-02-15 Thread Dmitry Ivanov
Hi hackers,

Recently I've been working on a CASCADE option for ALTER ... OWNER TO 
statement. Although it's still a working prototype, I think it's time to share 
my work.


Introduction


As of now there's no way to transfer the ownership of an object and all its 
dependent objects in one step. One has to manually alter the owner of each 
object, be it a table, a schema or something else. This patch adds the 
'CASCADE' option to every 'ALTER X OWNER TO' statement, including the 'ALTER 
DATABASE db OWNER TO user CASCADE' which turns out to be a delicate matter.


Implementation
==

There are two functions that process 'ALTER ... OWNER' statement: 
ExecAlterOwnerStmt() and ATExecCmd(). The latter function deals with the tasks 
that refer to all kinds of relations, while the first one handles the remaining 
object types. Basically, all I had to do is to add 'cascade' flag to the 
corresponding parsenodes and to make these functions call the dependency tree 
walker function (which would change the ownership of the dependent objects if 
needed). Of course, there are various corner cases for each kind of objects 
that require special treatment, but the code speaks for itself.

The aforementioned 'ALTER DATABASE db ...' is handled in a special way. Since 
objects that don't belong to the 'current database' are hidden, it is 
impossible to change their owner directly, so we have to do the job in the 
background worker that is connected to the 'db'. I'm not sure if this is the 
best solution available, but anyway.


What works
==

Actually, it seems to work in simple cases like 'a table with its inheritors' 
or 'a schema full of tables', but of course there might be things I've 
overlooked. There are some regression tests, though, and I'll continue to 
write some more.


What's dubious
==

It is unclear what kinds of objects should be transferred in case of database 
ownership change, since there's no way to get the full list of objects that 
depend on a given database. Currently the code changes ownership of all 
schemas (excluding the 'information_schema' and some others) and their 
contents, but this is a temporary limitation.

Feedback is welcome!


-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c48e37b..54be671 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -21,6 +21,7 @@
 #include "catalog/index.h"
 #include "catalog/objectaccess.h"
 #include "catalog/pg_amop.h"
+#include "catalog/pg_aggregate.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
 #include "catalog/pg_authid.h"
@@ -40,6 +41,7 @@
 #include "catalog/pg_foreign_server.h"
 #include "catalog/pg_language.h"
 #include "catalog/pg_largeobject.h"
+#include "catalog/pg_largeobject_metadata.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_operator.h"
@@ -56,6 +58,7 @@
 #include "catalog/pg_ts_template.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_user_mapping.h"
+#include "commands/alter.h"
 #include "commands/comment.h"
 #include "commands/defrem.h"
 #include "commands/event_trigger.h"
@@ -66,6 +69,7 @@
 #include "commands/seclabel.h"
 #include "commands/trigger.h"
 #include "commands/typecmds.h"
+#include "commands/tablecmds.h"
 #include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "rewrite/rewriteRemove.h"
@@ -77,17 +81,6 @@
 #include "utils/tqual.h"
 
 
-/*
- * Deletion processing requires additional state for each ObjectAddress that
- * it's planning to delete.  For simplicity and code-sharing we make the
- * ObjectAddresses code support arrays with or without this extra state.
- */
-typedef struct
-{
-	int			flags;			/* bitmask, see bit definitions below */
-	ObjectAddress dependee;		/* object whose deletion forced this one */
-} ObjectAddressExtra;
-
 /* ObjectAddressExtra flag bits */
 #define DEPFLAG_ORIGINAL	0x0001		/* an original deletion target */
 #define DEPFLAG_NORMAL		0x0002		/* reached via normal dependency */
@@ -97,17 +90,6 @@ typedef struct
 #define DEPFLAG_REVERSE		0x0020		/* reverse internal/extension link */
 
 
-/* expansible list of ObjectAddresses */
-struct ObjectAddresses
-{
-	ObjectAddress *refs;		/* => palloc'd array */
-	ObjectAddressExtra *extras; /* => palloc'd array, or NULL if not used */
-	int			numrefs;		/* current number of references */
-	int			maxrefs;		/* current size of palloc'd array(s) */
-};
-
-/* typedef ObjectAddresses appears in dependency.h */
-
 /* threaded list of ObjectAddresses, for recursion detection */
 ty

Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-12 Thread Dmitry Ivanov
> OK, that seems reasonable from here.  What I'm still fuzzy about is
> why including stdbool.h causes a failure. Is it because it defines a
> type called "bool" that clashes with ours?  That seems like the
> obvious explanation, but then how does that not cause the compiler to
> just straight-up error out?

stdbool.h defines the '_Bool' type. The standard says:

C99 and C11 §6.3.1.2/1 “When any scalar value is converted to _Bool, the 
result is 0 if the value compares equal to 0; otherwise, the result is 1.”

It seems that MSVC's bool (_Bool) assignment complies to the standard.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-01 Thread Dmitry Ivanov
Hi Hackers,

Although PostgreSQL is capable of performing some FTS (full text search) 
queries, there's still a room for improvement. Phrase search support could 
become a great addition to the existing set of features.


Introduction


It is no secret that one can make Google search for an exact phrase (instead 
of an unordered lexeme set) simply by enclosing it within double quotes. This 
is a really nice feature which helps to save the time that would otherwise be 
spent on annoying result filtering.

One weak spot of the current FTS implementation is that there is no way to 
specify the desired lexeme order (since it would not make any difference at 
all). In the end, the search engine will look for each lexeme individually, 
which means that a hypothetical end user would have to discard documents not 
including search phrase all by himself. This problem is solved by the patch 
below (should apply cleanly to 61ce1e8f1).


Problem description
===

The problem comes from the lack of lexeme ordering operator. Consider the 
following example:

select q @@ plainto_tsquery('fatal error') from 
unnest(array[to_tsvector('fatal error'), to_tsvector('error is not fatal')]) 
as q;
?column?
--
t
t
(2 rows)

Clearly the latter match is not the best result in case we wanted to find 
exactly the "fatal error" phrase. That's when the need for a lexeme ordering 
operator arises:

select q @@ to_tsquery('fatal ? error') from unnest(array[to_tsvector('fatal 
error'), to_tsvector('error is not fatal')]) as q;
?column?
--
t
f
(2 rows)


Implementation
==

The ? (FOLLOWED BY) binary operator takes form of "?" or "?[N]" where 0 <= N < 
~16K. If N is provided, the distance between left and right operands must be 
no greater that N. For example:

select to_tsvector('postgres has taken severe damage') @@ to_tsquery('postgres 
? (severe ? damage)');
 ?column?
--
 f
(1 row)

select to_tsvector('postgres has taken severe damage') @@ to_tsquery('postgres 
?[4] (severe ? damage)');
 ?column?
--
 t
(1 row)

New function phraseto_tsquery([ regconfig, ] text) takes advantage of the "?
[N]" operator in order to facilitate phrase search:

select to_tsvector('postgres has taken severe damage') @@ 
phraseto_tsquery('severely damaged');
 ?column?
--
 t
(1 row)


This patch was originally developed by Teodor Sigaev and Oleg Bartunov in 
2009, so all credit goes to them. Any feedback is welcome.


-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index f70cfe7..9b0a703 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -3925,8 +3925,9 @@ SELECT to_tsvector('english', 'The Fat Rats');
 
  A tsquery value stores lexemes that are to be
  searched for, and combines them honoring the Boolean operators
-  (AND), | (OR), and
- ! (NOT).  Parentheses can be used to enforce grouping
+  (AND), | (OR),
+ ! (NOT) and ? (FOLLOWED BY) phrase search
+ operator.  Parentheses can be used to enforce grouping
  of the operators:
 
 
@@ -3947,8 +3948,8 @@ SELECT 'fat  rat  ! cat'::tsquery;
 
 
  In the absence of parentheses, ! (NOT) binds most tightly,
- and  (AND) binds more tightly than
- | (OR).
+ and  (AND) and ? (FOLLOWED BY)
+ both bind more tightly than | (OR).
 
 
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 139aa2b..7585f6c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9060,6 +9060,12 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
 !'cat'


+ ?? 
+tsquery followed by tsquery
+to_tsquery('fat') ?? to_tsquery('rat')
+'fat' ? 'rat'
+   
+   
  @ 
 tsquery contains another ?
 'cat'::tsquery @ 'cat  rat'::tsquery
@@ -9154,6 +9160,18 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple

 
  
+  phraseto_tsquery
+ 
+ phraseto_tsquery( config regconfig ,  query text)
+
+tsquery
+produce tsquery ignoring punctuation
+phraseto_tsquery('english', 'The Fat Rats')
+'fat' ? 'rat'
+   
+   
+
+ 
   querytree
  
  querytree(query tsquery)
@@ -9177,6 +9195,15 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple


 
+ setweight(tsquery, "char")
+
+tsquery
+add weight to each element of tsquery
+setweight('fat ? cat  rat:B'::tsquery, 'A')
+( 'fat':A ? 'cat':A )  'rat':AB
+   
+   
+
  
   strip
  
@@ -9269,6 +9296,27 @@ CREATE TYPE rainbow AS E

Re: [HACKERS] [PROPOSAL] Backup and recovery of pg_statistic

2015-12-28 Thread Dmitry Ivanov
Here's the patch with an 'array_elemtype' procedure which returns array's 
element type as an oid. It should apply to the commit fc995b. I wasn't sure if 
I shoud start a new thread.


-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e08bf60..671c6d5 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11695,6 +11695,9 @@ SELECT NULLIF(value, '(none)') ...
 array_dims
   
   
+array_elemtype
+  
+  
 array_fill
   
   
@@ -11794,6 +11797,17 @@ SELECT NULLIF(value, '(none)') ...

 
  
+  array_elemtype(anyarray)
+ 
+
+oid
+returns the element type of an array as oid
+array_elemtype(ARRAY[1,2,3])
+23
+   
+   
+
+ 
   array_fill(anyelement, int[],
   , int[])
  
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 72d308a..c2883e9 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -158,6 +158,18 @@ static int width_bucket_array_variable(Datum operand,
 			Oid collation,
 			TypeCacheEntry *typentry);
 
+/*
+ * array_elemtype :
+ *		returns the element type of the array
+ *		pointed to by "v" as an Oid.
+ */
+Datum
+array_elemtype(PG_FUNCTION_ARGS)
+{
+	AnyArrayType *v = PG_GETARG_ANY_ARRAY(0);
+	
+	return ObjectIdGetDatum(AARR_ELEMTYPE(v));
+}
 
 /*
  * array_in :
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index d8640db..de55c01 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -892,6 +892,8 @@ DATA(insert OID = 2092 (  array_upper	   PGNSP PGUID 12 1 0 0 0 f f f f t f i s
 DESCR("array upper dimension");
 DATA(insert OID = 2176 (  array_length	   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 23 "2277 23" _null_ _null_ _null_ _null_ _null_ array_length _null_ _null_ _null_ ));
 DESCR("array length");
+DATA(insert OID = 3317 (  array_elemtype   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 26 "2277" _null_ _null_ _null_ _null_ _null_	array_elemtype _null_ _null_ _null_ ));
+DESCR("array element type");
 DATA(insert OID = 3179 (  cardinality	   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 23 "2277" _null_ _null_ _null_ _null_ _null_ array_cardinality _null_ _null_ _null_ ));
 DESCR("array cardinality");
 DATA(insert OID = 378 (  array_append	   PGNSP PGUID 12 1 0 0 0 f f f f f f i s 2 0 2277 "2277 2283" _null_ _null_ _null_ _null_ _null_ array_append _null_ _null_ _null_ ));
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index 716e756..22c0fc9 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -328,6 +328,7 @@ extern bool Array_nulls;
 /*
  * prototypes for functions defined in arrayfuncs.c
  */
+extern Datum array_elemtype(PG_FUNCTION_ARGS);
 extern Datum array_in(PG_FUNCTION_ARGS);
 extern Datum array_out(PG_FUNCTION_ARGS);
 extern Datum array_recv(PG_FUNCTION_ARGS);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Backup and recovery of pg_statistic

2015-12-24 Thread Dmitry Ivanov
> There's still the problem that it won't work across a major version
> upgrade that makes any change whatsoever to the rowtype of pg_statistic.

I'm sorry if my previous explanation was poor, this time I am going to be 
detailed.

Suppose you want to upgrade from 9.4 to 9.6. In that case you would use the 
pg_upgrade utility provided by the release 9.6, which means that it's the 
pg_dump who would have to connect to the older instance and to prepare tuples 
to be inserted to the pg_statistic of the newer instance. The pg_dump utility 
would have to convert statistical data to the new format (for example, add 
placeholders for new columns), so generated INSERT statements would be fine 
provided that the pg_dump would be up-to-date.

The documentation states that we should always run the pg_upgrade binary of 
the new server, not the old one [http://www.postgresql.org/docs/9.5/static/
pgupgrade.html, Usage, #9]. This means that the pg_upgrade will definitely use 
a fresh version of pg_dump utility that is aware of all possible pitfalls.

Furthermore, each INSERT statement consists of textually-serialized columns of 
pg_statistic. Columns of 'anyarray' type are deserialized using the 'array_in' 
procedure which performs various sanity checks, including the element type 
check. Thus it is not possible to insert an anyarray object which will cause 
server death.


-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Backup and recovery of pg_statistic

2015-12-23 Thread Dmitry Ivanov
> That concern is exactly the reason why we never did this originally.
> In particular, emitting raw INSERTs into pg_statistic is just plain
> foolish; we have changed the rowtype of pg_statistic in the past and
> are likely to do so again.  At a minimum we would need such a facility
> to be proof against addition of more statistic "slots" (more columns)
> to pg_statistic.

While this approach may indeed look dumb, it is intended to be used only in 
conjunction with 'binary-upgrade' option, which effectively means that the 
pg_dump-generated INSERT statement has to be compatible only with the release 
that includes this very pg_dump version. Thus, there's no need for validation.

> And of course there was code to emit such a dump, producing one
> dump statement per occupied "slot" in pg_statistic plus one call to
> the other function per pg_statistic row.

> An API like this seems a good deal more future-proof than plain INSERTs.

This sounds really good, but I doubt that this is necessary if we're to just 
preserve statistical data during an upgrade.

> Ideally, ordinary users
> could use this facility to transfer statistics for their own tables, just
> as they can use pg_dump ... but without adequate validity checking, it's
> way too much of a security hazard to allow that.

This could be implemented separately from the pg_dump if needed. The latter 
proposal aims for the preservation of the statistical data during the database 
upgrade, which is a rather important feature required by many DBAs. Of course, 
it would be great if there was a way for a user to dump and restore stats for 
his own tables on a whim, but, in my opinion, it is far less important.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Regarding recovery configuration

2015-12-01 Thread Dmitry Ivanov
Hi hackers,

I'd like to share some thoughts on 'recovery.conf'-related variables.


Introduction


The 'recovery.conf' file is used to set up the recovery configuration of a 
PostgreSQL server. Its structure closely resembles that of the 
'postgresql.conf' file whose settings are written to the GUC (stands for Grand 
Unified Configuration) at startup. There are several variables in the 
PostgreSQL 
backend, e.g.,

* recoveryRestoreCommand
* recoveryEndCommand
* recoveryCleanupCommand
* recoveryTarget
  ...

that store values read from the 'recovery.conf' file during server startup. 
Although they represent the recovery configuration, they are not included in 
the GUC and thus are not directly accessible via SHOW. Furthermore, the lack 
of an iterable collection (i.e. sorted array of 'config_generic' structures) of 
these variables significantly complicates the 'readRecoveryCommandFile' 
procedure since it mainly consists of numerous string comparisons and variable 
assignments. Lastly, the 'recovery.conf' is not reloadable, which means that 
there is no way to change recovery settings (e.g. 'primary_conninfo') after 
startup. This feature could come in handy if, for example, a master-slave 
connection terminated not long after the master's password had been changed. 
There would be no need to restart the slave in that case.


Possible changes


There are at least two ways to faciliate the management of these variables:

1. Combine them into a set similar to the GUC and name it something like RUC 
(Recovery Unified Configuration), thus providing separate variable storage for 
each configuration file. Some of its members might be read-only.

2. Add them to the GUC, perhaps with a special context.

Both approaches would allow us to create, for example, a dedicated view for 
the 'recovery.conf' file using a drastically simplified procedure written in C. 
Also it would get easier to reload at least some settings from the 
'recovery.conf' file at runtime.

Suggestions and comments are welcome.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PROPOSAL] Backup and recovery of pg_statistic

2015-11-27 Thread Dmitry Ivanov
Hi hackers,

This is my proposal for 'pg_dump' utility enhancements which enable backup and 
recovery of statistical data stored in the 'pg_statistic' table. Statistical 
data is very valuable for the query planner, so support engineers have to 
manually generate it using ANALYZE whenever they migrate or restore a database 
from backup, which is unpleasant. This proposal aims to improve the overall 
situation.


Problem description
===

Currently there is no way to backup 'pg_statistic' because columns of 
'anyarray' type cannot be reconstructed solely with their textual 
representation. Meanwhile, all that is needed to solve this problem is a small 
extension capable of retrieving the element type of an 'anyarray' object and 
recreating this particular 'anyarray' object using the 'array_in' procedure. 
Another vital feature is the ability to transform various object identificators 
that are stored in this table to textual representations in order to make them 
portable. Such functionality could be easily implemented, which is why I've 
made up a tiny proof of concept extension that is able to demonstrate the 
possibility of recovery.


Design
==

Several things come to mind when we think of the 'pg_statistic' recovery:

* The procedure written in the C language is needed in order to determine the 
element type of a composite 'anyarray' type. The returned 'oid' will be used 
to reconstruct the 'anyarray' object using the 'array_in' procedure.

arr := array_in('{1,2,3}', 'text'::regtype::oid, -1);
anyarray_elemtype(arr) -> 25 ('text'::regtype::oid)

* The columns 'starelid' (relation identifier) and 'staop' (operator 
identifier) 
have type 'oid', so their values could be invalid within a new DB, because the 
object IDs of newly recovered relations and operators might have changed 
during recovery. These kinds of values should be substituted with proper casts 
to internal types, for example:

65554 (relid) -> 'public.test'::regclass::oid
15 (staopN) -> 'public.=(pg_catalog.=(pg_catalog.int4, pg_catalog.int8)'

Note that every type is schema-qualified in order to avoid naming conflicts.

* The type of a column which is referenced by the 'staattnum' column also 
needs to be checked for the sake of consistency. It should remain the same, 
otherwise collected stats won't be of any use.

* The main procedure will simply generate a bunch of INSERT queries (one query 
per each row of the 'pg_statistic') which can be saved to a text file.


Proof of concept


Interface
-

Currently there's a PoC extension which contains several functions:

dump_statistic() - returns a set of INSERT queries;

anyarray_elemtype(anyarray) - returns the object identificator of an element 
type;

to_schema_qualified_operator(opid oid) - converts the 'opid' (operator ID) to a 
schema-qualified operator name;
to_schema_qualified_relname(relid oid) - converts the 'relid' (relation ID) to 
a schema-qualified relation name;
to_schema_qualified_type(typid oid) - converts the 'typid' (type ID) to a 
schema-qualified type name;

to_attname(rel text, colnum smallint) - returns the name of the Nth column of 
the specified table 'rel';
to_attnum(rel text, col text) - converts the table name 'rel' and the column 
name 'col' to a column number;

to_atttype(rel text, col text) - returns the type of the column 'col';
to_atttype(rel text, colnum smallint) - overloaded for the column number 
'colnum';

The extension is compatible with versions 9.4 and above.

Usage example
-

DB=# \copy (select dump_statistic()) to 'stat_backup.sql'
$ psql DB < stat_backup.sql


Proposed changes to pg_dump
===

Now that the approach has been developed, it may be applied to improve the 
'pg_dump' utility. Some minor code changes would make the 'pg_dump' emit 
specially-formed recovery INSERTS for 'pg_statistic' in the 'binary-upgrade' 
mode, thus allowing us to restore saved stats after an upgrade.


Conclusion
==

I've attached a tarball with sources so that anyone could try the extension.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

dump_stat.tar.gz
Description: application/compressed-tar

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers