Re: Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

2024-05-16 Thread Quan Zongliang




On 2024/5/16 00:58, Robert Haas wrote:

On Tue, Apr 16, 2024 at 3:16 AM Quan Zongliang  wrote:

According to the discussion with Jian He. Use the guc hook to check if
the xid needs to be output. If needed, the statement log can be delayed
to be output.


I appreciate the work that both of you have put into this, but I think
we should reject this patch and remove the TODO item. We currently
have some facilities (like log_statement) that log the statement
before parsing it or executing it, and others (like
log_min_duration_statement) that log it afterward. That is probably
not documented as clearly as it should be, but it's long-established
behavior.

What this patch does is change the behavior of log_statement so that
log_statement sometimes logs the statement before it's executed, and
sometimes after the statement. I think that's going to be confusing
and unhelpful. In particular, right now you can assume that if you set
log_statement=all and there's a statement running, it's already been
logged. With this change, that would sometimes be true and sometimes
false.

For example, suppose that at 9am sharp, I run an UPDATE command that
takes ten seconds to complete. Right now, the log_statement message
will appear at 9am. With this change, it will run at 9am if I do it
inside a transaction block that has an XID already, and at 9:00:10am
if I do it in a transaction block that does not yet have an XID, or if
I do it outside of a transaction. I don't think the benefit of getting
the XID in the log message is nearly enough to justify such a strange
behavior.

I thought about writing statement log when xid assigned. But it's seemed 
too complicated.
I'm inclined to keep it for a while. Until we find a good way or give 
up. It's a reasonable request, after all.






Re: Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

2024-04-16 Thread Quan Zongliang



On 2024/3/11 09:25, Quan Zongliang wrote:



On 2024/3/4 15:48, jian he wrote:


Maybe we can tolerate LOG, first output the query plan then statement.


It is more appropriate to let the extension solve its own problems.
Of course, this change is not easy to implement.
Due to the way XID is assigned, there seems to be no good solution at 
the moment.



According to the discussion with Jian He. Use the guc hook to check if 
the xid needs to be output. If needed, the statement log can be delayed 
to be output.diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 76f48b13d2..217910c9de 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1068,11 +1068,14 @@ exec_simple_query(const char *query_string)
/* Log immediately if dictated by log_statement */
if (check_log_statement(parsetree_list))
{
-   ereport(LOG,
-   (errmsg("statement: %s", query_string),
-errhidestmt(true),
-errdetail_execute(parsetree_list)));
-   was_logged = true;
+   if (!Log_has_xid || 
TransactionIdIsValid(GetTopTransactionIdIfAny()))
+   {
+   ereport(LOG,
+   (errmsg("statement: %s", query_string),
+   errhidestmt(true),
+   errdetail_execute(parsetree_list)));
+   was_logged = true;
+   }
}
 
/*
@@ -1283,6 +1286,16 @@ exec_simple_query(const char *query_string)
 
PortalDrop(portal, false);
 
+   /* Log if dictated by log_statement and has not been logged. */
+   if (!was_logged && check_log_statement(parsetree_list))
+   {
+   ereport(LOG,
+   (errmsg("statement: %s", query_string),
+   errhidestmt(true),
+   errdetail_execute(parsetree_list)));
+   was_logged = true;
+   }
+
if (lnext(parsetree_list, parsetree_item) == NULL)
{
/*
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 605ff3b045..e7dd9f5027 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -79,6 +79,7 @@
 #include "storage/ipc.h"
 #include "storage/proc.h"
 #include "tcop/tcopprot.h"
+#include "utils/guc.h"
 #include "utils/guc_hooks.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
@@ -114,6 +115,9 @@ char   *Log_destination_string = NULL;
 bool   syslog_sequence_numbers = true;
 bool   syslog_split_messages = true;
 
+/* Whether the transaction id will appear in the log messages. */
+bool   Log_has_xid = false;
+
 /* Processed form of backtrace_functions GUC */
 static char *backtrace_function_list;
 
@@ -286,6 +290,50 @@ message_level_is_interesting(int elevel)
 }
 
 
+/*
+ * check_log_line_prefix: GUC check_hook for Log_line_prefix
+ */
+bool check_log_line_prefix(char **newval, void **extra, GucSource source)
+{
+   const char *p;
+
+   if ((*newval) == NULL)
+   {
+   Log_has_xid = false;
+   return true;
+   }
+
+   Log_has_xid = false;
+
+   for (p = (*newval); *p != '\0'; p++)
+   {
+   if (*p != '%')
+   {
+   /* literal char, just skip */
+   continue;
+   }
+
+   /* must be a '%', so skip to the next char */
+   p++;
+   if (*p == '\0')
+   break;  /* format error - 
ignore it */
+   else if (*p == '%')
+   {
+   /* string contains %% */
+   continue;
+   }
+
+   /* process the option */
+   if (*p == 'x')
+   {
+   Log_has_xid = true;
+   break;
+   }
+   }
+
+   return true;
+}
+
 /*
  * in_error_recursion_trouble --- are we at risk of infinite error recursion?
  *
diff --git a/src/backend/utils/misc/guc_tables.c 
b/src/backend/utils/misc/guc_tables.c
index c68fdc008b..96948e5005 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -4103,7 +4103,7 @@ struct config_string ConfigureNamesString[] =
},
_line_prefix,
"%m [%p] ",
-   NULL, NULL, NULL
+   check_log_line_prefix, NULL, NULL
},
 
{
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 054dd2bf62..95361

Re: Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

2024-03-10 Thread Quan Zongliang




On 2024/3/4 15:48, jian he wrote:


Maybe we can tolerate LOG, first output the query plan then statement.


It is more appropriate to let the extension solve its own problems.
Of course, this change is not easy to implement.
Due to the way XID is assigned, there seems to be no good solution at 
the moment.






Re: Change the bool member of the Query structure to bits

2024-02-20 Thread Quan Zongliang




On 2024/2/20 19:18, Tomas Vondra wrote:

On 2/20/24 11:11, Quan Zongliang wrote:


Sorry. I forgot to save a file. This is the latest.

On 2024/2/20 18:07, Quan Zongliang wrote:


The Query structure has an increasing number of bool attributes. This
is likely to increase in the future. And they have the same
properties. Wouldn't it be better to store them in bits? Common
statements don't use them, so they have little impact. This also saves
memory space.



Hi,

Are we really adding bools to Query that often? A bit of git-blame says
it's usually multiple years to add a single new flag, which is what I'd
expect. I doubt that'll change.

As for the memory savings, can you quantify how much memory this would save?

I highly doubt that's actually true (or at least measurable). The Query
struct has ~256B, the patch cuts that to ~232B. But we allocate stuff in
power-of-2, so we'll allocate 256B chunk anyway. And we allocate very
few of those objects anyway ...

regards


That makes sense. Withdraw.





Re: Change the bool member of the Query structure to bits

2024-02-20 Thread Quan Zongliang




On 2024/2/20 23:45, Tom Lane wrote:

Quan Zongliang  writes:

The Query structure has an increasing number of bool attributes. This is
likely to increase in the future. And they have the same properties.
Wouldn't it be better to store them in bits? Common statements don't use
them, so they have little impact. This also saves memory space.


I'm -1 on that, for three reasons:

* The amount of space saved is quite negligible.  If queries had many
Query structs then it could matter, but they don't.

* This causes enough code churn to create a headache for back-patching.

* This'll completely destroy the readability of these flags in
pprint output.

I'm not greatly in love with the macro layer you propose, either,
but those details don't matter because I think we should just
leave well enough alone.

regards, tom lane

I get it. Withdraw.





Re: Change the bool member of the Query structure to bits

2024-02-20 Thread Quan Zongliang


Sorry. I forgot to save a file. This is the latest.

On 2024/2/20 18:07, Quan Zongliang wrote:


The Query structure has an increasing number of bool attributes. This is 
likely to increase in the future. And they have the same properties. 
Wouldn't it be better to store them in bits? Common statements don't use 
them, so they have little impact. This also saves memory space.


--
Quan Zongliangdiff --git a/contrib/postgres_fdw/postgres_fdw.c 
b/contrib/postgres_fdw/postgres_fdw.c
index 142dcfc995..a28d625147 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3341,7 +3341,7 @@ estimate_path_cost_size(PlannerInfo *root,
 
/* Collect statistics about aggregates for estimating 
costs. */
MemSet(, 0, sizeof(AggClauseCosts));
-   if (root->parse->hasAggs)
+   if (QueryHasAggs(root->parse))
{
get_agg_clause_costs(root, AGGSPLIT_SIMPLE, 
);
}
@@ -6762,7 +6762,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo 
*input_rel,
Costtotal_cost;
 
/* Nothing to be done, if there is no grouping or aggregation required. 
*/
-   if (!parse->groupClause && !parse->groupingSets && !parse->hasAggs &&
+   if (!parse->groupClause && !parse->groupingSets && !QueryHasAggs(parse) 
&&
!root->hasHavingQual)
return;
 
@@ -6859,7 +6859,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo 
*input_rel,
Assert(parse->sortClause);
 
/* We don't support cases where there are any SRFs in the targetlist */
-   if (parse->hasTargetSRFs)
+   if (QueryHasTargetSRFs(parse))
return;
 
/* Save the input_rel as outerrel in fpinfo */
@@ -7007,7 +7007,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo 
*input_rel,
return;
 
/* We don't support cases where there are any SRFs in the targetlist */
-   if (parse->hasTargetSRFs)
+   if (QueryHasTargetSRFs(parse))
return;
 
/* Save the input_rel as outerrel in fpinfo */
diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
index dce898c751..e328bd9242 100644
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -398,7 +398,7 @@ DefineView(ViewStmt *stmt, const char *queryString,
 * DefineQueryRewrite(), but that function will complain about a bogus 
ON
 * SELECT rule, and we'd rather the message complain about a view.
 */
-   if (viewParse->hasModifyingCTE)
+   if (QueryHasModifyingCTE(viewParse))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("views must not contain data-modifying 
statements in WITH")));
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 0f811fd2fc..cdc94f082d 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -488,7 +488,7 @@ init_execution_state(List *queryTree_list,
/* Utility commands require no planning. */
stmt = makeNode(PlannedStmt);
stmt->commandType = CMD_UTILITY;
-   stmt->canSetTag = queryTree->canSetTag;
+   stmt->canSetTag = QueryCanSetTag(queryTree);
stmt->utilityStmt = queryTree->utilityStmt;
stmt->stmt_location = queryTree->stmt_location;
stmt->stmt_len = queryTree->stmt_len;
@@ -540,7 +540,7 @@ init_execution_state(List *queryTree_list,
newes->stmt = stmt;
newes->qd = NULL;
 
-   if (queryTree->canSetTag)
+   if (QueryCanSetTag(queryTree))
lasttages = newes;
 
preves = newes;
@@ -1650,7 +1650,7 @@ check_sql_fn_retval(List *queryTreeLists,
{
Query  *q = lfirst_node(Query, lc2);
 
-   if (q->canSetTag)
+   if (QueryCanSetTag(q))
{
parse = q;
parse_cell = lc2;
@@ -1934,7 +1934,7 @@ tlist_coercion_finished:
newquery = makeNode(Query);
newquery->commandType = CMD_SELECT;
newquery->querySource = parse->querySource;
-   newquery->canSetTag = true;
+   QuerySetFlag(newquery, CAN_SET_TAG);
newquery->targetList = upper_tlist;
 
   

Change the bool member of the Query structure to bits

2024-02-20 Thread Quan Zongliang


The Query structure has an increasing number of bool attributes. This is 
likely to increase in the future. And they have the same properties. 
Wouldn't it be better to store them in bits? Common statements don't use 
them, so they have little impact. This also saves memory space.


--
Quan Zongliangdiff --git a/contrib/postgres_fdw/postgres_fdw.c 
b/contrib/postgres_fdw/postgres_fdw.c
index 142dcfc995..a28d625147 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3341,7 +3341,7 @@ estimate_path_cost_size(PlannerInfo *root,
 
/* Collect statistics about aggregates for estimating 
costs. */
MemSet(, 0, sizeof(AggClauseCosts));
-   if (root->parse->hasAggs)
+   if (QueryHasAggs(root->parse))
{
get_agg_clause_costs(root, AGGSPLIT_SIMPLE, 
);
}
@@ -6762,7 +6762,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo 
*input_rel,
Costtotal_cost;
 
/* Nothing to be done, if there is no grouping or aggregation required. 
*/
-   if (!parse->groupClause && !parse->groupingSets && !parse->hasAggs &&
+   if (!parse->groupClause && !parse->groupingSets && !QueryHasAggs(parse) 
&&
!root->hasHavingQual)
return;
 
@@ -6859,7 +6859,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo 
*input_rel,
Assert(parse->sortClause);
 
/* We don't support cases where there are any SRFs in the targetlist */
-   if (parse->hasTargetSRFs)
+   if (QueryHasTargetSRFs(parse))
return;
 
/* Save the input_rel as outerrel in fpinfo */
@@ -7007,7 +7007,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo 
*input_rel,
return;
 
/* We don't support cases where there are any SRFs in the targetlist */
-   if (parse->hasTargetSRFs)
+   if (QueryHasTargetSRFs(parse))
return;
 
/* Save the input_rel as outerrel in fpinfo */
diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
index dce898c751..e328bd9242 100644
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -398,7 +398,7 @@ DefineView(ViewStmt *stmt, const char *queryString,
 * DefineQueryRewrite(), but that function will complain about a bogus 
ON
 * SELECT rule, and we'd rather the message complain about a view.
 */
-   if (viewParse->hasModifyingCTE)
+   if (QueryHasModifyingCTE(viewParse))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("views must not contain data-modifying 
statements in WITH")));
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 0f811fd2fc..cdc94f082d 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -488,7 +488,7 @@ init_execution_state(List *queryTree_list,
/* Utility commands require no planning. */
stmt = makeNode(PlannedStmt);
stmt->commandType = CMD_UTILITY;
-   stmt->canSetTag = queryTree->canSetTag;
+   stmt->canSetTag = QueryCanSetTag(queryTree);
stmt->utilityStmt = queryTree->utilityStmt;
stmt->stmt_location = queryTree->stmt_location;
stmt->stmt_len = queryTree->stmt_len;
@@ -540,7 +540,7 @@ init_execution_state(List *queryTree_list,
newes->stmt = stmt;
newes->qd = NULL;
 
-   if (queryTree->canSetTag)
+   if (QueryCanSetTag(queryTree))
lasttages = newes;
 
preves = newes;
@@ -1650,7 +1650,7 @@ check_sql_fn_retval(List *queryTreeLists,
{
Query  *q = lfirst_node(Query, lc2);
 
-   if (q->canSetTag)
+   if (QueryCanSetTag(q))
{
parse = q;
parse_cell = lc2;
@@ -1934,7 +1934,7 @@ tlist_coercion_finished:
newquery = makeNode(Query);
newquery->commandType = CMD_SELECT;
newquery->querySource = parse->querySource;
-   newquery->canSetTag = true;
+   QuerySetFlag(newquery, CAN_SET_TAG);
newquery->targetList = upper_tlist;
 
/* We need a moderately realistic colnames list for the 
subquery RTE */
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index d404fbf262..bd359fd636 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ 

Re: Improvement discussion of custom and generic plans

2024-02-19 Thread Quan Zongliang

Add the GUC parameter.

On 2024/1/30 21:25, Quan Zongliang wrote:



On 2023/11/3 15:27, Quan Zongliang wrote:

Hi

We have one such problem. A table field has skewed data. Statistics:
n_distinct | -0.4481973
most_common_vals   | {5f006ca25b52ed78e457b150ee95a30c}
most_common_freqs  | {0.5518474}


Data generation:

CREATE TABLE s_user (
  user_id varchar(32) NOT NULL,
  corp_id varchar(32),
  status int NOT NULL
  );

insert into s_user
select md5('user_id ' || a), md5('corp_id ' || a),
  case random()<0.877675 when true then 1 else -1 end
   FROM generate_series(1,10031) a;

insert into s_user
select md5('user_id ' || a), md5('corp_id 10032'),
  case random()<0.877675 when true then 1 else -1 end
   FROM generate_series(10031,22383) a;

CREATE INDEX s_user_corp_id_idx ON s_user USING btree (corp_id);

analyze s_user;


1. First, define a PREPARE statement
prepare stmt as select count(*) from s_user where status=1 and corp_id 
= $1;


2. Run it five times. Choose the custom plan.
explain (analyze,buffers) execute 
stmt('5f006ca25b52ed78e457b150ee95a30c');


Here's the plan:
  Aggregate  (cost=639.84..639.85 rows=1 width=8) (actual 
time=4.653..4.654 rows=1 loops=1)

    Buffers: shared hit=277
    ->  Seq Scan on s_user  (cost=0.00..612.76 rows=10830 width=0) 
(actual time=1.402..3.747 rows=10836 loops=1)
  Filter: ((status = 1) AND ((corp_id)::text = 
'5f006ca25b52ed78e457b150ee95a30c'::text))

  Rows Removed by Filter: 11548
  Buffers: shared hit=277
  Planning Time: 0.100 ms
  Execution Time: 4.674 ms
(8 rows)

3.From the sixth time. Choose generic plan.
We can see that there is a huge deviation between the estimate and the 
actual value:
  Aggregate  (cost=11.83..11.84 rows=1 width=8) (actual 
time=4.424..4.425 rows=1 loops=1)

    Buffers: shared hit=154 read=13
    ->  Bitmap Heap Scan on s_user  (cost=4.30..11.82 rows=2 width=0) 
(actual time=0.664..3.371 rows=10836 loops=1)

  Recheck Cond: ((corp_id)::text = $1)
  Filter: (status = 1)
  Rows Removed by Filter: 1517
  Heap Blocks: exact=154
  Buffers: shared hit=154 read=13
  ->  Bitmap Index Scan on s_user_corp_id_idx  
(cost=0.00..4.30 rows=2 width=0) (actual time=0.635..0.635 rows=12353 
loops=1)

    Index Cond: ((corp_id)::text = $1)
    Buffers: shared read=13
  Planning Time: 0.246 ms
  Execution Time: 4.490 ms
(13 rows)

This is because in the choose_custom_plan function, the generic plan 
is attempted after executing the custom plan five times.


 if (plansource->num_custom_plans < 5)
 return true;

The generic plan uses var_eq_non_const to estimate the average 
selectivity.


These are facts that many people already know. So a brief introduction.


Our users actually use such parameter conditions in very complex 
PREPARE statements. Once they use the generic plan for the sixth time. 
The execution time will change from 5 milliseconds to 5 minutes.



To improve this problem. The following approaches can be considered:

1. Determine whether data skew exists in the PREPARE statement 
parameter conditions based on the statistics.
However, there is no way to know if the user will use the skewed 
parameter.


2.When comparing the cost of the generic plan with the average cost of 
the custom plan(function choose_custom_plan). Consider whether the 
maximum cost of a custom plan executed is an order of magnitude 
different from the cost of a generic plan.
If the first five use a small selectivity condition. And after the 
sixth use a high selectivity condition. Problems will still arise.


3.Trace the execution time of the PREPARE statement. When an execution 
time is found to be much longer than the average execution time, the 
custom plan is forced to run.



Is there any better idea?

I tried to do a demo. Add a member paramid to Const. When Const is 
generated by Param, the Const is identified as coming from Param. Then 
check in var_eq_const to see if the field in the condition using this 
parameter is skewed. If so, choose_custom_plan returns true every time, 
forcing custom_plan to be used.
Only conditional expressions such as var eq param or param eq var can be 
supported.

If it makes sense. Continue to improve this patch.


--
Quan Zongliang

diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 8b76e98529..14b8bec6ff 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -154,6 +154,8 @@ boolenable_partition_pruning = true;
 bool   enable_presorted_aggregate = true;
 bool   enable_async_append = true;
 
+double skewed_param_factor = DEFAULT_SKEWED_PARAM_FACTOR;
+
 typedef struct
 {
PlannerInfo *root;
diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index edc25d712e..8b922c0c95 100644
--- a/

Re: Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

2024-02-01 Thread Quan Zongliang


A little tweak to the code.
GetTopTransactionIdIfAny() != InvalidTransactionId
changed to
TransactionIdIsValid(GetTopTransactionIdIfAny()


On 2024/1/12 08:51, jian he wrote:

Hi

...

with patch:
src3=# explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
duration: 10010.167 ms  plan:
 Query Text: explain(analyze, costs off) select 1 from pg_sleep(10);
 Function Scan on pg_sleep  (cost=0.00..0.01 rows=1 width=4)
(actual time=10010.155..10010.159 rows=1 loops=1)
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
statement: explain(analyze, costs off) select 1 from pg_sleep(10);
  QUERY PLAN
-
  Function Scan on pg_sleep (actual time=10010.155..10010.159 rows=1 loops=1)
  Planning Time: 0.115 ms
  Execution Time: 10010.227 ms
(3 rows)

This problem does exist in a statement that takes a long time to run.
XID is applied only for the first change tuple. If the user want to see 
it in a single statement log, they have to wait until the statement has 
finished executing. And we don't know how long it will take until the 
statement ends. It is not appropriate to output the log twice because of 
xid. Besides, without parsing log_line_prefix we don't know if the user 
wants to see xid.diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 1a34bd3715..bd08b64450 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1064,8 +1064,9 @@ exec_simple_query(const char *query_string)
 */
parsetree_list = pg_parse_query(query_string);
 
-   /* Log immediately if dictated by log_statement */
-   if (check_log_statement(parsetree_list))
+   /* Log immediately if dictated by log_statement and XID assigned. */
+   if (TransactionIdIsValid(GetTopTransactionIdIfAny()) &&
+   check_log_statement(parsetree_list))
{
ereport(LOG,
(errmsg("statement: %s", query_string),
@@ -1282,6 +1283,16 @@ exec_simple_query(const char *query_string)
 
PortalDrop(portal, false);
 
+   /* Log if dictated by log_statement and has not been logged. */
+   if (!was_logged && check_log_statement(parsetree_list))
+   {
+   ereport(LOG,
+   (errmsg("statement: %s", query_string),
+   errhidestmt(true),
+   errdetail_execute(parsetree_list)));
+   was_logged = true;
+   }
+
if (lnext(parsetree_list, parsetree_item) == NULL)
{
/*


Re: Improvement discussion of custom and generic plans

2024-01-30 Thread Quan Zongliang



On 2023/11/3 15:27, Quan Zongliang wrote:

Hi

We have one such problem. A table field has skewed data. Statistics:
n_distinct | -0.4481973
most_common_vals   | {5f006ca25b52ed78e457b150ee95a30c}
most_common_freqs  | {0.5518474}


Data generation:

CREATE TABLE s_user (
  user_id varchar(32) NOT NULL,
  corp_id varchar(32),
  status int NOT NULL
  );

insert into s_user
select md5('user_id ' || a), md5('corp_id ' || a),
  case random()<0.877675 when true then 1 else -1 end
   FROM generate_series(1,10031) a;

insert into s_user
select md5('user_id ' || a), md5('corp_id 10032'),
  case random()<0.877675 when true then 1 else -1 end
   FROM generate_series(10031,22383) a;

CREATE INDEX s_user_corp_id_idx ON s_user USING btree (corp_id);

analyze s_user;


1. First, define a PREPARE statement
prepare stmt as select count(*) from s_user where status=1 and corp_id = 
$1;


2. Run it five times. Choose the custom plan.
explain (analyze,buffers) execute stmt('5f006ca25b52ed78e457b150ee95a30c');

Here's the plan:
  Aggregate  (cost=639.84..639.85 rows=1 width=8) (actual 
time=4.653..4.654 rows=1 loops=1)

    Buffers: shared hit=277
    ->  Seq Scan on s_user  (cost=0.00..612.76 rows=10830 width=0) 
(actual time=1.402..3.747 rows=10836 loops=1)
  Filter: ((status = 1) AND ((corp_id)::text = 
'5f006ca25b52ed78e457b150ee95a30c'::text))

  Rows Removed by Filter: 11548
  Buffers: shared hit=277
  Planning Time: 0.100 ms
  Execution Time: 4.674 ms
(8 rows)

3.From the sixth time. Choose generic plan.
We can see that there is a huge deviation between the estimate and the 
actual value:
  Aggregate  (cost=11.83..11.84 rows=1 width=8) (actual 
time=4.424..4.425 rows=1 loops=1)

    Buffers: shared hit=154 read=13
    ->  Bitmap Heap Scan on s_user  (cost=4.30..11.82 rows=2 width=0) 
(actual time=0.664..3.371 rows=10836 loops=1)

  Recheck Cond: ((corp_id)::text = $1)
  Filter: (status = 1)
  Rows Removed by Filter: 1517
  Heap Blocks: exact=154
  Buffers: shared hit=154 read=13
  ->  Bitmap Index Scan on s_user_corp_id_idx  (cost=0.00..4.30 
rows=2 width=0) (actual time=0.635..0.635 rows=12353 loops=1)

    Index Cond: ((corp_id)::text = $1)
    Buffers: shared read=13
  Planning Time: 0.246 ms
  Execution Time: 4.490 ms
(13 rows)

This is because in the choose_custom_plan function, the generic plan is 
attempted after executing the custom plan five times.


     if (plansource->num_custom_plans < 5)
     return true;

The generic plan uses var_eq_non_const to estimate the average selectivity.

These are facts that many people already know. So a brief introduction.


Our users actually use such parameter conditions in very complex PREPARE 
statements. Once they use the generic plan for the sixth time. The 
execution time will change from 5 milliseconds to 5 minutes.



To improve this problem. The following approaches can be considered:

1. Determine whether data skew exists in the PREPARE statement parameter 
conditions based on the statistics.

However, there is no way to know if the user will use the skewed parameter.

2.When comparing the cost of the generic plan with the average cost of 
the custom plan(function choose_custom_plan). Consider whether the 
maximum cost of a custom plan executed is an order of magnitude 
different from the cost of a generic plan.
If the first five use a small selectivity condition. And after the sixth 
use a high selectivity condition. Problems will still arise.


3.Trace the execution time of the PREPARE statement. When an execution 
time is found to be much longer than the average execution time, the 
custom plan is forced to run.



Is there any better idea?

I tried to do a demo. Add a member paramid to Const. When Const is 
generated by Param, the Const is identified as coming from Param. Then 
check in var_eq_const to see if the field in the condition using this 
parameter is skewed. If so, choose_custom_plan returns true every time, 
forcing custom_plan to be used.
Only conditional expressions such as var eq param or param eq var can be 
supported.

If it makes sense. Continue to improve this patch.


--
Quan Zongliang

diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index 94eb56a1e7..3384520dc1 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2489,6 +2489,8 @@ eval_const_expressions_mutator(Node *node,

pval,

prm->isnull,

typByVal);
+   if (paramLI->paramFetch 
== NULL)
+   

Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

2023-11-23 Thread Quan Zongliang



On 2023/11/24 03:39, Pavel Stehule wrote:



I modified the documentation a little bit - we don't need to extra 
propose SQL array syntax, I think.
I rewrote regress tests - we don't need to test unsupported 
functionality (related to RECORD).


- all tests passed


I wrote two examples of errors:
  user_id users.user_id%ROWTYPE[];
  user_id users.user_id%ROWTYPE ARRAY[4][3];

Fixed.


Regards

Pavel


 >
 > Regards
 >
 > Pavel
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 5977534a62..0eedf4a55d 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -766,6 +766,40 @@ SELECT merge_fields(t.*) FROM table1 t WHERE ... ;

   
 
+  
+   Arrays of Copying Types and Row Types
+
+
+name variable%TYPE[];
+name table_name%ROWTYPE[];
+
+
+   
+Arrays of Copying Types and Row Types is defined by appending square 
brackets
+([]) to the %TYPE or 
%ROWTYPE.
+Its definition is similar to PostgreSQL's arrays described in .
+For example:
+
+user_id users.user_id%TYPE[];
+user_id users%ROWTYPE[];
+
+The syntax allows the exact size of arrays to be specified. However, the 
current
+implementation ignores any supplied array size limits, i.e., the behavior 
is the
+same as for arrays of unspecified length.
+   
+
+   
+An alternative syntax, which conforms to the SQL standard by using the 
keyword
+ARRAY, can be used for one-dimensional or 
multi-dimensional
+arrays too:
+
+user_id users.user_id%TYPE ARRAY;
+user_id users%ROWTYPE ARRAY[4][3];
+
+As before, however, PostgreSQL does not enforce the size restriction in 
any case.
+   
+  
+
   
Record Types
 
@@ -794,6 +828,12 @@ SELECT merge_fields(t.*) FROM table1 t WHERE ... ;
 calling query is parsed, whereas a record variable can change its row
 structure on-the-fly.

+
+   
+The RECORD cannot be used for declaration of variable
+of an array type. "Copying Types" as shown in 
+are not supported too.
+   
   
 
   
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index a341cde2c1..a9cb15df6d 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -2095,6 +2095,29 @@ plpgsql_build_datatype(Oid typeOid, int32 typmod,
return typ;
 }
 
+/*
+ * Returns an array for type specified as argument.
+ */
+PLpgSQL_type *
+plpgsql_datatype_arrayof(PLpgSQL_type *dtype)
+{
+   Oid array_typeid;
+
+   if (dtype->typisarray)
+   return dtype;
+
+   array_typeid = get_array_type(dtype->typoid);
+
+   if (!OidIsValid(array_typeid))
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg("could not find array type for data 
type \"%s\"",
+   
format_type_be(dtype->typoid;
+
+   return plpgsql_build_datatype(array_typeid, dtype->atttypmod,
+ 
dtype->collation, NULL);
+}
+
 /*
  * Utility subroutine to make a PLpgSQL_type struct given a pg_type entry
  * and additional details (see comments for plpgsql_build_datatype).
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 6a09bfdd67..aa9103cf0e 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -2789,7 +2789,7 @@ read_datatype(int tok)
StringInfoData ds;
char   *type_name;
int startlocation;
-   PLpgSQL_type *result;
+   PLpgSQL_type *result = NULL;
int parenlevel = 0;
 
/* Should only be called while parsing DECLARE sections */
@@ -2817,15 +2817,11 @@ read_datatype(int tok)
   K_TYPE, "type"))
{
result = plpgsql_parse_wordtype(dtname);
-   if (result)
-   return result;
}
else if (tok_is_keyword(tok, ,

K_ROWTYPE, "rowtype"))
{
result = plpgsql_parse_wordrowtype(dtname);
-   if (result)
-   return result;
}
}
}
@@ -2841,15 +2837,11 @@ read_datatype(int tok)
   K_TYPE, "type"))
{
result = plpgsql_parse_wordtype(dtname);
-   if (result)
-   return result;
}
else if (tok_is_keyword(tok, ,

K_ROWTYPE, "rowtype"))
   

Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

2023-11-23 Thread Quan Zongliang



On 2023/11/20 17:33, Pavel Stehule wrote:




I did some deeper check:

- I don't like too much parser's modification (I am sending alternative 
own implementation) - the SQL parser allows richer syntax, and for full 
functionality is only few lines more

Agree.



- original patch doesn't solve %ROWTYPE

(2023-11-20 10:04:36) postgres=# select * from foo;
┌┬┐
│ a  │ b  │
╞╪╡
│ 10 │ 20 │
│ 30 │ 40 │
└┴┘
(2 rows)

(2023-11-20 10:08:29) postgres=# do $$
declare v foo%rowtype[];
begin
   v := array(select row(a,b) from foo);
   raise notice '%', v;
end;
$$;
NOTICE:  {"(10,20)","(30,40)"}
DO


two little fixes
1. spelling mistake
  ARRAY [ icons ] --> ARRAY [ iconst ]
2. code bug
  if (!OidIsValid(dtype->typoid)) --> if (!OidIsValid(array_typeid))



- original patch doesn't solve type RECORD
the error message should be more intuitive, although the arrays of 
record type can be supported, but it probably needs bigger research.


(2023-11-20 10:10:34) postgres=# do $$
declare r record; v r%type[];
begin
   v := array(select row(a,b) from foo);
   raise notice '%', v;
end;
$$;
ERROR:  syntax error at or near "%"
LINE 2: declare r record; v r%type[];
                              ^
CONTEXT:  invalid type name "r%type[]"


Currently only scalar variables are supported.
This error is consistent with the r%type error. And record arrays are 
not currently supported.
Support for r%type should be considered first. For now, let r%type[] 
report the same error as record[].

I prefer to implement it with a new patch.


- missing documentation
My English is not good. I wrote it down, please correct it. Add a note 
in the "Record Types" documentation that arrays and "Copying Types" are 
not supported yet.




- I don't like using the word "partitioned" in the regress test name 
"partitioned_table". It is confusing

fixed



Regards

Paveldiff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 5977534a62..ad02c9f561 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -766,6 +766,30 @@ SELECT merge_fields(t.*) FROM table1 t WHERE ... ;

   
 
+  
+   Arrays of Copying Types and Row Types
+
+
+name variable%TYPE[];
+name table_name%ROWTYPE[];
+
+
+   
+Arrays of Copying Types and Row Types is defined by appending square 
brackets
+([]) to the %TYPE or 
%ROWTYPE.
+Its definition is similar to PostgreSQL's array types. It is possible to
+specify the exact size of the array. The keyword ARRAY can also be used.
+For example:
+
+user_id users.user_id%TYPE[4][2];
+user_id users.user_id%ROWTYPE ARRAY[4][];
+
+However, the current implementation ignores any supplied array size 
limits, i.e.,
+the behavior is the same as for arrays of unspecified length.
+The current implementation does not enforce the declared number of 
dimensions either.
+   
+  
+
   
Record Types
 
@@ -794,6 +818,11 @@ SELECT merge_fields(t.*) FROM table1 t WHERE ... ;
 calling query is parsed, whereas a record variable can change its row
 structure on-the-fly.

+
+   
+RECORD does not support being defined as an array.
+"Copying Types" as shown in  is 
also not supported.
+   
   
 
   
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index a341cde2c1..a9cb15df6d 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -2095,6 +2095,29 @@ plpgsql_build_datatype(Oid typeOid, int32 typmod,
return typ;
 }
 
+/*
+ * Returns an array for type specified as argument.
+ */
+PLpgSQL_type *
+plpgsql_datatype_arrayof(PLpgSQL_type *dtype)
+{
+   Oid array_typeid;
+
+   if (dtype->typisarray)
+   return dtype;
+
+   array_typeid = get_array_type(dtype->typoid);
+
+   if (!OidIsValid(array_typeid))
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg("could not find array type for data 
type \"%s\"",
+   
format_type_be(dtype->typoid;
+
+   return plpgsql_build_datatype(array_typeid, dtype->atttypmod,
+ 
dtype->collation, NULL);
+}
+
 /*
  * Utility subroutine to make a PLpgSQL_type struct given a pg_type entry
  * and additional details (see comments for plpgsql_build_datatype).
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 6a09bfdd67..7778bffefc 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -2789,7 +2789,7 @@ read_datatype(int tok)
StringInfoData ds;
char   *type_name;
int startlocation;
-   PLpgSQL_type *result;
+   PLpgSQL_type *result = NULL;
int parenlevel = 0;
 
/* Should only be called while parsing DECLARE sections */
@@ -2817,15 +2817,11 @@ read_datatype(int tok)
 

Improvement discussion of custom and generic plans

2023-11-03 Thread Quan Zongliang

Hi

We have one such problem. A table field has skewed data. Statistics:
n_distinct | -0.4481973
most_common_vals   | {5f006ca25b52ed78e457b150ee95a30c}
most_common_freqs  | {0.5518474}


Data generation:

CREATE TABLE s_user (
 user_id varchar(32) NOT NULL,
 corp_id varchar(32), 


 status int NOT NULL
 );

insert into s_user
select md5('user_id ' || a), md5('corp_id ' || a),
 case random()<0.877675 when true then 1 else -1 end
  FROM generate_series(1,10031) a;

insert into s_user
select md5('user_id ' || a), md5('corp_id 10032'),
 case random()<0.877675 when true then 1 else -1 end
  FROM generate_series(10031,22383) a;

CREATE INDEX s_user_corp_id_idx ON s_user USING btree (corp_id);

analyze s_user;


1. First, define a PREPARE statement
prepare stmt as select count(*) from s_user where status=1 and corp_id = $1;

2. Run it five times. Choose the custom plan.
explain (analyze,buffers) execute stmt('5f006ca25b52ed78e457b150ee95a30c');

Here's the plan:
 Aggregate  (cost=639.84..639.85 rows=1 width=8) (actual 
time=4.653..4.654 rows=1 loops=1)

   Buffers: shared hit=277
   ->  Seq Scan on s_user  (cost=0.00..612.76 rows=10830 width=0) 
(actual time=1.402..3.747 rows=10836 loops=1)
 Filter: ((status = 1) AND ((corp_id)::text = 
'5f006ca25b52ed78e457b150ee95a30c'::text))

 Rows Removed by Filter: 11548
 Buffers: shared hit=277
 Planning Time: 0.100 ms
 Execution Time: 4.674 ms
(8 rows)

3.From the sixth time. Choose generic plan.
We can see that there is a huge deviation between the estimate and the 
actual value:
 Aggregate  (cost=11.83..11.84 rows=1 width=8) (actual 
time=4.424..4.425 rows=1 loops=1)

   Buffers: shared hit=154 read=13
   ->  Bitmap Heap Scan on s_user  (cost=4.30..11.82 rows=2 width=0) 
(actual time=0.664..3.371 rows=10836 loops=1)

 Recheck Cond: ((corp_id)::text = $1)
 Filter: (status = 1)
 Rows Removed by Filter: 1517
 Heap Blocks: exact=154
 Buffers: shared hit=154 read=13
 ->  Bitmap Index Scan on s_user_corp_id_idx  (cost=0.00..4.30 
rows=2 width=0) (actual time=0.635..0.635 rows=12353 loops=1)

   Index Cond: ((corp_id)::text = $1)
   Buffers: shared read=13
 Planning Time: 0.246 ms
 Execution Time: 4.490 ms
(13 rows)

This is because in the choose_custom_plan function, the generic plan is 
attempted after executing the custom plan five times.


if (plansource->num_custom_plans < 5)
return true;

The generic plan uses var_eq_non_const to estimate the average selectivity.

These are facts that many people already know. So a brief introduction.


Our users actually use such parameter conditions in very complex PREPARE 
statements. Once they use the generic plan for the sixth time. The 
execution time will change from 5 milliseconds to 5 minutes.



To improve this problem. The following approaches can be considered:

1. Determine whether data skew exists in the PREPARE statement parameter 
conditions based on the statistics.

However, there is no way to know if the user will use the skewed parameter.

2.When comparing the cost of the generic plan with the average cost of 
the custom plan(function choose_custom_plan). Consider whether the 
maximum cost of a custom plan executed is an order of magnitude 
different from the cost of a generic plan.
If the first five use a small selectivity condition. And after the sixth 
use a high selectivity condition. Problems will still arise.


3.Trace the execution time of the PREPARE statement. When an execution 
time is found to be much longer than the average execution time, the 
custom plan is forced to run.



Is there any better idea?

--
Quan Zongliang





Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

2023-10-17 Thread Quan Zongliang




On 2023/10/17 12:15, Pavel Stehule wrote:



út 17. 10. 2023 v 3:30 odesílatel Quan Zongliang <mailto:quanzongli...@yeah.net>> napsal:




On 2023/10/16 20:05, Pavel Stehule wrote:
 >
 >
 > po 16. 10. 2023 v 13:56 odesílatel Daniel Gustafsson
mailto:dan...@yesql.se>
 > <mailto:dan...@yesql.se <mailto:dan...@yesql.se>>> napsal:
 >
     >      > On 16 Oct 2023, at 12:15, Quan Zongliang
mailto:quanzongli...@yeah.net>
 >     <mailto:quanzongli...@yeah.net
<mailto:quanzongli...@yeah.net>>> wrote:
 >
 >      > Implement TODO item:
 >      > PL/pgSQL
 >      > Incomplete item Allow handling of %TYPE arrays, e.g.
tab.col%TYPE[]
 >
 >     Cool!  While I haven't looked at the patch yet, I've wanted this
 >     myself many
 >     times in the past when working with large plpgsql codebases.
 >
 >      > As a first step, deal only with [], such as
 >      > xxx.yyy%TYPE[]
 >      > xxx%TYPE[]
 >      >
 >      > It can be extended to support multi-dimensional and complex
 >     syntax in the future.
 >
 >     Was this omitted due to complexity of implementation or for some
 >     other reason?
 >
Because of complexity.

 >
 > There is no reason for describing enhancement. The size and
dimensions
 > of postgresql arrays are dynamic, depends on the value, not on
 > declaration. Now, this information is ignored, and can be
compatibility
 > break to check and enforce this info.
 >
Yes. I don't think it's necessary.
If anyone needs it, we can continue to enhance it in the future.


I don't think it is possible to do it.  But there is another missing 
functionality, if I remember well. There is no possibility to declare 
variables for elements of array.

The way it's done now is more like laziness.

Is it possible to do that?
If the parser encounters %TYPE[][]. It can be parsed. Then let 
parse_datatype do the rest.


For example, partitioned_table.a%TYPE[][100][]. Parse the type 
name(int4) of partitioned_table.a%TYPE and add the following [][100][]. 
Passing "int4[][100][]" to parse_datatype will give us the array 
definition we want.


Isn't this code a little ugly?



I propose syntax xxx.yyy%ELEMENTTYPE and xxx%ELEMENTTYPE

What do you think about it?
No other relational database can be found with such an implementation. 
But it seems like a good idea. It can bring more convenience to write 
stored procedure.




Regards

Pavel


 >
 >     --
 >     Daniel Gustafsson
 >
 >
 >







Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

2023-10-16 Thread Quan Zongliang




On 2023/10/16 20:05, Pavel Stehule wrote:



po 16. 10. 2023 v 13:56 odesílatel Daniel Gustafsson <mailto:dan...@yesql.se>> napsal:


 > On 16 Oct 2023, at 12:15, Quan Zongliang mailto:quanzongli...@yeah.net>> wrote:

 > Implement TODO item:
 > PL/pgSQL
 > Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

Cool!  While I haven't looked at the patch yet, I've wanted this
myself many
times in the past when working with large plpgsql codebases.

 > As a first step, deal only with [], such as
 > xxx.yyy%TYPE[]
 > xxx%TYPE[]
 >
 > It can be extended to support multi-dimensional and complex
syntax in the future.

Was this omitted due to complexity of implementation or for some
other reason?


Because of complexity.



There is no reason for describing enhancement. The size and dimensions 
of postgresql arrays are dynamic, depends on the value, not on 
declaration. Now, this information is ignored, and can be compatibility 
break to check and enforce this info.



Yes. I don't think it's necessary.
If anyone needs it, we can continue to enhance it in the future.



--
Daniel Gustafsson









Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

2023-10-16 Thread Quan Zongliang



Error messages still seem ambiguous.
   do not support multi-dimensional arrays in PL/pgSQL

Isn't that better?
   do not support multi-dimensional %TYPE arrays in PL/pgSQL


On 2023/10/17 09:19, Quan Zongliang wrote:


Attached new patch
   More explicit error messages based on type.


On 2023/10/16 18:15, Quan Zongliang wrote:



Implement TODO item:
PL/pgSQL
Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

As a first step, deal only with [], such as
xxx.yyy%TYPE[]
xxx%TYPE[]

It can be extended to support multi-dimensional and complex syntax in 
the future.



--
Quan Zongliang






Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

2023-10-16 Thread Quan Zongliang


Attached new patch
  More explicit error messages based on type.


On 2023/10/16 18:15, Quan Zongliang wrote:



Implement TODO item:
PL/pgSQL
Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

As a first step, deal only with [], such as
xxx.yyy%TYPE[]
xxx%TYPE[]

It can be extended to support multi-dimensional and complex syntax in 
the future.



--
Quan Zongliangdiff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 6a09bfdd67..c6377dcf2c 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -23,6 +23,7 @@
 #include "parser/scanner.h"
 #include "parser/scansup.h"
 #include "utils/builtins.h"
+#include "utils/syscache.h"
 
 #include "plpgsql.h"
 
@@ -76,6 +77,7 @@ staticPLpgSQL_expr*read_sql_expression2(int 
until, int until2,

  int *endtoken);
 static PLpgSQL_expr*read_sql_stmt(void);
 static PLpgSQL_type*read_datatype(int tok);
+static PLpgSQL_type*read_datatype_array(PLpgSQL_type *elem_typ);
 static PLpgSQL_stmt*make_execsql_stmt(int firsttoken, int location);
 static PLpgSQL_stmt_fetch *read_fetch_direction(void);
 static void complete_direction(PLpgSQL_stmt_fetch *fetch,
@@ -2783,6 +2785,74 @@ read_sql_construct(int until,
return expr;
 }
 
+static PLpgSQL_type *
+read_datatype_array(PLpgSQL_type *elem_typ)
+{
+   int tok;
+   HeapTuple   type_tup = NULL;
+   Form_pg_typetype_frm;
+
+   Assert(elem_typ);
+   if (!OidIsValid(elem_typ->typoid))
+   return elem_typ;
+
+   tok = yylex();
+   /* Next token is not square bracket. */
+   if (tok != '[')
+   {
+   plpgsql_push_back_token(tok);
+
+   return elem_typ;
+   }
+
+   tok = yylex();
+   /* For now, deal only with []. */
+   if (tok != ']')
+   {
+   plpgsql_push_back_token(tok);
+   plpgsql_push_back_token('[');
+
+   return elem_typ;
+   }
+
+   type_tup = SearchSysCache1(TYPEOID,
+ 
ObjectIdGetDatum(elem_typ->typoid));
+
+   /* should not happen. */
+   if (!HeapTupleIsValid(type_tup))
+   elog(ERROR, "cache lookup failed for type %u", 
elem_typ->typoid);
+
+   type_frm = (Form_pg_type) GETSTRUCT(type_tup);
+
+   if (OidIsValid(type_frm->typarray))
+   {
+   Oid arrtyp_oid = type_frm->typarray;
+
+   ReleaseSysCache(type_tup);
+
+   return plpgsql_build_datatype(arrtyp_oid,
+   elem_typ->atttypmod,
+   elem_typ->collation,
+   NULL);
+   }
+
+   if (type_frm->typcategory != TYPCATEGORY_ARRAY)
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+   errmsg("type \"%s[]\" does not exist",
+   NameStr(type_frm->typname)),
+   parser_errposition(yylloc)));
+
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+   errmsg("do not support multi-dimensional arrays 
in PL/pgSQL"),
+   parser_errposition(yylloc)));
+
+   ReleaseSysCache(type_tup);
+
+   return elem_typ;
+}
+
 static PLpgSQL_type *
 read_datatype(int tok)
 {
@@ -2818,7 +2888,9 @@ read_datatype(int tok)
{
result = plpgsql_parse_wordtype(dtname);
if (result)
-   return result;
+   {
+   return read_datatype_array(result);
+   }
}
else if (tok_is_keyword(tok, ,

K_ROWTYPE, "rowtype"))
@@ -2842,7 +2914,9 @@ read_datatype(int tok)
{
result = plpgsql_parse_wordtype(dtname);
if (result)
-   return result;
+   {
+   return read_datatype_array(result);
+   }
}
else if (tok_is_keyword(tok, ,

K_ROWTYPE, "rowtype"))
@@ -2866,7 +2940,9 @@ read_datatype(int tok)
  

PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

2023-10-16 Thread Quan Zongliang



Implement TODO item:
PL/pgSQL
Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

As a first step, deal only with [], such as
xxx.yyy%TYPE[]
xxx%TYPE[]

It can be extended to support multi-dimensional and complex syntax in 
the future.



--
Quan Zongliangdiff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 6a09bfdd67..3ff12b7af9 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -23,6 +23,7 @@
 #include "parser/scanner.h"
 #include "parser/scansup.h"
 #include "utils/builtins.h"
+#include "utils/syscache.h"
 
 #include "plpgsql.h"
 
@@ -76,6 +77,7 @@ staticPLpgSQL_expr*read_sql_expression2(int 
until, int until2,

  int *endtoken);
 static PLpgSQL_expr*read_sql_stmt(void);
 static PLpgSQL_type*read_datatype(int tok);
+static PLpgSQL_type*read_datatype_array(PLpgSQL_type *elem_typ);
 static PLpgSQL_stmt*make_execsql_stmt(int firsttoken, int location);
 static PLpgSQL_stmt_fetch *read_fetch_direction(void);
 static void complete_direction(PLpgSQL_stmt_fetch *fetch,
@@ -2783,6 +2785,55 @@ read_sql_construct(int until,
return expr;
 }
 
+static PLpgSQL_type *
+read_datatype_array(PLpgSQL_type *elem_typ)
+{
+   int tok;
+   HeapTuple   type_tup = NULL;
+   Form_pg_typetype_frm;
+   Oid arrtyp_oid;
+
+   Assert(elem_typ);
+   if (!OidIsValid(elem_typ->typoid))
+   return elem_typ;
+
+   tok = yylex();
+   /* Next token is not square bracket. */
+   if (tok != '[')
+   {
+   plpgsql_push_back_token(tok);
+
+   return elem_typ;
+   }
+
+   tok = yylex();
+   /* For now, deal only with []. */
+   if (tok != ']')
+   {
+   plpgsql_push_back_token('[');
+   plpgsql_push_back_token(tok);
+
+   return elem_typ;
+   }
+
+   type_tup = SearchSysCache1(TYPEOID,
+ 
ObjectIdGetDatum(elem_typ->typoid));
+   if (!HeapTupleIsValid(type_tup))
+   return elem_typ;
+
+   type_frm = (Form_pg_type) GETSTRUCT(type_tup);
+   arrtyp_oid = type_frm->typarray;
+   ReleaseSysCache(type_tup);
+
+   if (OidIsValid(arrtyp_oid))
+   return plpgsql_build_datatype(arrtyp_oid,
+   elem_typ->atttypmod,
+   elem_typ->collation,
+   NULL);
+   else
+   return elem_typ;
+}
+
 static PLpgSQL_type *
 read_datatype(int tok)
 {
@@ -2818,7 +2869,9 @@ read_datatype(int tok)
{
result = plpgsql_parse_wordtype(dtname);
if (result)
-   return result;
+   {
+   return read_datatype_array(result);
+   }
}
else if (tok_is_keyword(tok, ,

K_ROWTYPE, "rowtype"))
@@ -2842,7 +2895,9 @@ read_datatype(int tok)
{
result = plpgsql_parse_wordtype(dtname);
if (result)
-   return result;
+   {
+   return read_datatype_array(result);
+   }
}
else if (tok_is_keyword(tok, ,

K_ROWTYPE, "rowtype"))
@@ -2866,7 +2921,9 @@ read_datatype(int tok)
{
result = plpgsql_parse_cwordtype(dtnames);
if (result)
-   return result;
+   {
+   return read_datatype_array(result);
+   }
}
else if (tok_is_keyword(tok, ,

K_ROWTYPE, "rowtype"))
diff --git a/src/test/regress/expected/plpgsql.out 
b/src/test/regress/expected/plpgsql.out
index 272f5d2111..8db28c1122 100644
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -5814,6 +5814,31 @@ SELECT * FROM list_partitioned_table() AS t;
  2
 (2 rows)
 
+CREATE OR REPLACE FUNCTION array_partitioned_table()
+RETURNS SETOF partitioned_table.a%TYPE AS $$
+DECLARE
+i int;
+row partitioned_table%ROWTYPE;
+a_val 

Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

2023-10-07 Thread Quan Zongliang



Implement TODO item:

Fix log_line_prefix to display the transaction id (%x) for statements 
not in a transaction block

Currently it displays zero.


Check that the XID has been assigned at the location where the statement 
log is now printed. If not, no statement log is output.
And then before finish_xact_command. If the statement has not been 
output to the log. Here the log can get XID.


DML that does not manipulate any data still does not get XID.

[32718][788] LOG:  statement: insert into t1 values(1,0,'');
[32718][789] LOG:  statement: delete from t1;
[32718][0] LOG:  statement: delete from t1;


--
Quan Zongliangdiff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 21b9763183..b73676ab9d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1063,8 +1063,9 @@ exec_simple_query(const char *query_string)
 */
parsetree_list = pg_parse_query(query_string);
 
-   /* Log immediately if dictated by log_statement */
-   if (check_log_statement(parsetree_list))
+   /* Log immediately if dictated by log_statement and XID assigned. */
+   if (GetTopTransactionIdIfAny() != InvalidTransactionId &&
+   check_log_statement(parsetree_list))
{
ereport(LOG,
(errmsg("statement: %s", query_string),
@@ -1281,6 +1282,16 @@ exec_simple_query(const char *query_string)
 
PortalDrop(portal, false);
 
+   /* Log if dictated by log_statement and has not been logged. */
+   if (!was_logged && check_log_statement(parsetree_list))
+   {
+   ereport(LOG,
+   (errmsg("statement: %s", query_string),
+   errhidestmt(true),
+   errdetail_execute(parsetree_list)));
+   was_logged = true;
+   }
+
if (lnext(parsetree_list, parsetree_item) == NULL)
{
/*


Re: Improving the heapgetpage function improves performance in common scenarios

2023-09-06 Thread Quan Zongliang




On 2023/9/6 17:07, John Naylor wrote:


On Wed, Sep 6, 2023 at 2:50 PM Quan Zongliang <mailto:quanzongli...@yeah.net>> wrote:


 > If not optimized(--enable-debug CFLAGS='-O0'), there is a clear
 > difference. When the compiler does the optimization, the performance is
 > similar. I think the compiler does a good enough optimization with
 > "pg_attribute_always_inline" and the last two constant parameters when
 > calling heapgetpage_collect.

So as we might expect, more specialization (Andres' patch) has no 
apparent downsides in this workload. (While I'm not sure of the point of 
testing at -O0, I think we can conclude that less-bright compilers will 
show some improvement with either patch.)


If you agree, do you want to withdraw your patch from the commit fest?


Ok.

--
John Naylor
EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>






Re: Improving the heapgetpage function improves performance in common scenarios

2023-09-06 Thread Quan Zongliang




On 2023/9/6 15:50, Quan Zongliang wrote:



On 2023/9/5 18:46, John Naylor wrote:


On Tue, Sep 5, 2023 at 4:27 PM Quan Zongliang <mailto:quanzongli...@yeah.net>> wrote:


 > Here's how I test it
 >     EXPLAIN ANALYZE SELECT * FROM orders;

Note that EXPLAIN ANALYZE has quite a bit of overhead, so it's not 
good for these kinds of tests.


 > I'll also try Andres Freund's test method next.

Commit f691f5b80a85 from today removes another source of overhead in 
this function, so I suggest testing against that, if you wish to test 
again.



Test with the latest code of the master branch, see the attached results.

If not optimized(--enable-debug CFLAGS='-O0'), there is a clear 
difference. When the compiler does the optimization, the performance is 
similar. I think the compiler does a good enough optimization with 
"pg_attribute_always_inline" and the last two constant parameters when 
calling heapgetpage_collect.


Add a note. The first execution time of an attachment is not calculated 
in the average.





--
John Naylor
EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>






Re: Improving the heapgetpage function improves performance in common scenarios

2023-09-06 Thread Quan Zongliang



On 2023/9/5 18:46, John Naylor wrote:


On Tue, Sep 5, 2023 at 4:27 PM Quan Zongliang <mailto:quanzongli...@yeah.net>> wrote:


 > Here's how I test it
 >     EXPLAIN ANALYZE SELECT * FROM orders;

Note that EXPLAIN ANALYZE has quite a bit of overhead, so it's not good 
for these kinds of tests.


 > I'll also try Andres Freund's test method next.

Commit f691f5b80a85 from today removes another source of overhead in 
this function, so I suggest testing against that, if you wish to test again.



Test with the latest code of the master branch, see the attached results.

If not optimized(--enable-debug CFLAGS='-O0'), there is a clear 
difference. When the compiler does the optimization, the performance is 
similar. I think the compiler does a good enough optimization with 
"pg_attribute_always_inline" and the last two constant parameters when 
calling heapgetpage_collect.




--
John Naylor
EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>

Re: Improving the heapgetpage function improves performance in common scenarios

2023-09-05 Thread Quan Zongliang




On 2023/9/5 16:15, John Naylor wrote:


On Thu, Aug 24, 2023 at 5:55 PM Quan Zongliang <mailto:quanzongli...@yeah.net>> wrote:


 > In the function heapgetpage. If a table is not updated very frequently.
 > Many actions in tuple loops are superfluous. For all_visible pages,
 > loctup does not need to be assigned, nor does the "valid" variable.
 > CheckForSerializableConflictOutNeeded from
 > HeapCheckForSerializableConflictOut function, it only need to inspect at

Thanks for submitting! A few weeks before this, there was another 
proposal, which specializes code for all paths, not just one. That patch 
also does so without duplicating the loop:


https://www.postgresql.org/message-id/20230716015656.xjvemfbp5fysj...@awork3.anarazel.de
 
<https://www.postgresql.org/message-id/20230716015656.xjvemfbp5fysj...@awork3.anarazel.de>


Nice patch. I'm sorry I didn't notice it before.


 > the beginning of the cycle only once. Using vtune you can clearly see
 > the result (attached heapgetpage.jpg).
 >
 > So by splitting the loop logic into two parts, the vtune results show
 > significant improvement (attached heapgetpage-allvis.jpg).

For future reference, it's not clear at all from the screenshots what 
the improvement will be for the user. In the above thread, the author 
shares testing methodology as well as timing measurements. This is 
useful for reproducibilty, as well as convincing others that the change 
is important.



Here's how I test it
   EXPLAIN ANALYZE SELECT * FROM orders;
Maybe the test wasn't good enough. Although the modified optimal result 
looks good. Because it fluctuates a lot. It's hard to compare. The 
results of vtune are therefore used.


My patch is mainly to eliminate:
1, Assignment of "loctup" struct variable (in vtune you can see that 
these 4 lines have a significant overhead: 0.4 1.0 0.2 0.4).

2. Assignment of the "valid" variable.(overhead 0.6)
3. HeapCheckForSerializableConflictOut function call.(overhead 0.6)

Although these are not the same overhead from test to test. But all are 
too obvious to ignore. The screenshots are mainly to show the three 
improvements mentioned above.


I'll also try Andres Freund's test method next.


--
John Naylor
EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>






Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Quan Zongliang




On 2023/6/17 06:46, Tom Lane wrote:

Quan Zongliang  writes:

Perhaps we should discard this (dups cnt > 1) restriction?


That's not going to happen on the basis of one test case that you
haven't even shown us.  The implications of doing it are very unclear.
In particular, I seem to recall that there are bits of logic that
depend on the assumption that MCV entries always represent more than
one row.  The nmultiple calculation Tomas referred to may be failing
because of that, but I'm worried about there being other places.



The statistics for the other table look like this:
stadistinct | 6
stanumbers1 | {0.50096667,0.49736667,0.0012}
stavalues1  | {v22,v23,v5}

The value that appears twice in the small table (v1 and v2) does not 
appear here. The stadistinct's true value is 18 instead of 6 (three 
values in the small table do not appear here).


When calculating the selectivity:
if (nd2 > sslot2->nvalues)
  totalsel1 += unmatchfreq1 * otherfreq2 / (nd2 - sslot2->nvalues);

totalsel1 = 0
nd2 = 21
sslot2->nvalues = 2
unmatchfreq1 = 0.0002016420476
otherfreq2 = 0.82608695328235626

result: totalsel1 = 0.043473913749706022
rows = 0.043473913749706022 * 23 * 2,000,000 = 1999800



Basically, you're proposing a rather fundamental change in the rules
by which Postgres has gathered statistics for decades.  You need to
bring some pretty substantial evidence to support that.  The burden
of proof is on you, not on the status quo.

regards, tom lane






Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Quan Zongliang




On 2023/6/16 23:39, Tomas Vondra wrote:



On 6/16/23 11:25, Quan Zongliang wrote:


We have a small table with only 23 rows and 21 values.

The resulting MCV and histogram is as follows
stanumbers1 | {0.08695652,0.08695652}
stavalues1  | {v1,v2}
stavalues2  |
{v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15,v16,v17,v18,v19,v20,v21}

An incorrect number of rows was estimated when HashJoin was done with
another large table (about 2 million rows).

Hash Join  (cost=1.52..92414.61 rows=2035023 width=0) (actual
time=1.943..1528.983 rows=3902 loops=1)



That's interesting. I wonder how come the estimate gets this bad simply
by skipping values entries with a single row in the sample, which means
we know the per-value selectivity pretty well.

I guess the explanation has to be something strange happening when
estimating the join condition selectivity, where we combine MCVs from
both sides of the join (which has to be happening here, otherwise it
would not matter what gets to the MCV).

It'd be interesting to know what's in the other MCV, and what are the
other statistics for the attributes (ndistinct etc.).

Or even better, a reproducer SQL script that builds two tables and then
joins them.

The other table is severely skewed. Most rows cannot JOIN the small 
table. This special case causes the inaccuracy of cost calculation.



The reason is that the MCV of the small table excludes values with rows
of 1. Put them in the MCV in the statistics to get the correct result.

Using the conservative samplerows <= attstattarget doesn't completely
solve this problem. It can solve this case.

After modification we get statistics without histogram:
stanumbers1 | {0.08695652,0.08695652,0.04347826,0.04347826, ... }
stavalues1  | {v,v2, ... }

And we have the right estimates:
Hash Join  (cost=1.52..72100.69 rows=3631 width=0) (actual
time=1.447..1268.385 rows=3902 loops=1)



I'm not against building a "complete" MCV, but I guess the case where
(samplerows <= num_mcv) is pretty rare. Why shouldn't we make the MCV
complete whenever we decide (ndistinct <= num_mcv)?

That would need to happen later, because we don't have the ndistinct
estimate yet at this point - we'd have to do the loop a bit later (or
likely twice).

FWIW the patch breaks the calculation of nmultiple (and thus likely the
ndistinct estimate).

It's not just a small table. If a column's value is nearly unique. It 
also causes the same problem because we exclude values that occur only 
once. samplerows <= num_mcv just solves one scenario.

Perhaps we should discard this (dups cnt > 1) restriction?



regards







Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Quan Zongliang


We have a small table with only 23 rows and 21 values.

The resulting MCV and histogram is as follows
stanumbers1 | {0.08695652,0.08695652}
stavalues1  | {v1,v2}
stavalues2  | 
{v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15,v16,v17,v18,v19,v20,v21}


An incorrect number of rows was estimated when HashJoin was done with 
another large table (about 2 million rows).


Hash Join  (cost=1.52..92414.61 rows=2035023 width=0) (actual 
time=1.943..1528.983 rows=3902 loops=1)


The reason is that the MCV of the small table excludes values with rows 
of 1. Put them in the MCV in the statistics to get the correct result.


Using the conservative samplerows <= attstattarget doesn't completely 
solve this problem. It can solve this case.


After modification we get statistics without histogram:
stanumbers1 | {0.08695652,0.08695652,0.04347826,0.04347826, ... }
stavalues1  | {v,v2, ... }

And we have the right estimates:
Hash Join  (cost=1.52..72100.69 rows=3631 width=0) (actual 
time=1.447..1268.385 rows=3902 loops=1)



Regards,

--
Quan Zongliang
Beijing Vastdata Co., LTDdiff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 52ef462dba..08ea4243f5 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -2518,7 +2518,7 @@ compute_scalar_stats(VacAttrStatsP stats,
{
/* Reached end of duplicates of this value */
ndistinct++;
-   if (dups_cnt > 1)
+   if (dups_cnt > 1 || samplerows <= num_mcv)
{
nmultiple++;
if (track_cnt < num_mcv ||


Re: Why enable_hashjoin Completely disables HashJoin

2023-04-04 Thread Quan Zongliang




On 2023/4/3 19:44, Tomas Vondra wrote:

On 4/3/23 12:23, Quan Zongliang wrote:

Hi,

I found that the enable_hashjoin disables HashJoin completely.
It's in the function add_paths_to_joinrel:

if (enable_hashjoin || jointype == JOIN_FULL)
 hash_inner_and_outer(root, joinrel, outerrel, innerrel,
     jointype, );

Instead, it should add a disable cost to the cost calculation of
hashjoin. And now final_cost_hashjoin does the same thing:

if (!enable_hashjoin)
 startup_cost += disable_cost;


enable_mergejoin has the same problem.

Test case:

CREATE TABLE t_score_01(
s_id int,
s_score int,
s_course char(8),
c_id int);

CREATE TABLE t_student_01(
s_id int,
s_name char(8));

insert into t_score_01 values(
generate_series(1, 100), random()*100, 'course', generate_series(1,
100));

insert into t_student_01 values(generate_series(1, 100), 'name');

analyze t_score_01;
analyze t_student_01;

SET enable_hashjoin TO off;
SET enable_nestloop TO off;
SET enable_mergejoin TO off;

explain select count(*)
from t_student_01 a join t_score_01 b on a.s_id=b.s_id;

After disabling all three, the HashJoin path should still be chosen.



It's not clear to me why that behavior would be desirable? Why is this
an issue you need so solve?

Because someone noticed that when he set enable_hashjoin, 
enable_mergejoin and enable_nestloop to off. The statement seemed to get 
stuck (actually because it chose the NestedLoop path, which took a long 
long time to run).
If enable_hashjoin and enable_nestloop disable generating these two 
paths. Then enable_nestloop should do the same thing, but it doesn't.



AFAIK the reason why some paths are actually disabled (not built at all)
while others are only penalized by adding disable_cost is that we need
to end up with at least one way to execute the query. So we pick a path
that we know is possible (e.g. seqscan) and hard-disable other paths.
But the always-possible path is only soft-disabled by disable_cost.

For joins, we do the same thing. The hash/merge joins may not be
possible, because the data types may not have hash/sort operators, etc.
Nestloop is always possible. So we soft-disable nestloop but
hard-disable hash/merge joins.

I doubt we want to change this behavior, unless there's a good reason to
do that ...
It doesn't have to change. Because selecting NestedLoop doesn't really 
get stuck either. It just takes too long to run.


I will change the patch status to Withdrawn.



regards







Why enable_hashjoin Completely disables HashJoin

2023-04-03 Thread Quan Zongliang

Hi,

I found that the enable_hashjoin disables HashJoin completely.
It's in the function add_paths_to_joinrel:

if (enable_hashjoin || jointype == JOIN_FULL)
hash_inner_and_outer(root, joinrel, outerrel, innerrel,
jointype, );

Instead, it should add a disable cost to the cost calculation of 
hashjoin. And now final_cost_hashjoin does the same thing:


if (!enable_hashjoin)
startup_cost += disable_cost;


enable_mergejoin has the same problem.

Test case:

CREATE TABLE t_score_01(
s_id int,
s_score int,
s_course char(8),
c_id int);

CREATE TABLE t_student_01(
s_id int,
s_name char(8));

insert into t_score_01 values(
generate_series(1, 100), random()*100, 'course', generate_series(1, 
100));


insert into t_student_01 values(generate_series(1, 100), 'name');

analyze t_score_01;
analyze t_student_01;

SET enable_hashjoin TO off;
SET enable_nestloop TO off;
SET enable_mergejoin TO off;

explain select count(*)
from t_student_01 a join t_score_01 b on a.s_id=b.s_id;

After disabling all three, the HashJoin path should still be chosen.

Attached is the patch file.

--
Quan Zongliang
Vastdatadiff --git a/src/backend/optimizer/path/joinpath.c 
b/src/backend/optimizer/path/joinpath.c
index bd51e4f972..4b5ab7f7ff 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -206,14 +206,13 @@ add_paths_to_joinrel(PlannerInfo *root,
 * way of implementing a full outer join, so override enable_mergejoin 
if
 * it's a full join.
 */
-   if (enable_mergejoin || jointype == JOIN_FULL)
-   extra.mergeclause_list = select_mergejoin_clauses(root,
-   
  joinrel,
-   
  outerrel,
-   
  innerrel,
-   
  restrictlist,
-   
  jointype,
-   
  _allowed);
+   extra.mergeclause_list = select_mergejoin_clauses(root,
+   
  joinrel,
+   
  outerrel,
+   
  innerrel,
+   
  restrictlist,
+   
  jointype,
+   
  _allowed);
 
/*
 * If it's SEMI, ANTI, or inner_unique join, compute correction factors
@@ -316,9 +315,8 @@ add_paths_to_joinrel(PlannerInfo *root,
 * before being joined.  As above, disregard enable_hashjoin for full
 * joins, because there may be no other alternative.
 */
-   if (enable_hashjoin || jointype == JOIN_FULL)
-   hash_inner_and_outer(root, joinrel, outerrel, innerrel,
-jointype, );
+   hash_inner_and_outer(root, joinrel, outerrel, innerrel,
+jointype, );
 
/*
 * 5. If inner and outer relations are foreign tables (or joins) 
belonging


Re: Bug: When user-defined AM is used, the index path cannot be selected correctly

2022-08-17 Thread Quan Zongliang



On 2022/8/17 10:03, Tom Lane wrote:

Quan Zongliang  writes:

1. When using extended PGroonga
...
3. Neither ID = 'f' nor id= 't' can use the index correctly.


This works fine for btree indexes.  I think the actual problem
is that IsBooleanOpfamily only accepts the btree and hash
opclasses, and that's what needs to be improved.  Your proposed
patch fails to do that, which makes it just a crude hack that
solves some aspects of the issue (and probably breaks other
things).

It might work to change IsBooleanOpfamily so that it checks to
see whether BooleanEqualOperator is a member of the opclass.
That's basically what we need to know before we dare generate
substitute index clauses.  It's kind of an expensive test
though, and the existing coding assumes that IsBooleanOpfamily
is cheap ...

regards, tom lane


New patch attached.

It seems that partitions do not use AM other than btree and hash.
Rewrite only indxpath.c and check if it is a custom AM.diff --git a/src/backend/optimizer/path/indxpath.c 
b/src/backend/optimizer/path/indxpath.c
index 7d176e7b00..30df5cb2f6 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -23,6 +23,7 @@
 #include "catalog/pg_operator.h"
 #include "catalog/pg_opfamily.h"
 #include "catalog/pg_type.h"
+#include "commands/defrem.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
@@ -2295,7 +2296,7 @@ match_clause_to_indexcol(PlannerInfo *root,
 
/* First check for boolean-index cases. */
opfamily = index->opfamily[indexcol];
-   if (IsBooleanOpfamily(opfamily))
+   if (IsBooleanAmOpfamily(index->relam, opfamily))
{
iclause = match_boolean_index_clause(root, rinfo, indexcol, 
index);
if (iclause)
diff --git a/src/include/catalog/pg_opfamily.h 
b/src/include/catalog/pg_opfamily.h
index 8dc9ce01bb..5c4cc616d8 100644
--- a/src/include/catalog/pg_opfamily.h
+++ b/src/include/catalog/pg_opfamily.h
@@ -58,6 +58,12 @@ DECLARE_UNIQUE_INDEX_PKEY(pg_opfamily_oid_index, 2755, 
OpfamilyOidIndexId, on pg
 #define IsBooleanOpfamily(opfamily) \
((opfamily) == BOOL_BTREE_FAM_OID || (opfamily) == BOOL_HASH_FAM_OID)
 
+#define IsBooleanAmOpfamily(amid, opfamily) \
+   ((opfamily) == BOOL_BTREE_FAM_OID || (opfamily) == BOOL_HASH_FAM_OID || 
\
+   ((amid) >= FirstNormalObjectId && \
+   OidIsValid(GetDefaultOpClass(BOOLOID, (amid 
\
+   )
+
 #endif /* 
EXPOSE_TO_CLIENT_CODE */
 
 #endif /* PG_OPFAMILY_H */


Bug: When user-defined AM is used, the index path cannot be selected correctly

2022-08-16 Thread Quan Zongliang


Hi,

1. When using extended PGroonga

CREATE EXTENSION pgroonga;

CREATE TABLE memos (
  id boolean,
  content varchar
);

CREATE INDEX idxA ON memos USING pgroonga (id);

2. Disable bitmapscan and seqscan:

SET enable_seqscan=off;
SET enable_indexscan=on;
SET enable_bitmapscan=off;

3. Neither ID = 'f' nor id= 't' can use the index correctly.

postgres=# explain select * from memos where id='f';
QUERY PLAN
--
 Seq Scan on memos  (cost=100.00..101.06 rows=3 width=33)
   Filter: (NOT id)
(2 rows)

postgres=# explain select * from memos where id='t';
QUERY PLAN
--
 Seq Scan on memos  (cost=100.00..101.06 rows=3 width=33)
   Filter: id
(2 rows)

postgres=# explain select * from memos where id>='t';
QUERY PLAN
---
 Index Scan using idxa on memos  (cost=0.00..4.01 rows=2 width=33)
   Index Cond: (id >= true)
(2 rows)



The reason is that these expressions are converted to BoolExpr and Var.
match_clause_to_indexcol does not use them to check boolean-index.

patch attached.

--
Quan Zongliang
Beijing Vastdatadiff --git a/src/backend/optimizer/path/indxpath.c 
b/src/backend/optimizer/path/indxpath.c
index 7d176e7b00..31229ce16c 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -2295,7 +2295,7 @@ match_clause_to_indexcol(PlannerInfo *root,
 
/* First check for boolean-index cases. */
opfamily = index->opfamily[indexcol];
-   if (IsBooleanOpfamily(opfamily))
+   if (IsBooleanOpfamily(opfamily) || IsA(clause, BoolExpr) || IsA(clause, 
Var))
{
iclause = match_boolean_index_clause(root, rinfo, indexcol, 
index);
if (iclause)


Re: bugfix: when the blocksize is 32k, the function page_header of pageinspect returns negative numbers.

2021-07-08 Thread Quan Zongliang



On 2021/7/9 9:50 上午, Michael Paquier wrote:

On Fri, Jul 09, 2021 at 09:26:37AM +0800, Quan Zongliang wrote:

new patch attached


That's mostly fine at quick glance.  Here are some comments.

Please add pageinspect--1.9--1.10.sql within the patch.  Using git,
you can do that with a simple "git add".  With the current shape of
the patch, one has to manually copy pageinspect--1.9--1.10.sql into
contrib/pageinspect/ to be able to test it.

+SELECT lower, upper, special, pagesize, version FROM
page_header(get_raw_page('test1', 0));
+ lower | upper | special | pagesize | version
+---+---+-+--+-
+28 |  8152 |8192 | 8192 |   4
+(1 row)
I would not test all the fields, just pagesize and version perhaps?

+   if (TupleDescAttr(tupdesc, 3)->atttypid == INT2OID)
+   values[3] = UInt16GetDatum(page->pd_lower);
+   else
+   values[3] = Int32GetDatum(page->pd_lower);
Let's make the style more consistent with brinfuncs.c, by grouping all
the fields together in a switch/case, like that:
switch ((TupleDescAttr(tupdesc, 3)->atttypid)):
{
 case INT2OID:
 /* fill in values with UInt16GetDatum() */
break;
 case INT4OID:
 /* fill in values with Int32GetDatum() */
break;
 default:
 elog(ERROR, "blah");
}

+ALTER EXTENSION pageinspect UPDATE TO '1.10';
+\df page_header
+SELECT lower, upper, special, pagesize, version FROM 
page_header(get_raw_page('test1', 0));
No need for this test as page.sql already stresses page_header() for
the latest version.  Perhaps we could just UPDATE TO '1.9' before
running your query to show that it is the last version of pageinspect
where the older page_header() appeared.

The documentation of pageinspect does not require an update, as far as
I can see.  So we are good there.
--
Michael



Thanks for the comments.
Done
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 2d330ddb28..068134 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -13,7 +13,8 @@ OBJS = \
rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.8--1.9.sql \
+DATA =  pageinspect--1.9--1.10.sql \
+   pageinspect--1.8--1.9.sql \
pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
diff --git a/contrib/pageinspect/expected/oldextversions.out 
b/contrib/pageinspect/expected/oldextversions.out
index 04dc7f8640..1bb3919504 100644
--- a/contrib/pageinspect/expected/oldextversions.out
+++ b/contrib/pageinspect/expected/oldextversions.out
@@ -36,5 +36,19 @@ SELECT * FROM bt_page_items('test1_a_idx', 1);
   1 | (0,1) |  16 | f | f| 01 00 00 00 00 00 00 01 | f
| (0,1) | 
 (1 row)
 
+ALTER EXTENSION pageinspect UPDATE TO '1.9';
+\df page_header
+   
   List of functions
+ Schema |Name | Result data type | 
Argument data types 

| Type 
++-+--+-+--
+ public | page_header | record   | page bytea, OUT lsn pg_lsn, OUT 
checksum smallint, OUT flags smallint, OUT lower smallint, OUT upper smallint, 
OUT special smallint, OUT pagesize smallint, OUT version smallint, OUT 
prune_xid xid | func
+(1 row)
+
+SELECT pagesize, version FROM page_header(get_raw_page('test1', 0));
+ pagesize | version 
+--+-
+ 8192 |   4
+(1 row)
+
 DROP TABLE test1;
 DROP EXTENSION pageinspect;
diff --git a/contrib/pageinspect/pageinspect--1.9--1.10.sql 
b/contrib/pageinspect/pageinspect--1.9--1.10.sql
new file mode 100644
index 00..8dd9a27505
--- /dev/null
+++ b/contrib/pageinspect/pageinspect--1.9--1.10.sql
@@ -0,0 +1,21 @@
+/* contrib/pageinspect/pageinspect--1.9--1.10.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pageinspect UPDATE TO '1.10'" to load this file. 
\quit
+
+--
+-- page_header()
+--
+DROP FUNCTION page_header(IN page bytea);
+CREATE FUNCTION page_header(IN page bytea,
+OUT lsn pg_lsn,
+OUT checksum smallint,
+OUT flags smallint,
+OUT lower int,
+OUT upper int,
+OUT special int,
+OUT pagesize int,
+OUT version smallint,
+OUT prune_xid xid)
+AS 'MODULE_PATHNAME', 'page_header'
+LANGUAGE C STRICT PARALLEL SAFE;
diff --git a/contrib/pageinspect/pageinspect.control 
b/contrib/pageinspect/pageinspect.control
index bd716769a1..7

Re: bugfix: when the blocksize is 32k, the function page_header of pageinspect returns negative numbers.

2021-07-08 Thread Quan Zongliang



On 2021/7/8 3:54 下午, Michael Paquier wrote:

On Wed, Jul 07, 2021 at 11:28:06PM -0500, Justin Pryzby wrote:

I think you can refer to this prior commit for guidance.

commit f18aa1b203930ed28cfe42e82d3418ae6277576d
Author: Peter Eisentraut 
Date:   Tue Jan 19 10:28:05 2021 +0100

 pageinspect: Change block number arguments to bigint


Yes, thanks.  Peter's recent work is what I had in mind.  I agree that
we could improve the situation, and the change is not complicated once
you know what needs to be done.  It needs to be done as follows:
- Create a new pageinspect--1.9--1.10.sql.
- Provide a proper implementation for older extension version based on
the output parameter types, with a lookup at the TupleDesc for the
function to adapt.
- Add tests to sql/oldextversions.sql to stress the old function based
on smallint results.
- Bump pageinspect.control.

Quan, would you like to produce a patch?  That's a good exercise to
understand how the maintenance of extensions is done.


new patch attached


--
Michael

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 2d330ddb28..068134 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -13,7 +13,8 @@ OBJS = \
rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.8--1.9.sql \
+DATA =  pageinspect--1.9--1.10.sql \
+   pageinspect--1.8--1.9.sql \
pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
diff --git a/contrib/pageinspect/expected/oldextversions.out 
b/contrib/pageinspect/expected/oldextversions.out
index 04dc7f8640..9546791e23 100644
--- a/contrib/pageinspect/expected/oldextversions.out
+++ b/contrib/pageinspect/expected/oldextversions.out
@@ -36,5 +36,32 @@ SELECT * FROM bt_page_items('test1_a_idx', 1);
   1 | (0,1) |  16 | f | f| 01 00 00 00 00 00 00 01 | f
| (0,1) | 
 (1 row)
 
+\df page_header
+   
   List of functions
+ Schema |Name | Result data type | 
Argument data types 

| Type 
++-+--+-+--
+ public | page_header | record   | page bytea, OUT lsn pg_lsn, OUT 
checksum smallint, OUT flags smallint, OUT lower smallint, OUT upper smallint, 
OUT special smallint, OUT pagesize smallint, OUT version smallint, OUT 
prune_xid xid | func
+(1 row)
+
+SELECT lower, upper, special, pagesize, version FROM 
page_header(get_raw_page('test1', 0));
+ lower | upper | special | pagesize | version 
+---+---+-+--+-
+28 |  8152 |8192 | 8192 |   4
+(1 row)
+
+ALTER EXTENSION pageinspect UPDATE TO '1.10';
+\df page_header
+   
 List of functions
+ Schema |Name | Result data type | 
  Argument data types   
| 
Type 
++-+--+-+--
+ public | page_header | record   | page bytea, OUT lsn pg_lsn, OUT 
checksum smallint, OUT flags smallint, OUT lower integer, OUT upper integer, 
OUT special integer, OUT pagesize integer, OUT version smallint, OUT prune_xid 
xid | func
+(1 row)
+
+SELECT lower, upper, special, pagesize, version FROM 
page_header(get_raw_page('test1', 0));
+ lower | upper | special | pagesize | version 
+---+---+-+--+-
+28 |  8152 |8192 | 8192 |   4
+(1 row)
+
 DROP TABLE test1;
 DROP EXTENSION pageinspect;
diff --git a/contrib/pageinspect/pageinspect.control 
b/contrib/pageinspect/pageinspect.control
index bd716769a1..7cdf37913d 100644
--- a/contrib/pageinspect/pageinspect.control
+++ b/contrib/pageinspect/pageinspect.control
@@ -1,5 +1,5 @@
 # pageinspect extension
 comment = 'inspect the contents of database pages at a low level'
-default_version = '1.9'
+default_version = '1.10'
 module_pathname = '$libdir/pageinspect'
 relocatable = true
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index e9fee73bc4..2d1f90e3bc 100644
--- a/contrib/pageinspect/rawpage.c
+++ 

Re: bugfix: when the blocksize is 32k, the function page_header of pageinspect returns negative numbers.

2021-07-08 Thread Quan Zongliang




On 2021/7/8 3:54 下午, Michael Paquier wrote:

On Wed, Jul 07, 2021 at 11:28:06PM -0500, Justin Pryzby wrote:

I think you can refer to this prior commit for guidance.

commit f18aa1b203930ed28cfe42e82d3418ae6277576d
Author: Peter Eisentraut 
Date:   Tue Jan 19 10:28:05 2021 +0100

 pageinspect: Change block number arguments to bigint


Yes, thanks.  Peter's recent work is what I had in mind.  I agree that
we could improve the situation, and the change is not complicated once
you know what needs to be done.  It needs to be done as follows:
- Create a new pageinspect--1.9--1.10.sql.
- Provide a proper implementation for older extension version based on
the output parameter types, with a lookup at the TupleDesc for the
function to adapt.
- Add tests to sql/oldextversions.sql to stress the old function based
on smallint results.
- Bump pageinspect.control.

Quan, would you like to produce a patch?  That's a good exercise to
understand how the maintenance of extensions is done.


Ok. I'll do it.


--
Michael







bugfix: when the blocksize is 32k, the function page_header of pageinspect returns negative numbers.

2021-07-07 Thread Quan Zongliang


If the block size is 32k, the function page_header of the pageinspect 
module returns negative numbers:


postgres=# select * from page_header(get_raw_page('t1',0));
lsn| checksum | flags | lower | upper | special | pagesize | 
version | prune_xid

---+--+---+---+---+-+--+-+---
 0/174CF58 |0 | 0 |28 | 32736 |  -32768 |   -32768 | 
   4 | 0

(1 row)


This patch changes the output parameters lower, upper, special and 
pagesize to int32.


postgres=# select * from page_header(get_raw_page('t1',0));
lsn| checksum | flags | lower | upper | special | pagesize | 
version | prune_xid

---+--+---+---+---+-+--+-+---
 0/19EA640 |0 | 0 |28 | 32736 |   32768 |32768 | 
   4 | 0

(1 row)


--
Quan Zongliang
diff --git a/contrib/pageinspect/pageinspect--1.5.sql 
b/contrib/pageinspect/pageinspect--1.5.sql
index 1e40c3c97e..f71eff19c5 100644
--- a/contrib/pageinspect/pageinspect--1.5.sql
+++ b/contrib/pageinspect/pageinspect--1.5.sql
@@ -23,10 +23,10 @@ CREATE FUNCTION page_header(IN page bytea,
 OUT lsn pg_lsn,
 OUT checksum smallint,
 OUT flags smallint,
-OUT lower smallint,
-OUT upper smallint,
-OUT special smallint,
-OUT pagesize smallint,
+OUT lower int,
+OUT upper int,
+OUT special int,
+OUT pagesize int,
 OUT version smallint,
 OUT prune_xid xid)
 AS 'MODULE_PATHNAME', 'page_header'
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 7272b21016..af04c1a688 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -316,10 +316,10 @@ page_header(PG_FUNCTION_ARGS)
values[0] = LSNGetDatum(lsn);
values[1] = UInt16GetDatum(page->pd_checksum);
values[2] = UInt16GetDatum(page->pd_flags);
-   values[3] = UInt16GetDatum(page->pd_lower);
-   values[4] = UInt16GetDatum(page->pd_upper);
-   values[5] = UInt16GetDatum(page->pd_special);
-   values[6] = UInt16GetDatum(PageGetPageSize(page));
+   values[3] = Int32GetDatum(page->pd_lower);
+   values[4] = Int32GetDatum(page->pd_upper);
+   values[5] = Int32GetDatum(page->pd_special);
+   values[6] = Int32GetDatum(PageGetPageSize(page));
values[7] = UInt16GetDatum(PageGetPageLayoutVersion(page));
values[8] = TransactionIdGetDatum(page->pd_prune_xid);
 


Remove unused code from the KnownAssignedTransactionIdes submodule

2021-06-08 Thread Quan Zongliang


Hi,

In the KnownAssignedTransactionIdes sub-module, two lines of unused code 
were found in a previous change.


--
Quan Zongliang
CPUG
diff --git a/src/backend/storage/ipc/procarray.c 
b/src/backend/storage/ipc/procarray.c
index b448533564..99ce8c752a 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3363,8 +3363,6 @@ RecordKnownAssignedTransactionIds(TransactionId xid)
 
/* ShmemVariableCache->nextFullXid must be beyond any observed 
xid */
AdvanceNextFullTransactionIdPastXid(latestObservedXid);
-   next_expected_xid = latestObservedXid;
-   TransactionIdAdvance(next_expected_xid);
}
 }
 


Re: bugfix: invalid bit/varbit input causes the log file to be unreadable

2020-06-29 Thread Quan Zongliang

Good.

I tested it, and it looks fine.

Thank you.


On 2020/6/29 1:10 上午, Tom Lane wrote:

I wrote:

Even granting the premise, the proposed patch seems like a significant
decrease in user-friendliness for typical cases.  I'd rather see us
make an effort to print one valid-per-the-DB-encoding character.
Now that we can rely on snprintf to count %s restrictions in bytes,
I think something like this should work:
  errmsg("\"%.*s\" is not a valid binary digit",
 pg_mblen(sp), sp)));
But the real problem is that this is only the tip of the iceberg.
You didn't even hit all the %c usages in varbit.c.

I went through all the %c format sequences in the backend to see which
ones could use this type of fix.  There were not as many as I'd expected,
but still a fair number.  (I skipped cases where the input was coming from
the catalogs, as well as some non-user-facing debug printouts.)  That
leads to the attached patch, which seems to do the job without breaking
anything that works today.

regards, tom lane

PS: I failed to resist the temptation to improve some shoddy error
messages nearby in pageinspect/heapfuncs.c.



bugfix: invalid bit/varbit input causes the log file to be unreadable

2020-06-26 Thread Quan Zongliang


The bit/varbit type input functions cause file_fdw to fail to read the 
logfile normally.


1. Server conf:
server_encoding = UTF8
locale = zh_CN.UTF-8

2. Create external tables using file_fdw

CREATE EXTENSION file_fdw;
CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;

CREATE FOREIGN TABLE pglog (
 log_time timestamp(3) with time zone,
 user_name text,
 database_name text,
 process_id integer,
 connection_from text,
 session_id text,
 session_line_num bigint,
 command_tag text,
 session_start_time timestamp with time zone,
 virtual_transaction_id text,
 transaction_id bigint,
 error_severity text,
 sql_state_code text,
 message text,
 detail text,
 hint text,
 internal_query text,
 internal_query_pos integer,
 context text,
 query text,
 query_pos integer,
 location text,
 application_name text
) SERVER pglog
OPTIONS ( filename 'log/postgresql-2020-06-16_213409.csv',
 format 'csv');

It's normal to be here.

3. bit/varbit input
 select b'Ù';

The foreign table cannot be accessed. SELECT * FROM pglog will get:
invalid byte sequence for encoding "UTF8": 0xc3 0x22


The reason is that the error message in the bit_in / varbit_in function 
is output directly using %c. Causes the log file to not be decoded 
correctly.


The attachment is a patch.



diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c
index f0c6a44b84..506cc35446 100644
--- a/src/backend/utils/adt/varbit.c
+++ b/src/backend/utils/adt/varbit.c
@@ -230,8 +230,8 @@ bit_in(PG_FUNCTION_ARGS)
else if (*sp != '0')
ereport(ERROR,

(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-errmsg("\"%c\" is not a valid 
binary digit",
-   *sp)));
+errmsg("\"0x%02X\" is not a 
valid binary digit",
+   (unsigned 
char)*sp)));
 
x >>= 1;
if (x == 0)
@@ -531,8 +531,8 @@ varbit_in(PG_FUNCTION_ARGS)
else if (*sp != '0')
ereport(ERROR,

(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-errmsg("\"%c\" is not a valid 
binary digit",
-   *sp)));
+errmsg("\"0x%02X\" is not a 
valid binary digit",
+   (unsigned 
char)*sp)));
 
x >>= 1;
if (x == 0)


Re: Restore replication settings when modifying a field type

2020-02-10 Thread Quan Zongliang

On 2020/1/15 08:30, Quan Zongliang wrote:

On 2020/1/3 17:14, Peter Eisentraut wrote:

On 2019-11-01 04:39, Euler Taveira wrote:

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.


Yeah, I don't think we need to do the full dance of reverse compiling 
the SQL command and reexecuting it, as the patch currently does.  
That's only necessary for rebuilding the index itself.  For re-setting 
the replica identity, we can just use the internal API as you say.


Also, a few test cases would be nice for this patch.



I'm a little busy. I'll write a new patch in a few days.


new patch attached.


test case 1:
create table t1 (col1 int,col2 int not null,
  col3 int not null,unique(col2,col3));
alter table t1 replica identity using index t1_col2_col3_key;
alter table t1 alter col3 type text;
alter table t1 alter col3 type smallint using col3::int;

test case 2:
create table t2 (col1 varchar(10), col2 text not null,
  col3 timestamp not null,unique(col2,col3),
  col4 int not null unique);
alter table t2 replica identity using index t2_col2_col3_key;
alter table t2 alter col3 type text;
alter table t2 replica identity using index t2_col4_key;
alter table t2 alter col4 type timestamp using '2020-02-11'::timestamp;

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b7c8d663fc..de4da64e06 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -176,6 +176,7 @@ typedef struct AlteredTableInfo
List   *changedConstraintDefs;  /* string definitions of same */
List   *changedIndexOids;   /* OIDs of indexes to rebuild */
List   *changedIndexDefs;   /* string definitions of same */
+   List   *changedReplIdentName;   /* fullname of index to reset 
REPLICA IDENTIFY */
 } AlteredTableInfo;
 
 /* Struct describing one new constraint to check in Phase 3 scan */
@@ -482,6 +483,7 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo 
*tab, Relation rel,

   AlterTableCmd *cmd, LOCKMODE lockmode);
 static void RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab);
 static void RememberIndexForRebuilding(Oid indoid, AlteredTableInfo *tab);
+static void RememberReplicaIdentForRebuilding(Oid indoid, AlteredTableInfo 
*tab);
 static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
   LOCKMODE 
lockmode);
 static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
@@ -553,6 +555,7 @@ static void refuseDupeIndexAttach(Relation parentIdx, 
Relation partIdx,
  Relation 
partitionTbl);
 static List *GetParentedForeignKeyRefs(Relation partition);
 static void ATDetachCheckNoForeignKeyRefs(Relation partition);
+static void relation_mark_replica_identity(Relation rel, char ri_type, Oid 
indexOid, bool is_internal);
 
 
 /* 
@@ -4274,7 +4277,10 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
Relationrel;
ListCell   *lcmd;
 
-   if (subcmds == NIL)
+   if (subcmds == NIL && pass != AT_PASS_MISC)
+   continue;
+   if (subcmds == NIL && pass == AT_PASS_MISC &&
+   tab->changedReplIdentName==NIL)
continue;
 
/*
@@ -4295,6 +4301,18 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
if (pass == AT_PASS_ALTER_TYPE)
ATPostAlterTypeCleanup(wqueue, tab, lockmode);
 
+   if (pass == AT_PASS_MISC && tab->changedReplIdentName)
+   {
+   Relationidxrel;
+   ObjectAddress address = 
get_object_address(OBJECT_INDEX,
+   
(Node*)tab->changedReplIdentName, ,
+   
AccessExclusiveLock, false);
+   relation_close(idxrel, NoLock);
+
+   relation_mark_replica_identity(rel, 
REPLICA_IDENTITY_INDEX,
+   
address.objectId, true);
+   }
+
relation_close(rel, NoLock);
   

Re: Restore replication settings when modifying a field type

2020-01-14 Thread Quan Zongliang

On 2020/1/3 17:14, Peter Eisentraut wrote:

On 2019-11-01 04:39, Euler Taveira wrote:

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.


Yeah, I don't think we need to do the full dance of reverse compiling 
the SQL command and reexecuting it, as the patch currently does.  That's 
only necessary for rebuilding the index itself.  For re-setting the 
replica identity, we can just use the internal API as you say.


Also, a few test cases would be nice for this patch.



I'm a little busy. I'll write a new patch in a few days.




Re: Add a GUC variable that control logical replication

2019-11-06 Thread Quan Zongliang

On 2019/11/1 20:49, Peter Eisentraut wrote:

On 2019-10-20 00:23, Euler Taveira wrote:

You can probably achieve that using ALTER PUBLICATION to disable
publication of deletes or truncates, as the case may be, either
permanently or just for the duration of the operations you want to skip.


... then you are skipping all tables in the publication.


You can group tables into different publications and set the 
subscription to subscribe to multiple publications if you need this kind 
of granularity.


In any case, this kind of thing needs to be handled by the decoding 
plugin based on its configuration policies and depending on its needs. 
For example, let's say you have two decoding plugins running: one for a 
replication system and one for writing an audit log.  It would not be 
appropriate to disable logging for both of them because of some 
performance optimization for one of them.  And it would also not be 
appropriate to do this with a USERSET setting.


If we need different hooks or more DDL commands do this better, then 
that can be considered.  But this seems to be the wrong way to do it.




What the user needs is the same replication link that selectively skips 
some transactions. And this choice only affects transactions that are 
doing bulk delete sessions. The operations of other sessions are not 
affected and can continue to output replication messages.
For example, session 1 wants to bulk delete 1 million old data from the 
T1 table, which can be done without replication. At the same time, 
session 2 deletes 10 records from T1, which is expected to be passed on 
through replication.
Therefore, the two decoders can not meet this requirement. It is also 
inappropriate to temporarily disable subscriptions because it skips all 
transactions for a certain period of time.


--
权宗亮
神州飞象(北京)数据科技有限公司
我们的力量源自最先进的开源数据库PostgreSQL
zongliang.q...@postgresdata.com





Re: Restore replication settings when modifying a field type

2019-11-04 Thread Quan Zongliang

On 2019/10/28 12:39, Kyotaro Horiguchi wrote:

Hello.

# The patch no longer applies on the current master. Needs a rebasing.


New patch, rebased on master branch.


At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang  
wrote in

In fact, the replication property of the table has not been modified,
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
specified index property 'indisreplident' is set to false because of
the rebuild.


I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

P=# create table t (c1 integer, c2 text unique not null);
P=# alter table t replica identity using index t_c2_key;
P=# create publication p1 for table t;
P=# insert into t values (0, '00'), (1, '01');
S=# create table t (c1 integer, c2 text unique not null);
S=# alter table t replica identity using index t_c2_key;
S=# create subscription s1 connection '...' publication p1;

Your patch allows change of the type of c2 into integer.

P=# alter table t alter column c2 type integer using c2::integer;
P=# update t set c1 = c1 + 1 where c2 = '01';

This change doesn't affect perhaps as expected.

S=# select * from t;
  c1 | c2
+
   0 | 00
   1 | 01
(2 rows)



So I developed a patch. If the user modifies the field type. The
associated index is REPLICA IDENTITY. Rebuild and restore replication
settings.


Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Anyway I think it is a problem that replica identity setting is
dropped silently. Perhaps a message something like "REPLICA IDENTITY
setting is lost, please redefine after confirmation of compatibility
with subscribers." is needed.

regards.



diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 05593f3316..02f8dbeabf 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -174,6 +174,8 @@ typedef struct AlteredTableInfo
List   *changedConstraintDefs;  /* string definitions of same */
List   *changedIndexOids;   /* OIDs of indexes to rebuild */
List   *changedIndexDefs;   /* string definitions of same */
+   OidchangedReplIdentOid; /* OID of index to reset 
REPLICA IDENTIFY */
+   char   *changedReplIdentDef;/* string definitions of same */
 } AlteredTableInfo;
 
 /* Struct describing one new constraint to check in Phase 3 scan */
@@ -459,6 +461,7 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo 
*tab, Relation rel,

   AlterTableCmd *cmd, LOCKMODE lockmode);
 static void RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab);
 static void RememberIndexForRebuilding(Oid indoid, AlteredTableInfo *tab);
+static void RememberReplicaIdentForRebuilding(Oid indoid, AlteredTableInfo 
*tab);
 static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
   LOCKMODE 
lockmode);
 static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
@@ -10715,6 +10718,9 @@ ATExecAlterColumnType(AlteredTableInfo *tab, Relation 
rel,
{
Assert(foundObject.objectSubId 
== 0);

RememberIndexForRebuilding(foundObject.objectId, tab);
+
+   if 
(RelationGetForm(rel)->relreplident==REPLICA_IDENTITY_INDEX)
+   
RememberReplicaIdentForRebuilding(foundObject.objectId, tab);
}
else if (relKind == RELKIND_SEQUENCE)
{
@@ -10749,9 +10755,18 @@ ATExecAlterColumnType(AlteredTableInfo *tab, Relation 
rel,
}
 
case OCLASS_CONSTRAINT:
-   Assert(foundObject.objectSubId == 0);
-   
RememberConstraintForRebuilding(foundObject.objectId, tab);
-   break;
+   {
+   Oid indId;
+
+   Assert(foundObject.objectSubId == 0);
+   
RememberConstraintForRebuilding(foundObject.objectId, tab);
+
+   indId = 
get_constraint_index(foundOb

Re: Restore replication settings when modifying a field type

2019-10-31 Thread Quan Zongliang

On 2019/10/28 12:39, Kyotaro Horiguchi wrote:

Hello.

# The patch no longer applies on the current master. Needs a rebasing.

At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang  
wrote in

In fact, the replication property of the table has not been modified,
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
specified index property 'indisreplident' is set to false because of
the rebuild.


I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

P=# create table t (c1 integer, c2 text unique not null);
P=# alter table t replica identity using index t_c2_key;
P=# create publication p1 for table t;
P=# insert into t values (0, '00'), (1, '01');
S=# create table t (c1 integer, c2 text unique not null);
S=# alter table t replica identity using index t_c2_key;
S=# create subscription s1 connection '...' publication p1;

Your patch allows change of the type of c2 into integer.

P=# alter table t alter column c2 type integer using c2::integer;
P=# update t set c1 = c1 + 1 where c2 = '01';

This change doesn't affect perhaps as expected.

S=# select * from t;
  c1 | c2
+
   0 | 00
   1 | 01
(2 rows)



So I developed a patch. If the user modifies the field type. The
associated index is REPLICA IDENTITY. Rebuild and restore replication
settings.


Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Anyway I think it is a problem that replica identity setting is
dropped silently. Perhaps a message something like "REPLICA IDENTITY
setting is lost, please redefine after confirmation of compatibility
with subscribers." is needed.

In fact, the scene we encountered is like this. The field of a user's 
table is of type "smallint", and it turns out that this range is not 
sufficient. So change it to "int". At this point, the REPLICA IDENTITY 
is lost and the user does not realize it. When they found out, the 
logical replication for this period of time did not output normally. 
Users have to find other ways to get the data back.
The logical replication of this user is to issue standard SQL statements 
to other relational databases using the plugin developed by himself. And 
they have thousands of tables to replicate.
So I think this patch is appropriate in this scenario. As for the 
matching problem between publishers and subscribers, I'm afraid it's 
hard to solve here. If this is not a suitable modification, I can 
withdraw it. And see if there's a better way.


If necessary, I'll modify it again. Rebase to the master branch.


regards.







Restore replication settings when modifying a field type

2019-10-26 Thread Quan Zongliang


When the user modifies the REPLICA IDENTIFY field type, the logical 
replication settings are lost.


For example:

postgres=# \d+ t1
Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats 
target | Description

+-+---+--+-+-+--+-
 col1   | integer |   |  | | plain   | 
|
 col2   | integer |   | not null | | plain   | 
|

Indexes:
"t1_col2_key" UNIQUE CONSTRAINT, btree (col2) REPLICA IDENTITY


postgres=# alter table t1 alter col2 type smallint;
ALTER TABLE
postgres=# \d+ t1
 Table "public.t1"
 Column |   Type   | Collation | Nullable | Default | Storage | Stats 
target | Description

+--+---+--+-+-+--+-
 col1   | integer  |   |  | | plain   | 
 |
 col2   | smallint |   | not null | | plain   | 
 |

Indexes:
"t1_col2_key" UNIQUE CONSTRAINT, btree (col2)

In fact, the replication property of the table has not been modified, 
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously 
specified index property 'indisreplident' is set to false because of the 
rebuild.


So I developed a patch. If the user modifies the field type. The 
associated index is REPLICA IDENTITY. Rebuild and restore replication 
settings.


Regards,
Quan Zongliang
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b53f6ed3ac..c21372fe51 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -175,6 +175,8 @@ typedef struct AlteredTableInfo
List   *changedConstraintDefs;  /* string definitions of same */
List   *changedIndexOids;   /* OIDs of indexes to rebuild */
List   *changedIndexDefs;   /* string definitions of same */
+   OidchangedReplIdentOid; /* OID of index to reset 
REPLICA IDENTIFY */
+   char   *changedReplIdentDef;/* string definitions of same */
 } AlteredTableInfo;
 
 /* Struct describing one new constraint to check in Phase 3 scan */
@@ -428,6 +430,7 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo 
*tab, Relation rel,
  AlterTableCmd *cmd, LOCKMODE 
lockmode);
 static void RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab);
 static void RememberIndexForRebuilding(Oid indoid, AlteredTableInfo *tab);
+static void RememberReplicaIdentForRebuilding(Oid indoid, AlteredTableInfo 
*tab);
 static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char 
*colName,
List *options, 
LOCKMODE lockmode);
 static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
@@ -9991,6 +9994,9 @@ ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
{
Assert(foundObject.objectSubId 
== 0);

RememberIndexForRebuilding(foundObject.objectId, tab);
+
+   if 
(RelationGetForm(rel)->relreplident==REPLICA_IDENTITY_INDEX)
+   
RememberReplicaIdentForRebuilding(foundObject.objectId, tab);
}
else if (relKind == RELKIND_SEQUENCE)
{
@@ -10010,8 +10016,14 @@ ATExecAlterColumnType(AlteredTableInfo *tab, Relation 
rel,
}
 
case OCLASS_CONSTRAINT:
-   Assert(foundObject.objectSubId == 0);
-   
RememberConstraintForRebuilding(foundObject.objectId, tab);
+   {
+   Oid indId;
+   Assert(foundObject.objectSubId == 0);
+   
RememberConstraintForRebuilding(foundObject.objectId, tab);
+   indId = 
get_constraint_index(foundObject.objectId);
+   if (OidIsValid(indId))
+   
RememberReplicaIdentForRebuilding(indId, tab);
+   }
break;
 
case OCLASS_REWRITE:
@@ -10324,6 +10336,36 @@ RememberConstraintForRebuilding(Oid conoid, 
AlteredTableInfo *tab)
}
 }
 
+/*
+ * Subroutine for ATExecAlterColumnType: remember that a replica identify
+ * needs to be reset (which we might already know).
+ */
+static void
+RememberReplicaIdentForRebuilding(Oid indoid, AlteredTable

Re: Add a GUC variable that control logical replication

2019-09-18 Thread Quan Zongliang

On 2019/9/18 17:11, Peter Eisentraut wrote:

On 2019-09-18 10:39, Quan Zongliang wrote:

Sybase has a feature to turn off replication at the session level: set
replication = off, which can be temporarily turned off when there is a
maintenance action on the table. Our users also want this feature.


These kinds of feature requests are always dubious because just because
Sybase behaves this way for some implementation or architectural reason
doesn't necessarily mean it makes sense for PostgreSQL too.


Agree

Why do you need to turn off replication when there is "maintenance" on a
table?  What does that even mean?

In a table, the user only keep data for a period of time and delete 
expired records every day, involving about 10 million to 20 million 
records at a time. They want to not pass similar bulk operations in 
logical replication.

My English is bad that I use the wrong word “maintenance” in my description.





Add a GUC variable that control logical replication

2019-09-18 Thread Quan Zongliang



Sybase has a feature to turn off replication at the session level: set 
replication = off, which can be temporarily turned off when there is a 
maintenance action on the table. Our users also want this feature.
I add a new flag bit in xinfo, control it with a session-level variable, 
when set to true, this flag is written when the transaction is 
committed, and when the logic is decoded it abandons the transaction 
like aborted transactions. Since PostgreSQL has two types of 
replication, I call the variable "logical_replication" to avoid 
confusion and default value is true.


Sample SQL

insert into a values(100);
set logical_replication to off;
insert into a values(200);
reset logical_replication;
insert into a values(300);

pg_recvlogical output(the second is not output.)
BEGIN 492
table public.a: INSERT: col1[integer]:100
COMMIT 492
BEGIN 494
table public.a: INSERT: col1[integer]:300
COMMIT 494

I'm not sure this is the most appropriate way. What do you think?

Regards,
Quan Zongliang
diff --git a/src/backend/access/transam/xact.c 
b/src/backend/access/transam/xact.c
index 9162286c98..d7a04539d6 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -82,6 +82,8 @@ bool  XactDeferrable;
 
 intsynchronous_commit = SYNCHRONOUS_COMMIT_ON;
 
+bool   logical_replication = true;
+
 /*
  * When running as a parallel worker, we place only a single
  * TransactionStateData on the parallel worker's state stack, and the XID
@@ -5535,6 +5537,9 @@ XactLogCommitRecord(TimestampTz commit_time,
xl_origin.origin_timestamp = 
replorigin_session_origin_timestamp;
}
 
+   if (!logical_replication)
+   xl_xinfo.xinfo |= XACT_XINFO_LOGIREPL_OFF;
+
if (xl_xinfo.xinfo != 0)
info |= XLOG_XACT_HAS_INFO;
 
diff --git a/src/backend/replication/logical/decode.c 
b/src/backend/replication/logical/decode.c
index c53e7e2279..b11c89ff74 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -618,6 +618,7 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer 
*buf,
 */
if (SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) ||
(parsed->dbId != InvalidOid && parsed->dbId != 
ctx->slot->data.database) ||
+   (parsed->xinfo & XACT_XINFO_LOGIREPL_OFF) ||
ctx->fast_forward || FilterByOrigin(ctx, origin_id))
{
for (i = 0; i < parsed->nsubxacts; i++)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 90ffd89339..0880a2765f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1952,6 +1952,16 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
 
+   {
+   {"logical_replication", PGC_USERSET, REPLICATION_MASTER,
+   gettext_noop("Close logical replication of transactions 
in the session."),
+   NULL
+   },
+   _replication,
+   true,
+   NULL, NULL, NULL
+   },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL, NULL
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index d714551704..1a7f52ac50 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -81,6 +81,9 @@ typedef enum
 /* Synchronous commit level */
 extern int synchronous_commit;
 
+/* turn on/off logical replication */
+extern boollogical_replication;
+
 /*
  * Miscellaneous flag bits to record events which occur on the top level
  * transaction. These flags are only persisted in MyXactFlags and are intended
@@ -168,6 +171,12 @@ typedef void (*SubXactCallback) (SubXactEvent event, 
SubTransactionId mySubid,
 #define XACT_XINFO_HAS_AE_LOCKS(1U << 6)
 #define XACT_XINFO_HAS_GID (1U << 7)
 
+/*
+ * It indicates that the data affected by this transaction does not need
+ * to be included in the logical replication.
+ */
+#define XACT_XINFO_LOGIREPL_OFF(1U << 28)
+
 /*
  * Also stored in xinfo, these indicating a variety of additional actions that
  * need to occur when emulating transaction effects during recovery.


Re: enhance SPI to support EXECUTE commands

2019-09-05 Thread Quan Zongliang

On 2019/9/5 17:33, Pavel Stehule wrote:



čt 5. 9. 2019 v 10:57 odesílatel Quan Zongliang 
<mailto:zongliang.q...@postgresdata.com>> napsal:


On 2019/9/5 16:31, Pavel Stehule wrote:
 >
 >
 > čt 5. 9. 2019 v 10:25 odesílatel Quan Zongliang
 > mailto:zongliang.q...@postgresdata.com>
 > <mailto:zongliang.q...@postgresdata.com
<mailto:zongliang.q...@postgresdata.com>>> napsal:
 >
 >     On 2019/9/5 15:09, Pavel Stehule wrote:
 >      >
 >      >
 >      > čt 5. 9. 2019 v 8:39 odesílatel Quan Zongliang
 >      > mailto:zongliang.q...@postgresdata.com>
 >     <mailto:zongliang.q...@postgresdata.com
<mailto:zongliang.q...@postgresdata.com>>
 >      > <mailto:zongliang.q...@postgresdata.com
<mailto:zongliang.q...@postgresdata.com>
 >     <mailto:zongliang.q...@postgresdata.com
<mailto:zongliang.q...@postgresdata.com>>>> napsal:
 >      >
 >      >     Dear hackers,
 >      >
 >      >     I found that such a statement would get 0 in PL/pgSQL.
 >      >
 >      >     PREPARE smt_del(int) AS DELETE FROM t1;
 >      >     EXECUTE 'EXECUTE smt_del(100)';
 >      >     GET DIAGNOSTICS j = ROW_COUNT;
 >      >
 >      >     In fact, this is a problem with SPI, it does not support
 >     getting result
 >      >     of the EXECUTE command. I made a little enhancement.
Support
 >     for the
 >      >     number of rows processed when executing
INSERT/UPDATE/DELETE
 >     statements
 >      >     dynamically.
 >      >
 >      >
 >      > Is there some use case for support this feature?
 >      >
 >     A user deletes the data in PL/pgSQL using the above method,
hoping
 >     to do
 >     more processing according to the number of rows affected, and
found
 >     that
 >     each time will get 0.
 >
 >     Sample code:
 >     PREPARE smt_del(int) AS DELETE FROM t1 WHERE c=$1;
 >     EXECUTE 'EXECUTE smt_del(100)';
 >     GET DIAGNOSTICS j = ROW_COUNT;
 >
 >
 > This has not sense in plpgsql. Why you use PREPARE statement
explicitly?
 >
Yes, I told him to do it in other ways, and the problem has been solved.

Under psql, we can get this result

flying=# EXECUTE smt_del(100);
DELETE 1

So I think this may be the negligence of SPI, it should be better to
deal with it.


Personally, I would not to support features that allows bad code.

My code is actually a way to continue the CREATE AS SELECT and COPY 
statements. In spi.c, they look like this:


if (IsA(stmt->utilityStmt, CreateTableAsStmt))  // original code
...
else if (IsA(stmt->utilityStmt, CopyStmt))  // original code
...
else if (IsA(stmt->utilityStmt, ExecuteStmt))  // my code

My patch was not developed for this PL/pgSQL approach. I just because it 
found this problem.




Pavel


 >
 >     IF j=1 THEN
 >         do something
 >     ELSIF j=0 THEN
 >         do something
 >
 >     Here j is always equal to 0.
 >
 >
 >
 >     Regards
 >
 >      > Regards
 >      >
 >      > Pavel
 >      >
 >      >
 >      >     Regards,
 >      >     Quan Zongliang
 >      >
 >








Re: enhance SPI to support EXECUTE commands

2019-09-05 Thread Quan Zongliang

On 2019/9/5 16:31, Pavel Stehule wrote:



čt 5. 9. 2019 v 10:25 odesílatel Quan Zongliang 
<mailto:zongliang.q...@postgresdata.com>> napsal:


On 2019/9/5 15:09, Pavel Stehule wrote:
 >
 >
 > čt 5. 9. 2019 v 8:39 odesílatel Quan Zongliang
 > mailto:zongliang.q...@postgresdata.com>
 > <mailto:zongliang.q...@postgresdata.com
<mailto:zongliang.q...@postgresdata.com>>> napsal:
 >
 >     Dear hackers,
 >
 >     I found that such a statement would get 0 in PL/pgSQL.
 >
 >     PREPARE smt_del(int) AS DELETE FROM t1;
 >     EXECUTE 'EXECUTE smt_del(100)';
 >     GET DIAGNOSTICS j = ROW_COUNT;
 >
 >     In fact, this is a problem with SPI, it does not support
getting result
 >     of the EXECUTE command. I made a little enhancement. Support
for the
 >     number of rows processed when executing INSERT/UPDATE/DELETE
statements
 >     dynamically.
 >
 >
 > Is there some use case for support this feature?
 >
A user deletes the data in PL/pgSQL using the above method, hoping
to do
more processing according to the number of rows affected, and found
that
each time will get 0.

Sample code:
PREPARE smt_del(int) AS DELETE FROM t1 WHERE c=$1;
EXECUTE 'EXECUTE smt_del(100)';
GET DIAGNOSTICS j = ROW_COUNT;


This has not sense in plpgsql. Why you use PREPARE statement explicitly?


Yes, I told him to do it in other ways, and the problem has been solved.

Under psql, we can get this result

flying=# EXECUTE smt_del(100);
DELETE 1

So I think this may be the negligence of SPI, it should be better to 
deal with it.




IF j=1 THEN
    do something
ELSIF j=0 THEN
    do something

Here j is always equal to 0.



Regards

 > Regards
 >
 > Pavel
 >
 >
 >     Regards,
 >     Quan Zongliang
 >







Re: enhance SPI to support EXECUTE commands

2019-09-05 Thread Quan Zongliang

On 2019/9/5 15:09, Pavel Stehule wrote:



čt 5. 9. 2019 v 8:39 odesílatel Quan Zongliang 
<mailto:zongliang.q...@postgresdata.com>> napsal:


Dear hackers,

I found that such a statement would get 0 in PL/pgSQL.

PREPARE smt_del(int) AS DELETE FROM t1;
EXECUTE 'EXECUTE smt_del(100)';
GET DIAGNOSTICS j = ROW_COUNT;

In fact, this is a problem with SPI, it does not support getting result
of the EXECUTE command. I made a little enhancement. Support for the
number of rows processed when executing INSERT/UPDATE/DELETE statements
dynamically.


Is there some use case for support this feature?

A user deletes the data in PL/pgSQL using the above method, hoping to do 
more processing according to the number of rows affected, and found that 
each time will get 0.


Sample code:
PREPARE smt_del(int) AS DELETE FROM t1 WHERE c=$1;
EXECUTE 'EXECUTE smt_del(100)';
GET DIAGNOSTICS j = ROW_COUNT;

IF j=1 THEN
  do something
ELSIF j=0 THEN
  do something

Here j is always equal to 0.

Regards


Regards

Pavel


Regards,
Quan Zongliang







enhance SPI to support EXECUTE commands

2019-09-05 Thread Quan Zongliang

Dear hackers,

I found that such a statement would get 0 in PL/pgSQL.

PREPARE smt_del(int) AS DELETE FROM t1;
EXECUTE 'EXECUTE smt_del(100)';
GET DIAGNOSTICS j = ROW_COUNT;

In fact, this is a problem with SPI, it does not support getting result 
of the EXECUTE command. I made a little enhancement. Support for the 
number of rows processed when executing INSERT/UPDATE/DELETE statements 
dynamically.


Regards,
Quan Zongliang
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 22dd55c378..b364492e0e 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2272,6 +2272,27 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
_SPI_current->processed = 
pg_strtouint64(completionTag + 5,

 NULL, 10);
}
+   else if (IsA(stmt->utilityStmt, ExecuteStmt))
+   {
+   if (strncmp(completionTag, "INSERT ", 
7) == 0)
+   {
+   char *p = completionTag + 7;
+   /* INSERT: skip oid and space */
+   while (*p && *p != ' ')
+   p++;
+   if (*p != 0)
+   {
+   _SPI_current->processed 
=
+   
pg_strtouint64(p, NULL, 10);
+   }
+   }
+   else if (strncmp(completionTag, "UPDATE 
", 7) == 0 ||
+   strncmp(completionTag, 
"DELETE ", 7) == 0)
+   {
+   _SPI_current->processed =
+   
pg_strtouint64(completionTag + 7, NULL, 10);
+   }
+   }
}
 
/*