RE: Failed transaction statistics to measure the logical replication progress

2021-09-30 Thread Hou , Zhijie/侯 志杰
On Tues, Sep 28, 2021 6:05 PM Amit Kapila  wrote:
> On Tue, Sep 28, 2021 at 11:35 AM Masahiko Sawada  
> wrote:
> >
> > On Tue, Sep 28, 2021 at 1:54 PM Amit Kapila 
> wrote:
> > >
> > > >
> > > > Then, if, we proceed in this direction,
> > > > the place to implement those stats
> > > > would be on the LogicalRepWorker struct, instead ?
> > > >
> > >
> > > Or, we can make existing stats persistent and then add these stats on
> > > top of it. Sawada-San, do you have any thoughts on this matter?
> >
> > I think that making existing stats including received_lsn and
> > last_msg_receipt_time persistent by using stats collector could cause
> > massive reporting messages. We can report these messages with a
> > certain interval to reduce the amount of messages but we will end up
> > seeing old stats on the view.
> >
> 
> Can't we keep the current and new stats both in-memory and persist on
> disk? So, the persistent stats data will be used to fill the in-memory
> counters after restarting of workers, otherwise, we will always refer
> to in-memory values.

I think this approach works, but I have another concern about it.

The current pg_stat_subscription view is listed as "Dynamic Statistics Views" in
the document, the data in it seems about the worker process, and the view data
shows what the current worker did. But if we keep the new xact stat persist,
then it's not what the current worker did, it looks more related to the
subscription historic data.

Adding a new view seems resonalble, but it will bring another subscription
related view which might be too much. OTOH, I can see there are already some
different views[1] including xact stat, maybe adding another one is accepatble ?

[1]
pg_stat_xact_all_tables
pg_stat_xact_sys_tables
pg_stat_xact_user_tables
pg_stat_xact_user_functions

Best regards,
Hou zj


RE: A reloption for partitioned tables - parallel_workers

2021-02-18 Thread Hou, Zhijie
> hi,
> 
> Here we go, my first patch... solves
> https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520
> e...@www.fastmail.com
> 

Hi,

partitioned_table_reloptions(Datum reloptions, bool validate)
 {
+   static const relopt_parse_elt tab[] = {
+   {"parallel_workers", RELOPT_TYPE_INT,
+   offsetof(StdRdOptions, parallel_workers)},
+   };
+
return (bytea *) build_reloptions(reloptions, validate,
  
RELOPT_KIND_PARTITIONED,
- 0, 
NULL, 0);
+ 
sizeof(StdRdOptions),
+ tab, 
lengthof(tab));
 }

I noticed that you plan to store the parallel_workers in the same 
struct(StdRdOptions) as normal heap relation.
It seems better to store it in a separate struct.

And as commit 1bbd608 said:

> Splitting things has the advantage to
> make the information stored in rd_options include only the necessary
> information, reducing the amount of memory used for a relcache entry
> with partitioned tables if new reloptions are introduced at this level.


What do you think?

Best regards,
Houzj










RE: Determine parallel-safety of partition relations for Inserts

2021-02-17 Thread Hou, Zhijie
Hi,

Attaching v6 patches with the changes:
  * rebase the code on the greg's latest parallel insert patch.
  * fix some code comment.
  * add some test to cover the partitioned table.

Please consider it for further review.

Best regards,
Houzj





v6_0004-reloption-parallel_dml-test-and-doc.patch
Description: v6_0004-reloption-parallel_dml-test-and-doc.patch


v6_0001-guc-option-enable_parallel_dml-src.patch
Description: v6_0001-guc-option-enable_parallel_dml-src.patch


v6_0002-guc-option-enable_parallel_dml-doc-and-test.patch
Description: v6_0002-guc-option-enable_parallel_dml-doc-and-test.patch


v6_0003-reloption-parallel_dml-src.patch.patch
Description: v6_0003-reloption-parallel_dml-src.patch.patch


RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-10 Thread Hou, Zhijie
> > > It did have performance gain, but I think it's not huge enough to
> > > ignore the extra's index cost.
> > > What do you think ?
> >
> > Yes... as you suspect, I'm afraid the benefit from parallel bitmap
> > scan may not compensate for the loss of the parallel insert operation.
> >
> > The loss is probably due to 1) more index page splits, 2) more buffer
> > writes (table and index), and 3) internal locks for things such as
> > relation extension and page content protection.  To investigate 3), we
> > should want something like [1], which tells us the wait event
> > statistics (wait count and time for each wait event) per session or
> > across the instance like Oracke, MySQL and EDB provides.  I want to
> continue this in the near future.
> 
> What would the result look like if you turn off
> parallel_leader_participation?  If the leader is freed from
> reading/writing the table and index, the index page splits and internal
> lock contention may decrease enough to recover part of the loss.
> 
> https://www.postgresql.org/docs/devel/parallel-plans.html
> 
> "In a parallel bitmap heap scan, one process is chosen as the leader. That
> process performs a scan of one or more indexes and builds a bitmap indicating
> which table blocks need to be visited. These blocks are then divided among
> the cooperating processes as in a parallel sequential scan. In other words,
> the heap scan is performed in parallel, but the underlying index scan is
> not."

If I disable parallel_leader_participation.

For max_parallel_workers_per_gather = 4, It still have performance degradation.

For max_parallel_workers_per_gather = 2, the performance degradation will not 
happen in most of the case.
There is sometimes a noise(performance degradation), but most of result(about 
80%) is good.

Best regards,
houzj


postgres=# explain (ANALYZE, BUFFERS, VERBOSE, WAL)  insert into testscan 
select a from x where a<8 or (a%2=0 and a>19990);
   QUERY PLAN
-
 Gather  (cost=4272.20..1269111.15 rows=79918 width=0) (actual 
time=381.764..382.715 rows=0 loops=1)
   Workers Planned: 4
   Workers Launched: 4
   Buffers: shared hit=407094 read=4 dirtied=1085 written=1158
   WAL: records=260498 bytes=17019359
   ->  Insert on public.testscan  (cost=3272.20..1260119.35 rows=0 width=0) 
(actual time=378.227..378.229 rows=0 loops=5)
 Buffers: shared hit=407094 read=4 dirtied=1085 written=1158
 WAL: records=260498 bytes=17019359
 Worker 0:  actual time=376.638..376.640 rows=0 loops=1
   Buffers: shared hit=88281 dirtied=236 written=337
   WAL: records=56167 bytes=3674994
 Worker 1:  actual time=377.889..377.892 rows=0 loops=1
   Buffers: shared hit=81231 dirtied=213 written=99
   WAL: records=52175 bytes=3386885
 Worker 2:  actual time=377.388..377.389 rows=0 loops=1
   Buffers: shared hit=82544 dirtied=232 written=279
   WAL: records=52469 bytes=3443843
 Worker 3:  actual time=377.733..377.734 rows=0 loops=1
   Buffers: shared hit=87728 dirtied=225 written=182
   WAL: records=56307 bytes=3660758
 ->  Parallel Bitmap Heap Scan on public.x  (cost=3272.20..1260119.35 
rows=19980 width=8) (actual time=5.832..14.787 rows=26000 loops=5)
   Output: x.a, NULL::integer
   Recheck Cond: ((x.a < 8) OR (x.a > 19990))
   Filter: ((x.a < 8) OR (((x.a % 2) = 0) AND (x.a > 
19990)))
   Rows Removed by Filter: 1
   Heap Blocks: exact=167
   Buffers: shared hit=1475
   Worker 0:  actual time=5.203..14.921 rows=28028 loops=1
 Buffers: shared hit=209
   Worker 1:  actual time=5.165..14.403 rows=26039 loops=1
 Buffers: shared hit=196
   Worker 2:  actual time=5.188..14.284 rows=26177 loops=1
 Buffers: shared hit=195
   Worker 3:  actual time=5.151..14.760 rows=28104 loops=1
 Buffers: shared hit=208
   ->  BitmapOr  (cost=3272.20..3272.20 rows=174813 width=0) 
(actual time=8.288..8.290 rows=0 loops=1)
 Buffers: shared hit=500
 ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1468.38 
rows=79441 width=0) (actual time=3.681..3.681 rows=7 loops=1)
   Index Cond: (x.a < 8)
   Buffers: shared hit=222
 ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1763.86 
rows=95372 width=0) (actual time=4.605..4.605 rows=10 loops=1)
   Index Cond: (x.a > 19990)
   Buffers: shared hit=278
 Planning:
   Buffers: shared hit=19
 Planning Time: 0.173 ms
 Execution Time: 382.776 ms
(47 

RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-10 Thread Hou, Zhijie
> What are the results if disable the bitmap heap scan(Set enable_bitmapscan
> = off)? If that happens to be true, then we might also want to consider
> if in some way we can teach parallel insert to cost more in such cases.
> Another thing we can try is to integrate a parallel-insert patch with the
> patch on another thread [1] and see if that makes any difference but not
> sure if we want to go there at this stage unless it is simple to try that
> out?

If we diable bitmapscan, the performance degradation seems will not happen.

[Parallel]
postgres=# explain (ANALYZE, BUFFERS, VERBOSE, WAL)  insert into testscan 
select a from x where a<8 or (a%2=0 and a>19990);
QUERY PLAN
---
 Gather  (cost=1000.00..2090216.68 rows=81338 width=0) (actual 
time=0.226..5488.455 rows=0 loops=1)
   Workers Planned: 4
   Workers Launched: 4
   Buffers: shared hit=393364 read=1079535 dirtied=984 written=1027
   WAL: records=260400 bytes=16549513
   ->  Insert on public.testscan  (cost=0.00..2081082.88 rows=0 width=0) 
(actual time=5483.113..5483.114 rows=0 loops=4)
 Buffers: shared hit=393364 read=1079535 dirtied=984 written=1027
 WAL: records=260400 bytes=16549513
 Worker 0:  actual time=5483.116..5483.117 rows=0 loops=1
   Buffers: shared hit=36306 read=264288 dirtied=100 written=49
   WAL: records=23895 bytes=1575860
 Worker 1:  actual time=5483.220..5483.222 rows=0 loops=1
   Buffers: shared hit=39750 read=280476 dirtied=101 written=106
   WAL: records=26141 bytes=1685083
 Worker 2:  actual time=5482.844..5482.845 rows=0 loops=1
   Buffers: shared hit=38660 read=263713 dirtied=105 written=250
   WAL: records=25318 bytes=1657396
 Worker 3:  actual time=5483.272..5483.274 rows=0 loops=1
   Buffers: shared hit=278648 read=271058 dirtied=678 written=622
   WAL: records=185046 bytes=11631174
 ->  Parallel Seq Scan on public.x  (cost=0.00..2081082.88 rows=20334 
width=8) (actual time=4001.641..5287.248 rows=32500 loops=4)
   Output: x.a, NULL::integer
   Filter: ((x.a < 8) OR (((x.a % 2) = 0) AND (x.a > 
19990)))
   Rows Removed by Filter: 49967500
   Buffers: shared hit=1551 read=1079531
   Worker 0:  actual time=5335.456..5340.757 rows=11924 loops=1
 Buffers: shared hit=281 read=264288
   Worker 1:  actual time=5335.559..5341.766 rows=13049 loops=1
 Buffers: shared hit=281 read=280476
   Worker 2:  actual time=5335.534..5340.964 rows=12636 loops=1
 Buffers: shared hit=278 read=263712
   Worker 3:  actual time=0.015..5125.503 rows=92390 loops=1
 Buffers: shared hit=711 read=271055
 Planning:
   Buffers: shared hit=19
 Planning Time: 0.175 ms
 Execution Time: 5488.493 ms

[Serial]
postgres=# explain (ANALYZE, BUFFERS, VERBOSE, WAL)  insert into testscan 
select a from x where a<8 or (a%2=0 and a>19990);
QUERY PLAN
---
 Insert on public.testscan  (cost=0.00..5081085.52 rows=0 width=0) (actual 
time=19311.642..19311.643 rows=0 loops=1)
   Buffers: shared hit=392485 read=1079694 dirtied=934 written=933
   WAL: records=260354 bytes=16259841
   ->  Seq Scan on public.x  (cost=0.00..5081085.52 rows=81338 width=8) (actual 
time=0.010..18997.317 rows=12 loops=1)
 Output: x.a, NULL::integer
 Filter: ((x.a < 8) OR (((x.a % 2) = 0) AND (x.a > 19990)))
 Rows Removed by Filter: 199870001
 Buffers: shared hit=1391 read=1079691
 Planning:
   Buffers: shared hit=10
 Planning Time: 0.125 ms
 Execution Time: 19311.700 ms

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-10 Thread Hou, Zhijie
> > >
> > > else if (IsA(node, Query))
> > > {
> > > Query  *query = (Query *) node;
> > >
> > > /* SELECT FOR UPDATE/SHARE must be treated as unsafe */
> > > if (query->rowMarks != NULL)
> > > {
> > > context->max_hazard = PROPARALLEL_UNSAFE;
> > > return true;
> > > }
> >
> > In my understanding, the max_parallel_hazard() query tree walk is to
> > find constructs that are parallel unsafe in that they call code that
> > can't run in parallel mode.  For example, FOR UPDATE/SHARE on
> > traditional heap AM tuples calls AssignTransactionId() which doesn't
> > support being called in parallel mode.  Likewise ON CONFLICT ... DO
> > UPDATE calls heap_update() which doesn't support parallelism.  I'm not
> > aware of any such hazards downstream of ExecValuesScan().
> >
> > > >You're trying to
> > > > add something that bails based on second-guessing that a parallel
> > > >path  would not be chosen, which I find somewhat objectionable.
> > > >
> > > > If the main goal of bailing out is to avoid doing the potentially
> > > > expensive modification safety check on the target relation, maybe
> > > > we should try to somehow make the check less expensive.  I
> > > > remember reading somewhere in the thread about caching the result
> > > > of this check in relcache, but haven't closely studied the feasibility
> of doing so.
> > > >
> > >
> > > There's no "second-guessing" involved here.
> > > There is no underlying way of dividing up the VALUES data of
> > > "INSERT...VALUES" amongst the parallel workers, even if the planner
> > > was updated to produce a parallel-plan for the "INSERT...VALUES"
> > > case (apart from the fact that spawning off parallel workers to
> > > insert that data would almost always result in worse performance
> > > than a non-parallel plan...) The division of work for parallel
> > > workers is part of the table AM
> > > (scan) implementation, which is not invoked for "INSERT...VALUES".
> >
> > I don't disagree that the planner would not normally assign a parallel
> > path simply to pull values out of a VALUES list mentioned in the
> > INSERT command, but deciding something based on the certainty of it in
> > an earlier planning phase seems odd to me.  Maybe that's just me
> > though.
> >
> 
> I think it is more of a case where neither is a need for parallelism nor
> we want to support parallelism of it. The other possibility for such a check
> could be at some earlier phase say in standard_planner [1] where we are
> doing checks for other constructs where we don't want parallelism (I think
> the check for 'parse->hasModifyingCTE' is quite similar). If you see in
> that check as well we just assume other operations to be in the category
> of parallel-unsafe. I think we should rule out such cases earlier than later.
> Do you have better ideas than what Greg has done to avoid parallelism for
> such cases?
> 
> [1] -
> standard_planner()
> {
> ..
> if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 && IsUnderPostmaster &&
> parse->commandType == CMD_SELECT &&
> !parse->hasModifyingCTE &&
> max_parallel_workers_per_gather > 0 &&
> !IsParallelWorker())
> {
> /* all the cheap tests pass, so scan the query tree */
> glob->maxParallelHazard = max_parallel_hazard(parse); parallelModeOK =
> glob->(glob->maxParallelHazard != PROPARALLEL_UNSAFE);
> }
> else
> {
> /* skip the query tree scan, just assume it's unsafe */
> glob->maxParallelHazard = PROPARALLEL_UNSAFE; parallelModeOK = false;
> }

+1.

In the current parallel_dml option patch. I put this check and some high-level 
check in a separate function called is_parallel_possible_for_modify.


- * PROPARALLEL_UNSAFE, PROPARALLEL_RESTRICTED, PROPARALLEL_SAFE
+ * Check at a high-level if parallel mode is able to be used for the specified
+ * table-modification statement.
+ * It's not possible in the following cases:
+ *
+ *  1) enable_parallel_dml is off
+ *  2) UPDATE or DELETE command
+ *  3) INSERT...ON CONFLICT...DO UPDATE
+ *  4) INSERT without SELECT on a relation
+ *  5) the reloption parallel_dml_enabled is not set for the target table
+ *
+ * (Note: we don't do in-depth parallel-safety checks here, we do only the
+ * cheaper tests that can quickly exclude obvious cases for which
+ * parallelism isn't supported, to avoid having to do further parallel-safety
+ * checks for these)
  */
+bool
+is_parallel_possible_for_modify(Query *parse)





And I put the function at earlier place like the following:


if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
IsUnderPostmaster &&
(parse->commandType == CMD_SELECT ||
-   (enable_parallel_dml &&
-   IsModifySupportedInParallelMode(parse->commandType))) &&
+   is_parallel_possible_for_modify(parse)) &&
!parse->hasModifyingCTE &&
max_parallel_workers_per_gather > 0 &&
!IsParallelWorker())


If this looks good, 

RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-09 Thread Hou, Zhijie
> > --
> > postgres=# select pg_size_pretty(pg_indexes_size('testscan_index'));
> >  pg_size_pretty
> > 
> >  4048 kB
> > (1 row)
> >
> > postgres=# select pg_size_pretty(pg_relation_size('testscan_index'));
> >  pg_size_pretty
> > 
> >  4768 kB
> > (1 row)
> 
> Which of the above shows the table size?  What does pg_indexes_size()
> against an index (testscan_index) return?

Sorry, Maybe the tablename is a little confused,
but 'testscan_index' is actually a table's name.

pg_indexes_size will return the index's size attatched to the table.
pg_relation_size will return the table's size.

Did I miss something ?



> > IMO, due to the difference of inserts with parallel execution, the
> > btree insert's cost is more than serial.
> >
> > At the same time, the parallel does not have a huge performance gain
> > with bitmapscan, So the extra cost of btree index will result in
> > performance degradation.
> 
> How did you know that the parallelism didn't have a huge performance gain
> with bitmap scan?
> 
> [serial]
>->  Bitmap Heap Scan on public.x  (cost=3272.20..3652841.26 rows=79918
> width=8) (actual time=8.096..41.005 rows=12 loops=1)
> 
> [parallel]
>  ->  Parallel Bitmap Heap Scan on public.x
> (cost=3272.20..1260119.35 rows=19980 width=8) (actual time=5.832..14.787
> rows=26000 loops=5)

I tested the case without insert(Just the query use bitmapscan):

[serial]:
postgres=# explain analyze select a from x where a<8 or (a%2=0 and 
a>19990);
QUERY PLAN
--
 Bitmap Heap Scan on x  (cost=3258.59..3647578.53 rows=81338 width=4) (actual 
time=8.091..34.222 rows=12 loops=1)
   Recheck Cond: ((a < 8) OR (a > 19990))
   Filter: ((a < 8) OR (((a % 2) = 0) AND (a > 19990)))
   Rows Removed by Filter: 5
   Heap Blocks: exact=975
   ->  BitmapOr  (cost=3258.59..3258.59 rows=173971 width=0) (actual 
time=7.964..7.965 rows=0 loops=1)
 ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1495.11 rows=80872 
width=0) (actual time=3.451..3.451 rows=7 loops=1)
   Index Cond: (a < 8)
 ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1722.81 rows=93099 
width=0) (actual time=4.513..4.513 rows=10 loops=1)
   Index Cond: (a > 19990)
 Planning Time: 0.108 ms
 Execution Time: 38.136 ms

[parallel]
postgres=# explain analyze select a from x where a<8 or (a%2=0 and 
a>19990);
   QUERY PLAN

 Gather  (cost=4258.59..1266704.42 rows=81338 width=4) (actual 
time=9.177..22.592 rows=12 loops=1)
   Workers Planned: 4
   Workers Launched: 4
   ->  Parallel Bitmap Heap Scan on x  (cost=3258.59..1257570.62 rows=20334 
width=4) (actual time=6.402..12.882 rows=26000 loops=5)
 Recheck Cond: ((a < 8) OR (a > 19990))
 Filter: ((a < 8) OR (((a % 2) = 0) AND (a > 19990)))
 Rows Removed by Filter: 1
 Heap Blocks: exact=1
 ->  BitmapOr  (cost=3258.59..3258.59 rows=173971 width=0) (actual 
time=8.785..8.786 rows=0 loops=1)
   ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1495.11 rows=80872 
width=0) (actual time=3.871..3.871 rows=7 loops=1)
 Index Cond: (a < 8)
   ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1722.81 rows=93099 
width=0) (actual time=4.914..4.914 rows=10 loops=1)
 Index Cond: (a > 19990)
 Planning Time: 0.158 ms
 Execution Time: 26.951 ms
(15 rows)

It did have performance gain, but I think it's not huge enough to ignore the 
extra's index cost.
What do you think ?

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-09 Thread Hou, Zhijie
> > Till now, what I found is that:
> > With tang's conf, when doing parallel insert, the walrecord is more
> > than serial insert (IMO, this is the main reason why it has
> > performance degradation) See the attatchment for the plan info.
> >
> > I have tried alter the target table to unlogged and then the
> > performance degradation will not happen any more.
> >
> > And the additional walrecord seems related to the index on the target
> table.
> > If the target table does not have any index, the wal record is the
> > same between parallel plan and serial plan.
> > Also, it does not have performance degradation without index.
> 
> 
> [serial]
>  Insert on public.testscan  (cost=3272.20..3652841.26 rows=0 width=0)
> (actual time=360.474..360.476 rows=0 loops=1)
>Buffers: shared hit=392569 read=3 dirtied=934 written=933
>WAL: records=260354 bytes=16259841
> 
> [parallel]
>->  Insert on public.testscan  (cost=3272.20..1260119.35 rows=0
> width=0) (actual time=378.227..378.229 rows=0 loops=5)
>  Buffers: shared hit=407094 read=4 dirtied=1085 written=1158
>  WAL: records=260498 bytes=17019359
> 
> 
> More pages are dirtied and written in the parallel execution.  Aren't the
> index and possibly the target table bigger with parallel execution than
> with serial execution?  That may be due to the difference of inserts of
> index keys.

Yes, the table size and index size is bigger with parallel execution.

table and index's size after parallel insert
--
postgres=# select pg_size_pretty(pg_indexes_size('testscan_index'));
 pg_size_pretty

 4048 kB
(1 row)

postgres=#
postgres=# select pg_size_pretty(pg_relation_size('testscan_index'));
 pg_size_pretty

 4768 kB
(1 row)
--

table and index's size after serial insert 
--
postgres=# select pg_size_pretty(pg_indexes_size('testscan_index'));
 pg_size_pretty

 2864 kB
(1 row)

postgres=# select pg_size_pretty(pg_relation_size('testscan_index'));
 pg_size_pretty

 4608 kB
--


To Amit:
> I think you might want to see which exact WAL records are extra by using 
> pg_waldump?

Yes, thanks for the hint, I was doing that and the result is as follow:

Heap wal record is the same between parallel and serial: (12 which is the 
number count of the query result).

parallel Btree walrecord(130500 record):
--
INSERT_LEAF:129500
INSERT_UPPER:497
SPLIT_L:172
SPLIT_R:328
INSERT_POST:0
DEDUP:0
VACUUM:0
DELETE:0
MARK_PAGE_HALFDEAD:0
UNLINK_PAGE:0
UNLINK_PAGE_META:0
NEWROOT:3
REUSE_PAGE:0
META_CLEANUP:0
--

serial Btree walrecord(130355 record):
--
INSERT_LEAF:129644
INSERT_UPPER:354
SPLIT_L:0
SPLIT_R:355
INSERT_POST:0
DEDUP:0
VACUUM:0
DELETE:0
MARK_PAGE_HALFDEAD:0
UNLINK_PAGE:0
UNLINK_PAGE_META:0
NEWROOT:2
REUSE_PAGE:0
META_CLEANUP:0
--

IMO, due to the difference of inserts with parallel execution,
the btree insert's cost is more than serial.

At the same time, the parallel does not have a huge performance gain with 
bitmapscan,
So the extra cost of btree index will result in performance degradation.
Does it make sense ?

Best regards,
Houzj





RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-09 Thread Hou, Zhijie
> > > postgres=# explain verbose insert into testscan select a from x
> > > where
> > > a<8 or (a%2=0 and a>19990);
> > > QUERY PLAN
> > >
> 
> > > --
> > > -
> > >  Gather  (cost=4346.89..1281204.64 rows=81372 width=0)
> > >Workers Planned: 4
> > >->  Insert on public.testscan  (cost=3346.89..1272067.44 rows=0
> > > width=0)
> > >  ->  Parallel Bitmap Heap Scan on public.x1
> > > (cost=3346.89..1272067.44 rows=20343 width=8)
> > >Output: x1.a, NULL::integer
> > >Recheck Cond: ((x1.a < 8) OR (x1.a > 19990))
> > >Filter: ((x1.a < 8) OR (((x1.a % 2) = 0) AND
> > > (x1.a >
> > > 19990)))
> > >->  BitmapOr  (cost=3346.89..3346.89 rows=178808
> > > width=0)
> > >  ->  Bitmap Index Scan on x1_a_idx
> > > (cost=0.00..1495.19 rows=80883 width=0)
> > >Index Cond: (x1.a < 8)
> > >  ->  Bitmap Index Scan on x1_a_idx
> > > (cost=0.00..1811.01 rows=97925 width=0)
> > >Index Cond: (x1.a > 19990)
> > >
> > > PSA is my postgresql.conf file, maybe you can have a look. Besides,
> > > I didn't do any parameters tuning in my test session.
> >
> > I reproduced this on my machine.
> >
> > I think we'd better do "analyze" before insert which helps reproduce this
> easier.
> > Like:
> >
> > -
> > analyze;
> > explain analyze verbose insert into testscan select a from x where
> > a<8 or (a%2=0 and a>19990);
> > -
> 
> OK then.
> Can you check if just the underlying SELECTs are run (without INSERT), is
> there any performance degradation when compared to a non-parallel scan?

It seems there is no performance degradation without insert.

Till now, what I found is that:
With tang's conf, when doing parallel insert, the walrecord is more than serial 
insert
(IMO, this is the main reason why it has performance degradation)
See the attatchment for the plan info.

I have tried alter the target table to unlogged and
then the performance degradation will not happen any more.

And the additional walrecord seems related to the index on the target table.
If the target table does not have any index, the wal record is the same between 
parallel plan and serial plan.
Also, it does not have performance degradation without index.

I am still looking at this problem, if someone think of something about it,
It's very grateful to share the knowledge with me.

Best regards,
houzj





postgres=# explain (ANALYZE, BUFFERS, VERBOSE, WAL)  insert into testscan 
select a from x where a<8 or (a%2=0 and a>19990);
   QUERY PLAN

 Insert on public.testscan  (cost=3272.20..3652841.26 rows=0 width=0) (actual 
time=360.474..360.476 rows=0 loops=1)
   Buffers: shared hit=392569 read=3 dirtied=934 written=933
   WAL: records=260354 bytes=16259841
   ->  Bitmap Heap Scan on public.x  (cost=3272.20..3652841.26 rows=79918 
width=8) (actual time=8.096..41.005 rows=12 loops=1)
 Output: x.a, NULL::integer
 Recheck Cond: ((x.a < 8) OR (x.a > 19990))
 Filter: ((x.a < 8) OR (((x.a % 2) = 0) AND (x.a > 19990)))
 Rows Removed by Filter: 5
 Heap Blocks: exact=975
 Buffers: shared hit=1475
 ->  BitmapOr  (cost=3272.20..3272.20 rows=174813 width=0) (actual 
time=7.975..7.976 rows=0 loops=1)
   Buffers: shared hit=500
   ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1468.38 rows=79441 
width=0) (actual time=3.469..3.470 rows=7 loops=1)
 Index Cond: (x.a < 8)
 Buffers: shared hit=222
   ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1763.86 rows=95372 
width=0) (actual time=4.499..4.499 rows=10 loops=1)
 Index Cond: (x.a > 19990)
 Buffers: shared hit=278
 Planning:
   Buffers: shared hit=10
 Planning Time: 0.126 ms
 Execution Time: 360.547 ms
(22 rows)
postgres=# explain (ANALYZE, BUFFERS, VERBOSE, WAL)  insert into testscan 
select a from x where a<8 or (a%2=0 and a>19990);
   QUERY PLAN
-
 Gather  (cost=4272.20..1269111.15 rows=79918 width=0) (actual 
time=381.764..382.715 rows=0 loops=1)
   Workers Planned: 4
   Workers Launched: 4
   Buffers: shared hit=407094 read=4 dirtied=1085 written=1158
   WAL: records=260498 bytes=17019359
   ->  Insert on public.testscan  (cost=3272.20..1260119.35 rows=0 

RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-08 Thread Hou, Zhijie
> > Did it actually use a parallel plan in your testing?
> > When I ran these tests with the Parallel INSERT patch applied, it did
> > not naturally choose a parallel plan for any of these cases.
> 
> Yes, these cases pick parallel plan naturally on my test environment.
> 
> postgres=# explain verbose insert into testscan select a from x where
> a<8 or (a%2=0 and a>19990);
> QUERY PLAN
> --
> -
>  Gather  (cost=4346.89..1281204.64 rows=81372 width=0)
>Workers Planned: 4
>->  Insert on public.testscan  (cost=3346.89..1272067.44 rows=0
> width=0)
>  ->  Parallel Bitmap Heap Scan on public.x1
> (cost=3346.89..1272067.44 rows=20343 width=8)
>Output: x1.a, NULL::integer
>Recheck Cond: ((x1.a < 8) OR (x1.a > 19990))
>Filter: ((x1.a < 8) OR (((x1.a % 2) = 0) AND (x1.a >
> 19990)))
>->  BitmapOr  (cost=3346.89..3346.89 rows=178808
> width=0)
>  ->  Bitmap Index Scan on x1_a_idx
> (cost=0.00..1495.19 rows=80883 width=0)
>Index Cond: (x1.a < 8)
>  ->  Bitmap Index Scan on x1_a_idx
> (cost=0.00..1811.01 rows=97925 width=0)
>Index Cond: (x1.a > 19990)
> 
> PSA is my postgresql.conf file, maybe you can have a look. Besides, I didn't
> do any parameters tuning in my test session.

I reproduced this on my machine.

I think we'd better do "analyze" before insert which helps reproduce this 
easier.
Like:

-
analyze;
explain analyze verbose insert into testscan select a from x where a<8 or 
(a%2=0 and a>19990);
-

Best regards,
houzj





RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-07 Thread Hou, Zhijie
> Posting an updated set of patches.

A minor comment about doc.

+  
+Where the above target table features are determined to be, at worst,
+parallel-restricted, rather than parallel-unsafe, at least a parallel table
+scan may be used in the query plan for the INSERT
+statement. For more information about Parallel Safety, see
+.
+  

It seems does not mention that if target table is a foreign/temp table, a 
parallel table scan may be used.

So how about:

+  
+Where the target table is a foreign/temporary table or the above target 
table features
+are determined to be, at worst, parallel-restricted, rather than 
parallel-unsafe,
+at least a parallel table scan may be used in the query plan for the
+INSERT statement. For more information about Parallel 
Safety,
+see .
+  

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Hou, Zhijie
> > > 
> > > static Query *
> > > rewriteRuleAction(Query *parsetree,
> > > ...
> > > if (sub_action_ptr)
> > > +   {
> > > *sub_action_ptr = sub_action;
> > > +   rule_action->hasModifyingCTE |=
> > parsetree->hasModifyingCTE;
> > > +   }
> > > 
> > >
> > > And the Basic test passed.
> > > What do you think ?
> >
> > That is very close to what I was going to suggest, which is this:
> >
> > diff --git a/src/backend/rewrite/rewriteHandler.c
> > b/src/backend/rewrite/rewriteHandler.c
> > index 0672f497c6..3c4417af98 100644
> > --- a/src/backend/rewrite/rewriteHandler.c
> > +++ b/src/backend/rewrite/rewriteHandler.c
> > @@ -631,6 +631,8 @@ rewriteRuleAction(Query *parsetree,
> > checkExprHasSubLink((Node *)
> > rule_action->returningList);
> > }
> >
> > +   rule_action->hasModifyingCTE |= parsetree->hasModifyingCTE;
> > +
> > return rule_action;
> >  }
> 
> 
>   if (parsetree->cteList != NIL && sub_action->commandType !=
> CMD_UTILITY)
>   {
>   ...
>   sub_action->cteList = list_concat(sub_action->cteList,
>   }
> 
> Is is possible when sub_action is CMD_UTILITY ?
> In this case CTE will be copied to the newone, should we set the set the
> flag in this case ?

Sorry , a typo in my word.
In this case CTE will not be copied to the newone, should we set the set the 
flag in this case ?

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Hou, Zhijie
> > 
> > static Query *
> > rewriteRuleAction(Query *parsetree,
> > ...
> > if (sub_action_ptr)
> > +   {
> > *sub_action_ptr = sub_action;
> > +   rule_action->hasModifyingCTE |=
> parsetree->hasModifyingCTE;
> > +   }
> > 
> >
> > And the Basic test passed.
> > What do you think ?
> 
> That is very close to what I was going to suggest, which is this:
> 
> diff --git a/src/backend/rewrite/rewriteHandler.c
> b/src/backend/rewrite/rewriteHandler.c
> index 0672f497c6..3c4417af98 100644
> --- a/src/backend/rewrite/rewriteHandler.c
> +++ b/src/backend/rewrite/rewriteHandler.c
> @@ -631,6 +631,8 @@ rewriteRuleAction(Query *parsetree,
> checkExprHasSubLink((Node *)
> rule_action->returningList);
> }
> 
> +   rule_action->hasModifyingCTE |= parsetree->hasModifyingCTE;
> +
> return rule_action;
>  }


if (parsetree->cteList != NIL && sub_action->commandType != CMD_UTILITY)
{
...
sub_action->cteList = list_concat(sub_action->cteList,
}

Is is possible when sub_action is CMD_UTILITY ?
In this case CTE will be copied to the newone, should we set the set the flag 
in this case ?

Best regard,
houzj





RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Hou, Zhijie
> > I took a look into the hasModifyingCTE bugfix recently, and found a
> > possible bug case without the parallel insert patch.
> >
> > -
> > drop table if exists test_data1;
> > create table test_data1(a int, b int) ; insert into test_data1 select
> > generate_series(1,1000), generate_series(1,1000); set
> > force_parallel_mode=on;
> >
> > CREATE TEMP TABLE bug6051 AS
> >   select i from generate_series(1,3) as i;
> >
> > SELECT * FROM bug6051;
> > CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD select a as
> > i from test_data1;
> >
> > WITH t1 AS ( DELETE FROM bug6051 RETURNING * ) INSERT INTO bug6051
> > SELECT * FROM t1;
> >
> > ***
> > ***ERROR:  cannot assign XIDs during a parallel operation
> > ***
> > -
> >
> > I debugged it and it did have modifycte in the parsetree after rewrite.
> > I think if we can properly set the hasModifyingCTE, we can avoid this
> error by not consider parallel for this.
> >
> 
> Thanks. You've identified that the bug exists for SELECT too. I've verified
> that the issue is fixed by the bugfix included in the Parallel INSERT patch.
> Are you able to review my bugfix?
> Since the problem exists for SELECT in the current Postgres code, I'd like
> to pull that bugfix out and provide it as a separate fix.
> My concern is that there may well be a better way to fix the issue - for
> example, during the re-writing, rather than after the query has been
> re-written.
Hi,

I took a look at the fix and have some thoughts on it.
(Please correct me if you have tried this idea and found something is wrong)

IMO, the main reason for the hasModifyingCTE=false is that:
the Rewriter did not update the  hasModifyingCTE when copying the existsing 
'cteList' to the rewrited one.

It seems there is only one place where ctelist will be copied separately.
---
static Query *
rewriteRuleAction(Query *parsetree,
...
/* OK, it's safe to combine the CTE lists */
sub_action->cteList = list_concat(sub_action->cteList,

  copyObject(parsetree->cteList));
+   sub_action->hasModifyingCTE |= parsetree->hasModifyingCTE;


Based on the above, if we update the hasModifyingCTE here, we may solve this 
problem.

And there is another point here, the sub_action may be not the final parsetree.
If defined the rule like "DO INSTEAD insert into xx select xx from xx", 
Rewriter will
Put the ctelist into subquery in parsetree's rtable.
In this case, we may also need to update the final parsetree too.
(I think you know this case, I found same logic in the latest patch)


static Query *
rewriteRuleAction(Query *parsetree,
...
if (sub_action_ptr)
+   {
*sub_action_ptr = sub_action;
+   rule_action->hasModifyingCTE |= 
parsetree->hasModifyingCTE;
+   }


And the Basic test passed.
What do you think ?

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Hou, Zhijie
Hi,

I took a look into the hasModifyingCTE bugfix recently, 
and found a possible bug case without the parallel insert patch.

-
drop table if exists test_data1;
create table test_data1(a int, b int) ;
insert into test_data1 select generate_series(1,1000), generate_series(1,1000);
set force_parallel_mode=on;

CREATE TEMP TABLE bug6051 AS
  select i from generate_series(1,3) as i;

SELECT * FROM bug6051;
CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD select a as i from 
test_data1;

WITH t1 AS ( DELETE FROM bug6051 RETURNING * ) INSERT INTO bug6051 SELECT * 
FROM t1;

***
***ERROR:  cannot assign XIDs during a parallel operation
***
-

I debugged it and it did have modifycte in the parsetree after rewrite.
I think if we can properly set the hasModifyingCTE, we can avoid this error by 
not consider parallel for this.

Thoughts ?

Best regards,
houzj







RE: Determine parallel-safety of partition relations for Inserts

2021-02-02 Thread Hou, Zhijie
> For v3_0003-reloption-parallel_dml-src.patch :
> +   table_close(rel, NoLock);

> Since the rel would always be closed, it seems the return value from 
> RelationGetParallelDML() can be assigned to a variable, followed by call to 
> table_close(), then the return statement.

Thanks for the comment. Fixed in the latest patch.

Best regards,
houzj




RE: Determine parallel-safety of partition relations for Inserts

2021-02-02 Thread Hou, Zhijie
Hi,

Attaching v5 patches with the changes:
  * rebase the code on the greg's latest parallel insert patch
  * fix some code style.

Please consider it for further review.

Best regards,
Houzj
 
 





v5_0004-reloption-parallel_dml-test-and-doc.patch
Description: v5_0004-reloption-parallel_dml-test-and-doc.patch


v5_0001-guc-option-enable_parallel_dml-src.patch
Description: v5_0001-guc-option-enable_parallel_dml-src.patch


v5_0002-guc-option-enable_parallel_dml-doc-and-test.patch
Description: v5_0002-guc-option-enable_parallel_dml-doc-and-test.patch


v5_0003-reloption-parallel_dml-src.patch.patch
Description: v5_0003-reloption-parallel_dml-src.patch.patch


RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-02 Thread Hou, Zhijie
> 
> I've had a further look at this, and this walker function is doing a lot
> of work recursing the parse tree, and I'm not sure that it reliably retrieves
> the information that we;re looking for, for all cases of different SQL
> queries. Unless it can be made much more efficient and specific to our needs,
> I think we should not try to do this optimization, because there's too much
> overhead. Also, keep in mind that for the current parallel SELECT
> functionality in Postgres, I don't see any similar optimization being
> attempted (and such optimization should be attempted at the SELECT level).
> So I don't think we should be attempting such optimization in this patch
> (but could be attempted in a separate patch, just related to current parallel
> SELECT functionality).

Yes, I agreed,
I was worried about the overhead it may bring too,
we can remove this from the current patch.

Best regards,
houzj




RE: Determine parallel-safety of partition relations for Inserts

2021-02-01 Thread Hou, Zhijie
> > IMO, It seems more readable to extract all the check that we can do
> > before the safety-check and put them in the new function.
> >
> > Please consider it for further review.
> >
> 
> I've updated your v2 patches and altered some comments and documentation
> changes (but made no code changes) - please compare against your v2 patches,
> and see whether you agree with the changes to the wording.
> In the documentation, you will also notice that in your V2 patch, it says
> that the "parallel_dml_enabled" table option defaults to false.
> As it actually defaults to true, I changed that in the documentation too.

Hi greg,

Thanks a lot for the document update, LGTM.

Attaching v4 patches with the changes:
 * fix some typos in the code.
 * store partitioned reloption in a separate struct PartitionedOptions, 
   making it more standard and easier to expand in the future.

Please consider it for further review.

Best regards,
Houzj





v4_0001-guc-option-enable_parallel_dml-src.patch
Description: v4_0001-guc-option-enable_parallel_dml-src.patch


v4_0002-guc-option-enable_parallel_dml-doc-and-test.patch
Description: v4_0002-guc-option-enable_parallel_dml-doc-and-test.patch


v4_0003-reloption-parallel_dml-src.patch.patch
Description: v4_0003-reloption-parallel_dml-src.patch.patch


v4_0004-reloption-parallel_dml-test-and-doc.patch
Description: v4_0004-reloption-parallel_dml-test-and-doc.patch


RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-01 Thread Hou, Zhijie
Hi,

When developing the reloption patch, I noticed some issues in the patch.

1).
> - Reduce Insert parallel-safety checks required for some SQL, by noting
> that the subquery must operate on a relation (check for RTE_RELATION in
> subquery range-table)

+   foreach(lcSub, rte->subquery->rtable)
+   {
+   rteSub = lfirst_node(RangeTblEntry, lcSub);
+   if (rteSub->rtekind == RTE_RELATION)
+   {
+   hasSubQueryOnRelation = true;
+   break;
+   }
+   }
It seems we can not only search RTE_RELATION in rtable,
because RTE_RELATION may exist in other place like:

---
--** explain insert into target select (select * from test);
Subplan's subplan

--** with cte as (select * from test) insert into target select * from cte;
In query's ctelist.
---

May be we should use a walker function [1] to
search the subquery and ctelist.

2).

+--
+-- Test INSERT into temporary table with underlying query.
+-- (should not use a parallel plan)
+--

May be the comment here need some change since
we currently support parallel plan for temp table.

3)
Do you think we can add a testcase for foreign-table ?
To test parallel query with serial insert on foreign table.


[1]
static bool
relation_walker(Node *node)
{
if (node == NULL)
return false;

else if (IsA(node, RangeTblEntry))
{
RangeTblEntry *rte = (RangeTblEntry *) node;
if (rte->rtekind == RTE_RELATION)
return true;

return false;
}

else if (IsA(node, Query))
{
Query  *query = (Query *) node;

/* Recurse into subselects */
return query_tree_walker(query, relation_walker,
 NULL, 
QTW_EXAMINE_RTES_BEFORE);
}

/* Recurse to check arguments */
return expression_tree_walker(node,
  
relation_walker,
  NULL);
}


Best regards,
houzj




RE: Determine parallel-safety of partition relations for Inserts

2021-01-31 Thread Hou, Zhijie
Hi greg,

Thanks for the review !

> Personally, I think a table's "parallel_dml" option should be ON by default.
> It's annoying to have to separately enable it for each and every table being
> used, when I think the need to turn it selectively OFF should be fairly
> rare.

Yes, I agreed.
Changed.

> And I'm not sure that "parallel_dml" is the best name for the table option
> - because it sort of implies parallel dml WILL be used - but that isn't
> true, it depends on other factors too.
> So I think (to be consistent with other table option naming) it would have
> to be something like "parallel_dml_enabled".

Agreed.
Changed to parallel_dml_enabled.

Attatching v2 patch which addressed the comments above.

Some further refactor:

Introducing a new function is_parallel_possible_for_modify() which decide 
whether to do safety check.

IMO, It seems more readable to extract all the check that we can do before the 
safety-check and put them
in the new function.

Please consider it for further review.

Best regards,
houzj




v2_0003-reloption-parallel_dml-src.patch
Description: v2_0003-reloption-parallel_dml-src.patch


v2_0004-reloption-parallel_dml-test-and-doc.patch
Description: v2_0004-reloption-parallel_dml-test-and-doc.patch


v2_0001-guc-option-enable_parallel_dml-src.patch
Description: v2_0001-guc-option-enable_parallel_dml-src.patch


v2_0002-guc-option-enable_parallel_dml-doc-and-test.patch
Description: v2_0002-guc-option-enable_parallel_dml-doc-and-test.patch


RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
> Hi,
> 
> Attatching v1 patches, introducing options which let user manually control
> whether to use parallel dml.
> 
> About the patch:
> 1. add a new guc option: enable_parallel_dml (boolean) 2. add a new
> tableoption: parallel_dml (boolean)
> 
>The default of both is off(false).
> 
> User can set enable_parallel_dml in postgresql.conf or seesion to enable
> parallel dml.
> If user want to choose some specific to use parallel insert, they can set
> table.parallel_dml to on.
> 
> Some attention:
> (1)
> Currently if guc option enable_parallel_dml is set to on but table's
> parallel_dml is off, planner still do not consider parallel for dml.
> 
> In this way, If user want to use parallel dml , they have to set
> enable_parallel_dml=on and set parallel_dml = on.
> If someone dislike this, I think we can also let tableoption. parallel_dml's
> default value to on ,with this user only need to set enable_parallel_dml=on
> 
> (2)
> For the parallel_dml.
> If target table is partitioned, and it's parallel_dml is set to on, planner
> will ignore The option value of it's child.
> This is beacause we can not divide dml plan to separate table, so we just
> check the target table itself.
> 
> 
> Thoughts and comments are welcome.

The patch is based on v13 patch(parallel insert select) in [1]

[1] 
https://www.postgresql.org/message-id/CAJcOf-ejV8iU%2BYpuV4qbYEY-2vCG1QF2g3Gxn%3DZ%2BPyUH_5f84A%40mail.gmail.com

Best regards,
houzj




RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
Hi,

Attatching v1 patches, introducing options which let user manually control 
whether to use parallel dml.

About the patch:
1. add a new guc option: enable_parallel_dml (boolean)
2. add a new tableoption: parallel_dml (boolean)

   The default of both is off(false).

User can set enable_parallel_dml in postgresql.conf or seesion to enable 
parallel dml.
If user want to choose some specific to use parallel insert, they can set 
table.parallel_dml to on.

Some attention:
(1)
Currently if guc option enable_parallel_dml is set to on but table's 
parallel_dml is off,
planner still do not consider parallel for dml.

In this way, If user want to use parallel dml , they have to set 
enable_parallel_dml=on and set parallel_dml = on.
If someone dislike this, I think we can also let tableoption. parallel_dml's 
default value to on ,with this user only need
to set enable_parallel_dml=on

(2)
For the parallel_dml.
If target table is partitioned, and it's parallel_dml is set to on, planner 
will ignore
The option value of it's child. 
This is beacause we can not divide dml plan to separate table, so we just check 
the target table itself.


Thoughts and comments are welcome.

Best regards,
houzj




v1_0004-reloption-parallel_dml-test-and-doc.patch
Description: v1_0004-reloption-parallel_dml-test-and-doc.patch


v1_0001-guc-option-enable_parallel_dml-src.patch
Description: v1_0001-guc-option-enable_parallel_dml-src.patch


v1_0002-guc-option-enable_parallel_dml-doc-and-test.patch
Description: v1_0002-guc-option-enable_parallel_dml-doc-and-test.patch


v1_0003-reloption-parallel_dml-src.patch
Description: v1_0003-reloption-parallel_dml-src.patch


RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
> > Hi
> >
> > I have an issue about the parameter for DML.
> >
> > If we define the parameter as a tableoption.
> >
> > For a partitioned table, if we set the parent table's parallel_dml=on,
> > and set one of its partition parallel_dml=off, it seems we can not divide
> the plan for the separate table.
> >
> > For this case, should we just check the parent's parameter ?
> >
> 
> I think so. IIUC, this means the Inserts where target table is parent table,
> those will just check the option of the parent table and then ignore the
> option value for child tables whereas we will consider childrel's option
> for Inserts where target table is one of the child table, right?
> 

Yes, I think we can just check the target table itself.

Best regards,
houzj





RE: Determine parallel-safety of partition relations for Inserts

2021-01-28 Thread Hou, Zhijie
> From: Amit Kapila 
> > Good question. I think if we choose to have a separate parameter for
> > DML, it can probably a boolean to just indicate whether to enable
> > parallel DML for a specified table and use the parallel_workers
> > specified in the table used in SELECT.
> 
> Agreed.

Hi

I have an issue about the parameter for DML.

If we define the parameter as a tableoption.

For a partitioned table, if we set the parent table's parallel_dml=on,
and set one of its partition parallel_dml=off, it seems we can not divide the 
plan for the separate table.

For this case, should we just check the parent's parameter ?

Best regards,
houzj




RE: fix typo in reorderbuffer.c

2021-01-26 Thread Hou, Zhijie
> >> Combocid's ==>> Combocids
> >
> > Ping again, just in case it’s missed.
> 
> There are in the comments "ComboCIDs", "combocids" and "ComboCids" on top
> of the existing Combocid's.  Your patch introduces a fourth way of writing
> that.  It may be better to just have one way to govern them all.

Hi Michael,

Thanks for the review.

Actually, "Combocid's" in the patch is the first word of the sentence, so the 
first char is uppercase(may be not a new style).

I agree that it’s better to have a common way, but some different style of 
combocid follow the variable or functionname[1].
We may need to change the var name or function name too.

Personally , I prefer "combocids".
But do you think we can change that in a separate patch apart from this typo 
patch ?


[1]:
void
SerializeComboCIDState(Size maxsize, char *start_address)

static int  usedComboCids = 0;  /* number of elements in comboCids */
...

Best regards,
houzj






RE: fix typo in reorderbuffer.c

2021-01-26 Thread Hou, Zhijie
> I found a possible typo in reorderbuffer.c
> 
>  *   has got a combocid. Combocid's are only valid for the duration
> of a
> 
> Combocid's ==>> Combocids
> 
> Attatching a small patch to correct it.
> 

Ping again, just in case it’s missed.




RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-26 Thread Hou, Zhijie
Hi,

When testing the patch with the following kind of sql.

---
Insert into part_table select 1;
Insert into part_table select generate_series(1,1,1);
Insert into part_table select * from testfunc();
---

we usually use these sqls to initialize the table or for testing purpose.

Personally I think we do not need to do the parallel safety-check for these 
cases,
because there seems no chance for the select part to consider parallel.

I thought we aim to not check the safety unless parallel is possible.
, So I was thinking is it possible to avoid the check it these cases ?

I did some quick check on the code, An Immature ideal is to check if there is 
RTE_RELATION in query.
If no we do not check the safety-check.

I am not sure is it worth to do that, any thoughts ?

Best regards,
Houzj





RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-26 Thread Hou, Zhijie
> I think we can allow parallel insert in this case, because the column value
> is determined according to the DEFAULT definition of the target table
> specified in the INSERT statement.  This is described here:
> 
> https://www.postgresql.org/docs/devel/sql-createtable.html
> 
> "Defaults may be specified separately for each partition. But note that
> a partition's default value is not applied when inserting a tuple through
> a partitioned table."
> 
> So the parallel-unsafe function should not be called.

Thanks for the explanation.
I think you are right, I did miss it.

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-25 Thread Hou, Zhijie
Hi,

I have an issue of the check about column default expressions.

+   if (command_type == CMD_INSERT)
+   {
+   /*
+* Column default expressions for columns in the target-list are
+* already being checked for parallel-safety in the
+* max_parallel_hazard() scan of the query tree in 
standard_planner().
+*/
+
+   tupdesc = RelationGetDescr(rel);
+   for (attnum = 0; attnum < tupdesc->natts; attnum++)


IMO, max_parallel_hazard() only check the parent table's default expressions, 
But if the table has partitions and its partition have its own default 
expressions, max_parallel_hazard() seems does not check that.
And we seems does not check that too.

I am not sure should we allow parallel insert for this case ?

Example:

-
set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_table_scan_size=0;
set max_parallel_workers_per_gather=4;

create table origin(a int);
insert into origin values(generate_series(1,5000));

create or replace function bdefault_unsafe () returns int language plpgsql 
parallel unsafe as $$ begin
RETURN 5;
end $$;

create table parttable1 (a int, b name) partition by range (a); create table 
parttable1_1 partition of parttable1 for values from (0) to (5000); create 
table parttable1_2 partition of parttable1 for values from (5000) to (1);

alter table parttable1_1 ALTER COLUMN b SET DEFAULT bdefault_unsafe();

postgres=# explain insert into parttable1 select * from origin ;
   QUERY PLAN   

 Gather  (cost=0.00..41.92 rows=5865 width=0)
   Workers Planned: 3
   ->  Insert on parttable1  (cost=0.00..41.92 rows=0 width=0)
 ->  Parallel Seq Scan on origin  (cost=0.00..41.92 rows=1892 width=68)
(4 rows)

postgres=# explain insert into parttable1_1 select * from origin ;
QUERY PLAN 
---
 Insert on parttable1_1  (cost=0.00..1348.00 rows=0 width=0)
   ->  Seq Scan on origin  (cost=0.00..1348.00 rows=5000 width=68)
(2 rows)

-

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-25 Thread Hou, Zhijie
Hi,

When reading the code of rel_max_parallel_hazard_for_modify in 0001.

I thought there are so many places call table_close().
Personally, It's a little confused to me.

Do you think it's better to do the table_open/close outside of 
rel_max_parallel_hazard_for_modify ?

Like:

static bool rel_max_parallel_hazard_for_modify(Relation rel,
   CmdType command_type,
   max_parallel_hazard_context 
*context);
...
Relation relation = table_open(rte->relid, NoLock);
(void) rel_max_parallel_hazard_for_modify(relation, parse->commandType, 
);
table_close(relation, NoLock);


And we seems do not need the lockmode param with the above define.


Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-24 Thread Hou, Zhijie
Hi,

After doing some test to cover the code path in the PATCH 0001.
I have some suggestions for the 0002 testcase.


(1)
+   /* Check parallel-safety of any expressions in the 
partition key */
+   if (get_partition_col_attnum(pkey, i) == 0)
+   {
+   Node   *check_expr = (Node *) 
lfirst(partexprs_item);
+
+   if (max_parallel_hazard_walker(check_expr, 
context))
+   {
+   table_close(rel, lockmode);
+   return true;
+   }

The testcase seems does not cover the above code(test when the table have 
parallel unsafe expression in the partition key).

Personally, I use the following sql to cover this:
-
create table partkey_unsafe_key_expr_t (a int4, b name) partition by range 
((fullname_parallel_unsafe('',a::varchar)));
explain (costs off) insert into partkey_unsafe_key_expr_t select unique1, 
stringu1 from tenk1;
-


(2)
I noticed that most of testcase test both (parallel safe/unsafe/restricted).
But the index expression seems does not test the parallel restricted.
How about add a testcase like:
-
create or replace function fullname_parallel_restricted(f text, l text) returns 
text as $$
begin
return f || l;
end;
$$ language plpgsql immutable parallel restricted;

create table names4(index int, first_name text, last_name text);
create index names4_fullname_idx on names4 
(fullname_parallel_restricted(first_name, last_name));

--
-- Test INSERT with parallel-restricted index expression
-- (should create a parallel plan)
--
explain (costs off) insert into names4 select * from names;
-

(3)
+   /* Recursively check each partition ... */
+   pdesc = RelationGetPartitionDesc(rel);
+   for (i = 0; i < pdesc->nparts; i++)
+   {
+   if (rel_max_parallel_hazard_for_modify(pdesc->oids[i],
+   
command_type,
+   
context,
+   
AccessShareLock))
+   {
+   table_close(rel, lockmode);
+   return true;
+   }
+   }

It seems we do not have a testcase to test (some parallel unsafe expression 
or.. in partition)
Hoe about add one testcase to test parallel unsafe partition ?

Best regards,
houzj





RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Hou, Zhijie

> > And there seems another solution for this:
> >
> > In the patch, We only use the { ii_Expressions , ii_NumIndexAttrs ,
> > ii_IndexAttrNumbers } from the IndexInfo, which seems can get from
> "Relation-> rd_index".
> >
> > Based on above, May be we do not need to call BuildIndexInfo to build
> the IndexInfo.
> > It can avoid some unnecessary cost.
> > And in this solution we do not need to remove expression_planner.
> >
> 
> Hmmm, when I debugged my simple test case, I found rel->rd_index was NULL,
> so it seems that the call to BuildIndexInfo is needed.
> (have I understood your suggestion correctly?)
> 

Hi greg,

Thanks for debugging this.

May be I missed something. I am not sure about the case when rel->rd_index was 
NULL.
Because, In function BuildIndexInfo, it seems does not have NULL-check for 
index->rd_index.
Like the following:

BuildIndexInfo(Relation index)
{
IndexInfo  *ii;
Form_pg_index indexStruct = index->rd_index;
int i;
int numAtts;

/* check the number of keys, and copy attr numbers into the IndexInfo */
numAtts = indexStruct->indnatts;


And the patch do not have NULL-check for index->rd_index too.
So I thought we can assume index->rd_index is not null, but it seems I may 
missed something ?

Can you please share the test case with me ?

I use the following code to replace the call of BuildIndexInfo.
And the installcheck passed.

Example:
+Form_pg_index indexStruct = index_rel->rd_index;
+List *ii_Expressions = RelationGetIndexExpressions(index_rel);
+int ii_NumIndexAttrs = indexStruct->indnatts;
+AttrNumber  ii_IndexAttrNumbers[INDEX_MAX_KEYS];

+for (int i = 0; i < ii_NumIndexAttrs; i++)
+ii_IndexAttrNumbers[i] = indexStruct->indkey.values[i];

Best regards,
houzj






RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Hou, Zhijie
Hi

I took a look at v12-0001 patch, here are some comments:

1.
+   /*
+* Setup the context used in finding the max parallel-mode hazard.
+*/
+   Assert(initial_max_parallel_hazard == 0 ||
+  initial_max_parallel_hazard == PROPARALLEL_SAFE ||
+  initial_max_parallel_hazard == PROPARALLEL_RESTRICTED);
+   context.max_hazard = initial_max_parallel_hazard == 0 ?
+   PROPARALLEL_SAFE : initial_max_parallel_hazard;

I am not quiet sure when will " max_parallel_hazard == 0"
Does it means the case max_parallel_hazard_context not initialized ?


2.
Some tiny code style suggestions

+   if (con->contype == CONSTRAINT_CHECK)
+   {
+   char   *conbin;
+   Datum   val;
+   boolisnull;
+   Expr   *check_expr;

How about :

if (con->contype != CONSTRAINT_CHECK)
continue;

3.
+   if (keycol == 0)
+   {
+   /* Found an index expression */
+
+   Node   *index_expr;

Like 2, how about:

If (keycol != 0)
Continue;


4.
+   ListCell   *index_expr_item = 
list_head(index_info->ii_Expressions);
...
+   index_expr = (Node *) 
lfirst(index_expr_item);
+   index_expr = (Node *) 
expression_planner((Expr *) index_expr);

It seems BuildIndexInfo has already called eval_const_expressions for 
ii_Expressions,
Like the flow: BuildIndexInfo--> RelationGetIndexExpressions--> 
eval_const_expressions

So, IMO, we do not need to call expression_planner for the expr again.


And there seems another solution for this:

In the patch, We only use the { ii_Expressions , ii_NumIndexAttrs , 
ii_IndexAttrNumbers } from the IndexInfo,
which seems can get from "Relation-> rd_index".

Based on above, May be we do not need to call BuildIndexInfo to build the 
IndexInfo.
It can avoid some unnecessary cost.
And in this solution we do not need to remove expression_planner.

What do you think ?

Best regards,
houzj
















RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Hou, Zhijie
> >
> > +
> > +   index_oid_list = RelationGetIndexList(rel);
> > ...
> >
> > As memtioned in the comments of RelationGetIndexList:
> > * we return a copy of the list palloc'd in the caller's context.  The
> > caller
> > * may list_free() the returned list after scanning it.
> >
> > Shall we list_free(index_oid_list) at the end of function ?
> > Just to avoid potential memory leak.
> >
> 
> I think that's a good idea, so I'll make that update in the next version
> of the patch.
> I do notice, however, that there seems to be quite a few places in the 
> Postgres
> code where RelationGetIndexList() is being called without a corresponding
> list_free() being called - obviously instead relying on it being deallocated
> when the caller's memory-context is destroyed.
> 

Yes, it will be deallocated when the caller's memory-context is destroyed.

Currently, parallel safety-check check each partition.
I am just a little worried about if there are lots of partition here, it may 
cause high memory use.

And there is another place like this:

1.
+   conbin = TextDatumGetCString(val);
+   check_expr = stringToNode(conbin);

It seems we can free the cobin when not used(for the same reason above).
What do you think ?


Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Hou, Zhijie
> > So I think we're saying that if the target table is a foreign table or
> > temporary table, it can be regarded as PARALLEL_RESTRICTED, right?
> >
> 
> Yes
> 
> IMO, PARALLEL_RESTRICTED currently enable parallel select but disable
> parallel insert.
> So, the INSERT only happen in leader worker which seems safe to insert into
> tempory/foreigh table.
> 
> In addition, there are some other restriction about parallel select which
> seems can be removed:
> 
> 1.- Target table has a parallel-unsafe trigger, index expression, column
> default
> expression or check constraint
> 2.- Target table is a partitioned table with a parallel-unsafe partition
> key
> expression or support function
> 
> If the Insert's target table is the type listed above, Is there some reasons
> why we can not support parallel select ?
> It seems only leader worker will execute the trigger and key-experssion
> which seems safe.
> (If I miss something about it, please let me know)


So Sorry, please ignore the above, I think of something wrong.

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Hou, Zhijie
> > > Hi
> > >
> > > > > I may be wrong, and if I miss sth in previous mails, please give
> > > > > me some
> > > > hints.
> > > > > IMO, serial insertion with underlying parallel SELECT can be
> > > > > considered for foreign table or temporary table, as the
> > > > > insertions only
> > > > happened in the leader process.
> > > > >
> > > >
> > > > I don't think we support parallel scan for temporary tables. Can
> > > > you please try once both of these operations without Insert being
> > > > involved? If you are able to produce a parallel plan without
> > > > Insert then we can see why it is not supported with Insert.
> > >
> > > Sorry, may be I did not express it clearly, I actually means the case
> when insert's target(not in select part) table is temporary.
> > > And you are right that parallel select is not enabled when temporary
> table is in select part.
> > >
> >
> > I think Select can be parallel for this case and we should support this
> case.
> >
> 
> So I think we're saying that if the target table is a foreign table or
> temporary table, it can be regarded as PARALLEL_RESTRICTED, right?
>

Yes

IMO, PARALLEL_RESTRICTED currently enable parallel select but disable parallel 
insert.
So, the INSERT only happen in leader worker which seems safe to insert into 
tempory/foreigh table.

In addition, there are some other restriction about parallel select which seems 
can be removed:

1.- Target table has a parallel-unsafe trigger, index expression, column default
expression or check constraint
2.- Target table is a partitioned table with a parallel-unsafe partition key
expression or support function

If the Insert's target table is the type listed above, Is there some reasons 
why we can not support parallel select ?
It seems only leader worker will execute the trigger and key-experssion which 
seems safe.
(If I miss something about it, please let me know)

Best regards,
houzj







RE: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-20 Thread Hou, Zhijie
> > > Attaching v15 patch set. Please consider it for further review.
> >
> > Hi
> >
> > I have some comments for the 0001 patch
> >
> > In v15-0001-postgres_fdw-function-to-discard-cached-connecti
> >
> > 1.
> > +  If there is no open connection to the given foreign server,
> false
> > +  is returned. If no foreign server with the given name is found,
> > + an error
> >
> > Do you think it's better add some testcases about:
> > call postgres_fdw_disconnect and postgres_fdw_disconnect_all
> > when there is no open connection to the given foreign server
> 
> Do you mean a test case where foreign server exists but
> postgres_fdw_disconnect() returns false because there's no connection for
> that server?


Yes, I read this from the doc, so I think it's better to test this.




> > 2.
> > +   /*
> > +* For the given server, if we closed connection
> or it is still in
> > +* use, then no need of scanning the cache
> further.
> > +*/
> > +   if (entry->server_hashvalue == hashvalue &&
> > +   (entry->xact_depth > 0 || result))
> > +   {
> > +   hash_seq_term();
> > +   break;
> > +   }
> >
> > If I am not wrong, is the following condition always true ?
> > (entry->xact_depth > 0 || result)
> 
> It's not always true. But it seems like it's too confusing, please have
> a look at the upthread suggestion to change this with can_terminate_scan
> boolean.

Thanks for the remind, I will look at that.



Best regards,
houzj




RE: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-20 Thread Hou, Zhijie
> Attaching v15 patch set. Please consider it for further review.

Hi

I have some comments for the 0001 patch

In v15-0001-postgres_fdw-function-to-discard-cached-connecti

1.
+  If there is no open connection to the given foreign server, 
false
+  is returned. If no foreign server with the given name is found, an error

Do you think it's better add some testcases about:
call postgres_fdw_disconnect and postgres_fdw_disconnect_all when there 
is no open connection to the given foreign server

2.
+   /*
+* For the given server, if we closed connection or it 
is still in
+* use, then no need of scanning the cache further.
+*/
+   if (entry->server_hashvalue == hashvalue &&
+   (entry->xact_depth > 0 || result))
+   {
+   hash_seq_term();
+   break;
+   }

If I am not wrong, is the following condition always true ?
(entry->xact_depth > 0 || result)

Best regards,
houzj





RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-20 Thread Hou, Zhijie
> 
> Thanks for the feedback.
> Posting an updated set of patches. Changes are based on feedback, as detailed
> below:
Hi

It seems there are some previous comments[1][2] not addressed in current patch.
Just to make sure it's not missed.

[1]
https://www.postgresql.org/message-id/77e1c06ffb2240838e5fc94ec8dcb7d3%40G08CNEXMBPEKD05.g08.fujitsu.local

[2]
https://www.postgresql.org/message-id/CAA4eK1LMmz58ej5BgVLJ8VsUGd%3D%2BKcaA8X%3DkStORhxpfpODOxg%40mail.gmail.com

Best regards,
houzj




RE: Add Nullif case for eval_const_expressions_mutator

2021-01-19 Thread Hou, Zhijie
Hi

Thanks for the review.

> It's a bit unfortunate now that between OpExpr, DistinctExpr, NullIfExpr,
> and to a lesser extent ScalarArrayOpExpr we will now have several different
> implementations of nearly the same thing, without any explanation why one
> approach was chosen here and another there.  We should at least document
> this.

I am not quiet sure where to document the difference.
Temporarily, I tried to add some comments for the Nullif to explain why this 
one is different.

+   /*
+* Since NullIf is not common enough to deserve 
specially
+* optimized code, use ece_generic_processing 
and
+* ece_evaluate_expr to simplify the code as 
much as possible.
+*/

Any suggestions ?

> Some inconsistencies I found: The code for DistinctExpr calls
> expression_tree_mutator() directly, but your code for NullIfExpr calls
> ece_generic_processing(), even though the explanation in the comment for
> DistinctExpr would apply there as well.
> 
> Your code for NullIfExpr doesn't appear to call set_opfuncid() anywhere.

IMO, we will call set_opfuncid in function ece_evaluate_expr.

Like the following flow:
ece_evaluate_expr-->evaluate_expr--> fix_opfuncids--> 
fix_opfuncids_walker--> set_opfuncid

And we do not need the opfuncid till we call ece_evaluate_expr.
So, to simplify the code as much as possible, I did not call set_opfuncid in 
eval_const_expressions_mutator again.


> I would move your new block for NullIfExpr after the block for DistinctExpr.
> That's the order in which these blocks appear elsewhere for generic node
> processing.
> 

Changed.


> Check your whitespace usage:
> 
>  if(!has_nonconst_input)
> 
> should have a space after the "if".  (It's easy to fix of course, but I
> figure I'd point it out here since you have submitted several patches with
> this style, so it's perhaps a habit to break.)

Changed.


> Perhaps add a comment to the tests like this so it's clear what they are
> for:
> 
> diff --git a/src/test/regress/sql/case.sql
> b/src/test/regress/sql/case.sql index 4742e1d0e0..98e3fb8de5 100644
> --- a/src/test/regress/sql/case.sql
> +++ b/src/test/regress/sql/case.sql
> @@ -137,6 +137,7 @@ CREATE TABLE CASE2_TBL (
> FROM CASE_TBL a, CASE2_TBL b
> WHERE COALESCE(f,b.i) = 2;
> 
> +-- Tests for constant subexpression simplification
>   explain (costs off)
>   SELECT * FROM CASE_TBL WHERE NULLIF(1, 2) = 2;

Added.

Attatching v3 patch, please consider it for further review.

Best regards,
houzj





v3_0001-add-nullif-case-for-eval_const_expressions.patch
Description: v3_0001-add-nullif-case-for-eval_const_expressions.patch


RE: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-18 Thread Hou, Zhijie
> >>> +1 to add it after "dropped (Note )", how about as follows
> >>> with slight changes?
> >>>
> >>> dropped (Note that server name of an invalid connection can be NULL
> >>> if the server is dropped), and then such .
> >>
> >> Yes, I like this one. One question is; "should" or "is" is better
> >> than "can" in this case because the server name of invalid connection
> >> is always NULL when its server is dropped?
> >
> > I think "dropped (Note that server name of an invalid connection will
> > be NULL if the server is dropped), and then such ."
> 
> Sounds good to me. So patch attached.

+1

Best regards,
houzj




RE: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-01-17 Thread Hou, Zhijie
> We need to create the loopback3 with user mapping public, otherwise the
> test might become unstable as shown below. Note that loopback and
> loopback2 are not dropped in the test, so no problem with them.
> 
>  ALTER SERVER loopback OPTIONS (ADD use_remote_estimate 'off');  DROP
> SERVER loopback3 CASCADE;
>  NOTICE:  drop cascades to 2 other objects
> -DETAIL:  drop cascades to user mapping for postgres on server loopback3
> +DETAIL:  drop cascades to user mapping for bharath on server loopback3
> 
> Attaching v2 patch for postgres_fdw_get_connections. Please have a look.
Hi

I have a comment for the doc about postgres_fdw_get_connections.

+postgres_fdw_get_connections(OUT server_name text, OUT 
valid boolean) returns setof record
+
+ 
+  This function returns the foreign server names of all the open
+  connections that postgres_fdw established from
+  the local session to the foreign servers. It also returns whether
+  each connection is valid or not. false is returned
+  if the foreign server connection is used in the current local
+  transaction but its foreign server or user mapping is changed or
+  dropped, and then such invalid connection will be closed at
+  the end of that transaction. true is returned
+  otherwise. If there are no open connections, no record is returned.
+  Example usage of the function:

The doc seems does not memtion the case when the function returns NULL in 
server_name.
Users may be a little confused about why NULL was returned.

Best regards,
houzj




RE: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

2021-01-14 Thread Hou, Zhijie
> On Thu, Jan 14, 2021 at 5:36 PM Li Japin  wrote
> > Do we really need to access PUBLICATIONRELMAP in this patch? What if
> > we just set it to false in the else condition of (if (publish &&
> > (relkind != RELKIND_PARTITIONED_TABLE || pub->pubviaroot)))
> >
> > Thank for you review. I agree with you, it doesn’t need to access
> > PUBLICATIONRELMAP, since We already get the publication oid in
> > GetRelationPublications(relid) [1], which also access
> > PUBLICATIONRELMAP.  If the current pub->oid does not in pubids, the
> publish is false, so we do not need publish the table.
> 
> +1. This is enough.
> 
> > I have another question, the data->publications is a list, when did it
> has more then one items?
> 
> IIUC, when the single table is associated with multiple publications, then
> data->publications will have multiple entries. Though I have not tried,
> we can try having two or three publications for the same table and verify
> that.
> 
> > 0001 - change as you suggest.
> > 0002 - does not change.


Hi

In 0001 patch

The comments says " Relation is not associated with the publication anymore "
That comment was accurate when check is_relation_part_of_publication() in the 
last patch.

But when put the code in the else branch, not only the case in the 
comments(Relation is dropped),
Some other case such as "partitioned tables" will hit else branch too.

Do you think we need fix the comment a little to make it accurate?


And it seems we can add a "continue;" in else branch.
Of course this it not necessary.

Best regards,
houzj





fix typo in reorderbuffer.c

2021-01-14 Thread Hou, Zhijie
Hi

I found a possible typo in reorderbuffer.c

 *   has got a combocid. Combocid's are only valid for the duration of a

Combocid's ==>> Combocids

Attatching a small patch to correct it.

Best regards,
houzj




0001-fix-typo.patch
Description: 0001-fix-typo.patch


RE: remove unneeded pstrdup in fetch_table_list

2021-01-13 Thread Hou, Zhijie
> >>> Your observation seems correct to me, though I have not tried to
> >>> test your patch.
> >>
> >> +1 to apply, this looks correct and passes tests.  Scanning around I
> >> +didn't see
> >> any other occurrences of the same pattern.
> >
> > Thanks. I am thinking to backpatch this even though there is no
> > problem reported from any production system. What do you think?
> 
> No objections from me.

+1

Best regards,
houzj






RE: Single transaction in the tablesync worker?

2021-01-13 Thread Hou, Zhijie
> I am not exactly sure of the concern. (If the extra info below does not
> help can you please describe your concern with more details).
> 
> This [v14] patch code/feature is only referring to the immediate stopping
> of only the *** "tablesync" *** worker (if any) for any/each table being
> removed from the subscription. It has nothing to say about the "apply" worker
> of the subscription, which continues replicating as before.
> 
> OTOH, I think the other mail problem is not really related to the "tablesync"
> workers. As you can see (e.g. steps 7,8,9,10 of [2]), that problem is
> described as continuing over multiple transactions to replicate unexpected
> rows - I think this could only be done by the subscription "apply" worker,
> and is after the "tablesync" worker has gone away.
> 
> So AFAIK these are 2 quite unrelated problems, and would be solved
> independently.
> 
> It just happens that they are both exposed using ALTER SUBSCRIPTION ...
> REFRESH PUBLICATION;

So sorry for the confusion, you are right that these are 2 quite unrelated 
problems.
I misunderstood the 'stop the worker' here.


+   /* Immediately stop the worker. */
+   logicalrep_worker_stop_at_commit(subid, relid); 
/* prevent re-launching */
+   logicalrep_worker_stop(subid, relid); /* stop 
immediately */

Do you think we can add some comments to describe what type "worker" is stop 
here ? (sync worker here) 
And should we add some more comments to talk about the reason of " Immediately 
stop " here ? it may looks easier to understand.

Best regards,
Houzj




remove unneeded pstrdup in fetch_table_list

2021-01-12 Thread Hou, Zhijie
Hi

In function fetch_table_list, it get the table names from publicer and return a 
list of tablenames.
When append the name to the list, it use the following code:

**
nspname = TextDatumGetCString(slot_getattr(slot, 1, ));
Assert(!isnull);
relname = TextDatumGetCString(slot_getattr(slot, 2, ));
rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
tablelist = lappend(tablelist, rv);
**

the nspname and relname will be copied which seems unnecessary.
Because nspame and relname is get from TextDatumGetCString.
IMO, TextDatumGetCString returns a newly palloced string.

**
result = (char *) palloc(len + 1);
memcpy(result, VARDATA_ANY(tunpacked), len);
result[len] = '\0';

if (tunpacked != t)
pfree(tunpacked);

return result;
**

It may harm when there are a lot of tables are replicated.
So I try to fix this.

Best regards,
houzj








0001-remove-unneeded-pstrdup.patch
Description: 0001-remove-unneeded-pstrdup.patch


RE: Single transaction in the tablesync worker?

2021-01-12 Thread Hou, Zhijie
> Also PSA some detailed logging evidence of some test scenarios involving
> Drop/AlterSubscription:
> + Test-20210112-AlterSubscriptionRefresh-ok.txt =
> AlterSubscription_refresh which successfully drops a tablesync slot
> + Test-20210112-AlterSubscriptionRefresh-warning.txt =
> AlterSubscription_refresh gives WARNING that it cannot drop the tablesync
> slot (which no longer exists)
> + Test-20210112-DropSubscription-warning.txt = DropSubscription with a
> disassociated slot_name gives a WARNING that it cannot drop the tablesync
> slot (due to broken connection)

Hi

> * The AlterSubscription_refresh (v14+) is now more similar to 
> DropSubscription w.r.t to stopping workers for any "removed" tables.
I have an issue about the above feature.

With the patch, it seems does not stop the worker in the case of [1].
I probably missed something, should we stop the worker in such case ?

[1] 
https://www.postgresql.org/message-id/CALj2ACV%2B0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw%40mail.gmail.com

Best regards,
houzj





RE: Add Nullif case for eval_const_expressions_mutator

2021-01-11 Thread Hou, Zhijie
> > I think this patch should be about a tenth the size.  Try modeling it
> > on the T_SubscriptingRef-etc case, ie, use ece_generic_processing and
> > then ece_evaluate_expr to cover the generic cases.  OpExpr is common
> > enough to deserve specially optimized code, but NullIf isn't, so shorter
> is better.
> 
> Thanks for the review.
> 
> Attaching v2 patch , which followed the suggestion to use
> ece_generic_processing and ece_evaluate_expr to simplify the code.
> 
> Please have a check.

Sorry, I found the code still be simplified better.
Attaching the new patch.

Best regards,
houzj





v2_1-0001-add-nullif-case-for-eval_const_expressions.patch
Description: v2_1-0001-add-nullif-case-for-eval_const_expressions.patch


RE: Add Nullif case for eval_const_expressions_mutator

2021-01-11 Thread Hou, Zhijie
> > I notice that there are no Nullif case in eval_const_expression.
> > Since Nullif is similar to Opexpr and is easy to implement, I try add
> > this case in eval_const_expressions_mutator.
> 
> I think this patch should be about a tenth the size.  Try modeling it on
> the T_SubscriptingRef-etc case, ie, use ece_generic_processing and then
> ece_evaluate_expr to cover the generic cases.  OpExpr is common enough to
> deserve specially optimized code, but NullIf isn't, so shorter is better.

Thanks for the review.

Attaching v2 patch , which followed the suggestion 
to use ece_generic_processing and ece_evaluate_expr to simplify the code.

Please have a check.

Best regards,
houzj





v2-0001-add-nullif-case-for-eval_const_expressions.patch
Description: v2-0001-add-nullif-case-for-eval_const_expressions.patch


RE: Parallel Inserts in CREATE TABLE AS

2021-01-10 Thread Hou, Zhijie
> Attaching v21 patch set, which has following changes:
> 1) 0001 - changed fpes->ins_cmd_type ==
> PARALLEL_INSERT_CMD_CREATE_TABLE_AS to fpes->ins_cmd_type !=
> PARALLEL_INSERT_CMD_UNDEF
> 2) 0002 - reworded the commit message.
> 3) 0003 - added cmin, xmin test case to one of the parallel insert cases
> to ensure leader and worker insert the tuples in the same xact and replaced
> memory usage output in numbers like 25kB to NkB to make the tests stable.
> 4) 0004 - updated one of the test output to be in NkB and made the assertion
> in SetParallelInsertState to be not under an if condition.
> 
> There's one open point [1] on selective skipping of error "cannot insert
> tuples in a parallel worker" in heap_prepare_insert(), thoughts are
> welcome.
> 
> Please consider the v21 patch set for further review.

Hi,

I took a look into the new patch and have some comments.

1.
+   /*
+* Do not consider tuple cost in case of we intend to perform parallel
+* inserts by workers. We would have turned on the ignore flag in
+* apply_scanjoin_target_to_paths before generating Gather path for the
+* upper level SELECT part of the query.
+*/
+   if ((root->parse->parallelInsCmdTupleCostOpt &
+PARALLEL_INSERT_SELECT_QUERY) &&
+   (root->parse->parallelInsCmdTupleCostOpt &
+PARALLEL_INSERT_CAN_IGN_TUP_COST))

Can we just check PARALLEL_INSERT_CAN_IGN_TUP_COST here ?
IMO, PARALLEL_INSERT_CAN_IGN_TUP_COST will be set only when 
PARALLEL_INSERT_SELECT_QUERY is set.


2.
+static void
+ParallelInsCmdEstimate(ParallelContext *pcxt, ParallelInsertCmdKind ins_cmd,
+  void *ins_info)
...
+   info = (ParallelInsertCTASInfo *) ins_info;
+   intoclause_str = nodeToString(info->intoclause);
+   intoclause_len = strlen(intoclause_str) + 1;

+static void
+SaveParallelInsCmdInfo(ParallelContext *pcxt, ParallelInsertCmdKind ins_cmd,
+  void *ins_info)
...
+   info = (ParallelInsertCTASInfo *)ins_info;
+   intoclause_str = nodeToString(info->intoclause);
+   intoclause_len = strlen(intoclause_str) + 1;
+   intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len);

I noticed the above code will call nodeToString and strlen twice which seems 
unnecessary.
Do you think it's better to store the result of nodetostring and strlen first 
and pass them when used ?

3.
+   if (node->need_to_scan_locally || node->nworkers_launched == 0)
+   {
+   EState *estate = node->ps.state;
+   TupleTableSlot *outerTupleSlot;
+
+   for(;;)
+   {
+   /* Install our DSA area while executing the plan. */
+   estate->es_query_dsa =
+   node->pei ? node->pei->area : NULL;
...
+   node->ps.state->es_processed++;
+   }

How about use the variables estate like 'estate-> es_processed++;'
Instead of node->ps.state->es_processed++;

Best regards,
houzj







RE: Single transaction in the tablesync worker?

2021-01-07 Thread Hou, Zhijie
> PSA the v12 patch for the Tablesync Solution1.
> 
> Differences from v11:
>   + Added PG docs to mention the tablesync slot
>   + Refactored tablesync slot drop (done by
> DropSubscription/process_syncing_tables_for_sync)
>   + Fixed PG docs mentioning wrong state code
>   + Fixed wrong code comment describing TCOPYDONE state
> 
Hi

I look into the new patch and have some comments.

1.
+   /* Setup replication origin tracking. */
①+  originid = replorigin_by_name(originname, true);
+   if (!OidIsValid(originid))
+   {

②+  originid = replorigin_by_name(originname, true);
+   if (originid != InvalidRepOriginId)
+   {

There are two different style code which check whether originid is valid.
Both are fine, but do you think it’s better to have a same style here?


2.
  *  state to SYNCDONE.  There might be zero changes applied between
  *  CATCHUP and SYNCDONE, because the sync worker might be ahead 
of the
  *  apply worker.
+ *- The sync worker has a intermediary state TCOPYDONE which comes 
after
+ * DATASYNC and before SYNCWAIT. This state indicates that the 
initial

This comment about TCOPYDONE is better to be placed at [1]*, where is between 
DATASYNC and SYNCWAIT.

 * - Tablesync worker starts; changes table state from INIT to DATASYNC 
while
 *   copying.
 [1]*
 * - Tablesync worker finishes the copy and sets table state to 
SYNCWAIT;
 *   waits for state change.

3.
+   /*
+* To build a slot name for the sync work, we are limited to 
NAMEDATALEN -
+* 1 characters.
+*
+* The name is calculated as pg_%u_sync_%u (3 + 10 + 6 + 10 + '\0'). 
(It's
+* actually the NAMEDATALEN on the remote that matters, but this scheme
+* will also work reasonably if that is different.)
+*/
+   StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small");   /* for 
sanity */
+
+   syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);

The comments says syncslotname is limit to NAMEDATALEN - 1 characters.
But the actual size of it is (3 + 10 + 6 + 10 + '\0') = 30,which seems not 
NAMEDATALEN - 1.
Should we change the comment here?

Best regards,
houzj






RE: Single transaction in the tablesync worker?

2021-01-06 Thread Hou, Zhijie
> PSA the v11 patch for the Tablesync Solution1.
> 
> Difference from v10:
> - Addresses several recent review comments.
> - pg_indent has been run
> 
Hi

I took a look into the patch and have some comments.

1.
  *   So the state progression is always: INIT -> DATASYNC -> SYNCWAIT ->
- *   CATCHUP -> SYNCDONE -> READY.
+ *   CATCHUP -> (sync worker TCOPYDONE) -> SYNCDONE -> READY.

I noticed the new state TCOPYDONE is commented between CATCHUP and SYNCDONE,
But It seems the SUBREL_STATE_TCOPYDONE is actually set before 
SUBREL_STATE_SYNCWAIT[1].
Did i miss something here ?

[1]-
+   UpdateSubscriptionRelState(MyLogicalRepWorker->subid,
+  
MyLogicalRepWorker->relid,
+  
SUBREL_STATE_TCOPYDONE,
+  
MyLogicalRepWorker->relstate_lsn);
...
/*
 * We are done with the initial data synchronization, update the state.
 */
SpinLockAcquire(>relmutex);
MyLogicalRepWorker->relstate = SUBREL_STATE_SYNCWAIT;
--


2.
i = initialize,
d = data is being copied,
+   C = table data has been copied,
s = synchronized,
r = ready (normal replication)

+#define SUBREL_STATE_TCOPYDONE 't' /* tablesync copy phase is completed
+* 
(sublsn NULL) */
The character representing 'data has been copied' in the catalog seems 
different from the macro define.


Best regards,
houzj




RE: Parallel Inserts in CREATE TABLE AS

2021-01-05 Thread Hou, Zhijie
> >
> > +   /* Okay to parallelize inserts, so mark it. */
> > +   if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> > +   ((DR_intorel *) dest)->is_parallel = true;
> >
> > +   if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> > +   ((DR_intorel *) dest)->is_parallel = false;
> 
> We need to know exactly what is the command in above place, to dereference
> and mark is_parallel to true, because is_parallel is being added to the
> respective structures, not to the generic _DestReceiver structure. So, in
> future the above code becomes something like below:
> 
> +/* Okay to parallelize inserts, so mark it. */
> +if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +((DR_intorel *) dest)->is_parallel = true;
> +else if (ins_cmd == PARALLEL_INSERT_CMD_REFRESH_MAT_VIEW)
> +((DR_transientrel *) dest)->is_parallel = true;
> +else if (ins_cmd == PARALLEL_INSERT_CMD_COPY_TO)
> +((DR_copy *) dest)->is_parallel = true;
> 
> In the below place, instead of new function, I think we can just have
> something like if (fpes->ins_cmd_type != PARALLEL_INSERT_CMD_UNDEF)
> 
> > Or
> >
> > +   if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> > +   pg_atomic_add_fetch_u64(>processed,
> > + queryDesc->estate->es_processed);
> >
> > If you think the above code will extend the ins_cmd type check in the
> future, the generic function may make sense.
> 
> We can also change below to fpes->ins_cmd_type !=
> PARALLEL_INSERT_CMD_UNDEF.
> 
> +if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +receiver = ExecParallelGetInsReceiver(toc, fpes);
> 
> If okay, I will modify it in the next version of the patch.

Yes, that looks good to me.

Best regards,
houzj




RE: Parallel Inserts in CREATE TABLE AS

2021-01-05 Thread Hou, Zhijie
> > I think it makes sense.
> >
> > And if the check about ' ins_cmd == xxx1 || ins_cmd == xxx2' may be
> > used in some places, How about define a generic function with some comment
> to mention the purpose.
> >
> > An example in INSERT INTO SELECT patch:
> > +/*
> > + * IsModifySupportedInParallelMode
> > + *
> > + * Indicates whether execution of the specified table-modification
> > +command
> > + * (INSERT/UPDATE/DELETE) in parallel-mode is supported, subject to
> > +certain
> > + * parallel-safety conditions.
> > + */
> > +static inline bool
> > +IsModifySupportedInParallelMode(CmdType commandType) {
> > +   /* Currently only INSERT is supported */
> > +   return (commandType == CMD_INSERT); }
> 
> The intention of assert is to verify that those functions are called for
> appropriate commands such as CTAS, Refresh Mat View and so on with correct
> parameters. I really don't think so we can replace the assert with a function
> like above, in the release mode assertion will always be true. In a way,
> that assertion is for only debugging purposes. And I also think that when
> we as the callers know when to call those new functions, we can even remove
> the assertions, if they are really a problem here. Thoughts?
Hi

Thanks for the explanation.

If the check about command type is only used in assert, I think you are right.
I suggested a new function because I guess the check can be used in some other 
places.
Such as:

+   /* Okay to parallelize inserts, so mark it. */
+   if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
+   ((DR_intorel *) dest)->is_parallel = true;

+   if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
+   ((DR_intorel *) dest)->is_parallel = false;

Or

+   if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
+   pg_atomic_add_fetch_u64(>processed, 
queryDesc->estate->es_processed);

If you think the above code will extend the ins_cmd type check in the future, 
the generic function may make sense.

Best regards,
houzj 





RE: Parallel Inserts in CREATE TABLE AS

2021-01-05 Thread Hou, Zhijie
> > For v20-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch :
> >
> > ParallelInsCmdEstimate :
> >
> > +   Assert(pcxt && ins_info &&
> > +  (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS));
> > +
> > +   if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> >
> > Sinc the if condition is covered by the assertion, I wonder why the if
> check is still needed.
> >
> > Similar comment for SaveParallelInsCmdFixedInfo and
> > SaveParallelInsCmdInfo
> 
> Thanks.
> 
> The idea is to have assertion with all the expected ins_cmd types, and then
> later to have selective handling for different ins_cmds. For example, if
> we add (in future) parallel insertion in Refresh Materialized View, then
> the code in those functions will be something
> like:
> 
> +static void
> +ParallelInsCmdEstimate(ParallelContext *pcxt, ParallelInsertCmdKind
> ins_cmd,
> +   void *ins_info)
> +{
> +Assert(pcxt && ins_info &&
> +   (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS ||
> +(ins_cmd == PARALLEL_INSERT_CMD_REFRESH_MAT_VIEW));
> +
> +if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +{
> +
> +}
> +   else if (ns_cmd == PARALLEL_INSERT_CMD_REFRESH_MAT_VIEW)
> +   {
> +
> +   }
> 
> Similarly for other functions as well.

I think it makes sense.

And if the check about ' ins_cmd == xxx1 || ins_cmd == xxx2' may be used in 
some places,
How about define a generic function with some comment to mention the purpose.

An example in INSERT INTO SELECT patch:
+/*
+ * IsModifySupportedInParallelMode
+ *
+ * Indicates whether execution of the specified table-modification command
+ * (INSERT/UPDATE/DELETE) in parallel-mode is supported, subject to certain
+ * parallel-safety conditions.
+ */
+static inline bool
+IsModifySupportedInParallelMode(CmdType commandType)
+{
+   /* Currently only INSERT is supported */
+   return (commandType == CMD_INSERT);
+}

Best regards,
houzj




fix typo in ReorderBufferProcessTXN

2021-01-04 Thread Hou, Zhijie
Hi

I found a possible typo in the comment of  ReorderBufferProcessTXN().

* relmapper has no "historic" view, in contrast to normal
* the normal catalog during decoding. Thus repeated

Is there an extra ‘normal’ in the comment ?

in contrast to normal the normal catalog during decoding == >> in contrast to 
the normal catalog during decoding

Best regards,
houzj




0001-fix-typo.patch
Description: 0001-fix-typo.patch


RE: Parallel Inserts in CREATE TABLE AS

2021-01-04 Thread Hou, Zhijie
Hi

> 
> wrt v18-0002patch:
> 
> It looks like this introduces a state machine that goes like:
> - starts at CTAS_PARALLEL_INS_UNDEF
> - possibly moves to CTAS_PARALLEL_INS_SELECT
> - CTAS_PARALLEL_INS_TUP_COST_CAN_IGN can be added
> - if both were added at some stage, we can go to
> CTAS_PARALLEL_INS_TUP_COST_IGNORED and ignore the costs
> 
> what i'm wondering is why you opted to put logic around
> generate_useful_gather_paths and in cost_gather when to me it seems more
> logical to put it in create_gather_path? i'm probably missing something
> there?

IMO, The reason is we want to make sure we only ignore the cost when Gather is 
the top node.
And it seems the generate_useful_gather_paths called in 
apply_scanjoin_target_to_paths is the right place which can only create top 
node Gather.
So we change the flag in apply_scanjoin_target_to_paths around 
generate_useful_gather_paths to identify the top node. 


Best regards,
houzj




RE: Consider Parallelism While Planning For REFRESH MATERIALIZED VIEW

2020-12-29 Thread Hou, Zhijie
> Yeah without explain analyze we can not show whether the parallelism is
> picked in the test cases. What we could do is that we can add a plain RMV
> test case in write_parallel.sql after CMV so that at least we can be ensured
> that the parallelism will be picked because of the enforcement there. We
> can always see the parallelism for the select part of explain analyze CMV
> in write_parallel.sql and the same select query gets planned even in RMV
> cases.
> 
> IMO, the patch in this thread can go with test case addition to
> write_parallel.sql. since it is very small.
> 
> Thoughts?

Yes, agreed.

Best regards,
houzj




RE: [POC] Fast COPY FROM command for the table with foreign partitions

2020-12-29 Thread Hou, Zhijie
Hi

> see new version in attachment.

I took a look into the patch, and have some comments.

1.
+   PG_FINALLY();
+   {
+   copy_fmstate = NULL; /* Detect problems */
I don't quite understand this comment,
does it means we want to detect something like Null reference ?


2.
+   PG_FINALLY();
+   {
...
+   if (!OK)
+   PG_RE_THROW();
+   }
Is this PG_RE_THROW() necessary ? 
IMO, PG_FINALLY will reproduce the PG_RE_THROW action if we get to the code 
block due to an error being thrown.

3.
+   ereport(ERROR,
+   (errmsg("unexpected extra results 
during COPY of table: %s",
+   PQerrorMessage(conn;

I found some similar message like the following:

pg_log_warning("unexpected extra results during COPY of 
table \"%s\"",
   tocEntryTag);
How about using existing messages style ?

4.
I noticed some not standard code comment[1].
I think it's better to comment like:
/*
 * line 1
 * line 2
 */

[1]---
+   /* Finish COPY IN protocol. It is needed to do after successful 
copy or
+* after an error.
+*/


+/*
+ *
+ * postgresExecForeignCopy

+/*
+ *
+ * postgresBeginForeignCopy

---
Best regards,
Houzj





RE: Parallel copy

2020-12-23 Thread Hou, Zhijie
Hi

> Yes this optimization can be done, I will handle this in the next patch
> set.
> 

I have a suggestion for the parallel safety-check.

As designed, The leader does not participate in the insertion of data.
If User use (PARALLEL 1), there is only one worker process which will do the 
insertion.

IMO, we can skip some of the safety-check in this case, becase the safety-check 
is to limit parallel insert.
(except temporary table or ...)

So, how about checking (PARALLEL 1) separately ?
Although it looks a bit complicated, But (PARALLEL 1) do have a good 
performance improvement.

Best regards,
houzj




RE: Consider Parallelism While Planning For REFRESH MATERIALIZED VIEW

2020-12-22 Thread Hou, Zhijie
> Thanks for taking a look at the patch.
> 
> The intention of the patch is to just enable the parallel mode while planning
> the select part of the materialized view, but the insertions do happen in
> the leader backend itself. That way even if there's temporary tablespace
> gets created, we have no problem.
> 
> This patch can be thought as a precursor to parallelizing inserts in refresh
> matview. I'm thinking to follow the design of parallel inserts in ctas [1]
> i.e. pushing the dest receiver down to the workers once that gets reviewed
> and finalized. At that time, we should take care of not pushing inserts
> down to workers if temporary tablespace gets created.
> 
> In summary, as far as this patch is considered we don't have any problem
> with temporary tablespace getting created with CONCURRENTLY option.
> 
> I'm planning to add a few test cases to cover this patch in matview.sql
> and post a v2 patch soon.

Thanks for your explanation!
You are right that temporary tablespace does not affect current patch.

For the testcase:
I noticed that you have post a mail about add explain support for REFRESH 
MATERIALIZED VIEW.
Do you think we can combine these two features in one thread ?

Personally, The testcase with explain info will be better.

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2020-12-22 Thread Hou, Zhijie
Hi

> > I may be wrong, and if I miss sth in previous mails, please give me some
> hints.
> > IMO, serial insertion with underlying parallel SELECT can be
> > considered for foreign table or temporary table, as the insertions only
> happened in the leader process.
> >
> 
> I don't think we support parallel scan for temporary tables. Can you please
> try once both of these operations without Insert being involved? If you
> are able to produce a parallel plan without Insert then we can see why it
> is not supported with Insert.

Sorry, may be I did not express it clearly, I actually means the case when 
insert's target(not in select part) table is temporary.
And you are right that parallel select is not enabled when temporary table is 
in select part.

I test for the case when insert's target table is temporary or not.
--insert into not temporary table---
postgres=# explain (costs off) insert into notemp select * from test where i < 
600;
  QUERY PLAN   
---
 Gather
   Workers Planned: 4
   ->  Insert on notemp
 ->  Parallel Seq Scan on test
   Filter: (i < 600)

--insert into temporary table---
postgres=# explain (costs off) insert into temp select * from test where i < 
600;
   QUERY PLAN 
---
 Insert on temp
   ->  Seq Scan on test
 Filter: (i < 600)

---without insert part---
postgres=# explain (costs off) select * from test where i < 600;
   QUERY PLAN
-
 Gather
   Workers Planned: 4
   ->  Parallel Seq Scan on test
 Filter: (i < 600)

Best regards,
houzj




RE: Parallel INSERT (INTO ... SELECT ...)

2020-12-22 Thread Hou, Zhijie
Hi

I have an issue about the parallel-safety checks.

If target table is foreign table or temporary table, 
rel_max_parallel_hazard_for_modify will return PROPARALLEL_UNSAFE,
which not only disable parallel insert but also disable underlying parallel 
SELECT.

+create temporary table temp_names (like names);
+explain (costs off) insert into temp_names select * from names;
+   QUERY PLAN
+-
+ Insert on temp_names
+   ->  Seq Scan on names
+(2 rows)

I may be wrong, and if I miss sth in previous mails, please give me some hints.
IMO, serial insertion with underlying parallel SELECT can be considered for 
foreign table or temporary table,
as the insertions only happened in the leader process.

Are there any special considerations for this case ?

Best regards,
houzj




RE: Consider Parallelism While Planning For REFRESH MATERIALIZED VIEW

2020-12-22 Thread Hou, Zhijie
Hi

> Added this to commitfest, in case it is useful -
> https://commitfest.postgresql.org/31/2856/

I have an issue about the safety of enable parallel select.

I checked the [parallel insert into select] patch.
https://commitfest.postgresql.org/31/2844/
It seems parallel select is not allowed when target table is temporary table.

+   /*
+* We can't support table modification in parallel-mode if it's a 
foreign
+* table/partition (no FDW API for supporting parallel access) or a
+* temporary table.
+*/
+   if (rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE ||
+   RelationUsesLocalBuffers(rel))
+   {
+   table_close(rel, lockmode);
+   context->max_hazard = PROPARALLEL_UNSAFE;
+   return context->max_hazard;
+   }

For Refresh MatView.
if CONCURRENTLY is specified, It will builds new data in temp tablespace:
if (concurrent)
{
tableSpace = GetDefaultTablespace(RELPERSISTENCE_TEMP, false);
relpersistence = RELPERSISTENCE_TEMP;
}

For the above case, should we still consider parallelism ?

Best regards,
houzj





RE: Parallel INSERT (INTO ... SELECT ...)

2020-12-20 Thread Hou, Zhijie
Hi

> Posting an updated set of patches to address recent feedback:
> 
> - Removed conditional-locking code used in parallel-safety checking code
> (Tsunakawa-san feedback). It turns out that for the problem test case, no
> parallel-safety checking should be occurring that locks relations because
> those inserts are specifying VALUES, not an underlying SELECT, so the
> parallel-safety checking code was updated to bail out early if no underlying
> SELECT is specified for the INSERT. No change to the test code was required.
> - Added comment to better explain the reason for treating "INSERT ...
> ON CONFLICT ... DO UPDATE" as parallel-unsafe (Dilip)
> - Added assertion to heap_prepare_insert() (Amit)
> - Updated error handling for NULL index_expr_item case (Vignesh)

+
+   index_oid_list = RelationGetIndexList(rel);
...

As memtioned in the comments of RelationGetIndexList:
* we return a copy of the list palloc'd in the caller's context.  The caller
* may list_free() the returned list after scanning it.

Shall we list_free(index_oid_list) at the end of function ?
Just to avoid potential memory leak.

Best regards,
houzj





RE: Parallel Inserts in CREATE TABLE AS

2020-12-20 Thread Hou, Zhijie
Hi

The cfbost seems complains about the testcase:

Command exited with code 1
perl dumpregr.pl
=== $path ===\ndiff -w -U3 
C:/projects/postgresql/src/test/regress/expected/write_parallel.out 
C:/projects/postgresql/src/test/regress/results/write_parallel.out
--- C:/projects/postgresql/src/test/regress/expected/write_parallel.out 
2020-12-21 01:41:17.745091500 +
+++ C:/projects/postgresql/src/test/regress/results/write_parallel.out  
2020-12-21 01:47:20.375514800 +
@@ -1204,7 +1204,7 @@
->  Gather (actual rows=2 loops=1)
  Workers Planned: 3
  Workers Launched: 3
- ->  Parallel Seq Scan on temp2 (actual rows=0 loops=4)
+ ->  Parallel Seq Scan on temp2 (actual rows=1 loops=4)
Filter: (col2 < 3)
Rows Removed by Filter: 1
(14 rows)
@@ -1233,7 +1233,7 @@
->  Gather (actual rows=2 loops=1)
  Workers Planned: 3
  Workers Launched: 3
- ->  Parallel Seq Scan on temp2 (actual rows=0 loops=4)
+ ->  Parallel Seq Scan on temp2 (actual rows=1 loops=4)
Filter: (col2 < 3)
Rows Removed by Filter: 1
(14 rows)

Best regards,
houzj




RE: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-18 Thread Hou, Zhijie
Hi

I have an issue about the existing testcase.

"""
-- Test that alteration of server options causes reconnection SELECT c3, c4 
FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work ALTER SERVER loopback OPTIONS 
(SET dbname 'no such database'); SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 
1;  -- should fail DO $d$
BEGIN
EXECUTE $$ALTER SERVER loopback
OPTIONS (SET dbname '$$||current_database()||$$')$$;
END;
$d$;
SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work again """

IMO, the above case is designed to test the following code[1]:
With the patch, it seems the following code[1] will not work for this case, 
right?
(It seems the connection will be disconnect in pgfdw_xact_callback)

I do not know does it matter, or should we add a testcase to cover that?

[1] /*
 * If the connection needs to be remade due to invalidation, disconnect 
as
 * soon as we're out of all transactions.
 */
if (entry->conn != NULL && entry->invalidated && entry->xact_depth == 0)
{
elog(DEBUG3, "closing connection %p for option changes to take 
effect",
 entry->conn);
disconnect_pg_server(entry);
}


Best regards,
houzj




RE: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2020-12-17 Thread Hou, Zhijie
Hi

> Discussion here is on the point - whether to show up the invalidated
> connections in the output of the new postgres_fdw_get_connections()
> function? If we were to show, then because of the solution we proposed for
> the connection leak problem in [1], will the invalidated entries be shown
> every time?

IMO, we introduced the function postgres_fdw_get_connections to decide 
whether there are too many connections exists and we should disconnect them.

If User decide to disconnect, we have two cases:
1. user decide to disconnect one of them, 
   I think it’s ok for user to disconnect invalidated connection, so we'd 
better list the invalidated connections.

2. User decide to disconnect all of them. In this case, 
   It seems postgres_fdw_disconnect will disconnect both invalidated and not 
connections,
   And we should let user realize what connections they are disconnecting, so 
we should list the invalidated connections.

Based on the above two cases, Personlly, I think we can list the invalidated 
connections.

-
I took a look into the patch, and have a little issue:

+bool disconnect_cached_connections(uint32 hashvalue, bool all)
+   if (all)
+   {
+   hash_destroy(ConnectionHash);
+   ConnectionHash = NULL;
+   result = true;
+   }

If disconnect_cached_connections is called to disconnect all the connections, 
should we reset the 'xact_got_connection' flag ?


> [1] -
> https://www.postgresql.org/message-id/flat/CALj2ACVNcGH_6qLY-4_tXz8JLv
> A%2B4yeBThRfxMz7Oxbk1aHcpQ%40mail.gmail.com

The patch about connection leak looks good to me.
And I have a same issue about the new 'have_invalid_connections' flag,
If we disconnect all the connections, should we reset the flag ?

Best regards,
houzj





RE: Parallel Inserts in CREATE TABLE AS

2020-12-15 Thread Hou, Zhijie
> From: Hou, Zhijie [mailto:houzj.f...@cn.fujitsu.com]
> Sent: Tuesday, December 15, 2020 7:30 PM
> To: Bharath Rupireddy 
> Cc: Amit Kapila ; Luc Vlaming ;
> PostgreSQL-development ; Zhihong Yu
> ; Dilip Kumar 
> Subject: RE: Parallel Inserts in CREATE TABLE AS
> 
> > Thanks for the append use case.
> >
> > Here's my analysis on pushing parallel inserts down even in case the
> > top node is Append.
> >
> > For union cases which need to remove duplicate tuples, we can't push
> > the inserts or CTAS dest receiver down. If I'm not wrong, Append node
> > is not doing duplicate removal(??), I saw that it's the HashAggregate
> > node (which is the top node that removes the duplicate tuples). And
> > also for except/except all/intersect/intersect all cases we receive
> > HashSetOp nodes on top of Append. So for both cases, our check for
> > Gather or Append at the top node is enough to detect this to not allow
> parallel inserts.
> >
> > For union all:
> > case 1: We can push the CTAS dest receiver to each Gather node  Append
> >  ->Gather
> >  ->Parallel Seq Scan
> >  ->Gather
> >  ->Parallel Seq Scan
> >   ->Gather
> >  ->Parallel Seq Scan
> >
> > case 2: We can still push the CTAS dest receiver to each Gather node.
> > Non-Gather nodes will do inserts as they do now i.e. by sending tuples
> > to Append and from there to CTAS dest receiver.
> >  Append
> >  ->Gather
> >  ->Parallel Seq Scan
> >  ->Seq Scan / Join / any other non-Gather node
> >  ->Gather
> >  ->Parallel Seq Scan
> >  ->Seq Scan / Join / any other non-Gather node
> >
> > case 3:  We can push the CTAS dest receiver to Gather  Gather
> >  ->Parallel Append
> >  ->Parallel Seq Scan
> >  ->Parallel Seq Scan
> >
> > case 4: We can push the CTAS dest receiver to Gather  Gather
> >  ->Parallel Append
> >  ->Parallel Seq Scan
> >  ->Parallel Seq Scan
> >  ->Seq Scan / Join / any other non-Gather node
> >
> > Please let me know if I'm missing any other possible use case.
> >
> > Thoughts?
> 
> 
> Yes, The analysis looks right to me.
> 
> 
> > As suggested by Amit earlier, I kept the 0001 patch(so far) such that
> > it doesn't have the code to influence the planner to consider parallel
> > tuple cost as 0. It works on the plan whatever gets generated and
> > decides to allow parallel inserts or not. And in the 0002 patch, I
> > added the code for influencing the planner to consider parallel tuple
> > cost as 0. Maybe we can have a 0003 patch for tests alone.
> >
> > Once we are okay with the above analysis and use cases, we can
> > incorporate the Append changes to respective patches.
> >
> > Hope that's okay.
> 
> A little explanation about how to push down the ctas info in append.
> 
> 1. about how to ignore tuple cost in this case.
> IMO, it create gather path under append like the following:
> query_planner
> -make_one_rel
> --set_base_rel_sizes
> ---set_rel_size
> set_append_rel_size (*)
> -set_rel_size
> --set_subquery_pathlist
> ---subquery_planner
> grouping_planner
> -apply_scanjoin_target_to_paths
> --generate_useful_gather_paths
> 
> set_append_rel_size seems the right place where we can check and set a flag
> to ignore tuple cost later.
> We can set the flag for two cases when there is no parent path will be
> created(such as : limit,sort,distinct...):
> i) query_level is 1
> ii) query_level > 1 and we have set the flag in the parent_root.
> 
> The case ii) is to check append under append:
> Append
>->Append
>->Gather
>->Other plan
> 
> 2.about how to push ctas info down.
> 
> We traversing the whole plans tree, and we only care Append and Gather type.
> Gather: It set the ctas dest info and returned true at once if the gathernode
> does not have projection.
> Append: It will recursively traversing the subplan of Appendnode and will
> reture true if one of the subplan can be parallel.
> 
> +PushDownCTASParallelInsertState(DestReceiver *dest, PlanState *ps) {
> + bool parallel = false;
> +
> + if(ps == NULL)
> + return parallel;
> +
> + if(IsA(ps, AppendState))
> + {
> + AppendState *aps = (AppendState *) ps;
> + for(int i = 0; i < aps->as_nplans; i++)
> + {
> + parallel |=
> PushDownC

RE: Parallel Inserts in CREATE TABLE AS

2020-12-15 Thread Hou, Zhijie
> Thanks for the append use case.
> 
> Here's my analysis on pushing parallel inserts down even in case the top
> node is Append.
> 
> For union cases which need to remove duplicate tuples, we can't push the
> inserts or CTAS dest receiver down. If I'm not wrong, Append node is not
> doing duplicate removal(??), I saw that it's the HashAggregate node (which
> is the top node that removes the duplicate tuples). And also for
> except/except all/intersect/intersect all cases we receive HashSetOp nodes
> on top of Append. So for both cases, our check for Gather or Append at the
> top node is enough to detect this to not allow parallel inserts.
> 
> For union all:
> case 1: We can push the CTAS dest receiver to each Gather node  Append
>  ->Gather
>  ->Parallel Seq Scan
>  ->Gather
>  ->Parallel Seq Scan
>   ->Gather
>  ->Parallel Seq Scan
> 
> case 2: We can still push the CTAS dest receiver to each Gather node.
> Non-Gather nodes will do inserts as they do now i.e. by sending tuples to
> Append and from there to CTAS dest receiver.
>  Append
>  ->Gather
>  ->Parallel Seq Scan
>  ->Seq Scan / Join / any other non-Gather node
>  ->Gather
>  ->Parallel Seq Scan
>  ->Seq Scan / Join / any other non-Gather node
> 
> case 3:  We can push the CTAS dest receiver to Gather  Gather
>  ->Parallel Append
>  ->Parallel Seq Scan
>  ->Parallel Seq Scan
> 
> case 4: We can push the CTAS dest receiver to Gather  Gather
>  ->Parallel Append
>  ->Parallel Seq Scan
>  ->Parallel Seq Scan
>  ->Seq Scan / Join / any other non-Gather node
> 
> Please let me know if I'm missing any other possible use case.
> 
> Thoughts?


Yes, The analysis looks right to me.


> As suggested by Amit earlier, I kept the 0001 patch(so far) such that it
> doesn't have the code to influence the planner to consider parallel tuple
> cost as 0. It works on the plan whatever gets generated and decides to allow
> parallel inserts or not. And in the 0002 patch, I added the code for
> influencing the planner to consider parallel tuple cost as 0. Maybe we can
> have a 0003 patch for tests alone.
> 
> Once we are okay with the above analysis and use cases, we can incorporate
> the Append changes to respective patches.
> 
> Hope that's okay.

A little explanation about how to push down the ctas info in append.

1. about how to ignore tuple cost in this case.
IMO, it create gather path under append like the following:
query_planner
-make_one_rel
--set_base_rel_sizes
---set_rel_size
set_append_rel_size (*)
-set_rel_size
--set_subquery_pathlist
---subquery_planner
grouping_planner
-apply_scanjoin_target_to_paths
--generate_useful_gather_paths

set_append_rel_size seems the right place where we can check and set a flag to 
ignore tuple cost later.
We can set the flag for two cases when there is no parent path will be 
created(such as : limit,sort,distinct...):
i) query_level is 1
ii) query_level > 1 and we have set the flag in the parent_root.

The case ii) is to check append under append:
Append
   ->Append
   ->Gather
   ->Other plan

2.about how to push ctas info down.

We traversing the whole plans tree, and we only care Append and Gather type.
Gather: It set the ctas dest info and returned true at once if the gathernode 
does not have projection.
Append: It will recursively traversing the subplan of Appendnode and will 
reture true if one of the subplan can be parallel.

+PushDownCTASParallelInsertState(DestReceiver *dest, PlanState *ps)
+{
+   bool parallel = false;
+   
+   if(ps == NULL)
+   return parallel;
+
+   if(IsA(ps, AppendState))
+   {
+   AppendState *aps = (AppendState *) ps;
+   for(int i = 0; i < aps->as_nplans; i++)
+   {
+   parallel |= PushDownCTASParallelInsertState(dest, 
aps->appendplans[i]);
+   }
+   }
+   else if(IsA(ps, GatherState) && !ps->ps_ProjInfo)
+   {
+   GatherState *gstate = (GatherState *) ps;
+   parallel = true;
+
+   ((DR_intorel *) dest)->is_parallel = true;
+   gstate->dest = dest;
+   ps->plan->plan_rows = 0;
+   }
+
+   return parallel;
+}

Best regards,
houzj




0001-support-pctas-in-append-parallel-inserts.patch
Description: 0001-support-pctas-in-append-parallel-inserts.patch


0002-support-pctas-in-append-tuple-cost-adjustment.patch
Description: 0002-support-pctas-in-append-tuple-cost-adjustment.patch


RE: Parallel Inserts in CREATE TABLE AS

2020-12-14 Thread Hou, Zhijie
Hi

> Attaching v11 patch set. Please review it further.

Currently with the patch, we can allow parallel CTAS when topnode is Gather.
When top-node is Append and Gather is the sub-node of Append, I think we can 
still enable 
Parallel CTAS by pushing Parallel CTAS down to the sub-node Gather, such as:

Append
-->Gather
->Create table
->Seqscan
-->Gather
->create table
->Seqscan

And the use case seems common to me, such as:
select * from A where xxx union all select * from B where xxx;

I attatch a WIP patch which just show the possibility of this feature.
The patch is based on the latest v11-patch.

What do you think?

Best regards,
houzj







0001-patch-for-pctas-in-append.patch
Description: 0001-patch-for-pctas-in-append.patch


RE: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs

2020-12-10 Thread Hou, Zhijie
> IMO, let's not change the 1) behaviour to 3) with the patch. If agreed,

> I can do the following way in ExplainOneUtility and will add a comment on

> why we are doing this.

>

> if (es->analyze)

>  (void) CheckRelExistenceInCTAS(ctas, true);

>

> Thoughts?



Agreed.



Just in case, I took a look at Oracle 12’s behavior about [explain CTAS].

Oracle 12 will output the plan without throwing any msg in this case.



Best regards,

houzj










RE: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs

2020-12-10 Thread Hou, Zhijie
> Currently, for CTAS or CREATE MATERIALIZED VIEW(CMV) without if-not-exists
> clause, the existence of the relation gets checked during the execution
> of the select part and an error is thrown there.
> All the unnecessary rewrite and planning for the select part would have
> happened just to fail later. However, if if-not-exists clause is present,
> then a notice is issued and returned immediately without any further rewrite
> or planning for . This seems somewhat inconsistent to me.
> 
> I propose to check the relation existence early in ExecCreateTableAs() as
> well as in ExplainOneUtility() and throw an error in case it exists already
> to avoid unnecessary rewrite, planning and execution of the select part.
> 
> Attaching a patch. Note that I have not added any test cases as the existing
> test cases in create_table.sql and matview.sql would cover the code.
> 
> Thoughts?

Personally, I think it make sense, as other CMD(such as create extension/index 
...) throw that error
before any further operation too.

I am just a little worried about the behavior change of [explain CTAS].
May be someone will complain the change from normal explaininfo to error output.

And I took a look into the patch.

+   StringInfoData emsg;
+
+   initStringInfo();
+
+   if (level == NOTICE)
+   appendStringInfo(,

Using variable emsg and level seems a little complicated to me.
How about just:

if (!is_explain && ctas->if_not_exists)
ereport(NOTICE,xxx
else
ereport(ERROR,xxx

Best regards,
houzj





RE: Parallel Inserts in CREATE TABLE AS

2020-12-10 Thread Hou, Zhijie
Hi

+   allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo &&
+   plannedstmt->parallelModeNeeded &&
+   plannedstmt->planTree &&
+   IsA(plannedstmt->planTree, Gather) &&
+   plannedstmt->planTree->lefttree &&
+   plannedstmt->planTree->lefttree->parallel_aware 
&&
+   plannedstmt->planTree->lefttree->parallel_safe;

I noticed it check both IsA(ps, GatherState) and IsA(plannedstmt->planTree, 
Gather).
Does it mean it is possible that IsA(ps, GatherState) is true but 
IsA(plannedstmt->planTree, Gather) is false ?

I did some test but did not find a case like that.


Best regards,
houzj






Fix typo about generate_gather_paths

2020-12-08 Thread Hou, Zhijie
Hi

Since ba3e76c,
the optimizer call generate_useful_gather_paths instead of 
generate_gather_paths() outside.

But I noticed that some comment still talking about generate_gather_paths not 
generate_useful_gather_paths.
I think we should fix these comment, and I try to replace these " 
generate_gather_paths " with " generate_useful_gather_paths "

Best regards,
houzj





0001-fix-typo-about-generate_gather_paths.patch
Description: 0001-fix-typo-about-generate_gather_paths.patch


RE: Parallel Inserts in CREATE TABLE AS

2020-12-08 Thread Hou, Zhijie
> > I'm not quite sure how to address this. Can we not allow the planner
> > to consider that the select is for CTAS and check only after the
> > planning is done for the Gather node and other checks?
> >
> 
> IIUC, you are saying that we should not influence the cost of gather node
> even when the insertion would be done by workers? I think that should be
> our fallback option anyway but that might miss some paths to be considered
> parallel where the cost becomes more due to parallel_tuple_cost (aka tuple
> transfer cost). I think the idea is we can avoid the tuple transfer cost
> only when Gather is the top node because only at that time we can push
> insertion down, right? How about if we have some way to detect the same
> before calling generate_useful_gather_paths()? I think when we are calling
> apply_scanjoin_target_to_paths() in grouping_planner(), if the
> query_level is 1, it is for CTAS, and it doesn't have a chance to create
> UPPER_REL (doesn't have grouping, order, limit, etc clause) then we can
> probably assume that the Gather will be top_node. I am not sure about this
> but I think it is worth exploring.
> 

I took a look at the parallel insert patch and have the same idea.
https://commitfest.postgresql.org/31/2844/

 * Consider generating Gather or Gather Merge paths.  We must only do 
this
 * if the relation is parallel safe, and we don't do it for child rels 
to
 * avoid creating multiple Gather nodes within the same plan. We must do
 * this after all paths have been generated and before set_cheapest, 
since
 * one of the generated paths may turn out to be the cheapest one.
 */
if (rel->consider_parallel && !IS_OTHER_REL(rel))
generate_useful_gather_paths(root, rel, false);

IMO Gatherpath created here seems the right one which can possible ignore 
parallel cost if in CTAS.
But We need check the following parse option which will create path to be the 
parent of Gatherpath here.

if (root->parse->rowMarks)
if (limit_needed(root->parse))
if (root->parse->sortClause)
if (root->parse->distinctClause)
if (root->parse->hasWindowFuncs)
if (root->parse->groupClause || root->parse->groupingSets || 
root->parse->hasAggs || root->root->hasHavingQual)

Best regards,
houzj




RE: Parallel copy

2020-12-07 Thread Hou, Zhijie
> > 4.
> > A suggestion for CacheLineInfo.
> >
> > It use appendBinaryStringXXX to store the line in memory.
> > appendBinaryStringXXX will double the str memory when there is no enough
> spaces.
> >
> > How about call enlargeStringInfo in advance, if we already know the whole
> line size?
> > It can avoid some memory waste and may impove a little performance.
> >
> 
> Here we will not know the size beforehand, in some cases we will start
> processing the data when current block is populated and keep processing
> block by block, we will come to know of the size at the end. We cannot use
> enlargeStringInfo because of this.
> 
> Attached v11 patch has the fix for this, it also includes the changes to
> rebase on top of head.

Thanks for the explanation.

I think there is still chances we can know the size.

+* line_size will be set. Read the line_size again to be sure 
if it is
+* completed or partial block.
+*/
+   dataSize = pg_atomic_read_u32(>line_size);
+   if (dataSize != -1)
+   {

If I am not wrong, this seems the branch that procsssing the populated block.
I think we can check the copiedSize here, if copiedSize == 0, that means
Datasizes is the size of the whole line and in this case we can do the enlarge.


Best regards,
houzj






RE: Parallel Inserts in CREATE TABLE AS

2020-12-06 Thread Hou, Zhijie
Hi

+   /*
+* Flag to let the planner know that the SELECT query is for CTAS. This 
is
+* used to calculate the tuple transfer cost from workers to gather 
node(in
+* case parallelism kicks in for the SELECT part of the CTAS), to zero 
as
+* each worker will insert its share of tuples in parallel.
+*/
+   if (IsParallelInsertInCTASAllowed(into, NULL))
+   query->isForCTAS = true;


+   /*
+* We do not compute the parallel_tuple_cost for CTAS because the 
number of
+* tuples that are transferred from workers to the gather node is zero 
as
+* each worker, in parallel, inserts the tuples that are resulted from 
its
+* chunk of plan execution. This change may make the parallel plan cheap
+* among all other plans, and influence the planner to consider this
+* parallel plan.
+*/
+   if (!(root->parse->isForCTAS &&
+   root->query_level == 1))
+   run_cost += parallel_tuple_cost * path->path.rows;

I noticed that the parallel_tuple_cost will still be ignored,
When Gather is not the top node.

Example:
Create table test(i int);
insert into test values(generate_series(1,1000,1));
explain create table ntest3 as select * from test where i < 200 limit 
1;

  QUERY PLAN   
---
 Limit  (cost=1000.00..97331.33 rows=1000 width=4)
   ->  Gather  (cost=1000.00..97331.33 rows=1000 width=4)
 Workers Planned: 2
 ->  Parallel Seq Scan on test  (cost=0.00..96331.33 rows=417 width=4)
   Filter: (i < 200)


The isForCTAS will be true because [create table as], the
query_level is always 1 because there is no subquery.
So even if gather is not the top node, parallel cost will still be ignored.

Is that works as expected ?

Best regards,
houzj







Re: A new function to wait for the backend exit after termination

2020-12-04 Thread hou zhijie
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

Thanks for the new patch, the patch LGTM and works as expected

The new status of this patch is: Ready for Committer


RE: A new function to wait for the backend exit after termination

2020-12-04 Thread Hou, Zhijie
Hi,

When test pg_terminate_backend_and_wait with parallel query.
I noticed that the function is not defined as parallel safe.

I am not very familiar with the standard about whether a function should be 
parallel safe.
But I found the following function are all defined as parallel safe:

pg_promote
pg_terminate_backend(integer)
pg_sleep*

Is there a reason why pg_terminate_backend_and_wait are not parallel safe ?
(I'm sorry if I miss something in previous mails.)

Best regards,
houzj




RE: A new function to wait for the backend exit after termination

2020-12-03 Thread Hou, Zhijie
Hi,

-however only superusers can terminate superuser backends.
+however only superusers can terminate superuser backends. When no
+wait and timeout are
+provided, only SIGTERM is sent to the backend with the given process
+ID and false is returned immediately. But the

I test the case when no wait and timeout are provided.
True is returned as the following which seems different from the doc.

postgres=# select pg_terminate_backend(pid);
 pg_terminate_backend 
--
 t
(1 row)

Best regards,
houzj





RE: A new function to wait for the backend exit after termination

2020-12-02 Thread Hou, Zhijie
> 
> I changed the status to 'wait on anthor'.
> The others of the patch LGTM,
> I think it can be changed to Ready for Committer again, when this comment
> is confirmed.
> 

I am Sorry I forgot a possible typo comment.

+{ oid => '16386', descr => 'terminate a backend process and wait for it\'s 
exit or until timeout occurs'

Does the following change looks better?

Wait for it\'s exit => Wait for its exit

Best regards,
houzj




RE: A new function to wait for the backend exit after termination

2020-12-02 Thread Hou, Zhijie
> > +   ereport(WARNING,
> > +   (errmsg("could not wait for the termination of the
> backend with PID %d within %ld milliseconds",
> > +   pid, timeout)));
> 
> > The code use %ld to print int64 type.
> > How about use INT64_FORMAT, which looks more appropriate.
> 
> This is a translatable message, so INT64_FORMAT is no good -- we need
> something that is the same across platforms.  The current project standard
> for this problem is to use "%lld" and explicitly cast the argument to long
> long int to match that.

Thank you for pointing out that,
And sorry for did not think of it.

Yes, we can use %lld, (long long int) timeout.

Best regards,
houzj






RE: A new function to wait for the backend exit after termination

2020-12-02 Thread Hou, Zhijie
Hi

I take a look into the patch, and here some comments.

1.
+
+   ereport(WARNING,
+   (errmsg("could not wait for the termination of the 
backend with PID %d within %ld milliseconds",
+   pid, timeout)));
+

The code use %ld to print int64 type.
How about use INT64_FORMAT, which looks more appropriate. 

2.
+   if (timeout <= 0)
+   {
+   ereport(WARNING,
+   (errmsg("timeout cannot be negative or zero: 
%ld", timeout)));
+   PG_RETURN_BOOL(r);
+   }

The same as 1.

3.
+pg_terminate_backend_and_wait(PG_FUNCTION_ARGS)
+{
+   int pid = PG_GETARG_DATUM(0);

+pg_wait_backend(PG_FUNCTION_ARGS)
+{
+   int pid = PG_GETARG_INT32(0);

The code use different macro to get pid,
How about use PG_GETARG_INT32(0) for each one.


I changed the status to 'wait on anthor'.
The others of the patch LGTM, 
I think it can be changed to Ready for Committer again, when this comment is 
confirmed.


Best regards,
houzj








Avoid using lcons and list_delete_first in plan_union_children()

2020-12-01 Thread Hou, Zhijie
Hi,

In function plan_union_children(),  
I found the lcons and list_delete_first here is easy to be replaced by lappend 
and list_delete_last.

And I also found a previous commit do similar thing, so I try to improve this 
one.

Previous commit:
d97b714a219959a50f9e7b37ded674f5132f93f3

Best regards,
houzj




0001-Avoid-using-lcons-and-list_delete_first.patch
Description: 0001-Avoid-using-lcons-and-list_delete_first.patch


RE: [PATCH] Keeps tracking the uniqueness with UniqueKey

2020-11-30 Thread Hou, Zhijie
Hi 

I look into the patch again and have some comments.

1.
+   Size oid_cmp_len = sizeof(Oid) * ind1->ncolumns;
+
+   return ind1->ncolumns == ind2->ncolumns &&
+   ind1->unique == ind2->unique &&
+   memcmp(ind1->indexkeys, ind2->indexkeys, sizeof(int) * 
ind1->ncolumns) == 0 &&
+   memcmp(ind1->opfamily, ind2->opfamily, oid_cmp_len) == 0 &&
+   memcmp(ind1->opcintype, ind2->opcintype, oid_cmp_len) == 0 &&
+   memcmp(ind1->sortopfamily, ind2->sortopfamily, oid_cmp_len) == 
0 &&
+   equal(get_tlist_exprs(ind1->indextlist, true),
+ get_tlist_exprs(ind2->indextlist, true));

The length of sortopfamily,opfamily and opcintype seems ->nkeycolumns not 
->ncolumns.
I checked function get_relation_info where init the IndexOptInfo.
(If there are more places where can change the length, please correct me)


2.

+   COPY_SCALAR_FIELD(ncolumns);
+   COPY_SCALAR_FIELD(nkeycolumns);
+   COPY_SCALAR_FIELD(unique);
+   COPY_SCALAR_FIELD(immediate);
+   /* We just need to know if it is NIL or not */
+   COPY_SCALAR_FIELD(indpred);
+   COPY_SCALAR_FIELD(predOK);
+   COPY_POINTER_FIELD(indexkeys, from->ncolumns * sizeof(int));
+   COPY_POINTER_FIELD(indexcollations, from->ncolumns * sizeof(Oid));
+   COPY_POINTER_FIELD(opfamily, from->ncolumns * sizeof(Oid));
+   COPY_POINTER_FIELD(opcintype, from->ncolumns * sizeof(Oid));
+   COPY_POINTER_FIELD(sortopfamily, from->ncolumns * sizeof(Oid));
+   COPY_NODE_FIELD(indextlist);

The same as 1.
Should use nkeycolumns if I am right.


3.
+   foreach(lc, newnode->indextlist)
+   {
+   TargetEntry *tle = lfirst_node(TargetEntry, lc);
+   /* Index on expression is ignored */
+   Assert(IsA(tle->expr, Var));
+   tle->expr = (Expr *) find_parent_var(appinfo, (Var *) 
tle->expr);
+   newnode->indexkeys[idx] = castNode(Var, tle->expr)->varattno;
+   idx++;
+   }

The count variable 'idx'  can be replaces by foreach_current_index().


Best regards,
houzj





support IncrementalSortPath type in outNode()

2020-11-30 Thread Hou, Zhijie
Hi,

I found IncrementalSortPath type is not analyzed in outNode.
It does not matter during runtime though.

Just for the convenience of debugging, 
when debug by gdb, we can use "call pprint(incresortpath)" to list the node 
info.

So I try to support IncrementalSortPath in outNode.

Best regards,
houzj





0001-support-incremental-sort-in-outnode.patch
Description: 0001-support-incremental-sort-in-outnode.patch


RE: Parallel Inserts in CREATE TABLE AS

2020-11-26 Thread Hou, Zhijie
Hi,

> > I took a deep look at the projection logic.
> > In most cases, you are right that Gather node does not need projection.
> >
> > In some rare cases, such as Subplan (or initplan I guess).
> > The projection will happen in Gather node.
> >
> > The example:
> >
> > Create table test(i int);
> > Create table test2(a int, b int);
> > insert into test values(generate_series(1,1000,1));
> > insert into test2 values(generate_series(1,1000,1),
> > generate_series(1,1000,1));
> >
> > postgres=# explain(verbose, costs off) select test.i,(select i from
> (select * from test2) as tt limit 1) from test where test.i < 2000;
> >QUERY PLAN
> > 
> >  Gather
> >Output: test.i, (SubPlan 1)
> >Workers Planned: 2
> >->  Parallel Seq Scan on public.test
> >  Output: test.i
> >  Filter: (test.i < 2000)
> >SubPlan 1
> >  ->  Limit
> >Output: (test.i)
> >->  Seq Scan on public.test2
> >  Output: test.i
> >
> > In this case, projection is necessary, because the subplan will be
> > executed in projection.
> >
> > If skipped, the table created will loss some data.
> >
> 
> Thanks a lot for the use case. Yes with the current patch table will lose
> data related to the subplan. On analyzing further, I think we can not allow
> parallel inserts in the cases when the Gather node has some projections
> to do. Because the workers can not perform that projection. So, having
> ps_ProjInfo in the Gather node is an indication for us to disable parallel
> inserts and only the leader can do the insertions after the Gather node
> does the required projections.
> 
> Thoughts?
> 

Agreed.


2.
@@ -166,6 +228,16 @@ ExecGather(PlanState *pstate)
{
ParallelContext *pcxt;
 
+   /*
+* Take the necessary information to be passed to 
workers for
+* parallel inserts in CTAS.
+*/
+   if (ISCTAS(node->ps.intoclause))
+   {
+   node->ps.lefttree->intoclause = 
node->ps.intoclause;
+   node->ps.lefttree->objectid = node->ps.objectid;
+   }
+
/* Initialize, or re-initialize, shared state needed by 
workers. */
if (!node->pei)
node->pei = 
ExecInitParallelPlan(node->ps.lefttree,

I found the code pass intoclause and objectid to Gather node's lefttree.
Is it necessary? It seems only Gather node will use the information.


Best regards,
houzj





RE: Parallel Inserts in CREATE TABLE AS

2020-11-25 Thread Hou, Zhijie
Hi ,

> On Thu, Nov 26, 2020 at 7:47 AM Hou, Zhijie 
> wrote:
> >
> > Hi,
> >
> > I have an issue about the following code:
> >
> > econtext = node->ps.ps_ExprContext;
> > ResetExprContext(econtext);
> >
> > +   if (ISCTAS(node->ps.intoclause))
> > +   {
> > +   ExecParallelInsertInCTAS(node);
> > +   return NULL;
> > +   }
> >
> > /* If no projection is required, we're done. */
> > if (node->ps.ps_ProjInfo == NULL)
> > return slot;
> >
> > /*
> >  * Form the result tuple using ExecProject(), and return it.
> >  */
> > econtext->ecxt_outertuple = slot;
> > return ExecProject(node->ps.ps_ProjInfo);
> >
> > It seems the projection will be skipped.
> > Is this because projection is not required in this case ?
> > (I'm not very familiar with where the projection will be.)
> >
> 
> For parallel inserts in CTAS, I don't think we need to project the tuples
> being returned from the underlying plan nodes, and also we have nothing
> to project from the Gather node further up. The required projection will
> happen while the tuples are being returned from the underlying nodes and
> the projected tuples are being directly fed to CTAS's dest receiver
> intorel_receive(), from there into the created table. We don't need
> ExecProject again in ExecParallelInsertInCTAS().
> 
> For instance, projection will always be done when the tuple is being returned
> from an underlying sequential scan node(see ExecScan() -->
> ExecProject() and this is true for both leader and workers. In both leader
> and workers, we are just calling CTAS's dest receiver intorel_receive().
> 
> Thoughts?

I took a deep look at the projection logic.
In most cases, you are right that Gather node does not need projection.

In some rare cases, such as Subplan (or initplan I guess).
The projection will happen in Gather node.

The example:

Create table test(i int);
Create table test2(a int, b int);
insert into test values(generate_series(1,1000,1));
insert into test2 values(generate_series(1,1000,1), generate_series(1,1000,1));

postgres=# explain(verbose, costs off) select test.i,(select i from (select * 
from test2) as tt limit 1) from test where test.i < 2000;
   QUERY PLAN

 Gather
   Output: test.i, (SubPlan 1)
   Workers Planned: 2
   ->  Parallel Seq Scan on public.test
 Output: test.i
 Filter: (test.i < 2000)
   SubPlan 1
 ->  Limit
   Output: (test.i)
   ->  Seq Scan on public.test2
 Output: test.i

In this case, projection is necessary,
because the subplan will be executed in projection.

If skipped, the table created will loss some data.



Best regards,
houzj




RE: Parallel Inserts in CREATE TABLE AS

2020-11-25 Thread Hou, Zhijie
Hi,

I have an issue about the following code:

econtext = node->ps.ps_ExprContext;
ResetExprContext(econtext);
 
+   if (ISCTAS(node->ps.intoclause))
+   {
+   ExecParallelInsertInCTAS(node);
+   return NULL;
+   }

/* If no projection is required, we're done. */
if (node->ps.ps_ProjInfo == NULL)
return slot;

/*
 * Form the result tuple using ExecProject(), and return it.
 */
econtext->ecxt_outertuple = slot;
return ExecProject(node->ps.ps_ProjInfo);

It seems the projection will be skipped.
Is this because projection is not required in this case ?
(I'm not very familiar with where the projection will be.)

If projection is not required here, shall we add some comments here?

Best regards,
houzj






RE: Parallel Inserts in CREATE TABLE AS

2020-11-24 Thread Hou, Zhijie
Hi,

I'm very interested in this feature,
and I'm looking at the patch, here are some comments.

1.
+   if (!TupIsNull(outerTupleSlot))
+   {
+   (void) 
node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest);
+   node->ps.state->es_processed++;
+   }
+
+   if(TupIsNull(outerTupleSlot))
+   break;
+   }

How about the following style:

if(TupIsNull(outerTupleSlot))
Break;

(void) node->ps.dest->receiveSlot(outerTupleSlot, 
node->ps.dest);
node->ps.state->es_processed++;

Which looks cleaner.


2.
+
+   if (into != NULL &&
+   IsA(into, IntoClause))
+   {

The check can be replaced by ISCTAS(into).


3.
+   /*
+* For parallelizing inserts in CTAS i.e. making each
+* parallel worker inerst it's tuples, we must send
+* information such as intoclause(for each worker

'inerst' looks like a typo (insert).


4.
+   /* Estimate space for into clause for CTAS. */
+   if (ISCTAS(planstate->intoclause))
+   {
+   intoclausestr = nodeToString(planstate->intoclause);
+   shm_toc_estimate_chunk(>estimator, strlen(intoclausestr) 
+ 1);
+   shm_toc_estimate_keys(>estimator, 1);
+   }
...
+   if (intoclausestr != NULL)
+   {
+   char *shmptr = (char *)shm_toc_allocate(pcxt->toc,
+   
strlen(intoclausestr) + 1);
+   strcpy(shmptr, intoclausestr);
+   shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr);
+   }

The code here call strlen(intoclausestr) for two times,
After checking the existing code in ExecInitParallelPlan,
It used to store the strlen in a variable.

So how about the following style:

intoclause_len = strlen(intoclausestr);
...
/* Store serialized intoclause. */
intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len + 1);
memcpy(shmptr, intoclausestr, intoclause_len + 1);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, intoclause_space);

the code in ExecInitParallelPlan 


5.
+   if (intoclausestr != NULL)
+   {
+   char *shmptr = (char *)shm_toc_allocate(pcxt->toc,
+   
strlen(intoclausestr) + 1);
+   strcpy(shmptr, intoclausestr);
+   shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr);
+   }
+
/* Set up the tuple queues that the workers will write into. */
-   pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);
+   if (intoclausestr == NULL)
+   pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);

The two check about intoclausestr seems can be combined like:

if (intoclausestr != NULL)
{
...
}
else
{
...
}

Best regards,
houzj






RE: Parallel copy

2020-11-19 Thread Hou, Zhijie
Hi Vignesh,

I took a look at the v10 patch set. Here are some comments:

1. 
+/*
+ * CheckExprParallelSafety
+ *
+ * Determine if where cluase and default expressions are parallel safe & do not
+ * have volatile expressions, return true if condition satisfies else return
+ * false.
+ */

'cluase' seems a typo.


2.
+   /*
+* Make sure that no worker has consumed this element, 
if this
+* line is spread across multiple data blocks, worker 
would have
+* started processing, no need to change the state to
+* LINE_LEADER_POPULATING in this case.
+*/
+   (void) 
pg_atomic_compare_exchange_u32(>line_state,
+   
  _line_state,
+   
  LINE_LEADER_POPULATED);
About the commect

+* started processing, no need to change the state to
+* LINE_LEADER_POPULATING in this case.

Does it means no need to change the state to LINE_LEADER_POPULATED ' here?


3.
+ * 3) only one worker should choose one line for processing, this is handled by
+ *using pg_atomic_compare_exchange_u32, worker will change the state to
+ *LINE_WORKER_PROCESSING only if line_state is LINE_LEADER_POPULATED.

In the latest patch, it will set the state to LINE_WORKER_PROCESSING if 
line_state is LINE_LEADER_POPULATED or LINE_LEADER_POPULATING.
So The comment here seems wrong.


4.
A suggestion for CacheLineInfo.

It use appendBinaryStringXXX to store the line in memory.
appendBinaryStringXXX will double the str memory when there is no enough spaces.

How about call enlargeStringInfo in advance, if we already know the whole line 
size?
It can avoid some memory waste and may impove a little performance.


Best regards,
houzj






Add Nullif case for eval_const_expressions_mutator

2020-11-10 Thread Hou, Zhijie
Hi

I notice that there are no Nullif case in eval_const_expression.
Since Nullif is similar to Opexpr and is easy to implement,
I try add this case in eval_const_expressions_mutator.

Best regards,
houzj




0001-patch-eval-NULLIF.patch
Description: 0001-patch-eval-NULLIF.patch


RE: Parallel copy

2020-11-05 Thread Hou, Zhijie
Hi

> 
> my $bytes = $ARGV[0];
> for(my $i = 0; $i < $bytes; $i+=8){
>  print "longdata";
> }
> print "\n";
> 
> 
> postgres=# copy longdata from program 'perl /tmp/longdata.pl 1'
> with (parallel 2);
> 
> This gets stuck forever (or at least I didn't have the patience to wait
> it finish). Both worker processes are consuming 100% of CPU.

I had a look over this problem.

the ParallelCopyDataBlock has size limit:
uint8   skip_bytes;
chardata[DATA_BLOCK_SIZE];  /* data read from file */

It seems the input line is so long that the leader process run out of the 
Shared memory among parallel copy workers.
And the leader process keep waiting free block.

For the worker process, it wait util line_state becomes LINE_LEADER_POPULATED,
But leader process won't set the line_state unless it read the whole line.

So it stuck forever.
May be we should reconsider about this situation.

The stack is as follows:

Leader stack:
#3  0x0075f7a1 in WaitLatch (latch=, 
wakeEvents=wakeEvents@entry=41, timeout=timeout@entry=1, 
wait_event_info=wait_event_info@entry=150994945) at latch.c:411
#4  0x005a9245 in WaitGetFreeCopyBlock 
(pcshared_info=pcshared_info@entry=0x7f26d2ed3580) at copyparallel.c:1546
#5  0x005a98ce in SetRawBufForLoad (cstate=cstate@entry=0x2978a88, 
line_size=67108864, copy_buf_len=copy_buf_len@entry=65536, 
raw_buf_ptr=raw_buf_ptr@entry=65536, 
copy_raw_buf=copy_raw_buf@entry=0x7fff4cdc0e18) at copyparallel.c:1572
#6  0x005a1963 in CopyReadLineText (cstate=cstate@entry=0x2978a88) at 
copy.c:4058
#7  0x005a4e76 in CopyReadLine (cstate=cstate@entry=0x2978a88) at 
copy.c:3863

Worker stack:
#0  GetLinePosition (cstate=cstate@entry=0x29e1f28) at copyparallel.c:1474
#1  0x005a8aa4 in CacheLineInfo (cstate=cstate@entry=0x29e1f28, 
buff_count=buff_count@entry=0) at copyparallel.c:711
#2  0x005a8e46 in GetWorkerLine (cstate=cstate@entry=0x29e1f28) at 
copyparallel.c:885
#3  0x005a4f2e in NextCopyFromRawFields (cstate=cstate@entry=0x29e1f28, 
fields=fields@entry=0x7fff4cdc0b48, nfields=nfields@entry=0x7fff4cdc0b44) at 
copy.c:3615
#4  0x005a50af in NextCopyFrom (cstate=cstate@entry=0x29e1f28, 
econtext=econtext@entry=0x2a358d8, values=0x2a42068, nulls=0x2a42070) at 
copy.c:3696
#5  0x005a5b90 in CopyFrom (cstate=cstate@entry=0x29e1f28) at 
copy.c:2985


Best regards,
houzj





RE: psql \df choose functions by their arguments

2020-11-03 Thread Hou, Zhijie
Hi

(sorry forget to cc the hacklist)

> Improve psql \df to choose functions by their arguments

I think this is useful.

I found some comments in the patch.

1.
> * Abbreviations of common types is permitted (because who really likes 
> to type out "character varying"?), so the above could also be written as:

some Abbreviations of common types are not added to the type_abbreviations[] 
Such as:

Int8=> bigint
Int2=> smallint
Int4 ,int   => integer
Float4  => real
Float8,float,double => double precision
(as same as array type)

Single array seems difficult to handle it, may be we can use double array or 
use a struct.

2.
And I think It's better to update '/?' info about '\df[+]' in function 
slashUsage(unsigned short int pager).

Best regards,
houzj




Fix typo in xlogreader.h and procarray.c

2020-11-02 Thread Hou, Zhijie
Hi

I found some possible typo in procarray.c and xlogreader.h

- * For VACUUM separate horizons (used to to decide which deleted tuples must
+ * For VACUUM separate horizons (used to decide which deleted tuples must

- * Callers supply a page_read callback if they want to to call
+ * Callers supply a page_read callback if they want to call

Best regards,
houzj




0001-fix-typo.patch
Description: 0001-fix-typo.patch


RE: Parallel copy

2020-10-28 Thread Hou, Zhijie
Hi 

I found some issue in v9-0002

1.
+
+   elog(DEBUG1, "[Worker] Processing - line position:%d, block:%d, 
unprocessed lines:%d, offset:%d, line size:%d",
+write_pos, lineInfo->first_block,
+pg_atomic_read_u32(_blk_ptr->unprocessed_line_parts),
+offset, pg_atomic_read_u32(>line_size));
+

write_pos or other variable to be printed here are type of uint32, I think 
it'better to use '%u' in elog msg.

2.
+* line_size will be set. Read the line_size again to be sure 
if it is
+* completed or partial block.
+*/
+   dataSize = pg_atomic_read_u32(>line_size);
+   if (dataSize)

It use dataSize( type int ) to get uint32 which seems a little dangerous.
Is it better to define dataSize uint32 here? 

3.
Since function with  'Cstate' in name has been changed to 'CState'
I think we can change function PopulateCommonCstateInfo as well.

4.
+   if (pcdata->worker_line_buf_count)

I think some check like the above can be 'if (xxx > 0)', which seems easier to 
understand.


Best regards,
houzj




Fix typo in src/backend/utils/cache/lsyscache.c

2020-10-24 Thread Hou, Zhijie
Hi

I found the comment of function get_attgenerated(Oid relid, AttrNumber attnum) 
seems wrong.
It seems the function is given the attribute number not the name.

/*
  * get_attgenerated
  *
- * Given the relation id and the attribute name,
+ * Given the relation id and the attribute number,
  * return the "attgenerated" field from the attribute relation.
  *
  * Errors if not found.

Best regards,
houzj




0001-fix-typo.patch
Description: 0001-fix-typo.patch


  1   2   >