Re: [HACKERS] Block level parallel vacuum

2019-10-27 Thread Dilip Kumar
On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada  wrote:
>
> On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar  wrote:
> >
> > On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada  
> > wrote:
> > >
> > > For more detail of my idea it is that the first worker who entered to
> > > vacuum_delay_point adds its local value to shared value and reset the
> > > local value to 0. And then the worker sleeps if it exceeds
> > > VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
> > > from the shared value. Since vacuum_delay_point are typically called
> > > per page processed I expect there will not such problem. Thoughts?
> >
> > Oh right, I assumed that when the local balance is exceeding the
> > VacuumCostLimit that time you are adding it to the shared value but
> > you are adding it to to shared value every time in vacuum_delay_point.
> > So I think your idea is correct.
>
> I've attached the updated patch set.
>
> First three patches add new variables and a callback to index AM.
>
> Next two patches are the main part to support parallel vacuum. I've
> incorporated all review comments I got so far. The memory layout of
> variable-length index statistics might be complex a bit. It's similar
> to the format of heap tuple header, having a null bitmap. And both the
> size of index statistics and actual data for each indexes follows.
>
> Last patch is a PoC patch that implements the shared vacuum cost
> balance. For now it's separated but after testing both approaches it
> will be merged to 0004 patch. I'll test both next week.
>
> This patch set can be applied on top of the patch[1] that improves
> gist index bulk-deletion. So canparallelvacuum of gist index is true.
>
> [1] 
> https://www.postgresql.org/message-id/CAFiTN-uQY%2BB%2BCLb8W3YYdb7XmB9hyYFXkAy3C7RY%3D-YSWRV1DA%40mail.gmail.com
>
I haven't yet read the new set of the patch.  But, I have noticed one
thing.  That we are getting the size of the statistics using the AM
routine.  But, we are copying those statistics from local memory to
the shared memory directly using the memcpy.   Wouldn't it be a good
idea to have an AM specific routine to get it copied from the local
memory to the shared memory?  I am not sure it is worth it or not but
my thought behind this point is that it will give AM to have local
stats in any form ( like they can store a pointer in that ) but they
can serialize that while copying to shared stats.  And, later when
shared stats are passed back to the Am then it can deserialize in its
local form and use it.

+ * Since all vacuum workers write the bulk-deletion result at
+ * different slots we can write them without locking.
+ */
+ if (!shared_indstats->updated && stats[idx] != NULL)
+ {
+ memcpy(bulkdelete_res, stats[idx], shared_indstats->size);
+ shared_indstats->updated = true;
+
+ /*
+ * no longer need the locally allocated result and now
+ * stats[idx] points to the DSM segment.
+ */
+ pfree(stats[idx]);
+ stats[idx] = bulkdelete_res;
+ }

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




RE: Proposal: Add more compile-time asserts to expose inconsistencies.

2019-10-27 Thread Smith, Peter
From: Andres Freund  Sent: Sunday, 27 October 2019 12:03 PM

>>Ideally, the implementation should end up calling _Static_assert() 
>>somehow, so that we get the compiler's native error message.

OK. I can work on that.

>>We could do a configure check for whether _Static_assert() works at 
>>file scope.  I don't know what the support for that is, but it seems to 
>>work in gcc and clang

> I think it should work everywhere that has static assert. So we should need a 
> separate configure check.

Er, that's a typo right? I think you meant: "So we *shouldn't* need a separate 
configure check"

Kind Regards
---
Peter Smith
Fujitsu Australia


Re: Proposal: Add more compile-time asserts to expose inconsistencies.

2019-10-27 Thread Andres Freund
On 2019-10-27 11:44:54 +, Smith, Peter wrote:
> From: Andres Freund  Sent: Sunday, 27 October 2019 12:03 
> PM
> > I think it should work everywhere that has static assert. So we should need 
> > a separate configure check.
> 
> Er, that's a typo right? I think you meant: "So we *shouldn't* need a 
> separate configure check"

Yes.




Re: Binary support for pgoutput plugin

2019-10-27 Thread Dmitry Dolgov
> On Mon, Jun 17, 2019 at 10:29:26AM -0400, Dave Cramer wrote:
> > Which is what I have done. Thanks
> >
> > I've attached both patches for comments.
> > I still have to add documentation.
>
> Additional patch for documentation.

Thank you for the patch! Unfortunately 0002 has some conflicts, could
you please send a rebased version? In the meantime I have few questions:

-LogicalRepRelId
+void
 logicalrep_read_insert(StringInfo in, LogicalRepTupleData *newtup)
 {
charaction;
-   LogicalRepRelId relid;
-
-   /* read the relation id */
-   relid = pq_getmsgint(in, 4);

action = pq_getmsgbyte(in);
if (action != 'N')
@@ -175,7 +173,6 @@ logicalrep_read_insert(StringInfo in, 
LogicalRepTupleData *newtup)

logicalrep_read_tuple(in, newtup);

-   return relid;
 }



-   relid = logicalrep_read_insert(s, &newtup);
+   /* read the relation id */
+   relid = pq_getmsgint(s, 4);
rel = logicalrep_rel_open(relid, RowExclusiveLock);
+
+   logicalrep_read_insert(s, &newtup);

Maybe I'm missing something, what is the reason of moving pq_getmsgint
out of logicalrep_read_*? Just from the patch it seems that the code is
equivalent.

> There is one obvious hack where in binary mode I reset the input
> cursor to allow the binary input to be re-read From what I can tell
> the alternative is to convert the data in logicalrep_read_tuple but
> that would require moving a lot of the logic currently in worker.c to
> proto.c. This seems minimally invasive.

Which logic has to be moved for example? Actually it sounds more natural
to me, if this functionality would be in e.g. logicalrep_read_tuple
rather than slot_store_data, since it has something to do with reading
data. And it seems that in pglogical it's also located in
pglogical_read_tuple.




Re: Proposition to use '==' as synonym for 'IS NOT DISTINCT FROM'

2019-10-27 Thread Tom Lane
I wrote:
> To clarify, what I have in mind here doesn't have any effect whatever
> on the parse tree or the execution semantics, it's just about offering
> an alternative SQL textual representation.

Continuing this thread ... if we were just trying to fix the
dump/restore issue without regard for verbosity, I think I'd propose
that we implement syntaxes like

x IS DISTINCT FROM y
x IS DISTINCT (=) FROM y
x IS DISTINCT (schema.=) FROM y
x IS NOT DISTINCT FROM y
x IS NOT DISTINCT (=) FROM y
x IS NOT DISTINCT (schema.=) FROM y

with the understanding that the parenthesized operator name is what
to use for the underlying equality comparison, and that in the absence
of any name, the parser looks up "=" (which is what it does today).
Thus the first two alternatives are precisely equivalent, as are the
fourth and fifth.  Also, to support row-wise comparisons, we could
allow cases like

ROW(a,b) IS NOT DISTINCT (schema1.=, schema2.=) FROM ROW(x,y)

ruleutils.c could proceed by looking up the operator(s) normally,
and skipping the verbose syntax when they would print as just "=",
so that we don't need to emit nonstandard SQL for common cases.

I haven't actually checked to ensure that Bison can handle this,
but since DISTINCT and FROM are both fully reserved words, it seems
virtually certain that this would work without syntactic ambiguity.
It also seems relatively understandable as to what it means.

But of course, this is the exact opposite of addressing Eugen's
concern about verbosity :-(.  Can we pack the same functionality
into fewer characters?

regards, tom lane




Re: [Proposal] Arbitrary queries in postgres_fdw

2019-10-27 Thread David Fetter
On Fri, Oct 25, 2019 at 05:17:18PM +0200, rto...@carto.com wrote:
> Dear all,
> 
> We stumbled upon a few cases in which retrieving information from the
> foreign server may turn pretty useful before creating any foreign
> table, especially info related to the catalog. E.g: a list of schemas
> or tables the user has access to.
> 
> I thought of using dblink for it, but that requires duplication of
> server and user mapping details and it adds its own management of
> connections.
> 
> Then I thought a better approach may be a mix of both: a function to
> issue arbitrary queries to the foreign server reusing all the details
> encapsulated in the server and user mapping. It would use the same
> pool of connections.

There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
does something similar to this.  Might it be possible to incorporate
it into the previous patch that implemented that feature?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Selecting fields from a RowExpr

2019-10-27 Thread Tom Lane
At pgconf.eu, someone whose name I've forgotten pointed out to me
that this doesn't work:

regression=# select (row(1, 2.0)).f1;
ERROR:  could not identify column "f1" in record data type
LINE 1: select (row(1, 2.0)).f1;
^

The fields of an anonymous rowtype are certainly named f1, f2, etc,
so it seems like this *should* work.  A related case is

regression=# select (row(1, 2.0)).*;
ERROR:  record type has not been registered

Admittedly, these probably aren't terribly useful cases in practice,
but it's unfortunate that they don't work as one would expect.
So I propose the attached patch to make them work.

The underlying reason for both of these failures is that RowExpr
doesn't carry a typmod, so if it's of type RECORD then
get_expr_result_type doesn't know how to find a tupdesc for it.
The minimum-code solution is to teach get_expr_result_type to build
a tupdesc directly from the RowExpr, and that seems to be necessary
for complicated cases like

select (r).f1 from (select row(1, 2.0) as r) ss;

In an earlier version of the patch I chose to add in some fast-path
logic in ParseComplexProjection and ExpandRowReference, so as to
make the really simple cases shown above a bit less inefficient.
But on second thought, these are such corner cases that it doesn't
seem worth carrying extra code for them.  The cases that are more
likely to arise in practice are like that last example, and we
can't optimize that in the parser.  (The planner will optimize
FieldSelect-from-RowExpr after flattening subqueries, which is
probably as much as we really need to do here.)

I don't feel a need to back-patch this, but I would like to push
it into HEAD.

Thoughts?

regards, tom lane

diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index b7fac5d..4688fbc 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -227,6 +227,38 @@ get_expr_result_type(Node *expr,
 		  NULL,
 		  resultTypeId,
 		  resultTupleDesc);
+	else if (expr && IsA(expr, RowExpr) &&
+			 ((RowExpr *) expr)->row_typeid == RECORDOID)
+	{
+		/* We can resolve the record type by generating the tupdesc directly */
+		RowExpr*rexpr = (RowExpr *) expr;
+		TupleDesc	tupdesc;
+		AttrNumber	i = 1;
+		ListCell   *lcc,
+   *lcn;
+
+		tupdesc = CreateTemplateTupleDesc(list_length(rexpr->args));
+		Assert(list_length(rexpr->args) == list_length(rexpr->colnames));
+		forboth(lcc, rexpr->args, lcn, rexpr->colnames)
+		{
+			Node	   *col = (Node *) lfirst(lcc);
+			char	   *colname = strVal(lfirst(lcn));
+
+			TupleDescInitEntry(tupdesc, i,
+			   colname,
+			   exprType(col),
+			   exprTypmod(col),
+			   0);
+			TupleDescInitEntryCollation(tupdesc, i,
+		exprCollation(col));
+			i++;
+		}
+		if (resultTypeId)
+			*resultTypeId = rexpr->row_typeid;
+		if (resultTupleDesc)
+			*resultTupleDesc = BlessTupleDesc(tupdesc);
+		return TYPEFUNC_COMPOSITE;
+	}
 	else
 	{
 		/* handle as a generic expression; no chance to resolve RECORD */
diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index a272305..2a273f8 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -436,6 +436,45 @@ where i8 in (row(123,456)::int8_tbl, '(4567890123456789,123)');
  4567890123456789 | 123
 (2 rows)
 
+-- Check ability to select columns from an anonymous rowtype
+select (row(1, 2.0)).f1;
+ f1 
+
+  1
+(1 row)
+
+select (row(1, 2.0)).f2;
+ f2  
+-
+ 2.0
+(1 row)
+
+select (row(1, 2.0)).nosuch;  -- fail
+ERROR:  could not identify column "nosuch" in record data type
+LINE 1: select (row(1, 2.0)).nosuch;
+^
+select (row(1, 2.0)).*;
+ f1 | f2  
++-
+  1 | 2.0
+(1 row)
+
+select (r).f1 from (select row(1, 2.0) as r) ss;
+ f1 
+
+  1
+(1 row)
+
+select (r).f3 from (select row(1, 2.0) as r) ss;  -- fail
+ERROR:  could not identify column "f3" in record data type
+LINE 1: select (r).f3 from (select row(1, 2.0) as r) ss;
+^
+select (r).* from (select row(1, 2.0) as r) ss;
+ f1 | f2  
++-
+  1 | 2.0
+(1 row)
+
 -- Check some corner cases involving empty rowtypes
 select ROW();
  row 
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index 7e080c0..83cf4a1 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -171,6 +171,15 @@ where i8 in (row(123,456)::int8_tbl, '(4567890123456789,123)');
 select * from int8_tbl i8
 where i8 in (row(123,456)::int8_tbl, '(4567890123456789,123)');
 
+-- Check ability to select columns from an anonymous rowtype
+select (row(1, 2.0)).f1;
+select (row(1, 2.0)).f2;
+select (row(1, 2.0)).nosuch;  -- fail
+select (row(1, 2.0)).*;
+select (r).f1 from (select row(1, 2.0) as r) ss;
+select (r).f3 from (select row(1, 2.0) as r) ss;  -- fail
+select (r).* from (select row(1, 2.0) as r) ss;
+
 -- Check some corner cases 

Re: Selecting fields from a RowExpr

2019-10-27 Thread Pavel Stehule
Hi

ne 27. 10. 2019 v 19:47 odesílatel Tom Lane  napsal:

> At pgconf.eu, someone whose name I've forgotten pointed out to me
> that this doesn't work:
>
> regression=# select (row(1, 2.0)).f1;
> ERROR:  could not identify column "f1" in record data type
> LINE 1: select (row(1, 2.0)).f1;
> ^
>
> The fields of an anonymous rowtype are certainly named f1, f2, etc,
> so it seems like this *should* work.  A related case is
>
> regression=# select (row(1, 2.0)).*;
> ERROR:  record type has not been registered
>
> Admittedly, these probably aren't terribly useful cases in practice,
> but it's unfortunate that they don't work as one would expect.
> So I propose the attached patch to make them work.
>
> The underlying reason for both of these failures is that RowExpr
> doesn't carry a typmod, so if it's of type RECORD then
> get_expr_result_type doesn't know how to find a tupdesc for it.
> The minimum-code solution is to teach get_expr_result_type to build
> a tupdesc directly from the RowExpr, and that seems to be necessary
> for complicated cases like
>
> select (r).f1 from (select row(1, 2.0) as r) ss;
>
> In an earlier version of the patch I chose to add in some fast-path
> logic in ParseComplexProjection and ExpandRowReference, so as to
> make the really simple cases shown above a bit less inefficient.
> But on second thought, these are such corner cases that it doesn't
> seem worth carrying extra code for them.  The cases that are more
> likely to arise in practice are like that last example, and we
> can't optimize that in the parser.  (The planner will optimize
> FieldSelect-from-RowExpr after flattening subqueries, which is
> probably as much as we really need to do here.)
>
> I don't feel a need to back-patch this, but I would like to push
> it into HEAD.
>

some times I hit this limit, an can be nice more consistent behave of
composite types.

It's new feature - and there is not a reason for back-patching

Regards

Pavel

>
> Thoughts?
>
> regards, tom lane
>
>


Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock(PG10.7)

2019-10-27 Thread Alexander Korotkov
Hi, Peter!

On Fri, Oct 4, 2019 at 12:05 AM Alexander Korotkov
 wrote:
> I'm planning to continue work on README, comments and commit messages.

It tool me a lot of efforts to revise a concurrency section in README.
I can't judge the result, but I probably did my best.  I'd like to
commit this patchset including both bug fixes and README update. But
I'd like you to take a look on the README patch first.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-Fix-deadlock-between-ginDeletePage-and-ginStepRigh-3.patch
Description: Binary data


0002-Fix-traversing-to-the-deleted-GIN-page-via-downlin-3.patch
Description: Binary data


0003-Revise-concurrency-section-in-GIN-README-3.patch
Description: Binary data


update ALTER TABLE with ATTACH PARTITION lock mode

2019-10-27 Thread Justin Pryzby
commit #898e5e32 (Allow ATTACH PARTITION with only ShareUpdateExclusiveLock)
updates ddl.sgml but not alter_table.sgml, which only says:

https://www.postgresql.org/docs/12/release-12.html
|An ACCESS EXCLUSIVE lock is held unless explicitly noted.

Find attached patch, which also improve language in several related places.

"Without such a constraint": SUCH could refer to either of the constraints..

"because it is no longer necessary.": In our use case, we prefer to keep the
redundant constraint, to avoid having to add it back if we detach/reattach
again in the future..
>From c820a81fba0a6c2388ec58fc0204ca833523e81e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 27 Oct 2019 18:54:24 -0500
Subject: [PATCH v1 1/2] Mention reduced locking strength of ATTACH PARTITION..

See commit 898e5e32
---
 doc/src/sgml/ref/alter_table.sgml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index c8dfa19..a184bed 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -900,6 +900,13 @@ WITH ( MODULUS numeric_literal, REM
   the scan of the new partition, it is always skipped when the default
   partition is a foreign table.
  
+
+ 
+  Attaching a partition acquires a SHARE UPDATE EXCLUSIVE
+  lock on the partitioned table, in addition to an
+  ACCESS EXCLUSIVE lock on the partition.
+ 
+
 

 
-- 
2.7.4

>From b1c9c50228ebd3d2d511382ebd6cbae08788e376 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 27 Oct 2019 18:38:23 -0500
Subject: [PATCH v1 2/2] Tweak language for ATTACH PARTITION docs

---
 doc/src/sgml/ddl.sgml |  8 
 doc/src/sgml/ref/alter_table.sgml | 12 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b097b80..8b60d8b 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3972,14 +3972,14 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
 
  Before running the ATTACH PARTITION command, it is
  recommended to create a CHECK constraint on the table to
- be attached describing the desired partition constraint.  That way,
+ be attached matching the desired partition constraint.  That way,
  the system will be able to skip the scan which is otherwise needed to validate the implicit
- partition constraint. Without such a constraint, the table will be
+ partition constraint. Without the CHECK constraint, the table will be
  scanned to validate the partition constraint while holding an
  ACCESS EXCLUSIVE lock on that partition
  and a SHARE UPDATE EXCLUSIVE lock on the parent table.
- One may then drop the constraint after ATTACH PARTITION
- is finished, because it is no longer necessary.
+ It may be desired to drop the redundant CHECK constraint
+ after ATTACH PARTITION is finished.
 
 
 
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index a184bed..91ec626 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -841,7 +841,7 @@ WITH ( MODULUS numeric_literal, REM
   or as a default partition by using DEFAULT.
   For each index in the target table, a corresponding
   one will be created in the attached table; or, if an equivalent
-  index already exists, will be attached to the target table's index,
+  index already exists, it will be attached to the target table's index,
   as if ALTER INDEX ATTACH PARTITION had been executed.
   Note that if the existing table is a foreign table, it is currently not
   allowed to attach the table as a partition of the target table if there
@@ -864,17 +864,17 @@ WITH ( MODULUS numeric_literal, REM
   already exist.
   If any of the CHECK constraints of the table being
   attached are marked NO INHERIT, the command will fail;
-  such a constraint must be recreated without the NO INHERIT
+  such constraints must be recreated without the NO INHERIT
   clause.
  
 
  
   If the new partition is a regular table, a full table scan is performed
-  to check that no existing row in the table violates the partition
+  to check that existing rows in the table do not violate the partition
   constraint.  It is possible to avoid this scan by adding a valid
-  CHECK constraint to the table that would allow only
-  the rows satisfying the desired partition constraint before running this
-  command.  It will be determined using such a constraint that the table
+  CHECK constraint to the table that allows only
+  rows satisfying the desired partition constraint before running this
+  command.  The CHECK constraint will be used to determine that the table
   need not be scanned to validate the partition constraint.  This does not
   work, however, if any of the part

RE: Proposal: Add more compile-time asserts to expose inconsistencies.

2019-10-27 Thread Smith, Peter
From: Andres Freund  Sent: Sunday, 27 October 2019 12:03 PM
> My proposal for this really was just to use this as a fallback for when 
> static assert isn't available. Which in turn I was just suggesting because 
> Tom wanted a fallback.

The patch is updated to use "extern" technique only when  "_Static_assert" is 
unavailable.

PSA.

Kind Regards,
---
Peter Smith
Fujitsu Australia


ct_asserts_StaticAssertDecl_3.patch
Description: ct_asserts_StaticAssertDecl_3.patch


Re: PL/Python fails on new NetBSD/PPC 8.0 install

2019-10-27 Thread Tom Lane
Awhile back I wrote:
> I noticed that the old NetBSD 5.1.5 installation I had on my G4 Mac
> was no longer passing our regression tests, because it has a strtof()
> that is sloppy about underflow.  Rather than fight with that I decided
> to update it to something shinier (well, as shiny as you can get on
> hardware that's old enough to apply for a driver's license).  I stuck in
> NetBSD/macppc 8.0, and things seem to work, except that PL/Python
> crashes on launch.  I see something like this in the postmaster log:

> Traceback (most recent call last):
>   File "", line 1162, in 
> _install_external_importers
>   File "", line 980, in _find_and_load
>   File "", line 149, in __enter__
>   File "", line 84, in acquire
> RuntimeError: no current thread ident
> Fatal Python error: initexternalimport: external importer setup failed
> 
> Current thread 0x (most recent call first):
> 2019-06-18 17:40:59.629 EDT [20764] LOG:  server process (PID 23714) was 
> terminated by signal 6: Abort trap
> 2019-06-18 17:40:59.629 EDT [20764] DETAIL:  Failed process was running: 
> CREATE FUNCTION stupid() RETURNS text AS 'return "zarkon"' LANGUAGE 
> plpython3u;

So ... I just got this identical failure on NetBSD 8.1 on a shiny
new Intel NUC box.  So that removes the excuse of old unsupported
hardware, and leaves us with the conclusion that PL/Python is
flat out broken on recent NetBSD.

This is with today's HEAD of our code and the python37-3.7.4/amd64
package from NetBSD 8.1.

BTW, the only somewhat-modern NetBSD machine in our buildfarm is
coypu, which is running NetBSD/macppc 8.0 ... but what it is testing
PL/Python against is python 2.7.15, so the fact that it doesn't
fail can probably be explained as a python 2 vs python 3 thing.

regards, tom lane




Re: PL/Python fails on new NetBSD/PPC 8.0 install

2019-10-27 Thread Benjamin Scherrey
None of the output provides any clue to me but I do know that Python 3.7
has some issues with a lot of versions of openssl that is based on a
disagreement between devs in both projects. This was a problem for me when
trying to build python 3.7 on my Kubuntu 14.04 system. I've seen this issue
reported across all targets for Python including Freebsd so I expect it's
likely to also happen for NetBSD.

Perhaps this might be related to the problem?

On Mon, Oct 28, 2019, 8:12 AM Tom Lane  wrote:

> Awhile back I wrote:
> > I noticed that the old NetBSD 5.1.5 installation I had on my G4 Mac
> > was no longer passing our regression tests, because it has a strtof()
> > that is sloppy about underflow.  Rather than fight with that I decided
> > to update it to something shinier (well, as shiny as you can get on
> > hardware that's old enough to apply for a driver's license).  I stuck in
> > NetBSD/macppc 8.0, and things seem to work, except that PL/Python
> > crashes on launch.  I see something like this in the postmaster log:
>
> > Traceback (most recent call last):
> >   File "", line 1162, in
> _install_external_importers
> >   File "", line 980, in _find_and_load
> >   File "", line 149, in __enter__
> >   File "", line 84, in acquire
> > RuntimeError: no current thread ident
> > Fatal Python error: initexternalimport: external importer setup failed
> >
> > Current thread 0x (most recent call first):
> > 2019-06-18 17:40:59.629 EDT [20764] LOG:  server process (PID 23714) was
> terminated by signal 6: Abort trap
> > 2019-06-18 17:40:59.629 EDT [20764] DETAIL:  Failed process was running:
> CREATE FUNCTION stupid() RETURNS text AS 'return "zarkon"' LANGUAGE
> plpython3u;
>
> So ... I just got this identical failure on NetBSD 8.1 on a shiny
> new Intel NUC box.  So that removes the excuse of old unsupported
> hardware, and leaves us with the conclusion that PL/Python is
> flat out broken on recent NetBSD.
>
> This is with today's HEAD of our code and the python37-3.7.4/amd64
> package from NetBSD 8.1.
>
> BTW, the only somewhat-modern NetBSD machine in our buildfarm is
> coypu, which is running NetBSD/macppc 8.0 ... but what it is testing
> PL/Python against is python 2.7.15, so the fact that it doesn't
> fail can probably be explained as a python 2 vs python 3 thing.
>
> regards, tom lane
>
>
>


Re: CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+

2019-10-27 Thread Arthur Zakirov

Hello Tomas,

On 2019/10/13 10:26, Tomas Vondra wrote:

over in pgsql-bugs [1] we got a report about CREATE TEXT SEARCH
DICTIONARY causing segfaults on 12.0. Simply running

    CREATE TEXT SEARCH DICTIONARY hunspell_num (Template=ispell,
    DictFile=hunspell_sample_num, AffFile=hunspell_sample_long);

does trigger a crash, 100% of the time. The crash was reported on 12.0,
but it's in fact present since 9.6.

On 9.5 the example does not work, because that version does not (a)
include the hunspell dictionaries used in the example, and (b) it does
not support long flags. So even after copying the dictionaries and
tweaking them a bit it still passes without a crash.


This crash is not because of long flags, but because of aliases (more 
thoughts below).



Looking at the commit history of spell.c, there seems to be a bunch of
commits in 2016 (e.g. f4ceed6ceba3) touching exactly this part of the
code (hunspell), and it also correlates quite nicely with the affected
branches (9.6+). So my best guess is it's a bug in those changes.


Yeah, there was a lot changes.


So it simply grabs whatever it finds in the dict file, parses it and
then (later) we use it as index to access the AffixData array, even if
the value is way out of bounds.


Yes, we enter this code if an affix file defines aliases (AF parameter). 
AffixData array is used to store those aliases.


More about hunspell format you can find here:
https://linux.die.net/man/4/hunspell

In the example we have the following aliases:
AF 11
AF cZ   #1
AF cL   #2
...
AF sB   #11

And in the dictionary file we should use their indexes (from 1 to 11). 
These aliases defines set of flags and in the dict file we can use only 
single index:

book/3
book/11

but not:
book/3,4
book/2,11

I added checking of this last case in the attached patch. PostgreSQL 
will raise an error if it sees non-numeric and non-whitespace character 
after the index.


Aliases can be used with all flag types: 'default' (i.e. FM_CHAR), 
'long', and if I'm not mistaken 'num'.



So I think we need some sort of cross-check here. We certainly need to
make NISortDictionary() check the affix value is within AffixData
bounds, and error out when the index is non-sensical (maybe negative
and/or exceeding nAffixData).


I agree, I attached the patch which do this. I also added couple 
asserts, tests and fixed condition in getAffixFlagSet():


-   if (curaffix > 0 && curaffix <= Conf->nAffixData)
+   if (curaffix > 0 && curaffix < Conf->nAffixData)

I think it could be a bug, because curaffix can't be equal to 
Conf->nAffixData.



Maybe there's a simple way to check if the affix/dict files match.


I'm not sure how to properly fix this either. The only thing we could 
check is commas in affix flags in a dict file:


book/302,301,202,303

FM_CHAR and FM_LONG dictionaries can't have commas. They should have the 
following affix flags:


book/sGsJpUsS   # 4 affixes for FM_LONG
book/GJUS   # 4 affixes for FM_CHAR

But I guess they could have numbers in flags (as help says "Set flag 
type. Default type is the extended ASCII (8-bit) character.") and other 
non alphanumeric characters (as some language dictionaries have):


book/s1s2s3s4   # 4 affixes for FM_LONG

--
Artur
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index c715f06b8e..1b8766659c 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -458,6 +458,8 @@ IsAffixFlagInUse(IspellDict *Conf, int affix, const char 
*affixflag)
if (*affixflag == 0)
return true;
 
+   Assert(affix < Conf->nAffixData);
+
flagcur = Conf->AffixData[affix];
 
while (*flagcur)
@@ -1160,13 +1162,17 @@ getAffixFlagSet(IspellDict *Conf, char *s)
(errcode(ERRCODE_CONFIG_FILE_ERROR),
 errmsg("invalid affix alias \"%s\"", 
s)));
 
-   if (curaffix > 0 && curaffix <= Conf->nAffixData)
+   if (curaffix > 0 && curaffix < Conf->nAffixData)
 
/*
 * Do not subtract 1 from curaffix because empty string 
was added
 * in NIImportOOAffixes
 */
return Conf->AffixData[curaffix];
+   else if (curaffix > Conf->nAffixData)
+   ereport(ERROR,
+   (errcode(ERRCODE_CONFIG_FILE_ERROR),
+errmsg("invalid affix alias \"%s\"", 
s)));
else
return VoidString;
}
@@ -1561,6 +1567,8 @@ MergeAffix(IspellDict *Conf, int a1, int a2)
 {
char  **ptr;
 
+   Assert(a1 < Conf->nAffixData && a2 < Conf->nAffixData);
+
/* Do not merge affix flags if one of affix flags is empty */
if (*Conf->AffixData[a1] == '\0')
return a2;
@@ -1603,9 +1611,10 @@ MergeA

RE: Proposal: Add more compile-time asserts to expose inconsistencies.

2019-10-27 Thread Smith, Peter
From: Kyotaro Horiguchi  Sent: Monday, 28 October 2019 
1:26 PM

> It is missing the __cplusplus case?

My use cases for the macro are only in C code, so that's all I was interested 
in at this time.
If somebody else wants to extend the patch for C++ also (and test it) they can 
do.

Kind Regards,
---
Peter Smith
Fujitsu Australia






Allow cluster_name in log_line_prefix

2019-10-27 Thread Craig Ringer
Hi folks

I was recently surprised to notice that log_line_prefix doesn't support a
cluster_name placeholder. I suggest adding one. If I don't hear objections
I'll send a patch.

Before anyone asks "but why?!":

* A constant (short) string in log_line_prefix is immensely useful when
working with logs from multi-node systems. Whether that's physical
streaming replication, logical replication, Citus, whatever, it doesn't
matter. It's worth paying the small storage price for sanity when looking
at logs.

* Yes you can embed it directly into log_line_prefix. But then it gets
copied by pg_basebackup or whatever you're using to clone standbys etc, so
you can easily land up with multiple instances reporting the same name.
This rather defeats the purpose.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise


Re: Restore replication settings when modifying a field type

2019-10-27 Thread Kyotaro Horiguchi
Hello.

# The patch no longer applies on the current master. Needs a rebasing.

At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang  
wrote in 
> In fact, the replication property of the table has not been modified,
> and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
> specified index property 'indisreplident' is set to false because of
> the rebuild.

I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

P=# create table t (c1 integer, c2 text unique not null);
P=# alter table t replica identity using index t_c2_key;
P=# create publication p1 for table t;
P=# insert into t values (0, '00'), (1, '01');
S=# create table t (c1 integer, c2 text unique not null);
S=# alter table t replica identity using index t_c2_key;
S=# create subscription s1 connection '...' publication p1;

Your patch allows change of the type of c2 into integer.

P=# alter table t alter column c2 type integer using c2::integer;
P=# update t set c1 = c1 + 1 where c2 = '01';

This change doesn't affect perhaps as expected.

S=# select * from t;
 c1 | c2 
+
  0 | 00
  1 | 01
(2 rows)


> So I developed a patch. If the user modifies the field type. The
> associated index is REPLICA IDENTITY. Rebuild and restore replication
> settings.

Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Anyway I think it is a problem that replica identity setting is
dropped silently. Perhaps a message something like "REPLICA IDENTITY
setting is lost, please redefine after confirmation of compatibility
with subscribers." is needed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Remove one use of IDENT_USERNAME_MAX

2019-10-27 Thread Kyotaro Horiguchi
Hello.

At Sat, 26 Oct 2019 08:55:03 +0200, Peter Eisentraut 
 wrote in 
> IDENT_USERNAME_MAX is the maximum length of the information returned
> by an ident server, per RFC 1413.  Using it as the buffer size in peer
> authentication is inappropriate.  It was done here because of the
> historical relationship between peer and ident authentication.  But
> since it's also completely useless code-wise, remove it.

In think one of the reasons for the coding is the fact that *pw is
described to be placed in the static area, which can be overwritten by
succeeding calls to getpw*() functions. I think we can believe
check_usermap() never calls them but I suppose that some comments
needed..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [DOC] Fix for the missing pg_stat_progress_cluster view phase column value

2019-10-27 Thread Kyotaro Horiguchi
At Sat, 26 Oct 2019 05:13:49 +, "Shinoda, Noriyoshi (PN Japan A&PS 
Delivery)"  wrote in 
> I found a missing column value in the pg_stat_progress_cluster view document.
> I read the src/backend/catalog/system_views.sql file, there seems to be a 
> possibility that 'writing new heap' is output in the 'phase' column.
> The attached patch adds a description of the 'writing new heap' value output 
> in the 'phase' column.

Good catch!

By the way the table mentions the phases common to CLUSTER and VACUUM FULL. I 
wonder why some of them are described as "CLUSTER is" and others are "The 
command is"..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




RE: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

2019-10-27 Thread imai.yoshik...@fujitsu.com
Hi Dhruv,

On Sun, June 30, 2019 at 7:30 AM, Goel, Dhruv wrote:
> > On Jun 10, 2019, at 1:20 PM, Goel, Dhruv  wrote:
> >> On Jun 9, 2019, at 5:33 PM, Tom Lane  wrote:
> >> Andres Freund  writes:
> >>> On June 9, 2019 8:36:37 AM PDT, Tom Lane  wrote:
>  I think you are mistaken that doing transactional updates in
>  pg_index is OK.  If memory serves, we rely on xmin of the pg_index
>  row for purposes such as detecting whether a concurrently-created
>  index is safe to use yet.
> >
> > I took a deeper look regarding this use case but was unable to find more 
> > evidence. As part of this patch, we essentially
> make concurrently-created index safe to use only if transaction started after 
> the xmin of Phase 3. Even today concurrent
> indexes can not be used for transactions before this xmin because of the wait 
> (which I am trying to get rid of in this
> patch), is there any other denial of service you are talking about? Both the 
> other states indislive, indisready can
> be transactional updates as far as I understand. Is there anything more I am 
> missing here?
> 
> 
> Hi,
> 
> I did some more concurrency testing here through some python scripts which 
> compare the end state of the concurrently
> created indexes. I also back-ported this patch to PG 9.6 and ran some custom 
> concurrency tests (Inserts, Deletes, and
> Create Index Concurrently) which seem to succeed. The intermediate states 
> unfortunately are not easy to test in an
> automated manner, but to be fair concurrent indexes could never be used for 
> older transactions. Do you have more
> inputs/ideas on this patch?

According to the commit 3c8404649 [1], transactional update in pg_index is not 
safe in non-MVCC catalog scans before PG9.4.
But it seems to me that we can use transactional update in pg_index after the 
commit 813fb03155 [2] which got rid of SnapshotNow. 

If we apply this patch back to 9.3 or earlier, we might need to consider 
another way or take the Andres suggestion (which I don't understand the way 
fully though), but which version do you want/do we need to apply this patch?

Also, if we apply this patch in this way, there are several comments to be 
fixed which state the method of CREATE INDEX CONCURRENTLY.

ex.
[index.c]
/*
* validate_index - support code for concurrent index builds
...
* After completing validate_index(), we wait until all transactions that
* were alive at the time of the reference snapshot are gone; this is
* necessary to be sure there are none left with a transaction snapshot
* older than the reference (and hence possibly able to see tuples we did
* not index).  Then we mark the index "indisvalid" and commit.  Subsequent
* transactions will be able to use it for queries.
...
valiate_index()
{
}


[1] 
https://github.com/postgres/postgres/commit/3c84046490bed3c22e0873dc6ba492e02b8b9051#diff-b279fc6d56760ed80ce4178de1401d2c
[2] 
https://github.com/postgres/postgres/commit/813fb0315587d32e3b77af1051a0ef517d187763#diff-b279fc6d56760ed80ce4178de1401d2c

--
Yoshikazu Imai




Re: [HACKERS] Block level parallel vacuum

2019-10-27 Thread Dilip Kumar
On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar  wrote:
>
> On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila  wrote:
> >
> > On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar  wrote:
> > >
> > > On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar  
> > > wrote:
> > > >
> > > > On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > I am thinking if we can write the patch for both the approaches (a.
> > > > > compute shared costs and try to delay based on that, b. try to divide
> > > > > the I/O cost among workers as described in the email above[1]) and do
> > > > > some tests to see the behavior of throttling, that might help us in
> > > > > deciding what is the best strategy to solve this problem, if any.
> > > > > What do you think?
> > > >
> > > > I agree with this idea.  I can come up with a POC patch for approach
> > > > (b).  Meanwhile, if someone is interested to quickly hack with the
> > > > approach (a) then we can do some testing and compare.  Sawada-san,
> > > > by any chance will you be interested to write POC with approach (a)?
> > > > Otherwise, I will try to write it after finishing the first one
> > > > (approach b).
> > > >
> > > I have come up with the POC for approach (a).

> > Can we compute the overall throttling (sleep time) in the operation
> > separately for heap and index, then divide the index's sleep_time with
> > a number of workers and add it to heap's sleep time?  Then, it will be
> > a bit easier to compare the data between parallel and non-parallel
> > case.
I have come up with a patch to compute the total delay during the
vacuum.  So the idea of computing the total cost delay is

Total cost delay = Total dealy of heap scan + Total dealy of
index/worker;  Patch is attached for the same.

I have prepared this patch on the latest patch of the parallel
vacuum[1].  I have also rebased the patch for the approach [b] for
dividing the vacuum cost limit and done some testing for computing the
I/O throttling.  Attached patches 0001-POC-compute-total-cost-delay
and 0002-POC-divide-vacuum-cost-limit can be applied on top of
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch.  I haven't
rebased on top of v31-0006, because v31-0006 is implementing the I/O
throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
doing the same with another approach.   But,
0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
well (just 1-2 lines conflict).

Testing:  I have performed 2 tests, one with the same size indexes and
second with the different size indexes and measured total I/O delay
with the attached patch.

Setup:
VacuumCostDelay=10ms
VacuumCostLimit=2000

Test1 (Same size index):
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,50) as i;
delete from test where a < 20;

  Vacuum (Head)   Parallel Vacuum
   Vacuum Cost Divide Patch
Total Delay1784 (ms)   1398(ms)
 1938(ms)


Test2 (Variable size dead tuple in index)
create table test(a int, b varchar, c varchar);
create index idx1 on test(a);
create index idx2 on test(b) where a > 10;
create index idx3 on test(c) where a > 15;

insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
generate_series(1,50) as i;
delete from test where a < 20;

Vacuum (Head)   Parallel Vacuum
  Vacuum Cost Divide Patch
Total Delay 1438 (ms)   1029(ms)
   1529(ms)


Conclusion:
1. The tests prove that the total I/O delay is significantly less with
the parallel vacuum.
2. With the vacuum cost divide the problem is solved but the delay bit
more compared to the non-parallel version.  The reason could be the
problem discussed at[2], but it needs further investigation.

Next, I will test with the v31-0006 (shared vacuum cost) patch.  I
will also try to test different types of indexes.

[1] 
https://www.postgresql.org/message-id/CAD21AoBMo9dr_QmhT%3DdKh7fmiq7tpx%2ByLHR8nw9i5NZ-SgtaVg%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAA4eK1%2BPeiFLdTuwrE6CvbNdx80E-O%3DZxCuWB2maREKFD-RaCA%40mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


0001-POC-compute-total-cost-delay.patch
Description: Binary data


0002-POC-divide-vacuum-cost-limit.patch
Description: Binary data


Re: [DOC] Fix for the missing pg_stat_progress_cluster view phase column value

2019-10-27 Thread Michael Paquier
On Sat, Oct 26, 2019 at 05:13:49AM +, Shinoda, Noriyoshi (PN Japan A&PS 
Delivery) wrote:
> The attached patch adds a description of the 'writing new heap'
> value output in the 'phase' column. 

Indeed, fixed.  Thanks for the patch.
--
Michael


signature.asc
Description: PGP signature


Re: define bool in pgtypeslib_extern.h

2019-10-27 Thread Amit Kapila
On Sat, Oct 26, 2019 at 10:49 PM Tom Lane  wrote:
>
> I'm inclined to think that we need to make ecpglib.h's bool-related
> definitions exactly match c.h, which will mean that it has to pull in
>  on most platforms, which will mean adding a control symbol
> for that to ecpg_config.h.  I do not think we should export
> HAVE_STDBOOL_H and SIZEOF_BOOL there though; probably better to have
> configure make the choice and export something named like PG_USE_STDBOOL.
>

This sounds reasonable to me, but we also might want to do something
for probes.d.  To be clear, I am not immediately planning to write a
patch for this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Duplicate entries in pg_depend after REINDEX CONCURRENTLY

2019-10-27 Thread Michael Paquier
On Fri, Oct 25, 2019 at 03:43:18PM +0900, Michael Paquier wrote:
> Attached is a patch to fix the issue.  As we know that the old index
> will have a definition and dependencies that match with the old one, I
> think that we should just remove any dependency records on the new
> index before moving the new set of dependencies from the old to the
> new index.  The patch includes regression tests that scan pg_depend to
> check that everything remains consistent after REINDEX CONCURRENTLY.
> 
> Any thoughts?

I have done more tests for this one through the day, and committed the
patch.  There is still one bug pending related to partitioned indexes
where REINDEX CONCURRENTLY is cancelled after phase 4 (swap) has
committed.  I am still looking more into that.
--
Michael


signature.asc
Description: PGP signature


Re: [DOC] Fix for the missing pg_stat_progress_cluster view phase column value

2019-10-27 Thread Michael Paquier
On Mon, Oct 28, 2019 at 02:26:39PM +0900, Kyotaro Horiguchi wrote:
> By the way the table mentions the phases common to CLUSTER and
> VACUUM FULL. I wonder why some of them are described as "CLUSTER is"
> and others are "The command is".. 

Because VACUUM FULL does not use the sort-and-scan mode, no?
--
Michael


signature.asc
Description: PGP signature


Re: WIP: expression evaluation improvements

2019-10-27 Thread Soumyadeep Chakraborty
Hi Andres,

Apologies, I realize my understanding of symbol resolution and the
referenced_functions mechanism wasn't correct. Thank you for your very
helpful
explanations.

> There's also a related edge-case where are unable to figure out a symbol
> name in llvm_function_reference(), and then resort to creating a global
> variable pointing to the function.

Indeed.

> If indeed the only case this is being hit is language PL handlers, it
> might be better to instead work out the symbol name for that handler -
> we should be able to get that via pg_language.lanplcallfoid.

I took a stab at this (on top of your patch set):
v1-0001-Resolve-PL-handler-names-for-JITed-code-instead-o.patch

> Which cases are you talking about here? Because I don't think there's
> any others where would know a symbol name to add to referenced_functions
> in the first place?

I had misunderstood the intent of referenced_functions.

> I do want to benefit from getting accurate signatures for patch
> [PATCH v2 26/32] WIP: expression eval: relative pointer suppport
> I had a number of cases where I passed the wrong parameters, and llvm
> couldn't tell me...

I took a stab:
v1-0001-Rely-on-llvmjit_types-for-building-EvalFunc-calls.patch


On a separate note, I had submitted a patch earlier to optimize functions
earlier
in accordance to the code comment:
/*
 * Do function level optimization. This could be moved to the point where
 * functions are emitted, to reduce memory usage a bit.
 */
 LLVMInitializeFunctionPassManager(llvm_fpm);
Refer:
https://www.postgresql.org/message-id/flat/cae-ml+_oe4-shvn0aa_qakc5qkzvqvainxwb1ztuut67spm...@mail.gmail.com
I have rebased that patch on top of your patch set. Here it is:
v2-0001-Optimize-generated-functions-earlier-to-lower-mem.patch

--
Soumyadeep


v2-0001-Optimize-generated-functions-earlier-to-lower-mem.patch
Description: Binary data


v1-0001-Rely-on-llvmjit_types-for-building-EvalFunc-calls.patch
Description: Binary data


v1-0001-Resolve-PL-handler-names-for-JITed-code-instead-o.patch
Description: Binary data


Re: [HACKERS] Block level parallel vacuum

2019-10-27 Thread Amit Kapila
On Sun, Oct 27, 2019 at 12:52 PM Dilip Kumar  wrote:
>
> On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada  wrote:
> >
> >
> I haven't yet read the new set of the patch.  But, I have noticed one
> thing.  That we are getting the size of the statistics using the AM
> routine.  But, we are copying those statistics from local memory to
> the shared memory directly using the memcpy.   Wouldn't it be a good
> idea to have an AM specific routine to get it copied from the local
> memory to the shared memory?  I am not sure it is worth it or not but
> my thought behind this point is that it will give AM to have local
> stats in any form ( like they can store a pointer in that ) but they
> can serialize that while copying to shared stats.  And, later when
> shared stats are passed back to the Am then it can deserialize in its
> local form and use it.
>

You have a point, but after changing the gist index, we don't have any
current usage for indexes that need something like that. So, on one
side there is some value in having an API to copy the stats, but on
the other side without having clear usage of an API, it might not be
good to expose a new API for the same.   I think we can expose such an
API in the future if there is a need for the same.  Do you or anyone
know of any external IndexAM that has such a need?

Few minor comments while glancing through the latest patchset.

1. I think you can merge 0001*, 0002*, 0003* patch into one patch as
all three expose new variable/function from IndexAmRoutine.

2.
+prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
+{
+ char *p = (char *) GetSharedIndStats(lvshared);
+ int vac_work_mem = IsAutoVacuumWorkerProcess() &&
+ autovacuum_work_mem != -1 ?
+ autovacuum_work_mem : maintenance_work_mem;

I think this function won't be called from AutoVacuumWorkerProcess at
least not as of now, so isn't it a better idea to have an Assert for
it?

3.
+void
+heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)

This function is for performing a parallel operation on the index, so
why to start with heap?  It is better to name it as
index_parallel_vacuum_main or simply parallel_vacuum_main.

4.
/* useindex = true means two-pass strategy; false means one-pass */
@@ -128,17 +280,12 @@ typedef struct LVRelStats
  BlockNumber pages_removed;
  double tuples_deleted;
  BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
- /* List of TIDs of tuples we intend to delete */
- /* NB: this list is ordered by TID address */
- int num_dead_tuples; /* current # of entries */
- int max_dead_tuples; /* # slots allocated in array */
- ItemPointer dead_tuples; /* array of ItemPointerData */
+ LVDeadTuples *dead_tuples;
  int num_index_scans;
  TransactionId latestRemovedXid;
  bool lock_waiter_detected;
 } LVRelStats;

-
 /* A few variables that don't seem worth passing around as parameters */
 static int elevel = -1;

It seems like a spurious line removal.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com