Re: [HACKERS] Issues with logical replication

2017-11-29 Thread Petr Jelinek
On 29/11/17 20:11, Stas Kelvich wrote:
> 
>> On 29 Nov 2017, at 18:46, Petr Jelinek  wrote:
>>
>> What I don't understand is how it leads to crash (and I could not
>> reproduce it using the pgbench file attached in this thread either) and
>> moreover how it leads to 0 xid being logged. The only explanation I can
>> come up is that some kind of similar race has to be in
>> LogStandbySnapshot() but we explicitly check for 0 xid value there.
>>
> 
> Zero xid isn’t logged. Loop in XactLockTableWait() does following:
> 
> for (;;)
> {
> Assert(TransactionIdIsValid(xid));
> Assert(!TransactionIdEquals(xid, GetTopTransactionIdIfAny()));
> 
>   <...>
>   
> xid = SubTransGetParent(xid);
> }
> 
> So if last statement is reached for top transaction then next iteration
> will crash in first assert. And it will be reached if whole this loop
> happens before transaction acquired heavyweight lock.
> 
> Probability of that crash can be significantly increased be adding sleep
> between xid generation and lock insertion in AssignTransactionId().
> 

Yes that helps thanks. Now that I reproduced it I understand, I was
confused by the backtrace that said xid was 0 on the input to
XactLockTableWait() but that's not the case, it's what xid is changed to
in the inner loop.

So what happens is that we manage to do LogStandbySnapshot(), decode the
logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in
between GetNewTransactionId() and XactLockTableInsert() calls in
AssignTransactionId() for that same transaction.

I guess the probability of this happening is increased by the fact that
GetRunningTransactionData() acquires XidGenLock so if there is
GetNewTransactionId() running in parallel it will wait for it to finish
and we'll log immediately after that.

Hmm that means that Robert's loop idea will not help and ProcArrayLock
will not save us either. Maybe we could either rewrite XactLockTableWait
or create another version of it with SubTransGetParent() call replaced
by SubTransGetTopmostTransaction() as that will return the same top
level xid in case the input xid wasn't a subxact. That would make it
safe to be called on transactions that didn't acquire lock on themselves
yet.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: Would a BGW need shmem_access or database_connection to enumerate databases?

2017-11-29 Thread Andres Freund
On 2017-11-29 18:23:40 -0500, Chapman Flack wrote:
> On 11/29/2017 05:54 PM, Michael Paquier wrote:
> 
> > Yes. That's actually what the autovacuum launcher does. It connects
> > using InitPostgres(NULL, InvalidOid, NULL, NULL), and then scans
> > pg_database to fetch a list (see get_database_list).
> 
> Thanks! It looks like if get_database_list were not static, it
> would be just the thing I'm looking for.
> 
> Would an SPI query of pg_database also work, in the
> bgw-connected-to-null-dbname context? I'm just wondering if
> that might be clearer/fewer LOC than just copying the lower-level
> approach from get_database_list.

SQL won't really work in a non-database connected context.



Re: [HACKERS] [PATCH] Incremental sort (was: PoC: Partial sort)

2017-11-29 Thread Michael Paquier
On Mon, Mar 20, 2017 at 6:33 PM, Alexander Korotkov
 wrote:
> Thank you for the report.
> Please, find rebased patch in the attachment.

This patch cannot be applied. Please provide a rebased version. I am
moving it to next CF with waiting on author as status.
-- 
Michael



Re: Use of uninitialized variables in ExecFindPartition() for parent partition without leaves (HEAD only)

2017-11-29 Thread Amit Langote
Hi Michael.

On 2017/11/30 9:07, Michael Paquier wrote:
> Hi all,
> 
> Since commit 4e5fe9ad (committer Robert Haas and author Amit Langote),
> coverity has been complaining that the new code of ExecFindPartition()
> may use a set of values and isnull values which never get initialized.
> This is a state which can be easily reached with the following SQLs of
> a parent partition with no children:
> create table parent_tab (a int) partition by list ((a+0));
> insert into parent_tab values (1);
> 
> The mistake is visibly that FormPartitionKeyDatum() should be called
> even for a root partition, initializing the fields of isnull and
> values on the way. That's actually what happens in execMain.c's
> ExecFindPartition() if you look at REL_10_STABLE. So the answer would
> be to move a bit down the quick exit code. Attached is a patch.
> 
> Amit, that's your code. What do you think?

Thanks for the report.  That's clearly a bug. :-(

Your patch seems enough to fix it.  I added a test and expanded the
comment a bit in the attached updated version.

Thanks,
Amit
diff --git a/src/backend/executor/execPartition.c 
b/src/backend/executor/execPartition.c
index d275cefe1d..8b334ea2ff 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -206,13 +206,6 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, 
PartitionDispatch *pd,
slot = myslot;
}
 
-   /* Quick exit */
-   if (partdesc->nparts == 0)
-   {
-   result = -1;
-   break;
-   }
-
/*
 * Extract partition key from tuple. Expression evaluation 
machinery
 * that FormPartitionKeyDatum() invokes expects ecxt_scantuple 
to
@@ -223,6 +216,17 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, 
PartitionDispatch *pd,
 */
ecxt->ecxt_scantuple = slot;
FormPartitionKeyDatum(parent, slot, estate, values, isnull);
+
+   /*
+* Nothing for get_partition_for_tuple() to do if there are no
+* partitions to begin with.
+*/
+   if (partdesc->nparts == 0)
+   {
+   result = -1;
+   break;
+   }
+
cur_index = get_partition_for_tuple(rel, values, isnull);
 
/*
@@ -472,7 +476,7 @@ FormPartitionKeyDatum(PartitionDispatch pd,
 }
 
 /*
- * BuildSlotPartitionKeyDescription
+ * ExecBuildSlotPartitionKeyDescription
  *
  * This works very much like BuildIndexValueDescription() and is currently
  * used for building error messages when ExecFindPartition() fails to find
diff --git a/src/test/regress/expected/insert.out 
b/src/test/regress/expected/insert.out
index b7b37dbc39..dcbaad8e2f 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -165,6 +165,10 @@ create table range_parted (
a text,
b int
 ) partition by range (a, (b+0));
+-- no partitions, so fail
+insert into range_parted values ('a', 11);
+ERROR:  no partition of relation "range_parted" found for row
+DETAIL:  Partition key of the failing row contains (a, (b + 0)) = (a, 11).
 create table part1 partition of range_parted for values from ('a', 1) to ('a', 
10);
 create table part2 partition of range_parted for values from ('a', 10) to 
('a', 20);
 create table part3 partition of range_parted for values from ('b', 1) to ('b', 
10);
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index 310b818076..0150b6bb0f 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -90,6 +90,10 @@ create table range_parted (
a text,
b int
 ) partition by range (a, (b+0));
+
+-- no partitions, so fail
+insert into range_parted values ('a', 11);
+
 create table part1 partition of range_parted for values from ('a', 1) to ('a', 
10);
 create table part2 partition of range_parted for values from ('a', 10) to 
('a', 20);
 create table part3 partition of range_parted for values from ('b', 1) to ('b', 
10);


Re: [HACKERS] Block level parallel vacuum WIP

2017-11-29 Thread Michael Paquier
On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada  wrote:
> Yeah, I was thinking the commit is relevant with this issue but as
> Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
> I don't find out the cause of this issue yet. With the previous
> version patch, autovacuum workers were woking with one parallel worker
> but it never drops relations. So it's possible that the error might
> not have been relevant with the patch but anywayI'll continue to work
> on that.

This depends on the extension lock patch from
https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=gn...@mail.gmail.com/
if I am following correctly. So I propose to mark this patch as
returned with feedback for now, and come back to it once the root
problems are addressed. Feel free to correct me if you think that's
not adapted.
-- 
Michael



Re: [HACKERS] path toward faster partition pruning

2017-11-29 Thread Robert Haas
On Wed, Nov 22, 2017 at 3:59 AM, Amit Langote
 wrote:
> It seems I wrote an Assert in the code to support hash partitioning that
> wasn't based on a valid assumption.  I was wrongly assuming that all hash
> partitions for a given modulus (largest modulus) must exist at any given
> time, but that isn't the case.

Committed 0003 with some adjustments:

* Renamed the new test to partition_prune.
* Moved the test to what I thought was a better place in the schedule
file, and made it consistent between serial_schedule and
parallel_schedule.
* commutates -> commuted
* removed wrong /* empty */ comment
* Updated expected output.  It surprised me a bit that the tests
weren't passing as you had them, but the differences I got - all
related to mc3p_default - seemed correct to me

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Unclear regression test for postgres_fdw

2017-11-29 Thread Antonin Houska
The following test

-- Input relation to aggregate push down hook is not safe to pushdown and thus
-- the aggregate cannot be pushed down to foreign server.
explain (verbose, costs off)
select count(t1.c3) from ft1 t1, ft1 t2 where t1.c1 = postgres_fdw_abs(t1.c2);

produces the following plan

QUERY PLAN  
  
--
 Aggregate
   Output: count(t1.c3)
   ->  Nested Loop
 Output: t1.c3
 ->  Foreign Scan on public.ft1 t2
   Remote SQL: SELECT NULL FROM "S 1"."T 1"
 ->  Materialize
   Output: t1.c3
   ->  Foreign Scan on public.ft1 t1
 Output: t1.c3
 Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" = 
public.postgres_fdw_abs(c2)))

which is not major problem as such, but gdb shows that the comment "aggregate
cannot be pushed" is not correct. In fact, postgresGetForeignUpperPaths()
*does* create the upper path.

The reason that UPPERREL_GROUP_AGG is eventually not used seems to be that
postgresGetForeignJoinPaths() -> add_foreign_grouping_paths() ->
estimate_path_cost_size() estimates the join cost in rather generic way. While
the remote server can push the join clause down to the inner relation of NL,
the postgres_fdw cost computation assumes that the join clause is applied to
each pair of output and input tuple.

I don't think that the postgres_fdw's estimate can be fixed easily, but if the
impact of "shipability" on (not) using the upper relation should be tested, we
need a different test.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: [HACKERS] static assertions in C++

2017-11-29 Thread Andres Freund
On 2017-11-29 09:41:15 -0500, Robert Haas wrote:
> On Wed, Nov 29, 2017 at 9:26 AM, Peter Eisentraut
>  wrote:
> > I'd still like a review of this patch.
> 
> I don't think there's much to review apart from this one issue.
> Neither Tom nor I seem to be convinced about:
> 
> +/* not worth providing a workaround */

FWIW, I think that's a perfectly reasonable choice. Adding complications
in making static assertions work for random archaic compilers when
compiling with c++ just doesn't seem worth more than a few mins of
thought.

Greetings,

Andres Freund



<    1   2