Hi,

On 2018-02-13 12:41:26 +0530, amul sul wrote:
> From 08c8c7ece7d9411e70a780dbeed89d81419db6b6 Mon Sep 17 00:00:00 2001
> From: Amul Sul <sula...@gmail.com>
> Date: Tue, 13 Feb 2018 12:37:52 +0530
> Subject: [PATCH 1/2] Invalidate ip_blkid v5
> 
> v5:
>  - Added code in heap_mask to skip wal_consistency_checking[7]
>  - Fixed previous todos.
> 
> v5-wip2:
>  - Minor changes in RelationFindReplTupleByIndex() and
>    RelationFindReplTupleSeq()
> 
>  - TODO;
>    Same as the privious
> 
> v5-wip: Update w.r.t Amit Kapila's comments[6].
>  - Reverted error message in nodeModifyTable.c from 'tuple to be locked'
>    to 'tuple to be updated'.
> 
>  - TODO:
>  1. Yet to made a decision of having LOG/ELOG/ASSERT in the
>     RelationFindReplTupleByIndex() and RelationFindReplTupleSeq().
> 
> v4: Rebased on "UPDATE of partition key v35" patch[5].
> 
> v3: Update w.r.t Amit Kapila's[3] & Alvaro Herrera[4] comments
>  - typo in all error message and comment : "to an another" -> "to another"
>  - error message change : "tuple to be updated" -> "tuple to be locked"
>  - In ExecOnConflictUpdate(), error report converted into assert &
>   comments added.
> 
> v2: Updated w.r.t Robert review comments[2]
>  - Updated couple of comment of heap_delete argument and ItemPointerData
>  - Added same concurrent update error logic in ExecOnConflictUpdate,
>    RelationFindReplTupleByIndex and RelationFindReplTupleSeq
> 
> v1: Initial version -- as per Amit Kapila's suggestions[1]
>  - When tuple is being moved to another partition then ip_blkid in the
>    tuple header mark to InvalidBlockNumber.

Very nice and instructive to keep this in a submission's commit message.



> diff --git a/src/backend/access/heap/heapam.c 
> b/src/backend/access/heap/heapam.c
> index 8a846e7dba..f4560ee9cb 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -3037,6 +3037,8 @@ xmax_infomask_changed(uint16 new_infomask, uint16 
> old_infomask)
>   *   crosscheck - if not InvalidSnapshot, also check tuple against this
>   *   wait - true if should wait for any conflicting update to commit/abort
>   *   hufd - output parameter, filled in failure cases (see below)
> + *   row_moved - true iff the tuple is being moved to another partition
> + *                           table due to an update of partition key. 
> Otherwise, false.
>   *

I don't think 'row_moved' is a good variable name for this. Moving a row
in our heap format can mean a lot of things. Maybe 'to_other_part' or
'changing_part'?


> +     /*
> +      * Sets a block identifier to the InvalidBlockNumber to indicate such an
> +      * update being moved tuple to another partition.
> +      */
> +     if (row_moved)
> +             BlockIdSet(&((tp.t_data->t_ctid).ip_blkid), InvalidBlockNumber);

The parens here are set in a bit werid way. I assume that's from copying
it out of ItemPointerSet()?  Why aren't you just using 
ItemPointerSetBlockNumber()?


I think it'd be better if we followed the example of specultive inserts
and created an equivalent of HeapTupleHeaderSetSpeculativeToken. That'd
be a heck of a lot easier to grep for...


> @@ -9314,6 +9323,18 @@ heap_mask(char *pagedata, BlockNumber blkno)
>                        */
>                       if (HeapTupleHeaderIsSpeculative(page_htup))
>                               ItemPointerSet(&page_htup->t_ctid, blkno, off);
> +
> +                     /*
> +                      * For a deleted tuple, a block identifier is set to the

I think this 'the' is superflous.


> +                      * InvalidBlockNumber to indicate that the tuple has 
> been moved to
> +                      * another partition due to an update of partition key.

But I think it should be 'the partition key'.


> +                      * Like speculative tuple, to ignore any inconsistency 
> set block
> +                      * identifier to current block number.

This doesn't quite parse.


> +                      */
> +                     if (!BlockNumberIsValid(
> +                                     
> BlockIdGetBlockNumber((&((page_htup->t_ctid).ip_blkid)))))
> +                             BlockIdSet(&((page_htup->t_ctid).ip_blkid), 
> blkno);
>               }

That formatting looks wrong. I think it should be replaced by a macro
like mentioned above.


>               /*
> diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
> index 160d941c00..a770531e14 100644
> --- a/src/backend/commands/trigger.c
> +++ b/src/backend/commands/trigger.c
> @@ -3071,6 +3071,11 @@ ltrmark:;
>                                       ereport(ERROR,
>                                                       
> (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
>                                                        errmsg("could not 
> serialize access due to concurrent update")));
> +                             if 
> (!BlockNumberIsValid(BlockIdGetBlockNumber(&((hufd.ctid).ip_blkid))))
> +                                     ereport(ERROR,
> +                                                     
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +                                                      errmsg("tuple to be 
> locked was already moved to another partition due to concurrent update")));

Yes, given that we repeat this in multiple places, I *definitely* want
to see this wrapped in a macro with a descriptive name.


> diff --git a/src/backend/executor/nodeLockRows.c 
> b/src/backend/executor/nodeLockRows.c
> index 7961b4be6a..b07b7092de 100644
> --- a/src/backend/executor/nodeLockRows.c
> +++ b/src/backend/executor/nodeLockRows.c
> @@ -218,6 +218,11 @@ lnext:
>                                       ereport(ERROR,
>                                                       
> (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
>                                                        errmsg("could not 
> serialize access due to concurrent update")));
> +                             if 
> (!BlockNumberIsValid(BlockIdGetBlockNumber(&((hufd.ctid).ip_blkid))))
> +                                     ereport(ERROR,
> +                                                     
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +                                                      errmsg("tuple to be 
> locked was already moved to another partition due to concurrent update")));
> +

Why are we using ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE rather than
ERRCODE_T_R_SERIALIZATION_FAILURE?  A lot of frameworks have builtin
logic to retry serialization failures, and this kind of thing is going
to resolved by retrying, no?


> diff --git a/src/test/isolation/expected/partition-key-update-1.out 
> b/src/test/isolation/expected/partition-key-update-1.out
> new file mode 100644

I'd like to see tests that show various interactions with ON CONFLICT.

Greetings,

Andres Freund

Reply via email to