[jira] [Commented] (HAWQ-255) Checkpoint is blocked by TRANSACTION ABORT for INSERTING INTO a big partition table

ASF GitHub Bot (JIRA) Thu, 17 Dec 2015 19:23:33 -0800

    [ 
https://issues.apache.org/jira/browse/HAWQ-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063347#comment-15063347
 ]


ASF GitHub Bot commented on HAWQ-255:
-------------------------------------

Github user liming01 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47990910
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
        willHaveObjectsFromSmgr =
                        
PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -   if (willHaveObjectsFromSmgr)
    -   {
    -           /*
    -            * We need to ensure the recording of the [distributed-]commit 
record and the
    -            * persistent post-commit work will be done either before or 
after a checkpoint.
    -            */
    -           CHECKPOINT_START_LOCK;
    -   }
    +   /* In previous version, we ensured the recording of the 
[distributed-]commit record and the
    +    * persistent post-commit work will be done either before or after a 
checkpoint.
    +    *
    +    * However the persistent table status will be synchronized with 
AOSeg_XXXX
    +    * table and hdfs file in PersistentRecovery_Scan() at recovery PASS2.
    +    * We don't need to worry about inconsistent states between them. So no
    +    * CHECKPOINT_START_LOCK any more.
    +    */
    --- End diff --
    
    The status of current transaction will not be truncated at (4), the 
truncate related code is:
    CreateCheckPoint() --> CheckPointGuts() --> CheckPointMultiXact() -> 
TruncateMultiXact() -> SimpleLruTruncate()
    In this function, cutoffPage is based on the min value of 
OldestMemberMXactId and OldestVisibleMXactId for all backends' proc. 
    
    And OldestMemberMXactId and OldestVisibleMXactId for current backend's proc 
are reset in
    AtEOXact_MultiXact(). In AbortTransaction() and CommitTransaction()， this 
function is called after AtEOXact_smgr(). 
    
    So in recovery process, we can know that the current transaction has 
already committed, and all persistent table and hdfs files should be redo. 


> Checkpoint is blocked by TRANSACTION ABORT for INSERTING INTO a big partition 
> table
> -----------------------------------------------------------------------------------
>
>                 Key: HAWQ-255
>                 URL: https://issues.apache.org/jira/browse/HAWQ-255
>             Project: Apache HAWQ
>          Issue Type: Bug
>            Reporter: Ming LI
>            Assignee: Lei Chang
>
> If at the same time there are other INSERT commands running in parallel, it 
> will generates a lot of pg_xlog files. If at this time the system/master 
> nodes crashed, it will take a very long time for recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HAWQ-255) Checkpoint is blocked by TRANSACTION ABORT for INSERTING INTO a big partition table

Reply via email to