[ 
https://issues.apache.org/jira/browse/HAWQ-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061711#comment-15061711
 ] 

ASF GitHub Bot commented on HAWQ-255:
-------------------------------------

Github user liming01 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/191#discussion_r47879838
  
    --- Diff: src/backend/access/transam/xact.c ---
    @@ -2317,14 +2317,14 @@ CommitTransaction(void)
        willHaveObjectsFromSmgr =
                        
PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
     
    -   if (willHaveObjectsFromSmgr)
    -   {
    -           /*
    -            * We need to ensure the recording of the [distributed-]commit 
record and the
    -            * persistent post-commit work will be done either before or 
after a checkpoint.
    -            */
    -           CHECKPOINT_START_LOCK;
    -   }
    +   /* In previous version, we ensured the recording of the 
[distributed-]commit record and the
    +    * persistent post-commit work will be done either before or after a 
checkpoint.
    +    *
    +    * However the persistent table status will be synchronized with 
AOSeg_XXXX
    +    * table and hdfs file in PersistentRecovery_Scan() at recovery PASS2.
    +    * We don't need to worry about inconsistent states between them. So no
    +    * CHECKPOINT_START_LOCK any more.
    +    */
    --- End diff --
    
    Now hawq doesn't report error when (4) occurs: I set break in gdb at 
xact.c:2427, and run below statements, when hang at the "COMMIT", then I run 
"hadoop dfsadmin -safemode enter" to set hdfs to save mode, then continue in 
gdb, the commit successfully finished with only warning. Similar problem for 
ABORT of transaction which includes 'CREATE TABLE'. 
    
    postgres=# begin transaction ISOLATION LEVEL SERIALIZABLE;BEGIN
    postgres=# drop table tableinfs2;                                           
                                                                                
  DROP TABLE
    postgres=# commit;
    WARNING:  could not remove relation directory 24974/16387/24975: 
Input/output error
    CONTEXT:  Dropping file-system object -- Relation Directory: 
'24974/16387/24975'
    COMMIT
    postgres=# select * from tableinfs2;
    ERROR:  relation "tableinfs2" does not exist
    
    As for your question above, the recovery process will redo (4) again and 
similarly it report warning info if we failed to drop file and or fail to 
modify persistent table. Thanks.


> Checkpoint is blocked by TRANSACTION ABORT for INSERTING INTO a big partition 
> table
> -----------------------------------------------------------------------------------
>
>                 Key: HAWQ-255
>                 URL: https://issues.apache.org/jira/browse/HAWQ-255
>             Project: Apache HAWQ
>          Issue Type: Bug
>            Reporter: Ming LI
>            Assignee: Lei Chang
>
> If at the same time there are other INSERT commands running in parallel, it 
> will generates a lot of pg_xlog files. If at this time the system/master 
> nodes crashed, it will take a very long time for recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to