[
https://issues.apache.org/jira/browse/HAWQ-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061711#comment-15061711
]
ASF GitHub Bot commented on HAWQ-255:
-------------------------------------
Github user liming01 commented on a diff in the pull request:
https://github.com/apache/incubator-hawq/pull/191#discussion_r47879838
--- Diff: src/backend/access/transam/xact.c ---
@@ -2317,14 +2317,14 @@ CommitTransaction(void)
willHaveObjectsFromSmgr =
PersistentEndXactRec_WillHaveObjectsFromSmgr(EndXactRecKind_Commit);
- if (willHaveObjectsFromSmgr)
- {
- /*
- * We need to ensure the recording of the [distributed-]commit
record and the
- * persistent post-commit work will be done either before or
after a checkpoint.
- */
- CHECKPOINT_START_LOCK;
- }
+ /* In previous version, we ensured the recording of the
[distributed-]commit record and the
+ * persistent post-commit work will be done either before or after a
checkpoint.
+ *
+ * However the persistent table status will be synchronized with
AOSeg_XXXX
+ * table and hdfs file in PersistentRecovery_Scan() at recovery PASS2.
+ * We don't need to worry about inconsistent states between them. So no
+ * CHECKPOINT_START_LOCK any more.
+ */
--- End diff --
Now hawq doesn't report error when (4) occurs: I set break in gdb at
xact.c:2427, and run below statements, when hang at the "COMMIT", then I run
"hadoop dfsadmin -safemode enter" to set hdfs to save mode, then continue in
gdb, the commit successfully finished with only warning. Similar problem for
ABORT of transaction which includes 'CREATE TABLE'.
postgres=# begin transaction ISOLATION LEVEL SERIALIZABLE;BEGIN
postgres=# drop table tableinfs2;
DROP TABLE
postgres=# commit;
WARNING: could not remove relation directory 24974/16387/24975:
Input/output error
CONTEXT: Dropping file-system object -- Relation Directory:
'24974/16387/24975'
COMMIT
postgres=# select * from tableinfs2;
ERROR: relation "tableinfs2" does not exist
As for your question above, the recovery process will redo (4) again and
similarly it report warning info if we failed to drop file and or fail to
modify persistent table. Thanks.
> Checkpoint is blocked by TRANSACTION ABORT for INSERTING INTO a big partition
> table
> -----------------------------------------------------------------------------------
>
> Key: HAWQ-255
> URL: https://issues.apache.org/jira/browse/HAWQ-255
> Project: Apache HAWQ
> Issue Type: Bug
> Reporter: Ming LI
> Assignee: Lei Chang
>
> If at the same time there are other INSERT commands running in parallel, it
> will generates a lot of pg_xlog files. If at this time the system/master
> nodes crashed, it will take a very long time for recovery.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)