[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16673126#comment-16673126 ]
Allan Yang commented on HBASE-19953: ------------------------------------ The discussion need to be continued here , I also noticed that We can get a RPC timeout when alter/truncate a big table because of the modifications in this issue. This issue turns the whole alter/truncate into a sync op, the op time will be unacceptable if the table is huge. Even in 1.x, modifying table is a async op, we will not wait the regions to be reopened, but use admin.getAlterStauts() to check if finish. So I think 'Avoid calling post* hook when procedure fails' is not stand here, since in 1.x we will call postModifyTable even before the modify process finish. And, in 2.x, we have a hook named postCompletedModifyTableAction which can ensure only be executed after the whole process finish. Last but not least, as mentioned in HBASE-20658, the sync latch will be release after prepare state in DDLs like enable/disable other than alter/truncate(which only release it after the whole process finish). So there is a inconsistency here, we are trying hard to make sure postModifyTable to be called only after the whole process finish, but for other post* hooks like postEnableTable, they are not. I think we can revert the change here, otherwise, user will suffer a RPC timeout when alter/truncate big tables. [~elserj], [~stack],[~Apache9] > Avoid calling post* hook when procedure fails > --------------------------------------------- > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 > Reporter: Ramesh Mani > Assignee: Josh Elser > Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)