[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677085#comment-16677085 ] Josh Elser commented on HBASE-19953: {quote} [~elserj], actually postDeleteNamespace still have the race condition you mentioned above... Since the sync latch is released just after DELETE_NAMESPACE_PREPARE state in DeleteNamespaceProcedure. So when the time postDeleteNamespace is called, the namespace may also exists... {quote} Ugh, that's no good. Thanks for pointing it out. bq. But anyway, let me open another issue for it, the thing I want to solve most is the RPC timeout of modifying tables. Sounds good. Will look for it on the dev list (or feel free to ping me if you have yet to create it). Thanks sir! > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, > HBASE-19953.branch-2.0.addendum.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676002#comment-16676002 ] Allan Yang commented on HBASE-19953: [~elserj], actually postDeleteNamespace still have the race condition you mentioned above... Since the sync latch is released just after DELETE_NAMESPACE_PREPARE state in DeleteNamespaceProcedure. So when the time postDeleteNamespace is called, the namespace may also exists... But anyway, let me open another issue for it, the thing I want to solve most is the RPC timeout of modifying tables. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, > HBASE-19953.branch-2.0.addendum.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675476#comment-16675476 ] Josh Elser commented on HBASE-19953: [~allan163], I understand the RPC timeout issue that you bring up, but I think the right solution is to fix how we execute postHooks. If we applied your patch now, it would introduce a regression for the original issue that I created this change. Specifically, that issue is that when a 2.x client (which has procv2 support) submits an operation that doesn't use a blocking lock, there is a race condition between the execution of the postHook and the procedure for the operation itself. Specifically, the original problem was that an Apache Ranger-owned postDeleteNamespace hook was expecting that the HBase namespace would not exist when it was invoked. In reality, the namespace still existed because the DeleteNamespaceProcedure had not yet run when we used the non-blocking latch. Your patch would reintroduce this problem directly. So, there are two goals: one (that you raised) is that we want clients to not have to hold open an RPC for a procedure to complete. The second (that I raised here) is that we must not call a post\* hook before the actual operation is completed. I think the way to solve both of these is to push execution of the post\* hook into the procedure itself. That way, we do not execute the hook too "soon" and the client can just poll the procedure, waiting for completion (as we want). > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, > HBASE-19953.branch-2.0.addendum.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675435#comment-16675435 ] Josh Elser commented on HBASE-19953: {quote} After a careful think, I think we don't need to revert this(I've already changed the comment above), we only need to turn ModifyTable to a async op(Which is the only sync DDL for 2.x client now). Josh Elser you can see HMaster.truncateTable. We also use ProcedurePrepareLatch.createLatch(2, 0) to make sure the 2.x client won't sync wait here. Uploaded a addendum to clarify my point. {quote} Thanks, Allan. I appreciate the clarification -- I felt bad lobbing a "-1" onto this as I did. Glad we can talk it out. The addendum is very helpful and I appreciate that too. I need to refresh my head around the original problem and think about the solution you have suggested. I think it makes sense to me, but I need to make sure things still work as they did in 2.0.0 :). I'll do this now before I get pulled in another direction. On that note, since this did go out in 2.0.0, can you make this change under a different Jira issue instead of as an addendum to this one? That would remove potential confusion about 2.0.3 having it but not 2.0.1 and 2.0.2. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, > HBASE-19953.branch-2.0.addendum.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674284#comment-16674284 ] Allan Yang commented on HBASE-19953: {quote} In reality, a synchronous API for DDL operations is super-useful – applications can't reasonably proceed to run if an action hasn't completed. So, I'd pose the question: how would we know when to say that a DDL operation is "completed enough"? {quote} For clients >2.0, it is very simple, we only need to check whether the procedure is finished. For clients in 1.x, admin.getAlterStauts() is used to check the status(for modify table), but since in 2.x, getAlterStauts is deprecated , so we need to make 1.x client to wait in sync. {quote} I am -1 on just reverting this. {quote} After a careful think, I think we don't need to revert this, we only need to turn ModifyTable to a async op(Which is the only sync DDL for 2.x client now). [~elserj] you can see HMaster.truncateTable. We also use ProcedurePrepareLatch.createLatch(2, 0) to make sure the 2.x client won't sync wait here. Uploaded a addendum to clarify my point. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673150#comment-16673150 ] Josh Elser commented on HBASE-19953: {quote}So I think 'Avoid calling post* hook when procedure fails' is not stand here, since in 1.x we will call postModifyTable even before the modify process finish. {quote} This is surprising to me. I thought the code was otherwise. I'll have to take another look. {quote}I think we can revert the change here, otherwise, user will suffer a RPC timeout when alter/truncate big tables. {quote} Caveat: if they're using the synchronous API, right? {quote}Last but not least, as mentioned in HBASE-20658, the sync latch will be release after prepare state in DDLs like enable/disable other than alter/truncate(which only release it after the whole process finish). So there is a inconsistency here, we are trying hard to make sure postModifyTable to be called only after the whole process finish, but for other post* hooks like postEnableTable, they are not. {quote} How would we reconcile this? IIRC, our public API states that the call will block until the operation is complete. Seems to me that you're suggesting that we do not actually adhere to that in all places, nor that we should. In reality, a synchronous API for DDL operations is super-useful – applications can't reasonably proceed to run if an action hasn't completed. So, I'd pose the question: how would we know when to say that a DDL operation is "completed enough"? I am -1 on just reverting this. That would break a use-case while fixing another – not the right way to solve this. Let's talk through the semantics we can reasonably implement and what we need to provide for users. From there, we can figure out what we can safely implement. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673126#comment-16673126 ] Allan Yang commented on HBASE-19953: The discussion need to be continued here , I also noticed that We can get a RPC timeout when alter/truncate a big table because of the modifications in this issue. This issue turns the whole alter/truncate into a sync op, the op time will be unacceptable if the table is huge. Even in 1.x, modifying table is a async op, we will not wait the regions to be reopened, but use admin.getAlterStauts() to check if finish. So I think 'Avoid calling post* hook when procedure fails' is not stand here, since in 1.x we will call postModifyTable even before the modify process finish. And, in 2.x, we have a hook named postCompletedModifyTableAction which can ensure only be executed after the whole process finish. Last but not least, as mentioned in HBASE-20658, the sync latch will be release after prepare state in DDLs like enable/disable other than alter/truncate(which only release it after the whole process finish). So there is a inconsistency here, we are trying hard to make sure postModifyTable to be called only after the whole process finish, but for other post* hooks like postEnableTable, they are not. I think we can revert the change here, otherwise, user will suffer a RPC timeout when alter/truncate big tables. [~elserj], [~stack],[~Apache9] > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2, 2.0.0 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372295#comment-16372295 ] Hudson commented on HBASE-19953: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4626 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4626/]) HBASE-19953 Ensure post DDL hooks are only called after successful (elserj: rev d9b8dcc1d300ae114febc22dbc71866088387111) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineNamespaceProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteNamespaceProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableNamespaceManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyNamespaceProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/ClusterSchema.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/ClusterSchemaServiceImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateNamespaceProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ProcedurePrepareLatch.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterObserverPostCalls.java > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370671#comment-16370671 ] Josh Elser commented on HBASE-19953: {quote}What you have done here is clean. Maybe some more comment to the effect that the latch is just-for the Procedure preparation – that we are not blocking for the whole procedure run – though the naming of the latch is enough I suppose. That noop latch is 'interesting'... {quote} Haha, yes, I agree with you. I wanted to leave a comment here also covering the fact that we _could_ do this differently if holding open this RPC is a problem. Let me try to document here as well for the inevitable case when I forget... Taking {{createTable}} as an example, the client submits the async calls which submits the procedure to create a table. With this patch, the Master will block the client from receiving the procedure ID it needs to wait on for the createTable to finish to also ensure that the prepare finishes (sanity-checks – the table we want to create doesn't exist already). This wait _should_ be short so this is OK. The worry would be that (for an unseen reason), this takes a long time or gets stuck, which leaves an RPC+thread open in the Master, leading to a denial of service. Instead, we could make two "async checks" on the client: the first would be to block for the result of the "prepare" of our createTable, and then a second to ensure that the procedure actually completes. This is obviously more work, but it's something we can keep in our back-pocket. Will add some docs to the commit and push this one. Thanks for the review, Stack! > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370645#comment-16370645 ] stack commented on HBASE-19953: --- Sorry for the neglect [~elserj]. Its just that whenever I try to look at this issue, I get an instant headache. bq. The manifestation of this was Apache Atlas was getting audit events for things that didn't actually happen which is no good. Yes. What you have done here is clean. Maybe some more comment to the effect that the latch is just-for the Procedure preparation -- that we are not blocking for the whole procedure run -- though the naming of the latch is enough I suppose. That noop latch is 'interesting'... +1 on commit. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370588#comment-16370588 ] Hadoop QA commented on HBASE-19953: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 0s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 54s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 8s{color} | {color:red} hbase-server: The patch generated 27 new + 251 unchanged - 7 fixed = 278 total (was 258) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 12s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 14m 58s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}132m 11s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}165m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-19953 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12911258/HBASE-19953.003.branch-2.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 7c4ea773de21 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 19:09:19 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / a5443c18d2 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11584/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11584/testReport/ | | Max. process+thread count | 4748 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11584/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367893#comment-16367893 ] Josh Elser commented on HBASE-19953: {quote}Only have been paying half-attention. You are putting in place a blocking latch for all DDL? Downside of calling post on fail ? The CP doesn't know the op failed? {quote} Right, the 1.x post semantics were that it would only get called when the operation was successful. In 2.0, this means that, because the pre-conditional checks that are done (e.g. can't delete a namespace that doesn't exist) don't fire and fail until after we already called the hook which results in the post hook getting called all the time. The manifestation of this was Apache Atlas was getting audit events for things that didn't actually happen which is no good. The easiest fix without rewriting a bunch is to use the blocking latch, but a more "elegant" (maybe just complicated) solution would be to push the post-hook call down into the Procedure implementation itself so that we can release the client RPC thread. I don't think either is super critical since all of these preparatory checks are fast. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367881#comment-16367881 ] stack commented on HBASE-19953: --- Pardon me. Only have been paying half-attention. You are putting in place a blocking latch for all DDL? Downside of calling post on fail ? The CP doesn't know the op failed? Thanks [~elserj] > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367698#comment-16367698 ] Hadoop QA commented on HBASE-19953: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 36s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 46s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 19s{color} | {color:red} hbase-server: The patch generated 21 new + 227 unchanged - 2 fixed = 248 total (was 229) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 3m 44s{color} | {color:red} patch has 10 errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 16m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 20s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}147m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestAsyncTableBatch | | | hadoop.hbase.TestJMXListener | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-19953 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910945/HBASE-19953.002.branch-2.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 5a6228ba31c0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 33212cf6c8 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11552/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/11552/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11552/testReport/ | | asflicense |
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367509#comment-16367509 ] Josh Elser commented on HBASE-19953: Silly little fix to get the UT running. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19952.001.branch-2.patch, > HBASE-19953.002.branch-2.patch > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366499#comment-16366499 ] Hadoop QA commented on HBASE-19953: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 56s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 58s{color} | {color:red} hbase-server: The patch generated 21 new + 227 unchanged - 2 fixed = 248 total (was 229) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 2m 32s{color} | {color:red} patch has 10 errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 12m 0s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 51s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}130m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestMasterObserverPostCalls | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db | | JIRA Issue | HBASE-19953 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910825/HBASE-19952.001.branch-2.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 7243ffd9ddff 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 30dd595d3b | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11547/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/11547/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11547/testReport/ | | asflicense |
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366207#comment-16366207 ] Josh Elser commented on HBASE-19953: So, copy-paste'ing the latch code already there elsewhere didn't work because the default latch implementation is a no-op which doesn't wait (because we have a client which is async-procedure compatible) I can force a latch which will wait as a temporary workaround, which will have the (undesired) side-effect of holding the RPC open. Not the end of the world for now. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364707#comment-16364707 ] Josh Elser commented on HBASE-19953: Mighty [~stack]: one more (basic?) question for you. I'm seeing some discrepancies between CreateTableProcedure and DeleteNamespaceProcedure in how 'pre-checks' are handled. Specifically the difference between {{CreateTableProcedure#prepareCreate(..)}} and {{DeleteNamespaceProcedure#prepareDelete(..)}}. The former calls {{setFailure(..)}} with exceptions that would prevent a table-creation from happening (e.g. table already exists) while the latter just throws the Exception. Are these logically equivalent at a higher level? Or is this a sign that DeleteNS needs to be updated? > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362622#comment-16362622 ] Josh Elser commented on HBASE-19953: Let me take a look at that. Thanks for the pointer, sir. > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362607#comment-16362607 ] stack commented on HBASE-19953: --- Looking at related Master-side Operations, I see them take a latch in the NonceProcedureRunnable implementation. When latch is thrown, they call the post op. See enableTable, createTable, etc. This delete namespace should do similar? Later we should come back and get rid of all these latches (and then we'll have to figure how Observer can monitor Procedure). > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356207#comment-16356207 ] Josh Elser commented on HBASE-19953: Ok, this naive solution definitely doesn't work. Async is built into the procedure itself – this appears to be a nasty oversight on our part. The existing semantics from branch-1 are that if the operation fails, the post hook is not invoked. This was easy to implement as they were synchronous calls. I feel like to do this properly, we would need to push the post hook invocation down into the procedure itself so that it's automatically invoked. The downside of this approach is that is convolutes the Procedure implementation with the Observer hook logic. Maybe we can push this into the {{NonceProcedureRunnable}} logic instead? (in {{MasterProcedureUtil#submitProcedure()}}) > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails
[ https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355964#comment-16355964 ] Josh Elser commented on HBASE-19953: My hunch is that we need to do something like the following, but that's just a guess. Need to play with it. {code:java} diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java index 999272a5bd..b770e664e6 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java @@ -2971,7 +2971,10 @@ public class HMaster extends HRegionServer implements MasterServices { // Execute the operation synchronously - wait for the operation to complete before // continuing. setProcId(getClusterSchema().deleteNamespace(name, getNonceKey())); - getMaster().getMasterCoprocessorHost().postDeleteNamespace(name); + if (!getProcedureExecutor().getResult(getProcId()).isFailed()) { + getMaster().getMasterCoprocessorHost().postDeleteNamespace(name); + } }{code} > Avoid calling post* hook when procedure fails > - > > Key: HBASE-19953 > URL: https://issues.apache.org/jira/browse/HBASE-19953 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 >Reporter: Ramesh Mani >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-2 > > > Ramesh pointed out a case where I think we're mishandling some post\* > MasterObserver hooks. Specifically, I'm looking at the deleteNamespace. > We synchronously execute the DeleteNamespace procedure. When the user > provides a namespace that isn't empty, the procedure does a rollback (which > is just a no-op), but this doesn't propagate an exception up to the > NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh > pointing it out a bit better to me that the code executes a bit differently > than we actually expect. > I think we need to double-check our post hooks and make sure we aren't > invoking them when the procedure actually failed. cc/ [~Apache9], [~stack]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)