[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-11-06 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677085#comment-16677085
 ] 

Josh Elser commented on HBASE-19953:


{quote}
[~elserj], actually postDeleteNamespace still have the race condition you 
mentioned above... Since the sync latch is released just after 
DELETE_NAMESPACE_PREPARE state in DeleteNamespaceProcedure. So when the time 
postDeleteNamespace is called, the namespace may also exists...
{quote}

Ugh, that's no good. Thanks for pointing it out.

bq.  But anyway, let me open another issue for it, the thing I want to solve 
most is the RPC timeout of modifying tables. 

Sounds good. Will look for it on the dev list (or feel free to ping me if you 
have yet to create it).

Thanks sir!

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, 
> HBASE-19953.branch-2.0.addendum.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-11-05 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676002#comment-16676002
 ] 

Allan Yang commented on HBASE-19953:


[~elserj], actually postDeleteNamespace still have the race condition you 
mentioned above... Since the sync latch is released just after 
DELETE_NAMESPACE_PREPARE state in DeleteNamespaceProcedure. So when the time 
postDeleteNamespace is called, the namespace may also exists...
But anyway, let me open another issue for it, the thing I want to solve most is 
the RPC timeout of modifying tables. 

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, 
> HBASE-19953.branch-2.0.addendum.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-11-05 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675476#comment-16675476
 ] 

Josh Elser commented on HBASE-19953:


[~allan163], I understand the RPC timeout issue that you bring up, but I think 
the right solution is to fix how we execute postHooks.

If we applied your patch now, it would introduce a regression for the original 
issue that I created this change. Specifically, that issue is that when a 2.x 
client (which has procv2 support) submits an operation that doesn't use a 
blocking lock, there is a race condition between the execution of the postHook 
and the procedure for the operation itself.

Specifically, the original problem was that an Apache Ranger-owned 
postDeleteNamespace hook was expecting that the HBase namespace would not exist 
when it was invoked. In reality, the namespace still existed because the 
DeleteNamespaceProcedure had not yet run when we used the non-blocking latch. 
Your patch would reintroduce this problem directly.

So, there are two goals: one (that you raised) is that we want clients to not 
have to hold open an RPC for a procedure to complete. The second (that I raised 
here) is that we must not call a post\* hook before the actual operation is 
completed.

I think the way to solve both of these is to push execution of the post\* hook 
into the procedure itself. That way, we do not execute the hook too "soon" and 
the client can just poll the procedure, waiting for completion (as we want).

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, 
> HBASE-19953.branch-2.0.addendum.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-11-05 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675435#comment-16675435
 ] 

Josh Elser commented on HBASE-19953:


{quote}
After a careful think, I think we don't need to revert this(I've already 
changed the comment above), we only need to turn ModifyTable to a async 
op(Which is the only sync DDL for 2.x client now). Josh Elser you can see 
HMaster.truncateTable. We also use ProcedurePrepareLatch.createLatch(2, 0) to 
make sure the 2.x client won't sync wait here.
Uploaded a addendum to clarify my point.
{quote}

Thanks, Allan. I appreciate the clarification -- I felt bad lobbing a "-1" onto 
this as I did. Glad we can talk it out. The addendum is very helpful and I 
appreciate that too.

I need to refresh my head around the original problem and think about the 
solution you have suggested. I think it makes sense to me, but I need to make 
sure things still work as they did in 2.0.0 :). I'll do this now before I get 
pulled in another direction.

On that note, since this did go out in 2.0.0, can you make this change under a 
different Jira issue instead of as an addendum to this one? That would remove 
potential confusion about 2.0.3 having it but not 2.0.1 and 2.0.2.

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch, 
> HBASE-19953.branch-2.0.addendum.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-11-03 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674284#comment-16674284
 ] 

Allan Yang commented on HBASE-19953:


{quote}
In reality, a synchronous API for DDL operations is super-useful – applications 
can't reasonably proceed to run if an action hasn't completed. So, I'd pose the 
question: how would we know when to say that a DDL operation is "completed 
enough"?
{quote}
For clients >2.0, it is very simple, we only need to check whether the 
procedure is finished. For clients in 1.x, admin.getAlterStauts() is used to 
check the status(for modify table), but since in 2.x, getAlterStauts is 
deprecated , so we need to make 1.x client to wait in sync.
{quote}
I am -1 on just reverting this. 
{quote}
After a careful think, I think we don't need to revert this, we only need to 
turn ModifyTable to a async op(Which is the only sync DDL for 2.x client now). 
[~elserj] you can see HMaster.truncateTable. We also use 
ProcedurePrepareLatch.createLatch(2, 0) to make sure the 2.x client won't sync 
wait here.
Uploaded a addendum to clarify  my point.



> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-11-02 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673150#comment-16673150
 ] 

Josh Elser commented on HBASE-19953:


{quote}So I think 'Avoid calling post* hook when procedure fails' is not stand 
here, since in 1.x we will call postModifyTable even before the modify process 
finish.
{quote}
This is surprising to me. I thought the code was otherwise. I'll have to take 
another look.
{quote}I think we can revert the change here, otherwise, user will suffer a RPC 
timeout when alter/truncate big tables.
{quote}
Caveat: if they're using the synchronous API, right?
{quote}Last but not least, as mentioned in HBASE-20658, the sync latch will be 
release after prepare state in DDLs like enable/disable other than 
alter/truncate(which only release it after the whole process finish). So there 
is a inconsistency here, we are trying hard to make sure postModifyTable to be 
called only after the whole process finish, but for other post* hooks like 
postEnableTable, they are not.
{quote}
How would we reconcile this? IIRC, our public API states that the call will 
block until the operation is complete. Seems to me that you're suggesting that 
we do not actually adhere to that in all places, nor that we should.

In reality, a synchronous API for DDL operations is super-useful – applications 
can't reasonably proceed to run if an action hasn't completed. So, I'd pose the 
question: how would we know when to say that a DDL operation is "completed 
enough"?

I am -1 on just reverting this. That would break a use-case while fixing 
another – not the right way to solve this. Let's talk through the semantics we 
can reasonably implement and what we need to provide for users. From there, we 
can figure out what we can safely implement.

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-11-02 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673126#comment-16673126
 ] 

Allan Yang commented on HBASE-19953:


The discussion need to be continued here , I also noticed that We can get a RPC 
timeout when alter/truncate a big table because of the modifications in this 
issue.
This issue turns the whole alter/truncate into a sync op, the op time will be 
unacceptable if the table is huge.
Even in 1.x, modifying table is a async op, we will not wait the regions to be 
reopened, but use admin.getAlterStauts() to check if finish.
So I think 'Avoid calling post* hook when procedure fails' is not stand here, 
since in 1.x we will call postModifyTable even before the modify process finish.
And, in 2.x, we have a hook named postCompletedModifyTableAction which can 
ensure only be executed after the whole process finish.
Last but not least, as mentioned in HBASE-20658, the sync latch will be release 
after prepare state in DDLs like enable/disable other than alter/truncate(which 
only release it after the whole process finish). So there is a inconsistency 
here, we are trying hard to make sure postModifyTable  to be called only after 
the whole process finish, but for other post* hooks like postEnableTable, they 
are not.
I think we can revert the change here, otherwise, user will suffer a RPC 
timeout when alter/truncate big tables.
[~elserj], [~stack],[~Apache9]

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372295#comment-16372295
 ] 

Hudson commented on HBASE-19953:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4626 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4626/])
HBASE-19953 Ensure post DDL hooks are only called after successful (elserj: rev 
d9b8dcc1d300ae114febc22dbc71866088387111)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineNamespaceProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteNamespaceProcedure.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableNamespaceManager.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyNamespaceProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ClusterSchema.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ClusterSchemaServiceImpl.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateNamespaceProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ProcedurePrepareLatch.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestMasterObserverPostCalls.java


> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-20 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370671#comment-16370671
 ] 

Josh Elser commented on HBASE-19953:


{quote}What you have done here is clean. Maybe some more comment to the effect 
that the latch is just-for the Procedure preparation – that we are not blocking 
for the whole procedure run – though the naming of the latch is enough I 
suppose. That noop latch is 'interesting'...
{quote}
Haha, yes, I agree with you. I wanted to leave a comment here also covering the 
fact that we _could_ do this differently if holding open this RPC is a problem. 
Let me try to document here as well for the inevitable case when I forget...

Taking {{createTable}} as an example, the client submits the async calls which 
submits the procedure to create a table. With this patch, the Master will block 
the client from receiving the procedure ID it needs to wait on for the 
createTable to finish to also ensure that the prepare finishes (sanity-checks – 
the table we want to create doesn't exist already). This wait _should_ be short 
so this is OK. The worry would be that (for an unseen reason), this takes a 
long time or gets stuck, which leaves an RPC+thread open in the Master, leading 
to a denial of service. Instead, we could make two "async checks" on the 
client: the first would be to block for the result of the "prepare" of our 
createTable, and then a second to ensure that the procedure actually completes. 
This is obviously more work, but it's something we can keep in our back-pocket.

Will add some docs to the commit and push this one. Thanks for the review, 
Stack!

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370645#comment-16370645
 ] 

stack commented on HBASE-19953:
---

Sorry for the neglect [~elserj]. Its just that whenever I try to look at this 
issue, I get an instant headache.

bq. The manifestation of this was Apache Atlas was getting audit events for 
things that didn't actually happen which is no good.

Yes.

What you have done here is clean. Maybe some more comment to the effect that 
the latch is just-for the Procedure preparation -- that we are not blocking for 
the whole procedure run -- though the naming of the latch is enough I suppose. 
That noop latch is 'interesting'...

+1 on commit.




> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch, HBASE-19953.003.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370588#comment-16370588
 ] 

Hadoop QA commented on HBASE-19953:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 0s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
54s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
8s{color} | {color:red} hbase-server: The patch generated 27 new + 251 
unchanged - 7 fixed = 278 total (was 258) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 58s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}132m 
11s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}165m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-19953 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12911258/HBASE-19953.003.branch-2.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 7c4ea773de21 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / a5443c18d2 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11584/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11584/testReport/ |
| Max. process+thread count | 4748 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11584/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |



[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-16 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367893#comment-16367893
 ] 

Josh Elser commented on HBASE-19953:


{quote}Only have been paying half-attention. You are putting in place a 
blocking latch for all DDL? Downside of calling post on fail ? The CP doesn't 
know the op failed?
{quote}
Right, the 1.x post semantics were that it would only get called when the 
operation was successful. In 2.0, this means that, because the pre-conditional 
checks that are done (e.g. can't delete a namespace that doesn't exist) don't 
fire and fail until after we already called the hook which results in the post 
hook getting called all the time. The manifestation of this was Apache Atlas 
was getting audit events for things that didn't actually happen which is no 
good.

The easiest fix without rewriting a bunch is to use the blocking latch, but a 
more "elegant" (maybe just complicated) solution would be to push the post-hook 
call down into the Procedure implementation itself so that we can release the 
client RPC thread.

I don't think either is super critical since all of these preparatory checks 
are fast.

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367881#comment-16367881
 ] 

stack commented on HBASE-19953:
---

Pardon me. Only have been paying half-attention. You are putting in place a 
blocking latch for all DDL? Downside of calling post on fail ? The CP doesn't 
know the op failed? Thanks [~elserj]

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367698#comment-16367698
 ] 

Hadoop QA commented on HBASE-19953:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
36s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
46s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 21 new + 227 
unchanged - 2 fixed = 248 total (was 229) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m 
44s{color} | {color:red} patch has 10 errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
16m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 20s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncTableBatch |
|   | hadoop.hbase.TestJMXListener |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-19953 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910945/HBASE-19953.002.branch-2.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 5a6228ba31c0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 33212cf6c8 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11552/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11552/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11552/testReport/ |
| asflicense | 

[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-16 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367509#comment-16367509
 ] 

Josh Elser commented on HBASE-19953:


Silly little fix to get the UT running.

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19952.001.branch-2.patch, 
> HBASE-19953.002.branch-2.patch
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366499#comment-16366499
 ] 

Hadoop QA commented on HBASE-19953:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
56s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
58s{color} | {color:red} hbase-server: The patch generated 21 new + 227 
unchanged - 2 fixed = 248 total (was 229) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  2m 
32s{color} | {color:red} patch has 10 errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m  0s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 51s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterObserverPostCalls |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-19953 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910825/HBASE-19952.001.branch-2.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 7243ffd9ddff 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 30dd595d3b |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11547/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11547/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11547/testReport/ |
| asflicense | 

[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-15 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366207#comment-16366207
 ] 

Josh Elser commented on HBASE-19953:


So, copy-paste'ing the latch code already there elsewhere didn't work because 
the default latch implementation is a no-op which doesn't wait (because we have 
a client which is async-procedure compatible)

I can force a latch which will wait as a temporary workaround, which will have 
the (undesired) side-effect of holding the RPC open. Not the end of the world 
for now.

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-14 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364707#comment-16364707
 ] 

Josh Elser commented on HBASE-19953:


Mighty [~stack]: one more (basic?) question for you. I'm seeing some 
discrepancies between CreateTableProcedure and DeleteNamespaceProcedure in how 
'pre-checks' are handled. Specifically the difference between 
{{CreateTableProcedure#prepareCreate(..)}} and 
{{DeleteNamespaceProcedure#prepareDelete(..)}}. The former calls 
{{setFailure(..)}} with exceptions that would prevent a table-creation from 
happening (e.g. table already exists) while the latter just throws the 
Exception.

Are these logically equivalent at a higher level? Or is this a sign that 
DeleteNS needs to be updated?

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-13 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362622#comment-16362622
 ] 

Josh Elser commented on HBASE-19953:


Let me take a look at that. Thanks for the pointer, sir.

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362607#comment-16362607
 ] 

stack commented on HBASE-19953:
---

Looking at related Master-side Operations, I see them take a latch in the 
NonceProcedureRunnable implementation. When latch is thrown, they call the post 
op. See enableTable, createTable, etc. This delete namespace should do similar? 
Later we should come back and get rid of all these latches (and then we'll have 
to figure how Observer can monitor Procedure).

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-07 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356207#comment-16356207
 ] 

Josh Elser commented on HBASE-19953:


Ok, this naive solution definitely doesn't work. Async is built into the 
procedure itself – this appears to be a nasty oversight on our part.

The existing semantics from branch-1 are that if the operation fails, the post 
hook is not invoked. This was easy to implement as they were synchronous calls.

I feel like to do this properly, we would need to push the post hook invocation 
down into the procedure itself so that it's automatically invoked. The downside 
of this approach is that is convolutes the Procedure implementation with the 
Observer hook logic. Maybe we can push this into the {{NonceProcedureRunnable}} 
logic instead? (in {{MasterProcedureUtil#submitProcedure()}})

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-07 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355964#comment-16355964
 ] 

Josh Elser commented on HBASE-19953:


My hunch is that we need to do something like the following, but that's just a 
guess. Need to play with it.
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
index 999272a5bd..b770e664e6 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
@@ -2971,7 +2971,10 @@ public class HMaster extends HRegionServer implements 
MasterServices {
 // Execute the operation synchronously - wait for the operation to 
complete before
 // continuing.
 setProcId(getClusterSchema().deleteNamespace(name, getNonceKey()));
-    getMaster().getMasterCoprocessorHost().postDeleteNamespace(name);
+    if (!getProcedureExecutor().getResult(getProcId()).isFailed()) {
+  getMaster().getMasterCoprocessorHost().postDeleteNamespace(name);
+    }
   }{code}

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)