[
https://issues.apache.org/jira/browse/HBASE-17624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868334#comment-15868334
]
Francis Liu edited comment on HBASE-17624 at 2/15/17 6:58 PM:
--------------------------------------------------------------
{quote}
Tell me more Francis Liu about deadlock. How? The concurrency issue addressed
was just making it so all modification to rsgroup state was single-threaded;
{quote}
Both read/write access can't be single threaded. Consider the situation:
1. move_rsgroup_servers is called
2. while #1 is happening rsgroup region is in transition (rpc thread in #1
holds monitor lock)
3. while #2 is happening meta is in transition.
Balancer tries to figure out plan for meta region tries to get monitor lock but
can't. rpc thread task won't release monitor lock since rsgroup region never
gets assigned. rsgroup region never gets assigned because it can't update meta
with new state.
There's a good chance this can be reproduce just by moving both rsgroup and
meta region onto the same RS and call move_rsgoup_servers on the same RS.
A bunch different actors will query from group affiliation so we can't have
writes block reads.
{quote}
previous it seemed loosey-goosey – least I couldn't figure out the regime.
{quote}
Looks like the first patch already had these changes so I can't really respond
to what may or may not be loosey goosey. I'll try to review the changes today
and see what other issues could be there.
Regarding Guava classes, sounds like this is a new policy. The HostAndPort
change was one of the requirements to get a '-1' withdrawn. Prior to that had a
similar class with different name. Not having Guava classes exposed sounds
better to me tho.
was (Author: toffer):
{quote}
Tell me more Francis Liu about deadlock. How? The concurrency issue addressed
was just making it so all modification to rsgroup state was single-threaded;
{quote}
Both read/write access can't be single threaded. Consider the situation:
1. move_rsgroup_servers is called
2. while #1 is happening rsgroup region is in transition (rpc thread in #1
holds monitor lock)
3. while #2 is happening meta is in transition.
Balancer tries to figure out plan for meta region tries to get monitor lock but
can't. rpc thread task won't release monitor lock since rsgroup region never
gets assigned. rsgroup region never gets assigned because it can't update meta
with new state.
There's a good chance this can be reproduce just by moving both rsgroup and
meta region onto the same RS and call move_rsgoup_servers on the same RS.
A bunch of threads will query from group affiliation so we can't have writes
block reads.
{quote}
previous it seemed loosey-goosey – least I couldn't figure out the regime.
{quote}
Looks like the first patch already had these changes so I can't really respond
to what may or may not be loosey goosey. I'll try to review the changes today
and see what other issues could be there.
Regarding Guava classes, sounds like this is a new policy. The HostAndPort
change was one of the requirements to get a '-1' withdrawn. Prior to that had a
similar class with different name. Not having Guava classes exposed sounds
better to me tho.
> Address late review of HBASE-6721, rsgroups feature
> ---------------------------------------------------
>
> Key: HBASE-17624
> URL: https://issues.apache.org/jira/browse/HBASE-17624
> Project: HBase
> Issue Type: Bug
> Components: rsgroup
> Reporter: stack
> Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-17624.master.001.patch,
> HBASE-17624.master.002.patch, HBASE-17624.master.003.patch,
> HBASE-17624.master.004.patch, HBASE-17624.master.005.patch,
> HBASE-17624.master.006.patch, HBASE-17624.master.007.patch,
> HBASE-17624.master.008.patch, HBASE-17624.master.009.patch,
> HBASE-17624.master.010.patch
>
>
> An internal review by [~busbey] and [~appy] turned up a bunch of good
> findings going over HBASE-6721. They found some really good stuff a guava
> type is part of our public API and concurrency in a few core classes is
> inconsistent.
> Patch coming.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)