[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2012-01-05 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4899:
--

Affects Version/s: (was: 0.92.1)
   0.92.0
Fix Version/s: (was: 0.90.0)
   0.92.0

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-12-09 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4899:
-

Fix Version/s: (was: 0.92.1)
   0.90.0

This will be in 0.92 afterall; cutting a new RC.

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.90.0
>
> Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-12-01 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4899:
-

   Resolution: Fixed
Fix Version/s: 0.92.1
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Applied to trunk and 0.92 branch.  I ran all tests and first time a 
TestReplication failed.  Its failing on trunk and 0.92 at mo.  Second time I 
ran it all tests passed.  Thanks for the patch Chunhui.

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.1
>
> Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-30 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-4899:


Attachment: hbase-4899v3.patch

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-30 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4899:
-

 Priority: Critical  (was: Major)
Affects Version/s: (was: 0.92.0)
   0.92.1

Upping priority and marking against 0.92.1.  Will pull into 0.92.0 if another 
RC.

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Attachments: hbase-4899.patch, hbase-4899v2.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-30 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-4899:


Attachment: hbase-4899v2.patch

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: hbase-4899.patch, hbase-4899v2.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-29 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4899:
--

Status: Patch Available  (was: Open)

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-29 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4899:
--

  Description: 
Before assigning region in ServerShutdownHandler#process, it will check whether 
region is in RIT,
however, this checking doesn't work as the excepted in the following case:
1.move region A from server B to server C
2.kill server B
3.start server B immediately

Let's see what happen in the code for the above case
{code}
for step1:
1.1 server B close the region A,
1.2 master setOffline for region 
A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
1.3 server C start to open region A.(Not completed)
for step3:
master ServerShutdownHandler#process() for server B
{
..
splitlog()
...
List regionsInTransition =
this.services.getAssignmentManager()
.processServerShutdown(this.serverName);
...
Skip regions that were in transition unless CLOSING or PENDING_CLOSE
...
assign region
}
{code}
In fact, when running 
ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
 region A is in RIT (step1.3 not completed), but the return List 
regionsInTransition doesn't contain it, because region A has removed from 
AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
Therefore, region A will be assigned twice.

Actually, one server killed and started twice will also easily cause region 
assigned twice.
Exclude the above reason, another probability : 
when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
,region is included which is in RIT now.
But after completing MetaReader.getServerUserRegions, the region has been 
opened in other server and is not in RIT now.

In our testing environment where balancing,moving and killing are executed 
periodly, assigning region twice often happens, and it is hateful because it 
will affect other test cases.

  was:
Before assigning region in ServerShutdownHandler#process, it will check whether 
region is in RIT,
however, this checking doesn't work as the excepted in the following case:
1.move region A from server B to server C
2.kill server B
3.start server B immediately

Let's see what happen in the code for the above case
{code}
for step1:
1.1 server B close the region A,
1.2 master setOffline for region 
A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
1.3 server C start to open region A.(Not completed)
for step3:
master ServerShutdownHandler#process() for server B
{
..
splitlog()
...
List regionsInTransition =
this.services.getAssignmentManager()
.processServerShutdown(this.serverName);
...
Skip regions that were in transition unless CLOSING or PENDING_CLOSE
...
assign region
}

In fact, when running 
ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
 region A is in RIT (step1.3 not completed), but the return List 
regionsInTransition doesn't contain it, because region A has removed from 
AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
Therefore, region A will be assigned twice.
{code}

Actually, one server killed and started twice will also easily cause region 
assigned twice.
Exclude the above reason, another probability : 
when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
,region is included which is in RIT now.
But after completing MetaReader.getServerUserRegions, the region has been 
opened in other server and is not in RIT now.

In our testing environment where balancing,moving and killing are executed 
periodly, assigning region twice often happens, and it is hateful because it 
will affect other test cases.

Affects Version/s: 0.92.0

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: chunhui shen
> Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-29 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-4899:


Attachment: hbase-4899.patch

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Reporter: chunhui shen
> Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> {code}
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira