Hi, When I was debugging existing QE hbase test script, I found that in the following situation (create - freeze - thaw) it was determined that there was no live server: region server, hor10n03.gq1.ygridcore.net,42175,1405108984098, was considered dead by the new master due to 'freeze' action the new region server, hor10n03.gq1.ygridcore.net,46329,1405120269524 <http://hor10n03.gq1.ygridcore.net:60941/>, was live however master didn't remove the first one from the dead servers list due to port not matching. QE script drew the conclusion because 1(live)-1(dead) = 0
You can observe this scenario here: http://hor10n01.gq1.ygridcore.net:50938/master-status Since region server was brought up on the same node and the previous port was still free: [hortonzy@hor10n03 ~]$ sudo netstat -tulpn | grep 42175 [hortonzy@hor10n03 ~]$ I think proper action should be to reuse the previous port when thawing. Please comment.
