[ https://issues.apache.org/jira/browse/HADOOP-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12526183 ]
Jim Kellerman commented on HADOOP-1816: --------------------------------------- If a region server cannot contact the HDFS, it should shut itself down. In this case the master will notice when the region server's lease times out and reassign the region. > [hbase] Scan of .META. does socket timeout over and over again (rather than > ---------------------------------------------------------------------------- > > Key: HADOOP-1816 > URL: https://issues.apache.org/jira/browse/HADOOP-1816 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Reporter: stack > Assignee: Jim Kellerman > Priority: Trivial > Attachments: excerpt.txt > > > A mismatch in the code on the cluster revealed an infinite loop. The .META. > scanner is doing a socket timeout trying to contact a borked region server > (The borked server was having trouble contacting hdfs because of of code > version mismatch -- it was sort-of-working). We retry the timeout up to the > retry limit but then rather than try and redeploy the unreachable .META. we > just drop back into scanning at the old location.... I'll attach a log that > illustrates the goings-on. > I think this likely a trivial issue since it shouldn't really ever happen.... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.