[ 
https://issues.apache.org/jira/browse/HDFS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499163#comment-13499163
 ] 

Eli Collins commented on HDFS-4047:
-----------------------------------

That's correct, I captured that in the comments in the patch but not in the 
jira - sorry - I should have called that out explicitly here.

{code}
-   * No matter what kind of exception we get, keep retrying to offerService().
-   * That's the loop that connects to the NameNode and provides basic DataNode
-   * functionality.
...
+   * Main loop for each BP thread. It retries on IOExceptions, only
+   * stops when "shouldRun" or "shouldServiceRun" are false, ie
+   * on shutdown or refreshNamenodes (or non-IOE).
{code}

My thinking from HDFS-2882 and HDFS-4201 is that we shouldn't soldier on in the 
case of an RTE, eg NPE due to a BP failing to initialize, as this likely 
indicates a host configuration error. I could also see the point of view that 
the DN shouldn't stop running because one BP failed because perhaps the other 
is alive and well. What do you think?

                
> BPServiceActor has nested shouldRun loops
> -----------------------------------------
>
>                 Key: HDFS-4047
>                 URL: https://issues.apache.org/jira/browse/HDFS-4047
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>            Priority: Minor
>         Attachments: HADOOP-4047.patch, HDFS-4047.patch, hdfs-4047.txt, 
> hdfs-4047.txt
>
>
> BPServiceActor#run and offerService booth have while shouldRun loops. We only 
> need the outer one, ie we can hoist the info log from offerService out to run 
> and remove the while loop.
> {code}
> BPServiceActor#run:
> while (shouldRun()) {
>   try {
>     offerService();
>   } catch (Exception ex) {
> ...
> offerService:
> while (shouldRun()) {
>   try {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to