[ 
https://issues.apache.org/jira/browse/HDFS-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588059#comment-16588059
 ] 

CR Hota commented on HDFS-13834:
--------------------------------

[~linyiqun] Thanks for reviewing the patch.

Changing the flag to false will finish the thread and exit (something similar 
is happening in this error case). The case here is if there is any fatal error 
that was not caught as part of IOException the thread should recover and try to 
create a connection by taking on another task from the creator queue. Therefore 
the flag should still be true.

Wondering how to inject this case to write a test case.

 

> RBF: Connection creator thread should catch Throwable
> -----------------------------------------------------
>
>                 Key: HDFS-13834
>                 URL: https://issues.apache.org/jira/browse/HDFS-13834
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: CR Hota
>            Assignee: CR Hota
>            Priority: Critical
>         Attachments: HDFS-13834.0.patch, HDFS-13834.1.patch
>
>
> Connection creator thread is a single thread thats responsible for creating 
> all downstream namenode connections.
> This is very critical thread and hence should not die understand 
> exception/error scenarios.
> We saw this behavior in production systems where the thread died leaving the 
> router process in bad state.
> The thread should also catch a generic error/exception.
> {code}
>     @Override
>     public void run() {
>       while (this.running) {
>         try {
>           ConnectionPool pool = this.queue.take();
>           try {
>             int total = pool.getNumConnections();
>             int active = pool.getNumActiveConnections();
>             if (pool.getNumConnections() < pool.getMaxSize() &&
>                 active >= MIN_ACTIVE_RATIO * total) {
>               ConnectionContext conn = pool.newConnection();
>               pool.addConnection(conn);
>             } else {
>               LOG.debug("Cannot add more than {} connections to {}",
>                   pool.getMaxSize(), pool);
>             }
>           } catch (IOException e) {
>             LOG.error("Cannot create a new connection", e);
>           }
>         } catch (InterruptedException e) {
>           LOG.error("The connection creator was interrupted");
>           this.running = false;
>         }
>       }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to