[ 
https://issues.apache.org/jira/browse/HDFS-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13834:
-------------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: HDFS-13891
           Status: Resolved  (was: Patch Available)

Thanks [~crh] for the fix.
Committed to HDFS-13891.

> RBF: Connection creator thread should catch Throwable
> -----------------------------------------------------
>
>                 Key: HDFS-13834
>                 URL: https://issues.apache.org/jira/browse/HDFS-13834
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: CR Hota
>            Assignee: CR Hota
>            Priority: Critical
>             Fix For: HDFS-13891
>
>         Attachments: HDFS-13834-HDFS-13891.0.patch, 
> HDFS-13834-HDFS-13891.1.patch, HDFS-13834-HDFS-13891.2.patch, 
> HDFS-13834-HDFS-13891.3.patch, HDFS-13834-HDFS-13891.4.patch, 
> HDFS-13834-HDFS-13891.5.patch, HDFS-13834.0.patch, HDFS-13834.1.patch
>
>
> Connection creator thread is a single thread thats responsible for creating 
> all downstream namenode connections.
> This is very critical thread and hence should not die understand 
> exception/error scenarios.
> We saw this behavior in production systems where the thread died leaving the 
> router process in bad state.
> The thread should also catch a generic error/exception.
> {code}
>     @Override
>     public void run() {
>       while (this.running) {
>         try {
>           ConnectionPool pool = this.queue.take();
>           try {
>             int total = pool.getNumConnections();
>             int active = pool.getNumActiveConnections();
>             if (pool.getNumConnections() < pool.getMaxSize() &&
>                 active >= MIN_ACTIVE_RATIO * total) {
>               ConnectionContext conn = pool.newConnection();
>               pool.addConnection(conn);
>             } else {
>               LOG.debug("Cannot add more than {} connections to {}",
>                   pool.getMaxSize(), pool);
>             }
>           } catch (IOException e) {
>             LOG.error("Cannot create a new connection", e);
>           }
>         } catch (InterruptedException e) {
>           LOG.error("The connection creator was interrupted");
>           this.running = false;
>         }
>       }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to