[ 
https://issues.apache.org/jira/browse/HDFS-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CR Hota updated HDFS-13834:
---------------------------
    Description: 
Connection creator thread is a single thread thats responsible for creating all 
downstream namenode connections.

This is very critical thread and hence should not die understand 
exception/error scenarios.

We saw this behavior in production systems where the thread died leaving the 
router process in bad state.

The thread should also catch a generic error/exception.

{code}

    @Override
    public void run() {
      while (this.running) {
        try {
          ConnectionPool pool = this.queue.take();
          try {
            int total = pool.getNumConnections();
            int active = pool.getNumActiveConnections();
            if (pool.getNumConnections() < pool.getMaxSize() &&
                active >= MIN_ACTIVE_RATIO * total) {
              ConnectionContext conn = pool.newConnection();
              pool.addConnection(conn);
            } else {
              LOG.debug("Cannot add more than {} connections to {}",
                  pool.getMaxSize(), pool);
            }
          } catch (IOException e) {
            LOG.error("Cannot create a new connection", e);
          }
        } catch (InterruptedException e) {
          LOG.error("The connection creator was interrupted");
          this.running = false;
        }
      }

{code}

  was:
Connection creator thread is a single thread thats responsible for creating all 
downstream namenode connections.

This is very critical thread and hence should not die understand 
exception/error scenarios.

We saw this behavior in production systems where the thread died leaving the 
router process in bad state.

The thread should also catch a generic error/exception.

{code}
{}


> RBF: Connection creator thread should catch Throwable
> -----------------------------------------------------
>
>                 Key: HDFS-13834
>                 URL: https://issues.apache.org/jira/browse/HDFS-13834
>             Project: Hadoop HDFS
>          Issue Type: Bug
>         Environment: {code:java}
>     @Override
>     public void run() {
>       while (this.running) {
>         try {
>           ConnectionPool pool = this.queue.take();
>           try {
>             int total = pool.getNumConnections();
>             int active = pool.getNumActiveConnections();
>             if (pool.getNumConnections() < pool.getMaxSize() &&
>                 active >= MIN_ACTIVE_RATIO * total) {
>               ConnectionContext conn = pool.newConnection();
>               pool.addConnection(conn);
>             } else {
>               LOG.debug("Cannot add more than {} connections to {}",
>                   pool.getMaxSize(), pool);
>             }
>           } catch (IOException e) {
>             LOG.error("Cannot create a new connection", e);
>           }
>         } catch (InterruptedException e) {
>           LOG.error("The connection creator was interrupted");
>           this.running = false;
>         }
>       }
> {code}
>            Reporter: CR Hota
>            Assignee: CR Hota
>            Priority: Critical
>
> Connection creator thread is a single thread thats responsible for creating 
> all downstream namenode connections.
> This is very critical thread and hence should not die understand 
> exception/error scenarios.
> We saw this behavior in production systems where the thread died leaving the 
> router process in bad state.
> The thread should also catch a generic error/exception.
> {code}
>     @Override
>     public void run() {
>       while (this.running) {
>         try {
>           ConnectionPool pool = this.queue.take();
>           try {
>             int total = pool.getNumConnections();
>             int active = pool.getNumActiveConnections();
>             if (pool.getNumConnections() < pool.getMaxSize() &&
>                 active >= MIN_ACTIVE_RATIO * total) {
>               ConnectionContext conn = pool.newConnection();
>               pool.addConnection(conn);
>             } else {
>               LOG.debug("Cannot add more than {} connections to {}",
>                   pool.getMaxSize(), pool);
>             }
>           } catch (IOException e) {
>             LOG.error("Cannot create a new connection", e);
>           }
>         } catch (InterruptedException e) {
>           LOG.error("The connection creator was interrupted");
>           this.running = false;
>         }
>       }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to