Jira is down, but for the record -- the tests that recently fill up
disk space and cause havoc on test machines are failing because of
Hadoop's threads that fall into an endless loop in DataXceiverServer
(when the test framework calls interrupt on leaks threads).
Simplifying a bit, it looks like this:

  public void run() {
    Peer peer = null;
    while (datanode.shouldRun && !datanode.shutdownForUpgrade) {
      try {
        peer = peerServer.accept();
        ...
      } catch (IOException ie) {
        IOUtils.cleanup(null, peer);
        LOG.warn(datanode.getDisplayName() + ":DataXceiverServer: ", ie);
      }
    }

There are no timeouts on this loop, it just keeps logging forever.
Don't know if this "datanode" can be cleaned up properly, but it
definitely should be (in an afterclass hook). Otherwise the logs will
keep growing and there's not much we can do about it (from test
infrastructure point of view).

D.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to