[ https://issues.apache.org/jira/browse/ZOOKEEPER-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295137#comment-15295137 ]
Chris Nauroth commented on ZOOKEEPER-2366: ------------------------------------------ Hello [~fpj]. Thank you for picking up this patch. I think the approach of propagating the exceptions looks like the right direction. Here are a few comments. {code} try { acceptThread.join(); } catch (InterruptedException e) { LOG.error("Error joining old acceptThread when reconfiguring client port " + e.getMessage()); } {code} This wasn't introduced by your patch, but it's generally a bad practice to catch an {{InterruptedException}} and neither propagate it nor re-raise the interrupted status. Other layers of code might be expecting to react to thread interruption. Since we're touching this code, do you think we should add a call to {{Thread.currentThread().interrupt()}}? {code} LOG.error("Error reconfiguring client port to " + addr + " " + e.getMessage()); {code} This is also not a problem introduced by your patch, but can we pass along the {{e}} as the second argument to log the full stack trace for better diagnosis? {code} try { LOG.info("binding to port " + addr); parentChannel = bootstrap.bind(addr); localAddress = addr; } catch (Exception e) { LOG.error("Error while reconfiguring", e); throw new IOException(e.getMessage()); } finally { oldChannel.close(); } {code} If there is an exception thrown from the try body, and then {{oldChannel.close()}} also throws an exception, then the caller will receive the second exception. Generally, the first exception is probably more interesting for root cause information, so I think it's preferrable to catch and log the exception in the finally block, or if you prefer, only throw from the finally block if the whole try block was successful. {code} try { self.processReconfig(newQV, designatedLeader, zk.getZxid(), true); } catch (IOException e) { return false; } {code} Does this exception need to be logged at this layer, or do all exceptions already get logged somewhere within the {{QuorumPeer#processReconfig}} call? I think the test works for covering this change, but I see some incorrect indentation. Would you please clean that up? > Reconfiguration of client port causes a socket leak > --------------------------------------------------- > > Key: ZOOKEEPER-2366 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2366 > Project: ZooKeeper > Issue Type: Bug > Components: quorum > Affects Versions: 3.5.0 > Reporter: Timothy Ward > Assignee: Flavio Junqueira > Priority: Blocker > Fix For: 3.5.2 > > Attachments: ZOOKEEPER-2366.patch, ZOOKEEPER-2366.patch, > zookeeper.patch > > > The NIOServerCnxnFactory reconfigure method can leak server sockets, and > hence make ports unusable until the JVM restarts: > The first line of the method takes a reference to the current > ServerSocketChannel and then the next line replaces it. The subsequent > interactions with the server socket can fail (for example if the > reconfiguration tries to bind to an in-use port). If they fail *before* the > call to oldSS.close() then oldSS is *never* closed. This holds that port open > forever, and prevents the user from rolling back to the previous port! > The code from reconfigure is shown below: > ServerSocketChannel oldSS = ss; > try { > this.ss = ServerSocketChannel.open(); > ss.socket().setReuseAddress(true); > LOG.info("binding to port " + addr); > ss.socket().bind(addr); > ss.configureBlocking(false); > acceptThread.setReconfiguring(); > oldSS.close(); > acceptThread.wakeupSelector(); > try { > acceptThread.join(); > } catch (InterruptedException e) { > LOG.error("Error joining old acceptThread when > reconfiguring client port " + e.getMessage()); > } > acceptThread = new AcceptThread(ss, addr, selectorThreads); > acceptThread.start(); > } catch(IOException e) { > LOG.error("Error reconfiguring client port to " + addr + " " + > e.getMessage()); > } -- This message was sent by Atlassian JIRA (v6.3.4#6332)