[ 
https://issues.apache.org/jira/browse/ACCUMULO-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Tubbs resolved ACCUMULO-3478.
-----------------------------------------
    Resolution: Workaround

There have been several changes since this was initially reported, to improve 
the port search functionality. We probably still retry too aggressively on the 
same port instead of providing a useful exception, in the case where port 
searching isn't being used, as this issue describes. However, the user 
deploying the software could prevent such issues relatively easily by ensuring 
they don't configure Accumulo with ports that are in use in the first place 
with various "outside Accumulo" mechanisms for configuration management and 
port reservation, and external monitoring.

I think it makes sense to close this issue as OBE, given the age of this issue 
and the more recent advancements that make port search more usable. If there is 
still something to be done for this issue, please open a new issue or PR at 
https://github.com/apache/accumulo

> TServerUtils.startServer port already taken case is lacking
> -----------------------------------------------------------
>
>                 Key: ACCUMULO-3478
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3478
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master, tserver
>    Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1
>            Reporter: Josh Elser
>            Priority: Major
>
> When a thrift server is configured to use a fixed port, we will loop 100 
> times trying to bind to the port which is (likely) going to stay bound. 
> Because the server socket is setting SO_REUSEADDR, we shouldn't have issues 
> where the previous application has died/gone-away, but being unable to bind 
> to it.
> If we're explicitly given a port to start the thrift server on, and are 
> unable to start it, I think it would be better to throw an exception which 
> would kill the process instead of retrying at least 25 seconds first.
> Because we've already copied TNonblockingServerSocket, we can modify the 
> TTransportException thrown with a more meaningful exception and work on 
> getting better semantics in upstream thrift (if necessary).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to