Abhishek Girish created DRILL-2917:
--------------------------------------

             Summary: Drillbit process fails to restart with 
address-already-in-use error due to unclean shutdown
                 Key: DRILL-2917
                 URL: https://issues.apache.org/jira/browse/DRILL-2917
             Project: Apache Drill
          Issue Type: Bug
          Components: Client - CLI
    Affects Versions: 0.9.0
            Reporter: Abhishek Girish
            Assignee: Daniel Barclay (Drill)


ON a 4 node cluster, some Drillbits fails to come up, complaining about address 
already in use. 

Previous drill-bit process (if any) was not listed as running via `jps`. The 
Web UI continued to list all processes to be up. 

{code}
# jps
<No Drillbit Process>

# /opt/mapr/drill/drill-0.9.0/bin/drillbit.sh stop
no drillbit to stop because no pid file /opt/mapr/drill/drill-0.9.0/drillbit.pid

# /opt/mapr/drill/drill-0.9.0/bin/drillbit.sh restart
no drillbit to stop because no pid file /opt/mapr/drill/drill-0.9.0/drillbit.pid
starting drillbit, logging to /opt/mapr/drill/drill-0.9.0/logs/drillbit.out

# jps
<No Drillbit Process>
{code}

Drillbit.out:
{code}
Exception in thread "main" 
org.apache.drill.exec.exception.DrillbitStartupException: Failure during 
initial startup of Drillbit.
        at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:87)
        at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:66)
        at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:166)
Caused by: org.apache.drill.exec.exception.DrillbitStartupException: Could not 
bind Drillbit
        at org.apache.drill.exec.rpc.BasicServer.bind(BasicServer.java:158)
        at 
org.apache.drill.exec.service.ServiceEngine.start(ServiceEngine.java:65)
        at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:241)
        at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:84)
        ... 2 more
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:444)
        at sun.nio.ch.Net.bind(Net.java:436)
        ...
        ...
{code}

It turns out the drill-bit failed to shutdown correctly and an internal process 
was still running. 

{code}
# ps -ef |grep drill
mapr      2807     1  0 Apr25 ?        00:00:00 bash 
/opt/mapr/drill/drill-0.9.0/bin/drillbit.sh internal_start drillbit
mapr      2862  2807  0 Apr25 ?        00:18:54 
/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.65.x86_64/jre/bin/java 
-Dlog.path=/opt/mapr/drill/drill-0.9.0/log/drillbit.log -Xms1G -Xmx16G 
-XX:MaxDirectMemorySize=48G -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=1G 
-ea -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf 
-Dzookeeper.sasl.client=false -XX:+CMSClassUnloadingEnabled 
-XX:+UseConcMarkSweepGC -cp 
/opt/mapr/drill/drill-0.9.0/conf:/opt/mapr/drill/drill-0.9.0/jars/*:/opt/mapr/drill/drill-0.9.0/jars/ext/*:/opt/mapr/drill/drill-0.9.0/jars/3rdparty/*:/opt/mapr/drill/drill-0.9.0/jars/classb/*
 org.apache.drill.exec.server.Drillbit
{code}

Killing this process helped bring up drill-bits on all nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to