[
https://issues.apache.org/jira/browse/DRILL-6935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Turton closed DRILL-6935.
-------------------------------
Assignee: James Turton
Resolution: Won't Do
This needs to be fixed in Hadoop.
> Apache drill show a lot of CLOSE_WAIT states when we access https://ip
> address:8047 caused the URL is not available
> -------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-6935
> URL: https://issues.apache.org/jira/browse/DRILL-6935
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Drill
> Affects Versions: 1.13.0
> Reporter: ken
> Assignee: James Turton
> Priority: Critical
>
> Hi Team,
> Hope all is good.
> We need your help.
> Here is the apache drill process which we installed in our server.
> drill 19220 1 17 16:48 ? 00:15:32 /usr/java/jdk/bin/java > -Xms8G -Xmx8G
> -XX:MaxDirectMemorySize=96G -XX:ReservedCodeCacheSize=1024m >
> -Ddrill.exec.enable-epoll=false -XX:+CMSClassUnloadingEnabled -XX:+UseG1GC >
> -Dlog.path=/var/log/drill/drillbit.log >
> -Dlog.query.path=/var/log/drill/drillbit_queries.json -cp >
> /usr/local/apache-drill-1.13.1/conf:/usr/local/apache-drill-1.13.1/jars/*:/usr/local/apache-drill-1.13.1/jars/ext/*:/usr/local/apache-drill-1.13.1/jars/3rdparty/*:/usr/local/apache-drill-1.13.1/jars/classb/*:/usr/local/apache-drill-1.13.1/jars/3rdparty/linux/*
> > org.apache.drill.exec.server.Drillbit > root 23651 23227 0 18:16 pts/1
> 00:00:00 grep --color=auto java
> Question 1:
> There are a lot of CLOSE_WAIT states when I access apache drill
> [https://ip|https://ip/] > address:8047
> <[https://theremin.digitalalchemy.net.au:8047/]> I have changed > our server
> ip to xxxx for the secruity reason, this caused that we can't > access apache
> drill by [https://ip|https://ip/] address:8047 >
> <[https://theremin.digitalalchemy.net.au:8047/]>, so we can't check which SQL
> run failed.
> tcp6 0 0 :::8047 :::* LISTEN > 19220/java > tcp6 518 0 192.168.xxxx:8047
> 192.168.100.131:54132 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047
> 192.168.100.222:52986
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53009
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54131
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:61202
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54366
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54129
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:58627
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:58486
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54134
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53008
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:56226
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:52991
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:51172
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:36136
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54133
> CLOSE_WAIT 19220/java > tcp6 24 0 192.168. xxxx :8047 192.168.100.131:57474
> ESTABLISHED 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54069
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54130
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53001
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:52985
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:52990
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54212
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.100.131:58628
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.100.131:53955
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:57391
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:41219
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54307
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53000
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168 xxxx :8047 192.168.100.222:52984
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54308
> CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:46189
> CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54211
> CLOSE_WAIT 19220/java > > > >
> Question 2:
> Our apache drill was down frequently, it seems that it is due to memory
> leak. However, we have configured 96G memory for apache dirll, so can you
> please advise how can we identify which SQL took a lot of memory? and how
> can improve our performance? Error Id: 40d789a6-91ee-4e0b-bfc9-a26358a43df3
> on server:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM
> ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked:
> (67043328) > Allocator(op:14:0:0:HashPartitionSender) >
> 1000000/67043328/101535744/10000000000 (res/actual/peak/limit) > > > Fragment
> 14:0 > > [Error Id: 40d789a6-91ee-4e0b-bfc9-a26358a43df3 on >
> theremin.root.digitalalchemy:31010] > at >
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
> > ~[drill-common-1.13.0.jar:1.13.0] > at >
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:300)
> > [drill-java-exec-1.13.0.jar:1.13.0] > at >
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
> > [drill-java-exec-1.13.0.jar:1.13.0] > at >
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:266)
> > [drill-java-exec-1.13.0.jar:1.13.0] > at >
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> > [drill-common-1.13.0.jar:1.13.0] > at >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > [na:1.8.0_161] > at >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] >
> > ) > > Thank you. >
--
This message was sent by Atlassian Jira
(v8.20.1#820001)