[ 
https://issues.apache.org/jira/browse/SPARK-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366071#comment-15366071
 ] 

Slava commented on SPARK-16378:
-------------------------------

Another issue related to the same behavior is a resource leak. Each time 
HiveContext is created it opens a new socket that isn't closed by SparkContext. 
As a result spark switches to another port on each invocation:

java    55295 dropwizard  358u  IPv6           27186032      0t0      TCP 
*:9544 (LISTEN)
java    55295 dropwizard  359u  IPv6           27188547      0t0      TCP 
*:9545 (LISTEN)
java    55295 dropwizard  360u  IPv6           27191750      0t0      TCP 
*:9546 (LISTEN)
java    55295 dropwizard  361u  IPv6           27193491      0t0      TCP 
*:9547 (LISTEN)

WARN  [2016-07-07 12:53:07,721] org.apache.spark.util.Utils: Service 'HTTP file 
server' could not bind on port 9545. Attempting port 9546.
INFO  [2016-07-07 12:53:07,721] org.spark-project.jetty.server.Server: 
jetty-8.y.z-SNAPSHOT
WARN  [2016-07-07 12:53:07,725] 
org.spark-project.jetty.util.component.AbstractLifeCycle: FAILED 
[email protected]:9546: java.net.BindException: Address already in use
! java.net.BindException: Address already in use
! at java.net.PlainSocketImpl.socketBind(Native Method) ~[na:1.7.0_95]
! at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376) 
~[na:1.7.0_95]
! at java.net.ServerSocket.bind(ServerSocket.java:376) ~[na:1.7.0_95]
! at java.net.ServerSocket.<init>(ServerSocket.java:237) ~[na:1.7.0_95]
! at java.net.ServerSocket.<init>(ServerSocket.java:181) ~[na:1.7.0_95]
! at 
org.spark-project.jetty.server.bio.SocketConnector.newServerSocket(SocketConnector.java:96)
 ~[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.spark-project.jetty.server.bio.SocketConnector.open(SocketConnector.java:85)
 ~[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.spark-project.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
 ~[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.spark-project.jetty.server.bio.SocketConnector.doStart(SocketConnector.java:156)
 ~[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.spark-project.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
 [spark-core_2.10-1.6.0.jar:1.6.0]
! at org.spark-project.jetty.server.Server.doStart(Server.java:293) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.spark-project.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
 [spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.apache.spark.HttpServer.org$apache$spark$HttpServer$$doStart(HttpServer.scala:105)
 [spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1964)
 [spark-core_2.10-1.6.0.jar:1.6.0]
! at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) 
[scala-library-2.10.5.jar:na]
! at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1955) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.HttpServer.start(HttpServer.scala:62) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.HttpFileServer.initialize(HttpFileServer.scala:46) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.apache.spark.rpc.netty.HttpBasedFileServer.startFileServer(HttpBasedFileServer.scala:55)
 [spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.apache.spark.rpc.netty.HttpBasedFileServer.getFileServer(HttpBasedFileServer.scala:46)
 [spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.apache.spark.rpc.netty.HttpBasedFileServer.addJar(HttpBasedFileServer.scala:34)
 [spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.SparkContext.liftedTree2$1(SparkContext.scala:1654) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.SparkContext.addJar(SparkContext.scala:1653) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.SparkContext$$anonfun$14.apply(SparkContext.scala:487) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at org.apache.spark.SparkContext$$anonfun$14.apply(SparkContext.scala:487) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at scala.collection.immutable.List.foreach(List.scala:318) 
[scala-library-2.10.5.jar:na]
! at org.apache.spark.SparkContext.<init>(SparkContext.scala:487) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59) 
[spark-core_2.10-1.6.0.jar:1.6.0]
! at 
com.kenshoo.rtcruleengine.generalfeed.configuration.SparkContextProviderImpl.createSparkContext(SparkContextProviderImpl.java:71)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.generalfeed.configuration.SparkContextProviderImpl.getSparkContext(SparkContextProviderImpl.java:36)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.generalfeed.configuration.HiveContextProviderImpl.getHiveContext(HiveContextProviderImpl.java:31)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.generalfeed.health.HiveHealthValidator.doCheck(HiveHealthValidator.java:45)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.generalfeed.health.HiveHealthValidator.access$000(HiveHealthValidator.java:21)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.generalfeed.health.HiveHealthValidator$1.get(HiveHealthValidator.java:40)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.generalfeed.health.HiveHealthValidator$1.get(HiveHealthValidator.java:37)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.google.common.base.Suppliers$ExpiringMemoizingSupplier.get(Suppliers.java:192)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.generalfeed.health.HiveHealthValidator.validate(HiveHealthValidator.java:34)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.kenshoo.rtcruleengine.healthcheck.HiveHealthCheck.check(HiveHealthCheck.java:23)
 [rtc-rule-engine-runner-1.0.1797.jar:na]
! at com.yammer.metrics.core.HealthCheck.execute(HealthCheck.java:195) 
[general-feed-processor-1.0.1797.jar:na]
! at 
com.yammer.metrics.core.HealthCheckRegistry.runHealthChecks(HealthCheckRegistry.java:53)
 [general-feed-processor-1.0.1797.jar:na]
! at 
com.yammer.metrics.reporting.HealthCheckServlet.doGet(HealthCheckServlet.java:61)
 [metrics-servlet-2.2.0.jar:na]
! at javax.servlet.http.HttpServlet.service(HttpServlet.java:735) 
[javax.servlet-3.0.0.v201112011016.jar:na]
! at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) 
[javax.servlet-3.0.0.v201112011016.jar:na]
! at com.yammer.metrics.reporting.AdminServlet.doGet(AdminServlet.java:103) 
[metrics-servlet-2.2.0.jar:na]
! at javax.servlet.http.HttpServlet.service(HttpServlet.java:735) 
[javax.servlet-3.0.0.v201112011016.jar:na]
! at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) 
[javax.servlet-3.0.0.v201112011016.jar:na]
! at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:669) 
[jetty-servlet-8.1.10.v20130312.jar:8.1.10.v20130312]
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:457) 
[jetty-servlet-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) 
[jetty-servlet-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) 
[jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) 
[jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at org.eclipse.jetty.server.Server.handle(Server.java:368) 
[jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) 
[jetty-http-8.1.10.v20130312.jar:8.1.10.v20130312]
! at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) 
[jetty-http-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
 [jetty-server-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
 [jetty-util-8.1.10.v20130312.jar:8.1.10.v20130312]
! at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) 
[jetty-util-8.1.10.v20130312.jar:8.1.10.v20130312]
! at java.lang.Thread.run(Thread.java:745) [na:1.7.0_95]


> HiveContext doesn't release resources
> -------------------------------------
>
>                 Key: SPARK-16378
>                 URL: https://issues.apache.org/jira/browse/SPARK-16378
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API, SQL
>    Affects Versions: 1.6.0
>         Environment: Linux Ubuntu
>            Reporter: Slava
>            Priority: Minor
>
> I am running this simple code:
> {code} 
> HiveContext hiveContext = new HiveContext(new JavaSparkContext(conf));
> hiveContext.sparkContext().stop();
> {code} 
> Each HiveContext creation creates 100+ .dat files.
> They could be counted by running "ls -l | grep dat | wc -l" and listed with 
> "ls -l | grep dat" commands in /proc/PID/fd directory:
> lrwx------ 1 dropwizard dropwizard 64 Jul  4 21:39 891 -> 
> /tmp/spark-3625050e-6d18-421f-89ae-9859e9edfb9f/metastore/seg0/c650.dat
> lrwx------ 1 dropwizard dropwizard 64 Jul  4 21:39 893 -> 
> /tmp/spark-3625050e-6d18-421f-89ae-9859e9edfb9f/metastore/seg0/c670.dat
> lrwx------ 1 dropwizard dropwizard 64 Jul  4 21:39 895 -> 
> /tmp/spark-3625050e-6d18-421f-89ae-9859e9edfb9f/metastore/seg0/c690.dat
> In my application I use "short living " context. I create it and stop 
> repeatedly.
> It seems that stopping the SparkContext doesn't stop the HiveContext. So 
> these files (and it seems other resources) aren't released (deleted). 
> HiveContext itself doesn't have stop method.
> Thus next time I create the context, it creates another 100+ files. Finally I 
> am running out of max open file descriptors and getting "Too many open files" 
> error that eventually leads to the server crash.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to