zhli1142015 commented on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641681617
> > Lots of the paths actually do close the iterator explicitly. And I'm
still not sure how .toSeq doesn't consume it.
>
> I'm not sure you're referring KV store iterator, as I don't see it. Could
you please point out some places?
>
> > Recall that the underlying iterator closes itself when it consumes all
elements.
>
> Maybe you refer to `CompletionIterator` which still need to call `close`
explicitly on its implementation. That implementation is on core module and KV
store module doesn't even know about such existence.
> (And I barely remember that there has been some issue on iterator not
fully consumed and completion function is missed to be called.)
>
> > We can't get at the iterator that .asScala produces but we might be able
to add a close() method to the view or something, but that's getting ugly
>
> I agree current code is pretty much concise, but I'm really worrying that
we don't consider resource leak as serious one and be the one of items in
trade-offs. I'm not sure I can agree with the view that resource leak can be
tolerated to make code beauty. The problem wasn't observed because Linux deals
with the file deletion on usage - if we have been having extensive Windows
users then the problem should be raised much earlier.
>
> We can still play with try-with-resource pattern with the returned value
of `closeableIterator` - it may still allow us to wrap with asScala, since it
also implements Iterator. That forces us to live with a couple of more lines
instead of one-liner but I'm not sure it matters.
@srowen , i found more AppstatusStore related stack trace in log. and also
one case for `closeableIterator` i believe this is the case not all elements
are consumed before DB is closed.
BTW, from my experiment, if iterator is not closed but db is closed, there
is not allow to read any data from iterator (`NoSuchElementException`).
at
org.apache.spark.status.AppStatusStore.resourceProfileInfo(AppStatusStore.scala:56)
at
org.apache.spark.status.AppStatusStore.jobsList(AppStatusStore.scala:60)
at
org.apache.spark.status.AppStatusStore.executorList(AppStatusStore.scala:86)
at
org.apache.spark.status.AppStatusStore.executorSummary(AppStatusStore.scala:421)
at
org.apache.spark.status.AppStatusStore.rddList(AppStatusStore.scala:425)
at
org.apache.spark.status.AppStatusStore.streamBlocksList(AppStatusStore.scala:503)
at
org.apache.spark.status.AppStatusStore.constructTaskDataList(AppStatusStore.scala:537)
Iterator is not closed before db is closed: 1424827908, construct stack
trace: java.lang.Throwable
at
org.apache.spark.util.kvstore.LevelDBIterator.<init>(LevelDBIterator.java:58)
at org.apache.spark.util.kvstore.LevelDB$1.iterator(LevelDB.java:201)
at
org.apache.spark.util.kvstore.KVStoreView.**closeableIterator**(KVStoreView.java:117)
at
org.apache.spark.status.AppStatusStore.lastStageAttempt(AppStatusStore.scala:122)
at org.apache.spark.ui.jobs.JobPage.$anonfun$render$6(JobPage.scala:205)
at
org.apache.spark.status.AppStatusStore.asOption(AppStatusStore.scala:436)
at org.apache.spark.ui.jobs.JobPage.$anonfun$render$5(JobPage.scala:205)
at
org.apache.spark.ui.jobs.JobPage.$anonfun$render$5$adapted(JobPage.scala:202)
at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike.map(TraversableLike.scala:238)
at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
at scala.collection.immutable.List.map(List.scala:298)
at org.apache.spark.ui.jobs.JobPage.render(JobPage.scala:202)
at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:89)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at
org.sparkproject.jetty.servlet.ServletHolder.handle(ServletHolder.java:873)
at
org.sparkproject.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
at
org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:95)
at
org.sparkproject.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at
org.apache.spark.deploy.history.ApplicationCacheCheckFilter.doFilter(ApplicationCache.scala:404)
at
org.sparkproject.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at
org.sparkproject.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at
org.sparkproject.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at
org.sparkproject.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
at
org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at
org.sparkproject.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at
org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at
org.sparkproject.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
at
org.sparkproject.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at
org.sparkproject.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:753)
at
org.sparkproject.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
at
org.sparkproject.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.sparkproject.jetty.server.Server.handle(Server.java:505)
at
org.sparkproject.jetty.server.HttpChannel.handle(HttpChannel.java:370)
at
org.sparkproject.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
at
org.sparkproject.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at
org.sparkproject.jetty.io.FillInterest.fillable(FillInterest.java:103)
at
org.sparkproject.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at
org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at
org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at
org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at
org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at
org.sparkproject.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at
org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at
org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Thread.java:748)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]