[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-12 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-642344689


   > Looks fine in general - one thing looks to be missed is, 
`LevelDBIterator.close()` should also call `db.closeIterator(this);` to make 
sure tracker can indicate it.
   
   @HeartSaVioR, thanks for your comments. sure, updated, it indicates tracker 
by `db.notifyIteratorClosed(this);`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-12 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-642344689


   > Looks fine in general - one thing looks to be missed is, 
`LevelDBIterator.close()` should also call `db.closeIterator(this);` to make 
sure tracker can indicate it.
   
   @HeartSaVioR, thanks for your comments. sure, updated.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-11 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-642344689


   > Looks fine in general - one thing looks to be missed is, 
`LevelDBIterator.close()` should also call `db.closeIterator(this);` to make 
sure tracker can indicate it.
   
   @HeartSaVioR, thanks for your comments. I changed method name 
`closeIterator` to `notifyIteratorClosed` and remove `finalize` method as it's 
not needed any more. With this PR, `LevelDBIterator` should be closed either by 
explicit close call or DB.close. Then there would not be JNI resource leaking 
from it after DB is closed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-10 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   We actually use spark history server as long running service to provide 
diagnostic experience to others. The benefit should be spark history server 
reliability improvement. For my side, we don't need to stop service and 
manually clean HS disk cache after some period.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-10 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   We actually use spark history server as long running service to provide 
diagnostic experience to others. The benefit to us is we don't need to stop 
service and manually clean HS disk cache after some period.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-10 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   We actually use spark history server as long running service to provide 
diagnostic experience to others. The benefit to us is we don't need to stop 
service and manually clean HS disk cache after some period.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-10 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   We actually use spark history server as long running service to provide 
diagnostic experience to others. The benefit to us is we can keep it as real 
long running service and don't need to stop service and manually clean HS disk 
cache after some period.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-10 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   We actually use spark history server as long running service to provide 
diagnostic experience to others. The benefit to us is we don't need to stop 
service and manually restart HS after some period.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   I think the cause of issue is resource leaking (file handle on Windows which 
prevent releasing space by `HistoryServerDiskManager` ) caused by race 
condition, my pr is trying to fix this. we actually use spark history server as 
long running service to provide diagnostic experience to others. The benefit to 
us is we don't need to stop service and manually restart HS after some period.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   I think the cause of issue is resource leaking (file handle on Windows which 
prevent releasing space by `HistoryServerDiskManager` ) caused by race 
condition, my pr is trying to fix this. we actually use spark history server to 
provide diagnostic experience to others. The benefit to us is we don't need to 
stop service and manually restart HS after some period.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768


   > Then how about capture the exception and ask the user to increase the 
related configuration or try loading the page again?
   
   Because there is disk space limitation, we can only mitigate it by stopping 
service and manually cleaning disk cache, this is a little annoying. 
   
   > What is the benefit of this PR to users?
   
   I think the cause of issue is resource leaking (file handle on Windows which 
prevent releasing space by `HistoryServerDiskManager` ) caused by race 
condition, my pr is trying to fix this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641708063


   > Of course relying on finalize is wrong, but I don't think the intent was 
to rely on finalize. Not closing these iterators is a bug. I see one case it 
clearly isn't; there may be others but haven't spotted them. It'd be nice to 
fix them all instead of the change in this patch but we may want to fix what we 
can see and also make the change in this patch for now.
   
   Thanks for your comments, i get your point.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641681617


   > > Lots of the paths actually do close the iterator explicitly. And I'm 
still not sure how .toSeq doesn't consume it.
   > 
   > I'm not sure you're referring KV store iterator, as I don't see it. Could 
you please point out some places?
   > 
   > > Recall that the underlying iterator closes itself when it consumes all 
elements.
   > 
   > Maybe you refer to `CompletionIterator` which still need to call `close` 
explicitly on its implementation. That implementation is on core module and KV 
store module doesn't even know about such existence.
   > (And I barely remember that there has been some issue on iterator not 
fully consumed and completion function is missed to be called.)
   > 
   > > We can't get at the iterator that .asScala produces but we might be able 
to add a close() method to the view or something, but that's getting ugly
   > 
   > I agree current code is pretty much concise, but I'm really worrying that 
we don't consider resource leak as serious one and be the one of items in 
trade-offs. I'm not sure I can agree with the view that resource leak can be 
tolerated to make code beauty. The problem wasn't observed because Linux deals 
with the file deletion on usage - if we have been having extensive Windows 
users then the problem should be raised much earlier.
   > 
   > We can still play with try-with-resource pattern with the returned value 
of `closeableIterator` - it may still allow us to wrap with asScala, since it 
also implements Iterator. That forces us to live with a couple of more lines 
instead of one-liner but I'm not sure it matters.
   
   @srowen , i found more AppstatusStore related stack trace in log. and also 
one case for `closeableIterator` i believe this is the case not all elements 
are consumed before DB is closed. 
   BTW, from my experiment,  if iterator is not closed but db is closed, there 
is not allow to read any data from iterator (`NoSuchElementException`). So i 
think closing iterator when db is closed should be safe, and `AppStatusStore` 
is also closed same time.
   
at 
org.apache.spark.status.AppStatusStore.resourceProfileInfo(AppStatusStore.scala:56)
   
at 
org.apache.spark.status.AppStatusStore.jobsList(AppStatusStore.scala:60)
   
at 
org.apache.spark.status.AppStatusStore.executorList(AppStatusStore.scala:86)
   
at 
org.apache.spark.status.AppStatusStore.executorSummary(AppStatusStore.scala:421)
   
at 
org.apache.spark.status.AppStatusStore.rddList(AppStatusStore.scala:425)
   
at 
org.apache.spark.status.AppStatusStore.streamBlocksList(AppStatusStore.scala:503)
   
at 
org.apache.spark.status.AppStatusStore.constructTaskDataList(AppStatusStore.scala:537)
   
   
   Iterator is not closed before db is closed: 1424827908, construct stack 
trace: java.lang.Throwable
at 
org.apache.spark.util.kvstore.LevelDBIterator.(LevelDBIterator.java:58)
at org.apache.spark.util.kvstore.LevelDB$1.iterator(LevelDB.java:201)
at 
org.apache.spark.util.kvstore.KVStoreView.**closeableIterator**(KVStoreView.java:117)
at 
org.apache.spark.status.AppStatusStore.lastStageAttempt(AppStatusStore.scala:122)
at org.apache.spark.ui.jobs.JobPage.$anonfun$render$6(JobPage.scala:205)
at 
org.apache.spark.status.AppStatusStore.asOption(AppStatusStore.scala:436)
at org.apache.spark.ui.jobs.JobPage.$anonfun$render$5(JobPage.scala:205)
at 
org.apache.spark.ui.jobs.JobPage.$anonfun$render$5$adapted(JobPage.scala:202)
at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike.map(TraversableLike.scala:238)
at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
at scala.collection.immutable.List.map(List.scala:298)
at org.apache.spark.ui.jobs.JobPage.render(JobPage.scala:202)
at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:89)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
org.sparkproject.jetty.servlet.ServletHolder.handle(ServletHolder.java:873)
at 
org.sparkproject.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
at 
org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:95)
at 
org.sparkproject.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at 
org.apache.spark.deploy.history.ApplicationCacheCheckFilter.doFilter(ApplicationCache.scala:404)
at 
org.sparkproject.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
   

[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox


zhli1142015 edited a comment on pull request #28769:
URL: https://github.com/apache/spark/pull/28769#issuecomment-641655009


   if we see description from below comments, i think even caller calls close, 
it may still have leaking issue.
   
https://github.com/apache/spark/blob/master/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java
   ` /**
   Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
   with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
  */`
   ` 
 void closeIterator(LevelDBIterator it) throws IOException {
   synchronized (this._db) {
 DB _db = this._db.get();
 if (_db != null) {
   it.close();
 }
   }
 }`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org