[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19786


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-21 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152347531
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -34,10 +34,10 @@ import org.apache.hadoop.fs.permission.FsAction
 import org.apache.hadoop.hdfs.DistributedFileSystem
 import org.apache.hadoop.hdfs.protocol.HdfsConstants
 import org.apache.hadoop.security.AccessControlException
+import org.fusesource.leveldbjni.internal.NativeDB
 
 import org.apache.spark.{SecurityManager, SparkConf, SparkException}
 import org.apache.spark.deploy.SparkHadoopUtil
-import org.apache.spark.deploy.history.config._
--- End diff --

this import is not used.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-21 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152345297
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -570,7 +575,6 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
 }
 
 val logPath = fileStatus.getPath()
-logInfo(s"Replaying log path: $logPath")
--- End diff --

I remove this line because in the function call `replay` below, there is 
another duplicated log "Replaying log path: ..."


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-21 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152338077
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
--- End diff --

Well, the comment is copied. I think Vanzin means compatibility with older 
version of Leveldb release.
The previous case is about the version of how history server write into 
Leveldb.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152322074
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
--- End diff --

`is either old`, I think the old version problems is handled by the 
previous `case`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-20 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152040491
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
+logWarning(s"Failed to load disk store $path for app listing.", e)
--- End diff --

Yes I thought about catch `DBException`
But in this pr I just follows opening disk store for 
app:https://github.com/apache/spark/blob/0ffa7c488fa8156e2a1aa282e60b7c36b86d8af8/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#L307


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152039449
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
+logWarning(s"Failed to load disk store $path for app listing.", e)
--- End diff --

I see, shall we catch `DBException` instead of the general `Exception`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-20 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152038882
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
+logWarning(s"Failed to load disk store $path for app listing.", e)
--- End diff --

@cloud-fan No. 
Currently history server writes to a leveldb file for each application(if 
its page is accessed). And there will be one db called listing.ldb which stores 
the listing.
If the version is not matched, then all the db files should be deleted. 
In this PR, if only the listing db is corrupted, then just delete the 
listing db and recreate it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r152037332
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
+logWarning(s"Failed to load disk store $path for app listing.", e)
--- End diff --

Shall we merge it into the previous case?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-20 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r151988844
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
+logWarning(s"Failed to load disk store $path for app listing.", e)
--- End diff --

No, the file listing.ldb is built from event logs, which stores list of 
applications(history server home page). It is used like cache.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-20 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19786#discussion_r151983008
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -132,15 +132,20 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
   AppStatusStore.CURRENT_VERSION, logDir.toString())
 
 try {
-  open(new File(path, "listing.ldb"), metadata)
+  open(dbPath, metadata)
 } catch {
   // If there's an error, remove the listing database and any existing 
UI database
   // from the store directory, since it's extremely likely that 
they'll all contain
   // incompatible information.
   case _: UnsupportedStoreVersionException | _: 
MetadataMismatchException =>
 logInfo("Detected incompatible DB versions, deleting...")
 path.listFiles().foreach(Utils.deleteRecursively)
-open(new File(path, "listing.ldb"), metadata)
+open(dbPath, metadata)
+  case e: Exception =>
+// Get rid of the old data and re-create it. The store is either 
old or corrupted.
+logWarning(s"Failed to load disk store $path for app listing.", e)
--- End diff --

Is there a danger here that a transient error causes this to delete an 
otherwise valid file?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19786: [SPARK-22559][CORE]history server: handle excepti...

2017-11-19 Thread gengliangwang
GitHub user gengliangwang opened a pull request:

https://github.com/apache/spark/pull/19786

[SPARK-22559][CORE]history server: handle exception on opening corrupted 
listing.ldb

## What changes were proposed in this pull request?
Currently history server v2 failed to start if `listing.ldb` is corrupted.
This patch get rid of the corrupted `listing.ldb` and re-create it. 
The exception handling follows [opening disk store for 
app](https://github.com/apache/spark/blob/0ffa7c488fa8156e2a1aa282e60b7c36b86d8af8/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#L307)
## How was this patch tested?
manual test

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gengliangwang/spark listingException

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19786.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19786


commit 054cd7e208c908aa50cc8030b15feee9d2ce71e9
Author: Wang Gengliang 
Date:   2017-11-20T06:59:40Z

history server: handle exception on opening corrupted listing.ldb




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org