[GitHub] [spark] shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] Task summary metrics are wrong when there are running tasks

2019-08-08 Thread GitBox
shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] 
Task summary metrics are wrong when there are running tasks
URL: https://github.com/apache/spark/pull/25369#discussion_r312271621
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -156,7 +162,8 @@ private[spark] class AppStatusStore(
 // cheaper for disk stores (avoids deserialization).
 val count = {
   Utils.tryWithResource(
-if (store.isInstanceOf[InMemoryStore]) {
+if (isInMemoryStore) {
 
 Review comment:
   @vanzin In Spark-2.2, as per this line, it doesn't count metrics of a 
running task. That was the intend behind #23088 (which eventually fixes this 
issue as well)
   
https://github.com/apache/spark/blob/7c7d7f6a878b02ece881266ee538f3e1443aa8c1/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala#L340
   
   verified locally also, 
`sc.parallelize(1 to 160, 1).map( i => Thread.sleep(1000)).collect()`
   
   ![Screenshot 2019-08-09 at 4 03 09 
AM](https://user-images.githubusercontent.com/23054875/62742208-4beb1a80-ba5b-11e9-97a7-b01f9515b5fd.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] Task summary metrics are wrong when there are running tasks

2019-08-08 Thread GitBox
shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] 
Task summary metrics are wrong when there are running tasks
URL: https://github.com/apache/spark/pull/25369#discussion_r312271621
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -156,7 +162,8 @@ private[spark] class AppStatusStore(
 // cheaper for disk stores (avoids deserialization).
 val count = {
   Utils.tryWithResource(
-if (store.isInstanceOf[InMemoryStore]) {
+if (isInMemoryStore) {
 
 Review comment:
   @vanzin In Spark-2.2, as per this line, it doesn't count metrics of a 
running task. 
https://github.com/apache/spark/blob/7c7d7f6a878b02ece881266ee538f3e1443aa8c1/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala#L340
   
   verified locally also, 
`sc.parallelize(1 to 160, 1).map( i => Thread.sleep(1000)).collect()`
   
   ![Screenshot 2019-08-09 at 4 03 09 
AM](https://user-images.githubusercontent.com/23054875/62742208-4beb1a80-ba5b-11e9-97a7-b01f9515b5fd.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] Task summary metrics are wrong when there are running tasks

2019-08-07 Thread GitBox
shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] 
Task summary metrics are wrong when there are running tasks
URL: https://github.com/apache/spark/pull/25369#discussion_r311719400
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -156,7 +162,8 @@ private[spark] class AppStatusStore(
 // cheaper for disk stores (avoids deserialization).
 val count = {
   Utils.tryWithResource(
-if (store.isInstanceOf[InMemoryStore]) {
+if (isInMemoryStore) {
 
 Review comment:
   @gengliangwang If I understand correctly,  ~~#23008~~ #23088 is actually 
fixing this issue. right ?(At least history server case).
   Because, `count` is always filtering out the running tasks, as 
`ExecutorRunTime` will be defined for only finished tasks. But `scan tasks` 
contains all the tasks, including running tasks. 
   
https://github.com/apache/spark/blob/a59fdc4b5783be591a236bfc60d1107caa818412/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L169
   
   
https://github.com/apache/spark/blob/a59fdc4b5783be591a236bfc60d1107caa818412/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L263-L265
   
   So, reverting ~~23008~~ 23088 and hence this PR, would not fix the issue? 
Please correct me if I am wrong.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] Task summary metrics are wrong when there are running tasks

2019-08-07 Thread GitBox
shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] 
Task summary metrics are wrong when there are running tasks
URL: https://github.com/apache/spark/pull/25369#discussion_r311719400
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -156,7 +162,8 @@ private[spark] class AppStatusStore(
 // cheaper for disk stores (avoids deserialization).
 val count = {
   Utils.tryWithResource(
-if (store.isInstanceOf[InMemoryStore]) {
+if (isInMemoryStore) {
 
 Review comment:
   @gengliangwang If I understand correctly, #23088 is actually fixing this 
issue. right ?(At least history server case).
   Because, `count` is always filtering out the running tasks, as 
`ExecutorRunTime` will be defined for only finished tasks. But `scan tasks` 
contains all the tasks, including running tasks. 
   
https://github.com/apache/spark/blob/a59fdc4b5783be591a236bfc60d1107caa818412/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L169
   
   
https://github.com/apache/spark/blob/a59fdc4b5783be591a236bfc60d1107caa818412/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L263-L265
   
   So, reverting 23088 and hence this PR, would not fix the issue? Please 
correct me if I am wrong.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] Task summary metrics are wrong when there are running tasks

2019-08-07 Thread GitBox
shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] 
Task summary metrics are wrong when there are running tasks
URL: https://github.com/apache/spark/pull/25369#discussion_r311719400
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -156,7 +162,8 @@ private[spark] class AppStatusStore(
 // cheaper for disk stores (avoids deserialization).
 val count = {
   Utils.tryWithResource(
-if (store.isInstanceOf[InMemoryStore]) {
+if (isInMemoryStore) {
 
 Review comment:
   @gengliangwang If I understand correctly, #23008 is actually fixing this 
issue. right ?(At least history server case).
   Because, `count` is always filtering out the running tasks, as 
`ExecutorRunTime` will be defined for only finished tasks. But `scan tasks` 
contains all the tasks, including running tasks. 
   
https://github.com/apache/spark/blob/a59fdc4b5783be591a236bfc60d1107caa818412/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L169
   
   
https://github.com/apache/spark/blob/a59fdc4b5783be591a236bfc60d1107caa818412/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L263-L265
   
   So, reverting 23008 and hence this PR, would not fix the issue? Please 
correct me if I am wrong.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] Task summary metrics are wrong when there are running tasks

2019-08-07 Thread GitBox
shahidki31 commented on a change in pull request #25369: [SPARK-28638][WebUI] 
Task summary metrics are wrong when there are running tasks
URL: https://github.com/apache/spark/pull/25369#discussion_r311479709
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -156,7 +162,8 @@ private[spark] class AppStatusStore(
 // cheaper for disk stores (avoids deserialization).
 val count = {
   Utils.tryWithResource(
-if (store.isInstanceOf[InMemoryStore]) {
+if (isInMemoryStore) {
 
 Review comment:
   Thanks @gengliangwang . History server uses `InMemory` store by default. For 
Disk store case,  this isn't an optimal way for finding success task. I am yet 
to raise a PR for supporting for Disk store case.  
   
   I think you just need to add,
`isInMemoryStore: Boolean =  listener.isDefined || 
store.isInstanceOf[InMemoryStore]`
   
   Will test your code with all the scenarios.
   Any case, this would be temporary as we need to support for Diskstore also.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org