[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-04-27 Thread jisookim0513
Github user jisookim0513 closed the pull request at:

https://github.com/apache/spark/pull/16714


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16714: [SPARK-16333][Core] Enable EventLoggingListener to log l...

2017-04-27 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/16714
  
Ok, not including the updated blocks in task metrics reduced the size of 
our event logs. But I am closing this PR as the current implementation doesn't 
seem to be in the right way. Thanks for the inputs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16714: [SPARK-16333][Core] Enable EventLoggingListener to log l...

2017-04-27 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/16714
  
@vanzin @ajbozarth if you guys think having an option to skip logging 
internal accumulators (in my case I don't use the SQL UI) and completely 
getting rid of updated block statues are not needed, I can close this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-04-27 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16714#discussion_r113776549
  
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -343,10 +376,14 @@ private[spark] object JsonProtocol {
   ("Bytes Written" -> taskMetrics.outputMetrics.bytesWritten) ~
 ("Records Written" -> taskMetrics.outputMetrics.recordsWritten)
 val updatedBlocks =
-  JArray(taskMetrics.updatedBlockStatuses.toList.map { case (id, 
status) =>
-("Block ID" -> id.toString) ~
-  ("Status" -> blockStatusToJson(status))
-  })
+  if (omitUpdatedBlockStatuses) {
--- End diff --

@vanzin @ajbozarth #17412 gets rid of updated block statuses from the 
accumulable but not from task metrics. If you think it's ok to not to have an 
option to get rid of updated block statuses, then I can just get rid of updated 
block statuses here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16714: [SPARK-16333][Core] Enable EventLoggingListener to log l...

2017-04-27 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/16714
  
I would still like not to have internal accumulators in the event logs, as 
well as updated block statuses. @vanzin would you be ok with eliminating all 
internal accumulators and have an option to skip logging updated block statues?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-28 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16714#discussion_r103569094
  
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -62,18 +62,21 @@ private[spark] object JsonProtocol {
* JSON serialization methods for SparkListenerEvents |
* -- */
 
-  def sparkEventToJson(event: SparkListenerEvent): JValue = {
+  def sparkEventToJson(
+event: SparkListenerEvent,
+omitInternalAccums: Boolean = false,
+omitUpdatedBlockStatuses: Boolean = false): JValue = {
 event match {
--- End diff --

stageSubmitted/stageCompleted/jobStart should use `omitInternalAccums`, but 
not jobEnd. jobEnd's interface hasn't changed. `omitUpdatedBlockStatues` is 
intended to be only used for taskEnd because that's when updated block statuses 
are reported. Thanks for catching, I will add omitInternalAccums to 
stageSubmitted and jobStart.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-28 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16714#discussion_r103565658
  
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -97,61 +100,80 @@ private[spark] object JsonProtocol {
   case logStart: SparkListenerLogStart =>
 logStartToJson(logStart)
   case metricsUpdate: SparkListenerExecutorMetricsUpdate =>
-executorMetricsUpdateToJson(metricsUpdate)
+executorMetricsUpdateToJson(metricsUpdate, omitInternalAccums)
   case blockUpdated: SparkListenerBlockUpdated =>
 throw new MatchError(blockUpdated)  // TODO(ekl) implement this
   case _ => parse(mapper.writeValueAsString(event))
 }
   }
 
-  def stageSubmittedToJson(stageSubmitted: SparkListenerStageSubmitted): 
JValue = {
-val stageInfo = stageInfoToJson(stageSubmitted.stageInfo)
+  def stageSubmittedToJson(
+stageSubmitted: SparkListenerStageSubmitted,
+omitInternalAccums: Boolean = false): JValue = {
+val stageInfo = stageInfoToJson(stageSubmitted.stageInfo, 
omitInternalAccums)
 val properties = propertiesToJson(stageSubmitted.properties)
 ("Event" -> SPARK_LISTENER_EVENT_FORMATTED_CLASS_NAMES.stageSubmitted) 
~
 ("Stage Info" -> stageInfo) ~
 ("Properties" -> properties)
   }
 
-  def stageCompletedToJson(stageCompleted: SparkListenerStageCompleted): 
JValue = {
+  def stageCompletedToJson(
+stageCompleted: SparkListenerStageCompleted,
+omitInternalAccums: Boolean = false): JValue = {
 val stageInfo = stageInfoToJson(stageCompleted.stageInfo)
--- End diff --

Yes, thank you for catching it. I think it got omitted while I was merging 
stuff. Will fix this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-28 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16714#discussion_r103564868
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -64,6 +64,12 @@ private[spark] class EventLoggingListener(
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
+  // To reduce the size of event logs, we can omit logging all of internal 
accumulables for metrics.
+  private val omitInternalAccumulables =
--- End diff --

@vanzin I added CPU time because back then I was pulling stage metrics from 
history server and needed CPU time. Here's the PR for the change: 
https://github.com/apache/spark/pull/10212 . Looking at the code, CPU time 
should be there, so I think there's something on my end. That's a separate 
problem though, and I don't think CPU time metric should increase size of event 
logs much. I can't think of a use case for internal accumulables then, so I 
think it makes sense to delete this. If anyone wants to use internal 
accumulables for stage metrics, they should be able to catch it after a stage 
finishes, not from History server.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12436: [SPARK-14649][CORE] DagScheduler should not run duplicat...

2017-02-21 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/12436
  
@sitalkedia have you had a chance to work on this issue and open a new PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-15 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16714#discussion_r101442539
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -64,6 +64,12 @@ private[spark] class EventLoggingListener(
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
+  // To reduce the size of event logs, we can omit logging all of internal 
accumulables for metrics.
+  private val omitInternalAccumulables =
+sparkConf.getBoolean("spark.eventLog.omitInternalAccumulables", false)
+  // To reduce the size of event logs, we can omit logging "Updated Block 
Statuses" metric.
+  private val omitUpdatedBlockStatuses =
+sparkConf.getBoolean("spark.eventLog.omitUpdatedBlockStatuses", false)
--- End diff --

I am not sure if updated block statuses are used for the UI. At first, I 
was wondering if the information was used to reconstruct Storage page but I 
checked the usage of `TaskMetrics.updatedBlockStatues` and it doesn't seem to 
be used anywhere except for when the task metrics is converted to JSON object. 
Actually, I am not sure if Storage tab is working at all unless I am missing 
something. I don't think `/applications/[app-id]/storage/rdd` returns any 
meaningful information. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-15 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16714#discussion_r101438216
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -64,6 +64,12 @@ private[spark] class EventLoggingListener(
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
+  // To reduce the size of event logs, we can omit logging all of internal 
accumulables for metrics.
+  private val omitInternalAccumulables =
--- End diff --

I don't think this information is used to reconstruct job UI. I am not sure 
how this information got included in event logs, but I think some people might 
be using it to get internal metrics for a stage from the history server using 
its REST API. For example, CPU time metrics is not included in stage metrics 
you can get by querying history server endpoint 
`/applications/[app-id]/stages/[stage-id]`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16714: [SPARK-16333][Core] Enable EventLoggingListener to log l...

2017-01-26 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/16714
  
Not sure why the second test build failed at PySpark unit tests. I only 
changed the comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-01-26 Thread jisookim0513
GitHub user jisookim0513 opened a pull request:

https://github.com/apache/spark/pull/16714

[SPARK-16333][Core] Enable EventLoggingListener to log less

## What changes were proposed in this pull request?

Starting from Spark 2.0, task metrics are in the form of an accumulator. 
This is good but also causes excessive event logs because the metrics are 
logged twice (one under "Accumulators" and one under "Task Metrics"). For 
applications with lots of tasks, the size of event logs could be tens of GB and 
it is not feasible for Spark History Server to parse the logs and reconstruct 
the job UI. 

This PR adds an option for EventLoggingListener not to log internal 
accumulators that are for task metrics. It also adds an option not to log 
"Update Block Statuses" metric that is quite verbose and might not be needed on 
some occasions. 

After updating to Spark 2.0, a size of the event log of some application 
jumped from ~ 1GB to over 40 GB. With this patch, event log size went back to 
similar to the previous sizes with Spark 1.5.2.

## How was this patch tested?

Unit tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/metamx/spark enable-less-eventlogs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16714.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16714






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-09-23 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/10212
  
@vanzin thanks a lot!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-09-23 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/10212
  
@vanzin thanks, I was about to ask for a retest :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r80184799
  
--- Diff: core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala 
---
@@ -1097,7 +1100,9 @@ private[spark] object JsonProtocolSuite extends 
Assertions {
   |  },
   |  "Task Metrics": {
   |"Executor Deserialize Time": 300,
+  |"Executor Deserialize CPU Time": 0,
--- End diff --

Yeah I tested it on my testing cluster, but this makes sense. I will add 
non-zero CPU times by setting the CPU times same as given wall times.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r80184744
  
--- Diff: 
core/src/test/resources/HistoryServerExpectations/complete_stage_list_json_expectation.json
 ---
@@ -6,6 +6,7 @@
   "numCompleteTasks" : 8,
   "numFailedTasks" : 0,
   "executorRunTime" : 162,
+  "executorCpuTime" : 0,
--- End diff --

Oh no, these are expected outputs. I think the inputs are stored under 
`src/test/resources/spark-events`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r80156532
  
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -759,7 +761,15 @@ private[spark] object JsonProtocol {
   return metrics
 }
 metrics.setExecutorDeserializeTime((json \ "Executor Deserialize 
Time").extract[Long])
+metrics.setExecutorDeserializeCpuTime((json \ "Executor Deserialize 
CPU Time") match {
+  case JNothing => 0
+  case x => x.extract[Long]}
+)
 metrics.setExecutorRunTime((json \ "Executor Run Time").extract[Long])
+metrics.setExecutorCpuTime((json \ "Executor CPU Time") match {
+  case JNothing => 0
+  case x => x.extract[Long]}
--- End diff --

Will fix this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r80156386
  
--- Diff: 
core/src/test/resources/HistoryServerExpectations/complete_stage_list_json_expectation.json
 ---
@@ -6,6 +6,7 @@
   "numCompleteTasks" : 8,
   "numFailedTasks" : 0,
   "executorRunTime" : 162,
+  "executorCpuTime" : 0,
--- End diff --

Hmm I thought HistoryServerSuite runs with included log files (that don't 
have CPU time). So this is an expected result since those logs don't have cpu 
time fields.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r80155278
  
--- Diff: core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala 
---
@@ -1097,7 +1100,9 @@ private[spark] object JsonProtocolSuite extends 
Assertions {
   |  },
   |  "Task Metrics": {
   |"Executor Deserialize Time": 300,
+  |"Executor Deserialize CPU Time": 0,
--- End diff --

AFAIK, JsonProtolSuite creates a JSON string from the event created by 
`makeTaskMetrics()`:
'makeTaskMetrics(300L, 400L, 500L, 600L, 700, 800, hasHadoopInput = true, 
hasOutput = false))'.
I tried changing `makeTaskMetrics()` to accept deserialize CPU time and CPU 
time as arguments , but that ended up violating scalaStyle by having more than 
10 parameters.. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-09-20 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/10212
  
@vanzin could you merge this? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-09-20 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/10212
  
@vanzin this PR had passed all tests. Could you merge it if I fix the 
recently introduced conflicts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-08-19 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/10212
  
@vanzin I updated the patch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-08-07 Thread jisookim0513
Github user jisookim0513 commented on the issue:

https://github.com/apache/spark/pull/10212
  
@vanzin sure will do


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12221] add cpu time to metrics

2016-03-03 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r54951079
  
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -718,6 +719,7 @@ private[spark] object JsonProtocol {
 metrics.setHostname((json \ "Host Name").extract[String])
 metrics.setExecutorDeserializeTime((json \ "Executor Deserialize 
Time").extract[Long])
 metrics.setExecutorRunTime((json \ "Executor Run Time").extract[Long])
+metrics.setExecutorCpuTime((json \ "Executor CPU Time").extract[Long])
--- End diff --

Yeah it won't be able to deserialize a history from an earlier version. 
Would it be better to make this backward-compatible? (Sorry for the super late 
response)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: add cpu time to metrics

2015-12-08 Thread jisookim0513
GitHub user jisookim0513 opened a pull request:

https://github.com/apache/spark/pull/10212

add cpu time to metrics

Currently task metrics don't support executor CPU time, so there's no way 
to calculate how much CPU time a stage/task took from History Server metrics. 
This PR enables reporting CPU time.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jisookim0513/spark add-cpu-time-metric

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10212.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10212


commit 30752cb9b3e91366fe2ac16ca769e8fc7e8dcf54
Author: jisookim <jisookim0...@gmail.com>
Date:   2015-12-08T23:44:39Z

add cpu time to metrics




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org