[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-20 Thread ksakellis
Github user ksakellis commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70740442
  
@pwendell yeah, looks good. Sorry I was also away. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-19 Thread sryza
Github user sryza commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70607997
  
@pwendell sorry, was out for the weekend, but this LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-19 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70466063
  
Okay I'm pulling this in with the minor style fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4020


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70395176
  
LGTM - @sryza  and @ksakellis  look okay to you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-17 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23130489
  
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala 
---
@@ -44,42 +44,62 @@ class TaskMetrics extends Serializable {
   /**
* Host's name the task runs on
*/
-  var hostname: String = _
-
+  private var _hostname: String = _
+  def hostname = _hostname
+  private[spark] def setHostname(value : String) = _hostname = value
--- End diff --

`Should be value: String` with no space (not sure why our style checks 
didn't catch this).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70311854
  
  [Test build #25675 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25675/consoleFull)
 for   PR 4020 at commit 
[`39f3810`](https://github.com/apache/spark/commit/39f3810f6583cc257635b295205ae94b0cdee053).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-16 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70311929
  
Updated to resolve merge conflicts with #3120, fix some small style issues, 
and correctly use setters instead of inc/dec in certain places in the code.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70321469
  
  [Test build #25675 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25675/consoleFull)
 for   PR 4020 at commit 
[`39f3810`](https://github.com/apache/spark/commit/39f3810f6583cc257635b295205ae94b0cdee053).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class SparkListenerJobEnd(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70321476
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25675/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70189111
  
  [Test build #25621 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25621/consoleFull)
 for   PR 4020 at commit 
[`e446287`](https://github.com/apache/spark/commit/e446287b866eedeb74e68c2f800acf29250d2a76).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread sryza
Github user sryza commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70189173
  
Had a look over, and this mostly looks good, but it looks like there are 
many places where the patch replaces assigning with incrementing.  It would be 
good to take a close look and pull all these out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70180180
  
  [Test build #25619 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25619/consoleFull)
 for   PR 4020 at commit 
[`6444391`](https://github.com/apache/spark/commit/644439144dba2f1a2c0cac29da16a0fc7a52b109).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23056056
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -257,8 +257,8 @@ private[spark] class Executor(
   val serviceTime = System.currentTimeMillis() - taskStart
   val metrics = attemptedTask.flatMap(t = t.metrics)
   for (m - metrics) {
-m.executorRunTime = serviceTime
-m.jvmGCTime = gcTime - startGCTime
+m.incExecutorRunTime(serviceTime)
--- End diff --

I'm not sure whether the original behavior is necessarily correct. If the 
goal is to track total run time for the task, why does it make sense to do an 
assignment anywhere instead of an accumulation? 
 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70186438
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25619/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70186432
  
  [Test build #25619 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25619/consoleFull)
 for   PR 4020 at commit 
[`6444391`](https://github.com/apache/spark/commit/644439144dba2f1a2c0cac29da16a0fc7a52b109).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread sryza
Github user sryza commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23055589
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -257,8 +257,8 @@ private[spark] class Executor(
   val serviceTime = System.currentTimeMillis() - taskStart
   val metrics = attemptedTask.flatMap(t = t.metrics)
   for (m - metrics) {
-m.executorRunTime = serviceTime
-m.jvmGCTime = gcTime - startGCTime
+m.incExecutorRunTime(serviceTime)
--- End diff --

will this replace `=` with `+=`?  This applies in a couple places above as 
well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70214178
  
Hi Patrick - I did look over 3120. That one will definitely need to be 
merged first and then we can finish this. Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread sryza
Github user sryza commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23057341
  
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala 
---
@@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable {
   /**
* Number of bytes written for the shuffle by this task
*/
-  @volatile var shuffleBytesWritten: Long = _
-
+  @volatile private var _shuffleBytesWritten: Long = _
+  def shuffleBytesWritten = _shuffleBytesWritten
+  private[spark] def incShuffleBytesWritten(value: Long) = 
_shuffleBytesWritten += value
+  private[spark] def decShuffleBytesWritten(value: Long) = 
_shuffleBytesWritten -= value
+  
   /**
* Time the task spent blocking on writes to disk or buffer cache, in 
nanoseconds
*/
-  @volatile var shuffleWriteTime: Long = _
--- End diff --

`_shuffleWriteTime` can only get incremented from a single thread.  It's 
marked volatile so that other threads can read from it. Using an `AtomicLong` 
would unnecessarily bloat the size of the task results.  I probably should have 
documented this access pattern better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23057946
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -257,8 +257,8 @@ private[spark] class Executor(
   val serviceTime = System.currentTimeMillis() - taskStart
   val metrics = attemptedTask.flatMap(t = t.metrics)
   for (m - metrics) {
-m.executorRunTime = serviceTime
-m.jvmGCTime = gcTime - startGCTime
+m.incExecutorRunTime(serviceTime)
--- End diff --

Understood. Do you think it would be best to just add a setter function to 
address these cases directly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread sryza
Github user sryza commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23057216
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -257,8 +257,8 @@ private[spark] class Executor(
   val serviceTime = System.currentTimeMillis() - taskStart
   val metrics = attemptedTask.flatMap(t = t.metrics)
   for (m - metrics) {
-m.executorRunTime = serviceTime
-m.jvmGCTime = gcTime - startGCTime
+m.incExecutorRunTime(serviceTime)
--- End diff --

Task GC time is measured by looking at ((total time spent in GC since 
process started) - (total time spent in GC since process started, snapshotted 
when task started)).  Incrementing every time we update the metric would end up 
double-counting certain time regions.  A few other metrics work this way as 
well.  If you do notice issues with the logic, it might be good to propose 
fixes in a separate JIRA, so that we can give them careful consideration on 
their own.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70195240
  
  [Test build #25621 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25621/consoleFull)
 for   PR 4020 at commit 
[`e446287`](https://github.com/apache/spark/commit/e446287b866eedeb74e68c2f800acf29250d2a76).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class UDFRegistration (sqlContext: SQLContext) extends 
org.apache.spark.Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70195248
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25621/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23060808
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -257,8 +257,8 @@ private[spark] class Executor(
   val serviceTime = System.currentTimeMillis() - taskStart
   val metrics = attemptedTask.flatMap(t = t.metrics)
   for (m - metrics) {
-m.executorRunTime = serviceTime
-m.jvmGCTime = gcTime - startGCTime
+m.incExecutorRunTime(serviceTime)
--- End diff --

The purpose of this original patch was to just lock down the visibility, so 
I think it's fine to have -package-private setters in cases where we don't 
simply increment/decrement. We could also just only have setters (and not 
inc/dec). Basically as long as they are private, I don't mind. Maybe there is 
some value to restricting certain ones to inc/dec to provide more safety.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70201497
  
Conflicts abound!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23056809
  
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala 
---
@@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable {
   /**
* Number of bytes written for the shuffle by this task
*/
-  @volatile var shuffleBytesWritten: Long = _
-
+  @volatile private var _shuffleBytesWritten: Long = _
+  def shuffleBytesWritten = _shuffleBytesWritten
+  private[spark] def incShuffleBytesWritten(value: Long) = 
_shuffleBytesWritten += value
+  private[spark] def decShuffleBytesWritten(value: Long) = 
_shuffleBytesWritten -= value
+  
   /**
* Time the task spent blocking on writes to disk or buffer cache, in 
nanoseconds
*/
-  @volatile var shuffleWriteTime: Long = _
--- End diff --

Can we make these AtomicLong's so that the incrementing can be threadsafe. 
I have a pr out that does this:

https://github.com/apache/spark/pull/3120/files#diff-1bd3dc38f6306e0a822f93d62c32b1d0R226
 for input metrics. I'd be good to do this throughout.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-70192435
  
Btw - this may conflict with #3120 so maybe we should hold off until that's 
merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request:

https://github.com/apache/spark/pull/4020#discussion_r23058543
  
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala 
---
@@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable {
   /**
* Number of bytes written for the shuffle by this task
*/
-  @volatile var shuffleBytesWritten: Long = _
-
+  @volatile private var _shuffleBytesWritten: Long = _
+  def shuffleBytesWritten = _shuffleBytesWritten
+  private[spark] def incShuffleBytesWritten(value: Long) = 
_shuffleBytesWritten += value
+  private[spark] def decShuffleBytesWritten(value: Long) = 
_shuffleBytesWritten -= value
+  
   /**
* Time the task spent blocking on writes to disk or buffer cache, in 
nanoseconds
*/
-  @volatile var shuffleWriteTime: Long = _
--- End diff --

Fair enough, but it is hard to guarantee that only one thread will 
increment the value. We could mark the class as not thread safe by docs but it 
might be a ticking time bomb. Is the overhead of AtomicLong that concerning to 
risk concurrency issues down the line?

Speaking of shuffleMetrics, can we get rid of the array of 
shuffleReadMetrics (depsShuffleReadMetrics) and the merge step in favor of 
using AtomicLongs in ShuffleReadMetrics? That way the TaskMetrics can just 
return the same threadsafe ShuffleReadMetrics to the tasks and there wouldn't 
need to be a need to call updateShuffleReadMetrics periodically in the Executor 
heartbeat code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-69798914
  
  [Test build #25473 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25473/consoleFull)
 for   PR 4020 at commit 
[`1149e78`](https://github.com/apache/spark/commit/1149e780977ae6fe7b72562696a5c7181160b48c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `val classServer   = new 
HttpServer(conf, outputDir, new SecurityManager(conf), classServerPort, HTTP 
class server)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4020#issuecomment-69798922
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25473/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org