[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70740442 @pwendell yeah, looks good. Sorry I was also away. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70607997 @pwendell sorry, was out for the weekend, but this LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70466063 Okay I'm pulling this in with the minor style fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4020 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70395176 LGTM - @sryza and @ksakellis look okay to you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23130489 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -44,42 +44,62 @@ class TaskMetrics extends Serializable { /** * Host's name the task runs on */ - var hostname: String = _ - + private var _hostname: String = _ + def hostname = _hostname + private[spark] def setHostname(value : String) = _hostname = value --- End diff -- `Should be value: String` with no space (not sure why our style checks didn't catch this). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70311854 [Test build #25675 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25675/consoleFull) for PR 4020 at commit [`39f3810`](https://github.com/apache/spark/commit/39f3810f6583cc257635b295205ae94b0cdee053). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user ilganeli commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70311929 Updated to resolve merge conflicts with #3120, fix some small style issues, and correctly use setters instead of inc/dec in certain places in the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70321469 [Test build #25675 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25675/consoleFull) for PR 4020 at commit [`39f3810`](https://github.com/apache/spark/commit/39f3810f6583cc257635b295205ae94b0cdee053). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SparkListenerJobEnd(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70321476 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25675/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70189111 [Test build #25621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25621/consoleFull) for PR 4020 at commit [`e446287`](https://github.com/apache/spark/commit/e446287b866eedeb74e68c2f800acf29250d2a76). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70189173 Had a look over, and this mostly looks good, but it looks like there are many places where the patch replaces assigning with incrementing. It would be good to take a close look and pull all these out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70180180 [Test build #25619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25619/consoleFull) for PR 4020 at commit [`6444391`](https://github.com/apache/spark/commit/644439144dba2f1a2c0cac29da16a0fc7a52b109). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user ilganeli commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23056056 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -257,8 +257,8 @@ private[spark] class Executor( val serviceTime = System.currentTimeMillis() - taskStart val metrics = attemptedTask.flatMap(t = t.metrics) for (m - metrics) { -m.executorRunTime = serviceTime -m.jvmGCTime = gcTime - startGCTime +m.incExecutorRunTime(serviceTime) --- End diff -- I'm not sure whether the original behavior is necessarily correct. If the goal is to track total run time for the task, why does it make sense to do an assignment anywhere instead of an accumulation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70186438 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25619/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70186432 [Test build #25619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25619/consoleFull) for PR 4020 at commit [`6444391`](https://github.com/apache/spark/commit/644439144dba2f1a2c0cac29da16a0fc7a52b109). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23055589 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -257,8 +257,8 @@ private[spark] class Executor( val serviceTime = System.currentTimeMillis() - taskStart val metrics = attemptedTask.flatMap(t = t.metrics) for (m - metrics) { -m.executorRunTime = serviceTime -m.jvmGCTime = gcTime - startGCTime +m.incExecutorRunTime(serviceTime) --- End diff -- will this replace `=` with `+=`? This applies in a couple places above as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user ilganeli commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70214178 Hi Patrick - I did look over 3120. That one will definitely need to be merged first and then we can finish this. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23057341 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable { /** * Number of bytes written for the shuffle by this task */ - @volatile var shuffleBytesWritten: Long = _ - + @volatile private var _shuffleBytesWritten: Long = _ + def shuffleBytesWritten = _shuffleBytesWritten + private[spark] def incShuffleBytesWritten(value: Long) = _shuffleBytesWritten += value + private[spark] def decShuffleBytesWritten(value: Long) = _shuffleBytesWritten -= value + /** * Time the task spent blocking on writes to disk or buffer cache, in nanoseconds */ - @volatile var shuffleWriteTime: Long = _ --- End diff -- `_shuffleWriteTime` can only get incremented from a single thread. It's marked volatile so that other threads can read from it. Using an `AtomicLong` would unnecessarily bloat the size of the task results. I probably should have documented this access pattern better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user ilganeli commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23057946 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -257,8 +257,8 @@ private[spark] class Executor( val serviceTime = System.currentTimeMillis() - taskStart val metrics = attemptedTask.flatMap(t = t.metrics) for (m - metrics) { -m.executorRunTime = serviceTime -m.jvmGCTime = gcTime - startGCTime +m.incExecutorRunTime(serviceTime) --- End diff -- Understood. Do you think it would be best to just add a setter function to address these cases directly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23057216 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -257,8 +257,8 @@ private[spark] class Executor( val serviceTime = System.currentTimeMillis() - taskStart val metrics = attemptedTask.flatMap(t = t.metrics) for (m - metrics) { -m.executorRunTime = serviceTime -m.jvmGCTime = gcTime - startGCTime +m.incExecutorRunTime(serviceTime) --- End diff -- Task GC time is measured by looking at ((total time spent in GC since process started) - (total time spent in GC since process started, snapshotted when task started)). Incrementing every time we update the metric would end up double-counting certain time regions. A few other metrics work this way as well. If you do notice issues with the logic, it might be good to propose fixes in a separate JIRA, so that we can give them careful consideration on their own. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70195240 [Test build #25621 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25621/consoleFull) for PR 4020 at commit [`e446287`](https://github.com/apache/spark/commit/e446287b866eedeb74e68c2f800acf29250d2a76). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class UDFRegistration (sqlContext: SQLContext) extends org.apache.spark.Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70195248 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25621/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23060808 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -257,8 +257,8 @@ private[spark] class Executor( val serviceTime = System.currentTimeMillis() - taskStart val metrics = attemptedTask.flatMap(t = t.metrics) for (m - metrics) { -m.executorRunTime = serviceTime -m.jvmGCTime = gcTime - startGCTime +m.incExecutorRunTime(serviceTime) --- End diff -- The purpose of this original patch was to just lock down the visibility, so I think it's fine to have -package-private setters in cases where we don't simply increment/decrement. We could also just only have setters (and not inc/dec). Basically as long as they are private, I don't mind. Maybe there is some value to restricting certain ones to inc/dec to provide more safety. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70201497 Conflicts abound! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23056809 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable { /** * Number of bytes written for the shuffle by this task */ - @volatile var shuffleBytesWritten: Long = _ - + @volatile private var _shuffleBytesWritten: Long = _ + def shuffleBytesWritten = _shuffleBytesWritten + private[spark] def incShuffleBytesWritten(value: Long) = _shuffleBytesWritten += value + private[spark] def decShuffleBytesWritten(value: Long) = _shuffleBytesWritten -= value + /** * Time the task spent blocking on writes to disk or buffer cache, in nanoseconds */ - @volatile var shuffleWriteTime: Long = _ --- End diff -- Can we make these AtomicLong's so that the incrementing can be threadsafe. I have a pr out that does this: https://github.com/apache/spark/pull/3120/files#diff-1bd3dc38f6306e0a822f93d62c32b1d0R226 for input metrics. I'd be good to do this throughout. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70192435 Btw - this may conflict with #3120 so maybe we should hold off until that's merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23058543 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable { /** * Number of bytes written for the shuffle by this task */ - @volatile var shuffleBytesWritten: Long = _ - + @volatile private var _shuffleBytesWritten: Long = _ + def shuffleBytesWritten = _shuffleBytesWritten + private[spark] def incShuffleBytesWritten(value: Long) = _shuffleBytesWritten += value + private[spark] def decShuffleBytesWritten(value: Long) = _shuffleBytesWritten -= value + /** * Time the task spent blocking on writes to disk or buffer cache, in nanoseconds */ - @volatile var shuffleWriteTime: Long = _ --- End diff -- Fair enough, but it is hard to guarantee that only one thread will increment the value. We could mark the class as not thread safe by docs but it might be a ticking time bomb. Is the overhead of AtomicLong that concerning to risk concurrency issues down the line? Speaking of shuffleMetrics, can we get rid of the array of shuffleReadMetrics (depsShuffleReadMetrics) and the merge step in favor of using AtomicLongs in ShuffleReadMetrics? That way the TaskMetrics can just return the same threadsafe ShuffleReadMetrics to the tasks and there wouldn't need to be a need to call updateShuffleReadMetrics periodically in the Executor heartbeat code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-69798914 [Test build #25473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25473/consoleFull) for PR 4020 at commit [`1149e78`](https://github.com/apache/spark/commit/1149e780977ae6fe7b72562696a5c7181160b48c). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `val classServer = new HttpServer(conf, outputDir, new SecurityManager(conf), classServerPort, HTTP class server)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-69798922 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25473/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org