[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/22240#discussion_r212863571 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDDBarrier.scala --- @@ -22,15 +22,22 @@ import scala.reflect.ClassTag import org.apache.spark.TaskContext import org.apache.spark.annotation.{Experimental, Since} -/** Represents an RDD barrier, which forces Spark to launch tasks of this stage together. */ -class RDDBarrier[T: ClassTag](rdd: RDD[T]) { +/** + * :: Experimental :: + * Wraps an RDD in a barrier stage, which forces Spark to launch tasks of this stage together. + * [[RDDBarrier]] instances are created by [[RDD.barrier]]. + */ +@Experimental +@Since("2.4.0") +class RDDBarrier[T: ClassTag] private[spark] (rdd: RDD[T]) { --- End diff -- also hide the constructor here --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/22240#discussion_r212863543 --- Diff: core/src/main/scala/org/apache/spark/BarrierTaskInfo.scala --- @@ -28,4 +28,4 @@ import org.apache.spark.annotation.{Experimental, Since} */ @Experimental @Since("2.4.0") -class BarrierTaskInfo(val address: String) +class BarrierTaskInfo private[spark] (val address: String) --- End diff -- hide the constructor since this is not to be constructed by user --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/22240#discussion_r212863507 --- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala --- @@ -21,25 +21,31 @@ import java.util.{Properties, Timer, TimerTask} import scala.concurrent.duration._ import scala.language.postfixOps - -import org.apache.spark.annotation.{Experimental, Since} +import org.apache.spark.annotation.{DeveloperApi, Experimental, Since} import org.apache.spark.executor.TaskMetrics import org.apache.spark.memory.TaskMemoryManager import org.apache.spark.metrics.MetricsSystem import org.apache.spark.rpc.{RpcEndpointRef, RpcTimeout} import org.apache.spark.util.{RpcUtils, Utils} -/** A [[TaskContext]] with extra info and tooling for a barrier stage. */ -class BarrierTaskContext( +/** + * :: Experimental :: + * A [[TaskContext]] with extra contextual info and tooling for tasks in a barrier stage. + * Use [[BarrierTaskContext#get]] to obtain the barrier context for a running barrier task. + */ +@Experimental +@Since("2.4.0") +class BarrierTaskContext private[spark] ( --- End diff -- Made the constructor package private to force users get it from `#get()`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/22240#discussion_r212863444 --- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala --- @@ -68,7 +74,7 @@ class BarrierTaskContext( * * CAUTION! In a barrier stage, each task must have the same number of barrier() calls, in all * possible code branches. Otherwise, you may get the job hanging or a SparkException after - * timeout. Some examples of misuses listed below: + * timeout. Some examples of '''misuses''' listed below: --- End diff -- use bold font to make sure users don't misread --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/22240#discussion_r212863381 --- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala --- @@ -21,25 +21,31 @@ import java.util.{Properties, Timer, TimerTask} import scala.concurrent.duration._ import scala.language.postfixOps - -import org.apache.spark.annotation.{Experimental, Since} +import org.apache.spark.annotation.{DeveloperApi, Experimental, Since} import org.apache.spark.executor.TaskMetrics import org.apache.spark.memory.TaskMemoryManager import org.apache.spark.metrics.MetricsSystem import org.apache.spark.rpc.{RpcEndpointRef, RpcTimeout} import org.apache.spark.util.{RpcUtils, Utils} -/** A [[TaskContext]] with extra info and tooling for a barrier stage. */ -class BarrierTaskContext( +/** + * :: Experimental :: + * A [[TaskContext]] with extra contextual info and tooling for tasks in a barrier stage. + * Use [[BarrierTaskContext#get]] to obtain the barrier context for a running barrier task. + */ +@Experimental +@Since("2.4.0") +class BarrierTaskContext private[spark] ( override val stageId: Int, override val stageAttemptNumber: Int, override val partitionId: Int, override val taskAttemptId: Long, override val attemptNumber: Int, -override val taskMemoryManager: TaskMemoryManager, +private[spark] override val taskMemoryManager: TaskMemoryManager, --- End diff -- This is not exposed by `TaskContext`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/22240 [WIP] [SPARK-25248] [CORE] Audit barrier APIs for 2.4 ## What changes were proposed in this pull request? I made one pass over barrier APIs added to Spark 2.4 and updates some scopes and docs. TODOs: - [ ] scala doc - [ ] python doc You can merge this pull request into a Git repository by running: $ git pull https://github.com/mengxr/spark SPARK-25248 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22240.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22240 commit 19e380ab4f7242f2f2ef48aca81445b0adf0a87d Author: Xiangrui Meng Date: 2018-08-27T04:18:54Z update barrier Scala doc --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org