[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...

2018-08-26 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/22240#discussion_r212863571
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDDBarrier.scala ---
@@ -22,15 +22,22 @@ import scala.reflect.ClassTag
 import org.apache.spark.TaskContext
 import org.apache.spark.annotation.{Experimental, Since}
 
-/** Represents an RDD barrier, which forces Spark to launch tasks of this 
stage together. */
-class RDDBarrier[T: ClassTag](rdd: RDD[T]) {
+/**
+ * :: Experimental ::
+ * Wraps an RDD in a barrier stage, which forces Spark to launch tasks of 
this stage together.
+ * [[RDDBarrier]] instances are created by [[RDD.barrier]].
+ */
+@Experimental
+@Since("2.4.0")
+class RDDBarrier[T: ClassTag] private[spark] (rdd: RDD[T]) {
--- End diff --

also hide the constructor here


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...

2018-08-26 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/22240#discussion_r212863543
  
--- Diff: core/src/main/scala/org/apache/spark/BarrierTaskInfo.scala ---
@@ -28,4 +28,4 @@ import org.apache.spark.annotation.{Experimental, Since}
  */
 @Experimental
 @Since("2.4.0")
-class BarrierTaskInfo(val address: String)
+class BarrierTaskInfo private[spark] (val address: String)
--- End diff --

hide the constructor since this is not to be constructed by user


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...

2018-08-26 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/22240#discussion_r212863507
  
--- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala ---
@@ -21,25 +21,31 @@ import java.util.{Properties, Timer, TimerTask}
 
 import scala.concurrent.duration._
 import scala.language.postfixOps
-
-import org.apache.spark.annotation.{Experimental, Since}
+import org.apache.spark.annotation.{DeveloperApi, Experimental, Since}
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.memory.TaskMemoryManager
 import org.apache.spark.metrics.MetricsSystem
 import org.apache.spark.rpc.{RpcEndpointRef, RpcTimeout}
 import org.apache.spark.util.{RpcUtils, Utils}
 
-/** A [[TaskContext]] with extra info and tooling for a barrier stage. */
-class BarrierTaskContext(
+/**
+ * :: Experimental ::
+ * A [[TaskContext]] with extra contextual info and tooling for tasks in a 
barrier stage.
+ * Use [[BarrierTaskContext#get]] to obtain the barrier context for a 
running barrier task.
+ */
+@Experimental
+@Since("2.4.0")
+class BarrierTaskContext private[spark] (
--- End diff --

Made the constructor package private to force users get it from `#get()`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...

2018-08-26 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/22240#discussion_r212863444
  
--- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala ---
@@ -68,7 +74,7 @@ class BarrierTaskContext(
*
* CAUTION! In a barrier stage, each task must have the same number of 
barrier() calls, in all
* possible code branches. Otherwise, you may get the job hanging or a 
SparkException after
-   * timeout. Some examples of misuses listed below:
+   * timeout. Some examples of '''misuses''' listed below:
--- End diff --

use bold font to make sure users don't misread


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...

2018-08-26 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/22240#discussion_r212863381
  
--- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala ---
@@ -21,25 +21,31 @@ import java.util.{Properties, Timer, TimerTask}
 
 import scala.concurrent.duration._
 import scala.language.postfixOps
-
-import org.apache.spark.annotation.{Experimental, Since}
+import org.apache.spark.annotation.{DeveloperApi, Experimental, Since}
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.memory.TaskMemoryManager
 import org.apache.spark.metrics.MetricsSystem
 import org.apache.spark.rpc.{RpcEndpointRef, RpcTimeout}
 import org.apache.spark.util.{RpcUtils, Utils}
 
-/** A [[TaskContext]] with extra info and tooling for a barrier stage. */
-class BarrierTaskContext(
+/**
+ * :: Experimental ::
+ * A [[TaskContext]] with extra contextual info and tooling for tasks in a 
barrier stage.
+ * Use [[BarrierTaskContext#get]] to obtain the barrier context for a 
running barrier task.
+ */
+@Experimental
+@Since("2.4.0")
+class BarrierTaskContext private[spark] (
 override val stageId: Int,
 override val stageAttemptNumber: Int,
 override val partitionId: Int,
 override val taskAttemptId: Long,
 override val attemptNumber: Int,
-override val taskMemoryManager: TaskMemoryManager,
+private[spark] override val taskMemoryManager: TaskMemoryManager,
--- End diff --

This is not exposed by `TaskContext`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22240: [WIP] [SPARK-25248] [CORE] Audit barrier APIs for...

2018-08-26 Thread mengxr
GitHub user mengxr opened a pull request:

https://github.com/apache/spark/pull/22240

[WIP] [SPARK-25248] [CORE] Audit barrier APIs for 2.4

## What changes were proposed in this pull request?

I made one pass over barrier APIs added to Spark 2.4 and updates some 
scopes and docs.

TODOs:
- [ ] scala doc
- [ ] python doc

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mengxr/spark SPARK-25248

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22240


commit 19e380ab4f7242f2f2ef48aca81445b0adf0a87d
Author: Xiangrui Meng 
Date:   2018-08-27T04:18:54Z

update barrier Scala doc




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org