[GitHub] spark pull request #22011: [WIP][SPARK-24822][PySpark] Python support for ba...

HyukjinKwon Mon, 06 Aug 2018 21:07:56 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22011#discussion_r208093660
  
    --- Diff: python/pyspark/rdd.py ---
    @@ -2429,6 +2441,29 @@ def _wrap_function(sc, func, deserializer, 
serializer, profiler=None):
                                       sc.pythonVer, broadcast_vars, 
sc._javaAccumulator)
     
     
    +class RDDBarrier(object):
    +
    +    """
    +    .. note:: Experimental
    +
    +    An RDDBarrier turns an RDD into a barrier RDD, which forces Spark to 
launch tasks of the stage
    +    contains this RDD together.
    +    """
    +
    +    def __init__(self, rdd):
    +        self.rdd = rdd
    +        self._jrdd = rdd._jrdd
    +
    +    def mapPartitions(self, f, preservesPartitioning=False):
    +        """
    +        Return a new RDD by applying a function to each partition of this 
RDD.
    +        """
    --- End diff --
    
    shall we match the documentation, or why is it different?
    
    FWIW, for coding block, just `` `blabla` `` should be good enough. Nicer if 
linked properly by like `` :class:`ClassName` ``.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22011: [WIP][SPARK-24822][PySpark] Python support for ba...

Reply via email to