[GitHub] spark pull request #21386: [SPARK-23928][SQL][WIP] Add shuffle collection fu...

pkuwm Mon, 21 May 2018 15:17:18 -0700

Github user pkuwm commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21386#discussion_r189729116
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -2268,6 +2268,21 @@ def array_sort(col):
         return Column(sc._jvm.functions.array_sort(_to_java_column(col)))
     
     
    +@since(2.4)
    +def shuffle(col):
    +    """
    +    Collection function: Generate a random permutation of the given array.
    +
    +    :param col: name of column or expression
    +
    +    >>> df = spark.createDataFrame([([2, 1, 3],),([2, 1, None, 
3],),([1],),([],)], ['data'])
    +    >>> df.select(shuffle(df.data).alias('r')).collect()
    +    [Row(r=[1, 3, 2]), Row(r=[3, None, 1, 2]), Row(r=[1]), Row(r=[])]
    --- End diff --
    
    Cool. My bad. Not familiar with this. Thought they were just doc like 
comments... Will fix it.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21386: [SPARK-23928][SQL][WIP] Add shuffle collection fu...

Reply via email to