[GitHub] spark issue #22140: [SPARK-25072][PySpark] Forbid extra value for custom Row

HyukjinKwon Sun, 09 Sep 2018 19:21:02 -0700

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22140
  
    Yea, actually I wouldn't at least backport this to branch-2.3 since the 
release is very close. Looks a bug to me as well.
    
    One nitpicking is the case with RDD operation:
    
    ```python
    >>> from pyspark.sql import Row
    >>> row_class = Row("c1", "c2")
    >>> row = row_class(1, 2, 3)
    >>> spark.sparkContext.parallelize([row]).map(lambda r: r.c1).collect()
    [1]
    ```
    
    This is really unlikely and I even wonder if it makes any sense, but still 
there might be a case although the creation of the namedtuple like row itself 
should be disallowed, as fixed here.
    
    Can we just simply take this out from branch-2.3?




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #22140: [SPARK-25072][PySpark] Forbid extra value for custom Row

Reply via email to