Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19226#discussion_r139035222
  
    --- Diff: python/pyspark/tests.py ---
    @@ -644,6 +644,18 @@ def test_cartesian_chaining(self):
                 set([(x, (y, y)) for x in range(10) for y in range(10)])
             )
     
    +    def test_zip_chaining(self):
    +        # Tests for SPARK-21985
    +        rdd = self.sc.parallelize('abc')
    --- End diff --
    
    I'd set the explicit number of partitions because `zip` reserializes it 
depending on this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to