Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19226#discussion_r139035222 --- Diff: python/pyspark/tests.py --- @@ -644,6 +644,18 @@ def test_cartesian_chaining(self): set([(x, (y, y)) for x in range(10) for y in range(10)]) ) + def test_zip_chaining(self): + # Tests for SPARK-21985 + rdd = self.sc.parallelize('abc') --- End diff -- I'd set the explicit number of partitions because `zip` reserializes it depending on this.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org