Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20678#discussion_r170799189
--- Diff: python/pyspark/sql/tests.py ---
@@ -3493,19 +3495,42 @@ def create_pandas_data_frame(self):
data_dict["4_float_t"] = np.float32(data_dict["4_float_t"])
return pd.DataFrame(data=data_dict)
- def test_unsupported_datatype(self):
- schema = StructType([StructField("map", MapType(StringType(),
IntegerType()), True)])
- df = self.spark.createDataFrame([(None,)], schema=schema)
- with QuietTest(self.sc):
- with self.assertRaisesRegexp(Exception, 'Unsupported type'):
- df.toPandas()
+ @contextmanager
+ def arrow_fallback(self, enabled):
--- End diff --
Yup, makes sense. Will give a shot.
BTW, while we are here, I was thinking of adding a more generalized version
of an util like `arrow_fallback` to reduce configuration specific codes in the
test scope but was hesitant because this approach is new to PySpark. WDTY? I
will do another PR for this cleanup if we all feel in the same way. Cc @ueshin
too.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]