[GitHub] [spark] HyukjinKwon commented on a change in pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

GitBox Thu, 24 Jun 2021 18:10:04 -0700


HyukjinKwon commented on a change in pull request #33054:
URL: https://github.com/apache/spark/pull/33054#discussion_r658388558




##########
File path: python/pyspark/sql/tests/test_dataframe.py
##########
@@ -855,6 +855,22 @@ def test_df_show(self):
         with self.assertRaisesRegex(TypeError, "Parameter 'truncate=foo'"):
             df.show(truncate='foo')
 
+    @unittest.skipIf(
+        not have_pandas or not have_pyarrow,
+        pandas_requirement_message or pyarrow_requirement_message)  # type: 
ignore
+    def test_to_pandas_on_spark(self):
+        from pyspark.pandas.frame import DataFrame
+        from pandas.testing import assert_frame_equal
+
+        sdf = self.spark.createDataFrame([("a", 1), ("b", 2), ("c",  3)], 
["Col1", "Col2"])
+        psdf_from_sdf = sdf.to_pandas_on_spark()
+        psdf_from_sdf_with_index = sdf.to_pandas_on_spark(index_col="Col1")
+        psdf = DataFrame({"Col1": ["a", "b", "c"], "Col2": [1, 2, 3]})

Review comment:
       maybe directly create a pandas DataFrame?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

Reply via email to