srowen edited a comment on issue #23877: [SPARK-26449][PYTHON] Add transform method to DataFrame API URL: https://github.com/apache/spark/pull/23877#issuecomment-466662929 We should leave a reference to the original PR: https://github.com/apache/spark/pull/23414 I wonder if it's worth showing chaining at least two functions to highlight the point of the function? No big deal but the doctest needs fixing anyway: ``` ********************************************************************** File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/dataframe.py", line 2055, in pyspark.sql.dataframe.DataFrame.transform Failed example: df = spark.createDataFrame([Row(a=170.1, b=75.0)]) Exception raised: Traceback (most recent call last): File "/usr/lib64/pypy-2.5.1/lib-python/2.7/doctest.py", line 1315, in __run compileflags, 1) in test.globs File "<doctest pyspark.sql.dataframe.DataFrame.transform[0]>", line 1, in <module> df = spark.createDataFrame([Row(a=170.1, b=75.0)]) NameError: global name 'Row' is not defined ********************************************************************** File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/dataframe.py", line 2058, in pyspark.sql.dataframe.DataFrame.transform Failed example: df.transform(cast_all_to_int).collect() Exception raised: Traceback (most recent call last): File "/usr/lib64/pypy-2.5.1/lib-python/2.7/doctest.py", line 1315, in __run compileflags, 1) in test.globs File "<doctest pyspark.sql.dataframe.DataFrame.transform[2]>", line 1, in <module> df.transform(cast_all_to_int).collect() File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/dataframe.py", line 2061, in transform result = func(self) File "<doctest pyspark.sql.dataframe.DataFrame.transform[1]>", line 2, in cast_all_to_int return input_df.select([col(c_name).cast("int") for c_name in input_df.columns]) NameError: global name 'col' is not defined ********************************************************************** ``` No need to use `Row` anyway. Maybe best to copy the example in the other PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
