[ https://issues.apache.org/jira/browse/SPARK-41872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sandeep Singh updated SPARK-41872: ---------------------------------- Description: {code:java} row = self.spark.createDataFrame([("Alice", None, None, None)], schema).fillna(True).first() self.assertEqual(row.age, None){code} {code:java} Traceback (most recent call last): File "/Users/s.singh/personal/spark-oss/python/pyspark/sql/tests/test_dataframe.py", line 231, in test_fillna self.assertEqual(row.age, None) AssertionError: nan != None{code} {code:java} row = ( self.spark.createDataFrame([("Alice", 10, None)], schema) .replace(10, 20, subset=["name", "height"]) .first() ) self.assertEqual(row.name, "Alice") self.assertEqual(row.age, 10) self.assertEqual(row.height, None) {code} {code:java} Traceback (most recent call last): File "/Users/s.singh/personal/spark-oss/python/pyspark/sql/tests/test_dataframe.py", line 372, in test_replace self.assertEqual(row.height, None) AssertionError: nan != None {code} was: {code:java} row = self.spark.createDataFrame([("Alice", None, None, None)], schema).fillna(True).first() self.assertEqual(row.age, None){code} {code:java} Traceback (most recent call last): File "/Users/s.singh/personal/spark-oss/python/pyspark/sql/tests/test_dataframe.py", line 231, in test_fillna self.assertEqual(row.age, None) AssertionError: nan != None{code} > Fix DataFrame createDataframe handling of None > ---------------------------------------------- > > Key: SPARK-41872 > URL: https://issues.apache.org/jira/browse/SPARK-41872 > Project: Spark > Issue Type: Sub-task > Components: Connect > Affects Versions: 3.4.0 > Reporter: Sandeep Singh > Priority: Major > > {code:java} > row = self.spark.createDataFrame([("Alice", None, None, None)], > schema).fillna(True).first() > self.assertEqual(row.age, None){code} > {code:java} > Traceback (most recent call last): > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/tests/test_dataframe.py", > line 231, in test_fillna > self.assertEqual(row.age, None) > AssertionError: nan != None{code} > > {code:java} > row = ( > self.spark.createDataFrame([("Alice", 10, None)], schema) > .replace(10, 20, subset=["name", "height"]) > .first() > ) > self.assertEqual(row.name, "Alice") > self.assertEqual(row.age, 10) > self.assertEqual(row.height, None) {code} > {code:java} > Traceback (most recent call last): File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/tests/test_dataframe.py", > line 372, in test_replace self.assertEqual(row.height, None) > AssertionError: nan != None > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org