Don-Burns commented on PR #33428:
URL: https://github.com/apache/spark/pull/33428#issuecomment-1655435663

   I am jumping in very late on this.
   But hoping to learn from it.
   If you are creating a DF from scratch what is the suggested way of creating 
rows with null values if having non-strings passed as positional args is 
discouraged/a code smell?
   There are cases where column names are valid for spark but not able to be 
expressed as python keywords. e.g. has a dash in the name
   
   
   e.g. I define the schema separately and build my row data to create the df
   
   ```python
   from pyspark.sql import SparkSession
   from pyspark.sql.types import Row, StringType, StructField, StructType
   
   spark = SparkSession.builder.getOrCreate()
   schema = StructType(
       [
           StructField("some-col", StringType(), True),
       ]
   )
   
   data = [Row("a value"), Row(None)]
   
   df = spark.createDataFrame(data=data, schema=schema)
   df.show()
   ```
   
![image](https://github.com/apache/spark/assets/56016914/c4eaec66-484c-42d2-bc31-229e657fb58d)
   
![image](https://github.com/apache/spark/assets/56016914/bb9368fb-78a6-4db9-803a-3097eb91ea4d)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to