Mihaly Hazag created SPARK-33322:
------------------------------------
Summary: Dataframe: data is wrongly presented because of column
name
Key: SPARK-33322
URL: https://issues.apache.org/jira/browse/SPARK-33322
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 2.4.5
Reporter: Mihaly Hazag
Consider the code below: `some_text` column got the `some_int` value, while its
value is null in the dataframe.
!image-2020-11-03-14-42-52-840.png!
Renaming the field from `some_text` to `some_apple`, fixed the problem! 🙂
!image-2020-11-03-14-43-13-528.png!
Here is the code to reproduce the problem
{code:python}
from datetime import datetime
from pyspark.sql import Row
from pyspark.sql.types import StructType, StructField, DateType, StringType,
IntegerType
schema = StructType(
[
StructField('dfdt', DateType(), True),
StructField('some_text', StringType(), True),
StructField('some_int', IntegerType(), True),
]
)
test_df = spark.createDataFrame([
Row(dfdt=datetime.strptime('2020-12-18', '%Y-%m-%d'), some_text='cdsvg',
some_int=100)
], schema)
display(test_df)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]