Patrick Young created SPARK-23612: ------------------------------------- Summary: Specify formats for individual DateType and TimestampType columns in schemas Key: SPARK-23612 URL: https://issues.apache.org/jira/browse/SPARK-23612 Project: Spark Issue Type: Improvement Components: PySpark, SQL Affects Versions: 2.3.0 Reporter: Patrick Young
[https://github.com/apache/spark/blob/407f67249639709c40c46917700ed6dd736daa7d/python/pyspark/sql/types.py#L162-L200] It would be very helpful if it were possible to specify the format for individual columns in a schema when reading csv files, rather than one format: {code:title=Bar.python|borderStyle=solid} # Currently can only do something like: spark.read.option("**dateFormat", "yyyyMMdd").csv(...) # Would like to be able to do something like: schema = StructType([ StructField("date1", DateType(format="MM/dd/yyyy"), True), StructField("date2", DateType(format="yyyyMMdd"), True) ] read.schema(schema).csv(...) {{{code}}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org