MrPowers commented on pull request #29935:
URL: https://github.com/apache/spark/pull/29935#issuecomment-761705061


   @zero323 - Adding `CalendarIntervalType` to PySpark is a great idea.
   
   
[CalendarIntervalType](https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/types/CalendarIntervalType.html)
 is already in the Scala API and allows for some awesome functionality.
   
   Here's the Spark 3.0.1 behavior with Scala:
   
   ```scala
   import java.sql.Date
   import org.apache.spark.sql.functions._
   val df = Seq(
     (Date.valueOf("2021-01-23"), Date.valueOf("2021-01-21"))
   ).toDF("date1", "date2")
   df.withColumn("new_datediff", $"date1" - $"date2").show()
   //+----------+----------+------------+
   //|     date1|     date2|new_datediff|
   //+----------+----------+------------+
   //|2021-01-23|2021-01-21|      2 days|
   //+----------+----------+------------+
   
   df.withColumn("new_datediff", $"date1" - $"date2").printSchema()
   //root
   // |-- date1: date (nullable = true)
   // |-- date2: date (nullable = true)
   // |-- new_datediff: interval (nullable = true)
   ```
   
   Getting this functionality in PySpark would be a huge win.  
   
   Let me know if there is anything I can do to help you move this PR forward 
@zero323!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to