[ https://issues.apache.org/jira/browse/SPARK-33995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthew Powers updated SPARK-33995: ----------------------------------- Affects Version/s: (was: 3.1.0) 3.2.0 > Make datetime addition easier for years, weeks, hours, minutes, and seconds > --------------------------------------------------------------------------- > > Key: SPARK-33995 > URL: https://issues.apache.org/jira/browse/SPARK-33995 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 3.2.0 > Reporter: Matthew Powers > Priority: Minor > > There are add_months and date_add functions that make it easy to perform > datetime addition with months and days, but there isn't an easy way to > perform datetime addition with years, weeks, hours, minutes, or seconds with > the Scala/Python/R APIs. > Users need to write code like expr("first_datetime + INTERVAL 2 hours") to > add two hours to a timestamp with the Scala API, which isn't desirable. We > don't want to make Scala users manipulate SQL strings. > We can expose the [make_interval SQL > function|https://github.com/apache/spark/pull/26446/files] to make any > combination of datetime addition possible. That'll make tons of different > datetime addition operations possible and will be valuable for a wide array > of users. > make_interval takes 7 arguments: years, months, weeks, days, hours, mins, and > secs. > There are different ways to expose the make_interval functionality to > Scala/Python/R users: > * Option 1: Single make_interval function that takes 7 arguments > * Option 2: expose a few interval functions > ** make_date_interval function that takes years, months, days > ** make_time_interval function that takes hours, minutes, seconds > ** make_datetime_interval function that takes years, months, days, hours, > minutes, seconds > * Option 3: expose add_years, add_months, add_days, add_weeks, add_hours, > add_minutes, and add_seconds as Column methods. > * Option 4: Expose the add_years, add_hours, etc. as column functions. > add_weeks and date_add have already been exposed in this manner. > Option 1 is nice from a maintenance perspective cause it's a single function, > but it's not standard from a user perspective. Most languages support > datetime instantiation with these arguments: years, months, days, hours, > minutes, seconds. Mixing weeks into the equation is not standard. > As a user, Option 3 would be my preference. > col("first_datetime").addHours(2).addSeconds(30) is easy for me to remember > and type. col("first_datetime") + make_time_interval(lit(2), lit(0), > lit(30)) isn't as nice. col("first_datetime") + make_interval(lit(0), > lit(0), lit(0), lit(0), lit(2), lit(0), lit(30)) is harder still. > Any of these options is an improvement to the status quo. Let me know what > option you think is best and then I'll make a PR to implement it, building > off of Max's foundational work of course ;) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org