andygrove opened a new issue, #3092:
URL: https://github.com/apache/datafusion-comet/issues/3092
## What is the problem the feature request solves?
> **Note:** This issue was generated with AI assistance. The specification
details have been extracted from Spark documentation and may need verification.
Comet does not currently support the Spark `next_day` function, causing
queries using this function to fall back to Spark's JVM execution instead of
running natively on DataFusion.
The `NextDay` expression returns the first date after a given start date
that falls on a specified day of the week. It advances from the start date to
find the next occurrence of the target day, excluding the start date itself
even if it matches the target day of week.
Supporting this expression would allow more Spark workloads to benefit from
Comet's native acceleration.
## Describe the potential solution
### Spark Specification
**Syntax:**
```sql
next_day(start_date, day_of_week)
```
```scala
// DataFrame API
col("date_column").next_day("Monday")
next_day(col("start_date"), col("day_name"))
```
**Arguments:**
| Argument | Type | Description |
|----------|------|-------------|
| startDate | DateType | The starting date from which to find the next
occurrence |
| dayOfWeek | StringType (with collation support) | The target day of week
as a string (e.g., "Monday", "Tue") |
| failOnError | Boolean | Internal parameter controlling ANSI mode behavior
for invalid inputs |
**Return Type:** Returns `DateType` - an integer representing days since
epoch (1970-01-01).
**Supported Data Types:**
- **Input**: DateType for start date, StringType with collation support for
day of week
- **Output**: DateType
- **Trimming**: Supports trim collation for the day of week string parameter
**Edge Cases:**
- **Null handling**: Returns null if either input is null (null intolerant
behavior)
- **Invalid day names**: Throws `SparkIllegalArgumentException` in ANSI
mode, returns null otherwise
- **Case sensitivity**: Day of week parsing follows `DateTimeUtils` case
handling rules
- **Abbreviations**: Supports abbreviated day names (implementation
dependent on `DateTimeUtils`)
- **Same day exclusion**: Never returns the start date itself, always
advances to next occurrence
**Examples:**
```sql
-- Find next Monday after January 1st, 2023
SELECT next_day('2023-01-01', 'Monday');
-- Returns: 2023-01-02
-- Using with column references
SELECT order_date, next_day(order_date, 'Friday') as next_friday
FROM orders;
-- Next Tuesday after current date
SELECT next_day(current_date(), 'Tue');
```
```scala
// DataFrame API examples
import org.apache.spark.sql.functions._
// Find next Monday for each date
df.select(col("start_date"), next_day(col("start_date"), lit("Monday")))
// Dynamic day of week from another column
df.select(next_day(col("event_date"), col("target_day")))
// Using string interpolation
df.withColumn("next_sunday", next_day($"date_col", "Sunday"))
```
### Implementation Approach
See the [Comet guide on adding new
expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html)
for detailed instructions.
1. **Scala Serde**: Add expression handler in
`spark/src/main/scala/org/apache/comet/serde/`
2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if
needed
4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has
built-in support first)
## Additional context
**Difficulty:** Medium
**Spark Expression Class:**
`org.apache.spark.sql.catalyst.expressions.NextDay`
**Related:**
- `date_add` - Add days to a date
- `date_sub` - Subtract days from a date
- `dayofweek` - Extract day of week from date
- `last_day` - Get last day of month
- `DateTimeUtils` - Underlying utility class for date operations
---
*This issue was auto-generated from Spark reference documentation.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]