andygrove opened a new issue, #3092:
URL: https://github.com/apache/datafusion-comet/issues/3092

   ## What is the problem the feature request solves?
   
   > **Note:** This issue was generated with AI assistance. The specification 
details have been extracted from Spark documentation and may need verification.
   
   Comet does not currently support the Spark `next_day` function, causing 
queries using this function to fall back to Spark's JVM execution instead of 
running natively on DataFusion.
   
   The `NextDay` expression returns the first date after a given start date 
that falls on a specified day of the week. It advances from the start date to 
find the next occurrence of the target day, excluding the start date itself 
even if it matches the target day of week.
   
   Supporting this expression would allow more Spark workloads to benefit from 
Comet's native acceleration.
   
   ## Describe the potential solution
   
   ### Spark Specification
   
   **Syntax:**
   ```sql
   next_day(start_date, day_of_week)
   ```
   
   ```scala
   // DataFrame API
   col("date_column").next_day("Monday")
   next_day(col("start_date"), col("day_name"))
   ```
   
   **Arguments:**
   | Argument | Type | Description |
   |----------|------|-------------|
   | startDate | DateType | The starting date from which to find the next 
occurrence |
   | dayOfWeek | StringType (with collation support) | The target day of week 
as a string (e.g., "Monday", "Tue") |
   | failOnError | Boolean | Internal parameter controlling ANSI mode behavior 
for invalid inputs |
   
   **Return Type:** Returns `DateType` - an integer representing days since 
epoch (1970-01-01).
   
   **Supported Data Types:**
   - **Input**: DateType for start date, StringType with collation support for 
day of week
   - **Output**: DateType
   - **Trimming**: Supports trim collation for the day of week string parameter
   
   **Edge Cases:**
   - **Null handling**: Returns null if either input is null (null intolerant 
behavior)
   - **Invalid day names**: Throws `SparkIllegalArgumentException` in ANSI 
mode, returns null otherwise
   - **Case sensitivity**: Day of week parsing follows `DateTimeUtils` case 
handling rules
   - **Abbreviations**: Supports abbreviated day names (implementation 
dependent on `DateTimeUtils`)
   - **Same day exclusion**: Never returns the start date itself, always 
advances to next occurrence
   
   **Examples:**
   ```sql
   -- Find next Monday after January 1st, 2023
   SELECT next_day('2023-01-01', 'Monday');
   -- Returns: 2023-01-02
   
   -- Using with column references
   SELECT order_date, next_day(order_date, 'Friday') as next_friday
   FROM orders;
   
   -- Next Tuesday after current date
   SELECT next_day(current_date(), 'Tue');
   ```
   
   ```scala
   // DataFrame API examples
   import org.apache.spark.sql.functions._
   
   // Find next Monday for each date
   df.select(col("start_date"), next_day(col("start_date"), lit("Monday")))
   
   // Dynamic day of week from another column  
   df.select(next_day(col("event_date"), col("target_day")))
   
   // Using string interpolation
   df.withColumn("next_sunday", next_day($"date_col", "Sunday"))
   ```
   
   ### Implementation Approach
   
   See the [Comet guide on adding new 
expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html)
 for detailed instructions.
   
   1. **Scala Serde**: Add expression handler in 
`spark/src/main/scala/org/apache/comet/serde/`
   2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
   3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if 
needed
   4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has 
built-in support first)
   
   
   ## Additional context
   
   **Difficulty:** Medium
   **Spark Expression Class:** 
`org.apache.spark.sql.catalyst.expressions.NextDay`
   
   **Related:**
   - `date_add` - Add days to a date
   - `date_sub` - Subtract days from a date  
   - `dayofweek` - Extract day of week from date
   - `last_day` - Get last day of month
   - `DateTimeUtils` - Underlying utility class for date operations
   
   ---
   *This issue was auto-generated from Spark reference documentation.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to