andygrove opened a new issue, #3129:
URL: https://github.com/apache/datafusion-comet/issues/3129

   ## What is the problem the feature request solves?
   
   > **Note:** This issue was generated with AI assistance. The specification 
details have been extracted from Spark documentation and may need verification.
   
   Comet does not currently support the Spark `seconds_of_time` function, 
causing queries using this function to fall back to Spark's JVM execution 
instead of running natively on DataFusion.
   
   The `SecondsOfTime` expression extracts the seconds component from a 
time-related value. This is a unary expression that operates on a single child 
expression and returns the seconds portion as an integer value.
   
   Supporting this expression would allow more Spark workloads to benefit from 
Comet's native acceleration.
   
   ## Describe the potential solution
   
   ### Spark Specification
   
   **Syntax:**
   ```sql
   -- SQL syntax (implementation-dependent)
   SECONDS(time_expression)
   ```
   
   ```scala
   // DataFrame API usage
   import org.apache.spark.sql.catalyst.expressions.SecondsOfTime
   SecondsOfTime(child_expression)
   ```
   
   **Arguments:**
   | Argument | Type | Description |
   |----------|------|-------------|
   | child | Expression | The input expression containing time data from which 
to extract seconds |
   
   **Return Type:** Integer type representing the seconds component (typically 
0-59).
   
   **Supported Data Types:**
   - Timestamp types
   - Time-related string formats
   - Date/time structures that contain seconds information
   
   **Edge Cases:**
   - Null handling: Returns null if the child expression evaluates to null
   - Invalid time formats: Behavior depends on the underlying time parsing 
implementation
   - Leap seconds: Handles standard 0-59 seconds range, special leap second 
handling may vary
   - Timezone considerations: Seconds extraction is typically 
timezone-independent
   
   **Examples:**
   ```sql
   -- Example SQL usage (hypothetical)
   SELECT SECONDS(current_timestamp()) as current_seconds;
   SELECT SECONDS('2023-12-25 14:30:45') as extracted_seconds; -- Returns 45
   ```
   
   ```scala
   // Example DataFrame API usage
   import org.apache.spark.sql.functions._
   import org.apache.spark.sql.catalyst.expressions.SecondsOfTime
   
   // Using in a DataFrame transformation
   df.select(SecondsOfTime(col("timestamp_column")))
   ```
   
   ### Implementation Approach
   
   See the [Comet guide on adding new 
expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html)
 for detailed instructions.
   
   1. **Scala Serde**: Add expression handler in 
`spark/src/main/scala/org/apache/comet/serde/`
   2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
   3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if 
needed
   4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has 
built-in support first)
   
   
   ## Additional context
   
   **Difficulty:** Medium
   **Spark Expression Class:** 
`org.apache.spark.sql.catalyst.expressions.SecondsOfTime`
   
   **Related:**
   - `MinutesOfTime` - Extracts minutes component
   - `HoursOfTime` - Extracts hours component  
   - `second()` - Built-in SQL function for seconds extraction
   - Time extraction expressions in Spark Catalyst
   
   ---
   *This issue was auto-generated from Spark reference documentation.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to