andygrove opened a new issue, #3129:
URL: https://github.com/apache/datafusion-comet/issues/3129
## What is the problem the feature request solves?
> **Note:** This issue was generated with AI assistance. The specification
details have been extracted from Spark documentation and may need verification.
Comet does not currently support the Spark `seconds_of_time` function,
causing queries using this function to fall back to Spark's JVM execution
instead of running natively on DataFusion.
The `SecondsOfTime` expression extracts the seconds component from a
time-related value. This is a unary expression that operates on a single child
expression and returns the seconds portion as an integer value.
Supporting this expression would allow more Spark workloads to benefit from
Comet's native acceleration.
## Describe the potential solution
### Spark Specification
**Syntax:**
```sql
-- SQL syntax (implementation-dependent)
SECONDS(time_expression)
```
```scala
// DataFrame API usage
import org.apache.spark.sql.catalyst.expressions.SecondsOfTime
SecondsOfTime(child_expression)
```
**Arguments:**
| Argument | Type | Description |
|----------|------|-------------|
| child | Expression | The input expression containing time data from which
to extract seconds |
**Return Type:** Integer type representing the seconds component (typically
0-59).
**Supported Data Types:**
- Timestamp types
- Time-related string formats
- Date/time structures that contain seconds information
**Edge Cases:**
- Null handling: Returns null if the child expression evaluates to null
- Invalid time formats: Behavior depends on the underlying time parsing
implementation
- Leap seconds: Handles standard 0-59 seconds range, special leap second
handling may vary
- Timezone considerations: Seconds extraction is typically
timezone-independent
**Examples:**
```sql
-- Example SQL usage (hypothetical)
SELECT SECONDS(current_timestamp()) as current_seconds;
SELECT SECONDS('2023-12-25 14:30:45') as extracted_seconds; -- Returns 45
```
```scala
// Example DataFrame API usage
import org.apache.spark.sql.functions._
import org.apache.spark.sql.catalyst.expressions.SecondsOfTime
// Using in a DataFrame transformation
df.select(SecondsOfTime(col("timestamp_column")))
```
### Implementation Approach
See the [Comet guide on adding new
expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html)
for detailed instructions.
1. **Scala Serde**: Add expression handler in
`spark/src/main/scala/org/apache/comet/serde/`
2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if
needed
4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has
built-in support first)
## Additional context
**Difficulty:** Medium
**Spark Expression Class:**
`org.apache.spark.sql.catalyst.expressions.SecondsOfTime`
**Related:**
- `MinutesOfTime` - Extracts minutes component
- `HoursOfTime` - Extracts hours component
- `second()` - Built-in SQL function for seconds extraction
- Time extraction expressions in Spark Catalyst
---
*This issue was auto-generated from Spark reference documentation.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]