ericm-db opened a new pull request, #53819:
URL: https://github.com/apache/spark/pull/53819
### What changes were proposed in this pull request?
This PR extends the IDENTIFIED BY syntax to support streaming table-valued
functions (TVFs), complementing the existing support for streaming tables.
The changes include:
- Added grammar rules for streaming TVFs with IDENTIFIED BY clause
- Split `functionTable` into `tableFunctionCall` + clauses for consistent
syntax
- Added validation that IDENTIFIED BY is only allowed for streaming sources
- Streaming TVFs support two syntaxes:
1. `STREAM tvf() IDENTIFIED BY name` - clauses inside
2. `STREAM(tvf()) IDENTIFIED BY name` - clauses outside (consistent with
table syntax)
- Added comprehensive test coverage for both syntaxes
### Why are the changes needed?
Streaming TVFs (like range(), read_files() when available) need the ability
to be named just like streaming tables. This ensures:
- Consistent source naming across all streaming source types
- Better observability for streaming queries using TVFs
- Proper checkpoint management for TVF-based streams
### Does this PR introduce _any_ user-facing change?
Yes. Users can now use IDENTIFIED BY with streaming TVFs:
```sql
-- Non-parenthesized form
SELECT * FROM STREAM range(100) IDENTIFIED BY my_range_source
-- Parenthesized form (clauses outside for consistency)
SELECT * FROM STREAM(range(100)) IDENTIFIED BY my_range_source
-- With watermark and alias
SELECT * FROM STREAM range(100)
IDENTIFIED BY my_source
WATERMARK ts DELAY OF INTERVAL 1 MINUTE
AS src
```
Using IDENTIFIED BY on non-streaming TVFs produces a clear error:
```
IDENTIFIED BY clause is only supported for streaming sources
```
### How was this patch tested?
- Added comprehensive tests in StreamRelationParserSuite covering:
- Non-parenthesized form with all clause combinations
- Parenthesized form with all clause combinations
- Validation that non-streaming TVFs reject IDENTIFIED BY
- Tests verify correct placement of clauses in both syntaxes
- All existing tests continue to pass
### Was this patch authored or co-authored using generative AI tooling?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]