ericm-db opened a new pull request, #53805:
URL: https://github.com/apache/spark/pull/53805

   ### What changes were proposed in this pull request?
   
   This PR adds SQL syntax support for naming streaming sources using the 
`IDENTIFIED BY` clause. Users can now specify a name for a streaming source 
directly in SQL queries:
   
   ```sql
   SELECT * FROM STREAM table_name IDENTIFIED BY source_name
   ```
   
   The changes include:
   - Added `IDENTIFIED` keyword to the SQL lexer and parser
   - Added `identifiedByClause` grammar rule in SqlBaseParser.g4
   - Updated `AstBuilder` to extract and apply user-provided source names
   - Added factory method `NamedStreamingRelation.withUserProvidedName` for 
consistent creation
   - Updated keyword lists in documentation and test suites
   - Added comprehensive test coverage in StreamRelationParserSuite
   
   ### Why are the changes needed?
   
   This provides a SQL-native way to name streaming sources, complementing the 
existing programmatic `.name()` API. Named sources are essential for:
   - Tracking source metadata through the streaming pipeline
   - Stable checkpoint locations for source evolution
   - Better observability and debugging of streaming queries
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. Users can now use the `IDENTIFIED BY` clause in streaming SQL queries:
   
   **Before:**
   ```sql
   SELECT * FROM STREAM my_table  -- Source gets auto-generated name
   ```
   
   **After:**
   ```sql
   SELECT * FROM STREAM my_table IDENTIFIED BY my_source_name
   ```
   
   The syntax supports:
   - Basic form: `STREAM table IDENTIFIED BY name`
   - With options: `STREAM table WITH (key='value') IDENTIFIED BY name`
   - With alias: `STREAM table IDENTIFIED BY name AS alias`
   - Multiple sources in joins with different names
   
   ### How was this patch tested?
   
   - Added 6 new test cases in `StreamRelationParserSuite` covering:
     - Basic IDENTIFIED BY syntax
     - Combination with aliases
     - Combination with WITH options
     - Multiple streaming sources in joins
     - Backtick-quoted identifiers
   - Updated existing test suites for new keyword
   - All tests pass with the new syntax
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to