[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

stczwd Thu, 11 Oct 2018 18:27:01 -0700

Github user stczwd commented on the issue:

    https://github.com/apache/spark/pull/22575
  
    
    @WangTaoTheTonic 
    Adding 'stream' keyword has two purposes:
    
    - **Mark the entire sql query as a stream query and generate the 
SQLStreaming plan tree.**
    - **Mark the table type as UnResolvedStreamRelation.** Parse the table as 
StreamingRelation or other Relation, especially in the stream join batch 
queries, such as kafka join mysql.
    
    **Besides, the keyword 'stream' makes it easier to express StructStreaming 
with pure SQL.**
    A little example to show importances of 'stream': read stream from kafka 
stream table, and join mysql to count user message
    
      - with 'stream'
        - `select stream kafka_sql_test.name, count(door)  from kafka_sql_test 
inner join mysql_test on kafka_sql_test.name == mysql_test.name group by 
kafka_sql_test.name`
          - **It will be regarded as Streaming Query using Console Sink**, the 
kafka_sql_test will be parsed as StreamingRelation and mysql_test will be 
parsed as JDBCRelation, not Streaming Relation.
        - `insert into csv_sql_table select stream kafka_sql_test.name, 
count(door)  from kafka_sql_test inner join mysql_test on kafka_sql_test.name 
== mysql_test.name group by kafka_sql_test.name`
          - **It will be regarded as Streaming Query using FileStream Sink**, 
the kafka_sql_test will be parsed as StreamingRelation and mysql_test will be 
parsed as JDBCRelation, not Streaming Relation.
    
      - without 'stream'
        - `select kafka_sql.name, count(door) from kafka_sql_test inner join 
mysql_test on kafka_sql_test.name == mysql_test.name group by 
kafka_sql_test.name`
          - **It will be regarded as Batch Query**, the kafka_sql_test will be 
parsed to KafkaRelation and mysql_test will be parsed as JDBCRelation.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

Reply via email to