[GitHub] [spark] HeartSaVioR opened a new pull request #29461: [DO-NOT-MERGE][SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

GitBox Tue, 18 Aug 2020 00:01:41 -0700


HeartSaVioR opened a new pull request #29461:
URL: https://github.com/apache/spark/pull/29461



   ### What changes were proposed in this pull request?
   
   This patch proposes to update the doc (both SS guide doc and Dataset 
dropDuplicates method doc) to leave a note to check on using SQL statements 
with streaming Dataset.
   
   Once end users create a temp view based on streaming Dataset, they won't 
bother with thinking about "streaming" and do whatever they do with batch 
query. In many cases it works, but not just smoothly for the case when 
streaming aggregation is involved. They still need to concern about maintaining 
state store.
   
   ### Why are the changes needed?
   
   Although SPARK-32456 fixed the weird error message, as a side effect some 
operations are enabled on streaming workload via SQL statement, which is 
error-prone if end users don't indicate what they're doing.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Only doc change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR opened a new pull request #29461: [DO-NOT-MERGE][SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement with streaming Dataset

Reply via email to