[GitHub] spark pull request #13945: [SPARK-16256][SQL][STREAMING] Added Structured St...

koeninger Tue, 28 Jun 2016 13:22:52 -0700

Github user koeninger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13945#discussion_r68835399
  
    --- Diff: docs/structured-streaming-programming-guide.md ---
    @@ -0,0 +1,888 @@
    +---
    +layout: global
    +displayTitle: Structured Streaming Programming Guide [Alpha]
    +title: Structured Streaming Programming Guide
    +---
    +
    +* This will become a table of contents (this text will be scraped).
    +{:toc}
    +
    +# Overview
    +Structured Streaming is a scalable and fault-tolerant stream processing 
engine 
    +built on the Spark SQL engine. You can express your streaming computation 
by 
    +thinking you are running a batch computation on a static dataset, and the 
    +Spark SQL engine takes care of running it incrementally and continuously 
    +updating the final result as streaming data keeps arriving. You can use 
the 
    +[Dataset/DataFrame API](sql-programming-guide.html) in Scala, Java or 
Python to express streaming 
    +aggregations, event-time windows, stream-to-batch joins, etc. The 
computation 
    +is executed on the same optimized Spark SQL engine. Finally, the system 
    +ensures end-to-end exactly-once fault-tolerance guarantees through 
    --- End diff --
    
    End-to-end exactly-once sounds like over-promising.  Should probably define 
what the ends are, because destructive outputs can't be literally exactly-once 
in the face of network failures.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #13945: [SPARK-16256][SQL][STREAMING] Added Structured St...

Reply via email to