[GitHub] spark pull request: [SPARK-4806] Streaming doc update for 1.2

JoshRosen Wed, 10 Dec 2014 17:35:32 -0800

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3653#discussion_r21652907
  
    --- Diff: docs/streaming-flume-integration.md ---
    @@ -66,9 +66,16 @@ configuring Flume agents.
     
     ## Approach 2 (Experimental): Pull-based Approach using a Custom Sink
     Instead of Flume pushing data directly to Spark Streaming, this approach 
runs a custom Flume sink that allows the following.
    +
     - Flume pushes data into the sink, and the data stays buffered.
    -- Spark Streaming uses transactions to pull data from the sink. 
Transactions succeed only after data is received and replicated by Spark 
Streaming.
    -This ensures that better reliability and fault-tolerance than the previous 
approach. However, this requires configuring Flume to run a custom sink. Here 
are the configuration steps.
    +- Spark Streaming uses a [reliable Flume 
receiver](streaming-programming-guide.html#receiver-reliability)
    +  and transactions to pull data from the sink. Transactions succeed only 
after data is received and
    +  replicated by Spark Streaming.
    +
    +This ensures that stronger reliability and
    --- End diff --
    
    Can cut 'that'



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4806] Streaming doc update for 1.2

Reply via email to