spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

zsxwing Thu, 10 Dec 2015 15:32:49 -0800

Repository: spark
Updated Branches:
  refs/heads/branch-1.5 4b99f72f7 -> cb0246c93



[STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

With the merge of 
[SPARK-8337](https://issues.apache.org/jira/browse/SPARK-8337), now the Python 
API has the same functionalities compared to Scala/Java, so here changing the 
description to make it more precise.

zsxwing tdas , please review, thanks a lot.

Author: jerryshao <ss...@hortonworks.com>

Closes #10246 from jerryshao/direct-kafka-doc-update.

(cherry picked from commit 24d3357d66e14388faf8709b368edca70ea96432)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cb0246c9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cb0246c9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cb0246c9

Branch: refs/heads/branch-1.5
Commit: cb0246c9314892dbb3403488154b2d987c90b1dd
Parents: 4b99f72
Author: jerryshao <ss...@hortonworks.com>
Authored: Thu Dec 10 15:31:46 2015 -0800
Committer: Shixiong Zhu <shixi...@databricks.com>
Committed: Thu Dec 10 15:32:05 2015 -0800

----------------------------------------------------------------------
 docs/streaming-kafka-integration.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/cb0246c9/docs/streaming-kafka-integration.md
----------------------------------------------------------------------
diff --git a/docs/streaming-kafka-integration.md 
b/docs/streaming-kafka-integration.md
index ab7f011..d58f4f6 100644
--- a/docs/streaming-kafka-integration.md
+++ b/docs/streaming-kafka-integration.md
@@ -74,7 +74,7 @@ Next, we discuss how to use this approach in your streaming 
application.
        [Maven 
repository](http://search.maven.org/#search|ga|1|a%3A%22spark-streaming-kafka-assembly_2.10%22%20AND%20v%3A%22{{site.SPARK_VERSION_SHORT}}%22)
 and add it to `spark-submit` with `--jars`.
 
 ## Approach 2: Direct Approach (No Receivers)
-This new receiver-less "direct" approach has been introduced in Spark 1.3 to 
ensure stronger end-to-end guarantees. Instead of using receivers to receive 
data, this approach periodically queries Kafka for the latest offsets in each 
topic+partition, and accordingly defines the offset ranges to process in each 
batch. When the jobs to process the data are launched, Kafka's simple consumer 
API is used to read the defined ranges of offsets from Kafka (similar to read 
files from a file system). Note that this is an experimental feature introduced 
in Spark 1.3 for the Scala and Java API. Spark 1.4 added a Python API, but it 
is not yet at full feature parity.
+This new receiver-less "direct" approach has been introduced in Spark 1.3 to 
ensure stronger end-to-end guarantees. Instead of using receivers to receive 
data, this approach periodically queries Kafka for the latest offsets in each 
topic+partition, and accordingly defines the offset ranges to process in each 
batch. When the jobs to process the data are launched, Kafka's simple consumer 
API is used to read the defined ranges of offsets from Kafka (similar to read 
files from a file system). Note that this is an experimental feature introduced 
in Spark 1.3 for the Scala and Java API, in Spark 1.4 for the Python API.
 
 This approach has the following advantages over the receiver-based approach 
(i.e. Approach 1).
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

Reply via email to