Repository: bahir Updated Branches: refs/heads/master 7f546fffa -> eab486427
[BAHIR-52] Update README.md formatting for source code Update source code paragraphs to use tabs instead of ``` which is the supported way in vanilla Jekyll. Project: http://git-wip-us.apache.org/repos/asf/bahir/repo Commit: http://git-wip-us.apache.org/repos/asf/bahir/commit/eab48642 Tree: http://git-wip-us.apache.org/repos/asf/bahir/tree/eab48642 Diff: http://git-wip-us.apache.org/repos/asf/bahir/diff/eab48642 Branch: refs/heads/master Commit: eab486427186cee3c0f7ed8e440971f67f7ed832 Parents: 7f546ff Author: Luciano Resende <[email protected]> Authored: Wed Aug 10 12:47:07 2016 -0700 Committer: Luciano Resende <[email protected]> Committed: Wed Aug 10 13:48:47 2016 -0700 ---------------------------------------------------------------------- sql-streaming-mqtt/README.md | 101 ++++++++++++++++---------------------- streaming-akka/README.md | 46 +++++++---------- streaming-mqtt/README.md | 28 ++++------- streaming-twitter/README.md | 32 +++++------- streaming-zeromq/README.md | 28 ++++------- 5 files changed, 90 insertions(+), 145 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/bahir/blob/eab48642/sql-streaming-mqtt/README.md ---------------------------------------------------------------------- diff --git a/sql-streaming-mqtt/README.md b/sql-streaming-mqtt/README.md index fa222b1..bfb4bdc 100644 --- a/sql-streaming-mqtt/README.md +++ b/sql-streaming-mqtt/README.md @@ -4,26 +4,20 @@ A library for reading data from MQTT Servers using Spark SQL Streaming ( or Stru Using SBT: -```scala -libraryDependencies += "org.apache.bahir" %% "spark-sql-streaming-mqtt" % "2.0.0" -``` + libraryDependencies += "org.apache.bahir" %% "spark-sql-streaming-mqtt" % "2.0.0" Using Maven: -```xml -<dependency> - <groupId>org.apache.bahir</groupId> - <artifactId>spark-sql-streaming-mqtt_2.11</artifactId> - <version>2.0.0</version> -</dependency> -``` + <dependency> + <groupId>org.apache.bahir</groupId> + <artifactId>spark-sql-streaming-mqtt_2.11</artifactId> + <version>2.0.0</version> + </dependency> This library can also be added to Spark jobs launched through `spark-shell` or `spark-submit` by using the `--packages` command line option. For example, to include it when starting the spark shell: -``` -$ bin/spark-shell --packages org.apache.bahir:spark-sql-streaming-mqtt_2.11:2.0.0 -``` + $ bin/spark-shell --packages org.apache.bahir:spark-sql-streaming-mqtt_2.11:2.0.0 Unlike using `--jars`, using `--packages` ensures that this library and its dependencies will be added to the classpath. The `--packages` argument can also be used with `bin/spark-submit`. @@ -34,33 +28,26 @@ This library is compiled for Scala 2.11 only, and intends to support Spark 2.0 o A SQL Stream can be created with data streams received through MQTT Server using, -```scala -sqlContext.readStream - .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider") - .option("topic", "mytopic") - .load("tcp://localhost:1883") - -``` + sqlContext.readStream + .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider") + .option("topic", "mytopic") + .load("tcp://localhost:1883") ## Enable recovering from failures. Setting values for option `localStorage` and `clientId` helps in recovering in case of a restart, by restoring the state where it left off before the shutdown. -```scala -sqlContext.readStream - .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider") - .option("topic", "mytopic") - .option("localStorage", "/path/to/localdir") - .option("clientId", "some-client-id") - .load("tcp://localhost:1883") - -``` + sqlContext.readStream + .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider") + .option("topic", "mytopic") + .option("localStorage", "/path/to/localdir") + .option("clientId", "some-client-id") + .load("tcp://localhost:1883") ### Scala API An example, for scala API to count words from incoming message stream. -```scala // Create DataFrame representing the stream of input lines from connection to mqtt server val lines = spark.readStream .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider") @@ -81,41 +68,37 @@ An example, for scala API to count words from incoming message stream. query.awaitTermination() -``` Please see `MQTTStreamWordCount.scala` for full example. ### Java API An example, for Java API to count words from incoming message stream. -```java - - // Create DataFrame representing the stream of input lines from connection to mqtt server. - Dataset<String> lines = spark - .readStream() - .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider") - .option("topic", topic) - .load(brokerUrl).select("value").as(Encoders.STRING()); - - // Split the lines into words - Dataset<String> words = lines.flatMap(new FlatMapFunction<String, String>() { - @Override - public Iterator<String> call(String x) { - return Arrays.asList(x.split(" ")).iterator(); - } - }, Encoders.STRING()); - - // Generate running word count - Dataset<Row> wordCounts = words.groupBy("value").count(); - - // Start running the query that prints the running counts to the console - StreamingQuery query = wordCounts.writeStream() - .outputMode("complete") - .format("console") - .start(); - - query.awaitTermination(); -``` + // Create DataFrame representing the stream of input lines from connection to mqtt server. + Dataset<String> lines = spark + .readStream() + .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider") + .option("topic", topic) + .load(brokerUrl).select("value").as(Encoders.STRING()); + + // Split the lines into words + Dataset<String> words = lines.flatMap(new FlatMapFunction<String, String>() { + @Override + public Iterator<String> call(String x) { + return Arrays.asList(x.split(" ")).iterator(); + } + }, Encoders.STRING()); + + // Generate running word count + Dataset<Row> wordCounts = words.groupBy("value").count(); + + // Start running the query that prints the running counts to the console + StreamingQuery query = wordCounts.writeStream() + .outputMode("complete") + .format("console") + .start(); + + query.awaitTermination(); Please see `JavaMQTTStreamWordCount.java` for full example. http://git-wip-us.apache.org/repos/asf/bahir/blob/eab48642/streaming-akka/README.md ---------------------------------------------------------------------- diff --git a/streaming-akka/README.md b/streaming-akka/README.md index c93b2f3..16ede09 100644 --- a/streaming-akka/README.md +++ b/streaming-akka/README.md @@ -5,26 +5,20 @@ A library for reading data from Akka Actors using Spark Streaming. Using SBT: -``` -libraryDependencies += "org.apache.bahir" %% "spark-streaming-akka" % "2.0.0" -``` + libraryDependencies += "org.apache.bahir" %% "spark-streaming-akka" % "2.0.0" Using Maven: -```xml -<dependency> - <groupId>org.apache.bahir</groupId> - <artifactId>spark-streaming-akka_2.11</artifactId> - <version>2.0.0</version> -</dependency> -``` + <dependency> + <groupId>org.apache.bahir</groupId> + <artifactId>spark-streaming-akka_2.11</artifactId> + <version>2.0.0</version> + </dependency> This library can also be added to Spark jobs launched through `spark-shell` or `spark-submit` by using the `--packages` command line option. For example, to include it when starting the spark shell: -``` -$ bin/spark-shell --packages org.apache.bahir:spark-streaming_akka_2.11:2.0.0 -``` + $ bin/spark-shell --packages org.apache.bahir:spark-streaming_akka_2.11:2.0.0 Unlike using `--jars`, using `--packages` ensures that this library and its dependencies will be added to the classpath. The `--packages` argument can also be used with `bin/spark-submit`. @@ -40,30 +34,28 @@ DStreams can be created with data streams received through Akka actors by using You need to extend `ActorReceiver` so as to store received data into Spark using `store(...)` methods. The supervisor strategy of this actor can be configured to handle failures, etc. -```Scala -class CustomActor extends ActorReceiver { - def receive = { - case data: String => store(data) - } -} + class CustomActor extends ActorReceiver { + def receive = { + case data: String => store(data) + } + } // A new input stream can be created with this custom actor as val ssc: StreamingContext = ... val lines = AkkaUtils.createStream[String](ssc, Props[CustomActor](), "CustomReceiver") -``` + ### Java API You need to extend `JavaActorReceiver` so as to store received data into Spark using `store(...)` methods. The supervisor strategy of this actor can be configured to handle failures, etc. -```Java -class CustomActor extends JavaActorReceiver { - @Override - public void onReceive(Object msg) throws Exception { - store((String) msg); - } -} + class CustomActor extends JavaActorReceiver { + @Override + public void onReceive(Object msg) throws Exception { + store((String) msg); + } + } // A new input stream can be created with this custom actor as JavaStreamingContext jssc = ...; http://git-wip-us.apache.org/repos/asf/bahir/blob/eab48642/streaming-mqtt/README.md ---------------------------------------------------------------------- diff --git a/streaming-mqtt/README.md b/streaming-mqtt/README.md index 2b3d752..27124ad 100644 --- a/streaming-mqtt/README.md +++ b/streaming-mqtt/README.md @@ -5,26 +5,20 @@ Using SBT: -``` -libraryDependencies += "org.apache.bahir" %% "spark-streaming-mqtt" % "2.0.0" -``` + libraryDependencies += "org.apache.bahir" %% "spark-streaming-mqtt" % "2.0.0" Using Maven: -```xml -<dependency> - <groupId>org.apache.bahir</groupId> - <artifactId>spark-streaming-mqtt_2.11</artifactId> - <version>2.0.0</version> -</dependency> -``` + <dependency> + <groupId>org.apache.bahir</groupId> + <artifactId>spark-streaming-mqtt_2.11</artifactId> + <version>2.0.0</version> + </dependency> This library can also be added to Spark jobs launched through `spark-shell` or `spark-submit` by using the `--packages` command line option. For example, to include it when starting the spark shell: -``` -$ bin/spark-shell --packages org.apache.bahir:spark-streaming_mqtt_2.11:2.0.0 -``` + $ bin/spark-shell --packages org.apache.bahir:spark-streaming_mqtt_2.11:2.0.0 Unlike using `--jars`, using `--packages` ensures that this library and its dependencies will be added to the classpath. The `--packages` argument can also be used with `bin/spark-submit`. @@ -38,17 +32,13 @@ This library is cross-published for Scala 2.10 and Scala 2.11, so users should r You need to extend `ActorReceiver` so as to store received data into Spark using `store(...)` methods. The supervisor strategy of this actor can be configured to handle failures, etc. -```Scala -val lines = MQTTUtils.createStream(ssc, brokerUrl, topic) -``` + val lines = MQTTUtils.createStream(ssc, brokerUrl, topic) ### Java API You need to extend `JavaActorReceiver` so as to store received data into Spark using `store(...)` methods. The supervisor strategy of this actor can be configured to handle failures, etc. -```Java -JavaDStream<String> lines = MQTTUtils.createStream(jssc, brokerUrl, topic); -``` + JavaDStream<String> lines = MQTTUtils.createStream(jssc, brokerUrl, topic); See end-to-end examples at [MQTT Examples](https://github.com/apache/bahir/tree/master/streaming-mqtt/examples) \ No newline at end of file http://git-wip-us.apache.org/repos/asf/bahir/blob/eab48642/streaming-twitter/README.md ---------------------------------------------------------------------- diff --git a/streaming-twitter/README.md b/streaming-twitter/README.md index e2243e8..4d73bbe 100644 --- a/streaming-twitter/README.md +++ b/streaming-twitter/README.md @@ -5,26 +5,20 @@ A library for reading social data from [twitter](http://twitter.com/) using Spar Using SBT: -``` -libraryDependencies += "org.apache.bahir" %% "spark-streaming-twitter" % "2.0.0" -``` + libraryDependencies += "org.apache.bahir" %% "spark-streaming-twitter" % "2.0.0" Using Maven: -```xml -<dependency> - <groupId>org.apache.bahir</groupId> - <artifactId>spark-streaming-twitter_2.11</artifactId> - <version>2.0.0</version> -</dependency> -``` + <dependency> + <groupId>org.apache.bahir</groupId> + <artifactId>spark-streaming-twitter_2.11</artifactId> + <version>2.0.0</version> + </dependency> This library can also be added to Spark jobs launched through `spark-shell` or `spark-submit` by using the `--packages` command line option. For example, to include it when starting the spark shell: -``` -$ bin/spark-shell --packages org.apache.bahir:spark-streaming_twitter_2.11:2.0.0 -``` + $ bin/spark-shell --packages org.apache.bahir:spark-streaming_twitter_2.11:2.0.0 Unlike using `--jars`, using `--packages` ensures that this library and its dependencies will be added to the classpath. The `--packages` argument can also be used with `bin/spark-submit`. @@ -39,19 +33,15 @@ can be provided by any of the [methods](http://twitter4j.org/en/configuration.ht ### Scala API -```Scala -import org.apache.spark.streaming.twitter._ + import org.apache.spark.streaming.twitter._ -TwitterUtils.createStream(ssc, None) -``` + TwitterUtils.createStream(ssc, None) ### Java API -```Java -import org.apache.spark.streaming.twitter.*; + import org.apache.spark.streaming.twitter.*; -TwitterUtils.createStream(jssc); -``` + TwitterUtils.createStream(jssc); You can also either get the public stream, or get the filtered stream based on keywords. http://git-wip-us.apache.org/repos/asf/bahir/blob/eab48642/streaming-zeromq/README.md ---------------------------------------------------------------------- diff --git a/streaming-zeromq/README.md b/streaming-zeromq/README.md index 4184204..eddc3b4 100644 --- a/streaming-zeromq/README.md +++ b/streaming-zeromq/README.md @@ -5,26 +5,20 @@ A library for reading data from [ZeroMQ](http://zeromq.org/) using Spark Streami Using SBT: -``` -libraryDependencies += "org.apache.bahir" %% "spark-streaming-zeromq" % "2.0.0" -``` + libraryDependencies += "org.apache.bahir" %% "spark-streaming-zeromq" % "2.0.0" Using Maven: -```xml -<dependency> - <groupId>org.apache.bahir</groupId> - <artifactId>spark-streaming-zeromq_2.11</artifactId> - <version>2.0.0</version> -</dependency> -``` + <dependency> + <groupId>org.apache.bahir</groupId> + <artifactId>spark-streaming-zeromq_2.11</artifactId> + <version>2.0.0</version> + </dependency> This library can also be added to Spark jobs launched through `spark-shell` or `spark-submit` by using the `--packages` command line option. For example, to include it when starting the spark shell: -``` -$ bin/spark-shell --packages org.apache.bahir:spark-streaming_zeromq_2.11:2.0.0 -``` + $ bin/spark-shell --packages org.apache.bahir:spark-streaming_zeromq_2.11:2.0.0 Unlike using `--jars`, using `--packages` ensures that this library and its dependencies will be added to the classpath. The `--packages` argument can also be used with `bin/spark-submit`. @@ -36,14 +30,10 @@ This library is cross-published for Scala 2.10 and Scala 2.11, so users should r ### Scala API -```Scala -val lines = ZeroMQUtils.createStream(ssc, ...) -``` + val lines = ZeroMQUtils.createStream(ssc, ...) ### Java API -```Java -JavaDStream<String> lines = ZeroMQUtils.createStream(jssc, ...); -``` + JavaDStream<String> lines = ZeroMQUtils.createStream(jssc, ...); See end-to-end examples at [ZeroMQ Examples](https://github.com/apache/bahir/tree/master/streaming-zeromq/examples) \ No newline at end of file
