[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-62442671 Alright merging this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2564 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-6920 Hey @srowen sorry for dropping the ball on this. I am testing this manually right now. I am afraid there may have been a bug that has been spotted in this PR. https://github.com/apache/spark/pull/2735 Could you update the code? Then I can test it manually and merge it right in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-62231358 @tdas Done, I added a change like the one in https://github.com/apache/spark/pull/2735 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-62231777 [Test build #23078 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23078/consoleFull) for PR 2564 at commit [`0d0bf29`](https://github.com/apache/spark/commit/0d0bf29ce595c078de8f4a5ec60d435e6132ad81). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-62238655 [Test build #23078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23078/consoleFull) for PR 2564 at commit [`0d0bf29`](https://github.com/apache/spark/commit/0d0bf29ce595c078de8f4a5ec60d435e6132ad81). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class JavaRecoverableNetworkWordCount ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-62238658 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23078/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-58806389 Hi @tdas, does this look OK to you? Ready to go from this end. Just a little minor change to examples but I agree it's a worth example to resurrect. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57449006 @tdas No problem, text removed. I tested the Java example using the instructions in the javadoc, and that worked. I was lazy, and didn't try it on a cluster and try killing the receiver and recovering, on the assumption that the API calls are correct and it's the tests that should make sure it works. It's a straight port of the Scala example, so if that is a valid example, this should be too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57449117 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21107/consoleFull) for PR 2564 at commit [`35f23e3`](https://github.com/apache/spark/commit/35f23e36031635e48934573ace0aa5cf22c26bd6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57455207 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21107/consoleFull) for PR 2564 at commit [`35f23e3`](https://github.com/apache/spark/commit/35f23e36031635e48934573ace0aa5cf22c26bd6). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class JavaRecoverableNetworkWordCount ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57455214 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21107/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57381186 @srowen Cool!! How did you test the JavaRecoverableWordCount? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2564#discussion_r18251478 --- Diff: examples/src/main/java/org/apache/spark/examples/streaming/JavaRecoverableNetworkWordCount.java --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.examples.streaming; + +import java.io.File; +import java.io.IOException; +import java.nio.charset.Charset; +import java.util.Arrays; +import java.util.regex.Pattern; + +import scala.Tuple2; +import com.google.common.collect.Lists; +import com.google.common.io.Files; + +import org.apache.spark.SparkConf; +import org.apache.spark.api.java.JavaPairRDD; +import org.apache.spark.api.java.function.FlatMapFunction; +import org.apache.spark.api.java.function.Function2; +import org.apache.spark.api.java.function.PairFunction; +import org.apache.spark.streaming.Durations; +import org.apache.spark.streaming.Time; +import org.apache.spark.streaming.api.java.JavaDStream; +import org.apache.spark.streaming.api.java.JavaPairDStream; +import org.apache.spark.streaming.api.java.JavaReceiverInputDStream; +import org.apache.spark.streaming.api.java.JavaStreamingContext; +import org.apache.spark.streaming.api.java.JavaStreamingContextFactory; + +/** + * Counts words in text encoded with UTF8 received from the network every second. + * + * Usage: JavaRecoverableNetworkWordCount hostname port checkpoint-directory output-file + * hostname and port describe the TCP server that Spark Streaming would connect to receive + * data. checkpoint-directory directory to HDFS-compatible file system which checkpoint data + * output-file file to which the word counts will be appended + * + * checkpoint-directory and output-file must be absolute paths + * + * To run this on your local machine, you need to first run a Netcat server + * + * `$ nc -lk ` + * + * and run the example as + * + * `$ ./bin/run-example org.apache.spark.examples.streaming.JavaRecoverableNetworkWordCount \ + * localhost ~/checkpoint/ ~/out` + * + * If the directory ~/checkpoint/ does not exist (e.g. running for the first time), it will create + * a new StreamingContext (will print Creating new context to the console). Otherwise, if + * checkpoint data exists in ~/checkpoint/, then it will create StreamingContext from + * the checkpoint data. + * + * To run this example in a local standalone cluster with automatic driver recovery, --- End diff -- Can you remove this reference to spark standalone cluster mode? These instructions are old and irrelevant. Please remove from both examples :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/2564 SPARK-2548 [STREAMING] JavaRecoverableWordCount is missing Here's my attempt to re-port `RecoverableNetworkWordCount` to Java, following the example of its Scala and Java siblings. I fixed a few minor doc/formatting issues along the way I believe. You can merge this pull request into a Git repository by running: $ git pull https://github.com/srowen/spark SPARK-2548 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2564.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2564 commit 179b3c2ca892a8db237ff714147aabf54d7d2b3a Author: Sean Owen so...@cloudera.com Date: 2014-09-28T16:16:03Z Re-port RecoverableNetworkWordCount to Java example, and touch up doc / formatting in related examples --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57090626 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20941/consoleFull) for PR 2564 at commit [`179b3c2`](https://github.com/apache/spark/commit/179b3c2ca892a8db237ff714147aabf54d7d2b3a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57092386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20941/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57092384 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20941/consoleFull) for PR 2564 at commit [`179b3c2`](https://github.com/apache/spark/commit/179b3c2ca892a8db237ff714147aabf54d7d2b3a). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class JavaRecoverableNetworkWordCount ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57095584 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57095758 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20944/consoleFull) for PR 2564 at commit [`179b3c2`](https://github.com/apache/spark/commit/179b3c2ca892a8db237ff714147aabf54d7d2b3a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57098005 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20944/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2548 [STREAMING] JavaRecoverableWordCoun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2564#issuecomment-57098002 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20944/consoleFull) for PR 2564 at commit [`179b3c2`](https://github.com/apache/spark/commit/179b3c2ca892a8db237ff714147aabf54d7d2b3a). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class JavaRecoverableNetworkWordCount ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org