Re: Source code JavaNetworkWordcount

2014-02-05 Thread Tathagata Das
Seems good to me. BTW, its find to MEMORY_ONLY (i.e. without replication) for testing, but you should turn on replication if you want fault-tolerance. TD On Mon, Feb 3, 2014 at 3:19 PM, Eduardo Costa Alfaia e.costaalf...@unibs.it wrote: Hi Tathagata, You were right when you have said for

Re: Source code JavaNetworkWordcount

2014-02-05 Thread Eduardo Costa Alfaia
Hi Tathagata I am playing with NetworkWordCount.scala, I did some changes like this(in red): // Create the context with a 1 second batch size 67 val ssc = new StreamingContext(args(0), NetworkWordCount, Seconds(1), 68 System.getenv(SPARK_HOME),

Re: Source code JavaNetworkWordcount

2014-01-30 Thread Tathagata Das
Let me first ask for a few clarifications. 1. If you just want to count the words in a single text file like Don Quixote (that is, not for a stream of data), you should use only Spark. Then the program to count the frequency of words in a text file would look like this in Java. If you are not