Re: can't print DStream after reduce

2014-07-16 Thread Tathagata Das
Yeah. I have been wondering how to check this in the general case, across
all deployment modes, but thats a hard problem. Last week I realized that
even if we can do it just for local, we can get the biggest bang of the
buck.

TD


On Tue, Jul 15, 2014 at 9:31 PM, Tobias Pfeiffer t...@preferred.jp wrote:

 Hi,

 thanks for creating the issue. It feels like in the last week, more or
 less half of the questions about Spark Streaming rooted in setting the
 master to local ;-)

 Tobias


 On Wed, Jul 16, 2014 at 11:03 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Aah, right, copied from the wrong browser tab i guess. Thanks!

 TD


 On Tue, Jul 15, 2014 at 5:57 PM, Michael Campbell 
 michael.campb...@gmail.com wrote:

 I think you typo'd the jira id; it should be
 https://issues.apache.org/jira/browse/SPARK-2475  Check whether #cores
  #receivers in local mode


 On Mon, Jul 14, 2014 at 3:57 PM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 The problem is not really for local[1] or local. The problem arises
 when there are more input streams than there are cores.
 But I agree, for people who are just beginning to use it by running it
 locally, there should be a check addressing this.

 I made a JIRA for this.
 https://issues.apache.org/jira/browse/SPARK-2464

 TD


 On Sun, Jul 13, 2014 at 4:26 PM, Sean Owen so...@cloudera.com wrote:

 How about a PR that rejects a context configured for local or
 local[1]? As I understand it is not intended to work and has bitten 
 several
 people.
 On Jul 14, 2014 12:24 AM, Michael Campbell 
 michael.campb...@gmail.com wrote:

 This almost had me not using Spark; I couldn't get any output.  It is
 not at all obvious what's going on here to the layman (and to the best of
 my knowledge, not documented anywhere), but now you know you'll be able 
 to
 answer this question for the numerous people that will also have it.


 On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat 
 walrusthe...@gmail.com wrote:

 Great success!

 I was able to get output to the driver console by changing the
 construction of the Streaming Spark Context from:

  val ssc = new StreamingContext(local /**TODO change once a
 cluster is up **/,
 AppName, Seconds(1))


 to:

 val ssc = new StreamingContext(local[2] /**TODO change once a
 cluster is up **/,
 AppName, Seconds(1))


 I found something that tipped me off that this might work by digging
 through this mailing list.


 On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat 
 walrusthe...@gmail.com wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is
 printed to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat 
 walrusthe...@gmail.com wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

  And I got 0 every once in a while (which I think is strange,
 because lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and
 further inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks












Re: can't print DStream after reduce

2014-07-15 Thread Michael Campbell
I think you typo'd the jira id; it should be
https://issues.apache.org/jira/browse/SPARK-2475  Check whether #cores 
#receivers in local mode


On Mon, Jul 14, 2014 at 3:57 PM, Tathagata Das tathagata.das1...@gmail.com
wrote:

 The problem is not really for local[1] or local. The problem arises when
 there are more input streams than there are cores.
 But I agree, for people who are just beginning to use it by running it
 locally, there should be a check addressing this.

 I made a JIRA for this.
 https://issues.apache.org/jira/browse/SPARK-2464

 TD


 On Sun, Jul 13, 2014 at 4:26 PM, Sean Owen so...@cloudera.com wrote:

 How about a PR that rejects a context configured for local or local[1]?
 As I understand it is not intended to work and has bitten several people.
 On Jul 14, 2014 12:24 AM, Michael Campbell michael.campb...@gmail.com
 wrote:

 This almost had me not using Spark; I couldn't get any output.  It is
 not at all obvious what's going on here to the layman (and to the best of
 my knowledge, not documented anywhere), but now you know you'll be able to
 answer this question for the numerous people that will also have it.


 On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Great success!

 I was able to get output to the driver console by changing the
 construction of the Streaming Spark Context from:

  val ssc = new StreamingContext(local /**TODO change once a cluster
 is up **/,
 AppName, Seconds(1))


 to:

 val ssc = new StreamingContext(local[2] /**TODO change once a cluster
 is up **/,
 AppName, Seconds(1))


 I found something that tipped me off that this might work by digging
 through this mailing list.


 On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is
 printed to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat 
 walrusthe...@gmail.com wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

  And I got 0 every once in a while (which I think is strange, because
 lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and
 further inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks









Re: can't print DStream after reduce

2014-07-15 Thread Tathagata Das
Aah, right, copied from the wrong browser tab i guess. Thanks!

TD


On Tue, Jul 15, 2014 at 5:57 PM, Michael Campbell 
michael.campb...@gmail.com wrote:

 I think you typo'd the jira id; it should be
 https://issues.apache.org/jira/browse/SPARK-2475  Check whether #cores 
 #receivers in local mode


 On Mon, Jul 14, 2014 at 3:57 PM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 The problem is not really for local[1] or local. The problem arises when
 there are more input streams than there are cores.
 But I agree, for people who are just beginning to use it by running it
 locally, there should be a check addressing this.

 I made a JIRA for this.
 https://issues.apache.org/jira/browse/SPARK-2464

 TD


 On Sun, Jul 13, 2014 at 4:26 PM, Sean Owen so...@cloudera.com wrote:

 How about a PR that rejects a context configured for local or local[1]?
 As I understand it is not intended to work and has bitten several people.
 On Jul 14, 2014 12:24 AM, Michael Campbell michael.campb...@gmail.com
 wrote:

 This almost had me not using Spark; I couldn't get any output.  It is
 not at all obvious what's going on here to the layman (and to the best of
 my knowledge, not documented anywhere), but now you know you'll be able to
 answer this question for the numerous people that will also have it.


 On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Great success!

 I was able to get output to the driver console by changing the
 construction of the Streaming Spark Context from:

  val ssc = new StreamingContext(local /**TODO change once a cluster
 is up **/,
 AppName, Seconds(1))


 to:

 val ssc = new StreamingContext(local[2] /**TODO change once a
 cluster is up **/,
 AppName, Seconds(1))


 I found something that tipped me off that this might work by digging
 through this mailing list.


 On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat walrusthe...@gmail.com
  wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is
 printed to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat 
 walrusthe...@gmail.com wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

  And I got 0 every once in a while (which I think is strange,
 because lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and
 further inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks










Re: can't print DStream after reduce

2014-07-15 Thread Tobias Pfeiffer
Hi,

thanks for creating the issue. It feels like in the last week, more or less
half of the questions about Spark Streaming rooted in setting the master to
local ;-)

Tobias


On Wed, Jul 16, 2014 at 11:03 AM, Tathagata Das tathagata.das1...@gmail.com
 wrote:

 Aah, right, copied from the wrong browser tab i guess. Thanks!

 TD


 On Tue, Jul 15, 2014 at 5:57 PM, Michael Campbell 
 michael.campb...@gmail.com wrote:

 I think you typo'd the jira id; it should be
 https://issues.apache.org/jira/browse/SPARK-2475  Check whether #cores
  #receivers in local mode


 On Mon, Jul 14, 2014 at 3:57 PM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 The problem is not really for local[1] or local. The problem arises when
 there are more input streams than there are cores.
 But I agree, for people who are just beginning to use it by running it
 locally, there should be a check addressing this.

 I made a JIRA for this.
 https://issues.apache.org/jira/browse/SPARK-2464

 TD


 On Sun, Jul 13, 2014 at 4:26 PM, Sean Owen so...@cloudera.com wrote:

 How about a PR that rejects a context configured for local or local[1]?
 As I understand it is not intended to work and has bitten several people.
 On Jul 14, 2014 12:24 AM, Michael Campbell 
 michael.campb...@gmail.com wrote:

 This almost had me not using Spark; I couldn't get any output.  It is
 not at all obvious what's going on here to the layman (and to the best of
 my knowledge, not documented anywhere), but now you know you'll be able to
 answer this question for the numerous people that will also have it.


 On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat walrusthe...@gmail.com
  wrote:

 Great success!

 I was able to get output to the driver console by changing the
 construction of the Streaming Spark Context from:

  val ssc = new StreamingContext(local /**TODO change once a cluster
 is up **/,
 AppName, Seconds(1))


 to:

 val ssc = new StreamingContext(local[2] /**TODO change once a
 cluster is up **/,
 AppName, Seconds(1))


 I found something that tipped me off that this might work by digging
 through this mailing list.


 On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat 
 walrusthe...@gmail.com wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is
 printed to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat 
 walrusthe...@gmail.com wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

  And I got 0 every once in a while (which I think is strange,
 because lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and
 further inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks











Re: can't print DStream after reduce

2014-07-14 Thread Tathagata Das
The problem is not really for local[1] or local. The problem arises when
there are more input streams than there are cores.
But I agree, for people who are just beginning to use it by running it
locally, there should be a check addressing this.

I made a JIRA for this.
https://issues.apache.org/jira/browse/SPARK-2464

TD


On Sun, Jul 13, 2014 at 4:26 PM, Sean Owen so...@cloudera.com wrote:

 How about a PR that rejects a context configured for local or local[1]? As
 I understand it is not intended to work and has bitten several people.
 On Jul 14, 2014 12:24 AM, Michael Campbell michael.campb...@gmail.com
 wrote:

 This almost had me not using Spark; I couldn't get any output.  It is not
 at all obvious what's going on here to the layman (and to the best of my
 knowledge, not documented anywhere), but now you know you'll be able to
 answer this question for the numerous people that will also have it.


 On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Great success!

 I was able to get output to the driver console by changing the
 construction of the Streaming Spark Context from:

  val ssc = new StreamingContext(local /**TODO change once a cluster is
 up **/,
 AppName, Seconds(1))


 to:

 val ssc = new StreamingContext(local[2] /**TODO change once a cluster
 is up **/,
 AppName, Seconds(1))


 I found something that tipped me off that this might work by digging
 through this mailing list.


 On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is
 printed to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat walrusthe...@gmail.com
  wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

  And I got 0 every once in a while (which I think is strange, because
 lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and
 further inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks








Re: can't print DStream after reduce

2014-07-13 Thread Tathagata Das
Try doing DStream.foreachRDD and then printing the RDD count and further
inspecting the RDD.
On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks



Re: can't print DStream after reduce

2014-07-13 Thread Walrus theCat
Update on this:

val lines = ssc.socketTextStream(localhost, )

lines.print // works

lines.map(_-1).print // works

lines.map(_-1).reduceByKey(_+_).print // nothing printed to driver console

Just lots of:

14/07/13 11:37:40 INFO receiver.BlockGenerator: Pushed block
input-0-1405276660400
14/07/13 11:37:41 INFO scheduler.ReceiverTracker: Stream 0 received 1 blocks
14/07/13 11:37:41 INFO scheduler.JobScheduler: Added jobs for time
1405276661000 ms
14/07/13 11:37:41 INFO storage.MemoryStore: ensureFreeSpace(60) called with
curMem=1275, maxMem=98539929
14/07/13 11:37:41 INFO storage.MemoryStore: Block input-0-1405276661400
stored as bytes to memory (size 60.0 B, free 94.0 MB)
14/07/13 11:37:41 INFO storage.BlockManagerInfo: Added
input-0-1405276661400 in memory on 25.17.218.118:55820 (size: 60.0 B, free:
94.0 MB)
14/07/13 11:37:41 INFO storage.BlockManagerMaster: Updated info of block
input-0-1405276661400


Any insight?

Thanks


On Sun, Jul 13, 2014 at 1:03 AM, Walrus theCat walrusthe...@gmail.com
wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks



Re: can't print DStream after reduce

2014-07-13 Thread Walrus theCat
Thanks for your interest.

lines.foreachRDD(x = println(x.count))

And I got 0 every once in a while (which I think is strange, because
lines.print prints the input I'm giving it over the socket.)


When I tried:

lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

I got no count.

Thanks


On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das tathagata.das1...@gmail.com
 wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and further
 inspecting the RDD.
 On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks




Re: can't print DStream after reduce

2014-07-13 Thread Walrus theCat
More strange behavior:

lines.foreachRDD(x = println(x.first)) // works
lines.foreachRDD(x = println((x.count,x.first))) // no output is printed
to driver console




On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat walrusthe...@gmail.com
wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

 And I got 0 every once in a while (which I think is strange, because
 lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and further
 inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks





Re: can't print DStream after reduce

2014-07-13 Thread Walrus theCat
Great success!

I was able to get output to the driver console by changing the construction
of the Streaming Spark Context from:

 val ssc = new StreamingContext(local /**TODO change once a cluster is up
**/,
AppName, Seconds(1))


to:

val ssc = new StreamingContext(local[2] /**TODO change once a cluster is
up **/,
AppName, Seconds(1))


I found something that tipped me off that this might work by digging
through this mailing list.


On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat walrusthe...@gmail.com
wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is printed
 to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat walrusthe...@gmail.com
 wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

 And I got 0 every once in a while (which I think is strange, because
 lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and further
 inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks






Re: can't print DStream after reduce

2014-07-13 Thread Michael Campbell
This almost had me not using Spark; I couldn't get any output.  It is not
at all obvious what's going on here to the layman (and to the best of my
knowledge, not documented anywhere), but now you know you'll be able to
answer this question for the numerous people that will also have it.


On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat walrusthe...@gmail.com
wrote:

 Great success!

 I was able to get output to the driver console by changing the
 construction of the Streaming Spark Context from:

  val ssc = new StreamingContext(local /**TODO change once a cluster is
 up **/,
 AppName, Seconds(1))


 to:

 val ssc = new StreamingContext(local[2] /**TODO change once a cluster is
 up **/,
 AppName, Seconds(1))


 I found something that tipped me off that this might work by digging
 through this mailing list.


 On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is printed
 to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat walrusthe...@gmail.com
 wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

  And I got 0 every once in a while (which I think is strange, because
 lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and
 further inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks







Re: can't print DStream after reduce

2014-07-13 Thread Sean Owen
How about a PR that rejects a context configured for local or local[1]? As
I understand it is not intended to work and has bitten several people.
On Jul 14, 2014 12:24 AM, Michael Campbell michael.campb...@gmail.com
wrote:

 This almost had me not using Spark; I couldn't get any output.  It is not
 at all obvious what's going on here to the layman (and to the best of my
 knowledge, not documented anywhere), but now you know you'll be able to
 answer this question for the numerous people that will also have it.


 On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Great success!

 I was able to get output to the driver console by changing the
 construction of the Streaming Spark Context from:

  val ssc = new StreamingContext(local /**TODO change once a cluster is
 up **/,
 AppName, Seconds(1))


 to:

 val ssc = new StreamingContext(local[2] /**TODO change once a cluster
 is up **/,
 AppName, Seconds(1))


 I found something that tipped me off that this might work by digging
 through this mailing list.


 On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat walrusthe...@gmail.com
 wrote:

 More strange behavior:

 lines.foreachRDD(x = println(x.first)) // works
 lines.foreachRDD(x = println((x.count,x.first))) // no output is
 printed to driver console




 On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat walrusthe...@gmail.com
 wrote:


 Thanks for your interest.

 lines.foreachRDD(x = println(x.count))

  And I got 0 every once in a while (which I think is strange, because
 lines.print prints the input I'm giving it over the socket.)


 When I tried:

 lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))

 I got no count.

 Thanks


 On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Try doing DStream.foreachRDD and then printing the RDD count and
 further inspecting the RDD.
  On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com
 wrote:

 Hi,

 I have a DStream that works just fine when I say:

 dstream.print

 If I say:

 dstream.map(_,1).print

 that works, too.  However, if I do the following:

 dstream.reduce{case(x,y) = x}.print

 I don't get anything on my console.  What's going on?

 Thanks