Re: Getting number of physical machines in Spark

2015-08-28 Thread Alexey Grishchenko
writing a Spark streaming application to ingest from Kafka with the Receiver API and want to create one DStream per physical machine for read parallelism’s sake. How can I figure out at run time how many machines there are so I know how many DStreams to create? -- Best regards, Alexey

Re: Help Explain Tasks in WebUI:4040

2015-08-28 Thread Alexey Grishchenko
what is going on? Thanks, - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Alexey Grishchenko, http://0x0fff.com

Re: Any quick method to sample rdd based on one filed?

2015-08-28 Thread Alexey Grishchenko
. And hope in the final result, the negative ones could be 10 times more than positive ones. What would be most efficient way to do this? Thanks, -- Best regards, Alexey Grishchenko phone: +353 (87) 262-2154 email: programme...@gmail.com web: http://0x0fff.com

Re: Calculating Min and Max Values using Spark Transformations?

2015-08-28 Thread Alexey Grishchenko
-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Best regards, Alexey Grishchenko phone: +353 (87) 262-2154 email: programme...@gmail.com web: http://0x0fff.com