Re: Equally split a RDD partition into two partition at the same node

2017-01-14 Thread Rishi Yadav
Can you provide some more details: 1. How many partitions does RDD have 2. How big is the cluster On Sat, Jan 14, 2017 at 3:59 PM Fei Hu wrote: > Dear all, > > I want to equally divide a RDD partition into two partitions. That means, > the first half of elements in the

Re: Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-14 Thread shyla deshpande
Hello, I want to add that, I don't even see the streaming tab in the application UI on port 4040 when I run it on the cluster. The cluster on EC2 has 1 master node and 1 worker node. The cores used on the worker node is 2 of 2 and memory used is 6GB of 6.3GB. Can I run a spark streaming job

Equally split a RDD partition into two partition at the same node

2017-01-14 Thread Fei Hu
Dear all, I want to equally divide a RDD partition into two partitions. That means, the first half of elements in the partition will create a new partition, and the second half of elements in the partition will generate another new partition. But the two new partitions are required to be at the

Re: Debugging a PythonException with no details

2017-01-14 Thread Marco Mistroni
It seems it has to do with UDF..Could u share snippet of code you are running? Kr On 14 Jan 2017 1:40 am, "Nicholas Chammas" wrote: > I’m looking for tips on how to debug a PythonException that’s very sparse > on details. The full exception is below, but the only

java.io.NotSerializableException: org.apache.spark.SparkConf

2017-01-14 Thread streamly tester
Hi, I was playing with spark streaming and I wanted to collect data from MQTT and publish them into Cassandra. Here is my code, package com.wouri.streamly.examples.mqtt; import java.io.Serializable; import java.math.BigDecimal; import java.util.ArrayList; import java.util.Arrays; import

Re: Kryo On Spark 1.6.0

2017-01-14 Thread Yan Facai
For scala, you could fix it by using: conf.registerKryoClasses(Array(Class.forName("scala.collection.mutable. WrappedArray$ofRef"))) By the way, if the class is array of primitive class of Java, say byte[], then to use: Class.forName("[B") if it is array of other class, say

Re: spark locality

2017-01-14 Thread vincent gromakowski
Should I open a ticket to allow data locality in IP per container context ? 2017-01-12 23:41 GMT+01:00 Michael Gummelt : > If the executor reports a different hostname inside the CNI container, > then no, I don't think so. > > On Thu, Jan 12, 2017 at 2:28 PM, vincent