Hi, I’m receiving a task not serializable exception using Spark GraphX (Scala 2.11.6 / JDK 1.8 / Spark 1.5)
My vertex data is of type (VertexId, immutable Set), My edge data is of type PartialFunction[ISet[E], ISet[E]] where each ED has a precomputed function. My vertex program: val vertexProgram = (id: VertexId, currentSet: ISet[E], inSet: ISet[E]) => inSet (identity) My send message: val sendMessage: (EdgeTriplet[ISet[E], MonotonicTransferFunction]) => Iterator[(VertexId, ISet[E])] = (edge) => { val f = edge.attr val currentSet = edge.srcAttr Iterator((edge.dstId, f(currentSet))) } My message combiner val messageCombiner: (ISet[E], ISet[E]) => ISet[E] = (a, b) => a ++ b g.pregel(bottom, Int.MaxValue, EdgeDirection.Out)(vp, send, combine) I debugged the pregel execution and found that the exception happened when pregel calls mapReduceTriplets to aggregate the messages for the first time. This happens after the initial vertex program is run I believe (which does not cause an exception). I think the error lies within my send/combiner functions but I am not sure. I’ve also tried storing the PartialFunctions inside of the VD instead and still get the same error. At first I thought the error might have to do with Set and how it changes size throughout execution, but I have successfully ran other Pregel projects using immutable sets without issue… I have also tried enclosing each method within its own class that extends Serializable but this still gives me the same error. Thank you for your time and information. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org