The compiled jar is not consistent with Python source, maybe you are using a older version pyspark, but with assembly jar of Spark Core 1.4?
On Sun, Jun 21, 2015 at 7:24 AM, Shaanan Cohney <shaan...@gmail.com> wrote: > > Hi all, > > > I'm having an issue running some code that works on a build of spark I made > (and still have) but now rebuilding it again, I get the below traceback. I > built it using the 1.4.0 release, profile hadoop-2.4 but version 2.7 and I'm > using python3. It's not vital to my work (as I can use my other build) but > I'd still like to figure out what's going on. > > Best, > shaananc > > Traceback (most recent call last): > File "factor.py", line 73, in <module> > main() > File "factor.py", line 53, in main > poly_filename = polysel.run(sc, parameters) > File "/home/ubuntu/spark_apps/polysel.py", line 90, in run > polysel1_bestpolys = run_polysel1(sc, parameters) > File "/home/ubuntu/spark_apps/polysel.py", line 72, in run_polysel1 > polysel1_bestpolys = [v for _, v in polysel1_polys.takeOrdered(nrkeep, > key=lambda s: s[0])] > File "/home/ubuntu/spark/python/pyspark/rdd.py", line 1198, in takeOrdered > return self.mapPartitions(lambda it: [heapq.nsmallest(num, it, > key)]).reduce(merge) > File "/home/ubuntu/spark/python/pyspark/rdd.py", line 762, in reduce > vals = self.mapPartitions(func).collect() > File "/home/ubuntu/spark/python/pyspark/rdd.py", line 736, in collect > port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd()) > File "/home/ubuntu/spark/python/pyspark/rdd.py", line 2343, in _jrdd > bvars, self.ctx._javaAccumulator) > File "/usr/local/lib/python3.4/dist-packages/py4j/java_gateway.py", line > 701, in __call__ > self._fqn) > File "/usr/local/lib/python3.4/dist-packages/py4j/protocol.py", line 304, > in get_return_value > format(target_id, '.', name, value)) > py4j.protocol.Py4JError: An error occurred while calling > None.org.apache.spark.api.python.PythonRDD. Trace: > py4j.Py4JException: Constructor org.apache.spark.api.python.PythonRDD([class > org.apache.spark.rdd.ParallelCollectionRDD, class [B, class > java.util.HashMap, class java.util.ArrayList, class java.lang.Boolean, class > java.lang.String, class java.util.ArrayList, class > org.apache.spark.Accumulator]) does not exist > at > py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:184) > at > py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:202) > at py4j.Gateway.invoke(Gateway.java:213) > at > py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) > at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:745) > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org