The compiled jar is not consistent with Python source, maybe you are
using a older version pyspark, but with assembly jar of Spark Core
1.4?

On Sun, Jun 21, 2015 at 7:24 AM, Shaanan Cohney <shaan...@gmail.com> wrote:
>
> Hi all,
>
>
> I'm having an issue running some code that works on a build of spark I made
> (and still have) but now rebuilding it again, I get the below traceback. I
> built it using the 1.4.0 release, profile hadoop-2.4 but version 2.7 and I'm
> using python3. It's not vital to my work (as I can use my other build) but
> I'd still like to figure out what's going on.
>
> Best,
> shaananc
>
> Traceback (most recent call last):
>   File "factor.py", line 73, in <module>
>     main()
>   File "factor.py", line 53, in main
>     poly_filename = polysel.run(sc, parameters)
>   File "/home/ubuntu/spark_apps/polysel.py", line 90, in run
>     polysel1_bestpolys = run_polysel1(sc, parameters)
>   File "/home/ubuntu/spark_apps/polysel.py", line 72, in run_polysel1
>     polysel1_bestpolys = [v for _, v in polysel1_polys.takeOrdered(nrkeep,
> key=lambda s: s[0])]
>   File "/home/ubuntu/spark/python/pyspark/rdd.py", line 1198, in takeOrdered
>     return self.mapPartitions(lambda it: [heapq.nsmallest(num, it,
> key)]).reduce(merge)
>   File "/home/ubuntu/spark/python/pyspark/rdd.py", line 762, in reduce
>     vals = self.mapPartitions(func).collect()
>   File "/home/ubuntu/spark/python/pyspark/rdd.py", line 736, in collect
>     port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
>   File "/home/ubuntu/spark/python/pyspark/rdd.py", line 2343, in _jrdd
>     bvars, self.ctx._javaAccumulator)
>   File "/usr/local/lib/python3.4/dist-packages/py4j/java_gateway.py", line
> 701, in __call__
>     self._fqn)
>   File "/usr/local/lib/python3.4/dist-packages/py4j/protocol.py", line 304,
> in get_return_value
>     format(target_id, '.', name, value))
> py4j.protocol.Py4JError: An error occurred while calling
> None.org.apache.spark.api.python.PythonRDD. Trace:
> py4j.Py4JException: Constructor org.apache.spark.api.python.PythonRDD([class
> org.apache.spark.rdd.ParallelCollectionRDD, class [B, class
> java.util.HashMap, class java.util.ArrayList, class java.lang.Boolean, class
> java.lang.String, class java.util.ArrayList, class
> org.apache.spark.Accumulator]) does not exist
> at
> py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:184)
> at
> py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:202)
> at py4j.Gateway.invoke(Gateway.java:213)
> at
> py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
> at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
> at py4j.GatewayConnection.run(GatewayConnection.java:207)
> at java.lang.Thread.run(Thread.java:745)
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to