>>> sc._jvm.java.lang.Integer.valueOf("12")
12

FYI

On Mon, Sep 28, 2015 at 8:08 PM, YiZhi Liu <javeli...@gmail.com> wrote:

> Hi,
>
> I'm doing some data processing on pyspark, but I failed to reach JVM
> in workers. Here is what I did:
>
> $ bin/pyspark
> >>> data = sc.parallelize(["123", "234"])
> >>> numbers = data.map(lambda s:
> SparkContext._active_spark_context._jvm.java.lang.Integer.valueOf(s.strip()))
> >>> numbers.collect()
>
> I got,
>
> Caused by: org.apache.spark.api.python.PythonException: Traceback
> (most recent call last):
>   File
> "/mnt/hgfs/lewis/Workspace/source-codes/spark/python/lib/pyspark.zip/pyspark/worker.py",
> line 111, in main
>     process()
>   File
> "/mnt/hgfs/lewis/Workspace/source-codes/spark/python/lib/pyspark.zip/pyspark/worker.py",
> line 106, in process
>     serializer.dump_stream(func(split_index, iterator), outfile)
>   File
> "/mnt/hgfs/lewis/Workspace/source-codes/spark/python/lib/pyspark.zip/pyspark/serializers.py",
> line 263, in dump_stream
>     vs = list(itertools.islice(iterator, batch))
>   File "<stdin>", line 1, in <lambda>
> AttributeError: 'NoneType' object has no attribute '_jvm'
>
> at org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:138)
> at
> org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:179)
> at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:97)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> ... 1 more
>
> While _jvm at the driver end looks fine:
>
> >>>
> SparkContext._active_spark_context._jvm.java.lang.Integer.valueOf("123".strip())
> 123
>
> The program is trivial, I just wonder what is the right way to reach
> JVM in python. Any help would be appreciated.
>
> Thanks
>
> --
> Yizhi Liu
> Senior Software Engineer / Data Mining
> www.mvad.com, Shanghai, China
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to