Re: SparkContext._active_spark_context returns None

2015-09-29 Thread Ted Yu
bq. the right way to reach JVM in python Can you tell us more about what you want to achieve ? If you want to pass some value to workers, you can use broadcast variable. Cheers On Mon, Sep 28, 2015 at 10:31 PM, YiZhi Liu wrote: > Hi Ted, > > Thank you for reply. The sc

Re: SparkContext._active_spark_context returns None

2015-09-29 Thread YiZhi Liu
Hi Ted, I think I've make a mistake. I refered to python/mllib, callJavaFunc in mllib/common.py use SparkContext._active_spark_context because it is called from the driver. So maybe there is no explicit way to reach JVM during rdd operations? What I want to achieve is to take a ThriftWritable

Re: SparkContext._active_spark_context returns None

2015-09-28 Thread YiZhi Liu
Hi Ted, Thank you for reply. The sc works at driver, but how can I reach the JVM in rdd.map ? 2015-09-29 11:26 GMT+08:00 Ted Yu : sc._jvm.java.lang.Integer.valueOf("12") > 12 > > FYI > > On Mon, Sep 28, 2015 at 8:08 PM, YiZhi Liu wrote: >> >> Hi,

Re: SparkContext._active_spark_context returns None

2015-09-28 Thread Ted Yu
>>> sc._jvm.java.lang.Integer.valueOf("12") 12 FYI On Mon, Sep 28, 2015 at 8:08 PM, YiZhi Liu wrote: > Hi, > > I'm doing some data processing on pyspark, but I failed to reach JVM > in workers. Here is what I did: > > $ bin/pyspark > >>> data = sc.parallelize(["123",

SparkContext._active_spark_context returns None

2015-09-28 Thread YiZhi Liu
Hi, I'm doing some data processing on pyspark, but I failed to reach JVM in workers. Here is what I did: $ bin/pyspark >>> data = sc.parallelize(["123", "234"]) >>> numbers = data.map(lambda s: >>> SparkContext._active_spark_context._jvm.java.lang.Integer.valueOf(s.strip())) >>>