Help needed with Py4J
Hi Colleagues We need to call a Scala Class from pySpark in Ipython notebook. We tried something like below : from py4j.java_gateway import java_import java_import(sparkContext._jvm,'mynamespace') myScalaClass = sparkContext._jvm.SimpleScalaClass () myScalaClass.sayHello(World) Works Fine But When we try to pass sparkContext to our class it fails like below myContext = _jvm.MySQLContext(sparkContext) fails with AttributeErrorTraceback (most recent call last) ipython-input-19-34330244f574 in module() 1 z = _jvm.MySQLContext(sparkContext) C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py in __call__(self, *args) 690 691 args_command = ''.join( -- 692 [get_command_part(arg, self._pool) for arg in new_args]) 693 694 command = CONSTRUCTOR_COMMAND_NAME +\ C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py in get_command_part(parameter, python_proxy_pool) 263 command_part += ';' + interface 264 else: -- 265 command_part = REFERENCE_TYPE + parameter._get_object_id() 266 267 command_part += '\n' attributeError: 'SparkContext' object has no attribute '_get_object_id' And myContext = _jvm.MySQLContext(sparkContext._jsc) fails with Constructor org.apache.spark.sql.MySQLContext([class org.apache.spark.api.java.JavaSparkContext]) does not exist Would this be possible ... or there are serialization issues and hence not possible. If not what are the options we have to instantiate our own SQLContext written in scala from pySpark... Best Regards, Santosh
Re: Help needed with Py4J
Yeah ... I am able to instantiate the simple scala class as explained below which is from the same JAR Regards Santosh On May 20, 2015, at 7:26 PM, Holden Karau hol...@pigscanfly.camailto:hol...@pigscanfly.ca wrote: Are your jars included in both the driver and worker class paths? On Wednesday, May 20, 2015, Addanki, Santosh Kumar santosh.kumar.adda...@sap.commailto:santosh.kumar.adda...@sap.com wrote: Hi Colleagues We need to call a Scala Class from pySpark in Ipython notebook. We tried something like below : from py4j.java_gateway import java_import java_import(sparkContext._jvm,'mynamespace') myScalaClass = sparkContext._jvm.SimpleScalaClass () myScalaClass.sayHello(“World”) Works Fine But When we try to pass sparkContext to our class it fails like below myContext = _jvm.MySQLContext(sparkContext) fails with AttributeErrorTraceback (most recent call last) ipython-input-19-34330244f574 in module() 1 z = _jvm.MySQLContext(sparkContext) C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py in __call__(self, *args) 690 691 args_command = ''.join( -- 692 [get_command_part(arg, self._pool) for arg in new_args]) 693 694 command = CONSTRUCTOR_COMMAND_NAME +\ C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py in get_command_part(parameter, python_proxy_pool) 263 command_part += ';' + interface 264 else: -- 265 command_part = REFERENCE_TYPE + parameter._get_object_id() 266 267 command_part += '\n' attributeError: 'SparkContext' object has no attribute '_get_object_id' And myContext = _jvm.MySQLContext(sparkContext._jsc) fails with Constructor org.apache.spark.sql.MySQLContext([class org.apache.spark.api.java.JavaSparkContext]) does not exist Would this be possible … or there are serialization issues and hence not possible. If not what are the options we have to instantiate our own SQLContext written in scala from pySpark… Best Regards, Santosh -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau Linked In: https://www.linkedin.com/in/holdenkarau
Re: Help needed with Py4J
Ah sorry, I missed that part (I've been dealing with some py4j stuff today as well and maybe skimmed it a bit too quickly). Do you have your code somewhere I could take a look at? Also does your constructor expect a JavaSparkContext or a regular SparkContext (if you look at how the SQLContext is constructed in python its done using a regular SparkContext, so _jsc.sc() is used). On Wed, May 20, 2015 at 7:32 PM, Addanki, Santosh Kumar santosh.kumar.adda...@sap.com wrote: Yeah ... I am able to instantiate the simple scala class as explained below which is from the same JAR Regards Santosh On May 20, 2015, at 7:26 PM, Holden Karau hol...@pigscanfly.ca wrote: Are your jars included in both the driver and worker class paths? On Wednesday, May 20, 2015, Addanki, Santosh Kumar santosh.kumar.adda...@sap.com wrote: Hi Colleagues We need to call a Scala Class from pySpark in Ipython notebook. We tried something like below : from py4j.java_gateway import java_import java_import(sparkContext._jvm,'mynamespace') myScalaClass = sparkContext._jvm.SimpleScalaClass () myScalaClass.sayHello(“World”) Works Fine But When we try to pass sparkContext to our class it fails like below myContext = _jvm.MySQLContext(sparkContext) fails with AttributeErrorTraceback (most recent call last) ipython-input-19-34330244f574 in module() 1 z = _jvm.MySQLContext(sparkContext) C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py in __call__(self, *args) 690 691 args_command = ''.join( -- 692 [get_command_part(arg, self._pool) for arg in new_args]) 693 694 command = CONSTRUCTOR_COMMAND_NAME +\ C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py in get_command_part(parameter, python_proxy_pool) 263 command_part += ';' + interface 264 else: -- 265 command_part = REFERENCE_TYPE + parameter._get_object_id() 266 267 command_part += '\n' attributeError: 'SparkContext' object has no attribute '_get_object_id' And myContext = _*jvm.MySQLContext(sparkContext.*_jsc) fails with Constructor org.apache.spark.sql.MySQLContext([class org.apache.spark.api.java.JavaSparkContext]) does not exist Would this be possible … or there are serialization issues and hence not possible. If not what are the options we have to instantiate our own SQLContext written in scala from pySpark… Best Regards, Santosh -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau Linked In: https://www.linkedin.com/in/holdenkarau -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau Linked In: https://www.linkedin.com/in/holdenkarau
Re: Help needed with Py4J
Are your jars included in both the driver and worker class paths? On Wednesday, May 20, 2015, Addanki, Santosh Kumar santosh.kumar.adda...@sap.com wrote: Hi Colleagues We need to call a Scala Class from pySpark in Ipython notebook. We tried something like below : from py4j.java_gateway import java_import java_import(sparkContext._jvm,'mynamespace') myScalaClass = sparkContext._jvm.SimpleScalaClass () myScalaClass.sayHello(“World”) Works Fine But When we try to pass sparkContext to our class it fails like below myContext = _jvm.MySQLContext(sparkContext) fails with AttributeErrorTraceback (most recent call last) ipython-input-19-34330244f574 in module() 1 z = _jvm.MySQLContext(sparkContext) C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py in __call__(self, *args) 690 691 args_command = ''.join( -- 692 [get_command_part(arg, self._pool) for arg in new_args]) 693 694 command = CONSTRUCTOR_COMMAND_NAME +\ C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py in get_command_part(parameter, python_proxy_pool) 263 command_part += ';' + interface 264 else: -- 265 command_part = REFERENCE_TYPE + parameter._get_object_id() 266 267 command_part += '\n' attributeError: 'SparkContext' object has no attribute '_get_object_id' And myContext = _*jvm.MySQLContext(sparkContext.*_jsc) fails with Constructor org.apache.spark.sql.MySQLContext([class org.apache.spark.api.java.JavaSparkContext]) does not exist Would this be possible … or there are serialization issues and hence not possible. If not what are the options we have to instantiate our own SQLContext written in scala from pySpark… Best Regards, Santosh -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau Linked In: https://www.linkedin.com/in/holdenkarau