[ 
https://issues.apache.org/jira/browse/KUDU-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866551#comment-15866551
 ] 

Andy Stadtler commented on KUDU-1603:
-------------------------------------

We probably need a python class to wrap calling KuduContext cleaner so you 
don't have to do ugly stuff like this.

kc = sc._jvm.org.apache.kudu.spark.kudu.KuduContext("kudu.master:7051")

We also probably need a helper to convert Java ArrayList to a Scala Sequence 
for KuduRDD since py4j will convert the python list to an ArrayList. Not a 
Scala person but something simple like this works.

import java.util.ArrayList
import scala.collection.JavaConverters._

def ArrayListToSeq(al : ArrayList[String]) = al.asScala.toSeq

> Pyspark Integration
> -------------------
>
>                 Key: KUDU-1603
>                 URL: https://issues.apache.org/jira/browse/KUDU-1603
>             Project: Kudu
>          Issue Type: New Feature
>          Components: integration, python, spark
>            Reporter: Jordan Birdsell
>              Labels: features
>
> Now that integration with the Spark Scala/Java API has occurred, work can 
> begin on exposing this to python and integrating with pyspark.  This would 
> likely be a more desirable interface to Kudu for python for use cases, like 
> Data Science, than the current Python client.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to