[ 
https://issues.apache.org/jira/browse/SPARK-12834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-12834:
------------------------------------

    Assignee: Apache Spark

> Use type conversion instead of Ser/De of Pickle to transform JavaArray and 
> JavaList
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-12834
>                 URL: https://issues.apache.org/jira/browse/SPARK-12834
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Xusen Yin
>            Assignee: Apache Spark
>
> According to the Ser/De code in Python side:
> {code:title=StringIndexerModel|theme=FadeToGrey|linenumbers=true|language=python|firstline=0001|collapse=false}
>   def _java2py(sc, r, encoding="bytes"):
>     if isinstance(r, JavaObject):
>         clsName = r.getClass().getSimpleName()
>         # convert RDD into JavaRDD
>         if clsName != 'JavaRDD' and clsName.endswith("RDD"):
>             r = r.toJavaRDD()
>             clsName = 'JavaRDD'
>         if clsName == 'JavaRDD':
>             jrdd = sc._jvm.SerDe.javaToPython(r)
>             return RDD(jrdd, sc)
>         if clsName == 'DataFrame':
>             return DataFrame(r, SQLContext.getOrCreate(sc))
>         if clsName in _picklable_classes:
>             r = sc._jvm.SerDe.dumps(r)
>         elif isinstance(r, (JavaArray, JavaList)):
>             try:
>                 r = sc._jvm.SerDe.dumps(r)
>             except Py4JJavaError:
>                 pass  # not pickable
>     if isinstance(r, (bytearray, bytes)):
>         r = PickleSerializer().loads(bytes(r), encoding=encoding)
>     return r
> {code}
> We use SerDe.dumps to serialize JavaArray and JavaList in PythonMLLibAPI, 
> then deserialize them with PickleSerializer in Python side. However, there is 
> no need to transform them in such an inefficient way. Instead of it, we can 
> use type conversion to convert them, e.g. list(JavaArray) or list(JavaList). 
> What's more, there is an issue to Ser/De Scala Array as I said in 
> https://issues.apache.org/jira/browse/SPARK-12780



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to