[ 
https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231883#comment-15231883
 ] 

DB Tsai commented on SPARK-13944:
---------------------------------

I'm going to close this PR. Gonna work on 
https://issues.apache.org/jira/browse/SPARK-14462 first. 

I think we will just keep the `mllib` code untouched, and will not maintain it 
anymore. We'll copy the code into `ml` package, and all the further development 
will be on new `ml` package. As a result, those two packages will not depend on 
each other, and it's easier to maintain.

For your second point, we'll keep all the `mllib` code untouched, but users who 
is using the new `ml` code have to migrate to the new `ml.linalg`. We do think 
about using type aliases, and decide not to do it for the same Java 
compatibility reason :)

For UDT, in ScalaReflection.scala line 700, instead of the following

[code]
        val udt = Utils.classForName(className)
          .getAnnotation(classOf[SQLUserDefinedType]).udt().newInstance()
[/code]

we can add different way to get udt without using annotation. 

Thanks.

> Separate out local linear algebra as a standalone module without Spark 
> dependency
> ---------------------------------------------------------------------------------
>
>                 Key: SPARK-13944
>                 URL: https://issues.apache.org/jira/browse/SPARK-13944
>             Project: Spark
>          Issue Type: New Feature
>          Components: Build, ML
>    Affects Versions: 2.0.0
>            Reporter: Xiangrui Meng
>            Assignee: DB Tsai
>            Priority: Blocker
>
> Separate out linear algebra as a standalone module without Spark dependency 
> to simplify production deployment. We can call the new module 
> spark-mllib-local, which might contain local models in the future.
> The major issue is to remove dependencies on user-defined types.
> The package name will be changed from mllib to ml. For example, Vector will 
> be changed from `org.apache.spark.mllib.linalg.Vector` to 
> `org.apache.spark.ml.linalg.Vector`. The return vector type in the new ML 
> pipeline will be the one in ML package; however, the existing mllib code will 
> not be touched. As a result, this will potentially break the API. Also, when 
> the vector is loaded from mllib vector by Spark SQL, the vector will 
> automatically converted into the one in ml package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to