[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231883#comment-15231883 ]
DB Tsai commented on SPARK-13944: --------------------------------- I'm going to close this PR. Gonna work on https://issues.apache.org/jira/browse/SPARK-14462 first. I think we will just keep the `mllib` code untouched, and will not maintain it anymore. We'll copy the code into `ml` package, and all the further development will be on new `ml` package. As a result, those two packages will not depend on each other, and it's easier to maintain. For your second point, we'll keep all the `mllib` code untouched, but users who is using the new `ml` code have to migrate to the new `ml.linalg`. We do think about using type aliases, and decide not to do it for the same Java compatibility reason :) For UDT, in ScalaReflection.scala line 700, instead of the following [code] val udt = Utils.classForName(className) .getAnnotation(classOf[SQLUserDefinedType]).udt().newInstance() [/code] we can add different way to get udt without using annotation. Thanks. > Separate out local linear algebra as a standalone module without Spark > dependency > --------------------------------------------------------------------------------- > > Key: SPARK-13944 > URL: https://issues.apache.org/jira/browse/SPARK-13944 > Project: Spark > Issue Type: New Feature > Components: Build, ML > Affects Versions: 2.0.0 > Reporter: Xiangrui Meng > Assignee: DB Tsai > Priority: Blocker > > Separate out linear algebra as a standalone module without Spark dependency > to simplify production deployment. We can call the new module > spark-mllib-local, which might contain local models in the future. > The major issue is to remove dependencies on user-defined types. > The package name will be changed from mllib to ml. For example, Vector will > be changed from `org.apache.spark.mllib.linalg.Vector` to > `org.apache.spark.ml.linalg.Vector`. The return vector type in the new ML > pipeline will be the one in ML package; however, the existing mllib code will > not be touched. As a result, this will potentially break the API. Also, when > the vector is loaded from mllib vector by Spark SQL, the vector will > automatically converted into the one in ml package. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org