[
https://issues.apache.org/jira/browse/MAHOUT-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15762889#comment-15762889
]
ASF GitHub Bot commented on MAHOUT-1896:
----------------------------------------
Github user andrewpalumbo commented on a diff in the pull request:
https://github.com/apache/mahout/pull/263#discussion_r93156487
--- Diff:
spark/src/test/scala/org/apache/mahout/sparkbindings/drm/DrmLikeSuite.scala ---
@@ -63,6 +65,65 @@ class DrmLikeSuite extends FunSuite with
DistributedSparkSuite with DrmLikeSuite
throw new AssertionError("Block must be dense.")
keys -> block
}).norm should be < 1e-4
+
+ }
+
+ test("DRM wrap labeled points") {
+
+ import org.apache.spark.mllib.linalg.{Vectors => SparkVector}
+ import org.apache.spark.mllib.regression.LabeledPoint
+
+ val sc = mahoutCtx.asInstanceOf[SparkDistributedContext].sc
+
+ val lpRDD = sc.parallelize(Seq(LabeledPoint(1.0,
SparkVector.dense(2.0, 0.0, 4.0)),
+ LabeledPoint(2.0,
SparkVector.dense(3.0, 0.0, 5.0)),
+ LabeledPoint(3.0,
SparkVector.dense(4.0, 0.0, 6.0)) ))
+
+ val lpDRM = drmWrapMLLibLabeledPoint(rdd= lpRDD)
+ val lpM = lpDRM.collect(::,::)
+ val testM = dense((1,2,0,4), (2,3,0,5), (3,4,0,6))
+
+ assert(lpM === testM)
}
+ test("DRM wrap spark vectors") {
+
+ import org.apache.spark.mllib.linalg.{Vectors => SparkVector}
+
+ val sc = mahoutCtx.asInstanceOf[SparkDistributedContext].sc
--- End diff --
same implicit conversion as before.
> Add convenience methods for interacting with Spark ML
> -----------------------------------------------------
>
> Key: MAHOUT-1896
> URL: https://issues.apache.org/jira/browse/MAHOUT-1896
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.12.2
> Reporter: Trevor Grant
> Assignee: Trevor Grant
> Priority: Minor
> Fix For: 0.13.0
>
>
> Currently the method for ingesting RDDs to DRM is `drmWrap`. This is a
> flexible method, however there are many cases when the RDD to be wrapped is
> either RDD[org.apache.spark.mllib.lingalg.Vector],
> RDD[org.apache.spark.mllib.regression.LabeledPoint], or DataFrame[Row] (as is
> the case when working with SparkML. It makes sense to create convenience
> methods for converting these types to DRM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)