imatiach-msft commented on a change in pull request #21632: 
[SPARK-19591][ML][MLlib] Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r247767225
 
 

 ##########
 File path: mllib/src/test/scala/org/apache/spark/ml/tree/impl/TreeTests.scala
 ##########
 @@ -18,37 +18,46 @@
 package org.apache.spark.ml.tree.impl
 
 import scala.collection.JavaConverters._
+import scala.util.Random
 
 import org.apache.spark.{SparkContext, SparkFunSuite}
 import org.apache.spark.api.java.JavaRDD
 import org.apache.spark.ml.attribute.{AttributeGroup, NominalAttribute, 
NumericAttribute}
-import org.apache.spark.ml.feature.LabeledPoint
+import org.apache.spark.ml.feature.{Instance, LabeledPoint}
 import org.apache.spark.ml.linalg.Vectors
 import org.apache.spark.ml.tree._
+import org.apache.spark.mllib.util.TestingUtils._
 import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{DataFrame, SparkSession}
 
 private[ml] object TreeTests extends SparkFunSuite {
 
   /**
    * Convert the given data to a DataFrame, and set the features and label 
metadata.
+   *
    * @param data  Dataset.  Categorical features and labels must already have 
0-based indices.
    *              This must be non-empty.
    * @param categoricalFeatures  Map: categorical feature index to number of 
distinct values
    * @param numClasses  Number of classes label can take.  If 0, mark as 
continuous.
    * @return DataFrame with metadata
    */
   def setMetadata(
-      data: RDD[LabeledPoint],
+      data: RDD[_],
 
 Review comment:
   I tried to create two methods with different RDDs, one for LabeledPoint and 
the other for Instance, but I got an error complaining about duplicate methods 
(probably due to type erasure).  Maybe I misunderstood, is there a way to make 
it work?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to