Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19993#discussion_r158158375
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/param/ParamsSuite.scala 
---
    @@ -430,4 +433,49 @@ object ParamsSuite extends SparkFunSuite {
         require(copyReturnType === obj.getClass,
           s"${clazz.getName}.copy should return ${clazz.getName} instead of 
${copyReturnType.getName}.")
       }
    +
    +  /**
    +   * Checks that the class throws an exception in case both `inputCols` 
and `inputCol` are set and
    +   * in case both `outputCols` and `outputCol` are set.
    +   * These checks are performed only whether the class extends 
respectively both `HasInputCols` and
    +   * `HasInputCol` and both `HasOutputCols` and `HasOutputCol`.
    +   *
    +   * @param paramsClass The Class to be checked
    +   * @param spark A `SparkSession` instance to use
    +   */
    +  def checkMultiColumnParams(paramsClass: Class[_ <: Params], spark: 
SparkSession): Unit = {
    +    import spark.implicits._
    +    // create fake input Dataset
    +    val feature1 = Array(-1.0, 0.0, 1.0)
    +    val feature2 = Array(1.0, 0.0, -1.0)
    +    val df = feature1.zip(feature2).toSeq.toDF("feature1", "feature2")
    --- End diff --
    
    The reason why I created the dataframe inside the method was to control the 
names of the columns it has. Otherwise we can't ensure that those columns 
exist. I think that the type check is performed later, thus it is  not a 
problem here. What do you think?
    
    I preferred to use `paramsClass: Class[_ <: Params]` because I need a clean 
instance for each of the two checks: if an instance is passed I cannot enforce 
that it is clean, ie. some parameters weren't already set and I would need to 
copy it to create new instances as well, since otherwise the second check would 
be influenced by the first one. What do you think?
    
    Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to