[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

viirya Mon, 23 Apr 2018 23:42:56 -0700

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20146#discussion_r183619017
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
    @@ -130,21 +159,57 @@ class StringIndexer @Since("1.4.0") (
       @Since("1.4.0")
       def setOutputCol(value: String): this.type = set(outputCol, value)
     
    +  /** @group setParam */
    +  @Since("2.4.0")
    +  def setInputCols(value: Array[String]): this.type = set(inputCols, value)
    +
    +  /** @group setParam */
    +  @Since("2.4.0")
    +  def setOutputCols(value: Array[String]): this.type = set(outputCols, 
value)
    +
    +  private def countByValue(
    +      dataset: Dataset[_],
    +      inputCols: Array[String]): Array[OpenHashMap[String, Long]] = {
    +
    +    val aggregator = new StringIndexerAggregator(inputCols.length)
    --- End diff --
    
    Use SQL `Aggregator` now.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

Reply via email to