Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18736#discussion_r133127624
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala 
---
    @@ -74,26 +74,41 @@ class HashingTF @Since("1.4.0") (@Since("1.4.0") 
override val uid: String)
     
       setDefault(numFeatures -> (1 << 18), binary -> false)
     
    +  private[this] var hashingTF = new 
feature.HashingTF($(numFeatures)).setBinary($(binary))
    +
       /** @group getParam */
       @Since("1.2.0")
       def getNumFeatures: Int = $(numFeatures)
     
       /** @group setParam */
       @Since("1.2.0")
    -  def setNumFeatures(value: Int): this.type = set(numFeatures, value)
    +  def setNumFeatures(value: Int): this.type = {
    +    val t = set(numFeatures, value)
    +    hashingTF = new feature.HashingTF($(numFeatures)).setBinary($(binary))
    +    t
    +  }
     
       /** @group getParam */
       @Since("2.0.0")
       def getBinary: Boolean = $(binary)
     
       /** @group setParam */
       @Since("2.0.0")
    -  def setBinary(value: Boolean): this.type = set(binary, value)
    +  def setBinary(value: Boolean): this.type = {
    +    val t = set(binary, value)
    +    hashingTF.setBinary($(binary))
    --- End diff --
    
    I think we can't do other things except for set params in ```set***``` 
function, this is because PySpark ```set***``` doesn't call corresponding Scala 
function. In PySpark, we collect all params together and then pass them to 
Scala side, then call the following function:
    ```
    def fit(dataset: Dataset[_], paramMap: ParamMap): M = {
        copy(paramMap).fit(dataset)
      }
    ```
    which will skip the corresponding Scala ```set***``` functions. So it will 
make your change to old ```hashingTF``` doesn't take effect.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to