Github user yanboliang commented on the issue:

    https://github.com/apache/spark/pull/16011
  
    @MLnick Yeah, I think this is the most common case that copying Params from 
estimators to models. However, I also found some algorithms do not comply this 
rule, such as ```ALS``` which has ```ALSParams``` and ```ALSModelParams``` for 
estimator and model separately.
    
    I think we can set params to models not via estimator, for example:
    ```
    val discretizer = new QuantileDiscretizer()
    val pipeline = new Pipeline().setStages(Array(discretizer))
    val model = pipeline.fit(df)
    model.stages(0).asInstanceOf[Bucketizer].setHandleInvalid("skip")
    ```
    I know this way is a little tricky, a better way may be we can have 
```QuantileDiscretizerModel``` which is produced by ```QuantileDiscretizer```.
    Think more about it, ```Bucketizer``` is a separate transformer which 
mainly has two params(```splits``` and ```handleInvalid```) can be set. Users 
can provides candidates for these two params when doing cross validation to 
select the best model. But if we constrict it must be produced by 
```QuantileDiscretizer```, the ```splits``` would be a member variable of the 
model rather than a param. From this perspective, it's more make sense to see 
```Bucketizer``` as a separate transformer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to