[ 
https://issues.apache.org/jira/browse/SPARK-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527760#comment-14527760
 ] 

Joseph K. Bradley commented on SPARK-4766:
------------------------------------------

I started working on this JIRA, but I'm starting to think it's more trouble 
than it's worth.  Here are pros/cons of splitting Estimator & Model Params:

Input/output columns:
* Pro: It's odd to be able to set lrModel.labelCol.
* Con: When we want to evaluate a model, we'll want to know the label column.  
A model will know its default evaluator...but not the label column.

Other parameters:
* Pro: It's odd to set lrModel.maxIter.  It's awkward that CrossValidator could 
mistakenly iterate over lrModel.maxIter values (if lrModel were in a Pipeline).
* Con: A lot of parameters are arguably part of the model.  (regParam, etc.)

Pro: The separation is more technically correct.

Cons:
* The separation adds a little boilerplate.
* You also have to be careful about which validateAndTransformSchema you call 
due to inheriting from multiple traits.

Given everything, I'm going to close this JIRA as not a problem.  But please 
post if you disagree.

CC: [~mengxr]

> ML Estimator Params should be distinct from Transformer Params
> --------------------------------------------------------------
>
>                 Key: SPARK-4766
>                 URL: https://issues.apache.org/jira/browse/SPARK-4766
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 1.2.0
>            Reporter: Joseph K. Bradley
>
> Currently, in spark.ml, both Transformers and Estimators extend the same 
> Params classes.  There should be one Params class for the Transformer and one 
> for the Estimator.  These could sometimes be the same, but for other models, 
> we may need either (a) to make them distinct or (b) to have the Estimator 
> params class extend the Transformer one.
> E.g., it is weird to be able to do:
> {code}
> val model: LogisticRegressionModel = ...
> model.getMaxIter()
> {code}
> It's also weird to be able to:
> * Wrap LogisticRegressionModel (a Transformer) with CrossValidator
> * Pass a set of ParamMaps to CrossValidator which includes parameter 
> LogisticRegressionModel.maxIter
> * (CrossValidator would try to set that parameter.)
> * I'm not sure if this would cause a failure or just be a noop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to