[
https://issues.apache.org/jira/browse/SPARK-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph K. Bradley updated SPARK-4766:
-------------------------------------
Description:
h2. Issue
Currently, in spark.ml, both Transformers and Estimators extend the same Params
classes. There should be one Params class for the Transformer and one for the
Estimator. These could sometimes be the same, but for other models, we may
need either (a) to make them distinct or (b) to have the Estimator params class
extend the Transformer one.
E.g., it is weird to be able to do:
{code}
val model: LogisticRegressionModel = ...
model.getMaxIter()
{code}
It's also weird to be able to:
* Wrap LogisticRegressionModel (a Transformer) with CrossValidator
* Pass a set of ParamMaps to CrossValidator which includes parameter
LogisticRegressionModel.maxIter
* (CrossValidator would try to set that parameter.)
* I'm not sure if this would cause a failure or just be a noop.
See the comment below about Word2Vec as well, where the Estimator and Model
take different input column types.
h2. Proposal
was:
Currently, in spark.ml, both Transformers and Estimators extend the same Params
classes. There should be one Params class for the Transformer and one for the
Estimator. These could sometimes be the same, but for other models, we may
need either (a) to make them distinct or (b) to have the Estimator params class
extend the Transformer one.
E.g., it is weird to be able to do:
{code}
val model: LogisticRegressionModel = ...
model.getMaxIter()
{code}
It's also weird to be able to:
* Wrap LogisticRegressionModel (a Transformer) with CrossValidator
* Pass a set of ParamMaps to CrossValidator which includes parameter
LogisticRegressionModel.maxIter
* (CrossValidator would try to set that parameter.)
* I'm not sure if this would cause a failure or just be a noop.
See the comment below about Word2Vec as well.
> ML Estimator Params should be distinct from Transformer Params
> --------------------------------------------------------------
>
> Key: SPARK-4766
> URL: https://issues.apache.org/jira/browse/SPARK-4766
> Project: Spark
> Issue Type: Improvement
> Components: ML
> Affects Versions: 1.2.0
> Reporter: Joseph K. Bradley
>
> h2. Issue
> Currently, in spark.ml, both Transformers and Estimators extend the same
> Params classes. There should be one Params class for the Transformer and one
> for the Estimator. These could sometimes be the same, but for other models,
> we may need either (a) to make them distinct or (b) to have the Estimator
> params class extend the Transformer one.
> E.g., it is weird to be able to do:
> {code}
> val model: LogisticRegressionModel = ...
> model.getMaxIter()
> {code}
> It's also weird to be able to:
> * Wrap LogisticRegressionModel (a Transformer) with CrossValidator
> * Pass a set of ParamMaps to CrossValidator which includes parameter
> LogisticRegressionModel.maxIter
> * (CrossValidator would try to set that parameter.)
> * I'm not sure if this would cause a failure or just be a noop.
> See the comment below about Word2Vec as well, where the Estimator and Model
> take different input column types.
> h2. Proposal
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]