[
https://issues.apache.org/jira/browse/SPARK-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wojciech Jurczyk updated SPARK-12751:
-------------------------------------
Priority: Minor (was: Major)
> Traits generated by SharedParamsCodeGen should not be private
> -------------------------------------------------------------
>
> Key: SPARK-12751
> URL: https://issues.apache.org/jira/browse/SPARK-12751
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Affects Versions: 1.5.2, 1.6.0
> Reporter: Wojciech Jurczyk
> Priority: Minor
>
> Many Estimators and Transformers mix in traits generated by
> SharedParamsCodeGen. These estimators and transformers (like StringIndexer,
> MinMaxScaler etc) are accessible publicly while traits generated by
> SharedParamsCodeGen are private\[ml\]. From user's code it is possible to
> invoke methods that the traits introduce but it is illegal to use any trait
> explicitly. For example, you can call setInputCol(str) on StringIndexer but
> you are not allowed to assign StringIndexer to a variable of type HasInputCol.
> {code:java}
> val x: HasInputCol = new StringIndexer() // Usage of HasInputCol is illegal.
> {code}
> For example, it is impossible to create a collection of transformers that
> have both HasInputCol and HasOutputCol (e.g. Set\[Transformer with
> HasInputCol with HasOutputCol\]). We have to use structural typing and
> reflective calls like this:
> {code}
> ml.Estimator[_] { val outputCol: ml.param.Param[String] }
> {code}
> This seems easy to fix, exposing a couple of traits should not break
> anything. On the other hand, maybe it goes deeper than that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]