[ 
https://issues.apache.org/jira/browse/SPARK-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wojciech Jurczyk updated SPARK-12751:
-------------------------------------
    Priority: Minor  (was: Major)

> Traits generated by SharedParamsCodeGen should not be private
> -------------------------------------------------------------
>
>                 Key: SPARK-12751
>                 URL: https://issues.apache.org/jira/browse/SPARK-12751
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.5.2, 1.6.0
>            Reporter: Wojciech Jurczyk
>            Priority: Minor
>
> Many Estimators and Transformers mix in traits generated by 
> SharedParamsCodeGen. These estimators and transformers (like StringIndexer, 
> MinMaxScaler etc) are accessible publicly while traits generated by 
> SharedParamsCodeGen are private\[ml\]. From user's code it is possible to 
> invoke methods that the traits introduce but it is illegal to use any trait 
> explicitly. For example, you can call setInputCol(str) on StringIndexer but 
> you are not allowed to assign StringIndexer to a variable of type HasInputCol.
> {code:java}
> val x: HasInputCol = new StringIndexer() // Usage of HasInputCol is illegal.
> {code}
> For example, it is impossible to create a collection of transformers that 
> have both HasInputCol and HasOutputCol (e.g. Set\[Transformer with 
> HasInputCol with HasOutputCol\]). We have to use structural typing and 
> reflective calls like this:
> {code}
> ml.Estimator[_] { val outputCol: ml.param.Param[String] }
> {code}
> This seems easy to fix, exposing a couple of traits should not break 
> anything. On the other hand, maybe it goes deeper than that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to