[ 
https://issues.apache.org/jira/browse/SPARK-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengruifeng updated SPARK-18757:
---------------------------------
    Description: 
Recently, I found three places in which column setters are missing: 
KMeansModel, BisectingKMeansModel and OneVsRestModel.
These three models directly inherit `Model` which dont have columns setters, so 
I had to add the missing setters manually in [SPARK-18625] and [SPARK-18520].
Fow now, models in pyspark still don't support column setters at all.
I suggest that we keep the hierarchy of pyspark models in line with that in the 
scala side:
For classifiation and regression algs, I‘m making a trial in [SPARK-18379]. In 
it, I try to copy the hierarchy from the scala side.
For clustering algs, I think we may first create abstract classes 
{{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering 
algs inherit it. Then, in the python side, we copy the hierarchy so that we 
dont need to add setters for each alg.
For features algs, we can also use a abstract class {{FeatureModel}} in scala 
side, and do the same thing.

What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen]

  was:
Recently, I found three places in which column setters are missing: 
KMeansModel, BisectingKMeansModel and OneVsRestModel.
These three models directly inherit `Model` which dont have columns setters, so 
I had to add the missing setters manually in [SPARK-18625] and [SPARK-18520].
Fow now, models in pyspark still don't support column setters at all.
I suggest that we keep the hierarchy of pyspark models in line with that in the 
scala side:
For classifiation and regression algs, I‘m making a trial in [SPARK-18379]
For clustering algs, I think we may first create abstract classes 
{{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering 
algs inherit it. Then, in the python side, we copy the hierarchy so that we 
dont need to add setters for each alg.
For features algs, we can also use a abstract class {{FeatureModel}} in scala 
side, and do the same thing.

What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen]


> Models in Pyspark support column setters
> ----------------------------------------
>
>                 Key: SPARK-18757
>                 URL: https://issues.apache.org/jira/browse/SPARK-18757
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: ML, PySpark
>            Reporter: zhengruifeng
>
> Recently, I found three places in which column setters are missing: 
> KMeansModel, BisectingKMeansModel and OneVsRestModel.
> These three models directly inherit `Model` which dont have columns setters, 
> so I had to add the missing setters manually in [SPARK-18625] and 
> [SPARK-18520].
> Fow now, models in pyspark still don't support column setters at all.
> I suggest that we keep the hierarchy of pyspark models in line with that in 
> the scala side:
> For classifiation and regression algs, I‘m making a trial in [SPARK-18379]. 
> In it, I try to copy the hierarchy from the scala side.
> For clustering algs, I think we may first create abstract classes 
> {{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering 
> algs inherit it. Then, in the python side, we copy the hierarchy so that we 
> dont need to add setters for each alg.
> For features algs, we can also use a abstract class {{FeatureModel}} in scala 
> side, and do the same thing.
> What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to