[ 
https://issues.apache.org/jira/browse/SPARK-14299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xusen Yin updated SPARK-14299:
------------------------------
    Description: 
Duplicated code that I found in scala/examples/ml:

* scala/ml
** CrossValidatorExample.scala --> ModelSelectionViaCrossValidationExample
** DecisionTreeExample.scala --> DecisionTreeRegressionExample, 
DecisionTreeClassificationExample
** GBTExample.scala --> GradientBoostedTreeClassifierExample, 
GradientBoostedTreeRegressorExample
** LinearRegressionExample.scala --> LinearRegressionWithElasticNetExample
** LogisticRegressionExample.scala --> LogisticRegressionWithElasticNetExample, 
LogisticRegressionSummaryExample
** RandomForestExample.scala --> RandomForestRegressorExample, 
RandomForestClassifierExample
** TrainValidationSplitExample.scala --> 
ModelSelectionViaTrainValidationSplitExample
** DeveloperApiExample.scala --> I delete it for now because it's only about 
how to create your own classifieri, etc, which can be learned easily from other 
examples and ml codes.
** SimpleParamsExample.scala --> merge with 
LogisticRegressionSummaryExample.scala
** SimpleTextClassificationPipeline.scala --> 
ModelSelectionViaCrossValidationExample
** DataFrameExample.scala --> merge with LogisticRegressionSummaryExample.scala

When merging and cleaning those code, be sure not disturb the previous example 
on and off blocks.

I'll take this one as an example. 

  was:
Duplicated code that I found in scala/examples/ml:

* scala/ml
** CrossValidatorExample.scala --> ModelSelectionViaCrossValidationExample
** DecisionTreeExample.scala --> DecisionTreeRegressionExample, 
DecisionTreeClassificationExample
** GBTExample.scala --> GradientBoostedTreeClassifierExample, 
GradientBoostedTreeRegressorExample
** LinearRegressionExample.scala --> LinearRegressionWithElasticNetExample
** LogisticRegressionExample.scala --> LogisticRegressionWithElasticNetExample, 
LogisticRegressionSummaryExample
** RandomForestExample.scala --> RandomForestRegressorExample, 
RandomForestClassifierExample
** TrainValidationSplitExample.scala --> 
ModelSelectionViaTrainValidationSplitExample
** DeveloperApiExample.scala --> I delete it for now because it's only about 
how to create your own classifieri, etc, which can be learned easily from other 
examples and ml codes.
** SimpleParamsExample.scala --> merge with 
LogisticRegressionSummaryExample.scala
** SimpleTextClassificationPipeline.scala --> 
ModelSelectionViaCrossValidationExample

* Unsure code duplications (need double check)
** DataFrameExample.scala

When merging and cleaning those code, be sure not disturb the previous example 
on and off blocks.

I'll take this one as an example. 


> Scala ML examples code merge and clean up
> -----------------------------------------
>
>                 Key: SPARK-14299
>                 URL: https://issues.apache.org/jira/browse/SPARK-14299
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Examples
>            Reporter: Xusen Yin
>            Priority: Minor
>              Labels: starter
>
> Duplicated code that I found in scala/examples/ml:
> * scala/ml
> ** CrossValidatorExample.scala --> ModelSelectionViaCrossValidationExample
> ** DecisionTreeExample.scala --> DecisionTreeRegressionExample, 
> DecisionTreeClassificationExample
> ** GBTExample.scala --> GradientBoostedTreeClassifierExample, 
> GradientBoostedTreeRegressorExample
> ** LinearRegressionExample.scala --> LinearRegressionWithElasticNetExample
> ** LogisticRegressionExample.scala --> 
> LogisticRegressionWithElasticNetExample, LogisticRegressionSummaryExample
> ** RandomForestExample.scala --> RandomForestRegressorExample, 
> RandomForestClassifierExample
> ** TrainValidationSplitExample.scala --> 
> ModelSelectionViaTrainValidationSplitExample
> ** DeveloperApiExample.scala --> I delete it for now because it's only about 
> how to create your own classifieri, etc, which can be learned easily from 
> other examples and ml codes.
> ** SimpleParamsExample.scala --> merge with 
> LogisticRegressionSummaryExample.scala
> ** SimpleTextClassificationPipeline.scala --> 
> ModelSelectionViaCrossValidationExample
> ** DataFrameExample.scala --> merge with 
> LogisticRegressionSummaryExample.scala
> When merging and cleaning those code, be sure not disturb the previous 
> example on and off blocks.
> I'll take this one as an example. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to