[
https://issues.apache.org/jira/browse/SPARK-14299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xusen Yin updated SPARK-14299:
------------------------------
Description:
Duplicated code that I found in scala/examples/ml:
* scala/ml
** CrossValidatorExample.scala --> ModelSelectionViaCrossValidationExample
** DecisionTreeExample.scala --> DecisionTreeRegressionExample,
DecisionTreeClassificationExample
** GBTExample.scala --> GradientBoostedTreeClassifierExample,
GradientBoostedTreeRegressorExample
** LinearRegressionExample.scala --> LinearRegressionWithElasticNetExample
** LogisticRegressionExample.scala --> LogisticRegressionWithElasticNetExample,
LogisticRegressionSummaryExample
** RandomForestExample.scala --> RandomForestRegressorExample,
RandomForestClassifierExample
** TrainValidationSplitExample.scala -->
ModelSelectionViaTrainValidationSplitExample
** DeveloperApiExample.scala --> I delete it for now because it's only about
how to create your own classifieri, etc, which can be learned easily from other
examples and ml codes.
** SimpleParamsExample.scala --> merge with
LogisticRegressionSummaryExample.scala
** SimpleTextClassificationPipeline.scala -->
ModelSelectionViaCrossValidationExample
* Unsure code duplications (need double check)
** DataFrameExample.scala
When merging and cleaning those code, be sure not disturb the previous example
on and off blocks.
I'll take this one as an example.
was:
Duplicated code that I found in scala/examples/ml:
* scala/ml
** CrossValidatorExample.scala --> ModelSelectionViaCrossValidationExample
** DecisionTreeExample.scala --> DecisionTreeRegressionExample,
DecisionTreeClassificationExample
** GBTExample.scala --> GradientBoostedTreeClassifierExample,
GradientBoostedTreeRegressorExample
** LinearRegressionExample.scala --> LinearRegressionWithElasticNetExample
** LogisticRegressionExample.scala --> LogisticRegressionWithElasticNetExample,
LogisticRegressionSummaryExample
** RandomForestExample.scala --> RandomForestRegressorExample,
RandomForestClassifierExample
** TrainValidationSplitExample.scala -->
ModelSelectionViaTrainValidationSplitExample
** DeveloperApiExample.scala --> I delete it for now because it's only about
how to create your own classifieri, etc, which can be learned easily from other
examples and ml codes.
** SimpleParamsExample.scala --> merge with
LogisticRegressionSummaryExample.scala
* Unsure code duplications (need double check)
** DataFrameExample.scala
** SimpleTextClassificationPipeline.scala
When merging and cleaning those code, be sure not disturb the previous example
on and off blocks.
I'll take this one as an example.
> Scala ML examples code merge and clean up
> -----------------------------------------
>
> Key: SPARK-14299
> URL: https://issues.apache.org/jira/browse/SPARK-14299
> Project: Spark
> Issue Type: Sub-task
> Components: Examples
> Reporter: Xusen Yin
> Priority: Minor
> Labels: starter
>
> Duplicated code that I found in scala/examples/ml:
> * scala/ml
> ** CrossValidatorExample.scala --> ModelSelectionViaCrossValidationExample
> ** DecisionTreeExample.scala --> DecisionTreeRegressionExample,
> DecisionTreeClassificationExample
> ** GBTExample.scala --> GradientBoostedTreeClassifierExample,
> GradientBoostedTreeRegressorExample
> ** LinearRegressionExample.scala --> LinearRegressionWithElasticNetExample
> ** LogisticRegressionExample.scala -->
> LogisticRegressionWithElasticNetExample, LogisticRegressionSummaryExample
> ** RandomForestExample.scala --> RandomForestRegressorExample,
> RandomForestClassifierExample
> ** TrainValidationSplitExample.scala -->
> ModelSelectionViaTrainValidationSplitExample
> ** DeveloperApiExample.scala --> I delete it for now because it's only about
> how to create your own classifieri, etc, which can be learned easily from
> other examples and ml codes.
> ** SimpleParamsExample.scala --> merge with
> LogisticRegressionSummaryExample.scala
> ** SimpleTextClassificationPipeline.scala -->
> ModelSelectionViaCrossValidationExample
> * Unsure code duplications (need double check)
> ** DataFrameExample.scala
> When merging and cleaning those code, be sure not disturb the previous
> example on and off blocks.
> I'll take this one as an example.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]