[
https://issues.apache.org/jira/browse/SPARK-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262810#comment-15262810
]
Xusen Yin commented on SPARK-14302:
-----------------------------------
We should leave them unmerged e.g. ml.bisecting_k_means_example and
mllib.bisecting_k_means_example. Even though they are similar, but each serves
different purposes, i.e. is used for different document files.
This JIRA aims to merge duplicated codes inside examples/python/ml,
examples/python/mllib, but not between them two.
For example, we have python/mllib/gaussian_mixture_model.py, which is
duplicated with python/mllib/gaussian_mixture_example.py. The latter has
$example on$ and $example off$ blocks in it which means it serves as a part of
document files. So we should delete the former one and keep the latter.
Also, according to here
https://github.com/apache/spark/pull/12092#issuecomment-204276885, we should
leave the example code with command-line parameters untouched, so we should
keep the python/mllib/gaussian_mixture_model.py.
> Python examples code merge and clean up
> ---------------------------------------
>
> Key: SPARK-14302
> URL: https://issues.apache.org/jira/browse/SPARK-14302
> Project: Spark
> Issue Type: Sub-task
> Components: Examples
> Reporter: Xusen Yin
> Priority: Minor
> Labels: starter
>
> Duplicated code that I found in python/examples/mllib and python/examples/ml:
> * python/ml
> ** None
> * Unsure duplications, double check
> ** dataframe_example.py
> ** kmeans_example.py
> ** simple_params_example.py
> ** simple_text_classification_pipeline.py
> * python/mllib
> ** gaussian_mixture_model.py
> ** kmeans.py
> ** logistic_regression.py
> * Unsure duplications, double check
> ** correlations.py
> ** random_rdd_generation.py
> ** sampled_rdds.py
> ** word2vec.py
> When merging and cleaning those code, be sure not disturb the previous
> example on and off blocks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]