[jira] [Commented] (SPARK-14302) Python examples code merge and clean up

Saikat Kanjilal (JIRA) Fri, 29 Apr 2016 20:36:28 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265118#comment-15265118
 ]


Saikat Kanjilal commented on SPARK-14302:
-----------------------------------------

And here's the duplication inside the mllib directory:

mllib
Tasks in common
initialize sparkContext
MLUtils.loadLibSVMFile

correlations && correlations example—share Statistics.corr, should be 
generalized like correlations_example
gaussian mixture model and example
kmeans and kmeans_example
word2vec and word2vec_example


We really should think about combining the example python files and the actual 
algorithms, what do you think (for example kmeans and kmeans_example)

Let me know your thoughts on my duplication removal ideas above before I make 
any code changes

> Python examples code merge and clean up
> ---------------------------------------
>
>                 Key: SPARK-14302
>                 URL: https://issues.apache.org/jira/browse/SPARK-14302
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Examples
>            Reporter: Xusen Yin
>            Priority: Minor
>              Labels: starter
>
> Duplicated code that I found in python/examples/mllib and python/examples/ml:
> * python/ml
> ** None
> * Unsure duplications, double check
> ** dataframe_example.py
> ** kmeans_example.py
> ** simple_params_example.py
> ** simple_text_classification_pipeline.py
> * python/mllib
> ** gaussian_mixture_model.py
> ** kmeans.py
> ** logistic_regression.py
> * Unsure duplications, double check
> ** correlations.py
> ** random_rdd_generation.py
> ** sampled_rdds.py
> ** word2vec.py
> When merging and cleaning those code, be sure not disturb the previous 
> example on and off blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-14302) Python examples code merge and clean up

Reply via email to