[ 
https://issues.apache.org/jira/browse/SPARK-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-11920:
--------------------------------------
            Assignee: Yanbo Liang
    Target Version/s: 1.6.0

> ML LinearRegression should use correct dataset in examples and user guide doc
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-11920
>                 URL: https://issues.apache.org/jira/browse/SPARK-11920
>             Project: Spark
>          Issue Type: Improvement
>          Components: Documentation, ML
>            Reporter: Yanbo Liang
>            Assignee: Yanbo Liang
>            Priority: Minor
>
> ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in 
> examples and user guide doc, but it's actually classification dataset rather 
> than regression dataset. We should use 
> data/mllib/sample_linear_regression_data.txt instead.
> The deeper causes is that LinearRegression with "normal" solver can not solve 
> this dataset correctly, may be due to the ill condition and unreasonable 
> label. This issue has been reported at SPARK-11918.
> It will confuse users if they run the example code but get exception, so we 
> should make this change which can clearly illustrate the usage of 
> LinearRegression algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to