[jira] [Assigned] (SPARK-11920) ML LinearRegression should use correct dataset in examples and user guide doc

Apache Spark (JIRA) Mon, 23 Nov 2015 01:12:38 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Apache Spark reassigned SPARK-11920:
------------------------------------

    Assignee: Apache Spark

> ML LinearRegression should use correct dataset in examples and user guide doc
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-11920
>                 URL: https://issues.apache.org/jira/browse/SPARK-11920
>             Project: Spark
>          Issue Type: Improvement
>          Components: Documentation, ML
>            Reporter: Yanbo Liang
>            Assignee: Apache Spark
>            Priority: Minor
>
> ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in 
> examples and user guide doc, but it's actually classification dataset rather 
> than regression dataset. We should use 
> data/mllib/sample_linear_regression_data.txt instead.
> The deeper causes is that LinearRegression with "normal" solver can not solve 
> this dataset correctly, may be due to the ill condition and unreasonable 
> label. This issue has been reported at SPARK-11918.
> So we should make this change in examples and user guides, that can clearly 
> illustrate the usage of LinearRegression algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Assigned] (SPARK-11920) ML LinearRegression should use correct dataset in examples and user guide doc

Reply via email to