Github user yinxusen commented on the pull request:
https://github.com/apache/spark/pull/422#issuecomment-40984061
Several comments:
==========
code here (scala code)
http://54.82.240.23:4000/mllib-linear-methods.html#linear-support-vector-machine-svm
`// Load training data in LIBSVM format.`
`val data = MLUtils.loadLibSVMData(sc, "mllib/data/sample_libsvm_data.txt")`
It should be
`"mllib/data/sample_svm_data.txt"`
and also, it cannot be parsed in libSVM format, you need write parsePoint
just like the Python code.
=========
code here (python code) http://54.82.240.23:4000/mllib-naive-bayes.html
`model = NaiveBayes.train(training, 1.0)`
should be
`model = NaiveBayes.train(data, 1.0)`
and also has the bug: https://github.com/apache/spark/pull/463
============
code here: http://54.82.240.23:4000/mllib-basics.html
`import org.apache.spark.mllib.util.MLUtils`
`val training: RDD[LabeledPoint] = MLUtils.loadLibSVMData(sc,
"mllib/data/sample_libsvm_data.txt")`
for consistency,it should be
`import org.apache.spark.mllib.util.MLUtils`
`import org.apache.spark.rdd.RDD`
`import org.apache.spark.mllib.regression.LabeledPoint`
`val training: RDD[LabeledPoint] = MLUtils.loadLibSVMData(sc,
"mllib/data/sample_libsvm_data.txt")`
and also, the path indicated there is not a valid path. See issue 1 as
reference.
===============
Other codes perform well under my test.
Besides, the hyperlinks of
* logistic regression
* linear least squares, Lasso, and ridge regression
are incorrect, they are forwarded to a localhost address. I think you need
adjust the anchors.
Anyway, great document!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---