The Datasets is in a fairly popular data format called libsvm data format -
popularized by the libsvm library.
http://svmlight.joachims.org - The 'How to Use' section describes the file
format.
XGBoost uses the same file format and their documentation is here -
https://xgboost.readthedocs.io/en/l
Hi,
I am trying to use apache spark's decision tree classifier. I am
trying to implement the method found in
https://spark.apache.org/docs/1.5.1/ml-decision-tree.html 's
classification example. I found the dataset at
https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt