2014-06-11 4:07 GMT+02:00 Joe Bogner <[email protected]>:
> Thanks Jan-Pieter, how would I recreate the results of the calculating the
> % correct with yours? I will give it a shot on my own still later.. I
> pasted some code to help jumpstart the reading of the array of data:
>
Thanks for the info!
I just tried the classification of the data and this is what I get:
NB. transformed your loader into a reusable verb.
parsefile =: 3 : 0
file =. fread y
header_end =. >: file i. LF
arr =. ". ];._2 header_end }. file
)
NB. Load training and validation labels and data
Train =: parsefile jpath '~temp/trainingsample.csv'
Validation =: parsefile jpath '~temp/validationsample.csv'
NB. separate labels (1st column) from data (the rest)
'TrainLabels TrainData' =: ({."1 ; }."1) Train
'ValidationLabels ValidationData'=: ({."1 ; }."1) Validation
NB. Classify one against all:
predicted =: 10 nnClass oaa TrainLabels;TrainData;ValidationData
NB. Assess the accuracy of our result:
OA =: 100 * (+/%#)@:=
predicted OA ValidationLabels
93.6
I'd like to recommend the book that started me on implementing this all:
Elements of Statistical Learning
Trevor Hastie, Robert Tibshirani, Jerome Friedman
PDF Freely (legally too) available via
http://statweb.stanford.edu/~tibs/ElemStatLearn/
In the future, I'd be interested toying around with more advanced
classifiers, like Support Vector Machines.
Jan-Pieter
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm