Re: How do you evaluate a correctness and accurancy of a prediction

Raf Tue, 12 Jan 2016 23:52:21 -0800

Hello Wakan.

This is a huge point you are making and defining a loss function cancompletely change the validity of a ML algo.

Depending on your task (regression, classification) I strongly suggestyou to create your own Metrics[1]: this, imho, could have a big impacton how the HTM region processes the data - it literally changes the"learning goal".

I'll try to clarify what I mean with a general classification examplenot strictly linked to NuPIC.Let's imagine I've a simple task of time series classification that'smaybe a bit unrealistic but it'll do the job.I'm receiving oil prices and I'd like to know if now is the right momentto perform no action (label 0), to "buy" (label 1) or to "sell" (label2). The prediction obtained by the algo would consist of the probabilityfor each label; as an example: label 0 = 0.12 (12%), label 1 = 0.70(70%), label 2 = 0.18 (18%.Now, if I just evaluated the error using the distance of the predictedvalue from the real value in terms of RMSE I would not notice (and mostof all I wouldn't let my ML system notice) the subtle differencesbetween a little mistake (the action is wrong and the price differenceis not that big) and a big mistake (the action is wrong again but theprice difference is huge this time). In this case, for example, using asa loss function the outcome of the trade in terms of money if weperformed the trade for real (including fees and commissions) it could,imho, give you a better overall learning process that is more useful inthe real world.Of course, this has nothing to do with NuPIC per se but I suppose it iscommon in basically all the ML algos you can think of.

Raf

[1]https://github.com/numenta/nupic/blob/master/src/nupic/frameworks/opf/metrics.py



On 13/01/2016 02:19, Wakan Tanka wrote:

Hello NuPIC,
How do you evaluate a correctness and accurancy of a prediction? Or ifyou have multiple predictions for same data how do you compare whichprediction was more accurate? I've seen that there is NAB [1] but tobe honest I did not get deep into so I do not know if it might help ornot. AFAIK when you want to do such things the correlation should workfine, in this case correlation between original and predicted data.But correlation works only when you have linear data, it would notwork e.g. on hotgym example where you have repeating cycles, peaks,maybe random events in particular days etc. So my intuitive approachwas to calculate absolute difference [2] of original and predictedvalue and then calculate mean of those values. The lower the mean isthe better the prediction is. Then I've realized that there isstandard deviation [3] which can be calculated from those absolutedifferences. Next step would be pick up all values which have absolutedifferences of original and predicted value:
1. above  mean + standard deviation
2. bellow mean - standard deviation
This should give me an overview of how many values falls in thisinterval and how many is doesn't. The dataset where more values fallsin the interval is dataset with better prediction.
Does this make sense?
[1]http://numenta.com/blog/nab-a-benchmark-for-streaming-anomaly-detection.html
[2] https://en.wikipedia.org/wiki/Absolute_difference
[3] http://www.mathsisfun.com/data/standard-deviation.html


--
Raf

www.madraf.com/algotrading
reply to: [email protected]
skype: algotrading_madraf

Re: How do you evaluate a correctness and accurancy of a prediction

Reply via email to