Re: How do you evaluate a correctness and accurancy of a prediction

Raf Wed, 13 Jan 2016 03:37:07 -0800

As you can see, that little bugger of my neighbour's kid, makes twomistakes: in the first case the error according to the RMSE/MSE/MAEwouldn't be that a big error (440.00-348.23 = 91.77) and the lattercase, instead, would represent for the mlearning system (in this casethe kid) a MASSIVE error (698.46-440.00 = 258.76).

I realized that I made a mistake in the second calculation which shouldbe 698.46-348.23 = 350.23 (which is even worst!).


On 13/01/2016 11:26, Raf wrote:

Hi Wakan,
in the case of a classification task (when you have a dataset withalready defined "labels") the RMSE/MSE/MAE would be calculated (atleast with other ml models) between the probability predicted for theright class (0.00, 1.00) vs the real class (0.00, 1.00). That is why,for this peculiar job I wouldn't advice this way of estimate theerror. Even using the price as a loss function, though, for aclassification task... I suppose that wouldn't perform well in thereal world (imho).
Nonetheless, even in the case of a pure regression task (where youwould try to predict the next price, for example) and even calculatingthe RMSE/MSE/MAE as a cost function between prices... it wouldn't beuseful in real life (imho). Why? Because it wouldn't take into accountnon-linear real life scenarios like possible fees and commissions,price slippages and lack of liquidity.
What you really want is to know if right now is a good moment to "donothing", "buy" or "sell" - not just the price change.
I'll try to make an example (sorry if I raise too many tradingconcepts, but it actually fits the topic in my opinion).
- It is 12pm. The current price of something is 1205.00(something/USD). The prediction of the price for the next three hours(15pm) is 1155.00: it is lower, but it is not worth the trade... fornow.-One hour passes (it is 13pm), current real price is 1170.00. Let'simagine that the real price in three hours from now will be a lotlower than the actual price: 900.00. The predicted price for the nextthree hours is 905.00 : wonderful prediction! It is a lot lower thanthe current price, so we sell.
But.... because it is happening during some news (everyday at 15.00CET or 16.00 CET there is going to be some news that shakes thatparticular asset and the machine learning system understood that),your bank decides to increase the spread (or the commissions) in orderto filter orders and giving precedence to bigger investments at theexpense of smaller ones. Also, because there is huge volatility, yourorder cannot be processed at the price you hoped to sell because yourrequest to "sell" doesn't match any other "buy" action for that price:this means your order is processed at a price of 960.00 instead of theoriginal price (1170.00). Furthermore, due to the network high trafficat that time, you can't promptly process the order, slipping of acouple of mseconds that though turn into another price slippage. Thismeans that between the widened spread/commissions and this market"slippage", your profit is almost non-existent and you could probablyeven end up losing money.
Now, I remind us the original question I've been asking to mymlearning system: "is it a good moment for buying, selling, or doingnothing?".
Of course a human mind would prefer not to enter into volatility (at13pm) - instead, some sort of "blind" mlearning system that takes intoaccount only the price change would be tempted to enter exactly inthat moment. This is not what we wanted to start with.
I did this example just because I'm familiar with these kinds ofscenario but probably this wasn't the best choice for NuPIC.
I'll make another example which, in my opinion, fits better aneocortical algorithm. Let's imagine now that my neighbour has a kidthat is learning piano (this is actually happening :-) ).He is learning this[1] song: "Frère Jacques". When I'm working and Ilisten to him through my wall, my brain (see HTM paper which explainsthat brilliantly) expects this exact sequence[2]: C - D - E - C - C -D - E - C | E - F (...) . Now, when the kid learns, of course hemakes mistakes (that's how we learn after all!).When he plays some note wrong (that is not in the above sequence) he,actually "we" :), understand that he was wrong because myTemporalMultiStep Prediction detected an "anomaly" (trying to useNuPIC jargon here).If his brain had to consider a "brutal" :) RMSE or MAE he should takethe frequencies of the notes in the piano[3] and subtracting theexpected (real) value from the played value.
EXPECTED VALUES (considering the middle C):
261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - 261.63 Hz - 293.66 Hz- 329.63 Hz - 261.63 Hz - | - 329.63 Hz - 348.23 Hz - ....
PLAYED VALUES (Error 1):
261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - 261.63 Hz - 293.66 Hz- 329.63 Hz - 261.63 Hz - | - 329.63 Hz - 444.00 Hz (A) ....
PLAYED VALUES (Error 2):
261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - 261.63 Hz - 293.66 Hz- 329.63 Hz - 261.63 Hz - | - 329.63 Hz - 698.46 Hz (NEXT OCTAVE Finsted of current octave F) ....
As you can see, that little bugger of my neighbour's kid, makes twomistakes: in the first case the error according to the RMSE/MSE/MAEwouldn't be that a big error (440.00-348.23 = 91.77) and the lattercase, instead, would represent for the mlearning system (in this casethe kid) a MASSIVE error (698.46-440.00 = 258.76).
In the reality, though, it is much "less-horrible" to listen to thesecond error (which is at least the same note, although of the nextoctave) than the first error (which is a totally different note).
If the kid had to learn using RMSE/MSE/MAE I think he wouldn't beenable to distinguish between "little" and "massive" errors thus hecouldn't learn to play the piano.
At the end, what I wanted to stress, to give you my opinion, is thatthe choice of the loss function/metrics is very very very importantfor defining the correct learning method of any machine learning system.
My two cents :)

Raf


[1]: https://www.youtube.com/watch?v=eYtuOYABwes
[2]: http://www.true-piano-lessons.com/images/FrerejacquesinCtab.jpg
[3]: http://amath.colorado.edu/pub/matlab/music/frequencies.jpg





On 13/01/2016 10:24, Wakan Tanka wrote:
Thank you Raf,
why do you think that you would not notice difference between littleand big mistake? I suppose that little mistakes will have lowersquare and the big will have bigger square. When you will thenaverage all values which you have obtained such way then it ispossible that one big mistake will drastically change the finalscore, but in general this is what you want isn't it? No matter ifyou were predicting lot of small mistakes or one big mistake, if theamount of money you have lost is the same.
PS: I suppose using some other metrics (some kind of clustering ormaybe more simple method using just basic histograms) it should bepossible to filter just those big mistakes
am I wrong?


On 01/13/2016 08:50 AM, Raf wrote:
Hello Wakan.

This is a huge point you are making and defining a loss function can
completely change the validity of a ML algo.

Depending on your task (regression, classification) I strongly suggest
you to create your own Metrics[1]: this, imho, could have a big impact
on how the HTM region processes the data - it literally changes the
"learning goal".

I'll try to clarify what I mean with a general classification example
not strictly linked to NuPIC.
Let's imagine I've a simple task of time series classification that's
maybe a bit unrealistic but it'll do the job.
I'm receiving oil prices and I'd like to know if now is the rightmoment
to  perform no action (label 0), to "buy" (label 1) or to "sell" (label
2). The prediction obtained by the algo would consist of theprobability
for each label; as an example: label 0 = 0.12 (12%), label 1 = 0.70
(70%), label 2 = 0.18 (18%.
Now, if I just evaluated the error using the distance of the predicted
value from the real value in terms of RMSE I would not notice (and most
of all I wouldn't let my ML system notice) the subtle differences
between a little mistake (the action is wrong and the price difference
is not that big) and a big mistake (the action is wrong again but the
price difference is huge this time). In this case, for example,using as
a loss function the outcome of the trade in terms of money if we
performed the trade for real (including fees and commissions) it could,
imho, give you a better overall learning process that is more useful in
the real world.
Of course, this has nothing to do with NuPIC per se but I suppose it is
common in basically all the ML algos you can think of.

Raf
[1]
https://github.com/numenta/nupic/blob/master/src/nupic/frameworks/opf/metrics.py
On 13/01/2016 02:19, Wakan Tanka wrote:
Hello NuPIC,

How do you evaluate a correctness and accurancy of a prediction? Or if
you have multiple predictions for same data how do you compare which
prediction was more accurate? I've seen that there is NAB [1] but to
be honest I did not get deep into so I do not know if it might help or
not. AFAIK when you want to do such things the correlation should work
fine, in this case correlation between original and predicted data.
But correlation works only when you have linear data, it would not
work e.g. on hotgym example where you have repeating cycles, peaks,
maybe random events in particular days etc. So my intuitive approach
was to calculate absolute difference [2] of original and predicted
value and then calculate mean of those values. The lower the mean is
the better the prediction is. Then I've realized that there is
standard deviation [3] which can be calculated from those absolute
differences. Next step would be pick up all values which have absolute
differences of original and predicted value:
1. above  mean + standard deviation
2. bellow mean - standard deviation

This should give me an overview of how many values falls in this
interval and how many is doesn't. The dataset where more values falls
in the interval is dataset with better prediction.

Does this make sense?




[1]
http://numenta.com/blog/nab-a-benchmark-for-streaming-anomaly-detection.html
[2] https://en.wikipedia.org/wiki/Absolute_difference
[3] http://www.mathsisfun.com/data/standard-deviation.html


--
Raf

www.madraf.com/algotrading
reply to: [email protected]
skype: algotrading_madraf

Re: How do you evaluate a correctness and accurancy of a prediction

Reply via email to