As you can see, that little bugger of my neighbour's kid, makes two mistakes: in the first case the error according to the RMSE/MSE/MAE wouldn't be that a big error (440.00-348.23 = 91.77) and the latter case, instead, would represent for the mlearning system (in this case the kid) a MASSIVE error (698.46-440.00 = 258.76).
I realized that I made a mistake in the second calculation which should be 698.46-348.23 = 350.23 (which is even worst!).

On 13/01/2016 11:26, Raf wrote:
Hi Wakan,

in the case of a classification task (when you have a dataset with already defined "labels") the RMSE/MSE/MAE would be calculated (at least with other ml models) between the probability predicted for the right class (0.00, 1.00) vs the real class (0.00, 1.00). That is why, for this peculiar job I wouldn't advice this way of estimate the error. Even using the price as a loss function, though, for a classification task... I suppose that wouldn't perform well in the real world (imho).

Nonetheless, even in the case of a pure regression task (where you would try to predict the next price, for example) and even calculating the RMSE/MSE/MAE as a cost function between prices... it wouldn't be useful in real life (imho). Why? Because it wouldn't take into account non-linear real life scenarios like possible fees and commissions, price slippages and lack of liquidity.

What you really want is to know if right now is a good moment to "do nothing", "buy" or "sell" - not just the price change.

I'll try to make an example (sorry if I raise too many trading concepts, but it actually fits the topic in my opinion).

- It is 12pm. The current price of something is 1205.00 (something/USD). The prediction of the price for the next three hours (15pm) is 1155.00: it is lower, but it is not worth the trade... for now. -One hour passes (it is 13pm), current real price is 1170.00. Let's imagine that the real price in three hours from now will be a lot lower than the actual price: 900.00. The predicted price for the next three hours is 905.00 : wonderful prediction! It is a lot lower than the current price, so we sell.

But.... because it is happening during some news (everyday at 15.00 CET or 16.00 CET there is going to be some news that shakes that particular asset and the machine learning system understood that), your bank decides to increase the spread (or the commissions) in order to filter orders and giving precedence to bigger investments at the expense of smaller ones. Also, because there is huge volatility, your order cannot be processed at the price you hoped to sell because your request to "sell" doesn't match any other "buy" action for that price: this means your order is processed at a price of 960.00 instead of the original price (1170.00). Furthermore, due to the network high traffic at that time, you can't promptly process the order, slipping of a couple of mseconds that though turn into another price slippage. This means that between the widened spread/commissions and this market "slippage", your profit is almost non-existent and you could probably even end up losing money.

Now, I remind us the original question I've been asking to my mlearning system: "is it a good moment for buying, selling, or doing nothing?".

Of course a human mind would prefer not to enter into volatility (at 13pm) - instead, some sort of "blind" mlearning system that takes into account only the price change would be tempted to enter exactly in that moment. This is not what we wanted to start with.


I did this example just because I'm familiar with these kinds of scenario but probably this wasn't the best choice for NuPIC.

I'll make another example which, in my opinion, fits better a neocortical algorithm. Let's imagine now that my neighbour has a kid that is learning piano (this is actually happening :-) ). He is learning this[1] song: "Frère Jacques". When I'm working and I listen to him through my wall, my brain (see HTM paper which explains that brilliantly) expects this exact sequence[2]: C - D - E - C - C - D - E - C | E - F (...) . Now, when the kid learns, of course he makes mistakes (that's how we learn after all!). When he plays some note wrong (that is not in the above sequence) he, actually "we" :), understand that he was wrong because my TemporalMultiStep Prediction detected an "anomaly" (trying to use NuPIC jargon here). If his brain had to consider a "brutal" :) RMSE or MAE he should take the frequencies of the notes in the piano[3] and subtracting the expected (real) value from the played value.

EXPECTED VALUES (considering the middle C):
261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - 261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - | - 329.63 Hz - 348.23 Hz - ....

PLAYED VALUES (Error 1):
261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - 261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - | - 329.63 Hz - 444.00 Hz (A) ....

PLAYED VALUES (Error 2):
261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - 261.63 Hz - 293.66 Hz - 329.63 Hz - 261.63 Hz - | - 329.63 Hz - 698.46 Hz (NEXT OCTAVE F insted of current octave F) ....


As you can see, that little bugger of my neighbour's kid, makes two mistakes: in the first case the error according to the RMSE/MSE/MAE wouldn't be that a big error (440.00-348.23 = 91.77) and the latter case, instead, would represent for the mlearning system (in this case the kid) a MASSIVE error (698.46-440.00 = 258.76).

In the reality, though, it is much "less-horrible" to listen to the second error (which is at least the same note, although of the next octave) than the first error (which is a totally different note).

If the kid had to learn using RMSE/MSE/MAE I think he wouldn't been able to distinguish between "little" and "massive" errors thus he couldn't learn to play the piano.


At the end, what I wanted to stress, to give you my opinion, is that the choice of the loss function/metrics is very very very important for defining the correct learning method of any machine learning system.

My two cents :)

Raf


[1]: https://www.youtube.com/watch?v=eYtuOYABwes
[2]: http://www.true-piano-lessons.com/images/FrerejacquesinCtab.jpg
[3]: http://amath.colorado.edu/pub/matlab/music/frequencies.jpg





On 13/01/2016 10:24, Wakan Tanka wrote:
Thank you Raf,

why do you think that you would not notice difference between little and big mistake? I suppose that little mistakes will have lower square and the big will have bigger square. When you will then average all values which you have obtained such way then it is possible that one big mistake will drastically change the final score, but in general this is what you want isn't it? No matter if you were predicting lot of small mistakes or one big mistake, if the amount of money you have lost is the same.

PS: I suppose using some other metrics (some kind of clustering or maybe more simple method using just basic histograms) it should be possible to filter just those big mistakes

am I wrong?


On 01/13/2016 08:50 AM, Raf wrote:
Hello Wakan.

This is a huge point you are making and defining a loss function can
completely change the validity of a ML algo.

Depending on your task (regression, classification) I strongly suggest
you to create your own Metrics[1]: this, imho, could have a big impact
on how the HTM region processes the data - it literally changes the
"learning goal".

I'll try to clarify what I mean with a general classification example
not strictly linked to NuPIC.
Let's imagine I've a simple task of time series classification that's
maybe a bit unrealistic but it'll do the job.
I'm receiving oil prices and I'd like to know if now is the right moment
to  perform no action (label 0), to "buy" (label 1) or to "sell" (label
2). The prediction obtained by the algo would consist of the probability
for each label; as an example: label 0 = 0.12 (12%), label 1 = 0.70
(70%), label 2 = 0.18 (18%.
Now, if I just evaluated the error using the distance of the predicted
value from the real value in terms of RMSE I would not notice (and most
of all I wouldn't let my ML system notice) the subtle differences
between a little mistake (the action is wrong and the price difference
is not that big) and a big mistake (the action is wrong again but the
price difference is huge this time). In this case, for example, using as
a loss function the outcome of the trade in terms of money if we
performed the trade for real (including fees and commissions) it could,
imho, give you a better overall learning process that is more useful in
the real world.
Of course, this has nothing to do with NuPIC per se but I suppose it is
common in basically all the ML algos you can think of.

Raf
[1]
https://github.com/numenta/nupic/blob/master/src/nupic/frameworks/opf/metrics.py



On 13/01/2016 02:19, Wakan Tanka wrote:
Hello NuPIC,

How do you evaluate a correctness and accurancy of a prediction? Or if
you have multiple predictions for same data how do you compare which
prediction was more accurate? I've seen that there is NAB [1] but to
be honest I did not get deep into so I do not know if it might help or
not. AFAIK when you want to do such things the correlation should work
fine, in this case correlation between original and predicted data.
But correlation works only when you have linear data, it would not
work e.g. on hotgym example where you have repeating cycles, peaks,
maybe random events in particular days etc. So my intuitive approach
was to calculate absolute difference [2] of original and predicted
value and then calculate mean of those values. The lower the mean is
the better the prediction is. Then I've realized that there is
standard deviation [3] which can be calculated from those absolute
differences. Next step would be pick up all values which have absolute
differences of original and predicted value:
1. above  mean + standard deviation
2. bellow mean - standard deviation

This should give me an overview of how many values falls in this
interval and how many is doesn't. The dataset where more values falls
in the interval is dataset with better prediction.

Does this make sense?




[1]
http://numenta.com/blog/nab-a-benchmark-for-streaming-anomaly-detection.html

[2] https://en.wikipedia.org/wiki/Absolute_difference
[3] http://www.mathsisfun.com/data/standard-deviation.html








--
Raf

www.madraf.com/algotrading
reply to: [email protected]
skype: algotrading_madraf


Reply via email to