I have been doing some more testing with my ski game and have some interesting results. Now I know that my model settings are not optimized (setting up a swarm is probably my next step), but the number of lines that it trains on seems to have a huge impact on how well it performs.
All of these results are using an integer to represent each object's position as a scaler encoding. The width refers to the number of spaces between the boundary trees, training is the number of slope lines the model is trained on, and distance is how far it is able to go in the game without hitting a tree. Here are my initial test run results: Width Training Distance 31 100 337 31 500 892 31 1000 342 31 2000 50k+ 31 5000 50k+ 23 100 334 23 500 846 23 1000 340 23 2000 50k+ 23 5000 50k+ 22 100 257 22 500 842 22 1000 293 22 2000 20299 22 5000 20299 21 100 256 21 500 90 21 1000 293 21 2000 4663 21 5000 1663 I thought that it was interesting that the distance went up so fast between 1000 and 2000, so I explored some of the settings in between: 21 1000 293 21 1500 112 21 1600 21 21 1700 4963 21 1800 4863 and digging deeper: 21 1610 11 21 1611 10 21 1612 2168 21 1613 2167 21 1614 2166 21 1615 2165 21 1616 2164 21 1617 5046 21 1618 5045 21 1619 5044 21 1620 5043 21 1630 5033 21 1640 5023 21 1650 5013 Does anyone know why the accuracy would go down with more training data? Thanks, Matt On Aug 27, 2013, at 5:04 PM, Matt Keith <[email protected]> wrote: > Thank you for the great information. I will put my feedback and reward based > learning plans on hold for now. > > I have updated my ski program to disable learning when it does the live run. > I have also added a seed for the random number generator, so that the > training data is consistent each time. This will allow me to test different > parameter settings and encoding schemes to see how they perform. > > Right now, I am still just using character position integers to represent the > tree and skier positions and getting some really good results. When I give > the model just 1000 lines of training data, it is able to ski for 342 lines > before it crashes. However, when I increased the training data to 5000 > lines, it was able to ski for over 35000 lines before I stopped the program! > > I will continue to try different encodings and will make the width of the ski > slope smaller and more challenging. > > I have updated the code at https://github.com/keithcom/nta_ski if anyone is > interested. > > Thanks, > > Matt > > On Aug 27, 2013, at 12:51 PM, Patrick Higgins <[email protected]> wrote: > >> The CLA will predict, it does not have [goal based] motor >> function or attention control yet. This has been the topic >> of many discussions in other threads. So I think at present, >> your challenge will be to come up with an encoding scheme >> that allows the CLA to predict the skiers upcoming position. >> >> A sliding window of 1s should do it: >> >> >> 0000...0001111...1111000...0000 centered >> 1111...1111000...0000 left >> 0000...0001111...1111 right >> >> >> i would start with 128 bits total for the skier position >> attribute (leaving many other bits for other attributes >> if you have a 2048 column CLA matrix). Always have >> (100) 0s and (28) 1s which gives an SDR of about 2% >> giving a lot of overlap between similar skier positions. >> I don't know the answer to this, but it might be ok, if >> there is only one attribute being tracked, to use a much >> smaller matrix. I think the sine wave test would be a >> good parallel. I've not tired it. >> >> The real power of the CLA at present is its ability to >> take in many different attributes of the "state of affairs" >> (or readings from many sensors or distinct components >> of a data set) and compare them to find patterns both >> spatial and temporal, and make predictions about >> what the next record will be or report how anomalous >> is the current record. >> >> i.e. An Electrical motor connected to a table saw: >> Measured attributes: >> Motor temperature (0 to 150°C) >> Voltage at the motor's input (0 to 150VAC) >> Amperage used by the motor (0 to 30Amps) >> >> (many more can be used to enrich the data set >> allowing the CLA to better find patterns and predict >> when the motor will overheat and/or possibly fail) >> >> Obviously when the voltage goes down and the amperage >> goes up, the temperature of the motor will increase. These >> are distinct attributes of the system that form patters that the >> CLA will learn to recognize. There are other factors involved >> here in this example, such as the voltage can fluctuate at >> the source, because it changes at the panel, not due to a >> change in the load on the motor. There are patterns in even >> this simple domain that one might not predict are there, hidden >> in the complexity of the system. I'm off topic now, but just wanted >> to hopefully expose how the CLA can currently be used and >> how this may deviate from your goals for the ski game test. >> >> >> >> Patrick >> >> >> >> >> >> On Aug 26, 2013, at 2:26 AM, Matt Keith wrote: >> >>> I thought about that, but it doesn't really address the intent of my test. >>> Ideally, I would like to have the model learn how to play on its own >>> without being trained beforehand, so I would like to have some type of >>> metric for the model to optimize on for improvements. >>> >> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> > _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
