iterationCount is how many rows of data to run through each model that when swarming, from the beginning of the file, and it does not repeat.
If the nature of the data changes after 3000, that will not be incorporated into the swarm search. But... If you think about it, the nature of the data needs to change dramatically for it to make a difference. And even then, over time the HTM will learn the new patterns in the data. The swarm is really just to find out what the best model params are for the "shape" of the data being processed. The shape is defined by the input fields, their min/max, intervals, etc. This usually exists for the lifetime of the data, and generally associations between fields will also continue to exist for the data lifetime. You want the 3000 rows you swarm over to be your best data sample, however. Regards, --------- Matt Taylor OS Community Flag-Bearer Numenta On Mon, Nov 2, 2015 at 6:33 AM, Wakan Tanka <[email protected]> wrote: > Hello NuPIC, > > I've been discussing with Matt on gitter a while ago > https://gitter.im/numenta/public/archives/2015/08/19 > about size of "iterationCount" parameter. My original problem was that the > swarming get stucked when I used "iterationCount": -1 on huge data set > (249459 lines). Matt said that everything that is over 3000 is waste. So > setting "iterationCount": 3000 solved my problems. But there are two > questions that comes to my mind: > > 1. What does the "iterationCount" exactly mean? Is it how much rows (from > the beginning of file) will be involved in swarming process or how many > times will swarm iterate over the input until it finds best model? > > 2. If it means the number of lines that are involved in swarm process isn't > setting a fix number (everything instead of -1 problem)? I mean what if I > have large data which nature might change after 3000th row, will be swarming > able to handle this? > > Thank you >
