I haven't touched NuPIC for a very long time, and I've never used swarming, but my suggestion would be to ditch swarming for now and focus on understanding the model and its parameters.
Since you're trying do to prediction, I'd start with the hotgym example, which attempts successfully to do the same thing, and modify the code to use your data instead (if that is not what you did already). Then, I'd fiddle with the parameters a bit manually (giving special attention to the scalar encoder) and see how the prediction is affected. If this fails to produce any interesting results, I really doubt swarming will find a magical combination of parameters that will solve your problem. See two threads I created for more interesting tips (and they also show NuPIC can do a lot better than that): http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2013-October/001539.html (you can see the graphs by clicking on the links) http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2013-October/001540.html Good luck, and be sure to post yours results here, even if it doesn't work out! On Mon, May 26, 2014 at 10:36 AM, Marek Otahal <[email protected]> wrote: > Hi Daniel, > > the 1-step prediction does not seem that bad to me. For readibility of the > graph, you could plot just raw-k_step data (more images). > You did not enclose CLA settings (description.py / model_params.py), also > you could use swarming to get some improvement (if you don't do that > already). > For the predictions, 4k of data is really not that much (what is your > expected "period" in the data? - a day, a week?), further you could: > -use more data (eg by just repeating the 4k of data multiple times in the > input stream). > -for the "further ahead" predictions use aggregation > > Cheers, mark > > > On Mon, May 26, 2014 at 2:13 PM, Daniel Cohen <[email protected]>wrote: > >> I asked a while ago - when I responded to further questions I didn't get >> a further response, so here I am asking again. >> >> This is a data set taken from google analytics - it is almost 2 years of >> web traffic data by date hour - that's about 4000 data points. >> >> >> >> I ran through the same process I used for the sine wave prediction >> tutorial >> except I added more prediction steps. The attached .png is an extract of >> the >> plot. Even the 1 step prediction is disappointing and I'd expect better >> after 4000 data points. The 10 step prediction is almost wholly unreliable >> and anything beyond that is useless. Sure the predictions are within the >> right range of points but you couldn't base anything useful off them. They >> don't even seem to have picked up on the hourly modulation throughout a >> day. >> >> Is there any way to improve the prediction quality without simply using >> more data points? >> >> search_def.json <http://justpaste.it/fhwk> >> >> visits.csv< >> https://drive.google.com/file/d/0B8imHXOv0rGcSXVmREV6QTVlTjA/edit?usp=sharing >> > >> >> The visits are by datehour to increase the number of data points. I wasn't >> expecting great things from the 1000 step prediction but I did want to >> test >> it. The 10 step and 100 step I did expect much more from though. >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > > > -- > Marek Otahal :o) > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > -- Pedro Tabacof
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
