Hi I thought I would provide some thoughts as I try to progress through the funnel outlined in the "NuPIC Consumer Engagement Strategy" at [0] (as a sidenote, it is very interesting and quite refreshing to see a commercial organisation be explicit with strategy documents like these).
Hopefully these notes will be of some value as Numenta refines the strategy. I haven't quite got to the bottom of the funnel, but hopefully I'll get there! <https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy#watcher--builder-obstacles>Watcher ⟹ Builder Obstacles** Some things that have tripped me up which are not mentioned - I need to patch the source to workaround "linux3" result to sys.platform - "source env.sh" is somewhat annoying to remember (don't want to hard code this in my .bashrc) - building - why does NuPIC try to uninstall Python modules that it can find? Why not use system dependencies is they are there to use? <https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy#builder--experimenter-obstacles>Builder ⟹ Experimenter Obstacles NuPIC's documentation is great - but after hours of reading I still don't know how to give the CLA a CSV file (with the two extra headers) and ask it to start predicting results. Reading through the source hasn't been as straightforward as I would have like. I've ended up spending quite a bit of time. Rather than start with hotgym.py, I have been searching down the rabbit hole of classes/methods used by OpfRunExperiment.py, given that is what I was executing to run the experiment. [edit/update: after taking a look at hotgym.py, things actually look relatively simple to get started with, perhaps consider this a case of RTFM] After spending some time in the source and noticing that Grok uses a database, rather than CSV files, I feel that Numenta has perhaps added an obstacle here to 3rd party developers here. Support for a wide variety of input formats could make it easy for people to experiment with their own data stored in data warehouses. (sorry, I can't remember where in the source tree I found this reference as I was reading today) <https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy#experimenter--developer-obstacles>Experimenter ⟹ Developer ObstaclesSome anecdotal evidence to support the barriers listed in this section. Results interpretation is a bit tricky. When running hotgym for the first time, in the terminal, all I see is the result metrics: ... <JSON> { "prediction:trivial:errorMetric='aae':steps=5:window=1000:field=consumption": 15.844123711340194, "prediction:trivial:errorMetric='altMAPE':steps=5:window=1000:field=consumption": 56.13599339610908, "multiStepBestPredictions:multiStep:errorMetric='altMAPE':steps=5:window=1000:field=consumption": 43.27516112448442, "prediction:trivial:errorMetric='altMAPE':steps=1:window=1000:field=consumption": 20.833911019329946, "prediction:trivial:errorMetric='aae':steps=1:window=1000:field=consumption": 5.86262833675565, "multiStepBestPredictions:multiStep:errorMetric='aae':steps=1:window=1000:field=consumption": 5.305007751770775, "multiStepBestPredictions:multiStep:errorMetric='altMAPE':steps=1:window=1000:field=consumption": 18.852305332800853, "multiStepBestPredictions:multiStep:errorMetric='aae':steps=5:window=1000:field=consumption": 12.214213466328994 } </JSON> ... It has taken me a long time to figure our that the actual prediction results are provided in /tmp/tmpXXXXX/generated_output.csv and that there is more output saved at /nta/logs/numenta-logs-username/OpfRunExperiment-NNNNNNNNNN-NNNN.log The Numenta Python style guide and some of the practices does not lend itself to being familiar to Python programmers. I assume that the style guide is designed to assist C++ programmers function efficiently in a dynamic language, by preventing an overly large context switch. However, to me it presents a few degrees of learning curve gradient as I attempt to learn a style. Here are two examples that I've noticed myself wincing at: - "_TaskRunner.getFieldInfo" (line 606 of experiment_runner.py) is not Pythonic. Python programmers would tend towards defining "_TaskRunner.fields" - within logging, prefixing module paths with "com.numenta" is quite frustrating. I personally don't see why the namespacing is needed. The string is hard coded in /nta/conf/default/nupic-logging.conf, so I assume it doesn't distinguish between different code modules being run I should note that it's very useful that NuPIC's code is so very thoroughly documented. Thank you for showing such discipline. I hope that these notes are useful! Tim McNamara @timClicks <http://twitter.com/timClicks> | timmcnamara.co.nz [0] https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy <http://timmcnamara.co.nz/>
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
