Very useful for us, Tim. Thanks for taking the time to write this up. I will 
attempt to address your comments after the (US) holiday weekend. 

Matt

Sent from my MegaPhone

On Sep 1, 2013, at 2:10 AM, Tim McNamara <[email protected]> wrote:

> Hi 
> 
> I thought I would provide some thoughts as I try to progress through the 
> funnel outlined in the "NuPIC Consumer Engagement Strategy" at [0] (as a 
> sidenote, it is very interesting and quite refreshing to see a commercial 
> organisation be explicit with strategy documents like these).
> 
> Hopefully these notes will be of some value as Numenta refines the strategy. 
> I haven't quite got to the bottom of the funnel, but hopefully I'll get there!
> 
> Watcher ⟹ Builder Obstacles
> 
> Some things that have tripped me up which are not mentioned
> 
> I need to patch the source to workaround "linux3" result to sys.platform
> "source env.sh" is somewhat annoying to remember (don't want to hard code 
> this in my .bashrc)
> building - why does NuPIC try to uninstall Python modules that it can find? 
> Why not use system dependencies is they are there to use?
> 
> 
> Builder ⟹ Experimenter Obstacles
> 
> NuPIC's documentation is great - but after hours of reading I still don't 
> know how to give the CLA a CSV file (with the two extra headers) and ask it 
> to start predicting results.
> 
> Reading through the source hasn't been as straightforward as I would have 
> like. I've ended up spending quite a bit of time. Rather than start with 
> hotgym.py, I have been searching down the rabbit hole of classes/methods used 
> by OpfRunExperiment.py, given that is what I was executing to run the 
> experiment.
> 
> [edit/update: after taking a look at hotgym.py, things actually look 
> relatively simple to get started with, perhaps consider this a case of RTFM]
> 
> 
> After spending some time in the source and noticing that Grok uses a 
> database, rather than CSV files, I feel that Numenta has perhaps added an 
> obstacle here to 3rd party developers here. Support for a wide variety of 
> input formats could make it easy for people to experiment with their own data 
> stored in data warehouses. (sorry, I can't remember where in the source tree 
> I found this reference as I was reading today)  
> Experimenter ⟹ Developer Obstacles
> 
> Some anecdotal evidence to support the barriers listed in this section. 
> Results interpretation is a bit tricky. When running hotgym for the first 
> time, in the terminal, all I see is the result metrics:
> 
> ...
> <JSON>
> {
>     
> "prediction:trivial:errorMetric='aae':steps=5:window=1000:field=consumption": 
> 15.844123711340194, 
>     
> "prediction:trivial:errorMetric='altMAPE':steps=5:window=1000:field=consumption":
>  56.13599339610908, 
>     
> "multiStepBestPredictions:multiStep:errorMetric='altMAPE':steps=5:window=1000:field=consumption":
>  43.27516112448442, 
>     
> "prediction:trivial:errorMetric='altMAPE':steps=1:window=1000:field=consumption":
>  20.833911019329946, 
>     
> "prediction:trivial:errorMetric='aae':steps=1:window=1000:field=consumption": 
> 5.86262833675565, 
>     
> "multiStepBestPredictions:multiStep:errorMetric='aae':steps=1:window=1000:field=consumption":
>  5.305007751770775, 
>     
> "multiStepBestPredictions:multiStep:errorMetric='altMAPE':steps=1:window=1000:field=consumption":
>  18.852305332800853, 
>     
> "multiStepBestPredictions:multiStep:errorMetric='aae':steps=5:window=1000:field=consumption":
>  12.214213466328994
> }
> </JSON>
> ...
> 
> 
> It has taken me a long time to figure our that the actual prediction results 
> are provided in /tmp/tmpXXXXX/generated_output.csv and that there is more 
> output saved at 
> /nta/logs/numenta-logs-username/OpfRunExperiment-NNNNNNNNNN-NNNN.log
> 
> 
> The Numenta Python style guide and some of the practices does not lend itself 
> to being familiar to Python programmers. I assume that the style guide is 
> designed to assist C++ programmers function efficiently in a dynamic 
> language, by preventing an overly large context switch. However, to me it 
> presents a few degrees of learning curve gradient as I attempt to learn a 
> style.
> 
> Here are two examples that I've noticed myself wincing at:
> 
> "_TaskRunner.getFieldInfo" (line 606 of experiment_runner.py) is not 
> Pythonic. Python programmers would tend towards defining "_TaskRunner.fields"
> within logging, prefixing module paths with "com.numenta" is quite 
> frustrating. I personally don't see why the namespacing is needed. The string 
> is hard coded in /nta/conf/default/nupic-logging.conf, so I assume it doesn't 
> distinguish between different code modules being run
> 
> I should note that it's very useful that NuPIC's code is so very thoroughly 
> documented. Thank you for showing such discipline.
> 
> I hope that these notes are useful!
> 
> 
> 
> Tim McNamara
> @timClicks | timmcnamara.co.nz
> 
> [0] https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy
> 
> 
> 
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to