Response inline.
On Sun, Sep 1, 2013 at 2:10 AM, Tim McNamara <[email protected]>wrote: > Hi > > I thought I would provide some thoughts as I try to progress through the > funnel outlined in the "NuPIC Consumer Engagement Strategy" at [0] (as a > sidenote, it is very interesting and quite refreshing to see a commercial > organisation be explicit with strategy documents like these). > > Hopefully these notes will be of some value as Numenta refines the > strategy. I haven't quite got to the bottom of the funnel, but hopefully > I'll get there! > > <https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy#watcher--builder-obstacles>Watcher > ⟹ Builder Obstacles** > Some things that have tripped me up which are not mentioned > > > - I need to patch the source to workaround "linux3" result to > sys.platform > - "source env.sh" is somewhat annoying to remember (don't want to hard > code this in my .bashrc) > - building - why does NuPIC try to uninstall Python modules that it > can find? Why not use system dependencies is they are there to use? > > > <https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy#builder--experimenter-obstacles>Builder > ⟹ Experimenter Obstacles > NuPIC's documentation is great - but after hours of reading I still don't > know how to give the CLA a CSV file (with the two extra headers) and ask it > to start predicting results. > There's no reason you have to read in sensorRecords from a CSV file exclusively. You can just as easily generate the data from whatever output stream you find useful. Take a look at the code from my hackathon experiment: https://github.com/ravaa/nupic/blob/master/predipic/runtest.py What's fed into the model instance is just a dict of the record headers and the data. I could have (and would have if hadn't spent the time getting a web console running) been generating those records from a python generator fed from sensors (something i'm poking at now), a socket, etc.. > Reading through the source hasn't been as straightforward as I would have > like. I've ended up spending quite a bit of time. Rather than start with > hotgym.py, I have been searching down the rabbit hole of classes/methods > used by OpfRunExperiment.py, given that is what I was executing to run the > experiment. > > [edit/update: after taking a look at hotgym.py, things actually look > relatively simple to get started with, perhaps consider this a case of RTFM] > > > After spending some time in the source and noticing that Grok uses a > database, rather than CSV files, I feel that Numenta has perhaps added an > obstacle here to 3rd party developers here. Support for a wide variety of > input formats could make it easy for people to experiment with their own > data stored in data warehouses. (sorry, I can't remember where in the > source tree I found this reference as I was reading today) > > <https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy#experimenter--developer-obstacles>Experimenter > ⟹ Developer ObstaclesSome anecdotal evidence to support the barriers > listed in this section. Results interpretation is a bit tricky. When > running hotgym for the first time, in the terminal, all I see is the result > metrics: > > ... > <JSON> > { > > "prediction:trivial:errorMetric='aae':steps=5:window=1000:field=consumption": > 15.844123711340194, > > "prediction:trivial:errorMetric='altMAPE':steps=5:window=1000:field=consumption": > 56.13599339610908, > > "multiStepBestPredictions:multiStep:errorMetric='altMAPE':steps=5:window=1000:field=consumption": > 43.27516112448442, > > "prediction:trivial:errorMetric='altMAPE':steps=1:window=1000:field=consumption": > 20.833911019329946, > > "prediction:trivial:errorMetric='aae':steps=1:window=1000:field=consumption": > 5.86262833675565, > > "multiStepBestPredictions:multiStep:errorMetric='aae':steps=1:window=1000:field=consumption": > 5.305007751770775, > > "multiStepBestPredictions:multiStep:errorMetric='altMAPE':steps=1:window=1000:field=consumption": > 18.852305332800853, > > "multiStepBestPredictions:multiStep:errorMetric='aae':steps=5:window=1000:field=consumption": > 12.214213466328994 > } > </JSON> > ... > > > It has taken me a long time to figure our that the actual prediction > results are provided in /tmp/tmpXXXXX/generated_output.csv and that there > is more output saved at > /nta/logs/numenta-logs-username/OpfRunExperiment-NNNNNNNNNN-NNNN.log > > > The Numenta Python style guide and some of the practices does not lend > itself to being familiar to Python programmers. I assume that the style > guide is designed to assist C++ programmers function efficiently in a > dynamic language, by preventing an overly large context switch. However, to > me it presents a few degrees of learning curve gradient as I attempt to > learn a style. > > Here are two examples that I've noticed myself wincing at: > > > - "_TaskRunner.getFieldInfo" (line 606 of experiment_runner.py) is not > Pythonic. Python programmers would tend towards defining > "_TaskRunner.fields" > - within logging, prefixing module paths with "com.numenta" is quite > frustrating. I personally don't see why the namespacing is needed. The > string is hard coded in /nta/conf/default/nupic-logging.conf, so I assume > it doesn't distinguish between different code modules being run > > > I should note that it's very useful that NuPIC's code is so very > thoroughly documented. Thank you for showing such discipline. > > I hope that these notes are useful! > > > > Tim McNamara > @timClicks <http://twitter.com/timClicks> | timmcnamara.co.nz > > [0] > https://github.com/numenta/nupic/wiki/NuPIC-Consumer-Engagement-Strategy > > > > <http://timmcnamara.co.nz/> > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
