Thanks, Tim.  This is great feedback, and appreciate the suggestions.  I've 
addressed a few of your points below.

On Sep 1, 2013, at 2:10 AM, Tim McNamara <[email protected]> wrote:

> building - why does NuPIC try to uninstall Python modules that it can find? 
> Why not use system dependencies is they are there to use?
We have some items in the current sprint to address this, specifically.  I can 
provide some historical context, however.  Before NuPIC was released as free 
software, the installed base was relatively small -- Numenta engineers and 
internal infrastructure.  It was easier for us to simply bundle the externals 
inline with NuPIC since we had full control over the entire process.  We've 
since moved to a model where external dependencies are managed with pip and 
setuptools, but with strict requirements on the source and versions of those 
dependencies.  As a result, if there is a conflict of versions, pip will 
attempt to remove packages which would otherwise conflict at runtime, which can 
be problematic if such packages are installed at the system level.  For now, 
you will need to forcefully remove any conflicting packages that are installed 
at the system level, or use something like virtualenv to isolate your 
environment.

You can track the progress of this in these JIRAs: 

https://issues.numenta.org/browse/NPC-303
https://issues.numenta.org/browse/NPC-304
https://issues.numenta.org/browse/NPC-305
https://issues.numenta.org/browse/NPC-306
https://issues.numenta.org/browse/NPC-308
https://issues.numenta.org/browse/NPC-316

> After spending some time in the source and noticing that Grok uses a 
> database, rather than CSV files, I feel that Numenta has perhaps added an 
> obstacle here to 3rd party developers here. Support for a wide variety of 
> input formats could make it easy for people to experiment with their own data 
> stored in data warehouses. (sorry, I can't remember where in the source tree 
> I found this reference as I was reading today)  

NuPIC only requires a database (MySQL) for swarming.  There is some interest in 
providing a SQLite abstraction, however.  Meanwhile, the CSV files are only 
used at a higher level in the examples (with some NuPIC-provided utilities) -- 
iterate through a CSV file, feed data to the model, save the results back out 
to a file.  There's no reason we couldn't build similar adapters for any data 
source.

> The Numenta Python style guide and some of the practices does not lend itself 
> to being familiar to Python programmers. I assume that the style guide is 
> designed to assist C++ programmers function efficiently in a dynamic 
> language, by preventing an overly large context switch. However, to me it 
> presents a few degrees of learning curve gradient as I attempt to learn a 
> style.
> 
> Here are two examples that I've noticed myself wincing at:
> 
> "_TaskRunner.getFieldInfo" (line 606 of experiment_runner.py) is not 
> Pythonic. Python programmers would tend towards defining "_TaskRunner.fields"
> within logging, prefixing module paths with "com.numenta" is quite 
> frustrating. I personally don't see why the namespacing is needed. The string 
> is hard coded in /nta/conf/default/nupic-logging.conf, so I assume it doesn't 
> distinguish between different code modules being run
> 
> I should note that it's very useful that NuPIC's code is so very thoroughly 
> documented. Thank you for showing such discipline.

Officially, our Python style guide 
(https://github.com/numenta/nupic/wiki/Python-Style-Guide) is an extension of 
PEP-8 and PEP-257, which are commonly accepted within the broader python 
community.  In practice, however, much of the codebase was written by 
developers with backgrounds primarily in Java and C++, which is why you will 
find examples like the ones you've pointed out.

As usual, Pull Requests welcome!
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to