Re: [nupic-dev] Newbie question

Subutai Ahmad Tue, 16 Jul 2013 08:52:21 -0700

Some historical color on this discussion:

The CLA is complex enough that over the years we found it extremely useful
to have two parallel implementations (Python and C++) of the temporal
pooler. This has been very helpful not only in quickly prototyping changes,
but also in debugging. It is very very tricky to debug the temporal pooler
(and most machine learning algorithms). If nothing crashes, and you are
getting a "reasonable" answer, how do you know the code is correct?


To help in this we took a lot of pain to make sure the two copies are
exact. We even make both use the same random number generator. In our auto
tests we feed in a number of input sequences using the same seed, and
ensure they both return the same results, end up with the same synapses and
connections, etc. With this process we have found and fixed some really
tricky bugs in both implementations.

If you examine the two implementations, they will look very different
because the C++ focuses on speed (the optimizations techniques are cool and
a topic for a separate post if there is interest! :-). However, they both
generate identical results.

Currently the code has years of abuse on it, and not fully C++. I really
hope we can clean it up and develop a pure C++ implementation of the
spatial pooler and temporal pooler. This way the code can be super fast and
easily support multiple languages.  I totally agree with Erik, etc. and
would definitely vote to continue to have a parallel Python reference
implementation.

--Subutai


On Tue, Jul 16, 2013 at 6:30 AM, Archie, Kevin <[email protected]>wrote:

>  +1
>
>  I try to stay out of language/religious discussions (my own preferences
> are eclectic) but I'm very much in favor of keeping the high-level Python
> code current, mostly because that forces some architectural flexibility
> that is likely to be the first casualty of a full conversion to C++. I
> suspect that concurrent and probably even distributed processing will
> become important as people start building multi-region models and expecting
> them to respond in finite time. One *can* do parallel programming in C++,
> but you probably don't want to start there.
>
>    - Kevin
>
>  p.s. but gosh, it would be nice to have the Python part of the build
> cleaned up.
>
>  p.p.s. I've been contemplating an update to Greenspun's tenth rule:  Any
> sufficiently complicated distributed C/C++ program contains an ad hoc,
> informally-specified, brittle, unreliable implementation of half of Erlang.
>
>
>  On Jul 16, 2013, at 7:23 AM, David Ragazzi wrote:
>
>   Hi Eric and Scott,
>
>  >>Eric wrote:
>  >>Python bindings significantly lower the barrier to entry and adoption,
> in my opinion. People who don't know python very well, can still do
> something with nupic and get some value out of the predictions relatively
> quickly, as I believe was evidenced by the hackathon.
>
>  Yes, you are right. Python is well know as an excelent language to
> prototyping which allow scientists (with non computational background) make
> experiments. So, I think we could use Python as a prototype tool, but C++
> as functional code (but not integrated one-to-one).
>
>  From I understood, Scott says that the plans are keeping sincronized
> versions for python and c++ but with automated tools for check
> inconsistences. My suggestion is keep sincronized versions but with Python
> as just a design tool and C++ as the functional code, i.e. discarding
> dependency between them and thus discarding several tools to support the
> integration.
>
>  The development process could be:
>
>  - Any new features should be FIRST writen and tested in Python code.
> This way, people interested on research wouldn´t need know C++ and other
> tools to get used to CLA.
>  - Once these new features are accepted and well integrated in Python
> code, C++ code assimilate the changes. This way people interested in take
> advantage of the code wouldn´t need know Python and integration tools.
>  - C++ project WOULD NOT receive directly changes for business logic. The
> allowed changes would bug fixes, optimizations, etc.
>
>  In other words, Python would play role in analysis/design phases such as
> UML and other tools do, while C++ would play role in implementation phases,
> i.e. the real code. My suggestion, of course.
>
>  My best wishes, David
>
>
>
>
>  On 16 July 2013 05:03, Erik Blas <[email protected]> wrote:
>
>> Python bindings significantly lower the barrier to entry and adoption, in
>> my opinion. People who don't know python very well, can still do something
>> with nupic and get some value out of the predictions relatively quickly, as
>> I believe was evidenced by the hackathon.
>>
>>
>> On Mon, Jul 15, 2013 at 10:58 PM, Scott Purdy <[email protected]> wrote:
>>
>>>  We intend to keep both the Python and C++ versions. The Python version
>>> will probably always use some swigged C++ code such as the sparse matrix in
>>> the spatial pooler but the C++ version will be pure, portable C++. And we
>>> will also swig the C++ implementation for different languages and run it
>>> against the same Python test site to ensure they stay in sync functionally.
>>>  That is the plan now at least. It may change so don't hold me to it.
>>> And do let me know if you have a strong opinion for or against it.
>>>  On Jul 15, 2013 9:43 PM, "Erik Blas" <[email protected]> wrote:
>>>
>>>> I always get a kick out of that tool's name.
>>>>
>>>>  I am sad to see the python support go, as it made for quick
>>>> prototyping of projects, and understand why it would also make the platform
>>>> more portable as a whole.
>>>>
>>>>  Perhaps I'll get to the point of keeping a python port.
>>>>
>>>     _______________________________________________
>
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
>
> ------------------------------
>
> The material in this message is private and may contain Protected
> Healthcare Information (PHI). If you are not the intended recipient, be
> advised that any unauthorized use, disclosure, copying or the taking of any
> action in reliance on the contents of this information is strictly
> prohibited. If you have received this email in error, please immediately
> notify the sender via telephone or return mail.
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] Newbie question

Reply via email to