Hi Brian,

Thanks! This has been (its not over yet) both a labor of love and a severe
push, largely for some of the same reasons you pointed out. This is my
first foray into Python, and so I had all the same feelings of
disorientation and intimidation. I (as well as Numenta's flag bearer, Matt
Taylor) also could foresee as you do the enormous opportunity to introduce
HTM theory to a significantly large group of developers - by providing a
Java version.

I have to admit, most of my effort and focus have been on getting a
thorough and tested Java version of NuPIC up and running as quick as
possible. As a result, the only comparisons I have made are to the Python
version. As one would expect, the Java version is magnitudes faster than
the Python version (which mostly exists as a research platform and as a
knowledge transfer platform for new users due to the ease with which new
ideas can be implemented in a quick fashion). I have not (yet) had a chance
to make any comparisons between the Java and C++ versions - however it is
my goal to make sure the Java version is at least competitive with the C++
version (if not exceeding it - as it could very well do in a long running,
primed JVM). The emphasis however is to augment the utility of NuPIC in
general, and introduce as many people as possible to these technologies
because they are a very unique and important contribution the field of
machine learning - which is why I'm doing this!

Regards,
David

On Sun, Oct 19, 2014 at 11:31 AM, Brian Eppert <[email protected]>
wrote:

> Very impressive, must have taken a lot determination, nice work!
>
> It’s great to see the java port is more strongly typed, one of the
> scariest parts for me looking at the python code was the wealth
> configuration parameters as (mis-typable, unconstrained) strings and
> arrays. It seems more surmountable as an neophyte to use an IDE that can
> compile and flag bad values, and provide code completions, in place
> documentation or “go to definition” capabilities.
>
> Another win is having this in Java allows for native use by the other JVM
> hosted languages like Groovy, Scala, Clojure, JRuby, etc. That’s accessible
> to quite a few more developers, and with with Java’s strong
> cross-platform-ness a ton of avenues of use open up.
>
> That is all wonderful but I’m bracing myself as I ask this but what have
> you seen as far as performance as compared to the NuPIC python and C++ code?
>
>
> On Oct 18, 2014, at 10:37 AM, cogmission1 . <[email protected]>
> wrote:
>
> Hi Everybody,
>
> After 2 (looooooooong) months we finally have usable NuPIC functionality
> in Java!
>
> Repo: https://github.com/numenta/htm.java
> Wiki:  https://github.com/numenta/htm.java/wiki
> Twitter: https://twitter.com/search?q=%23HtmJavaDevUpdates&src=typd
>
> Here's a blurb describing the goals, and future plans for the project:
>
> ======
>
> Throughout the development of the TemporalMemory and the SpatialPooler,
> there was an emphasis on keeping a 1-to-1 correlation between the methods
> and functions implementing each algorithm in the Java and Python versions.
> To this end, I would say that 98% of the Python tests in each module have
> the *exact* same output produced within the Java unit tests and integration
> tests. The only place where they differ is in places where calls to an
> underlying RandomNumberGenerator have a significant impact - however, even
> in those places, every other aspect of the code output is carefully
> monitored to ensure that had certain initial parameters been the same, the
> two versions (Python and Java) would produce the exact same output. This
> was achieved by altering the Python tests temporarily to be initialized
> with the same values that the Java version was initialized with - and
> making sure the output produced was the same!
>
> Additionally, a utility object (ArrayUtils) was created to bridge the gap
> between functionality native to Python which doesn't exist in Java and
> there was the creation of the SparseMatrix (and its subclasses:
> SparseBinaryMatrix, and SparseObjectMatrix) to handle array shaping and
> vector math operations.
>
> There are a few architectural differences in the Java version. One is the
> abstraction of objects represented in the Python version as arrays and
> array containers into formal Objects in the Java version. Another is that
> all methods in the Java version are "functional" in that the data they
> operate on is passed in, and no state is kept in either the TemporalMemory
> or the SpatialPooler classes. The "Connections" class (inspired by Chetan's
> Connections object) acts like an isolated memory - containing all state.
> This means that two distinct Connections objects (memories) could be passed
> to the TM or SP, manipulating two entirely different layers *concurrently*
> or in parallel.
>
>
> Roadmap:
>
> At this point the SpatialPooler can be connected to the TemporalMemory to
> produce output
> within a given Java project - since those two classes represent the major
> inference functionality of NuPIC. However, in order to exactly reproduce
> the convenience of the Online Prediction Framework, other structures would
> need to be implemented - and so those are next on the list to be
> implemented. The anticipated roadmap is as follows:
>
> 1.) Create the BaseEncoder and derivative encoders which are currently
> relevant (since one or two may have become obsolete). The culmination of
> which should be the GEOSpatialEncoder I assume.
>
> 2.) Classifiers will then be next on the list which will complete the
> current hierarchy of functionality.
>
> 3.) Following this, Layer and Regional constructs will be created to
> coordinate and manage data flow in this hierarchy.
>
> 4.) Then we'll loop around and take a look at what "Research" sensorymotor
> based new development can be formally pulled in and guide the reshaping of
> the Java version to a form that reflects the most current theory.
>
> 5.) Then we'll do an optimization/performance pass over the entire
> codebase to make it at least as fast as whatever C++ version is available.
>  (*wink*)
>
>
>
>
> --
> *We find it hard to hear what another is saying because of how loudly "who
> one is", speaks...*
>
>
>


-- 
*We find it hard to hear what another is saying because of how loudly "who
one is", speaks...*

Reply via email to