A little over a year ago ago I started to implement some of the ideas from NuPIC in Java. I'm sure what you're doing is truer to the Python implementation, but some of the ideas what I did may be useful to you - I was specifically going after high performance, colocating data in memory to avoid cache misses, flyweight objects (allocating a short-lived object in a JVM is cheap; long-lived objects is expensive), avoiding mutability to eliminate various flavors of bug, a visitor based API similar to javac's, and separating data storage from the objects that represent HTM concepts such as cells or dendrite segments.
It was a combination exercise to understand the problem space better, and to look at ways you'd design such a thing for high performance. I got the basics in place, and then got too busy with paid work to bring it to the point it did something useful (there are unit tests of the basic pieces). It might only be useful for inspiration, but in case any of the code is useful I just published it on GitHub with the same license you're using: https://github.com/timboudreau/jhtm -Tim
