Re: htm.java config questions

Takenori Sato Wed, 04 Nov 2015 20:31:54 -0800

Hi David, thanks for your answers!

I tried some, like adding SpatialPooler, changing n/w, but no luck.


Perhaps I should run swarming in python against my data,
and study the configuration produced.

- Takenori

On Thu, Nov 5, 2015 at 3:44 AM, cogmission (David Ray) <
[email protected]> wrote:

> Hi Takenori,
>
> You might think this is weird (I know I do), but as I am basically just
> one person writing and supporting HTM.java (with some appreciated help from
> community members from time to time), I haven't really had the time to
> **use** NuPIC. Therefore the scope of the questions I can faithfully answer
> are specific to setting up and using the code, together with any Java
> related questions. NuPIC configurations that have to do with performance of
> the HTM (like DateEncoder parameters, the size of W and N; and actual
> parameter settings - any familiar person who has used NuPIC and struggled
> with that learning curve can answer you.
>
> The default parameters used are those that were in the Python network
> examples and settings that I have been told are "decent" when asking for
> help myself. NuPIC parameters are not easy, and require knowledge of the
> "rules of thumb" (typical rules for usage). For instance, W should be an
> odd number for reasons having to do with finding the "center" of a series
> of bits. Also, if you read the class documentation for Encoder.java or
> base.py (The abstract base encoder for the Python version) files, you will
> see some discussion for N and W and how they relate to each other.
>
> In general, the difference between the ScalarEncoder and the
> RandomDistributedScalarEncoder is that the ScalarEncoder is a bit more
> efficient but requires prior knowledge of the min and max values in your
> expected dataset. The RDSE can be used without prior knowledge of the
> bounds and so is a nice alternative for unknown data. Most people just use
> the RDSE.
>
> Here's a video that discusses the RDSE:
> https://www.youtube.com/watch?v=_q5W2Ov6C9E
>
> The DateEncoder class Javadoc, and the class file itself (together with
> DateEncoderTest.java), have lots of documentation in them which illustrate
> their usage. Basically, a DateEncoder is a compound encoder that has
> ScalarEncoders inside it which handle different aspects of the date
> mechanism being used.
>
> The SpatialPooler is an integral part of the HTM - you usually want that.
> The only time when that has been "skipped" is when inserting an encoding
> scheme of your own and you want to preserve the input format. But that is
> an extreme corner case, I would advise to use one in your code.
>
> Don't worry about multiple regions and layers. The capacity to have
> multiple regions and layers exists for those who need extra flexibility.
> The ability to assemble Network hierarchies is mostly a "space saver" for
> when HTM Hierarchy code is released by Numenta in the future. The "modes"
> shown in the HotGym Demo are just there for demonstration purposes and
> really there is no internal concept of "Mode" inside the Network hierarchy.
> Again, the Mode in the demo is just a switch to instruct the demo to setup
> different hierarchy styles to show that the output is the same regardless
> of the number of hierarchical components used to funnel data through.
>
> I hope this helps. You can ask Numenta engineers for rules of thumb
> regarding the individual Parameter settings.
>
> Cheers,
> David
>
> On Wed, Nov 4, 2015 at 9:45 AM, Takenori Sato <[email protected]> wrote:
>
>> Hi NuPIC community and David,
>>
>> I have some questions about how to configure my network with htm.java.
>>
>> My use case is to let HTM detect an unexpected high load on a server
>> through PING response times. But so far, it produces 0.0 for almost any
>> inputs. Sometimes it returns some value, but which are not reasonable at
>> all.
>>
>> The biggest problem is that I am not sure at all about my configurations.
>> So I highly suspect my configurations are far from correct ones.
>>
>> For your reference, you can see my codes here:
>>
>> CloudSonar project <https://github.com/ggsato/CloudSonar>
>> HTMAnomalyDetector
>> <https://github.com/ggsato/CloudSonar/blob/master/src/com/cloudian/analytics/HTMAnomalyDetector.java>
>>
>> My network configurations are based on(or I say copy and paste)
>> NetworkDemoHarness. They are modified slightly where I believe I understand.
>>
>> Here're my questions.
>>
>> *1. Parameters#getAllDefaultParameters*
>>
>> private static Network createNetwork(Sensor<ObservableSensor<String>>
>> sensor) {
>> *Parameters p = buildParams();*
>> p = p.union(buildEncoderParams());
>> return Network.create("CloudSonar", p)
>>            .add(Network.createRegion("Region")
>>                .add(Network.createLayer("Layer", p)
>>                    .alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE)
>>                    .add(Anomaly.create())
>>                    .add(new TemporalMemory())
>>                    .add(sensor)
>>                    )
>>                );
>> }
>> private static Parameters buildParams() {
>> return* Parameters.getAllDefaultParameters(); <== THIS ONE*
>> }
>>
>> NetworkDemoHarness#getParameters confused me with many parameters. So I
>> picked up only the default ones without overriding anything. Can I start
>> like this?
>>
>> Also, are there any resources to learn about those parameters?
>>
>> *2. Encoders*
>>
>> My inputs are [timestamps, duration_in_micro_sec].
>>
>> private static String generateCSVInput(PollingJob job) {
>> StringBuffer sb = new StringBuffer();
>> sb.append(FULL_DATE_FORMAT.format(new Date())); *<== TIMESTAMP*
>> sb.append(CSVUpdateHandler.DELIM);
>> sb.append(TimeUnit.MICROSECONDS.convert(job.pollingStatus.duration(),
>> TimeUnit.NANOSECONDS)); *<== DURATION*
>> return sb.toString();
>> }
>>
>> I borrowed the config from NetworkDemoHarness#getHotGymFieldEncodingMap
>> and getNetworkDemoFieldEncodingMap(noticed mixed up). Then, modified the
>> red parts:
>>
>>     public static Map<String, Map<String, Object>>
>> getNetworkFieldEncodingMap() {
>>         Map<String, Map<String, Object>> fieldEncodings = setupMap(
>>                 null,
>>                 0, // n
>>                 0, // w
>>                 0, 0, 0, 0, null, null, null,
>>                 "timestamp", "datetime", "DateEncoder");
>>         fieldEncodings = setupMap(
>>                 fieldEncodings,
>>                 50,
>>                 21,
>>                 0, *10000000*, 0, 0.1, null, Boolean.TRUE, null,  *<== 0
>> ~ 10 sec*
>>                 CLASSFIER_FIELD, "int", "ScalarEncoder");
>>
>>
>> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_DOFW.getFieldName(), new
>> Tuple(1, 1.0)); // Day of week
>>
>> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_TOFD.getFieldName(), new
>> Tuple(5, 4.0)); // Time of day
>>
>> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_PATTERN.getFieldName(),
>> *FULL_DATE*);
>>
>>         return fieldEncodings;
>>     }
>>
>> Why are all the params of DateEncoder 0 or null?
>>
>> What is the difference between ScalarEncoder
>> and RandomDistributedScalarEncoder?
>>
>> I happened to use the larger n and w used
>> by getNetworkDemoFieldEncodingMap. Compared to HotGym demo, durations is
>> much larger than consumption. So a larger n makes sense, but I should have
>> set lower w like 6?
>>
>> I wasn't able to find information how to set those DATEFIELD parameters.
>> PATTERN was obvious, but the other two remained unclear. Especially, what
>> is the Tuple, and those numbers?
>>
>> *3. SpatialPooler*
>>
>> NetworkAPIDemo uses SpatialPooler in every network. But it should be
>> related to spatial inputs, correct? So I dropped it from my network
>> configuration. I have read the JavaDoc, but got no clue. What is it for?
>>
>> *4. Multiple Regions and Layers*
>>
>> I wasn't able to understand the difference between those 3 modes in
>> NetworkAPIDemo. I understand MULTILAYER uses multiple layers, and
>> MULTIREGION uses multiple regions. But when to use which mode in practice?
>>
>>
>> I gave all of these stupid questions, but in overall, I was impressed
>> that the design is easy to understand to integrate htm.java in my own
>> application!!
>>
>> Thanks,
>> Takenori
>>
>
>
>
> --
> *With kind regards,*
>
> David Ray
> Java Solutions Architect
>
> *Cortical.io <http://cortical.io/>*
> Sponsor of:  HTM.java <https://github.com/numenta/htm.java>
>
> [email protected]
> http://cortical.io
>

Re: htm.java config questions

Reply via email to