Hi David, thanks for your answers! I tried some, like adding SpatialPooler, changing n/w, but no luck.
Perhaps I should run swarming in python against my data, and study the configuration produced. - Takenori On Thu, Nov 5, 2015 at 3:44 AM, cogmission (David Ray) < [email protected]> wrote: > Hi Takenori, > > You might think this is weird (I know I do), but as I am basically just > one person writing and supporting HTM.java (with some appreciated help from > community members from time to time), I haven't really had the time to > **use** NuPIC. Therefore the scope of the questions I can faithfully answer > are specific to setting up and using the code, together with any Java > related questions. NuPIC configurations that have to do with performance of > the HTM (like DateEncoder parameters, the size of W and N; and actual > parameter settings - any familiar person who has used NuPIC and struggled > with that learning curve can answer you. > > The default parameters used are those that were in the Python network > examples and settings that I have been told are "decent" when asking for > help myself. NuPIC parameters are not easy, and require knowledge of the > "rules of thumb" (typical rules for usage). For instance, W should be an > odd number for reasons having to do with finding the "center" of a series > of bits. Also, if you read the class documentation for Encoder.java or > base.py (The abstract base encoder for the Python version) files, you will > see some discussion for N and W and how they relate to each other. > > In general, the difference between the ScalarEncoder and the > RandomDistributedScalarEncoder is that the ScalarEncoder is a bit more > efficient but requires prior knowledge of the min and max values in your > expected dataset. The RDSE can be used without prior knowledge of the > bounds and so is a nice alternative for unknown data. Most people just use > the RDSE. > > Here's a video that discusses the RDSE: > https://www.youtube.com/watch?v=_q5W2Ov6C9E > > The DateEncoder class Javadoc, and the class file itself (together with > DateEncoderTest.java), have lots of documentation in them which illustrate > their usage. Basically, a DateEncoder is a compound encoder that has > ScalarEncoders inside it which handle different aspects of the date > mechanism being used. > > The SpatialPooler is an integral part of the HTM - you usually want that. > The only time when that has been "skipped" is when inserting an encoding > scheme of your own and you want to preserve the input format. But that is > an extreme corner case, I would advise to use one in your code. > > Don't worry about multiple regions and layers. The capacity to have > multiple regions and layers exists for those who need extra flexibility. > The ability to assemble Network hierarchies is mostly a "space saver" for > when HTM Hierarchy code is released by Numenta in the future. The "modes" > shown in the HotGym Demo are just there for demonstration purposes and > really there is no internal concept of "Mode" inside the Network hierarchy. > Again, the Mode in the demo is just a switch to instruct the demo to setup > different hierarchy styles to show that the output is the same regardless > of the number of hierarchical components used to funnel data through. > > I hope this helps. You can ask Numenta engineers for rules of thumb > regarding the individual Parameter settings. > > Cheers, > David > > On Wed, Nov 4, 2015 at 9:45 AM, Takenori Sato <[email protected]> wrote: > >> Hi NuPIC community and David, >> >> I have some questions about how to configure my network with htm.java. >> >> My use case is to let HTM detect an unexpected high load on a server >> through PING response times. But so far, it produces 0.0 for almost any >> inputs. Sometimes it returns some value, but which are not reasonable at >> all. >> >> The biggest problem is that I am not sure at all about my configurations. >> So I highly suspect my configurations are far from correct ones. >> >> For your reference, you can see my codes here: >> >> CloudSonar project <https://github.com/ggsato/CloudSonar> >> HTMAnomalyDetector >> <https://github.com/ggsato/CloudSonar/blob/master/src/com/cloudian/analytics/HTMAnomalyDetector.java> >> >> My network configurations are based on(or I say copy and paste) >> NetworkDemoHarness. They are modified slightly where I believe I understand. >> >> Here're my questions. >> >> *1. Parameters#getAllDefaultParameters* >> >> private static Network createNetwork(Sensor<ObservableSensor<String>> >> sensor) { >> *Parameters p = buildParams();* >> p = p.union(buildEncoderParams()); >> return Network.create("CloudSonar", p) >> .add(Network.createRegion("Region") >> .add(Network.createLayer("Layer", p) >> .alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE) >> .add(Anomaly.create()) >> .add(new TemporalMemory()) >> .add(sensor) >> ) >> ); >> } >> private static Parameters buildParams() { >> return* Parameters.getAllDefaultParameters(); <== THIS ONE* >> } >> >> NetworkDemoHarness#getParameters confused me with many parameters. So I >> picked up only the default ones without overriding anything. Can I start >> like this? >> >> Also, are there any resources to learn about those parameters? >> >> *2. Encoders* >> >> My inputs are [timestamps, duration_in_micro_sec]. >> >> private static String generateCSVInput(PollingJob job) { >> StringBuffer sb = new StringBuffer(); >> sb.append(FULL_DATE_FORMAT.format(new Date())); *<== TIMESTAMP* >> sb.append(CSVUpdateHandler.DELIM); >> sb.append(TimeUnit.MICROSECONDS.convert(job.pollingStatus.duration(), >> TimeUnit.NANOSECONDS)); *<== DURATION* >> return sb.toString(); >> } >> >> I borrowed the config from NetworkDemoHarness#getHotGymFieldEncodingMap >> and getNetworkDemoFieldEncodingMap(noticed mixed up). Then, modified the >> red parts: >> >> public static Map<String, Map<String, Object>> >> getNetworkFieldEncodingMap() { >> Map<String, Map<String, Object>> fieldEncodings = setupMap( >> null, >> 0, // n >> 0, // w >> 0, 0, 0, 0, null, null, null, >> "timestamp", "datetime", "DateEncoder"); >> fieldEncodings = setupMap( >> fieldEncodings, >> 50, >> 21, >> 0, *10000000*, 0, 0.1, null, Boolean.TRUE, null, *<== 0 >> ~ 10 sec* >> CLASSFIER_FIELD, "int", "ScalarEncoder"); >> >> >> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_DOFW.getFieldName(), new >> Tuple(1, 1.0)); // Day of week >> >> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_TOFD.getFieldName(), new >> Tuple(5, 4.0)); // Time of day >> >> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_PATTERN.getFieldName(), >> *FULL_DATE*); >> >> return fieldEncodings; >> } >> >> Why are all the params of DateEncoder 0 or null? >> >> What is the difference between ScalarEncoder >> and RandomDistributedScalarEncoder? >> >> I happened to use the larger n and w used >> by getNetworkDemoFieldEncodingMap. Compared to HotGym demo, durations is >> much larger than consumption. So a larger n makes sense, but I should have >> set lower w like 6? >> >> I wasn't able to find information how to set those DATEFIELD parameters. >> PATTERN was obvious, but the other two remained unclear. Especially, what >> is the Tuple, and those numbers? >> >> *3. SpatialPooler* >> >> NetworkAPIDemo uses SpatialPooler in every network. But it should be >> related to spatial inputs, correct? So I dropped it from my network >> configuration. I have read the JavaDoc, but got no clue. What is it for? >> >> *4. Multiple Regions and Layers* >> >> I wasn't able to understand the difference between those 3 modes in >> NetworkAPIDemo. I understand MULTILAYER uses multiple layers, and >> MULTIREGION uses multiple regions. But when to use which mode in practice? >> >> >> I gave all of these stupid questions, but in overall, I was impressed >> that the design is easy to understand to integrate htm.java in my own >> application!! >> >> Thanks, >> Takenori >> > > > > -- > *With kind regards,* > > David Ray > Java Solutions Architect > > *Cortical.io <http://cortical.io/>* > Sponsor of: HTM.java <https://github.com/numenta/htm.java> > > [email protected] > http://cortical.io >
