Hi Phil, Very interesting idea. I agree with you that breaking individual transactions into sequences could be a effective way of detecting transaction anomalies. Could you provide more detail regarding the fields of each transaction record? Do field at different time steps share common elements? Across transactions, how would evaluate whether two fields are similar?
I asked these questions because in the end, you want to ensure that the output of the spatial pooler has more overlap for semantically similar field. If you have a small set of possible fields and each field is independent and distinct from other field, you can just use a simple category encoder (e.g., encode it as A, B, C, D). If there is a concept of similarity across field, you might want to use a shared spatial pooler for all the fields and make sure that the similarity is reflected in the output of the spatial pooler. -- Yuwei Cui Research Engineer, Numenta Inc. On Sat, Sep 19, 2015 at 12:50 PM, Phil iacarella <[email protected]> wrote: > Hello, > > I would like to use the CLA to monitor transactions. Transactions arrive > in a completely random order so using the TP to measure transitions between > transactions is not useful. My thought is that I need to monitor the > transitions within a transaction record and not between transaction records. > > Why am I doing this? Well, I’d like to leverage the enormous storage > potential of the TP by storing the transitional sequences within a > transaction. In this way, it could learn every transaction as a sequence. I > could then, after learning, for any new record get the anomaly score of > each TP sequence step in a transaction record, add them up and get the > average score. This score would indicate how anomalous the record is as a > transaction. > > My first idea is to break up the transaction record into multiple time > steps and feed these fields into the CLA one at a time. The problem is that > each of these fields could possibly require a different encoder. I can get > around this by encoding the fields separately and passing them to the > PassThruEncoder that would be hookup up a spatialPooler. As long as each > field encodes to the same length this should work….right? > > My question is, would this confuse the SpatialPooler? Am I crossing data > domains regarding the SpatialPooler because the data being represented is > not of the same kind. The SpatialPooler doesn’t “know” this, but still...? > > The second idea is to have a separate SpatialPooler for every field in the > record and send its output to a higher CLA in sequential order (SP1, SP2, > SP3….end of sequence. Next record…). This way each field has its own > SpatialPooler to learn its own data domain. The higher level SpatialPooler > then reads this as regular input. The TP would learn the transitions > between fields in the record. Each record would begin a new learning > sequence. > > I think this could be very effective (and incredibly sensitive) in > identifying transactions that need further inspection. > > Sound feasible? > > -Phil >
