Yuwei,
Imagine a transaction looks something like this:
Card #, date/time, $ Amount, Zipcode, Merchant ID, Tax paid,
ProductCode,…
Fields are semantically unrelated in that Card# represents something
completely different from $Amount. This is why I think I would have to use
different SP’s for each field but all the SP’s would feed into the same TP.
On the surface this seems like it would work but I feel like I’m
missing something?
Another approach is to encode the transaction record and all its fields
as a single SP input and feed it through the TP and use the “activeState”
output of the TP as input into a classifier. New transactions would then be fed
in, and the classifier would give a confidence score on how well it fits the
class. In this case the classifier could be the Card #. I don’t think this
would be as effective.
Thanks for your interest.
-Phil
> On Sep 22, 2015, at 1:26 PM, Yuwei Cui <[email protected]> wrote:
>
> Hi Phil,
>
> Very interesting idea. I agree with you that breaking individual transactions
> into sequences could be a effective way of detecting transaction anomalies.
> Could you provide more detail regarding the fields of each transaction
> record? Do field at different time steps share common elements? Across
> transactions, how would evaluate whether two fields are similar?
>
> I asked these questions because in the end, you want to ensure that the
> output of the spatial pooler has more overlap for semantically similar field.
> If you have a small set of possible fields and each field is independent and
> distinct from other field, you can just use a simple category encoder (e.g.,
> encode it as A, B, C, D). If there is a concept of similarity across field,
> you might want to use a shared spatial pooler for all the fields and make
> sure that the similarity is reflected in the output of the spatial pooler.
>
> --
> Yuwei Cui
>
> Research Engineer, Numenta Inc.
>
>
>
> On Sat, Sep 19, 2015 at 12:50 PM, Phil iacarella <[email protected]
> <mailto:[email protected]>> wrote:
> Hello,
>
> I would like to use the CLA to monitor transactions. Transactions arrive in a
> completely random order so using the TP to measure transitions between
> transactions is not useful. My thought is that I need to monitor the
> transitions within a transaction record and not between transaction records.
>
> Why am I doing this? Well, I’d like to leverage the enormous storage
> potential of the TP by storing the transitional sequences within a
> transaction. In this way, it could learn every transaction as a sequence. I
> could then, after learning, for any new record get the anomaly score of each
> TP sequence step in a transaction record, add them up and get the average
> score. This score would indicate how anomalous the record is as a transaction.
>
> My first idea is to break up the transaction record into multiple time steps
> and feed these fields into the CLA one at a time. The problem is that each of
> these fields could possibly require a different encoder. I can get around
> this by encoding the fields separately and passing them to the
> PassThruEncoder that would be hookup up a spatialPooler. As long as each
> field encodes to the same length this should work….right?
>
> My question is, would this confuse the SpatialPooler? Am I crossing data
> domains regarding the SpatialPooler because the data being represented is not
> of the same kind. The SpatialPooler doesn’t “know” this, but still...?
>
> The second idea is to have a separate SpatialPooler for every field in the
> record and send its output to a higher CLA in sequential order (SP1, SP2,
> SP3….end of sequence. Next record…). This way each field has its own
> SpatialPooler to learn its own data domain. The higher level SpatialPooler
> then reads this as regular input. The TP would learn the transitions between
> fields in the record. Each record would begin a new learning sequence.
>
> I think this could be very effective (and incredibly sensitive) in
> identifying transactions that need further inspection.
>
> Sound feasible?
>
> -Phil
>