I was watching the tutorial videos. For me currently the hardest to wrap my mind about are the Encoders. Let me do quick recap, you tell me if my interpretation is wrong.
0. You have "dense" data 1. Use Encoder to convert this "dense"-data to "distributed"-representation 2. Use Spatial pooler to make this "distributed"-representation a Sparse-Distributed one 3. Use Temporal pooler to "make" the sequence of SDRs an even larger SDR, thus in a sense "encoding" most often used sequences as a larger-stable-SDRs (from % standpoint view). Now the most awkward phase to me is the transition from Dense-2->Distributed representation. From what I deduce the idea is to extract nuggets of semantic information and encode it as standalone bits in a distributed fashion with some overlapping so that we don't lose the information when we make it Sparse. My question is what are the GENERAL rules of making Distributed from Dense ? Is there a general way of doing that at all OR it is domain specific ? If yes how do you approach it ? If human have to extract the initial SEMANTIC information of the data, doesn't this hamper the GENERALITY of the whole approach ? How does the Brain does it ? Let me give you an superficial example where pure Scalar-encoder may not be the best solution. Let say we are encoding temperature ... if we know the data is for human habitation we can include as semantic information in addition to the pure numbers the information about what is comfortable human temperature which may enhance the future prediction. In the same case if we are talking about chemical reaction the additional semantic information will be different temperature ranges or scales (freezing and boiling point for the specific compounds or reaction). As you see this kind of semantic information is dependent on the human factor deciding the encoding. I've been reading recently about word-embeedings in the standard NN approaches, where the question is reversed i.e. data-items are provided in context (f.e. surrounding words of the topic-word .. from text-corpora) then the distributed word representation is guessed slowly by trying to match the context-words-sequence, thus in a sense extracting the semantic-distributed-representation by the context of usage. I.E. distributed representation is a result of the context of usage. (may be this approach can solve also the problem of adaptive-scalar-encoder, by encoding the context instead of the single datum. The other obvious solution to this problem is to encode "velocity" or "jerk" of the data instead of the "distance" itself.). Again let me know of GENERAL approach to make Distributed from Dense representation, if you happen to had one ! OR if you have some other ideas... thanks -------| http://ifni.co
