Hi,

With regard to the following question:

2- The wiki refers to encoder outputs as SDRs. Is that necessarily the case
> and if so, to what properties of encoder design is that requirement
> attributed to? (i.e. why do I need an SDR to be the output of the encoder
> as opposed to a binary vector unconstrained in density?)



 ...I happened upon another not often cited advantageous property of SDR's
which is their Quantum-like simultaneity when considering the efficiency of
searching over a vast store of representations - by taking the union of all
candidates, one can yield instantaneously the matching semantic
characteristics relevant to a given search parameter.

Here is an interesting link which parallels Jeff's thinking on this topic:

http://people.brandeis.edu/~grinkus/SDR_and_QC.html

David Ray


On Sat, Aug 2, 2014 at 10:44 AM, Fergal Byrne <[email protected]>
wrote:

>
> Hi Nicholas,
>
> They're some really good questions.
>
> On Sat, Aug 2, 2014 at 1:50 PM, Nicholas Mitri <[email protected]>
> wrote:
>
>> 1- Are there any specific properties that encoders need to have when
>> designing one? What’s the rationale behind them if they exist?
>>
>
> Yes, there are a couple of important properties which encodings must have.
>
> The most important one is that if you have a meaning for "semantic
> closeness" (or "distance") in the data, then close values should have
> overlapping bits in their encodings and distant values should not.
>
> An example is for scalar values (which may be arbitrarily close, of
> course), where you choose the encoding so that values within some range (or
> radius) r have the same encoding, those more than r and less than 2r differ
> by a single bit, and so on. The "traditional" scalar encoder produces
> encodings such as:
>
>     "111100000000"
>     "011110000000"
>     "001111000000"
>     "000111100000"
>     "000111100000"
>     "000011110000"
>     "000001111000"
>     "000000111100"
>     "000000111100"
>     "000000011110"
>     "000000001111"
>     "000000001111"
>
> I've done a discussion of this and Chetan's newer Random Distributed
> Scalar Encoder if you want more detail [1].
>
> Sometimes you have data which does not have such distance semantics. An
> example is categorical data, where values are either members of a set or
> not, the sets are disjoint and there is no ordering semantics (this would
> be common but not applicable to all categorical data). In this case you
> could either divide the encoding width into n blocks and assign a block to
> each category, or choose "random" encodings for each category.
>
> The rationale behind this is that most columns which activate on an input
> will also do so on "nearby" inputs, since they subsample the bits, and so
> the SDR on the layer will vary little when the inputs change a little. This
> provides stability in the face of noise, and allows the CLA to form a
> stable representation of the inputs.
>
> The other primary property is sparseness, which I'll explain in more
> detail in response to the next question.
>
>
>>  2- The wiki refers to encoder outputs as SDRs. Is that necessarily the
>> case and if so, to what properties of encoder design is that requirement
>> attributed to? (i.e. why do I need an SDR to be the output of the encoder
>> as opposed to a binary vector unconstrained in density?)
>>
>
> The sparseness is necessary to take advantage of a) the improbability of
> two substantially overlapping encodings (when subsampled) being a false
> match, and b) false matches representing mild semantic errors. This is a
> statistical property of sparse representations, and it's used in techniques
> such as locality sensitive hashing [2]. Essentially, sparse binary vectors
> with most bits differing are very far apart in the high-dimensional space
> compared with those which share many bits.
>
> Inter-layer communication uses SDRs of course, so genuine SDRs (at ~2%
> on-bits) are the "best" encodings of data, but the CLA will work fine with
> only "quite sparse" inputs of the order of 10-15% on bits. CLA will learn
> faster if the encoding is less sparse, and the number of on-bits relates to
> discriminatory resolution, so we'll often (or even usually) use this
> less-sparse encoding regime.
>
>
>
>> 3- Is there a biological counterpart for encoders in the general sense?
>>
>
> Yes, all input to the neocortex is composed of trains of spikes, which is
> a digital encoding scheme. The brain truly receives streams of bits and
> generates the illusion that we "see" or "hear" directly.
>
>
>> 4- Encoders perform quantization on the input stream by binning similar
>> input patterns into hypercubes in feature space and assigning a single
>> label (SDR or binary representation) to each bin. The encoder resolution
>> determines the size of the hypercube. The SP essentially performs a very
>> similar task by binning the outputs of the encoder in a binary feature
>> space instead. City block distance determined by a threshold parameter
>> controls the size of the hypercubes/bins. Why is this not viewed as a
>> redundant operation by 2 consecutive modules of the HTM design? Is there a
>> strong case for allowing for it?
>>
>
> That's a very good question. There are a few parts to the answer.
>
> Firstly, independently encoded inputs are often fed into a HTM system
> which will extract correlative or causal structure between or among the
> inputs (this is how Layer 4 combines sensory and motor data in the recent
> version of Jeff's theory).
>
> Secondly, a HTM hierarchy will extract a hierarchy of feature structure by
> repeating the same algorithm at each level (and this hierarchy cannot be
> represented in a single encoding).
>
> Thirdly, HTM will extract temporal structure from a series of
> "independently" encoded inputs, which again cannot be represented in each
> single encoding.
>
> Fourthly, the sparseness of the output of each layer in HTM is a property
> of that layer, and independent of the input sparseness, so there is a
> sparseness transformation which alters the dimensionality of the output
> compared with the input.
>
> You need to view HTM as a system which extracts structure which is latent
> in the encoded data; if your encoding is so clever that it exposes all this
> structure directly, then you're right, you don't need HTM at all!
>
>
>> 5- Finally, is there any benefit to designing an encoding scheme that
>> bins inputs into hyperspheres instead of hypercubes? Would the resulting
>> combination of bins produce decision boundaries that might possibly allow
>> for better binary classification performance for example?
>>
>
> The noise levels in the brain are of the order of a bit per bit.
> Implementations of HTM use drastic simplifications (such as binary
> encodings, binary synapses, global inhibition, etc) and distributed
> representations to model this, and so the answer is "probably", but it
> seems to make little engineering sense unless hyperspheres are easier to
> implement or have some other cost advantage.
>
> Thanks again for the great questions.
>
> Regards
>
> Fergal Byrne
>
> [1] http://fergalbyrne.github.io/rdse.html
> [2] http://en.wikipedia.org/wiki/Locality-sensitive_hashing
>
>
> --
>
> Fergal Byrne, Brenter IT
>
> Author, Real Machine Intelligence with Clortex and NuPIC
> https://leanpub.com/realsmartmachines
>
> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014:
> http://euroclojure.com/2014/
> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
>
> http://inbits.com - Better Living through Thoughtful Technology
> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>
> e:[email protected] t:+353 83 4214179
> Join the quest for Machine Intelligence at http://numenta.org
> Formerly of Adnet [email protected] http://www.adnet.ie
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to