Thanks alot for the paper ... that is what I was looking for.

To extrapolate the basic Scalar encoding/decoding.
---------------
import numpy as np
import math

class ScalarEncoder:

    def __init__(self, minimum=0,maximum=100,buckets=100,width=5,n=None):
        self.vmin = minimum
        self.vmax = maximum
        self.vrange = self.vmax - self.vmin
        self.width = width
        if (n is None) :
            self.buckets = buckets
            self.n = buckets + width + 1
        else :
            self.n = n
            self.buckets = n - width + 1

    def encode(self, value):
        i = math.floor(self.buckets * ((value - self.vmin)/float(self.vrange)) )
        rv = np.zeros(self.n,dtype='uint8')
        rv[i : i + self.width ] = 1
        return rv


    def decode(self, data):
        tmp = np.where(data == 1)[0]
        i = 0 if len(tmp) == 0 else tmp[0]
        value = ( (i * self.vrange) / float(self.buckets) ) + self.vmin
        return math.floor(value)
-------| http://ifni.co


On Sun, Feb 21, 2016 at 2:52 PM, Alex Lavin <[email protected]> wrote:
> Hi mraptor,
> I recommend taking a look at a new paper of ours:
> Scott Purdy, "Encoding Data for HTM Systems":
> http://arxiv.org/abs/1602.05925.
>
> In your temperature encoding example, knowing the range of possible values a
> priori is indeed useful. You would simply use two independent scaler
> encoders, one for each of the two scenarios -- habitable temperatures and
> chemical reaction temperatures. The random distributed scaler encoder [1]
> comes in handy when the temperature ranges are not known; it dynamically
> adjusts the range as the min and/or max change with new data.
>
> Regarding word embeddings, you're correct that "distributed representation
> is a result of the context of usage"; the underlying assumption in
> state-of-the-art word embedding methods is that words appearing in similar
> contexts have similar meanings. By sliding a window through some corpus of
> text (with various tricks) dense-distributed representations are learned,
> specifically toward use in a deep learning network. Similarly, Cortical.io
> [2] creates sparse distributed representations (SDRs), which are potentially
> more useful in a range of NLP tasks, and can be used as input to HTM models.
> To encode text into SDRs check out the python[3] or java[4] clients for
> their API.
>
> Hopefully this info will help clear up any confusion you may have!
>
> [1]
> https://github.com/numenta/nupic/blob/master/src/nupic/encoders/random_distributed_scalar.py
> [2] http://www.cortical.io/technology.html
> [3] https://github.com/cortical-io/retina-sdk.py
> [4] https://github.com/cortical-io/retina-api-java-sdk
>
> Cheers,
> Alex
>
> Alexander Lavin
> Software Engineer
> Numenta

Reply via email to