Hey Ryan,

For what it's worth I actually agree with your initial point. I believe
that humans can create better encoders than swarming. The example I always
use is from the point of view of American Express, do customer payments of
$100 and $200 or $100 and $199.97 look more similar? If the answer is the
former, it cannot be really captured by swarming.

Moreover, imagine HTMs of the future - when hierarchy is implemented. You
will have models with 500 input fields and 50 predicted fields. I am not
exactly sure that swarming will scale well due to how computationally
intensive it is. I think that creating encoders is a valuable skill to
master in preparation for that future.

The extreme example of that is Pandora. They hired 4 musicians to listen to
music for 10 years and describe it in something like SDR terms (literally
hundreds of binary questions about each song). Though that level of
fidelity might not be required in all applications =)

Sergey

On Fri, Dec 4, 2015 at 2:25 PM, Ryan Singer <[email protected]> wrote:

> Thanks you all for those clear answers. The analogy of tuning retinas to
> the environment clicked for me.
>
> On Fri, Dec 4, 2015 at 4:08 PM Subutai Ahmad <[email protected]> wrote:
>
>> Hi Ryan,
>>
>> To add to Matt and Marcus' replies, recently we have been getting to a
>> better mathematical understanding of the various parameters [1]. I think
>> the math provides a lot of insight into how to set many of the parameters.
>> We haven't folded this into the swarming process yet - it could speed
>> things up quite a bit.  At the end of the day I believe that most of the
>> parameters can be set once, and won't need to be tuned based on specific
>> problems.
>>
>> However there are currently still some parameters that are dependent on
>> inherent properties of a specific data source. For example, you might need
>> to set the encoder parameters differently depending on whether the signal
>> is really noisy or really clean.  In biology, the retinas of different
>> animals are pretty tuned to their specific environment.  You might also be
>> able to do this analytically but no one has figured this out yet.
>>
>> In that sense swarming is just a temporary hack for letting the computer
>> do the work for stuff we don't yet know how to figure out analytically.
>>
>> --Subutai
>>
>> [1] http://arxiv.org/abs/1503.07469
>>
>>
>> On Fri, Dec 4, 2015 at 1:20 PM, Matthew Taylor <[email protected]> wrote:
>>
>>> Hi Ryan,
>>>
>>> You are correct, a human could procure model parameters without
>>> swarming, but it might take a very long time and a lot of experiments. That
>>> is what swarming was designed to do -- test out a bunch of different
>>> permutations of model parameters and see which ones are the best.
>>>
>>> Also, a human must choose what data fields are available to a swarm in
>>> the first place. We usually do this by guessing what data might affect the
>>> values of other data over time. Swarming helps us refine exactly how much
>>> each data field contributes to the predicted field.
>>>
>>> We have found that sometimes correlations between different input data
>>> is not evident to the human eye, and swarming can uncover those
>>> correlations that a human would rarely think logical. For an interesting
>>> discussion of this, see this mailing list thread [1].
>>>
>>> As you probably know, streams of data can have lots of different
>>> "shapes". Some have many fields, some have only one. Some fields are highly
>>> affected by the values of other fields. Some data patterns are daily, some
>>> hourly, some have no regard to time whatsoever.
>>>
>>> Finding out which input data is relevant and how to encode it is what
>>> swarming does. It also permutes over encoder parameters to find the best
>>> way to encode the input data into SDRs. This video explains it in detail
>>> [2] in case you have not seen it.
>>>
>>> Generally, you only need to swarm as a pre-processing step, and once you
>>> find good model parameters, you can feed data into an HTM model over the
>>> data's lifetime. As the patterns in the data change, the model will learn
>>> those changes online. You generally only need to re-swarm if the "shape" of
>>> the data changes.
>>>
>>> I hope this was helpful, please let me know if I answered any of your
>>> questions. If you are just getting started with NuPIC, you might want to
>>> try out Menorah [3] to run some models very quickly based upon data readily
>>> available in River View [4].
>>>
>>> [1]
>>> http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2015-October/011878.html
>>> [2] https://www.youtube.com/watch?v=xYPKjKQ4YZ0
>>> [3] https://github.com/nupic-community/menorah
>>> [4] http://data.numenta.org/index.html
>>>
>>> Regards,
>>> ---------
>>> Matt Taylor
>>> OS Community Flag-Bearer
>>> Numenta
>>>
>>> On Fri, Dec 4, 2015 at 12:28 PM, Ryan Singer <[email protected]> wrote:
>>>
>>>> Hello NuPIC community,
>>>>
>>>> I'm just getting started. I'm excited about Nupic because the HTM model
>>>> and SDR data structure make so much sense intuitively. Finally, an approach
>>>> that doesn't feel arbitrary and over-engineered.
>>>>
>>>> However I'm confused about why swarming is necessary when configuring
>>>> models. I've been reading all the docs and I haven't yet found an
>>>> explanation of why a human can't arrive at the right model parameters
>>>> through reasoning.
>>>>
>>>> Am I missing some documentation or video that explains this? Any help
>>>> is appreciated.
>>>>
>>>> Ryan
>>>>
>>>
>>>
>>

Reply via email to