Re: [nupic-dev] HTM in Natural Language Processing

Jeff Hawkins Sun, 25 Aug 2013 12:21:41 -0700

This, or something similar, would be a good task area for the next
hackathon.  If some people are looking for a project to work on we could
have this prepared in advance.


Jeff 

 

From: nupic [mailto:[email protected]] On Behalf Of Francisco
Webber
Sent: Saturday, August 24, 2013 10:24 AM
To: NuPIC general mailing list.
Subject: Re: [nupic-dev] HTM in Natural Language Processing

 

For those who don't want to use the API and for evaluation purposes, I would
propose that we choose some reference text and I convert it into a sequence
of SDRs. This file could be used for training.

I would also generate a list of all words contained in the text, together
with their SDRs to be used as conversion table.

As a simple test measure we could feed a sequence of SDRs into a trained
network and see if the HTM makes the right prediction about the following
word(s). 

The last file to produce for a complete framework would be a list of lets
say 100 word sequences with their correct continuation.

The word sequences could be for example the beginnings of phrases with more
than n words (n being the number of steps ahead that the CLA can predict
ahead)

This could be the beginning of a measuring set-up that allows to compare
different CLA-implementation flavors.

 

Any suggestions for a text to choose?

 

Francisco

 

On 24.08.2013, at 17:12, Matthew Taylor wrote:





Very cool, Francisco. Here is where you can get cept API credentials:
https://cept.3scale.net/signup




---------

Matt Taylor

OS Community Flag-Bearer

Numenta

 

On Fri, Aug 23, 2013 at 5:07 PM, Francisco Webber <[email protected]> wrote:

Just a short post scriptum:

The public version of our API doesn't actually contain the generic
conversion function. But if people from the HTM community want to experiment
just click the "Request for Beta-Program" button and I will upgrade your
accounts manually.

Francisco


On 24.08.2013, at 01:59, Francisco Webber wrote:

> Jeff,
> I thought about this already.
> We have a REST API where you can send a word in and get the SDR back, and
vice versa.
> I invite all who want to experiment to try it out.
> You just need to get credentials at our website: www.cept.at
<http://www.cept.at/> .
>
> In mid-term it would be cool to create some sort of evaluation set, that
could be used to measure progress while improving the CLA.
>
> We are continuously improving our Retina but the version that is currently
online works pretty well already.
>
> I hope that will help
>
> Francisco
>
> On 24.08.2013, at 01:46, Jeff Hawkins wrote:
>
>> Francisco,
>> Your work is very cool.  Do you think it would be possible to make
available
>> your word SDRs (or a sufficient subset of them) for experimentation?  I
>> imagine there would be interested in the NuPIC community in training a
CLA
>> on text using your word SDRs.  You might get some useful results more
>> quickly.  You could do this under a research only license or something
like
>> that.
>> Jeff
>>
>> -----Original Message-----
>> From: nupic [mailto:[email protected]] On Behalf Of
Francisco
>> Webber
>> Sent: Wednesday, August 21, 2013 1:01 PM
>> To: NuPIC general mailing list.
>> Subject: Re: [nupic-dev] HTM in Natural Language Processing
>>
>> Hello,
>> I am one of the founders of CEPT Systems and lead researcher of our
retina
>> algorithm.
>>
>> We have developed a method to represent words by a bitmap pattern
capturing
>> most of its "lexical semantics". (A text sensor) Our word-SDRs fulfill
all
>> the requirements for "good" HTM input data.
>>
>> - Words with similar meaning "look" similar
>> - If you drop random bits in the representation the semantics remain
intact
>> - Only a small number (up to 5%) of bits are set in a word-SDR
>> - Every bit in the representation corresponds to a specific semantic
feature
>> of the language used
>> - The Retina (sensory organ for a HTM) can be trained on any language
>> - The retina training process is fully unsupervised.
>>
>> We have found out that the word-SDR by itself (without using any HTM yet)
>> can improve many NLP problems that are only poorly solved using the
>> traditional statistic approaches.
>> We use the SDRs to:
>> - Create fingerprints of text documents which allows us to compare them
for
>> semantic similarity using simple (euclidian) similarity measures
>> - We can automatically detect polysemy and disambiguate multiple
meanings.
>> - We can characterize any text with context terms for automatic
>> search-engine query-expansion .
>>
>> We hope to successfully link-up our Retina to an HTM network to go beyond
>> lexical semantics into the field of "grammatical semantics".
>> This would hopefully lead to improved abstracting-, conversation-,
question
>> answering- and translation- systems..
>>
>> Our correct web address is www.cept.at <http://www.cept.at/>  (no
kangaroos in Vienna ;-)
>>
>> I am interested in any form of cooperation to apply HTM technology to
text.
>>
>> Francisco
>>
>> On 21.08.2013, at 20:16, Christian Cleber Masdeval Braz wrote:
>>
>>>
>>> Hello.
>>>
>>> As many of you here i am prety new in HTM technology.
>>>
>>> I am a researcher in Brazil and I am going to start my Phd program soon.
>> My field of interest is NLP and the extraction of knowledge from text. I
am
>> thinking to use the ideas behind the Memory Prediction Framework to
>> investigate semantic information retrieval from the Web, and answer
>> questions in natural language. I intend to use the HTM implementation as
>> base to do this.
>>>
>>> I apreciate a lot if someone could answer some questions:
>>>
>>> - Are there some researches related to HTM and NLP? Could indicate them?
>>>
>>> - Is HTM proper to address this problem? Could it learn, without
>> supervision, the grammar of a language or just help in some aspects as
Named
>> Entity Recognition?
>>>
>>>
>>>
>>> Regards,
>>>
>>> Christian
>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org


_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

 

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] HTM in Natural Language Processing

Reply via email to