This, or something similar, would be a good task area for the next hackathon. If some people are looking for a project to work on we could have this prepared in advance.
Jeff From: nupic [mailto:[email protected]] On Behalf Of Francisco Webber Sent: Saturday, August 24, 2013 10:24 AM To: NuPIC general mailing list. Subject: Re: [nupic-dev] HTM in Natural Language Processing For those who don't want to use the API and for evaluation purposes, I would propose that we choose some reference text and I convert it into a sequence of SDRs. This file could be used for training. I would also generate a list of all words contained in the text, together with their SDRs to be used as conversion table. As a simple test measure we could feed a sequence of SDRs into a trained network and see if the HTM makes the right prediction about the following word(s). The last file to produce for a complete framework would be a list of lets say 100 word sequences with their correct continuation. The word sequences could be for example the beginnings of phrases with more than n words (n being the number of steps ahead that the CLA can predict ahead) This could be the beginning of a measuring set-up that allows to compare different CLA-implementation flavors. Any suggestions for a text to choose? Francisco On 24.08.2013, at 17:12, Matthew Taylor wrote: Very cool, Francisco. Here is where you can get cept API credentials: https://cept.3scale.net/signup --------- Matt Taylor OS Community Flag-Bearer Numenta On Fri, Aug 23, 2013 at 5:07 PM, Francisco Webber <[email protected]> wrote: Just a short post scriptum: The public version of our API doesn't actually contain the generic conversion function. But if people from the HTM community want to experiment just click the "Request for Beta-Program" button and I will upgrade your accounts manually. Francisco On 24.08.2013, at 01:59, Francisco Webber wrote: > Jeff, > I thought about this already. > We have a REST API where you can send a word in and get the SDR back, and vice versa. > I invite all who want to experiment to try it out. > You just need to get credentials at our website: www.cept.at <http://www.cept.at/> . > > In mid-term it would be cool to create some sort of evaluation set, that could be used to measure progress while improving the CLA. > > We are continuously improving our Retina but the version that is currently online works pretty well already. > > I hope that will help > > Francisco > > On 24.08.2013, at 01:46, Jeff Hawkins wrote: > >> Francisco, >> Your work is very cool. Do you think it would be possible to make available >> your word SDRs (or a sufficient subset of them) for experimentation? I >> imagine there would be interested in the NuPIC community in training a CLA >> on text using your word SDRs. You might get some useful results more >> quickly. You could do this under a research only license or something like >> that. >> Jeff >> >> -----Original Message----- >> From: nupic [mailto:[email protected]] On Behalf Of Francisco >> Webber >> Sent: Wednesday, August 21, 2013 1:01 PM >> To: NuPIC general mailing list. >> Subject: Re: [nupic-dev] HTM in Natural Language Processing >> >> Hello, >> I am one of the founders of CEPT Systems and lead researcher of our retina >> algorithm. >> >> We have developed a method to represent words by a bitmap pattern capturing >> most of its "lexical semantics". (A text sensor) Our word-SDRs fulfill all >> the requirements for "good" HTM input data. >> >> - Words with similar meaning "look" similar >> - If you drop random bits in the representation the semantics remain intact >> - Only a small number (up to 5%) of bits are set in a word-SDR >> - Every bit in the representation corresponds to a specific semantic feature >> of the language used >> - The Retina (sensory organ for a HTM) can be trained on any language >> - The retina training process is fully unsupervised. >> >> We have found out that the word-SDR by itself (without using any HTM yet) >> can improve many NLP problems that are only poorly solved using the >> traditional statistic approaches. >> We use the SDRs to: >> - Create fingerprints of text documents which allows us to compare them for >> semantic similarity using simple (euclidian) similarity measures >> - We can automatically detect polysemy and disambiguate multiple meanings. >> - We can characterize any text with context terms for automatic >> search-engine query-expansion . >> >> We hope to successfully link-up our Retina to an HTM network to go beyond >> lexical semantics into the field of "grammatical semantics". >> This would hopefully lead to improved abstracting-, conversation-, question >> answering- and translation- systems.. >> >> Our correct web address is www.cept.at <http://www.cept.at/> (no kangaroos in Vienna ;-) >> >> I am interested in any form of cooperation to apply HTM technology to text. >> >> Francisco >> >> On 21.08.2013, at 20:16, Christian Cleber Masdeval Braz wrote: >> >>> >>> Hello. >>> >>> As many of you here i am prety new in HTM technology. >>> >>> I am a researcher in Brazil and I am going to start my Phd program soon. >> My field of interest is NLP and the extraction of knowledge from text. I am >> thinking to use the ideas behind the Memory Prediction Framework to >> investigate semantic information retrieval from the Web, and answer >> questions in natural language. I intend to use the HTM implementation as >> base to do this. >>> >>> I apreciate a lot if someone could answer some questions: >>> >>> - Are there some researches related to HTM and NLP? Could indicate them? >>> >>> - Is HTM proper to address this problem? Could it learn, without >> supervision, the grammar of a language or just help in some aspects as Named >> Entity Recognition? >>> >>> >>> >>> Regards, >>> >>> Christian >>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
