Daniel, What version of python are you using? What version of NuPIC are you using? (cd into the checkout directory and run "git log -1" and paste the commit SHA)
Thanks, --------- Matt Taylor OS Community Flag-Bearer Numenta On Wed, Oct 16, 2013 at 12:57 AM, Daniel Jachyra <[email protected]> wrote: > Please, could you help me with following error: > > nupic@nupic-vm:~/nupic_nlp-master$ ./run_association_experiment.py > resources/animals.txt resources/vegetables.txt -p 100 -t 1000 > Prediction output for 1000 pairs of terms > > #COUNT TERM ONE TERM TWO | TERM TWO PREDICTION > -------------------------------------------------------------------- > Traceback (most recent call last): > File "./run_association_experiment.py", line 80, in <module> > main() > File "./run_association_experiment.py", line 76, in main > runner.random_dual_association(args[0], args[1]) > File "/home/nupic/nupic_nlp-master/nupic_nlp/runner.py", line 65, in > random_dual_association > self.associate(associations) > File "/home/nupic/nupic_nlp-master/nupic_nlp/runner.py", line 40, in > associate > term2_prediction = self._feed_term(term1, fetch_result) > File "/home/nupic/nupic_nlp-master/nupic_nlp/runner.py", line 75, in > _feed_term > predicted_bitmap = self.nupic.feed(sdr_array) > File "/home/nupic/nupic_nlp-master/nupic_nlp/nupic_words.py", line 24, in > feed > predicted_cells = tp.getPredictedState() > File > "/home/nupic/nta/eng/lib/python2.7/site-packages/nupic/research/TP10X2.py", > line 296, in __getattr__ > raise AttributeError("'TP' object has no attribute '%s'" % name) > AttributeError: 'TP' object has no attribute 'getPredictedState' > > -----Original Message----- > From: nupic [mailto:[email protected]] On Behalf Of Matthew > Taylor > Sent: Monday, October 07, 2013 1:41 AM > To: NuPIC general mailing list. > Subject: Re: [nupic-dev] NLP experiments with NuPIC > > I've added some work to my NuPIC / NLP repo that does POS predictions: > > https://github.com/rhyolight/nupic_nlp#parts-of-speech > > This experiment does not require the CEPT API, so anyone should be able to > run it just by checking it out and installing. It parses a given corpus, > decodes all the parts of speech tags for each sentence, and uses a category > encoder to pass the POS into NuPIC, predicting the next POS. > > Here is some example output: > > $ ./run_pos_experiment.py -t 06_how_thor_got_the_hammer.txt ... > All determiner pronoun > the determiner noun > gods noun noun > felt past tense . > very adverb preposition > sorry adjective proper noun > for preposition noun > little adjective pronoun > Brok proper noun noun > . . past tense > They pronoun pronoun > thought past tense past tense > Loki proper noun pronoun > ' past tense > s noun noun > things noun . > were past tense . > fine noun preposition > . . . > ... > > Column 1: input words > Column 2: POS > Column 3: predicted POS for the same word > > There are some interesting things here. NuPIC commonly predicts a pronoun as > the first word after a sentence, because that's the most common word > starting a sentence within the corpus. It also always predicts a noun will > follow a determiner, because they usually do. > > While NuPIC isn't doing great, it does tend to pick up small POS phrases, > and is pretty good and predicting the end of sentences. But this POS problem > is not something I'd expect it to nail, frankly. It's not something a human > can do well on either. Each phrase is a tree, and at any point in the > phrase, could branch in multiple directions. > NuPIC is going to make its best guess, but will likely be wrong most of the > time. A more interesting experiment would be to turn this into an anomaly > experiment. Once it's been trained on some text, incoming nonsense grammar > should trigger high anomaly scores. > > Another thing you might note is that NLTK doesn't tag all the words > properly. Nouns like "bit" are commonly mis-categorized as a noun instead of > a verb in phrases like "the horse bit the dog", and vice versa. If anyone is > experienced with NLTK, I'd be happy to get some help improving POS tag > accuracy. > > I don't have time to continue these experiments, but I hope this lays some > of the groundwork for anyone interested in the NLP focus of the Hackathon. > I've added this to our list of NLP challenges on our wiki: > > https://github.com/numenta/nupic/wiki/Natural-Language-Processing#challenges > --------- > Matt Taylor > OS Community Flag-Bearer > Numenta > > > On Thu, Oct 3, 2013 at 10:01 AM, Matthew Taylor <[email protected]> wrote: >> Oh by the way, keep in mind that I'm still a python novice. >> Improvements, clarifications, and pull requests are welcome! >> --------- >> Matt Taylor >> OS Community Flag-Bearer >> Numenta >> >> >> On Thu, Oct 3, 2013 at 9:59 AM, Matthew Taylor <[email protected]> wrote: >>> I've been putting together some experiments with NLP and CEPT's word >>> SDRs. Thanks to Subutai and Francisco for your help with this. >>> >>> I've got some initial decent results, at least proving that we can >>> take CEPT's SDRs as input for the CLA and get predicted SDRs back out >>> and get the "similar terms" for the SDR from CEPT's API. >>> >>> https://github.com/rhyolight/nupic_nlp >>> >>> The README on that repo is extensive, so if you are interested, >>> please get a CEPT API key[1] and try it out with your own word > associations. >>> Here is an example (from the README): >>> >>> $ ./run_association_experiment.py resources/animals.txt >>> resources/vegetables.txt -p 100 -t 1000 >>> Prediction output for 1000 pairs of terms >>> >>> #COUNT TERM ONE TERM TWO | TERM TWO PREDICTION >>> -------------------------------------------------------------------- >>> # 100 salmon endive | lentil >>> # 101 crocodile borage | >>> # 102 wolf turmeric | amaranth >>> # 103 termite chickweed | >>> # 104 quail poke | >>> # 105 woodpecker shallot | >>> # 106 echidna caper | tomato >>> # 107 panther guar | >>> # 108 ape tomatillo | chrysanthemum >>> # 109 bee cabbage | >>> # 110 seahorse sorrel | >>> # 111 camel tomatillo | lemongrass >>> # 112 rat chives | >>> # 113 crab yam | turnip >>> >>> This script takes a random term from the first file and a random term >>> from the second. It converts each term to an SDR through the CEPT API >>> and feeds term #1 and term #2 into NuPIC, bypassing the spacial >>> pooler and sending it right into the TP (as described in the hello_tp >>> example[2]). The next prediction after feeding in term #1 is >>> preserved and printed to the console. Then it resets the TP so that >>> it can only learn that simple one->two relationship. In the sample >>> above, NuPIC should only be predicting plants or vegatables, given >>> that the association I'm training it on is "animal" --> "vegetable". >>> >>> This trivial example seems to be working rather well, although NuPIC >>> doesn't always have a valid SDR prediction. The predictions it does >>> create almost always seem to be some sort of plant. Even more >>> interesting is that sometimes NuPIC predicts SDRs that resolve to >>> words outside the range of the input values. >>> >>> Happy hacking! >>> --------- >>> Matt Taylor >>> OS Community Flag-Bearer >>> Numenta >>> >>> [1] https://cept.3scale.net/signup (YOU MUST upgrade your account to >>> use the API endpoints this project requires, email [email protected] >>> and tell him you're working on NuPIC NLP tasks and he'll upgrade >>> you.) [2] >>> https://github.com/numenta/nupic/blob/master/examples/tp/hello_tp.py > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
