I've been putting together some experiments with NLP and CEPT's word SDRs. Thanks to Subutai and Francisco for your help with this.
I've got some initial decent results, at least proving that we can take CEPT's SDRs as input for the CLA and get predicted SDRs back out and get the "similar terms" for the SDR from CEPT's API. https://github.com/rhyolight/nupic_nlp The README on that repo is extensive, so if you are interested, please get a CEPT API key[1] and try it out with your own word associations. Here is an example (from the README): $ ./run_association_experiment.py resources/animals.txt resources/vegetables.txt -p 100 -t 1000 Prediction output for 1000 pairs of terms #COUNT TERM ONE TERM TWO | TERM TWO PREDICTION -------------------------------------------------------------------- # 100 salmon endive | lentil # 101 crocodile borage | # 102 wolf turmeric | amaranth # 103 termite chickweed | # 104 quail poke | # 105 woodpecker shallot | # 106 echidna caper | tomato # 107 panther guar | # 108 ape tomatillo | chrysanthemum # 109 bee cabbage | # 110 seahorse sorrel | # 111 camel tomatillo | lemongrass # 112 rat chives | # 113 crab yam | turnip This script takes a random term from the first file and a random term from the second. It converts each term to an SDR through the CEPT API and feeds term #1 and term #2 into NuPIC, bypassing the spacial pooler and sending it right into the TP (as described in the hello_tp example[2]). The next prediction after feeding in term #1 is preserved and printed to the console. Then it resets the TP so that it can only learn that simple one->two relationship. In the sample above, NuPIC should only be predicting plants or vegatables, given that the association I'm training it on is "animal" --> "vegetable". This trivial example seems to be working rather well, although NuPIC doesn't always have a valid SDR prediction. The predictions it does create almost always seem to be some sort of plant. Even more interesting is that sometimes NuPIC predicts SDRs that resolve to words outside the range of the input values. Happy hacking! --------- Matt Taylor OS Community Flag-Bearer Numenta [1] https://cept.3scale.net/signup (YOU MUST upgrade your account to use the API endpoints this project requires, email [email protected] and tell him you're working on NuPIC NLP tasks and he'll upgrade you.) [2] https://github.com/numenta/nupic/blob/master/examples/tp/hello_tp.py _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
