I have used NLTK in python before to do POS tagging, but honestly it is not very good.
--------- Matt Taylor OS Community Flag-Bearer Numenta On Fri, Oct 16, 2015 at 10:19 AM, cogmission (David Ray) < [email protected]> wrote: > @Carin between those two resources we should be able to come up with an > adequate word "look up" mechanism eh? > > On Fri, Oct 16, 2015 at 12:12 PM, cogmission (David Ray) < > [email protected]> wrote: > >> Here's a resource: The Moby Part of Speech file!!! >> >> Linked on my server: www.mindlab.ai/mobypos.txt >> >> That's one resource! >> >> On Fri, Oct 16, 2015 at 12:05 PM, cogmission (David Ray) < >> [email protected]> wrote: >> >>> Yep, precisely. Do it in the encoder! The encoder would take in a whole >>> sentence and encode each word according to its "position" within a >>> sentence, and its POS. For instance: The word "Where" would be encoded >>> differently depending on the what its location in the sentence is... >>> >>> >>> >>> On Fri, Oct 16, 2015 at 11:50 AM, Matthew Taylor <[email protected]> >>> wrote: >>> >>>> We don't have to use the fingerprints. Another way is to simply encode >>>> the part of speech (POS) for each word. I'm sure that statements and >>>> questions have different temporal POS patterns that should be recognizable. >>>> >>>> >>>> --------- >>>> Matt Taylor >>>> OS Community Flag-Bearer >>>> Numenta >>>> >>>> On Fri, Oct 16, 2015 at 9:10 AM, Richard Crowder <[email protected]> >>>> wrote: >>>> >>>>> My 2 cent's - This sounds similar to DeepQA, that helped IBM Watson >>>>> win Jeopardy? >>>>> http://researcher.watson.ibm.com/researcher/view_group.php?id=2099 >>>>> >>>>> On Fri, Oct 16, 2015 at 4:39 PM, cogmission (David Ray) < >>>>> [email protected]> wrote: >>>>> >>>>>> Awesome Idea! I for one am in! >>>>>> >>>>>> I think there are some questions that arise concerning capability and >>>>>> approach? >>>>>> >>>>>> My main question is: >>>>>> >>>>>> Considering that training a Cortical.io Fingerprint will organize >>>>>> SDRs according to subject applicability, I'm not sure whether it will >>>>>> differentiate according to degree of interrogative-ness? I have the same >>>>>> question as to the HTM; whether predictions and anomalies can >>>>>> differentiate >>>>>> according to degree of interrogative-ness... >>>>>> >>>>>> So my immediate suggestion for a solution to the above is to do it in >>>>>> the "Encoder". That is, to spatially aggregate inputs (sentences) >>>>>> according >>>>>> to their Part-Of-Speach question word order... For example: >>>>>> >>>>>> 1. Sentences beginning with Is, Are, Why, How, Do, What, Where, >>>>>> Whether etc. should be encoded closer to each other... >>>>>> 2. Sentence fragments and clauses which accomplish the same as the >>>>>> above, should have the same encoding nature. >>>>>> >>>>>> That's all I have for now... >>>>>> >>>>>> On Fri, Oct 16, 2015 at 10:23 AM, Matthew Taylor <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hello NuPIC, >>>>>>> >>>>>>> Here is a question for anyone interested in NLP, Cortical.IO's API, >>>>>>> and phrase classification... >>>>>>> >>>>>>> This tweet from Carin Meier got me thinking last night: >>>>>>> https://twitter.com/gigasquid/status/654802085335068672 >>>>>>> >>>>>>> Could we do this with text fingerprints from Cortical and HTM? What >>>>>>> if we put together a collection of human-gathered "statements" and a >>>>>>> list >>>>>>> of "questions". For each phrase, we turned each word into an SDR via >>>>>>> Cortical's API, and train one model on the statement phrases (resetting >>>>>>> sequences between phrases) and one for questions. So we'll have one >>>>>>> model >>>>>>> that's only seen statements and one that's only seen phrases. >>>>>>> >>>>>>> If there are typical word patterns that exist mostly in one type of >>>>>>> phrase or another, it may be possible to feed new phrases as SDRs into >>>>>>> each >>>>>>> model, and use the lowest anomaly to identify whether it is a statement >>>>>>> or >>>>>>> question? >>>>>>> >>>>>>> Does this seem feasible? Is anyone interested in this project? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> --------- >>>>>>> Matt Taylor >>>>>>> OS Community Flag-Bearer >>>>>>> Numenta >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *With kind regards,* >>>>>> >>>>>> David Ray >>>>>> Java Solutions Architect >>>>>> >>>>>> *Cortical.io <http://cortical.io/>* >>>>>> Sponsor of: HTM.java <https://github.com/numenta/htm.java> >>>>>> >>>>>> [email protected] >>>>>> http://cortical.io >>>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> *With kind regards,* >>> >>> David Ray >>> Java Solutions Architect >>> >>> *Cortical.io <http://cortical.io/>* >>> Sponsor of: HTM.java <https://github.com/numenta/htm.java> >>> >>> [email protected] >>> http://cortical.io >>> >> >> >> >> -- >> *With kind regards,* >> >> David Ray >> Java Solutions Architect >> >> *Cortical.io <http://cortical.io/>* >> Sponsor of: HTM.java <https://github.com/numenta/htm.java> >> >> [email protected] >> http://cortical.io >> > > > > -- > *With kind regards,* > > David Ray > Java Solutions Architect > > *Cortical.io <http://cortical.io/>* > Sponsor of: HTM.java <https://github.com/numenta/htm.java> > > [email protected] > http://cortical.io >
