Re: NLP idea: identify statements vs questions

Matthew Taylor Fri, 16 Oct 2015 09:51:41 -0700

We don't have to use the fingerprints. Another way is to simply encode the
part of speech (POS) for each word. I'm sure that statements and questions
have different temporal POS patterns that should be recognizable.



---------
Matt Taylor
OS Community Flag-Bearer
Numenta

On Fri, Oct 16, 2015 at 9:10 AM, Richard Crowder <[email protected]> wrote:

> My 2 cent's - This sounds similar to DeepQA, that helped IBM Watson win
> Jeopardy?
> http://researcher.watson.ibm.com/researcher/view_group.php?id=2099
>
> On Fri, Oct 16, 2015 at 4:39 PM, cogmission (David Ray) <
> [email protected]> wrote:
>
>> Awesome Idea! I for one am in!
>>
>> I think there are some questions that arise concerning capability and
>> approach?
>>
>> My main question is:
>>
>> Considering that training a Cortical.io Fingerprint will organize SDRs
>> according to subject applicability, I'm not sure whether it will
>> differentiate according to degree of interrogative-ness? I have the same
>> question as to the HTM; whether predictions and anomalies can differentiate
>> according to degree of interrogative-ness...
>>
>> So my immediate suggestion for a solution to the above is to do it in the
>> "Encoder". That is, to spatially aggregate inputs (sentences) according to
>> their Part-Of-Speach question word order... For example:
>>
>> 1. Sentences beginning with Is, Are, Why, How, Do, What, Where, Whether
>> etc. should be encoded closer to each other...
>> 2. Sentence fragments and clauses which accomplish the same as the above,
>> should have the same encoding nature.
>>
>> That's all I have for now...
>>
>> On Fri, Oct 16, 2015 at 10:23 AM, Matthew Taylor <[email protected]>
>> wrote:
>>
>>> Hello NuPIC,
>>>
>>> Here is a question for anyone interested in NLP, Cortical.IO's API, and
>>> phrase classification...
>>>
>>> This tweet from Carin Meier got me thinking last night:
>>> https://twitter.com/gigasquid/status/654802085335068672
>>>
>>> Could we do this with text fingerprints from Cortical and HTM? What if
>>> we put together a collection of human-gathered "statements" and a list of
>>> "questions". For each phrase, we turned each word into an SDR via
>>> Cortical's API, and train one model on the statement phrases (resetting
>>> sequences between phrases) and one for questions. So we'll have one model
>>> that's only seen statements and one that's only seen phrases.
>>>
>>> If there are typical word patterns that exist mostly in one type of
>>> phrase or another, it may be possible to feed new phrases as SDRs into each
>>> model, and use the lowest anomaly to identify whether it is a statement or
>>> question?
>>>
>>> Does this seem feasible? Is anyone interested in this project?
>>>
>>> Thanks,
>>>
>>> ---------
>>> Matt Taylor
>>> OS Community Flag-Bearer
>>> Numenta
>>>
>>
>>
>>
>> --
>> *With kind regards,*
>>
>> David Ray
>> Java Solutions Architect
>>
>> *Cortical.io <http://cortical.io/>*
>> Sponsor of:  HTM.java <https://github.com/numenta/htm.java>
>>
>> [email protected]
>> http://cortical.io
>>
>
>

Re: NLP idea: identify statements vs questions

Reply via email to