Add text similarity / relevance / syntactic match component based on parse trees
--------------------------------------------------------------------------------

                 Key: OPENNLP-253
                 URL: https://issues.apache.org/jira/browse/OPENNLP-253
             Project: OpenNLP
          Issue Type: Improvement
          Components: Parser
    Affects Versions: 1.6.0
         Environment: jave
            Reporter: Boris Galitsky
             Fix For: 1.6.0


 Proposed component relies on openNLP parser, and gives search engineers a 
simple relevance verification tool which relies on machine learning of 
syntactic parse trees.

The value for search engineers community is that they dont have to be familiar 
with NLP to use syntactic generalization component, which does parsing/chunking 
by openNLP and then graph-based learning for relevance assessment (proposed 
component).

One of the expected usage scenario is that a search library like lucene is 
used, and this component would accept / reject irrelevant search results 
(according to the proposed syntactic generalization measure).

This code has been deployed commercially over last 2 years at datran.com and 
zvents.com and is serving > 20 mln users monthly.

There is a number of publications on this project, including 

http://portal.acm.org/citation.cfm?id=1881190

http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS11/paper/view/2573

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to