[
https://issues.apache.org/jira/browse/OPENNLP-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Boris Galitsky updated OPENNLP-253:
-----------------------------------
Attachment: text_similarity_proposal_for_opennlp.zip
three packages: operation, application example and an d its utilities
> Add text similarity / relevance / syntactic match component based on parse
> trees
> --------------------------------------------------------------------------------
>
> Key: OPENNLP-253
> URL: https://issues.apache.org/jira/browse/OPENNLP-253
> Project: OpenNLP
> Issue Type: Improvement
> Components: Parser
> Affects Versions: 1.6.0
> Environment: jave
> Reporter: Boris Galitsky
> Fix For: 1.6.0
>
> Attachments: text_similarity_proposal_for_opennlp.test.zip,
> text_similarity_proposal_for_opennlp.zip
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> Proposed component relies on openNLP parser, and gives search engineers a
> simple relevance verification tool which relies on machine learning of
> syntactic parse trees.
> The value for search engineers community is that they dont have to be
> familiar with NLP to use syntactic generalization component, which does
> parsing/chunking by openNLP and then graph-based learning for relevance
> assessment (proposed component).
> One of the expected usage scenario is that a search library like lucene is
> used, and this component would accept / reject irrelevant search results
> (according to the proposed syntactic generalization measure).
> This code has been deployed commercially over last 2 years at datran.com and
> zvents.com and is serving > 20 mln users monthly.
> There is a number of publications on this project, including
> http://portal.acm.org/citation.cfm?id=1881190
> http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS11/paper/view/2573
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira