[ https://issues.apache.org/jira/browse/OPENNLP-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Wiesner closed OPENNLP-566. ---------------------------------- Resolution: Not A Problem Closing this as this is neither a bug, nor a problem. > Parse thicket is a structure to represent syntactic relations in a paragraph > ---------------------------------------------------------------------------- > > Key: OPENNLP-566 > URL: https://issues.apache.org/jira/browse/OPENNLP-566 > Project: OpenNLP > Issue Type: New Feature > Components: Similarity > Reporter: Boris Galitsky > Assignee: Boris Galitsky > Priority: Major > Original Estimate: 120h > Remaining Estimate: 120h > > Per paper > http://link.springer.com/chapter/10.1007%2F978-3-642-35786-2_12 > Parse Thicket Representation for Multi-sentence Search > Boris A. Galitsky, Sergei O. Kuznetsov, Daniel Usikov > Abstract > We develop a graph representation and learning technique for parse structures > for sentences and paragraphs of text. This technique is used to improve > relevance answering complex questions where an answer is included in multiple > sentences. We introduce Parse Thicket as a sum of syntactic parse trees > augmented by a number of arcs for inter-sentence word-word relations such as > coreference and taxonomic. These arcs are also derived from other sources, > including Rhetoric Structure theory, and respective indexing rules are > introduced, which identify inter-sentence relations and joins phrases > connected by these relations in the search index. Generalization of syntactic > parse trees (as a similarity measure between sentences) is defined as a set > of maximum common sub-trees for two parse trees. Generalization of a pair of > parse thickets to measure relevance of a question and an answer, distributed > in multiple sentences, is defined as a set of maximal common sub-parse > thickets. The proposed approach is evaluated in the product search domain of > eBay.com, where user query includes product names, features and expressions > for user needs, and the query keywords occur in different sentences of text. > We demonstrate that search relevance is improved by single sentence-level > generalization, and further increased by parse thicket generalization. The > proposed approach is evaluated in the product search domain of eBay.com, > where user query includes product names, features and expressions for user > needs, and the query keywords occur in different sentences of text. -- This message was sent by Atlassian Jira (v8.20.10#820010)