[jira] [Issue Comment Edited] (LUCENE-3938) Add query time parent child search

Martijn van Groningen (Issue Comment Edited) (JIRA) Fri, 30 Mar 2012 09:28:53 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242507#comment-13242507
 ]


Martijn van Groningen edited comment on LUCENE-3938 at 3/30/12 4:27 PM:
------------------------------------------------------------------------

Added initial patch with random test.

Code usage:
{code}
ParentChildCommand command = new ParentChildCommand();
...
command.setParentField("fieldA");
command.setChildField("fieldB");
command.setTypeField("typeField");
command.setGroupChild(true);
..
TermTopParentChildCollector topParentChildCollector = new 
TermTopParentChildCollector(command);
indexSearcher.search(query, topParentChildCollector);
ParentChildResult result = topParentChildCollector.getParentChildResult();
TermParentChildResolveCollector parentChildResolveCollector = new 
TermParentChildResolveCollector(result, command);
indexSearcher.search(command.childrenQuery(), parentChildResolveCollector);

// render results
System.out.println("Hit count" + result.hitCount)
for (ParentChildDoc hit : result.docs) {
   ScoreDoc parentDoc = hit.getParentDoc();
   TopDocs children = hit.getChildDocs();
   // render hit
}

{code}

It also possible to group parent child hits. For example if many subtitles of 
the same program are matching with a query, this could pollute the result. If 
this "grouping" is used only the most relevant matching document of a parent 
child document is kept.
                
      was (Author: martijn.v.groningen):
    Added initial patch with random test.

Code usage:
{code}
ParentChildCommand command = new ParentChildCommand();
...
command.setParentField("fieldA");
command.setChildField("fieldB");
command.setTypeField("typeField");
command.setGroupChild(true);
..
TermTopParentChildCollector topParentChildCollector = new 
TermTopParentChildCollector(command);
          indexSearcher.search(query, topParentChildCollector);
          ParentChildResult result = 
topParentChildCollector.getParentChildResult();
          TermParentChildResolveCollector parentChildResolveCollector = new 
TermParentChildResolveCollector(result, command);
          indexSearcher.search(command.childrenQuery(), 
parentChildResolveCollector);

// render results
System.out.println("Hit count" + result.hitCount)
for (ParentChildDoc hit : result.docs) {
            ScoreDoc parentDoc = hit.getParentDoc();
            TopDocs children = hit.getChildDocs();
            // render hit
          }

{code}

It also possible to group parent child hits. For example if many subtitles of 
the same program are matching with a query, this could pollute the result. If 
this "grouping" is used only the most relevant matching document of a parent 
child document is kept.
                  
> Add query time parent child search
> ----------------------------------
>
>                 Key: LUCENE-3938
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3938
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/join
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3938.patch
>
>
> At the moment there is support for index time parent child search with two 
> queries implementations and a collector. The index time parent child search 
> requires that documents are indexed in a block, this isn't ideal for 
> updatability. For example in the case of tv content and subtitles (both being 
> separate documents). Updating already indexed tv content with subtitles would 
> then require to also re-index the subtitles.
> This issue focuses on the collector part for query time parent child search. 
> I started a while back with implementing this. Basically a two pass search 
> performs a parent child search. In the first pass the top N parent child 
> documents are resolved. In the second pass the parent or top N children are 
> resolved (depending if the hit is a parent or child) and are associated with 
> the top N parent child relation documents. Patch will follow soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Issue Comment Edited] (LUCENE-3938) Add query time parent child search

Reply via email to