[ 
https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044828#comment-13044828
 ] 

Mark Harwood commented on LUCENE-2454:
--------------------------------------

Below are 2 example tests searching employment resumes - both using the same 
optional and mandatory clauses but in subtly different ways.
Question 1 is "who has Mahout skills and preferably used them at Lucid?" while 
the other question is "who has Mahout skills and preferably has been employed 
by Lucid?". The questions and the answers are different. Below is the XML test 
script I used to illustrate the data/queries used, define expected results and 
run as an executable test. 
Hopefully you can make sense of this:
{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?>
<Test description="NestedQuery tests">
        <Data>
                <Index name="ResumeIndex">
                        <Analyzers 
class="org.apache.lucene.analysis.WhitespaceAnalyzer">
                        </Analyzers>
                        <Shard name="shard1">
                                <!--  
=============================================================== -->
                                <Document pk="1">
                                        <Field name="name">grant</Field>
                                        <Field name="docType">resume</Field>
                                </Document>
                                <!--  
=============================================================== -->
                                                <Document pk="2">
                                                        <Field 
name="employer">lucid</Field>
                                                        <Field 
name="docType">employment</Field>
                                                        <Field 
name="skills">java lucene</Field>
                                                </Document>
                                <!--  
=============================================================== -->
                                                <Document pk="3">
                                                        <Field 
name="employer">somewhere else</Field>
                                                        <Field 
name="docType">employment</Field>
                                                        <Field 
name="skills">mahout and more mahout</Field>
                                                </Document>
                                <!--  
=============================================================== -->
                                <Document pk="4">
                                        <Field name="name">sean</Field>
                                        <Field name="docType">resume</Field>
                                </Document>
                                <!--  
=============================================================== -->
                                                <Document pk="5">
                                                        <Field 
name="employer">foo bar</Field>
                                                        <Field 
name="docType">employment</Field>
                                                        <Field 
name="skills">java</Field>
                                                </Document>
                                <!--  
=============================================================== -->
                                                <Document pk="6">
                                                        <Field 
name="employer">some co</Field>
                                                        <Field 
name="docType">employment</Field>
                                                        <Field 
name="skills">mahout mahout and more mahout</Field>
                                                </Document>
                        </Shard>
                </Index>
        </Data>
        <Tests>
                <Test description="Who knows Mahout and preferably used it 
*while employed at Lucid*?">
                        <Query>
                    <NestedQuery> 
                        <!-- testing properties of individual child employment 
docs -->
                       <Query>
                          <BooleanQuery>
                                        <Clause occurs="must">
                                                <TermsQuery 
fieldName="skills">mahout</TermsQuery>
                                        </Clause>
                                        <Clause occurs="should">
                                                <TermsQuery 
fieldName="employer">lucid</TermsQuery>
                                        </Clause>
                          </BooleanQuery>
                       </Query>
                       <ParentsFilter>  
                            <TermsFilter 
fieldName="docType">resume</TermsFilter>                                
                       </ParentsFilter> 
                    </NestedQuery>
                        </Query>
                        <ExpectedResults why="Grant's tenure at Lucid is 
overlooked for scoring purposes 
                                               because it did not involve the 
required Mahout. Sean has more Mahout experience">
                                                        <Result 
fieldName="pk">4</Result>
                                                        <Result 
fieldName="pk">1</Result>
                        </ExpectedResults>
                </Test>

                <!-- 
====================================================================================
 -->
                
                <Test description="Different question - who knows Mahout and 
preferably has been employed by Lucid?">
                        <Query>
                <BooleanQuery>
                                <Clause occurs="must">
                                            <NestedQuery> 
                                                <!-- testing properties of one 
child employment docs -->
                                               <Query>
                                                        <TermsQuery 
fieldName="skills">mahout</TermsQuery>
                                               </Query>
                                               <ParentsFilter>  
                                                    <TermsFilter 
fieldName="docType">resume</TermsFilter>                                
                                               </ParentsFilter> 
                                            </NestedQuery>
                                </Clause>
                                <Clause occurs="should">
                                                <!-- Another NestedQuery 
testing properties of *potentially different* child employment docs -->
                                            <NestedQuery> 
                                               <Query>
                                                        <TermsQuery 
fieldName="employer">lucid</TermsQuery>
                                               </Query>
                                               <ParentsFilter>  
                                                    <TermsFilter 
fieldName="docType">resume</TermsFilter>                                
                                               </ParentsFilter> 
                                            </NestedQuery>
                                </Clause>
                        </BooleanQuery>
                        </Query>
                        <ExpectedResults why="Grant has the required Mahout 
skills plus the optional Lucid engagement">
                                                        <Result 
fieldName="pk">1</Result>
                                                        <Result 
fieldName="pk">4</Result>
                        </ExpectedResults>
                </Test>
                <!-- 
====================================================================================
 -->
        </Tests>
</Test>
{code}  

> Nested Document query support
> -----------------------------
>
>                 Key: LUCENE-2454
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2454
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/search
>    Affects Versions: 3.0.2
>            Reporter: Mark Harwood
>            Assignee: Mark Harwood
>            Priority: Minor
>         Attachments: LUCENE-2454.patch, LuceneNestedDocumentSupport.zip
>
>
> A facility for querying nested documents in a Lucene index as outlined in 
> http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to