[ 
https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000000#comment-13000000
 ] 

Paul Elschot commented on LUCENE-2454:
--------------------------------------

How about an implementation for strict hierarchies that uses two fields per 
document, in the following way:

The two fields each contain a single (indexed) token that indicates the node in 
the nesting hierarchy, one field meaning that the document is a child of that 
node, and the other that the document is the representative of that node. Any 
number of levels could be allowed, but no cycles of course.
These fields are then used by a merge policy to keep the documents ordered 
postorder, that is the children immediately followed by the representative for 
each node.
Collecting scores at any node in the hierarchy could then be done by using term 
filters, one for each involved scorer, to provide the representative for the 
current doc by advancing.


For example, in index order:

userDocId nodeMemberField nodeReprField

doc1 nodeA1 .
doc2 nodeA1 .
doc3 nodeA nodeA1
doc4 nodeA2 .
doc5 nodeA2 .
doc6 nodeA nodeA2

The node representatives for scoring could then be obtained by a term filter 
for nodeA.


I think this could work for the scoring part, basically along the lines of the 
code already posted here.

Could someone with more experience in segment merge policies comment on this? 
This is quite restrictive for merging as the only freedom that is left in the 
document order is the order of the children for each node.

For example, adding a leaf document doc7 for nodeA1 could result in the 
following index order:

doc4 nodeA2 .
doc5 nodeA2 .
doc6 nodeA nodeA2
doc7 nodeA1 .
doc1 nodeA1 .
doc2 nodeA1 .
doc3 nodeA nodeA1




> Nested Document query support
> -----------------------------
>
>                 Key: LUCENE-2454
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2454
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>    Affects Versions: 3.0.2
>            Reporter: Mark Harwood
>            Assignee: Mark Harwood
>            Priority: Minor
>         Attachments: LuceneNestedDocumentSupport.zip
>
>
> A facility for querying nested documents in a Lucene index as outlined in 
> http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to