[ https://issues.apache.org/jira/browse/SOLR-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16543976#comment-16543976 ]
David Smiley commented on SOLR-12519: ------------------------------------- Oh I understand now. My suggestion to use PathHierarchyTokenizerFactory was centered around use-cases of querying for child docs purely by this path (e.g. all paths that look like this, etc.). If the query is find all child docs that match some arbitrary query (which is what "childFilter" is), and furthermore _their_ ancestors, then PathHierarchyTokenizerFactory may not be so useful in that. Sorry for the wild goose chase; though I suspect we'll revisit the use of PathHierarchyTokenizerFactory in the near future. I think we can do this with DocValues to store the nest path, and with modifications to ChildDocTransformer's loop over matching child documents. Recognize first how Lucene/Solr actually sequence the arrangement of nested child documents. Any given child document always comes _before_ it's parent (and thus recursively so). Therefore, what can be done is to look at all documents _after_ a matching child document to see which of those is an ancestor of a matching child document. Detecting if child doc X has an ancestor of doc X + N is a matter of comparing if the path at X + N is a prefix of the path at X. You stop looping forward once you reach the root document -- tracked in parentsFilter bits. If that's not enough information for you to implement this, I can post a patch modification to ChildDocTransformer that will do this, and maybe you could take it further from there (e.g. restructure the ancestors into a nice hierarchy). > Support Deeply Nested Docs In Child Documents Transformer > --------------------------------------------------------- > > Key: SOLR-12519 > URL: https://issues.apache.org/jira/browse/SOLR-12519 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Reporter: mosh > Priority: Major > Attachments: SOLR-12519-no-commit.patch > > Time Spent: 10m > Remaining Estimate: 0h > > As discussed in SOLR-12298, to make use of the meta-data fields in > SOLR-12441, there needs to be a smarter child document transformer, which > provides the ability to rebuild the original nested documents' structure. > In addition, I also propose the transformer will also have the ability to > bring only some of the original hierarchy, to prevent unnecessary block join > queries. e.g. > {code} {"a": "b", "c": [ {"e": "f"}, {"e": "g"} , {"h": "i"} ]} {code} > Incase my query is for all the children of "a:b", which contain the key "e" > in them, the query will be broken in to two parts: > 1. The parent query "a:b" > 2. The child query "e:*". > If the only children flag is on, the transformer will return the following > documents: > {code}[ {"e": "f"}, {"e": "g"} ]{code} > In case the flag was not turned on(perhaps the default state), the whole > document hierarchy will be returned, containing only the matching children: > {code}{"a": "b", "c": [ {"e": "f"}, {"e": "g"} ]{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org