[
https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205156#comment-13205156
]
Hoss Man commented on SOLR-3076:
--------------------------------
bq. Maybe there can be field aliases? Eg, book_page_count:[0 to 1000] and
chapter_page_count[10:40], and the QP is told to map book_page_count ->
parent:size and chapter_page_count -> child:size? Or maybe we let the user
explicitly scope the field, eg chapter:size, book:size, book:title, etc. Not
sure...
Hmmm... i kind of understand what you're saying; but the part i'm not
understanding is even if you had field aliasing like that, given some query
string like...
{code}
book_page_count:[0 TO 1000] and chapter_page_count[10 TO 40]
{code}
..how would the parser know whether the user was asking for the results to be
"book documents" matching that criteria (1-1000 pages and containing at least
one chapter child containing 10-40 pages), or "chapter documents" matching that
criteria (10-40 pages contained in a book of 1-1000 pages) or "page documents"
(all pages in containing in a chapter of 10-40 total pages, contained in a book
of 1-1000 total pages) ?
I mean: it seems possible, and a QParser like that could totally support
configuring those types of file mappings / hierarchy definitions in init
params, but perhaps we should focus on the more user explicit, direct mapping
type QParser type approach Mikhail has already started on for now, and consider
that as an enhancement later? (especially since it's not clear how the
indexing side will be managed/enforced -- depending on how that shapes up, it
might be possible for a QParser like you're describing, or perhaps _all_
QParsers to infer the field rules from the schema or some other configuration)
I think the syntax in Mikhail's BlockJoinParentQParserPlugin looks great as a
straight forward baseline implementation. The one straw man suggestion i might
toss out there for consideration would be to invert the use of the "filter" and
"v" local params, so instead of...
{code}
{!parent filter="parent:true"}child_name:b
{!parent filter="parent:true"}
{code}
...it might be...
{code}
{!parent of="child_names:b"}parent:true
{!parent}parent:true
{code}
...people may find that easier to read as a way to understand that the final
query will return "parent documents" constraint such that those parent
documents have children matching the "of" query. The one thing i don't like
this "of" idea is that (compared to the "filter" param Mikhail uses) it might
be more tempting for people to use something like...
{code}
// WRONG! (i think)
q={!parent of="child_names:b"}some_parent_field:foo
{code}
...when they mean to write something like this...
{code}
q={!parent of="child_names:b"}some_query_that_identifies_the_set_of_all_parents
fq=some_parent_field:foo
{code}
...because as i understand it, it's important for the "parentFilter" to
identify *all* of the parent documents, even ones you may not want returned, so
that the ToParentBlockJoinQuery knows how to identify the parent of each
document (correct?)
This type of user confusion is still possible with the syntax Mikhail's got,
but i suspect it will be less likely --- In any case, i wanted to put the idea
out there.
Given McCandless supposition that the parent/child relationships are likely to
be very consistent, not very deep, and not vary from query to query, one thing
we could do to to help mitigate this possible confusion would be:
* make the "filter" param name much longer and verbose, ie:
{{setOfAllParentsQuery}}
* make the param optional, and have it default to something specified as an
init param, ie: {{defaultSetOfAllParentsQuery}}
* make the init param mandatory
That way, in the common case people will configure things like...
{code}
<queryParser name="parent" class="solr.BlockJoinParentQParserPlugin">
<str name="defaultSetOfAllParentsQuery">type:parent</str>
</queryParser>
{code}
..and their queries will be simple...
{code}
q={!parent} (all parent docs)
q={!parent}foo:bar (all parent docss that contain kid docs matching
foo:bar)
{code}
...but it will still be possible for people with more complex usecases with do
more complex things.
Mikhail: some other minor feedback on the parts i understood of your patch that
i understood (note: my lack of understanding is not a fault of your patch, it's
just that most of the block join stuff is very foreign to me)...
* please prune down "solrconfig-bjqparser.xml" so it contains only the absolute
minimum things you need for the test case, it makes it a lot easier for people
to review the patch, and for users to understand what is necessary to utilize
features demoed in the test (we have a lot of old bloaded solrconfig files i
nthe test dir, but we're trying to stop doing that)
* the test would be a bit easier to follow if you used different letters for
the parent fields vs the child fields (abcdef, vs xyz for example)
* it would be good to have tests verifying that nested parent queries work as
expected, ie: that something like this works...
{code}
q={!parent filter="type:book" v=$chapters}
chapters=+chapter_title:Solr +_query_:{!parent filter="type:chapter" v=$pages}
pages=page_body:BlockJoin
{code}
* it would be good to have your tests introspect the cache after doing the
query to make sure the number of inserts, lookups, and hits match what you
expect.
...but like i said: all in all i think it's really good.
> Solr should support block joins
> -------------------------------
>
> Key: SOLR-3076
> URL: https://issues.apache.org/jira/browse/SOLR-3076
> Project: Solr
> Issue Type: New Feature
> Reporter: Grant Ingersoll
> Attachments: SOLR-3076.patch, bjq-vs-filters-backward-disi.patch,
> bjq-vs-filters-illegal-state.patch, parent-bjq-qparser.patch,
> parent-bjq-qparser.patch, solrconf-bjq-erschema-snippet.xml
>
>
> Lucene has the ability to do block joins, we should add it to Solr.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]