[ 
https://issues.apache.org/jira/browse/SOLR-10583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-10583:
----------------------------
    Attachment: SOLR-10583.patch


A while back someone asked me about the "domain change" idea in the JSON Facet 
code.  They had seen some blogs from yonik about use blockParent and 
blockChildren domain changes to facet on parent/child fields, and were 
wondering if the same sort of thing could work for a _query join_.

I wasn't too familiar with any of this code, but poking around it seemed fairly 
straight forward.  (Except that my randomized test case ran into an interesting 
bug when the same join query was used in two diff facets: I hacked around this 
by forcing a non-caching wrapper around the queries, but still haven't gotten 
to the root cause)


The attached patch includes:

* complete code implementing this new domain change option
* Modifications to {{TestJsonFacets}} to test the new {{join}} option w/similar 
data as the existing {{blockChildren}} and {{blockParent}} tests
* a fairly robust {{TestCloudJSONFacetJoinDomain}} that does randomized facet 
queries w/these {{join}} (sometimes + {{filter}}) domain changes and then 
verifies the result counts against test queries w/equivilent {{fq}} params
** currently only uses one shard...
** ...and always requests a facet limit greater then the cardinality of the 
fields so that refinement is not an issue even if/when the test starts using 
multiple shards
* a very similar but greatly simplified {{TestCloudJSONFacet}} that does 
similar checks but only using {{filter}} domain changes
** i hacked this off of {{TestCloudJSONFacetJoinDomain}} when i first started 
getting weird test failures to try see if i could reproduce any weird failures 
with just the existing domain change code

Still TODO (all pretty well captured in nocommits)...

# get to the bottom of the cache issue
#* need to do this first before improving the test to ensure nothing else 
"breaks" our garunteed failure in testBespoke
#* see nocommits in the code for ideas about possible root causes
# generate better errors if the {{join}} is malformed (ie: not a map, doesn't 
have both {{from}} and {{to}} specified
#* or maybe: support the syntax {{join : "field_name"}} as syntax sugar 
for...{code}
join : { from : "field_name", to : "field_name" }
{code}
# add code+tests to ensure joining on numeric fields works
# fix the {{multipleValuesPerDocument}} logic and enhance the test so some 
fields are always single valued
# cleanup & refactor some of the code that uses {{JoinUtil}}
#* is it as efficient as it can be?
#* is there stuff we can share with ScoreJoinQParserPlugin?
# refactor the {{TestCloudJSONFacet}} & {{TestCloudJSONFacetJoinDomain}} 
duplication and make them use multi-shards
#* FWIW: join queries might seem like an odd thing to worry about testing with 
multiple shards -- but the usecases I have in mind can all leverage doc routing 
to ensure that all docs with identical values in the join field are co-located
# investigate the recent work/tests yonik's been doing on supporting refinement 
(single level?), and add {{join}} to those tests as well.


> Add 'join' as a new type of domain change in JSON Facets
> --------------------------------------------------------
>
>                 Key: SOLR-10583
>                 URL: https://issues.apache.org/jira/browse/SOLR-10583
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>         Attachments: SOLR-10583.patch
>
>
> Add support for a new (query) {{join}} option when specifying a {{domain}} 
> for a JSON Facet.
> Suggested syntax...
> {code}
> ...
> domain : { join : { from : field_foo,
>                     to : field_bar
>                 }
>        }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to