[ 
https://issues.apache.org/jira/browse/LUCENE-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated LUCENE-10571:
----------------------------------------
    Attachment: LUCENE-10571.patch
        Status: Open  (was: Open)

I'm attaching a patch with a {{HuperDuperTermFilteredPresearcher}} (name just a 
placeholder) that works the way described by introducing a 
{{MISSING_FILTERS_FIELD}} into (Query) documents in the {{QueryIndex}} which we 
then search when a Document doesn't contain any values in a specific filter 
field.

The easiest way to really see what the impact of this is compared to 
{{TermFilteredPresearcher}} is to compare the two new 
{{testMissingFieldFiltering}} methods and the differnet expected results for 
each impl.

At the moment this new class is largely a lot of copy/paste duplication of 
{{TermFilteredPresearcher}} with small additions, because i'm not sure how we 
might want to really expose this functionality to users....

Obviously even if other folks agree that this is a better way to do "term 
filtering" in Monitor then how {{TermFilteredPresearcher}} currently works, 
changing the internals of {{TermFilteredPresearcher}} to "invert" it's logic 
like this would be a huge back compat break -- but what i'm not sure is if it 
would make sense to make this behavior "configurable" in 
{{TermFilteredPresearcher}} or refactor some of the internals to all this new  
functionality in a new subclass (which would probably be straightforward, but 
would also require _another_ subclass to support "multipass" in combination 
with this alternative filtering)

> Monitor alternative "TermFilter" Presearcher for sparse filter fields
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-10571
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10571
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/monitor
>            Reporter: Chris M. Hostetter
>            Priority: Major
>         Attachments: LUCENE-10571.patch
>
>
> One of the things that surprised me the most when looking into how the 
> {{TermFilteredPresearcher}} worked was what happens when Queries and/or 
> Documents do _NOT_  have a value in a configured filter field.
> per the javadocs...
> {quote}Filtering by additional fields can be configured by passing a set of 
> field names. Documents that contain values in those fields will only be 
> checked against \{@link MonitorQuery} instances that have the same 
> fieldname-value mapping in their metadata.
> {quote}
> ...which is straightforward and useful in the tested example where every 
> registered Query has {{"language"}} metadata, and every Document has a 
> {{"language"}} field, but gives unintuitive results when a Query or Document 
> does *NOT* have a {{"language"}}
> A more "intuitive" & useful (in my opinions) implementation would be 
> something that could be documented as ...
> {quote}Filtering by additional fields can be configured by passing a set of 
> field names. Documents that contain values in those fields will only be 
> checked against \{@link MonitorQuery} instances
> that have the same fieldname-value mapping in their metadata <em>or have no 
> mapping for that fieldname</em>.
> Documents that do not contain values in those fields will only be checked 
> against \{@link MonitorQuery} instances that also have no mapping for that 
> fieldname.
> {quote}
> ...ie: instead of being a straight "filter candidate queries by what we find 
> in the filter fields in the documents" we can instead "derive the queries 
> that are viable candidates for each document if we were restricting the set 
> of documents by those values during a "forward search"



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to