[
https://issues.apache.org/jira/browse/SOLR-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley updated SOLR-6318:
-------------------------------
Attachment: SOLR-6318__terms_QParser.patch
Here it is, with test.
>From the javadoc:
bq. Finds documents whose specified field has any of the specified values. It's
like TermQParserPlugin but multi-valued, and supports a variety of internal
algorithms. Parameters: f: The field name (mandatory) separator: the separator
delimiting the values in the query string. By default it's a " " which is
special in that it splits on any consecutive whitespace. method: Any of
termsFilter (default), booleanQuery, automaton, docValuesTermsFilter. Note that
if no values are specified then the query matches no documents.
It would be cool if somebody did some benchmarking that would allow us to
choose between some of the algorithms based on heuristics... but this is fine
for now. For example use method=X when the number of values is > some value.
And use docValuesTermsFilter if docValues is enabled. Note that
DocValuesTermsFilter (trunk) is known as FieldCacheTermsFilter on 4x. On 4x
this feature doesn't support DocValues (just FieldCache) whereas on trunk it
supports both depending on wether you indexed DocValues or not (I think). That
method is also limited to single valued fields, but there's no explicit check.
I'll commit this in a couple days, pending input.
> QParser for TermsFilter
> -----------------------
>
> Key: SOLR-6318
> URL: https://issues.apache.org/jira/browse/SOLR-6318
> Project: Solr
> Issue Type: New Feature
> Components: query parsers
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: 4.10
>
> Attachments: SOLR-6318__terms_QParser.patch
>
>
> Some applications require filtering documents by a large number of terms.
> It's often related to security filtering. Naively this is done this way:
> {noformat}
> fq={!df=myfield q.op=OR}code1 code2 code3 code4 code5...
> {noformat}
> And this ends up being a BooleanQuery. Users then wind up hitting
> BooleaQuery.maxClauseCount (sometimes in production, sadly) and they up it to
> a huge number to get the job done.
> Solr should offer a QParser based on TermsFilter. I propose it be named
> "terms" (plural of term), and have a "separator" option defaulting to a
> space. When it's a space, the values also get trimmed, which wouldn't
> otherwise happen. The analysis logic should be the same as that for "term"
> QParser which is to call FieldType.readableToIndexed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]