[
https://issues.apache.org/jira/browse/SOLR-12902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742424#comment-16742424
]
Hoss Man commented on SOLR-12902:
---------------------------------
Quick comment on something specific...
{quote}... I have added a test-case in the code to explain the scenario in
which the custom component will be helpful.
{quote}
Tirth: what you're describing is really more of an "example documentation" ...
when we folks talk about having test cases for new functionality/patches, what
they mean is new JUnit powered tests that are either unit tests proving that
the underlying methods behave as documented, or integration level tests showing
that when a Solr request comes in, the search component behaves as expected (in
this case: letting the request execute and return the expected results, or
returning an expected error if it violates the configuration)
----
General feedback:
This is a type of functionality we've talked about for a long time, but one of
the reasons we (or at least "I") have never tackled it head on relates to my
main concern with the approach currently taking the in the PR patch: it sets us
down the path of needing a "laundry list" (which we have to maintain and
constantly update moving forward) of every possible param/feature (and
combination there of) that _some_ people *might* find problematic (with
configuration options for all of them) in order to help ensure that something
like this is useful for _most_ people.
The reason i say that is because typically when users come along to assess a
feature like this, and they are concerned about "A, B, X & C" it's not useful
to them if it only solves "A, B C, & D" – w/o support for X. Because if they
need their own custom solution/plugin for preventing X they might as well
encorporate a custom solution for "A, B, & C" as well, so they only need to
worry about configurating one solution instead of two.
The permutations of things to worry about providing configuration options for
is problematic as well, because it's not just a question of "here's *every*
solr param, let's add a config option to turn it off or limit it's range of
legal values" (if it were we could maybe simplify the impl w/a "rules" syntax
that didn't need to know about specific param names) but it's also all about
the permutations of interconnected params – ex: folks who want to support both
faceting & highlighting, but not on the same requests; or highlighting is ok,
as long as rows isn't too big.
----
I think the only way to offer a really re-usable generalized solution for
something like this would be via the ScriptEngine, and letting people configure
their own set of arbitrary script(s) that could be compiled on startup, and
then evaled against the request params (and request context). We could test &
provide some small re-usable sample/example scripts that people could choose to
mix and match or customize ... similar to how spam assissian rules are
provided/configured.
I think the simplest implementation on the javaside would be:
* configure a list of script files
* compile all scripts on init
* at request time loop over each script in order and eval
* if script eval result is something that is null or .equals() FALSE or "new
Float(0)" continue
* if script eval result is anything else, return the toString as an error
message
that way people could write scripts like...
{code:java}
if (params[rows] > 100) {
return "rows param is too high"
}
if (params[start] > 10) {
return "start param is too high"
}
if (null != params[facet.pivot] && null != params[highlight]) {
...
...
return 0
{code}
But we could potentially also support a varient option for simpler scripts
w/less control over the error message returned...
* configure a NamedList mapping error strings to lists of script files, ie...
{code:java}
<lst name="scripts">
<str name="limited pagination support">script1.js</str>
<arr name="unsupported param combinations">
<str>script2.js</str>
<str>script3.js</str>
<str>script4.js</str>
...
{code}
* compile all scripts on init, maintain a mapping to their error string
* at request time loop over each script in order and eval
* if script eval result is not .equals() TRUE then retrun the associated error
string
that way people could have much simpler "boolean expression" scripts like
{code:java}
(rows <= 100)
&& (start <= 10)
&& (params[facet.pivot] ^ params[highlight])
&& ...
{code}
A lot of the "plumbing" code we'd need for something like this already exists
in the StatelessScriptUpdateProcessorFactory – we'd just need to refactor it
into a "ScriptUtils" helper place, and tease out some bits that require the
"Invocable" API since we wouldn't really need that here.
Thoughts?
> Block Expensive Queries custom Solr component
> ---------------------------------------------
>
> Key: SOLR-12902
> URL: https://issues.apache.org/jira/browse/SOLR-12902
> Project: Solr
> Issue Type: Task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Tirth Rajen Mehta
> Priority: Minor
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Added a Block Expensive Queries custom Solr component (
> [https://github.com/apache/lucene-solr/pull/47|https://github.com/apache/lucene-solr/pull/477)]
> ) :
> * This search component can be plugged into your SearchHandler if you would
> like to block some well known expensive queries.
> * The queries that are blocked and failed by component currently are deep
> pagination queries as they are known to consume lot of memory and CPU. These
> are
> *
> ** queries with a start offset which is greater than the configured
> maxStartOffset config parameter value
> ** queries with a row param value which is greater than the configured
> maxRowsFetch config parameter value
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]