: This looks compelling! I'm also not sure what, specifically, we can
: validate in Solr's configuration... and I also don't know how much
: validation we do today. What hard errors does Solr produce on startup
: when configuration is wrong?
once upon a time solr would log some config errors, but then happily start
up anyway.
that "feature" has since been removed (as far as i know) and solr will now
fail on any concrete error it encounters in the config.
what solr doesn't currently identify as an error is "unused" config
(ie: someone types <requeestHandler /> instead of
<requestHandler/>) ... allthough i seem to recall a patch committed not to
long ago that started checking for unused field/fieldtype attributes when
parsing schema.xml, along the lines of what mccandles described...
: do something like this: when a plugin "claims" a certain attr/element,
: this is recorded. If at the end of loading the config, there are
: unclaimed attrs/elements, then that's an error.
One potential problem with generalizing this approach is that we support
"lazy" initializatin of some plugins (it might just be RequestHandlers ...
i don't remember off hand) so we'd need to watch out for that -- the whole
point is to prevent hte need for instantiating expensive plugins
unless/untill they are actually used, so you wouldn't wnat to force them
to startup just to read/claim their configs.
As to the larger question...
: More generally, before we hash out an approach here, I'd like to know
: if anyone disagree that we should move Solr to more strict error
: checking of its configuration on startup. I think being silent on
: configuration errors is the wrong choice... and I think that's
: generally Solr's approach today (I think? Or do we catch
: configuration errors w/ a hard error and clear message?).
...as i mentioned: if solr sees an error, it should already fail hard and
loud.
I would love to be able to do either static validation or more agressive
sanity checking of potential typos/unused configs on startup in a way that
would catch the cases we currently miss -- w/o preventing plugins from
having their own options (i regret not using something like xml
namespaces for this from day 1) -- but i suspect that it could wind up
being a lot of work for little gain....
Anecdotaly, the most common "config mistake" peoples i see people making
are along the lines of this:
<requestHandler name="/myHandler" class="solr.SearchHandler">
<lst name="defaults">
<str name="paramNameWithTypo">explicit</str>
<str name="wt">paramValueWithTypo</str>
...
...i don't know of any static way we could validate the configs that would
deal with what are ultimately going to be runtime params.
As i said: any improvements to help catch the mistakes we can identify
would be great, but we should maintain perspective of the effort/gain
tradeoff given that there is likely nothing we can do about the basic
problem of "a string that won't be evaluated until runtime"
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]