Re: spellchecker problems (bugs)

Jonathan Lee Wed, 23 Jul 2008 03:45:47 -0700

I ran into a similar issue and found that I am able to get around it by:

1. Similar to what https://issues.apache.org/jira/browse/SOLR-622 will do,
issue a spellcheck.reload=true command on the firstSearcher event to read
any existing index off disk. Here are the relevant parts of my
solrconfig.xml:


  <listener event="firstSearcher" class="solr.QuerySenderListener">
    <arr name="queries">
      <lst> 
        <str name="qt">myHandler</str>
        <str name="q">*:*</str>
        <str name="rows">0</str>
        <str name="spellcheck">true</str>
        <str name="spellcheck.reload">true</str>
      </lst>
    </arr>
  </listener>
  <requestHandler name="myHandler" ...>
    ...     
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>
  <searchComponent name="spellcheck" class="...SpellCheckComponent">
    <lst name="spellchecker">
      <str name="name">default</str>
      <str name="field">name_spell</str>
      ...
    </lst>
    <str name="queryAnalyzerFieldType">text_spell</str>
  </searchComponent>


2. I believe there is a bug in IndexBased- and FileBasedSpellChecker.java
where the analyzer variable is only set on the build command. Therefore,
when the index is reloaded, but not built after starting solr, issuing a
query with the spellcheck.q parameter will cause a NullPointerException to
be thrown (SpellCheckComponent.java:158). Moving the analyzer logic to the
constructor seems to fix the problem.

I did not see a jira ticket for this (nor am I sure it's a real bug :), so I
have attached a patch with these changes. Please let me know if I have
overlooked something here and if I should attach this to an actual ticket.

-Jonathan


> From: Geoffrey Young <[EMAIL PROTECTED]>
> Reply-To: <solr-user@lucene.apache.org>
> Date: Tue, 22 Jul 2008 11:07:41 -0400
> To: <solr-user@lucene.apache.org>
> Subject: Re: spellchecker problems (bugs)
> 
> 
> 
> Shalin Shekhar Mangar wrote:
>> The problems you described in the spellchecker are noted in
>> https://issues.apache.org/jira/browse/SOLR-622 -- I shall create an issue to
>> synchronize spellcheck.build so that the index is not corrupted.
> 
> I'd like to discuss this a little...
> 
> I'm not sure that I want to rebuild the spelling index each time the
> underlying data index changes - the process takes very long and my
> updates are frequent changes to non-spelling related data.
> 
> what I'd really like is for a change to my index to not cause an
> exception.  IIRC the "old" way of using a spellchecker didn't work like
> this at all - I could completely rm data/index and leave data/spell in
> place, add new data, not issue cmd=build and the spelling parts still
> worked just fine (albeit with old data).
> 
> not to say that SOLR-622 isn't a good idea (it is) but I don't really
> think the entire solution is keeping the spellcheck index in sync.  do
> they need to be kept in sync for things not to implode on me?
> 
> --Geoff

Re: spellchecker problems (bugs)

Reply via email to