[ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044917#comment-13044917
 ] 

James Dyer commented on SOLR-2571:
----------------------------------

{quote}
what makes this 'decision' of correctlySpelled? Do you know?
{quote}

I took a quick look to find out.  Its more complicated than I thought!  Here's 
the basic jist (I think!) :
 - If the instance of SolrSpellChecker returns frequency data and all 
suggestions have frequency >0, TRUE.
 - If the instance of SolrSpellChecker returns frequency data and any 
suggestion have frequency == 0, FALSE.
 - If the instance of SolrSpellChecker returns NO frequency data but has 
suggestions, OMIT.
 - If the instance of SolrSpellChecker returns NO suggestions, FALSE. 

Possibly this isn't fully accurate but I'm at least mostly correct here.  Seems 
like the discrepency with DirectSolrSpellChecker is because it isn't returning 
Frequency info?

This all happens in SpellCheckComponent.toNamedList() ... I'm guessing the code 
here uses the presence or absence of frequency data as kind of a proxy 
indicator whether or not its dealing with IndexBasedSpellChecker or 
FileBasedSpellChecker.  Possibly it would be better if each instance of 
SolrSpellChecker had a "isCorrectlySpelled()" method that toNamedList() could 
call?  Maybe I should I go open another jira issue for that?


> IndexBasedSpellChecker "thresholdTokenFrequency" fails with a 
> ClassCastException on startup
> -------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2571
>                 URL: https://issues.apache.org/jira/browse/SOLR-2571
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>    Affects Versions: 1.4.1, 3.1, 4.0
>            Reporter: James Dyer
>            Priority: Minor
>              Labels: whereIsHossManWhenYouNeedHim
>             Fix For: 3.3, 4.0
>
>         Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
> SOLR-2571.solr3.2.patch
>
>
> When parsing the configuration for thresholdTokenFrequency", the 
> IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
> NamedList.  However, this comes through as a String.  Therefore, a 
> ClassCastException is always thrown whenever this parameter is specified.  
> The code ought to be doing "Float.parseFloat(...)" on the value.
> This looks like a nice feature to use in cases the data contains misspelled 
> or rare words leading to spurious "correct" queries.  I would have liked to 
> have used this with a project we just completed however this bug prevented 
> that.  This issue came up recently in the User's mailing list so I am raising 
> an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to