[jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup
[ https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2571: - Attachment: SOLR-2571.patch This version takes all of DirectSolrSpellChecker's parameters as Integer and Float objects rather than Strings, as appropriate. Also, I changed the accuracy parameter to use SpellingParams.SPELLCHECK_ACCURACY ... I'm not sure if this would have validated any unit tests (I didn't see any tests that use DirectSolrSpellChecker). I think this will make DirectSolrSpellChecker more consistent with the rest of solrconfig.xmls parameter requirements. The only better option than this, maybe, would to make it flexible and allow either the Int/Float or String in these cases. I think this later option is not necessary however. IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup --- Key: SOLR-2571 URL: https://issues.apache.org/jira/browse/SOLR-2571 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.4.1, 3.1, 4.0 Reporter: James Dyer Priority: Minor Labels: whereIsHossManWhenYouNeedHim Fix For: 3.3, 4.0 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.solr3.2.patch When parsing the configuration for thresholdTokenFrequency, the IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived NamedList. However, this comes through as a String. Therefore, a ClassCastException is always thrown whenever this parameter is specified. The code ought to be doing Float.parseFloat(...) on the value. This looks like a nice feature to use in cases the data contains misspelled or rare words leading to spurious correct queries. I would have liked to have used this with a project we just completed however this bug prevented that. This issue came up recently in the User's mailing list so I am raising an issue now. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup
[ https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2571: - Attachment: SOLR-2571.patch Here is that patch with Ints/Floats instead of Strings. I made a tiny adjustment to the unit test also. IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup --- Key: SOLR-2571 URL: https://issues.apache.org/jira/browse/SOLR-2571 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.4.1, 3.1, 4.0 Reporter: James Dyer Priority: Minor Labels: whereIsHossManWhenYouNeedHim Fix For: 3.3, 4.0 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.solr3.2.patch When parsing the configuration for thresholdTokenFrequency, the IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived NamedList. However, this comes through as a String. Therefore, a ClassCastException is always thrown whenever this parameter is specified. The code ought to be doing Float.parseFloat(...) on the value. This looks like a nice feature to use in cases the data contains misspelled or rare words leading to spurious correct queries. I would have liked to have used this with a project we just completed however this bug prevented that. This issue came up recently in the User's mailing list so I am raising an issue now. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup
[ https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2571: - Attachment: SOLR-2571.patch I'm betting the jury will rule we keep this a float / element, so here's a patch that changes DirectSolrSpellChecker. I also added a unit test for thresholdTokenFrequency and added a (commented-out) line for it in the example solrconfig.xml. There are 3 TODO's in the unit test code: 1. My ignorance of the expression language used in unit-tests lead mem write an old-style long-form unit test. If someone can show me how to convert this to a 1-liner I would be very appreciative. 2. I found that DirectSolrSpellChecker returns results in a slightly different format than IndexBasedSpellChecker. Is this OK? Can SOLRJ handle this or do we need to tweak there? 3. Also, in one case IndexBasedSpellChecker returns correctlySpelled=false while DirectSolrSpellChecker returns correctlySpelled=true. Is this discrepancy valid? IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup --- Key: SOLR-2571 URL: https://issues.apache.org/jira/browse/SOLR-2571 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.4.1, 3.1, 4.0 Reporter: James Dyer Priority: Minor Labels: whereIsHossManWhenYouNeedHim Fix For: 3.3, 4.0 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.solr3.2.patch When parsing the configuration for thresholdTokenFrequency, the IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived NamedList. However, this comes through as a String. Therefore, a ClassCastException is always thrown whenever this parameter is specified. The code ought to be doing Float.parseFloat(...) on the value. This looks like a nice feature to use in cases the data contains misspelled or rare words leading to spurious correct queries. I would have liked to have used this with a project we just completed however this bug prevented that. This issue came up recently in the User's mailing list so I am raising an issue now. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup
[ https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-2571: -- Labels: whereIsHossManWhenYouNeedHim (was: ) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup --- Key: SOLR-2571 URL: https://issues.apache.org/jira/browse/SOLR-2571 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.4.1, 3.1, 4.0 Reporter: James Dyer Priority: Minor Labels: whereIsHossManWhenYouNeedHim Fix For: 3.3, 4.0 Attachments: SOLR-2571.patch, SOLR-2571.solr3.2.patch When parsing the configuration for thresholdTokenFrequency, the IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived NamedList. However, this comes through as a String. Therefore, a ClassCastException is always thrown whenever this parameter is specified. The code ought to be doing Float.parseFloat(...) on the value. This looks like a nice feature to use in cases the data contains misspelled or rare words leading to spurious correct queries. I would have liked to have used this with a project we just completed however this bug prevented that. This issue came up recently in the User's mailing list so I am raising an issue now. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup
lol. This is the best label since my lucid internal hossmans_pedantic_woolpack_branch branch. I was using the label woolpack till he stumbled upon my ineptitude. On Jun 2, 2011, at 6:53 PM, Robert Muir (JIRA) wrote: [ https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-2571: -- Labels: whereIsHossManWhenYouNeedHim (was: ) - Mark Miller lucidimagination.com BERLIN BUZZWORDS JUNE 6-7TH, 2011 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup
[ https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2571: - Attachment: SOLR-2571.solr3.2.patch SOLR-2571.patch Patches attached for Trunk 3.x . This patch fixes the problem for IndexBasedSpellChecker. DirectSolrSpellChecker (Trunk only) appears to be correct already. Should this patch be committed, I will add documentation for thresholdTokenFrequency to the wiki. Currently it is absent from the wiki (although documented in SmileyPugh). IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup --- Key: SOLR-2571 URL: https://issues.apache.org/jira/browse/SOLR-2571 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.4.1, 3.1, 4.0 Reporter: James Dyer Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2571.patch, SOLR-2571.solr3.2.patch When parsing the configuration for thresholdTokenFrequency, the IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived NamedList. However, this comes through as a String. Therefore, a ClassCastException is always thrown whenever this parameter is specified. The code ought to be doing Float.parseFloat(...) on the value. This looks like a nice feature to use in cases the data contains misspelled or rare words leading to spurious correct queries. I would have liked to have used this with a project we just completed however this bug prevented that. This issue came up recently in the User's mailing list so I am raising an issue now. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org